m. wu: enee631 digital image processing (spring'09) overview on general framework & issues...
TRANSCRIPT
M. Wu: ENEE631 Digital Image Processing (Spring'09)
Overview on General Framework & Issues Overview on General Framework & Issues
of Image Analysis and Video Streamingof Image Analysis and Video Streaming
Spring ’09 Instructor: Min Wu
Electrical and Computer Engineering Department,
University of Maryland, College Park
bb.eng.umd.edu (select ENEE631 S’09) [email protected]
ENEE631 Spring’09ENEE631 Spring’09Lecture 20 (4/15/2009)Lecture 20 (4/15/2009)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [2]
Overview and LogisticsOverview and Logistics
Last Time: – General methodologies on motion analysis– Video content analysis
Basic framework Temporal segmentation; Compressed domain processing
Today:– Project#2– Overview of content-based image retrieval– A quick guide on video communications– Introduction on data embedding in imagesU
MC
P E
NE
E6
31
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
4)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [3]
Recap: Video Content AnalysisRecap: Video Content Analysis
Teach computer to “understand” video content– Define features that computer can learn to measure and compare
color (RGB values or other color coordinates) motion (magnitude and directions) shape (contours) texture and patterns
– Give example correspondences so that computer can learn build connections between feature & higher-level
semantics/concepts statistical classification and recognition techniques
Video understanding1. Break a video sequence into chunks, each with consistent content ~ “shot”2. Group similar shot into scenes that represent certain events3. Describe connections among scenes via story boards or scene graphs4. Associate shot/scene with representative feature/semantics for future query
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [6]
Fast Extraction of DC Image From MPEG-1Fast Extraction of DC Image From MPEG-1
I frame– Put together DC coeff. from each block
(and apply proper scaling)
Predictive (P/B) frame– Fast approximation of reference block’s DC – Adding DC of the motion compensation residue
recall DCT is a linear transform
See Yeo-Liu’s paper for more derivations on approximations (DC; DC+2AC)
[ ( )] [ ( )] [ ( )]DCT P DCT P DCT Pcur ref diff00 00 00
[ ( )] [ ( )]DCT Ph w
DCT Prefi i
ii
00 001
4
64
1 2
3 4
C
R
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
DC Frame
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [7]
Summary on Video Temporal SegmentationSummary on Video Temporal Segmentation
A first step toward video content understanding– Image analysis may be applied to key frame sequences– Motion and temporal info can also be exploited
Two types of transitions– “Cut” ~ abrupt transition– Gradual transition: Fade out and Fade in; Dissolve; Wipe
Detecting transitions: can be done on “DC images” w/o full decompression– Detecting cut is relatively easier ~ check frame-wise
difference– Detecting dissolve and fade by checking linearity
f0 (1 – t/T) + f1 * t/T
– Detecting wipe ~ more difficult exploit transition patterns, or linearity of color histogram
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
& R
.Liu
© 2
00
2)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [8]
Content-based Image Retrieval (CBIR)Content-based Image Retrieval (CBIR) An active research area from 1990s with renewed interests
– As digital camera/camera-phones become affordable, and image & video sharing services (flickr, YouTube/Google, etc) become popular
Reference: a recent comprehensive survey – “Image Retrieval: Ideas, Influences, and Trends of the New Age,”
by Datta, Joshi, Li and Wang, ACM Computing Survey, 4/2008
Include a broad range of use scenarios and applications– Image browsing, search and retrieval– Automatic image annotation, and related subfields
With interests & contributions from multiple fields of study– Multimedia (MM), machine learning (ML), info retrieval (IR),
computer vision (CV), and human-computer interaction (HCI)=> See also EE633 pattern recognition and other CS course
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [9]
Requirements for Various CBIR ApplicationsRequirements for Various CBIR Applications
(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [10]
Image Retrieval from User & System PerspectivesImage Retrieval from User & System Perspectives
(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [11]
Framework for Forming Image Signature for CBIRFramework for Forming Image Signature for CBIR(Fig. from ACM Computing Survey 4/2008 article
by Datta, Joshi, Li and Wang)
Provide a compact representation to better reflect image content and facilitate search
Closely related to the similarity measure employed
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [12]
Image Similarity Measures for CBIRImage Similarity Measures for CBIR
(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)
Major considerations: agreement with semantics; computational efficiency
(to work in real time and large scale);
robustness to noise (invariant to perturbations);
background invariance (allow region-based querying);
local linearity (following triangle inequality in a neighborhood).
Grouping of techniques based on design philosophy: treating features as vectors, nonvector representations, or ensembles; using region-based similarity, global similarity, or a combination of both; computing similarities over linear space or nonlinear manifold; considering the role played by image segments in similarity computation; using stochastic, fuzzy, or deterministic similarity measures; and use of supervised, semi-supervised, or unsupervised learning.
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [13]
Clustering Methods and Application ScopeClustering Methods and Application Scope
(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)
Clustering and Classification help speed up retrieval in large databases Also facilitate or improve visualization, automatic annotation, & robustness
=> See more in ENEE633 pattern recognition
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [14]
Video CommunicationsVideo Communications
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [15]
MM + Data Comm. = Effective MM Communications?MM + Data Comm. = Effective MM Communications?
Multimedia vs. Generic Data– Perceptual no-difference vs. Bit-by-bit accuracy– Unequal importance within multimedia data– High data volume and real-time requirements
Need consider the interplay between source coding and transmission and make use of MM specific properties
E.g. wireless video need “good” compression algorithm to:– Support scalable video compression rate (from 10 to several
hundred kbps)– Be robust to the transmission errors and channel impairments– Minimize end-to-end delay– Handle missing frames intelligently
Video download vs. streaming
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [16]
Error-Resilient Coding with Localized Synch MarkerError-Resilient Coding with Localized Synch Marker To localize error and reduce error propagation Use spatial and temporal interpolation to conceal errors/losses
Output sequence
Inputsequence
H.263 encoder
MB detection
LRM
H.263 decoder
Error concealment
Random noise
H.263 with FRM H.263 with LRM
(From D. Lun @ HK PolyUniv. Short Course 6/01)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [17]
Issues in Video Communications/StreamingIssues in Video Communications/Streaming
Source coding aspects– Rate-Distortion tradeoff and bit allocation in R-D optimal sense– Scalable coding and Fine Granular Scalability (FGS)– Multiple description coding– Error resilient source coding
Channel coding aspects ~ see ENEE626 for general theory– Unequal Error Protection (UEP) channel codes– Embedded modulation for achieving UEP
Joint source-channel approaches– Jointly select source and channel coding parameters to optimize
end-to-end distortion– Wisely map source codewords to channel symbols– Take advantage of channel’s non-uniform characteristics for UEP
Bandwidth resource determination, allocation & adaptation
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [18]
Data Hiding in Images: An IntroductionData Hiding in Images: An Introduction
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [19]
Example: Data Embedding by Replacing LSBsExample: Data Embedding by Replacing LSBs
Downloaded from http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
ba
sed
on
Re
sea
rch
Ta
lks
’98
-’04
)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [20]
Example: LSB Replacement (cont’d)Example: LSB Replacement (cont’d)
Replace LSB with Pentagon’s MSBUM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
ba
sed
on
Re
sea
rch
Ta
lks
’98
-’04
)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [21]
Using Higher LSB Bitplanes for EmbeddingUsing Higher LSB Bitplanes for Embedding
…… Issues to consider:11 1011 how much distortion by
embedding?
10 1010 how much resilience to minor changes?
9 10018 1000 smaller distortion7 0111 original pixel value6 01105 0101 replace 2nd LSB4 01003 00112 00101 0001
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [22]
Example: LSB Replacement of Higher BitplanesExample: LSB Replacement of Higher Bitplanes
Replace 6 LSBs with Pentagon’s 6 MSBsUM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
ba
sed
on
Re
sea
rch
Ta
lks
’98
-’04
)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [23]
Review: Pixel DepthReview: Pixel Depth
– “Contour” artifacts for low pixel depthat gradual transition areas
– Human eyes distinguish about 50 gray levels => 5~6 bits/pixel
8 bits / pixel
4 bits / pixel
2 bits / pixel
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [24]
Embedding Basics: Two Simple TriesEmbedding Basics: Two Simple Tries
Data Hiding: To put secondary data in host signal
(1) Replace LSB
(2) Round a pixel value to closest even or odd numbers
Both equivalent to reduce effective pixel depth for representing host image
Detection scheme is same as LSB, but embedding brings less distortion in the quantized case, esp. for higher LSB bitplane
+ Simple embedding; Fragile to even minor changes
even “0”odd “1”
pixel value 98 99 100 101
odd-even mapping
lookup table mapping
0 1 0 1
… 0 1 1 0 …
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
ba
sed
on
Re
sea
rch
Ta
lks
’98
-’04
)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [25]
How to Improve the Robustness?How to Improve the Robustness?
Introduce quantization to embedding process– Make features being odd/even multiple of Q
Tradeoff between embedding distortion and robustnessLarger Q => Higher resilience to minor changes
=> Higher average changes required to embed data
Questions: What’s the expected embedding distortion? Relation with distortion by quantization alone?
feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q
odd-even mapping
lookup table mapping
0 1 0 1
… 0 1 1 0 …
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [26]
Distortion from Quantization-based EmbeddingDistortion from Quantization-based Embedding
Uniform quantization with step size Q(Assume source’s distribution within each interval is approx. constant)
MSE = Q2 / 12
Odd-even embedding with quantization
MSE = ½ * (Q2/12) + ½ * (7Q2/12) = Q2 / 3
MSE equivalent to quantize with 2Q step size! “Pre-distort” via quantization to gain resilience [-Q/2, Q/2]
feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q
odd-even mapping 0 1 0 1
-Q -Q/2 + Q/2 +Q
-Q/2 + Q/2
1/Q
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [27]
Two Views of Quantization-based EmbeddingTwo Views of Quantization-based Embedding
From decoder’s view– Partition the signal space into two subsets labeled “0” & “1”– Decode according to which subset a sample belongs to
Embedder picks watermarked sample from the subset labeled with to-be-embedded bit, and tries to minimize the amount of changes
From embedder’s view– Design two quantizers “#0”, “#1”: step size 2Q, offset by Q– Embedder perform quantization using the quantizer labeled
with to-be-embedded bit => “Quantization Index Modulation (QIM)” Decoder looks for closest representative
feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q
“1”“0”
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [28]
Tampering Detection by Pixel-domain Fragile WmkTampering Detection by Pixel-domain Fragile Wmk
Downloaded from ICIP’97 CD-ROM paper by Yeung-Mintzer
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
ba
sed
on
Re
sea
rch
Ta
lks
’98
-’04
)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [29]
Fight Against Forging Tamper-Detection Watermark?Fight Against Forging Tamper-Detection Watermark?
If using LSB to embed a fragile watermark for tampering detection, adversary can alter image but retain LSB
[Solution 1] Add uncertainty to the embedding mapping
– through a random look-up table with controlled run length
[Solution 2] Make watermark securely depend on host content
E.g. embed a robust/content-base hash of host image
feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q
odd-even mapping
lookup table mapping
0 1 0 1
… 0 1 1 0 …
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [30]
Pixel-domain Table-lookup EmbeddingPixel-domain Table-lookup Embedding (Yeung-Mintzer ICIP’97)
– Simple to implement; be able to localize alteration extracted wmk from altered image
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [31]
Yeung’s Fragile Watermark for Tampering DetectionYeung’s Fragile Watermark for Tampering Detection
Basic idea: – enforce certain relationship to embed data– minimize distortion: nearest neighbor, constrained runs– diffuse error incurred to surrounding pixels
v’=v+d1+d2: LUT(v’)=boriginal image
marked image
lookup table generator
LUT( )
seed
data to be embedded
table lookuptest
image
extracted data
LUT( )
visualize &decide
embed detect
d1: diffused error
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [32]
LUT Embedding: Distortion/Security/RobustnessLUT Embedding: Distortion/Security/Robustness
What’s new compared with odd-even embedding?– Mapping from feature to embedded bit is less predictable– Adjacent intervals may be mapped to the same bit value
How much security gained with proprietary LUT? =>
– Proprietary LUT brings uncertainty and makes it difficult for attackers to embed specific data at his/her will
How much MSE introduced by embedding? =>
– Larger than odd-even embedding How much resilience gained? =>
– Moving away by Q/2 step may not trigger detection error Due to possible continuous run in LUT
Ref: M. Wu: "Joint Security and Robustness Enhancement for Quantization Based Embedding," IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 8, pp.831-841, August 2003. (see ICIP’03 for shorter conf. version)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [36]
Case Study: Mintzer-Yeung Patent on Fragile WmkCase Study: Mintzer-Yeung Patent on Fragile Wmk
F. Mintzer and M.M. Yeung: “Invisible Image Watermark for Image Verification,” U.S. Patent 5,875,249, issued Feb. 1999.
Acquire knowledge on latest art in industry from patent– Especially useful when industry don’t publish all key
techniques (but they often aggressively patent these “IP”) Watermark is a good example: Digimarc, Verance, IBM, NEC …
– Complementary to literature search of journal/conf. papers
For details on how to patent your novel ideas– Talk to your supervisor & lawyers, and check univ./company policies– Resource
Online workshop on Patent 101 http://www.invent.org/workshop/3_0_0_workshop.asp
US Patent Officewww.uspto.gov (full-text patent search and patent doc. images)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [37]
Reference ReadingsReference Readings
Content based image retrieval– “Image Retrieval: Ideas, Influences, and Trends of the New Age,”
by Datta, Joshi, Li and Wang, ACM Computing Survey, 4/2008
Video communications– Wang’s video textbook: Chapter 14, 15.
– Wood’s book: Chapter 12
Data Embedding(1) M. M. Yeung, F. Mintzer: “An Invisible Watermarking Technique for Image
Verification", Proc. of the IEEE Int’l Conf. on Image Proc. (ICIP), Oct. 1997.(2) F. Mintzer and M.M. Yeung: “Invisible Image Watermark for Image
Verification,” U.S. Patent 5,875,249, issued Feb. 1999. <www.uspto.gov/>
For further exploration – M. Wu: "Joint Security and Robustness Enhancement for Quantization Based Embedding," IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 8, pp.831-841, August 2003. (see ICIP’03 for shorter version)
UM
CP
EN
EE
40
8G
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
2)
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [40]
A Glimpse at the Patent SystemA Glimpse at the Patent System
Intended to add "the fuel of interest to the fire of genius" (Abraham Lincoln)– In exchange for disclosing an invention to the public, the inventor
receives the exclusive right to control exploitation of the invention and to realize any profits for a specific length of time
Three classifications in US Patent laws:– Utility patents of most interest to ECEer
A term of 20 years from the date the patent application was filed Granted to anyone who invents or discovers any new and useful
process, machine, manufacture, or composition of matter, or any new and useful improvement thereof
– Design patents new, original & ornamental design for 14 years’ protection
– Plant patents
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [41]
Patent ProcessPatent Process
Idea– Make sure your idea is new and practical/useful
Document– Keep records that document your discovery– File “Invention Disclosure” & Be careful with public disclosure
Research– Search existing literature & patents related to your invention– Analyze existing patents and literature
Apply– Prepare and file the patent application documents– Review by PTO examiner; amend your patent claims if nece.
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [42]
Structure of A Utility PatentStructure of A Utility Patent
Title Page– Title, Patent Number, File & Issue Dates– Inventors, Assignees, Patent examiners– Related patents and references– Abstract, # of claims, # of drawings, representative drawing
Drawings
Main text– Field of the Invention (usually in one sentence)– Background– Summary– Brief description of drawings– Detailed description of the preferred embodiment– Claims
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [43]
Useful Contents for Technical StudiesUseful Contents for Technical Studies
Detailed descriptions of invention (process/method/apparatus)
– Along with drawings (and background/summary)– They are often intended to be written in an easily accessible
way
Technical discussions on “Preferred embodiment(s)” in Mintzer-Yeung’s patent– Image stamping via LUT embedding– Image verification via Table lookup and visualization– Error diffusion to alleviate visual distortion incurred by
embedding– Apply the process to DC-image for embedding data in JPEG
image
M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video
Comm [44]
Claims: Crucial Part for Business ValuesClaims: Crucial Part for Business Values
Not always “fun” to read– Many legally speaking terms and wordings
Usually prefer broad (and allowable) claims
Claims are the hot spot examined by USPTO– Determines whether
(1) the proposed claims have been claimed by other patents? (2) anticipated by other already issued patents? (3) straightforward combination or extensions of existing methods for similar purposes by those “skilled in the art”
28 Claims in Mintzer-Yeung patent– “Root” claims and “child” claims
Claim Tree: useful tree visualization to illustrate relations between claims