m. wu: enee631 digital image processing (spring'09) overview on general framework & issues...

39
M. Wu: ENEE631 Digital Image Processing (Spring'09) Overview on General Framework & Issues Overview on General Framework & Issues of Image Analysis and Video Streaming of Image Analysis and Video Streaming Spring ’09 Instructor: Min Wu Electrical and Computer Engineering Department, University of Maryland, College Park bb.eng.umd.edu (select ENEE631 S’09) [email protected] ENEE631 Spring’09 ENEE631 Spring’09 Lecture 20 (4/15/2009) Lecture 20 (4/15/2009)

Upload: milo-gibson

Post on 28-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

M. Wu: ENEE631 Digital Image Processing (Spring'09)

Overview on General Framework & Issues Overview on General Framework & Issues

of Image Analysis and Video Streamingof Image Analysis and Video Streaming

Spring ’09 Instructor: Min Wu

Electrical and Computer Engineering Department,

University of Maryland, College Park

bb.eng.umd.edu (select ENEE631 S’09) [email protected]

ENEE631 Spring’09ENEE631 Spring’09Lecture 20 (4/15/2009)Lecture 20 (4/15/2009)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [2]

Overview and LogisticsOverview and Logistics

Last Time: – General methodologies on motion analysis– Video content analysis

Basic framework Temporal segmentation; Compressed domain processing

Today:– Project#2– Overview of content-based image retrieval– A quick guide on video communications– Introduction on data embedding in imagesU

MC

P E

NE

E6

31

Slid

es

(cre

ate

d b

y M

.Wu

© 2

00

4)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [3]

Recap: Video Content AnalysisRecap: Video Content Analysis

Teach computer to “understand” video content– Define features that computer can learn to measure and compare

color (RGB values or other color coordinates) motion (magnitude and directions) shape (contours) texture and patterns

– Give example correspondences so that computer can learn build connections between feature & higher-level

semantics/concepts statistical classification and recognition techniques

Video understanding1. Break a video sequence into chunks, each with consistent content ~ “shot”2. Group similar shot into scenes that represent certain events3. Describe connections among scenes via story boards or scene graphs4. Associate shot/scene with representative feature/semantics for future query

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [6]

Fast Extraction of DC Image From MPEG-1Fast Extraction of DC Image From MPEG-1

I frame– Put together DC coeff. from each block

(and apply proper scaling)

Predictive (P/B) frame– Fast approximation of reference block’s DC – Adding DC of the motion compensation residue

recall DCT is a linear transform

See Yeo-Liu’s paper for more derivations on approximations (DC; DC+2AC)

[ ( )] [ ( )] [ ( )]DCT P DCT P DCT Pcur ref diff00 00 00

[ ( )] [ ( )]DCT Ph w

DCT Prefi i

ii

00 001

4

64

1 2

3 4

C

R

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

© 2

00

2)

DC Frame

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [7]

Summary on Video Temporal SegmentationSummary on Video Temporal Segmentation

A first step toward video content understanding– Image analysis may be applied to key frame sequences– Motion and temporal info can also be exploited

Two types of transitions– “Cut” ~ abrupt transition– Gradual transition: Fade out and Fade in; Dissolve; Wipe

Detecting transitions: can be done on “DC images” w/o full decompression– Detecting cut is relatively easier ~ check frame-wise

difference– Detecting dissolve and fade by checking linearity

f0 (1 – t/T) + f1 * t/T

– Detecting wipe ~ more difficult exploit transition patterns, or linearity of color histogram

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

& R

.Liu

© 2

00

2)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [8]

Content-based Image Retrieval (CBIR)Content-based Image Retrieval (CBIR) An active research area from 1990s with renewed interests

– As digital camera/camera-phones become affordable, and image & video sharing services (flickr, YouTube/Google, etc) become popular

Reference: a recent comprehensive survey – “Image Retrieval: Ideas, Influences, and Trends of the New Age,”

by Datta, Joshi, Li and Wang, ACM Computing Survey, 4/2008

Include a broad range of use scenarios and applications– Image browsing, search and retrieval– Automatic image annotation, and related subfields

With interests & contributions from multiple fields of study– Multimedia (MM), machine learning (ML), info retrieval (IR),

computer vision (CV), and human-computer interaction (HCI)=> See also EE633 pattern recognition and other CS course

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [9]

Requirements for Various CBIR ApplicationsRequirements for Various CBIR Applications

(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [10]

Image Retrieval from User & System PerspectivesImage Retrieval from User & System Perspectives

(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [11]

Framework for Forming Image Signature for CBIRFramework for Forming Image Signature for CBIR(Fig. from ACM Computing Survey 4/2008 article

by Datta, Joshi, Li and Wang)

Provide a compact representation to better reflect image content and facilitate search

Closely related to the similarity measure employed

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [12]

Image Similarity Measures for CBIRImage Similarity Measures for CBIR

(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)

Major considerations: agreement with semantics; computational efficiency

(to work in real time and large scale);

robustness to noise (invariant to perturbations);

background invariance (allow region-based querying);

local linearity (following triangle inequality in a neighborhood).

Grouping of techniques based on design philosophy: treating features as vectors, nonvector representations, or ensembles; using region-based similarity, global similarity, or a combination of both; computing similarities over linear space or nonlinear manifold; considering the role played by image segments in similarity computation; using stochastic, fuzzy, or deterministic similarity measures; and use of supervised, semi-supervised, or unsupervised learning.

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [13]

Clustering Methods and Application ScopeClustering Methods and Application Scope

(Fig. from ACM Computing Survey 4/2008 article by Datta, Joshi, Li and Wang)

Clustering and Classification help speed up retrieval in large databases Also facilitate or improve visualization, automatic annotation, & robustness

=> See more in ENEE633 pattern recognition

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [14]

Video CommunicationsVideo Communications

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [15]

MM + Data Comm. = Effective MM Communications?MM + Data Comm. = Effective MM Communications?

Multimedia vs. Generic Data– Perceptual no-difference vs. Bit-by-bit accuracy– Unequal importance within multimedia data– High data volume and real-time requirements

Need consider the interplay between source coding and transmission and make use of MM specific properties

E.g. wireless video need “good” compression algorithm to:– Support scalable video compression rate (from 10 to several

hundred kbps)– Be robust to the transmission errors and channel impairments– Minimize end-to-end delay– Handle missing frames intelligently

Video download vs. streaming

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [16]

Error-Resilient Coding with Localized Synch MarkerError-Resilient Coding with Localized Synch Marker To localize error and reduce error propagation Use spatial and temporal interpolation to conceal errors/losses

Output sequence

Inputsequence

H.263 encoder

MB detection

LRM

H.263 decoder

Error concealment

Random noise

H.263 with FRM H.263 with LRM

(From D. Lun @ HK PolyUniv. Short Course 6/01)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [17]

Issues in Video Communications/StreamingIssues in Video Communications/Streaming

Source coding aspects– Rate-Distortion tradeoff and bit allocation in R-D optimal sense– Scalable coding and Fine Granular Scalability (FGS)– Multiple description coding– Error resilient source coding

Channel coding aspects ~ see ENEE626 for general theory– Unequal Error Protection (UEP) channel codes– Embedded modulation for achieving UEP

Joint source-channel approaches– Jointly select source and channel coding parameters to optimize

end-to-end distortion– Wisely map source codewords to channel symbols– Take advantage of channel’s non-uniform characteristics for UEP

Bandwidth resource determination, allocation & adaptation

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [18]

Data Hiding in Images: An IntroductionData Hiding in Images: An Introduction

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [19]

Example: Data Embedding by Replacing LSBsExample: Data Embedding by Replacing LSBs

Downloaded from http://www.cl.cam.ac.uk/~fapp2/steganography/image_downgrading/

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [20]

Example: LSB Replacement (cont’d)Example: LSB Replacement (cont’d)

Replace LSB with Pentagon’s MSBUM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [21]

Using Higher LSB Bitplanes for EmbeddingUsing Higher LSB Bitplanes for Embedding

…… Issues to consider:11 1011 how much distortion by

embedding?

10 1010 how much resilience to minor changes?

9 10018 1000 smaller distortion7 0111 original pixel value6 01105 0101 replace 2nd LSB4 01003 00112 00101 0001

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [22]

Example: LSB Replacement of Higher BitplanesExample: LSB Replacement of Higher Bitplanes

Replace 6 LSBs with Pentagon’s 6 MSBsUM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [23]

Review: Pixel DepthReview: Pixel Depth

– “Contour” artifacts for low pixel depthat gradual transition areas

– Human eyes distinguish about 50 gray levels => 5~6 bits/pixel

8 bits / pixel

4 bits / pixel

2 bits / pixel

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [24]

Embedding Basics: Two Simple TriesEmbedding Basics: Two Simple Tries

Data Hiding: To put secondary data in host signal

(1) Replace LSB

(2) Round a pixel value to closest even or odd numbers

Both equivalent to reduce effective pixel depth for representing host image

Detection scheme is same as LSB, but embedding brings less distortion in the quantized case, esp. for higher LSB bitplane

+ Simple embedding; Fragile to even minor changes

even “0”odd “1”

pixel value 98 99 100 101

odd-even mapping

lookup table mapping

0 1 0 1

… 0 1 1 0 …

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [25]

How to Improve the Robustness?How to Improve the Robustness?

Introduce quantization to embedding process– Make features being odd/even multiple of Q

Tradeoff between embedding distortion and robustnessLarger Q => Higher resilience to minor changes

=> Higher average changes required to embed data

Questions: What’s the expected embedding distortion? Relation with distortion by quantization alone?

feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q

odd-even mapping

lookup table mapping

0 1 0 1

… 0 1 1 0 …

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [26]

Distortion from Quantization-based EmbeddingDistortion from Quantization-based Embedding

Uniform quantization with step size Q(Assume source’s distribution within each interval is approx. constant)

MSE = Q2 / 12

Odd-even embedding with quantization

MSE = ½ * (Q2/12) + ½ * (7Q2/12) = Q2 / 3

MSE equivalent to quantize with 2Q step size! “Pre-distort” via quantization to gain resilience [-Q/2, Q/2]

feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q

odd-even mapping 0 1 0 1

-Q -Q/2 + Q/2 +Q

-Q/2 + Q/2

1/Q

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [27]

Two Views of Quantization-based EmbeddingTwo Views of Quantization-based Embedding

From decoder’s view– Partition the signal space into two subsets labeled “0” & “1”– Decode according to which subset a sample belongs to

Embedder picks watermarked sample from the subset labeled with to-be-embedded bit, and tries to minimize the amount of changes

From embedder’s view– Design two quantizers “#0”, “#1”: step size 2Q, offset by Q– Embedder perform quantization using the quantizer labeled

with to-be-embedded bit => “Quantization Index Modulation (QIM)” Decoder looks for closest representative

feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q

“1”“0”

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [28]

Tampering Detection by Pixel-domain Fragile WmkTampering Detection by Pixel-domain Fragile Wmk

Downloaded from ICIP’97 CD-ROM paper by Yeung-Mintzer

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [29]

Fight Against Forging Tamper-Detection Watermark?Fight Against Forging Tamper-Detection Watermark?

If using LSB to embed a fragile watermark for tampering detection, adversary can alter image but retain LSB

[Solution 1] Add uncertainty to the embedding mapping

– through a random look-up table with controlled run length

[Solution 2] Make watermark securely depend on host content

E.g. embed a robust/content-base hash of host image

feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q

odd-even mapping

lookup table mapping

0 1 0 1

… 0 1 1 0 …

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [30]

Pixel-domain Table-lookup EmbeddingPixel-domain Table-lookup Embedding (Yeung-Mintzer ICIP’97)

– Simple to implement; be able to localize alteration extracted wmk from altered image

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [31]

Yeung’s Fragile Watermark for Tampering DetectionYeung’s Fragile Watermark for Tampering Detection

Basic idea: – enforce certain relationship to embed data– minimize distortion: nearest neighbor, constrained runs– diffuse error incurred to surrounding pixels

v’=v+d1+d2: LUT(v’)=boriginal image

marked image

lookup table generator

LUT( )

seed

data to be embedded

table lookuptest

image

extracted data

LUT( )

visualize &decide

embed detect

d1: diffused error

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [32]

LUT Embedding: Distortion/Security/RobustnessLUT Embedding: Distortion/Security/Robustness

What’s new compared with odd-even embedding?– Mapping from feature to embedded bit is less predictable– Adjacent intervals may be mapped to the same bit value

How much security gained with proprietary LUT? =>

– Proprietary LUT brings uncertainty and makes it difficult for attackers to embed specific data at his/her will

How much MSE introduced by embedding? =>

– Larger than odd-even embedding How much resilience gained? =>

– Moving away by Q/2 step may not trigger detection error Due to possible continuous run in LUT

Ref: M. Wu: "Joint Security and Robustness Enhancement for Quantization Based Embedding," IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 8, pp.831-841, August 2003. (see ICIP’03 for shorter conf. version)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [36]

Case Study: Mintzer-Yeung Patent on Fragile WmkCase Study: Mintzer-Yeung Patent on Fragile Wmk

F. Mintzer and M.M. Yeung: “Invisible Image Watermark for Image Verification,” U.S. Patent 5,875,249, issued Feb. 1999.

Acquire knowledge on latest art in industry from patent– Especially useful when industry don’t publish all key

techniques (but they often aggressively patent these “IP”) Watermark is a good example: Digimarc, Verance, IBM, NEC …

– Complementary to literature search of journal/conf. papers

For details on how to patent your novel ideas– Talk to your supervisor & lawyers, and check univ./company policies– Resource

Online workshop on Patent 101 http://www.invent.org/workshop/3_0_0_workshop.asp

US Patent Officewww.uspto.gov (full-text patent search and patent doc. images)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [37]

Reference ReadingsReference Readings

Content based image retrieval– “Image Retrieval: Ideas, Influences, and Trends of the New Age,”

by Datta, Joshi, Li and Wang, ACM Computing Survey, 4/2008

Video communications– Wang’s video textbook: Chapter 14, 15.

– Wood’s book: Chapter 12

Data Embedding(1) M. M. Yeung, F. Mintzer: “An Invisible Watermarking Technique for Image

Verification", Proc. of the IEEE Int’l Conf. on Image Proc. (ICIP), Oct. 1997.(2) F. Mintzer and M.M. Yeung: “Invisible Image Watermark for Image

Verification,” U.S. Patent 5,875,249, issued Feb. 1999. <www.uspto.gov/>

For further exploration – M. Wu: "Joint Security and Robustness Enhancement for Quantization Based Embedding," IEEE Trans. on Circuits and Systems for Video Technology, vol. 13, no. 8, pp.831-841, August 2003. (see ICIP’03 for shorter version)

UM

CP

EN

EE

40

8G

Slid

es

(cre

ate

d b

y M

.Wu

© 2

00

2)

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [39]

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [40]

A Glimpse at the Patent SystemA Glimpse at the Patent System

Intended to add "the fuel of interest to the fire of genius" (Abraham Lincoln)– In exchange for disclosing an invention to the public, the inventor

receives the exclusive right to control exploitation of the invention and to realize any profits for a specific length of time

Three classifications in US Patent laws:– Utility patents of most interest to ECEer

A term of 20 years from the date the patent application was filed Granted to anyone who invents or discovers any new and useful

process, machine, manufacture, or composition of matter, or any new and useful improvement thereof

– Design patents new, original & ornamental design for 14 years’ protection

– Plant patents

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [41]

Patent ProcessPatent Process

Idea– Make sure your idea is new and practical/useful

Document– Keep records that document your discovery– File “Invention Disclosure” & Be careful with public disclosure

Research– Search existing literature & patents related to your invention– Analyze existing patents and literature

Apply– Prepare and file the patent application documents– Review by PTO examiner; amend your patent claims if nece.

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [42]

Structure of A Utility PatentStructure of A Utility Patent

Title Page– Title, Patent Number, File & Issue Dates– Inventors, Assignees, Patent examiners– Related patents and references– Abstract, # of claims, # of drawings, representative drawing

Drawings

Main text– Field of the Invention (usually in one sentence)– Background– Summary– Brief description of drawings– Detailed description of the preferred embodiment– Claims

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [43]

Useful Contents for Technical StudiesUseful Contents for Technical Studies

Detailed descriptions of invention (process/method/apparatus)

– Along with drawings (and background/summary)– They are often intended to be written in an easily accessible

way

Technical discussions on “Preferred embodiment(s)” in Mintzer-Yeung’s patent– Image stamping via LUT embedding– Image verification via Table lookup and visualization– Error diffusion to alleviate visual distortion incurred by

embedding– Apply the process to DC-image for embedding data in JPEG

image

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [44]

Claims: Crucial Part for Business ValuesClaims: Crucial Part for Business Values

Not always “fun” to read– Many legally speaking terms and wordings

Usually prefer broad (and allowable) claims

Claims are the hot spot examined by USPTO– Determines whether

(1) the proposed claims have been claimed by other patents? (2) anticipated by other already issued patents? (3) straightforward combination or extensions of existing methods for similar purposes by those “skilled in the art”

28 Claims in Mintzer-Yeung patent– “Root” claims and “child” claims

Claim Tree: useful tree visualization to illustrate relations between claims

M. Wu: ENEE631 Digital Image Processing (Spring'09)Lec.20 – Overview on CBIR & Video

Comm [45]

From Fragile to Robust WatermarkFrom Fragile to Robust Watermark

UM

CP

EN

EE

63

1 S

lide

s (c

rea

ted

by

M.W

u ©

ba

sed

on

Re

sea

rch

Ta

lks

’98

-’04

)