from image analysis to content extraction: are we there...
TRANSCRIPT
![Page 1: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/1.jpg)
From Image Analysis to Content Extraction:
Are We There Yet?
From Image Analysis to Content Extraction:
Are We There Yet?
Tsuhan ChenCarnegie Mellon University
Pittsburgh, [email protected]
![Page 2: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/2.jpg)
Tsuhan Chen
A Journey of 10+ Years A Journey of 10+ Years
• Multimedia Signal Processing (MMSP) Technical Committee
– Founding Chair 1996~1999
• MMSP Workshops
– Princeton 1997, Los Angeles 1998, Copenhagen 1999, Cannes 2001, St. Thomas 2002, Siena 2004, Shanghai 2005, Victoria 2006…
• IEEE Transactions on Multimedia
– Editor-in-Chief: 2002~2004
• International Conference on Multimedia and Expo (ICME)
– New York 2000, Tokyo 2001, Lausanne 2002, Baltimore 2003, Taipei2004, Amsterdam 2005, Toronto 2006, Beijing 2007…
• IEEE Fellow, 2007~, “…multimedia signal processing”
• IEEE Distinguished Lecturer, 2007~2008
![Page 3: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/3.jpg)
Signal vs. ContentSignal vs. Content
![Page 4: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/4.jpg)
Tsuhan Chen
[Baker and Kanade]
What is “content”?What is “content”?
population worldhistory human36524606030 ××××××>>
Number of all possible 16×12 images 812162 ××=
“Content” is based on signals, i.e., prior, statistics, data-driven…
![Page 5: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/5.jpg)
Tsuhan Chen
ThoughtsThoughts
• “The most compelling shapes are those near to our hearts: people’s faces, a gracefully moving body, a natural scene with rustling leaves and flowing water. Evolution has tuned us to these sights…”
[Lengyel, 1998]
• How do we see such “objects of interest”?
• Content extraction is more than processing bits…it’s signal processing + statistical learning
[Chen, 2007]
![Page 6: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/6.jpg)
Tsuhan Chen
Sample Projects in Content Retrieval Sample Projects in Content Retrieval
Beyond digital images/videos…
![Page 7: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/7.jpg)
Hand-Drawn Query
Retrieved Trademarks
[Leung&Chen ICME’02]Trademark RetrievalTrademark Retrieval
![Page 8: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/8.jpg)
Tsuhan Chen
Sketch RetrievalSketch RetrievalUser sketches a query…
QuerySketch
SimilarSketch
Page Stored in Database
[Leung&Chen ICME’03]
![Page 9: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/9.jpg)
Tsuhan Chen
3D Object Retrieval3D Object Retrieval[Zhang&Chen ACM MM’01]
![Page 10: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/10.jpg)
Tsuhan Chen
3D Protein Retrieval3D Protein Retrieval[Chen&Chen ICIP’02]
![Page 11: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/11.jpg)
Tsuhan Chen
Object DiscoveryObject Discovery
Object Discovery ≠ Object Detection
![Page 12: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/12.jpg)
Tsuhan Chen
Object DetectionObject Detection
Training Data (Labeled)
Test Data
[BioID Face Database]
![Page 13: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/13.jpg)
Tsuhan Chen
Object DiscoveryObject Discovery
[Caltech Face+Background Dataset]
Discover = Categorize + Localize
How did we do that?
![Page 14: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/14.jpg)
Tsuhan Chen
Object DiscoveryObject Discovery
[UIUC Car Dataset]
Discover = Categorize + Localize
How did we do that?
![Page 15: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/15.jpg)
Tsuhan Chen
Discovering Objects in VideoDiscovering Objects in Video
Discover = Categorize + Localize[YouTube/Google Video]
How did we do that?
![Page 16: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/16.jpg)
Tsuhan Chen
The ApproachThe Approach
Feature Extraction
Statistical Learning
![Page 17: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/17.jpg)
Tsuhan Chen
Feature ExtractionFeature Extraction
Maximally Stable Extremal Regions (MSER)[Matas et al., 02]
“Patch”
![Page 18: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/18.jpg)
Tsuhan Chen
Scale Invariant Feature Transform (SIFT)Scale Invariant Feature Transform (SIFT)[Lowe, 04]
• Robust to viewpoint, illumination, blurring, rotation, and scale changes
![Page 19: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/19.jpg)
Tsuhan Chen
Quantization into Visual WordsQuantization into Visual Words
Visual Words
Discrete symbols128-dim SIFT features
[Leung and Malik, 01]
K-means
Every images becomes a bag of words…
![Page 20: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/20.jpg)
Tsuhan Chen
Statistical LearningStatistical Learning
FeatureExtraction
StatisticalLearning
Single Image
Collectionof Images
Video
![Page 21: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/21.jpg)
Tsuhan Chen
GoalGoal
• Label each patch as background or object of interest
r = (200;200)
z = object of interest
z = background
r = (300;100)
w = w2
w = w3
“Location”
“Appearance”
“Location”
“Appearance”
![Page 22: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/22.jpg)
Tsuhan Chen
Probabilistic ModelProbabilistic Model
0.7
0.3z1
z2
Image Characteristic
Gaussian
uniform
Location Semantics
p(rjz2)p(rjz1)p(z)
= p(z)p(rjz)p(wjz)
0.40.1
0.40.0
0.20.9
z1 z2
w1
w2
Topic Appearance
p(wjz)
w3
p(z; r; w) = p(z)p(r; wjz) r Locationw Appearancez Obj/Bg
![Page 23: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/23.jpg)
Tsuhan Chen
Posterior ProbabilityPosterior Probability
r = (300;100)
r = (200;200)
w = w3
w = w2
p(zjr; w) = p(z; r; w)Xzp(z; r; w)
=p(z)p(rjz)p(wjz)Xzp(z)p(rjz)p(wjz)
z = argmaxz
p(zjr; w)
z = argmaxz
p(zjr; w)
Posterior Probabilities ~ (Soft) Labels
![Page 24: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/24.jpg)
Tsuhan Chen
Only half of the story…Only half of the story…
p(wjz)p(z)
p(rjz)p(zjr; w)
r Locationw Appearancez Obj/Bg
![Page 25: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/25.jpg)
Tsuhan Chen
p(z = z1) =1
4+3
4=2 = 1=2
p(z = z1) = 1=2
How to estimate :• If label is known
• If is known
Estimate Image CharacteristicEstimate Image Characteristic
p(z)
z = z1
z = z1
p(zjw; r)
r Locationw Appearancez Obj/Bg
![Page 26: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/26.jpg)
Tsuhan Chen
p(w = w1jz = z1) =1
2=1
2= 1
p(w = w1jz = z1) =
0@34 + 0
2
1A= 1
2=3
4
Estimate Topic AppearanceEstimate Topic Appearance
How to estimate :• If label is known
• If is known
p(wjz)
w1
w1
w2
w2
z = z1
z = z1
p(zjw; r)
r Locationw Appearancez Obj/Bg
![Page 27: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/27.jpg)
Tsuhan Chen
How to estimate mean and var of :• If label is known
• If is known
Estimate Location SemanticsEstimate Location Semantics
p(rjz = z1)z = z1
p(zjw; r)
![Page 28: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/28.jpg)
Tsuhan Chen
An Iterative AlgorithmAn Iterative Algorithm
p(wjz)p(z)
LocationEstimation p(rjz)
p(zjr; w)
r Locationw Appearancez Obj/Bg
Can start anywhere, can seed anyhow…
![Page 29: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/29.jpg)
Tsuhan Chen
Collection of ImagesCollection of Images
0.4
0.6
0.8
0.2
d1 d2
z1
z2
p(rjz1; d1)
p(rjz1; d2)
p(zjd)
p(z; r; wjd) = p(zjd)p(rjz; d)p(wjz; d)
p(z; r; w) = p(z)p(rjz)p(wjz)
= p(zjd)p(rjz; d)p(wjz)
p(wjz)
0.40.1
0.40.0
0.20.9
z1 z2
w1
w2
w3
r Locationw Appearancez Obj/Bgd Image
![Page 30: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/30.jpg)
Tsuhan Chen
An Iterative AlgorithmAn Iterative Algorithm
p(wjz)
LocationEstimation
p(zjd)
p(rjz; d)p(zjr; w; d)
r Locationw Appearancez Obj/Bgd Image
Same as before, but location/characteristics are image-dependent
![Page 31: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/31.jpg)
Tsuhan Chen
An ExampleAn Example
[Caltech Face+Background Dataset]
![Page 32: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/32.jpg)
Tsuhan Chen
Location Semantics Topic AppearancePosteriorp(rjz = z1; d)
p(wjz = z1)
p(wjz = z2)p(zjr; w; d)
r Locationw Appearancez Obj/Bgd Image
![Page 33: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/33.jpg)
Tsuhan Chen
Video ≠ Collection of ImagesVideo ≠ Collection of Images
Time
Smooth trajectory expected
![Page 34: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/34.jpg)
Tsuhan Chen
![Page 35: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/35.jpg)
Tsuhan Chen
![Page 36: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/36.jpg)
Tsuhan Chen
Motion Information
![Page 37: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/37.jpg)
Tsuhan Chen
( )iν
![Page 38: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/38.jpg)
Tsuhan Chen
( )iν
![Page 39: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/39.jpg)
Tsuhan Chen
( )iν
![Page 40: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/40.jpg)
Tsuhan Chen
),0|( )()(
)()(
SN ii
ii
i
νβ
νβν
∝
≡∑
ν
[Bar-Shalom, 80]
![Page 41: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/41.jpg)
Tsuhan Chen
),,|(),0|(
),0|()()(
1)()(
)()(
)()(
drwzzpSN
SNiiii
ii
ii
i
=∝
∝
≡∑
νβ
νβ
νβν
[Bar-Shalom, 80]
ν
![Page 42: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/42.jpg)
Tsuhan Chen
ν
νWss += −+ ˆˆ
+s−s
[ ]( )2tmeasuremensystem )ˆ(,, +−ΣΣ= ssEfW
![Page 43: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/43.jpg)
Tsuhan Chen
ν
+s−s
νWss += −+ ˆˆ[ ]( )2
tmeasuremensystem )ˆ(,, +−ΣΣ= ssEfW
![Page 44: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/44.jpg)
Tsuhan Chen
An Iterative AlgorithmAn Iterative Algorithm
p(wjz)
LocationEstimation
p(zjd)p(zjw; r; d)
p(rjz; d)Motion
Modeling
![Page 45: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/45.jpg)
Tsuhan Chen
An Iterative AlgorithmAn Iterative Algorithm
p(wjz)
MotionModeling
p(zjd)p(zjw; r; d)
p(rjz; d)
• Knowledge of appearance improves location estimate
![Page 46: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/46.jpg)
Tsuhan Chen
An Iterative AlgorithmAn Iterative Algorithm
p(wjz)
MotionModeling
p(zjd)p(zjw; r; d)
p(rjz; d)
• Knowledge of location improves appearance estimate
![Page 47: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/47.jpg)
Tsuhan Chen
ApplicationsApplications
• Object localization
• Categorization– Video skimming
• Keyframe extraction– Video summarization
![Page 48: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/48.jpg)
Tsuhan Chen
Input VideoInput Video
CMU dataset
![Page 49: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/49.jpg)
Tsuhan Chen
ComparisonComparisonAPP+LOC+MOTION
APP+LOCAPP
p(wjz)
MotionModel
p(zjd)p(zjw; r; d)
p(rjz; d)
p(wjz)
LocationEstim.
p(zjd)p(zjw; r; d)
p(rjz; d)
p(wjz)p(zjd)
p(zjw; d)
[Sivic et al. 05]
![Page 50: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/50.jpg)
Tsuhan Chen
LocalizationLocalization
[CMU Dataset]
APP+LOC+MOTION
APP+LOCAPP
![Page 51: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/51.jpg)
Tsuhan Chen
CategorizationCategorization
• Top 40 frames out of 181, according to p(z = z1jd)
[YouTube/Google Video]
![Page 52: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/52.jpg)
Tsuhan Chen
CategorizationCategorization
[YouTube/Google Video]
• Top 40 frames out of 711, according to p(z = z1jd)
![Page 53: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/53.jpg)
Tsuhan Chen
Keyframe Extraction on YouTubeKeyframe Extraction on YouTube
[YouTube/Google Video]
![Page 54: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/54.jpg)
Tsuhan Chen
Keyframe Extraction – Our ResultKeyframe Extraction – Our Result
5 keyframes from top 40 frames, according to
181 frames. 2 frame/sec.
p(z = z1jd)
[YouTube/Google Video]
![Page 55: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/55.jpg)
Tsuhan Chen
Keyframe Extraction on YouTubeKeyframe Extraction on YouTube
[YouTube/Google Video]
![Page 56: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/56.jpg)
Tsuhan Chen
Keyframe Extraction – Our ResultKeyframe Extraction – Our Result
711 frames. 2 frame/sec.
5 keyframes from top 40 frames, according to p(z = z1jd)
[YouTube/Google Video]
![Page 57: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/57.jpg)
Tsuhan Chen
ExtensionsExtensions
• Geometric Consistency
• Semi-supervised
• Multiple classes and instances
• Hierarchical semantics of objects
![Page 58: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/58.jpg)
Tsuhan Chen
Geometric ConsistencyGeometric Consistency
– Background random, object consistent– Matched patches more likely from object of interest
[Caltech-4 data set]
![Page 59: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/59.jpg)
Tsuhan Chen
Geometric ConsistencyGeometric Consistency
Correspondence Info
0.010.2011 ~ 15
0.970.360 ~ 5
0.000.07> 16
0.020.376 ~ 10
# matches z1 z2
p(mjz)
p(z; w; r;mjd) = p(zjd)p(wjz)p(rjz; d)p(mjz)
![Page 60: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/60.jpg)
Tsuhan Chen
Semi-SupervisedSemi-Supervised
• User provides limited information– e.g., Label one frame
p(wjz)
LocationEstimation
p(zjd)
p(rjz; d)
pL(zjw; r; d)pU(zjw; r; d)
![Page 61: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/61.jpg)
Tsuhan Chen
Multiple Classes and InstancesMultiple Classes and Instances
• Multiple classes
• Multiple instances of the same object class
– Parametric methods
– Nonparametric methods
Model selection with BIC [Schwartz 78]Variational Bayes [Attias 99]
Mean-shift [Comaniciu & Meer 01]
![Page 62: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/62.jpg)
Tsuhan Chen
CHAIR
OFFICE
PHONE
MONITORKEYBOARD
computer
desk-area
Collection of images Corresponding hSO
Hierarchical Semantics of ObjectsHierarchical Semantics of Objects[Parikh&Chen CVPR’07]
![Page 63: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/63.jpg)
Tsuhan Chen
SummarySummary
• Probabilistic framework for object discovery– Incorporate information from
appearance / location / motion / geometry– Multiple classes and multiple instances possible– Unsupervised and semi-supervised possible– Discovery of hierarchical semantics of objects
![Page 64: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/64.jpg)
Tsuhan Chen
Finally…Finally…
![Page 65: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/65.jpg)
Tsuhan Chen
Some Related WorkSome Related Work
![Page 66: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/66.jpg)
Tsuhan Chen
Camera ArrayCamera Array
![Page 67: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/67.jpg)
Tsuhan Chen
What can be done…What can be done…
[EyeVision]
[CMU 3D Dome]
[CMU CamArray]
![Page 68: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/68.jpg)
Tsuhan Chen
Beyond Camera Array: “Active Sensing”Beyond Camera Array: “Active Sensing”
![Page 69: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/69.jpg)
Tsuhan Chen
![Page 70: From Image Analysis to Content Extraction: Are We There Yet?chenlab.ece.cornell.edu/Publication/Tsuhan/keynote_ICIAP.pdf · Test Data [BioID Face Database] Tsuhan Chen Object DiscoveryObject](https://reader036.vdocument.in/reader036/viewer/2022070905/5f734e8488dca678d40724d7/html5/thumbnails/70.jpg)
Tsuhan Chen
Advanced Multimedia Processing LabAdvanced Multimedia Processing Lab
Please visit us at:http://amp.ece.cmu.edu