building local part models for category-level recognition c. schmid, inria grenoble joint work with...
TRANSCRIPT
![Page 1: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/1.jpg)
Building local part models for
category-level recognition
C. Schmid, INRIA Grenoble
Joint work with G. Dorko, S. Lazebnik, J. Ponce
![Page 2: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/2.jpg)
Introduction
• Invariant local descriptors
=> robust recognition of specific objects or scenes
• Recognition of textures and object classes
=> description of intra-class variation, selection of discriminant features, spatial relations
texture recognition car detection
![Page 3: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/3.jpg)
1. An affine-invariant texture recognition (CVPR’03)
2. A two-layer architecture for texture segmentation and recognition (ICCV’03)
3. Feature selection for object class recognition (ICCV’03)
4. Building affine-invariant part models for recognition
Overview
![Page 4: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/4.jpg)
Affine-invariant texture recognition
• Texture recognition under viewpoint changes and non-rigid transformations
• Use of affine-invariant regions– invariance to viewpoint changes– spatial selection => more compact representation, reduction of
redundancy in texton dictionary
[A sparse texture representation using affine-invariant regions,
S. Lazebnik, C. Schmid and J. Ponce, CVPR 2003]
![Page 5: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/5.jpg)
Spatial selection
clustering each pixel
clustering selected pixels
![Page 6: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/6.jpg)
Overview of the approach
![Page 7: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/7.jpg)
Harris detector
Laplace detector
Region extraction
![Page 8: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/8.jpg)
Descriptors – Spin images
![Page 9: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/9.jpg)
Signature and EMD
• Hierarchical clustering
=> Signature :
• Earth movers distance
– robust distance, optimizes the flow between distributions– can match signatures of different size– not sensitive to the number of clusters
SS = { ( m1 , w1 ) , … , ( mk , wk ) }
D( SS , SS’’ ) = [i,j fij d( mi , m’j)] / [i,j fij ]
![Page 10: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/10.jpg)
Database with viewpoint changes
20 samples of 10 different textures
![Page 11: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/11.jpg)
Results
Spin images Gabor-like filters
![Page 12: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/12.jpg)
1. An affine-invariant texture recognition (CVPR’03)
2. A two-layer architecture for texture segmentation and recognition (ICCV’03)
3. Feature selection for object class recognition (ICCV’03)
4. Building affine-invariant part models for recognition
Overview
![Page 13: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/13.jpg)
A two-layer architecture
• Texture recognition + segmentation
• Classification of individual regions + spatial layout
[A generative architecture for semi-supervised texture
recognition, S. Lazebnik, C. Schmid, J. Ponce, ICCV 2003]
![Page 14: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/14.jpg)
A two-layer architecture
Modeling :
1. Distribution of the local descriptors (affine invariants)• Gaussian mixture model• estimation with EM, allows incorporating unsegmented images
2. Co-occurrence statistics of sub-class labels over affinely adapted neighborhoods
Segmentation + Recognition :
1. Generative model for initial class probabilities
2. Co-occurrence statistics + relaxation to improve labels
![Page 15: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/15.jpg)
Texture Dataset – Training Images
T1 (brick) T2 (carpet) T3 (chair) T4 (floor 1) T5 (floor 2) T6 (marble) T7 (wood)
![Page 16: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/16.jpg)
Effect of relaxation + co-occurrence
Original image
Top: before relaxation (indivual regions), bottom: after relaxation (co-occurrence)
![Page 17: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/17.jpg)
Recognition + Segmentation Examples
![Page 18: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/18.jpg)
Animal Dataset – Training Images
• no manual segmentation, weakly supervised• 10 training images per animal (with background) • no purely negative images
![Page 19: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/19.jpg)
Recognition + Segmentation Examples
![Page 20: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/20.jpg)
1. An affine-invariant texture recognition (CVPR’03)
2. A two-layer architecture for texture segmentation and recognition (ICCV’03)
3. Feature selection for object class recognition (ICCV’03)
4. Building affine-invariant part models for recognition
Overview
![Page 21: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/21.jpg)
Object class detection/classification
• Description of intra-class variations of object parts
[Selection of scale inv. regions for object class recognition,
G. Dorko and C. Schmid, ICCV’03]
![Page 22: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/22.jpg)
Object class detection/classification
• Description of intra-class variations of object parts
• Selection of discrimiant features (weakly supervised)
![Page 23: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/23.jpg)
Training the model
• Training phase 1– Input : Images of the object with background (positive images),
no normalization, alignment of the image
– Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT
– Clustering : estimation of Gaussian mixture with EM
![Page 24: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/24.jpg)
Training the model
• Training phase 1– Input : Images of the object with background (positive images),
no normalization, alignment of the image/object
– Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT
– Clustering : estimation of Gaussian mixture with EM
![Page 25: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/25.jpg)
Training the model
• Training phase 2 (selection)– Input : verification set, positive and negative images
– Rank each cluster with likelihood (or mutual information)
– MAP classifier with the n top clusters
j
nj
nj
i
ui
ui
dclP
dclPcR
)()(
)()(
)(
![Page 26: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/26.jpg)
5
Likelihood Mutual Information
25
Likelihood – mutual information
–likelihood: more discriminant but very specific
–mutual Information: discriminant but not too specific
![Page 27: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/27.jpg)
Results for test images
Har
ris-
Lap
lace
354 points 49 correct + 37 incorrect 31 correct + 20 incorrect
25 Likelihood 10 Mutual InformationDetection
Har
ris-
Lap
lace
277 points 43 correct + 36 incorrect 26 correct + 20 incorrect
![Page 28: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/28.jpg)
Relaxation – propagation of probablities
![Page 29: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/29.jpg)
Classification
• Assign each test descriptor to the most probable cluster (MAP)
• Each descriptor assigned to one of the top n clusters is positive
• If the number of positive descriptors are above a threshold p classify the image as positive
![Page 30: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/30.jpg)
Classification experimentsAirplanesAirplanes MotorbikesMotorbikes Wild CatsWild Cats
Training Phase 1
#Positive images 200 200 25
Training Phase 2
#Positive images 200 200 25
#Negative images 450 450 450
Testing
#Positive images 400 400 50
#Negative images 450 450 450
Training
Verification
Test
http://www.robots.ox.ac.uk/~vgg/data Corel Image Library
![Page 31: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/31.jpg)
Results: Motorbikes
Equal-Error-Rates as a function of p.
Receiver-Operating-Characteristic
p=6
![Page 32: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/32.jpg)
Best Estimated p p=6 Fergus
p % p % % %
AirplanesHarris 8 97,5 5 97 97.25 -
Kadir 18 97 30 96.5 96 94
MotorbikesHarris 9 99 5 98 98.25 -
Kadir 19 98.75 32 98.25 98 96
Wild CatsHarris 31 94 34 92 72 -
Kadir 17 86 45 82 84 90
97.5
99
94
Classification results: ROC equal error rates
![Page 33: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/33.jpg)
1. An affine-invariant texture recognition (CVPR’03)
2. A two-layer architecture for texture segmentation and recognition (ICCV’03)
3. Feature selection for object class recognition (ICCV’03)
4. Building affine-invariant part models for recognition
Overview
![Page 34: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/34.jpg)
• Matching collections of local affine-invariant regions that map with an affine transformation => part
• Matching works for unsegmented images
• Model = a collection of parts
A
Affine-invariant part models
![Page 35: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/35.jpg)
Matching: Faces
spurious match
![Page 36: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/36.jpg)
Matching: 3D Objects
closeup
closeup closeup
![Page 37: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/37.jpg)
Matching: Finding Repeated Patterns
![Page 38: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/38.jpg)
Matching: Finding Symmetry
![Page 39: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/39.jpg)
Modeling for Recognition
• Match multiple pairs of training images to produce several candidate parts.
• Use additional validation images to evaluate repeatability of parts and individual patches.
• Retain a fixed number of parts having the best repeatability score as class model.
• No background model
![Page 40: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/40.jpg)
The Butterfly Dataset
• 16 training images (8 pairs) per class
• 10 validation images per class
• 437 test images
• 619 images total
![Page 41: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/41.jpg)
Butterfly Models
Top two rows: pairs of images used for modeling. Bottom two rows: closeup views of some of the partsmaking up the models of the seven butterfly classes.
![Page 42: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/42.jpg)
Recognition
• Top 10 models per class used for recognition
• Multi-class classification results:
total model size (smallest/largest)
![Page 43: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/43.jpg)
Classification Rate vs. Number of Parts
Number of parts
![Page 44: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/44.jpg)
Successful Detection Examples
Model partYellow: detected in test imageBlue: occluded in test image
Test image:All ellipses
Test image:Matched ellipses
Note: only one of the two training images is shown
![Page 45: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/45.jpg)
Successful Detection Examples (cont.)
![Page 46: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/46.jpg)
Detection of Multiple Instances
![Page 47: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/47.jpg)
Detection Failures
![Page 48: Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce](https://reader035.vdocument.in/reader035/viewer/2022081520/56649ea95503460f94badd8d/html5/thumbnails/48.jpg)
Future Work
• Spatial relation– non-rigid models – relations between clusters and affine-invariant parts
• Feature selection: dimensionality reduction
• Shape information: appropriate descriptors
• Rapid search: structuring of the data