object recognition and localizationaz/icvss08_az_intro.pdf · 2 2 2 what we would like to be able...
TRANSCRIPT
![Page 1: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/1.jpg)
111
Object Recognition and Localization
Andrew ZissermanVisual Geometry Group
University of Oxfordhttp://www.robots.ox.ac.uk/~vgg
Includes slides from: Anna Bosch, Ondra Chum, Mark Everingham, Rob Fergus, PawanKumar, Bastian Leibe, Pietro Perona, Bernt Schiele, Josef Sivic and Manik Varma
![Page 2: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/2.jpg)
222
What we would like to be able to do …• Visual recognition and scene understanding
• What is in the image and where
• scene type: outdoor, city …• object classes• material properties• actions
![Page 3: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/3.jpg)
333
Various tasks
Image classification:“the image contains an airplane”
Object pixel-level segmentation:“the object ‘owns’ these pixels”
Object detection/localization:“the object is here, in a bounding box”
![Page 4: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/4.jpg)
5
Why is the recognition problem hard ?
• Scale and shape of the imaged object varies with viewpoint
• Occlusion (self- or by a foreground object)
• Lighting changes
• Background “clutter”
![Page 5: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/5.jpg)
6
Some object classes(Caltech datasets)
Difficulties:
• Size/shape variation• Partial occlusion• Lighting• Background clutter• Intra-class variation
![Page 6: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/6.jpg)
7
Class of model: Pictorial Structure
• Intuitive model of an object
• Model has two components
1. parts (2D image fragments)
2. structure (configuration of parts)
• Dates back to Fischler & Elschlager 1973
Is this complexity of representation necessary ?
![Page 7: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/7.jpg)
888
Deformations
![Page 8: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/8.jpg)
101010
Unclear how to model categories, so learn rather than manually specify
Learning
![Page 9: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/9.jpg)
111111
Levels of supervision in learning
pixel level (regions)“parts”
ROInone (weak supervision)
![Page 10: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/10.jpg)
121212
David Lowe [1985]
Specific object recognition history: Geometric methods
Rothwell et al. [1992]
![Page 11: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/11.jpg)
131313
Specific object recognition history: Appearance-based methods
• Murase & Nayer 1995 • Schmid & Mohr 1996• Lowe 1999• …
Original use of SIFT descriptor
![Page 12: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/12.jpg)
141414
• Rowley & Kanade, 1998• Schneiderman & Kanade 2000• Viola and Jones, 2000• Heisele et al., 2001
• LeCun et al. 1998 • Amit and Geman, 1999• Belongie and Malik, 2002• DeCoste and Scholkopf, 2002• Simard et al. 2003
• Poggio et al. 1993• Schneiderman & Kanade, 2000• Argawal and Roth, 2002…..
History: early object categorization
![Page 13: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/13.jpg)
151515
Application I: improving photography
![Page 14: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/14.jpg)
161616
Application II: Improving online search
Query:STREET
Application III: Organizing photo collections
![Page 15: Object Recognition and Localizationaz/icvss08_az_intro.pdf · 2 2 2 What we would like to be able to do … • Visual recognition and scene understanding • What is in the image](https://reader033.vdocument.in/reader033/viewer/2022060520/604e6d90d8570b367612eb1c/html5/thumbnails/15.jpg)
171717
Outline
1. Bag of visual words model for categorization
• SVM classifier
2. Adding spatial information for localization
3. Databases and challenges
4. Spatial layout
5. Class based segmentation
• Pixel level localization
6. Conclusions and the future