visual recognition with humans in the loop
DESCRIPTION
Visual Recognition With Humans in the Loop. ECCV 2010, Crete, Greece. Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie. Peter Welinder Pietro Perona. What type of bird is this?. Field Guide. …?. What type of bird is this?. Computer Vision. ?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/1.jpg)
1
Visual Recognition With Humans in the Loop
Steve Branson
Catherine Wah
Florian Schroff
Boris Babenko
Serge Belongie
Peter WelinderPietro Perona
ECCV 2010, Crete, Greece
![Page 2: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/2.jpg)
2
What type of bird is this?
![Page 3: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/3.jpg)
What type of bird is this?
3…?
Field Guide
![Page 4: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/4.jpg)
4
What type of bird is this?
Computer Vision
?
![Page 5: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/5.jpg)
5
What type of bird is this?
Bird?
Computer Vision
![Page 6: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/6.jpg)
6
What type of bird is this?
Chair?Bottle?
Computer Vision
![Page 7: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/7.jpg)
7
Parakeet Auklet
• Field guides difficult for average users
• Computer vision doesn’t work perfectly (yet)
• Research mostly on basic-level categories
![Page 8: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/8.jpg)
8
Visual Recognition With Humans in the Loop
Parakeet Auklet
What kind of bird is this?
![Page 9: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/9.jpg)
9
Levels of Categorization
Airplane? Chair? Bottle? …
Basic-Level Categories
[Griffin et al. ‘07, Lazebnik et al. ‘06, Grauman et al. ‘06, Everingham et al. ‘06, Felzenzwalb et al. ‘08, Viola et al. ‘01, … ]
![Page 10: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/10.jpg)
10
Levels of Categorization
American Goldfinch? Indigo Bunting? …
Subordinate Categories
[Belhumeur et al. ‘08 , Nilsback et al. ’08, …]
![Page 11: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/11.jpg)
11
Levels of Categorization
Yellow Belly? Blue Belly?…
Parts and Attributes
[Farhadi et al. ‘09, Lampert et al. ’09, Kumar et al. ‘09]
![Page 12: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/12.jpg)
12
Visual 20 Questions GameBlue Belly? no
Cone-shaped Beak? yes
Striped Wing? yes
American Goldfinch? yes
Hard classification problems can be turned into a sequence of easy ones
![Page 13: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/13.jpg)
13
Recognition With Humans in the Loop
Computer Vision
Cone-shaped Beak? yes
American Goldfinch? yes
Computer Vision
• Computers: reduce number of required questions• Humans: drive up accuracy of vision algorithms
![Page 14: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/14.jpg)
14
Research Agenda
Heavy Reliance on Human Assistance
More Automated
Computer Vision
Improves
Blue belly? noCone-shaped beak? yesStriped Wing? yesAmerican Goldfinch? yes
Striped Wing? yesAmerican Goldfinch? yes
Fully AutomaticAmerican Goldfinch? yes
2010 2015
2025
![Page 15: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/15.jpg)
15
Field Guides
www.whatbird.com
![Page 16: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/16.jpg)
16
Field Guides
www.whatbird.com
![Page 17: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/17.jpg)
17
Example Questions
![Page 18: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/18.jpg)
18
Example Questions
![Page 19: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/19.jpg)
19
Example Questions
![Page 20: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/20.jpg)
20
Example Questions
![Page 21: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/21.jpg)
21
Example Questions
![Page 22: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/22.jpg)
22
Example Questions
![Page 23: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/23.jpg)
23
Basic Algorithm
Input Image ( )xQuestion 1:
Is the belly black?
Question 2:Is the bill hooked?
Computer Vision
A: NO
A: YES
)|( xcp
),|( 1uxcp
),,|( 21 uuxcp
1u
2u
Max Expected Information Gain
Max Expected Information Gain
…
![Page 24: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/24.jpg)
24
Without Computer Vision
Input Image ( )xQuestion 1:
Is the belly black?
Question 2:Is the bill hooked?
Class Prior
A: NO
A: YES
)(cp
)|( 1ucp
),|( 21 uucp
1u
2u
Max Expected Information Gain
Max Expected Information Gain
…
![Page 25: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/25.jpg)
25
Basic Algorithm
Select the next question that maximizes expected information gain:• Easy to compute if we can to estimate probabilities of
the form:)...,,|( 21 tuuuxcp
Object Class
Image Sequence of user responses
![Page 26: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/26.jpg)
26
Basic Algorithm
Zxcpcuuup t )|()|...,( 21
Model of user responses
Computer vision
estimate
)...,,|( 21 tuuuxcp
Normalization factor
![Page 27: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/27.jpg)
27
Basic Algorithm
Zxcpcuuup t )|()|...,( 21
Model of user responses
Computer vision
estimate
)...,,|( 21 tuuuxcp
Normalization factor
![Page 28: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/28.jpg)
28
Modeling User Responses
ti it cupcuuup...121 )|()|...,(• Assume:
• Estimate using Mechanical Turk)|( cup i
grey red
blac k
whi
tebr
own
blue
grey red
blac k
whi
tebr
own
blue
grey red
blac k
whi
tebr
own
blue
Definitely
Probably Guessing
What is the color of the belly?
Pine Grosbeak
![Page 29: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/29.jpg)
29
Incorporating Computer Vision
• Use any recognition algorithm that can estimate: p(c|x)
• We experimented with two simple methods:
)}(exp{)|( xmxcp 1-vs-all SVM
i i capxcp )|()|(
Attribute-based classification
[Lampert et al. ’09, Farhadi et al. ‘09]
![Page 30: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/30.jpg)
30
Incorporating Computer Vision
[Vedaldi et al. ’08, Vedaldi et al. ’09]
Self SimilarityColor Histograms Color Layout
Bag of WordsSpatial Pyramid
Geometric Blur Color SIFT, SIFT
Multiple Kernels
•Used VLFeat and MKL code + color features
![Page 31: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/31.jpg)
31
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 32: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/32.jpg)
32
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 33: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/33.jpg)
33
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 34: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/34.jpg)
34
Results: Without Computer Vision
Comparing Different User Models
![Page 35: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/35.jpg)
35
Results: Without Computer Vision
if users answers agree with field guides…Perfect Users: 100% accuracy in 8≈log2(200) questions
![Page 36: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/36.jpg)
36
Results: Without Computer VisionReal users answer questions
MTurkers don’t always agree with field guides…
![Page 37: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/37.jpg)
37
Results: Without Computer VisionReal users answer questions
MTurkers don’t always agree with field guides…
![Page 38: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/38.jpg)
38
Results: Without Computer VisionProbabilistic User Model: tolerate imperfect user responses
![Page 39: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/39.jpg)
39
Results: With Computer Vision
![Page 40: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/40.jpg)
40
Results: With Computer Vision
Users drive performance: 19% 68%
Just Computer Vision19%
![Page 41: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/41.jpg)
41
Results: With Computer Vision
Computer Vision Reduces Manual Labor: 11.1 6.5 questions
![Page 42: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/42.jpg)
42
Examples
Without computer vision: Q #1: Is the shape perching-like? no (Def.)With computer vision: Q #1: Is the throat white? yes (Def.)
Western Grebe
Different Questions Asked w/ and w/out Computer Vision
perching-like
![Page 43: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/43.jpg)
43
Examples
computer vision
Magnolia Warbler
User Input Helps Correct Computer Vision
Is the breast pattern solid? no (definitely)
Common Yellowthroat
Magnolia Warbler
Common Yellowthroat
![Page 44: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/44.jpg)
44
Recognition is Not Always SuccessfulAcadian Flycatcher
Least Flycatcher
Parakeet Auklet
Least Auklet
Is the belly multi-colored? yes (Def.)
Unlimited questions
![Page 45: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/45.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
45
Recognition of fine-grained categories
More reliable than field guides
![Page 46: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/46.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
46
Recognition of fine-grained categories
More reliable than field guides
![Page 47: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/47.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
47
Recognition of fine-grained categories
More reliable than field guides
![Page 48: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/48.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
48
Recognition of fine-grained categories
More reliable than field guides
![Page 49: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/49.jpg)
49
Future Work• Extend to domains other than birds• Methodologies for generating questions• Improve computer vision
![Page 50: Visual Recognition With Humans in the Loop](https://reader035.vdocument.in/reader035/viewer/2022081422/56816587550346895dd842a5/html5/thumbnails/50.jpg)
50
Questions?
Project page and datasets available at:http://vision.caltech.edu/visipedia/
http://vision.ucsd.edu/project/visipedia/