what, where & how many? combining object detectors and crfs
DESCRIPTION
What, Where & How Many? Combining Object Detectors and CRFs. Ľ ubor Ladick ý , Paul Sturgess, Karteek Alahari, Christopher Russell, Philip H.S. Torr. http://cms.brookes.ac.uk/research/visiongroup/. Complete Scene Understanding. Involves - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/1.jpg)
What, Where & How Many?Combining Object Detectors and CRFs
Ľubor Ladický, Paul Sturgess, Karteek Alahari, Christopher Russell, Philip H.S. Torr
http://cms.brookes.ac.uk/research/visiongroup/
![Page 2: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/2.jpg)
Complete Scene Understanding
Involves
Localization of all instances of foreground objects (“things”)
Localization of all background classes (“stuff”)
Pixel-wise segmentation
3D reconstruction
Pose detection
Action recognition
Event recognition
![Page 3: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/3.jpg)
Complete Scene Understanding
Involves
Localization of all instances of foreground objects (“things”)
Localization of all background classes (“stuff”)
Pixel-wise segmentation
3D reconstruction
Pose detection
Action recognition
Event recognition
Semantic Segmentation
![Page 4: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/4.jpg)
Complete Scene Understanding
Involves
Localization of all instances of foreground objects (“things”)
Localization of all background classes (“stuff”)
Pixel-wise segmentation
![Page 5: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/5.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, detect every thing in it.
Thing : An object with a specific size and shape.
Adelson, Forsyth et al. 96
![Page 6: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/6.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, detect every thing in it.
Thing : An object with a specific size and shape.
Adelson, Forsyth et al. 96
![Page 7: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/7.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, detect every thing in it.
Thing : An object with a specific size and shape.
Adelson, Forsyth et al. 96
![Page 8: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/8.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, detect every thing in it.
Thing : An object with a specific size and shape.
Adelson, Forsyth et al. 96
![Page 9: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/9.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, detect every thing in it.
Thing : An object with a specific size and shape.
Adelson, Forsyth et al. 96
![Page 10: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/10.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, label all the stuff
Stuff : Material defined by a homogeneous or repetitive pattern, with no specific spatial extent / shape. Adelson, Forsyth et al. 96
![Page 11: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/11.jpg)
Semantic Scene Understanding
We're interested in whole scene understandingGiven an image, label all the stuff
Stuff : Material defined by a homogeneous or repetitive pattern, with no specific spatial extent / shape. Adelson, Forsyth et al. 96
![Page 12: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/12.jpg)
Semantic Scene Understanding
Our plan is to combine– State of the art sliding window object detection
– State of the art segmentation techniques
car
![Page 13: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/13.jpg)
Algorithms for Object Localization
Sliding window detectors
HOG descriptor (Dalal & Triggs CVPR05)
Based on histograms of features (Vedaldi et al. ICCV09)
Part-based models (Felzenszwalb et al. CVPR09)
![Page 14: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/14.jpg)
Algorithms for Object Localization
Sliding window detectors
HOG descriptor (Dalal & Triggs CVPR05)
Based on histograms of features (Vedaldi et al. ICCV09)
Part-based models (Felzenszwalb et al. CVPR09)
![Page 15: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/15.jpg)
Algorithms for Object Localization
Sliding window detectors
HOG descriptor (Dalal & Triggs CVPR05)
Based on histograms of features (Vedaldi et al. ICCV09)
Part-based models (Felzenszwalb et al. CVPR09)
![Page 16: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/16.jpg)
Algorithms for Object Localization
Non-maxima suppression
Greedily
Using field of indicator variables (Ramanan et al ICCV09)
• One binary variable yi { 0, 1 } per location/scale
![Page 17: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/17.jpg)
Algorithms for Object Localization
car
Non-maxima suppression
Greedily
Using field of indicator variables (Ramanan et al ICCV09)
• One binary variable yi { 0, 1 } per location/scale
![Page 18: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/18.jpg)
Sliding window detectors
• Sliding window + Segmentation
– OBJCUT (Kumar et al. 05)
– Updating colour model (GrabCut - Rother et al. 04)
car
![Page 19: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/19.jpg)
Sliding window detectors
Sliding window detectors not good for “stuff”
Try to detect “sky” !
![Page 20: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/20.jpg)
Sliding window detectors
Sliding window detectors not good for “stuff”
sky
Sky is irregular shape not suitedto the sliding window approach
![Page 21: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/21.jpg)
Sliding window detectors
State-of-the-art for object detection (“things”) Do not work for background classes (“stuff”)
No distinct shapeCannot be enclosed in a box
Cannot recover from incorrect detections
![Page 22: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/22.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over pixels
Shotton et al. ECCV06
Input image
Final segmentation
CRF construction
![Page 23: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/23.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over pixels
Shotton et al. ECCV06
Input image
Training of Potentials
CRF construction
![Page 24: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/24.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over pixels
MAP
Shotton et al. ECCV06
Input image
Final segmentation
Training of Potentials
CRF construction
![Page 25: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/25.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over Super-pixels / Segments
Input image
Unsupervisedsegmentation
Shi, Malik PAMI2000, Comaniciu, Meer PAMI2002,Felzenszwalb, Huttenlocher, IJCV2004
![Page 26: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/26.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over Super-pixels / Segments
Input image
Unsupervisedsegmentation
Shi, Malik PAMI2000, Comaniciu, Meer PAMI2002,Felzenszwalb, Huttenlocher, IJCV2004
![Page 27: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/27.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over Super-pixels / Segments
Batra et al. CVPR08, Yang et al. CVPR07, Zitnick et al. CVPR08, Rabinovich et al. ICCV07, Boix et al. CVPR10
Input image Training of potentials
Unsupervisedsegmentation
![Page 28: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/28.jpg)
Algorithms for Object-class Segmentation
Pairwise CRF over Super-pixels / Segments
Batra et al. CVPR08, Yang et al. CVPR07, Zitnick et al. CVPR08, Rabinovich et al. ICCV07, Boix et al. CVPR10
Input image
Final segmentation
Training of potentials
Unsupervisedsegmentation
MAP
![Page 29: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/29.jpg)
Algorithms for Object-class Segmentation
Associative Hierarchical CRF
Ladický et al. ICCV09
Input image
Multiplesegmentationsor hierarchies
![Page 30: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/30.jpg)
Algorithms for Object-class Segmentation
Associative Hierarchical CRF
Ladický et al. ICCV09
Input image
Multiplesegmentationsor hierarchies
CRF construction
![Page 31: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/31.jpg)
Algorithms for Object-class Segmentation
Associative Hierarchical CRF
Ladický et al. ICCV09, Russell et al. UAI10
Input image
Final segmentation
Multiplesegmentationsor hierarchies
MAP
CRF construction
![Page 32: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/32.jpg)
Associative Hierarchical CRF
State-of-the-art performance on MSRC & CamVid
• Merges information at multiple scales
• Can recover from incorrect segmentations
Ladický et al. ICCV09, Sturgess et al., BMVC09
![Page 33: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/33.jpg)
Associative Hierarchical CRF
State-of-the-art performance on MSRC & CamVid
• Merges information at multiple scales
• Can recover from incorrect segmentations
However,
• No concept of “things”
• Cannot distinguish between
multiple instances
Ladický et al. ICCV09, Sturgess et al., BMVC09
car
![Page 34: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/34.jpg)
CRF Formulation with Detectors
CRF formulation altered with a potential for each detection
![Page 35: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/35.jpg)
CRF Formulation with Detectors
CRF formulation altered with a potential for each detection
CRF graphover pixels
AH-CRF energy without detectors
![Page 36: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/36.jpg)
CRF Formulation with Detectors
CRF formulation altered with a potential for each detection
Set of pixels of d-th detection
Classifier response
Detected label
CRF graphover pixels
AH-CRF energy without detectors
![Page 37: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/37.jpg)
CRF Formulation with Detectors
Joint CRF formulation should contain
• Possibility to reject detection hypothesis
• Recover the status of the detection (0 / 1)
Thus, potential is a minimum over indicator variable yd { 0, 1 }
CRF graphover pixels
Indicator variables
![Page 38: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/38.jpg)
CRF Formulation with Detectors
Detection potential should decrease the energy of labelling agreeing with the detection hypothesis
Partial disagreement penalized but not directly rejects the hypothesis
CRF graphover pixels
Indicator variables
![Page 39: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/39.jpg)
CRF Formulation with Detectors
Detection potential should decrease the energy of labelling agreeing with the detection hypothesis
Partial disagreement penalized but not directly rejects the hypothesis
Detection hypothesis status Detection strength
Partial inconsistency cost
CRF graphover pixels
![Page 40: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/40.jpg)
CRF Formulation with Detectors
Detection potential should decrease the energy of labelling agreeing with the detection hypothesis
Partial disagreement penalized but not directly rejects the hypothesis
Linear thresholded Linear
CRF graphover pixels
![Page 41: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/41.jpg)
Inference over CRF with Detectors
This higher order potential can be transformed to
Solvable with all graph cut-based methods
which take the form of Robust PN (Kohli et al. CVPR08)
![Page 42: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/42.jpg)
Results on CamVid dataset
Result without detections Set of detections Final Result
Brostow et al.
Sturgess et al.
Brostow et al. ECCV08, Sturgess et al. BMVC09
![Page 43: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/43.jpg)
Results on CamVid dataset
Result without detections
Set of detections Final Result
Also provides number of object instances (using yd’s)
![Page 44: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/44.jpg)
Results on VOC2009 dataset
Input image CRF without detectors
CRF with detectors
Input image CRF without detectors
CRF with detectors
![Page 45: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/45.jpg)
Summary
Novel formulation of CRF with detectors Can recover from incorrect detections Possible to obtain instances of objects Efficiently solvable
![Page 46: What, Where & How Many? Combining Object Detectors and CRFs](https://reader035.vdocument.in/reader035/viewer/2022062807/56815161550346895dbf852d/html5/thumbnails/46.jpg)
Future and ongoing work
Automatic non-maximum suppression Part-based models in CRF New things & stuff dataset available soon!
Questions?