computer vision group uc berkeley how should we combine high level and low level knowledge? jitendra...
TRANSCRIPT
![Page 1: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/1.jpg)
Computer Vision GroupUC Berkeley
How should we combine high level and low level knowledge?
Jitendra Malik UC Berkeley
Recognition using regions is joint work with Chunhui Gu, Joseph Lim & Pablo Arbelaez (CVPR 2009)
![Page 2: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/2.jpg)
Computer Vision GroupUC Berkeley
The central problems of vision
Grouping /Segmentation
3D structure/Figure-Ground
Object and Scene Recognition
![Page 3: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/3.jpg)
Computer Vision GroupUC Berkeley
Detection and Segmentation: Giraffes
Orig. Image Segmentation Orig. Image Segmentation
![Page 4: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/4.jpg)
Computer Vision GroupUC Berkeley
Detection and Segmentation: Mugs
Orig. Image Segmentation Orig. Image Segmentation
![Page 5: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/5.jpg)
Computer Vision GroupUC Berkeley
Outline
• Current paradigm: Multiscale scanning
• Our approach– Bottom up region segmentation– Hough transform style voting (learned weights)– Top down segmentation
• Results on ETHZ , Caltech 101, MSRC
![Page 6: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/6.jpg)
Computer Vision GroupUC Berkeley
Detection: Is this an X?
Ask this question repeatedly, varying position, scale, category…
Paradigm introduced by Rowley, Baluja & Kanade 96 for face detectionViola & Jones 01, Dalal & Triggs 05, Felzenszwalb, McAllester, Ramanan 08
![Page 7: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/7.jpg)
Computer Vision GroupUC Berkeley
Problems with the multi-scale scanning paradigm
• Computational complexity•10^6 windows, 10 scales, 10^4 categories
• Not natural for irregularly shaped objects
• Segmentation is delinked
• Context is delinked
![Page 8: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/8.jpg)
Computer Vision GroupUC Berkeley
Our Approach
• Perceptual Organization provides the right primitives for visual recognition.
• After more than a decade of work, we finally have high quality, generic, detectors for contours and regions. We now only need to work with ~100 elements, each with its local scale estimate.
• In this talk, we demonstrate recognition using regions. Detection and segmentation happen in the same framework.
• There will always be some errors in the bottom-up grouping process, the recognition machinery needs to be robust to that.
![Page 9: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/9.jpg)
Computer Vision GroupUC Berkeley
Contour Detection (CVPR 2008)
![Page 10: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/10.jpg)
Computer Vision GroupUC Berkeley
Region Detection (CVPR 2009)
![Page 11: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/11.jpg)
Computer Vision GroupUC Berkeley
Region detector wins on any measure!Region Benchmarks on BSDS
Probabilistic Rand Index on BSDS Variation of Information on BSDS
Region Benchmarks on MSRC/PASCAL08
![Page 12: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/12.jpg)
![Page 13: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/13.jpg)
![Page 14: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/14.jpg)
![Page 15: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/15.jpg)
![Page 16: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/16.jpg)
Computer Vision GroupUC Berkeley
Parallelizing Image SegmentationCatanzaro et al, UC Berkeley, ICCV 09
• GTX 280 is an Nvidia Graphics Processor, massively parallel general purpose computing platform– 30 cores, 8 wide SIMD
= 240 way parallelism– 140 GB/s memory bandwidth
(Modern CPUs have ~10-20 GB/s)– Special memory subsystems for
graphics processing
• Sequential Implementation: 5 minutes per image
• Parallel, Optimized Implementation: 2 seconds
![Page 17: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/17.jpg)
Computer Vision GroupUC Berkeley
Why Use Regions?
• Local estimate of scale; no search necessary
• Shape, color and texture in the same framework
• Hierarchy of regions (“partonomy”) represents scenes, objects, parts. Makes use of context natural.
• Do not suffer from background clutter
• Reduce candidate windows on detection task– 1000 to 10000 times fewer windows on the ETHZ dataset
• Need to be robust to segmentation errors
![Page 18: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/18.jpg)
Computer Vision GroupUC Berkeley
Object Representation using Regions
Bag of Regions
RegionSegmentation
![Page 19: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/19.jpg)
Computer Vision GroupUC Berkeley
Region Representation
![Page 20: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/20.jpg)
Region-based Hough Voting• Recover transformation from matched regions• Transform exemplar bounding box to query
20
Exemplar Query
T(x,y,sx,sy)
T(x,y,sx,sy)
![Page 21: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/21.jpg)
Region-based Voting
Exemplar 1
Query
21
![Page 22: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/22.jpg)
Region-based Voting
Exemplar 1
Query
22
![Page 23: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/23.jpg)
Region-based Voting
Exemplar 1
Query
23
![Page 24: Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint](https://reader035.vdocument.in/reader035/viewer/2022081516/55147625550346b0158b52ef/html5/thumbnails/24.jpg)
Region-based Voting
Exemplar 1
Query
24