875: recent a dvances in geometric c omputer v ision & recognition
DESCRIPTION
875: Recent A dvances in Geometric C omputer V ision & Recognition. Jan-Michael Frahm Spring 2014. Introductions. Grade Requirements. Presentation of 2 papers in class 30 min talk, 10 min questions Papers for selection must come from: top journals: IJCV, PAMI, CVIU, IVCJ - PowerPoint PPT PresentationTRANSCRIPT
875: Recent Advances in Geometric Computer Vision & Recognition
Jan-Michael FrahmSpring 2014
2
Introductions
3
Grade Requirements• Presentation of 2 papers in class
30 min talk, 10 min questions
• Papers for selection must come from: top journals: IJCV, PAMI, CVIU, IVCJ top conferences: CVPR (2010,2011), ICCV (2011),
ECCV (2010), approval for all other venues is needed
• Final project evaluation, extension of a recent method from
the above
4
Grading• 20% first presentation• 20% second presentation• 30% final project • 30% attendance & class participation
5
Schedule• Jan. 7th, Introduction
• Jan 7th, Uncertainty in Stereo (guest Philippos Mordohai) (substitute for Jan 13th class)
• Jan 15th , Large-scale image localization basic concepts, First paper selection (Large –scale localization)
• Jan 20th, MLK holiday no class
• Jan 22nd-29th, Large-scale localization basic concepts
• Feb. 3rd, 1. round of presentations starts
• Mar. 10th, 12th Spring break (no class)
• Mar. 17th, Modeling dynamic objects/scenes basic concepts, Second paper selection, final project definition
• Mar. 19st, Modeling dynamic objects
• Mar. 24th, 2. round of presentations starts
• Apr. 21st, 23rd , final project presentation
6
How to give a great presentation
• Structure of the talk: Motivation (motivate and explain the
problem) Overview Related work (short concise discussion) Approach Experiments Conclusion and future work
7
How to give a great presentation
• Use large enough fonts 5-6 one line bullet items on a slide
max• Keep it simple• No complex formulas in your talk• Bad Powerpoint slides• How to for presentations
8
How to give a great presentation
• Abstract the material of the talk provide understanding beyond
details• Use pictures to illustrate
find pictures on the internet create a graphic (in ppt, graph tool) animate complex pictures
9
How to give a good presentation
• Avoid bad color schemes no red on blue looks awful
• Avoid using laser pointer (especially if you are nervous)
• Add pointing elements in your presentation
• Practice to stay within your time! • Don’t rush through the talk!
Brush up on Stereo Reconstruction
10
Stereo• Extraction of 3D information from 2D images
11
Images 3D Point Cloud
Stereo
Binocular stereo• Given a calibrated binocular stereo pair, fuse it
to produce a depth image Humans can do it
Stereograms: Invented by Sir Charles Wheatstone, 1838
13
Depth Recovery by Stereo
reference image matching imageDepth
d1d2
d3d4
d5d6
d7d8
d9
Search Space
Epipolar line
14
Depth Recovery from Stereo
reference image matching imageDepth
d1d2
d3d4
d5d6
d7d8
d9
Search Space
Epipolar line
depthPixel similarity: measured by color differences
Matching Cost
Ground Truth Pixel Matching
Depth Map
Matching criteria• Raw pixel values (correlation)• Band-pass filtered images [Jones & Malik 92]• “Corner” like features [Zhang, …]• Edges [many people…]• Gradients [Seitz 89; Scharstein 94]• Rank statistics [Zabih & Woodfill 94]• Intervals [Birchfield and Tomasi 96]• Overview of matching metrics and their performance:
H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching Costs on Images with Radiometric Differences”, PAMI 2008
slide: R. Szeliski
Adaptive Weighting• Boundary Preserving• More Costly
Simplest Case: Parallel images
• Image planes of cameras are parallel to each other and to the baseline
• Camera centers are at same height
• Focal lengths are the same
slide: S. Lazebnik
Simplest Case: Parallel images
• Image planes of cameras are parallel to each other and to the baseline
• Camera centers are at same height
• Focal lengths are the same
• Then, epipolar lines fall along the horizontal scan lines of the images
slide: S. Lazebnik
Essential matrix for parallel images
RtExExT ][,0
0000
000][
TTRtE
R = I t = (T, 0, 0)
Epipolar constraint:
00
0][
xy
xz
yz
aaaa
aaa
t
x
x’
Essential matrix for parallel images
RtExExT ][,0
0000
000][
TTRtE
Epipolar constraint:
R = I t = (T, 0, 0)
t
x
x’
21
Aggregation Structure
depth
Matching Cost
Pixelwise Costs
Search Space
22
Aggregation Structure
Cost Volume
Cost aggregation: cutting the cost volume.
Search Space
Se arch Space
23
Aggregation Structure
Cost Volume
Fronto-Parallel Plane
Treat neighbors equally
Cost of the center pixel
Costs of neighboring
pixels
Sum of Absolute Differences (SAD)
Depth Map
24
Aggregation Structure
Adaptive WeightYoon and Kweon, PAMI 2006
Depth Map
Cost Volume
•Color differences •Spatial distances
Weighted cost of the center
pixel
Weighted costs of neighboring
pixels
25
Aggregation Structure
Adaptive Weight
Depth Map
Oriented Plane
Cost Volume
Lu et al., CVPR 2013
Your basic stereo algorithm
For each epipolar lineFor each pixel in the left image
• compare with every pixel on same epipolar line in right image• pick pixel with minimum match cost
Improvement: match windows• This should look familar...
slide: R. Szeliski
)( 4NO
)( 3NO
Depth Map Computation• Local methods
Depth with the minimum cost Complexity:
• Global methods Pairwise interactions Complexity:
Scharstein and Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", IJCV 2002
Image Resolution : the total number of pixels
28
N pixels
aN
pixe
ls
bN pixels
)( 2NO
2~ N
Depth from disparity
f
x x’
BaselineB
z
O O’
X
f
zfBxxdisparity
Disparity is inversely proportional to depth!
Depth Sampling Depth sampling for integer pixel disparity
Quadratic precision loss with depth!
Depth Sampling Depth sampling for wider baseline
Depth Sampling Depth sampling is in O(resolution6)
Failures of correspondence search
Textureless surfaces Occlusions, repetition
Non-Lambertian surfaces, specularitiesslide: S. Lazebnik
How can we improve window-based matching?
• The similarity constraint is local (each reference window is matched independently)
• Need to enforce non-local correspondence constraints
slide: S. Lazebnik
Non-local constraints• Uniqueness
For any point in one image, there should be at most one matching point in the other image
slide: S. Lazebnik
Non-local constraints• Uniqueness
For any point in one image, there should be at most one matching point in the other image
• Ordering Corresponding points should be in the same order
in both views
slide: S. Lazebnik
Non-local constraints• Uniqueness
For any point in one image, there should be at most one matching point in the other image
• Ordering Corresponding points should be in the same order in
both views
Ordering constraint doesn’t holdslide: S. Lazebnik
Non-local constraints• Uniqueness
For any point in one image, there should be at most one matching point in the other image
• Ordering Corresponding points should be in the same order in
both views• Smoothness
We expect disparity values to change slowly (for the most part)
slide: S. Lazebnik
I1 I2 I10
Multiple-baseline stereo results
M. Okutomi and T. Kanade, “A Multiple-Baseline Stereo System,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 15(4):353-363 (1993).
Plane Sweep Stereo• Choose a reference view• Sweep family of planes at different depths with
respect to the reference camera
Each plane defines a homography warping each input image into the reference view
reference camera
input image
R. Collins. A space-sweep approach to true multi-image matching. CVPR 1996.
input image
Real-time 3D reconstruction from video
“Real-Time Plane-sweeping Stereo with Multiple Sweeping Directions", CVPR 2007
3D scene SAD as similarity (darker is higher
similarity)
warped images
46
Real-time 3D reconstruction from video
47
“Real-Time Plane-sweeping Stereo with Multiple Sweeping Directions", CVPR 2007
3D scene
warped images
SAD as similarity (darker is higher
similarity)
Real-time 3D reconstruction from video
49
“Real-Time Plane-sweeping Stereo with Multiple Sweeping Directions", CVPR 2007
3D scene
warped images
SAD as similarity (darker is higher
similarity)
Multi-way sweep
3D reconstruction from video
view 1 view N
50
3D reconstruction from video
51