stereo matching & energy minimization vision for graphics cse 590ss, winter 2001 richard...
Post on 22-Dec-2015
221 views
TRANSCRIPT
Stereo Matching &Energy Minimization
Vision for GraphicsCSE 590SS, Winter 2001
Richard Szeliski
2/5/2001 Vision for Graphics 2
Stereo Matching
What are some possible algorithms?• match “features” and interpolate• match edges and interpolate• match all pixels with windows (coarse-fine)• use optimization:
– iterative updating– energy minimization (regularization, stochastic)– dynamic programming– graph algorithms
2/5/2001 Vision for Graphics 3
Feature-based stereo
Match “corner” (interest) points
Interpolate complete solution
2/5/2001 Vision for Graphics 4
Data interpolation
Given a sparse set of 3D points, how do we interpolate to a full 3D surface?
Scattered data interpolation [Nielson93]• triangulate• put onto a grid and fill (use pyramid?)• place a kernel function over each data point• minimize an energy function
2/5/2001 Vision for Graphics 5
Energy minimization
1-D example: approximating splines
zx,y
dx,y
2/5/2001 Vision for Graphics 6
Relaxation
Iteratively improve a solution by locally minimizing the energy: relax to solution
Earliest application: WWII numerical simulations
zx,y
dx,ydx+1,y dx+1,y
2/5/2001 Vision for Graphics 7
Relaxation
Try solving this problem yourself:1. Make up a bunch of zx,y
2. Guess best values for dx,yOriginal value z[i] 1 1 1 1 0 0 0 SumsGuessed value d[i] 0 0 0 0 0 0 0Difference (d[i]-z[i]) 1 1 1 1 0 0 0Difference^2 (d[i]-z[i]) 2̂ 1 1 1 1 0 0 0 4Neighbor diff. (d[i]-d[i-1]) 0 0 0 0 0 02 |Neighbor diff.| 2|d[i]-d[i-1]| 0 0 0 0 0 0 0
4
Original value z[i] 1 1 0 1 0 0 0 SumsGuessed value d[i] 0 0 0 0 0 0 0Difference (d[i]-z[i]) 1 1 0 1 0 0 0Difference^2 (d[i]-z[i]) 2̂ 1 1 0 1 0 0 0 3Neighbor diff. (d[i]-d[i-1]) 0 0 0 0 0 02 |Neighbor diff.| 2|d[i]-d[i-1]| 0 0 0 0 0 0 0
3
Original value z[i] 1 1 1 1 0 0 1 SumsGuessed value d[i] 0 0 0 0 0 0 0Difference (d[i]-z[i]) 1 1 1 1 0 0 1Difference^2 (d[i]-z[i]) 2̂ 1 1 1 1 0 0 1 5Neighbor diff. (d[i]-d[i-1]) 0 0 0 0 0 02 |Neighbor diff.| 2|d[i]-d[i-1]| 0 0 0 0 0 0 0
5
2/5/2001 Vision for Graphics 8
Relaxation
How can we get the best solution?
Differentiate energy function, set to 0
2/5/2001 Vision for Graphics 9
Non-quadratic energy
How about minimizing this cost function?
2/5/2001 Vision for Graphics 10
Discrete optimization space
What if you have discrete (e.g., binary) values?
0 0
1 1
0
1
0
1
0
1
2/5/2001 Vision for Graphics 11
Dynamic programming
Evaluate best cumulative cost at each pixel
0 0
1 1
0
1
0
1
0
1
2/5/2001 Vision for Graphics 12
Dynamic programming
1-D cost function
2/5/2001 Vision for Graphics 13
Dynamic programming
Can we apply this trick in 2D as well?
dx,ydx-1,y
dx,y-1dx-1,y-1
No: dx,y-1 and dx-1,y may depend on different values of dx-1,y-1
2/5/2001 Vision for Graphics 14
Graph cuts
Solution technique for general 2D problem
2/5/2001 Vision for Graphics 15
Graph cuts
Two different kinds of moves:
2/5/2001 Vision for Graphics 16
Graph cuts
- swap: interchange and labels
2/5/2001 Vision for Graphics 17
Graph cuts
expansion: add pixels to class
Back to stereo matching…
2/5/2001 Vision for Graphics 19
Neighborhood size (review)
Smaller neighborhood: more details
Larger neighborhood: fewer isolated mistakes
w = 3 w = 20
2/5/2001 Vision for Graphics 20
Plane sweep stereo
Re-order (pixel / disparity) evaluation loops
for every pixel, for every disparity for every disparity for every pixel compute cost compute cost
2/5/2001 Vision for Graphics 21
Stereo matching framework
1. For every disparity, compute raw matching costs
Why use a robust function?• occlusions, other outliers
Can also use alternative match criteria
2/5/2001 Vision for Graphics 22
Stereo matching framework
2. Aggregate costs spatially
• Here, we are using a box filter(efficient moving averageimplementation)
• Can also use weighted average,[non-linear] diffusion…
2/5/2001 Vision for Graphics 23
Stereo matching framework
3. Choose winning disparity at each pixel
• Can interpolate to sub-pixel accuracy
2/5/2001 Vision for Graphics 24
Linear diffusion
Average energy with neighbors
window diffusion
2/5/2001 Vision for Graphics 25
Linear diffusion
Average energy with neighbors + starting value
window diffusion
2/5/2001 Vision for Graphics 26
Dynamic programming
1-D cost function [Intille & Bobick, IJCV 99]
2/5/2001 Vision for Graphics 27
Dynamic programming
Disparity space image and min. cost path
2/5/2001 Vision for Graphics 28
Dynamic programming
Sample result (note horizontal streaks)
2/5/2001 Vision for Graphics 29
Graph cuts
- swap expansion
modify smoothness penalty based on edges
compute best possible match within integer disparity
2/5/2001 Vision for Graphics 30
Bayesian inference
Formulate as statistical inference problem
Prior model pP(d)
Measurement model pM(IL, IR| d)
Posterior model
pM(d | IL, IR) pP(d) pM(IL, IR| d)
Maximum a Posteriori (MAP estimate):
maximize pM(d | IL, IR)
2/5/2001 Vision for Graphics 31
Markov Random Field
Probability distribution on disparity field d(x,y)
Enforces smoothness or coherence on field
2/5/2001 Vision for Graphics 32
Measurement model
Likelihood of intensity correspondence
Corresponds to Gaussian noise for quadratic
2/5/2001 Vision for Graphics 33
MAP estimate
Maximize posterior likelihood
Equivalent to regularization (energy minimization with smoothness constraints)
2/5/2001 Vision for Graphics 34
Why Bayesian estimation?
Principled way of determining cost function
Explicit model of noise and prior knowledge
Admits a wider variety of optimization algorithms:• gradient descent (local minimization)• stochastic optimization (Gibbs Sampler)• mean-field optimization• graph theoretic (actually deterministic) [Zabih]
2/5/2001 Vision for Graphics 35
Mean-field interpretation
Bayesian non-linear diffusion rule:• update your probability distribution assuming your
neighbors’ distributions are independent (valid for Markov chain)
Equivalent to finding best factored approximation
P(d|IL,IR) ~ Q(d) = iQi(di)
2/5/2001 Vision for Graphics 36
Mean-field interpretation
log MAP estimate
-log P(d|IL,IR) = ijEij(di,dj) + iEi(di)
= ijsiAijsi + i bisi
Kullback-Leibler divergence
DKL =H(Q) - d Q(d) log P(d)
= ik qik log qik + ijsiAijsi + i bisi
2/5/2001 Vision for Graphics 37
Mean-field interpretation
minimize K-L divergence with
k qik = 1
update rule:
qik exp[ - ( jaijk qj + bik )]
= exp[- ( jlEij(di=k,dj=l)p(dj=l) + Ei(di=k) )]
2/5/2001 Vision for Graphics 38
Depth Map Results
Input image Sum Abs Diff
Mean field Graph cuts
2/5/2001 Vision for Graphics 39
Stereo with Non-Linear Diffusion
Advantages:• works very well in non-occluded regions
Disadvantages:• restricted to two images (not)• gets confused in occluded regions• can’t handle mixed pixels
2/5/2001 Vision for Graphics 40
Summary
ApplicationsImage rectificationMatching criteriaLocal algorithms (aggregation)
• area-based; iterative updating
Optimization algorithms:• energy (cost) formulation• Markov Random Fields
• mean-field; dynamic programming;• stochastic; graph algorithms
2/5/2001 Vision for Graphics 41
More stereo…(next 2 lectures)
Multi-image stereo
Volumetric techniques
Graph cuts
Transparency
Surfaces and level sets
2/5/2001 Vision for Graphics 42
Bibliography
See the references in the readings…Y. Boykov, O. Veksler, and Ramin Zabih, Fast Approximate Energy
Minimization via Graph Cuts, Unpublished manuscript, 2000. A.F. Bobick and S.S. Intille. Large occlusion stereo. International Journal of
Computer Vision, 33(3), September 1999. pp. 181-200D. Scharstein and R. Szeliski. Stereo matching with nonlinear diffusion.
International Journal of Computer Vision, 28(2):155-174, July 1998 R. Szeliski. Stereo algorithms and representations for image-based
rendering. In British Machine Vision Conference (BMVC'99), volume 2, pages 314-328, Nottingham, England, September 1999.
R. Szeliski and R. Zabih. An experimental comparison of stereo algorithms. In International Workshop on Vision Algorithms, pages 1-19, Kerkyra, Greece, September 1999.
G. M. Nielson, Scattered Data Modeling, IEEE Computer Graphics and Applications, 13(1), January 1993, pp. 60-70.