875: recent a dvances in geometric c omputer v ision & recognition

875: Recent Advances in Geometric Computer Vision & Recognition

Jan-Michael FrahmSpring 2014

2

Introductions

3

Grade Requirements• Presentation of 2 papers in class

30 min talk, 10 min questions

• Papers for selection must come from: top journals: IJCV, PAMI, CVIU, IVCJ top conferences: CVPR (2010,2011), ICCV (2011),

ECCV (2010), approval for all other venues is needed

• Final project evaluation, extension of a recent method from

the above

4

Grading• 20% first presentation• 20% second presentation• 30% final project • 30% attendance & class participation

5

Schedule• Jan. 7th, Introduction

• Jan 7th, Uncertainty in Stereo (guest Philippos Mordohai) (substitute for Jan 13th class)

• Jan 15th , Large-scale image localization basic concepts, First paper selection (Large –scale localization)

• Jan 20th, MLK holiday no class

• Jan 22nd-29th, Large-scale localization basic concepts

• Feb. 3rd, 1. round of presentations starts

• Mar. 10th, 12th Spring break (no class)

• Mar. 17th, Modeling dynamic objects/scenes basic concepts, Second paper selection, final project definition

• Mar. 19st, Modeling dynamic objects

• Mar. 24th, 2. round of presentations starts

• Apr. 21st, 23rd , final project presentation

6

How to give a great presentation

• Structure of the talk: Motivation (motivate and explain the

problem) Overview Related work (short concise discussion) Approach Experiments Conclusion and future work

7


• Use large enough fonts 5-6 one line bullet items on a slide

max• Keep it simple• No complex formulas in your talk• Bad Powerpoint slides• How to for presentations

http://www.youtube.com/watch?v=lpvgfmEU2Ck

http://youtu.be/gNG0etmnwuk

http://youtu.be/gNG0etmnwuk

8


• Abstract the material of the talk provide understanding beyond

details• Use pictures to illustrate

find pictures on the internet create a graphic (in ppt, graph tool) animate complex pictures

9

How to give a good presentation

• Avoid bad color schemes no red on blue looks awful

• Avoid using laser pointer (especially if you are nervous)

• Add pointing elements in your presentation

• Practice to stay within your time! • Don’t rush through the talk!

Brush up on Stereo Reconstruction

10

Stereo• Extraction of 3D information from 2D images

11

Images 3D Point Cloud

Stereo

Binocular stereo• Given a calibrated binocular stereo pair, fuse it

to produce a depth image Humans can do it

Stereograms: Invented by Sir Charles Wheatstone, 1838

13

Depth Recovery by Stereo

reference image matching imageDepth

d1d2

d3d4

d5d6

d7d8

d9

Search Space

Epipolar line

14

Depth Recovery from Stereo

reference image matching imageDepth

d1d2

d3d4

d5d6

d7d8

d9

Search Space

Epipolar line

depthPixel similarity: measured by color differences

Matching Cost

Ground Truth Pixel Matching

Depth Map

Matching criteria• Raw pixel values (correlation)• Band-pass filtered images [Jones & Malik 92]• “Corner” like features [Zhang, …]• Edges [many people…]• Gradients [Seitz 89; Scharstein 94]• Rank statistics [Zabih & Woodfill 94]• Intervals [Birchfield and Tomasi 96]• Overview of matching metrics and their performance:

H. Hirschmüller and D. Scharstein, “Evaluation of Stereo Matching Costs on Images with Radiometric Differences”, PAMI 2008

slide: R. Szeliski

Adaptive Weighting• Boundary Preserving• More Costly

Simplest Case: Parallel images

• Image planes of cameras are parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same

slide: S. Lazebnik

Simplest Case: Parallel images

• Image planes of cameras are parallel to each other and to the baseline

• Camera centers are at same height

• Focal lengths are the same

• Then, epipolar lines fall along the horizontal scan lines of the images

slide: S. Lazebnik

Essential matrix for parallel images

RtExExT ][,0

0000

000][

TTRtE

R = I t = (T, 0, 0)

Epipolar constraint:

00

0][

xy

xz

yz

aaaa

aaa

t

x

x’

Essential matrix for parallel images

RtExExT ][,0

0000

000][

TTRtE

Epipolar constraint:

R = I t = (T, 0, 0)

t

x

x’

21

Aggregation Structure

depth

Matching Cost

Pixelwise Costs

Search Space

Jan-Michael Frahm

Jan-Michael Frahm

pixel wise

Jan-Michael Frahm

Here you should also indicate that the minimum is used to compute the depth of the pixel.

22


Cost Volume

Cost aggregation: cutting the cost volume.

Search Space

Se arch Space

23


Cost Volume

Fronto-Parallel Plane

Treat neighbors equally

Cost of the center pixel

Costs of neighboring

pixels

Sum of Absolute Differences (SAD)

Depth Map

24


Adaptive WeightYoon and Kweon, PAMI 2006

Depth Map

Cost Volume

•Color differences •Spatial distances

Weighted cost of the center

pixel

Weighted costs of neighboring

pixels

25


Adaptive Weight

Depth Map

Oriented Plane

Cost Volume

Lu et al., CVPR 2013

Your basic stereo algorithm

For each epipolar lineFor each pixel in the left image

• compare with every pixel on same epipolar line in right image• pick pixel with minimum match cost

Improvement: match windows• This should look familar...

slide: R. Szeliski

)( 4NO

)( 3NO

Depth Map Computation• Local methods

Depth with the minimum cost Complexity:

• Global methods Pairwise interactions Complexity:

Scharstein and Szeliski, “A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", IJCV 2002

Image Resolution : the total number of pixels

28

N pixels

aN

pixe

ls

bN pixels

)( 2NO

2~ N

Depth from disparity

f

x x’

BaselineB

z

O O’

X

f

zfBxxdisparity

Disparity is inversely proportional to depth!

Depth Sampling Depth sampling for integer pixel disparity

Quadratic precision loss with depth!

Depth Sampling Depth sampling for wider baseline

Depth Sampling Depth sampling is in O(resolution6)

Failures of correspondence search

Textureless surfaces Occlusions, repetition

Non-Lambertian surfaces, specularitiesslide: S. Lazebnik

How can we improve window-based matching?

• The similarity constraint is local (each reference window is matched independently)

• Need to enforce non-local correspondence constraints

slide: S. Lazebnik

Non-local constraints• Uniqueness

For any point in one image, there should be at most one matching point in the other image

slide: S. Lazebnik



• Ordering Corresponding points should be in the same order

in both views

slide: S. Lazebnik



• Ordering Corresponding points should be in the same order in

both views

Ordering constraint doesn’t holdslide: S. Lazebnik



• Ordering Corresponding points should be in the same order in

both views• Smoothness

We expect disparity values to change slowly (for the most part)

slide: S. Lazebnik

I1 I2 I10

Multiple-baseline stereo results

M. Okutomi and T. Kanade, “A Multiple-Baseline Stereo System,” IEEE Trans. on Pattern Analysis and Machine Intelligence, 15(4):353-363 (1993).

http://www.ri.cmu.edu/pub_files/pub2/okutomi_m_1993_1/okutomi_m_1993_1.pdf

Plane Sweep Stereo• Choose a reference view• Sweep family of planes at different depths with

respect to the reference camera

Each plane defines a homography warping each input image into the reference view

reference camera

input image

R. Collins. A space-sweep approach to true multi-image matching. CVPR 1996.

input image

http://www.ri.cmu.edu/pub_files/pub1/collins_robert_1996_1/collins_robert_1996_1.pdf

Real-time 3D reconstruction from video

“Real-Time Plane-sweeping Stereo with Multiple Sweeping Directions", CVPR 2007

3D scene SAD as similarity (darker is higher

similarity)

warped images

46


47


3D scene

warped images

SAD as similarity (darker is higher

similarity)


49


3D scene

warped images

SAD as similarity (darker is higher

similarity)

Multi-way sweep

3D reconstruction from video

view 1 view N

50

3D reconstruction from video

51

875: recent a dvances in geometric c omputer v ision & recognition

Documents

final project definitionmar

final project presentation

depth imagehumans

great presentationuse

d information

d worlds

localization basic conceptsfeb

great presentationabstract