stanford cs223b computer vision, winter 2007 lecture 8 structure from motion

Stanford CS223B Computer Vision, Winter 2007

Lecture 8 Structure From Motion

Professors Sebastian Thrun and Jana Košecká

CAs: Vaibhav Vaish and David Stavens

Slide credit: Gary Bradski, Stanford SAIL

Sebastian Thrun and Jana Košecká CS223B Computer Vision, Winter 2007

Summary SFM

Problem– Determine feature locations (=structure)– Determine camera extrinsic (=motion)

Two Principal Solutions– Bundle adjustment (nonlinear least squares, local

minima)– SVD (through orthographic approximation, affine

geometry) Correspondence

– (RANSAC)– Expectation Maximization


Structure From Motion

camera

features

Recover: structure (feature locations), motion (camera extrinsics)


SFM = Holy Grail of 3D Reconstruction

Take movie of object Reconstruct 3D model

Would be

commercially

highly viable

live.com


Structure From Motion (1)

[Tomasi & Kanade 92]


Structure From Motion (4a): Images

Marc Pollefeys


Structure From Motion (4b)

Marc Pollefeys



http://www.cs.unc.edu/Research/urbanscape



Problem 1:– Given n points pij =(xij, yij) in m images

– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)

– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)

Problem 2:– Establish correspondence: c(pij)



camera

features

Recover: structure (feature locations), motion (camera extrinsics)


Recovery Problems

1 image 2+ images

Location known calibration stereo

Location unknown

SFM, stitching


SFM: General Formulation

iz

jz

jy

jx

ii

ii

ii

ii

iy

ix

jz

jy

jx

ii

ii

ii

ii

ii

ii

jy

jx

b

P

P

P

b

b

P

P

P

fp

p

,

,

,

,

,

,

,

,

,

,

,

cossin0

sincos0

001

cos0sin

010

sin0cos

100

cossin0

sincos0

001

cos0sin

010

sin0cos

0cossin

0sincos

fZ Z

fXx

XO

-x


SFM: Bundle Adjustment

min

cossin0

sincos0

001

cos0sin

010

sin0cos

100

cossin0

sincos0

001

cos0sin

010

sin0cos

0cossin

0sincos

2

,

,

,

,

,

,

,

,

,

,

,

,

ji

iz

jz

jy

jx

ii

ii

ii

ii

iy

ix

jz

jy

jx

ii

ii

ii

ii

ii

ii

jy

jx

b

P

P

P

b

b

P

P

P

fp

p

fZ Z

fXx

XO

-x


Bundle Adjustment

SFM = Nonlinear Least Squares problem Minimize through

– Gradient Descent– Conjugate Gradient– Gauss-Newton– Levenberg Marquardt common method

Prone to local minima


Count # Constraints vs #Unknowns

m camera poses n points 2mn point constraints 6m+3n unknowns

Suggests: need 2mn 6m + 3n But: Can we really recover all parameters???


How Many Parameters Can’t We Recover?

0 3 6 7 8 10 12 n m nm

Place Your Bet!

We can recover all but…

m = #camera posesn = # feature points





– Can’t recover origin, orientation (6 params)– Can’t recover scale (1 param)

Thus, we need 2mn 6m + 3n - 7


Are we done?

No, bundle adjustment has many local minima.


The “Trick Of The Day”

Replace Perspective by Orthographic Geometry

Replace Euclidean Geometry by Affine Geometry

Solve SFM linearly via PCA (“closed” form, globally optimal)

Post-Process to make solution Euclidean

Post-Process to make solution perspective

By Tomasi and Kanade, 1992


Orthographic Camera Model

Orthographic = Limit of Pinhole Model:

z

y

x

z

y

x

z

y

x

b

b

b

P

P

P

aaa

aaa

aaa

p

p

p

333231

232221

131211

Extrinsic Parameters

Rotation

Orthographic Projection bAPb

b

P

P

P

a

a

a

a

a

a

p

p

y

x

Z

Y

X

y

x

23

13

22

12

21

11


Orthographic Projection

Limit of Pinhole Model:

Orthographic Projection

1||

1||

0

22

21

21

a

a

aa

rotation is

333231

232221

131211

aaa

aaa

aaa

ijiij bPAp

featurejcamerai

bAPb

b

P

P

P

a

a

a

a

a

a

p

p

y

x

Z

Y

X

y

x

23

13

22

12

21

11


The Orthographic SFM Problem

}{ and },{recover jPii bA

ijiij bPAp featurejcamerai 1||

1||

0

22

21

21

a

a

aa

subject to


The Affine SFM Problem

}{ and },{recover jPii bA

ijiij bPAp featurejcamerai 1||

1||

0

22

21

21

a

a

aa

subject todrop theconstraints





ijiij bPAp featurejcamerai


How Many Parameters Can’t We Recover?

0 3 6 7 8 10 12 n m nm

Place Your Bet!

We can recover all but…


The Answer is (at least): 12

ijiij bPAp

iijiij bdAdCPCCAp ))(( :Proof 11

iji bPA

iiiji bdAdAPA

''' ijiij bPAp

dCPCP jj11'

iii bdAb 'singular-non , Cd CAA ii '


Points for Solving Affine SFM Problem

m camera poses n points

Need to have: 2mn 8m + 3n-12


Affine SFM

jiij PAp

Fix coordinate systemby making pi0=P0=origin

mj

j

j

p

p

q 1

mA

A

A 1

jj APqm :cameras

ADQn :points

NPPD 1

mn

n

m p

p

p

p

Q

1

1

11

ijiij bPAp

Proof:

3m2 size has A

Rank Theorem: Q has rank 3

nD 3 size has


The Rank Theorem

3rank has

1

1

1

1

11

11

Nyy

Nxx

Nyy

Nxx

MM

MM

pp

pp

pp

pp

n elements

2m

ele

me

nts


Singular Value Decomposition

T

Nyy

Nxx

Nyy

Nxx

VWU

pp

pp

pp

pp

MM

MM

1

1

1

1

11

11

n332 m 33


Affine Solution to Orthographic SFM

structure affine TWV

positions camera affine U

Gives also the optimal affine reconstruction under noise


Back To Orthographic Projection

1||

1||

0

sConstraint

22

21

21

a

a

aa

matrix singular -non , vector Cd

with

Find C for which constraints are metSearch in 9-dim space (instead of 8m + 3n-12)

''' ijiij bPAp

dCPCP jj11'

ii CAA '

iii bdAb '


Back To Projective Geometry

Orthographic (in the limit)

Projective


Back To Projective Geometry

min

cossin0

sincos0

001

cos0sin

010

sin0cos

100

cossin0

sincos0

001

cos0sin

010

sin0cos

0cossin

0sincos

2

,

,

,

,

,

,

,

,

,

,

,

,

ji

iz

jz

jy

jx

ii

ii

ii

ii

iy

ix

jz

jy

jx

ii

ii

ii

ii

ii

ii

jy

jx

b

P

P

P

b

b

P

P

P

fp

p

fZ Z

fXx

XO

-x

Optimize

Using orthographic solution as starting point


The “Trick Of The Day”

Replace Perspective by Orthographic Geometry

Replace Euclidean Geometry by Affine Geometry

Solve SFM linearly via PCA (“closed” form, globally optimal)

Post-Process to make solution Euclidean

Post-Process to make solution perspective

By Tomasi and Kanade, 1992



Problem 1:– Given n points pij =(xij, yij) in m images

– Reconstruct structure: 3-D locations Pj =(xj, yj, zj)

– Reconstruct camera positions (extrinsics) Mi=(Aj, bj)

Problem 2:– Establish correspondence: c(pij)


The Correspondence Problem

View 1 View 3View 2


Correspondence: Solution 1

Track features (e.g., optical flow)

…but fails when images taken from widely different poses


Correspondence: Solution 2

Start with random solution A, b, P Compute soft correspondence: p(c|A,b,P) Plug soft correspondence into SFM Reiterate

See Dellaert/Seitz/Thorpe/Thrun, Machine Learning Journal, 2003


Example


Results: Cube


Animation


Tomasi’s Benchmark Problem


Reconstruction with EM


3-D Structure


Correspondence: Alternative Approach

Ransac [Fisher/Bolles]

= Random sampling and consensus

Will be discussed Wednesday


Summary SFM

Problem– Determine feature locations (=structure)– Determine camera extrinsic (=motion)

Two Principal Solutions– Bundle adjustment (nonlinear least squares, local

minima)– SVD (through orthographic approximation, affine

geometry) Correspondence

– (RANSAC)– Expectation Maximization

stanford cs223b computer vision, winter 2007 lecture 8 structure from motion

Documents

cpijsebastian thrun

comsebastian thrun

jana koeckcas

local minimasebastian

bmarc pollefeyssebastian

stanford sailsebastian

structure feature locations

imagesreconstruct structure