computer vision: stereo and 3d reconstruction · computer vision: stereo and 3d reconstruction...

Post on 23-Mar-2020

10 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Computer Vision: Stereo and 3D Reconstruction

Daniel G. Aliaga

Department of Computer Science

Purdue University

Thanks to S. Narasimhan @ CMU

for some of the slides

Definitions

• Camera geometry (=motion)

– Given corresponded points on ≥2 views, what are the poses of the cameras?

• Correspondence geometry (=correspondence)

– Given a point in one view, what are the constraints of its position in another view?

• Scene geometry (=structure)

– Given corresponded points on ≥2 views and the camera poses, what is the 3D location of the points?

Stereo Rig

scene

LP LL yx ,left image

RP RR yx ,

right image

baseline bAssume that we know corresponds to LP RP

Z

bX

f

xL 2

Z

bX

f

xR 2

Using perspective projection (defined using coordinate system shown)

Z

Y

X

Z

Y

f

y

f

y RL

ZYX ,,

f

Stereo Rig

Z

bX

f

xL 2

Z

bX

f

xR 2

Z

Y

f

y

f

y RL

RL

RL

xx

xxbX

2

RL

RL

xx

yybY

2 RL xx

bfZ

scene

LP LL yx ,

left image

RP RR yx ,

right image

baseline b

ZYX ,,

Stereo: Disparity and Depth

Z

bX

f

xL 2

Z

bX

f

xR 2

Z

Y

f

y

f

y RL

RL

RL

xx

xxbX

2

RL

RL

xx

yybY

2 RL xx

bfZ

is the disparity between corresponding left and right image points RL xxd

• disparity increases with baseline b • inversely proportional to depth

scene

LP LL yx ,

left image

RP RR yx ,

right image

baseline b

ZYX ,,

Stereo: Ray Triangulation

scene point

optical center optical center image plane

(Ray) Triangulation: compute reconstruction as intersection of two rays

Stereo: Ray Triangulation

scene point

optical center optical center image plane

Do two lines intersect in 3D?

If so, how do you compute their intersection?

Stereo: Ray Triangulation

Equations for the intersection:

0)()( 21 ba pppp

0)()( 43 ba pppp

)pp(spp 121b

)pp(tpp 343a

...s

...t

)(5.0 ba ppp

1p

2p

3p

4p

ap

bp

p

Solve for s and t, compute p:

field of view

of stereo

Stereo: Vergence

1. Field of view decreases with increase in baseline and vergence 2. Accuracy increases with increase in baseline and vergence

one pixel

uncertainty of

scenepoint

optical axes of the two cameras need not be parallel

Camera Geometry

• We need to transform “left frame” to “right frame” – includes a rotation and translation:

scene

Left frame Right frame

Lx~ Rx~

333231

232221

131211

rrr

rrr

rrr

R

• In matrix notation, we can write as:

Camera Geometry

• In matrix notation, we can write as:

Camera Geometry

RLLL

RLLL

RLLL

zrzryrxr

yrzryrxr

xrzryrxr

34333231

24232221

14131211

Camera Geometry: Orthonormality Constraints

(a) Rows of R are perpendicular vectors

(b) Each row of R is a unit vector

IRRT

NOTE: Constraints

are NON-LINEAR!

Problem:

Given ‘s

Find

Lx~ Rx~

LRtR ),...,,( 341211 rrr

Camera Geometry: Problem Definition

scene

Lx~ Rx~

•subject to (nonlinear) constraints

Problem: same image coords can be generated by doubling Lx~ Rx~ LRt~

thus, we can find only up to a scale factor!

Solution: fix scale by using constraint: (1 additional equation)

Camera Geometry: Scale Ambiguity

Lx~

Rx~

LRt~

Lx~2

Rx~2

LRt~

2

LRt~

1~~

LRLR tt

Camera Geometry: How many scene points are needed?

Each scene point gives 3 equations:

and 6+1 additional equations from orthonormality of rotation matrix

constraints and scale constraint.

Thus, for n scene points, we have (3n + 6 + 1) equations and 12

unknowns

•What is the minimum value for n?

RLLL

RLLL

RLLL

zrzryrxr

yrzryrxr

xrzryrxr

34333231

24232221

14131211

Camera Geometry: Solving an Over-determined System

• Generally, more than 3 points are used to find the 12 unknowns

• Formulate error for scene point i as:

• Find that minimize:

RLRLi xtxRe ~)~(

)]1()([|| 2

1

12

LRLRT

N

i

i ttIRReE

LRtR &

Camera Geometry: A Linear Estimation

Assume a near correct rotation is known. Then an orthogonal

rotation matrix looks like:

•Using this matrix, iteratively and linearly solve for ’s and tLR:

1

1

1

xy

xz

yz

R

0~)~( RLRL xtxR

•How many equations/scene-points are needed?

•6 unknowns, 3 equations per scene point, so ≥ 2 points

•Limitations:

•1. ignores normality/scale (fix by re-scaling each iteration)

•2. assumes good initial guess

where is the 3D rotation axis

and its length is the amount by

which to rotate

Correspondence

scene point

optical center optical center image plane

Correspondence

scene point

optical center optical center image plane

epipolar line

Correspondence

scene point

optical center image plane

optical center

epipolar line epipolar line

Epipolar Geometry

Epipolar Constraint: reduces correspondence problem to 1D search along conjugate epipolar lines

scene point

optical center image plane

optical center

epipolar line epipolar line epipolar plane

Epipolar Geometry

Epipolar Constraint: can be expressed using the fundamental matrix F

scene point

optical center image plane

optical center

epipolar line epipolar line epipolar plane

Epipolar Geometry

converging

cameras

motion parallel with image plane

Epipolar Geometry

e

e’

Epipolar Geometry

Forward motion

Epipolar Geometry

Correspondence reduced to looking in a small neighborhood of a line…

Fundamental Matrix

How to compute the fundamental matrix?

1. geometric explanation…

2. algebraic explanation…

Fundamental Matrix: Geometric Exp.

Thus, there is a mapping

xx

X

ee

l

lx

point line

c c

Fundamental Matrix: Geometric Exp.

How do you map a point to a line?

xx

X

ee

l

c c

Fundamental Matrix: Geometric Exp.

Idea:

- We know ‘s are in a plane

- Define a line by its “perpendicular”, then we can use dot product; e.g.,

xx

X

ee

l

)(x

0 lx 0)'( lcx

c c

or

What is a definition of as perpendicular to the pictured epipolar line?

Fundamental Matrix: Geometric Exp.

xx

X

ee

l

c c

)()( cxcel

l

xel (assume all in canonical frame

of the right-side camera)

Cross product can be expressed using matrix notation:

Fundamental Matrix: Geometric Exp.

xel

z

y

x

xy

xz

yz

x

x

x

ee

ee

ee

xe

0

0

0

xexe ][

xel ][

How do you compute ?

Fundamental Matrix: Geometric Exp.

x

Use a homography (or projective transformation) to map to x x

(Homography: maps points in a plane to another plane)

...

...

...

,,

1

H

w

xw

xw

xx

x

x y

x

y

x

Hxx

Fundamental Matrix: Geometric Exp.

xel ][

Hxx Hxel ][ HeF ][ 0 Fxx T

xx

X

ee

l

c c

Epipolar Constraint 0 lxWant …

Fundamental Matrix: Algebraic Exp.

xx

X

ee

l

c c

PXx

XtRx

XPx

Fundamental Matrix: Algebraic Exp.

PXx ?X

tcxPtX )( Pwhere is the pseudoinverse of P

Why pseudoinverse?

Since not square, pseudoinverse means but solved as an optimization

IPP P

xel ][

?x

What is in terms of ?

Recall

x x

(Let’s assume which means in on the image plane) 0t X

xPPx PPeF ][ 0 Fxx T

Epipolar Constraint

Correspondence: Epipolar Geometry

Epipolar constraint reduces correspondence problem to 1D search along conjugate epipolar lines

scene point

optical center image plane

optical center

epipolar line epipolar line epipolar plane xx

Correspondence: Epipolar Geometry

Epipolar constraint can be expressed as

scene point

optical center image plane

optical center

epipolar line epipolar line epipolar plane

0 Fxx T

xx

Fundamental matrix

Correspondence: Epipolar Geometry

Interesting case: what happens if camera motion is pure translation?

x x

If motion parallel to x-axis…

]0|[IP ]|[ tIP

eF

Te 001

010

100

000

Fimplies horizontal

epipolar line…

Thus the desire to do image rectification

IH ( ) ?F

Correspondence: Epipolar Geometry

Thus for rectified images, correspondence is reduced to looking in a small neighborhood of a line…

Essential Matrix

• Similar to the fundamental matrix but includes the intrinsic calibration matrix, thus the equation is in terms of the normalized image coordinates, e.g.:

0'' Exx T FKKE T'and

essential matrix

Scene Geometry scene point?

Camera A Camera B

Camera geometry known

Correspondence and epipolar geometry known

What is the location of the scene point (scene geometry)?

Scene Geometry: Linear Formulation

XMx aa

~~

ax~

X~

aa KPM bb KPM

XMx bb

~~

bx~

or Problem?

Assumes we know

But what is the value for ?

Twyxx ~

w

Scene Geometry: Linear Formulation

XMx~~

'

''

~

w

yx

s

sysx

x

'

''

~

w

yx

x

Recall

where

'/''/'

wywx

yx where and are the

observed projections x y

Let , thus ?s 'ws

14131211 mZmYmXmsx Hence?

24232221 mZmYmXmsy

Scene Geometry: Linear Formulation

14131211 mZmYmXmsx

Given 24232221 mZmYmXmsy

For a scene point, how many unknowns?

For a scene point, how many camera views needed?

In general, one scene point observed in at least two views is sufficient…

3+N

3N≥3+N

and N cameras

34333231 mZmYmXms

Scene Geometry: Linear Formulation

b

a

s

sZYX

14131211 mZmYmXmsx

24232221 mZmYmXmsy

34333231 mZmYmXms

x2

Scene Geometry: Linear Formulation

'ssZYX

10'''

'0'''

'0'''

01

0

0

333231

232221

131211

333231

232221

131211

mmm

ymmm

xmmm

mmm

ymmm

xmmm

34

24

14

34

24

14

'

'

'

m

m

m

m

m

m

14131211 mZmYmXmsx

24232221 mZmYmXmsy

34333231 mZmYmXms

x2

Cameras M and M’

Scene Geometry: Nonlinear Form.

• Remember “Bundle Adjustment”

– Given initial guesses, use nonlinear least squares to refine/compute the calibration parameters

– Simple but good convergence depends on accuracy of initial guess

Recall

ij ji

jiij

ji

jiij

Xm

Xmy

Xm

Xmx

mnE 2

3

22

3

1)~

~

()~

~

(1

Goal is 0E

For scene geometry, are the unknowns… X~

Scene Geometry: Nonlinear Form.

Example Result

• Using dense feature-based stereo

• Next: images and features!

[Pollefeys99]

top related