a global linear method for camera pose registration
DESCRIPTION
A Global Linear Method for Camera Pose Registration. Nianjuan Jiang* 1 , Zhaopeng Cui* 2 , Ping Tan 2 1 Advanced Digital Sciences Center, Singapore 2 National University of Singapore *Joint first authors. Structure from Motion ( SfM ). - PowerPoint PPT PresentationTRANSCRIPT
1
A Global Linear Method for Camera Pose Registration
Nianjuan Jiang*1, Zhaopeng Cui*2, Ping Tan2
1Advanced Digital Sciences Center, Singapore2National University of Singapore
*Joint first authors
2
Structure from Motion (SfM)
Simultaneously recover both 3D scene points and camera poses
3
SfM PipelineStep 1. Epipolar geometry;
compute relative motion between 2 or 3 camerasβ’ 6-point method [Quan 1995]β’ 7-point method [Torr & Murray 1997]β’ 8-point method (normalized) [Hartley 1997] β’ 5-point method [Nister 2004]
Images with matched feature points
4
SfM PipelineStep 1. Epipolar geometry;Step 2. Camera registration;
put all cameras in the same coordinate system (auto-calibration if needed [Pollefeys et al. 1998])
β’ [Fitzgibbon & Zisserman 1998]β’ [Pollefeys et al. 2004]
5
SfM PipelineStep 1. Epipolar geometry;Step 2. Camera registration;Step 3. Bundle adjustment.
optimize all cameras and pointsβ’ [Triggs et al. 1999]
6
βThe Black Art βStep 1. Epipolar geometry;Step 2. Camera registration;Step 3. Bundle adjustment.
The state-of-the-art:1. Step 1 and 3 are very well studied with
elegant theories and algorithms.
2. The step 2 is often ad-hoc and heuristic.
The camera registration to initialize bundle adjustment ββ¦ is still to some extent a black artβ¦β.
Page 452, Chapter 18.6
7
Typical Solutions
[Lhuillier & Quan 2005]
Hierarchical solution:Iteratively merge sub-sequences
[Fitzgibbon & Zisserman 1998]
8
Typical Solutions
[Lhuillier & Quan 2005]
Hierarchical solution:Iteratively merge sub-sequences
[Fitzgibbon & Zisserman 1998]
[Pollefeys et al. 2004]
Incremental solution: Iteratively add cameras one by one
[Snavely et al. 2006]
9
The block diagram (for the incremental solution):
Drawbacks:1. Repetitively calling bundle adjustment Inefficiency 90% of the total computation time is spent on bundle adjustment.2. Some cameras are fixed before the others asymmetric formulation leads to inferior results.
Pain of Existing Solutions
Our objective:Simultaneously register all cameras to
initialize the bundle adjustment
Add Cameras Bundle Adjustment More Cameras?
Initial Reconstruction
(2 cameras)
Step 1: Epipolar Geometry Register All Cameras in a Single Step Step 3: Bundle
Adjustment
10
Previous Works
L
[Govindu 2001]
[Martinec et al. 2007] [Arie-Nachimson et al. 2012][Kahl 2005]
linear global solution to rotations
[Hartley et al. 2013]
elegant quasi-convex optimization linear global solution to translations
[Crandall et al. 2011]
discrete-continuous optimization
cannot solve translations
sensitive to outliers
require coplanar cameras
degenerate at collinear motion
Desirable features:1. Solve both rotations & translations;2. Linear & robust solution;3. No degeneracy.
11
The Input Epipolar GeometryThe essential matrix encodes the relative motion
πΈππ= [π‘ ππ ]Γπ ππ
π ππ
π‘πππΈππ π‘πππ ππ and
12
A linear equation from every two cameras
Rotation Registration
π π
π π=π πππ π
π π=[ , ,]
π ππ
π3ππ2
ππ1π
β¨β¨π π
[Martinec et al. 2007]
π 2=π 12 π 1{cam1 , cam 2 }
β¦β¦
π 3=π 23π 3{cam 2, cam 3 }
π π=π πππ π{camm , camn }
13
Input:
Relative translations:
Output:
Camera positions:
ci cj
ck
Translation Registration (3 cameras)
π ππ
π ππ
π ππ
14
Translation Registration (3 cameras)
Suppose , are known, can be computed by:
ci cj
ck
ππβππ=π π(π πβ )π ππ
ππ(π πβπ π)
cj
A linear equation:
π π (π πβ )
π ππππ
ππβ
π ππ
π ππ
π π (π πβ )π ππππ
1. rotate to match the orientation of 2. shrink/grow to match the length of
both are easy to compute
15
Translation Registration (3 cameras)
A similar linear equation by matching and
ππβπ π=π π (βπ πβ ) π ππ
ππ(π πβπ π)
ci cj
ck
ci
π πβ
π ππ
π ππ
16
Translation Registration (3 cameras)
A geometric explanation
ijc
jkc
ci cj
ππβππ=π π (π πβ )π ππ
ππ(π πβπ π)
ππβπ π=π π (βπ πβ ) π ππ
ππ(π πβπ π)
ikc
π 1 π 2
: plane spanned by and
: plane spanned by and
and are non-coplanar
ck
17
Translation Registration (3 cameras)
A geometric explanation
ijc
jkc
ci cj
ck
ππβππ=π π (π πβ )π ππ
ππ(π πβπ π)
ππβπ π=π π (βπ πβ ) π ππ
ππ(π πβπ π)
Bikc
A
π 1 π 2 : the mutual perpendicular line
: the middle point of
β A
βπ΅ππ=π π+π π (ππ
β) π ππππ(π πβππ)
ππ=π π+π π (βπ πβ) π ππ
ππ(ππβπ π)
Our linear equations minimizes an approximate geometric error!
see derivation in the paper
18
Translation Registration (3 cameras)No degeneracy with collinear motion
ci cj
ckπ ππ
π ππ
π ππ
ππβππ=π π (0 )π ππππ(π πβπ π)
ππβπ π=π π ( 0 )π ππππ(π πβπ π)
19
Translation Registration (3 cameras)
Suppose , are known, can be computed by:
ci cj
ck
ππβ
ππβπ ππ
π ππ
π ππ
π πβπ π=π π (βπ πβ )π ππππ (ππβππ)
π πβππ=π π (ππ )π ππππ(π πβππ)
20
Translation Registration (3 cameras)
Suppose , are known, can be computed by:
ci cj
ck
ππβ
π πβ
π ππ
π ππ
π ππ
π πβππ=π π (βππβ )π ππ
ππ (π πβππ)
π πβπ π=π π (π π ) π ππππ (ππβπ π)
21
Translation Registration (3 cameras)
Collecting all six equations
π΅πππ(π π
π πππ)=0
Translation Registration (n cameras)
1. Collect equations from all triangles in the match graph.
π΅2 (π2 ,π3 ,π4 )=0π΅1 (π1,π2 ,π3 )=0
2. Solve all equations
Generalize to n cameras
π΅π=0 π=[π1
π2
π3
π4
π5
π6
π7
π8
π9
]The match graph:each camera is a vertex,connect two cameras if their relative motion is known.
cameras can be non-coplanar.
23
TriangulationOnce cameras are fixed, triangulate matched corners to generate 3D points.
24
Robustness Issuesβ’ Exclude unreliable tripletsβ’ More consistency checks in the paper
π ππ
π ππ
π ππ
οΏ½ΜοΏ½ ππ οΏ½ΜοΏ½ ππ
οΏ½ΜοΏ½ ππ
Check if ??
ResultsAccuracy evaluation:Compare with recent methods on data with known ground truth.
Fountain-P11 Herz-Jesu-P25 Castle-P30
c meters
R degrees
c meters
R degrees
c meters
R degrees
Ours 0.0139 0.1954 0.0636 0.1880 0.2345 0.4800
[Arie-Nachimson et al. 2012] 0.0226 0.4211 0.0479 0.3125 - -
[Sinha et al. 2010] 0.1317 - 0.2538 - - -
VisualSFM 0.0364 0.2794 0.0551 0.2868 0.2639 0.3980
Fountain-P11 Herz-Jesu-P25 Castle-P30
All results are after the final bundle adjustment.
ResultsEfficiency evaluation:
Building (128) Notre Dame (371) Pisa (481) Trevi Fountain (1259)
Our Method
Visual-SFM
Our Method
Visual-SFM
Our Method
Visual-SFM
Our Method
Visual-SFM
Total running time (s)* 17 62 49 479 69 479 135 1790
BA time (s) 11 57 20 442 52 444 61 1715
Registration time (s) 6 5 29 37 17 12 74 75
# of reconstructed images
128 128 362 365 479 480 1255 1253
# of reconstructed points 91,290 78,100 103,629 104,657 134,555 129,484 297,766 292,277
* The total running time excludes the time spent on feature matching and epipolar geometry computation.
Building Notre Dame Pisa Trevi Fountain
27
Conclusions
β’ A global solution for orientations & positions;β’ Linear, robust & geometrically meaningful;β’ No degeneracy.
Thanks!
code & data available at:http://www.ece.nus.edu.sg/stfpage/eletp/
29
A large scale scene
Results
Quasi-dense points generated by CMVS [Furukawa et al. 2010] for better visualization.