visual 3d modeling using cameras and camera networks marc pollefeys university of north carolina at...
Post on 20-Dec-2015
224 Views
Preview:
TRANSCRIPT
Visual 3D Modeling using Cameras and Camera Networks
Marc PollefeysUniversity of North Carolina at Chapel Hill
Visual 3D Modeling using Cameras and Camera Networks2
Talk outline
• Introduction• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion– Acquisition of scene structure– Constructing visual models
• Camera Networks– Camera Network Calibration– Camera Network Synchronization – Towards Active Camera Networks…
• Conclusion
Visual 3D Modeling using Cameras and Camera Networks3
What can be achieved?
• Can we get 3D models from images? • How much do we need to know about the camera? • Can we freely move around? Hand-held? • Do we need to keep parameters fixed? Zoom?• What about auto-exposure?
• What about camera networks?• Can we provide more flexible systems? Avoid calibration?• What about using IP-based PTZ cameras? Hand-held
camcorders?• Unsynchronized or even asynchronous?
Visual 3D Modeling using Cameras and Camera Networks4
Talk outline
• Introduction• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion– Acquisition of scene structure– Constructing visual models
• Camera Networks– Camera Network Calibration– Camera Network Synchronization – Towards Active Camera Networks…
• Conclusion
Visual 3D Modeling using Cameras and Camera Networks5
(Pollefeys et al. ’98)
Visual 3D Modeling using Cameras and Camera Networks6
(Pollefeys et al. ’04)
More efficient RANSAC
Fully projective
Bundle adjustment
Deal with dominant planes
Improved self-calibration
Polar stereo rectification
Deal with radial distortion
Volumetric 3D integration
Image-based rendering
Faster stereo algorithm
Video Key-frame selection
Deal with specularities
Deal with Auto-Exposure
Visual 3D Modeling using Cameras and Camera Networks7
Feature tracking/matching
• Shape-from-Photographs: match Harris corners
• Shape-from-Video: track KLT features
Problem: insufficient motion between consecutive video-frames to compute epipolar geometry accurately and use it effectively as an outlier filter
Visual 3D Modeling using Cameras and Camera Networks8
Key-frame selection
Select key-frame when F yields a better model than H– Use Robust Geometric Information Criterion
– Given view i as a key-frame, pick view j as next key-frame for first view where GRIC(Fij)>GRIC(Hij) (or a few views later)
(Torr ’98)
bad fit penalty model complexity
H-GRIC
F-GRIC
(Pollefeys et al.’02)
Visual 3D Modeling using Cameras and Camera Networks9
Epipolar geometry
C1
C2
l2
P
l1
e1
e20m m 1T2 F
Fundamental matrix (3x3 rank 2
matrix)
1. Computable from corresponding points
2. Simplifies matching3. Allows to detect wrong
matches4. Related to calibration
Underlying structure in set of matches for rigid scenes
Visual 3D Modeling using Cameras and Camera Networks10
Epipolar geometry computation:robust estimation (RANSAC)
Step 1. Extract featuresStep 2. Compute a set of potential matchesStep 3. do
Step 3.1 select minimal sample (i.e. 7 matches)Step 3.2 compute solution(s) for FStep 3.3 count inliers, if not promising stop
until (#inliers,#samples)<95% samples#7)1(1
matches#inliers#
#inliers 90% 80% 70% 60% 50%
#samples 5 13 35 106 382
Step 4. Compute F based on all inliersStep 5. Look for additional matchesStep 6. Refine F based on all correct matches
(generate hypothesis)
(verify hypothesis)
Visual 3D Modeling using Cameras and Camera Networks11
Epipolar geometry computation
geometric relations between two views is fully described by recovered 3x3 matrix F
Visual 3D Modeling using Cameras and Camera Networks12
Initialize Motion (P1,P2 compatibel with F)
Sequential Structure and Motion Computation
Initialize Structure (minimize reprojection error)
Extend motion(compute pose through matches seen in 2 or more previous views)
Extend structure(Initialize new structure, refine existing structure)
Visual 3D Modeling using Cameras and Camera Networks13
Dealing with dominant planar scenes
• USaM fails when common features are all in a plane
• Solution: part 1 Model selection to detect problem
(Pollefeys et al., ECCV‘02)
Visual 3D Modeling using Cameras and Camera Networks14
Dealing with dominant planar scenes
• USaM fails when common features are all in a plane
• Solution: part 2 Delay ambiguous computationsuntil after self-calibration(couple self-calibration over all 3D
parts)
(Pollefeys et al., ECCV‘02)
Visual 3D Modeling using Cameras and Camera Networks15
Refine Structure and Motion
• Use projective bundle adjustment– Sparse bundle allows very efficient computation (2 levels)– Take radial distortion into account (1 or 2 parameters)
Visual 3D Modeling using Cameras and Camera Networks16
*
*
projection
constraints
Tiii
Tii ωΩ KKPP
Self-calibration using absolute conic
Absolute conic projection:
Translate constraints on K through projection equationto constraints on *
Euclidean projection matrix:
Upgrade from projective to metric
Transform structure and motion so that * diag(1,1,1,0)
some constraints, e.g. constant, no skew,...
(Faugeras ECCV’92; Triggs CVPR’97; Pollefeys et al. ICCV’98; etc.)
Visual 3D Modeling using Cameras and Camera Networks17
Practical linear self-calibration
0Ω
0Ω
0Ω
0ΩΩ
23T
13T
12T
22T
11T
PP
PP
PP
PPPP
(Pollefeys et al., ECCV‘02)
100
0ˆ0
00ˆ2
2
* f
fTPPKK
0ΩΩ
0ΩΩ
33T
22T
33T
11T
PPPP
PPPP
(only rough aproximation, but still usefull to avoid degenerate configurations)
(relatively accurate for most cameras)
9
1
9
1
1.0
11.0
101.0
12.0
1
1yy
xx
cf
csf
K
Don’t treat all constraints equal
after normalization!
when fixating point at image-center not only absolute quadric diag(1,1,1,0) satisfies ICCV98 eqs., but also diag(1,1,1,a), i.e. real or imaginary spheres!
Visual 3D Modeling using Cameras and Camera Networks18
Refine Metric Structure and Motion
• Use metric bundle adjustment– Use Euclidean parameterization for projection matrices– Same sparseness advantages, also use radial distortion
Visual 3D Modeling using Cameras and Camera Networks19
Mixing real and virtual elements in video
Virtual reconstruction of ancient fountain
Preview fragment of sagalassos TV documentary Similar to 2D3‘s Boujou and RealViz‘ MatchMover
Visual 3D Modeling using Cameras and Camera Networks20
Intermezzo: Auto-calibration of Multi-Projector System
(Raij and Pollefeys, submitted)hard because screens are planar, but still possible
Visual 3D Modeling using Cameras and Camera Networks21
Visual 3D Modeling using Cameras and Camera Networks22
Stereo rectification
• Resample image to simplify matching process
Visual 3D Modeling using Cameras and Camera Networks23
Stereo rectification
• Resample image to simplify matching process
Also take into account radial distortion!
Visual 3D Modeling using Cameras and Camera Networks24
Polar stereo rectification
Does not work with standard Homography-based approaches
Polar reparametrization of images around epipoles(Pollefeys et al. ICCV’99)
Visual 3D Modeling using Cameras and Camera Networks25
General iso-disparity surfaces(Pollefeys and Sinha, ECCV’04)
Example: polar rectification preserves disp.
Application: Active vision
Also interesting relation to human horopter
Visual 3D Modeling using Cameras and Camera Networks26
Stereo matching
Optimal path(dynamic programming )
Similarity measure(SSD or NCC)
Constraints• epipolar
• ordering
• uniqueness
• disparity limit
• disparity gradient limit
Trade-off
• Matching cost
• Discontinuities
(Cox et al. CVGIP’96; Koch’96; Falkenhagen´97; Van Meerbergen,Vergauwen,Pollefeys,VanGool IJCV‘02)
Visual 3D Modeling using Cameras and Camera Networks27
Hierarchical stereo matchingD
ow
nsam
plin
g
(Gau
ssia
n p
yra
mid
)
Dis
pari
ty p
rop
ag
ati
on
Allows faster computation
Deals with large disparity ranges
Visual 3D Modeling using Cameras and Camera Networks28
Disparity map
image I(x,y) image I´(x´,y´)Disparity map D(x,y)
(x´,y´)=(x+D(x,y),y)
Visual 3D Modeling using Cameras and Camera Networks29
Example: reconstruct image from neighbors
Visual 3D Modeling using Cameras and Camera Networks30
Multi-view depth fusion
• Compute depth for every pixel of reference image– Triangulation– Use multiple views– Up- and down sequence– Use Kalman filter
(Koch, Pollefeys and Van Gool. ECCV‘98)
Also allows to compute robust texture
Visual 3D Modeling using Cameras and Camera Networks31
Real-time stereo on GPU
• Plane-sweep stereo• Computes Sum-of-Square-Differences (use pixelshader)• Hardware mip-map generation for aggregation over window• Trade-off between small and large support window
(Yang and Pollefeys, CVPR2003)
150M disparity hypothesis/sec (Radeon9700pro)e.g. 512x512x20disparities at 30Hz
(Demo GeForce4)
GPU is great for vision too!
Visual 3D Modeling using Cameras and Camera Networks32
Dealing with specular highlights
Extend photo-consistency model to include highlights
(Yang, Pollefeys and Welch, ICCV’03)
Visual 3D Modeling using Cameras and Camera Networks33
Visual 3D Modeling using Cameras and Camera Networks34
3D surface model
Depth image Triangle mesh Texture image
Textured 3DWireframe model
Visual 3D Modeling using Cameras and Camera Networks35
Volumetric 3D integration
Multiple depth images Volumetric integration Texture integration
(Curless and Levoy, Siggraph´96)
patchwork texture map
Visual 3D Modeling using Cameras and Camera Networks36
Dealing with auto-exposure
• Estimate cameras radiometric response curve, exposure and white balance changes
• Extends prior HDR work at Columbia, CMU, etc.to moving camera
(Kim and Pollefeys, submitted)
brightness transfer curve
auto-exposure fixed-exposure response curve model
robust estimate using DP
Visual 3D Modeling using Cameras and Camera Networks37
Dealing with auto-exposure
Applications:• Photometric alignment of textures (or HDR textures)• HDR video
(Kim and Pollefeys, submitted)
Visual 3D Modeling using Cameras and Camera Networks38
Part of Jain temple
Recorded during post-ICCV tourist trip in India
(Nikon F50; Scanned)
Visual 3D Modeling using Cameras and Camera Networks39
Example: DV video 3D model
accuracy ~1/500 from DV video (i.e. 140kb jpegs 576x720)
Visual 3D Modeling using Cameras and Camera Networks40
Unstructured lightfield rendering
demo
(Heigl et al.’99)
Visual 3D Modeling using Cameras and Camera Networks41
Talk outline
• Introduction• Visual 3D modeling with a hand-held camera
– Acquisition of camera motion– Acquisition of scene structure– Constructing visual models
• Camera Networks– Camera Network Calibration– Camera Network Synchronization – towards active camera networks…
• Conclusion
Visual 3D Modeling using Cameras and Camera Networks42
Camera Networks
• CMU’s Dome, 3D Room, etc.
• MIT’s Visual Hull
• Maryland’s Keck lab, ETHZ’s BLUE-C and more
• Recently, Shape-from-Silhouette/Visual-Hull systems have been very popular
Visual 3D Modeling using Cameras and Camera Networks43
Camera Networks
• Offline Calibration Procedure• Special Calibration Data
– Planar Pattern– moving LED
• Requires physical access to environment• Active Camera Networks
– How do we maintain calibration ?
Visual 3D Modeling using Cameras and Camera Networks44
An example
• 4 NTSC videos recorded by 4 computers for 4 minutes• Manually synchronized and calibrated using MoCap
system
P. Sand, L. McMillan, and J. Popovic. Continuous Capture of Skin Deformation.ACM Transactions on Graphics 22, 3, 578-586, 2003.
Visual 3D Modeling using Cameras and Camera Networks45
Can we do without explicit calibration?
• Feature-based? – Hard to match features between very different views– Not many features on foreground– Background often doesn’t overlap much between views
• Silhouette-based?– Necessary for visual-hull anyway– But approach is not obvious
Visual 3D Modeling using Cameras and Camera Networks46
Multiple View Geometry of Silhouettes
• Frontier Points• Epipolar Tangents
• Points on Silhouettes in 2 views do not correspond in general except for projected Frontier Points
• Always at least 2 extremal frontier points per silhouette
• In general, correspondence only over two views
x1x2
x’1
x’2
0Fxx1
T
2
0xFx1
T
2
Visual 3D Modeling using Cameras and Camera Networks47
Calibration from Silhouettes: prior work
Epipolar Geometry from Silhouettes • Porril and Pollard, ’91• Astrom, Cipolla and Giblin, ’96
Structure-and-motion from Silhouettes• Joshi, Ahuja and Ponce’95 (trinocular rig/rigid object)• Vijayakumar, Kriegman and Ponce’96 (orthographic)• Wong and Cipolla’01 (circular motion, at least to start)• Yezzi and Soatto’03 (only refinement)
None really applicable to calibrate visual hull system
Visual 3D Modeling using Cameras and Camera Networks48
Camera Network Calibration from Silhouettes
• 7 or more corresponding frontier points needed to compute epipolar geometry for general motion
• Hard to find on single silhouette and possibly occluded
• However, Visual Hull systems record many silhouettes!
(Sinha, Pollefeys and McMillan, submitted)
Visual 3D Modeling using Cameras and Camera Networks49
Camera Network Calibration from Silhouettes
• If we know the epipoles, it is simple• Draw 3 outer epipolar tangents (from two silhouettes)
• Compute corresponding line homography H-T (not unique)
• Epipolar Geometry F=[e]xH
Visual 3D Modeling using Cameras and Camera Networks50
Let’s just sample: RANSAC
• Repeat – Generate random hypothesis for epipoles – Compute epipolar geometry – Verify hypothesis and count inliers
until satisfying hypothesis • Refine hypothesis
– minimize symmetric transfer error of frontier points– include more inliers
Until error and inliers stable
(use conservative threshold, e.g. 5 pixels, but abort early if not promising)
(use strict threshold, e.g. 1 pixels)
We’ll need an efficient representation as we are likely to have to do many trials!
Visual 3D Modeling using Cameras and Camera Networks51
A Compact Representation for SilhouettesTangent Envelopes
• Convex Hull of Silhouette.
• Tangency Pointsfor a discrete set of angles.
• Approx. 500 bytes/frame. Hence a whole video sequences easily fits in memory.
• Tangency Computations are efficient.
Visual 3D Modeling using Cameras and Camera Networks52
Epipole Hypothesis and Computing H
Visual 3D Modeling using Cameras and Camera Networks53
Model Verification
Visual 3D Modeling using Cameras and Camera Networks54
Remarks
• RANSAC allows efficient exploration of 4D parameter space (i.e. epipole pair) while being robust to imperfect silhouettes
• Select key-frames to avoid having too many identical constraints (when silhouette is static)
Visual 3D Modeling using Cameras and Camera Networks55
Reprojection Error and Epipole Hypothesis Distribution
Residual Distribution – Hypotheses along y-axis– Sorted Residuals along x-axis.– Pixel Error along z-axis.
40 best hypothesis out of 30000
Typically, 1/5000 samples converges to global minima after non-linear refinement (corresponds to 15 sec. computation time)
Visual 3D Modeling using Cameras and Camera Networks56
Computed Fundamental Matrices
Visual 3D Modeling using Cameras and Camera Networks57
Computed Fundamental Matrices
F computed directly (black epipolar lines) F after consistent 3D reconstruction (color)
Visual 3D Modeling using Cameras and Camera Networks58
Computed Fundamental Matrices
F computed directly (black epipolar lines) F after consistent 3D reconstruction (color)
Visual 3D Modeling using Cameras and Camera Networks59
From epipolar geometry to full calibration
• Not trivial because only matches between two views• Approach similar to Levi et al. CVPR’03, but practical• Key step is to solve for camera triplet
• Assemble complete camera network• projective bundle, self-calibration, metric bundle
(also linear in v)
(v is 4-vector )
Choose P3 corresponding to closest
Visual 3D Modeling using Cameras and Camera Networks60
Experiment
4 video sequences at 30 fps.
All F Matricescomputed from silhouettes
Full calibration computed
Visual 3D Modeling using Cameras and Camera Networks61
Metric Cameras and Visual-Hull Reconstruction from 4 views
Final calibration quality comparable to explicit calibration procedure
Visual 3D Modeling using Cameras and Camera Networks62
What if the videos are unsynchronized?
For videos recorded at a constant framerate, same contraints are valid, up to some extra unknown temporal offsets
Visual 3D Modeling using Cameras and Camera Networks63
Synchronization and calibration from silhouettes (Sinha and Pollefeys, submitted)
• Add a random temporal offset to RANSAC hypothesis generation, sample more
• Use multi-resolution approach:– Key-frames with slow motion, rough synchronization– Add key-frames with faster motion, refine
synchronization
Visual 3D Modeling using Cameras and Camera Networks64
Synchronization experiment
• Total temporal offset search range [-500,+500] (i.e. ±15s)• Unique peaks for correct offsets• Possibility for sub-frame synchronization
Visual 3D Modeling using Cameras and Camera Networks65
Synchronize camera network
• Consider oriented graph with offsets as branch value
• For consistency loops should add up to zero• MLE by minimizing
+3
-5+8
+6
+2
0
22 tt
ground truth
in frames (=1/30s)
Visual 3D Modeling using Cameras and Camera Networks66
Towards active camera networks
• Provide much more flexibility by making use of pan-tilt-zoom range, networked cameras
• (maintaining) calibration is a challenge
up to 3Gpix!
Visual 3D Modeling using Cameras and Camera Networks67
Calibration of PTZ camerassimilar to Collins and Tsin ’99, but with varying radial distortion
Visual 3D Modeling using Cameras and Camera Networks68
Visual 3D Modeling using Cameras and Camera Networks69
Conclusion
• 3D models from video, more flexibility, more general
• Camera networks synchronization and calibration, just from silhouettes, great for visual-hull systems
Future plans• Deal with sub-frame offset for VH reconstruction• Extend to active camera network (PTZ cameras)• Extend to asynchronous video streams (IP
cameras)
view01
Visual 3D Modeling using Cameras and Camera Networks70
Acknowledgment
• NSF Career, NSF ITR on 3D-TV, DARPA seedling, Link foundation• EU ACTS VANGUARD, ITEA BEYOND, EU IST MURALE, FWO-Vlaanderen
• Sudipta Sinha, Ruigang Yang, Seon Joo Kim, Andrew Raij, Greg Welch, Leonard McMillan (UNC)
• Maarten Vergauwen, Frank Verbiest, Kurt Cornelis, Jan Tops, Luc Van Gool (KULeuven), Reinhard Koch (UKiel), Benno Heigl
top related