panoramas. creating full view panoramic image mosaics and environment maps richard szeliski and...
Post on 24-Dec-2015
225 Views
Preview:
TRANSCRIPT
Panoramas
Creating Full View Panoramic Image Mosaics and Environment Maps
Richard Szeliski and Heung-Yeung Shum
Microsoft Research
Outline
• Main Contribution• Introduction • Details • Results
Contributions• Novel approach to creating full view panoramic
mosaics from image sequence– Not necessarily pure horizontal camera panning– Do not require any controlled motions or constraints
• Represent image mosaics using a set of transforms
• Fast and robust • Method to recover camera focal length• Extract environment map from image mosaic
Introduction
• Image-based rending– Realistic rendering without geometry models
• IBR without depth info – Only support user panning, rotation and zoom– QuickTime VR, Surround Video…– Cylindrical image, spherical maps..
Introduction
• To capture panoramic images– Using panoramic camera to get cylindrical image– Using a lens with a large field of view (fisheye lens)– Mirrored pyramids and parabolic mirrors
• Hardware-intensive methods– Take a series regular picture or video and stitch them
together– Require carefully-controlled camera motion– Produce only cylindrical image
Novel Algorithms
• They use 3-parameter rotational motion model: fewer unknowns, more robust – Instead of general 8-parameter planar perspective
motion model
• Estimate focal length from a set of 8-parameter perspective registration
• Gap closing
Cylindrical Panoramas
• Cylindrical panorama is easy to construct• Coordinate transformation
XY
Z
cylindrical image
h
Cylindrical Panorama
• With ideal pinhole camera and known f,
• Distortion: horizontal lines becomes curved
( , , ) ( , , )X Y Z w u v f
f
( , )u v
Spherical Panorama
2 2 2
2 2
2 2
sin
sin
cos
Y
X Y ZX
X ZZ
X Z
X
YZ
(X,Y,Z)
(cos sin ,sin ,cos cos )
2 2tan
Y
X Z
Motion Model
• Warp image to cylindrical panorama• Ideal horizontal panning sequence – Rotation->Translation in angle
• In practice– Vertical translation to compensate for vertical
jitter and optical twist( , )x yt t On WARPED Image!
Motion Recovery
• Estimate incremental translation• by minimizing the intensity error
• Taylor series expansion
( , )x yt t t
21 0( ) ( ) ( )i ii
E I I t x t x
2( ) T
i ii
E e t g t
1 0( ) ( )i i ie I I x x Current intensity or color error
1( )Ti iI g x Image gradient of I_1 at position x’i
Motion Recovery
• Minimization -> Least square solution
Ti i i i
i i
e g g t g
2( ) T
i ii
E e t g t
Motion Recovery• Large initial displacement– Coarse to fine optimization scheme
J. R. Bergen, P. Anandan, K. J. Hanna, and R. Hingorani. Hierarchical model-based motion estimation
Motion Recovery
• Large initial displacement– Coarse to fine optimization scheme
• Discontinuities in intensity or color between images being composed– Feathering algorithm: weighted by distance map
Limitations of Cylindrical or Spherical Panorama
• Only handle pure panning motion• Ill-sampling at north pole and south pole
cause big registration errors• Require knowing the focal length f – Estimation of focal length of lens by registering
images is not very accurate
Perspective Panoramas
• Planar perspective transform between images using 8 parameters
For example, if only translation, m2, m5 are the unknowns
Perspective Panoramas
• Iteratively update transform matrix using
• Resampling image I_1 with x’’=(I+D)Mx to
M (I +D)M
1( '')I x
Perspective Panoramas
• Minimize2 2
1 0( ) ( ) ( ) T Ti i i i i
i i
E I I x e d x g J d
( )i i dJ J x
Perspective Panoramas
• Least square minimization 2 2
1 0( ) ( ) ( ) T Ti i i i i
i i
E I I x e d x g J d
Ad b
Hessian Matrix Accumulated gradient or residual
Perspective Panoramas
• Works well if initial estimates of correct transformation are close enough
• Slow convergence • Get stuck in local minima
Rotational Panoramas
• Cameras centered at the origin
~x TVRp1
x
y
x
X
Y
Z
p
Simplicity of rotation: set c_x=c_y=0, pixel start from image center
1 1 p R V x
Rotational Panoramas
• Camera rotating around its center of projection
• Focal length is known and the same for all images V_k = V_l = V
• Angular velocity z( , , )x y
Rotational Panoramas
• Incremental rotation matrix (Rodriguez’s formula)
Rotational Panoramas
• is the deformation matrix as• Jacobian of
M (I +D)M
0 0 0
0 0 1
0 0
0 0 1
0 0 0
0 0
0 1/ 0
1/ 0 0
f
f
f
f
Rotational Panoramas• 3 parameters Incremental
rotation vector• Update R_k in
• Much easier and more intuitive to interactively adjust
Estimate Focal Length
1/
1/
Estimate Focal Length
• If fixed focal length, take average of f_0 and f_1:
• If multiple focal length for every images, use median value for final estimate
• We can also update the focal length as part of the image registration process using least squares approach
0 1f f f
Closing gap in a panorama
• Matching the first image and the last one• Compute the gap angle • Distribute the gap angle evenly across the
whole sequence– Modify rotations by – Update focal length
• Only works for 1D panorama where the camera is continuously turning in the same direction
g
/g imageN
' (1 / 360 )gf f
Conclusion
• Does not place constraints on how the images to be taken with hand held cameras
• Accurate and robust – Estimate only 3 rotation parameters instead of 8
parameters in general perspective transforms
• Increases accuracy, flexibility and ease of use • Focal length estimation
Results
Photographing Long Scenes with Multi-Viewpoint Panoramas
http://grail.cs.washington.edu/projects/multipano/
Abstract• Multi-viewpoint panoramas of long, roughly planar scenes
– Façades of buildings along a city street
• Panoramas are composed of relatively large regions of ordinary perspective
• User interactions – to identify the dominant plane– To draw strokes indicating various high-level goals
• Markov Random Field optimization
Introduction
• Long scene – Hard to take photographs at one point– Wider field of view: large distortion– Single perspective photographs are not very
effective at conveying long scenes• Street side in a city• Bank of a river • Aisle of a grocery store
Introduction
• Take photographs– Walk along the other side and take handheld
photographs at intervals of one large step (roughly one meter)
• Output – a single panorama that visualizes the entire extent
of the scene captured in the input photographs and resembles what a human would see when walking along the street
Contributions
• A practical approach to creating high quality, high-resolution, multi-viewpoint panoramas with a simple and casual capture method.
• A number of novel techniques, including – An objective function that describes desirable
properties of a multi-viewpoint panorama, and– A novel technique for propagating user-drawn
strokes that annotate 3D objects in the scene
Related Work
• Single-viewpoint panoramas– Rotating a camera around its optical center
• Strip Panoramas– Translating camera– Orthographic projection along the horizontal axis– Perspective along vertical – Varying strip width by depth estimation of
appearance optimization– High-speed video camera and special setups
Strip Panoramas• Exhibit distortion for scene with varying depths,
especially if these depth variations occur across the vertical axis of the image
• Created from video sequences, and still images created from video rarely have the same quality as those captured by a still camera– Low resolution, compression artifacts
• Capturing a suitable video can be cumbersome
Approach
• Inspired by the work of artist Michael Koller– Multi-viewpoint panoramas of San Francisco
streets – Large regions of ordinary perspective photographs– Artfully seamed together to hide the transitions– Attractive and informative
multi-viewpoint panoramas • Each object in the scene is rendered from a viewpoint roughly in
front of it to avoid perspective distortion.
• The panoramas are composed of large regions of linear perspective seen from a viewpoint where a person would naturally stand (for example, a city block is viewed from across the street, rather than from some faraway viewpoint).
• Local perspective effects are evident; objects closer to the image plane are larger than objects further away, and multiple vanishing points can be seen.
• The seams between these perspective regions do not draw attention; that is, the image appears natural and continuous.
Properties
• A dominant plane in the scene • Not attempt to – Create multi-viewpoint panoramas that turn around street
corners– show all four sides of a building
System Overview
Pre-processing• Takes the source images• Removes radial distortion• Recovers the camera projection matrices• Compensates for exposure variation
Panorama Surface• Defines the picture surface• Source photographs are then projectedonto this surface
Composition• Selects a viewpoint for each pixel in the output panorama• Interactively refine by drawing strokes
Capture Images
• Use a digital SLR camera with auto-focus and manually control the exposure to avoid large exposure shifts
• Use a fisheye lens to insure a wide field of view for some data
Pre-Processing
• Recover projection matrices of each camera so that we can later project the source images onto a picture surface
• Use structure-from-motion system [Hartley and Zisserman 2004] built by Snavely et al. [2006]– Using sift for keypoint detection and matching– Bundle adjustment
, ,i i iR t f
http://phototour.cs.washington.edu/bundler/
Exposure Compensation
• Adjust the exposure of the various photographs so that they match better in overlapping regions
• Recover the radiometric response function of each photograph [Mitsunaga and Nayar 1999]
• Simpler approach
i i j jk I k I
Least squares for all the pairs of SIFT match
Picture Surface Selection
• The picture surface should be roughly aligned with the dominant plane of the scene
• Extrude in Y direction
Picture Surface Selection• Define the coordinate system of the recovered 3D scene
– Automatic: fit a plane to the camera viewpoints using principal component analysis• The dimension of greatest variation (the first principal component) is
the new x-axis, and • The dimension of least variation the new y-axis
– Interactive: user selects a few of these projected points that lie along the desired axes
• Draw the curve in the xz plane that defines the picture surface
Picture Surface Selection
• Easy to identify for street scenes• River bank: hard to specify by drawing strokes– The user selects clusters of scene points that
should lie along the picture surface– The system fits a third-degree polynomial z(x) to
the z-coordinates of these 3D scene points as a function of their x-coordinates
Sample Picture Surface
• Project each S(I,j) on picture surface to source photograph
Average Image
the average image of all the projected sources
the average image after unwarping to straighten the ground plane and cropping
Interactive Refinement• Small drifts that can accumulate during structure-from-
motion estimation and lead to ground planes that slowly curve
• The user clicks a few points along the average image to indicate y values of the image that should be warped straight
• Resample and crop
the average image after unwarping to straighten the ground plane and cropping
Viewpoint Selection
• How to choose color for each pixel on panorama from one of the source image I_i(p)
Determine L(p): L is the image no.
Viewpoint Selection
• Optimization using Markov Random Field• Cost function:
Gives straight-on view
Seamless transition
Encourages the panorama to resemble the average image in areas where the scene geometry intersects the picture surface
Viewpoint Selection
• Minimize the overall cost function for each pixel and each pair of neighboring pixel
• Solve using min-cut optimization– compute the panorama at a lower resolution so that the
MRF optimization can be computed in reasonable time– create higher-resolution versions using the hierarchical
approach described by Agarwala et al. [2005].– We thus composite the final Panorama in the gradient
domain to smooth errors across these seams
Viewpoint Selection• Not vertical strips
Interactive Refinement
• The user should be able to express desired changes to the panorama without tedious manual editing of the exact seam locations.
• Three types of strokes:– View selection: use certain viewpoint– Seam suppression: no seam should pass an object– Inpainting: eliminate undesirable features
Interactive Refinement
Results
• 1 hour to capture images (100) and 20 mins for interactions
• Not for every scene– Suburban scenes with a range of different depths
• More results can be found at http://grail.cs.washington.edu/projects/multipano/
top related