panoramas. creating full view panoramic image mosaics and environment maps richard szeliski and...

Panoramas

Creating Full View Panoramic Image Mosaics and Environment Maps

Richard Szeliski and Heung-Yeung Shum

Microsoft Research

Outline

• Main Contribution• Introduction • Details • Results

Contributions• Novel approach to creating full view panoramic

mosaics from image sequence– Not necessarily pure horizontal camera panning– Do not require any controlled motions or constraints

• Represent image mosaics using a set of transforms

• Fast and robust • Method to recover camera focal length• Extract environment map from image mosaic

Introduction

• Image-based rending– Realistic rendering without geometry models

• IBR without depth info – Only support user panning, rotation and zoom– QuickTime VR, Surround Video…– Cylindrical image, spherical maps..

Introduction

• To capture panoramic images– Using panoramic camera to get cylindrical image– Using a lens with a large field of view (fisheye lens)– Mirrored pyramids and parabolic mirrors

• Hardware-intensive methods– Take a series regular picture or video and stitch them

together– Require carefully-controlled camera motion– Produce only cylindrical image

Novel Algorithms

• They use 3-parameter rotational motion model: fewer unknowns, more robust – Instead of general 8-parameter planar perspective

motion model

• Estimate focal length from a set of 8-parameter perspective registration

• Gap closing

Cylindrical Panoramas

• Cylindrical panorama is easy to construct• Coordinate transformation

cylindrical image

Cylindrical Panorama

• With ideal pinhole camera and known f,

• Distortion: horizontal lines becomes curved

( , , ) ( , , )X Y Z w u v f

( , )u v

Spherical Panorama

X Y ZX

(X,Y,Z)

(cos sin ,sin ,cos cos )

2 2tan

Motion Model

• Warp image to cylindrical panorama• Ideal horizontal panning sequence – Rotation->Translation in angle

• In practice– Vertical translation to compensate for vertical

jitter and optical twist( , )x yt t On WARPED Image!

Motion Recovery

• Estimate incremental translation• by minimizing the intensity error

• Taylor series expansion

( , )x yt t t

21 0( ) ( ) ( )i ii

E I I t x t x

2( ) T

E e t g t

1 0( ) ( )i i ie I I x x Current intensity or color error

1( )Ti iI g x Image gradient of I_1 at position x’i

Motion Recovery

• Minimization -> Least square solution

Ti i i i

e g g t g

2( ) T

E e t g t

Motion Recovery• Large initial displacement– Coarse to fine optimization scheme

J. R. Bergen, P. Anandan, K. J. Hanna, and R. Hingorani. Hierarchical model-based motion estimation

Motion Recovery

• Large initial displacement– Coarse to fine optimization scheme

• Discontinuities in intensity or color between images being composed– Feathering algorithm: weighted by distance map

Limitations of Cylindrical or Spherical Panorama

• Only handle pure panning motion• Ill-sampling at north pole and south pole

cause big registration errors• Require knowing the focal length f – Estimation of focal length of lens by registering

images is not very accurate

Perspective Panoramas

• Planar perspective transform between images using 8 parameters

For example, if only translation, m2, m5 are the unknowns

• Iteratively update transform matrix using

• Resampling image I_1 with x’’=(I+D)Mx to

M (I +D)M

1( '')I x

• Minimize2 2

1 0( ) ( ) ( ) T Ti i i i i

E I I x e d x g J d

( )i i dJ J x

• Least square minimization 2 2

1 0( ) ( ) ( ) T Ti i i i i

E I I x e d x g J d

Hessian Matrix Accumulated gradient or residual

• Works well if initial estimates of correct transformation are close enough

• Slow convergence • Get stuck in local minima

Rotational Panoramas

• Cameras centered at the origin

~x TVRp1

Simplicity of rotation: set c_x=c_y=0, pixel start from image center

1 1 p R V x

• Camera rotating around its center of projection

• Focal length is known and the same for all images V_k = V_l = V

• Angular velocity z( , , )x y

• Incremental rotation matrix (Rodriguez’s formula)

• is the deformation matrix as• Jacobian of

M (I +D)M

0 1/ 0

1/ 0 0

Rotational Panoramas• 3 parameters Incremental

rotation vector• Update R_k in

• Much easier and more intuitive to interactively adjust

Estimate Focal Length

• If fixed focal length, take average of f_0 and f_1:

• If multiple focal length for every images, use median value for final estimate

• We can also update the focal length as part of the image registration process using least squares approach

0 1f f f

Closing gap in a panorama

• Matching the first image and the last one• Compute the gap angle • Distribute the gap angle evenly across the

whole sequence– Modify rotations by – Update focal length

• Only works for 1D panorama where the camera is continuously turning in the same direction

/g imageN

' (1 / 360 )gf f

Conclusion

• Does not place constraints on how the images to be taken with hand held cameras

• Accurate and robust – Estimate only 3 rotation parameters instead of 8

parameters in general perspective transforms

• Increases accuracy, flexibility and ease of use • Focal length estimation

Results

Photographing Long Scenes with Multi-Viewpoint Panoramas

http://grail.cs.washington.edu/projects/multipano/

Abstract• Multi-viewpoint panoramas of long, roughly planar scenes

– Façades of buildings along a city street

• Panoramas are composed of relatively large regions of ordinary perspective

• User interactions – to identify the dominant plane– To draw strokes indicating various high-level goals

• Markov Random Field optimization

Introduction

• Long scene – Hard to take photographs at one point– Wider field of view: large distortion– Single perspective photographs are not very

effective at conveying long scenes• Street side in a city• Bank of a river • Aisle of a grocery store

Introduction

• Take photographs– Walk along the other side and take handheld

photographs at intervals of one large step (roughly one meter)

• Output – a single panorama that visualizes the entire extent

of the scene captured in the input photographs and resembles what a human would see when walking along the street

Contributions

• A practical approach to creating high quality, high-resolution, multi-viewpoint panoramas with a simple and casual capture method.

• A number of novel techniques, including – An objective function that describes desirable

properties of a multi-viewpoint panorama, and– A novel technique for propagating user-drawn

strokes that annotate 3D objects in the scene

Related Work

• Single-viewpoint panoramas– Rotating a camera around its optical center

• Strip Panoramas– Translating camera– Orthographic projection along the horizontal axis– Perspective along vertical – Varying strip width by depth estimation of

appearance optimization– High-speed video camera and special setups

Strip Panoramas• Exhibit distortion for scene with varying depths,

especially if these depth variations occur across the vertical axis of the image

• Created from video sequences, and still images created from video rarely have the same quality as those captured by a still camera– Low resolution, compression artifacts

• Capturing a suitable video can be cumbersome

Approach

• Inspired by the work of artist Michael Koller– Multi-viewpoint panoramas of San Francisco

streets – Large regions of ordinary perspective photographs– Artfully seamed together to hide the transitions– Attractive and informative

multi-viewpoint panoramas • Each object in the scene is rendered from a viewpoint roughly in

front of it to avoid perspective distortion.

• The panoramas are composed of large regions of linear perspective seen from a viewpoint where a person would naturally stand (for example, a city block is viewed from across the street, rather than from some faraway viewpoint).

• Local perspective effects are evident; objects closer to the image plane are larger than objects further away, and multiple vanishing points can be seen.

• The seams between these perspective regions do not draw attention; that is, the image appears natural and continuous.

Properties

• A dominant plane in the scene • Not attempt to – Create multi-viewpoint panoramas that turn around street

corners– show all four sides of a building

System Overview

Pre-processing• Takes the source images• Removes radial distortion• Recovers the camera projection matrices• Compensates for exposure variation

Panorama Surface• Defines the picture surface• Source photographs are then projectedonto this surface

Composition• Selects a viewpoint for each pixel in the output panorama• Interactively refine by drawing strokes

Capture Images

• Use a digital SLR camera with auto-focus and manually control the exposure to avoid large exposure shifts

• Use a fisheye lens to insure a wide field of view for some data

Pre-Processing

• Recover projection matrices of each camera so that we can later project the source images onto a picture surface

• Use structure-from-motion system [Hartley and Zisserman 2004] built by Snavely et al. [2006]– Using sift for keypoint detection and matching– Bundle adjustment

, ,i i iR t f

http://phototour.cs.washington.edu/bundler/

Exposure Compensation

• Adjust the exposure of the various photographs so that they match better in overlapping regions

• Recover the radiometric response function of each photograph [Mitsunaga and Nayar 1999]

• Simpler approach

i i j jk I k I

Least squares for all the pairs of SIFT match

Picture Surface Selection

• The picture surface should be roughly aligned with the dominant plane of the scene

• Extrude in Y direction

Picture Surface Selection• Define the coordinate system of the recovered 3D scene

– Automatic: fit a plane to the camera viewpoints using principal component analysis• The dimension of greatest variation (the first principal component) is

the new x-axis, and • The dimension of least variation the new y-axis

– Interactive: user selects a few of these projected points that lie along the desired axes

• Draw the curve in the xz plane that defines the picture surface

Picture Surface Selection

• Easy to identify for street scenes• River bank: hard to specify by drawing strokes– The user selects clusters of scene points that

should lie along the picture surface– The system fits a third-degree polynomial z(x) to

the z-coordinates of these 3D scene points as a function of their x-coordinates

Sample Picture Surface

• Project each S(I,j) on picture surface to source photograph

Average Image

the average image of all the projected sources

the average image after unwarping to straighten the ground plane and cropping

Interactive Refinement• Small drifts that can accumulate during structure-from-

motion estimation and lead to ground planes that slowly curve

• The user clicks a few points along the average image to indicate y values of the image that should be warped straight

• Resample and crop

the average image after unwarping to straighten the ground plane and cropping

Viewpoint Selection

• How to choose color for each pixel on panorama from one of the source image I_i(p)

Determine L(p): L is the image no.

Viewpoint Selection

• Optimization using Markov Random Field• Cost function:

Gives straight-on view

Seamless transition

Encourages the panorama to resemble the average image in areas where the scene geometry intersects the picture surface

Viewpoint Selection

• Minimize the overall cost function for each pixel and each pair of neighboring pixel

• Solve using min-cut optimization– compute the panorama at a lower resolution so that the

MRF optimization can be computed in reasonable time– create higher-resolution versions using the hierarchical

approach described by Agarwala et al. [2005].– We thus composite the final Panorama in the gradient

domain to smooth errors across these seams

Viewpoint Selection• Not vertical strips

Interactive Refinement

• The user should be able to express desired changes to the panorama without tedious manual editing of the exact seam locations.

• Three types of strokes:– View selection: use certain viewpoint– Seam suppression: no seam should pass an object– Inpainting: eliminate undesirable features

Interactive Refinement

Results

• 1 hour to capture images (100) and 20 mins for interactions

• Not for every scene– Suburban scenes with a range of different depths

• More results can be found at http://grail.cs.washington.edu/projects/multipano/

panoramas. creating full view panoramic image mosaics and environment maps richard szeliski and...

z slide

image mosaic slide

curved slide

residual slide

accurate slide

square solution slide

distance map slide

image sequence

Documents

kullback-leibler boosting ce liu heung-yeung shum microsoft...

geometrical photometric image...

stereo matching using belief propagation - pattern …stereo...

image warping (szeliski 3.6.1)

dongjin agencydongjin fides heung-a pusan dongjin venus...

drag and drop pasting - department of computer...

image deblurring with blurred/noisy image pairs lu yuan jian...

face poser: interactive modeling of 3d facial expressions...

tina shum, happy retirement!

xianyou hou, li-yi wei, heung-yeung shum, baining guo...

realistic rendering and animation of knitwear · realistic...

large mesh deformation using the volumetric graph laplacian...

jessica shum portfolio

an efficient approach to learning inhomogenous gibbs models...

blurred/non-blurred image alignment using sparseness...

pop-up light field: an interactive image-based modeling and...

ppt on tao heung

synthesis of progressively-variant textures on arbitrary...

video object cut and paste - microsoft.com€¦ · video...

yick heung products portfolio-sweaters