![Page 1: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/1.jpg)
Automatic scene inference for 3D object compositing
Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth
SIGGRAPH 2014
![Page 2: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/2.jpg)
What is this system• Image editing system• Drag-and-drop object insertion• Place objects in 3D and relight• Fully automatic for recovering a comprehensive 3D
scene model: geometry, illumination, diffuse albedo, and camera parameters
• From single low dynamic range (LDR) image
![Page 3: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/3.jpg)
Existing problems• It’s the artist’s job to create photorealistic
effects by recognizing the physical space• Lighting, shadow, perspective• Need: camera parameters, scene geometry,
surface materials, and sources of illumination
![Page 4: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/4.jpg)
State-of-the-art• http://www.popularmechanics.com/technolog
y/digital/visual-effects/4218826• http://en.wikipedia.org/wiki/The_Adventures
_of_Seinfeld_%26_Superman
![Page 5: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/5.jpg)
![Page 6: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/6.jpg)
What can not this system handle• Works best when scene lighting is diffuse;
therefore generally works better indoors than out• Errors in either geometry, illumination, or
materials may be prominent• Does not handle object insertion behind existing
scene elements
![Page 7: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/7.jpg)
Contribution• Illumination inference: recovers a full lighting
model including light sources not directly visible in the photograph
• Depth estimation: combines data-driven depth transfer with geometric reasoning about the scene layout
![Page 8: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/8.jpg)
How to do this• Need: geometry, illumination, surface
reflectance• Even though the estimates are coarse, the
composites still look realistic because even large changes in lighting are often not perceivable
![Page 9: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/9.jpg)
Workflow
![Page 10: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/10.jpg)
Indoor/outdoor scene classification• K-nearest-neighbor matching of GIST features• Indoor dataset: NYUv2• Outdoor dataset: Make3D• Different training images and classifiers are
chosen depending on indoor/outdoor scene
![Page 11: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/11.jpg)
Single image reconstruction• Camera parameters, geometry– Focal length f, camera center (cx, cy) and extrinsic
parameters are computed from three orthogonal vanishing points detected in the scene
![Page 12: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/12.jpg)
Surface materials• Per-pixel diffuse material albedo and shading
by Color Rentinex method
![Page 13: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/13.jpg)
Data-driven depth estimation• Database: rgbd• Appearance cues for correspondences: multi-
scale SIFT features• Incorporate geometric information
![Page 14: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/14.jpg)
Data-driven depth estimation
Et: depth transferEm: Manhattan worldEo: orientationE3s: spatial smoothness in 3D
![Page 15: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/15.jpg)
Scene illumination
![Page 16: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/16.jpg)
Visible sources• Segment the image into superpixels;• Then compute features for each superpixel;– Location in image– Use 340 features used in Make3D
• Train a binary classifier with annotated data to predict whether or not a superpixel is emitting/reflecting a significant amount of light.
![Page 17: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/17.jpg)
Out-of-view sources• Data-driven: annotated SUN360 panorama
dataset;• Assumption: if photographs are similar, then
the illumination environment beyond the photographed region will be similar as well.
![Page 18: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/18.jpg)
Out-of-view sources• Use features: geometric context, orientation maps, spatial
pyramids, HSV histograms, output of the light classifier;• Measure: histogram intersection score, per-pixel inner
product;• Similarity metric of IBLs: how similar the rendered canonical
objects are;• Ranking function: 1-slack, linear SVN-ranking optimization
(trained).
![Page 19: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/19.jpg)
Relative intensities of the light sources• Intensity estimation through rendering: adjusting until a
rendered version of the scene matches the original image;• Humans cannot distinguish between a range of illumination
configurations, suggesting that there is a family of lighting conditions that produce the same perceptual response.
• Simply choose the lighting configuration that can be rendered faster.
![Page 20: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/20.jpg)
Physically grounded image editing• Drag-and-drop insertion• Lighting adjustment• Synthetic depth-of-field
![Page 21: Automatic scene inference for 3D object compositing](https://reader035.vdocument.in/reader035/viewer/2022081512/5681673d550346895ddbedf7/html5/thumbnails/21.jpg)
User study• Real object, real scene VS inserted object, real
scene• Synthetic object, synthetic scene VS inserted
object, synthetic scene• Produces perceptually convincing results