geometric and semantic 3d reconstruction: part …chaene/cvpr17tut/geometry...geometric and semantic...
TRANSCRIPT
Geometric and Semantic 3D Reconstruction:
Part 4B: 3D Prediction using ConvNets
CVPR 2017 Tutorial
Christian Häne
UC Berkeley
Problem Statement
• Predict full 3D geometry from little input
– Single depth map
– Single RGB image
• Multi-view geometry not applicable
Input Desired Output
Classical Solution
• Fit shape model into input data
Morphable Face Model [Blanz and Vetter, SIGGRAPH 99]
• High resolution geometry • Manual generation of model • Only possible for suitable
object classes
Encoder-Decoder Architecture
• Auto-Encoder, Input = Output
– Feature Learning [Zeiler et al. CVPR 2010]
– Representation Learning
Convolutional Layers Up-Convolutional Layers
Low Dimensional Representation
Input Output
Auto-Encoder Architecture
• Learning Shape Representations
Input Voxels Output Voxels
3D Up-Convolutional Layers
Shape Code
3D Convolutional Layers
Predicting Geometry from RGB
• Input mapped to shape representation
• Decoded to occupancy volume
2D Convolutional Layers 3D Up-Convolutional Layers
Low Shape Representation
Input Image Output Voxel Grid
TL-Networks [Girdhar et al. ECCV 2016]
• Learn shape representation with auto-encoder
• Predict learned embedding from rgb image
TL-Networks [Girdhar et al. ECCV 2016]
• Coarse resolution voxel grids
• Generalizes to real-data
3D-R2N2 [Choy et al. ECCV 2016]
• Multiple input images
3D-R2N2 [Choy et al. ECCV 2016]
• Recurrent network with memory layers
– LSTM (long short-term memory) [Hochreiter & Schmidhuber 1997]
– GRU (gated recurrent unit) [Cho et al, 2014]
• Variable number of Input Images
Update Equations GRU
3D-R2N2 [Choy et al. ECCV 2016]
Datasets
• Pascal 3D [Xiang et al. 2014]
– CAD model annotations in real images
• CAD annotation for NYUD [Guo & Hoiem, ICCV 2013]
– RGBD images annotated with CAD models
• ShapeNet [Chang et al. 2015]
– CAD model database
– Input images rendered synthetically
Pascal 3D
• Real images with manually aligned CAD models
NYUD + CAD
• CAD models aligned with RGBD data
ShapeNet
• CAD model from 57 object categories
– Rendering of images
– Voxelization
Notes on Evaluation
• CAD annotations unsuitable for training
– Very small number of models (ca. 10 per class)
– No train / test split of used CAD models
– Training a classifier outperforms voxel prediction
• Using CAD models directly (ShapeNet)
– Synthetic images
• Alleviated by random backgrounds
– Numbers not comparable between papers
• Different: renderings, voxelizations, viewpoint sampling, subset of categories, …..
Supervision
• Ground truth voxels
• Weaker supervision
– Silhouettes
– Depth maps
– Color
Perspective Transformer Nets [Yan et al. NIPS 2016]
• Silhouettes as Supervision
Perspective Transformer Nets [Yan et al. NIPS 2016]
Ray Consistency [Tulsiani et al. CVPR 2017]
• Ray formulation with multi-view supervision
Observation O
from camera C
Input Image
Geometric Consistency Loss
CNN
• Silhouettes
• Depth
• Color
Ray Consistency [Tulsiani et al. CVPR 2017]
Ray Consistency [Tulsiani et al. CVPR 2017]
GT Input 3D Supervision
DRC (Mask)
DRC (Depth)
Coarse Resolution
• Problem: Volumetric prediction at low resolution 323
• Surface is not 3D just 2D
– Predict coefficients of parametric shape model [Dibra et al. 3DV 2016]
– Predict volume hierarchically around surface [Häne et. al. 2017]
Predict Moraphable Shape Model [Dibra et al. 3DV 2016]
• Needs known shape model
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
• Our approach: Hierarchical Surface Prediction (HSP)
– Three label prediction (free/occupied/boundary)
– Higher resolution only on boundary (octree)
Input 163 323 643 1283 2563
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
• Network architecture
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
• Two baselines
– LR Soft
– LR Hard
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
Intersection over Union
Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]
• Smooth high res surfaces
• Quantitatively more accurate than low res
• Various object categories
Conclusion
• Geometry prediction recent approach
• Difficult to compare methods
• Weak supervision
• Hierarchical prediction