geometric and semantic 3d reconstruction: part …chaene/cvpr17tut/geometry...geometric and semantic...

Geometric and Semantic 3D Reconstruction:

Part 4B: 3D Prediction using ConvNets

CVPR 2017 Tutorial

Christian Häne

UC Berkeley

Problem Statement

• Predict full 3D geometry from little input

– Single depth map

– Single RGB image

• Multi-view geometry not applicable

Input Desired Output

Classical Solution

• Fit shape model into input data

Morphable Face Model [Blanz and Vetter, SIGGRAPH 99]

• High resolution geometry • Manual generation of model • Only possible for suitable

object classes

Encoder-Decoder Architecture

• Auto-Encoder, Input = Output

– Feature Learning [Zeiler et al. CVPR 2010]

– Representation Learning

Convolutional Layers Up-Convolutional Layers

Low Dimensional Representation

Input Output

Auto-Encoder Architecture

• Learning Shape Representations

Input Voxels Output Voxels

3D Up-Convolutional Layers

Shape Code

3D Convolutional Layers

Predicting Geometry from RGB

• Input mapped to shape representation

• Decoded to occupancy volume

2D Convolutional Layers 3D Up-Convolutional Layers

Low Shape Representation

Input Image Output Voxel Grid

TL-Networks [Girdhar et al. ECCV 2016]

• Learn shape representation with auto-encoder

• Predict learned embedding from rgb image

TL-Networks [Girdhar et al. ECCV 2016]

• Coarse resolution voxel grids

• Generalizes to real-data

3D-R2N2 [Choy et al. ECCV 2016]

• Multiple input images


• Recurrent network with memory layers

– LSTM (long short-term memory) [Hochreiter & Schmidhuber 1997]

– GRU (gated recurrent unit) [Cho et al, 2014]

• Variable number of Input Images

Update Equations GRU

Datasets

• Pascal 3D [Xiang et al. 2014]

– CAD model annotations in real images

• CAD annotation for NYUD [Guo & Hoiem, ICCV 2013]

– RGBD images annotated with CAD models

• ShapeNet [Chang et al. 2015]

– CAD model database

– Input images rendered synthetically

Pascal 3D

• Real images with manually aligned CAD models

NYUD + CAD

• CAD models aligned with RGBD data

ShapeNet

• CAD model from 57 object categories

– Rendering of images

– Voxelization

Notes on Evaluation

• CAD annotations unsuitable for training

– Very small number of models (ca. 10 per class)

– No train / test split of used CAD models

– Training a classifier outperforms voxel prediction

• Using CAD models directly (ShapeNet)

– Synthetic images

• Alleviated by random backgrounds

– Numbers not comparable between papers

• Different: renderings, voxelizations, viewpoint sampling, subset of categories, …..

Supervision

• Ground truth voxels

• Weaker supervision

– Silhouettes

– Depth maps

– Color

Perspective Transformer Nets [Yan et al. NIPS 2016]

• Silhouettes as Supervision

Perspective Transformer Nets [Yan et al. NIPS 2016]

Ray Consistency [Tulsiani et al. CVPR 2017]

• Ray formulation with multi-view supervision

Observation O

from camera C

Input Image

Geometric Consistency Loss

CNN

• Silhouettes

• Depth

• Color



GT Input 3D Supervision

DRC (Mask)

DRC (Depth)

Coarse Resolution

• Problem: Volumetric prediction at low resolution 323

• Surface is not 3D just 2D

– Predict coefficients of parametric shape model [Dibra et al. 3DV 2016]

– Predict volume hierarchically around surface [Häne et. al. 2017]

Predict Moraphable Shape Model [Dibra et al. 3DV 2016]

• Needs known shape model

Hierarchical Surface Prediction [Häne, Tulsiani, Malik 2017]

• Our approach: Hierarchical Surface Prediction (HSP)

– Three label prediction (free/occupied/boundary)

– Higher resolution only on boundary (octree)

Input 163 323 643 1283 2563


• Network architecture


• Two baselines

– LR Soft

– LR Hard


Intersection over Union


• Smooth high res surfaces

• Quantitatively more accurate than low res

• Various object categories

Conclusion

• Geometry prediction recent approach

• Difficult to compare methods

• Weak supervision

• Hierarchical prediction

geometric and semantic 3d reconstruction: part …chaene/cvpr17tut/geometry...geometric and semantic...

Documents