deep hough voting - github pages · learning on graphs and manifolds (shameless plug) fmnet,...

52
Deep Hough Voting 3D Object Detection in Point Clouds Or Litany FAIR / Stanford In collaboration with: Charles Qi, Kaiming He and Leonidas Guibas 1

Upload: others

Post on 31-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough Voting3D Object Detection in Point Clouds

Or LitanyFAIR / Stanford

In collaboration with: Charles Qi, Kaiming He and Leonidas Guibas

1

Page 2: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D is a natural representation of the world

Agarwal et. al., ICCV'09 2

Page 3: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D consumer market

3

Page 4: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Data driven tools for 3D

4

Page 5: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D representations

5

Page 6: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Learning on graphs and manifolds (shameless plug)

self-supervised, CVPR'19Shape completion, CVPR'18FMNet, ICCV'17

What if the graph (connectivity) is unknown?

6

Page 7: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough Voting: 3D Object Detection in Point Clouds

7

Page 8: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?Generally: To localize and recognize objects in a 3D scene.

8

Page 9: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?Generally: To localize and recognize objects in a 3D scene.

Specifically in literature: Estimate amodal, oriented 3D bounding boxes and semantic classes of objects from 3D point clouds or RGB-D data.

9

Page 10: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?Generally: To localize and recognize objects in a 3D scene.

Specifically in literature: Estimate amodal, oriented 3D bounding boxes and semantic classes of objects from 3D point clouds or RGB-D data.

Applications:

- Augmented reality.- Robotics.- Autonomous driving.

10

Page 11: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?

KITTI

SUN RGB-D

11

Page 12: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?

Figure by Yin et al. (VoxelNet)

12

Page 13: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

What is 3D object detection?

Figure by Lahoud et al. (2D-driven 3D object detection)

13

Page 14: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D Vs. 2D Object Detection3D input: point clouds from Lidar, RGB-D, reconstructed meshes.

+ Accurate 3D geometry (depth and scale)+ Robust to illumination- Sparse and irregular (doesn’t fit with CNNs).- Centroid can be far from surface points.

14

Page 15: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D Vs. 2D Object Detection3D input: point clouds from Lidar, RGB-D, reconstructed meshes.

3D output: Amodal 3D oriented bounding boxes with semantic classes

3D box parameterization:

Usually we only consider 1D rotation around the up-axis. 15

Page 16: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Evaluation metricAverage Precision (AP) with a 3D Intersection over Union (IoU) threshold.

90% correct in each dimension, perfect angle:0.9^3 = 0.73!

Figure from the SUMO report. 16

Page 17: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Key research problems● 3D object proposal (challenges: large search space, varying sizes and

orientations)

● How to use image (high resolution, rich semantics, 2D geometry) and 3D (low resolution, accurate 3D geometry)

● How to represent “objects”: bounding boxes (2D,3D,oriented,amodal), instance masks, others (convex hulls,voxels,meshes,primitives,…)

17

Page 18: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D object proposal: Current methods’ limitations

208

208100

3D CNN detector

● High computation cost.● Search in empty space (no use

of sparsity in point clouds).

18

Page 19: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D object proposal: Current methods’ limitations

208

208100

3D CNN detector

● High computation cost.● Search in empty space (no use

of sparsity in point clouds).

Bird’s eye view detector

● Restricted to certain types of scenes (e.g. driving).

● Essentially a 2D detector.

19

Page 20: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D object proposal: Current methods’ limitations

208

208100

3D CNN detector

● High computation cost.● Search in empty space (no use

of sparsity in point clouds).

Bird’s eye view detector

Frustum-based detector

● Restricted to certain types of scenes (e.g. driving).

● Essentially a 2D detector.

● Hard dependence on 2D detectors.

20

Page 21: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D object proposal: What we want- Generic: no assumption on canonical viewpoint as in bird’s eye view

detectors.

- 3D-based: no hard dependence on 2D images as in frustum-based detectors.

- Efficient: no brute-force search in the entire 3D space as in 3D CNNs.

Leverage the sparsity in point clouds.

21

Page 22: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Simple point cloud based solution: Direct prediction

PointNet++, Qi et. al. 22

Page 23: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Simple point cloud based solution: Direct prediction- Predict directly from existing points

- Challenge: Existing points can be very far from object centers.

23

Page 24: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

3D object proposal:A return of hough voting!

24

Page 25: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance

25

Page 26: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance

26

Page 27: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance

27

Page 28: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance

28

Page 29: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance- Votes clustering to find peaks

29

Page 30: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance- Votes clustering to find peaks- Find patches that voted for the peaks

back-projection

30

Page 31: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

Hough voting pipeline (in 2D):- Select interest points- Match patch around each interest point

to a training patch (codebook)- Vote for object center given that training

instance- Votes clustering to find peaks- Find patches that voted for the peaks

back-projection- Find full objects based on

back-projected patches

31

Page 32: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Hough voting detector recap

From U. Toronto CSC420

+ Suitable for sparse data: computation is only on “interest” points

+ Long-range and non-uniform context aggregation

- Not end-to-end optimizable

32

Page 33: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

33

Page 34: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

34

Page 35: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

35

Page 36: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

36

Page 37: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

37

Page 38: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

38

Page 39: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

39

- Votes are "virtual points": same structure, better location

Page 40: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Deep Hough voting:

40

- Votes are "virtual points": same structure, better location- Aggregation instead of back-tracing:

- Learn to filter- Predict more than just location: pose, class, etc.- Amodal proposals

Back-trace Learn to aggregate

Page 41: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Results

Single RGB-D imagesEval on 10 classes.5k/5k train/test.amodal

Reconstructed scenes. Eval on 18 classes.1.2k/302train/valNot amodal, no pose.

SUN RGB-D ScanNet

41

Page 42: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

SUN RGB-D42

Page 43: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

SUN RGB-D43

Page 44: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

ScanNet

VotingNet Prediction Ground truth

44

Page 45: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

SUN RGB-D

ScanNet

45

Page 46: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

To vote or not to vote?

46

Page 47: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

When does voting helps the most?

47

Page 48: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Aggregation is key

48

Page 49: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Proposal quality and runtimes

49

Page 50: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Summary● Hough voting is back

○ Effective 3D object detection in point clouds with state-of-the-art performance on real 3D scans

50

Page 51: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Summary

● Hough voting is back○ Effective 3D object detection in point clouds with state-of-the-art performance on real 3D

scans○ Improved context aggregation: low dimensional attention, online graph construction

● Future directions:○ Adding color images (semantics and geometry cues)○ Downstream tasks: extending the system to semantic / instance segmentation○ Other use-cases suitable for voting

51

Page 52: Deep Hough Voting - GitHub Pages · Learning on graphs and manifolds (shameless plug) FMNet, ICCV'17 Shape completion, CVPR'18 self-supervised, CVPR'19 What if the graph (connectivity)

Thanks!orlitany.github.io

52