deep learning for visual...

40
Deep Learning for Visual Analysis Department of Automation, Tsinghua University, China http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/ Jiwen Lu

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Deep Learning for Visual Analysis

Department of Automation, Tsinghua University, China

http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/

Jiwen Lu

Page 2: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Outline

2

• Introduction

• Deep Metric Learning for Visual Analysis

• Multi-Modal Deep Learning for Visual Analysis

• Deep Auto-Encoder Networks for Visual Analysis

• Conclusions and Future Works

Page 3: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Visual Analysis Visual Recognition

3

Page 4: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Visual Tracking

Visual Analysis

4

Page 5: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Visual Search

Visual Analysis

5

Page 6: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

Visual Parsing

Visual Analysis

6

Page 7: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

7

Deep Learning Convolutional Neural Networks

Page 8: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

8

Deep Learning Deep Auto-Encoder Networks

Page 9: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

9 9

Deep Metric Learning

Final representation:

The distance of a pair is:

Illustration at the top layer

[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.

Face Verification

Page 10: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

10 10

Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.

Deep Metric Learning

Page 11: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

11 11

Basic idea of the proposed DTML method.

Deep Metric Learning

[2] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.

Person Re-identification

Page 12: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

12 12

Cross-Dataset Person Re-identification

Feature representation: LBP and color histogram

Datasets: VIPER, i-LIDS, CAVIAR, 3DPeS

Deep Metric Learning

Page 13: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

13 13 Top r matched results of different methods on the VIPeR dataset

Deep Metric Learning Cross-Dataset Person Re-identification

Page 14: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

14 14

Main procedure of our proposed DML tracker.

Deep Metric Learning

[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, 2016, accepted.

Visual Tracking

Page 15: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

15 15

Quantitative Evaluation

Deep Metric Learning

Page 16: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

16 16

The sequences (CarDark, Singer1, and Skating1) with illumination and scale changes.

Deep Metric Learning Qualitative Evaluation

Page 17: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

17 The sequences (David3, Walking2, FaceOcc1 and Liquor) with heavy occlusion.

Deep Metric Learning Qualitative Evaluation

Page 18: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

18 The sequences (Jumping, CarScale, Ironman and Soccer) with fast motion and motion blur.

Deep Metric Learning Qualitative Evaluation

Page 19: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

19

Deep Metric Learning Image Set Classification

Our proposed multi-manifold deep metric learning framework.

[4] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.

Page 20: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

20 20

Average classification rates of different methods on different datasets

Deep Metric Learning Image Set Classification

Page 21: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

21

Deep Metric Learning Scalable Visual Search

Main procedure of our proposed deep hashing. [5] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary

codes learning, CVPR, pp. 2475-2483, 2015.

Page 22: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

22

Results on CIFAR.

Deep Metric Learning Scalable Visual Search

Page 23: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

23

Deep Metric Learning Image Matching

[6] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.

Main procedure of our proposed approach.

Page 24: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

24

Deep Metric Learning Image Matching

Page 25: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

25

Deep Metric Learning Image Matching

Page 26: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

26 26

Multi-Modal Deep Learning

Our proposed multi-modal deep metric learning framework.

[7] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.

Page 27: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

27 27

Multi-Modal Deep Learning

Our proposed multi-modal deep metric learning framework.

Page 28: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

28 28

Multi-Modal Deep Learning

RGB-D object dataset:

51 classes, 207,920 RGB-D image frames, with roughly 600 images per object

10-fold split. For each split, one object from each class was sampled, resulting

in 51 test objects. After subsampling every 5th frame from the videos, there

were some 34,000 images for training and 6900 images for testing

Object Recognition

Page 29: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

29 29

Multi-Modal Deep Learning

[8] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.

Page 30: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

30 30

Multi-Modal Deep Learning

RGB-D object dataset:

Object Recognition

Page 31: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

31 31

Deep Auto-Encoder Networks Face Recognition

[9] Jiwen Lu, Venice Erin Liong, Gang Wang, and Pierre Moulin, Joint feature learning for face recognition, IEEE Trans. on Information Forensics and Security, vol. 10, no. 7, pp. 1371-1383, 2015.

Page 32: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

32 32

Deep Auto-Encoder Networks Face Recognition

Page 33: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

33 33

Deep Auto-Encoder Networks Face Recognition

Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.

Page 34: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

34 34

Deep Auto-Encoder Networks Face Alignment

[10] Renliang Weng, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Learning cascaded deep auto-encoder networks for face alignment, IEEE Trans. on Multimedia, 2016.

Page 35: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

35 35

Deep Auto-Encoder Networks Face Alignment

Page 36: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

36 36

Deep Auto-Encoder Networks Face Alignment

Page 37: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

37 37

Deep Auto-Encoder Networks Scene Labeling

[11] Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, and Tat-Jen Cham, Multi-modal unsupervised feature learning for RGB-D scene labeling, ECCV, pp. 453-467, 2014.

Page 38: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

38 38

Deep Auto-Encoder Networks Scene Labeling

Page 39: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

39 39

Deep Auto-Encoder Networks Scene Labeling

Page 40: Deep Learning for Visual Analysisimages.nvidia.com/cn/gtc/downloads/pdf/big-data/201.鲁继文THU_JiwenLu.pdf•Deep learning is effective for many visual analysis tasks including

40 40

• Deep learning is effective for many visual analysis tasks

including visual recognition, visual tracking, visual search,

and visual parsing.

• Theoretical analysis for deep learning is required to show

how it improves various visual analysis tasks.

Summary and Future Work