deep learning for visual...

Deep Learning for Visual Analysis

Department of Automation, Tsinghua University, China

http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/

Jiwen Lu



Outline

2

• Introduction

• Deep Metric Learning for Visual Analysis

• Multi-Modal Deep Learning for Visual Analysis

• Deep Auto-Encoder Networks for Visual Analysis

• Conclusions and Future Works

Visual Analysis Visual Recognition

3

Visual Tracking

Visual Analysis

4

Visual Search

Visual Analysis

5

Visual Parsing

Visual Analysis

6

7

Deep Learning Convolutional Neural Networks

8

Deep Learning Deep Auto-Encoder Networks

9 9

Deep Metric Learning

Final representation:

The distance of a pair is:

Illustration at the top layer

[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.

Face Verification

10 10

Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.


11 11

Basic idea of the proposed DTML method.


[2] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.

Person Re-identification

12 12

Cross-Dataset Person Re-identification

Feature representation: LBP and color histogram

Datasets: VIPER, i-LIDS, CAVIAR, 3DPeS


13 13 Top r matched results of different methods on the VIPeR dataset

Deep Metric Learning Cross-Dataset Person Re-identification

14 14

Main procedure of our proposed DML tracker.


[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, 2016, accepted.

Visual Tracking

15 15

Quantitative Evaluation


16 16

The sequences (CarDark, Singer1, and Skating1) with illumination and scale changes.

Deep Metric Learning Qualitative Evaluation

17 The sequences (David3, Walking2, FaceOcc1 and Liquor) with heavy occlusion.


18 The sequences (Jumping, CarScale, Ironman and Soccer) with fast motion and motion blur.


19

Deep Metric Learning Image Set Classification

Our proposed multi-manifold deep metric learning framework.

[4] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.

20 20

Average classification rates of different methods on different datasets

Deep Metric Learning Image Set Classification

21

Deep Metric Learning Scalable Visual Search

Main procedure of our proposed deep hashing. [5] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary

codes learning, CVPR, pp. 2475-2483, 2015.

22

Results on CIFAR.

Deep Metric Learning Scalable Visual Search

23

Deep Metric Learning Image Matching

[6] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.

Main procedure of our proposed approach.

24


25


26 26

Multi-Modal Deep Learning

Our proposed multi-modal deep metric learning framework.

[7] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.

27 27


Our proposed multi-modal deep metric learning framework.

28 28


RGB-D object dataset:

51 classes, 207,920 RGB-D image frames, with roughly 600 images per object

10-fold split. For each split, one object from each class was sampled, resulting

in 51 test objects. After subsampling every 5th frame from the videos, there

were some 34,000 images for training and 6900 images for testing

Object Recognition

29 29


[8] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.

30 30


RGB-D object dataset:

Object Recognition

31 31

Deep Auto-Encoder Networks Face Recognition

[9] Jiwen Lu, Venice Erin Liong, Gang Wang, and Pierre Moulin, Joint feature learning for face recognition, IEEE Trans. on Information Forensics and Security, vol. 10, no. 7, pp. 1371-1383, 2015.

32 32


33 33


Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.

34 34

Deep Auto-Encoder Networks Face Alignment

[10] Renliang Weng, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Learning cascaded deep auto-encoder networks for face alignment, IEEE Trans. on Multimedia, 2016.

35 35


36 36


37 37

Deep Auto-Encoder Networks Scene Labeling

[11] Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, and Tat-Jen Cham, Multi-modal unsupervised feature learning for RGB-D scene labeling, ECCV, pp. 453-467, 2014.

38 38


39 39


40 40

• Deep learning is effective for many visual analysis tasks

including visual recognition, visual tracking, visual search,

and visual parsing.

• Theoretical analysis for deep learning is required to show

how it improves various visual analysis tasks.

Summary and Future Work

deep learning for visual...

Documents