deep learning for visual...
TRANSCRIPT
Deep Learning for Visual Analysis
Department of Automation, Tsinghua University, China
http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/
Jiwen Lu
Outline
2
• Introduction
• Deep Metric Learning for Visual Analysis
• Multi-Modal Deep Learning for Visual Analysis
• Deep Auto-Encoder Networks for Visual Analysis
• Conclusions and Future Works
Visual Analysis Visual Recognition
3
Visual Tracking
Visual Analysis
4
Visual Search
Visual Analysis
5
Visual Parsing
Visual Analysis
6
7
Deep Learning Convolutional Neural Networks
8
Deep Learning Deep Auto-Encoder Networks
9 9
Deep Metric Learning
Final representation:
The distance of a pair is:
Illustration at the top layer
[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.
Face Verification
10 10
Comparison with State-of-the-Arts
Verification rate (%) of different methods.
ROC curves of different methods.
Deep Metric Learning
11 11
Basic idea of the proposed DTML method.
Deep Metric Learning
[2] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.
Person Re-identification
12 12
Cross-Dataset Person Re-identification
Feature representation: LBP and color histogram
Datasets: VIPER, i-LIDS, CAVIAR, 3DPeS
Deep Metric Learning
13 13 Top r matched results of different methods on the VIPeR dataset
Deep Metric Learning Cross-Dataset Person Re-identification
14 14
Main procedure of our proposed DML tracker.
Deep Metric Learning
[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, 2016, accepted.
Visual Tracking
15 15
Quantitative Evaluation
Deep Metric Learning
16 16
The sequences (CarDark, Singer1, and Skating1) with illumination and scale changes.
Deep Metric Learning Qualitative Evaluation
17 The sequences (David3, Walking2, FaceOcc1 and Liquor) with heavy occlusion.
Deep Metric Learning Qualitative Evaluation
18 The sequences (Jumping, CarScale, Ironman and Soccer) with fast motion and motion blur.
Deep Metric Learning Qualitative Evaluation
19
Deep Metric Learning Image Set Classification
Our proposed multi-manifold deep metric learning framework.
[4] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.
20 20
Average classification rates of different methods on different datasets
Deep Metric Learning Image Set Classification
21
Deep Metric Learning Scalable Visual Search
Main procedure of our proposed deep hashing. [5] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary
codes learning, CVPR, pp. 2475-2483, 2015.
22
Results on CIFAR.
Deep Metric Learning Scalable Visual Search
23
Deep Metric Learning Image Matching
[6] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.
Main procedure of our proposed approach.
24
Deep Metric Learning Image Matching
25
Deep Metric Learning Image Matching
26 26
Multi-Modal Deep Learning
Our proposed multi-modal deep metric learning framework.
[7] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.
27 27
Multi-Modal Deep Learning
Our proposed multi-modal deep metric learning framework.
28 28
Multi-Modal Deep Learning
RGB-D object dataset:
51 classes, 207,920 RGB-D image frames, with roughly 600 images per object
10-fold split. For each split, one object from each class was sampled, resulting
in 51 test objects. After subsampling every 5th frame from the videos, there
were some 34,000 images for training and 6900 images for testing
Object Recognition
29 29
Multi-Modal Deep Learning
[8] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.
30 30
Multi-Modal Deep Learning
RGB-D object dataset:
Object Recognition
31 31
Deep Auto-Encoder Networks Face Recognition
[9] Jiwen Lu, Venice Erin Liong, Gang Wang, and Pierre Moulin, Joint feature learning for face recognition, IEEE Trans. on Information Forensics and Security, vol. 10, no. 7, pp. 1371-1383, 2015.
32 32
Deep Auto-Encoder Networks Face Recognition
33 33
Deep Auto-Encoder Networks Face Recognition
Comparison with State-of-the-Arts
Verification rate (%) of different methods.
ROC curves of different methods.
34 34
Deep Auto-Encoder Networks Face Alignment
[10] Renliang Weng, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Learning cascaded deep auto-encoder networks for face alignment, IEEE Trans. on Multimedia, 2016.
35 35
Deep Auto-Encoder Networks Face Alignment
36 36
Deep Auto-Encoder Networks Face Alignment
37 37
Deep Auto-Encoder Networks Scene Labeling
[11] Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, and Tat-Jen Cham, Multi-modal unsupervised feature learning for RGB-D scene labeling, ECCV, pp. 453-467, 2014.
38 38
Deep Auto-Encoder Networks Scene Labeling
39 39
Deep Auto-Encoder Networks Scene Labeling
40 40
• Deep learning is effective for many visual analysis tasks
including visual recognition, visual tracking, visual search,
and visual parsing.
• Theoretical analysis for deep learning is required to show
how it improves various visual analysis tasks.
Summary and Future Work