deep learning for visual...

Deep Learning for Visual Analysis

Department of Automation, Tsinghua University, China

http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/

Jiwen Lu

Outline

• Introduction

• Deep Metric Learning for Visual Analysis

• Multi-Modal Deep Learning for Visual Analysis

• Deep Auto-Encoder Networks for Visual Analysis

• Conclusions and Future Works

Visual Analysis Visual Recognition

Visual Tracking

Visual Analysis

Visual Search

Visual Analysis

Visual Parsing

Visual Analysis

Deep Learning Convolutional Neural Networks

Deep Learning Deep Auto-Encoder Networks

Deep Metric Learning

Final representation:

The distance of a pair is:

Illustration at the top layer

[1] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Discriminative deep metric learning for face verification in the wild, CVPR, pp. 1875-1882, 2014.

Face Verification

Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.

Basic idea of the proposed DTML method.

[2] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep transfer metric learning, CVPR, pp. 325-333, 2015.

Person Re-identification

Cross-Dataset Person Re-identification

Feature representation: LBP and color histogram

Datasets: VIPER, i-LIDS, CAVIAR, 3DPeS

13 13 Top r matched results of different methods on the VIPeR dataset

Deep Metric Learning Cross-Dataset Person Re-identification

Main procedure of our proposed DML tracker.

[3] Junlin Hu, Jiwen Lu, and Yap-Peng Tan, Deep metric learning for visual tracking, IEEE Trans. on Circuits and Systems for Video Technology, 2016, accepted.

Visual Tracking

Quantitative Evaluation

The sequences (CarDark, Singer1, and Skating1) with illumination and scale changes.

Deep Metric Learning Qualitative Evaluation

17 The sequences (David3, Walking2, FaceOcc1 and Liquor) with heavy occlusion.

18 The sequences (Jumping, CarScale, Ironman and Soccer) with fast motion and motion blur.

Deep Metric Learning Image Set Classification

Our proposed multi-manifold deep metric learning framework.

[4] Jiwen Lu, Gang Wang, Weihong Deng, Pierre Moulin, and Jie Zhou, Multi-manifold deep metric learning for image set classification, CVPR, pp. 1137-1145, 2015.

Average classification rates of different methods on different datasets

Deep Metric Learning Image Set Classification

Deep Metric Learning Scalable Visual Search

Main procedure of our proposed deep hashing. [5] Venice Erin Liong, Jiwen Lu, Gang Wang, Pierre Moulin, and Jie Zhou, Deep hashing for compact binary

codes learning, CVPR, pp. 2475-2483, 2015.

Results on CIFAR.

Deep Metric Learning Scalable Visual Search

Deep Metric Learning Image Matching

[6] Kevin Lin, Jiwen Lu, Chu-Song Chen, and Jie Zhou, Learning compact binary descriptors with unsupervised deep neural networks, CVPR, pp. 1183-1192, 2016.

Main procedure of our proposed approach.

Deep Metric Learning Image Matching

Multi-Modal Deep Learning

Our proposed multi-modal deep metric learning framework.

[7] Anran Wang, Jiwen Lu, Jianfei Cai, Tat-Jen Cham, and Gang Wang, Large-margin multi-modal deep learning for RGB-D object recognition, IEEE Trans. on Multimedia, vol. 17, no. 11, pp. 1887-1898, 2015.

Our proposed multi-modal deep metric learning framework.

RGB-D object dataset:

51 classes, 207,920 RGB-D image frames, with roughly 600 images per object

10-fold split. For each split, one object from each class was sampled, resulting

in 51 test objects. After subsampling every 5th frame from the videos, there

were some 34,000 images for training and 6900 images for testing

Object Recognition

[8] Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham, MMSS: Multi-modal sharable and specific feature learning for RGB-D object recognition, ICCV, pp. 1125-1133, 2015.

RGB-D object dataset:

Object Recognition

Deep Auto-Encoder Networks Face Recognition

[9] Jiwen Lu, Venice Erin Liong, Gang Wang, and Pierre Moulin, Joint feature learning for face recognition, IEEE Trans. on Information Forensics and Security, vol. 10, no. 7, pp. 1371-1383, 2015.

Deep Auto-Encoder Networks Face Recognition

Comparison with State-of-the-Arts

Verification rate (%) of different methods.

ROC curves of different methods.

Deep Auto-Encoder Networks Face Alignment

[10] Renliang Weng, Jiwen Lu, Yap-Peng Tan, and Jie Zhou, Learning cascaded deep auto-encoder networks for face alignment, IEEE Trans. on Multimedia, 2016.

Deep Auto-Encoder Networks Face Alignment

Deep Auto-Encoder Networks Scene Labeling

[11] Anran Wang, Jiwen Lu, Gang Wang, Jianfei Cai, and Tat-Jen Cham, Multi-modal unsupervised feature learning for RGB-D scene labeling, ECCV, pp. 453-467, 2014.

Deep Auto-Encoder Networks Scene Labeling

• Deep learning is effective for many visual analysis tasks

including visual recognition, visual tracking, visual search,

and visual parsing.

• Theoretical analysis for deep learning is required to show

how it improves various visual analysis tasks.

Summary and Future Work

deep learning for visual...

Documents

learning hierarchical visual representations in deep

visual interpretability of deep learning algorithms in

deep learning for visual analysis

deep learning human mind for automated visual...

peretroukhin et al. dpc-net: deep pose correction for visual...

cs6501: deep learning for visual recognition neural...

selfvio: self-supervised deep monocular visual-inertial

鲁家峙的美那年夏天 -...

devise: a deep visual-semantic embedding model

visual perception through deep learning

deep learning for visual question answering · deep...

deep visual-semantic quantization for efficient...

recsal : deep recursive supervision for visual saliency

京鲁秋意menu-繁中 · 2020. 9. 4. · title:...

deep learning models of biological visual information...

grad-cam++: improved visual explanations for deep

deep learning - a visual introduction

deep learning in the visual cortex

sparse deep belief net models for visual area...

detecting visual relationships with deep relational...