layer-wise cnn surgery for visual sentiment prediction
TRANSCRIPT
LAYER-WISE CNN SURGERY FOR VISUAL SENTIMENT
PREDICTION
Víctor Campos Xavier Giró Amaia Salvador Brendan Jou
July 20th 2015
Outline
1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work
2
3
Introduction: motivation
4
Introduction: motivation
Introduction: motivation
5
6
Introduction: problem definition▷ What? ▷ How?
▷ What? Predict the sentiment that an image provokes to a human▷ How?
7
Introduction: problem definition
▷ What? Predict the sentiment that an image provokes to a human▷ How?
8
Introduction: problem definition
▷ What? Predict the sentiment that an image provokes to a human▷ How? Using Convolutional Neural Networks (CNNs)
9
CNN
Introduction: problem definition
10
CNN
Introduction: example
11
CNN
Introduction: example
Outline
1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work
12
Related work: low-level descriptors
13
Siersdorfer, S., Minack, E., Deng, F., & Hare, J. (2010, October). Analyzing and predicting sentiment of images on the social web. In Proceedings of the international conference on Multimedia (pp. 715-718). ACM.
Machajdik, J., & Hanbury, A. (2010, October). Affective image classification using features inspired by psychology and art theory. In Proceedings of the international conference on Multimedia (pp. 83-92). ACM.
14
Borth, D., Ji, R., Chen, T., Breuel, T., & Chang, S. F. (2013, October). Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM international conference on Multimedia (pp. 223-232). ACM.
Related work: SentiBank
Related work: CNNs for sentiment prediction
15
You, Q., Luo, J., Jin, H., & Yang, J. (2015). Robust image sentiment analysis using progressively trained and domain transferred deep networks. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI).
Outline
1. Introduction2. Related work3. Methodology and results
a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results
4. Conclusions5. Future work
16
Convolutional Neural Networks
17
Krizhevsky, A.; Sutskever, I. & Hinton, G. E.: ImageNet Classification with Deep Convolutional Neural Networks. In: NIPS., 2012
Outline
1. Introduction2. Related work3. Methodology and results
a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results
4. Conclusions5. Future work
18
Datasets
19
Flickr Twitter
Authors Borth et al. (2013) You et al. (2015)
Size ~500k 1269
Annotation method Textual tags5 human
annotators
Datasets
20
Size
Flickrdataset
Quality of the annotations
Twitterdataset
Datasets
21
Size
Flickrdataset
Quality of the annotations
Twitterdataset
Outline
1. Introduction2. Related work3. Methodology and results
a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results
4. Conclusions5. Future work
22
Experimental setup: 5-fold cross-validation
Dataset
Experimental setup: 5-fold cross-validation
Train Test
Experimental setup: 5-fold cross-validation
Train Test
Mean ± Std. Dev.
Experimental setup: 5-fold cross-validation
27
ARCHITECTURECaffeNet
Experimental setup: CNN
28
ARCHITECTURECaffeNet
SOFTWARE[Jia’14]
Experimental setup: CNN
Experimental setup: CNN
29
Pre-trainedModel
ARCHITECTURECaffeNet
SOFTWARE[Jia’14]
Experimental setup: outline
1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition
30
Fine-tuning CaffeNet
31
Fine-tuning CaffeNet
32
Fine-tuning CaffeNet
33
Fine-tuning CaffeNet
34
Pre-trainedmodel
Data augmentation (oversampling)
35
CNN
Data augmentation (oversampling)
36
CNN
Data augmentation (oversampling)
37
CNN
Data augmentation (oversampling)
38
CNN
Data augmentation (oversampling)
39
CNN
Data augmentation (oversampling)
40
CNN
Data augmentation (oversampling)
41
CNN
Fine-tuning CaffeNet
42
Experimental setup: outline
1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition
43
Layer by layer analysis
44
Layer by layer analysis
45
Experimental setup: outline
1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition
46
Layer ablation
47
Raw ablation
2-neuron on top
Layer ablation
48
Layer ablation
49
Layer ablation
50
~16Mparams(~25%)
Experimental setup: outline
1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition
51
Layer addition
52
Layer addition
53
Outline
1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work
54
Conclusions
55
Pre-trainedmodel
56
CNN
Conclusions
Conclusions
57
Outline
1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work
58
Future work
59
Size
Flickrdataset
Quality of the annotations
Twitterdataset
Future work
60
Size
Flickrdataset
Quality of the annotations
Twitterdataset
New Flickr
dataset
Experimental setup: introduction
61
Model
ARCHITECTURECaffeNet
SOFTWARE[Jia’14]
DATASET[Jou’15]
62
Acknowledgements
63
Financial supportTechnical support
Albert Gil Josep Pujal
Evaluation metric: accuracy
Top-5 scores
Receptive fields visualizationCONV5, unit 49:
CONV5, unit 51: