Layer-wise CNN Surgery for Visual Sentiment Prediction

Post on 12-Aug-2015

76 views 1 download

Tags:

transcript

LAYER-WISE CNN SURGERY FOR VISUAL SENTIMENT

PREDICTION

Víctor Campos Xavier Giró Amaia Salvador Brendan Jou

July 20th 2015

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

2

3

Introduction: motivation

4

Introduction: motivation

Introduction: motivation

5

6

Introduction: problem definition▷ What? ▷ How?

▷ What? Predict the sentiment that an image provokes to a human▷ How?

7

Introduction: problem definition

▷ What? Predict the sentiment that an image provokes to a human▷ How?

8

Introduction: problem definition

▷ What? Predict the sentiment that an image provokes to a human▷ How? Using Convolutional Neural Networks (CNNs)

9

CNN

Introduction: problem definition

10

CNN

Introduction: example

11

CNN

Introduction: example

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

12

Related work: low-level descriptors

13

Siersdorfer, S., Minack, E., Deng, F., & Hare, J. (2010, October). Analyzing and predicting sentiment of images on the social web. In Proceedings of the international conference on Multimedia (pp. 715-718). ACM.

Machajdik, J., & Hanbury, A. (2010, October). Affective image classification using features inspired by psychology and art theory. In Proceedings of the international conference on Multimedia (pp. 83-92). ACM.

14

Borth, D., Ji, R., Chen, T., Breuel, T., & Chang, S. F. (2013, October). Large-scale visual sentiment ontology and detectors using adjective noun pairs. In Proceedings of the 21st ACM international conference on Multimedia (pp. 223-232). ACM.

Related work: SentiBank

Related work: CNNs for sentiment prediction

15

You, Q., Luo, J., Jin, H., & Yang, J. (2015). Robust image sentiment analysis using progressively trained and domain transferred deep networks. In The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI).

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

16

Convolutional Neural Networks

17

Krizhevsky, A.; Sutskever, I. & Hinton, G. E.: ImageNet Classification with Deep Convolutional Neural Networks. In: NIPS., 2012

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

18

Datasets

19

Flickr Twitter

Authors Borth et al. (2013) You et al. (2015)

Size ~500k 1269

Annotation method Textual tags5 human

annotators

Datasets

20

Size

Flickrdataset

Quality of the annotations

Twitterdataset

Datasets

21

Size

Flickrdataset

Quality of the annotations

Twitterdataset

Outline

1. Introduction2. Related work3. Methodology and results

a. Convolutional Neural Networksb. Datasetsc. Experimental setup and results

4. Conclusions5. Future work

22

Experimental setup: 5-fold cross-validation

Dataset

Experimental setup: 5-fold cross-validation

Train Test

Experimental setup: 5-fold cross-validation

Train Test

Mean ± Std. Dev.

Experimental setup: 5-fold cross-validation

27

ARCHITECTURECaffeNet

Experimental setup: CNN

28

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

Experimental setup: CNN

Experimental setup: CNN

29

Pre-trainedModel

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

30

Fine-tuning CaffeNet

31

Fine-tuning CaffeNet

32

Fine-tuning CaffeNet

33

Fine-tuning CaffeNet

34

Pre-trainedmodel

Data augmentation (oversampling)

35

CNN

Data augmentation (oversampling)

36

CNN

Data augmentation (oversampling)

37

CNN

Data augmentation (oversampling)

38

CNN

Data augmentation (oversampling)

39

CNN

Data augmentation (oversampling)

40

CNN

Data augmentation (oversampling)

41

CNN

Fine-tuning CaffeNet

42

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

43

Layer by layer analysis

44

Layer by layer analysis

45

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

46

Layer ablation

47

Raw ablation

2-neuron on top

Layer ablation

48

Layer ablation

49

Layer ablation

50

~16Mparams(~25%)

Experimental setup: outline

1. Fine-tuning CaffeNet2. Layer by layer analysis3. Layer ablation4. Layer addition

51

Layer addition

52

Layer addition

53

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

54

Conclusions

55

Pre-trainedmodel

56

CNN

Conclusions

Conclusions

57

Outline

1. Introduction2. Related work3. Methodology and results4. Conclusions5. Future work

58

Future work

59

Size

Flickrdataset

Quality of the annotations

Twitterdataset

Future work

60

Size

Flickrdataset

Quality of the annotations

Twitterdataset

New Flickr

dataset

Experimental setup: introduction

61

Model

ARCHITECTURECaffeNet

SOFTWARE[Jia’14]

DATASET[Jou’15]

62

Acknowledgements

63

Financial supportTechnical support

Albert Gil Josep Pujal

Evaluation metric: accuracy

Top-5 scores

Receptive fields visualizationCONV5, unit 49:

CONV5, unit 51: