mediaeval 2016: a multimodal system for the verifying multimedia use task

1
MediaEval 2016: A multimodal system for the Verifying Multimedia Use task C ´ edric Maigrot Vincent Claveau Ewa Kijak Ronan Sicre {firstname}.{lastname}@irisa.fr MediaEval 2016: A multimodal system for the Verifying Multimedia Use task C ´ edric Maigrot Vincent Claveau Ewa Kijak Ronan Sicre {firstname}.{lastname}@irisa.fr Why use a multimodal system ? Because there are several types of hoax ! False information present in the text content Forged image Image reused for an other event Global Hypotheses Prediction is first made at the image-level, then propagated to the tweets that contain the image Translation if the detected language is different than english Text-based approach (run-T) Detect if the message is style-wise similar to known hoax Capture similar comments between an unknown image and an image from the training set (e.g. It’s photoshopped ) and similar genres of com- ments (e.g. presence of smileys) Prediction made by a k -Nearest-Neighbor ap- proach (in this case k = 1) Source-based approach (run-S) Detect if the message is related to a trustworthy source 2 type of sources searched: news-related organ- isms (e.g. press agencies) and explicit citations of the source of the image (e.g. the pattern pho- tographed by + Name ) Predict real if a trustworthy source is detected, fake else Example Image-based approach (run-I) Detect a known image Compare an unknown image to an image database of 8 000 known images (7 500 fake and 500 real images) Database images extracted from 5 specialized websites Description of an image by a deep CNN layer output (4096-dimensional descriptor) Predict real (resp. fake ) if a real (resp. fake ) similar image is found in the database, uncertain else Combination approach (run-C) Combine the three previous predictions Late fusion: learn the best combination Boosting algorithm (adaboost.MH, parameters of the machine learning algorithm are set by cross-validation on the training data) Results run-T run-I run-S run-C 92.23% 34.07% 94.63% 91.22% 63.98% 49.18% 90.3% 75.25% 75.57% 40.25% 92.42% 82.47% Approaches Score in % 2 228 messages to classify, corresponding to 130 images 86 % to the test tweets are associated with one or more images (the rest is associated with video) Conclusion Text-based approach: competes with the source-based approach in terms of recall but tends to classify every tweet as fake Image-based approach: low precision compared with estimations on the training set. This may be due to: (1) small and unbalanced reference database; (2) original image and forged ones are sometimes very similar; (3) presence of stamps Combination-based approach: does not offer any gain due to overfitting Acknowledgements This work is partly supported by the Direction G ´ en ´ erale de l’Armement, France (DGA).

Upload: multimediaeval

Post on 20-Jan-2017

5 views

Category:

Science


1 download

TRANSCRIPT

Page 1: MediaEval 2016: A Multimodal System for the Verifying Multimedia Use Task

MediaEval 2016: A multimodal system for theVerifying Multimedia Use task

Cedric Maigrot Vincent Claveau Ewa Kijak Ronan Sicre{firstname}.{lastname}@irisa.fr

MediaEval 2016: A multimodal system for theVerifying Multimedia Use task

Cedric Maigrot Vincent Claveau Ewa Kijak Ronan Sicre{firstname}.{lastname}@irisa.fr

Why use a multimodal system ? Because there are several types of hoax !

ú False information present in the text content ú Forged image ú Image reused for an other event

Global Hypotheses

ú Prediction is first made at the image-level, then propagated to the tweets that contain the image

ú Translation if the detected language is different than english

Text-based approach

(run-T)

Detect if the message is style-wisesimilar to known hoaxÜ Capture similar comments between an unknown

image and an image from the training set (e.g.It’s photoshopped) and similar genres of com-ments (e.g. presence of smileys)

Ü Prediction made by a k -Nearest-Neighbor ap-proach (in this case k = 1)

Source-based approach

(run-S)

Detect if the message is related to atrustworthy sourceÜ 2 type of sources searched: news-related organ-

isms (e.g. press agencies) and explicit citationsof the source of the image (e.g. the pattern pho-tographed by + Name)

Ü Predict real if a trustworthy source is detected,fake else

Example Image-based approach

(run-I)

Detect a known imageú Compare an unknown image to an image

database of 8 000 known images (7 500 fake and500 real images)

ú Database images extracted from 5 specializedwebsites

ú Description of an image by a deep CNN layeroutput (4096-dimensional descriptor)

ú Predict real (resp. fake) if a real (resp. fake)similar image is found in the database, uncertainelse

Combination approach (run-C)

Combine the three previous predictionsú Late fusion: learn the best combination

ú Boosting algorithm (adaboost.MH, parameters of the machine learning algorithm are set by cross-validation on the training data)

Results

run-T run-I run-S run-C

92.2

3%

34.0

7%

94.6

3%

91.2

2%

63.9

8%

49.1

8%

90.3

%

75.2

5%

75.5

7%

40.2

5%

92.4

2%

82.4

7%

Approaches

Sco

rein

%

ú 2 228 messages to classify, corresponding to 130 images

ú 86 % to the test tweets are associated with one or more images (the rest is associatedwith video)

Conclusion

ú Text-based approach: competes with the source-based approach in terms of recall buttends to classify every tweet as fake

ú Image-based approach: low precision compared with estimations on the training set.This may be due to: (1) small and unbalanced reference database; (2) original image andforged ones are sometimes very similar; (3) presence of stamps

ú Combination-based approach: does not offer any gain due to overfitting

Acknowledgements

This work is partly supported by the Direction Generale de l’Armement, France (DGA).