+ All Categories
Home > Education > MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Date post: 20-Jan-2017
Category:
Upload: multimediaeval
View: 148 times
Download: 4 times
Share this document with a friend
12
The CERTH-UNITN Participation @ Verifying Multimedia Use 2015 Christina Boididou 1 , Symeon Papadopoulos 1 , Duc-Tien Dang- Nguyen 2 , Giulia Boato 2 , and Yiannis Kompatsiaris 1 MediaEval 2015 Workshop, Sept 14-15, 2015, Wurzen, Germany This task is supported by the REVEAL EC FP7 Project. 1 Information Technologies Institute (ITI), CERTH, Greece 2 University of Trento, Italy
Transcript
Page 1: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

The CERTH-UNITN Participation @ Verifying Multimedia Use 2015 Christina Boididou1, Symeon Papadopoulos1, Duc-Tien Dang-Nguyen2, Giulia Boato2, and Yiannis Kompatsiaris1

MediaEval 2015 Workshop, Sept 14-15, 2015, Wurzen, Germany

This task is supported by the REVEAL EC FP7 Project.

1Information Technologies Institute (ITI), CERTH, Greece 2University of Trento, Italy

Page 2: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Overview

2

Approach Use of tweet-, user-based and forensics features

Supervised learning (SL) scheme

Semi-Supervised learning scheme called Agreement-based retraining technique (SSL-AR)

Aim Predict if a tweet that shares multimedia content is fake or real

Page 3: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Features Features used in the experiments

3

Feature Set Description

TB–base Baseline tweet-based

TB–ext Extended tweet-based

UB–base Baseline user-based

UB–ext Extended user-based

FOR Forensics

Types • Tweet-based: information coming from the tweet and its metadata • User-based: information and metadata about the user posting (or retweeting) the tweet • Multimedia forensics: based on the image that accompanies the tweet.

Sets • Baseline (base) set: Features shared by the task • Extended (ext) set: New features extracted • Forensics (FOR) set: Both distributed by the task and some additional ones

Page 4: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Additional Features

4

Tweet-based User-based Forensics

Contains word please Account age AJPG-BAG combined

Has external link Number of media content NAJPG-BAG combined

Number of slang words Shares location

Number of nouns Shares location that exists1

Readability2

Web Of Trust (WOT) score

In-degree centrality3

Harmonic centrality3

Alexa rankings

For the links

1Geonames dataset (http://download.geonames.org/export/) 2Flesch Reading Ease method, which computes the complexity of a piece of text as a score in the interval [0; 100] 3Common Crawl WWW Ranking (http://wwwranking.webdatacommons.org/more.html)

Page 5: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Additional Forensics Features

5

AJPG map Binary map

‘Object’

Mask BAG

AJPG-BAG

combined

‘Object’

features

‘Background’

features

thresholding

• NAJPG-BAG was combined in the same way from NAJPG and BAG features.

Page 6: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Agreement-based retraining method

6

• Make the initial model adaptable • Predict more accurately the values of the disagreed samples

Page 7: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Bagging

7

Training set

• N=9 • Equal number of samples from each class • Average result of numerous predictors

Page 8: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Submitted Runs

Run Learning Features

RUN-1 SL TB-base

RUN-2 SL TB-base + FOR

RUN-3 SSL-AR (TB-base + FOR) + UB-base

RUN-4 SL TB-ext + UB-ext + FOR

RUN-5 SSL-AR (TB-ext + FOR) + UB-ext

8

• RUN1, RUN2 & RUN4 plain classification model • RUN3 & RUN5 agreement-based retraining technique

• Random Forest classifier used for all models

CL1 CL2

SL: Supervised Learning SSL-AR: Semi-supervised-Learning – Agreement Retraining

Page 9: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Results

Runs Recall Precision F-score

RUN-1 0.794 0.733 0.762

RUN-2 0.749 0.994 0.854

RUN-3 0.922 0.736 0.819

RUN-4 0.798 0.860 0.828

RUN-5 0.969 0.861 0.911

9

A. RUN5 achieved the best score B. Use of SSL-AR technique improves the performance a lot C. RUN2 better than RUN1 -> FOR features contribution D. RUN3 & RUN5 comparison -> ext features’ contribution

A B

C

D

Features

TB-base

TB-base + FOR

(TB-base + FOR) + UB-base

TB-ext + UB-ext + FOR

(TB-ext + FOR) + UB-ext

Page 10: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Examples

Fake example classified as real

10

Fake example classified as fake

Page 11: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Conclusions / Future Work

Features

• ext features perform better than base ones

• FOR features improve performance

Agreement-based retraining technique

• improves accuracy

• adapts to the new data

• requires a number of test samples to be applied

Future Ideas

• Experiment with other set of features

• Perform feature selection

• Adapt the method to be applied with fewer samples

11

Page 12: MediaEval 2015 - The CERTH-UNITN Participation @ Verifying Multimedia Use 2015

Questions

12

Thank you for your attention!


Recommended