Social-media Storytelling Linking · 2018. 11. 20. · Social-media Storytelling Linking Hao Wu...

Social-media Storytelling Linking

Hao Wu

Seamus Lawless

Gareth Jones

Francois Pitie

The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

www.adaptcentre.ie

• Task definition

• Challenges & Solutions

• Training

• Searching

• Result

www.adaptcentre.ie

www.adaptcentre.ie

Tour France

www.adaptcentre.ie

www.adaptcentre.ie

Challenges&

Solutions

www.adaptcentre.ie

Lack of training dataVideo can’t be concluded by only one sentences.

Challenges

www.adaptcentre.ie

Solutions

Pre-train + Fine tuning

Video segmentation

+

Length normalization

www.adaptcentre.ie

Data pre-processing

www.adaptcentre.ie

Images Videos Queries

Edinburgh Festival

32k 6.2k 60

Le Tour de France

66k 19k 58

www.adaptcentre.ie

Shot boundary detection

Resnet-152

Video

ImageImage sets

Visual embeddings

Text

Text representation

Word level+

Sentence level (Skip-Thought)

www.adaptcentre.ie

Model overview

www.adaptcentre.ie

www.adaptcentre.ie

Training

www.adaptcentre.ie

SnowPlayful dogs

People having meal

Deep time ShowMuseum of Edinburgh

Highlights of Chris Froome

Pre-training

Target information

Examples

www.adaptcentre.ie

Pre-trainingIntroducing Flickr30k (High quality “image”- “text” pairs)

A boy in a dark shirt is reading a book while sitting on a piano bench

www.adaptcentre.ie

Target information collecting

Collecting from source domain:• Identify keywords from query file.• Match keywords with data in the source.

E.g. Keyword: taking selfies.

Collecting from search engine:• Collect labels from online image search engine

(Google and Bing) using story segments + event name as query.

Model

www.adaptcentre.ie

SnowChris Froome pedaling

www.adaptcentre.ie

Searching

www.adaptcentre.ie

Search

Trade-off between consistency and accuracy

𝑅𝑡 = 0.2*𝑅t−1 + 0.8 *𝑀𝑡

(M is the model raw output, R is the modified output)

www.adaptcentre.ie

Search

λ used in penalizing long videos;L denotes number of segments;Sig() is sigmoid function.

There are 5 runs submitted. The main difference is the value of λ:

Conf Run1 Run2 Run3 Run4 Run5

λ 3 5 12 20 50

Source Google+Bing

Google Google Google Google

www.adaptcentre.ie

Results

www.adaptcentre.ie

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Run1 Run2 Run3 Run4 Run5

Summary Quality

Edfest Tourfrance

www.adaptcentre.ie

Conclusion & Future Work

Target specific information are crucial.

Improve video representations by applying key frame selection (or building sequence model).

Build a classifier to filter crawled images to make this processautomatic.

Thanks for listening.

The ADAPT Centre is funded under the SFI Research Centres Programme (Grant 13/RC/2106) and is co-funded under the European Regional Development Fund.

Date post:	23-Aug-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Social-media Storytelling Linking · 2018. 11. 20. · Social-media Storytelling Linking Hao Wu...

Documents