The Search and Hyperlinking Task at MediaEval 2014

transcript

Search and Hyperlinking 2014

Overview

Maria Eskevich, Robin Aly, David Nicolás Racca

Roeland Ordelman, Shu Chen, Gareth J.F. Jones

Find what you were (not) looking for

Search & Explore

Jump-‐in points!

Researchers & Educators

Journalists Research

Academic researchers & students

InvesDgate

Academic educators Educate

Public users CiMzens Entertainment, Infotainment

Main group User Target

Media Professionals Broadcast Professionals

Media Archivists Annotate

RecommendaMon (Linking)

Not what we want

Linking Audio-‐Visual Content

1998 2002 2008

2013 2010 2015

BIG DATA?

not representaMve

representaMve

Search & Hyperlinking task

•  User oriented: aim to explore the needs of real users expressed as queries. –  How: UK ciMzens and crowd sourcing for retrieval assessment

•  Temporal aspect: seek to direct users to the relevant parts of retrieved video (“jump-‐in point”). –  How: segmentaMon, segment overlap, transcripts. prosodic, visual (low-‐level, high-‐level; keyframes)

•  MulDmodal: want to invesMgate technologies for addressing variety in user needs and expectaMons –  varied visual and audio contribuMons, intenMonal gap between query and mulMmodal descriptors in content

ME Search & Hyperlinking task in development: 2012 – 2014

Search Hyperlinking

2012 2013 2014 2012 2013 2014

Dataset BlipTv BBC BlipTv BBC

Features released:

!  Transcripts 2 ASR 3 ASR 2 ASR 3 ASR

!  Prosodic features no yes no yes

!  Visual clues for queries yes no no

!  Concept detecDon yes yes

Type of the task Known-‐item Ad-‐hoc Ad-‐hoc

Query/Anchors creaDon PC iPad PC iPad

Number of queries/anchors 30/30 4/50 50/30 30/30 11/ 98/30

Relevance assessment MTurk users (BBC) MTurk MTurk

Numbers of assessed cases 30 50 9 900 3 517 9 975 13 141

EvaluaDon metrics MRR, MASP, MASDWP MAP(-‐bin/tol), P@5/10

MAP MAP(-‐bin/tol), P@5/10

Dataset: Video collecMon •  BBC copyright cleared broadcast material:

–  Videos: •  Development set: 6 weeks between 01.04.2008 and 11.05.2008 (1335 hours/2323 videos)

•  Test set: 11 weeks between 12.05.2008 and 31.07.2008 (2686 hours, 3528 videos) –  Manually transcribed subMtles –  Metadata

•  AddiDonal data: –  ASR: LIMSI/Vocapia, LIUM, NST-‐Sheffield –  Shot boundaries, keyframes

–  Output of visual concept detectors by University of Leuven, and University of Oxford

Dataset: Query •  28 Users

-‐ Policemen, Hair dresser, Bouncer, Sales manger, Student, Self-‐employed

•  Two hour session on iPads: – Search the archive (document level)

– Define clips (segment level) – Define anchors (anchor level)

Statement of InformaMon Need

Search Refine

Relevant Clips Define Anchors

User study @ BBC: 1.) Statement of InformaMon Need

User study @ BBC: 2.) Search

Relevant clips

Goto 1.)

Goto 3.)

User study @ BBC: 3.) Refine Relevant Clip

User study @ BBC: 4.) Define Anchors

Data cleaning: Usable InformaMon Need

•  DescripMon clearly specifies what is relevant •  A query with a suitable Mtle exists •  Sufficient relevant segments exist (try query)

Data cleaning: Process

•  For each informaMon need in batch 1.  check if usable 2.  If in doubt use search to search for relevant data 3.  reword & spellcheck descripMon 4.  select the first suitable query 5.  Save

Data cleaning: Usable Anchor

•  Longer than 5 seconds •  DesMnaMon descripMon clearly idenMfies the material the user wants to see when he would acMvate the anchor described by label

•  It is likely that there are some relevant items in the collecMon

Data cleaning: Process

•  For each informaMon need in assigned batch – Go through anchors

•  check if usable •  reword & spellcheck descripMon •  Assess whether it is like to find links in the collecMon (possibly using search)

– Save

Dataset: outcome (1/2)

•  30 queries <top> <queryId>query_6</queryId> <refId>53b3cf9d42b47e4c32545510</refId> <queryText>saturday kitchen cocktails</queryText>

</top>

<top> <queryId>query_1</queryId> <refId>53b3c64b42b47e4a362be4ce</refId> <queryText>sightseeing london</queryText>

</top>

Dataset: outcome (2/2)

•  30 anchors: <anchor>

<anchorId>anchor_1</anchorId> <refId>53b3c46f42b47e459265d06f</refId> <startTime>16.38</startTime> <endTime>17.35</endTime> <fileName>v20080629_184000_bbctwo_killer_whales_in_the</fileName>

</anchor>

Ground truth creaMon

•  Queries/Anchors: user studies at BBC:

-‐  28 users with following profile: "  Age: 18-‐30 years old "  Use of search engines and services on iPads on the daily basis

•  Relevance assessment: via crowdsourcing on Amazon MTurk plaporm: –  Top 10 results from 58 search and 62 hyperlinking submissions

–  1 judgment per query or anchor that was accepted/rejected based on an automated algorithm, special cases of users typos checked manually

–  Number of evaluated HITs:

9 900 for search, and 13 141 for hyperlinking

•  P@5/10/20 •  MAP based:

•  MAP: taking into account any overlapping segment:

•  MAP-‐bin: relevant segments are binned for relevance:

•  MAP-‐tol: only start Mmes of the segments are considered:

EvaluaMon metrics

RESULTS

Results: Search sub-‐task: MAP

LIMSI/Vocapia Manual No ASR NST/Sheffield LIUM

Results: Search sub-‐task: MAP_bin

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

Results: Search sub-‐task: MAP_tol

Results: Hyperlinking sub-‐task: MAP

0 0.5 1

CUNI_F_M_N

oOverlap

eights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_N

oOverlap

eights

CUNI_F_M_N

oOverlap

eights

CUNI_F_M_O

verlap

CUNI_F_N_N

oOverlap

ioWeights

CUNI_F_N_N

oOverlap

eights

CUNI_F_N_N

oOverlap

eights

CUNI_O_M

verlap

eights

oncept2

onceptEn

t IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_O

R JRS_F_MV_A

TextVisR

JRS_F_MV_A

JRS_F_MV_C

TextVisR

JRS_F_MV_C

JRS_F_M_A

JRS_F_M_C

JRS_F_V_A

JRS_F_V_C

2014_O

_VO_KC7

S LINKE

2014_O

_VO_KC7

2014_Ss_N_A

2014_Ss_N_TEX

Results: Hyperlinking sub-‐task: MAP_bin

CUNI_F_M_N

oOverlap

ioWeights

CUNI_F_M_N

oOverlap

2Weights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_O

verlap

eights

CUNI_F_N_N

oOverlap

eights

CUNI_F_N_N

oOverlap

Weights

CUNI_F_N_N

oOverlap

Weights

CUNI_O_M

verlap

Weights

oncept2

onceptEn

richmen

t IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_O

JRS_F_MV_A

TextVisR

JRS_F_MV_A

JRS_F_MV_C

TextVisR

JRS_F_MV_C

JRS_F_M_A

JRS_F_M_C

JRS_F_V_A

JRS_F_V_C

2014_O

_VO_KC7

S LINKE

2014_O

_VO_KC7

2014_Ss_N_A

2014_Ss_N_TEX

Results: Hyperlinking sub-‐task: MAP_tol

CUNI_F_M_N

oOverlap

ioWeights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_N

oOverlap

Weights

CUNI_F_M_O

verlap

eights

CUNI_F_N_N

oOverlap

eights

CUNI_F_N_N

oOverlap

Weights

CUNI_F_N_N

oOverlap

Weights

CUNI_O_M

verlap

Weights

oncept2

onceptEn

t IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_N

IRISAKU

L_Ss_O

JRS_F_MV_A

TextVisR

JRS_F_MV_A

JRS_F_MV_C

TextVisR

JRS_F_MV_C

JRS_F_M_A

JRS_F_M_C

JRS_F_V_A

JRS_F_V_C

2014_O

_VO_KC7

S LINKE

2014_O

_VO_KC7

2014_Ss_N_A

2014_Ss_N_TEX

Lessons learned

1.  iPad vs PC = different user behaviour and expectaMon from the system.

2.  Prosodic features broaden the scope of the search sub-‐task.

3.  Use of shot segmentaMon based units achieves the worst scores for both sub-‐tasks.

4.  Use of metadata improves results for both sub-‐tasks.

The Search and Hyperlinking task was supported by

We are grateful to Jana Eggink and

Andy O'Dwyer

from the BBC for preparing the collecMon and hosMng the user trials.

... and of course Martha for advise & crowdsourcing access.

JRS at Search and Hyperlinking of Television Content Task

Werner Bailer, Harald SMegler MediaEval Workshop, Barcelona, Oct. 2014

Linking sub-‐task

•  Matching terms from textual resources •  Reranking based on visual similarity (VLAT)

•  Using visual concepts (only/in addiMon) •  Results

– Differences between different text resources – Context helped only in few of the cases – Visual reranking provides small improvement

– Visual concepts did not provide improvements

SoluDon with concept enrichment •  Concept enrichment: the set of words is extended with their synonyms or other conceptually connected words.

•  Top 10 vs top 50 conceptually connected words for each word

•  Conclusion: the results show that concept enrichment with less words give beuer precision because at the opposite case the noise is greater.

Zsombor Paróczi, Bálint Fodor, Gábor Szűcs

Television Linked To The Web

www.linkedtv.eu

H.A. Le1, Q.M. Bui1, B. Huet1, B. Cervenková2, J. Bouchner2, E. Apostolidis3,

F. Markatopoulou3, A. Pournaras3, V. Mezaris3, D. Stein4, S. Eickeler4, and M. Stadtschnitzer4

1 - Eurecom, Sophia Antipolis, France. 2 - University of Economics, Prague, Czech Republic.

3 - Information Technologies Institute, CERTH, Thessaloniki, Greece. 4 - Fraunhofer IAIS, Sankt Augustin, Germany.

16-‐17 Oct 2014

Reasons to visit the LinkedTV poster

LinkedTV @ MediaEval 2014 Search and Hyperlinking Task

The Search and Hyperlinking Task at MediaEval 2014

Software