+ All Categories
Home > Internet > CLEF NewsREEL 2016 Overview

CLEF NewsREEL 2016 Overview

Date post: 16-Apr-2017
Category:
Upload: frank-hopfgartner
View: 362 times
Download: 2 times
Share this document with a friend
30
News REcommendation Evaluation Lab (NewsREEL) Lab Overview Frank Hopfgartner, Benjamin Kille, Andreas Lommatzsch, Martha Larson, Torben Brodt, Jonas Seiler
Transcript
Page 1: CLEF NewsREEL 2016 Overview

News REcommendation Evaluation Lab (NewsREEL)

Lab Overview

Frank Hopfgartner, Benjamin Kille, Andreas Lommatzsch, Martha Larson, Torben Brodt, Jonas Seiler

Page 2: CLEF NewsREEL 2016 Overview

Recommender systems or recommendation systems are a subclass of information filtering systems that seek to predict the "rating" or "preference“ that a user would give to an item.

Recommender Systems

Page 3: CLEF NewsREEL 2016 Overview

Items (Set-based recommenders)

Page 4: CLEF NewsREEL 2016 Overview

Items (Streams)

Page 5: CLEF NewsREEL 2016 Overview

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

Page 6: CLEF NewsREEL 2016 Overview

How do we evaluate?

Academia Industry

• Static, often rather old datasets• Offline Evaluation• Focus on Algorithms and Precision

• Dynamic dataset• Online A/B testing• Focus on user satisfaction

Page 7: CLEF NewsREEL 2016 Overview

Example: Recommending sites in Evora

Sé Catedral

Capela dos Ossos

Templo romano

Palacio de Don Manuel I

Source (Images): Wikipedia

Page 8: CLEF NewsREEL 2016 Overview

1.Chose time point to split dataset 2.Classify ratings before t0 as training set 3.Classify ratings after as test set

Offline Evaluation Dataset construction

Centro Historico

Capela dos Ossos

Templo Romano

Se Catedral

Almendres Cromlech

Cristiano 2 4 5 2

Marta 5 3

Luis 1 2

Page 9: CLEF NewsREEL 2016 Overview

1.Train rating function f(u,i) using training set2.Predict rating for all pairs (u,i) of test set3.Compute RMSE(f) over all rating predictions

Offline Evaluation Benchmarking Recommendation Task

Page 10: CLEF NewsREEL 2016 Overview

• Ignores users’ previous interactions/preferences

• Does not consider trends/shifting preferences

• ...

• Technical challenges are ignored

Drawbacks of offline evaluation

And what about me?

Page 11: CLEF NewsREEL 2016 Overview

Example: Online evaluation

Page 12: CLEF NewsREEL 2016 Overview

Online Evaluation (A/B testing)

Compare performance, e.g., based on profit margins

$$$

A

BClick-through rateUser retention timeRequired resources User satisfaction

Page 13: CLEF NewsREEL 2016 Overview

• Large user base and costly infrastructure required

• Different evaluation metrics required

• Comparison to offline evaluation challenging

Drawbacks of online evaluation

And what about me?

Page 14: CLEF NewsREEL 2016 Overview

• Academia and industry apply different evaluation approaches• Limited transfer from offline to online scenario• Multi-dimensional benchmarking• Combination of different evaluation approaches

Evaluation challenges

Page 15: CLEF NewsREEL 2016 Overview

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

Page 16: CLEF NewsREEL 2016 Overview

In CLEF NewsREEL, participants can develop stream-based news recommendation algorithms and have them benchmarked (a) online by millions of users over the period of a few months, and (b) offline by

simulating a live stream.

CLEF NewsREEL

Page 17: CLEF NewsREEL 2016 Overview

NewsREEL scenario

Image: Courtesy of T. Brodt (plista)

Page 18: CLEF NewsREEL 2016 Overview

NewsREEL scenario

Profit = Clicks on recommendationsBenchmarking metric: Click-Through-Rate

Request article

Request article

Request recommendation

Request recommendation

Page 19: CLEF NewsREEL 2016 Overview

Task 2: Offline Evaluation

• Traffic and content updates of nine German-language news content provider websites

• Traffic: Reading article, clicking on recommendations

• Updates: adding and updating news articlesD

ata

• Simulation of data stream using Idomaar framework

• Participants have to predict interactions with data stream

• Quality measured by the ratio of successful predictions by the total number of predictionsE

valu

atio

n

Predict interactions in a simulated data stream

Page 20: CLEF NewsREEL 2016 Overview

Simulation process

Idomaar

simulate stream

request article

Page 21: CLEF NewsREEL 2016 Overview

Idomaar stream simulation

21

Page 22: CLEF NewsREEL 2016 Overview

Task 1: Online Evaluation

• Provide recommendations for visitors of the news portals of plista’s customers

• Ten portals (local news, sports, business, technology)

• Communication via Open Recommendation Platform (ORP)

Dat

a

• Benchmark own performance with other participants and baseline algorithms during three pre-defined evaluation windows

• Best algorithms determined in final evaluation period

• Standard evaluation metricsEva

luat

ion

Recommend news articles in real-time

Page 23: CLEF NewsREEL 2016 Overview

Real-Time Recommendation

T. Brodt and F. Hopfgartner“Shedding Light on a Living Lab: The CLEF NewsREEL Open Recommendation Platform,” In Proc. of IIiX 2014, Regensburg, Germany, pp. 223-226, 2014.

Page 24: CLEF NewsREEL 2016 Overview

• Recommender Systems• Evaluation• NewsREEL scenario• NewsREEL 2016

Outline

Page 25: CLEF NewsREEL 2016 Overview

Participation

25

Page 26: CLEF NewsREEL 2016 Overview

Task 1 – First evaluation window

Page 27: CLEF NewsREEL 2016 Overview

Task 1 – Second evaluation window

Page 28: CLEF NewsREEL 2016 Overview

Task 1 – Third Evaluation window

Page 29: CLEF NewsREEL 2016 Overview

• NewsREEL presentations – Online Algorithms and Data

Analysis– Frameworks and Algorithms

• Evaluation results and task winners

• Joint Session: LL4IR & NewsREEL: New Ideas

NewsREEL sessionTomorrow, 1:30pm – 3:30pm

Page 30: CLEF NewsREEL 2016 Overview

More Information• http://orp.plista.com• http://www.clef-newsreel.org• http://www.crowdrec.eu• http://sigir.org/files/forum/2015D/p129.pdf

Thank you

Questions?


Recommended