Real-world News Recommender Systems

Verarbeitung von Datenstromen in Echtzeit

Tobias Heintz1 Benjamin Kille2

1plista GmbH

2Technische Universitat Berlin

September 26, 2014

Table of Contents

Introduction

Recommender SystemsUnpersonalised RecommendationCollaborative FilteringContent-based FilteringEvaluation

News Recommendation

Big Data Issues

Who are we?

I Tobias Heintz, plista GmbH

I Benjamin Kille, Technische Universitat Berlin

plista GmbH

Pioneers for targeted advertisement and content distribution.

I founded 31 July, 2008

I incorporated in the WPP Group as of 1 January, 2014

I headquaters in Berlin, Germany

I 120 employees (30 % R&D)

Technische Universitat Berlin

I >30 000 enrolled students

I 331 professors

I >2600 researchers

What problems do we address?

Recommender Systems

We will introduce recommender systems; we will discuss a varietyof algorithms; we will explore how to evaluate recommendersystems.

NewsWe will talk about specific challenges when recommending news;we will illustrate issues arising as system fail to buildcomprehensive user profiles; we will depict how news evolving overtime affect recommender systems.

Big Data

We will examplify in what way news represent a source of big data;we will introduce a system which grants researchers access to bigdata; we will show you, how you can compete with your ownapproaches.

Why are these problem important?

Users increasingly face information overload as they interact withitem collections. For instance:

I >43 000 000 songs on Apple’s iTunes

I 100 h of video are uploaded on Youtube every minute

I >3 000 000 movies on IMDb

I ...

Collection continue to grow causing even more severe informationoverload. The same yields for news articles.

Table of Contents

Introduction


News Recommendation

Big Data Issues

Problem definition

Users have insufficient time and cognitive capacity to iterate thefull collection. Recommender Systems support users as they filtercollections. Recommender Systems differ with respect to themethod they use to filter. More formally, a general-purposerecommender system is a triple (U , I, φ).

U → set of users {u1, u2, . . . , uM}I → set of items {i1, i2, . . . , iN}φ→ a filter function

The performance of different recommendation algorithms typicallydepends on φ.

Filter Functions

Filter functions take a user u, the entire item collection I, and amodel M. They return a subset of items to be recommended I∗.

φ(u, I,M) = I∗

Recommender systems’ success or failure strongly depends on themodel M. In particular, how accurately the model reflects actualuser preferences. M may take various kinds of input, as we willdiscuss for a selection of recommendation algorithms.

Random Recommendation

M takes the item collection and selects items randomly.

Random Recommendation

M takes the item collection and selects items randomly.

random

Most-Popular Recommendation

M orders the item collection according to the number ofinteractions, K ≥ L ≥ M ≥ N.

K interactions

L interactions

M interactions

N interactionsmostpopular

Summary: Unpersonalised Recommenders

Advantages

I low computational complexity

I easy to update M

I domain independent

Disadvantages

I disregard personal taste

I disregard context

I high chance to recommend known or unpopular items

Collaborative Filtering

Basic Assumptions

I systems have access to users’ preferences

I users with similar tastes in the past will continue to likesimilar items

I systems have means to compare users tastes

Distinctions

I model-based vs memory-based

I item-based vs user-based

Example

AnnaAviator

Bob

Clara

Dan

Bad Boys

Cars

District 9

Elektra

Example

AnnaAviator

Bob

Clara

Dan

Bad Boys

Cars

District 9

Elektra

Example

AnnaAviator

Bob

Clara

Dan

Bad Boys

Cars

District 9

Elektra

Bad Boys District 9 Elektra[ , , ]user profile: Anna

Example

Anna

Bob

Clara

Dan

Bad Boys District 9 Elektra[ , , ][ , , , ][ , , ][ ]

Aviator

Aviator

Bad Boys District 9 Elektra

Cars District 9 Elektra

Example

Anna

Aviator

Bob

Clara

Dan

Bad Boys Cars District 9 Elektra

Example

Anna

Aviator

Bob

Clara

Dan


Example

Anna

Aviator

Bob

Clara

Dan


Example

Anna

Aviator

Bob

Clara

Dan


Preference Elicitation

Explicit Preferences

I Likes

I Thumbs Up/Down

I Ratings

I Comments

I Purchase

Implicit Preferences

I Click

I Dwell Time

I Returns

How can we measure whether users like items and how much theydo?

Collaborative Filtering Algorithms with Ratings

Memory-based

Algorithm uses the complete set of data in the recommendationprocess. M contains the full rating matrix.

I user-based k-nearest neighbour

I item-based k-nearest neighbour

Model-basedAlgorithm learns a model M and uses it to recommend items.

I matrix factorisation with ALS

I matrix factorisation with SGD

User-based k-nearest Neighbour

Input: M × N rating matrix R, similarity measure σ(u, v)

Anna

Aviator

Bob

Clara

Dan




Anna

Aviator

Bob

Clara

Dan




Anna

Aviator

Bob


1 1 1

1 1 11

0 0

0

Similarity Measures

Number of items in common

σ(u, v) =∑i∈I

I(i)

I(i) =

{1 if both u and v liked i

0 otherwise

Cosine similarity

σ(u, v) =u · v||u||||v ||

Pearson’s correlation coefficient

σ(u, v) =cov(u, v)

std(u)std(v)



Anna

Bob

Clara

Dan

Anna Bob Clara Dan

11

1

1

sim(Anna, Bob)

sim(Bob, Anna)



Anna

Bob

Clara

Dan

Anna Bob Clara Dan

11

1

1

sim(Anna, Bob)

sim(Bob, Anna)

[1, sBob, sClara, sDan]



Anna

Aviator

Bob

Clara

Dan


?


Recommendation procedure user profile:

u = (r(i1), r(i2), . . . , r(iN))

similarity vector:

σ(u, ·) = (σ(u, v1), σ(u, v2), . . . , σ(u, u), . . . , σ(u, vM))

preference prediction:r(j) = uσ(u, ·)

ResultWe obtain a prediction for each item’s preference and can rankthem accordingly. The algorithm returns as many items asrequested starting from the top rank.

Item-based k-nearest Neighbour

Input: M × N rating matrix R, similarity measure σ(i , j)

Anna

Aviator

Bob

Clara

Dan




Anna

Aviator

Bob

Clara

Dan




Anna

Aviator

Bob

Clara

Dan

Bad Boys

1

1

11

0

0 0

0

Similarity Measures

Number of items in common

σ(i , j) =∑u∈U

I(u)

I(u) =

{1 if both i and j are liked by u

0 otherwise

Cosine similarity

σ(i , j) =i · j||i ||||j ||

Pearson’s correlation coefficient

σ(i , j) =cov(i , j)

std(i)std(j)



Aviator Bad Boys Cars District 9 Elektra

Aviator

Bad Boys

Cars

District 9

Elektra

11

11

1

sim(Aviator, Bad Boys)

sim(Bad Boys, Aviator)



Anna

Aviator

Bob

Clara

Dan


?


Recommendation procedure item profile:

i = (r(u1), r(u2), . . . , r(uM))

similarity vector:

σ(i , ·) = (σ(i , j1), σ(i , j2), . . . , σ(i , i), . . . , σ(i , jN))

preference prediction:r(u) = σ(i , ·)i

ResultWe obtain a prediction for each item’s preference and can rankthem accordingly. The algorithm returns as many items asrequested starting from the top rank.

Matrix Factorisation

Input: M × N rating matrix R

R =

1 1 1

1 1 1 11 1 1

1

GoalFill the gaps of missing preferences.


IdeaProject preferences into low dimensional space to detect latentstructures.

[R]M×N ≈ [P]M×K [Q]>N×K

K � M,N

ProblemHow to determine P and Q?


Learning P and QInput: Error metric

E (P,Q,R) =∑

(u,i)∈R

(r(u, i)− PuQ>i )2

(quadratic error)

E (P,Q,R) =∑

(u,i)∈R

|r(u, i)− PuQ>i |

(absolute error)


Stochastic Gradient DescentOptimise error metric by selecting data points at random.

I initialise P,Q with small random values

I pick a preference (u, i) at random

I determine the gradient at that point

I adjust P,Q accordingly

I continue

Alternating Least Squares

Optimise either P or Q keeping the other fixed

I initialise P,Q with small random values

I optimise error metric by P

I optimise error metric by Q

I continue

Summary: Collaborative Filtering

Advantages

I takes personal taste into account

I successful in the Netflix Prize competition

I domain-independent

Disadvantages

I cold-start problem

I sparsity

I grey sheep

Cold-Start Problem

I user without known preferences

I item without preferences

I similarity measures fail

I inconclusive latent factors

Grey Sheep

I user rate all their items average

I user profile: [3, 3, 3, 3, . . . , 3]

I collaborative systems cannot distinguish good from bad items

Content-based Filtering

IdeaSuggest items which are similar to items users have liked.

Similarity

I based on content → features

I depending on the domain


Input: user profile, item collection, item features, and similaritymeasure





Features

▪ Name/ID▪ Meta data▪ Content▪ audio stream --> songs▪ video stream -->

movies▪ text --> book, news

article



CBF

sim(i,j)


Similarity: Example

I keyword overlap → text

I average colour match → images/video

I maximum amplitude → audio/sound

I common actors → movies

I common interests → friends/partnership

Summary: Content-based Filtering

Advantages

I considers personal taste

I high expectability

Disadvantages

I cost-sensitive for high-volume contents, e.g., video

I low serendipity

I user cold-start

Evaluation

Important aspects

I how well does the system predict preferences?

I how often do users receive useful suggestions?

I how long does it take for the system to provide suggestions?

I how many requests cannot be answered?

I how often do users return to the site?

I how often do users purchase/rent/consume items which thesystem had recommended?

I how well did users perceive the system?

Evaluation: Rating Prediction

GoalThe evaluation ought to show how well the system estimatespreferences.

Assumptions

I system can access recorded explicit numerical preferences

I tastes remain stable over time

I the more accurate the system estimates preferences, the moresuited the suggestions

Metrics

I root mean squared error√

1|(u,i)|

∑(u,i)∈R(r(u, i)− r(u, i))2

I mean absolute error 1|(u,i)|

∑(u,i)∈R |r(u, i)− r(u, i)|

Evaluation: Ranking

GoalThe evaluation ought to show how well the system ranks itemsaccording to users’ preferences.

Assumptions

I system can access preference relations between items


I the better the system ranks items, the more suited thesuggestions

Metrics

I normalised discounted cumulative gain DCGIDCG

I mean reciprocal rank 1|u|

∑u∈U

1ranki

Evaluation: Top-N

GoaldThe evaluation ought to show how well the system selects the topsuggestions.

Assumptions

I system can access preference relations between items


I the better the system selects the top suggestions, the moresuited they are

Metrics

I precision@N TPTP+FP

I recall@N TPTP+FN

Evaluation: Problems

I explicit preferences may not be available

I tastes change over time

I recorded data does not fully reflect the current situation

SolutionAccessing real systems with current user interactions to seewhether method performs better than existing one → second partof the tutorial

Summary: Recommender Systems

I support users by suggesting interesting items

I counteract information overload

I unpersonalised recommenderI collaborative filtering

I user-based k-nearest neighbourI item-based k-nearest neighbourI matrix factorisation

I content-based filtering

I evaluation still difficult

Table of Contents

Introduction


News Recommendation

Big Data Issues

News Recommendation: Special Characteristics

Collection Dynamics

I thousands of new article published daily

I older articles’ relevancy decays

Contextual Differences

I users perceive recommendations differently

I devices render recommendations differently

I dependence on daytime and weekday

Popularity Bias

I few items receive a lot of attention

I most items receive hardly any attention

News Recommendation: Collection Dynamics

500

1000

1500

2000

Oct Jan

entry

Oct Jan

exit

News Recommendation: Contextual Differences

hour

Mon

Tue

Wed

Thu

Fri

Sat

Sun

0 6 12 18

desktop phone

Mon

Tue

Wed

Thu

Fri

Sat

Suntablet

0.000

0.002

0.004

0.006

0.008

0.010

0.012

0.014

News Recommendation: Popularity Bias

News

Interactions

Frequency

10^0

10^1

10^2

10^3

10^4

10^0 10^1 10^2 10^3 10^4 10^5 10^6

Movies

InteractionsFrequency

10^0.0

10^0.5

10^1.0

10^1.5

10^2.0

10^0 10^1 10^2 10^3 10^4

Table of Contents

Introduction


News Recommendation

Big Data Issues

Big Data

GoalIntelligent real-time processing of huge amounts of data.Recommender Systems → personalisation

I volume → amount of data to be stored increases

I variety → heterogeneous data

I velocity → data streams in (near) real-time

I veracity → noisy data

Big Data

Do news recommendations fullfil the requirements of big data?

Volumehundreds of GB every day X

Variety

news entail textual data and images enducing some variety

Velocity

news arise continuously → second part of the tutorial X

Veracity

news have some consistent attributes (headline, text), but alsocomprise some features which are missing or wrong (date, location,image)

Questions?

Thank you for your attention! We hope you enjoyed the first partof the tutorial! There is more (practical) to come in the secondpart!

Date post:	27-Jun-2015
Category:	Science
Upload:	kib83
View:	147 times
Download:	1 times

Real-world News Recommender Systems

Science