kNN CF: A Temporal
Social Network
kNN CF: A Temporal Social Network
Neal Lathia, Stephen Hailes, Licia CapraUniversity College London RecSys’08
Advisor: Hsin-Hsi ChenReporter: Y.H Chang2009/03/09
2009/03/09kNN CF: A Temporal
Social Network 2/25
INTRODUCTION(1/4)
Recommender System: It has been an important component, or
even core technology, of online business.
EX: Amazon, Netflix (Netflix prize competition)
The process of computing recommendations is reduced to a problem of predicting the correct rating that users would apply to unrated items
2009/03/09kNN CF: A Temporal
Social Network 3/25
INTRODUCTION(2/4)
k-Nearest Neighborhood Collaborative Filtering(kNN CF/ kNN) has surfaced amongst the most popular underlying algorithms of recommender systems. Collaborative Filtering: using a set of user r
ating profiles to predict ratings of unrated items
2009/03/09kNN CF: A Temporal
Social Network 4/25
INTRODUCTION(3/4)
In order to understand the effect of kNN, the algorithm can be viewed as a process that generates a social network graph, where nodes are users and edges connect k similar users.
In this work (1)we analyse user-user kNN graph from temporal perspective (2) we observe the emergent properties of the entire graph as algorithm parameters change.
2009/03/09kNN CF: A Temporal
Social Network 5/25
INTRODUCTION(4/4)
The analysis is decomposed into four separate stages:
Individual Nodes Node Pairs Node Neighborhoods Community Graphs
kNN CF: A Temporal
Social Network
I. USER PROFILES OVER TIME
2009/03/09kNN CF: A Temporal
Social Network 7/25
USER PROFILES OVER TIME (1/2)
In this work we focus on the two MovieLens datasets
100t MovieLens 100, 000 ratings of 1682 movies by 943 users. (1997.
09.20 to 1998.04.22) 1000t MovieLens
About 1 million ratings of 3900 movies by 6040 users. (2000.04.25 to 2003.02.28)
2009/03/09kNN CF: A Temporal
Social Network 8/25
USER PROFILES OVER TIME (2/2)
kNN CF: A Temporal
Social Network
II. USER PAIRS OVER TIME
2009/03/09kNN CF: A Temporal
Social Network 10/25
USER PAIRS OVER TIME(1/6)
Predictions are often computed as a weighted average of deviation from neighbor means:
user a, item i
b is a’s neighbor
:item i’s rating of neighbor b
:neighbor b’s mean ratingbribr ,
Similarity between the User a and its’ neighbor b
2009/03/09kNN CF: A Temporal
Social Network 11/25
USER PAIRS OVER TIME(2/6) - four highly cited methods of the similarity between users
Total n items
2009/03/09kNN CF: A Temporal
Social Network 12/25
USER PAIRS OVER TIME(3/6) -evolution of similarity
2009/03/09kNN CF: A Temporal
Social Network 13/25
USER PAIRS OVER TIME(4/6)
In this work we plot the similarity at time t, sim(t) against the similarity at the time of the next update, sim(t + 1).
The distance from points to the diagonal represents the changed from one update to the next.
2009/03/09kNN CF: A Temporal
Social Network 14/25
COR wPCCRange:-1~+1
VSPCC
Range:-1~+1
USER PAIRS OVER TIME(5/6)- sim(t) against sim(t+1)
sim(t)
sim(t + 1)
2009/03/09kNN CF: A Temporal
Social Network 15/25
USER PAIRS OVER TIME(6/6)
We classified those similarity methods according to their temporal behavior—
1. Incremental:COR and wPCC The differnce between (t) and (t+1) is small. Growing
2. Corrective: VS method Jumps from 0 to near-perfect
then degrade
3. Near-random: PCC near-random behavior
kNN CF: A Temporal
Social Network
III. DYNAMIC NEIGHBOURHOODS
2009/03/09kNN CF: A Temporal
Social Network 17/25
DYNAMIC NEIGHBOURHOODS(1/2)
The often-cited assumption of collaborative filtering is that users who have been like-minded in the past will continue sharing opinions in the future.
When applying user-user kNN CF, we would expect each user’s neighborhood to converge to a fixed set of neighbors over time
2009/03/09kNN CF: A Temporal
Social Network 18/25
DYNAMIC NEIGHBOURHOODS(2/2)
(This experiment updated daily.) The actual number of neighbors that a user will be connected to depends on:
similarity measure neighborhood size k
The stepper they are, the faster the user is
meeting other recommenders.
COR and wPCC outperform the VS and PCC
(N.Lathia et al.,2008)
New recommend-ers Left
time
kNN CF: A Temporal
Social Network
IV. NEAREST-NEIGHBOUR GRAPHS
2009/03/09kNN CF: A Temporal
Social Network 20/25
NEAREST-NEIGHBOUR GRAPHS(1/5)
The last section, we focus on non-temporal characteristics of the dataset.(wPCC) Path Length Connectedness (using only positive sim) Reciprocity: a characteristic of graphs expl
ored in social network analysis; in this work, it is the proportion of users who are in other’s top-k
2009/03/09kNN CF: A Temporal
Social Network 21/25
NEAREST-NEIGHBOUR GRAPHS(2/5)
2009/03/09kNN CF: A Temporal
Social Network 22/25
NEAREST-NEIGHBOUR GRAPHS(3/5)
power law
(1)There may be some users who are not in any other’s top-k. Their ratings are therefore inaccesible and will not be used in any prediction.
2009/03/09kNN CF: A Temporal
Social Network 23/25
NEAREST-NEIGHBOUR GRAPHS(4/5)
(2)Some users will have incredible high in-degree. We call this group “power users”
2009/03/09kNN CF: A Temporal
Social Network 24/25
NEAREST-NEIGHBOUR GRAPHS(5/5)
More experiments about “power users”: 1. remove the power users’ ability to prediction 2. only the top power users are allow to contribute
to the prediction Results:
The remaining users can still make significant contribution to each user’s predictions
The 10 topmost power users hold access to over 50% of the dataset.
2009/03/09kNN CF: A Temporal
Social Network 25/25
DISCUSSION
The evolution of similarity between any pair of users is dominated by the similarity method, and the four measures we explored can be classified into three categories (incremental, corrective, near-random) based on the temporal properties
Measures that are known to perform better display the same behavior: they are incremental, connect each user quicker, and offer broader access to the ratings in the training set.