Post on 13-Dec-2015
transcript
Copyright 2008 by CEBT
Outline
Introduction
Temporal Dynamics
Global effect
Baseline predictors
Time changing baseline predictors
User, Item effect
Bellkor function
Performance
Exploratory Study
Conclusion
2
Copyright 2008 by CEBT
Preliminaries
Quiz set and Probe set– Given
(Kevin, Avatar, 2009/12/20, ★ ★ ★ ★ ★)
(Coca, 2012, 2009/12/10, ★ ★ ★ ★)
– Predict
(Kevin, District 9, 2009/12/18, ?????)
Training
– Dec 31, 1999 – Dec 31, 2005
– 100 million ratings
– 480 thousand users
– 17,770 movies
Testing
– 1.4 million ratings
Copyright 2008 by CEBT
Ratings Data
1 3 4
3 5 5
4 5 5
3
3
2 2 2
5
2 1 1
3 3
1
17,700 movies
480,000users
Copyright 2008 by CEBT
Ratings Data
1 3 4
3 5 5
4 5 5
3
3
2 ? ?
?
2 1 ?
3 ?
1
Test Data Set(most recent ratings)
480,000users
17,700 movies
Copyright 2008 by CEBT
Training Data
100 million ratings
Held-Out Data
3 million ratings
1.5m ratings 1.5m ratings
Quiz Set:scores
posted onleaderboard
Test Set:scores
known onlyto Netflix
Scores used indeterminingfinal winner
Labels only known to NetflixLabels known publicly
Copyright 2008 by CEBT
Scoring
Quality of the result is measured by RMSE
1/|R| S (u,i) e R ( rui - rui )2
Does not necessarily correlate well with user satisfaction
Baseline RMSE Scores on Test data 1.054 - just predict the mean user rating for each movie 0.953 - Netflix’s own system (Cinematch) as of 2006 0.941 - nearest-neighbor method using correlation 0.857 - required 10% reduction to win $1 million
Copyright 2008 by CEBT
Considerations
User preference changes over time
Problem of Concept Drift
Instance Selection
Instance Weighting
Tries difference exponential time decay rates to solve the problem
Copyright 2008 by CEBT
Considerations
Full extent of the time period, not only the present be-havior
Key to being able to extract signal from each time point, while neglecting only the noise
Multiple changing concepts should be captured
User or/and item dependency
User-item within a single framework
Do not try to extrapolate future temporal dynamics
Too difficult…….
Copyright 2008 by CEBT
Components of a rating predictor
user-movie interactionmovie biasuser bias
User-movie interaction
Characterizes the matching between users and movies
Attracts most research in the field
Baseline predictor• Separates users and movies
• Often overlooked • Benefits from insights into users’
behavior• Among the main practical contribu-
tions of the competition
(slide from Yehuda Koren)
Copyright 2008 by CEBT
Global temporal effects
Average movie rating made a jump
Ratings increase with the movie age at the time of the rating
Copyright 2008 by CEBT
Baseline predictor is :
, where
– u is overall average rating of a user
– and is observed deviation of user u and item i
Baseline predictors
– Rating scale of user u– Values of other ratings user
gave(day-specific mood, anchor-ing, multi-user accounts)
– Popularity of movie i– Selection bias; related to num-
ber of ratings user gave on the same day (“frequency”)
Copyright 2008 by CEBT
Two major temporal effects :
Item’s popularity is changing over time
User change their baseline rating over time
To take the parameter and as a function of time
Time changing baseline predic-tors
Copyright 2008 by CEBT
Item temporal effect
Considering resolution and enough rating
Each bin corresponds to roughly ten consecutive weeks of data
30 bins spanning all days in the dataset
Copyright 2008 by CEBT
Static model
Linear model , where , b=0.4
Spline model ,
Result static .9799 < Linear .9605 = Spline .9603
User temporal effect
Copyright 2008 by CEBT
Periodic effect
Dayparting
Some products can be more popular in specific seasons or near certain holidays
Different types of television or radio shows are popular throughout different segments of the day
Season, day-of-week effect
Unfortunately, periodic effects do not shows significant predictive power
Copyright 2008 by CEBT
Performance
1.054 - just predict the mean user rating for each movie
0.953 - Netflix’s own system (Cinematch) as of 2006 0.941 - nearest-neighbor method using correlation 0.864 - Bellkor algorithm 2008
0.856 - Bellkor algorithm 2009, 10.05% improved
Copyright 2008 by CEBT
An Exploratory Study
Sudden rise in the average movie rating (early 2004)
Technical improvements in Netflix Cinematch
GUI improvements
Meaning of rating changed
‘Normal User’ increases (?)
Copyright 2008 by CEBT
An Exploratory Study
Movie’s age
Users prefer new movies without any reasons
Older movies are just inherently better than newer ones (x)