+ All Categories
Home > Documents > EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste: A Constant Time Collaborative Filtering Algorithm

Date post: 14-Jan-2016
Category:
Upload: iokina
View: 58 times
Download: 0 times
Share this document with a friend
Description:
EigenTaste: A Constant Time Collaborative Filtering Algorithm. Ken Goldberg Students: Theresa Roeder, Dhruv Gupta, Chris Perkins Industrial Engineering and Operations Research Electrical Engineering and Computer Science UC Berkeley. CF Problem Definition. - PowerPoint PPT Presentation
Popular Tags:
29
EigenTaste: A Constant Time Collaborative Filtering Algorithm Ken Goldberg Students: Theresa Roeder, Dhruv Gupta, Chris Perkins Industrial Engineering and Operations Research Electrical Engineering and Computer Science UC Berkeley
Transcript
Page 1: EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste:A Constant Time Collaborative

Filtering Algorithm

Ken Goldberg

Students: Theresa Roeder, Dhruv Gupta, Chris Perkins

Industrial Engineering and Operations Research

Electrical Engineering and Computer Science

UC Berkeley

Page 2: EigenTaste: A Constant Time Collaborative Filtering Algorithm

CF Problem Definition

• A set of objects (movies, books, jokes)

• A user rates a subset of objects

• Based on the ratings, retrieve objects from the complement of this subset. Criteria:– Effective : recommended objects should

receive high ratings– Efficient : the online recommendation process

should run quickly and be scalable

Page 3: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Some Previous Work

• D. Goldberg, et al. - Tapestry (1992)

• Riedel, Resnick, Konstan et. al. - GroupLens(1994-)

• Shardanand and Maes - Ringo (1995)

• Resnick and Varian (1997)

• Breese et. al. at Microsoft Research (1998)

• Pazzani (1999)

• Herlocker et. al. - GroupLens (1999)

Page 4: EigenTaste: A Constant Time Collaborative Filtering Algorithm

WWW-based Recommender Systems

Firefly

MovieCritic

MovieLens

Page 5: EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste Algorithm

1) Principal Component Analysis 2) Universal Queries (dense ratings matrix)3) Fine-grained ratings bar (captures nuances)4) Offline and Online Processing5) Online: Constant time recommendations

Page 6: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Universal Queries

• Most CF systems require users to select which items they want to rate: sparse ratings matrix

• Eigentaste allows users to rate all items based on short unbiased descriptions (eg, film synopsis)

• Eigentaste uses a subset of highly discriminatory items for the gauge set

Page 7: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Disapprove Approve

Continuous Rating Scale

Page 8: EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste Algorithm

• A is the n x m normalized rating matrix– n users– m objects

• C is the k x k reduced correlation matrix– k objects in the gauge set:– C = (1/n) ATA– assumes ratings are continuous with linear rel.

• E is the ortho. matrix of eigenvectors of C is the diagonal matrix of eigenvalues

Page 9: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Correlation Matrix

Page 10: EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste• ECET = • C = ETE• Let B = AET

• RB = (1/n) BTB = ECET =

– transformed points are uncorrelated and each column of B has variance i

• Principle Components (Pearson 1901)– consider m largest eigenvectors, Em

• Bm = AEmT

• choose m based on “knee” in eigenvalues

Page 11: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Dimensionality Reduction

• First two principal components (eigenvectors) account for nearly 50% of the variation in user ratings

• Project user ratings along first two principal components: x = AE2

T

• Facilitates visualization ...

Page 12: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Eigen Plane

Recursive Clustering

Page 13: EigenTaste: A Constant Time Collaborative Filtering Algorithm

The EigenTaste Algorithm

• Offline:– Compute eigenvectors and project users onto eigen plane.– Cluster and compute average ratings for each cluster.

• Online: – Collect ratings for objects in gauge set– Project onto the eigen plane– Find representative cluster

– Recommend objects based on average ratings within that cluster

Page 14: EigenTaste: A Constant Time Collaborative Filtering Algorithm

First Application (1999)Jester: Recommending Jokes

• Sense of humor is difficult to specify

• Advantages:– Rating process is not altogether unpleasant– Can evaluate jokes quickly:– Dense ratings matrix (large sample size)

• Disadvantages:– Offensive/Shaggy Dog jokes– Temporal Effects, Portfolio Effects– Priming/Masking

Page 15: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Jester: User Interface

Page 16: EigenTaste: A Constant Time Collaborative Filtering Algorithm

System Architecture

Client

WebServer

RecommendationEngine

User RatingProfiles

Content DatabaseInternet

CGI

Login Interface

CGI

Page 17: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Measure of Effectiveness

Metric: Normalized Mean Absolute Error (NMAE): Average absolute deviation of actual ratings from predicted ratings, normalized over rating range.

MAE = 1/c |r - p|

NMAE = MAE / (r_max - r_min)

Page 18: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Effectiveness

Algorithm NMAE

POP 0.203

1 Nearest Neighbor 0.237

80 Nearest Neighbors 0.187

EigenTaste 0.187

Based on 18,000 users

Page 19: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Computational Complexity n - number of usersk - number of objects in gauge set

Nearest Neighborhood algorithm : Online processing - O(kn)

EigenTaste algorithm: Offline processing - O(k2n)Online processing - O(k)

Page 20: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Effectiveness and Efficiency

NMAE OFFLINE COMPLEXITY

ONLINE COMPLEXITY

POP 0.203 O(nm) O(1)

1 Nearest Neighbor 0.237 O(1) O(nk)

80 Nearest Neighbors 0.187 O(1) O(nk)

Eigentaste 0.187 O(k2n) O(k)

Page 21: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Prediction Speed

Algorithm Time to

process 9000 users

Nearest Neighbor 28 hoursEigenTaste 3 minutes

Page 22: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Current Jester Dataset

62,000 registered users

approx. 3,000,000 ratings

Page 23: EigenTaste: A Constant Time Collaborative Filtering Algorithm

Second Application (2000)Sleeper: Recommending Books

Page 24: EigenTaste: A Constant Time Collaborative Filtering Algorithm
Page 25: EigenTaste: A Constant Time Collaborative Filtering Algorithm
Page 26: EigenTaste: A Constant Time Collaborative Filtering Algorithm
Page 27: EigenTaste: A Constant Time Collaborative Filtering Algorithm
Page 28: EigenTaste: A Constant Time Collaborative Filtering Algorithm

EigenTaste Algorithm

1) Principal Component Analysis 2) Universal Queries (dense ratings matrix)3) Fine-grained ratings bar (captures nuances)4) Offline and Online Processing5) Online: Constant time recommendations

Patent application 21 December 1999 by UC Regents

Page 29: EigenTaste: A Constant Time Collaborative Filtering Algorithm

www.cs.berkeley.edu/~goldberg

[email protected]

Eigentaste: A Constant Time Collaborative Filtering Algorithm

(to appear: Information Retrieval Journal, 2001)


Recommended