Post on 07-Jul-2015
description
transcript
Collaborative Filtering Based on Star Users
Qiang Liu with Bingfei Cheng and Congfu Xu
College of Computer Science and Technology
Zhejiang University Hangzhou, Zhejiang 310027, China
2012dtd@gmail.com
ICTAI 2011, Boca Raton November 7, 2011
Outline
Introduction Star-user-based Collaborative Filtering Experimental Results Conclusion
INTRODUCTION
Collaborative Filtering
Neighborhood-based
Model-based
Collaborative Filtering(CF)
User-based Item-based
Bayesian Model Factorization Model Maximum Entropy Classification or Clustering ……
Motivation To improve the most widely used
technology in real-life recommender systems.
Neighborhood Model Similarity between users:
◦ Pearson:cov(𝑎,𝑏)𝜎𝑎𝜎𝑏
◦ Cosine: 𝑎∙𝑏𝑎 𝑏
◦ Other similarity measures
Weighted sum of neighbors’ ratings:
◦ 𝒑𝒂,𝒊 = 𝒓𝒂 + ∑ 𝒓𝒖,𝒊−𝒓𝒖 ∙ 𝒘𝒂,𝒖𝒖∈𝑼∑ 𝒘𝒂,𝒖𝒖∈𝑼
Common items:1,4,6 Rating vectors of common items: a=[1,4,5] b=[2,2,5]
Challenges faced by traditional methods
Matching similar users (computing similarities ): Sparsity and noise Scalability ……
STAR-USER-BASED CF
The MPN users Let A, B, C, D are neighbors of users A, B,
C, D respectively. Then area E is the set of the most
popular neighbors(MPN).
What is star user
Star users are special users who have rated all items with relatively stable standard.
We maintain a small set of star users, and treat them as fixed neighbors of every general user
Problem Formulation
Filling the following matrix ℛ ∈ 𝑅𝐻×𝑁.
𝒊𝟏 … 𝒊𝒊 … 𝒊𝑵
𝒔𝟏 ? . . . ?
… . . . . .
𝒔𝒔 . . 𝑟𝑠,𝑖 . .
... . . . . .
𝒔𝑯 ? . . . ?
Star users(H)
Items (N)
Prediction Model Selecting Star Neighbors:
Generate predictions
based on star users’ ratings:
𝒑𝒖,𝒊 = 𝒓�𝒖 + ∑ 𝒓𝒔,𝒊−𝒓𝒔 ∙ 𝒘𝒖,𝒔𝒔∈𝑺∑ 𝒘𝒖,𝒔𝒔∈𝑺
The parameters are 𝑟𝑠,𝑖 and 𝑤𝑢,𝑠.
𝒖𝟏 … 𝒖𝒊 … 𝒖𝑴
𝒔𝟏 . . . . .
… . . . . .
𝒔𝒔 . . 𝑤𝑢,𝑠 . .
... . . . . .
𝒔𝑯 . . . . .
General Users (M)
Star Users (H
)
Relationship Matrix W
How we get star users(1)
Training Stage: 1. Initialization star user matrix ℛ. 2. Predict each rating �̂�𝑢,𝑖 in the training set:
3. The residual is and the gradient of 𝑒𝑢,𝑖
2 is:
�̂�𝑢,𝑖 = �̅�𝑢 +∑ (𝑟𝑠,𝑖 − �̅�𝑠) × 𝑤𝑢,𝑠𝑠∈𝑆
∑ 𝑤𝑢,𝑠𝑠∈𝑆
𝑒𝑢,𝑖 = 𝑟𝑢,𝑖 − �̂�𝑢,𝑖
𝜕𝜕𝑟𝑠,𝑖
𝑒𝑢,𝑖2 = −2𝑒𝑢,𝑖 ∙
𝑁−1𝑁 ∙𝑤𝑢,𝑠
∑ 𝑤𝑢,𝑠𝑠∈𝑆
How we get star users(2)
Training Stage: 4. Update each element of matrix ℛ:
5. Repeat steps 2 to 4 until convergence.
𝑟𝑠,𝑖 ← 𝑟𝑠,𝑖 + 𝜂 ∙ 𝑒𝑢,𝑖 ∙𝑤𝑢,𝑠
∑ 𝑤𝑢,𝑠𝑠∈𝑆
How we get star users(3)
Parameters: ◦ 𝛼(users):The update frequency of �̅�𝑠 . ◦ 𝛽 𝑖𝑖𝑒𝑟𝑖𝑖𝑖𝑖𝑖𝑠 :The update frequency of 𝑤𝑢,𝑠 ∈ 𝑊for each u, and s.
w𝑢,𝑠 is computed using Pearson Correlation
Maintain the relationship matrix W: 𝑊 ∈ 𝑅𝑀×𝐻
until recommending stage.
EXPERIMENTAL RESULTS
Results on MovieLens Dataset
Time requirement comparison
RMSE of our approach against various H and comparison with kNN
Item-based Model
We firstly train a small set of star items instead of star users.
Predictions are computed as: 𝑝𝑎,𝑖 = �̅�𝑖 +
∑ 𝑟𝑎,𝑠 − 𝑟𝑠� × 𝑤𝑠,𝑗𝑠∈𝑆′
∑ 𝑤𝑠,𝑗𝑠∈𝑆′
Results on Netflix Dataset
Our approach with different values of learning rate
Our approach with different values of H
Discussion
Comparison with kNN
◦ Accuracy ◦ Data Sparsity ◦ Scalability 𝛰 𝑀2 × 𝑁′
→ 𝛰(𝑀 × 𝐻 × 𝑁′) where 𝐻 ≪ 𝑀.
Comparison with SVD
◦ Scientific explanation ◦ Parameters ◦ Updating
CONCLUSION
Summary
We proposed a novel CF model based on star users.
The original intention is to improve traditional neighborhood-based CF model.
Experimental results on two datasets verified the effectiveness of our approach.
Future work
Incorporating contextual information into our model.
Validating our approach in practical applications.
THANK YOU