Date post: | 04-Jul-2015 |
Category: |
Technology |
Upload: | error007 |
View: | 479 times |
Download: | 10 times |
04/18/13 Data Mining: Principles and Algorithms1
Data Mining: Concepts and Techniques
— Chapter 11 —
Addit ional Theme: Collaborative Filtering & Data Mining
Jiawei Han and Micheline KamberDepartment of Computer Science
University of Illinois at Urbana-Champaignwww.cs.uiuc.edu/~hanj
©2006 Jiawei Han and Micheline Kamber. All rights reserved
04/18/13 Data Mining: Principles and Algorithms2
04/18/13 Data Mining: Principles and Algorithms3
Outline
Motivation Systems in Action A Conceptual Framework User-User Methods Item-Item Methods Recent Advances and Open Problems
04/18/13 Data Mining: Principles and Algorithms4
Motivation
User Perspective Lots of online products, books, movies, etc. Reduce my choices…please…
Manager Perspective
“ if I have 3 million customers on the web, I should have 3 million stores on the web.”
CEO of Amazon.com [SCH01]
04/18/13 Data Mining: Principles and Algorithms5
Example: Recommendation
Customers who bought this book also bought:
•Data Preparation for Data Mining: by Dorian Pyle (Author) •The Elements of Statistical Learning: by T. Hastie, et al •Data Mining: Introductory and Advanced Topics: by Margaret H. Dunham•Mining the Web: Analysis of Hypertext and Semi Structured Data
04/18/13 Data Mining: Principles and Algorithms6
Example: Personalization
04/18/13 Data Mining: Principles and Algorithms7
Other Examples
Movielens: movies Moviecritic: movies again My launch: music Gustos starrater: web pages Jester: Jokes TV Recommender: TV shows Suggest 1.0 : different products And much more…
04/18/13 Data Mining: Principles and Algorithms8
How it Works?
Each user has a profile Users rate items
Explicitly: score from 1..5 Implicitly: web usage mining
Time spent in viewing the item Navigation path Etc…
System does the rest, How? This is what we will show today
04/18/13 Data Mining: Principles and Algorithms9
Basic Approaches
Collaborative Filtering (CF) Look at users collective behavior Look at the active user history Combine!
Content-based Filtering Recommend items based on key-words More appropriate for information retrieval
04/18/13 Data Mining: Principles and Algorithms10
Collaborative Filtering: A Framework
u1
u2
…
ui
...
um
Items: I
i1 i2 … ij … in
3 1.5 …. … 2
2
1
3
rij=?
The task:Q1: Find Unknown ratings?Q2: Which items should we recommend to this user?...
Unknown function f: U x I→ R
Users: U
04/18/13 Data Mining: Principles and Algorithms11
Collaborative Filtering Road Map
User-User Methods Identify like-minded users Memory-based: K-NN Model-based: Clustering
Item-Item Method Identify buying patterns Correlation Analysis Linear Regression Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms12
User-User Similarity: Intuit ion
TargetTargetCustomerCustomer
Q1: How to measure similarity?
Q2: How to select neighbors?
Q3: How to combine?
04/18/13 Data Mining: Principles and Algorithms13
How to Measure Similarity?
Pearson correlation coefficient
Cosine measure Users are vectors in product-dimension space
∑∑
∑
∈∈
∈
−−
−−=
Items RatedCommonly j
2
Items RatedCommonly j
2
Items RatedCommonly j
)()(
))((
),(iijaaj
iijaaj
prrrr
rrrr
iaw
ui
ua
i1 in
22*
.),(
ia
iac rr
rriaw =
04/18/13 Data Mining: Principles and Algorithms14
Nearest Neighbor Approaches [SAR00a]
Offline phase: Do nothing…just store transactions
Online phase: Identify highly similar users to the active one
Best K ones All with a measure greater than a threshold
Prediction
∑∑ −
+=
i
iiji
aaj iaw
rriawrr
),(
)(),(
User a’s neutralUser i’s deviation
User a’s estimated deviation
04/18/13 Data Mining: Principles and Algorithms15
Horting Method [AGG99]
K-NN is not transitive Horting takes advantage of transitivity Uses new similarity measure: Predictability User i predicts user a if
They have rated sufficiently common items There is an error-bounded linear
transformation from user i’s ratings to a’s ones
04/18/13 Data Mining: Principles and Algorithms16
How Horting Works?
Offline phase: build neighborhood graph Online phase: Compute raj
Ua
1- Identify users who predict ua
2- Identify users who rated j
3- Find shortest paths from group1 to 2
4- Backward propagation and averaging
- Better for sparse environments- Not well evaluated
04/18/13 Data Mining: Principles and Algorithms17
Clustering [BRE98]
Offline phase: Build clusters: k-mean, k-medoid, etc.
Online phase: Identify the nearest cluster to the active user Prediction:
Use the center of the cluster Weighted average between cluster members
Weights depend on the active user
Faster Slower but a little more accurate
04/18/13 Data Mining: Principles and Algorithms18
Clustering vs. k-NN Approaches
K-NN using Pearson measure is slower but more accurate
Clustering is more scalableActive user
Bad recommendations
We can use soft clustering but will lose computational edge
04/18/13 Data Mining: Principles and Algorithms19
Did We Answer the Questions?
TargetTargetCustomerCustomer
Q1: How to measure similarity?
Q2: How to select neighbors?
Q3: How to combine?
04/18/13 Data Mining: Principles and Algorithms20
Are We Done? Q1:How to measure similarity?
.....
......
),( Items RatedCommonly j∑
∈=iawp
What about Sparsity?Not enough common Itemsimplies spurious neighbors and hence bad recommendations
Sparsity results from the poor representation!
U1 rates recycled letter pads HighU2 rates recycled memo pads High
Both of them like Recycled office products
They are similar but the math won’t work for that
Example from [SAR00P]
By working at the right level of abstraction we can eliminate sparsity
Done... Really??
04/18/13 Data Mining: Principles and Algorithms21
The Power of Representation [UNG98]
Action Foreign Classic
Q1-B: How can we formalize this intuition?
04/18/13 Data Mining: Principles and Algorithms22
How to Abstract?
Semi-manual Methods Use product features Cluster products first, then cluster users Works only if we have descriptive features
Automatic Methods Adjusted Product Taxonomy Latent Semantic Indexing
04/18/13 Data Mining: Principles and Algorithms23
Adjusted Product Taxonomy [CHO04]
• Input : product taxonomy•Output: modified taxonomy with even distribution
04/18/13 Data Mining: Principles and Algorithms24
Adjusted Product Taxonomy (2)
Using original taxonomy
Using adjusted taxonomy
Number of transactions having this category
04/18/13 Data Mining: Principles and Algorithms25
Latent Semantic Indexing [SAR00b]
=R
m X n
U
m X r
S
r X r
I’
r X n
Sk
k X k
Uk
m X k
Ik’
k X n
The reconstructed matrix Rk = Uk.Sk.Ik’ is the closest rank-k matrix to the original matrix R.
Rk
• Captures latent associations• Reduced space is less-noisy
04/18/13 Data Mining: Principles and Algorithms26
Are We Done? (2)
Q2:How to Select Neighbors? We don’t expect to use the same neighbors
for all products Neighbors should be product-category
specific
Not adequately answered
Q2-B. How can we determine whether or not a user is relevant to a given product?
04/18/13 Data Mining: Principles and Algorithms27
Selecting Relevant Instances [YU01]
Superman and Batman and correlated Titanic and Batman are negatively correlated “Dances with Wolves” has nothing to do with Batman’s rating Karen is not a good instance to consider
MI(X;Y) = H(X) – H(X|Y)
How can we formalize this? Mutual Information
Predict this
04/18/13 Data Mining: Principles and Algorithms28
Selecting Relevant Instances (2)
Offline phase: Estimate mutual information between items For each item:
Find users who rated it Compute their strength (how many relevant items
they also rated) Retain subset of them (10% works fine)
Online phase: To predict the target item’s rating, run k-NN on
its reduced instance space
Better results with less data… quality not quantity is what matter
04/18/13 Data Mining: Principles and Algorithms29
Are We Done? (3)
Q3:How to combine? Weighted average Discover association rules in neighbors’ transactions
[LEE01, WAN04] For every x in this group: like(x, Item1) ^ like(x, Item2) like(x, Item3) Use confidence and support to judge the quality of the
prediction Prediction is done on the binary level (like, dislike) Costly to run online
04/18/13 Data Mining: Principles and Algorithms30
User-User Methods Evaluation
Achieve good quality in practice The more processing we push offline, the better
the method scale However:
User preference is dynamic High update frequency of offline-calculated
information No recommendation for new users
We don’t know much about them yet
04/18/13 Data Mining: Principles and Algorithms31
Collaborative Filtering Road Map
User-User Methods Identify like-minded users Memory-based: K-NN Model-based: Clustering
Item-Item Method Identify buying patterns Correlation Analysis Linear Regression Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms32
Item-Item Similarity: The Intuit ion
Search for similarities among items All computations can be done offline Item-Item similarity is more stable that user-user
similarity No need for frequent updates
First Order Models Correlation Analysis Linear Regression
Higher Order Models Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms33
Correlation-based Methods [SAR01]
Same as in user-user similarity but on item vectors Pearson correlation coefficient
Look for users who rated both items
u1
um
i1 ii ij in
∑∑∑
∈∈
∈
−−
−−=
ItemsBoth Rated Usersu
2
ItemsBoth Rated Usersu
2
ItemsBoth Rated Usersu
)()(
))((
iuijuj
iuijuj
ijrrrr
rrrrs
04/18/13 Data Mining: Principles and Algorithms34
Correlation-based Methods (2)
Offline phase: Calculate n(n-1) similarity measures For each item
Determine its k-most similar items Online phase:
Predict rating for a given user-item pair as a weighted sum over similar items that he rated
Ua ?2 3 4
∑∑
∈
∈=
itemssimilariij
aiitemssimilariij
aj s
rsr
j
04/18/13 Data Mining: Principles and Algorithms35
Regression Based Methods [VUC00]
Offline phase: Fit n(n-1) linear regressions Fij(x) is a linear transformation of a user rating on
item i to his rating on item j Online phase
Same as previous method The weights are inversely proportional to the
regression error rates
∑∑
∈
∈=
aby items
aby items
)(
ratediij
airatedi
ijij
aj w
rfw
r
04/18/13 Data Mining: Principles and Algorithms36
Higher Order Models
Previous approaches used the Naïve Bayes assumption Item effects on a given one are independent
Not always true Higher order models can do better
Belief Network Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms37
Bayesian Belief Network: introduction
Bayesian belief network allows a subset of the variables to be conditionally independent
A graphical model of causal relationships Represents dependency among the variables Gives a specification of joint probability distribution
X Y
ZP
Nodes: random variablesLinks: dependencyX,Y are the parents of Z, and Y is the parent of PNo dependency between Z and PHas no loops or cycles
04/18/13 Data Mining: Principles and Algorithms38
Bayesian Belief Network: An Example
FamilyHistory
LungCancer
PositiveXRay
Smoker
Emphysema
Dyspnea
LC
~LC
(FH, S) (FH, ~S) (~FH, S) (~FH, ~S)
0.8
0.2
0.5
0.5
0.7
0.3
0.1
0.9
Bayesian Belief Networks
The conditional probability table for the variable LungCancer:Shows the conditional probability for each possible combination of its parents
∏=
=n
iZParents iziPznzP
1))(|(),...,1(
04/18/13 Data Mining: Principles and Algorithms39
Belief Network for CF [BRE98]
Every item is a node Binary rating (like, dislike) Learn offline a belief network over the training date CPT table at each node is represented as a decision tree Use greedy algorithms to determine the best network
structure Use probabilistic inference for online prediction
04/18/13 Data Mining: Principles and Algorithms40
Belief Network for CF: An Example
decision tree for the random variable “Melrose Palace” in the movie domain
Probability
Friends
M.P
B.H CPT
04/18/13 Data Mining: Principles and Algorithms41
Association Rule Mining
Offline processing Work on the binary level (like, dislike) View user as market basket containing items
liked by user Discover association rules between items
Online processing: Match items that the active user like with rules
left hand side Recommend rules’ consequent based on
support and confidence
04/18/13 Data Mining: Principles and Algorithms42
Association Rule Mining : Problems
High support threshold leads to low coverage and may eliminate important, but infrequent items from consideration
Low support thresholds result in very large model sizes, computationally expensive offline pattern discovery phase and slower online matching phase
Solution: Adaptive Association Rule Mining
04/18/13 Data Mining: Principles and Algorithms43
Adaptive Association Rule Mining [LIN01]
minSupport
minConfidenceDesired number
of rules
Given: transaction dataset target item desired range for number of
rules specified minimum confidence
Find: set S of association rules for target item such that number of rules in S is in given range rules in S satisfy minimum confidence constraint rules in S have higher support than rules not in S that satisfy above
constraints
04/18/13 Data Mining: Principles and Algorithms44
Adaptive Association Rule Mining (2)
Discover rules with one item on the head Like (x, item1) ^ Like (x, item2) Like(x,
target)
The miner discovers association rules iteratively (for each target item) until the desired number of rules are extracted
Support is adjusted per-item
04/18/13 Data Mining: Principles and Algorithms45
Item-Item Methods: Why It Works?
Like(x,Book1)^like(x,book2) like(x,book3)
Like(x,Movie1) like(x,Movie2)
Support Support
We use the right neighbors for each item
Without discovering the groups themselves thus eliminating costly online matching
In general better quality than user-user methods and better response time [LIN03]
Book1, Book2Movie1
Bookgang
Moviegang
04/18/13 Data Mining: Principles and Algorithms46
Recent Work and Open Problems
Order-based methods Ordering items is more informative than rating them [KAM03] developed k-o’mean to work on orders
Preference-based methods Total ordering of items is not feasible Work on partial orders (preferences) [COH99]
Integrating background knowledge User demographic information, item-features, etc..
Modeling time Sequential patterns
04/18/13 Data Mining: Principles and Algorithms47
References (1) Charu C. Aggarwal, Joel L. Wolf, Kun-Lung Wu, Philip S. Yu: Horting Hatches
an Egg: A New Graph-Theoretic Approach to Collaborative Filtering. KDD 1999: 201-212
J. Breese, D. Heckerman, C. Kadie Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proc. 14th Conf. Uncertainty in Artificial Intelligence, Madison, July 1998.
Yoon Ho Cho and Jae Kyeong Kim: Application of Web usage mining and product taxonomy to collaborative recommendations in e-commerce. Expert Systems with Applications, 26(2), 2003
William W. Cohen, Robert E. Schapire, and Yoram Singer. Learning to order things. In Advances in Neural Processing Systems 10, Denver, CO, 1997
Jiawe Han, Fall 2003 online course notes available at: http://www-courses.cs.uiuc.edu/~cs397han/slides/05.ppt Toshihiro Kamishima: Nantonac collaborative filtering: recommendation
based on order responses. KDD 2003: 583-588 Lee, C.-H, Kim, Y.-H., Rhee, P.-K. Web personalization expert with combining
collaborative filtering and association rule mining technique. Expert Systems with Applications, v 21, n 3, October, 2001, p 131-137
04/18/13 Data Mining: Principles and Algorithms48
References (2) W. Lin, 2001P, online presentation available at: http://www.wiwi.hu-
berlin.de/~myra/WEBKDD2000/WEBKDD2000_ARCHIVE/LinAlvarezRuiz_WebKDD2000.ppt
Weiyang Lin, Sergio A. Alvarez, and Carolina Ruiz. Efficient adaptive-support association rule mining for recommender systems. Data Mining and Knowledge Discovery, 6:83--105, 2002
G. Linden, B. Smith, and J. York, "Amazon.com Recommendations Iemto -item collaborative filtering", IEEE Internet Computing, Vo. 7, No. 1, pp. 7680, Jan. 2003. Badrul M. Sarwar, George Karypis, Joseph A. Konstan, John Riedl: Analysis of recommendation algorithms for e-commerce. ACM Conf. Electronic Commerce 2000: 158-167
B. Sarwar, G. Karypis, J. Konstan, and J. Riedl: Application of dimensionality reduction in recommender systems--a case study. In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000.
B. M. Sarwar, G. Karypis, J. A. Konstan, and J. Riedl. Item-based collaborative filtering recommendation algorithms. WWW’01
04/18/13 Data Mining: Principles and Algorithms49
References (3) B. Sarwar, 2000P, online presentation available at: http://www.wiwi.hu-
berlin.de/~myra/WEBKDD2000/WEBKDD2000_ARCHIVE/badrul.ppt J. Ben Schafer, Joseph A. Konstan, John Riedl: E-Commerce
Recommendation Applications. Data Mining and Knowledge Discovery 5(1/2): 115-153, 2001
L.H. Ungar and D.P. Foster: Clustering Methods for Collaborative Filtering, AAAI Workshop on Recommendation Systems, 1998.
Yi-Fan Wang, Yu-Liang Chuang, Mei-Hua Hsu and Huan-Chao Keh: A personalized recommender system for the cosmetic business. Expert Systems with Applications, v 26, n 3, April, 2004 Pages 427-434
S. Vucetic and Z. Obradovic. A regression-based approach for scaling-up personalized recommender systems in e-commerce. In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000.
Kai Yu, Xiaowei Xu, Martin Ester, and Hans-Peter Kriegel: Selecting relevant instances for efficient accurate collaborative filtering. In Proceedings of the 10th CIKM, pages 239--246. ACM Press, 2001.
Cheng Zhai, Spring 2003 online course notes available at: http://sifaka.cs.uiuc.edu/course/2003-497CXZ/loc/cf.ppt
04/18/13 Data Mining: Principles and Algorithms50