Date post: | 15-Apr-2017 |
Category: |
Data & Analytics |
Upload: | stijn-geuens |
View: | 43 times |
Download: | 0 times |
An Evaluation Framework for Collaborative Filtering on Purchase Information in
Recommendation Systems
Stijn Geuens, Kristof Coussement, Koen W. De Bock
An Evaluation Framework for Collaborative Filtering on Purchase Information in Recommendation Systems
Integrating Behavioral, Product, and Customer Data in Hybrid Recommendation Systems Based on Factorization Machines
Subjects of today
12/15/2015 BAFI 2015 [email protected] 2
Recommendation Systems: Omnipresent in our Daily Lives
3BAFI 2015 [email protected]
In … E-commerce
12/15/2015
Recommendation Systems: Omnipresent in our Daily Lives
4BAFI 2015 [email protected]
In … Travel sector
12/15/2015
Recommendation Systems: A Definition [Ricci et al. 2011]
Recommender Systems are software tools and techniques providingpersonalized suggestions for items to be of use to a user. The suggestions,based a customer’s profile or his behavior, relate to various decision-making processes, such as what items to buy, what music to listen to, orwhat online news to read.
5BAFI 2015 [email protected] 12/15/2015
How to Calculate Recommendations[Bobadilla et al. 2013; Adomavicius et al. 2008]
6BAFI 2015 [email protected]
Based on: Socio-demographic information Demographic RecSys [eg. Pazzani 1999; Porcel et al. 2012]
Product characteristics Content-based RecSys [eg. Lang 1995; Meteren and Someren 2000]
Real-time navigation information Knowledge-based RecSys [eg. Burke 2000]
Transactional history Collaborative filtering RecSys [eg. Herlocker et al. 2004]
Hybrid solutions [eg. Burke 2002; Preece and Sneiderman 2009]
12/15/2015
Collaborative Filtering: Components
Characteristics
Concepts Literature Our study
8BAFI 2015 [email protected] 12/15/2015
Input Data
Characteristics
Explicit Feedback [eg. Linden et al. 2003; Su and Khoshgoftaar 2009]
vs Implicit Feedback [eg. Sarwar et al. 2000; Palanivel and Sivakumar 2010]
Characteristics [eg. Deshpande and Karypis 2004; Steck 2011; Aggarwal et al. 1999]
9BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
10BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
+ Representing user preference
- ̶̶ Effort and time to rate
- ̶̶ Only rate a small fraction of purchases
- ̶̶ Difficulties expressing interest
10BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
+ Representing user preference
- ̶̶ Effort and time to rate
- ̶̶ Only rate a small fraction of purchases
- ̶̶ Difficulties expressing interest
10BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
+ Representing user preference
- ̶̶ Effort and time to rate
- ̶̶ Only rate a small fraction of purchases
- ̶̶ Difficulties expressing interest
10BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
+ Representing user preference
- ̶̶ Effort and time to rate
- ̶̶ Only rate a small fraction of purchases
- ̶̶ Difficulties expressing interest
- ̶̶ No explicit indication of preference
10BAFI 2015 [email protected] 12/15/2015
Input Data: Explicit vs Implicit feedback
[Palanivel and Sivakumar 2010]
+ Representing user preference
- ̶̶ Effort and time to rate
- ̶̶ Only rate a small fraction of purchases
- ̶̶ Difficulties expressing interest
- ̶̶ No explicit indication of preference+ Objective+ In large amounts available in server logs
10BAFI 2015 [email protected] 12/15/2015
Input Data: Characteristics
Binary purchase data
Breese et al. (1998)
Li et al. (2010)
Deshpande and
Karypis (2004)
Linden et al. (2003)
Pradel et al. (2011)
Sarwar et al. (2000)
Current study
11BAFI 2015 [email protected] 12/15/2015
Input Data: Characteristics Item/user ratio
Sparsity
Purchase distribution
Characteristics
controlled
Breese et al. (1998) None
Li et al. (2010) None
Deshpande and
Karypis (2004)None
Linden et al. (2003) None
Pradel et al. (2011) None
Sarwar et al. (2000) Sparsity
Current study
Sparsity
Purchase
distribution
Item/user ratio
12BAFI 2015 [email protected] 12/15/2015
Input Data: Characteristics
6 levels of sparsity: 95% – 99.5%
3 purchase distributions: Logistic, Linear, Uniform
3 item/user ratios: 0.5, 1, 2
= 54 data sets+ 2 real-life validation datasets
Item/user ratio
Sparsity
Purchase distribution
13BAFI 2015 [email protected] 12/15/2015
Algorithm Variations
Data reduction as preprocessing [Sarwar et al. 2000]
CF-method [eg. Qin et al. 2011]
Similarity calculation [eg. Adomavicius and Tuzhilin 2005; Bobadilla et al. 2012]
Prediction method [Adomavicius and Tuzhilin 2005]
14BAFI 2015 [email protected] 12/15/2015
Algorithm Variations Data reduction as preprocessing
CF-method
Similarity measure
Reduction as
Preprocessing
Breese et al. (1998) None
Li et al. (2010) None
Deshpande and
Karypis (2004)None
Linden et al. (2003) None
Pradel et al. (2011) None
Sarwar et al. (2000)None
SVD
Current study
None
SVD
CA
NMF
LPCA
15BAFI 2015 [email protected] 12/15/2015
Algorithm Variations Data reduction as preprocessing
CF-method
Similarity measure
Reduction as
PreprocessingCF Method
Breese et al. (1998) None User-based
Li et al. (2010) None User-based
Deshpande and
Karypis (2004)None Item-based
Linden et al. (2003) None Item-based
Pradel et al. (2011) None Item-based
Sarwar et al. (2000)None
SVDUser-based
Current study
None
SVD
CA
NMF
LPCA
User-based
Item-based
16BAFI 2015 [email protected] 12/15/2015
Algorithm Variations Data reduction as preprocessing
CF-method
Similarity measure
Reduction as
PreprocessingCF Method
Similarity
Measure
Breese et al. (1998) None User-basedCosine
Correlation
Li et al. (2010) None User-based Cosine
Deshpande and
Karypis (2004)None Item-based Cosine
Linden et al. (2003) None Item-based Cosine
Pradel et al. (2011) None Item-based Cosine
Sarwar et al. (2000)None
SVDUser-based Cosine
Current study
None
SVD
CA
NMF
LPCA
User-based
Item-based
Cosine
Correlation
Jaccard
17BAFI 2015 [email protected] 12/15/2015
Evaluation
Our Study: Accuracy = F1-measure* [Lipton et al. 2014]
Diversity = inverse Intra List Similarity (ILS)* [Ziegler et al. 2005]
Computation Time = Time in Seconds* Selection sizes between 5 and 200
Literature: Accuracy
19BAFI 2015 [email protected] 12/15/2015
Purpose of the Study
Creation of a framework to guide a marketer in:
Selecting the best collaborative filtering algorithm as function of specific characteristics of the input data set available
Estimate the impact of changes in input characteristics of the data set on the algorithms
20BAFI 2015 [email protected] 12/15/2015
Research Questions
RQ1. How does CF algorithm configuration affect performance?
RQ2. How do input data characteristics influence the optimal CF configuration(s)?
RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?
21BAFI 2015 [email protected] 12/15/2015
RQ1. How does CF algorithm configuration affect performance?
Data Reduction CF Method Similarity Measure
Accuracy CA > NMF, SVD > LPCA > None Item > User Cos, Corr > Jaccard
Diversity (inverse ILS) None, CA, NMF, SVD > LPCA User > Item Jaccard > Cos, Corr
Time SVD, CA, None < LPCA < NMF / Jaccard < Cos, Corr
0%
10%
20%
30%
40%
50%
51
02
03
04
05
06
07
08
09
01
00
15
0
20
0
F1 (
Acc
ura
cy)
Selection Size
CA/Item/Corr
LPCA/Item/Corr
NMF/Item/Corr
SVD/Item/Corr
None/Item/Corr
None/Item/Jaccard
None/User/Jaccard
0
4,000
8,000
12,000
16,000
51
02
03
04
05
06
07
08
09
01
00
15
0
20
0
ILS
(In
vers
e D
ive
rsit
y)Selection Size
LPCA / Item / CosineCA / Item / CosineNMF / Item / CosineNone / Item / CosineSVD / Item / CosineNone / Item / JaccardNone / User / Jaccard
22BAFI 2015 [email protected] 12/15/2015
0
1,000
2,000
3,000
4,000
SVD CA None LPCA NMF
Tota
l Co
mp
uta
tio
n T
ime
(S
eco
nd
s)
Reduction Method
Correlation
Cosine
Jaccard
0
100
200
300
400
500
SVD CA None
Reduction Method
Data Reduction CF Method Similarity Measure
Accuracy CA > NMF, SVD > LPCA > None Item > User Cos, Corr > Jaccard
Diversity (inverse ILS) CA, None, NMF, SVD > LPCA User > Item Jaccard > Cos, Corr
Time SVD, CA, None < LPCA < NMF / Jaccard < Cos, Corr
RQ1. How does CF algorithm configuration affect performance?
23BAFI 2015 [email protected] 12/15/2015
RQ2. How do input data characteristics influence the optimal CF configuration(s)?
Input Data
CharacteristicsEvaluation Metric
CF Characteristics
Algorithm Variations
Reduction Technique CF Method Similarity Measure
Sparsity All No impact
Purchase
distribution
Accuracy
Computation timeNo Impact
Diversity
Logistic: no clear pattern Logistic: item Logistic: cosine, correlation
Linear: no reduction Linear: user Linear: Jaccard
Uniform: no clear patternUniform: low sparsity: item
Uniform: high sparsity: userUniform: cosine, correlation
Item–user ratio
Accuracy
DiversityNo impact
Computation time
0.5 (2000 users): logistic: CA
0.5 (2000 users): linear: SVD
0.5 (2000 users):uniform: SVD
0.5 (2000 users): item
No impact
1 (1000 users): no reduction 1 (1000 users): user
2 (500 users): no reduction 2 (500 users): user
24BAFI 2015 [email protected] 12/15/2015
RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?
Sparsity Purchase Distribution Item–User Ratio
Accuracy Negative / /
Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5
Time / / 2 < 1 < 0.5
25BAFI 2015 [email protected] 12/15/2015
Sparsity Purchase Distribution Item–User Ratio
Accuracy Negative / /
Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5
Time / / 2 < 1 < 0.5
26BAFI 2015 [email protected]
RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?
12/15/2015
Logistic
Linear
Uniform
Sparsity Purchase Distribution Item–User Ratio
Accuracy Negative / /
Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5
Time / / 2 < 1 < 0.5
27BAFI 2015 [email protected]
RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?
12/15/2015
RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?
Sparsity Purchase Distribution Item–User Ratio
Accuracy Negative / /
Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5
Time / / 2 < 1 < 0.5
28BAFI 2015 [email protected] 12/15/2015
Empirical validationEvaluation
MetricDataset
F4, 228 Value (p) F1, 228 Value (p) F2, 228 Value (p)
Reduction Technique CF method Similarity Measure
Accuracy (F1)
Children’s Clothing* 12.15 (< 0.001) 117.70 (< 0.001) 7.72 (0.02)Furniture** 12.89 (< 0.001) 33.99 (< 0.001) 6.84 (0.04)
CA > NMF, SVD, LPCA > None Item > User Cos, Corr > Jaccard
Inverse Diversity
(ILS)
Children’s Clothing 51.91 (< 0.001) 28.12 (< 0.001) 0.04 (0.84)
Furniture 83.20 (< 0.001) 324.06 (< 0.001) 1.74 (0.19)
Children’s Clothing None, CA, NMF, SVD, LPCAUser < Item Jaccard, Cos, Corr
Furniture CA, None, SVD, NMF > LPCA
Computation
Time (Sec)
Children’s Clothing 1621.35 (< 0.001) 7.60 (0.02) 6.14 (0.03)
Furniture 16,839.10 (< 0.001) 32.38 (< 0.001) 37.33 (< 0.001)
SVD, CA, None < LPCA < NMF Item < User Jaccard < Cos, Corr
A > B (A < B) indicates a significantly higher (lower) value for item A compared to B; A, B indicates A obtain a better value
compared to B, but the differences are not significant
* Children’s clothing: 5,999 users and 4,372 items
** Furniture: 5,368 users and 2,601 items
12/15/2015 BAFI 2015 [email protected] 29
Discussion
The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:
30BAFI 2015 [email protected] 37
12/15/2015
Discussion
The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:1. Decide on the most important metric or a trade-off
30BAFI 2015 [email protected] 40
12/15/2015
Discussion
The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:2. Decide on the most suitable model
Average Ranking
AlgorithmAccuracy
(F1)Diversity
(Inverse ILS)Time (Inverse
seconds)
CA/Item/Cos, Corr 1.28 3.28 2.83
NMF/Item/Cos, Corr 2.53 3.26 10.55
SVD/Item/Cos, Corr 3.74 3.45 1.57
None/Item/Cos, Corr 3.81 4.68 5.02
LPCA/Item/Cos, Corr 6.25 7.66 8.42
LPCA/User/Cos, Corr 6.62 7.38 8.58
NMF/User/Cos, Corr 7.89 6.57 10.45
SVD/User/Cos, Corr 8.04 6.94 2.45
CA/User/Cos, Corr 8.79 6.60 3.85
None/User/Cos, Corr 8.81 7.08 4.60
None/User/Jaccard 9.55 3.83 4.08
None/Item/Jaccard 10.58 5.28 3.60
31BAFI 2015 [email protected] 12/15/2015
Discussion
The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:3. Estimate the impact of changes in input data structure on the preferred models
Sparsity Purchase Distribution Item–User Ratio
Accuracy Negative / /
Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5
Time / / 2 < 1 < 0.5
32BAFI 2015 [email protected] 12/15/2015
Paths for further research
Benchmark with other algorithms
Hybrid system
Include other customer actions
– Explicit ratings, views, clicks, addition to cart and wish list
Hybrid system
33BAFI 2015 [email protected] 12/15/2015
Thank you for your Attention
Contact:Stijn Geuens (0)3.20.545.892
IESEG School of Management [email protected]
3 Rue de la Digue fr.linkedin.com/pub/stijn-geuens/
F-59000 Lille stijn.geuens