An Evaluation Framework for Collaborative Filtering on Purchase Information in Recommendation...

An Evaluation Framework for Collaborative Filtering on Purchase Information in

Recommendation Systems

Stijn Geuens, Kristof Coussement, Koen W. De Bock

An Evaluation Framework for Collaborative Filtering on Purchase Information in Recommendation Systems

Integrating Behavioral, Product, and Customer Data in Hybrid Recommendation Systems Based on Factorization Machines

Subjects of today

12/15/2015 BAFI 2015 [email protected] 2

Recommendation Systems: Omnipresent in our Daily Lives

3BAFI 2015 [email protected] 12/15/2015


3BAFI 2015 [email protected]

In … E-commerce

12/15/2015



In … Travel sector

12/15/2015

Recommendation Systems: A Definition [Ricci et al. 2011]

Recommender Systems are software tools and techniques providingpersonalized suggestions for items to be of use to a user. The suggestions,based a customer’s profile or his behavior, relate to various decision-making processes, such as what items to buy, what music to listen to, orwhat online news to read.


How to Calculate Recommendations[Bobadilla et al. 2013; Adomavicius et al. 2008]


Based on: Socio-demographic information Demographic RecSys [eg. Pazzani 1999; Porcel et al. 2012]

Product characteristics Content-based RecSys [eg. Lang 1995; Meteren and Someren 2000]

Real-time navigation information Knowledge-based RecSys [eg. Burke 2000]

Transactional history Collaborative filtering RecSys [eg. Herlocker et al. 2004]

Hybrid solutions [eg. Burke 2002; Preece and Sneiderman 2009]

12/15/2015

Collaborative Filtering


Collaborative Filtering: Components

Characteristics


Collaborative Filtering: Components

Characteristics

Concepts Literature Our study


Input Data

Characteristics

Explicit Feedback [eg. Linden et al. 2003; Su and Khoshgoftaar 2009]

vs Implicit Feedback [eg. Sarwar et al. 2000; Palanivel and Sivakumar 2010]

Characteristics [eg. Deshpande and Karypis 2004; Steck 2011; Aggarwal et al. 1999]


Input Data: Explicit vs Implicit feedback

[Palanivel and Sivakumar 2010]




+ Representing user preference

- ̶̶ Effort and time to rate

- ̶̶ Only rate a small fraction of purchases

- ̶̶ Difficulties expressing interest






















- ̶̶ No explicit indication of preference








- ̶̶ No explicit indication of preference+ Objective+ In large amounts available in server logs


Input Data: Characteristics

Binary purchase data

Breese et al. (1998)

Li et al. (2010)

Deshpande and

Karypis (2004)

Linden et al. (2003)

Pradel et al. (2011)

Sarwar et al. (2000)

Current study


Input Data: Characteristics Item/user ratio

Sparsity

Purchase distribution

Characteristics

controlled

Breese et al. (1998) None

Li et al. (2010) None

Deshpande and

Karypis (2004)None

Linden et al. (2003) None

Pradel et al. (2011) None

Sarwar et al. (2000) Sparsity

Current study

Sparsity

Purchase

distribution

Item/user ratio


Input Data: Characteristics

6 levels of sparsity: 95% – 99.5%

3 purchase distributions: Logistic, Linear, Uniform

3 item/user ratios: 0.5, 1, 2

= 54 data sets+ 2 real-life validation datasets

Item/user ratio

Sparsity

Purchase distribution


Algorithm Variations

Data reduction as preprocessing [Sarwar et al. 2000]

CF-method [eg. Qin et al. 2011]

Similarity calculation [eg. Adomavicius and Tuzhilin 2005; Bobadilla et al. 2012]

Prediction method [Adomavicius and Tuzhilin 2005]


Algorithm Variations Data reduction as preprocessing

CF-method

Similarity measure

Reduction as

Preprocessing

Breese et al. (1998) None

Li et al. (2010) None

Deshpande and

Karypis (2004)None

Linden et al. (2003) None

Pradel et al. (2011) None

Sarwar et al. (2000)None

SVD

Current study

None

SVD

CA

NMF

LPCA



CF-method

Similarity measure

Reduction as

PreprocessingCF Method

Breese et al. (1998) None User-based

Li et al. (2010) None User-based

Deshpande and

Karypis (2004)None Item-based

Linden et al. (2003) None Item-based

Pradel et al. (2011) None Item-based


SVDUser-based

Current study

None

SVD

CA

NMF

LPCA

User-based

Item-based



CF-method

Similarity measure

Reduction as

PreprocessingCF Method

Similarity

Measure

Breese et al. (1998) None User-basedCosine

Correlation

Li et al. (2010) None User-based Cosine

Deshpande and

Karypis (2004)None Item-based Cosine

Linden et al. (2003) None Item-based Cosine

Pradel et al. (2011) None Item-based Cosine


SVDUser-based Cosine

Current study

None

SVD

CA

NMF

LPCA

User-based

Item-based

Cosine

Correlation

Jaccard



5 x 2 x 3Experimental Design


Evaluation


Evaluation

Literature: Accuracy


Evaluation

Our Study: Accuracy = F1-measure* [Lipton et al. 2014]

Diversity = inverse Intra List Similarity (ILS)* [Ziegler et al. 2005]

Computation Time = Time in Seconds* Selection sizes between 5 and 200

Literature: Accuracy


Purpose of the Study

Creation of a framework to guide a marketer in:

Selecting the best collaborative filtering algorithm as function of specific characteristics of the input data set available

Estimate the impact of changes in input characteristics of the data set on the algorithms


Research Questions

RQ1. How does CF algorithm configuration affect performance?

RQ2. How do input data characteristics influence the optimal CF configuration(s)?

RQ3. How sensitive are the optimal CF configuration(s) to variations in the input data characteristics?



Data Reduction CF Method Similarity Measure

Accuracy CA > NMF, SVD > LPCA > None Item > User Cos, Corr > Jaccard

Diversity (inverse ILS) None, CA, NMF, SVD > LPCA User > Item Jaccard > Cos, Corr

Time SVD, CA, None < LPCA < NMF / Jaccard < Cos, Corr

0%

10%

20%

30%

40%

50%

51

02

03

04

05

06

07

08

09

01

00

15

0

20

0

F1 (

Acc

ura

cy)

Selection Size

CA/Item/Corr

LPCA/Item/Corr

NMF/Item/Corr

SVD/Item/Corr

None/Item/Corr

None/Item/Jaccard

None/User/Jaccard

0

4,000

8,000

12,000

16,000

51

02

03

04

05

06

07

08

09

01

00

15

0

20

0

ILS

(In

vers

e D

ive

rsit

y)Selection Size

LPCA / Item / CosineCA / Item / CosineNMF / Item / CosineNone / Item / CosineSVD / Item / CosineNone / Item / JaccardNone / User / Jaccard


0

1,000

2,000

3,000

4,000

SVD CA None LPCA NMF

Tota

l Co

mp

uta

tio

n T

ime

(S

eco

nd

s)

Reduction Method

Correlation

Cosine

Jaccard

0

100

200

300

400

500

SVD CA None

Reduction Method

Data Reduction CF Method Similarity Measure

Accuracy CA > NMF, SVD > LPCA > None Item > User Cos, Corr > Jaccard

Diversity (inverse ILS) CA, None, NMF, SVD > LPCA User > Item Jaccard > Cos, Corr

Time SVD, CA, None < LPCA < NMF / Jaccard < Cos, Corr



RQ2. How do input data characteristics influence the optimal CF configuration(s)?

Input Data

CharacteristicsEvaluation Metric

CF Characteristics


Reduction Technique CF Method Similarity Measure

Sparsity All No impact

Purchase

distribution

Accuracy

Computation timeNo Impact

Diversity

Logistic: no clear pattern Logistic: item Logistic: cosine, correlation

Linear: no reduction Linear: user Linear: Jaccard

Uniform: no clear patternUniform: low sparsity: item

Uniform: high sparsity: userUniform: cosine, correlation

Item–user ratio

Accuracy

DiversityNo impact

Computation time

0.5 (2000 users): logistic: CA

0.5 (2000 users): linear: SVD

0.5 (2000 users):uniform: SVD

0.5 (2000 users): item

No impact

1 (1000 users): no reduction 1 (1000 users): user

2 (500 users): no reduction 2 (500 users): user



Sparsity Purchase Distribution Item–User Ratio

Accuracy Negative / /

Diversity (inverse ILS) Positive Uniform, Linear ≥ Logistic 2 > 1 > 0.5

Time / / 2 < 1 < 0.5





Time / / 2 < 1 < 0.5



12/15/2015

Logistic

Linear

Uniform




Time / / 2 < 1 < 0.5



12/15/2015





Time / / 2 < 1 < 0.5


Empirical validationEvaluation

MetricDataset

F4, 228 Value (p) F1, 228 Value (p) F2, 228 Value (p)

Reduction Technique CF method Similarity Measure

Accuracy (F1)

Children’s Clothing* 12.15 (< 0.001) 117.70 (< 0.001) 7.72 (0.02)Furniture** 12.89 (< 0.001) 33.99 (< 0.001) 6.84 (0.04)

CA > NMF, SVD, LPCA > None Item > User Cos, Corr > Jaccard

Inverse Diversity

(ILS)

Children’s Clothing 51.91 (< 0.001) 28.12 (< 0.001) 0.04 (0.84)

Furniture 83.20 (< 0.001) 324.06 (< 0.001) 1.74 (0.19)

Children’s Clothing None, CA, NMF, SVD, LPCAUser < Item Jaccard, Cos, Corr

Furniture CA, None, SVD, NMF > LPCA

Computation

Time (Sec)

Children’s Clothing 1621.35 (< 0.001) 7.60 (0.02) 6.14 (0.03)

Furniture 16,839.10 (< 0.001) 32.38 (< 0.001) 37.33 (< 0.001)

SVD, CA, None < LPCA < NMF Item < User Jaccard < Cos, Corr

A > B (A < B) indicates a significantly higher (lower) value for item A compared to B; A, B indicates A obtain a better value

compared to B, but the differences are not significant

* Children’s clothing: 5,999 users and 4,372 items

** Furniture: 5,368 users and 2,601 items

12/15/2015 BAFI 2015 [email protected] 29

Discussion

The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:

30BAFI 2015 [email protected] 37

12/15/2015

Discussion

The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:1. Decide on the most important metric or a trade-off

30BAFI 2015 [email protected] 40

12/15/2015

Discussion

The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:2. Decide on the most suitable model

Average Ranking

AlgorithmAccuracy

(F1)Diversity

(Inverse ILS)Time (Inverse

seconds)

CA/Item/Cos, Corr 1.28 3.28 2.83

NMF/Item/Cos, Corr 2.53 3.26 10.55

SVD/Item/Cos, Corr 3.74 3.45 1.57

None/Item/Cos, Corr 3.81 4.68 5.02

LPCA/Item/Cos, Corr 6.25 7.66 8.42

LPCA/User/Cos, Corr 6.62 7.38 8.58

NMF/User/Cos, Corr 7.89 6.57 10.45

SVD/User/Cos, Corr 8.04 6.94 2.45

CA/User/Cos, Corr 8.79 6.60 3.85

None/User/Cos, Corr 8.81 7.08 4.60

None/User/Jaccard 9.55 3.83 4.08

None/Item/Jaccard 10.58 5.28 3.60


Discussion

The presented results allow the creation of a framework to guide e-commerce companies in 3 distinct ways:3. Estimate the impact of changes in input data structure on the preferred models




Time / / 2 < 1 < 0.5


Paths for further research

Benchmark with other algorithms

Hybrid system

Include other customer actions

– Explicit ratings, views, clicks, addition to cart and wish list

Hybrid system


Thank you for your Attention

Contact:Stijn Geuens (0)3.20.545.892

IESEG School of Management [email protected]

3 Rue de la Digue fr.linkedin.com/pub/stijn-geuens/

F-59000 Lille stijn.geuens

Date post:	15-Apr-2017
Category:	Data & Analytics
Upload:	stijn-geuens
View:	43 times
Download:	0 times

An Evaluation Framework for Collaborative Filtering on Purchase Information in Recommendation...

Data & Analytics