Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender...

transcript

Cold Star t So lu t ions fo r Recommender Sys tems Amin Mantrach

amantrac@yahoo-inc.com Research Scientist – Yahoo Labs Barcelona

Outline

§  Recommending at cold start; ›  Learning representations for the item cold start:

•  Recommending cold articles to users;

›  Enrich user profiles by using users’ implicit feedback: •  Learning representations for completing the user profile;

›  Enrich user profiles by using query logs; §  Discussions: Matrix factorization and skip gram.

RECOMMENDING AT COLD START

Item Cold Start Problem on Yahoo Properties

§  Majority of the items (~80%) are never shown or clicked:

§  Personalization uses content as main signal (CTR can not be used on cold items);

§  Motivations: why recommending cold start items? ›  Diversify the offer; ›  Avoid the “kim kardashian” effect; ›  Avoid quick sold out of advertisings.

Weakly-engaged users on Yahoo Properties

§  User engagement is power-law distributed; à ~80% of the users have sparse profiles: on Netflix, Amazon, Yahoo news and Yahoo search. §  In other words, we are facing a coverage problem; §  Recommendations can not be efficient for the majority of the users due

to the sparsity of their profiles.

4 50 100 150 200 250 3000

Clicks

Drawbacks of state-of-the-art in cold-start recommendations Item cold start §  State-of-the-art of collaborative filtering approaches can not be applied:

[2009 Koren et al., Matrix factorization techniques for recommender systems; 2009 Rendle et al. UAI, B.P. Ranking from implicit feedback];

§  Basic approaches relies on content-based (CB) models. §  State-of-the-art for item cold-start consists of hybrid methods

[WSDM 2010, Agarwal et al., fLDA; Gantner ICDM 2010, learning mappings…];

Weakly-engaged user §  80% of the users are weakly-engaged and thus have sparse profiles; §  State-of-the art user profile enrichment techniques rely on

›  the kNN to enrich the user profile (this does not work for the weakly-engaged user); ›  external information but low coverage.

Our contributions to the cold start

§ Novel collective learning representation framework: ›  Common framework:

•  We solve both the item cold start and the user cold start; •  Our representations are interpretable (non-negative) and can be used to

reconstruct the user profile; •  Our implementation relies on simple alternating least squares (ALS) or

multiplicative updates (MU).

§ Weakly-engaged users: •  We can complete better the user profiles in comparison to the state-of-the-

The cold start: Research Questions

Content + User feedback à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representations from content and

collaborative information to outperform state-of-the-art item cold-start recommenders? [Saveski and Mantrach – RecSys 2014]. Weakly-engaged user:

2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback? [ongoing work]

3.Can we use query logs as external information to improve recommendations on homerun ? [patent filed]

Content + User feedback + Matrix factorization à Collective factorization Singh and Gordon KDD 2008

Item cold start: 1.Can we learn collective representations from content and

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [patent filed]

1. RECOMMEND FOR THE ITEM COLD START

Collective Representation Learning

§  Why collective? ›  It allows learning from multiple sources: users’ feedback + items’ features.

§  Why representation? ›  By learning embeddings we extract latent factors that capture the essence of the data.

§  Why collective representation for cold start? ›  When observing just one view we can reconstruct the missing one .

By projecting items’ features on the joined representation we can

reconstruct missing the user’s items

Xs W Hs

≈ Global

topic model TOPICS

Topic 1 …

Xu W Hu

≈ C1 …Ck

Topic k

#features

#users

COMMUNITY Personalized

model per user

Collective Representation Learning

Non-negative representations + locality constraints: LCE 2 similar items should share similar representations

Optimization Problem

§  We implemented an alternating least squares algorithm and a multiplicative update algorithm to learn the decomposition.

[https://github.com/amantrac/JNMF]

Item Cold-Start Recommendations

Offline evaluation: §  Enron: 10 mailboxes, 36K emails, 5K users, explicit feedback. §  Yahoo News Articles (40days – random sample of 41K articles –

650K users + user implicit feedback (3.5M comments).

A/B testing ›  Average #of items surfaced/day; ›  Dwell time of the items

[RecSys2014, Yi et al., Beyond clicks: dwell time for personalization.]

Item Cold Start: Baselines

1.  Content Based Recommender (CB) 2.  Content Topic Based Recommender

3.  Latent Semantic Indexing on user profiles [Soboroff’99]

4.  Author Topic Model [M. Rosen-Zvi’04]

5.  Bayesian Personalized Ranking + kNN (BRP-kNN) [Gantner’10]

6.  fLDA [Agarwal’10]

Offline Evaluation: Email Recipients Recommendation

MicroF1 MacroF1 MAP NDCG

Performan

BPR-kNN CB LCE (No GR) LCE

Offline Evaluation: Cold News Articles Recommendation

RA@3 RA@5 RA@7 RA@10

ing'Ac

curacy'

CB BPR-kNN LCE (No GR) LCE

Next directions…

§  Use dwell time/duration (i.e. proportion of the video watched) instead of intentional plays;

§  Incorporate a profile enrichment strategy based on representation learning to diversify recommendations for the weakly-engaged user.

Item cold start: 1.Can we learn collective representation from content and

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]

Recommending in the Long Tail: User Profile Completion Why it is important: §  Current rec. systems are only effective for the 20% loyal users having

a dense profile. Why enriching weakly engaged user ? §  Improving recommendation for 80% of the remaining users; §  Encouraging churning of weakly-engaged users to loyal; §  Easy to integrate: we feed the existing system with enriched user

profiles and do not need to change existing algorithms; §  Advertising can benefit as well of better enriched profiles.

Endogenous vs Exogenous Profile Enrichment

A. Endogenous: Using implicit feedback §  We have this info for loyal users for free; §  We do not need to rely on any external source. Our solution: §  Learning embedding spaces designed to reconstruct user profiles to

improve news recommendation.

B. Exogenous §  Many external sources of information are available inside Yahoo. They can

be used to enrich user profiles. §  Our solution: §  Using search query logs to enrich user profiles for news recommendation.

2.Can we design collective representations for enriching the weakly-engaged user profile using implicit user feedback?

3.Can we use external informational – as query logs – to improve recommendations ? [1 patent – submitted to Techpulse 2014]

2. USING IMPLICIT USER FEEDBACK 23

User Coverage Against Click Count for News Data Set

4 50 100 150 200 250 3000

Clicks

Xs W Hs

#features

#users

Collective Representation Learning for User Profiles Reconstruction

Xp ≈ Hu Hs

#features

User Profile Reconstruction Xp=XuT.Xs

Optimization Problem

User Profile Reconstruction Regularization

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

clicks=1clicks=2clicks=3clicks=4clicks=5

Performance in Terms of Sparsity

1 2 3 4 5 6

Clicks

CEUP-ACLS

CEUP-MU

WITHOUT ENRICHMENT

Content + User Feedback + Matrix Factorization à Collective Factorization Singh and Gordon KDD 2008

2.Can we design collective representations for enriching the weakly-engaged user profile using internal user feedback? [submitted to techpulse 2014]

3.Can query logs be used as external information to improve recommendationso ? [1 patent]

B. USING EXTERNAL SIGNAL QUERY LOGS 30

News Personalization

§ Reading profile (endogenous): Aggregated clicked news (implicit feedback) or skipped news.

§ Search profiles (exogenous): Aggregated queries submitted by users (explicit feedback).

Search profiles

§ Motivations: using other sources of available information to improve news personalization.

§ Why search? ›  More familiar; ›  Explicit user intent.

Titles

Abstracts

Coverage

66% of the Homerun users in a specific target day did also use “Search” during the last month.

Others Finance FrontPage Mail News Search SportsYahoo! properties

)unique yuidsunique bcookies

Coverage

§ Considering users who clicked at least once on a Homerun recommendation on a target day, how many queries did each of them submit during the last 3 months?

100 101 102 103 104

Number of queries per user during 90 days

§  Users that clicked at least once during a target day on a recommended article;

§  We consider only users who submitted at least 1000 queries during the last 3 months (~10 query/day);

à 70K users with 140K recommendations.

Data Set

User Query

User QTitles

User QAbstracts

User Query 1

User Query 2

User Query 3

User Query N

Query User Profile

User QTitles 1

User QTitles 2

User QTitles 3

User QTitles N

QTitle User Profile

User QAbstracts 1

User QAbstracts 2

User QAbstracts 3

User QAbstracts N

QAbstracts User Profile

I. Do search profiles help improve the quality of news personalization?

II. What are the important features to be considered in a search profile?

III. How many queries do we need?

Limitation: 400 queries corresponds to a coverage of ~200K users

IV. Which period should the historical search information span in order to produce high-quality recommendations?

V. How does the recency of search profiles affect the quality of news personalization?

Status and Limitations

§  The main limitation is the coverage: ›  Scales up to 200K users.

§  Further work: ›  Improve coverage; ›  Complete user profile by learning collective representations from (1) implicit

feedback, (2) query logs and (3) item’s features.

Discussions: Matrix Factorizations and skip-gram based Representations

§  Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013

§  “We analyze skip-gram with negative-sampling (SGNS), a word

embedding method introduced by Mikolov et al., and show that it is implicitly factorizing a word-context matrix, whose cells are the pointwise mutual information (PMI) of the respective word and context pairs (shifted by a global constant).” [Omer Levy and Yoav Goldberg,Neural Word Embeddings as Implicit Matrix Factorization, NIPS, 2014]

Quest ions Doubts

Concerns Queries

Quest ion Advises Issues

Cold Start Solutions for Recommender Systems - Yahoo … · Cold Start Solutions for Recommender...

Documents