+ All Categories
Home > Data & Analytics > Simon Dollé_Large-scale Real-time recommendation at Criteo

Simon Dollé_Large-scale Real-time recommendation at Criteo

Date post: 07-Aug-2015
Category:
Upload: dataconomy
View: 118 times
Download: 2 times
Share this document with a friend
Popular Tags:
25
Copyright © 2015 Criteo Large-scale real-time recommendation Simon Dollé Data Enthusiasts London, July 13 rd , 2015
Transcript

Copyright © 2015 Criteo

Large-scale real-time recommendation

Simon Dollé

Data Enthusiasts London, July 13rd, 2015

Copyright © 2015 Criteo

We sell clicks

We sell clicks

We sell clicks

We sell clicks

3 billions ads/day

2 billions products

10ms to pick relevant products

Copyright © 2015 Criteo

Data Sources

9

Ad display dataUser behavior dataCatalog data

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Copyright © 2015 Criteo

Offline

• Similarities computed on browsing data

• Based on coevents (collaborative filtering)

• Computed on Hadoop cluster

• Map reduce jobs, pig

• Takes around 12 hours

• Pushed to memcache severs

Copyright © 2015 Criteo

Copyright © 2015 Criteo

0.04 0.02 0.02 0.05

Copyright © 2015 Criteo

0.04 0.02 0.02 0.05

Copyright © 2015 Criteo

Online

• Merge candidate products

• Rank candidates thanks to ML model learned on ad display data.

• Features

Product-specific User-specific User-product interactions Display-specific

Copyright © 2015 Criteo

Online optimizations

• Algorithmic• Use simpler ML model

• Quickly discard candidates

• Technical• Fight against garbage collector

• Memcache + local cache

• Async I/O

Copyright © 2015 Criteo

Upcoming challenges

•Long(er)-term user profiles

•More and better product information (images, NLP)

• Instant-update of similarities

Copyright © 2015 Criteo

Fancy a try ?

On your own:

With us !

http://labs.criteo.com/jobs/

• Our 1st public dataset is online: http://bit.ly/1vgw2XC• 4GB display and click data, Kaggle challenge in 2014

• NEW : 1TB dataset released a few weeks ago: http://bit.ly/1PyH4Vq• Hosted on Microsoft Azure, just waiting for you

Copyright © 2015 Criteo

Questions?

Copyright © 2015 Criteo

Thank you !

[email protected]

Credits: Freepik.com


Recommended