Spotify’s Music Recommendations Lambda Architecture
Esh Kumar @eshvkEmily Samuels @emilymsa
Overview‣Why Lambda?‣Use Case: Discover Recommendations• Batch Architecture• Real-time Architecture• Challenges
‣Future Work
Why Lambda?• 1 new user every 3 seconds.• Contextual, time based recs
more & more important
Discover Recs
The Discover PageAlgorithmically generated fresh recs for users.
The Discover Batch Pipeline
Machine Learning Deep Dive
Word2VecWords with similar contexts have similar meaning
Word2VecKing – Man + Woman = Queen
Annoy• Approximate
Nearest Neighbors Oh Yeah!
• https://github.com/spotify/annoy
Batch Architecture
Strengths• Recs based on
complete user historyWeakness• User vector generation
time increasing with no. users.
• Not reflective of
current mood.
Intro to Storm
Storm• Distributed real-time
computation system
Storm @ Spotify
Real-time Architecture
• Workers die -> Cascading JVM Process death
• Memcache flakiness• Cassandra JVM problems due
to write/overwrite pattern
Challenges
Future/Ongoing Work• Simplify the topology• Keep listens for 24
hours• Ongoing work on other
real time personalization features.
QuestionsEsh Kumar [email protected] Samuels [email protected]