Date post: | 05-Jul-2015 |
Category: |
Software |
Upload: | hakka-labs |
View: | 4,313 times |
Download: | 1 times |
Building personalized ad experiences through iterative engineering and product development
Ad Personalizationat Spotify
Kinshuk MishraNoel Cody
Music…
...is with you throughout the day.
...fits your mood.
...fits your activity.
Music…
...is personal.
If your day looks like this:
Wake up Work out Commute Focus at Work Relax at Home Sleep
Ads should follow.
Wake up Work out Commute Focus at Work Relax at Home Sleep
Ads
Ads should follow.
Wake up Work out Commute Focus at Work (Classical) Relax at Home Sleep
Electronic Music ad
Not bad. WTF?
Why Personalization?
“...it works well the advertisements are annoying though I am not a fan of mainstream music so hearing about pop bands is also driving me crazy”
“Great way to listen to whatever music you want. The ads can be really annoying though since they don't seem to be targeted. I HATE rap music, yet I seem to get a lot of ads for it.”
Why Personalization?
Data confirms anecdotal evidence
AD PERSONALIZATION
User Stories
Hypotheses + Goals
Product MVPs + Experiments
● Context-aware ads ● Music ads like music recommendations ● Ads that learn
User Stories to Hypotheses & Goals:
● Real-time genre targeting
● Historic genre targeting
● Real-time moment targeting
Hypotheses to Products:
(Product MVPs to Experiments)
Control Variation 1 Variation 2
INFRASTRUCTURE
Ad Targeting Architecture
Feedback Loop
Ad Targeting Architecture
OSS Data Infrastructure
Spotify Backend Infrastructure
Ad Targeting Architecture V1.0
COTS Data Infrastructure
Spotify Backend Infrastructure
Real-time Targeting
Ad Targeting Architecture V2.0
Real-time + Batch Targeting(a.k.a. Lambda Architecture)
Ad Targeting Architecture V2.5
Transition to Persistent User Profile
Ad Targeting Architecture V3.0
Richer Profile Schema with Persistence
Tech Choices
Kafka
● Kafka is a distributed, partitioned, replicated commit log service.
● Guarantees
● Kafka provides a total order over messages within a partition
● Fault tolerance : handles N-1 failures for replication factor N.
Ad Targeting Architecture V1.0
COTS Data Infrastructure
Spotify Backend Infrastructure
Real-time Targeting
StormStorm
● Real time stream processing
● Like hadoop without HDFS
● Like Map/Reduce with many reducer steps
● Fault tolerant and guaranteed message processing
StormStorm: Testing (since 0.8.1)
StormStorm: Visualization (since 0.9.2)
Ad Targeting Architecture V2.0
Real-time + Batch Targeting
Apache Crunch
● Framework for writing, testing, and running MapReduce pipelines
● Pipelines are composed of user-defined functions and higher-level
abstractions of common MR tasks (filter, join, etc.)
Apache Crunch
Data structures:
● PCollection<T>
● PTable<K,V>
● PGroupedTable<K,V>
Functions:
● MapFn<T1,T2>: T1 → T2
● CombineFn<K,V>: (K, Iterable<V>) → (K, V)
What’s wrong with plain Python Streaming MapReduce?
● Testability
● Optimization
● Performance
● IDE support
● Type Safety
● Lack of higher-level operations (filter/join/aggregate)
Apache Crunch
From Spotify Presentation: Scalding the Crunchy Pig for Cascading into the Hive
● About a 5x performance improvement over Python streaming MapReduce
● Readable functional-style API in plain Java
● Great local testing support
● First-class support for Avro records.
Apache Crunch
From Spotify Presentation: Scalding the Crunchy Pig for Cascading into the Hive
Apache Crunch
Apache Crunch
Ad Targeting Architecture V2.5
Transition to Persistent User Profile
CASSANDRA
Rich wide-column schema support
Solid persistence and replication
Slower reads
● Rich schema● Persistence
MEMCACHED
K/V only
TTL is default (in-memory mgmt)
vs.
Ad Targeting Architecture V3.0
Richer Profile Schema with Persistence
DATA INGESTION:
CASSANDRA
CRUNCH
STORM
HDFS
KAFKALOGS
TESTING
User Stories
Hypotheses + Goals
Product MVPs + Experiments
AAR
Vital Signs
Ad-Specific Metrics
AAR
Vital Signs
Ad-Specific Metrics
Higher-level metrics are hard to move
US
ER E
XP
ERIE
NC
E
TEST ITERATION
IMPACTS AAR
AAR
Vital Signs
Ad-Specific MetricsOur focus
Test evaluation
● Positive Signals: CTR, Downstream Effects
● Avoidance Signals: Volume, Audio Output
● An “Ad Quality Score”
Thanks!
(We’re hiring):spotify.com/us/jobs/