Date post: | 16-Apr-2017 |
Category: |
Data & Analytics |
Upload: | flink-forward |
View: | 99 times |
Download: | 1 times |
© King.com Ltd 2016 – Commercially confidential
RBea: Scalable Real-Time Analytics at King
Gyula FóraData Warehouse Engineer
Apache Flink PMC
Page 2
© King.com Ltd 2016 – Commercially confidential
We make awesome mobile games
463 million monthly active users
30 billion events per day
And a lot of data…
Page 3
About King
© King.com Ltd 2016 – Commercially confidential
From streaming perspective…
Page 4
DB
30 billion events / day
Complex processing logic
Terabytes of state
DB
DB
© King.com Ltd 2016 – Commercially confidential
How do we use Flink
Page 5
Standalone deployment
Few heavy streaming jobs
RocksDB state backend with caching
Lot of custom tooling
© King.com Ltd 2016 – Commercially confidential
The RBea platform
Page 6
Scripting on the live streams
Deploy from browser/python
Window aggregates
Stateful computations
Scalable + fault tolerant
© King.com Ltd 2016 – Commercially confidential
RBea architecture
Page 7
Events Output
REST API
RBEA web frontend
Libraries
http://hpc-asia.com/wp-content/uploads/2015/09/World-Class-Consultancy-Seeking-Data-Scientist-CA-Hobson-Associates-Matthew-Abel-Recruiter.jpg
Data Scientists
© King.com Ltd 2016 – Commercially confidential
RBea backend implementation
Page 8
One stateful Flink job / game
Stream events and scripts
Events are partitioned by user id
Scripts are broadcasted
Output/Aggregation happens downstream
S1 S2
S3S4
S5
Add/Remove scripts
Event stream Loop over deployed scripts and process
CoFlatMap
Output based on API calls
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 9
@ProcessEvent(semanticClass=SCPurchase.class)def process(SCPurchase purchase,
Output out,Aggregators agg) {
long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "\t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment()
}
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 10
@ProcessEvent(semanticClass=SCPurchase.class)def process(SCPurchase purchase,
Output out,Aggregators agg) {
long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "\t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment()
}
Processing methods by annotation
Event filter conditions
Flexible argument list
Code-generate Java classes
=> void processEvent(Event e, Context ctx);
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 11
@ProcessEvent(semanticClass=SCPurchase.class)def process(SCPurchase purchase,
Output out,Aggregators agg) {
long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "\t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment()
}
Output calls create Output events
Output(KAFKA, “purchases”, “…” )
These events are filtered downstream and sent to a Kafka sink
© King.com Ltd 2016 – Commercially confidential
Dissecting the DSL
Page 12
@ProcessEvent(semanticClass=SCPurchase.class)def process(SCPurchase purchase,
Output out,Aggregators agg) {
long amount = purchase.getAmount() String curr = purchase.getCurrency() out.writeToKafka("purchases", curr + "\t" + amount)
Counter numPurchases = agg.getCounter("PurchaseCount", MINUTES_10) numPurchases.increment()
}
Aggregator calls create Aggregate events
Aggr (MYSQL, 60000, “PurchaseCount”, 1)
Flink window operators do the aggregation
© King.com Ltd 2016 – Commercially confidential
Aggregators
Page 13
long size = aggregate.getWindowSize();long start = timestamp - (timestamp % size);long end = start + size;TimeWindow tw = new TimeWindow(start, end);
Event time windowsWindow size / aggregator
Script1
Script2
Window 1Window 2 NumGames
Revenue
W1: 8999W2: 9001
W1: 200W2: 300
MyAggregator
W1: 10W2: 5
Dynamic window assignment
© King.com Ltd 2016 – Commercially confidential
User states
Page 14
RBeaScript
LAST_GAMESTARTLAST_PURCHASELAST_…
+ User defined states
Fieldsbyte[]
User State
byte[]byte[]
LRU Cache
Scripts can read all stateBut write only their own
© King.com Ltd 2016 – Commercially confidential
Things you might wonder…
Page 15
Can slow scripts affect other scripts?Yes, but we are working on it
Separate test/live environments
What does the backend know about the scripts?Outputs produced
Failures (causes) => these are propagated
Runtime stats in the future
Is RBea useful compared to custom Flink jobs?
Writing + maintaining streaming jobs is hard
Especially with state, windowing, MySQL etc.
© King.com Ltd 2016 – Commercially confidential Page 16
RBea physical plan
© King.com Ltd 2016 – Commercially confidential
Wrap up
Page 17
RBea makes streaming accessible to every data scientist at King
We leverage Flink’s stateful and windowed processing capabilities
People love it because it’s simple and powerful
Thank you!