+ All Categories
Home > Documents > Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to...

Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to...

Date post: 04-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
39
Big Data at King Fabio Scanu, Senior Data Warehouse Engineer– [email protected]
Transcript
Page 1: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Big Data at King

Fabio Scanu, Senior Data Warehouse Engineer– [email protected]

Page 2: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business
Page 3: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

3

King in numbers356 million MAU1.5 billion game plays per day9 game studios, 1700 employees

And lots and lots of data...32 billion rows per day1.5 TB per day new> 9 Pb stored

A bit about King

Studios in Stockholm, London, Barcelona, Malmo, Berlin, Singapore and Seattle. Offices in San Francisco, New York, Malta, Tokyo, Seoul and Shanghai

And for fun:• 100000s of hours played• Trillions of candies matched

Page 4: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

4

Activision Blizzard in numbersl Headquartered in Santa Monica, California

l 9000 employees

l Focused on games for Xbox, PS, Cmputer, etc

l Call of Duty, Guitar Hero, Diablo, Warcraft, etc

l Offices pretty much all over the US

A bit about Activision Blizzard

Page 5: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Players are different

356 m

We have more players than the entire US

320 m

Page 6: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Big data is…What is Big Data?

What's your definition of Big Data?

Page 7: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Big data is…What is Big Data?

Page 8: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

We predict player behaviour…

Good stuff

EffectiveActionable Predictable

Page 9: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Our data is… growing

Our data

Page 10: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

20130117T060000.142+0100 23102 1387107022 1137497977 0 0 fb notif giveGoldToUser20130117T060000.277+0100 2310101 1000524045 1 2 510720130117T060000.281+0100 2321 1025951084 0 134 135838885720130117T060000.282+0100 2369 1025951084 0 134 0 1358398800 facebook bookmark_favorites 0fb_source=bookmark_favorites&ref=bookmarks&count=3&fb_bmpos=9_320130117T060000.285+0100 2338 1025951084 ad1c792b WINDOWS_XP CHROME 24.0.1312.52 Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.1720130117T060000.287+0100 2310101 1140113442 -1 4 510120130117T060000.288+0100 2310005 1140113442 4 3 135839880028820130117T060000.305+0100 2310005 1111576364 5 2 135839880030520130117T060000.306+0100 2310006 1031413225 7 13 0 0 8 1358398598520 -120130117T060000.350+0100 2310101 1151246251 -1 0 510120130117T060000.351+0100 2310005 1151246251 5 7 135839880035120130117T060000.358+0100 2310006 1376461814 4 3 0 0 72 1358398575940 -10001

Our data is… not that useful rawOur data

Page 11: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Game servers

Log server

ReportsData scientists

Data WarehouseTSV log files

Data MartRaw data

ETL

System architectureOur data

Page 12: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

• Ease of use• Flexible framework• Huge bag of techniques & tricks• Structures thinking

Why build a dimensional model?Our data

Page 13: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

…actually well structured

Our data is…Our data

Page 14: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

20130117T060000.142+0100 23102 1387107022 1137497977 0 0 fb notif giveGoldToUser20130117T060000.277+0100 2310101 1000524045 1 2 510720130117T060000.281+0100 2321 1025951084 0 134 135838885720130117T060000.282+0100 2369 1025951084 0 134 0 1358398800 facebook bookmark_favorites 0fb_source=bookmark_favorites&ref=bookmarks&count=3&fb_bmpos=9_320130117T060000.285+0100 2338 1025951084 ad1c792b WINDOWS_XP CHROME 24.0.1312.52 Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.52 Safari/537.1720130117T060000.287+0100 2310101 1140113442 -1 4 510120130117T060000.288+0100 2310005 1140113442 4 3 135839880028820130117T060000.305+0100 2310005 1111576364 5 2 135839880030520130117T060000.306+0100 2310006 1031413225 7 13 0 0 8 1358398598520 -120130117T060000.350+0100 2310101 1151246251 -1 0 510120130117T060000.351+0100 2310005 1151246251 5 7 135839880035120130117T060000.358+0100 2310006 1376461814 4 3 0 0 72 1358398575940 -10001

TSVOur data

Page 15: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

15

Hadoop strengths and weaknesses

Our data

Strengths Weaknesses

Scalability Structured data performance

Resiliency Ease of use

Flexibility Maintenance

Low cost accessible storage Fast data exploration

Unstructured / semi-structured data

JOINs

Page 16: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

16

GamesEvent data

Hive

Reports

Data scientists

ETL

Data platform 1.0Our data

Page 17: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

17

GamesEvent data

Hive DB?

Reports

Data scientists

ETL

Data platform 1.5Our data

Page 18: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

18

• Optimised for structured data• Good for dimensional model• Fast data exploration• More friendly / productive environment• Faster queries = happier users!

Benefits of an column-oriented databaseOur data

Page 19: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

19

• Speed• Efficiency• Tuning free• Scalability (170Tb and counting...)• ExaSol the company

Why ExaSolution?Our data

Page 20: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

20

Database grade servers

Hadoop grade servers

Performance / price

Price / Tb usable storageOur data

0 x 2x

3x

4x

5x

6x

7x

Page 21: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

21

Hybrid architecture: best of both worlds

Our data

Hadoop Analytics database

Scalability Structured data performance

Resiliency Ease of use

Flexibility Low maintenance

Low cost accessible storage Fast data exploration

Unstructured/semi-structured data JOINs

Page 22: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Data platform 2.0Our data

22

GamesEvent data

Hive ExaSolution

Reports

Data scientists

ETL

Page 23: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

23

Cool! But…what kind of analysis can I do with that?

Our data

• Fairly deep thinking about the players and their motivation, frustration, achievements, persistence, etc

• Carefully designed experiments (AB tests) to run in the games, which integrate a hypothesis about player’s behaviour with a nicely designed game feature

• Continuing to introduce entirely new challenges as the levels unfold (Candy Crush Saga has 1,280 Reality levels and 665 Dreamworld levels)

• The right analysis

Page 24: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Machine learning and predictive analytics

24

We have >9 petabytes of player data. Mostly of the form:• “player ‘x’ tried level ‘y’ and succeeded / failed / spent”

A fairly large space of opportunity to predict…• Is this player going to stop playing?• Is this player going to start spending?• What product should I recommend to this player?• What other game might they enjoy?• Is it a good time to recommend they play another game?• But also segmentation, recommendation, etc

Page 25: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Candy Crush Saga has been at the top of the charts since January 2013

25

Page 26: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Candy Crush Saga: Can a level be too hard?

First Episode Unlock

Level 35

Level 65

Super hard level 65• 120+ attempts on average• 50% drop out rate• Very high revenue• Very high conversion• Super happy players when

they eventually complete itShould it be easier?

Page 27: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Machine learning and predictive analytics

27

The long term value of our players is higher if we make it easier• We get at significantly more direct revenue (all those future levels)• More players stay active in our network (=more players trying out

other games, more players helping & competing with their friends)At King we optimise for the long term!

Page 28: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

28

Pet Rescue Saga. Which of these is better?

or ?

ClearSimpleObvious button to buyNo confusionLow price point

ComplexChoices to makeVaried price pointsChance for more revenue, but does it put people off?

Page 29: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

29

Pet Rescue Saga. Which of these is better?

Results of a nice AB test:

Total revenue up significantly - driven almost entirely by our “medium” and “high” spend segments.No negative impact (zero/low spend segments are unaffected).andWe should think of how to target the zero spend and low spend segments in other ways.

or ?

Page 30: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

• Upstream and downstream throughput and flexibility• Greater variety of game genres• Keep on scaling• Technology innovation• Evolving data model• Microbatch ETL• Real(er) time…

ChallengesWhere next?

Page 31: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

31

Bridging the latency canyonWhere next?

Page 32: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

32

Where next?

DataReal time system

ExaSolution Hadoop

Microbatch ETL

Increasing latency, quality, context

0 ms DailyHourly15

minutes?200 ms

Batch ETL

Data platform 4.0

VoltDB?

Page 33: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

33

Where next?

In details

Page 34: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

34

• Hadoop with 330+ nodes, adding 2 racks / month

• 32 Billion events per day, more than TwitterI. If an event had a weight of 1 gram, this would be as big as a 53 fully

loaded Airbus 380s.II. If an event was a grain of salt, this would mean about 30 bathtubs of salt.

• 64 Nodes in memory column store DB

• Hive, Impala, Spark, Yarn, in place

• 9PB of data in hdfs, 170TB+ in Exasol

• In 12 months time, these numbers will double

Some numbersWhere next?

Page 35: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

35

• What are your requirements?

• There’s not one tool for the job

• Hybrid architectures give the best of more worlds

• 9PB of data opens up to a new set of challenges:

l A medium table in King has about 300 billion records;

l Having all that amount of data over that architecture allows you to do any kind of analysis you want, using the algorithm you want (NPL, AI, Machine learning, etc)

In conclusionWhere next?

Page 36: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

A few words about our people

36

About 1700 employees today• Many 100s of software engineers• Lots of graphic designers, artists, musicians, business managers, producers,

marketers,…• In the data area:

60+ data scientists 30 data engineers building and maintaining our data and reporting platforms

Page 37: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Great roles

37

Data Scientists and Data Engineers working• in our games• on our network• on our systems• on our testing/optimisation frameworks• …And we like people to rotate around over time

Between 6 and 11 interviews before joining

https://www.youtube.com/watch?v=V9y21zPw4MY

Page 38: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Working @King

38

In the office, we have:l Unlimited food & drinks, gym, wine & whisky tasting, many different beersl Boxing, krav maga and yoga classesl Nap rooms, running clubsl Movie nights, boarding games, and football tournamentl Everyone's idea matter, no matter the seniorityl You get to travel as often as you likel You can work from homel Really cool parties & eventsl Freedom to work on what you likel You keep learning all the timel And much more...

Page 39: Big Data at King - Roma Tre Universitytorlone/bigdata/S6-King.pdf · We should think of how to target the zero spend and low spend segments in other ways. or ? ... musicians, business

Thank you


Recommended