+ All Categories
Home > Documents > SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in...

SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in...

Date post: 21-Mar-2020
Category:
Upload: others
View: 4 times
Download: 0 times
Share this document with a friend
4
CASE STUDY DoubleDown Interactive is a leading provider of fun-to-play casino games on the internet. DoubleDown was founded in 2010 in Seattle, Washington, and is part of International Game Technology (NYSE:IGT). Its games are massively popular and available on Facebook, desktop and mobile platforms such as iOS and Android. Although most of its games are free to players, DoubleDown makes money from in-game purchases and working with advertising partners. According to Rolfe Lindberg, Head of Business Intelligence at DoubleDown Interactive, “We have a lot of analytics projects underway with business analysts and data scientists for decision making purposes as well as for many internal departmental needs.” Game developers, customer support, marketing staff, customer experience and loyalty teams, and external marketing partners all make use of data analytics at DoubleDown. By understanding and drilling into their data, DoubleDown finds insights that influence game design, enable rigorous marketing campaign evaluation and management, improve understanding of player behavior, assess user experience, and uncover bugs and defects. Metrics based on game event data allow stakeholders to understand what players are doing during gaming sessions, which helps them evolve a particular game as well as create new and different games. In addition, DoubleDown also has several production processes that process data to manage user account balances and handle revenue recognition for its games. Performing these analyses requires bringing together data from multiple sources. Rolfe explains, “For our internal data we get bookings, user information, marketing campaigns and promotions. Separately, game event logs are generated when users go into our online casinos and play those games. We get this data from MySQL databases, the internal production databases and cloud-based game servers. Some of the operations data is collected from our Splunk system. We also get a lot of external third-party data — we have about 19 vendors who provide us data that needs to go into our data warehouse. That includes ad partner data from Facebook, AppLovin, and many other publishers.” DoubleDown Wins Big with Snowflake CUSTOMER DoubleDown PARTNER Snowflake Pub Date: April 11, 2017 | 1 DOUBLEDOWN INTERACTIVE'S SCENARIO
Transcript
Page 1: SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in part because all the game event data is stored in large JSON log files using JSON format.

C A S E S T U D Y

DoubleDown Interactive is a leading provider of fun-to-play casino games on the internet. DoubleDown

was founded in 2010 in Seattle, Washington, and is part of International Game Technology (NYSE:IGT).

Its games are massively popular and available on Facebook, desktop and mobile platforms such as iOS

and Android. Although most of its games are free to players, DoubleDown makes money from in-game

purchases and working with advertising partners.

According to Rolfe Lindberg, Head of Business Intelligence

at DoubleDown Interactive, “We have a lot of analytics

projects underway with business analysts and data scientists

for decision making purposes as well as for many internal

departmental needs.” Game developers, customer support,

marketing staff, customer experience and loyalty teams, and

external marketing partners all make use of data analytics at

DoubleDown.

By understanding and drilling into their data, DoubleDown

finds insights that influence game design, enable rigorous

marketing campaign evaluation and management, improve

understanding of player behavior, assess user experience,

and uncover bugs and defects. Metrics based on game event

data allow stakeholders to understand what players are doing

during gaming sessions, which helps them evolve a particular

game as well as create new and different games. In addition,

DoubleDown also has several production processes that

process data to manage user account balances and handle

revenue recognition for its games.

Performing these analyses requires bringing together data

from multiple sources. Rolfe explains, “For our internal data

we get bookings, user information, marketing campaigns and

promotions. Separately, game event logs are generated when

users go into our online casinos and play those games. We

get this data from MySQL databases, the internal production

databases and cloud-based game servers. Some of the

operations data is collected from our Splunk system. We also

get a lot of external third-party data — we have about 19

vendors who provide us data that needs to go into our data

warehouse. That includes ad partner data from Facebook,

AppLovin, and many other publishers.”

DoubleDown Wins Big with Snowflake CUSTOMER DoubleDown

PARTNER Snowflake

Pub Date: April 11, 2017 | 1

DOUBLEDOWN INTERACTIVE'S SCENARIO

Page 2: SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in part because all the game event data is stored in large JSON log files using JSON format.

““We had minimal configuration work to do with Snowflake; we did not have to

worry about indexes or administration, because it’s a highly optimized SQL data-

base already. Because the Snowflake data warehouse is truly elastic, we can in-

crease and decrease compute power for different user needs that are temporary,

with no changes to data or data locations.”” — Josh McDonald, Director of Analytics Engineering

DoubleDown’s challenge was to take continuous data feeds

from their games and integrate that with other data into a

holistic representation of game activity, usability and trends.

“When it came to our event log data, this is where we got

into the problem of big data. Our game servers generate

roughly 3.5 terabytes of data per day,“ says Rolfe.

Integrating that data was complex—it required many sources

with separate data flow paths and ETL transformations for

each, in part because all the game event data is stored in

large JSON log files using JSON format. In addition to using

Talend’s enterprise integration data suite to help them with

ETL and data integration, DoubleDown also used a noSQL

database, MongoDB, for processing the data. “The previous

process was to get the data into a noSQL database, and then

run a collection of noSQL DB collectors and aggregators.

The data was then pulled into a staging area where it got

cleaned, transformed, and conformed to the star schema,

then it was loaded into our pre-existing enterprise data

warehouse,” says Rolfe.

Once in the data warehouse, the data was used for analysis

and reporting via both commercial tools including Tableau

and a homegrown reporting dashboard used heavily across

the company that was supported by MySQL.

DoubleDown had latency, throughput, and reliability

challenges with their data pipeline. They had hidden costs

and risks due to the lack of reliability of their data pipeline

and the amount of ETL transformations required. According

to Rolfe, “There were a lot of challenges with our previous

architecture because it took a really long time to process

the event log. There were many times that we had to wait

until 3pm the next day to get the data from the previous day.

If one of the MongoDB clusters went down, we actually

lost data.”

DoubleDown also needed even more event detail along with

more in-depth reporting and analytics to support more

complex ad hoc data science explorations. “We didn’t have

any detailed game-level log data because the noSQL system

would not scale to process the larger volume that was

required. As a result, it was very difficult for us to go back

and do any root cause analysis or find issues we observed in

that data.

THE CHALLENGE

Previous Environment

C A S E S T U D Y

Pub Date: April 11, 2017 | 2

Page 3: SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in part because all the game event data is stored in large JSON log files using JSON format.

DoubleDown turned to Snowflake’s cloud data warehouse

for a better solution to host the computing and data flow

for all operational and game event analysis data. This

combination has given them increased scalability, lower

infrastructure costs and higher agility in navigating new

data flow and processing requirements, all of which helps

enable them to stay ahead of their growth curve. In fact,

within the next year they expect to be using 100% cloud

IT infrastructure.

Intrigued by Snowflake’s scalable cloud architecture and its

ability to load and process JSON log data in its native form,

DoubleDown decided to replace their MongoDB data store

and related MapReduce processing with Snowflake. All

previous MongoDB transformations and aggregations, plus

several new ones, are now done inside Snowflake after

loading their JSON game event data directly into Snowflake.

According to Rolfe: “We now take in the data from Amazon

Kinesis and load it into an Amazon S3 landing area. Once the

data is available there, our Talend process runs every 5

minutes and then loads the files directly into an event log

table in Snowflake, which makes all the JSON attributes

queryable.”

Putting Snowflake in place was straightforward and

happened quickly. “Snowflake seamlessly integrates

between our file system and Amazon S3, and it was simple to

integrate with our Talend data integration process. We

brought it into production in just three months development

took less than two man-months, and then we migrated the

process in the third month, including all of the testing and

QA,” says Rolfe.

Using Snowflake has brought DoubleDown three important

advantages: a faster, more reliable data pipeline; lower costs;

and the flexibility to access new data using SQL.

Fast, reliable data pipeline

According to Rolfe, “We have huge amounts of event data in

JSON files that we need to process. Snowflake was able to

manage this very efficiently—because Snowflake can load

and flatten a JSON structure of 2.5 million elements in less

than two minutes, we’re able to run and process new event

data every five minutes. Our daily process now takes about

15 minutes to process a full day’s worth of data, whereas

previously it would take more than 24 hours even while

using lower granularity data”. Using Snowflake also helped

DoubleDown eliminate the failures that had created delays

waiting for data to be reprocessed. “Since we moved to our

new data architecture, we have not had any data loss,”

explained Rolfe. The improved reliability means they can

now meet their SLAs by getting all game data results to

analysts the same day they are generated.

Cost savings

Rolfe goes on to say, “Snowflake is extremely cost effective—

we have saved nearly 80% by implementing Snowflake.” One

part of the cost savings was being able to stage and store

data with higher granularity cost effectively in Amazon S3,

something made possible by the Snowflake architecture.

DoubleDown also saw cost savings because they no longer

need to allocate resources to constantly monitor and fix

their noSQL clusters, and they do not require specialized

resources to write MapReduce jobs in order to transform

their game event data.

Flexibility

Snowflake’s ability to load JSON natively saves DoubleDown

several steps in their ETL process. “Snowflake provided an

upgrade for our transformation processes that previously

ran in MongoDB as MapReduce jobs”, says Rolfe. The ability

to process their JSON data using SQL also provided

significant benefits, allowing them to open up more data to

both Tableau users and users of their internal dashboards.

C A S E S T U D Y

Pub Date: April 11, 2017 | 3

FINDING A BETTER SOLUTION

SEEING RESULTS

Page 4: SNO-CaseStudy DoubleDown R1...with separate data flow paths and ETL transformations for each, in part because all the game event data is stored in large JSON log files using JSON format.

ABOUT SNOWFLAKE

Snowflake is the only data warehouse built for the cloud. Snowflake delivers the

performance, concurrency and simplicity needed to store and analyze all of an organization’s

data in one location. Snowflake’s technology combines the power of data warehousing,

the flexibility of big data platforms and the elasticity of the cloud at a fraction of the cost of

traditional solutions. Snowflake: Your data, no limits.

Find out more at snowflake.net.

C A S E S T U D Y

Pub Date: April 11, 2017 | 4

LOOKING INTO THE FUTURE

“Because Snowflake has the standard SQL that you would

typically use in a relational database, our development pace

was really rapid. Using Snowflake, we are able to quickly

create queries that enable new features such as verification

and validation of payout probabilities for various games and

reconciling chips balances across all players.

Previously, the lack of high granularity game event data

meant whole sets of decisions were ignored and therefore

the root causes of problem events were not understood or

acted upon. Using Snowflake, they can now perform root-

cause analysis. Because of this, many future problems and

software bugs can be solved and often avoided entirely

which improves both productivity as well as cycle time speed

for product delivery. Further, this positively impacts product

quality, customer experience and customer lifetime value.

By removing processing steps, not only do they achieve a

performance advantage resulting in same-day analytic

results, they also achieve a more reliable infrastructure with

fewer maintenance requirements and the ability to build out

new specialized ad hoc analyses for different stakeholders.

Looking to the future, Rolfe says, “We have only scratched

the surface with this new implementation. We have

additional real-time reporting for game performance in

development so that when a new game is launched, we can

immediately see how the game is performing. We can put in

alerts based on any data outliers and see where and why

things are going wrong.”

“Overall, the addition of Snowflake is a giant leap for

DoubleDown and we expect many more good things to

come out of this in the future,” says Rolfe.


Recommended