+ All Categories
Home > Data & Analytics > Making the leap to BI on Hadoop by Mariani, dave @ atscale

Making the leap to BI on Hadoop by Mariani, dave @ atscale

Date post: 12-Jul-2015
Category:
Upload: tin-ho
View: 298 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
Making the leap to BI on Hadoop Predictive Analytics & Business Insights 2014 November 19, 2014 David P. Mariani CEO AtScale, Inc.
Transcript
Page 1: Making the leap to BI on Hadoop by Mariani, dave @ atscale

Making the leap to BI on Hadoop

Predictive Analytics & Business Insights 2014

November 19, 2014

David P. Mariani

CEO

AtScale, Inc.

Page 2: Making the leap to BI on Hadoop by Mariani, dave @ atscale

2

THE TRUTH

ABOUT DATA

2

“We think only 3% of the

potentially useful data is tagged,

and even less is analyzed.”

Source: IDC Predictions 2013: Big Data, IDC

“90% of the data in the world

today has been created in

the last two years”Source: IBM

Page 3: Making the leap to BI on Hadoop by Mariani, dave @ atscale

The Broken PromiseWhat We WantedCentralized Data Warehouse

Page 4: Making the leap to BI on Hadoop by Mariani, dave @ atscale

What We GotData Marts

Page 5: Making the leap to BI on Hadoop by Mariani, dave @ atscale

WHAT WE GOT

ETL + STAR SCHEMAS

Page 6: Making the leap to BI on Hadoop by Mariani, dave @ atscale

6

INPUT DATA

ETL

MART MART MART

QUERY ENGINE

ANALYSIS TOOLS

DATA

WAREHOUSE

Traditional Data Architecture

Page 7: Making the leap to BI on Hadoop by Mariani, dave @ atscale

7

INPUT DATA

ETL

MART MART MART

QUERY ENGINE

ANALYSIS TOOLS

DATA

WAREHOUSE

What’s Wrong with this Picture

Highly complex

Lots of people & skillsets

Multiple copies of data

Stale data

Rigid schema

Tough to change

Write Many StructuredEarly Transformation

Page 8: Making the leap to BI on Hadoop by Mariani, dave @ atscale

8

It Takes an Army

BI Engineer

Design Reports/Dashboards

ETL Engineer

Automate Cube Load

BI Engineer

Design Cube

DBA

Automate Data Load

ETL Engineer

Write ETL Code

DBA

Create Tables

Data Warehouse Architect

Design Star Schema

SAN/NAS Engineer

Define Storage Architecture

Page 9: Making the leap to BI on Hadoop by Mariani, dave @ atscale

9

Star Schema = Unnatural!

Page 10: Making the leap to BI on Hadoop by Mariani, dave @ atscale

WHAT WE WANTED

SCHEMA ON DEMAND

Page 11: Making the leap to BI on Hadoop by Mariani, dave @ atscale

11

Data Management Approaches

INPUT DATA

ETL

MART MART MART

QUERY ENGINE

ANALYSIS TOOLS

DATA

WAREHOUSE

Traditional Approach New Approach

INPUT DATA

ANALYSIS TOOLS

HADOOP

Page 12: Making the leap to BI on Hadoop by Mariani, dave @ atscale

Time for a New Approach

VS

Write Once Semi-StructuredLate Transformation

✔ ✔ ✔

Page 13: Making the leap to BI on Hadoop by Mariani, dave @ atscale

13

Not This, That

BI Engineer

Run Queries/Create Reports

Hadoop Engineer

Create EXTERNAL Tables

Hadoop Engineer

Define location to store files

BI Engineer

Design Reports/Dashboards

ETL Engineer

Automate Cube Load

BI Engineer

Design Cube

DBA

Automate Data Load

ETL Engineer

Write ETL Code

DBA

Create Tables

Data Warehouse Architect

Design Star Schema

SAN/NAS Engineer

Define Storage Architecture

VS

Page 14: Making the leap to BI on Hadoop by Mariani, dave @ atscale

Example: Key-Values

Page 15: Making the leap to BI on Hadoop by Mariani, dave @ atscale

Example: JSON

Page 16: Making the leap to BI on Hadoop by Mariani, dave @ atscale

DEMOMOBA Game Analytics

Page 17: Making the leap to BI on Hadoop by Mariani, dave @ atscale

17

Demo: DOTA 2 – What the User Sees

Key Data Points: 5 vs. 5 players per match. Players choose ‘Heroes’, use ‘Items’ & earn ‘Gold’.

Page 18: Making the leap to BI on Hadoop by Mariani, dave @ atscale

FOR THE DATA SCIENTISTS!

Page 20: Making the leap to BI on Hadoop by Mariani, dave @ atscale

20

As Easy As 1,2,3

BI Engineer

Run Queries/Create Reports

Hadoop Engineer

Create EXTERNAL Tables

Hadoop Engineer

Define location to store files

Page 21: Making the leap to BI on Hadoop by Mariani, dave @ atscale

21

Demo: DOTA 2 – Use Case 1

Question: Who are the most popular heroes?

Page 22: Making the leap to BI on Hadoop by Mariani, dave @ atscale

22

Demo: DOTA 2 – Use Case 2

Question: Which heroes have the highest win rate?

Page 23: Making the leap to BI on Hadoop by Mariani, dave @ atscale

23

Demo: DOTA 2 – Use Case 3

Question: What are the top 3 items associated with the best win rate?

Page 24: Making the leap to BI on Hadoop by Mariani, dave @ atscale

24

Practical Applications

Time Server Analysis (session data)

Affinity Analysis

Segmentation Analysis

Many to Many

Page 25: Making the leap to BI on Hadoop by Mariani, dave @ atscale

NO JOINS = HORIZONTAL SCALE

Page 26: Making the leap to BI on Hadoop by Mariani, dave @ atscale

FOR THE

ORDINARY HUMAN!

Page 27: Making the leap to BI on Hadoop by Mariani, dave @ atscale

27

Page 28: Making the leap to BI on Hadoop by Mariani, dave @ atscale

DEMO

Page 29: Making the leap to BI on Hadoop by Mariani, dave @ atscale

29

Summary: The Do’s & Don’ts

Capture data “as is” Pre-aggregate data

Apply schema on read Force schema on load

Land new data on Hadoop Land new data on relational

DBs

Create a data warehouse Create data marts

Leverage open source engines Invest in proprietary databases

Do Don’t

Page 30: Making the leap to BI on Hadoop by Mariani, dave @ atscale

Business Intelligence Redefined


Recommended