+ All Categories
Home > Documents > Emerging Business Applications of High Performance Analytics Pivotal

Emerging Business Applications of High Performance Analytics Pivotal

Date post: 02-Jun-2018
Category:
Upload: vivek1119
View: 215 times
Download: 0 times
Share this document with a friend
39
1 SAS Event: Kuala Lumpur - Hadoop, B ig Data & Analytics © Copyright 2013 Pivotal. All rights reserved. 1 © Copyright 2013 Pivotal. All rights reserved. Emerging Business Applications of High Performance Analytics  August 2014 Tan Yaw, Sr. Data Scientist
Transcript
Page 1: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 1/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

© Copyright 2013 Pivotal. All rights reserved.

Emerging Business Applications ofHigh Performance Analytics

August 2014

Tan Yaw, Sr. Data Scientist

Page 2: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 2/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Table of Contents! Introduction! Data Lake! Analytics! Labs

Page 3: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 3/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Pivotal At-a-Glance

! New Independent Venture: Spun jointly owned by EMC & VMware

! Top Talent: 1700~ employees ! Proven Leadership: Paul Maritz,

! Global Customer Validation:+1000 Tier-1 Enterprise Customers

! Strategic Backing: $105M investm! Bold Vision: New platform for a ne

focused on the intersection of Big Daand Agile Software Development

Page 4: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 4/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

EMC Federation

Page 5: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 5/39SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics

© Copyright 2013 Pivotal. All rights reserved.

Pivotal Data LabsF (X ) =

1M

M

Xm=1

T m(X ) = 1M

M

Xm=1

n

Xi=1

W im(X )Y i =n

Xi=1

1M

M

Xm=1

W im(X )!Y i

Pivotal – What we do

Page 6: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 6/39SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics

© Copyright 2013 Pivotal. All rights reserved.

Customer Reference

Page 7: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 7/39SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics

© Copyright 2013 Pivotal. All rights reserved.Pivotal Confidential–Internal Use Onl

Data Lake

Page 8: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 8/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Big Data

! Bnsa

Page 9: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 9/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Pivotal Business Data Lake ArchitectureCentralized Management

System monitoring System management

Unified Data Management TierData mgmt.

servicesMDMRDM

Audit andpolicy mgmt.

Processing Tier

Workflow Management

Distillation Tier

HDFS storageUnstructured and structured data

In-memory

MPP database

Unified Sources Flexible Actions

Real-timeingestion

Micro batchingestion

Batchingestion

Realinsig

Interinsig

Batcinsig

Page 10: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 10/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Emerging Analytics Architecture

AnaData M

MPP Dat

Enterprise Data WarehouseRDBMS

Data StagingPlatform

DataIngestion

Streams/Feeds

Descriptive AnalyticsBusiness Analysis

Predictive AnalyticsData Science

Page 11: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 11/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Criteria Business Data Lake EDW

CommonData Model

Single Standard Data view = Base classEnhanced Local Data view = DerivedClasses

Single Class = Siacross the enterpr

DataQuality

DataIntegration

MultipleInterfaces

SQL, SAS, R, MapReduce, NoSQL SQL access &Integration with S

Quality ofService

Mixed workload with varying QoS Limited QoS separarequired

Full Spectrum

How is Business Data Lake Different

Low Latency Interactive Batch

Page 12: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 12/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Hadoop at the Center

Enabling the Data Driven Enterprise

Fastest SQL Query Engine

Hadoop as a Service

Big Data On-DemandGe

In-Memory Re

SpBuilding B

Page 13: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 13/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

China Citic Bank implements Data Lake to integrmultiple databases and in-database modelling forrapid model deployment.

Opportunity: Integrate the bank’s FICO TRIAD CustomerManagement Solution, Database Marketing platform, IBM CognosBusiness Intelligence software, and subcenter customerrelationship management (CRM)Business Benefits:

More Productive Telephone Sales Center• Optimized Marketing Campaigns (1286 with 86% reduction inconfiguration time)

• Faster model deployment via ‘In-database’ analytics

Page 14: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 14/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

GPDBAN IDEAL

WinNYSE Euronext manage exponential data growsupport analytic applications

“Pivotal rings the NYSE bell on Oct 29”

Opportunity: Work with NYSE technologies division on new historical archive p

Solution: Data Infrastructure to allow NYSE to handle trading data real time.

Co-developed in partnership with NYSE Technologies, Pivotal Data Dispatch is aimsquarely at the big data information worker. The idea of this product is to provide daanalysts with an easy way to provision various big data sets from any source, includHadoop, MPP, flat files or legacy databases.

Page 15: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 15/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Improving security analytics and implementing Data Lakearchitecture based on Pivotal HD, HAWQ and GPDB

Business Challenge: : Improving security analytics for credit card transactionsdeveloping “Data Lake” architecture for future projects.Volume: 4 TBs and growing

Solution: Data Lake architecture including Pivotal HD, HAWQ and

Page 16: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 16/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.Pivotal Confidential–Internal Use Onl

Analytics

Page 17: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 17/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Moneyball

• Lost your best players• No resources• Competing against richer, be

opposition

Q:How do you comp A: Data Driven Anal

Page 18: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 18/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Man vs. Machine : Simple Charts

! Traditionally, ‘Man’ takes data and turn them into charts in order tovisualize relationships. Charts are simple and easy to interpret

! ‘Man’ has ‘Analytical Limits’. We inherently view the world in 2-Dimensions and in simple linear relationships

Page 19: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 19/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Man vs. Machine: Complex relationships

! But the real-world is complex!

! It is not just X and Y relationships. X " Y" Z" A" B" C

! It is not just linear.

! Charts that try and visualize complex relationships are themselves mcomplex.

Page 20: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 20/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Man vs Machine : Finding Patterns! How do we classify and identify different groups within a

dataset

Page 21: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 21/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Man vs Machine: Machine Learning

X

Y

52

3

! Machines are able to analyze complex

patterns within the data that thehuman mind has difficulty visualizing

Page 22: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 22/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Evidence-Based DecisionsWhen somebody on staff asks what we should do toaddress a problem, the first questions I now ask are

‘What does the research say? What is the evidence base?

The core idea is that decisions supported by hard factsand sound analysis are likely to be better than decisionsmade on the basis of instinct, folklore or informalanecdotal evidence.

Page 23: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 23/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Decision-Based EvidenceMany managers think they’ve committed their organizations to evidebased decision making

— but have instead, without realizing it, committed to decision-baseevidence creation.

When asking staff to conduct a major analysis, a projectteam told us, “The executives have already made uptheir minds ! . We are being told that this is the way thatwe are going, we need to get on board and make thedecision work out to be [the new choice].”

Page 24: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 24/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Route OptimizationCustomer

A major courier delivery services company

Business Problem

Optimizing routing decisions while meeting thedemand and satisfying the many businessconstraints to guarantee feasibility andcompliance.

Challenges

• Routing problems are known to be NP-Hard

• Size of the operation. Delivery of 3 millionpackages a day with the largest fleet in the US

• Existing solution takes weeks to roll outmonthly routing plans

Solution

• Avoided expensive data movementsdemand forecasting and route optimdatabase

• Built a fully parallelized approximathat featured a variation of Floyd Wshortest paths and neighborhood sea

• Achieved significant reduction in fuover a greedy initial

feasible solution

Page 25: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 25/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Predicting Commodity Futures through TwCustomer

A major a agri-business cooperativeBusiness Problem

Predict price of commodity futures throughTwitter

Challenges

! Language on Twitter does not adhere torules of grammar and has poor structure

! No domain specific label corpus of tweetsentiment – problem is semi-supervised

Solution

! Built Sentiment Analysis and T

Regression algorithms to predifutures from Tweets

! Established the foundation for structured data (market fundamunstructured data (tweets)

Page 26: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 26/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Credit Risk Assessment and Stress Testing

Customer

A global financial services provider

Business Problem

Speed up the process of compliance reportingand stress testing for Basel III.

Challenges

Running the calculation procedures on thecustomer’s legacy database were time-consuming, therefore had to be done inovernight batch mode.

Solution

! Implement risk asset calculatiotesting on the Greenplum datab

! Three years of data was procesunder 2 minutes, significantly fcustomer’s current procedures.

! Connect an “in-database”

visualization tool to theGreenplum database viaODBC for on-demandreporting and visualization.

Page 27: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 27/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Text Analytics for Churn PredictionCustomer

A major telecom companyBusiness Problem

Reducing churn through more accuratemodels

Challenges

! Existing models only used structuredfeatures

! Call center memos had poor structure andhad lots of typos

Solution

! Built sentiment analysis model

churn and topic models to undeof conversation in call center m

! Achieved 16% improve in ROCChurn Prediction

Page 28: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 28/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Cross-Channel Customer EngagementCustomer

A major health insurance company

Business Problem

As each call to the call center represents asignificant cost to the company, find out whencustomers are using the call center instead of thewebsite

Challenges

# Unstructured text data requires considerablepreprocessing

Solution

# Used logistic regression to predict wcustomer would be unable to find theinformation on the web and need to

# Created a topic model based on the cto learn what these customers were cabout, since these would be the topicwere having trouble finding on the w

Page 29: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 29/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.Pivotal Confidential–Internal Use Onl

Labs

Page 30: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 30/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Industrial-era business practices! Many enterprise-gra

business practices arsuited for an industr

! But may face challendealing with the Intewhere ‘Speed’ and ‘

are being key compelevers

Page 31: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 31/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Industrial-era business practices

! Waterfall Project Mgmt! Develop, Test, Production, DR environments! Detailed Requirements!

Structured Data Schema

Page 32: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 32/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Knowledge-era business practices

INNOVATIO

! Silicon Valley has always been a hot-bed of innovation.

! When working with new technology, demanding high-availabilityspeed, uncertain customer preferences, DIFFERENT BUSINESSPROCESS are needed

Page 33: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 33/39

Page 34: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 34/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Labs Experiments! Data Lab experim

as a key approachgenerating value

Page 35: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 35/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Data Labs

Data ScienceData Engineering

+

MAD Approach to Analytics

Page 36: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 36/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

MAD Approach to AnalyticsMagnetic - attracting data to your EDW by r“barriers to entry”

Agile – enabling rapid analyses through thof powerful tools as close as possible to the data

Deep – going beyond basic data operations to eanalysts to reach new, rich depths in their

Page 37: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 37/39

Pivotal Confidential–Internal Use Only Pivotal Confidential–Internal Use Only

Conclusion

A l i Vi i

Page 38: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 38/39

Pivotal Confidential–Internal Use Only

Analytics Vision

Use into iterativ

your p

Build

Right TCleanse, organize, andmanage you data lake

Make the right toolsavailable

Use the resources wiselyto compute, analyze, and

understand data

Obsessively collectdata

Keep it forever

Put the data in oneplace

Analyze AnythingStore Everything

Page 39: Emerging Business Applications of High Performance Analytics Pivotal

8/10/2019 Emerging Business Applications of High Performance Analytics Pivotal

http://slidepdf.com/reader/full/emerging-business-applications-of-high-performance-analytics-pivotal 39/39

SAS Event: Kuala Lumpur - Hadoop, Big Data & Analytics© Copyright 2013 Pivotal. All rights reserved.

Thank You


Recommended