How PepsiCo's Big Data Strategy is Disrupting CPG Retail Analytics

Post on 11-Feb-2017

839 views 2 download

transcript

How PepsiCo’s Big Data Strategy is Disrupting CPG Retail Analytics

Mike Riegling, Analyst, PepsiCo

presented by:

Will Davis, TrifactaJeff Huckaby, Tableau

Camilo Silva, Hortonworks

Your Presenter

Mike Riegling

Analyst, Customer Supply Chain

Q&A Sessionwith your hosts:

Will Davis

Director of Product Marketing

Jeff Huckaby

Market Segment Director, Retail & Consumer Goods

Camilo Silva

Enterprise Account Manager

4

Industry-leading data wrangling solution for data

analysts

Self-service data exploration & preparation

Supporting desktop, cloud and big data deployments

The Best-of-Breed Analytics StackLeading solutions for data processing, wrangling & visualization

Industry-leading Enterprise Analytics Platform

Governance & Self-service analytics at scale

Deploy on premise, in the cloud, or fully hosted

Future-proof scalable data platform to enable storage and

growth of expanding data

Allows business decisions faster and based on more actionable

insight

Enables corporate success in consumer markets

Agenda

CPFR Data Wrangling & Analytics at PepsiCo – Mike Riegling• CPFR Process at PepsiCo• Challenges Managing Diverse Internal & External Data

• Walkthrough of Trifacta + Tableau

5

Question & Answer• Will Davis - Trifacta• Jeff Huckaby - Tableau

• Camilo Silva - Hortonworks

Analytics Infrastructure at PepsiCo – Will Davis• History of Big Data at PepsiCo• IT/Business Collaboration for Analytics

• Analytics Stack: Hortonworks + Trifacta + Tableau

AnalyticsInfrastructureatPepsiCo

AnalyticsJourneyatPepsiCo

• PepsiCo’s journey with Big Data started over 4 years to respond to ever-increasing data requirements across Pepsi

• Focus on providing technology infrastructure and applications that bring shared success to Business & IT

• Eliminating traditional processes where IT was a bottleneck to the business

• Unified Data Architecture has 3 main pillars:• Enterprise Data Warehouse• Hortonworks Data Lake Environment• Data Discovery, Analytics & Business Intelligence tools

(Trifacta & Tableau)

DataPlatform- Hortonworks

• Selected Hortonworks Data Platform (HDP) as foundational technology to extend PepsiCo’s Unified Data Architecture

• Leveraging HDP to acquire, understand and incorporate new forms of internal/external business and consumer data

• HDP provides the platform capable of scaling up to effectively leverage the rapid growth of more granular consumer data

• Still early days on Hadoop at PepsiCo – only managing hundreds TB’s of data in HDP

• Use cases on Hadoop include CPFR (first use case), Consumer & Marketing Analytics

• Need only standard services to support use cases – Hive, YARN, PIG, etc…

• CPFR use case with Trifacta consumes approximately 25-50% of HDP resources

DataWrangling- Trifacta

• Trifacta was selected as the standard self service data wrangling tool within our data discovery infrastructure.

• Provides PepsiCo users with a familiar, yet powerful portal for data discovery and process development.

• By empowering business users, Trifacta helps bridge across the time and resource boundaries between business and IT

• Enables more rapid deployment of solutions that fit business needs precisely

• Collaborative effort, with both sides open to driving innovation and experimentation, delivers greater speed to shared success

DataVisualization&BusinessIntelligence- Tableau

• Tableau is the data visualization & business intelligence standard at PepsiCo

• Over 2000 users, 59 projects & 541 workbooks across PepsiCo

• 7+ Tableau servers in production environment (each server has 8 cores & 64GB RAM)

• Tableau serves as corporate standard for Business Intelligence throughout PepsiCo on top of EDW as well as self-service analytics for departments and individual analysts

• CPFR use case is completely self-service process for end users to discover and prep diverse data in Trifacta and build dashboards in Tableau (without the help of IT)

Hortonworks +Trifacta+TableauinthePepsico DataArchitecture

Unified Data ArchitectureERP

SCM

CRM

Social Media

Sensor Data

MachineLogs

Marketing

Planning

Data Mining

Analytics

Language

Business Analyst

Data Analyst

Data Scientist

Customer Partners

Frontline Workers

DataSources

Tools and Apps

Users

ENTERPRISE DATA WAREHOUSE

DATA DISCOVERY/ ANALYTICS

BUSINESSINTELLIGENCE

ETL

Data Quality

PepsiCoCPFRAnalysisProcess

CollaborationsTeamVision

“ExpandCollaborationwithCustomersbyLeveragingSharedDatatoEnhanceProcesses,ProvideBestinClassServiceandCreateaCompetitiveAdvantageforPepsiCo”

CPFRPillars

Planning

• Promotions• New item

introductions• Transition execution

Forecasting

• Demand planning visibility

• Promotional lifts and pipeline timing

• Seasonal planning

Replenishment

• Store level inventory management

• Right sized inventory position

• Markdown Reduction

Collaboration

ManagingRetailPartnerRelationships

PepsiCoCPFRTeam

Additional Retail Partners

ImprovingBusinesswithEachRetailer

16

POS Data Shipment History

Promotions Forecast

Orders

PepsiCoCPFRTeam

Production Inventory

Promotions Forecast

Orders

Shipments

17

Forecasting Collaboration Process

Why combine this data together?

• Combining the data into a single master report gives a more accurate overall picture performance

• Promotes collaboration between PepsiCo and the customer

• Traditionally the vendor–retailer relationship was contentious

• Combing PepsiCo data and retailer data helps promote shared success goals

• Through this process there was an increase in the forecast accuracy of PepsiCo which resulted in reduced spoilage for retailers

OriginalProcessforBuildingCPFRForecasts

Last-milestructuring,enrichingandcleansing

Initial structuring, enriching, and cleansing

Business

WhattheProcessLookedLikeinAccess

19

ChallengesLeadingtoHortonworks+Trifacta+TableauSolution

• Data Outgrowing Tools: Existing infrastructure pushed to the limits by the size of the source datasets

• Technical Skills Required: Datasets were connected through a large series of elaborate queries and macros.

• Data Quality Issues: Errors difficult to locate.

• Slow, Manual Process: Build time for one CPFR tool could take months.

PepsiCo’s Hortonworks + Trifacta Solution

21

Business

All structuring, enriching, and cleansing

Hortonworks + Trifacta + Tableau Solution Benefits at PepsiCo

• Business Benefits:– Reporting time has been reduced by 70%– Build time has been reduced as much as 90%

• Technical Benefits: – Can easily work with large quantities of non-standard data– Self-service prep for analysts reduces technical dependencies on IT– Trifacta surfaces errors and data problems immediately to analysts

• PepsiCo CPFR teams can now respond more quickly to sales trends and adjust forecasting and inventory distribution accordingly

DEMO Intro - Trifacta Wrangling Process for Retailer Data

• Structure the third party data– BOH: Balance on Hand or Inventory Data

• Cleanse mismatched values and delimiters – Remove the ‘,’ from values that exceed 1,000

• Extract embedded text/numbers– Split the Customer Item Code and Item description into two separate columns

• Convert the customer Item Code to the PepsiCo UPC– Join the BOH dataset with the Item Reference Dataset and build a new master report

• Run the job at scale and profile the results– Publish to Tableau

Trifacta Sample Workflow

CPFR Dashboard in Tableau

Thanks!

Q&A Sessionwith your hosts:

Will Davis

Director of Product Marketing

Jeff Huckaby

Market Segment Director, Retail & Consumer Goods

Camilo Silva

Enterprise Account Manager

28

Trifacta Wrangler Enterprise for Hadoop

https://www.trifacta.com/gated-form/bringing-hadoop-to-an-analysts-

fingertips/

Empowering CPG to Drive Innovation with

Datahttps://www.trifacta.com/resources/emp

owering-consumer-packaged-goods-organizations-to-drive-innovation-with-

data/

Supporting Resources

About the HortonworksSolution

http://hortonworks.com/solutions/

Try HortonworksSandbox

http://hortonworks.com/products/sandbox/

Big Data Analytics for Retail with Hadoop

http://hortonworks.com/info/big-data-analytics-for-retail-with-apache-hadoop/

Tableau for Big Data Analysis

http://www.tableau.com/resource/big-data-analysis

Faster, Smarter Retail Analytics with Tableau

http://www.tableau.com/resource/big-data-analysis

Thanks for joining!