1© Cloudera, Inc. All rights reserved.
Architecture ModernizationFrank VullersBusiness Value Strategist EMEA
2© Cloudera, Inc. All rights reserved.
3© Cloudera, Inc. All rights reserved.
Our relationship with data is changingData is now a strategic asset, how you use it, your key differentiator
4© Cloudera, Inc. All rights reserved.
Evolution in use of data
Traditional BI Big Data Analytics
Fast Data Analytics
More(Different) Data
(near) Real time
5© Cloudera, Inc. All rights reserved.
Traditional BI
Business determine questions to ask
IT structure data to answer the questions
Describe outcomes
What’s Next?
(Outcome Driven)
“Capture only what’s needed”
6© Cloudera, Inc. All rights reserved.
Big Data Analytics
Business Explores Data for Questions Worth Answering
IT Delivers a Platform for Storing, Refining, and Analyzing All Data Sources
Explain Causes
Decisions/Action Plans
“Capture in case it’s needed”
(Process Driven)
Scalable Machine LearningTest, train and run on the same
environment
7© Cloudera, Inc. All rights reserved.
Fast Data Analytics
Real Time event requires Analysis in (near) real time
React in (near) real time with alert or offer
“Analyse and React fast within the time window”
(near) Real time
Recommendation Engine• Next Best Offer• Content and/or Services
Recommendation
Event Detection• Fraud/Risk Detection• Spam Filter• Marketing Alerts
Model scoring• Embedded Analytics• Analytic Aggregates• Reports
Real Time Events
8© Cloudera, Inc. All rights reserved.
Architecture view
9© Cloudera, Inc. All rights reserved.
Schema on Read is the Change Agent
©2014 Cloudera, Inc. All rights reserved.
Schema on Write• Determine Requirements• Design Schema• Collect & Transform Data• Validate Design
Schema on Read• Explore • Transform• Analyze• Iterate
Image source: “Business Process Analytics” by M. Zur Muhlen, Robert Shapiro, in Handbook ofBusiness Process Management 2, Springer Berlin Heidelberg, pp 137-157, 2010.
When storing data in Hadoop it is not necessary to declare its structure or association with any particular application
10© Cloudera, Inc. All rights reserved.
The logical architecture hasn’t changed *
*Ralph Kimball: The Future of Data Warehousing: ETL Will Never be the Same
11© Cloudera, Inc. All rights reserved.
The logical architecture hasn’t changed *
*Ralph Kimball: The Future of Data Warehousing: ETL Will Never be the Same
12© Cloudera, Inc. All rights reserved.
Modernizing (traditional) Architecture
EDW
ERP CRM …
BI
Traditional BI
13© Cloudera, Inc. All rights reserved.
Modernizing (traditional) Architecture
EDW
ERP CRM …
BI
Big Data Analytics
14© Cloudera, Inc. All rights reserved.
Modernizing (traditional) Architecture
EDW
ERP CRM …
BI
Fast Data Analytics
15© Cloudera, Inc. All rights reserved.
Summary Modernizing Architecture
EDW
ERP CRM …
BI
Trad
ition
al B
I Fa
st D
ata
Anal
ytics
Big
Data
An
alyti
cs
16© Cloudera, Inc. All rights reserved.
Logical Information Architecture (1/2)
Landing Zone / Staging Layer
Discovery Zone / Enriched Layer
Integrated Zone / Atomic Layer
Optimized Zone / Mart Layer
• Data from source • Separate directories• original format and
structure
• Still separate directories • Data sets “enriched”• Available for Discovery
and Exploration
• Data joined together• One atomic data model• Not optimized for
speed
• Organized to provide optimized performance
• Organized by use case• Deformalized, uses
optimized formats
17© Cloudera, Inc. All rights reserved.
Logical Information Architecture
Landing Zone Discovery Zone Integrated Zone Optimized Zone
Man
aged
User
Staging Layer Enriched Layer Atomic Layer Mart Layer
Raw Trusted
Z0-M
Z0-U
Z1-M
Z1-U
Z2-M
Z2-U
Z3-M
Ingest Validation &Verification Enrichment Transformation Routing
Logical Information Architecture (2/2)