Date post: | 04-Jun-2018 |
Category: |
Documents |
Upload: | adarsh1234 |
View: | 221 times |
Download: | 0 times |
of 29
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
1/29
The New BI Ecosystem:How Big Data Merges Top Down and Bottom
up Computing
Wayne W. Eckerson
Director of Research and FounderFounder, BI Leadership Forum
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
2/29
Big data platforms Relational databases
Analytical databases Hadoop
New analytical ecosystem
Agenda
2
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
3/29
Kilobyte (KB) 10 3 bytes Megabyte (MB) 10 6 bytes Gigabyte (GB) 10 9 bytes Terabyte (TB) 10 12 bytes Petabyte (PB) 10 15 bytes 10 18 bytes 10 21 bytes 10 24 bytes
What comes next?
3
Exabyte (EB)Zettabyte (ZB)Yottabyte (YB)
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
4/29
What is big data?
Yes!
a) Lots of datab) Different types of datac) More data than you can handled) Purpose-built analytical systems
e) Distributed file systemf) New staging area and archiveg) A Java developers employment act h) A replacement for the RDBMSi) A club for hip data people
Data
Systems
Movement
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
5/29
Information explosion
5
Every 18 months , non-rich structured and unstructured enterprise
data doubles
2005 2006 2007 2008 2009 2010 2011 2012
Unstructured &Content Depot
Structured &
Replicated
Source: IDC DigitalUniverse 2009; WhitePaper, Sponsored byEMC, May 2009
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
6/29
Structured data Call detail records Point of sale records Claims data
Semi-structured data Web logs Sensor data Email, Twitter
Unstructured data Video, Audio, Images, Text
Data deluge
6
A Sea of Sensors, The Economist, Nov 4, 2010
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
7/29
From transactions to observations
7
Structured Semi-Structured Unstructured
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
8/29
General purpose relational database Analytical database Hadoop
Three big data platforms (systems)
8
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
9/29
1. General purpose RDBMS- Powers first generation DW
9
BIServer
OperationalSystem
OperationalSystem
DataMart
Data WarehouseReports /
Dashboards
OperationalSystem
OperationalSystem
Data WarehouseETL ETL
Benefits :- RDBMS already inhouse- SQL-based- Trained DBAs
Challenges:- Cost to deploy and upgrade- Doesnt support complex analytics - Scalability and performance
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
10/29
2. Analytical platforms
1010dataAster Data (Teradata)CalpontDatallegro (Microsoft)ExasolGreenplum (EMC)IBM SmartAnalytics
InfobrightKognitioNetezza (IBM)Oracle ExadataParaccelPervasiveSand TechnologySAP HANASybase IQ (SAP)TeradataVertica (HP)
Purpose-built database managementsystems designed explicitly for queryprocessing and analysis that provides
dramatically higher price/performanceand availability compared to general
purpose solutions .
Deployment Options-Software only (Paraccel, Vertica)-Appliance (SAP, Exadata, Netezza)-Hosted(1010data, Kognitio)
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
11/29
Quicker to deploy Preconfigured and tuned Fast ROI
Faster and more scalable
Faster query response times Linear performance
Built-in analytics Libraries of functions Extensible SDK
Less costly Less power, cooling, space Fewer people to maintain
Game-changing technology
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
12/29
Kelley Blue Book Consolidates millionsof auto transactionseach week to calculatecar valuations
AT&T Mobility Trackspurchasing patterns
for 80M customersdaily to optimizetargeted marketing
Business value of analytic platforms
AnalyticalDatabase
Analyticalappliance
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
13/29
3. Hadoop
13
Ecosystem of open source projects Hosted by Apache Foundation Google developed and shared concepts Distributed file system that scales out oncommodity servers with direct attachedstorage and automatic failover.
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
14/29
Hadoop distilled: Whats new?
14
Open Source $$
MapReduce
Unstructured data
BIGDATA
Distributed FileSystem
Benefits- Comprehensive- Agile- Expressive
- Affordable
Drawbacks- Immature
- Batch oriented- Expertise- TCO
Data scientist
Schema at Read
No SQL
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
15/29
Hadoop ecosystem
Source: Hortonworks
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
16/29
Sabre Holdings Analyze airline shopping data
Vestas
Site wind turbines by modelinglarger volumes of weather data CBS Interactive
Optimize ad placement and pricing Nokia
Identify new data services
Hadoop use cases
16
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
17/29
Hadoop hype
17
Gartner Group Hype Cycle
OverheardHadoop will replace relationaldatabases.
Hadoop will replace datawarehouses.
Hadoop has a superior queryengine compared to analyticalplatforms.
Use Hadoop for any applicationthat requires more than onenode.
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
18/29
Hadoop adoption rates
18
38%
32%
20%
5%
4%
No plans
Considering
Experimenting
Implementing
In production
Based on 158 respondents, BI Leadership Forum, April, 2012
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
19/29
Hadoop workloads
19
92%
92%
83%
58%
42%
25%
58%
92%
92%
92%
67%
67%
67%
83%
Staging area
Online archive
Transformation Engine
Ad hoc queries
Scheduled reports
Visual exploration
Data mining
Today In 18 Months
Based on respondents that have implementedHadoop. BI Leadership Forum, April, 2012
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
20/29
Which platform do you choose?
20
Structured Semi-Structured Unstructured
Hadoop
Analytic Database
General PurposeRDBMS
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
21/29
RDBMSAnalytical
DatabaseHadoop
Purpose OLTP Analytics Anything
Volume Low Moderate High
Variety Relational Relational+ Variable
Access SQL SQL+ Java+
Latency Low Moderate High
Concurrency High Moderate Low
Cost per GB High Moderate Low
Role DW Hub ordata mart
DW orSandbox
Staging areaand archive
Big data platform comparison
21
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
22/29
The New BI Ecosystem
22
BI F k 2020
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
23/29
Business Intelligence
23
Analytics Intelligence
C o n t i n u o u s I n t e l l i g e n
c e
C o n t e n t I n t e
l l i g e n c e
Data Warehousing
Ad hoc query, Spreadsheets,OLAP, Visual Analysis, Analytic
Workbenches, Hadoop
Analytic Sandboxes
E v e n t - d r i v e n
Reports and Dashboards
MAD Dashboards
Data Ware-housing
End-User Tools
E v e n t -D r i v e n A l e r t s a n d
D a s h b o a r d s
BI Framework 2020
Ad hoc SQL
D a s h b o a r d A l e r t s
E v e n t d e t e c t i o n
a n d c o r r e l a t i o n
C
E P , S t r e a m s
AnalyticSandboxes
Design Framework
Architecture
Reporting&Analysis
Excel, Access, OLAP, Datamining, visual exploration
K e y w o r d s e a r c h
, B I t o o
l s ,
X q u e r y
, H i v e , J a v a , e t c .
M a p R e
d u c e
, X M L s c
h e m a ,
K e y - v
a l u e p a i r s , g r a p
h
n o t a t i o n
, e t c .
H D F S
, N o S Q L
d a t a b s e s
ExplorationPower Users
BI F k
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
24/29
Reporting & Monitoring (Casual Users)
PredefinedMetrics
Corporate Objectives and StrategyTOP DOWN- Business Intelligence
Processes and Projects
Analysis and Prediction (Power Users)
Ad hocqueries
AnalysisBegetsReports
ReportsBeget
Analysis
Pros: - Alignment-ConsistencyCons:
- Hard to build- Politically charged- Hard to change- Expensive- Schema Heavy
Pros:- Quick to build
- Politically uncharged- Easy to change-Low costCons:- Alignment- Consistency
- Schema Light
Data WarehousingArchitecture
Non-volatileData
Analytics
Architecture
Volatile
Data
24
BI Framework
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
25/29
The new analytical ecosystem
MachineData
Web Data
Hadoop Cluster
Operational Systems
(Structured data)
Power User
BIServer
Casual UserOperationalSystem
OperationalSystem
Documents & Text
Free-StandingSandbox
DeptDataMart
Data Warehouse
Virtual Sandboxes
Top-down Architecture
Bottom-up Architecture
ExternalData
Audio/videoData
Streaming/CEP Engine
Extract, Transform, Load(Batch, near real-time, or real-time)
Analytic platform or non-relational database
In-memorySandbox
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
26/29
Analytical sandboxes
MachineData
Web Data
Hadoop Cluster
Operational Systems(Structured data)
Power User
BIServer
Casual UserOperationalSystem
OperationalSystem
Documents & Text
Free-StandingSandbox
DeptDataMart
Data Warehouse
Virtual SandboxesTop-down Architecture
Bottom-up Architecture
ExternalData
Audio/videoData
Streaming/CEP Engine
Extract, Transform, Load(Batch, near real-time, or real-time)
Analytic platform or non-relational database
In-memorySandbox
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
27/29
Workflows
27
Analyticaldatabase
(DW)
SourceSystems
Analytical tools
5. Explore data
6. Parse, aggregate
Capture in caseits needed
1. Extract, transform, load
Capture only whatsneeded
9. Report and mine data
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
28/29
Explore applications for multi-structured data Apply the right tool for the job
RDBMS, Analytical platform, Hadoop, NoSQL Make power users full-fledged members of your BI
environment Reconcile top-down and bottom-up BI environments
Create an analytical ecosystem!
Recommendations
28
8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing
29/29
Wayne Eckerson [email protected]
Questions?
29
Analytical thought leader Founder, BI Leadership Forum Director of Research, TechTarget Former director of research at TDWI Author