+ All Categories
Home > Documents > 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

Date post: 04-Jun-2018
Category:
Upload: adarsh1234
View: 221 times
Download: 0 times
Share this document with a friend

of 29

Transcript
  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    1/29

    The New BI Ecosystem:How Big Data Merges Top Down and Bottom

    up Computing

    Wayne W. Eckerson

    Director of Research and FounderFounder, BI Leadership Forum

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    2/29

    Big data platforms Relational databases

    Analytical databases Hadoop

    New analytical ecosystem

    Agenda

    2

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    3/29

    Kilobyte (KB) 10 3 bytes Megabyte (MB) 10 6 bytes Gigabyte (GB) 10 9 bytes Terabyte (TB) 10 12 bytes Petabyte (PB) 10 15 bytes 10 18 bytes 10 21 bytes 10 24 bytes

    What comes next?

    3

    Exabyte (EB)Zettabyte (ZB)Yottabyte (YB)

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    4/29

    What is big data?

    Yes!

    a) Lots of datab) Different types of datac) More data than you can handled) Purpose-built analytical systems

    e) Distributed file systemf) New staging area and archiveg) A Java developers employment act h) A replacement for the RDBMSi) A club for hip data people

    Data

    Systems

    Movement

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    5/29

    Information explosion

    5

    Every 18 months , non-rich structured and unstructured enterprise

    data doubles

    2005 2006 2007 2008 2009 2010 2011 2012

    Unstructured &Content Depot

    Structured &

    Replicated

    Source: IDC DigitalUniverse 2009; WhitePaper, Sponsored byEMC, May 2009

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    6/29

    Structured data Call detail records Point of sale records Claims data

    Semi-structured data Web logs Sensor data Email, Twitter

    Unstructured data Video, Audio, Images, Text

    Data deluge

    6

    A Sea of Sensors, The Economist, Nov 4, 2010

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    7/29

    From transactions to observations

    7

    Structured Semi-Structured Unstructured

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    8/29

    General purpose relational database Analytical database Hadoop

    Three big data platforms (systems)

    8

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    9/29

    1. General purpose RDBMS- Powers first generation DW

    9

    BIServer

    OperationalSystem

    OperationalSystem

    DataMart

    Data WarehouseReports /

    Dashboards

    OperationalSystem

    OperationalSystem

    Data WarehouseETL ETL

    Benefits :- RDBMS already inhouse- SQL-based- Trained DBAs

    Challenges:- Cost to deploy and upgrade- Doesnt support complex analytics - Scalability and performance

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    10/29

    2. Analytical platforms

    1010dataAster Data (Teradata)CalpontDatallegro (Microsoft)ExasolGreenplum (EMC)IBM SmartAnalytics

    InfobrightKognitioNetezza (IBM)Oracle ExadataParaccelPervasiveSand TechnologySAP HANASybase IQ (SAP)TeradataVertica (HP)

    Purpose-built database managementsystems designed explicitly for queryprocessing and analysis that provides

    dramatically higher price/performanceand availability compared to general

    purpose solutions .

    Deployment Options-Software only (Paraccel, Vertica)-Appliance (SAP, Exadata, Netezza)-Hosted(1010data, Kognitio)

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    11/29

    Quicker to deploy Preconfigured and tuned Fast ROI

    Faster and more scalable

    Faster query response times Linear performance

    Built-in analytics Libraries of functions Extensible SDK

    Less costly Less power, cooling, space Fewer people to maintain

    Game-changing technology

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    12/29

    Kelley Blue Book Consolidates millionsof auto transactionseach week to calculatecar valuations

    AT&T Mobility Trackspurchasing patterns

    for 80M customersdaily to optimizetargeted marketing

    Business value of analytic platforms

    AnalyticalDatabase

    Analyticalappliance

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    13/29

    3. Hadoop

    13

    Ecosystem of open source projects Hosted by Apache Foundation Google developed and shared concepts Distributed file system that scales out oncommodity servers with direct attachedstorage and automatic failover.

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    14/29

    Hadoop distilled: Whats new?

    14

    Open Source $$

    MapReduce

    Unstructured data

    BIGDATA

    Distributed FileSystem

    Benefits- Comprehensive- Agile- Expressive

    - Affordable

    Drawbacks- Immature

    - Batch oriented- Expertise- TCO

    Data scientist

    Schema at Read

    No SQL

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    15/29

    Hadoop ecosystem

    Source: Hortonworks

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    16/29

    Sabre Holdings Analyze airline shopping data

    Vestas

    Site wind turbines by modelinglarger volumes of weather data CBS Interactive

    Optimize ad placement and pricing Nokia

    Identify new data services

    Hadoop use cases

    16

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    17/29

    Hadoop hype

    17

    Gartner Group Hype Cycle

    OverheardHadoop will replace relationaldatabases.

    Hadoop will replace datawarehouses.

    Hadoop has a superior queryengine compared to analyticalplatforms.

    Use Hadoop for any applicationthat requires more than onenode.

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    18/29

    Hadoop adoption rates

    18

    38%

    32%

    20%

    5%

    4%

    No plans

    Considering

    Experimenting

    Implementing

    In production

    Based on 158 respondents, BI Leadership Forum, April, 2012

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    19/29

    Hadoop workloads

    19

    92%

    92%

    83%

    58%

    42%

    25%

    58%

    92%

    92%

    92%

    67%

    67%

    67%

    83%

    Staging area

    Online archive

    Transformation Engine

    Ad hoc queries

    Scheduled reports

    Visual exploration

    Data mining

    Today In 18 Months

    Based on respondents that have implementedHadoop. BI Leadership Forum, April, 2012

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    20/29

    Which platform do you choose?

    20

    Structured Semi-Structured Unstructured

    Hadoop

    Analytic Database

    General PurposeRDBMS

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    21/29

    RDBMSAnalytical

    DatabaseHadoop

    Purpose OLTP Analytics Anything

    Volume Low Moderate High

    Variety Relational Relational+ Variable

    Access SQL SQL+ Java+

    Latency Low Moderate High

    Concurrency High Moderate Low

    Cost per GB High Moderate Low

    Role DW Hub ordata mart

    DW orSandbox

    Staging areaand archive

    Big data platform comparison

    21

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    22/29

    The New BI Ecosystem

    22

    BI F k 2020

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    23/29

    Business Intelligence

    23

    Analytics Intelligence

    C o n t i n u o u s I n t e l l i g e n

    c e

    C o n t e n t I n t e

    l l i g e n c e

    Data Warehousing

    Ad hoc query, Spreadsheets,OLAP, Visual Analysis, Analytic

    Workbenches, Hadoop

    Analytic Sandboxes

    E v e n t - d r i v e n

    Reports and Dashboards

    MAD Dashboards

    Data Ware-housing

    End-User Tools

    E v e n t -D r i v e n A l e r t s a n d

    D a s h b o a r d s

    BI Framework 2020

    Ad hoc SQL

    D a s h b o a r d A l e r t s

    E v e n t d e t e c t i o n

    a n d c o r r e l a t i o n

    C

    E P , S t r e a m s

    AnalyticSandboxes

    Design Framework

    Architecture

    Reporting&Analysis

    Excel, Access, OLAP, Datamining, visual exploration

    K e y w o r d s e a r c h

    , B I t o o

    l s ,

    X q u e r y

    , H i v e , J a v a , e t c .

    M a p R e

    d u c e

    , X M L s c

    h e m a ,

    K e y - v

    a l u e p a i r s , g r a p

    h

    n o t a t i o n

    , e t c .

    H D F S

    , N o S Q L

    d a t a b s e s

    ExplorationPower Users

    BI F k

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    24/29

    Reporting & Monitoring (Casual Users)

    PredefinedMetrics

    Corporate Objectives and StrategyTOP DOWN- Business Intelligence

    Processes and Projects

    Analysis and Prediction (Power Users)

    Ad hocqueries

    AnalysisBegetsReports

    ReportsBeget

    Analysis

    Pros: - Alignment-ConsistencyCons:

    - Hard to build- Politically charged- Hard to change- Expensive- Schema Heavy

    Pros:- Quick to build

    - Politically uncharged- Easy to change-Low costCons:- Alignment- Consistency

    - Schema Light

    Data WarehousingArchitecture

    Non-volatileData

    Analytics

    Architecture

    Volatile

    Data

    24

    BI Framework

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    25/29

    The new analytical ecosystem

    MachineData

    Web Data

    Hadoop Cluster

    Operational Systems

    (Structured data)

    Power User

    BIServer

    Casual UserOperationalSystem

    OperationalSystem

    Documents & Text

    Free-StandingSandbox

    DeptDataMart

    Data Warehouse

    Virtual Sandboxes

    Top-down Architecture

    Bottom-up Architecture

    ExternalData

    Audio/videoData

    Streaming/CEP Engine

    Extract, Transform, Load(Batch, near real-time, or real-time)

    Analytic platform or non-relational database

    In-memorySandbox

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    26/29

    Analytical sandboxes

    MachineData

    Web Data

    Hadoop Cluster

    Operational Systems(Structured data)

    Power User

    BIServer

    Casual UserOperationalSystem

    OperationalSystem

    Documents & Text

    Free-StandingSandbox

    DeptDataMart

    Data Warehouse

    Virtual SandboxesTop-down Architecture

    Bottom-up Architecture

    ExternalData

    Audio/videoData

    Streaming/CEP Engine

    Extract, Transform, Load(Batch, near real-time, or real-time)

    Analytic platform or non-relational database

    In-memorySandbox

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    27/29

    Workflows

    27

    Analyticaldatabase

    (DW)

    SourceSystems

    Analytical tools

    5. Explore data

    6. Parse, aggregate

    Capture in caseits needed

    1. Extract, transform, load

    Capture only whatsneeded

    9. Report and mine data

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    28/29

    Explore applications for multi-structured data Apply the right tool for the job

    RDBMS, Analytical platform, Hadoop, NoSQL Make power users full-fledged members of your BI

    environment Reconcile top-down and bottom-up BI environments

    Create an analytical ecosystem!

    Recommendations

    28

  • 8/13/2019 9-10-2012 - The New Analytical Ecosystem - How Big Data Merges Top Down and Bottom Up Computing

    29/29

    Wayne Eckerson [email protected]

    Questions?

    29

    Analytical thought leader Founder, BI Leadership Forum Director of Research, TechTarget Former director of research at TDWI Author


Recommended