© Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.1
Hadoop, SQL & NoSQL – No longer an either or question
Tony Baer
Hadoop Summit 2014
June 4, 2014
© Copyright Ovum. All rights reserved. Ovum is an Informa business.2
Where we’ve come – Twins separated at birth & joyous reunion
Why/how the convergence?
Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.3
SQL RDBMS
File systems
Hierarchical Data stores
OODBMS
SQL, NoSQL, Hadoop
1970s1980s
1990s
2000s
2010s
Network Data stores
© Copyright Ovum. All rights reserved. Ovum is an Informa business.4
Early Development
Commercialization Ecosystem Formation
1960s 1980s 1990s 2000s
“Prehistoric”
EF Codd publishes seminal RDBMS model
IBM System R,
Ingres
DB2, Oracle,
Teradata, PC-based DBMSs
SQL becomes de facto
enterprise standard
data platform
Tooling emerges
SQL market consolidates:
Oracle, DB2, SQL Server,
Teradata
NewSQL analytic
platforms emerge
Mainframe era Midranges & PCs emerge
Big Data
2014
DBMSs add multiple engines
Database timeline
1970s
Client/server & n-Tier
Ecosystem Broadens
CODASYL, IMS
MySQL/ LAMP stack
emerges
J2EE, .NET
© Copyright Ovum. All rights reserved. Ovum is an Informa business.5
Early Development Commercialization Ecosystem Formation
2003 - 2005 2009 2011 2012 2013
First Advanced
SQL platforms emerge
Hadoop emerges
Other NoSQL
platforms emerge
Cloudera intros
comm’l Hadoop support
Major vendors enter Big
Data market
Tooling emerges
2nd wave NewSQL platforms emerge
Big Data Tools emerge
Internet firm early adopters
Enterprise early adopters (FS & Media)
Mainstream adoption begins
2014
Big Data Apps
emerge
Big Data platform timeline
Hortonworks enters market
MongoDB, Cassandra
emerge
© Copyright Ovum. All rights reserved. Ovum is an Informa business.6
Platform proliferation =Data processing silos
SQL RDBMS
NewSQL RDBMS
NoSQL Key-Value
NoSQL JSON
Hadoop
OLTP (ACID)
OLTP (Non-ACID)
BI Query & Report
Analytics
OLTP (Non-ACID)
Advanced Analytics
Operational Decision Support
Operational Decision Support
MapReduce- based
Advanced Analytics
© Copyright Ovum. All rights reserved. Ovum is an Informa business.7
Where we’ve come – Twins separated at birth & joyous reunion
Why/how the convergence?
Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.8
Analytic SLA requirements vary
Batch Periodic Interactive Real-time
Exploratory Analytics Standard
reporting
Days/Hours Seconds Split seconds
Interactive query
Search
Streaming
Decision Support
Modeling
Operational Decision Support
Hours/Minutes
© Copyright Ovum. All rights reserved. Ovum is an Informa business.9
Analytics problems cross silos –Operational examples
Customer engagement
Interaction – Customer 360 query in DW
Behavior – Enrich with sentiment analysis on Hadoop
Engagement – Manage real-time engagement on NoSQL database
Risk mitigation
Baseline – Model party & transactional risk on DW or Hadoop
Enrich – Analyze, rank impact of externalities on Hadoop
Ingest – Real-time market feeds via streaming in-memory
Define – Decision processes offline via BPM
Act – Allow/deny credit on system of record
© Copyright Ovum. All rights reserved. Ovum is an Informa business.10
Architecture –Common threads
Aggressive tiering
Multiple storage engines
Multiple workload types
On the horizon:
Federated query
Workload/query orchestration
Loose ends:
Common security?
© Copyright Ovum. All rights reserved. Ovum is an Informa business.11
SQL Databases adding multiple personas
IBM DB2
BLU architecture adds columnar, data skipping, advanced tiering
New MongoDB-compliant JSON data store
Oracle Database 12c
“In-Memory” option adds DRAM-based columnar, extreme compression
Microsoft SQL Server
PDW adds columnar indexing
PolyBase feature adds Hadoop integration
Teradata
Teradata 14.10 adds “Intelligent Memory” data tiering, columnar, Hadoop integration
Aster 6 adds graph, file store, “SNAP” framework for choreographing SQL, MapReduce, graph & Hadoop processing
SAP
“Smart Data Access” federated query over HANA, Sybase IQ, Teradata & Hadoop
© Copyright Ovum. All rights reserved. Ovum is an Informa business.12
Hadoop growing beyond MapReduce
Apache Hadoop 2.0’s new YARN resource allocation framework allows multiple workloads
Interactive SQL – lots of flavors
Spark – The new MapReduce & more…
Search
Streaming
Loose ends:
Graph ready for prime time?
© Copyright Ovum. All rights reserved. Ovum is an Informa business.13
Emerging NewSQL + NoSQL databases
JSON data stores exploding
Intuitive for representing Internet data
MongoDB, Couchbase
IBM, Teradata… potentially Oracle adding JSON
New transaction stores … not full ACID
Cassandra for NoSQL (integrated to Hadoop)
NuoDB, Clustrix, MemSQL & others reinvent OLTP for distributed Internet apps
HBase
DynamoDB, Berkeley DB (Oracle NoSQL database) & other key-value stores
© Copyright Ovum. All rights reserved. Ovum is an Informa business.14
A variety of overlapping choices
NewSQL
JSON
Graph
Hadoop
SQL
Deep analytics
StreamGraph
NoSQLAccount/user profiles
Interactive content
Graph
Machine data
JSON
SQL RDBMSOLTP
DW
JSON
Distributed OLTP
Fast, deep analytics
Active Archiving
SQ
L R
DB
MS
New
SQ
L R
DB
MS
No
SQ
L K
ey-V
alu
e
No
SQ
L J
SO
N
Had
oo
p
From To
© Copyright Ovum. All rights reserved. Ovum is an Informa business.15
A variety of overlapping choices –But…
Who owns the logical
hub?
SQL RDBMS NewSQL
Hadoop NoSQL
OLTP
DW
Active Archiving
JSON
Distributed OLTP
Fast, deep analytics
JSON
Graph
SQL
Deep analytics
StreamGraph
Account/user profiles
Interactive content
Graph
Machine data
JSON
© Copyright Ovum. All rights reserved. Ovum is an Informa business.16
Where we’ve come – Twins separated at birth & joyous reunion
Why/how the convergence?
Loose ends
Agenda
© Copyright Ovum. All rights reserved. Ovum is an Informa business.17
Loose ends
Ideally, policy-based federated query will be the solution
Who owns federated query?
Data platform?
BI tool?
Application?
Who owns workload management?
Who owns security?
Tug of war between data platforms likely
© Copyright Ovum. All rights reserved. Ovum is an Informa business.18
Takeaways
Analytics no longer limited by platform constraints
Data platforms are taking multiple personas –
Platform choice is not either/or
But
Analytics are no longer silo’ed
Execution remains silo’ed
The brass ring will be a logical hub for
Policy/SLA-based workload targeting & management
Security & operations/performance management
© Copyright Ovum. All rights reserved. Ovum is a subsidiary of Informa plc.19
Thank you
Tony Baer
Ovum
(646) 546-5330
@TonyBaer