Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | kathleen-day |
View: | 32 times |
Download: | 1 times |
Unseating the Giants
Monte ZwebenCEO, Splice Machine
October 16, 2014
2
The Big SqueezeData growing much faster than IT budgets
Source: 2013 IBM Briefing Book
Source: Gartner, Worldwide IT, Spending forecast, 3Q13 Update
Traditional RDBMSs Giants Overwhelmed…Scale-up becoming cost-prohibitive
Splice Machine | Proprietary & Confidential
4
Scale-Out: The Future of DatabasesDramatic improvement in price/performance
Scale Up(Increase server size)
Scale Out(More small servers)
vs.$ $ $ $ $ $
5
Unseating the Giants
vs.
Scale-Up Giants
Scale-Out Challengers
6
Scale-Out Example #1
New application chooses NoSQL
Splice Machine | Proprietary & Confidential
8
ADVERTISER ROCKET FUEL
145RTB advertisingsupply partners
21,103,424Websites
19BnDaily impressions
###MM WW CONSUMERS91,999 DEVICES
Rocket Fuel: New Application
AdExchange
Rocket Fuel Platform
Auto Optimization
Real-Time Bidding
Publishers
Exchanges
Ad networks
Advertisers
Data Providers
10
$2.38965$0.6782$1.7234
$0.09$1.78964$1.6782$1.7234$0.809$2.421.25
$2.11$1.26
$2.178$2.056$0.809$2.421.25
$2.11$1.26$2.78$1.56
$1.809$2.421.25
$2.11$1.26$2.78$0.56$2.421.25
$2.11$1.26$2.78
$0.756$0.809$2.421.25
$2.11$1.26$2.78
$1.256$1.809$2.421.25
$2.11$1.26$2.78
$0.586$2.009
1.25$2.11$1.26$2.78$1.56
$0.00
[ + ][ + ]
Site/PageGeo/WeatherTime of DayBrand AffinityUser
12
Hourly
Refresh
RTB
EXCHANGES
Bid Servers
HB
AS
E
User Profile Store
Ad ServersPixel Servers
Direct Publishers
H D F S
Master
Slaves
ETL
Bidder Logs Ad Server Logs
LikelihoodScores & Bid Value
Master Database
Apollo Ad-hoc Analytics Tools
Campaign Framework
Response Prediction
Models
Eligible Ads
ExchangePublishers
Apollo Reporting & Campaign Tools
Bid Call
Tag for Selected Ad & Bid
Rocket Fuel Ad Tag
Ad Creative Tag
Response Prediction
Models
UserLookup
UserLookup
& Update
Evaluate Ads
Load Balancer
Load Balancer
Hourly
Refresh
Ad-Rejection
1
2
3
45
6
7
8
9
13Splice Machine Proprietary and Confidential
HBase: Proven Scale-Out
Auto-sharding Scales with commodity hardware Cost-effective from GBs to PBs
High availability thru failover and replication
LSM-trees
14
Rocket Fuel: Results
Facebook likes
Searches on Google
Bid Requests Considered by Rocketfuel
5 B
6 B
45 B
Requests per day
World class request velocity on over 10 PBs of data
15
Scale-Out Example #2
Web application replaces Oracle
Splice Machine | Proprietary & Confidential
17
Before Architecture: OracleOracle too expensive, too slow, and too difficult to scale and modify
Metadata Storage
Shutterfly Website
Photo File Storage
UploaderApp
Consumers
18
After Architecture: MongoDBFlexibility and scalability of NoSQL ideal for simple web app
Metadata Storage
Shutterfly Website
Photo File Storage
UploaderApp
Consumers
19
MongoDB ArchitectureDocument data model sharded across commodity servers
20
MongoDB: Compelling Results vs. Oracle
⅕ costwith commodity scale out
9x fasterthrough parallelized queries
Increased agilitywith flexible schema and “shard on demand”
21
Scale-Out Example #3
Splice Machine | Proprietary & Confidential
Existing OLTP & OLAP Apps Replace Oracle
23
Before Architecture: Oracle RACOracle RAC too expensive and too slow, with queries up to ½ hour
• Operational Reports for Campaign Performance• Ad Hoc Audience
Segmentation
Social Feeds
Web/eCommerce Clickstreams
ETL
1st Party/CRM Data
3rd Party Data (e.g., Axciom)
POS Data
Email Marketing
Data Quality
24
After Architecture: Hadoop RDBMSRDBMS functionality with proven scale-out from Hadoop
• Operational Reports for Campaign Performance• Ad Hoc Audience
Segmentation
Social Feeds
Web/eCommerce Clickstreams
ETL
1st Party/CRM Data
3rd Party Data (e.g., Axciom)
POS Data
Email Marketing
Data Quality
25
Hadoop RDBMS: Best of Both Worlds
Scale-out on commodity servers Proven to 100s of petabytes Efficiently handle sparse data Extensive ecosystem
RDBMS ANSI SQL Real-time, concurrent updates ACID transactions ODBC/JDBC support
Hadoop
26
Distributed, Parallelized Query Execution
Parallelized computation across clusterMoves computation to the dataUtilizes HBase co-processorsNo MapReduce
HBase Co-Processor
HBase Server Memory Space
L EG EN D
27
Hadoop RDBMS: Compelling Results vs. Oracle
¼ costwith commodity scale out
3-7x fasterthrough parallelized queries
10-20x price/perfwith no application, BI or ETL rewrites
28
Scale-Up vs. Scale-Out
Scale-Up: Top Reasons1. Willing to pay for engineered systems
2. Lots of custom code (e.g., PL/SQL)
3. Proven reliability
4. Avoid risk of newer technologies
5. Less migration required
Scale-Out: Top Reasons6. Reduce costs by 4x-10x
7. Increase performance by 3x-10x
8. Ease of scalability
9. Support for flexible schemas
10.Huge ecosystem of open source tools
29
Unseating the Giants: Why is it different this time?It’s not just technology – user requirements fundamentally changed
Seismic User Shift• Budgets flat• Massive increase in data:
- Volume- Velocity- Variety
• No longer acceptable to throw data away
Disruptive Tech: Scale-Out
• Leverage commodity H/W• Reduce costs by 4-5x• Increase perf by 5-10x• Increase agility