+ All Categories
Home > Documents > 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

Date post: 18-Jan-2016
Category:
Upload: milton-charles
View: 215 times
Download: 0 times
Share this document with a friend
31
1 One Size Fits All” One Size Fits All” An Idea Whose Time Has An Idea Whose Time Has Come and Gone Come and Gone by by Michael Stonebraker Michael Stonebraker
Transcript
Page 1: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

1

““One Size Fits All”One Size Fits All”An Idea Whose Time Has An Idea Whose Time Has

Come and GoneCome and Gone

byby

Michael StonebrakerMichael Stonebraker

Page 2: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

2

Current DBMS Gold StandardCurrent DBMS Gold Standard Current DBMS Gold StandardCurrent DBMS Gold Standard

Store fields in one record contiguously on

diskUse B-tree indexingUse small (e.g. 4K) disk blocksAlign fields on byte or word boundariesConventional (row-oriented) query optimizer

and executor

Store fields in one record contiguously on

diskUse B-tree indexingUse small (e.g. 4K) disk blocksAlign fields on byte or word boundariesConventional (row-oriented) query optimizer

and executor

Page 3: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

3

Terminology -- “Row Store”Terminology -- “Row Store”

Record 2

Record 4

Record 1

Record 3

E.g. DB2, Oracle, Sybase, SQLServer, …

Page 4: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

4

Row Stores are Row Stores are Write OptimizedWrite Optimized Row Stores are Row Stores are Write OptimizedWrite Optimized

Can insert and delete a record in one physical

writeGood for OLTPBut not for the data warehouse and other read-

mostly markets

Can insert and delete a record in one physical

writeGood for OLTPBut not for the data warehouse and other read-

mostly markets

Page 5: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

5

The Elephants and WarehousesThe Elephants and WarehousesThe Elephants and WarehousesThe Elephants and Warehouses

Bitmap indexesStar schema optimizationMaterialized viewsCompression (coding) or attributes

Bitmap indexesStar schema optimizationMaterialized viewsCompression (coding) or attributes

Page 6: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

6

Ultimate Result….Ultimate Result…. Ultimate Result….Ultimate Result….

Read optimized storeInstead of a write optimized store

Read optimized storeInstead of a write optimized store

Page 7: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

7

A Column Store (Like Sybase IQ)A Column Store (Like Sybase IQ)

Page 8: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

8

What is Fast Approaching…What is Fast Approaching…

Warehouse engineOLTP engine

parser

Page 9: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

9

Two Engines United by a Common Two Engines United by a Common ParserParser

Marketing can preserve the “one size

fits all” fictionGood idea because two engines causes

sales confusionMarketing confusionCompatibility issues

Marketing can preserve the “one size

fits all” fictionGood idea because two engines causes

sales confusionMarketing confusionCompatibility issues

But it won’t work in stream processing!

Page 10: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

10

Example Application – Feed AlarmsExample Application – Feed Alarms

Custom-coded

Feed alarm

application

Feed A

Feed B

alarms

Page 11: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

11

Characteristics of Feed Alarm PilotCharacteristics of Feed Alarm Pilot 500 rapidly updating tickers (5 sec. interval) +

4000 slowly updating tickers (60 sec. interval)in each FEED.

Problem Types1. Low-level alarm

Ticker not seen within update interval.2. Problem in Feed

More than 100 low-alarms from Feed A or Feed B3. Problem in Exchange

More than 100 low-level alarms from NASDAQ or NYSE

Suppression: – When problems of type 2 or 3 detected, do not emit

(distracting) problems of type 1.

Page 12: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

12

ResultsResults

Aurora/Grassy Brook implementation:– ~ 160K msgs/sec on a 3.2GHz Linux pentium

Elephant solution– ~900 msgs/sec on the same hardware

More than 2 orders of magnitude difference……

Page 13: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

13

Inbound vs outbound processingInbound vs outbound processing

The right primitivesThe right primitives

Integration of application logicIntegration of application logic

Why?Why?

Page 14: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

14

Traditional ModelTraditional ModelOutboundOutbound Processing Processing

Storage

Updates

DataProcessing

And

queries

Page 15: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

15

Stream Processing ModelStream Processing ModelInboundInbound Processing Processing

Storage

Data

Application

Page 16: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

16

Alarm Correlation ApplicationAlarm Correlation Application

AlarmD=5 sec

AlarmD=60 sec

U Count 100

500 fast

4000 slow

ReutersHi = Prob in Reuters

Filterex =NY U Count 100

Hi = Prob in NY

U Count 100Filterex = NY

NY

NY

NA

NA

AlarmD=5 sec

AlarmD=60 sec

U Count 100

500 fast

4000 slow

ComStock

Filter hi/lo Lo = Problem in Security In Reuters

Hi = Prob in Comstock

Filter hi/lo Lo = Prob in Security in Comstock

Map

Filterhi=1

Hi = Prob in NAFilterhi=1

FilterAlarm=1

FilterAlarm=1

FilterAlarm=1

FilterAlarm=1

Map

Page 17: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

17

Inbound ProcessingInbound Processing

Never store the data! Lower overhead Lower latency

Page 18: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

18

Inbound Processing in DBMSsInbound Processing in DBMSs

Triggers (glue-on) Limited support Often slow

In theory, a DBMS could be both inbound and outbound,

but this is a research project….

Just hooking a query plan up to a stream is not good enough…..

Page 19: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

19

Windowed Time Series OperatorsWindowed Time Series Operators

Windowed time series operators– Group by stock_id– Window is 2 ticks– Slide by 1 tick

Resilient to stream imperfections– User-specified timeouts for late data

Page 20: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

20

Alarm Correlation ApplicationAlarm Correlation Application

AlarmD=5 sec

AlarmD=60 sec

U Count 100

500 fast

4000 slow

ReutersHi = Prob in Reuters

Filterex =NY U Count 100

Hi = Prob in NY

U Count 100Filterex = NY

NY

NY

NA

NA

AlarmD=5 sec

AlarmD=60 sec

U Count 100

500 fast

4000 slow

ComStock

Filter hi/lo Lo = Problem in Security In Reuters

Hi = Prob in Comstock

Filter hi/lo Lo = Prob in Security in Comstock

Map

Filterhi=1

Hi = Prob in NAFilterhi=1

FilterAlarm=1

FilterAlarm=1

FilterAlarm=1

FilterAlarm=1

Map

Page 21: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

21

Windowed Aggregates With TimeoutsWindowed Aggregates With Timeouts

AlarmD=5 sec

Aggregate

Group by tickerWindow (size = 2 tuples,

step = 1 tuple)Timeout = 5 sec.

same as

Page 22: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

22

Windowed Aggregates with Windowed Aggregates with Timeout in DBMSsTimeout in DBMSs

In the trigger system? On stored data (polling)?

Page 23: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

23

Integration of Application LogicIntegration of Application Logic

All required capabilities in single system– No process switches– Integrated storage (not client-server)

Page 24: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

24

Integrated CodeIntegrated Code

Count 100 same as

Map F.evaluate:

cnt++if (cnt % 100 != 0) if !suppress emit lo-alarm

else emit drop-alarmelse emit hi-alarm, set suppress = true

• Lets first 100 low-alarms through.• Emits one high-alarm for every 100 low-alarms.• Suppresses low-alarms after 1st high-alarm.

Page 25: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

25

Application Integration in DBMSsApplication Integration in DBMSs

Client-server present for protection Stored procedures are a start

– tough to do control flow Object-relational blades are better

– But still tough to do control flowUnified programming language never made it

– E.g. Rigel or Pascal R No support for embedded DBMS applications

Page 26: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

26

Transactions in StreamsTransactions in Streams

Locking– Critical sections are enough; no need for xacts

Crash recovery– Log-based recovery slow– doesn’t recover whole state – System unavailable during recovery

Much better to just do HA– Failover to a backup (Tandem-style)– Forget about state recovery

Page 27: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

27

Net-NetNet-Net

Inbound vs outbound processingWindowed primitives vs end-of-table

primitivesSeparate app vs embedded appHA failover vs transactions

Page 28: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

28

Whenever These Matter a Lot Whenever These Matter a Lot

Separate engineTo get 2 orders of magnitude benefit

Page 29: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

29

Candidates for a Separate EngineCandidates for a Separate Engine

OLTPWarehousesStream processingSensor networks (TinyDB, etc.)Text retrieval (Google, etc.)Scientific data bases (lineage, arrays, etc.)XML (argued by some)

Page 30: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

30

Obvious Research TemplateObvious Research Template

Pick an area where “one size doesn’t fit”And figure out what does

Page 31: 1 “One Size Fits All” An Idea Whose Time Has Come and Gone by Michael Stonebraker.

31

More GenerallyMore Generally

Current system software factored into– App server (e.g. Websphere)– Messaging system (e.g, MQSeries)– DBMS (e.g. DB2)

Stream processing engines integrate pieces of all three– To avoid process switches

How many other interesting factorings are there?


Recommended