+ All Categories
Home > Documents > MULTICORE IN DATA APPLIANCES - Systems Group · MULTICORE IN DATA APPLIANCES Gustavo Alonso Systems...

MULTICORE IN DATA APPLIANCES - Systems Group · MULTICORE IN DATA APPLIANCES Gustavo Alonso Systems...

Date post: 30-Aug-2019
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
40
MULTICORE IN DATA APPLIANCES Gustavo Alonso Systems Group Dept. of Computer Science ETH Zürich, Switzerland SwissBox – CREST Workshop– March 2012
Transcript

MULTICORE IN DATA APPLIANCES

Gustavo Alonso Systems Group

Dept. of Computer Science ETH Zürich, Switzerland

SwissBox – CREST Workshop– March 2012

Systems Group = www.systems.ethz.ch Enterprise Computing Center = www.ecc.ethz.ch

2 Gustavo Alonso - Systems Group - ETH Zürich

The SwissBox project

Build an open source data appliance

• Hardware

• Software

3 Gustavo Alonso - Systems Group - ETH Zürich

Goals

Robust by design

Scalable by design

Fully predictable

Behavior immune to peak loads (read or write)

Efficient use of modern hardware

• Cost efficiency

• Power/space/management complexity

Gustavo Alonso - Systems Group - ETH Zürich 4

DATA APPLIANCE

Database in a box • Funny database • Funny box

Examples: • Exascale (Oracle) • Twin-Fin (Netezza – IBM) • NewDB (SAP) • Teradata

5 Gustavo Alonso - Systems Group - ETH Zürich

• Intelligent storage manager • Massive caching • RAC based architecture • Fast network interconnect

ORACLE EXADATA

6 Gustavo Alonso - Systems Group - ETH Zürich

The Multicore Challenge

7 Gustavo Alonso - Systems Group - ETH Zürich

Data parallelism

Relational model is highly parallel

• Independent tables

• Orthogonal operators

• Intra- and Inter-query parallelism

• Most successful commercial parallel systems

And yet …

Gustavo Alonso - Systems Group - ETH Zürich 8

Database engines and multicore

Gustavo Alonso - Systems Group - ETH Zürich 9

0

200

400

600

800

1000

1200

1400

1600

1800

0 50 100 150 200 250

Th

rou

gh

pu

t (T

PS

)

TPCW-S Clients

Postgres TPC-WB 20GB DB

PG 48 PG-24 PG 8

8 cores

48 cores

24 cores

Salomie, Subasu, Giceva, Alonso, EuroSys 2011

Database engines and multicore

Gustavo Alonso - Systems Group - ETH Zürich 10

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250 300 350

Th

oru

gh

pu

t(T

PS

)

Clients

MySQL TPC-WB 20 GB DB

MYSQL-48 MYSQL-24 MYSQL-12

8 cores

48 cores

24 cores

Salomie, Subasu, Giceva, Alonso, EuroSys 2011

CLAIM #1

Adding resources to a troubled application does not necessarily lead to improvements

Gustavo Alonso - Systems Group - ETH Zürich 11

Size matters

The challenge of appliances is the unprecedented power available

• 64 cores AMD,256 GB memory, 10 Gb network + 3 TB NAS: ~14k CHF

• Imagine a rack full of those

Is your job large enough?

Gustavo Alonso - Systems Group - ETH Zürich 12

CLAIM #2

Large scale parallelism sensible only when looking at aggregated

loads

Gustavo Alonso - Systems Group - ETH Zürich 13

Load interaction

In a highly parallel system, multiple jobs will get on each other’s way:

• Synchronization

• Data movement

• Resource arbitrage

• Resource capping

• Heavy vs. light jobs

• Management and coordination

Gustavo Alonso - Systems Group - ETH Zürich 14

Load interaction in practice

Gustavo Alonso - Systems Group - ETH Zürich 15

System X

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

CLAIM #3

Robustness and performance can only be obtained by minimizing interaction

Gustavo Alonso - Systems Group - ETH Zürich 16

Locality in the XXIst century

Gustavo Alonso - Systems Group - ETH Zürich 17

P1

P0

P2

P3

P4

P6

P5

P7

Each die has: • 6 cores • 4HT ports • 2 memory channels

Each package has: • 12 cores • 4HT ports • 4 memory channels

CLAIM #4

Robustness and definitely performance can only be achieved on fixed data paths

Gustavo Alonso - Systems Group - ETH Zürich 18

Architecture

Gustavo Alonso - Systems Group - ETH Zürich 19

RACK RACK

RACK RACK

RACK

multicore

Parallel hardware accelerator

other data centers

CLAIM #5

Traditional layered architectures (HW/OS/VM/App) do not work in these

environments

Gustavo Alonso - Systems Group - ETH Zürich 20

CLAIM #6

Strong notions of consistency (serializability) and atomicity of complex programs no

longer feasible

Gustavo Alonso - Systems Group - ETH Zürich 21

SWISSBOX

22 Gustavo Alonso - Systems Group - ETH Zürich

Alonso, Kossmann, Roscoe, CIDR 2011

SwissBox: the project

Great opportunity for research

• Rethink the entire system software stack

• Redesign the operating system, database, and storage system architecture

• Software – hardware co-design

23 Gustavo Alonso - Systems Group - ETH Zürich

SwissBox: the product

Direct collaboration and input from industry

Great demand for tailored systems

• Big data

• Highly demanding applications

• Low power / high efficiency

24 Gustavo Alonso - Systems Group - ETH Zürich

Claim # 1 => Deterministic behavior

Adding resources to an application does not necessarily lead to a performance

improvement

System performance completely determined at design time through simple parameters

Gustavo Alonso - Systems Group - ETH Zürich 25

Clock Scan

READ CURSOR

WRITE CURSOR DATA IN

CIRCULAR BUFFER

(WIDE TABLE)

BUILD QUERY INDEX FOR NEXT SCAN QUERIES

UPDATES

26 Gustavo Alonso - Systems Group - ETH Zürich

Unterbrunner, Giannikis, Alonso, Kossmann, PVLDB 2009

Claim # 2 => Batch processing

Large scale parallelism makes sense only when considering aggregated loads

Execution proceeds in batches (1000’s of queries per batch)

Gustavo Alonso - Systems Group - ETH Zürich 27

Shared join

Crescando runs selection and projections in one set of cores

SharedDB runs joins on the streams from Crescando, thousands of queries at a time

28 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Claim # 3 => No load interaction

Robustness and performance can only be obtained by minimizing interaction

Operators are orthogonal and work on clearly delimited resources

Gustavo Alonso - Systems Group - ETH Zürich 29

Predictability, robustness

30 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Claim # 4 => No dynamic scheduling

Robustness (and definitely performance) can only be achieved on fixed data paths

No dynamic scheduling, operators are always on and at fixed locations

Gustavo Alonso - Systems Group - ETH Zürich 31

Single plan: operator per core

32 Gustavo Alonso - Systems Group - ETH Zürich

Raw performance

33 Gustavo Alonso - Systems Group - ETH Zürich

Gia

nn

ikis

, Alo

no

, Ko

ssm

ann

, PV

LD

B 2

012

Claim # 5 => Open stack

Traditional layered architectures (HW/OS/VM/App) do not work in these

environments

Operating System / Database co-design

Gustavo Alonso - Systems Group - ETH Zürich 34

Claim # 6 => Consistency

Strong notions of consistency (serializability) and atomicity of complex programs no

longer feasible

Snapshot isolation, multiversions, eventual consistency

Gustavo Alonso - Systems Group - ETH Zürich 35

SW

ISS

BO

X

36 Gustavo Alonso - Systems Group - ETH Zürich

Where are we?

Fully predictable performance

• Accurate analytical model

• Easily tunable / scalable

Tolerates high peaks of reads and updates without compromising SLA

Intelligent storage engine

37 Gustavo Alonso - Systems Group - ETH Zürich

Next steps

Hardware acceleration

Query optimizer

Parallel operators

OS / DB interfaces

Gustavo Alonso - Systems Group - ETH Zürich 38

In the future

Virtualized operators

Flexible, elastic deployment through OS interaction

Scalability through operator/plan replication across cores and machines

Hardware acceleration by operator offloading and in-network data processing

Operator parallelism

39 Gustavo Alonso - Systems Group - ETH Zürich

SwissBox in a nutshell

A new way to process data

• Parallel, predictable by design

• Not optimal but good enough

• Co-design at all levels

Great opportunity for research

• Redo everything from scratch

40 Gustavo Alonso - Systems Group - ETH Zürich


Recommended