+ All Categories
Home > Documents > Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then...

Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then...

Date post: 29-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
38
Transcript
Page 1: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting
Page 2: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Welcome

Page 3: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Boom Goes the Data Engine!Turbocharging Tableau with Hyper

Tobias Muehlbauer

Jan Finis

# T C 1 8

Page 4: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting
Page 5: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

What is Hyper?

Page 6: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

2008: Hyper Started as a Research Projectat Technical University Munich

Academic Success

Commercial Spin-Off

Early 2016: Tableau Acquires Hyper

Europe R&D Center in Munichwith Now Over 30 Employees

10.5: Replace Tableau Data Engine inExisting Tableau Products

2018+: Data Engine for Prep, Improve Existing Scenarios, and Evaluate New Use Cases

Page 7: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Hyper

Page 8: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting
Page 9: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Hyper as the Data Engine Replacement

Desktop Online Public Server

Hyper

Page 10: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Hyper as the Data Engine for Prep

Page 11: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Customer Impact in 10.5 and 2018

Speed and size: fast analysis on data of all sizes

• Larger data sets

• Query performance scales linearly with number of CPU cores

Data freshness: faster extract creation and refreshes

• Ingestion at speed of data source

• No post-processing phase

Enterprise ready: improved scalability and performance

• Improved throughput with Hyper and Tableau Server

Page 12: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Learnings

Integration project that replaces a core component

Scalability and performance are an end-to-end story

Every data set and query workload is differentContinuous performance improvementsDifferences in size of extract filesMaterialization of calculations

Resource usage and deployment guidelines

Page 13: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting
Page 14: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Why are Databases So Slow?

Pat HanrahanCo-Founder and Chief Scientist of Tableau

Keynote at an Academic Database Conference, 2012

Page 15: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Why is Hyper So Fast?

Page 16: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Modern hardware …

... and what it means for database systems

Page 17: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

A Changing Hardware Landscape

Main memory capacities are growing fast

Page 18: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

A Changing Hardware Landscape

CPUs are based on an increasingly complex super-scalar multi-core architecture

Page 19: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

What Does it Mean for Database Systems?

Page 20: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

A Changing Hardware Landscape:What does it mean for Database Systems?

If a hand-coded program is faster than all databases, then why can't the database just generate this program?

Hyper compiles each SQL query to machine code and then executes this code

Page 21: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Traditional Interpretation vs. Compilation

Traditional Interpreting Database Compiling Database (Hyper)

Interpreter has to handle all possible queries

→ very general

Cannot be adapted to specific query at hand

→ no query-specific optimizations

Code generated for the specific query at hand

→ highly specialized

Query-specific optimizations baked into the program

→ highly optimized for query at hand

Query execution starts immediatly

1. Generate highly optimized C program (fast)

2. Compilation to machine code

(slow, some seconds for C code)

3. Execution of machine code

Page 22: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Why is Compilation Attractive Now?Analogy: Fetching data from memory = Getting a document (1cycle = 3 feet)

Processing the data = Reading the document

≈1 cycle

≈4 cycles

≈10 cycles

≈40 cycles

≈200 cycles

≈4-40 Million cycles

Latency

DISK (HDD)

RAM

L3 Cache

L2 Cache

L1 Cache

CPU Registers

other side of the earth

For decades, databases “went around the earth” to get your data

Hyper

Page 23: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Query Optimization

Query Optimization

Efficient execution is not enough!

Page 24: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

How Well Do Databases Optimize?

Problem: Query optimization is inherently hard

→ Only real experts can write good query optimizers

→Many existing systems lack various optimizations

→Only very few researchers in the whole world specialize in query optimization

So, why is Hyper good at optimization?

Prof. Dr. Thomas Neumann, co-founder of Hyper and principal advisor at Tableau, is one of these few researchers.

Thomas is a leading researcher in database systems research specializing in query optimization.

Page 25: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Parallelization

Modern CPUs have lots of cores!

But more cores are only beneficial if the software can keep these cores busy

No parallelization: Almost no utilization

Traditional parallelization in database systems only scales to a few cores; with more cores there is no further speedup

Hyper’s morsel-driven parallelization fully utilizes large numbers of cores (>120)

Page 26: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Morsel Driven Parallelism

Assume you have 4 people (cores) and your goal is to eat a cake (process a query) as fast as possible

How do you do it?

Page 27: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

How Traditional Databases Eat Cake

Cut the cake into four equally large pieces

Every person eats one piece

What if one person is slower than the others? (skew)

Hard-to-eat nuts

Piece is larger than anticipated

Distracted by other work

In the end, she eats alone, while the others have to wait

Bad CPU utilization

What if new people arrive or have to leave after the cake was cut?

→ Load balancing is hard/no elasticity

Page 28: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Why Skew Really Hurts

Amdahl’s law: The theoretical speedup of a partly parallel task is always limited by the non-parallel part of the task

If not all parts of query execution are fully parallelized, scalability will be limited

The more processors you want to utilize, the better you must parallelize

Skew forces a part of the task to be serial (last person eats alone)

95% 32 13x 40%

Page 29: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

How Traditional Databases Eat Cake

Page 30: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

How Hyper Eats Cake

Cut the cake into very small morsels

Everyone grabs a morsel whenever he/she finishes their current morsel

A faster eater simply eats more morsels

If morsels are small enough, all eaters finish at roughly the same time (skew resilience)

If a tastier cake becomes available, an eater can switch to it quickly (query prioritization)

Number of eaters per cake can change dynamically (elasticity)

Enabling morsel-driven parallelism in existing database systems is a lot of effort

Page 31: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Hyper versus TDE on 32 cores

Page 32: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

More than Analytics …

Page 33: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Supporting Transactions and Analytics

Page 34: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

and a Data Warehouse is hard

Combining a Transactional System

Page 35: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Hyper: The New Data Engine

Extract Creation

Extract Refresh

Federation

Project Maestro Dashboards

Interactive Analysis

Deep Analytics

Hyper allows both, efficient management of your data,

as well as fast analysis of the latest state of your data.

Page 36: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Please complete the

session survey from the My

Evaluations menu

in your TC18 app

Page 37: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Thank you!

# T C 1 8

Tobias Muehlbauer: [email protected]

Jan Finis: [email protected]

Page 38: Welcome [tc18.tableau.com] · 2020-01-06 · Hyper compiles each SQL query to machine code and then executes this code. Traditional Interpretation vs. Compilation Traditional Interpreting

Recommended