+ All Categories
Home > Documents > Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis...

Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis...

Date post: 05-Jan-2016
Category:
Upload: gwendolyn-barrett
View: 223 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
A Development Environment for Query Optimizers Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University
Transcript
Page 1: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

A Development Environment for Query

Optimizers

Olga Papaemmanouil, Brandeis UniversityNga Tran, Brandeis University

Mitch Cherniack, Brandeis University

Page 2: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

natural consequence of One-Size-D-N-F-A

many new niche data management systems

many new “redesigns” of DBMS components◦ exploiting HW Architecture (e.g., shared-nothing)◦ exploiting SW Architecture (e.g., column-stores)

needed: Development Environment (DE) for

◦ Rapid Prototyping◦ Evaluation◦ Refinement

of alternative component designs

One-Design-D-N-F-A

Page 3: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Development Environment for Query Optimizers

◦ Rapid Prototyping:Declarative Specifications,Generator Tools

◦ Evaluation: Component Benchmarks

◦ Refinement: Component Profiling Tools

Proposal-Ware: |vision| >> |substance|

Results for Join Plan Enumerator (JPE)

◦ SQL Server: better plans with less optimizer work

◦ Vertica: Part of version 2.0 Query Optimizer

Project Overview

Page 4: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Development Environment Overview

Initial Results: JPEs

Future Work: Other Components

Talk Outline

Page 5: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

1. Chooses Join Order2. Chooses Join Algorithms and Access Methods3. Chooses Pre-Join Data Transfer (Distributed DBs)

Targeted Optimizer Components

PrunerPPMJPE

Join Plan Selector

Join Plan Enumerator Physical Plan Mapper

Page 6: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

SELECT ...FROM L,O,C,SWHERE ...

Targeted Optimizer Components

Parser

PrunerPPMJPE

Join Plan Selector

L O

CS ExecutionEngine

L O

C

S

L S O C

L O

C

SINLJ

HJ

INLJ

L O

C

S HJ

HJ

INLJ

L S O C

INLJ INLJ

HJ

O C L S

INLJ HJ

HJL S O C

INLJ INLJ

HJ

Legend

Borrowed

DESupported

X Y

Join Graph

Logical Plan

Physical Plan

Legend

Fixed

Generated

Page 7: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

SELECT ...FROM L,O,C,SWHERE ...

Targeted Optimizer Components

Parser

PrunerPPMJPE

Join Plan Selector

L O

CS

L O

CS

L O

CS

HJ

L O

CS

INLJ

L S

CO

O C

SL

L S

CO

INLJ

L S

CO

HJ

O C

SL

INLJ

O C

SL

HJ

L O

CS

HJ

L O

CS

INLJ

L S

CO

INLJ

O C

SL

HJ

ExecutionEngine

Legend

Borrowed

DESupported

X Y

Join Graph

Logical Plan

Physical Plan

Legend

Fixed

Generated

Page 8: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

SELECT ...FROM L,O,C,SWHERE ...

Targeted Optimizer Components

Parser

PrunerPPMJPE

Join Plan Selector

L O

CS ExecutionEngine

Legend

Borrowed

DESupported

X Y

Join Graph

Logical Plan

Physical Plan

Legend

Fixed

Generated

L S

INLJ

O CL O

CINLJ

S

Page 9: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Targeted Optimizer Components

Parser

PrunerPPMJPE

ExecutionEngine

Join Plan Selector

Legend

Borrowed

DESupported

X Y

Join Graph

Logical Plan

Physical Plan

Legend

Fixed

Generated

ControllerY

Join Graph

X Physical Plan

CostOrder

Property

JoinGraph Plan

PNLJLJoin …

Page 10: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Targeted Optimizer Components

JPE PPM PrunerController

Plan

LJoin

Property

Order

Page 11: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Proposed Development Environment

JPE

JPE Generator

JPESpec

JPE Benchmark

Tools

JPE Profiling

Tools

PPM

PPMGenerator

PPMSpec

PPMBenchmark

Tools

PPMProfiling

Tools

Pruner

PrunerGenerator

PrunerSpec

PrunerBenchmark

Tools

PrunerProfiling

Tools

Controller

ControllerGenerator

ControllerSpec

ControllerBenchmark

Tools

ControllerProfiling

Tools

Plan

Plan ClassGenerator

Plan OpSpec

LJoin

Property

PropertyAnnotatorGenerator

PropertySpec

Order

Page 12: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Development Environment Overview

Initial Results: JPEs

Future Work: Other Components

Talk Outline

Page 13: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Initial Results: JPEs Why not universal (exhaustive) enumeration?

1. Scalability most products include optimization-level “knob” (low = fast) DBA Guide [Tow03]: high knob setting for simple queries

only issue of space (not just time)

2. Relies on Brittle Cost Models Cardinality Errors of Cost Models well-known DBA Guide [Tow03]: low knob setting can improve plans

3. Lightweight Query Optimizers Useful for adaptive query optimization, automatic DB design

Therefore, we assume targeted JPEs desirable

Page 14: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Intuition:◦ good plans usually perform certain joins before

others◦ a good ‘early’ join might …

produce small intermediate results exploit ephemeral properties of inputs (e.g., order)

JPEG-Generated JPEs:◦ specified by “Join ranking function”

higher rank = preferable to perform earlier in plan

◦ only enumerate plans that perform joins in rank order

JPEs: Generating

2 11 1

2 3

3

Page 15: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

An Example JPEG Enumeration

JPEs: Generating

L O

2

2

N C

,

L O

C

2

N

2

L O

C N

L O

C

N

L O

C N

,

1

2

2

L O

CN

Page 16: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Some Example Join Rankings

A. Cardinality: 1=m::1, 2=m::nB. Order: 1=Both, 2=Larger, 3=Smaller, 4=NeitherC. Indexed: 1=Either, 2=NeitherD. Size: 1=At Least One Very Small Table

2=Everything Else3=At Least One Very Large Table

E. Distribution: 1=“Both Join Partitioned/One Replicated2=“Larger Partitioned”3=“Smaller Partitioned”4=“Neither Partitioned”

JPEs: Generating

Page 17: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Results: Replacing the SQL Server JPE

JPEs: Generating

Query(# Tables)

SQL Server SQL Server + JPEG

ChosenPlan (Sec)

DJWUsChosen

Plan (Sec)

DJWUs

Q2 (5) 13.27 26 13.27 15

Q3 (3) 267.49 4 267.49 4

Q5 (6) 332.94 244 254.28 77

Q7 (6) 883.74 137 205.62 57

Q8 (8) 304.08 507 304.08 204

Q9 (6) 457.52 255 395.88 76

Q10 (4) 329.62 11 329.62 10

Q11 (3) 40.70 4 40.70 4

Q18 (3) 1392.86 4 1392.86 4

Q21 (4) 259.47 11 259.47 8

Query(# Tables)

SQL Server SQL Server + JPEG

ChosenPlan (Sec)

DJWUsChosen

Plan (Sec)

DJWUs

Q2 (5) 13.27 26 13.27 15

Q3 (3) 267.49 4 267.49 4

Q5 (6) 332.94 244 254.28 77

Q7 (6) 883.74 137 205.62 57

Q8 (8) 304.08 507 304.08 204

Q9 (6) 457.52 255 395.88 76

Q10 (4) 329.62 11 329.62 10

Q11 (3) 40.70 4 40.70 4

Q18 (3) 1392.86 4 1392.86 4

Q21 (4) 259.47 11 259.47 8

10 QueriesBased on TPC-H,ScaleFactor 10

DWJU = “Distinct Join Work Unit”

Measure of Optimizer Work(# of distinct logical joins that are costed)

Page 18: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Results: Replacing the SQL Server JPE

JPEs: Generating

Query(# Tables)

SQL Server SQL Server + JPEG

ChosenPlan (Sec)

DJWUsChosen

Plan (Sec)

DJWUs

Q2 (5) 13.27 26 13.27 15

Q3 (3) 267.49 4 267.49 4

Q5 (6) 332.94 244 254.28 77

Q7 (6) 883.74 137 205.62 57

Q8 (8) 304.08 507 304.08 204

Q9 (6) 457.52 255 395.88 76

Q10 (4) 329.62 11 329.62 10

Q11 (3) 40.70 4 40.70 4

Q18 (3) 1392.86 4 1392.86 4

Q21 (4) 259.47 11 259.47 8

Better Plan, Less Work

Same Plan, Less Work

Same Plan, Same Work

Page 19: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

How to benchmark JPE independently of cost model?

◦ Best Runtime of Enumerated Plans◦ Worst Runtime of Enumerated Plans◦ Average Runtime of Enumerated Plans◦ Number of Enumerated Plans

Issues:◦ Runtimes depend on execution engine◦ How to reconcile # of Plans vs Quality of Plans

JPEs: Benchmarking

Page 20: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Primary JPE Tuning Strategies:

◦ Merging/Splitting Join Ranks Generates more/fewer enumerated plans

◦ Reorder Join Ranks Generates different enumerated plans

Profiling Tools to Guide Above:

◦ Rank Prevalence: What % of joins have rank i?

◦ Merge Impact: What is % increase in enumerated plans resulting from merging ranks i and j?

◦ Reorder Impact: What is effect on benchmark metrics from swapping ranks i and j?

JPEs: Profiling Tools

Page 21: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Development Environment Overview

Initial Results: JPEs

Future Work: Other Components

Talk Outline

Page 22: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Possible Specification:◦ >: Plan x Plan Boolean◦ Ps: Set of “interesting” properties

Possible Benchmarks:◦ >-quality: % of times predicted better plan is

actually better

◦ P-quality: % of plans identified as best that would have been pruned with property P identified as interesting

Pruner Tools: Early Thoughts

Page 23: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Possible Specification◦ Top-Down vs Bottom-Up Traversal

◦ Handoff Policy: conditions for transfering control from JPE to PPM/Pruner

Possible Benchmarks (Ceager vs Clazy):◦ Eager-Benefit: % of subplans pruned earlier by

Ceager than by Clazy (assuming pruned by both)

◦ Eager-Penalty: % of subplans pruned by Ceager but not by Clazy

Controller Tools: Early Thoughts

Page 24: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

Early-Stage Project:

Development Environment for Join Plan Selection Components of Query Optimizers

Will Support Per Component:Rapid Prototyping Declarative Specification,

Generator ToolEvaluation Component BenchmarksRefinement Profiling Tools

Early Results for JPEs:◦ JPEG: Enumerate According to “Join Ranking

Functions”◦ Used to build JPEs for SQL Server, Vertica

Conclusions

Page 25: Olga Papaemmanouil, Brandeis University Nga Tran, Brandeis University Mitch Cherniack, Brandeis University.

?Questions


Recommended