+ All Categories
Home > Documents > Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)? Real customers, real...

Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)? Real customers, real...

Date post: 19-Jan-2016
Category:
Upload: asher-ray
View: 214 times
Download: 1 times
Share this document with a friend
32
Sort in GPDB Feng Tian GreenPlum Inc.
Transcript
Page 1: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Sort in GPDB

Feng TianGreenPlum Inc.

Page 2: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

WARNING: NON-TECH SLIDES

Why (NOW)? Real customers, real problems. About to get the code in MAIN

Make Joy/Brian's code reading experience easier.

Page 3: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Outline (Doesn't this look familiar?)

Motivation Review of Current Status Improve Sort Performance Remaining Work

Page 4: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Sort in Database

One of the most important operator Order by Group by OLAP

Rollup and Cube Window (partition by and order by)

Merge Join Build index

Page 5: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Sort in GPDB

One of the most mysterious operator Sort is slow v.s. Sort is OK Fix planner to avoid sort v.s. Fix sort

Page 6: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Sort is fun

One of the most extensively studied algorithm In memory sorting algorithm

CK always got some interesting links Jie challenged my interview question Sedgewick: Quicksort is optimal Bentley & McIlory, 93.

External sort TAOCP

Page 7: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

GPDB Sort is funny

Good Honest TAOCP Honest BM93.

Bad Equal keys Lots of columns Sort strings

Ugly Combination of the bads

Page 8: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Goal

Get rid of the ugly part of GPDB Sort.

Page 9: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Outline

Motivation Review of Current Status Improve Sort Performance Remaining Work

Page 10: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

GPDB Sort

Quicksort if entries fit in memory External sort

An honest implementation from TAOCP I/O pattern is pretty good Amount of I/O when sorting tuple is OK

No compression Sorting datum is terrible, but not a concern at this moment

Only used for distinct May eventually be replaced by hash

Use Heap to merge

Page 11: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

GPDB Sort

Details Cost of comparison

Non trivial overhead (Unicode) String compare is extremely slow

Strcoll v.s. Strxfrm + strcmp

Cost of memtuple_getattr It is way better than heap_getattr Postgres devs know this for a long time Cache first sort column

Sort (1, 'a'), (2, 'a'), (3, 'c') ... is fast. Sort (1, 'a'), (1, 'b'), (1, 'c') ... is miserably slow.

Page 12: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Outline

Motivation Review of Current Status Improve Sort Performance Remaining Work

Page 13: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Goal

It should be “invisible” No API change Keep fast cases fast

Slow cases? What slow cases? Planner can honestly optimize a query, without

worrying about “avoiding” sort User can write a query, without trying to be creative In the cases that a sort cannot be avoided, may

save out neck.

Page 14: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Quicksort Is Optimal (Sedgewick)

Equal keys Equal keys is good (Bentley & McIloy)

Do not special case small n Why? Not sure. Cache oblivious?

Multi column sort keys Comparison get slower and slower

Page 15: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Quicksort

As the old algorithm, cache first sort column Quicksort on first column For the range with equal first column, cache the

second sort column, quick sort the range Until all sort columns are processed

May stop early. Sort (1, 'a'), (2, 'b'), (3, 'c') will not compare string at all. Sort (1, 'a'), (1, 'b'), (1, 'c') will only call memtuple_getattr

when necessary.

Page 16: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Example

(1, ?), (3, ?), (2, ?), (0, ?), (3, ?), (2, ?) Choose Pivot (2, ?) (2, ?), (1, ?), (1, ?) :: (3, ?), (3, ?), (2, ?) Swap to middle (0, ?),(1, ?) :: (2, ?),(2, ?) :: (3, ?), (3, ?)

Page 17: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Recursive Down

Quick sort each partition For left, right, just quick sort. For the middle part, expand to level k+1

(2, ?), (2, ?) ... (2, ?) to (2, 'a'), (2, 'x'), (2, 'd') ... (2, 'z')

Of course, only if middle has not expanded all level NO EXTRA LEVEL EXPANSION NO EXTRA COMPARISON

Page 18: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Heapsort

Used in external sort (both produce runs and merge runs)

Cache first sort column when insert into heap Expand to (n+1)th sort column only when first n

column equals those of heap top Remember the lv of expand

Maintain an array of datum d, entry.sort_column[x] = d[x] if x < lv

Siftup and Siftdown Siftdown hole

Page 19: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

HeapSort Continued

NO EXTRA EXPANSION NO EXTRA COMPARISION However, code became more complicated.

Page 20: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Handling String

When cache a sort column, cache strxfrm Comparison use strcmp

Equal String Collapse equal strings

Compare pointer value first Save memory

Problems Memory consumption

Page 21: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Minor improvements

Fast path some basic types Int, maybe float later

Limit Sort: Use heapsort instead of insertion sort

Page 22: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Outline

Motivation Review of Current Status Improve Sort Performance Remaining Work

Page 23: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

“Honest” Implementation

Cut corners in performance prototype is dangerous Error handling Special cases

Relatively honest Does not handle unique check etc.

Pass make installcheck-good. Pass TPCH and opperf if turn off hashagg and hashjoin

Page 24: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

TPCH 1G Q1

Hashagg ~5.7 sec Old sort ~15 sec New sort ~8 sec

Aggregate computing takes ~4 sec Hashagg proper ~ 1.5 sec New sort, generated 3 runs, motioned 6M tuples, and

do one more comparison in Agg in less than 4 sec. The extra comparison takes more than 1 sec Sort proper is ~2 sec

Page 25: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Building index

On ship_instruction, ship_mode, comment Old: All take 24 to 26 sec New: 4 sec, 6 sec, 11 sec

On two columns Old: 70+ sec New: 16?

Page 26: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

OLAP (Cube and Rollup)

For “Big” OLAP CUBE/ROLLUP queries, 10~15% faster Not much on “smaller” ones, some may even see

some small regression Unstable timing, regression comes and goes

Our olap plan have many sorts, on 1 or 2 integer column, so this is expected

However, we can finish some “machine freezing” queries now

Page 27: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Yahoo Hashagg

Slightly slower Heapsort Overhead :-( On par once I fastpath-ed int4cmp

Page 28: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Outline

Motivation Review of Current Status Improve Sort Performance Remaining Work

Page 29: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

More Improvements

We know the level of key change Important for sort agg Important for OLAP Important for merge join

Take (more) advantages of unique, limit, aggregate.

Page 30: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Improve the code

Heap code (maybe) is (more) complicated (than necessary), don't know how to improve yet.

Memory management. Explain analyze accounting and reporting.

Page 31: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Code Review

Code is at ftian_main_cr2 branch tuplesort.c

Should make it tuplesortnew.c, and probably GUC it. Uses memtuple and logtape as before. Uses new quick sort and heap sort.

mk_qsort.c Multi key quick sort. Straightforward.

mk_heap.c Multi key heap sort. 700 lines heap sort :-(

About time to port into MAIN.

Page 32: Sort in GPDB Feng Tian GreenPlum Inc.. WARNING: NON-TECH SLIDES Why (NOW)?  Real customers, real problems.  About to get the code in MAIN Make Joy/Brian's.

Feedback (Thanks!)

Welcome ideas, new improvements and critique of the approach.


Recommended