+ All Categories
Home > Documents > Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for...

Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for...

Date post: 07-Jun-2018
Category:
Upload: lamnga
View: 258 times
Download: 0 times
Share this document with a friend
47
Analyzing and Optimizing Linux Kernel for PostgreSQL Sangwook Kim PGConf.Asia 2017
Transcript
Page 1: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Analyzing and Optimizing Linux Kernel for PostgreSQL

Sangwook Kim PGConf.Asia 2017

Page 2: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

2

Sangwook Kim •  Co-founder and CEO @ Apposha •  Ph. D. in Computer Science

•  Cloud/Virtualization •  SMP scheduling [ASPLOS’13, VEE’14] •  Group-based memory management [JPDC’14]

•  Database/Storage •  Non-volatile cache management [USENIX ATC’15, ApSys’16] •  Request-centric I/O prioritization [FAST’17, HotStorage’17]

Page 3: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

3

Fully-managed & performance-optimized for MySQL and PostgreSQL

for

What We Do

Fully-managed & performance-optimized for MongoDB

No special H/W support No code changes for DB engine

Page 4: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Contents

•  Background tasks

•  Full page writes

•  Parallel query

4

Page 5: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

PostgreSQL Architecture

5

Storage Device

Operating System

T1

Client

T2

I/O

T3 T4

Request Response

I/O I/O I/O

PostgreSQL

Database performance

* PostgreSQL’s processes -  Backend (foreground)

-  Checkpointer

-  Autovacuum workers

-  WAL writer

-  Writer

-  …

Page 6: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

PostgreSQL Architecture

5

Storage Device

Operating System

T1

Client

T2

I/O

T3 T4

Request Response

I/O I/O I/O

PostgreSQL * PostgreSQL’s processes -  Backend (foreground)

-  Checkpointer

-  Autovacuum workers

-  WAL writer

-  Writer

-  …

Page 7: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

PostgreSQL Checkpoint

6

http://www.interdb.jp/pg/pgsql09.html

Page 8: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Impact of Checkpointer

7

0

500

1000

1500

2000

2500

0 500 1000 1500 2000 2500 3000

Max

trx

late

ncy

(ms)

Elapsed time (sec)

Default

1.1 sec

•  Dell Poweredge R730 •  32 cores •  132GB DRAM •  1 SAS SSD

•  PostgreSQL v9.5.6 •  52GB shared_buf •  10GB max_wal

•  TPC-C workload •  50GB dataset •  1 hour run •  50 clients

Page 9: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

PostgreSQL Autovacuum

8

Free Space Map

http://bstar36.tistory.com/308

Page 10: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Impact of Autovacuum Workers

9

0 1000 2000 3000 4000 5000 6000 7000 8000 9000

1 50 100 150 200 250 300

Trx

thpu

t (trx

/sec

)

# of clients

Default Aggressive AV

40%

autovacuum_max_workers 3 => 6 autovacuum_naptime 1min => 15s autovacuum_vacuum_threshold 50 => 25 autovacuum_analyze_threshold 50 => 10 autovacuum_vacuum_scale_factor 0.2 => 0.1 autovacuum_analyze_scale_factor 0.1 => 0.05 autovacuum_vacuum_cost_delay 20ms => -1 autovacuum_vacuum_cost_limit -1 => 1000

Page 11: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Impact of Autovacuum Workers

10

0

500

1000

1500

2000

2500

0 500 1000 1500 2000 2500 3000

Max

trx

late

ncy

(ms)

Elapsed time (sec)

Default Aggressive AV 2.5 sec

Page 12: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Why Background Tasks Matter

• Multiple independent layers

11

Storage Device

Caching Layer

Application

File System Layer

Block Layer Abs

tract

ion

Buffer Cache

read() write()

FG FG BG

BG FG BG BG

reorder

admission control

admission control

reorder

BG

admission control

Page 13: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Why Background Tasks Matter

•  I/O priority inversion

12

Storage Device

Caching Layer

Application

File System Layer

Block Layer

I/O

FG wait

wait

BG user var

wake

FG wait

I/O FG lock

BG wait

FG wait

wait BG var

wake

Page 14: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Request-Centric I/O Prioritization

• Solution v1 (reactive) •  Request-aware I/O processing in the I/O path [FAST’17] •  I/O priority inheritance [USENIX ATC’15, FAST’17]

•  Locks

•  Condition variables

13

FG lock

BG I/O FG BG submit

complete

FG BG

FG wait

BG

register

BG

inherit

FG BG I/O submit

complete

wake

CV CV CV

[USENIX ATC’15] Request-Oriented Durable Write Caching for Application Performance [FAST’17] Enlightening the I/O Path: A Holistic Approach for Application Performance

Page 15: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Request-Centric I/O Prioritization

14

Caching Layer

Application

File System Layer

Block Layer

• Solution v1 (problem)

Synchronization

linux/include/linux/mutex.h linux/include/linux/pagemap.h linux/include/linux/rtmutex.h linux/include/linux/rwsem.h linux/include/uapi/linux/sem.h linux/include/linux/wait.h linux/kernel/sched/wait.c linux/kernel/locking/rwsem.c linux/kernel/futex.c linux/kernel/locking/mutex.h linux/kernel/locking/mutex.c linux/kernel/locking/rtmutex.c linux/kernel/locking/rwsem-xadd.c

linux/kernel/fork.c linux/kernel/sys.c linux/kernel/sysctl.c linux/include/linux/sched.h

linux/include/linux/blk_types.h linux/include/linux/blkdev.h linux/include/linux/buffer_head.h linux/include/linux/caq.h linux/block/blk-core.c linux/block/blk-flush.c linux/block/blk-lib.c linux/block/blk-mq.c linux/block/caq-iosched.c linux/block/cfq-iosched.c linux/block/elevator.c

linux/fs/buffer.c linux/fs/ext4/extents.c linux/fs/ext4/inode.c linux/include/linux/jbd2.h linux/fs/jbd2/commit.c linux/fs/jbd2/ journal.c linux/fs/jbd2/ transaction.c

linux/include/linux/writeback.h linux/mm/page-writeback.c linux/include/linux/mm_types.h linux/fs/buffer.c

Interface

Page 16: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Request-Centric I/O Prioritization

• Solution v2 (proactive)

15

Device Driver

Noop CFQ Deadline Apposha I/O Scheduler

Block Layer

Ext4 XFS F2FS

VFS

Apposha Front-End File System

Etc

Linux I/O Stack

Page Cache

-  Priority-aware I/O scheduling

-  Control device-level congestion

-  Control writes at the upper layer

-  Tag low-priority for background I/Os

Page 17: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

• V12 Engine

Request-Centric I/O Prioritization

16

Kernel Modules

MongoDB Library

PostgreSQL Library

V12-M V12-P

User-level library -  Classify task priority, optimize WAL accesses

Front-End File System I/O Scheduler

Page 18: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Request-Centric I/O Prioritization

17

0

500

1000

1500

2000

2500

0 500 1000 1500 2000 2500 3000

Max

trx

late

ncy

(ms)

Elapsed time (sec)

Default Aggressive AV V12-P

0.15 sec

V12-P: V12 Engine for PostgreSQL

Page 19: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Request-Centric I/O Prioritization

17

0

500

1000

1500

2000

2500

0 500 1000 1500 2000 2500 3000

Max

trx

late

ncy

(ms)

Elapsed time (sec)

Default Aggressive AV V12-P v9.6

1.1 sec

V12-P: V12 Engine for PostgreSQL

New feature since v9.6: checkpoint_flush_after

Page 20: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Contents

•  Background tasks

•  Full page writes

•  Parallel query

18

Page 21: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Full Page Writes

19 Tuning PostgreSQL for High Write Workloads – Grant McAlister

Page 22: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600

WA

L si

ze (G

B)

Elapsed time (sec)

Default

Full Page Writes

•  Impact on WAL size

20

•  PostgreSQL v9.6.5 •  52GB shared_buffer •  1GB max_wal

•  TPC-C workload •  50GB dataset •  10 min run •  200 clients

Page 23: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

WAL Compression

•  Impact on WAL size

21

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600

WA

L si

ze (G

B)

Elapsed time (sec)

Default WAL compression

•  PostgreSQL v9.6.5 •  52GB shared_buffer •  1GB max_wal

•  TPC-C workload •  50GB dataset •  10 min run •  200 clients

Page 24: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

WAL Compression

•  Impact on performance

22

0

2000

4000

6000

8000

10000

12000

Trx

thpu

t (tr

x/se

c)

Default WAL compression

Page 25: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

WAL Compression

•  Impact on performance

22

0

2000

4000

6000

8000

10000

12000

Trx

thpu

t (tr

x/se

c)

Default WAL compression

What if we can safely disable full_page_writes w/o special H/W support?

Page 26: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

write()

Data Block

In memory On disk

Mapping Table

Data Block

Linux Kernel (w/ V12 Engine)

PostgreSQL (full_page_writes off)

Page 27: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

fsync()

Data Block

PostgreSQL (full_page_writes off)

In memory On disk

Mapping Table

Data Block Data Block

writeback Linux Kernel (w/ V12 Engine)

Page 28: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

fsync()

Data Block

In memory On disk

Mapping Table

Data Block Data Block

atomic update Linux Kernel

(w/ V12 Engine)

PostgreSQL (full_page_writes off)

Page 29: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

fsync()

Data Block

In memory On disk

Mapping Table

Data Block Data Block

Linux Kernel (w/ V12 Engine)

PostgreSQL (full_page_writes off)

Page 30: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

Data Block

In memory On disk

Mapping Table

Data Block Data Block

free

Linux Kernel (w/ V12 Engine)

PostgreSQL (full_page_writes off)

Page 31: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

23

Data Block

In memory On disk

Mapping Table

Data Block

Linux Kernel (w/ V12 Engine)

PostgreSQL (full_page_writes off)

Page 32: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

•  Impact on WAL size

24

0

2

4

6

8

10

12

14

0 100 200 300 400 500 600

WA

L si

ze (G

B)

Elapsed time (sec)

Default WAL compression V12-P

V12-P: V12 Engine for PostgreSQL

Page 33: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Atomic Write Support

•  Impact on performance

25

0

2000

4000

6000

8000

10000

12000

Trx

thpu

t (tr

x/se

c)

Default WAL compression V12-P 2X

V12-P: V12 Engine for PostgreSQL

Page 34: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Contents

•  Background tasks

•  Full page writes

•  Parallel query

26

Page 35: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

27

User query

Backend

Worker Worker Worker Worker

Scan, aggregate, join, …

Page 36: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

27

Slow response Fast response

User query

Backend

Worker

cache

storage

cache

storage

cache

storage

cache

storage

hit hit miss!

Worker Worker Worker

miss!

Page 37: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

• Problem inside Linux kernel

28

Fair-based (communism)

Query1 Query2

18sec

20sec I/O Scheduling

10sec 10sec

Page 38: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

• Problem inside Linux kernel

28 Fair-based (communism)

Query1 Query2

24sec

26sec I/O Scheduling

10sec 10sec

Query3

10sec

30sec

Page 39: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

• Problem inside Linux kernel

28 Fair-based (communism)

Query1 Query2

?

? I/O Scheduling

10sec 10sec

Query3

10sec

?

… QueryN

10sec

… …

Page 40: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Parallel Query

• Problem inside Linux kernel

29

Fair-based (communism) Task-aware (utilitarianism)

Query1 Query2

18sec

20sec

10sec

20sec I/O

Scheduling

10sec 10sec

Page 41: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Optimizing Parallel Query

•  Illustrative experiment

30

pgbench –i –s 1000 test1 pgbench –i –s 1000 test2 Q1: SELECT * FROM pgbench_accounts WHERE filler LIKE ‘%x% -d test1 Q2: SELECT * FROM pgbench_accounts WHERE filler LIKE ‘%x% -d test2 QUERY PLAN -------------------------------------------------------------------------------------- Gather (cost=1000.00..1818916.53 rows=1 width=97) Workers Planned: 7 -> Parallel Seq Scan on pgbench_accounts (cost=0.00..1817916.43 rows=1 width=97) Filter: (filler ~~ '%x%'::text)

Page 42: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

0

10

20

30

40

50

60

70

Deadline CFQ NOOP Taskaware

Que

ry la

tenc

y (m

s)

Q1 Q2 Q3 Q4

Optimizing Parallel Query

•  Illustrative experiment

31

Page 43: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Recall

• Multiple independent layers

32

Storage Device

Caching Layer

Application

File System Layer

Block Layer Abs

tract

ion

Buffer Cache

read() write()

FG FG BG

BG FG BG BG

reorder

Page 44: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Optimizing Parallel Query

•  Illustrative experiment (device queue off)

33

Avg 23%

0

20

40

60

80

100

Deadline CFQ NOOP Taskaware

Que

ry la

tenc

y (m

s)

Q1 Q2 Q3 Q4 Q1

73%

Page 45: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Optimizing Parallel Query

•  Illustrative experiment (4ms idle_slice)

34

0 10 20 30 40 50 60 70

Deadline CFQ NOOP Taskaware Taskaware (Idle 4ms)

Que

ry la

tenc

y (m

s)

Q1 Q2 Q3 Q4 Avg 17% Q1

2.8X

Page 46: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

Optimizing Parallel Query

• Ongoing work •  Handling heavy tasks •  Extending to CPU scheduling

35

Page 47: Analyzing and Optimizing Linux Kernel for PostgreSQL · Analyzing and Optimizing Linux Kernel for PostgreSQL ... MongoDB Library PostgreSQL ... Tuning PostgreSQL for High Write Workloads

47

H http://apposha.io F www.facebook.com/apposha M [email protected]


Recommended