Percona Server for MySQL 5.7: Key
Performance AlgorithmsLaurynas Biveinis Alexey Stroganov
Percona [email protected]
Percona Server for MySQL 5.7 Key Performance Algorithms
• Focus on the InnoDB: buffer pool, flushing, the doublewrite buffer
• Talk assumes familiarity, but feel free to interrupt
• What we learned
• What we did
• How we did it
InnoDB buffer pool• Memory cache of disk data pages, sized by innodb_buffer_pool_size
• In-memory data pages accessible through several data structures
• 1) Page hash for lookup
(space_id; page_id) ?
Hash array Data page lists
Fold
InnoDB buffer pool• 2) flush list for dirty page management. Dirtying:
Clean page
INSERT INTO foo VALUES(bar)
Dirty page, LSN = 42
Dirty page, LSN = 25 Dirty page, LSN = 32 Dirty page, LSN = 42Flush list tail:
InnoDB buffer pool• 2) flush list for dirty page management. Flushing:
Dirty page, LSN = 5 Dirty page, LSN = 7 Dirty page, LSN = 12Flush list head:
Flush up to LSN 10
Clean page
Clean pageDirty page, LSN = 12
Flush list head:
InnoDB buffer pool• 3) LRU list for deciding which pages to evict
• Preventing eviction for recently-used pages (making them young)
• innodb_old_blocks_pct, innodb_old_blocks_time
Dirty pageClean page Dirty page Clean page Clean page
Page access
Dirty pageClean page Dirty page Clean pageClean page
old young
InnoDB buffer pool• 4) free list for having free space in the buffer pool
to read currently non-present pages. Reading:
Free page Free page Free page Free page Free page
Page read
Free page Free page Free page Free page
Clean page
InnoDB buffer pool• 3/4) Evicting/flushing pages from the LRU list and putting
them on the free list (up to innodb_max_lru_scan_depth)
Dirty pageClean page Dirty page Clean page Clean page
Free page Free page Free page Free page
Dirty page Dirty page Clean page Clean page
Free page Free page Free page Free page Free page
LRU
free
LRU
free
The doublewrite buffer
• innodb_doublewrite
• Protects again data loss in case of a crash in the middle of a page write
• Implemented by writing data pages twice
The doublewrite bufferData page
Doublewrite buffer in disk
Data file
Doublewrite buffer in memory
Add
Flush
Write
Step 1
Step 2
Step 3
Buffer pool concurrency
flush list
LRU list
free listpage hash
misc.
buffer pool mutexflush list mutexpage hash latch
Buffer pool instancesflush list LRU list
free list
page hash
misc.
buffer pool mutexflush list mutexpage hash latch
buffer pool instance 0
flush list LRU list
free list
page hash
misc.
buffer pool mutexflush list mutexpage hash latch
buffer pool instance 1
Buffer pool instances• innodb_buffer_pool_instances
• Problem: some instances are cold and some are hot
• “First the accesses to the buffer pools is in no way evenly spread out.”
• http://bit.ly/bpsplit
• Six year-old quote, still relevant the same today
Concurrency in XtraDB
flush listpage hash
flush list mutexpage hash latch
LRU list
LRU list mutex
free list
free list mutex
misc
misc mutex / atomics
Patch contributed to MySQL, and merged in 8.0.0 http://bugs.mysql.com/bug.php?id=75534
The concurrency solutions are compatible
flush listpage hash
flush list mutexpage hash latch
LRU list
LRU list mutex
free list
free list mutex
misc
misc mutex / atomics
buffer pool instance 0
buffer pool instance 1
flush listpage hash
flush list mutexpage hash latch
LRU list
LRU list mutex
free list
free list mutex
misc
misc mutex / atomics
Buffer pool mutexes are so 5.5
Improvement by the buffer pool mutex
split
Improvement by adaptive
flushing
5.6+ changed things• In 5.5 and earlier: reduce mutex contention by X%,
observe TPS increase by ~X%
• Changing flushing heuristics is driven by performance stability, not necessarily by peak performance
• Pre-release Percona Server 5.6: reduce mutex contention by X%, observe TPS increase by ~0%
• What happened? InnoDB cleaner thread happened
Buffer pool / flushing concurrency in 5.5
Time Master thread Query thread 1 Query thread 2
flush list flush
flush list flush
flush list flush
access page
access page
LRU list flush
access page
LRU list flush
access page
Buffer pool / flushing concurrency in 5.6+
Time Cleaner thread Query thread 1 Query thread 2
flush list flush
flush list flush
flush list flush
access page
access page
LRU list flush
access page
LRU list flush
access page
LRU list flush
Buffer pool / flushing concurrency in 5.6+
• In 5.6+, code-level changes to reduce locking granularity are still relevant, but
• Increasing thread specialization means that…
• …flushing - including LRU - heuristics are very important now
MySQL 5.7 multi-threaded flushing
LRU instance #0 flush list instance #0
LRU instance #1 flush list instance #1
LRU instance #2 flush list instance #2
coordinator thread
worker thread #0
worker thread #1
time0 s 1 s
LRU…
LRU…
LRU…
innodb_page_cleaners
MySQL 5.7.11 OLTP_RW
PFS data is incomplete
MySQL 5.7.11 OLTP_RW
660 pthread_cond_wait,enter (ib0mutex.h:850), buf_dblwr_write_single_page (ib0mutex.h:850),buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page (buf0flu.cc1096),buf_flush_single_page_from_LRU (buf0flu.cc:2217), buf_LRU_get_free_block(buf0lru.cc:1401),...
631 pthread_cond_wait,buf_dblwr_write_single_page (buf0dblwr.cc:1213), buf_flush_write_block_low(buf0flu.cc:1096),buf_flush_page (buf0flu.cc:1096), buf_flush_single_page_from_LRU (buf0flu.cc:2217), buf_LRU_get_free_block(buf0lru.cc:1401),...
337 pthread_cond_wait,PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89),get_next_redo_rseg (trx0trx.cc:1185), trx_assign_rseg_low(trx0trx.cc:1278),trx_set_rw_mode (trx0trx.cc:1278), lock_table(lock0lock.cc:4076),...
631 pthread_cond_wait,buf_dblwr_write_single_page
Single-page flushing
Is free
page available?
Single-page flush
Take a free page from the free list
Query thread needs a free page
Yes No
Single-page doublewrite
Query thread has a free page
Percona Server innodb_empty_free_list_algorithm=backoff
Is free
page available?
Wait
Take a free page from the free list
Query thread needs a free page
Yes No
Single-page doublewrite
Query thread has a free page
Single-page flush
MySQL 5.7 multi-threaded flushing
LRU instance #0 flush list instance #0
LRU instance #1 flush list instance #1
LRU instance #2 flush list instance #2
coordinator thread
worker thread #0
worker thread #1
time0 s 1 s
LRU…
LRU…
LRU…
free pages
Single page
flushes!
free pages
Percona Server 5.7 multi-threaded flushing
LRU flusher #0
LRU flusher #1
LRU instance #0 LRU instance #0 LRU…
free pages
LRU instance #1 LRU…
free pages
flush list instance #0
flush list instance #1
coordinator
worker #0
time0 s 1 s
flush…
flush…
Percona Server 5.7.10-3 OLTP_RW
Percona Server 5.7.10-3 OLTP_RW
2678 nanosleep (libpthread.so.0), … ,buf_LRU_get_free_block (buf0lru.cc:1435), ...
867 pthread_cond_wait,...,log_write_up_to(log0log.cc:1293),...
396 pthread_cond_wait,…, mtr_t::s_lock(sync0rw.ic:433), btr_cur_search_to_nth_level(btr0cur.cc:1022),...
337 libaio::??(libaio.so.1),LinuxAIOHandler::collect (os0file.cc:2325), ...
240 poll(libc.so.6),...,Protocol_classic::read_packet(protocol_classic.cc:810),...
2678 nanosleep, …, buf_LRU_get_free_block
Percona Server 5.7.10-3 OLTP_RW flushers only
Percona Server 5.7.10-3 OLTP_RW flushers only
139 libaio::??(libaio.so.1),LinuxAIOHandler::collect (os0file.cc:2448), LinuxAIOHandler::poll(os0file.cc:2594),...
56 pthread_cond_wait,…,buf_dblwr_add_to_batch (buf0dblwr.cc:1111),…,buf_flush_LRU_list_batch(buf0flu.cc:1555), ...,buf_lru_manager(buf0flu.cc:2334),...
25 pthread_cond_wait,…,os_event_wait_low(os0event.cc:534),buf_flush_page_cleaner_worker(buf0flu.cc:3482),...
21 pthread_cond_wait, …, PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89),buf_page_io_complete (buf0buf.cc:5966), fil_aio_wait(fil0fil.cc:5754),io_handler_thread(srv0start.cc:330),...
8 pthread_cond_timedwait,…,buf_flush_page_cleaner_coordinator (buf0flu.cc:2726),...
56 pthread_cond_wait, …, buf_dblwr_add_to_batch
Legacy doublewrite buffer: adding pages
Legacy doublewrite buffer: flushing buffer
Parallel doublewrite buffer: adding pages
Parallel doublewrite buffer: flushing buffers
Percona Server 5.7.11-4 OLTP_RW flushers only
Percona Server 5.7.11-4 OLTP_RW flushers only
112 libaio::??(libaio.so.1),LinuxAIOHandler::collect(os0file.cc:2455),...,io_handler_thread(srv0start.cc:330),...
54 pthread_cond_wait,…,buf_dblwr_flush_buffered_writes(buf0dblwr.cc:1287),…,buf_flush_LRU_list(buf0flu.cc:2341),buf_lru_manager(buf0flu.cc:2341),...
35 pthread_cond_wait, …, PolicyMutex<TTASEventMutex<GenericPolicy> (ut0mutex.ic:89), buf_page_io_complete(buf0buf.cc:5986), …, io_handler_thread(srv0start.cc:330),...
27 pthread_cond_wait,...,buf_flush_page_cleaner_worker(buf0flu.cc:3489),...
10 pthread_cond_wait,…,enter(ib0mutex.h:845), buf_LRU_block_free_non_file_page(ib0mutex.h:845), buf_LRU_block_free_hashed_page(buf0lru.cc:2567), …,buf_page_io_complete(buf0buf.cc:6070), …,io_handler_thread(srv0start.cc:330),...
Percona Server 5.7 OLTP_RW
Percona Server 5.7 OLTP_RW
Summary: the 5.7 story• I/O-bound workloads: high demand for free pages,
provided by LRU batch flushing or single-page flushing
• Single-page flushes are bad, w/ and w/o doublewrite
• Removed it
• Made batch LRU flusher truly parallel
• Doublewrite buffer negates parallel flushing gains
• Made it parallel too
The image part with relationship ID rId8 was not found in the file.The image part with relationship ID rId2 was not found in the file.
Join us at Percona LiveWhen: April 24-27, 2017Where: Santa Clara, CA, USAThe Percona Live Open Source Database Conference is a great event for users of any level using open source database technologies.• Get briefed on the hottest topics• Learn about building and maintaining high-performing deployments • Listen to technical experts and top industry leaders
Use promo code “WebinarPL” to save an extra 15% off.Register now and get the early bird rate, but hurry prices go up Jan 31th. https://www.percona.com/live/17/register
Sponsorship opportunities available as well: https://www.percona.com/live/17/be-a-sponsor