Date post: | 17-May-2015 |
Category: |
Technology |
Upload: | mysqlconference |
View: | 2,170 times |
Download: | 0 times |
InnoDB Performance and Usability Patches
MySQL CE 2009
Vadim Tkachenko, Ewen Fortune
Percona Inc MySQLPerformanceBlog.com
Who are we ?
• Vadim Tkachenko – Co-Founder of Percona Inc
• Lead of R&D department
– Co-Author MySQLPerformanceBlog.com – Co-Author “High Performance MySQL” 2nd edition book
• Ewen Fortune – Consultant, Percona Inc
• Special Thanks Yasufumi Kinoshita – Not here, but author of most patches
What is this talk about?
• Patches made by Percona for InnoDB Storage Engine
• Two main focuses – Performance improvement patches – “Usability” patches
• Make InnoDB a bit more friendly • World changed since time of Pentium 100MHz and 8MB of RAM
– But many such assumptions still in InnoDB code
Why we do it
• Most requirements and changes come from practical work with customers
• We need InnoDB fully utilizing modern hardware today – 16 cores – RAIDs – SSD / FusionIO / other storage technologies
• InnoDB team is “conservative” in making improvements in this area
Future
• Why patches ? Why it can’t be included in InnoDB ? – We are often asked about, but actually question is to
InnoDB team • (empty space due to uncertainty of MySQL future in
Oracle)
• Anyway we will continue our work
Versions
• 5.0 – Set of patches – SHOW PATCHES to see full list
• 5.1 – Storage engine XtraDB – Based on InnoDB + patches, not real competitor of
InnoDB, but drop-in enhanced version
Performance Patches
Scalability
• Enhanced read_write locks – Improves InnoDB scalability on systems with 8-16 cores – Similar on Google implementation, InnoDB-plugin-1.0.3 – Our implementation is alternative
• Topic to research which one is better • InnoDB-plugin may be preferred, InnoDB team made hard job
porting it to many platforms
– And now in 5.4 • Split buffer_pool mutex even more
– Additional split of buffer_pool mutex to 5.0.33
IO patches
• InnoDB IO patches – Part similar to Google’s InnoDB IO patches, but again
alternative – Several parts – some of them now in 5.4
IO – multiple threads
• Read_io_threads – Number of threads for reads requests (by default 1) – Not really useful as used only for read-ahead requests
• Write_io_threads – Number of threads for write requests (by default 1) – This is one you may want to use on system with multiple
disks • Io_capacity
– Amount of IO operations per second InnoDB assumes server can do (by default 100, which is not right assumptions for modern systems)
IO – Adaptive checkpoint
• InnoDB flushing of dirty buffer_pool pages may be intensive
• Lack of free pages may be controlled by innodb_max_dirty_pages_pct
• Flushing at the moment of checkpoint is not controllable, intensive and may hurt
Adaptive checkpointing InnoDB default behavior, hiccups during buffer_pool flushing
Adaptive checkpoint
• What we do: • Flush pages more intensive
– the closer checkpoint the more intensive
Adaptive_checkpoint
• Adaptive_checkpoint=1
IO Control of Insert buffer
• Ibuf_max_size – maximal size of insert buffer (by default can be half of buffer_pool)
• Ibuf_accel_rate – IO rate for background thread, works in pair with io_capacity
IO – multiple pages
• Read_ahead = (both | linear | random) – Control to use or not internal InnoDB read-ahead logic
• Flush_neighbor_pages = (yes|no) – By default InnoDB also writes neighborhoods of flushing
pages • All these operations were made for disks with
expensive (in time sense) random reads – may be not needed for SSD / FusionIO / other devices with cheap random reads
Extra rollback segments
• By default InnoDB uses single segment protected by mutex
• Sensitive in intensive parallel insert load
Fix group commit
• “Broken” in 5.0 – Problem appears on slow disks with enabled binary-logs
Benchmark
• Tpcc-like workload • 100 Warehouses (about 10GB of data) • Buffer_pool=5GB • System: Dell PowerEdge R900, RAID 10 on 8
disks, RAM 32GB – O_DIRECT for InnoDB, xfs filesystem, mounted with
nobarrier • 5.0.77 vs 5.0.77-percona
– Had no chance to test 5.4 yet
Benchmark
Usability patches
Microslow
• InnoDB part
InnoDB_IO_r_ops: 1 InnoDB_IO_r_bytes: 16384 InnoDB_IO_r_wait: 0.028487 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 5
Limit data dictionary
• Problem: – Data dictionary entry of once opened table kept in
memory forever (or while DELETE table) – Is not problem for regular usage (100-1000 tables) – Is problem for instances with 10K+ tables
• 10GB+ of memory just allocated for datadictionary entries
• Our solution: – LRU based datadictionary entries – Remove from memory oldest entries if limit reached
IO access pattern
• mysql> select INDEX_ID,TABLE_NAME,INDEX_NAME,sum(N_READ),sum(N_WRITE) from INFORMATION_SCHEMA.INNODB_ALL_PAGE_IO group
• by INDEX_ID;
• +------------+------------------+-------------------+-------------+--------------+
• | INDEX_ID | TABLE_NAME | INDEX_NAME | sum(N_READ) | sum(N_WRITE) |
• +------------+------------------+-------------------+-------------+--------------+
• | 30 | tpcc/item | PRIMARY | 547 | 0 |
• | 32 | tpcc/district | PRIMARY | 1 | 1 | • | 36 | tpcc/history | GEN_CLUST_INDEX | 11 | 5 |
• | 37 | tpcc/history | fkey_history_1 | 166 | 163 |
• | 38 | tpcc/history | fkey_history_2 | 37 | 30 |
• | 39 | tpcc/new_orders | PRIMARY | 76 | 76 |
• | 43 | tpcc/order_line | PRIMARY | 218 | 189 |
• | 44 | tpcc/order_line | fkey_order_line_2 | 1040 | 1040 | • | 46 | tpcc/stock | PRIMARY | 3137 | 1764 |
• | 47 | tpcc/stock | fkey_stock_2 | 269 | 0 |
• | 48 | tpcc/customer | PRIMARY | 960 | 580 |
• | 49 | tpcc/customer | idx_customer | 171 | 0 |
• | 50 | tpcc/orders | PRIMARY | 94 | 70 |
• | 51 | tpcc/orders | idx_orders | 142 | 129 |
Show pattern of pages on disk accessed
Show buffer pool content
• What is in buffer_pool select space,offset, RECORDS, DATASIZE, INDEX_NAME,TABLE_SCHEMA,TABLE_NAME from information_schema.INNODB_BUFFER_POOL_CONTENT limit 10; +-------+---------+---------+----------+------------+--------------+-------------+ | space | offset | RECORDS | DATASIZE | INDEX_NAME | TABLE_SCHEMA | TABLE_NAME | +-------+---------+---------+----------+------------+--------------+-------------+ | 1584 | 640643 | 9 | 10312 | PRIMARY | art104 | article104 | | 1648 | 2100 | 135 | 15226 | PRIMARY | art114 | author114 | | 1492 | 4507 | 158 | 15130 | PRIMARY | art87 | author87 | | 1406 | 17498 | 141 | 16056 | img_status | art52 | img_out52 | | 1466 | 47632 | 49 | 15140 | PRIMARY | art62 | img_out62 | | 1470 | 1395457 | 24 | 14769 | PRIMARY | art84 | article84 | | 1460 | 16025 | 62 | 15174 | PRIMARY | art61 | img_out61 | | 1458 | 560956 | 20 | 14977 | PRIMARY | art61 | article61 | | 1466 | 67953 | 56 | 15182 | PRIMARY | art62 | img_out62 | | 1621 | 162962 | 46 | 15134 | PRIMARY | art110 | link_out110 | +-------+---------+---------+----------+------------+--------------+-------------+
Show memory usage
• Extended information about memory consuming ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 328830416; in additional pool allocated 2117120 + Internal hash tables (constant factor + variable factor) + Adaptive hash index 4839388 (4425628 + 413760) + Page hash 138716 + Dictionary cache 3383508 (3320220 + 63288) + File system 41848 (41336 + 512) + Lock system 332788 (332468 + 320) + Recovery system 0 (0 + 0) + Threads 41900 (41348 + 552) Buffer pool size 16384 + Buffer pool size, bytes 268435456 Free buffers 12396
Show locks held • ---TRANSACTION 0 163390, ACTIVE 0 sec, process no 15571, OS thread id 1159485776
inserting
• mysql tables in use 1, locked 1 • 7 lock struct(s), heap size 1216, undo log entries 4 • MySQL thread id 15, query id 15455 127.0.0.1 root update
• INSERT INTO history(h_c_d_id, h_c_w_id, h_c_id, h_d_id, h_w_id, h_date, h_amount, h_data) VALUES(?, ?, ?, ?, ?, ?,
• ?, ?)
• Trx read view will not see trx with id >= 0 163391, sees < 0 163086 • TABLE LOCK table `test/warehouse` trx id 0 163390 lock mode IX • RECORD LOCKS space id 10 page no 3 n bits 168 index `PRIMARY` of table `test/warehouse`
trx id 0 163390 lock_mode X
• locks rec but not gap • TABLE LOCK table `test/district` trx id 0 163390 lock mode IX
• RECORD LOCKS space id 18 page no 7 n bits 216 index `PRIMARY` of table `test/district` trx id 0 163390 lock_mode X locks rec but not gap
• TABLE LOCK table `test/customer` trx id 0 163390 lock mode IX • RECORD LOCKS space id 19 page no 57918 n bits 96 index `PRIMARY` of table `test/customer ̀
trx id 0 163390 lock_mode X locks rec but not gap
• TABLE LOCK table `test/history` trx id 0 163390 lock mode IX
Extra undo slots
• By default 1024 slots to store transaction undo information, that may limit count of concurrent transactions to 512
• We increase to 4072 – Only on 5.1 XtraDB – Use it only if you need, breaks compatibility with InnoDB
TransactionalReplication
• Similar to Google’s patch • Information in relay-log.info is not consistent with
InnoDB state. – When server crash MySQL will repeat several transaction
• You are lucky if replication fails on “Duplicate key error” • In worst case you will have several transactions executed twice
• Our solution: store information of binary log name and position and relay-log name and position in InnoDB transactional log file
Plans
• Still hunt performance improvements • Operations tasks:
– Fast recovery • There is reported bug http://bugs.mysql.com/bug.php?id=29847
– Preload table / index into buffer_pool. – Copy single .ibd table from one server to different – Open InnoDB tables in parallel
• Currently serialized
– Different improvements on statistics • Some patches already published (not by us)
To finalize
• Most of patches is not rocket science – Could be developed or included in official tree long time
ago • Even more, for some patches we just only uncommented few
lines of code
– Expect most of them in MariaDB 5.1
Questions ?
• Thank you for coming!