© 2011 IBM Corporation1 IBM Internal Use Only
Freakish Database Performance With Flash Storage
© 2011 IBM Corporation2 IBM Internal Use Only
Agenda
• Share some experience with using solid state/ flash storage for database workloads:
• OLTP (2TB)
• Warehouse (76TB)
• Which workload characteristics can best leverage flash storage?
• What are some best practices
© 2011 IBM Corporation3 IBM Internal Use Only
OLTP Workload
Initial profile• A brokerage house package
• Batch cycle comprised of five Java programs (only one can be parallelized)
• 1.5M transactions in 8 hours after extensive application and SQL tuning
• 1.68TB uncompressed
• Online backup time (backup, then gzip) in 36 hours
The Challenge• Goal: 1.5 M trans in 5.5 hours
• Stretch goal: 1.5M trans in 2 hours
• Improve backup time
• What is possible if CPU, memory, storage, and network are not constrained?
© 2011 IBM Corporation4 IBM Internal Use Only
The Setup
No holds barred• 2 x 64 cores, 3.86
GHz, 1TB RAM
• 86TB HDD, 256GB cache – 2 ms average response time
• 1TB SSD
• 10GbE
Approaches• Enabled compression
• No database tuning
• All-HDD
• Mixed – SSD (logs & temp), HDD (data & indexes)
• All-SSD
© 2011 IBM Corporation5 IBM Internal Use Only
Results
Mixed SSD (logs & temp) & HDD (data, indexes) 14%
All – SSD 26%
Disk Utilization < 1% busy
Average IOPS 20
Throughput 450KB/s
Application Engines 30
Uncompressed offline backup 30 – 40 min
Compressed online/offline backup (SSD to HDD) 18 min
Accept all default database settings out of the box• STMM
• Auto runstats
• Auto online table reorg
© 2011 IBM Corporation6 IBM Internal Use Only
Application Engines Performance
Most improvements resulted from more CPUs for the application
• CPU intensive
• Verbose application logging
• Application logs generated more IOs than database!
• More application engines generating transactions to reduce batch elapsed time
• Low database IO profile
© 2011 IBM Corporation7 IBM Internal Use Only
Final Results
Results
Goal: 1.5M trans in 5.5 hours Y
Stretch goal: 1.5M trans in 2 hours Y
Improve backup time: 18 minutes v. 36 hours Y
Best result: 1.5M trans in 1.1 hours! (All SSD) Y
© 2011 IBM Corporation8 IBM Internal Use Only
Warehouse Workload
Initial profile• Servers and storage running 100% all day
long
• Maxed out at around 30 – 40 active users
• Half-stroked disks to get performance and throughput
The Challenge• Aging servers and storage
• Data center floor space, cooling, and power consumption constraints
• Same or better performance
© 2011 IBM Corporation9 IBM Internal Use Only
The Setup
Approach• Replacement will be very fast, very
small, very simple
© 2011 IBM Corporation10 IBM Internal Use Only
Database IO Improvement for Warehouse Workload76TB IBM SSD v. Old HDD
Sub-millisecond IO response time Sustained
• Synchronous reads 21.8x
• Synchronous writes 13.6x• Asynchronous reads 17.6x• Asynchronous writes 18.34x• Data pages per asynchronous request 1.8xNote: Asynchronous IOs are ~18x faster, each asynchronous request is ~2x more effective due to 32K page size, that is a 36x improvement.
© 2011 IBM Corporation11 IBM Internal Use Only
Benchmark Queries Improvement for Warehouse Workload76TB IBM SSD v. Old HDD
Benchmark details
• Actual IO and CPU intensive queries captured from business users
• Runs weekly to monitor any performance degradation with respect to new and organic growth in the warehouse over time
• Noise queries (75) + benchmark queries (25) = 100
All SSD Old
Noise queries completed 85% 32%
BM queries completed 100%(first time ever)
64%(historically never reached
100)
CPU utilization 30% 100%
© 2011 IBM Corporation12 IBM Internal Use Only
Benchmark Queries Speed Up Factor for Warehouse Workload(Plotted on Logarithmic Scale)
76TB IBM SSD v. Old HDD
Speed up details
• Average: 2.21 (log) or 163.96x faster
• Median: 1.48 (log) or 29.96x faster (50% is at least ~30x faster)
• Low: 0.56 (log) or 3.59x faster
• High: 3.05 (log) or 1,113.56x faster
• Time is measured as elapsed time (prepare + execute + fetch)
© 2011 IBM Corporation13 IBM Internal Use Only
CPU Utilization
About 30% busy … BTW … We are also using disk level encryption (SED)
© 2011 IBM Corporation14 IBM Internal Use Only
EXP30 Ultra SSD
IO Specifications• Each drive: SFF (1.8”), 1/5 of 1U, 387GB
• IO drawer: 30 drives (6 x 5). Total raw capacity: 11.6TB (30 x 387GB). Cache: 3.1GB
• IOPS: 400K (100% read) / 280K (70/30 R/W) / 165K (100% write)
• Two POWER 740 servers connected to one IO drawer
• PCIe attached via GX++ adapter (8Gb/s)
• Configured as 5+p LUNs (130GB LUNs)
© 2011 IBM Corporation15 IBM Internal Use Only
Deployment Considerations
• IO adapter card (HBA)
• At 120K – 400K IOPS per IO drawer, and 32K IO size, it is possible to saturate the HBA
• Plan for adequate number of HBAs
• If using SAN then be sure the bandwidth to the storage server is consistent along the whole path, for example, 8Gb/s
• Balance IOs across HBAs and front end ports for even utilization
• Be cautious about mixing flash storage & HDD drives in one HBA
• Fewer, larger LUNs (500GB– 700GB)
• LUNs do take up available system memory and CPU cycles on the server
• Multiple logical volumes per LUN, no reason to stripe LV across LUNs
• Use large page size (32K), extent size, but ensure that the database bufferpool(s) are adequately sized to accept big reads
• Optimize data movement with less IOPS. It is not about driving up IOPS
© 2011 IBM Corporation16 IBM Internal Use Only
Candidate Application Considerations• High IO profile
• Indexes, data
• Database logs and temp spaces can take advantage of cache write through already, may not be the best candidates
• Applications that can parallelize well to take advantage of higher IO throughput
• Before we can process more transactions per second the applications need to be able to generate more transactions per second
• For example, we needed to increase the number of application engines from 3 to 30 in order to generate 8x throughput in transaction rate
• Applications that spend more time fetching result sets across a network, rather than executing complex queries in the database, will likely see less improvement (slow consumers)
• client_idle_wait_time (ms) (time spent waiting for client/application to send its next request)
• If the database spends more time waiting for client/application to send work then improving database response time alone will not improve throughput.
• Increase application parallelism
• Look for network congestion issues
• call monreport.dbsummary(600), examine client_idle_wait_time
© 2011 IBM Corporation17 IBM Internal Use Only
Why Consider Flash Storage
• Greatly beneficial for high IO workloads
• Much smaller footprint, much more energy efficient
• Servers (11), IO drawers (7), power supply all fit in one rack!
• Achieve high performance, and throughput quickly without tuning
• Performance, reliability, price