www.percona.com 2
In This Presentation
Flash technology overview
Review some of the available technology
What does this mean for databases ?
Specific opportunities for MySQL
www.percona.com 4
There were HDDs
Good at Sequential Read/Writes
RT=Seek Time + Rotation Latency
Reads/Write – Similar Latency
No specific write limits
Retain data for a long time
One IO request in parallel
Low cost per GB
www.percona.com 6
Using Many HDDs together
Caching Reads
Buffering Writes (Writeback Cache)
Better Sequential Read/Write speed
Better throughput at high concurrency
Higher IO latencies for uncached IO
www.percona.com 8
NAND Flash
Cell
Page/Read Block
Erase Block
Write but no overwrite
Wears with writes (erases)
www.percona.com 9
Writing to the Flash
• Set all bits to “1111111…” Erase
• Set some of the bits to 0: “0100111..” Write
• Impossible. Do Erase, when Write
Change Zero to one
www.percona.com 11
Flash Storage Design
Cache
Battery/Super Capacitor
Controller + Complex Firmware
Built-in Parallelism
www.percona.com 12
Flash Controller Tasks
Write wear leveling
Garbage collection
Error correction
Bad block mapping
Read scrubbing
Read disturb management
Encryption
www.percona.com 13
Flash Properties Lots of IOs per device! (100K+)
Less random IO penalty
Writes more expensive than reads (but can be faster)
Limited by amount of writes
Limited retention
Concurrent execution on single device
Fast write acknowledgement (safe or not)
Can burst writes
www.percona.com 29
Evaluation
Performance changes over time
Empty space matters
Complex internals
Watch stability carefully
www.percona.com 30
How Flash Fails
Clear write amount defined EOL (but often can handle a lot more)
One day… it’s gone
“Power Loss Protection”
Internal ECC and redundancy
www.percona.com 31
To RAID or Not to RAID ?
More valuable for consumer grade
Watch for good flash support
RAID controller logic may slow things down
Use a redundant array of inexpensive servers instead?
www.percona.com 32
Redundancy
Device internal redundancy
Hardware RAID
Software RAID
Filesystem “RAID”
www.percona.com 35
Database History
Most have been designed in HDD time
Optimize for sequential IO
Count on cheap sequential writes
RAID, BBU to improve performance
www.percona.com 39
Warmup
Much faster warmup times
Even if the database fits in memory, SSD might be justified
www.percona.com 40
Tolerate more IO bound load
• 5ms • Can do 20 IO/s for 100ms
response time (non parallel) HDD
• 0.1ms • Can do 1000 IO/s for 100ms
response time (non parallel) Flash
www.percona.com 42
Endurance Math
• 4400GB/day over 5 Years • 1400MB/sec peak writes • 66 days at peak write throughput
HGST FlashMax III 2200GB
• 72TB total life time writes • 400MB/sec write • 52 hours at peak write throughput
Crucial M500
960GB
www.percona.com 45
“Torn Page” problem
Flash can avoid this with little cost due to internal design
FusionIO NVMFS (Atomic Writes)
Copy-on-Write File Systems • ZFS • BTRFS
Filesystem level data journaling less preferred • data=journal for EXT4
Skip-Innodb-double-write
www.percona.com 46
Fast IO Path
Bypass Caching O_DIRECT
Native Asynchronous IO
Efficient Checksuming
Innodb_checksum_algorithm=crc32
Innodb_flush_method=O_DIRECT
www.percona.com 47
IO Cost Accounting
Sequential vs Random IO balance
IO vs CPU Balance
Smaller page sizes might make sense • innodb_page_size=4K
www.percona.com 49
Less Merging on Flushing
Do not assume flushing multiple sequential dirty pages has same cost
Innodb_flush_neighbors=0
www.percona.com 50
Less Space on Disk
Innodb Compression (2x typical)
TokuDB Compression (5-10x typical)
Archiving data off OLTP System
www.percona.com 51
Less Writes on Flash
Hybrid Flash/SSD System
Transactional Logs, Other logs on the HDD with RAID and BBU
Small Temporary objects on tmpfs
Innodb_log_file_size=<LARGE>
www.percona.com 57
Other Thoughts
Host hardware and OS matter, especially with high end flash
Virtualization has higher relative overhead
Network higher relative overhead
www.percona.com 58 www.percona.com
Peter Zaitsev [email protected]
@PeterZaitsev https://www.linkedin.com/in/peterzaitsev
Thank You!