1
NITRO: A CAPACITY-OPTIMIZED SSD CACHE FOR PRIMARY STORAGE
Cheng Li, Rutgers University; Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone,
and Grant Wallace, EMC Corporation
2014 USENIX Annual Technical Conference
Presented By: Nusrat Sharmin
2
PROBLEM BACKGROUND Requirements of primary storage customers
High IOPS, High Throughput, Low Latency & Low Cost High Performance & Cost Efficiency Conflict each other.
For example: SSD: Support high IOPS with low latency but cost higher than HDD HDD: Have high capacity, low cost but limited IOPS & latency
Former work used SSD’s as a cache in front of HDD & modified SSD interface for caching purpose: Improved performance Occupy a large portion of total storage cost
3
INTRODUCTION TO NITRO NITRO – A SSD cache architecture
Applies data reduction techniques to SSD caches Increase effective cache size Reduce SSD cost for given system
Primary strategies of deduplication & compression Is to achieve high space Energy Efficiency
Combination of deduplication & compression for storage as capacity optimized storage (COS)
4
MOTIVATION An analysis of deduplication pattern of
primary storage traces & properties of local compression
Minimization of deduplication overheads such as memory indices
A cache replacement policy tracks the status of WEUs instead of
compressed data
5
MAIN COTRIBUTIONS Propose ‘Nitro’ – An SSD cache that utilizes
Deduplication Compression Large replacement unit to accelerate primary I/O
Investigate trade-offs between Deduplication, compression, RAM requirements,
performance & SSD lifespan To validate Nitro’s performance improvements
Experiment with both COS & TPS prototype
6
POTENTIAL BENEFITS OF ADDING DEDUPLICATION & COMPRESSION
A cache has the potentiality To capture a significant fraction
of potential deduplication A deduplicated SSD cache
accelerate performance of primary storage
Compression - increase the cache capacity
Benefits of compression in WEUs greatly impact on caching while decreasing SSD erasures.
7
APPROPRIATE STORAGE LAYER FOR NITRO
Two main locations for a server-side cache At the highest layer of the storage stack, right after
processing the storage protocol Server’s first opportunity to cache data Close to the client which minimize latency
Post deduplication within the system Advantage: Currently existing functionality can be reused Some mechanism must provide for every cache read
File recipe A structure mapping from file Offset to fingerprint
8
NITRO ARCHITECTURE Three layer-
Bottom layer: Use COS or TPS HDD systems for large capacity
Middle layer: SSD used to accelerate performance
Upper layer: In memory structures for managing the SSD & memory caches
9
NITRO ARCHITECTURE Conceptually divided into two halves-
Top half: CacheManager which Manages the cache infrastructure A file index: Maps the file systems interface to internal SSD
locations A fingerprint index: Detects duplicate content before it
written to SSD Dirty list: Tracks dirty data for write-back mode NVRAM: In front of cache to buffer pending writes and to
support write-back caching: check-pointing and journaling of the dirty list.
Lower half that implements SSD caching
10
NITRO ARCHITECTURE Nitro Components-
Extent Basic unit of data from a file
Write-Evict Unit(WEU) Unit of replacement (writing and evicting) for SSD File extents are compressed & packed together
File index A mapping from filehandle and offset to an
extent’s location in WEU
11
NITRO ARCHITECTURE WEU ID number, the offset within the WEU & the amount of
compressed data Fingerprint index
To implement deduplication Increase cache capacity
Recipe cache Represent a file as a sequence of fingerprints referencing extents
Dirty list A compact list of extent locations Maintained in NVRAM for consistent logging Dirty extents are written to the storage system either when they
are evicted from SSD or when they reaches a size watermark
12
NITRO FUNCTIONALITY File read path
Read requests check the file index based on file handle and offset If there is hit in the file index
CacheManager read the compressed extent from WEU & decompress it Update LRU status for the WEU For duplicate entry reading, CacheManager confirms validity with WEU generation
numbers Auxiliary Structure tracks whether each WEU in memory or SSD
If there is miss in the file index CacheManager prefeteches the file recipe into the recipe cache
Read is serviced from HDD if read request misses in both file & fingerprint indices
File write path Extents are buffered in NVRAM Passed to the CacheManager for asynchronous SSD caching
13
NITRO FUNCTIONALITY Cache insertion path
Follow 6 steps as pointed in Figure 31. Hash a new extent to create fingerprint2. Check the FP against the FP index. If the FP is
in the index, update the appropriate LRU status and go to step 5
3. Compress and append the extent to a WEU & update WEU header
4. Update the FP index to map from a FP to WEU location
5. Update the file index to map from file handle and offset to WEU. The first entry for the cached extent is marked as a “Base” entry
6. When an in-memory WEU becomes full, increment the generation number and write it to the SSD
14
NITRO FUNCTIONALITY SSD cache replacement policy
Selects a WEU from the SSD to evict before reusing that space for a newly packed WEU
ChacheManager initiates cache replacement By migrating dirty data from the selected WEU to disk storage Removing corresponding invalid entries from file & FP indices
Cleaning the file index Use asynchronous cleaning to remove invalid, duplicate file
indexed entries A background cleaning threads If an entry is accessed by client before it is cleaned a generation
number mismatch indicates the entry can be removed
15
NITRO FUNCTIONALITY Faster snapshot restore/ access
Its recipe will be prefetched from disk into a recipe cache when reading a snapshot
By using FP index duplicate reads will access extents already in the cache
Any shared extents between the primary and snapshot versions can be reused
System restart To accelerate cache warming a system restart/crash recovery
technique is used. A journal tracks the dirty and invalid status of extents. When recovering from a crash, the CacheManager reads the journal,
the WEU headers from SSD (faster than reading all ex-tent headers), and recreates indices.
16
NITRO IMPLEMENTATION Developed simulator & two prototype CacheManager shared between implementations
while the storage components differ Simulator measures read-hit ratios and SSD
churn, and its disk stub generates synthetic content based on finger print
Prototypes measure performance and use real SSDs and HDDs
17
NITRO IMPLEMENTATION Potential SSD customization
Present WEU-LRU, an update to the greedy SSD GC algorithm that replaces WEUs
Added the SATA TRIM command in simulator that invalidates a range of SSD logical addresses
SSD simulator is based on well-studied simulators with a hybrid mapping scheme where blocks are categorized into data and log blocks
18
NITRO IMPLEMENTATION Prototype System
Implemented in user space, leveraging multi-threading & asynchronous I/O to increase parallelism & replaying storage traces
Use real SSDs and either COS or TPS system with hard drive for storage
Before corresponding WEU is replaced evicted dirty extents are moved to a write queue & written to disk storage
19
EXPERIMENTAL METHODOLOGY Metrics
IOPS: Input / Output operations per second Read-hit ratio: The ratio of read I/O requests satisfied by
Nitro over total read requests Read response time (RRT): The average elapsed time
from the dispatch of one read request to when it finishes, also characterize the user-perceivable latency
SSD erasures: The number of SSD blocks erased, which counts against SSD lifespan
Deduplication & Compression ratio: Ratio of the data size versus the size after deduplication or compression ( ≥ 1X)
20
EXPERIMENTAL METHODOLOGY Experimental Traces
FIU traces: Including WebVM (a VM running two web-servers), Mail (an
email server with small I/Os), Homes (a file server with a large fraction of random wires)
Boot-storm traces: Many VMs booting up within a short time frame from the
same storage system Restore trace:
To study snapshot use 100 daily snapshots of a 38GB workstation VM with a median overwrite rat of 2.3%
Large read I/Os were issued while restoring entire VM
21
EXPERIMENTAL METHODOLOGY
Fingerprint generation Variation in extent size is needed Necessary to generate extent finger prints Multi-pass algorithm for extent fingerprints
Synthetic compression information Used LZ4 for compression & decompression
22
EXPERIMENTAL METHODOLOGY Parameter Space
Configuration Space Table 1 with default values
in bold The notation used as
Deduplicated (D) Non-Deduplicated (ND) Compressed (C) Non-Compressed (NC) WEU(D,C) configuration by
default
23
EXPERIMENTAL METHODOLOGY Experimental Platform
COS system is a server 2.33GHz Xeon CPUs (two sockets) 36GB of DRAM 960MB of NVRAM Two shelves of hard drives
One: 12 1TB 7200RPM 2TB hard drives Other: 15 7200RPM 2TB drives
RAID-6 configuration includes 2 spare disks for each shelf TPS system is a server
Four 1.6GHz Xeon CPUs 8GB DRAM with battery protection
24
EXPERIMENTAL METHODOLOGY
11 1TB 7200RPM disk drives in RAID-5 configuration
Both prototype use Samsung 256GB SSD SATA-2 controller Set SSD simulation parameters based on
Micron MLC SSD specification
25
EVALUATION
Simulation Results:
Read-hit Ratio
26
EVALUATION
Simulation Results:
Impact of fingerprint index ratio
27
EVALUATION
Simulation Results:
WEU vs. SSD co-design
28
EVALUATION
Prototype System Results:
Performance in TPS System & Performance in COS System
29
EVALUATION
Prototype System Results:
Sensitivity Analysis
30
EVALUATION
Prototype System Results:
Nitro Overheads
31
NITRO ADVANTAGES As Nitro effectively expands a cache
Improved random read performance in aged COS
Faster snapshot restore performance Write reductions in SSD – Extend life span
32
NITRO ADVANTAGES
33
RELATED WORK SSD as storage or cache
Intel Turbo Memory Use nonvolatile disk cache to enable fast start up
Another system of Kgil at al. Splits a flash cache into separate read & write regions Use Programmable flash memory controller
SDF Provides hardware/ software co-designed storage to exploit flash
performance FlashTier
Redesign SSD to support caching instead of storage Introduce silent eviction
34
RELATED WORK Deduplication and Compression in SSD
iDedup Deduplicates primary work loads to balance between performance & capacity
savings ChunkStash
A fingerprint index in flash but the actual data resides on disk Dedupv1
Improves inline deduplication by leveraging the high random read performance of SSDs
CAFTL To achieve best effort deduplication using SSD FTL
Feng and Schindler found that VDI and long-term CIFS workloads can be deduplicated with a small SSD cache
SAR Selective caching schemes for restoring from deduplicated storage
35
CONCLUSION Nitro focuses on
Improving storage performance with a capacity-optimized SSD cache with deduplication and compression
To deduplicate SSD cache, a fingerprint index used To maintain deduplication while reducing RAM requirements
To support the variable-sized extents that result from compression relies on Write-Evict Unit which Packs extent together Maximizes the cache hit-ratio while extending SSD life span
Analyze the impact of various design trade-off that include Cache size Fingerprint index size RAM usage SSD erasures on overall performance
Evaluation shows that- “Nitro can improve performance in both COS & TPS systems”
36
THANK YOU