Embedded System Lab.HotStorage'20
A New LSM-style Garbage Collection Schemefor ZNS SSDs
2020. 07. 13
Presentation by Choi, Gunhee
Gunhee Choi, Kwanghee Lee, Myunghoon Oh and Jongmoo Choi from Dankook University Jhuyeong Jhin and Yongseok Oh from SK hynix12th USENIX Workshop on Hot Topic in Storage and File System (HotStorage 20), 2020
Embedded System Lab.
Content
HotStorage'20 2
1. Zoned Namespace SSD
2. Motivation
3. LSM-ZGC Design
4. Evaluation
5. Conclusions
- Appendix
Embedded System Lab.
HotStorage'20
1. Zoned Namespace SSD
3
Traditional SSD
Open-Channel SSD Key-Value SSD
Zoned Namespace SSD Optane SSD
• What are Next Generation SSDs?
vs.
Embedded System Lab.
HotStorage'20
1. Zoned Namespace SSD
4
Traditional SSD
ü Benefits• Better performance and WAF by distributing different workloads into different zones
• Better isolation (IO Determinism)• Reduce DRAM usage and Over-provisioning area in SSDs
LBA space
NAND
LBA space
NAND
Zone 1 Zone 2 Zone 3 …
Zoned Namespace SSD
• What is ZNS SSD?
Embedded System Lab.
HotStorage'20
1. Zoned Namespace SSD
5
Host Needs to handle zone controlsSequential write constraint
Host
ZNS SSD Controller
Zone
WritePointer
Writingprogress direction
Unable to writeuntil zone reset
Zone 1 Zone 2 Zone N…
Block deviceFTL absence
App 1 App 1
• What are the issues of ZNS SSD?
ü Sequential write constraint: writes need to be conducted in a sequential manner, like the SMR drives.
ü Host needs to control zones directly such as zone open, close, reset and zone garbage collection.
Embedded System Lab.
HotStorage'20
2. Motivation
6
SK Hynix PrototypeZNS SSD
Ref : https://news.skhynix.co.kr/1915
• How much is the Zone Garbage Collection (hereafter ZGC) overhead?ü Using real ZNS SSD prototypeü Zone size: 1GB (note that the typical segment size in LFS is 2MB)
Embedded System Lab.
HotStorage'20
2. Motivation
7
Basic Zone Garbage Collection (Basic_ZGC)
MemoryZoneBitmap
Zone 0
…
Zone 100
…
Step 1.Select a candidate zone (Greedy or CB)
Embedded System Lab.
HotStorage'20
2. Motivation
8
Basic Zone Garbage Collection
MemoryZoneBitmap
Zone 0
…
Zone 100
…
Step 1.Select a candidate zone (Greedy or CB)
Step 2.Find out valid blocksusing a zone bitmap
Embedded System Lab.
HotStorage'20
2. Motivation
9
Basic Zone Garbage Collection
MemoryZoneBitmap
Zone 0
…
Zone 100
…
Step 3.Read valid data in 4KB (or larger) I/O size
Step 1.Select a candidate zone (Greedy or CB)
Step 2.Find out valid blocksusing a zone bitmap
Embedded System Lab.
HotStorage'20
2. Motivation
10
Basic Zone Garbage Collection
MemoryZoneBitmap
Zone 0
…
Zone 100
…
Step 3.Read valid data in 4KB (or larger) I/O size
Step 1.Select a candidate zone (Greedy or CB)
Step 2.Find out valid blocksusing a zone bitmap
Step 4.Write data in 4KB (or larger) I/O size
Embedded System Lab.
HotStorage'20
2. Motivation
11
Basic Zone Garbage Collection
MemoryZoneBitmap
Zone 0
…
Zone 100
…
Step 3.Read valid data in 4KB (or larger) I/O size
Step 1.Select a candidate zone (Greedy or CB)
Step 2.Find out valid blocksusing a zone bitmap
Step 4.Write data in 4KB (or larger) I/O size
Step 5.Reset the selected zone
Embedded System Lab.
HotStorage'20
2. Motivation
12
• Zone : 1GB• Block : 4KB
Observation 1: Zone garbage collection overhead
Embedded System Lab.
HotStorage'20
2. Motivation
13
• Zone : 1GB• Block : 4KB
☞Motivation 1: reducing utilization of a candidate zone is indispensable
5 times!
18 times!
Observation 1: Zone garbage collection overhead
Embedded System Lab.
HotStorage'20
2. Motivation
14
Observation 2: I/O size for Read/Write
• Another feature of ZNS SSD ü A zone is, in general, mapped into multiple channels/ways.
• Then, how about read/write data in a larger I/O size (e.g. 128KB)?
Embedded System Lab.
HotStorage'20
2. Motivation
15
Observation 2: I/O size for Read/Write
11 Times!
☞Motivation 2: accessing in a larger I/O size is beneficial in ZNS SSDs
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
16
So, Our ideas are1) Make the utilization of a candidate zone low2) Access data in a larger I/O size
• How to access data in a larger I/O size? ü The coexistence of valid and invalid data makes it difficultü Read not only valid but also invalid data in a larger I/O size
• How to make the utilization of a candidate zone low? ü Traditional hot/cold separation is not applicable in ZNS SSDs since zone is quite bigü Employ the segment concept for finer-grained hot/cold separation
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
17
• Two management unitsü Zone: for garbage collection vs. Segment: for hot/cold separationü A zone is divided into multiple segments (1GB vs. 2MB in this study)
• Segment state and transition rule (refer to our paper for details)ü New data è C0ü During ZGC, survived data from C0
§ Data in a high utilized segment ( > thresholdcold): cold è C1C § Others: hot (or unknown) è C1H § Reasoning: spatial locality, also observed in previous studies such as F2FS (FAST’15),
Multi-stream (FAST’19), Key-range locality (FAST’20) ü During ZGC, survived data from C1C or C1H (second survived data)è C2
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
18
Zone 0
Zone 1
Zone 2
Zone N-1
C0_zone
LSM(Log Structured Merge) Zone GC
Zone 3
…
Zone N
ZoneBitmap
…
Step 1.Select a candidate zone (or zones, Greedy or CB)
C1H_zone
C1C_zone
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
19
Zone 0
Zone 1
Zone 2
Zone N-1
C0_zone
………
2MB 2MB
…
2MB
LSM(Log Structured Merge) Zone GC
Zone 3
…
Zone N
ZoneBitmap
…
Step 1.Select a candidate zone (or zones, Greedy or CB)
Step 2.Read all data in 128KB I/O size
C1H_zone
C1C_zone
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
20
Zone 0
Zone 1
Zone 2
Zone N-1
C0_zone
………
2MB 2MB
…
2MB
LSM(Log Structured Merge) Zone GC
Zone 3
…
Zone N
ZoneBitmap
…
Step 1.Select a candidate zone (or zones, Greedy or CB)
Step 2.Read all data in 128KB I/O size
Step 3-1.Check valid data
Step 3-2.Identify Hot/Cold segment
C1H_zone
C1C_zone
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
21
Zone 0
Zone 1
Zone 2
Zone N-1
C0_zone
………
2MB 2MB
…
2MB
LSM(Log Structured Merge) Zone GC
Zone 3
…
Zone N
ZoneBitmap
…
C1H_zone
C1C_zone
Step 2.Read all data in 128KB I/O size
…
2MB
Step 3-1.Check valid data
Step 4.Merge valid data only according to hot/cold
…
2MB
Hot Segment
Cold Segment
Step 3-2.Identify Hot/Cold segment
Step 1.Select a candidate zone (or zones, Greedy or CB)
Embedded System Lab.
HotStorage'20
3. LSM-ZGC Design
22
Zone 0
Zone 1
Zone 2
Zone N-1
C0_zone
………
2MB 2MB
…
2MB
LSM(Log Structured Merge) Zone GC
Zone 3
…
Zone N
ZoneBitmap
…
Step 1.Select a candidate zone (or zones, Greedy or CB)
C1H_zone
C1C_zone
Step 2.Read all data in 128KB I/O size
…
2MB
Step 4.Merge valid data only according to hot/cold
…
2MB
Hot Segment
Cold Segment
Step 5.Write data in 128KB I/O size
Step 3-1.Check valid data
Step 3-2.Identify Hot/Cold segment
Embedded System Lab.
HotStorage'20
4. Evaluation
23
Experimental environment• Intel Core i7 (8 core)• 16GB DRAM• 1TB ZNS SSD• Size of Zone : 1GB
Average of 1.9 times
Garbage collection overhead: uniform update pattern
Max of 2.3 times
Embedded System Lab.
HotStorage'20
4. Evaluation
24
Garbage collection overhead: skewed update pattern
Experimental environment• Intel Core i7 (8 core)• 16GB DRAM• 1TB ZNS SSD• Size of Zone : 1GB
Average of 1.4 timesMax of 1.6 times
Parameters• Workload: 70/30 hot/cold ratio• Threaholdcold : 0.8 • average utilization: x-axis
0
10
20
30
40
50
0.5 0.6 0.7 0.8 0.9Garb
age
colle
ctio
n tim
e (s
ec)
Utilization
LSM_ZGC Basic_ZGC
Embedded System Lab.
HotStorage'20
4. Evaluation
25
Hot/Cold Separation
ü Without hot/cold separation
ü With hot/cold separation
Parameters• Workload: 70/30 hot/cold ratio• Threaholdcold : 0.8 • Average utilization: 0.6
0
50
100
150
200
250
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Garbage Collection Count : 100
0
50
100
150
200
250
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Coun
t
Utilization
Garbage Collection Count : 500
020406080100120140160180
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Garbage Collection Count : 900
0
50
100
150
200
250
300
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Garbage Collection Count : 100
020406080100120140
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Garbage Collection Count : 500
0
20
40
60
80
100
120
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Garbage Collection Count : 900
Embedded System Lab.
HotStorage'20
5. Conclusions
26
• Our contributions§ Observation: a zone garbage collection really matters§ Proposal: a new LSM-style zone garbage collection scheme§ Evaluation: real implementation based results
• Future work
§ We are currently extending F2FS on our ZNS SSD prototype§ Also, evaluating LSM ZGC under diverse workloads with different hot
/cold ratio, data size, initial placement and classification policies
Embedded System Lab.HotStorage'20
Thank You!
A New LSM-style Garbage Collection Schemefor ZNS SSDs
2020. 07. 13
Presentation by Choi, Gunhee
Gunhee Choi, Kwanghee Lee, Myunghoon Oh, Jhuyeong Jhin, Yongseok Oh, Jongmoo Choi12th USENIX Workshop on Hot Topic in Storage and File System (HotStorage 20), 2020
Questions?
Embedded System Lab.
HotStorage'20
6. Appendix
28
Performance comparison using multi-thread & Scalability
Non-linearScalability
Worker only: 40With LSM_ZGC : 45With Basic_ZGC : 53
Embedded System Lab.
HotStorage'20
6. Appendix
29
Sensitive Analysis: various parameters
ü Effect of thresholdcold (initial utilization: 0.6)
ü Effect of initial utilization of a zone (thresholdcold : 0.8)
020406080100120140160180
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Coun
t
Utilization
Threshold : 0.6
020406080100120140160180
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Threshold : 0.8
0
20
40
60
80
100
120
140
160
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Co
unt
Utilization
Initial Utilization : 0.8
0
50
100
150
200
250
300
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Coun
t
Utilization
Initial Utilization : 0.9
0
50
100
150
200
250
300
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Coun
t
Utilization
Initial Utilization : 0.5
020
406080
100120
140160
180
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Coun
t
Utilization
Initial Utilization : 0.6
020406080100120140160
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Count
Utilization
Threshold : 0.9