+ All Categories
Home > Documents > FlashBlox: Achieving Both Performance Isolation and...

FlashBlox: Achieving Both Performance Isolation and...

Date post: 14-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
69
FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs Jian Huang Anirudh Badam Laura Caulfield Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi
Transcript
Page 1: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Achieving Both Performance Isolation

and Uniform Lifetime for Virtualized SSDs

Jian Huang † Anirudh Badam Laura Caulfield

Suman Nath Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi

† ‡

Page 2: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Flash Has Changed Over the Last Decade

2

Performance

Improvement

100x lower latency

5,000x higher throughput

Page 3: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Flash Has Changed Over the Last Decade

2

Increased

Parallelism

Dozens of

parallel chips

Performance

Improvement

100x lower latency

5,000x higher throughput

Page 4: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Flash Has Changed Over the Last Decade

2

Increased

Parallelism

Dozens of

parallel chips

Became

Commodity

Less than $0.2/GB

Performance

Improvement

100x lower latency

5,000x higher throughput

Page 5: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Flash Has Changed Over the Last Decade

2

Increased

Parallelism

Dozens of

parallel chips

Became

Commodity

Less than $0.2/GB

Significant improvements on Flash

Performance

Improvement

100x lower latency

5,000x higher throughput

Page 6: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Shared Flash-Based Solid State Disk (SSD) in the Cloud

3

…….

Page 7: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Shared Flash-Based Solid State Disk (SSD) in the Cloud

3

…….

SSDs are virtualized and shared in data centers

Page 8: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

Channel

Chip Chip

……… …

Flash Translation Layer

Performance Interference in Shared SSD

4

…….

Flash-based SSD: A Black Box

Write Read

Page 9: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

Channel

Chip Chip

……… …

Flash Translation Layer

Performance Interference in Shared SSD

4

…….

Flash-based SSD: A Black Box

Write ReadRead/write interferences cause long (3x) tail latency!

Page 10: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

Channel

Chip Chip

……… …

Flash Translation Layer

Performance Interference in Shared SSD

4

…….

Write Read

Page 11: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

Channel

Chip Chip

……… …

Flash Translation Layer

Performance Interference in Shared SSD

4

…….

Write Read

Page 12: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Hardware Isolation in Cloud Storage

5

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

Page 13: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Hardware Isolation in Cloud Storage

5

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

Leveraging parallel chips for hardware isolation

Page 14: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Internal Parallelism Enables Hardware Isolation

6

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Page 15: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Internal Parallelism Enables Hardware Isolation

6

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Channel-Level Parallelism

Page 16: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Internal Parallelism Enables Hardware Isolation

6

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Channel-Level Parallelism

Chip-Level Parallelism

Page 17: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Internal Parallelism Enables Hardware Isolation

6

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Channel-Level Parallelism

Chip-Level Parallelism

Plane-Level

Parallelism

Plane-level parallelism is constrained

as each chip contains only one address buffer

Page 18: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Internal Parallelism Enables Hardware Isolation

6

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Channel-Level Parallelism

Chip-Level Parallelism

Plane-Level

Parallelism

Different parallelism level provides different isolation guarantee

Page 19: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

New Abstractions for Hardware Isolation

7

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Page 20: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

New Abstractions for Hardware Isolation

7

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Virtual SSD

(Chip Level)

Virtual SSD

(Channel Level)

Virtual SSD

(Plane Level)

High Medium Low

Page 21: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

New Abstractions for Hardware Isolation

7

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

Virtual SSD

(Chip Level)

Virtual SSD

(Channel Level)

Virtual SSD

(Plane Level)

High Medium Low

Software-based

Page 22: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud

8

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

vSSD (Chip)vSSD (Channel) vSSD (Software)

Page 23: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud

8

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

vSSD (Chip)vSSD (Channel) vSSD (Software)

Azure

DocumentDB

Azure

SQL DatabaseAmazon

DynamoDB

Page 24: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud

8

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

vSSD (Chip)vSSD (Channel) vSSD (Software)

Azure

DocumentDB

Azure

SQL DatabaseAmazon

DynamoDB

Throughput

Single Partition Size

Price

Page 25: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Hardware Isolation Meets the Pay-As-You-Go Model in Cloud

8

Channel

Chip

plane plane

Chip

plane plane

Channel

Chip

plane plane

Chip

plane plane

…………… ……… … …

Channel

Chip

plane plane

Chip

plane plane

……… …

vSSD (Chip)vSSD (Channel) vSSD (Software)

Hundreds of vSSDs can be

supported in a single server

Page 26: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Hardware Isolation on SSD Lifetime

9

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

Page 27: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Hardware Isolation on SSD Lifetime

9

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Ave

rage

#B

lock

s Era

sed/s

ec

The average rate at which flash blocks are erased

Page 28: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Hardware Isolation on SSD Lifetime

9

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Ave

rage

#B

lock

s Era

sed/s

ec

The average rate at which flash blocks are erased

Page 29: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Hardware Isolation on SSD Lifetime

9

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Ave

rage

#B

lock

s Era

sed/s

ec

The average rate at which flash blocks are erased

Flash blocks wear out at different rate with different workload

Page 30: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Hardware Isolation on SSD Lifetime

9

Channel

Chip Chip

Channel

Chip Chip

…………… ……… … …

…….

Channel

Chip Chip

……… …

Flash Translation Layer

Write Intensive

Page 31: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBloxChallenges

10

Chip

Chip

Chip

AppApp App

SSD Lifetime Performance Isolation

Page 32: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBloxChallenges

10

Chip

Chip

Chip

AppApp App

SSD Lifetime Performance Isolation

Chip

Chip

Chip

AppApp App

SSD Lifetime Performance Isolation

Page 33: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBloxChallenges

10

Chip

Chip

Chip

AppApp App

SSD Lifetime Performance IsolationSSD Lifetime

Page 34: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Swapping Channels for Wear Balance

11

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Adjusting the wear imbalance at a more coarse time granularity

can achieve near-ideal SSD lifetime

Page 35: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Swapping Channels for Wear Balance

11

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

The channel that has incurred the maximum wearout

The channel that has the minimum rate of wearout

Page 36: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox: Swapping Channels for Wear Balance

11

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Channel migration takes 15 minutes, once per 19 days

Overall performance drops only for 0.04% of all the time

Page 37: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Imbalance = MaxWear / AvgWear

Page 38: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M

Imbalance = MaxWear / AvgWear4

App

Page 39: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M M

Imbalance = MaxWear / AvgWear42

App

Page 40: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M M M

Imbalance = MaxWear / AvgWear424/3

App

Page 41: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M M M M

Imbalance = MaxWear / AvgWear424/31

App

Page 42: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M M M M

Imbalance = MaxWear / AvgWear424/31

M

8/5

M

4/3

M

8/7

M

1

App

Page 43: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

How Frequently Should We Swap?

12

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M M M M

Imbalance = MaxWear / AvgWear424/31

M

8/5

M

4/3

M

8/7

M

1

App

How many times should we swap within SSD lifetime?

Page 44: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

Page 45: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

after K rounds of cycling:

Page 46: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

after K rounds of cycling:

Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x)

Maximum Wearout Average Wearout

Page 47: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

after K rounds of cycling:

Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x)

K ≥ (N – 1 – x) / (Nx)

Page 48: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

after K rounds of cycling:

Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x)

K ≥ (N – 1 – x) / (Nx)

If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times,

we can guarantee the wear imbalance is bounded in 1.1

Page 49: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Quantifying the Swapping Frequency

13

after K rounds of cycling:

Wear Imbalance = (MK + M)/(MK + M/N) = (K + 1)/(K + 1/N) ≤ (1 + x)

K ≥ (N – 1 – x) / (Nx)

If N = 16, x = 0.1, then K = 9, which means after swap NK = 148 times,

we can guarantee the wear imbalance is bounded in 1.1

For an SSD with 5 years lifetime, swap once per 12 days

can guarantee the channels are well balanced for worst case

Page 50: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Adaptive Wear Leveling in Practice

14

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M

M/3 M/20

App App App App

Page 51: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Adaptive Wear Leveling in Practice

14

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

M

M/3 M/20

App App

Using erase rate as the trigger condition for swapping

App App

Page 52: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Intra Channel Wear Leveling

15

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Page 53: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Intra Channel Wear Leveling

15

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Chips will be swapped along with the channel migration

Chip

Page 54: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Intra Channel Wear Leveling

15

Channel 1

Use

d E

rase

Cyc

les

Channel 2 Channel 3 Channel 4

Chips will be swapped along with the channel migration

Chip

Intra-chip wear leveling mechanisms +

Page 55: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

Chip-Level

Wear Leveling

Page 56: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

App

Virtual SSD

App

Virtual SSD…

Chip-Level

Wear Leveling

Page 57: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Isolation, Bandwidth & Capacity Requirement

(Virtual SSD to Parallel Chips Mappings)

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

App

Virtual SSD

App

Virtual SSD…

Chip-Level

Wear Leveling

Page 58: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Isolation, Bandwidth & Capacity Requirement

(Virtual SSD to Parallel Chips Mappings)

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

App

Virtual SSD

App

Virtual SSD…

Chip-Level

Wear Leveling

Pay-As-You-Go Model in Cloud

Page 59: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Inter Channel

Swapping

Isolation, Bandwidth & Capacity Requirement

(Virtual SSD to Parallel Chips Mappings)

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

App

Virtual SSD

App

Virtual SSD…

Chip-Level

Wear Leveling

Channel Channel

Page 60: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Intra Channel

Swapping

Intra Channel

Swapping

Other FTL

Algorithms

… … …

Inter Channel

Swapping

Isolation, Bandwidth & Capacity Requirement

(Virtual SSD to Parallel Chips Mappings)

FlashBlox Architecture

16

App

Channel-Level

Wear Leveling

Flash

Resource

Manager

App

Virtual SSD

App

Virtual SSD…

Chip-Level

Wear Leveling

Channel Channel

Page 61: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox Experimental Setup

17

14 data center workloads

16 channels

4 chips

4 planes

16 KB page size

Yahoo Cloud Service Benchmark

Bing Search / Index / PageRank

Transactional Database

Azure Storage

Page 62: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Tail Latency Reduction with FlashBlox

18

App1 App2

Page 63: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Tail Latency Reduction with FlashBlox

18

App1 App2 A: Session store recording recent actions

B: Photo tagging

C: User profile cache

D: User status update

E: Threaded conversations

F: User database

Page 64: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Tail Latency Reduction with FlashBlox

18

0

100

200

300

400

500

600

700

A+A A+B A+C A+D A+E A+F

99th

Perc

entile

Lat

ency

(mic

rose

cons)

Yahoo Cloud Service Benchmark (YCSB)

App1-Software Isolation App1-FlashBlox

App2-Software Isolation App2-FlashBlox

Tail latency reduction: 2.6x, average latency reduction: 1.4x

App1 App2

Page 65: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Channel Migration on Application Performance

19

0

0.1

0.2

0.3

0.4

0.5

0.6

Lat

ency

(m

illis

eco

nds)

Time (Seconds)

Bing Search’s Performance During Channel Migration

Without MigrationWith Migration

Page 66: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Channel Migration on Application Performance

19

0

0.1

0.2

0.3

0.4

0.5

0.6

Lat

ency

(m

illis

eco

nds)

Time (Seconds)

Bing Search’s Performance During Channel Migration

Without MigrationWith Migration

34%

Page 67: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

Impact of Channel Migration on Application Performance

19

Channel migration takes 15 minutes, once per 19 days

Overall performance drops only for 0.04% of all the time

0

0.1

0.2

0.3

0.4

0.5

0.6

Lat

ency

(m

illis

eco

nds)

Time (Seconds)

Bing Search’s Performance During Channel Migration

Without MigrationWith Migration

34%

Page 68: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

FlashBlox Summary

20

2.6x reduction on tail latency

Near-ideal SSD lifetime

Swap once per 19 days

Page 69: FlashBlox: Achieving Both Performance Isolation and ...nvmw.ucsd.edu/nvmw2018-program/unzip/current/nvmw2018-paper… · FlashBlox: Achieving Both Performance Isolation ... Different

21

Thanks!Jian Huang†

[email protected]

Anirudh Badam Laura Caulfield Suman Nath

Sudipta Sengupta Bikash Sharma Moinuddin K. Qureshi

Q&A


Recommended