+ All Categories
Home > Documents > Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Date post: 03-Feb-2022
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
14
Enterprise MLC NAND Industry Comparison Enterprise MLC NAND Industry Comparison Gary Tressler, Dustin Vanstee and Tom Griffin IBM Corporation Santa Clara, CA August 2011 1
Transcript
Page 1: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Gary Tressler, Dustin Vanstee and Tom Griffin IBM Corporation

Santa Clara, CAAugust 2011 1

Page 2: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Agenda

Santa Clara, CAAugust 2011 2

Introduction Goals Methodology PlatformProgram-Erase CyclingHigh Temperature Data Retention Summary

Page 3: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Flash Characterization Goals

Santa Clara, CAAugust 2011 3

Example of RBER variation across Flash blocks

Cycles (k)

Raw bit error rate

GoalsCompare industry Enterprise MLC NAND devices• Multiple suppliers • Endurance/retention BER envelope• Program-erase cycling temperature sensitivity• Dwell time sensitivity• UBER analysisSummarize relationship of characterization variables relative to Raw Bit Error Rate (RBER)Attempt to understand Flash impacts on SSD usable life• Use characterization data to interpolate for

different usage scenarios• Use characterization data to extrapolate to

usage scenarios that are time prohibitive to directly measure

Page 4: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Why Does Flash Bit Error Rate Matter for SSDs?

Santa Clara, CAAugust 2011 4

Cycles (k)

Raw bit error rate

In enterprise applications, Flash is generally written at a much higher rate than in the client spaceFlash cells degrade due to the large voltage required to program/erase the devices and due to the presence of defects • After many program/erase cycles it is not always possible to read back the data stored due to physical

wearout of the device cell To overcome this effect, SSD controllers implement various ECC and recovery mechanisms to mitigate bit errors • ECC and recovery schemes can only protect to a limitFrom an SSD perspective, validating this limit is a time consuming process – will likely take years to understand without acceleration Flash characterization requires testing under different environmental and usage conditions, and gathering Raw Bit Error Rate (RBER) statistics to evaluate the Flash robustness

Page 5: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry ComparisonFlash Characterization Methodology

Santa Clara, CAAugust 2011

5

Cycles (k)

Raw bit error rate

ApproachDwell time = 30, 60s, 120s, 240s, 480sProgram-erase cycling = 15K, 30KCycling temperature = 25C, 55C, 85CBake temperature = 100CBake hours = 0, 1, 2, 3, 4, 15, 26 hrs

Repeat Sequence until data retention Testing

Is complete

Test ConfigurationSelect Vendor,

cycling temp, and block set to test

Program-Erase Cycling and Data Retention Flow Chart

PeCyclingTemp A B C

25 30,60,120,240,480 30,60,120,240,480 30,60,120

55 30,60,120,240,480 30,60,120,240,480 x

85 30,60,120,240,480 30,60,120,240,480 x

PeCyclingTemp A B C

25 30,60,120 30,60,120 30,60,120

55 30,60,120 30,60,120 x

85 30,60,120 30,60,120 x

Vendor

Vendor

PE Cycling Data Set Matrix

Data Retention Data Set Matrix

* Dwell times applied noted under each supplier

PE Cyclingat controlled

temp with intermediate data readouts

Initial Data ReadoutImmediately after PE cycling

complete. Data readoutat room temp

Flash BakeN hrs at 100C

Data ReadoutAt room temp

PE Cycling Phase (Weeks to months) Data Retention Phase (Hours)

Page 6: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Flash Characterization Platform

Santa Clara, CAAugust 2011

6

Cycles (k)

FPGA Evaluation Board• Embedded processor • Drives signaling and test data to Flash test cards• Summarizes test responses and transmits via serial port to PC

Logic• Cross-device sampling

• Hides tPROG and tBERS and • Optimizes test card bandwidth

• Nested cycling permits multiple dwell times

Flash Device Test Card(Asynchronous)

Flash Array8 Sites / 8 High Stack

XilinxTest Board

PC Workstation

Flash Array4 Sites / 8 High Stack

Flash Device Test Card(Toggle/ONFI)

Serial PortInterface

25C, 55C, 85C PE Cycling

Page 7: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry ComparisonRaw Program-Erase Cycling @ 25C

No Data Retention, Dwell Time Variance

Santa Clara, CAAugust 2011 7

Raw program-erase cycling (no data retention evaluation) - RBER at 25C• Small and inconsistent sensitivity to dwell time

Page 8: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Santa Clara, CAAugust 2011 8

Raw program-erase cycling (no data retention evaluation) - RBER at 85C• Vendor A shows minimal sensitivity to dwell time• Vendor B shows some evidence of longer dwell time resulting in lower RBER

Raw Program-Erase Cycling @ 85C No Data Retention, Dwell Time Variance

Page 9: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry ComparisonRaw Program-Erase Cycling

No Data Retention, Temperature Variance

Santa Clara, CAAugust 2011 9

Program-erase cycling temperature sensitivity inconclusive • Vendors A & B show opposite relationships relative to temperature

AB

A

B

Page 10: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Post Program-Erase Cycling Data Retention Study

Santa Clara, CAAugust 2011 10

• As bake duration increases, raw bit error rate (RBER) increases • Higher program-erase cycling temperatures results in improved data retention• Longer dwell times show lower bit error rates

1hr2hr3hr

4hr15hr

26hr

Increasing bake timedegrades HTDR RBER

1206030

Increasing DwellImproves HTDRRBER

Increasing PE cycling Temperature improvesHTDR RBER

* Bake times are 1,2,3,4,15,and 26 hrs

Page 11: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Post Program-Erase Cycling Data Retention Study - Dwell Time Sensitivity

Santa Clara, CAAugust 2011 11

Example

M=-0.66

If dwell time is doubled

RBER reduced by 1.58 (2^-0.66)

M=slope

1200s dwell timehas 4.57x RBER improvementvs 120s dwell time (10^0.66)

• As dwell time increases, raw bit error rate (RBER) decreases • Linear relationship on log-log plot implies a power law relationship – RBER ~ DT^m

• Where m is a function of supplier, temperature, program-erase cycling and data retention bake duration • Can be applied to predict RBER for extended dwell times (as in SSD environment)

Page 12: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Uncorrectable Bit Error Rate (UBER) Study

Santa Clara, CAAugust 2011 12

• Data retention bit error rate is very sensitive to program-erase cycling temperature • Devices cycled at higher temperatures have a lower RBER and lower ECC requirement • For 30K program-erase cycled devices, 25C requires greater than 200 ECC bits per 1KB, while 85C requires about 35 ECC bits for the same data set

JEDEC SPEC1E-14 (Fails / bits read)

85C PE cycling

25C PE cycling

UBER = Number of 1KB UE================total bits read

Page 13: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

1920

1358

960

679

480

339

240

170

120

85

60

42

30

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

22000

24000

26000

28000

30000

32000

34000

36000

38000

40000

Extending Flash Dwell Time at Constant Data Retention

Santa Clara, CAAugust 2011

13

Constant RBER Bands

Decreasing RBER

Typical Flash Qualification Dwell Time Region at 30K PE

Typical SSD Dwell Time

Program-Erase Cycle Count

Dw

ell T

ime

(s)

• Additional program-erase cycles can be realized when typical SSD-level dwell times are applied

Page 14: Enterprise MLC NAND Industry Comparison

Enterprise MLC NAND Industry Comparison

Summary

Santa Clara, CAAugust 2011 14

• Flash device characterization pursued to describe relationship between variables and raw bit error rate, and to investigate Flash impact on SSD usable life • Program-erase cycling shows some sensitivity to dwell time

• Dependence observed for one supplier at 85’C (not seen at 25’C) • Program-erase cycling shows sensitivity to temperature – suppliers under test show opposite relationships• Post program-erase cycling data retention bit error rate shows clear sensitivity to dwell time during cycling

• Extended dwell times exhibit lower bit error rate • Post program-erase cycling data retention bit error rate shows clear sensitivity to temperature during cycling

• Higher temperatures exhibit lower bit error rate • Flash raw bit error rate vs. program-erase cycle dwell time log-log plot exhibits power law relationship

• Can be applied to predict RBER for extended dwell times (as in SSD environment) • Controller ECC requirement is reduced for higher program-erase cycling temperatures • Additional Flash program-erase cycles can be realized when typical SSD-level dwell times are applied at constant data retention


Recommended