+ All Categories
Home > Documents > Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD...

Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD...

Date post: 24-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
43
Future Compute Memory Non Volatile Memory (NVM) in Compute Al Fazio Intel Fellow Director, Memory technology Development November 12, 2008
Transcript
Page 1: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Future Compute MemoryNon Volatile Memory (NVM) in Compute

Al Fazio

Intel Fellow

Director, Memory technology Development

November 12, 2008

Page 2: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Outline

Motivations for NVM in Compute

Key Principles of NAND Flash Operation & Device Physics from a compute applications viewpoint

Memory Controller Architecture

NVM Impact on Compute Applications

Future NVM Technology Trends

Page 3: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

1987 View of NVM in Compute

Page 4: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

2008 View of NVM in Compute

Form Factor: 2.5”/ 1.8” Standard SATA 3Gb/s

Performance

• World Class SATA I/O Performance• X2S-E (SLC) Throughput

– Sustained R/W: 240 / 170MB/s– Active (avg): 2.4W; Idle: 0.06W

• X25-M/X18-M (MLC) Throughput – Sustained R/W: 240 / 70MB/s – Active (avg): 0.25W; Idle: 0.06W

Page 5: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

A Full Range of NVM in Compute

Page 6: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Media Access Time for 20K Read

0

20

40

60

80

100

120

140

160

180

200

Jan-96

Jan-97

Jan-98

Jan-99

Jan-00

Jan-01

Jan-02

Jan-03

Jan-04

Jan-05

Jan-06

Jan-07

Jan-08

Jan-09

Nor

mal

ized

Per

form

ance

Multicore CPUCPU

Disk

Motivation for NVM in compute:Huge Scaling Discrepancy Between CPU and HDD

Source: Intel measurements

1.3X vs 175X in 13 years!

Normalized CPU PerformanceNormalized Media Access Time for 20K Read

Intel® Mainstream SATA SSD

Intel®Turbo Memory

Intel®SSD

Page 7: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

3+ decades of floating-gate technology scaling starting on EPROM Flash

Technology designed for high-volume manufacturing

1986/1.51986/1.5µµmm

1988/1.01988/1.0µµmm

1991/0.81991/0.8µµmm

1993/0.61993/0.6µµmm

1996/0.41996/0.4µµmm

1998/0.251998/0.25µµmm

2000/0.182000/0.18µµmm

2002/0.132002/0.13µµmm

2004/90nm2004/90nm

20+ Years Flash Floating Gate Technology20+ Years Flash Floating Gate Technology

2006/72nm2006/72nm

2007/50nm2007/50nm

Source: IntelSource: Intel

2008/34nm2008/34nm32Gbit32Gbit

Page 8: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Flash: A License to Disrupt

35mm film, Floppy drives, audio tape…

• Flash use in consumer electronics characterized by:– Large block files (.jpg, mp3…)– # Writes determined by human interaction (i.e. photos taken)

To disrupt HDD, flash must accommodate compute characteristics:

• Small random writes, # writes determine by OS

• Add to this:

A Be-

Control

Flash requires high fields to overcome energy barriers for non-volatility

Flash reliability dominated by oxide-degradation; result of program/erase

Page 9: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

NAND Physical Organization

A block is a sea of cells arranged in a grid

TG’s are connected in wordlines (typ. 32, only 5 shown)

Cells in different wordlines are strung together in series

Each string of cells is connected to a bitline at one, source at the other

Select devices control whether the block is connected to bitlines and source

Gate Gate Gate Gate Gate

One block

Wor

dlin

e

Bitline

Source

Sele

ct

Sele

ct

Page 10: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Flash Cell Layout and Cross-Section

N-Channel MOSFET with a few distinguishing features: – Isolated floating gate– Charge storage on Floating gate modulates threshold voltage of underlying

MOSFET

D

S

CG

FG

Ids

Vcg

“1” “0”

Erased“1”

Programmed“0”

Stored Electrons

Inter Poly Dielectric ONO

N+ Source N+ DRAIN

POLY1FLOATING GATE

POLY2 CONTROL GATE

P-Type SiliconSubstrate

Tunnel Oxide

Oxide Sidewall

Page 11: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Charge Storage: Program and Erase

Programming means injecting electrons to the FG

• Fowler-Nordheim Tunneling

Erase: Fowler-Nordheim Tunneling in reverse direction

N+ N+

20V

0V0V

Programming: NAND

N+ N+

-20V

0V0V

Erase

Dis

trib

utio

n

Vt

“1” “0”

2 Levels => 1 bit/cell

Dis

trib

utio

n

Vt

“11” “00”

4 Levels => 2 bit/cell

“10”“01”

Page 12: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Reliability and Oxide TrapsNormally, F-N tunneling occur only during

accelerated stresses done by engineers trying to study oxide degradation…

• Flash memories: basis device operation itself

This fact has two fundamental implications:

• Flash reliability is dominated by oxide-degradation effects, notably trap buildup in the tunnel oxide, which occur as a result of program/erase cycling

• More than any other IC technology, developing a Flash technology centers around obtaining acceptable reliability

Over time, charges can detrap

• Effect will cause VT to shift and possible data loss

-----

Channel

FG

Q’

Distance

Ener

gy

N+ N+

FG

Top Gate

Page 13: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Bit Errors: Overview

At any instant, some fraction of bits are in the wrong data state, typically 1E-9 to 1E-6, called the “raw bit error rate” or RBER

These failing bits develop with use

L0 L1 L2 L3After Write

VtVpass Read

L0 L1 L2 L3After Time

• During write, some bits program when they shouldn’t, or program higher than they should

This complexity means that RBER is a number, but not like pi: • like temperature: a # for specific set of conditions, location, instant

• Cells shift in VT over time, because of simply time (“data retention”) or of repetitive read operations (“read disturb”)

• Both kinds increase with more program/erase cycles• Several mechanisms cause bit errors, each with its own dependence on

cycles, time, temperature, etc.

Page 14: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

1.0E-09

1.0E-08

1.0E-07

1.0E-06

1.0E-05

4000 6000 8000 10000P/E Cycles

RB

ER

1.0E-09

1.0E-08

1.0E-07

1.0E-06

1.0E-05

1 10 100 1000 10000

P/E Cycles

RB

ER

Erratic Nature of Write Errors

Errors are erratic: Most bits failing at 5K didn’t fail at 10K

Explanation: oxide traps are transient

Data verified only at symbols: did we miss errors in between?

Ran experiment to verify data after every cycle

• Example bit failed 11 times, never at previous verify points

• Previous verifies detected only 0.6% of failing bits

Standard “test after stress” qualifications miss most errors!

Bit failpoints

EarlierDataIMFT

Next Several Slides are based on 70nm results from: Mielke, N., et. al., “Bit error rate in NAND Flash memories”, IEEE International Reliability Physics Symposium, 2008

Page 15: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

10-4

10-5

10-6

10-7

10-8

10-9

0 5000 10000Retention Time (Hours)

Raw

Bit

Erro

r R

ate

Data-Retention Errors

Post 10K Cycles

After cycling, RBER increases over time without bias

Error transitions show cells are losing VT (“charge loss”)

Two products dominated by upper state (L3), others by L1 & L2

Characteristics:

• L1 & L2: Detrapping from the tunnel oxide

• L3: SILC (trap-assisted tunneling) leakage off FG

L0 L1 L2 L3

R1 R2 R3

n+ n+

CG

FG

n+ n+

SILCDetrapping

Page 16: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

0

5x10-7

10-6

1.5 x10-6

0 5000 10000Number of Reads

Raw

Bit

Erro

r R

ate

Read Disturb ErrorsPost 10K Cycles

After cycling, RBER increases with repetitive reading

Error transitions show erased cells gaining VT

Mechanism is well known: SILC under read bias

L0 L1 L2 L3

R1 R2 R3

n+ n+

~6V

SILC

Page 17: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

10KCycles

+1 Yearor

10K reads

10KCycles

+ 0.5 Yearor

5K reads

10KCycles

5KCycles

00.0E+00

1.0E-04

2.0E-04C

um F

ract

ion

Sect

ors

Faili

ng (

1-bi

t EC

C)

0

2E-13

4E-13

Cum

Fra

ctio

n Se

ctor

sFa

iling

(4-b

it EC

C)

Effect of ECC4x10-13

0 0

2x10-4

Failures drop several orders of magnitude, ~1012x over no ECC

Curves get steeper (because of Ecc power law)

Dominant mechanism switches to retention (because of underlying error distribution)

1-bit

4-bit

Page 18: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Workable UBER Definition for NAND

)Cyc-PostN eReads/Cycl# (Nsector)per (bitsFailing Sectors Fraction Cum

sector)per reads(#sector)per (bitsFailing Sectors Fraction CumUBER

CYC +•⋅=

⋅=

Worst case:1Read Disturb: #reads in stressUnbiased: Impute same rate

as in cycling

UBER = Uncorrectable Bit Error Rate

Page 19: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

00

1085x107

2x10-13

Bits Read per Sector

Cum

Fra

ctio

n Se

ctor

s,4-

bit E

CC

3x10-21Retention

Read DisturbWrite

UBER Estimate

Data re-plotted vs. # bits read

UBER at any point is the slope of line to the origin

UBER is very low 3x10-21 at worst-case point (retention)

UBER increases with greater use, so use range must be stated when UBER is specified

Page 20: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Concurrency in Intel® SSD ASIC10 external physical NAND channels

• Up to 2 NAND components per channel

• Component = Dual Die or Quad Die Packages

Each channel supports multiple outstanding tasks

• Each NAND channel fully hardware automated/accelerated

• Hardware fully overlaps & pipelines commands

• Automated ECC generators & correctors

Dual CE#

NAND

Dual CE#

NAND

Dual CE#

NAND

Dual CE#

NAND

Memory Arbiter

ECC Corrector

Dual CE#

NAND

Dual CE#

NAND

Dual CE#

NAND

Dual CE#

NAND

To rest of ASIC To rest of ASIC datapathdatapath

NAND Channel

#1

NAND Channel

#2

NAND Channel

#9

NAND Channel

#0

ECC Generator

ECC Generator

ECC Generator

ECC Generator

Page 21: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Algorithmic EfficiencyA high-performance NAND controller is necessary but not sufficient

Primary impact on overall performance is algorithmic efficiency

• Especially the case for small random writes

Page 22: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Write AmplificationWrite Amplification is the amount of NAND written for a requested amount of write from host

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63Erase Block (EB)

Page 3

*Simplified example to illustrate the write amplification effect. Specific algorithms vary greatly.

Data to be written

Data to be Data to be writtenwritten

Page 23: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Write AmplificationWrite Amplification is the amount of NAND written for a requested amount of write from host

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63Erase Block (EB)

Page 3

*Simplified example to illustrate the write amplification effect. Specific algorithms vary greatly.

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63

Page 3

DRAM Copy

First retrieve all data in

erase block

First retrieve First retrieve all data in all data in

erase blockerase block

Then insert new data in

retrieved copy

Then insert Then insert new data in new data in

retrieved copyretrieved copy

Page 24: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Write AmplificationWrite Amplification is the amount of NAND written for a requested amount of write from host

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63Erase Block (EB)

Page 3

*Simplified example to illustrate the write amplification effect. Specific algorithms vary greatly.

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63

Page 3

DRAM Copy

Then erase the NAND

erase block

Then erase Then erase the NAND the NAND

erase blockerase blockFinally put all

data back (including new)

Finally put all Finally put all data back data back

(including new)(including new)

Page 25: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63Erase Block (EB)

Page 3

Page 0

Page 1

Page 2

Page 61

Page 62

Page 63

Page 3

Write AmplificationWrite Amplification is the amount of NAND written for a requested amount of write from host

*Simplified example to illustrate the write amplification effect. Specific algorithms vary greatly.

Example amplification is 32 Example amplification is 32 (32X NAND written for host request)(32X NAND written for host request). . Traditional schemes have amplification of approx 20Traditional schemes have amplification of approx 20--40X.40X.

Page 26: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Client Workload Write Amplification

0

10 20 30 40 50 60 70 80 90

100

110

120

130

140

150

160

170

180

190

0.1

1

10

100

1000

Dat

a W

ritte

n (M

B)

Workload Duration (Minutes)

Host Data Written NAND Data Written

XP Mobile Workload XP Mobile Workload WritesWrites

IntelIntel®® HighHigh--Performance SATA Performance SATA SSDsSSDs typical write amplification typical write amplification <1.1 for client workloads (this example <1.05)<1.1 for client workloads (this example <1.05)

Measured writes from

host

Measured Measured writes from writes from

hosthost

Measured writes to

NAND

Measured Measured writes to writes to

NANDNAND

Performance measurements are made using specific computer systems and/or components and reflect the approximate performance of the technology as measured by those tests. Any difference in system hardware or software design or configuration may affect actual results.

*Third party marks and brands are the property of their respective owners

Page 27: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Variability in Wear Leveling

Controllers vary in in wear-leveling effectiveness

Poor wear leveling can have high impact

20x in cycles can be 10x or more in RBER

10x in RBER is 10ECC+1 in ECC failure rate: 100,000x for 4-bit ECC

Wear Leveling

0

500

1000

1500

2000

2500

1 2049 4097 6145

Sorted Erase Block (min to max)

P/E

Cyc

les

Ref: Intel® X18-M

Unsophisticated regioned scheme More sophisticated scheme

Cycles/blockVaries 20x 4% variation

Page 28: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Putting it together: SSD Reliability MetricsSSD UBER values can be << 10-15

UBER ∝ usage: program/erase/read & subsequent retention

Intel® X18-M and X25-M Mainstream SATA SSD (80GB)

• 10 Channels Architecture with 50nm MLC ONFI 1.0 NAND

• 5 years usage, 1000G, 1.2million hrs MTBF

• GB/day client workload @ 1e-15 UBER >>100GB/day, 5 years

Intel® X25-M and X18-M Mainstream SATA SSDs deliver

>5X accepted requirement for clients (20GB/day)Intel® X25-E Extreme SATA SSD (32GB)

• 10 Channels Architecture with 50nm SLC ONFI 1.0 NAND

• 1000G, 2Million hrs MTBF

• Intel SLC SSD support > 7000 8K 2:1 R/W Random IOPs 24/7, 5 years

Intel X25-E SLC SSDs support the endurance required

to replace many 15K RPM HDDs for IOPS applications

Page 29: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Why Random Performance Matters(more than sequential transfer rate)

Most requests are nonMost requests are non--sequential where the nonsequential where the non--transfer time component is dominanttransfer time component is dominant

Approximate service Approximate service time breakdown for time breakdown for 7200RPM HDD w/ 8ms 7200RPM HDD w/ 8ms average seek time and average seek time and 75MB/s transfer rate 75MB/s transfer rate performing 32KB performing 32KB random read random read operation.operation.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

32KB Transfer

SeekRotationMediaInterface

Media Transfer (440us)

Media Transfer Media Transfer (440us)(440us)

Interface Transfer (110us)

Interface Transfer Interface Transfer (110us)(110us)

Most requests are Most requests are nonnon--sequentialsequential

For nonFor non--sequential sequential accesses, >95% of accesses, >95% of total HDD service total HDD service time is mechanical time is mechanical latencylatency

Page 30: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Intel® Mainstream SATA SSD Bridges the HDD Performance Gap Random Read Performance

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

512

1024

2048

4096

8192

1638

432

768

6553

613

1072

Transfer Size (B)

Ran

dom

Rea

d IO

Ps

(QD

=32)

7200RPM HDD

128GB SATA SSD (A)

64GB SSD (B)

Intel® MainstreamSATA SSD

Performance measurements are made using specific computer systems and/or components and reflect the approximate performance of the technology as measured by those tests. Any difference in system hardware or software design or configuration may affect actual results.

Page 31: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Intel® Mainstream SATA SSD Bridges the HDD Performance Gap (cont’d)Random Write Performance

0

500

1000

1500

2000

2500

3000

3500

4000

512

1024

2048

4096

8192

1638

432

768

6553

613

1072

Transfer Size (B)

Rand

om W

rite

IOP

s (Q

D=32

)

7200RPM HDD

128GB SATA SSD (A)

64GB SSD (B)

Intel® MainstreamSATA SSD

Performance measurements are made using specific computer systems and/or components and reflect the approximate performance of the technology as measured by those tests. Any difference in system hardware or software design or configuration may affect actual results.

Page 32: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

0 20 40 60 80 100 120

Minutes

Pow

er (W

)

5400 HDD (8X)

7200 HDD (13X)

Intel SSD (1X)

Intel® Mainstream SATA SSDs Save Power:SATA Power Rails With 2 Hour Mobile Workload

HDD spends only 10% in lowest power states

Intel® X25-M SSD spends 96% in

lowest power states

Performance measurements are made using specific computer systems and/or components and reflect the approximate performance of the technology as measured by those tests. Any difference in system hardware or software design or configuration may affect actual results.

Samples Sorted by Increasing Power ->

Intel® X25-M SSD

Page 33: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Intel® Mainstream SATA SSDs Mean Better Mobile CPU Scaling

Sysmark07*-Productivity Performance Scaling

50

75

100

125

150

175

200

225

2 GHz Intel® Core™2 DuoProcessor

3 GHz Intel® Core™2 DuoExtreme Processor

Ben

chm

ark

scor

e Intel® MainstreamSATA SSD

5400 RPM MobileHDD

25% Scaling

36% Scaling

Intel® Mainstream SATA SSDs

maximize end user’s processor performance Performance measurements are made using specific computer systems and/or components and reflect the approximate performance of the technology as measured by those tests. Any difference in system hardware or software design or configuration may affect actual results.

Page 34: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

SSDs in Data Center

Data center value proposition:

• Performance, especially IOPS performance – IOPS = Input/Output Operation Per Second

• Fewer devices needed to meet IOP need, saving money

• Lower power consumption

• Higher system reliability

SSD Value:

A lower cost, greener, more reliable data center

Page 35: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Media Access Time for 20K Read

0

20

40

60

80

100

120

140

160

180

200

Jan-96

Jan-97

Jan-98

Jan-99

Jan-00

Jan-01

Jan-02

Jan-03

Jan-04

Jan-05

Jan-06

Jan-07

Jan-08

Jan-09

Nor

mal

ized

Per

form

ance

Multicore CPUCPU

Disk

Source: Intel measurements

Intel SSD

Normalized CPU PerformanceNormalized Media Access Time for 20K ReadMedia Access Time for 20K Read

0

20

40

60

80

100

120

140

160

180

200

Jan-96

Jan-97

Jan-98

Jan-99

Jan-00

Jan-01

Jan-02

Jan-03

Jan-04

Jan-05

Jan-06

Jan-07

Jan-08

Jan-09

Nor

mal

ized

Per

form

ance

Multicore CPUCPU

Disk

Source: Intel measurements

Intel SSD

Normalized CPU PerformanceNormalized Media Access Time for 20K Read

Enterprise HDD Performance Gap Results in Multiplication of HDDs

7056 HDDs are expensive

7056 HDDs are hard to manage

7056 HDDs fail often

7056 HDDs burn a lot of power

7056 15K RPM HDDs

TPC-C Reporthttp://www.tpc.org/

Page 36: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Similar I/O Performance For IOPS Intensive Workload

490 Fiberchannel 15K RPM drives in 4 racks

Vs.

8 lab prototype SSDs (not product) internal to server

Page 37: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

HDDHDD64,000 IOPS64,000 IOPS490 490 HDDsHDDs35 drive shelves35 drive shelves24 sq ft24 sq ft14 kW14 kW4.6 IOPS/W4.6 IOPS/W

SDDSDD120,000 IOPS120,000 IOPS8 SSDs8 SSDs1 drive shelf1 drive shelf1 sq ft1 sq ft0.6 kW0.6 kW200 IOPS/W200 IOPS/W

Energy Savings

$9KSAVINGS/year

Energy Costs

23XREDUCTION

Storage Cost

8XREDUCTION

Floor Space

24XREDUCTION

Picture not to scale

IOPS Application Optimization

Page 38: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Intel® Turbo Memory (NAND Cache)

NAND flash solution on PCI-e bus

Intel driver interfaces to Microsoft ReadyBoost* and ReadyDrive*

O-ROM handles pre-driver load cache management

Supports on-motherboard and minicard solutions

Operating System

Intel® Matrix Storage Manager

Driver 7.0

SWHW

PCIe*SATA

Disk

ReadyDrive* Technology/

T13 NVM Commands

Microsoft ReadyBoost* Technology/SuperFetch*

Memory Management

Technology

CTRLNAND NAND

Robson Driver

OROM

Page 39: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Standardized High Performance NAND Platform

HostHost

Platform Platform NVM NVM

subsystemsubsystemFlashFlashFlashFlashFlashFlash

All elements necessary for standardized All elements necessary for standardized highhigh--performance platform NAND solutionperformance platform NAND solution

FlashFlashFlashFlashFlashFlash

Flash Controller

Flash Flash ControllerController

ONFI 2.0 (~133MB/s per channel)

ONFI 2.0 ONFI 2.0 (~133MB/s (~133MB/s per channel)per channel)

NVMHCINVMHCINVMHCIONFI NAND DIMM Connector

ONFI ONFI NAND DIMM NAND DIMM ConnectorConnector

*Future platform evolution forecasted*Future platform evolution forecasted

Page 40: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

NAND in the Platform

NAND in the platform has started with modules plugged in on PCIe

As NAND becomes more prevalent, the controller will be integrated with the platform

• Down on motherboard or higher levels of integration

OEMs want to offer customers capacity/feature choice, so NAND will remain on a module

Issue: How to plug a NAND-only module into a PC platform?

• NAND does not talk PCIe*

ChipsetChipset

IntelIntel®®Turbo MemoryTurbo Memory

Page 41: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

Connector for NAND-only Modules

To offer capacity choice, ONFI is defining a standard connector • Enables OEMs to sell NAND on a module• Like an unbuffered and unregistered DIMM

The ONFI connector effort is leveraging existing DRAM standards

• Avoids major connector tooling costs• Re-uses electrical verification• Ensures low cost with quick time to

market

Both right-angle and vertical entry form factors are being delivered

Page 42: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

NAND Technology Future Scaling Trends: $$/GB = Bit Area & Reliability Scaling

NAND scaling: multiple potential vectors

Pipeline for next several years

Traditional FG NAND

34nm 32Gb MLC NAND

Vertical Integration

Ref: S. Jung, IEDM 2006

NAND Pathfinding Programs

Trap Based Flash

Control Dielectric

Charge Storage Dots

Tunnel Oxide

More Evolutionary: Higher production

probability, but less scalable

Less Evolutionary: Lower production probability, but potentially more

scalable

Non-NAND NVM

Storage ElementSwitch Element

Page 43: Al Fazio Intel Fellow Director, Memory technology Development …€¦ · Putting it together: SSD Reliability Metrics SSD UBER values can be

NVM in Compute…

20+ Year Vision drive by Moore’s Law

Now we can start…


Recommended