Disk Drive Roadmap from the Thermal Perspective A Case for Dynamic Thermal Management Sudhanva...

Post on 25-Dec-2015

213 views 0 download

transcript

Disk Drive Roadmap from the Thermal Perspective

A Case for Dynamic Thermal Management

Sudhanva GurumurthiAnand Sivasubramaniam, Vivek Natarajan

Computer Systems LabPennsylvania State University

2

Power Demands of Data Centers“What matters most to the computer designers at Google is not speed but

power – low-power – because data centers can consume as much electricity as a city”, Eric Schmidt, CEO, Google

• Data centers consume several Megawatts of power

• Electricity bill– $4 billion/year– Disks account

for 27% of computing-load costs

• Difficult to cool at high power-densities

Sources:

1. “Intel’s Huge Bet Turns Iffy”, New York Times article, September 29, 2002

2. “Power, Heat, and Sledgehammer, Apr. 2002.

3. “Heat Density Trends in Data Processing, Computer Systems, and Telecommunications Equipment”, 2000.

3

Data Center Cooling Costs

• Data center of a large financial institution in New York City– Power consumption ~ 4.8 MW

Source: “Energy Benchmarking and Case Study – NY Data Center No. 2”, Lawrence Berkeley National Lab, July 2003.

51%42%

7%

Servers Air-Conditioning Other

4

Temperature Affects Disk Drive Reliability

• Heat-Related Problems– Thermal-tilt of disk stack and actuator arms– Out-gassing of spindle/voice-coil motor

lubricants– Wear-out of bearings

• Hard disk operating 5 C above normal temperature 10-15% more likely to fail

Disk drive design constrained by the thermal-envelope

5Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html

6

Power =~ (# Platters)*(RPM)2.8*(Diameter)4.6

Increase RPM

Thermal-Constrained Design

Increase RPM

Lower Capacity

Shrink Platter

1 platterData Rate =~ (Linear-Density)*(RPM)*(Diameter)

(RPM)2.8 (Dia)4.6 (# Platters)

Lower Data Rate

Data-Rate Capacity

Temperature

40% AnnualIDR Growth

Can we stay on this roadmap?

7

Outline

• Introduction

• Modeling

• The Roadmap

• Dynamic Thermal Management

• Conclusions

8

Modeling

• Baseline input parameters– Linear-Density (BPI)– Track-Density (TPI)

• Characteristics Modeled– Capacity– Performance– Temperature

9

Capacity Model

• Cmax = ηxnsurfxπ(ro2-ri

2)(BPIxTPI)

• Stroke-Efficiency: η < 1– Spare tracks, recalibration tracks etc.– Assumed η = 2/3 [CMRR]

• User-accessible capacity needs to be derated due to:– Zoned-Bit Recording (ZBR)– Servo Overheads– ECC Overheads

10

Performance Models

• Parameters Modeled– IDR– Seek-time

• IDR– IDR experienced by outermost zone

• Seek-time– Uses linear-interpolation based on track-to-track,

average, and full-stroke times [Worthington’95]

– Accurate for seeks longer than 10 cylinders

11

Validation

• Compared modeled vs. actual capacity and IDR using 13 disks from 4 different manufacturers from 1999-2002

• Inputs: BPI, TPI, RPM, Platter-size, Number of platters

• Assumed all disks have 30 zones.

12

Performance Model ValidationModel Year Actual

Capacity (GB)

Model Capacity (GB)

Actual IDR (MB/s)

Model IDR (MB/s)

Quantum Atlas 10K

1999 18 17.6 39.3 46.5

Seagate Cheetah X15

2000 18 20.1 63.5 73.6

IBM Ultrastar 73LZX

2001 36 34.7 86.3 85.2

Fujitsu AL-7LE

2001 73 67.6 84.1 88.1

Seagate Cheetah 15K.3

2002 73 74.8 111.4 114.4

13Source: Hitachi GST Technology Overview Charts, http://www.hitachigst.com/hdd/technolo/overview/storagetechchart.html

14

Change in BPI and TPI Trends

• Slowdown in BPI– Difficult to lower fly-height– Requires higher recording media coercivity– Smaller grain sizes suffer from superparamagnetic effects

• Slowdown in TPI– Narrower tracks more susceptible to media noise– Inter-track interference– Increase in track edge-effects with narrower tracks

• Bit-Aspect Ratios (BPI/TPI) dropping– Larger slowdown in BPI

• Long-term areal density growth expected to slowdown to 40-50% – 1 Tb/in2 disk expected to be available in 2010 [DS2]

15

Capturing BPI and TPI Trends

• Studied published work on designing Terabit areal-density disks.

• Chose design with most conservative assumptions about BPI

• Scaled BPI and TPI CGRs to achieve 1 Tb/in2 areal density in 2010– BPI CGR = 14%– TPI CGR = 28%– Areal-density CGR = 46%

16

Thermal Model

• Extension of work by Eibeck et al. at the University of California

• Components Modeled:– Internal air– Spindle-assembly– Arm-assembly– Drive base and cover

• Drive completely enclosed• External temperature maintained constant

17

Modeling the Heat-Transfer

• Newton’s Law of Cooling:

dQ/dt = hAΔT • Internal Air Heat = Heat convected by

solid components + viscous dissipation – heat lost through drive cover

18

Drive Parameters

• Materials– Proprietary data– Assumed platters, arms, and spindle-hub composed

of Aluminum

• Geometry– Modeling and measurement

• Voice-coil motor (VCM) power– Used published data from IBM [Sri-Jayantha’95]

• External air temperature– Assumed 28 C for single-platter configuration

19

The Thermal-Envelope

28

33

38

43

48

1 5 10 15 20 25 30 35 40 45 50

Time (Mins.)

Tem

per

atu

re (

C)

Thermal Envelope

20

Outline

• Introduction

• Modeling

• Formulating a Disk-Drive Roadmap

• The Roadmap

• Dynamic Thermal Management

• Conclusions

21

Drive RPM

0

50000

100000

150000

200000

250000

Year

RP

M

2.6"

2.1"

1.6"

BPI CGR = 30%

TPI CGR = 50%

BPI CGR = 14%

TPI CGR = 28%

Areal Density ≥ 1 Tb/in2

22

Drive Temperature

10

100

1000

Year

Tem

pera

ture

(C

)

2.6" 2.1" 1.6"

Thermal-Envelope

23

24

25

Outline

• Introduction

• Modeling

• Formulating a Disk-Drive Roadmap

• The Roadmap

• Dynamic Thermal Management

• Conclusions

26

Dynamic Thermal Management (DTM)

• To boost performance while still working within the thermal-envelope by dynamic activity-control

• How much do higher RPMs benefit application I/O performance?

27

Applications Studied

• Five commercial I/O traces– Openmail (HP Labs)

– OLTP Application (UMass Repository)

– Web Search-Engine (UMass Repository)

– TPC-C (Penn State)

– TPC-H (IBM Research)

• Attempted to re-create the disk-system on which the trace was collected in DiskSim

28

30-60% Performance Boostfor 10,000 RPM Increase

29

Search-Engine - Thermal BehaviorThermal Envelope = 45.22 C

30

DTM Solution 1:Exploiting Thermal Slack

T

E

M

P

E

R

A

T

U

R

E

TIME

Thermal-EnvelopeSPM+VCM On

VCM Off

RPMThermal Slack

31

Thermal Slack

0

10000

20000

30000

40000

50000

60000

RP

M

2.6 2.1 1.6

Platter-Diameter (inches)

Envelope-Design VCM Off

32

33

DTM Solution 2:Activity Throttling

• Thermal-design assuming an average-case operation

• Basic idea– Disk services requests at its peak-

performance configuration– Throttle disk activities if thermal-envelope may

be exceeded

34

Approach 1:Seek Throttling

T

E

M

P

E

R

A

T

U

R

E

TIME

Thermal-Envelope

VCM On

VCM Off

35

Approach 2:(Seek+RPM) Throttling

T

E

M

P

E

R

A

T

U

R

E

TIME

Thermal-Envelope

VCM On

VCM Off

VCM Off+

Low RPM

36

Throttling-Ratio

Seek Throttling2.6", 24,534 RPM

0

0.5

1

1.5

2

0.5 1 2 4 6 8

tcool (secs)

Th

rott

lin

g-R

atio

(Seek+RPM) Throttling2.6", 37,001 RPM

00.20.40.60.8

11.21.41.61.8

0.5 1 2 4 6 8

tcool (secs)

Th

rott

lin

g-R

atio

• tcool – Disk undergoing throttling• theat – Disk operating at maximal performance configuration• Throttling-Ratio = (theat/tcool)

2.6” 40% IDR Growth to 2005 2.6” 40% IDR Growth to 2007

37

Summary

• Need aggressive RPM increases to sustain IDR growth– Scaling BPI and TPI more difficult– Lower Signal-to-Noise ratios at higher densities

increase ECC overheads• IDR growth would get affected due to heat dissipation

– 40% growth rate cannot be maintained beyond 2007 even for 1.6” platter-size

– Expected to slowdown to 14%• Possible to buy back performance with Dynamic

Thermal Management (DTM).