+ All Categories
Home > Documents > Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

Date post: 29-Mar-2015
Category:
Upload: taniya-edgell
View: 218 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 06/19/22 1
Transcript
Page 1: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

Input/Output IIIDisks

CS 423, Fall 2007Klara Nahrstedt/Sam King

04/10/23 1

Page 2: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

Administrative

• MP3 – Deadline, November 5, 8am• Re-grading of MP2, HW1, Midterm is closed

04/10/23 2

Page 3: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/233

Data Centers

• Core of enterprise computing– A cluster of specialized servers– Multiple tiers

SANNetwork

Storage

Database

File

Applications

Email

Web

Page 4: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/234

Data Path

Database Server

Disks

SAN

Storage Server Storage Cache

Database Buffer

Processors

L2 cache

• Disks, storage server, and database server

• Widely used caching memory– Large access speed gaps– Different sizes– Various granularity>300 cycles

~125 ns

14 cycles6 ns

~40 us

> 5 ms

Page 5: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/235

Two Performance Trends

• The gaps are increasing large

1.E-04

1.E-03

1.E-02

1.E-01

1.E+00

1.E+01

1.E+02

1.E+03

1.E+04

1.E+05

1960 1970 1980 1990 2000 2010Year

Processor Cycle

Memory Cycle

Disk Access

Tim

e (u

s)

Source: Zhifeng Chen, ”Optimization of Data Access for Database Applications”, PhD Thesis, 2005, UIUC

Page 6: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 6

Disks First, Then File Systems

Form factor: .5-1” 4” 5.7”Storage: 18-73GB

Form factor: .4-.7” 2.7” 3.9”Storage: 4-27GB

Form factor: .2-.4” 2.1” 3.4”Storage: 170MB-1GB

Page 7: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 7

Disk Technology Trends• Disks are getting smaller for similar capacity

– Spin faster, less rotational delay, higher bandwidth– Less distance for head to travel (faster seeks)– Lighter weight (for portables)

• Disk data is getting denser– More bits/square inch– Tracks are closer together– Doubles density every 18 months

• Disks are getting cheaper ($/MB)– Factor of ~2 per year since 1991– Head close to surface

Page 8: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 8

Disk Organization• Disk surface

– Circular disk coated with magnetic material

• Tracks– Concentric rings around disk

surface, bits laid out serially along each track

• Sectors– Each track is split into arc of

track (min unit of transfer)sector

Page 9: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 9

More on Disks• CD’s and floppies come

individually, but magnetic disks come organized in a disk pack

• Cylinder– Certain track of the platter

• Disk arm– Seek the right cylinder

seek a cylinder

Page 10: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 10

Disk Examples (Summarized Specs)

Seagate Barracuda IBM Ultrastar 72ZXCapacity, Interface & Configuration

Formatted Gbytes 28 73.4Interface Ultra ATA/66 Ultra160 SCSI

Platters / Heads 4 / 8 11/22Bytes per sector 512 512-528

PerformanceMax Internal transfer rate (Mbytes/sec) 40 53

Max external transfer rate (Mbytes/sec) 66.6 160Avg Transfer rate( Mbytes/sec) > 15 22.1-37.4Multisegmented cache (Kbytes) 512 16,384Average seek, read/write (msec) 8 5.3

Average rotational latency (msec) 4.16 2.99Spindle speed (RPM) 7,200 10,000

Page 11: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 11

Disk Performance• Seek

– Position heads over cylinder, typically 5.3 8 ms• Rotational delay

– Wait for a sector to rotate underneath the heads– Typically 8.3 6.0 ms (7,200 – 10,000RPM) or ½ rotation takes 4.15-

3ms• Transfer bytes

– Average transfer bandwidth (15-37 Mbytes/sec)• Performance of transfer 1 Kbytes

– Seek (5.3 ms) + half rotational delay (3ms) + transfer (0.04 ms)– Total time is 8.34ms or 120 Kbytes/sec!

• What block size can get 90% of the disk transfer bandwidth?

Page 12: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/2312

Disk Behaviors

• There are more sectors on outer tracks than inner tracks– Read outer tracks: 37.4MB/sec– Read inner tracks: 22MB/sec

• Seek time and rotational latency dominate the cost of small reads– A lot of disk transfer bandwidth is

wasted– Need algorithms to reduce seek

time

Block Size

(Kbytes)

% of Disk Transfer Bandwidth

1Kbytes 0.5%

8Kbytes 3.7%

256Kbytes

55%

1Mbytes 83%

2Mbytes 90%

Page 13: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 13

Observations

• Getting first byte from disk read is slow– high latency

• Peak bandwidth high, but rarely achieved• Need to mitigate disk performance impact

– Do extra calculations to speed up disk access• Schedule requests to shorten seeks

– Move some disk data into main memory – file system caching

Page 14: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 14

RAID

• Use parallel processing to speed up CPU performance

• Use parallel I/O to improve disk performance, reliability (1988, Patterson)

• Design new class of I/O devices called RAID – Redundant Array of Inexpensive Disks (also Redundant Array of Independent Disks)

• Use the RAID in OS as a SLED (Single Large Expensive Disk), but with better performance and reliability

Page 15: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 15

RAID (cont.)• RAID consists of RAID SCSI controller plus a box of SCSI disks• Data are divided into strips and distributed over disks for

parallel operation• RAID 0 … RAID 5 levels• RAID 0 organization writes consecutive strips over the drives

in round-robin fashion – operation is called striping• RAID 1 organization uses striping and duplicates all disks• RAID 2 uses words, even bytes and stripes across multiple

disks; uses error codes, hence very robust scheme• RAID 3, 4, 5 alterations of the previous ones

Page 16: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 16

Linux Kernel

Page 17: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

Kernel Components Affected by Block Device Op.

04/10/23 17

Block DeviceDriver

Block DeviceDriver

I/O Scheduler Layer

Generic Block Layer

Disk Filesystem

Disk Filesystem

Disk Filesystem

Mapping Layer

Disk Caches

VFS

Page 18: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 18

Disk Scheduling

• Which disk request is serviced first?– FCFS– Shortest seek time first– Elevator (SCAN)– LOOK– C-SCAN (Circular SCAN)– C-LOOK

• Looks familiar?

Page 19: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 19

FIFO (FCFS) order

• Method– First come first serve

• Pros– Fairness among requests– In the order applications expect

• Cons– Arrival may be on random spots

on the disk (long seeks)– Wild swing can happen

• Analogy:– Can elevator scheduling use

FCFS?

0 199

98, 183, 37, 122, 14, 124, 65, 67

53

Page 20: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 20

SSTF (Shortest Seek Time First)• Method

– Pick the one closest on disk– Rotational delay is in calculation

• Pros– Try to minimize seek time

• Cons– Starvation

• Question– Is SSTF optimal?– Can we avoid starvation?

• Analogy: elevator

0 199

98, 183, 37, 122, 14, 124, 65, 67(65, 67, 37, 14, 98, 122, 124, 183)

53

Page 21: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 21

Elevator (SCAN)• Method

– Take the closest request in the direction of travel

– Real implementations do not go to the end (called LOOK)

• Pros– Bounded time for each request

• Cons– Request at the other end will

take a while

0 199

98, 183, 37, 122, 14, 124, 65, 67(37, 14, 0, 65, 67, 98, 122, 124, 183)

53

Page 22: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 22

C-SCAN (Circular SCAN)• Method

– Like SCAN– But, wrap around– Real implementation doesn’t

go to the end (C-LOOK)• Pros

– Uniform service time• Cons

– Do nothing on the return

0 199

98, 183, 37, 122, 14, 124, 65, 67(65, 67, 98, 122, 124, 183, 199, 0, 14, 37)

53

Page 23: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 23

LOOK and C-LOOK

• SCAN and C-SCAN move the disk arm across the full width of the disk

• In practice, neither algorithm is implemented this way

• More commonly, the arm goes only as far as the final request in each direction. Then, it reverses direction immediately, without first going all the way to the end of the disk.

• These versions of SCAN and C-SCAN are called LOOK and C-LOOK

Page 24: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 24

Group Discussion Questions• The disk scheduling algorithm that may cause starvation is:

– FCFS or SSTF or C-SCAN or LOOK ??• From the list of disk-scheduling algorithms (FCFS, SSTF, SCAN, C-

SCAN, LOOK, C-LOOK), SSTF will always give the least head movement for any set of cylinder-number requests to the disk scheduler: – True or False ??

• The cylinder numbers on a disk are 0,1,…10. Currently, there are five cylinder requests on the disk scheduler queue in the following order: 1,5,4,8,7 and the head is located at position 2 and moving in the direction of increasing block numbers. The time to serve a request is proportional to the distance from the head to the cylinder number requested. If T(X) is the time it takes to service the requests currently in the queue using scheduling algorithm X, then: – T(SSTF) < T(SCAN) < T(FCFS) or – T(FCFS) < T(SSTF) < T(SCAN) or – T(SSTF) < T(FCFS) < T(SCAN) or – None of the above???

Page 25: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 25

History of Disk-related Concerns

• When memory was expensive– Do as little bookkeeping as possible

• When disks were expensive– Get every last sector of usable space

• When disks became more common– Make them much more reliable

• When processor got much faster– Make them appear faster

Page 26: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 26

Disk Versus Memory

Memory• Latency in 10’s of processor

cycles• Transfer rate 300+MB/s• Contiguous allocation gains

~10x

Disk• Latency in milliseconds• Transfer rate 5-50MB/s• Contiguous allocation gains

~1000x

Page 27: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 27

On-Disk Caching

• Method– Put RAM on disk controller to cache blocks

• Seagate ATA disk has .5MB, IBM Ultra160 SCSI has 16MB• Some of the RAM space stores “firmware” (an OS)

– Blocks are replaced usually in LRU order• Pros

– Good for reads if you have locality• Cons

– Expensive– Need to deal with reliable writes

Page 28: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 28

Disk Block Caches

• Main memory rather than disk may hold disk blocks

• 85% or more of all I/O requests by file system and applications can be satisfied by disk block cache

• BSD UNIX provides a disk block cache as part of the block-oriented device software layer

• It consists of between 100 and 1000 individual buffers.

Page 29: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 29

CD-ROM• Compact Disk – Read Only Memory, Optical Disk

– 1980 – Philips and Sony developed CD• CD is prepared (WRITE OPERATION)

– using a high-power infrared laser to burn 0.8 micron diameter holes in a coated glass master disk;

– from the master disk, mold is created, processed and reflective layer is deposited on polycarbonate;

– depressions on the polycarbonate substrate are called pits, the unburned areas between pits are called lands

• CD is read (READ OPERATION)– Low-power laser diode shines infrared light with a wavelength of 0.78 micron on

pits and lands as they stream by. – Laser is on the polycarbonate side, so pits stick out toward the laser as bumps in

the flat surface. – Pits and lands return different light to the player’s photodetector (pit returns less

light than light bouncing off land), hence the player tells pit from land. Pit length is 0.6 micrometers.

– Pit/land and land/pit transitions represent 1, absence of transition is 0. • CD-Recordable, CD-Rewritables, DVD

Page 30: Input/Output III Disks CS 423, Fall 2007 Klara Nahrstedt/Sam King 6/3/20141.

04/10/23 30

Disk I/O Summary• Disk is an important I/O device• Disk must be fast and reliable• Error Handling is important; check checksum,

called ECC (Error Correcting Code)– Track a bad sector– Substitute a spare for the bad sector– Shift al sectors to bypass the bad one

• RAID protects against few bad sectors, but does not protect against write errors laying down bad data

• Stable Storage may be needed


Recommended