Post on 18-Jun-2018
transcript
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O
Computer Architecture
Hebrew UniversitySpring 2001
Chapter 8–Input/Output
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 2
Big Picture: Where are We Now?
I/O Systems
Processor
Computer
Control
Datapath
Memory Devices
Input
Output
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 3
Outline of Chapter Lectures
I/O Systems and PerformanceTypes and Characteristics of I/O Devices– Magnetic Disks– Graphic Displays
– Networks
Buses– Bus Types and Bus Operation
– Bus Arbitration and How to Design a Bus Arbiter» some examples
Interfacing the OS and the I/O Device– Operating System’s Role
– Delegating I/O Responsibility from the CPU
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 4
I/O System Design Issues
Performance– throughput and latency– fast devices and many devices
Expandability/Flexibility (number + variety of devices)– wide performance spectrum
Resilience in the face of failure
Device Behavior Partner Data Rate (KB/sec)Keyboard Input Human 0.01Mouse Input Human 0.02Line Printer Output Human 1.00Laser Printer Output Human 100.00Graphics Output Human 30,000.00Network-LAN Input/Output Machine 200.00Floppy disk Storage Machine 50.00Optical Disk Storage Machine 500.00Magnetic Disk Storage Machine 2,000.00
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 5
Typical Desktop I/O System
200 MHz Pentium ProcessorPipeline
Caches2.4GB/sec
Chipset Memory
528MB/sec
PCI132MB/sec
DiskController
Ethernet Controller
USB HubController
1.5Mb/sec
Mouse
Keyboard
Printer
GraphicsController
Disk Disk Graphics
USB
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 6
I/O System Performance
I/O System performance depends on many aspects of the system:– The CPU– The memory system:
» Internal and external caches» Main Memory
– The underlying interconnection (buses)– The I/O controller
– The I/O device– The speed of the I/O software– The efficiency of the software’s use of the I/O devices
Two common performance metrics:– Throughput: I/O bandwidth– Response time: Latency
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 7
Producer-Server Model
Throughput:– The number of tasks completed by the server in unit time– Highest possible throughput:
» Server should never be idle» Queue should never be empty
Response time:– Begins when a task is placed in the queue– Ends when it is completed by the server
– To minimize response time:» Queue should be empty» Server will be idle whenever task arrives
Producer ServerQueue
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 8
Throughput versus Response Time
20% 40% 60% 80% 100%
ResponseTime (ms)
100
200
300
Percentage of maximum throughput
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 9
Throughput Enhancement
In general throughput can be improved by:– Throwing more hardware at the problem
Response time is much harder to reduce:– Ultimately it is limited by the speed of light
Producer
ServerQueue
QueueServer
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 10
I/O Benchmarks for Magnetic Disks
Supercomputer application:– Large-scale scientific problems
Transaction processing:– Examples: Airline reservations systems and banks
File system:– Example: UNIX file system
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 11
Supercomputer I/O
Supercomputer I/O is dominated by:– Access to large files on magnetic disks
Supercomputer I/O consists of – one large read (read in the data)
– Many writes to snapshot the state of the computation– Supercomputer I/O consists of more output than input
Key supercomputer I/O measures = data throughput:– Bytes/second transferred between disk and memory
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 12
Transaction Processing I/O
Transaction processing (TP):– Examples: airline reservations systems, bank ATMs,
inventory system, purchasing system, ...– Many small changes to a large body of shared data
Transaction processing requirements:– Throughput and response time are both important
» response time: for users» throughput: for cost
– Must gracefully handle certain types of failure
Transaction processing chiefly focuses on I/O rate:– The number of disk accesses per second
Each transaction in typical TP system takes:– Between 2 and 10 disk I/Os
– Between 5,000 and 20,000 CPU instructions per disk I/O
Adapted from COD2e by Petterson & Hennessy Chapter 8 I/O 13
File System I/O
Measurements of UNIX file systems in an engineering environment:– 80% of accesses are to files less than 10 KB– 90% of all file accesses are to data with sequential
addresses on the disk
– 67% of the accesses are reads– 27% of the accesses are writes
– 6% of the accesses are read-write accesses
Chapter 8 I/O 14
Magnetic Disk
Purpose:– Long term, nonvolatile storage– Large, inexpensive, and slow– Lowest level in memory hierarchy
Two major types:– Floppy disk
– Hard disk
Both types of disks:– Rely on a rotating platter coated with a magnetic surface
– Use a moveable read/write head to access the disk
Advantages of hard disks over floppy disks:– Platters are rigid ( metal or glass) so they can be larger
– Higher density because it can be controlled more precisely– Higher data rate because it spins faster– Can incorporate more than one platter
Registers
Cache
Mem
ory
Disk
Chapter 8 I/O 15
Organization of Hard Disk
Typical numbers (depending disk size):– 500 to 2,000 tracks per surface– 32 to 128 sectors per track
» A sector is the smallest unit that can be read or written
Traditionally all tracks have same number of sectors:– Constant bit density: record more sectors on outer tracks
» Being done in new disks
Platters
Track
Sector
Chapter 8 I/O 16
Magnetic Disk Characteristic
Cylinder: all the tacks under the head at a given point on all surfacesRead/write data is a three-stage process:– Seek time: position the arm over the proper track– Rotational latency: wait for the desired sector
to rotate under the read/write head– Transfer time: transfer a block of bits (sector)
under the read-write headAverage seek time as reported by the industry:– Typically in the range of 12 ms to 20 ms– (Sum of the time for all possible seek) / (total # of
possible seeks)
Due to locality of disk reference, actual average seek time:– Typically only 25%–33% of advertised number
SectorTrack
Cylinder
Head
Platter
Chapter 8 I/O 17
Typical Performance of a Disk
Rotational Latency:– Most disks rotate at 3,600–5,400 RPM– Approximately 12–16 ms per revolution– Average latency to the desired
sector is 1/2 around the disk: 6–8 msTransfer Time is a function of :– Transfer size (usually a sector): 1 KB / sector– Rotation speed: 3600 RPM to 5400 RPM– Recording density: typical diameter ranges
from 2 to 14 in– Typical values: 2 to 4 MB per second
SectorTrack
Cylinder
Head
Platter
Chapter 8 I/O 18
Reliability and Availability
I/O systems often have fault tolerant requirements:– must be able to get the data even if there is a failure
» Data is available» Reliable = never any failures
Reliability and Availability often confusedAvailability can be improved by adding hardware:– Example: adding ECC on memory
Reliability can only be improved by:– Bettering environmental conditions
– Building more reliable components– Building with fewer components
» Improve availability may come at the cost of lower reliability
Chapter 8 I/O 19
Disk Arrays
Organize disks to take advantage of – Small, inexpensive disks– Arrange them in an array– Increase throughput using many disk drives:
» Data is spread over multiple disk» Multiple accesses are made to several disks
Reliability is lower than a single disk:– But availability can be improved by adding redundant disks:
Lost information can be reconstructed from redundant information
– MTTR: mean time to repair is in the order of hours– MTTF: mean time to failure of disks is three to five years
Chapter 8 I/O 20
N scan lines
M pixels
Graphics Displays
Raster Cathode Ray Tube (CRT) Displays:– Resolution: (M pixels) x (N horizontal scan lines)
Typical Sizes:– Studio Quality: 720 x 480
– High Resolution Workstation Monitor: 1280 x 1024– High Definition TV (proposed standards):
» U.S.A.: 1440 x 960» Others: 1920 x 1080
Chapter 8 I/O 21
Graphics Displays and Bit Map
M x N image represented by a bit map in memory:– Black & White: 1bit per pixel
» The pixel is either ON or OFF. No longer widely used.
– Gray scale: 8 bits per pixel» Each pixel can have 256 shades of black white
– Color: 24 bits per pixel» 8 bits for each of the primary color: Red, Green, Blue
N scan lines
M pixels
Chapter 8 I/O 22
Cost of Computer Graphics
Size of Frame Buffer = M x N x P– M: Number of pixels per scan line– N: Number of horizontal scan lines– P: Number of bits per pixel
Color Frame Buffer for High Resolution Workstation:– 1280 x 1024 x 24-bit = 3840 KB ~= 4 MB
The size of the color frame buffer can be reduced:– Most pictures do not need full palette (224) of possible colors– Use a two-level representation
Chapter 8 I/O 23
Color Map: Lower Cost of Frame Buffer
Color Map Size is 256-word x 24-bit:– Only 256 different colors can appear on the screen at a time
The color map is loaded by the application program:– Each picture can have its own palette of color to chose from
FrameBuffer
Memory
Color CRTDisplayN
8 bits
M
N
M
256-word x 24-bitColor Map
Word 0
Word 1
W 255
:
:
:
Word P
Word Q
P
Q
y1
y2
x2x1
y1
y2
x2x1
Chapter 8 I/O 24
Bandwidth for Frame Buffer
Assume no color map:– Frame Buffer Size: 1280 x 1024 x 24 = 3840 KB ~= 4 MB– Bandwidth Requirement: 4 MB x 60 Hz = 240 MB / sec
Assume N = 32-bit = 4-byte:– Frame buffer needs to respond:
» 240 / 4 = 60 MHz => 16.6 ns
– This is faster than most of the low cost SRAM
– Another reason to use a color map!
1280 x 1024 x 24-bitHigh Resolution. Color Monitor
Refresh Rate: 60 Hz
Frame Buffer1280 x 1024 x 24
Bandwidth?
N
DA
C
Chapter 8 I/O 25
Bandwidth for Frame Buffer (Cont.)
With a 256-word by 24-bit color map:– Frame Buffer Size: 1280 x 1024 x 8 = 1280 KB– Bandwidth Requirement: 1280 KB x 60 Hz = 75 MB / sec
Assume N = 32-bit = 4-byte:– The frame buffer needs to response at: 75/4 =18.75 MHz =>
53.3 ns– DRAM is still too slow and SRAM is too expensive.
– Solution: Video RAM (VRAM)
1280 x 1024 x 24-bitHigh Resolution. Color Monitor
Refresh Rate: 60 Hz
Frame Buffer1280 x 1024 x 8
Bandwidth?
N
256-wordx 24 -bit
Color Map
DA
C
Chapter 8 I/O 26
Video Random Access Memory (VRAM)
A high-speed shift register– Hold one row of DRAM
Random Port:– Access the DRAM array
– Works like regular DRAM(and just as slow)
Serial Port:– Access the shift register
N row
s
N columns
DRAM
ColumnAddress
M-bitRandom Port
M bits
N x M Shift Reg.
Row Address
M-bitSerial Port
Chapter 8 I/O 27
ABCs of Networks
Starting Point: Send bits between 2 computersFIFO Queue on each endCan send both ways (“Full Duplex”)Rules for communication “protocol”– May involve variety of control information
» request/response» size of message» type of message
Chapter 8 I/O 28
A Simple Example
What is format of packet?– packet = unit of maximum size that makes up all messages
» routing of packet independent of other packets
– Fixed? Number bytes?» fixed or variable size
Request/Response
Address/Data
1 bit 32 bits0: Please send data from Address1: Data corresponding to request
Chapter 8 I/O 29
Example: Ethernet (IEEE 802.3)
Essentially 10Kb/s 1 wire bus with no central control– Collision based protocol with carrier sense
» Listen. If nobody is taking, go ahead and talk.
» If you hear anybody else talking,stop and try again later.Binary exponential backoff
– increase wait time by 2x each collision
Recently: – 100Mbit (and 1Gbit coming)
– switched organization
Transceiver (detects collision)
Cable (50 ohm coax, 10Mbps, 500M)
Computer (or repeater)Transceiver Cable
(50M)
Frame Frame Frame
ContentionInterval
Contention Slot2t = worst-case round trip time
= 512 bit times = 51.2 ms
idle
Frame
ManchesterEncoding
Chapter 8 I/O 30
Buses: Connecting I/O to the System
Bus–a shared communication link– uses one set of wires to connect multiple subsystems– as opposed to a point-to-point link
– connects processor to I/O devices as well as connecting I/O devices to memory
» processor may get data/commands from CPU or memory
Control
Datapath
Memory
Processor
Input
Output
Chapter 8 I/O 31
Advantages of Buses
Versatility:– New devices can be added easily– Peripherals can be moved between computer
systems that use the same bus standard
Low Cost:– A single set of wires is shared in multiple ways
MemoryProcessor
I/O Device
I/O Device
I/O Device
Chapter 8 I/O 32
Disadvantages of Buses
Busses create a communication bottleneck– Bandwidth of bus can limit the maximum I/O throughput
The maximum bus speed is largely limited by:– The length of the bus
– The number of devices on the bus– The need to support a range of devices with:
» Widely varying latencies » Widely varying data transfer rates
MemoryProcessor
I/O Device
I/O Device
I/O Device
Chapter 8 I/O 33
General Organization of a Bus
Control lines:– Signal requests and acknowledgments– Indicate what type of information is on the data lines
Data lines: carry information between source and destination:– Data and Addresses
– Complex commands
A bus transaction includes two parts:– Sending the address
– Receiving or sending the data
Data LinesControl Lines
Chapter 8 I/O 34
Master versus Slave
A bus transaction includes two parts:– Sending the address– Receiving or sending the data
Master is the one who starts the bus transaction by:– Sending the address
Slave is the one who responds to the address by:– Sending data to the master if the master ask for data
– Receiving data from master if master wants to send data
BusMaster
BusSlave
Master send address
Data can go either way
Chapter 8 I/O 35
Output Operation
Output = Processor sending data to the I/O device
Processor
Control (Memory Read Request)
Memory
Step 1: Request Memory
I/O Device (Disk)
Data(Memory Address)
Processor
Control
Memory
Step 2: Read Memory
I/O Device (Disk)
Data
Processor
Control (Device Write Request)
Memory
Step 3: Send Data to I/O Device
I/O Device (Disk)
Data (I/O Device Address
and then Data)
Chapter 8 I/O 36
Input Operation
Processor
Control (Memory Write Request)
Memory
Step 1: Request Memory
I/O Device (Disk)
Data (Memory Address)
Processor
Control (I/O Read Request)
Memory
Step 2: Receive Data
I/O Device (Disk)
Data(I/O Device Address
and then Data)
Input = processor receiving data from the I/O device
Chapter 8 I/O 37
Types of Buses
Processor-Memory Bus (design specific)– Short and high speed– Only need to match the memory system
» Maximize memory-to-processor bandwidth (cache transfers)
– Connects directly to the processor
I/O Bus (industry standard)– Usually is lengthy and slower
– Need to match a wide range of I/O devices (cost and speed)– Connects to the processor-memory bus or backplane bus
Backplane Bus (often industry standard)– Backplane: an interconnection structure within the chassis– Allow processors, memory, and I/O devices to coexist– Cost advantage: one single bus for all components
Chapter 8 I/O 38
One Bus System: Backplane Bus
A single bus (the backplane bus) is used for:– Processor to memory communication– Communication between I/O devices and memory
Advantages: Simple and low costDisadvantages: slow and the bus can become a major bottleneckExample: IBM PC: ISA
Processor Memory
I/O Devices
Backplane Bus
Chapter 8 I/O 39
A Two-Bus System
I/O buses tap processor-memory bus via adaptors:– Processor-memory bus: mainly processor-memory traffic– I/O buses: provide expansion slots for I/O devices
Apple Macintosh-II– NuBus: Processor, memory, and a few selected I/O devices– SCSI Bus: the rest of the I/O devices
Processor Memory
I/OBus
Processor Memory Bus
BusAdaptor
BusAdaptor
BusAdaptor
I/OBus
I/OBus
Chapter 8 I/O 40
A Three-Bus System
Backplane buses tap into the processor-memory bus– Processor-memory bus used for processor memory traffic– I/O buses are connected to the backplane bus
Advantage: load on processor bus greatly reducedMany new PCs use this organization
Processor MemoryProcessor Memory Bus
BusAdaptor
BusAdaptor
BusAdaptor
I/O BusBackplane Bus
I/O Bus
Chapter 8 I/O 41
Synchronous & Asynchronous Bus
Synchronous Bus:– Includes a clock in the control lines– Fixed protocol for communication, relative to the clock– Advantage: involves very little logic and can run very fast
– Disadvantages:» Every device on the bus must run at the same clock rate» To avoid clock skew, bus cannot be long if they are fast
Asynchronous Bus:– It is not clocked– It can accommodate a wide range of devices– It can be lengthened without worrying about clock skew
– It requires a handshaking protocol
Chapter 8 I/O 42
A Handshaking Protocol
Three control lines– ReadReq: indicates a read request for memory
» Address is put on he data lines at same line
– DataRdy: indicates data word is ready on data lines» Data is put on data lines at same time
– Ack: acknowledge ReadReq or DataRdy of other party
ReadReq
AddressData Data
Ack
DataRdy
1 2
2
3
4
4
56
6 7
Chapter 8 I/O 43
Increasing Bus Bandwidth
Separate versus multiplexed address and data lines:– Address and data can be transmitted in one bus cycle
if separate address and data lines are available
– Cost: (a) more bus lines, (b) increased complexityData bus width:– Increasing the width of the data bus, transfers of multiple
words require fewer bus cycles
– Cost: more bus linesBlock transfers:
– Transfer multiple words in back-to-back bus cycles– Only one address needs to be sent at the beginning– The bus is not released until the last word is transferred
– Cost: (a) increased complexity (b) decreased response time for request
Split transaction: free up the bus when not transferring
– Cost: complexity, potentially latency
Chapter 8 I/O 44
Obtaining Access to the Bus
One of the most important issues in bus design:– How is the bus reserved by a device that wishes to use it?
Chaos is avoided by a master-slave arrangement:– Only the bus master can control access to the bus:
» It initiates and controls all bus requests
– A slave responds to read and write requests
The simplest system:– Processor is the only bus master
– All bus requests must be controlled by the processor– Major drawback: processor is involved in every transaction
BusMaster
BusSlave
Control: Master initiates requests
Data can go either way
Chapter 8 I/O 45
Multiple Masters => Arbitration
Bus arbitration scheme:– Bus master wanting to use bus asserts “bus request”– Bus master cannot use the bus until its request is granted– Bus master must signal to the arbiter after finish using the
bus
Bus arbitration schemes try to balance two factors:– Bus priority: highest priority device should be serviced first
– Fairness: lowest priority devices should never be completely locked out
Bus arbitration schemes–four broad classes:– Distributed arbitration by self-selection: each device
wanting bus places a code indicating its identity on bus.
– Distributed arbitration by collision detection: Ethernet– Daisy chain arbitration: most common in I/O– Centralized, parallel arbitration: centralize it.
Chapter 8 I/O 46
Daisy Chain Arbitration Scheme
Advantage: simpleDisadvantages:– Cannot assure fairness:
» A low-priority device may be locked out indefinitely
– The use of “daisy chain” grant signal limits bus speed
BusArbiter
Device 1(HighestPriority)
Device NLowestPriority
Device 2
Grant Grant GrantRelease
Request
Chapter 8 I/O 47
Bus Example: PCI
Clock at 33 MHz (with extension to 66 MHz) [CLK]Central arbitration [REQ#, GNT#]– Overlapped with previous transaction
Multiplexed Address/Data– 32 lines (with extension to 64) [AD]
General Protocol– Transaction type (bus command) [C/BE#]– Address handshake and duration[FRAME#, TRDY#]
– Data width (byte enable) [C/BE#]– Variable-length data block handshake [IRDY#, TRDY#]
Maximum Bandwidth is 132 MB/s
Chapter 8 I/O 48
Example: USB (Universal Serial Bus)
Targeted at low-cost, low-bandwidth peripherals– Software control and hardware logic are complex
» But these are inexpensive for high volume
– Advantages for configurability, expandability, and ease of use» cables are inexpensive and easy to connect
Clock at 1.5 MHz (with option for 12 MHz)Serial bus with a single signal– Differential pair for noise tolerance– Clock encoded with data using Non-Return to Zero, Inverted
» Each signal transition corresponds to a 0 bit» After 6 consecutive 1’s, a 0 is stuffed to force a transition
– Star topology:» connections routed point-to-point and through hubs
Chapter 8 I/O 49
More on USB
General Protocol– Transactions are packed into 1 ms frames– Each transaction consists of three phases
» Token specifies device and transaction type» Data transfer up to 1KB» Handshake to ensure reliable transfer
– Software schedules transactions» hardware arbitration is unnecessary
Maximum bandwidth 1.5 Mb/s (with option 12 Mb/s)
Chapter 8 I/O 50
USB System Configuration
Figure from Universal Serial BusSystem Architecture
Chapter 8 I/O 51
USB Signaling
Figure from Universal Serial BusSystem Architecture
Data encoded on differential-pair signalline allows clock to be recovered.
Chapter 8 I/O 52
Operating System Tasks
Operating system acts as the interface between:– The I/O hardware and the program that requests I/O
Three characteristics of the I/O systems:– I/O system is shared by multiple program using processor
– I/O systems often use interrupts (externally generated exceptions) to communicate information about I/O operations.
» Interrupts must be handled by the OS because they cause a transfer to supervisor mode
– The low-level control of an I/O device is complex:» Managing a set of concurrent events» The requirements for correct device control are very detailed
Chapter 8 I/O 53
Operating System Requirements
Provide protection to shared I/O resources– Guarantees that a user’s program can only access the
portions of an I/O device to which the user has rights
Provides abstraction for accessing devices:– Supply routines that handle low-level device operation
Handles the interrupts generated by I/O devicesProvide equitable access to the shared I/O resources– All user programs must have equal access to the I/O
resources
Schedule accesses in order to enhance system throughput
Chapter 8 I/O 54
OS–I/O Communication
The Operating System must be able to prevent:– User program from communicating with I/O device directly
If user programs could perform I/O directly:– Protection to shared I/O resources could not be provided
Three types of communication are required:1. OS must be able to give commands to I/O devices2. I/O device must be able to notify OS when I/O device has
completed an operation or has encountered an error
3. Data must be transferred between memory and an I/O device
Chapter 8 I/O 55
Giving Commands to I/O Devices
Two methods are used to address the device:– Special I/O instructions– Memory-mapped I/O
Special I/O instructions specify:– Both the device number and the command word
» Device number: the processor communicates this via aset of wires normally included as part of the I/O bus
» Command word: usually sent on the bus’s data lines» Used in Intel x86 and IBM 360
Memory-mapped I/O:– Portions of the address space are assigned to I/O device– Read and writes to those addresses are interpreted
as commands to the I/O devices– User programs prevented issuing I/O operations directly:
» I/O address space is protected by the address translation» Used in MIPS, SPARC, etc.
Chapter 8 I/O 56
I/O Device Notifying the OS
The OS needs to know when:– The I/O device has completed an operation– The I/O operation has encountered an error
This can be accomplished in two different ways:– Polling:
» The I/O device put information in a status register» The OS periodically check the status register–overhead
– I/O Interrupt:» Whenever an I/O device needs attention from the processor,
it interrupts the processor from what it is currently doing.» Interrupt must convey information to OS–what device?
done with either a register or interrupt vector
» I/O interrupt is asynchronousprocessor can wait till end of current instruction before recognizing interrupt
– Interrupts may have different priorities–hardware support
Chapter 8 I/O 57
Programmed I/O and Polling
Advantage: – Simple: processor is totally in control and does all the work
Disadvantage:– Polling overhead and data transfer consume CPU time
CPU
IOC
device
Memory
Is thedeviceready?
store data to device
yes no
done? noyes
busy wait loopnot an efficient
way to use the CPUunless the device
is very fast!
processor may be inefficient way to
transfer data
read data from
memory
Example: output operation
Chapter 8 I/O 58
Interrupt Driven Data Transfer
Advantage:– User program progress only
halted during actual transfer
Disadvantage, special hardware needed to:– Cause an interrupt (I/O device)– Detect interrupt (processor)– Save proper states to resume
after interrupt (processor)
addsubandornop
readstore...rti
memory
userprogram(1) I/O
interrupt
(2) save PC
(3) interruptservice addr
interruptserviceroutine(4)
CPU
IOC
device
Memory
:
Chapter 8 I/O 59
Programmer’s View
Add
SubDiv
mainprogram
Service the(keyboard)interrupt
Save processorstatus/state
Restore processorstatus/state
(3) get PC
interrupts request (e.g., from keyboard)(1)
(2) Save PC and “branch” to interrupt target address
Chapter 8 I/O 60
Delegating I/O Handling: DMA
Direct Memory Access (DMA):– External to the CPU– Act as a maser on the bus
– Transfer blocks of data to or from memory without CPU intervention
CPU
IOC
device
Memory DMAC
CPU sends a starting address, direction, and length count to DMAC. Then issues "start".
DMAC provides handshakesignals for PeripheralController, and MemoryAddresses and handshakesignals for Memory.
Chapter 8 I/O 61
Delegating I/O Handling: IOP
CPU IOP
Mem
D1
D2
Dn
. . .main memory
bus
I/Obus
CPU
IOP
(1) Issuesinstructionto IOP
memory
(2)
(3)
DMA vs. IOP?how is I/O
program stored and accessed?
OP Device Address
target devicewhere commands
IOP looks in memory for commands
OP Addr Cnt Other
whatto do
whereto putdata
howmuch
specialrequests
(4) IOP interrupts CPU when done
Device to/from memory transfers are controlled by the
IOP directly.
IOP steals memory cycles.
Chapter 8 I/O 62
Summary
I/O Performance–many factors– expandability, latency, bandwidth
I/O Devices: wide spectrum– disks, graphics, networks
Buses– different types of buses:
» Processor-memory buses, I/O buses, Backplane buses
– Bus arbitration schemes:» Daisy chain arbitration: cheap, but cannot assure fairness» Centralized parallel arbitration: requires a central arbiter
OS and I/O – communication: polling and interrupts– Handling I/O outside CPU: DMA, I/O processor