+ All Categories
Home > Documents > Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer...

Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer...

Date post: 17-Jan-2016
Category:
Upload: mariah-hubbard
View: 219 times
Download: 0 times
Share this document with a friend
Popular Tags:
178
Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1
Transcript
Page 1: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture

Chang-Bum Lee

Dept. of Computer EngineeringYoungsan University

Computer Architecture 1

Page 2: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Content(1)

2Computer Architecture

Lecture #1 — Course Overview

Course ContentsCourse Schedule\Grading GuidelinesTest and Assignments

Lecture #2 — Basic Architecture of Computer

Basic Architecture System Configuration

Lecture # 3 — Instruction Execution

Fetch CycleExecution CycleInterrupt Cycle

Page 3: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Content(2)

Lecture #4, 5— Instruction Set

Program ControlInstruction Formats Addressing ModesPentium Processors

Lecture #6, 7— Arithmetic and Logical Operations

Arithmetic and Logical UnitInteger RepresentationLogic Operations Shift OperationsArithmetic Operations of Integer

(Addition, Subtraction, Multiplication, and Division)

3Computer Architecture

Page 4: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Content(3)Lecture #8, 9

— Real NumbersRepresentation of Floating Point NumbersArithmetic Operations of Floating Point Numbers

(Addition, Subtraction, Multiplication, and Division)

Lecture #10— Control Unit

Structure of Control UnitMicroinstructionMicroprogram

Lecture #11— Memory Devices

Memory HierarchyRAMROMDesign of Memory Device Modules

4Computer Architecture

Page 5: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Content(4)Lecture #12

— Cache MemoryCache SizeFetch MethodMapping

5Computer Architecture

Page 6: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Course Overview

Lecture #1

Computer Architecture 6

Page 7: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Objectives

7

Understand role & relationship of hardware and software

Exposure to. . .— Machine organization— Assembly language programming— C programming

Able to actually build entire (slow) computing system—Hardware and software

Be distinguished from mere programmers

Computer Architecture

Page 8: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Schedule The complete course, including Lectures and Seminars, will be covered in 90 hours(15 weeks).

The total duration of the course will be 4 months.

Lecture 3 hours (2 hours + 1 hour) weekly

8Computer Architecture

Page 9: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Grading Guidelines

Attendance : 20%— Depending on students class participation

Final Exam : 40%— Textbook based in class final exam

Midterm Exam : 30%— Textbook based in class mid-term exam

Assignments : 10%— Based on submitting assignments

9Computer Architecture

Page 10: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course References

Computer Architecture Computer Architecture/Jong-Hyun Kim

By Sang Lung Publishing Corp.

The course slides will be available at

http://prof.ysu.ac.kr/blog/postlist.asp?b_id=cblee

10Computer Architecture

Page 11: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Course Summary

11

Introduction to computer architecture— How is data represented?— What are the pieces of a computer?— How do computers work?

Programming— How do I "talk" directly to the machine?— How do I program in C?

Computer Systems and Computation— How do simple HW/SW elements come together to realize

complex computations?

Computer Architecture

Page 12: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Basic Architecture

Lecture #2

12Computer Architecture

Page 13: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Introduction - Architecture (1)Architecture is those attributes visible to the programmer—Instruction set, number of bits used for data

representation, I/O mechanisms, addressing techniques.

—e.g. Is there a multiply instruction?

Organization is how features are implemented—Control signals, interfaces, memory

technology.—e.g. Is there a hardware multiply unit or is it

done by repeated addition?

13Computer Architecture

Page 14: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Introduction - Architecture (2)All Intel x86 family share the same basic architecture.

The IBM System/370 family share the same basic architecture.

This gives code compatibility.—At least backwards

Organization differs between different versions.

14Computer Architecture

Page 15: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Structure & FunctionStructure is the way in which components relate to each other.

Function is the operation of individual components as part of the structure.

All computer functions are:—Data processing—Data storage—Data movement—Control

15Computer Architecture

Page 16: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

ENIACElectronic Numerical Integrator And ComputerEckert and Mauchly in University of PennsylvaniaTrajectory tables for weapons Started 1943, Finished 1946— Too late for war effort

Used until 1955

— Decimal (not binary)— 20 accumulators of 10 digits— Programmed manually by switches— 18,000 vacuum tubes, 30 tons— 15,000 square feet— 140 kW power consumption— 5,000 additions per second

16Computer Architecture

Page 17: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Structure of von Neumann Machine

Stored Program conceptMain memory storing programs and dataALU operating on binary dataControl unit interpreting instructions from memory and executingInput and output equipment operated by control unitPrinceton Institute for Advanced Studies —IAS

Completed 1952

17Computer Architecture

Page 18: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Transistor Based ComputersTransistors — Replaced vacuum tubes— Smaller— Cheaper— Less heat dissipation— Solid State device— Made from Silicon (Sand)— Invented 1947 at Bell Labs— William Shockley et al.

Transistor based computers— Second generation machines— NCR & RCA produced small transistor machines— IBM 7000, DEC - 1957— Produced PDP-1

18Computer Architecture

Page 19: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Speeding It Up & Performance Mismatch

Speeding it up — Pipelining— On board cache(L1 & L2 cache)— Branch prediction— Data flow analysis— Speculative execution

Performance Mismatch— Processor speed increased— Memory capacity increased— Memory speed lags behind processor speed

19Computer Architecture

Page 20: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

SolutionsIncrease number of bits retrieved at one time.— Make DRAM “wider” rather than “deeper”

Change DRAM interface.— Cache

Reduce frequency of memory access.— More complex cache and cache on chip

Increase interconnection bandwidth.— High speed buses— Hierarchy of buses

20Computer Architecture

Page 21: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Program ConceptHardwired systems are inflexible.

General purpose hardware can do different tasks, given correct control signals.

Instead of re-wiring, supply a new set of control signals.

A sequence of steps— For each step, an arithmetic or logical operation is done.— For each operation, a different set of control signals is needed.

21Computer Architecture

Page 22: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer ComponentsThe Control Unit and the Arithmetic and Logic Unit constitute the Central Processing Unit.

Data and instructions need to get into the system and results out.—Input/output

Temporary storage of code and results is needed.—Main memory

22Computer Architecture

Page 23: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: CPU Structures and Functions

Lecture #3

23Computer Architecture

Page 24: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

CPU StructureCPU must: Fetch instructions Interpret instructions Fetch data, process data, and write data

Registers CPU must have some working space

(temporary storage) Number and function vary between processor designs One of the major design decisions Top level of memory hierarchy

Control Unit Control unit coordinates sequence of execution steps

ALU ALU performs arithmetic and logical processing

Registers

ALU

ControlUnit

CPU Internal Bus

AddressBus

DataBus

ControlBus

System Bus

24Computer Architecture

Page 25: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

CPU Structure

Instruction Set

Software

Hardware

25Computer Architecture

Page 26: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Fetch Cycle(1)Program Counter (PC) holds address of next instruction to fetch.

Processor fetches instruction from memory location pointed to by PC.

Increment PC— Unless told otherwise

Instruction loaded into Instruction Register (IR)– to: MAR <- PC

– t1: MBR <-M[MAR], PC <- PC+1

– t2: IR <-MBR

Processor interprets instruction and performs required actions

26Computer Architecture

Page 27: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Fetch Cycle(2)

Micro operation– to: MAR <- PC

– t1: MBR <-M[MAR], PC <- PC+1

– t2: IR <-MBR

Address and Instruction Flow in fetch cycle

MemoryDevices

AddressBus

ControlBus

DataBus

ControlUnit

27Computer Architecture

Page 28: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Execute Cycle(1)Processor-memory— data transfer between CPU and main memory

Processor I/O— Data transfer between CPU and I/O module

Data processing— Some arithmetic or logical operation on data

Control— Alteration of sequence of operations— e.g. jump

Combination of above

28Computer Architecture

Page 29: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Execute Cycle(2)Example—LOAD addr :

– to: MAR <- IR(addr)

– t1: MBR <-M[MAR]

– t2: AC <-MBR

—STA addr—ADD addr

ControlUnit

AddressBus

ControlBus

DataBus

MemoryDevices

29Computer Architecture

Page 30: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Interrupt Cycle

Added to instruction cycleProcessor checks for interrupt—Indicated by an interrupt signal

If no interrupt, fetch next instructionIf interrupt pending:—Suspend execution of current program —Save context—Set PC to start address of interrupt handler routine—Process interrupt—Restore context and continue interrupted program

30Computer Architecture

Page 31: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Multiple Interrupts(1)

Disable interrupts—Processor will ignore further interrupts while

processing one interrupt—Interrupts remain pending and are checked

after first interrupt has been processed—Interrupts handled in sequence as they occur

31Computer Architecture

Page 32: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Multiple Interrupts(2)

Main Program

• Define priorities- Low priority interrupts can be interrupted by higher priority interrupts- When higher priority interrupt has been processed, processor returns to previous interrupt

32Computer Architecture

Page 33: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Indirect Cycle

May require memory access to fetch operandsIndirect addressing requires more memory accessesCan be thought of as additional instruction subcycle

33Computer Architecture

Page 34: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Prefetch

Fetch accessing main memoryExecution usually does not access main memoryCan fetch next instruction during execution of current instructionCalled instruction prefetch

34Computer Architecture

Page 35: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Improved Performance

But not doubled:—Fetch usually shorter than execution

– Prefetch more than one instruction?

—Any jump or branch means that prefetched instructions are not the required instructions

Add more stages to improve performance

35Computer Architecture

Page 36: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Pipelining

Fetch instructionDecode instructionCalculate operands (i.e. EAs)Fetch operandsExecute instructionsWrite result

Overlap these operations

36Computer Architecture

Page 37: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Two Stage Instruction Pipeline

Fetch Execute

Instruction ResultInstruction

(a) Simplified View

Fetch Execute

Instruction ResultInstruction

(b) Expanded View

Discard

New AddressWait Wait

37Computer Architecture

Page 38: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Memory Connection

Receives and sends dataReceives addresses (of locations)Receives control signals —Read—Write—Timing

38Computer Architecture

Page 39: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Input/Output ConnectionSimilar to memory from computer’s viewpointOutput—Receive data from computer—Send data to peripheral

Input—Receive data from peripheral—Send data to computer

Receive control signals from computerSend control signals to peripherals—e.g. spin disk

Receive addresses from computer—e.g. port number to identify peripheral

Send interrupt signals (control)39Computer Architecture

Page 40: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

CPU ConnectionReads instruction and dataWrites out data (after processing)Sends control signals to other unitsReceives (& acts on) interrupts

Buses—There are a number of possible interconnection systems—Single and multiple BUS structures are most common—e.g. Control/Address/Data bus (PC)—e.g. Unibus (DEC-PDP)

40Computer Architecture

Page 41: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

What is a Bus?A communication pathway connecting two or more devices

Usually broadcast

Often grouped—A number of channels in one bus—e.g. 32 bit data bus is 32 separate single bit channels.

Power lines may not be shown

41Computer Architecture

Page 42: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Data Bus and Address BusData Bus—Carries data

– Remember that there is no difference between “data” and “instruction” at this level

—Width is a key determinant of performance– 8, 16, 32, 64 bit

Address Bus—Identify the source or destination of data—e.g. CPU needs to read an instruction (data) from a

given location in memory—Bus width determines maximum memory capacity of

system– e.g. 8080 has 16 bit address bus giving 64k address space

42Computer Architecture

Page 43: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Control Bus

Control and timing information—Memory read/write signal—Interrupt request—Clock signals

43Computer Architecture

Page 44: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Single Bus ProblemsLots of devices on one bus leads to:—Propagation delays

– Long data paths mean that co-ordination of bus use can adversely affect performance.

– If aggregate data transfer approaches bus capacity.

Most systems use multiple buses to overcome these problems.

44Computer Architecture

Page 45: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Bus Types and ArbitrationBus Types—Dedicated

– Separate data & address lines—Multiplexed

– Shared lines– Address valid or data valid control line– Advantage - fewer lines– Disadvantages

+ More complex control+ Ultimate performance

Bus Arbitration— More than one module controlling the bus— e.g. CPU and DMA controller— Only one module may control bus at one time— Arbitration may be centralised or distributed

45Computer Architecture

Page 46: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

TimingCo-ordination of events on bus

Synchronous—Events determined by clock signals—Control Bus includes clock line—A single 1-0 is a bus cycle—All devices can read clock line—Usually sync on leading edge—Usually a single cycle for an event

Asynchronous—Read, Write

46Computer Architecture

Page 47: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Memory Hierarchy & Physical TypesMemory Hierarchy—Registers

– Exist In CPU—Internal or Main memory

– May include one or more levels of cache– Mainly “RAM”

—External memory– Backing store

Physical Types—Semiconductor types are mainly RAM—Magnetic types are Disk & Tape—Optical types are CD & DVD—Others are Bubble, Hologram, etc.

47Computer Architecture

Page 48: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Performance

Access time—Time between presenting the address and

getting the valid data

Memory Cycle time—Time may be required for the memory to

“recover” before next access.—Cycle time is access + recovery.

Transfer Rate—Rate at which data can be moved.

48Computer Architecture

Page 49: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Instruction RepresentationIn machine code each instruction has a unique bit pattern.

For human consumption (well, programmers anyway) a symbolic representation is used.—e.g. ADD, SUB, LOAD

Operands can also be represented in this way.—ADD A,B

49Computer Architecture

Page 50: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Instruction Types and Addressing Modes

Lecture #4, #5

50Computer Architecture

Page 51: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Instruction Format and Types

4 bits

Opcode Operand Reference Operand Reference

6 bits 6 bits

16 bits

Simple Instruction Format

Instruction TypesData processingData storage (main memory)Data movement (I/O)Program flow control

51Computer Architecture

Page 52: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Number of Addresses (1)

3 addresses—Operand 1, Operand 2, Result—a = b + c;—May be a forth - next instruction (usually

implicit)—Not common—Needs very long words to hold everything

52Computer Architecture

Page 53: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Number of Addresses (2)

2 addresses—One address doubles as operand and result.—a = a + b—Reduces length of instruction—Requires some extra work

– Temporary storage to hold some results

1 address—Implicit second address—Usually a register (accumulator)—Common on early machines

53Computer Architecture

Page 54: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Number of Addresses (3)

0 (zero) addresses—All addresses implicit—Uses a stack—e.g.

– push a– push b– add– pop c

—c = a + b

54Computer Architecture

Page 55: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Design Decisions (1)Operation repertoire—How many ops?—What can they do?—How complex are they?

Data types

Instruction formats—Length of op code field—Number of addresses

55Computer Architecture

Page 56: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Addressing Modes

ImmediateDirectIndirectRegisterRegister IndirectDisplacement (Indexed) Stack

56Computer Architecture

Page 57: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Immediate Addressing

Operand is part of instructionOperand = address fielde.g. ADD 5—Add 5 to contents of accumulator—5 is operand

No memory reference to fetch dataFastLimited range

57Computer Architecture

Page 58: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Immediate Addressing Diagram

OperandOpcode

Instruction

58Computer Architecture

Page 59: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Addressing

Address field contains address of operand.Effective address (EA) = address field (A)e.g. ADD A—Add contents of address A to accumulator

Single memory reference to access dataNo additional calculations to work out effective addressLimited address space

59Computer Architecture

Page 60: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Addressing Diagram

MemoryAddress AOpcode

Instruction

Operand

60Computer Architecture

Page 61: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Indirect AddressingMemory cell pointed to by address field contains the address of (pointer to) the operand.EA = (A)—Look in A, find address (A) and look there for operand.

e.g. ADD (A)—Add contents of cell pointed to by contents of A to

accumulator.

Large address space 2n where n = word lengthMay be nested, multilevel, cascaded—e.g. EA = (((A))) Draw the diagram yourself

Multiple memory accesses to find operandHence slower

61Computer Architecture

Page 62: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Indirect Addressing Diagram

Address AOpcode

Instruction

Memory

Operand

Pointer to operand

62Computer Architecture

Page 63: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Register Addressing (1)Operand is held in register named in address filed.

EA = R

Limited number of registers

Very small address field needed —Shorter instructions—Faster instruction fetch

63Computer Architecture

Page 64: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Register Addressing (2)

No memory accessVery fast executionVery limited address spaceMultiple registers helps performance—Requires good assembly programming or

compiler writing—N.B. C programming

– register int a;

c.f. Direct addressing

64Computer Architecture

Page 65: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Register Addressing Diagram

Register Address ROpcode

Instruction

Registers

Operand

65Computer Architecture

Page 66: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Register Indirect Addressing

C.f. indirect addressingEA = (R)Operand is in memory cell pointed to by contents of register RLarge address space (2n)One fewer memory access than indirect addressing

66Computer Architecture

Page 67: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Register Indirect Addressing Diagram

Register Address ROpcode

Instruction

Memory

OperandPointer to Operand

Registers

67Computer Architecture

Page 68: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Displacement Addressing

EA = A + (R)Address field hold two values—A = base value—R = register that holds displacement—or vice versa

68Computer Architecture

Page 69: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Displacement Addressing Diagram

Register ROpcode

Instruction

Memory

OperandPointer to Operand

Registers

Address A

+

69Computer Architecture

Page 70: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Relative Addressing

A version of displacement addressingR = Program counter, PCEA = A + (PC)i.e. get operand from A cells from current location pointed to by PCc.f locality of reference & cache usage

70Computer Architecture

Page 71: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Base-Register Addressing

A holds displacementR holds pointer to base addressR may be explicit or implicite.g. segment registers in 80x86

71Computer Architecture

Page 72: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Indexed Addressing

A = baseR = displacementEA = A + RGood for accessing arrays—EA = A + R—R++

72Computer Architecture

Page 73: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Combinations

PostindexEA = (A) + (R)

PreindexEA = (A+(R))

(Draw the diagrams)

73Computer Architecture

Page 74: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Stack Addressing

Operand is (implicitly) on top of stacke.g. —ADD Pop top two items from stack

and add

74Computer Architecture

Page 75: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Pentium Addressing ModesVirtual or effective address is offset into segment.— Starting address plus offset gives linear address.— This goes through page translation if paging enabled.

12 addressing modes available— Immediate— Register operand— Displacement— Base— Base with displacement— Scaled index with displacement— Base with index and displacement— Base scaled index with displacement— Relative

75Computer Architecture

Page 76: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Instruction Types

Instruction generally four types.—Data processing—Data storage (main memory)—Data movement (I/O)—Program flow control

76Computer Architecture

Page 77: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Design Decisions (1)

Operation repertoire—How many ops?—What can they do?—How complex are they?

Data types

Instruction formats—Length of op code field—Number of addresses

77Computer Architecture

Page 78: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Design Decisions (2)

Registers—Number of CPU registers available—Which operations can be performed on which

registers?

Addressing modes (later…)

RISC v CISC

78Computer Architecture

Page 79: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Types of Operation

There are several types of operations as follows.—Data Transfer—Arithmetic—Logical—Conversion—I/O—System Control—Transfer of Control

79Computer Architecture

Page 80: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

ArithmeticArithmetic operations include Add, Subtract, Multiply, Divide.

Can use signed integer.

Can arithmetic operations process floating point ?

May include.—Increment (a++)—Decrement (a--)—Negate (-a)

80Computer Architecture

Page 81: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Shift and Rotate Operations

Logical right shiftLogical left shiftArithmetic right shiftArithmetic left shiftRight rotateLeft rotate

81Computer Architecture

Page 82: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Logical and Conversion

Logical—Has bitwise operations.—Logical operations are AND, OR, NOT, etc.

Conversion—E.g. Binary to Decimal

82Computer Architecture

Page 83: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Input/Output

May be specific instructions.May be done using data movement instructions. (memory mapped)May be done by a separate controller (DMA).

83Computer Architecture

Page 84: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Transfer of Control

Branch—e.g. branch to x if result is zero

Skip—e.g. increment and skip if zero—ISZ Register1: Skip if zero—Branch xxxx

Subroutine call—c.f. interrupt call: jump to interrupt service

routine

84Computer Architecture

Page 85: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Branch Instruction

Unconditional Branch—Jump to 211 unconditionally.

Conditional Branch 1—Jump to 211 if accumulator is zero.

Conditional Branch 2—Jump to 235 if R1 equals to R2.

85Computer Architecture

Page 86: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Nested Procedure Calls

If a main program calls procedure 1, it goes to Proc.1 and it’s procedure is processed.

If the Proc.1 calls another procedure(Proc.2), it goes to Proc.2 and it’s procedure is processed.

If Proc.2 meets RETURN instruction, it returns to Proc.1.

86Computer Architecture

Page 87: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Arithmetic and Logical Operations of Computer

Lecture #6, #7

87Computer Architecture

Page 88: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Arithmetic & Logic UnitDoes the calculations.

Everything else in the computer is there to service this unit.

Handles integers.

May handle floating point (real) numbers.

May be separate FPU (maths co-processor).

88Computer Architecture

Page 89: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Integer Representation

Only have 0 & 1 to represent everythingPositive numbers stored in binary—e.g. 41=00101001

Has no minus sign Has no periodHas sign-magnitudeUse one’s or two’s compliment

89Computer Architecture

Page 90: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Sign-Magnitude

Left most bit is sign bit.0 means positive.1 means negative.+18 = 00010010 -18 = 10010010Problems—Need to consider both sign and magnitude in

arithmetic—Two representations of zero (+0 and -0)

90Computer Architecture

Page 91: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Two’s Compliment+3 = 00000011, +2 = 00000010+1 = 00000001, +0 = 00000000 -1 = 11111111, -2 = 11111110 -3 = 11111101

Benefits—Two’s compliment has one representation of

zero.—Arithmetic works easily (see later).—Negating is fairly easy.

– 3 = 00000011– Boolean complement gives 11111100– Add 1 to LSB 11111101

91Computer Architecture

Page 92: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Logical Operations

AND, OR, XOR, NOTSelective-set, Selective-complementMasking, Insert, Compare

Bitwise operations—Logical Shift—Circular Shift— Arithmetic Shift— Shift with Carry

92Computer Architecture

Page 93: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Shift and Rotate Operations

93Computer Architecture

Page 94: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Addition and SubtractionNormal binary addition

Monitor sign bit for overflow

Take two’s compliment of substahend and add to minuend.—i.e. a - b = a + (-b)

So we only need addition and complement circuits.

94Computer Architecture

Page 95: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Hardware for Addition and Subtraction

OF: overflow bitSW: Switch (select addition or subtraction)

B Register

Complementer

SW

Adder

A Register

OF

95Computer Architecture

Page 96: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Multiplication

Is complexWork out partial product for each digitTake care with place value (column)Add partial products

96Computer Architecture

Page 97: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Multiplication Example

1011 Multiplicand (11 dec)

x 1101 Multiplier (13 dec) 1011 Partial products 0000 1011 1011 10001111 Product (143 dec)

Note: if multiplier bit is 1, copy multiplicand (place value),

otherwise zeroNote: need double length result

97Computer Architecture

Page 98: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Booth’s AlgorithmSTART

A←0, Q-1 ← 0M ← MultiplicandQ ← MultiplierCounter ← n

Q0, Q-1

Arithmetic Shift Right of A, Q, Q-1

Counter ← Counter-1

Counter=0? END

A← A + MA← A - M

= 01= 10

YesNo

= 11= 00

98Computer Architecture

Page 99: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

DivisionMore complex than multiplicationNegative numbers are really bad!Based on long divisionDivision of Unsigned Binary Integers

001111

1011

00001101

100100111011001110

1011

1011100

Quotient

Dividend

Remainder

PartialRemainders

Divisor

99Computer Architecture

Page 100: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Real Numbers

Lecture #8, #9

100Computer Architecture

Page 101: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Real NumbersNumbers with fractions

Could be done in pure binary—1001.1010 = 24 + 20 +2-1 + 2-3 =9.625

Where is the binary point?

Fixed?—Very limited

Moving?—How do you show where it is?

101Computer Architecture

Page 102: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Floating Point

+/- .significand x 2exponent

Point is actually fixed between sign bit and body of mantissa.Exponent indicates place value (point position).

Sig

n bi

t

BiasedExponent

Mantissa

102Computer Architecture

Page 103: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Floating Point Examples

S E field Mantissa field

(b) Examples of a data representation

Sign(S) bit = 0 Exponent(E) field = 00000101 Mantissa(M) field = 1101 0000 0000 0000 0000 0000

1 bit 8 bits 23 bits

(a) 32-bit floating point format

0 000000101 11010000 00000000 0000000

103Computer Architecture

Page 104: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Signs for Floating Point

Mantissa is stored in 2s complement.Exponent is in excess or biased notation.—e.g. Excess (bias) 128 means—8 bit exponent field—Pure value range 0-255—Subtract 128 to get correct value—Range -128 to +127

104Computer Architecture

Page 105: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Normalization

FP numbers are usually normalized.i.e. exponent is adjusted so that leading bit (MSB) of mantissa is 1.Since it is always 1 there is no need to store it.c.f. Scientific notation where numbers are normalized to give a single digit before the decimal point. e.g. 3.123 x 103

105Computer Architecture

Page 106: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

FP Ranges

For a 32 bit number—8 bit exponent —+/- 2256 1.5 x 1077

Accuracy—The effect of changing lsb of mantissa—23 bit mantissa 2-23 1.2 x 10-7

106Computer Architecture

Page 107: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Expressible Numbers

107Computer Architecture

Page 108: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

IEEE 754Standard for floating point storage32 and 64 bit standards8 and 11 bit exponent respectively

108Computer Architecture

Page 109: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Floating Point ArithmeticFP Arithmetic +/-—Check for zeros—Align significands (adjusting exponents)—Add or subtract significands—Normalize result

FP Arithmetic x/—Check for zero—Add/subtract exponents —Multiply/divide significands (watch sign)—Normalize—Round—All intermediate results should be in double length storage

109Computer Architecture

Page 110: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Floating Point Multiplication

110Computer Architecture

Page 111: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Control Unit

Lecture #10

111Computer Architecture

Page 112: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

112

Functions of control unit— Decoding of an instruction code— Generation of control signals for instruction execution

Micro-instruction : — Control word

Micro-program : — Set of micro-instructions

Routine— Groups of micro-instructions for special functions

of CPU ex. Fetch cycle routine, Execution cycle routine,

Interrupt cycle routine

Control Unit

Computer Architecture

Page 113: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

113

Structure of Control Unit

Configuration elements— Instruction decoder

— Control address register: CAR

— Control memory) : Internal Memory to store the micro programs

— control buffer register: CBR

— subroutine register: SBR

— sequencing module

Computer Architecture

Page 114: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

114

Internal Structure of Control Unit

Instruction Register

Instruction Decoder

CAR

Control Memory Device

CBR

Decoder

SBRSequencing

Module

External Control Signals

Internal Control Signals

Condition Flags

Computer Architecture

Page 115: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

115

Internal Structure of the Control Memory Device

Example— Capacity of CMD = 512

words— The first half (Address 0 ~

63) : Store common routines

— The second half (Address 64 ~ 127) : Store execution routines of each instruction

Fetch Cycle Routine

Indirect Cycle Routine

Interrupt Cycle Routine

Execution Cycle Routine 1

Execution Cycle Routine 2

0...

6364

127

Computer Architecture

Page 116: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

116

Mapping

Instruction Code

Mapping Function

Computer Architecture

Page 117: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

117

Binary Codes and Symbols for Micro Operations(Examples)

Code Micro-operation Symbol 000 None NOP 001 MAR PC+1 PCTAR 010 MAR IR(addr) IRTAR 011 AC AC+MBR ADD 100 MBR M[MAR] READ 101 AC MBR BRTAC 110 IR MBR BRTIR 111 M[MAR] MBR WRITE

• Op field 1

Computer Architecture

Page 118: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

118

Op field 2

Code Micro-operation Symbol 000 None NOP 001 PC PC+1 INCPC 010 MBR AC ACTBR 011 MBR PC PCTBR 100 PC MBR BRTPC 101 MAR SP SPTAR 110 AC AC-MBR SUB 111 PC IR(addr) IRTPC

Binary Codes and Symbols for Micro Operations(Examples)

Computer Architecture

Page 119: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

119

Micro-programming

Fetch Cycle Routine ORG O

FETCH: PCTAR U JMP NEXT ; MAR <-PC Execution of next instruction

READ, INCPC U JMP NEXT ; BR <-M[MAR], PC =PC+1

Execution of next instruction

BRTIR U MAP; IR<-MBR Branch to the execution cycle

Binary Bit Pattern

Computer Architecture

Page 120: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

120

Indirect Cycle RoutineMicro instruction routine of the indirect cycle

Binary Bit Pattern

Execution of next instruction

Execution of next instruction

Return to the execution cycle

Computer Architecture

Page 121: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

121

Execution Cycle Routine

Instruction Op code Staring address of the routine

Computer Architecture

Page 122: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

122

Execution Cycle Routines for each instruction

; Call the indirect cycle routine if I=1

; Call the indirect cycle routine if I=1

Computer Architecture

Page 123: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Memory Devices

Lecture #11

123Computer Architecture

Page 124: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Memory Classification• Main memory :

—Internal memory

• Auxiliary storage device— External memory

124Computer Architecture

Page 125: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Memory Hierarchy

Registers—In CPU

Internal or Main memory—May include one or more

levels of cache—“RAM”

External memory—Backing store

125Computer Architecture

Page 126: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Semiconductor Memory Types

126Computer Architecture

Page 127: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Semiconductor Memory

RAM —Misnamed as all semiconductor memory is

random access—Read/Write—Volatile—Temporary storage—Static or dynamic

127Computer Architecture

Page 128: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Memory Cell Operation

(a) Write (b) Read

CellSelect

Control

Data InCellSelect

Control

Sense

128Computer Architecture

Page 129: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Dynamic RAMBits stored as charge in capacitorsCharges leakNeed refreshing even when poweredSimpler constructionSmaller per bitLess expensiveNeed refresh circuitsSlowerMain memoryEssentially analogue—Level of charge determines value

129Computer Architecture

Page 130: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Refreshing

Refresh circuit included on chipDisable chipCount through rowsRead & Write backTakes timeSlows down apparent performance

130Computer Architecture

Page 131: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Dynamic RAM Structure

Address Line

Bit Line B

Transistor

Storage Capacitor

Ground

131Computer Architecture

Page 132: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

DRAM OperationAddress line active when bit read or written—Transistor switch closed (current flows)

Write—Voltage to bit line

– High for 1 low for 0—Then signal address line

– Transfers charge to capacitor

Read—Address line selected

– transistor turns on—Charge from capacitor fed via bit line to sense amplifier

– Compares with reference value to determine 0 or 1—Capacitor charge must be restored

132Computer Architecture

Page 133: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Typical 16 Mb DRAM (4M x 4)

133Computer Architecture

Page 134: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Static RAMBits stored as on/off switchesNo charges to leakNo refreshing needed when poweredMore complex constructionLarger per bitMore expensiveDoes not need refresh circuitsFasterCacheDigital—Uses flip-flops

134Computer Architecture

Page 135: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Static RAM Structure

dc voltage

Ground

Address Line Bit Line BBit Line B

T3 T4

T5 C1C2

T1 T2

T6

135Computer Architecture

Page 136: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

SRAM and DRAMBoth volatile—Power needed to preserve data

DRAM —Simpler to build, smaller—More dense—Less expensive—Needs refresh—Larger memory units

SRAM—Faster—Used in cache

136Computer Architecture

Page 137: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Read Only Memory (ROM)

Permanent storage—Nonvolatile

Microprogramming (see later)

Library subroutines

Systems programs (BIOS)

Function tables

137Computer Architecture

Page 138: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Types of ROMWritten during manufacture—Very expensive for small runs

Programmable (once)—PROM—Needs special equipment to program

Read “mostly”—Erasable Programmable (EPROM)

– Erased by UV

—Electrically Erasable (EEPROM)– Takes much longer to write than read

—Flash memory– Erase whole memory electrically

138Computer Architecture

Page 139: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Packaging

139Computer Architecture

Page 140: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

140

Design of Memory Device Module

[Example] Design of 1Kx32 bit memory device module using 1K×8 bit RAM chips

– Method : parallel connection of 4 RAM chips

– Capacity of module: (1K×8) × 4 = 1K×32 bits = 1K words

– Address bits(10 bits: A9∼A0) : Common connection to all chips

– Address area: 000H ∼ 3FFH (H: Hexadecimal)

– Data Store: 8 bits/chip

Computer Architecture

Page 141: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

141

Design of 1K×32 bits Memory Device Module

Address(A9-0)

Data Bus(32 bits)

Computer Architecture

Page 142: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

142

Design of Memory Device Module(con’t)

[Example] Design of 4Kx8 bit memory device module using 1K×8 bit RAM chips

– Method : serial connection of 4 RAM chips– Capacity of module: (1K×8) × 4 = 4K×8 bits

= 4K bytes

– Address bits(12 bits: A11∼A0) :+ upper 2 bits : generation of 4 chip select

signals using address decoder + lower 10 bits : common connection to all chips

– Address area: 000H ∼ FFFH (H: Hexadecimal)

– Data Store: 8 bits/address

Computer Architecture

Page 143: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

143

Design of 4K×8 bits Memory Device Module

2×4 Decoder

Data(D7-0)

Computer Architecture

Page 144: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

144

Address Areas of each RAMRAM Address area

Chip No

from

to

from

to

from

to

from

to

Address Area

Computer Architecture

Page 145: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

145

Design Procedure of Memory Module

Design Procedure 1. Decision of memory capacity for computer

system2. Chip decision and design of address map3. Circuit design in detail

Computer Architecture

Page 146: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

146

• Capacity : 1K bytes RAM, 512 bytes ROM • Address: RAM = 0 ~, ROM = 800H ~• Useful chips: 256×8 bits RAM, 512×8 bits ROM

o Address table

Memory Design for 8-bit Micro Computer

Memory ChipAddress Area(Hexadecimal)

Address bits

Computer Architecture

Page 147: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

147

Design Example of Memory Device for 8-bit Micro Computer

(8-bit)Address

Decoder3 2 1 0

Data

Computer Architecture

Page 148: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

148

Cache Memory[Wikipedia definition] A cache is a component that improves

performance by transparently storing data such that future requests for that data can be served faster

Purpose for use: high-speed memory which is installed

between CPU and memory to minimize the CPU waiting time because of the speed difference between CPU and memory.

Characteristics Use of memory chips which have a higher

access speed than that of main memory Small capacity because of the price and

limited space

CPU

Main Memory

Cache

Computer Architecture

Page 149: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

149

Cache Memorycache hit : data which CPU wants to access already exists in cachecache miss : data which CPU wants to access doesn’t exist in cacheCache hit ratio(H) :

The ratio(or percentage) of accesses that result in cache hits is known as the hit ratio of the cache

number of times to be hit to cache H = -------------------------------------- number of times of total memory access

Cache miss ratio = (1 - H)Average access time of memory device (Ta) :

Ta = H × Tc + (1 - H) × Tm

Tc: cache access time, Tm: main memory access time

Computer Architecture

Page 150: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Computer Architecture: Cache Memory

Lecture #12

150Computer Architecture

Page 151: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

So you want fast?

It is possible to build a computer which uses only static RAM (see later).

This would be very fast.

This would need no cache.—How can you cache cache?

This would cost a very large amount.

151Computer Architecture

Page 152: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Locality of Reference

During the course of the execution of a program, memory references tend to cluster.

e.g. loops

152Computer Architecture

Page 153: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

CacheSmall amount of fast memorySits between normal main memory and CPUMay be located on CPU chip or module

CPU Cache Main Memory

Word Transfer Block Transfer

153Computer Architecture

Page 154: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Cache operation - overviewCPU requests contents of memory location.Check cache for this data.If present, get from cache (fast).If not present, read required block from main memory to cache.Then deliver from cache to CPU.Cache includes tags to identify which block of main memory is in each cache slot.

154Computer Architecture

Page 155: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Size does matter

Cost—More cache is expensive.

Speed—More cache is faster (up to a point).—Checking cache for data takes time.

155Computer Architecture

Page 156: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Typical Cache Organization

156Computer Architecture

Page 157: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Mapping Function

Cache of 64kByte

Cache block of 4 bytes—i.e. cache is 16k (214) lines of 4 bytes

16MBytes main memory

24 bit address —(224=16M)

157Computer Architecture

Page 158: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct MappingEach block of main memory maps to only one cache line.—i.e. if a block is in cache, it must be in one specific

place

Address is in two parts.Least Significant w bits identify unique word.Most Significant s bits specify one memory block.

The MSBs are split into a cache line field r and a tag of s-r (most significant).

158Computer Architecture

Page 159: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Mapping-Address Structure

Tag Field (t) Slot Field (s) Word Field(w)

8 14 2

• 24 bit address• 2 bit word identifier (4 byte block)• 22 bit block identifier

—8 bit tag (=22-14)—14 bit slot or line

• No two blocks in the same line have the same Tag field.• Check contents of cache by finding line and checking

Tag.

159

Page 160: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Mapping - Cache Slot Table

Cache Slot Main Memory blocks held0 0, m, 2m, 3m…2s-m1 1,m+1, 2m+1…2s-m+1

m-1 m-1, 2m-1,3m-1…2s-1

160Computer Architecture

Page 161: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Mapping Cache Organization

Memory Address

Comparator

Cache

Data

Tag

Slot(0)

Slot(i)

Slot(m-1)

Tag Slot Word

(Cache hit)

(Cache miss)

Main Memory

161Computer Architecture

Page 162: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Mapping SummaryAddress length = (t+ s + w) bits

Number of addressable units = 2s+w words or bytes

Block size = 2w words or bytes

Number of blocks in main memory = 2t+s+w/2w = 2t+s

Number of slots in cache = m = 2s

Size of tag = t bits

162Computer Architecture

Page 163: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Direct Mapping Characteristics

Simple

Inexpensive

Fixed location for given block—If a program accesses 2 blocks that map to the

same line repeatedly, cache misses are very high.

163Computer Architecture

Page 164: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Associative Mapping

A main memory block can load into any line of cache.

Memory address is interpreted as tag and wordTag uniquely identifies block of memory.

Every line’s tag is examined for a match.

Cache searching gets expensive.

164Computer Architecture

Page 165: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Fully Associative Cache Organization

Tag Field Word Field

Memory Address

Comparator

Cache

DataTag

Slot(0)

Slot(i)

Slot(m-1)

Tag Word

(Cache hit)

(Cache miss)

Main Memory

165Computer Architecture

Page 166: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Associative Mapping ExampleAddress

Tag Word Data

Main Memory(128 bytes)

5 bits 32bitsCache(32 bytes)

Tag data slot #

166Computer Architecture

Page 167: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Tag 5 bit Word2 bit

Associative Mapping-Address Structure

5 bit tag stored with each 32 bit block of dataCompare tag field with tag entry in cache to check for hitLeast significant 2 bits of address identify which 16 bit word is required from 32 bit data block

167Computer Architecture

Page 168: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Associative Mapping SummaryAddress length = (s + w) bitsNumber of addressable units = 2s+w words or bytesSlot size = 2w words or bytesNumber of tags in main memory = 2t+ w/2w = 2t

Number of slots in cache = undeterminedSize of tag = t bits

168Computer Architecture

Page 169: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Set Associative Mapping

Cache is divided into a number of sets.Each set contains a number of lines.A given block maps to any line in a given set.—e.g. Block B can be in any line of set i.

e.g. 2 lines per set—2 way associative mapping—A given block can be in one of 2 lines in only one set.

Tag Field Set Field Word Field

169Computer Architecture

Page 170: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Set Associative Mapping Example

Tag Set Word

23 2

Memory Address

Comparator

Cache

DataTag

Slot(0)

Slot(1)

Set(i)

Set(0)Tag Set Word

(Cache hit)

(Cache miss)

Main Memory

Slot(0)

Slot(1)

Slot(0)

Slot(1)

Set(m-1)

170Computer Architecture

Page 171: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Set Associative Mapping -Address Structure

Use set field to determine cache set to look in.Compare tag field to see if we have a hit.e.g—Address Tag Data Set number—1FF 7FFC 1FF 12345678 1FFF—001 7FFC 001 11223344 1FFF

Tag 9 bit Set 13 bitWord2 bit

171Computer Architecture

Page 172: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Set Associative Mapping SummaryAddress length = (s + w) bitsNumber of addressable units = 2s+w words or bytesBlock size = line size = 2w words or bytesNumber of blocks in main memory = 2d

Number of lines in set = kNumber of sets = v = 2d

Number of lines in cache = kv = k * 2d

Size of tag = (s – d) bits

172Computer Architecture

Page 173: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Pentium 4 Cache80386 – no on chip cache80486 – 8k using 16 byte lines and four way set associative organizationPentium (all versions) – two on chip L1 caches— Data & instructions

Pentium 4 – L1 caches— 8k bytes— 64 byte lines— four way set associative

L2 cache — Feeding both L1 caches— 256k— 128 byte lines— 8 way set associative

173Computer Architecture

Page 174: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Pentium 4 Core ProcessorFetch/Decode Unit— Fetches instructions from L2 cache— Decode into micro-ops— Store micro-ops in L1 cache

Out of order execution logic— Schedules micro-ops— Based on data dependence and resources— May speculatively execute

Execution units— Execute micro-ops— Data from L1 cache— Results in registers

Memory subsystem— L2 cache and systems bus

174Computer Architecture

Page 175: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Pentium 4 DesignDecodes instructions into RISC like micro-ops before L1 cache

Micro-ops fixed length— Superscalar pipelining and scheduling

Pentium instructions long & complex

Performance improved by separating decoding from scheduling & pipelining— (More later – ch14)

Data cache is write back— Can be configured to write through

L1 cache controlled by 2 bits in register— CD = cache disable— NW = not write through— 2 instructions to invalidate (flush) cache and write back then invalidate

175Computer Architecture

Page 176: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

DRAM

1. Synchronous DRAM (SDRAM)— Add a clock signal to DRAM interface, so that the

repeated transfers would not bear overhead to synchronize with DRAM controller

2. Double Data Rate (DDR SDRAM)— Transfer data on both the rising edge and falling edge of

the DRAM clock signal doubling the peak data rate— DDR2 lowers power by dropping the voltage from 2.5 to

1.8 volts + offers higher clock rates: up to 400 MHz— DDR3 drops to 1.5 volts + higher clock rates: up to 800

MHz

Improved Bandwidth, not Latency

176Computer Architecture

Page 177: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

DRAM

StandardClock Rate

(MHz)M transfers /

second DRAM NameMbytes/s/

DIMMDIMM

Name

DDR 133 266 DDR266 2128 PC2100

DDR 150 300 DDR300 2400 PC2400

DDR 200 400 DDR400 3200 PC3200

DDR2 266 533 DDR2-533 4264 PC4300

DDR2 333 667 DDR2-667 5336 PC5300

DDR2 400 800 DDR2-800 6400 PC6400

DDR3 533 1066 DDR3-1066 8528 PC8500

DDR3 666 1333 DDR3-1333 10664 PC10700

DDR3 800 1600 DDR3-1600 12800 PC12800

x 2 x 8177Computer Architecture

Page 178: Computer Architecture Chang-Bum Lee Dept. of Computer Engineering Youngsan University Computer Architecture 1.

Error Correction

Motivation:—Failures/time proportional to number of bits!—As DRAM cells shrink, more vulnerable

Went through period in which failure rate was low enough without error correction that people didn’t do correction—DRAM banks too large now—Servers always corrected memory systems

Basic idea: add redundancy through parity bits—Common configuration: Random error correction

– SEC-DED (single error correct, double error detect)– One example: 64 data bits + 8 parity bits (11% overhead)

—Really want to handle failures of physical components as well– Organization is multiple DRAMs/DIMM, multiple DIMMs– Want to recover from failed DRAM and failed DIMM!– “Chip kill” handle failures width of single DRAM chip

178Computer Architecture


Recommended