12 CPU Structure and Function Computer Organization.

12CPU Structure and Function

Computer Organization

CPU StructureCPU must:

Fetch instructionsInterpret instructionsFetch dataProcess dataWrite data

CPU With Systems Bus

CPU Internal Structure

RegistersCPU must have some working space

(temporary storage)Called registersNumber and function vary between

processor designsOne of the major design decisionsTop level of memory hierarchy

User Visible RegistersGeneral PurposeDataAddressCondition Codes

General Purpose Registers (1)

May be true general purposeMay be restrictedMay be used for data or addressingData

AccumulatorAddressing

Segment

General Purpose Registers (2)

Make them general purposeIncrease flexibility and programmer optionsIncrease instruction size & complexity

Make them specializedSmaller (faster) instructionsLess flexibility

How Many GP Registers?Between 8 - 32Fewer = more memory referencesMore does not reduce memory references

and takes up processor real estateSee also RISC

How big?Large enough to hold full addressLarge enough to hold full wordOften possible to combine two data

registersC programmingdouble int a;long int a;

Condition Code RegistersSets of individual bits

e.g. result of last operation was zeroCan be read (implicitly) by programs

e.g. Jump if zeroCan not (usually) be set by programs

Control & Status RegistersProgram CounterInstruction Decoding RegisterMemory Address RegisterMemory Buffer Register

Revision: what do these all do?

Program Status WordA set of bitsIncludes Condition CodesSign of last resultZeroCarryEqualOverflowInterrupt enable/disableSupervisor

Supervisor ModeIntel ring zeroKernel modeAllows privileged instructions to executeUsed by operating systemNot available to user programs

Other RegistersMay have registers pointing to:

Process control blocks (see O/S)Interrupt Vectors (see O/S)

N.B. CPU design and operating system design are closely linked

Example Register Organizations

Instruction CycleRevisionStallings Chapter 3

Indirect CycleMay require memory access to fetch

operandsIndirect addressing requires more memory

accessesCan be thought of as additional instruction

subcycle

Instruction Cycle with Indirect

Instruction Cycle State Diagram

Data Flow (Instruction Fetch)Depends on CPU designIn general:

FetchPC contains address of next instructionAddress moved to MARAddress placed on address busControl unit requests memory readResult placed on data bus, copied to MBR,

then to IRMeanwhile PC incremented by 1

Data Flow (Data Fetch)IR is examinedIf indirect addressing, indirect cycle is

performedRight most N bits of MBR transferred to MARControl unit requests memory readResult (address of operand) moved to MBR

Data Flow (Fetch Diagram)

Data Flow (Indirect Diagram)

Data Flow (Execute)May take many formsDepends on instruction being executedMay include

Memory read/writeInput/OutputRegister transfersALU operations

Data Flow (Interrupt)SimplePredictableCurrent PC saved to allow resumption after

interruptContents of PC copied to MBRSpecial memory location (e.g. stack

pointer) loaded to MARMBR written to memoryPC loaded with address of interrupt

handling routineNext instruction (first of interrupt handler)

can be fetched

Data Flow (Interrupt Diagram)

PrefetchFetch accessing main memoryExecution usually does not access main

memoryCan fetch next instruction during execution

of current instructionCalled instruction prefetch

Improved PerformanceBut not doubled:

Fetch usually shorter than executionPrefetch more than one instruction?

Any jump or branch means that prefetched instructions are not the required instructions

Add more stages to improve performance

PipeliningFetch instructionDecode instructionCalculate operands (i.e. EAs)Fetch operandsExecute instructionsWrite result

Overlap these operations

Two Stage Instruction Pipeline

Timing Diagram for Instruction Pipeline Operation

The Effect of a Conditional Branch on Instruction Pipeline Operation

Six Stage Instruction Pipeline

Alternative Pipeline Depiction

Speedup Factorswith InstructionPipelining

Dealing with BranchesMultiple StreamsPrefetch Branch TargetLoop bufferBranch predictionDelayed branching

Multiple StreamsHave two pipelinesPrefetch each branch into a separate

pipelineUse appropriate pipeline

Leads to bus & register contentionMultiple branches lead to further pipelines

being needed

Prefetch Branch TargetTarget of branch is prefetched in addition

to instructions following branchKeep target until branch is executedUsed by IBM 360/91

Loop BufferVery fast memoryMaintained by fetch stage of pipelineCheck buffer before fetching from memoryVery good for small loops or jumpsc.f. cacheUsed by CRAY-1

Loop Buffer Diagram

Branch Prediction (1)Predict never taken

Assume that jump will not happenAlways fetch next instruction 68020 & VAX 11/780VAX will not prefetch after branch if a page

fault would result (O/S v CPU design)Predict always taken

Assume that jump will happenAlways fetch target instruction

Branch Prediction (2)Predict by Opcode

Some instructions are more likely to result in a jump than thers

Can get up to 75% successTaken/Not taken switch

Based on previous historyGood for loops

Branch Prediction (3)Delayed Branch

Do not take jump until you have toRearrange instructions

Branch Prediction Flowchart

Branch Prediction State Diagram

Dealing With Branches

Intel 80486 Pipelining Fetch

From cache or external memory Put in one of two 16-byte prefetch buffers Fill buffer with new data as soon as old data consumed Average 5 instructions fetched per load Independent of other stages to keep buffers full

Decode stage 1 Opcode & address-mode info At most first 3 bytes of instruction Can direct D2 stage to get rest of instruction

Decode stage 2 Expand opcode into control signals Computation of complex address modes

Execute ALU operations, cache access, register update

Writeback Update registers & flags Results sent to cache & bus interface write buffers

80486 Instruction Pipeline Examples

Pentium 4 Registers

EFLAGS Register

Control Registers

MMX Register MappingMMX uses several 64 bit data typesUse 3 bit register address fields

8 registersNo MMX specific registers

Aliasing to lower 64 bits of existing floating point registers

Mapping of MMX Registers to Floating-Point Registers

Pentium Interrupt ProcessingInterrupts

MaskableNonmaskable

ExceptionsProcessor detectedProgrammed

Interrupt vector tableEach interrupt type assigned a numberIndex to vector table256 * 32 bit interrupt vectors

5 priority classes

PowerPC User Visible Registers

PowerPC Register Formats

Date post:	20-Jan-2016
Category:	Documents
Upload:	tracy-harvey
View:	243 times
Download:	0 times

12 CPU Structure and Function Computer Organization.

Documents