Date post: | 20-Jan-2016 |
Category: |
Documents |
Upload: | tracy-harvey |
View: | 243 times |
Download: | 0 times |
12CPU Structure and Function
Computer Organization
CPU StructureCPU must:
Fetch instructionsInterpret instructionsFetch dataProcess dataWrite data
CPU With Systems Bus
CPU Internal Structure
RegistersCPU must have some working space
(temporary storage)Called registersNumber and function vary between
processor designsOne of the major design decisionsTop level of memory hierarchy
User Visible RegistersGeneral PurposeDataAddressCondition Codes
General Purpose Registers (1)
May be true general purposeMay be restrictedMay be used for data or addressingData
AccumulatorAddressing
Segment
General Purpose Registers (2)
Make them general purposeIncrease flexibility and programmer optionsIncrease instruction size & complexity
Make them specializedSmaller (faster) instructionsLess flexibility
How Many GP Registers?Between 8 - 32Fewer = more memory referencesMore does not reduce memory references
and takes up processor real estateSee also RISC
How big?Large enough to hold full addressLarge enough to hold full wordOften possible to combine two data
registersC programmingdouble int a;long int a;
Condition Code RegistersSets of individual bits
e.g. result of last operation was zeroCan be read (implicitly) by programs
e.g. Jump if zeroCan not (usually) be set by programs
Control & Status RegistersProgram CounterInstruction Decoding RegisterMemory Address RegisterMemory Buffer Register
Revision: what do these all do?
Program Status WordA set of bitsIncludes Condition CodesSign of last resultZeroCarryEqualOverflowInterrupt enable/disableSupervisor
Supervisor ModeIntel ring zeroKernel modeAllows privileged instructions to executeUsed by operating systemNot available to user programs
Other RegistersMay have registers pointing to:
Process control blocks (see O/S)Interrupt Vectors (see O/S)
N.B. CPU design and operating system design are closely linked
Example Register Organizations
Instruction CycleRevisionStallings Chapter 3
Indirect CycleMay require memory access to fetch
operandsIndirect addressing requires more memory
accessesCan be thought of as additional instruction
subcycle
Instruction Cycle with Indirect
Instruction Cycle State Diagram
Data Flow (Instruction Fetch)Depends on CPU designIn general:
FetchPC contains address of next instructionAddress moved to MARAddress placed on address busControl unit requests memory readResult placed on data bus, copied to MBR,
then to IRMeanwhile PC incremented by 1
Data Flow (Data Fetch)IR is examinedIf indirect addressing, indirect cycle is
performedRight most N bits of MBR transferred to MARControl unit requests memory readResult (address of operand) moved to MBR
Data Flow (Fetch Diagram)
Data Flow (Indirect Diagram)
Data Flow (Execute)May take many formsDepends on instruction being executedMay include
Memory read/writeInput/OutputRegister transfersALU operations
Data Flow (Interrupt)SimplePredictableCurrent PC saved to allow resumption after
interruptContents of PC copied to MBRSpecial memory location (e.g. stack
pointer) loaded to MARMBR written to memoryPC loaded with address of interrupt
handling routineNext instruction (first of interrupt handler)
can be fetched
Data Flow (Interrupt Diagram)
PrefetchFetch accessing main memoryExecution usually does not access main
memoryCan fetch next instruction during execution
of current instructionCalled instruction prefetch
Improved PerformanceBut not doubled:
Fetch usually shorter than executionPrefetch more than one instruction?
Any jump or branch means that prefetched instructions are not the required instructions
Add more stages to improve performance
PipeliningFetch instructionDecode instructionCalculate operands (i.e. EAs)Fetch operandsExecute instructionsWrite result
Overlap these operations
Two Stage Instruction Pipeline
Timing Diagram for Instruction Pipeline Operation
The Effect of a Conditional Branch on Instruction Pipeline Operation
Six Stage Instruction Pipeline
Alternative Pipeline Depiction
Speedup Factorswith InstructionPipelining
Dealing with BranchesMultiple StreamsPrefetch Branch TargetLoop bufferBranch predictionDelayed branching
Multiple StreamsHave two pipelinesPrefetch each branch into a separate
pipelineUse appropriate pipeline
Leads to bus & register contentionMultiple branches lead to further pipelines
being needed
Prefetch Branch TargetTarget of branch is prefetched in addition
to instructions following branchKeep target until branch is executedUsed by IBM 360/91
Loop BufferVery fast memoryMaintained by fetch stage of pipelineCheck buffer before fetching from memoryVery good for small loops or jumpsc.f. cacheUsed by CRAY-1
Loop Buffer Diagram
Branch Prediction (1)Predict never taken
Assume that jump will not happenAlways fetch next instruction 68020 & VAX 11/780VAX will not prefetch after branch if a page
fault would result (O/S v CPU design)Predict always taken
Assume that jump will happenAlways fetch target instruction
Branch Prediction (2)Predict by Opcode
Some instructions are more likely to result in a jump than thers
Can get up to 75% successTaken/Not taken switch
Based on previous historyGood for loops
Branch Prediction (3)Delayed Branch
Do not take jump until you have toRearrange instructions
Branch Prediction Flowchart
Branch Prediction State Diagram
Dealing With Branches
Intel 80486 Pipelining Fetch
From cache or external memory Put in one of two 16-byte prefetch buffers Fill buffer with new data as soon as old data consumed Average 5 instructions fetched per load Independent of other stages to keep buffers full
Decode stage 1 Opcode & address-mode info At most first 3 bytes of instruction Can direct D2 stage to get rest of instruction
Decode stage 2 Expand opcode into control signals Computation of complex address modes
Execute ALU operations, cache access, register update
Writeback Update registers & flags Results sent to cache & bus interface write buffers
80486 Instruction Pipeline Examples
Pentium 4 Registers
EFLAGS Register
Control Registers
MMX Register MappingMMX uses several 64 bit data typesUse 3 bit register address fields
8 registersNo MMX specific registers
Aliasing to lower 64 bits of existing floating point registers
Mapping of MMX Registers to Floating-Point Registers
Pentium Interrupt ProcessingInterrupts
MaskableNonmaskable
ExceptionsProcessor detectedProgrammed
Interrupt vector tableEach interrupt type assigned a numberIndex to vector table256 * 32 bit interrupt vectors
5 priority classes
PowerPC User Visible Registers
PowerPC Register Formats