Sequential CPUSequential CPUImplementationImplementationSequential CPUSequential CPUImplementationImplementation
– 2 – Processor
Suggested ReadingSuggested Reading
- - Chap 4.3Chap 4.3
– 3 – Processor
Y86 Instruction Set P259Y86 Instruction Set P259Byte 0 1 2 3 4 5
pushl rA A 0 rA 8
jXX Dest 7 fn Dest
popl rA B 0 rA 8
call Dest 8 0 Dest
rrmovl rA, rB 2 0 rA rB
irmovl V, rB 3 0 8 rB V
rmmovl rA, D(rB) 4 0 rA rB D
mrmovl D(rB), rA 5 0 rA rB D
OPl rA, rB 6 fn rA rB
ret 9 0
nop 0 0
halt 1 0
addl 6 0
subl 6 1
andl 6 2
xorl 6 3
jmp 7 0
jle 7 1
jl 7 2
je 7 3
jne 7 4
jge 7 5
jg 7 6
– 4 – Processor
Building Blocks P278, P279, P280Building Blocks P278, P279, P280
Combinational LogicCombinational Logic Compute Boolean functions of
inputs Continuously respond to input
changes Operate on data and implement
control
Storage ElementsStorage Elements Store bits Addressable memories Non-addressable registers Loaded only as clock rises
Registerfile
Registerfile
A
B
W dstW
srcA
valA
srcB
valB
valW
Clock
ALU
fun
A
B
MUX
0
1
=
Clock
– 5 – Processor
Hardware Control LanguageHardware Control Language
Very simple hardware description language Can only express limited aspects of hardware operation
Parts we want to explore and modify
Data TypesData Types bool: Boolean
a, b, c, …
int: wordsA, B, C, …Does not specify word size---bytes, 32-bit words, …
StatementsStatements bool a = bool-expr ; int A = int-expr ;
– 6 – Processor
HCL OperationsHCL Operations
Classify by type of value returned
Boolean ExpressionsBoolean Expressions Logic Operations
a && b, a || b, !a Word Comparisons
A == B, A != B, A < B, A <= B, A >= B, A > B Set Membership
A in { B, C, D }» Same as A == B || A == C || A == D
Word ExpressionsWord Expressions Case expressions
[ a : A; b : B; c : C ]Evaluate test expressions a, b, c, … in sequenceReturn word expression A, B, C, … for first successful test
– 7 – Processor
4.3.1Instruction Execution Stages P281
4.3.1Instruction Execution Stages P281
FetchFetch Read instruction from instruction memory
DecodeDecode Read program registers
ExecuteExecute Compute value or address
MemoryMemory Read or write data
Write BackWrite Back Write program registers
PCPC Update program counter
– 8 – Processor
Instruction DecodingInstruction Decoding
Instruction FormatInstruction Format Instruction byte icode:ifun Optional register byte rA:rB Optional constant word valC
5 0 rA rB D
icodeifun
rArB
valC
Optional Optional
– 9 – Processor
Figure 4.16 P283 Executing Arith./Logical OperationFigure 4.16 P283 Executing Arith./Logical Operation
FetchFetch Read 2 bytes
DecodeDecode Read operand registers
ExecuteExecute Perform operation Set condition codes
MemoryMemory Do nothing
Write backWrite back Update register
PC UpdatePC Update Increment PC by 2
OPl rA, rB 6 fn rA rB
– 10 – Processor
Stage Computation: Arith/Log. OpsP283 Figure 4.16
Stage Computation: Arith/Log. OpsP283 Figure 4.16
Formulate instruction execution as sequence of simple steps
Use same general form for all instructions
OPl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[rA]
valB R[rB]Decode
Read operand A
Read operand B
valE valB OP valA
Set CCExecute
Perform ALU operation
Set condition code register
Memory
R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
M1[PC] 表示从 PC开始的内存中读取一个字节的数据。
– 11 – Processor
Executing rrmovlExecuting rrmovl
FetchFetch Read 2 bytes
DecodeDecode Read operand register rA
ExecuteExecute Do nothing
MemoryMemory Do nothing
Write backWrite back Update register
PC UpdatePC Update Increment PC by 2
rrmovl rA, rB 2 0 rA rB
– 12 – Processor
Stage Computation: rrmovlP283 Figure 4.16
Stage Computation: rrmovlP283 Figure 4.16
Formulate instruction execution as sequence of simple steps
Use same general form for all instructions
rrmovl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[rA]Decode
Read operand A
valE 0 + valAExecute
Perform ALU operation
*valE Memory
R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
– 13 – Processor
Executing irmovlExecuting irmovl
FetchFetch Read 6 bytes
DecodeDecode Do nothing
ExecuteExecute Do nothing
MemoryMemory Do nothing
Write backWrite back Update register
PC UpdatePC Update Increment PC by 6
irmovl V, rB 3 0 8 rB V
– 14 – Processor
Stage Computation: irmovlP283 Figure 4.16
Stage Computation: irmovlP283 Figure 4.16
Formulate instruction execution as sequence of simple steps
Use same general form for all instructions
irmovl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valC M4[PC+2]valP PC+6
Fetch
Read instruction byte
Read register byte
Read constant value
Compute next PC
Decode
valE 0 + valCExecute
Perform ALU operation
Memory
R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
– 15 – Processor
Figure 4.17 P283
Executing rmmovlFigure 4.17 P283
Executing rmmovl
FetchFetch Read 6 bytes
DecodeDecode Read operand registers
ExecuteExecute Compute effective address
MemoryMemory Write to memory
Write backWrite back Do nothing
PC UpdatePC Update Increment PC by 6
rmmovl rA, D(rB) 4 0 rA rB D
– 16 – Processor
Stage Computation: rmmovlP283Figure 4.17
Stage Computation: rmmovlP283Figure 4.17
Use ALU for address computation
rmmovl rA, D(rB)
icode:ifun M1[PC]
rA:rB M1[PC+1]
valC M4[PC+2]valP PC+6
Fetch
Read instruction byte
Read register byte
Read displacement D
Compute next PC
valA R[rA]
valB R[rB]Decode
Read operand A
Read operand B
valE valB + valCExecute
Compute effective address(sum of the displacement and the base register value)
M4[valE] valAMemory Write value to memory
Write
back
PC valPPC update Update PC
– 17 – Processor
Executing mrmovlExecuting mrmovl
FetchFetch Read 6 bytes
DecodeDecode Read operand registers rB
ExecuteExecute Compute effective address
MemoryMemory Read from memory
Write backWrite back Update register rA
PC UpdatePC Update Increment PC by 6
mrmovl D(rB),rA 5 0 rA rB D
– 18 – Processor
Stage Computation: mrmovlP283 Figure 4.17
Stage Computation: mrmovlP283 Figure 4.17
Use ALU for address computation
mrmovl D(rB) , rAicode:ifun M1[PC]
rA:rB M1[PC+1]
valC M4[PC+2]valP PC+6
Fetch
Read instruction byte
Read register byte
Read displacement D
Compute next PC
valB R[rB]Decode
Read operand B
valE valB + valCExecute
Compute effective address
valM M4[valE]Memory Read data from memory
R[rA] valM
Write
back Update register rA
PC valPPC update Update PC
– 19 – Processor
Figure 4.18 P284 Executing pushlFigure 4.18 P284 Executing pushl
FetchFetch Read 2 bytes
DecodeDecode Read stack pointer and rA
ExecuteExecute Decrement stack pointer by
4
MemoryMemory Store valA at the address of
new stack pointer
Write backWrite back Update stack pointer
PC UpdatePC Update Increment PC by 2
pushl rA a 0 rA 8
– 20 – Processor
Stage Computation: pushl P284Figure 4.18
Stage Computation: pushl P284Figure 4.18
Use ALU to Decrement stack pointer
pushl rA
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[rA]
valB R [%esp]Decode
Read valA
Read stack pointer
valE valB + (-4)Execute
Decrement stack pointer
M4[valE] valAMemory Store to stack
R[%esp] valEWrite
back
Update stack pointer*在 write back之前实际上写入的元素在堆栈外。
PC valPPC update Update PC
– 21 – Processor
Executing poplExecuting popl
FetchFetch Read 2 bytes
DecodeDecode Read stack pointer
ExecuteExecute Increment stack pointer by 4
MemoryMemory Read from old stack pointer
Write backWrite back Update stack pointer Write result to register
PC UpdatePC Update Increment PC by 2
popl rA b 0 rA 8
– 22 – Processor
Stage Computation: poplP284 Figure 4.18
Stage Computation: poplP284 Figure 4.18
Use ALU to increment stack pointer Must update two registers
Popped valueNew stack pointer
popl rA
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[%esp]
valB R [%esp]Decode
Read stack pointer
Read stack pointer
valE valB + 4Execute
Increment stack pointer
valM M4[valA]Memory Read from stack
R[%esp] valE
R[rA] valM
Write
back
Update stack pointer
Write back result
PC valPPC update Update PC
在写回阶段要写两个寄存器,这两个写是有先后次序的。必须按照上面的方法进行,因为 rA可能就是%esp。具体见书 P270 Practice Problem 4.5.
– 23 – Processor
Figure 4.19 P284 Executing JumpsFigure 4.19 P284 Executing Jumps
FetchFetch Read 5 bytes Increment PC by 5
DecodeDecode Do nothing
ExecuteExecute Determine whether to take
branch based on jump condition and condition codes
MemoryMemory Do nothing
Write backWrite back Do nothing
PC UpdatePC Update Set PC to Dest if branch
taken or to incremented PC if not branch
jXX Dest 7 fn Dest
XX XXfall thru:
XX XXtarget:
Not taken
Taken
– 24 – Processor
Stage Computation: JumpsP284 Figure 4.19
Stage Computation: JumpsP284 Figure 4.19
Compute both addresses Choose based on setting of condition codes and branch
condition
jXX Dest
icode:ifun M1[PC]
valC M4[PC+1]valP PC+5
Fetch
Read instruction byte
Read destination address
Fall through address
Decode
Bch Cond(CC,ifun)Execute
Take branch?
Memory
Write
back
PC Bch ? valC : valPPC update Update PC
– 25 – Processor
Executing callExecuting call
FetchFetch Read 5 bytes Increment PC by 5
DecodeDecode Read stack pointer
ExecuteExecute Decrement stack pointer by
4
MemoryMemory Write incremented PC to
new value of stack pointer
Write backWrite back Update stack pointer
PC UpdatePC Update Set PC to Dest
call Dest 8 0 Dest
XX XXreturn:
XX XXtarget:
– 26 – Processor
Stage Computation: callP284 Figure 4.19
Stage Computation: callP284 Figure 4.19
Use ALU to decrement stack pointer Store incremented PC
call Dest
icode:ifun M1[PC]
valC M4[PC+1]valP PC+5
Fetch
Read instruction byte
Read destination address
Compute return point
valB R[%esp]Decode
Read stack pointer
valE valB + –4Execute
Decrement stack pointer
M4[valE] valP Memory Write return value on stack
R[%esp] valE
Write
back
Update stack pointer
PC valCPC update Set PC to destination
– 27 – Processor
Executing retExecuting ret
FetchFetch Read 1 byte
DecodeDecode Read stack pointer
ExecuteExecute Increment stack pointer by 4
MemoryMemory Read return address from
old stack pointer
Write backWrite back Update stack pointer
PC UpdatePC Update Set PC to return address
ret 9 0
XX XXreturn:
– 28 – Processor
Stage Computation: retP284 Figure 4.19
Stage Computation: retP284 Figure 4.19
Use ALU to increment stack pointer Read return address from memory
ret
icode:ifun M1[PC]
Fetch
Read instruction byte
valA R[%esp]
valB R[%esp]Decode
Read operand stack pointer
Read operand stack pointer
valE valB + 4Execute
Increment stack pointer
valM M4[valA] Memory Read return address
R[%esp] valE
Write
back
Update stack pointer
PC valMPC update Set PC to return address
– 29 – Processor
Computation StepsFigure 4.16 P285Computation StepsFigure 4.16 P285
All instructions follow same general pattern Differ in what gets computed on each step
OPl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
[Read constant word]
Compute next PC
valA R[rA]
valB R[rB]Decode
Read operand A
Read operand B
valE valB OP valA
Set CCExecute
Perform ALU operation
Set condition code register
Memory [Memory read/write]
R[rB] valE
Write
back
Write back ALU result
[Write back memory result]
PC valPPC update Update PC
icode,ifun
rA,rB
valC
valP
valA, srcA
valB, srcB
valE
Cond code
valM
dstE
dstM
PC
– 30 – Processor
Computation Steps Figure 4.19 P284
Computation Steps Figure 4.19 P284
All instructions follow same general pattern Differ in what gets computed on each step
call Dest
Fetch
Decode
Execute
Memory
Write
back
PC update
icode,ifun
rA,rB
valC
valP
valA, srcA
valB, srcB
valE
Cond code
valM
dstE
dstM
PC
icode:ifun M1[PC]
valC M4[PC+1]valP PC+5
valB R[%esp]
valE valB + –4
M4[valE] valP R[%esp] valE
PC valC
Read instruction byte
[Read register byte]
Read constant word
Compute next PC
[Read operand A]
Read operand B
Perform ALU operation
[Set condition code reg.]
[Memory read/write]
[Write back ALU result]
Write back memory result
Update PC
– 31 – Processor
Computed ValuesComputed Values
FetchFetch
icode Instruction code
ifun Instruction function
rA Instr. Register A
rB Instr. Register B
valC Instruction constant
valP Incremented PC
Decode § Write backDecode § Write back
valA Register value A
valB Register value B
ExecuteExecute valE ALU result Bch Branch flag
MemoryMemory valM Value from
memory
– 32 – Processor
4.3.2 , 4.3.3, 4.3.4SEQ Operation4.3.2 , 4.3.3, 4.3.4SEQ Operation
StateState Program counter register (PC) Condition code register (CC) Register File Memories
Access same memory space Data: for reading/writing
program data Instruction: for reading
instructions
All updated as clock rises
Combinational LogicCombinational Logic ALU Control logic Memory reads
Instruction memory Register file Data memory
CombinationalLogic Data
memoryData
memory
Registerfile
Registerfile
PC0x00c
CCCCReadPorts
WritePorts
Read WriteRead Write
– 33 – Processor
CombinationalLogic Data
memoryData
memory
Registerfile
%ebx = 0x100
Registerfile
%ebx = 0x100
PC0x00c
CC100CC100
ReadPorts
WritePorts
Read WriteRead Write
0x00c: addl %edx,%ebx # %ebx <-- 0x300 CC <-- 000
0x00e: je dest # Not taken
Cycle 3:
Cycle 4:
0x006: irmovl $0x200,%edx # %edx <-- 0x200Cycle 2:
0x000: irmovl $0x100,%ebx # %ebx <-- 0x100Cycle 1:
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4SEQ Operation #2 (point 1)
SEQ Operation #2 (point 1)
state set according to second irmovl instruction
combinational logic starting to react to state changes
Figure 4.23 P297
– 34 – Processor
0x00c: addl %edx,%ebx # %ebx <-- 0x300 CC <-- 000
0x00e: je dest # Not taken
Cycle 3:
Cycle 4:
0x006: irmovl $0x200,%edx # %edx <-- 0x200Cycle 2:
0x000: irmovl $0x100,%ebx # %ebx <-- 0x100Cycle 1:
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4SEQ Operation #3 (point 2)
SEQ Operation #3 (point 2)
state set according to second irmovl instruction
combinational logic generates results for addl instruction
CombinationalLogic Data
memoryData
memory
Registerfile
%ebx = 0x100
Registerfile
%ebx = 0x100
PC0x00c
CC100CC100
ReadPorts
WritePorts
0x00e
000
Read WriteRead Write
– 35 – Processor
0x00c: addl %edx,%ebx # %ebx <-- 0x300 CC <-- 000
0x00e: je dest # Not taken
Cycle 3:
Cycle 4:
0x006: irmovl $0x200,%edx # %edx <-- 0x200Cycle 2:
0x000: irmovl $0x100,%ebx # %ebx <-- 0x100Cycle 1:
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4SEQ Operation #4 (point 3)
SEQ Operation #4 (point 3)
state set according to addl instruction
combinational logic starting to react to state changes
CombinationalLogic Data
memoryData
memory
Registerfile
%ebx = 0x300
Registerfile
%ebx = 0x300
PC0x00e
CC000CC000
ReadPorts
WritePorts
Read WriteRead Write
– 36 – Processor
0x00c: addl %edx,%ebx # %ebx <-- 0x300 CC <-- 000
0x00e: je dest # Not taken
Cycle 3:
Cycle 4:
0x006: irmovl $0x200,%edx # %edx <-- 0x200Cycle 2:
0x000: irmovl $0x100,%ebx # %ebx <-- 0x100Cycle 1:
Clock
Cycle 1 Cycle 2 Cycle 3 Cycle 4SEQ Operation #5 (point 4)
SEQ Operation #5 (point 4)
state set according to addl instruction
combinational logic generates results for je instruction
CombinationalLogic Data
memoryData
memory
Registerfile
%ebx = 0x300
Registerfile
%ebx = 0x300
PC0x00e
CC000CC000
ReadPorts
WritePorts
0x013
CombinationalLogic Data
memoryData
memory
Registerfile
%ebx = 0x300
Registerfile
%ebx = 0x300
PC0x00e
CC000CC000
ReadPorts
WritePorts
0x013
Read WriteRead Write
– 37 – Processor
SEQ SemanticsSEQ Semantics
Achieve the same effect as a sequential execution of Achieve the same effect as a sequential execution of the assignment shown in the tables of Figures 4.16 the assignment shown in the tables of Figures 4.16 to 4.19to 4.19 Though all of the state updates occur simultaneously at the
clock rises to the next cycle. A problem: popl %esp need to sequentially write two
registers. So the register file control logic must process it.
Principle: never needs to read back the state updated Principle: never needs to read back the state updated by an instruction in order to complete the processing by an instruction in order to complete the processing of this instruction of this instruction (P295)(P295) If so, the update must happen in the instruction cycle E.g. pushl semantics E.g. no one instruction need to both set and then read the
condition codes.
– 38 – Processor
Exercise (2010.4.7)Exercise (2010.4.7)
See page 282 codes, modified as below.See page 282 codes, modified as below.
1 irmovl $-9, %edx1 irmovl $-9, %edx
2 irmovl $9, %ebx2 irmovl $9, %ebx
3 addl %edx, %ebx3 addl %edx, %ebx
Question:Question:
1)1) Write the binary code of these three instructions, to Write the binary code of these three instructions, to be note to have the address value, start from 0x000; be note to have the address value, start from 0x000;
2)2) Trace the generic form of addl instruction Trace the generic form of addl instruction implementation, and the specific execution of this implementation, and the specific execution of this instruction. Also give the CC value after addl instruction. Also give the CC value after addl execution.execution.
– 39 – Processor
SEQ CPU Implementation
SEQ CPU Implementation
– 40 – Processor
What we will discuss today?What we will discuss today?
The implementation of a sequential CPU ---- SEQThe implementation of a sequential CPU ---- SEQ Every Instruction finished in one cycle. Instruction executes in sequential No two instruction execute in parallel or overlap
An revised version of SEQ ---- SEQ+An revised version of SEQ ---- SEQ+ Modify the PC Update stage of SEQ to show the difference between ISA and implementation
– 41 – Processor
Some MacrosFigure 4.24 P299Some MacrosFigure 4.24 P299
NameName Value Value MeaningMeaning
INOPINOP 00 Code for Code for nopnop instruction instruction
IHALTIHALT 11 Code for Code for halthalt instruction instruction
IRRMOVLIRRMOVL 22 Code for Code for rrmovlrrmovl instruction instruction
IIRMOVLIIRMOVL 33 Code for Code for irmovlirmovl instruction instruction
IRMMOVLIRMMOVL 44 Code for Code for rmmovlrmmovl instruction instruction
IMRMOVLIMRMOVL 55 Code for Code for mrmovlmrmovl instruction instruction
IOPLIOPL 66 Code for integer op instructionsCode for integer op instructions
IJXXIJXX 77 Code for jump instructionsCode for jump instructions
…………………… ………… …………………………………………………………
IPOPLIPOPL BB Code for Code for poplpopl instruction instruction
RESPRESP
RENONERENONE
66
88
Register ID for %espRegister ID for %esp
Indicates no register file accessIndicates no register file access
ALUADDALUADD 00 Function for addition operationFunction for addition operation
– 42 – Processor
SEQ Hardware Structure Figure 4.20 P292
SEQ Hardware Structure Figure 4.20 P292
StagesStages Fetch
Read instruction from memory Decode
Read program registers Execute
Compute value or address Memory
Read or write data Write Back
Write program registers PC
Update program counter
Instruction FlowInstruction Flow Read instruction at address
specified by PC Process through stages Update program counter
Instructionmemory
Instructionmemory
PCincrement
PCincrement
CCCCALUALU
Datamemory
Datamemory
Fetch
Decode
Execute
Memory
Write back
icode, ifunrA , rB
valC
Registerfile
Registerfile
A BM
E
Registerfile
Registerfile
A BM
E
PC
valP
srcA, srcBdstA, dstB
valA, valB
aluA, aluB
Bch
valE
Addr, Data
valM
PCvalE, valM
newPC
– 43 – Processor
Difference between semantics and implementationDifference between semantics and implementationISAISA
Every stage may update some states, these updates occur sequentially
SEQSEQ All the state update operations occur simultaneously at
clock rising (except memory and CC)
– 44 – Processor
SEQ Hardware Figure 4.21 P293
SEQ Hardware Figure 4.21 P293
KeyKey Blue boxes:
predesigned hardware blocks
E.g., memories, ALU
Gray boxes: control logic
Describe in HCL
White ovals ( 椭圆 ): labels for signals
Thick lines: 32-bit word values
Thin lines: 4-8 bit values
Dotted lines: 1-bit values
Instructionmemory
Instructionmemory
PCincrement
PCincrement
CCCC ALUALU
Datamemory
Datamemory
NewPC
rB
dstE dstM
ALUA
ALUB
Mem.control
Addr
srcA srcB
read
write
ALUfun.
Fetch
Decode
Execute
Memory
Write back
data out
Registerfile
Registerfile
A BM
E
Registerfile
Registerfile
A BM
E
Bch
dstE dstM srcA srcB
icode ifun rA
PC
valC valP
valBvalA
Data
valE
valM
PC
newPC
– 45 – Processor
Fetch LogicFetch Logic
Predefined BlocksPredefined Blocks PC: Register containing PC Instruction memory: Read 6 bytes (PC to PC+5) Split: Divide instruction byte into icode and ifun Align: Get fields for rA, rB, and valC
Instructionmemory
Instructionmemory
PCincrement
PCincrement
rBicode ifun rA
PC
valC valP
Needregids
NeedvalC
Instrvalid
AlignAlignSplitSplit
Bytes 1-5Byte 0
Figure 4.25 P299
– 46 – Processor
Fetch LogicFetch Logic
Control LogicControl Logic Instr. Valid: Is this instruction valid? Need regids: Does this instruction have a register
bytes? Need valC: Does this instruction have a constant word?
Instructionmemory
Instructionmemory
PCincrement
PCincrement
rBicode ifun rA
PC
valC valP
Needregids
NeedvalC
Instrvalid
AlignAlignSplitSplit
Bytes 1-5Byte 0
– 47 – Processor
Fetch Control Logic P300
Fetch Control Logic P300
pushl rA A 0 rA 8
jXX Dest 7 fn Dest
popl rA B 0 rA 8
call Dest 8 0 Dest
rrmovl rA, rB 2 0 rA rB
irmovl V, rB 3 0 8 rB V
rmmovl rA, D(rB) 4 0 rA rB D
mrmovl D(rB), rA 5 0 rA rB D
OPl rA, rB 6 fn rA rB
ret 9 0
nop 0 0
halt 1 0
pushl rA A 0 rA 8pushl rA A 0A 0 rA 8rA 8
jXX Dest 7 fn DestjXX Dest 7 fn7 fn Dest
popl rA B 0 rA 8popl rA B 0B 0 rA 8rA 8
call Dest 8 0 Destcall Dest 8 08 0 Dest
rrmovl rA, rB 2 0 rA rBrrmovl rA, rB 2 02 0 rA rBrA rB
irmovl V, rB 3 0 8 rB Virmovl V, rB 3 03 0 8 rB8 rB V
rmmovl rA, D(rB) 4 0 rA rB Drmmovl rA, D(rB) 4 04 0 rA rBrA rB D
mrmovl D(rB), rA 5 0 rA rB Dmrmovl D(rB), rA 5 05 0 rA rBrA rB D
OPl rA, rB 6 fn rA rBOPl rA, rB 6 fn6 fn rA rBrA rB
ret 9 0ret 9 09 0
nop 0 0nop 0 00 0
halt 1 0halt 1 01 0
bool need_regids =icode in { IRRMOVL, IOPL, IPUSHL, IPOPL,
IIRMOVL, IRMMOVL, IMRMOVL };
bool instr_valid = icode in { INOP, IHALT, IRRMOVL, IIRMOVL, IRMMOVL, IMRMOVL, IOPL, IJXX, ICALL, IRET, IPUSHL, IPOPL };
– 48 – Processor
Decode & Write-Back LogicDecode & Write-Back Logic
Register FileRegister File Read ports A, B Write ports E, M Addresses are register IDs or
8 (no access)
rB
dstE dstM srcA srcB
Registerfile
Registerfile
A BM
EdstE dstM srcA srcB
icode rA
valBvalA valEvalM
Control LogicControl Logic srcA, srcB: read port
addresses dstA, dstB: write port
addresses
Figure 4.26 P300
– 49 – Processor
A SourceP301A SourceP301 OPl rA, rB
valA R[rA]Decode Read operand A
rmmovl rA, D(rB)
valA R[rA]Decode Read operand A
popl rA
valA R[%esp]Decode Read stack pointer
jXX Dest
Decode No operand
call Dest
valA R[%esp]Decode Read stack pointer
ret
Decode No operand
int srcA = [icode in { IRRMOVL, IRMMOVL, IOPL, IPUSHL } : rA;icode in { IPOPL, IRET } : RESP;1 : RNONE; # Don't need register
];
– 50 – Processor
E DestinationP301E DestinationP301
None
R[%esp] valE Update stack pointer
None
R[rB] valE
OPl rA, rB
Write-back
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
Write-back
Write-back
Write-back
Write-back
Write-back
Write back result
R[%esp] valE Update stack pointer
R[%esp] valE Update stack pointer
int dstE = [icode in { IRRMOVL, IIRMOVL, IOPL} : rB;icode in { IPUSHL, IPOPL, ICALL, IRET } : RESP;1 : RNONE; # Don't need register
];
– 51 – Processor
Execute LogicExecute Logic
UnitsUnits ALU
Implements 4 required functions Generates condition code values
CC Register with 3 condition code
bits
bcond Computes branch flag
Control LogicControl Logic Set CC: Should condition code
register be loaded? ALU A: Input A to ALU ALU B: Input B to ALU ALU fun: What function should
ALU compute?
CCCC ALUALU
ALUA
ALUB
ALUfun.
Bch
icode ifun valC valBvalA
valE
SetCC
bcondbcond
Figure 4.27 P302
– 52 – Processor
ALU A InputALU A Input
valE valB + –4 Decrement stack pointer
No operation
valE valB + 4 Increment stack pointer
valE valB + valC Compute effective address
valE valB OP valA Perform ALU operation
OPl rA, rB
Execute
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
Execute
Execute
Execute
Execute
Execute valE valB + 4 Increment stack pointer
int aluA = [icode in { IRRMOVL, IOPL } : valA;icode in { IIRMOVL, IRMMOVL, IMRMOVL } : valC;icode in { ICALL, IPUSHL } : -4;icode in { IRET, IPOPL } : 4;# Other instructions don't need ALU
];
P302
– 53 – Processor
ALU OperationALU Operation
valE valB + –4 Decrement stack pointer
No operation
valE valB + 4 Increment stack pointer
valE valB + valC Compute effective address
valE valB OP valA Perform ALU operation
OPl rA, rB
Execute
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
Execute
Execute
Execute
Execute
Execute valE valB + 4 Increment stack pointer
int alufun = [icode == IOPL : ifun;1 : ALUADD;
];
P303
– 54 – Processor
Condition SetCondition Set
Bool set_cc = icode in { IOPL };Bool set_cc = icode in { IOPL };
We will not discuss the detail of We will not discuss the detail of BcondBcond Though it is also a control unit
P303
– 55 – Processor
Memory LogicMemory Logic
MemoryMemory Reads or writes memory word
Control LogicControl Logic Mem. read: should word be read? Mem. write: should word be
written? Mem. addr.: Select address Mem. data.: Select data
Datamemory
Datamemory
Mem.read
Memaddr
read
write
data out
Memdata
valE
valM
valA valP
Mem.write
data in
icode
Figure 4.28 P303
– 56 – Processor
Memory AddressMemory AddressOPl rA, rB
Memory
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
No operation
M4[valE] valAMemory Write value to memory
valM M4[valA]Memory Read from stack
M4[valE] valP Memory Write return value on stack
valM M4[valA] Memory Read return address
Memory No operation
int mem_addr = [icode in { IRMMOVL, IPUSHL, ICALL, IMRMOVL } : valE;icode in { IPOPL, IRET } : valA;# Other instructions don't need address
];
P304
– 57 – Processor
Memory ReadMemory Read
OPl rA, rB
Memory
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
No operation
M4[valE] valAMemory Write value to memory
valM M4[valA]Memory Read from stack
M4[valE] valP Memory Write return value on stack
valM M4[valA] Memory Read return address
Memory No operation
bool mem_read = icode in { IMRMOVL, IPOPL, IRET };bool mem_write = icode in { IRMMOVL, IPUSHL, ICALL };
P304
– 58 – Processor
PC Update LogicPC Update Logic
New PCNew PC Select next value of PC
NewPC
Bchicode valC valPvalM
PC
Figure 4.29 P304
– 59 – Processor
PCUpdatePCUpdate
OPl rA, rB
rmmovl rA, D(rB)
popl rA
jXX Dest
call Dest
ret
PC valPPC update Update PC
PC valPPC update Update PC
PC valPPC update Update PC
PC Bch ? valC : valPPC update Update PC
PC valCPC update Set PC to destination
PC valMPC update Set PC to return address
int new_pc = [icode == ICALL : valC;icode == IJXX && Bch : valC;icode == IRET : valM;1 : valP;
];
P304
– 60 – Processor
SEQ Hardware(Review)SEQ Hardware(Review)
Stages occur in sequence One operation in process
at a time
Instructionmemory
Instructionmemory
PCincrement
PCincrement
CCCC ALUALU
Datamemory
Datamemory
NewPC
rB
dstE dstM
ALUA
ALUB
Mem.control
Addr
srcA srcB
read
write
ALUfun.
Fetch
Decode
Execute
Memory
Write back
data out
Registerfile
Registerfile
A BM
E
Registerfile
Registerfile
A BM
E
Bch
dstE dstM srcA srcB
icode ifun rA
PC
valC valP
valBvalA
Data
valE
valM
PC
newPC
Figure 4.21 P293
– 61 – Processor
Instructionmemory
Instructionmemory
PCincrement
PCincrement
CCCC ALUALU
Datamemory
Datamemory
PC
rB
dstE dstM
ALUA
ALUB
Mem.control
Addr
srcA srcB
read
write
ALUfun.
Fetch
Decode
Execute
Memory
Write back
data out
Registerfile
Registerfile
A BM
E
Registerfile
Registerfile
A BM
E
Bch
dstE dstM srcA srcB
icode ifun rA
pBch pValM pValC pValPpIcode
PC
valC valP
valBvalA
Data
valE
valM
PC
4.3.5SEQ+ Hardware4.3.5SEQ+ Hardware
Still sequential implementation
Reorder PC stage to put at beginning
PC StagePC Stage Task is to select PC for
current instruction Based on results
computed by previous instruction
Processor StateProcessor State PC is no longer stored in
register But, can determine PC
based on other stored information
– 62 – Processor
PC ComputationPC Computation
Int pc= [Int pc= [
pIcode == ICALL : pValC;pIcode == ICALL : pValC;
pIcode == IJXX && bBch : pValC;pIcode == IJXX && bBch : pValC;
PIcode == IRET : pValM;PIcode == IRET : pValM;
1 : pValP;1 : pValP;
];];
P308
– 63 – Processor
SEQ SummarySEQ Summary
ImplementationImplementation Express every instruction as series of simple steps Follow same general flow for each instruction type Assemble registers, memories, predesigned combinational
blocks Connect with control logic
LimitationsLimitations Too slow to be practical In one cycle, must propagate through instruction memory,
register file, ALU, and data memory Would need to run clock very slowly Hardware units only active for fraction of clock cycle