1
Sequential CPU Implementation
2
Outline
• Logic design
• Organizing Processing into Stages
• SEQ timing
• Suggested Reading 4.2,4.3.1 ~ 4.3.3
OFZFSF
OFZFSF
OFZFSF
OFZFSF
Arithmetic Logic Unit
• Combinational logic– Continuously responding to inputs
• Control signal selects function computed– Corresponding to 4 arithmetic/logical operations in Y86
• Also computes values for condition codes• We will use it as a basic component for our CPU
ALU
Y
X
X + Y
0
ALU
Y
X
X - Y
1
ALU
Y
X
X & Y
2
ALU
Y
X
X ^ Y
3
A
B
A
B
A
B
A
B
3
Storage
• Registers– Hold single words or bits– Loaded as clock rises– Not only program registers
I O
Clock
4
Storing 1 Bit
5
V1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
V1
Bistable ElementQ+
Q–
q
!q
q = 0 or 1
Vin V1
V2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
V1
V2
Storing 1 Bit (cont.)
60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
V1
V2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
Vin
V2
Bistable ElementQ+
Q–
q
!q
q = 0 or 1
Vin V1
V2V2
Vin V1
Vin = V2
Stable 0
Stable 1
Metastable
Physical Analogy
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
V1
V2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Vin
Vin
V2
Stable 0
Stable 1
Metastable
.Stable left . Stable right.
Metastable
7
Storing and Accessing 1 Bit
Q+
Q–
R
S
R-S Latch
Q+
Q–
R
S
Q+
Q–
R
S
Resetting1
0
1 0
0 1
Q+
Q–
R
S
Q+
Q–
R
S
Setting0
1
0 1
1 0
Q+
Q–
R
S
Q+
Q–
R
S
Storing0
0
!q q
q !q
Bistable Element
Q+
Q–
q
!q
q = 0 or 1
8
1-Bit Latch
Latching
1
Q+
Q–
R
S
D
C
Q+
Q–
R
S
D
C
d !d !d !d d
d d !d0
Storing
Q+
Q–
R
S
D
C
Q+
Q–
R
S
D
C
d !d q
!q
!q
q0
0
9
D Latch
Q+
Q–
R
S
D
C
Data
Clock
Transparent 1-Bit Latch
– When in latching mode, combinational propogation from D to Q+ and Q–
– Value latched depends on value of D as C falls
Latching
1
Q+
Q–
R
S
D
C
Q+
Q–
R
S
D
C
d !d !d !d d
d d !d
C
D
Q+
Time
Changing D
10
Edge-Triggered Latch
– Only in latching mode for brief periodRising clock edge
– Value latched depends on data as clock rises
– Output remains stable at all other times
Q+
Q–
R
S
D
C
Data
Clock TTrigger
C
D
Q+
Time
T
11
Registers
– Stores word of dataDifferent from program registers seen in
assembly code
– Collection of edge-triggered latches– Loads input on rising edge of clock
I O
Clock
DC
Q+
DC
Q+
DC
Q+
DC
Q+
DC
Q+
DC
Q+
DC
Q+
DC
Q+
i7
i6
i5
i4
i3
i2
i1
i0
o7
o6
o5
o4
o3
o2
o1
o0
Clock
Structure
12
Register Operation
• Stores data bits• For most of time acts as barrier between
input and output• As clock rises, loads input
State = x Risingclock
Output = xInput = y
x
State = y
Output = y
y
13
State Machine Example
Comb. Logic
ALU
0
Out
MUX
0
1
Clock
In
Load
0
14
State Machine Example
15
Comb. Logic
ALU
0
OutMUX0
1
Clock
In
Load
x0
1
x0 ?
?
x0
Clock
Load
In
Out
State Machine Example
16
Comb. Logic
ALU
0
OutMUX0
1
Clock
In
Load
x0
0
x0+x0 x0
X0
x0
Clock
Load
In
Out x0
State Machine Example
17
Comb. Logic
ALU
0
OutMUX0
1
Clock
In
Load
x1
0
x0+x1 x0
X0
x0
Clock
Load
In
Out x0
x1
State Machine Example
18
Comb. Logic
ALU
0
OutMUX0
1
Clock
In
Load 0
x0
Clock
Load
In
Out x0
x1x2 x3 x4 x5
x0+x1 x0+x1+x2
– Accumulator circuit
– Load or accumulate on each cycle
x3 x+x4 x3+x4+x5
Building Blocks
• Combinational Logic– Compute Boolean
functions of inputs– Continuously respond to
input changes– Operate on data and
implement control
• Storage Elements– Store bits– Addressable memories– Non-addressable registers– Loaded only as clock rises
Registerfile
Registerfile
A
B
W dstW
srcA
valA
srcB
valB
valW
Clock
ALU
fun
A
B
MUX
0
1
=
Clock19
Summary
• Storage– Registers
• Hold single words• Loaded as clock rises
– Random-access memories• Hold multiple words• Possible multiple read or write ports• Read word when address input changes• Write word as clock rises
20
21
Instruction Decoding
• Instruction Format– Instruction byte icode:ifun– Optional register byte rA:rB– Optional constant word valC
5 0 rA rB D
icodeifun
rArB
valC
Optional Optional
Y86 Instruction Set #1Byte 0 1 2 3 4 5
pushl rA A 0 rA F
jXX Dest 7 fn Dest
popl rA B 0 rA F
call Dest 8 0 Dest
cmovXX rA, rB 2 fn rA rB
irmovl V, rB 3 0 F rB V
rmmovl rA, D(rB) 4 0 rA rB D
mrmovl D(rB), rA 5 0 rA rB D
OPl rA, rB 6 fn rA rB
ret 9 0
nop 1 0
halt 0 0
rrmovl 2 0
cmovle 2 1
cmovl 2 2
cmove 2 3
cmovne 2 4
cmovge 2 5
cmovg 2 6
22
Y86 Instruction Set #2Byte 0 1 2 3 4 5
pushl rA A 0 rA F
jXX Dest 7 fn Dest
popl rA B 0 rA F
call Dest 8 0 Dest
cmovXX rA, rB 2 fn rA rB
irmovl V, rB 3 0 F rB V
rmmovl rA, D(rB) 4 0 rA rB D
mrmovl D(rB), rA 5 0 rA rB D
OPl rA, rB 6 fn rA rB
ret 9 0
nop 1 0
halt 0 0 addl 6 0
subl 6 1
andl 6 2
xorl 6 3
23
Y86 Instruction Set #3Byte 0 1 2 3 4 5
pushl rA A 0 rA F
jXX Dest 7 fn Dest
popl rA B 0 rA F
call Dest 8 0 Dest
rrmovl rA, rB 2 fn rA rB
irmovl V, rB 3 0 F rB V
rmmovl rA, D(rB) 4 0 rA rB D
mrmovl D(rB), rA 5 0 rA rB D
OPl rA, rB 6 fn rA rB
ret 9 0
nop 1 0
halt 0 0
jmp 7 0
jle 7 1
jl 7 2
je 7 3
jne 7 4
jge 7 5
jg 7 624
25
Instruction Execution Stages
• Fetch– Read instruction from instruction memory
• Decode– Read program registers
• Execute– Compute value or address
• Memory– Read or write data
• Write Back– Write program registers
• PC– Update program counter
26
Executing Arith./Logical Operation
• Fetch– Read 2 bytes
• Decode– Read operand
registers
• Execute– Perform operation– Set condition codes
• Memory– Do nothing
• Write back– Update register
• PC Update– Increment PC by 2
OPl rA, rB 6 fn rA rB
Stage Computation: Arith/Log. Ops
OPl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[rA]
valB R[rB]Decode
Read operand A
Read operand B
valE valB OP valA
Set CCExecute
Perform ALU operation
Set condition code register Memory
R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
27
28
SEQ Hardware Structure
• Instruction Flow– Read instruction at address specified by PC– Process through stages– Update program counter
29
InstructionInstruction memory PC increment
CCCC ALU
Data memory
Fetch
Decode
Execute
Memory
Write back
icode ifunrA rB
RegisterM
valP
srcA, srcB
dstB
valA, valB
aluA,aluB
valE
PC
valE
,
newPCPC
A B Register FileE
30
Stage Computation
• Formulate instruction execution as sequence of simple steps
• Use same general form for all instructions
31
Executing rrmovl
• Fetch– Read 2 bytes
• Decode– Read operand
register rA
• Execute– Do nothing
• Memory– Do nothing
• Write back– Update register
• PC Update– Increment PC by 2
rrmovl rA, rB 2 0 rA rB
32
Stage Computation: rrmovl
rrmovl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[rA]Decode
Read operand A
valE 0 + valAExecute
Perform ALU operation
Memory R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
33
Executing irmovl
• Fetch– Read 6 bytes
• Decode– Do nothing
• Execute– Do nothing
• Memory– Do nothing
• Write back– Update register
• PC Update– Increment PC by 6
irmovl V, rB 3 0 F rB V
34
Stage Computation: irmovl
irmovl rA, rB
icode:ifun M1[PC]
rA:rB M1[PC+1]
valC M4[PC+2]valP PC+6
Fetch
Read instruction byte
Read register byte
Read constant value
Compute next PC
Decode
valE 0 + valCExecute
Perform ALU operation
Memory R[rB] valE
Write
back
Write back result
PC valPPC update Update PC
35
Executing rmmovl
• Fetch– Read 6 bytes
• Decode– Read operand
registers
• Execute– Compute effective
address
• Memory– Write to memory
• Write back– Do nothing
• PC Update– Increment PC by 6
rmmovl rA, D(rB) 4 0 rA rB D
Stage Computation: rmmovl
– Use ALU for address computation
rmmovl rA, D(rB)
icode:ifun M1[PC]
rA:rB M1[PC+1]
valC M4[PC+2]valP PC+6
Fetch
Read instruction byte
Read register byte
Read displacement D
Compute next PC
valA R[rA]
valB R[rB]Decode
Read operand A
Read operand B
valE valB + valCExecute
Compute effective address
M4[valE] valAMemory Write value to memory
Write
back
PC valPPC update Update PC
36
37
Executing mrmovl
• Fetch– Read 6 bytes
• Decode– Read operand
registers rB
• Execute– Compute effective
address
• Memory– Read from memory
• Write back– Update register rA
• PC Update– Increment PC by 6
mrmovl D(rB),rA 5 0 rA rB D
38
Stage Computation: mrmovl
• Use ALU for address computation
mrmovl D(rB) , rAicode:ifun M1[PC]rA:rB M1[PC+1]valC M4[PC+2]valP PC+6
Fetch
Read instruction byteRead register byteRead displacement DCompute next PC
valB R[rB]Decode
Read operand BvalE valB + valCExecute Compute effective
address
valM M4[valE]Memory Read data from memory
R[rA] valM
Write
back Update register rAPC valPPC update Update PC
39
Executing pushl
• Fetch– Read 2 bytes
• Decode– Read stack pointer
and rA
• Execute– Decrement stack
pointer by 4
• Memory– Store valA at the
address of new stack pointer
• Write back– Update stack
pointer
• PC Update– Increment PC by 2
pushl rA a 0 rA F
40
Stage Computation: pushl
• Use ALU to Decrement stack pointer
pushl rA
icode:ifun M1[PC]rA:rB M1[PC+1] valP PC+2
Fetch
Read instruction byteRead register byte Compute next PC
valA R[rA]valB R [%esp]
DecodeRead valARead stack pointer
valE valB + (-4)Execute
Decrement stack pointer
M4[valE] valAMemory Store to stack R[%esp] valEWrite
back
Update stack pointer
PC valPPC update Update PC
41
Executing popl
• Fetch– Read 2 bytes
• Decode– Read stack pointer
• Execute– Increment stack
pointer by 4
• Memory– Read from old stack
pointer
• Write back– Update stack pointer– Write result to register
• PC Update– Increment PC by 2
popl rA b 0 rA F
Stage Computation: popl
popl rA
icode:ifun M1[PC]
rA:rB M1[PC+1]
valP PC+2
Fetch
Read instruction byte
Read register byte
Compute next PC
valA R[%esp]
valB R [%esp]Decode
Read stack pointer
Read stack pointer
valE valB + 4Execute
Increment stack pointer
valM M4[valA]Memory Read from stack
R[%esp] valE
R[rA] valM
Write
back
Update stack pointer
Write back result
PC valPPC update Update PC
42
43
Stage Computation: popl
• Use ALU to increment stack pointer• Must update two registers
– Popped value– New stack pointer
44
Executing Jumps
jXX Dest 7 fn Dest
XX XXfall thru:
XX XXtarget:
Not taken
Taken
45
Executing Jumps
• Fetch– Read 5 bytes– Increment PC by 5
• Decode– Do nothing
• Execute– Determine whether
to take branch based on jump condition and condition codes
• Memory– Do nothing
• Write back– Do nothing
• PC Update– Set PC to Dest if
branch taken or to incremented PC if not branch
Stage Computation: Jumps
jXX Dest
icode:ifun M1[PC]
valC M4[PC+1]valP PC+5
Fetch
Read instruction byte
Read destination addressFall through address
Decode
Cnd Cond(CC,ifun)Execute
Take branch?
Memory
Write
back
PC Cnd ? valC : valPPC update Update PC
46
47
Stage Computation: Jumps
• Compute both addresses• Choose based on setting of condition codes
and branch condition
48
Executing call
call Dest 8 0 Dest
XXXXreturn:
XXXXtarget:
49
Executing call
• Fetch– Read 5 bytes– Increment PC by 5
• Decode– Read stack pointer
• Execute– Decrement stack
pointer by 4
• Memory– Write incremented
PC to new value of stack pointer
• Write back– Update stack
pointer
• PC Update– Set PC to Dest
Stage Computation: call
call Dest
icode:ifun M1[PC]
valC M4[PC+1]valP PC+5
Fetch
Read instruction byte
Read destination address Compute return point
valB R[%esp]Decode
Read stack pointer
valE valB + –4Execute
Decrement stack pointer
M4[valE] valP Memory Write return value on stack R[%esp] valE
Write
backUpdate stack pointer
PC valCPC update Set PC to destination
50
51
Stage Computation: call
• Use ALU to decrement stack pointer• Store incremented PC
52
Executing ret
ret 9 0
XXXXreturn:
53
Executing ret
• Fetch– Read 1 byte
• Decode– Read stack pointer
• Execute– Increment stack
pointer by 4
• Memory– Read return address
from old stack pointer
• Write back– Update stack pointer
• PC Update– Set PC to return
address
Stage Computation: ret
ret
icode:ifun M1[PC]
Fetch
Read instruction byte
valA R[%esp]
valB R[%esp]Decode
Read operand stack pointerRead operand stack pointervalE valB + 4
ExecuteIncrement stack pointer
valM M4[valA] Memory Read return address
R[%esp] valE
Write
back
Update stack pointer
PC valMPC update Set PC to return
address
54
55
Stage Computation: ret
• Use ALU to increment stack pointer• Read return address from memory
Next
• SEQ Implementation• Suggested Reading 4.3.1, 4.3.4
56