Post on 14-Dec-2015
transcript
in1210/01-PDS 1TU-Delft
The Processing Unit
in1210/01-PDS 2TU-Delft
Problem
f
y
ALU
y
Decoder
a
instruction
Reg
?
in1210/01-PDS 3TU-Delft
Basic cycle Assume an instruction occupies a single
word in memory Basic cycle to be implemented:
1. Fetch instruction pointed to by PC and
put it in Instruction Register (IR)
[IR] M([PC])
2. Increment PC: [PC] [PC] + 4
3. Perform actions as specified in IR
in1210/01-PDS 4TU-Delft
Organization
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 5TU-Delft
Register gating
Ri
CPU bus
Y
Z
ALU
x
x
x
x
MUX
x
Const 4 Ri_in
Ri_out
Y_in
Select
Z_in
Z_out
in1210/01-PDS 6TU-Delft
Operation Operation cycle includes:
- Fetch contents of memory location and put in one of the CPU registers
- Store contents of CPU register in memory location
- Transfer data from register tot register or to ALU
- Perform Arithmetic or Logic operation
in1210/01-PDS 7TU-Delft
Fetch from Memory (1)
MDR
x
Internal processor bus Memory busData lines
x
x
x
MDR_out
MDR_in
MDR_outE
MDR_inE
in1210/01-PDS 8TU-Delft
Fetch from memory (2)
1. [MAR] [Ri]
2. Start read on memory bus
3. Wait for MFC response
4. Load MDR from memory bus
5. [Rj] [MDR]
MFC
Memory CPU
Read
Address
Data
e.g. LHZ Rj,Ri
in1210/01-PDS 9TU-Delft
Fetch from memory (3)
1. Ri_out, MAR_in, Read
2. MDR_inE, WMFC
3. MDR_out, Rj_in
Signal Sequence Activation
in1210/01-PDS 10TU-Delft
Timing of readCLK
MAR_in
MR
1 2 3
address
Read
MDR_inE
Data
MFC
MDR_out
in1210/01-PDS 11TU-Delft
Store to memory
1. Ri_out, MAR_in
2. Rj_out, MDR_in, Write
3. MDR_outE, WMFC
Memory CPU
Write
Address
Data
MFC
e.g. STW Rj,Ri
in1210/01-PDS 12TU-Delft
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
Z_out Address _in R_in
R_out CPU bus
Address _out
F_alu
in1210/01-PDS 13TU-Delft
Copy of registers Copy contents R1 to R3
1. Address_out = R1
2. R_out
3. Address_in = R3
4. R _ in
in1210/01-PDS 14TU-Delft
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
Z_out Address _in R_in
R_out CPU bus
Address _out
F_alu
in1210/01-PDS 15TU-Delft
Arithmetic OperationStep Action 1. Address_out R1
Y_inR_out
2. Address_out R2F_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
in1210/01-PDS 16TU-Delft
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
Z_out Address _in R_in
R_out CPU bus
Address _out
F_alu
in1210/01-PDS 17TU-Delft
Arithmetic OperationStep Action 1. Address_out R1
Y_inR_out
2. Address_out R2F_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
in1210/01-PDS 18TU-Delft
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
Z_out Address _in R_in
R_out CPU bus
Address _out
F_alu
in1210/01-PDS 19TU-Delft
Arithmetic OperationStep Action 1. Address_out R1
Y_inR_out
2. Address_out R2F_alu “ADD”Z_in
Address_in R3Z_outR_in
ADD R3,R2,R1
in1210/01-PDS 20TU-Delft
Register Transfers
R0
R1
R2
R3
Y
Z register file
ALU
Y_in
Z_in
Z_out Address _in R_in
R_out CPU bus
Address _out
F_alu
in1210/01-PDS 21TU-Delft
Steps in time
Y_in
Z_in
Z_out
R_in
1 2 3 Step
in1210/01-PDS 22TU-Delft
Register gating
1 bit of common bus line
Tri-state based gate
C D
Q
C
IR/W
R1_out
C D
Q
C
IR/W
R2_out
C D
Q
C
IR/W
R3_out
in1210/01-PDS 23TU-Delft
Timing
hold time
trans-missiontime
set-up time
Rising edge of clock
R_ out
delay throughALU
data available at next register
turn output on
in1210/01-PDS 24TU-Delft
Complete instruction
1. Fetch instruction
2. Fetch the operand
3. Perform operation
4. Store result
Example ADD (R3),R1
[R1] M([R3]) + [R1]
in1210/01-PDS 25TU-Delft
Execution fetch(1)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
[PC] [PC ]+1
[IR] M([PC ])
Step 1-3: instruction fetchand PCupdate
Note: for architectures having PC:=PC+4 a different scheme must be used
in1210/01-PDS 26TU-Delft
Fetch instruction
PC
Z
ALU
PC_in
Z_in
Z_out
ADD
MAR
PC_out
carry
MAR_in
Read
WFMC
MDR
IR_in
MDR_out
IR
MDR_in
in1210/01-PDS 27TU-Delft
Execution fetch(2)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
Step 1-3: instruction fetchand PCupdate
[PC] [PC ]+1
[IR] M([PC ])
in1210/01-PDS 28TU-Delft
Fetch instruction
PC
Z
ALU
PC_in
Z_in
Z_out
ADD
MAR
PC_out
carry
MAR_in
Read
WFMC
MDR
IR_in
MDR_out
IR
MDR_in
in1210/01-PDS 29TU-Delft
Execution fetch(3)
Step Action1 PC_out, MAR_in, Read
Set carry-in ALUF_alu = “ADD”Z_in
Z_out, PC_inWait for MFC
3 MDR_out, IR_in
[PC ] [PC ]+1
[IR] M([PC ])
Step 1-3: instruction fetchand PCupdate
in1210/01-PDS 30TU-Delft
Fetch instruction
PC
Z
ALU
PC_in
Z_in
Z_out
ADD
MAR
PC_out
carry
MAR_in
Read
WFMC
MDR
IR_in
MDR_out
IR
MDR_in
in1210/01-PDS 31TU-Delft
Execute Step Action4 Address_out=R3
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1Z_out, R_in, End
Step 4 and 5: operand fetch
Perform addition
Store Result
in1210/01-PDS 32TU-Delft
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
Read
in1210/01-PDS 33TU-Delft
Execute Step Action4 Address_out=R3
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1Z_out, R_in, End
Step 4 and 5: operand fetch
Perform addition
Store Result
in1210/01-PDS 34TU-Delft
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
WFMC
in1210/01-PDS 35TU-Delft
Execute Step Action4 Address_out=R3
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1Z_out, R_in, End
Step 4 and 5: operand fetch
Perform addition
Store Result
in1210/01-PDS 36TU-Delft
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 37TU-Delft
Execute Step Action4 Address_out=R3
MAR_inRead
Address_out=R1, R_outY_in, Wait for MFC
6 MDR_out, Z_inF_alu = “ADD”
7 Address_in=R1Z_out, R_in, End
Step 4 and 5: operand fetch
Perform addition
Store Result
in1210/01-PDS 38TU-Delft
Execute
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 39TU-Delft
BranchingStep Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Off-set-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
in1210/01-PDS 40TU-Delft
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 41TU-Delft
BranchingStep Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Off-set-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
in1210/01-PDS 42TU-Delft
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 43TU-Delft
BranchingStep Action1-3 <instruction fetch
as in previous example>
PC_out, Y_in
5 Off-set-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
in1210/01-PDS 44TU-Delft
Branching
PC
CPU bus
IR
Decoder
control
R0
R1
R2
R3
register file
MAR
MDR memory bus
Y
Z
ALU
in1210/01-PDS 45TU-Delft
Conditional branchingStep Action1-3 <instruction fetch
as in previous example>
PC_out, Y_inIf N=0 then End
5 Off-set-field-IR_outF_alu = “ADD”Z_in
6 PC_inZ_out, End
in1210/01-PDS 46TU-Delft
Control mechanisms There are two basic control
organizations:- Hardwired control- Micro-programmed control
in1210/01-PDS 47TU-Delft
Control Unit Organization
Status Flags
Condition Codes
Control step counter
Clock CLK
Encoder/Decoder
IR
Control signals
in1210/01-PDS 48TU-Delft
Separating decoding/encoding
Status Flags
Condition Codes
End
Reset
Run
Control step counter
Clock
Step decoderT_1 T_n
Ins_1
Encoder
Instruction decoder
IRIns_n
in1210/01-PDS 49TU-Delft
Generation of control signalsADD
T_6 T_5BR
T_1
Z_in
Z_in = T_1 + T_6 . ADD + T_5 . BR
in1210/01-PDS 50TU-Delft
End signal
End = T_7 . ADD + T_6 . BR +(T_6 . N + T_4 . /N) .BRN
Other example:
in1210/01-PDS 51TU-Delft
PLA’s
AND array OR array
Control signalsIR counter Flags
PLA
in1210/01-PDS 52TU-Delft
Performance Performance is dependent on:
- Power of instructions- Cycle time- Number of cycles per instruction
Performance improvement by:- Multiple datapaths- Instruction prefetching and pipelining- Caches
in1210/01-PDS 53TU-Delft
Multiple datapaths
R0
R1
R2
R3
Y
register file ALU
in1210/01-PDS 54TU-Delft
Complete CPU
Instruction unit
Floating-pointunit
Integer unit
Data CacheInstruction
Cache
BusInterface
MainMemory
Input/Output
CPU
in1210/01-PDS 55TU-Delft
Microprogrammed control All control bits are organized as memory Each memory location represents a
control setting Memory words are called micro-
instructions
in1210/01-PDS 56TU-Delft
Example
micro- PC_in MAR_in Addr_in Z_in ...instruction
1 0 1 00 1 ...2 1 0 00 0 ...3 0 0 01 0 .......
in1210/01-PDS 57TU-Delft
Basic organization
IR
Startingaddress
generator
Clock micro-PC
Control Store
ControlSignals
in1210/01-PDS 58TU-Delft
Micro-routineAddress Micro-instruction
0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_in
Z_out, PC_in, Wait for MFC
2 MDR_out, IR_in
3 Branch to starting address routine (here 25)...........................................................................................................................25 PC_out, Y_in, if N=0 then goto address 0
26 Offset-field-of-IR_out, F_alu = “ADD”, Z_in
27 Z_out, PC_in, End
Fetch Instruction
Test N bit
New PC address
in1210/01-PDS 59TU-Delft
Detailed organization
IR
Startingaddress
generator
Clock micro-PC
Control Store
ControlSignals
Status flags
Control codes
in1210/01-PDS 60TU-Delft
micro-PC Micro-PC is incremented by 1, except:
- At End» Micro-PC is set to first micro-instruction of
instruction fetch routine
- After loading IR» Micro-PC is set to first micro-instruction for
executing machine instruction
- At Branch instruction
in1210/01-PDS 61TU-Delft
Why micro-programming Flexibility
- emulation of different instruction sets on same hardware
Support for powerful instructions
in1210/01-PDS 62TU-Delft
Structure micro-instructions Most simple organization: 1 bit per
control signal However,
- Many bits needed (e.g 80-120 bits)- For many signals only one is needed per
cycle; hence they can be grouped- Coding is possible: e.g. an address instead of
a single control bit per register
in1210/01-PDS 63TU-Delft
Example
Field 1(4 bits): Register address_inField 2(4 bits): Register address_outField 3(4 bits): Other registers_inField 4(4 bits): Function ALUField 5(2 bit) : Read/Write/NopField 6(1 bit) : Carry-in ALUField 7(1 bit) : WMFCField 8(1 bit) : End............ ..............
F1 F2 F3 F4 F5 F6 F7 F8
in1210/01-PDS 64TU-Delft
Forms of organization Little coding: horizontal organization
- Large words- Little decoding logic- Fast
Much coding: vertical organization- Small control store- Much decoding logic- Slower
Mixed organization
in1210/01-PDS 65TU-Delft
Horizontal/Vertical
F0 F1 F2 F3
R0 R1 R2 R3
Horizontal
F0 F1
Decoder
R0 R1 R2 R3
Vertical
in1210/01-PDS 66TU-Delft
Sequencing Thus far only branch after fetch No sharing of micro-code between micro-
routines micro-subroutines leads to more efficient
control store
in1210/01-PDS 67TU-Delft
Multi-way branching Number of two-way branches
- disadvantage: slows down More than one branch address in micro-
instruction- disadvantage: more bits required
bit-ORing if specified branch address
in1210/01-PDS 68TU-Delft
Example
x x x 0 0
x x
x x x . .
micro-instruction
Part IR
branchaddress
OR
actualbranchaddress
in1210/01-PDS 69TU-Delft
Example microroutine(1)
ADD (Rsrc)+, Rdst
Instruction Format
OP code 010 Rsrc Rdst
Mode
034781011
IR
bit 8: direct/indirectbit 9,10: indexed (11)
autodecrement(10) autoincrement(01) register(00)
in1210/01-PDS 70TU-Delft
Example microroutine(2)Address Micro-instruction
0 PC_out, MAR_in, Read, Set carry-in ALU, F_alu = “ADD”, Z_inZ_out, PC_in, Wait for MFC2 MDR_out, IR_in3 Branch{PC101 (from PLA); PC_5,4 [IR_10,9];
PC_3 [not.IR_10,].[not.IR_9].[IR_8]}...........................................................................................................................121 Rsrc_out, MAR_in, Set carry-in ALU,Read, F_alu = “ADD”, Z_in122 Z_out, Rscr_in123 Branch{PC170; PC_0[not.IR_8]}, WMFC
170 MDR_out, MAR_in, Read, WMFC171 MDR_out, Y_in172 Rdst_out, F_alu = “ADD”, Z_in173 Z_out, Rdst_in, End
FETCH
in1210/01-PDS 71TU-Delft
Micro branch address
OP code 010 Rsrc Rdst
Mode
034781011
IR
0 0 1 0 1 0 0 0 1
/IR10./IR9.IR8
PLA
121
101
9
in1210/01-PDS 72TU-Delft
Micro branch address
OP code 010 Rsrc Rdst
Mode
034781011
IR
0 0 1 1 1 1 0 0 0
/IR8
PLA
170
170
in1210/01-PDS 73TU-Delft
Next address field(1) Micro-instruction contains address next
micro-instruction Larger store needed Branch micro-instructions no longer
needed
in1210/01-PDS 74TU-Delft
Next-address field(2) IR Status
flagsCondition
codes
Decoding circuits
micro-AR
Control store
Next address
Microinstruction decoder
micro-IR
in1210/01-PDS 75TU-Delft
Example
Field 0(8 bits): Next addressField 1(4 bits): Register address_inField 2(4 bits): Register address_outField 3(4 bits): Other registers_inField 4(4 bits): Function ALUField 5(2 bit) : Read/Write/NopField 6(1 bit) : Carry-in ALUField 7(1 bit) : WMFCField 8(1 bit) : End............ PLA/ORing etc
F1 F2 F3 F4 F5 F6 F7 F8 F0
in1210/01-PDS 76TU-Delft
Emulation A micro program determines machine
instruction of computer Suppose we have two computers M1 and
M2 with different instruction sets By adapting the micro-program of M1,
we can emulate M2
in1210/01-PDS 77TU-Delft
Organization Micro-program is often placed in ROM
on CPU chip Some machines had writable control
store, i.e. user could change instruction set