8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 1/46
DOC 112: Computer Hardware Lecture 18 Slide 1
Lectures 18
Designing a Central Processor Unit:
The Controller: State Sequencing and Output Logic
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 2/46
DOC 112: Computer Hardware Lecture 18 Slide 2
Last lecture we defined the data paths: 32
IR
c12
PC
MPX
+1
c10
s3
MDR
c14c13
Address
In Data Out
MEMORY
MAR
c11
MPX
s0 s1 s2
select
R0
c0
R3
c3
R2
c2
R1
c1
MPX
select
s4 s5 s6
Internal 32 bit Bus
MASK f5
ALU
A
B
Cin
Cout
Res
f2 f1 f0
select
f4 f3
SHIFTER
select
B
c8
A
c7
C
c9
R6
c6
R4
c4
R5
c5
C
Controller
c0 . . . . c14 f0 . . . f5 s0 . . . s6
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 3/46
DOC 112: Computer Hardware Lecture 18 Slide 3
The instructions were also defined:
31 24 23 20 19 0
Opcode Rdest Address
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 4/46
DOC 112: Computer Hardware Lecture 18 Slide 4
The next job is designing the controller
F1
F2F3
E1
E2
E3 E4 The controller's statesequence lookssimple enough, but
there is a problem:
What should theinput signal(s) be?
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 5/46
DOC 112: Computer Hardware Lecture 18 Slide 5
Determining the state sequences
The state sequencing depends on the instruction weare executing.
For example if we are executing a STORE instruction
we will branch from E2 to F1.
If we are executing a LOADINDIRECT instruction
we will go all the way to E4 before returning to F1
This suggests some complex sequencing logic is
required
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 6/46
DOC 112: Computer Hardware Lecture 18 Slide 6
The controller - State diagram
F1
F2
F3
E1
E2
E3 E41
1
1
1
0
0
0
0
We can try to get round this by designing acombinatorial circuit with one output C, that will tell uswhether we continue to the next execution cycle or fetchanother instruction
IR31IR30.
.Q2Q1Q0
C
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 7/46DOC 112: Computer Hardware Lecture 18 Slide 7
De-multiplexers to the rescue
A demultiplexer can be used to decode the top eight bits
of the IR, and give us an output line for each
instruction.
Only one output lineis 1 at any timeindicating theinstruction beingexecuted
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 8/46DOC 112: Computer Hardware Lecture 18 Slide 8
Instructions with equivalent sequences
There are several instructions that need the same
sequence of register transfers (even though the
function bits may differ). We can simply implement
these from the output lines of the instruction decoder:
ADDS = ADD + SUBTRACT + AND + OR + XOR
SHIFTS = ASL + ASR + ROR
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 9/46DOC 112: Computer Hardware Lecture 18 Slide 9
We can now do our state assignments
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 10/46DOC 112: Computer Hardware Lecture 18 Slide 10
De-multiplexers to the rescue
We can now use a 3-8 demultiplexer to decode our states
We now have hardware
lines that tell us both thestate and the instructionor group of instructions.
We can use these asBoolean variables in our hardware design!
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 11/46DOC 112: Computer Hardware Lecture 18 Slide 11
The “C” input to the finite state machine
We can now simply write Boolean equations to define when
the finite state machine needs to return to fetch a new
instruction.
For example we can go through our register transfer tables and
find all the instructions that need exactly 2 execution cycles,
and thus determine that the condition for returning from E2is:
(E2 · (RETURN + SHIFTS + MOVE + JUMPINDIRECT))'
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 12/46DOC 112: Computer Hardware Lecture 18 Slide 12
The C input to the finite state machine
If we proceed in the same way for all the states where we may
branch back to F1 we get the following Boolean equation for C:
C= (F3.NOP)' ·
(E1 · (SKIP+CLEAR+JUMP))' ·
(E2 · (RETURN + SHIFTS + MOVE + JUMPINDIRECT)))' ·
(E3 · (COMP+DEC+INC+COMPARE+
ADDS+STOREINDIRECT+LOAD))
Which we can easily implement with gates.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 13/46DOC 112: Computer Hardware Lecture 18 Slide 13
We continue using our standard method
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 14/46
DOC 112: Computer Hardware Lecture 18 Slide 14
Giving us the following Karnaugh maps
D2 = C• Q2• Q1’ + C• Q1• Q0’
D1 = C• Q1’ + C• Q2• Q0’ + Q2’• Q1’• Q0’
D0 = Q2’• Q1’• Q0’ + Q2’• Q1• Q0 + C• Q2• Q1• Q0’
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 15/46
DOC 112: Computer Hardware Lecture 18 Slide 15
Further simplification
We can use the EOR simplification rule:
D0 = Q2’• Q1’• Q0’ + Q2’• Q1• Q0 + C• Q2• Q1• Q0’
D0 = Q2’• (Q1Q0)’ + C• Q2• Q1• Q0’
But, since we have already decoded the states, we willnot bother with this
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 16/46
DOC 112: Computer Hardware Lecture 18 Slide 16
Further simplification
Instead we can simplify the equations using thedecoded states:
D2 = C·Q2·Q1’ + C·Q1·Q0’D1 = C·Q1 + C·Q2·Q0 + F1
D0 = F1 + F2 + C·E3
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 17/46
DOC 112: Computer Hardware Lecture 18 Slide 17
The final circuit is simpler than expected!
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 18/46
DOC 112: Computer Hardware Lecture 18 Slide 18
Start Up
We did not check whether the circuit will be safe atstart up, but it is.
We will need to add extra hardware to make the processor do something particular at start up, (and
maybe also on a signal from a reset button), so the
design will be safe in any case.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 19/46
DOC 112: Computer Hardware Lecture 18 Slide 19
The output Logic
We have now successfully designed the statesequencing logic, and all that remains is to design the
output logic.
Recall that the Moore machine had no connection
between the inputs and the output logic. This is a safer
design methodology
However, for the processor we use the Mealy machine
(the inputs go to the output logic)
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 20/46
DOC 112: Computer Hardware Lecture 18 Slide 20
The output logic of the controller
The output logic is a huge combinatorial design problem.
The inputs are the states (F1, F2 etc) and the
instructions (LOAD, STORE etc) which we havealready decoded.
The outputs are the clock controls (c0, c1, c2 etc) thearithmetic function select lines (f0, f1, etc) and the
multiplexer select lines (s0, s1, etc).
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 21/46
DOC 112: Computer Hardware Lecture 18 Slide 21
Clock Gates
The clock gate signals c0 to c8 determine whichregister is loaded at each cycle.
The MAR will use this typical gating circuit:
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 22/46
DOC 112: Computer Hardware Lecture 18 Slide 22
Gating The MAR
To determine when the MAR should be loaded we needto look through all the register transfer tables. This
gives us an equation for CMAR:
CMAR = F1 + E1•(LOAD + STORE) +
E2•(LOADINDIRECT + STOREINDIRECT)
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 23/46
DOC 112: Computer Hardware Lecture 18 Slide 23
Using don’t care states
However, the only time we need the MAR to becorrect is before we we load the MDR. At other times
we can load it without disturbing the execution.
Thus we can simplify the equation:
CMAR = F1 + E1 + E2
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 24/46
DOC 112: Computer Hardware Lecture 18 Slide 24
The MDR Clock
The same procedure is followed for all the other register clocks. From the register transfers we find:
CMDR = F2 + E2•LOAD + E3•LOADINDIRECT
The MDR (loaded in F2) is needed in cycle 3 by the
CALL instruction, but only LOADINDIRECT uses it
after E3, so we can simplify the equation to:
CMDR = F2 + E2·LOAD + E3
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 25/46
DOC 112: Computer Hardware Lecture 18 Slide 25
The Register Clocks
The register to be clocked is recorded in the IR bits 20-22.The
condition for any register (Rdest) to receive a clock edge is:
CRdest = E4+ E3•(LOAD+ADD+INC+DEC+COMP) +
E2•(ASL + MOVE +
CALL+CALLINDIRECT) +
E1•CLEAR
It cannot be simplified further
31 24 23 20 19 16 15 0
Opcode Rdest UnusedRscr
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 26/46
DOC 112: Computer Hardware Lecture 18 Slide 26
The Register clocks
A decoder is required to
determine the which
register is clocked.
A four bit decoder is
required if we expand
the design to 16registers
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 27/46
DOC 112: Computer Hardware Lecture 18 Slide 27
The Shifter Function
The shifter function is defined as follows.
The control bits are defined by equations:
f4 = ASR+ROR
f3 = ASL+ROR
00 is the default function
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 28/46
DOC 112: Computer Hardware Lecture 18 Slide 28
The ALU Function
f2 = E3•(COMP+OR+AND) + E2•(COMP+DEC)
f1 = E3•(SUBTRACT+COMPARE+DEC+INC+ADD+AND)
+ E2•(COMP+DEC)
f0 =E3•(DEC + INC + ADD + OR) +E2•(COMP+DEC)
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 29/46
DOC 112: Computer Hardware Lecture 18 Slide 29
The carry in bit
The default will be 0
The only place that a 1 carry is required is INC•E3
Thus
f5 = INC•E3
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 30/46
DOC 112: Computer Hardware Lecture 18 Slide 30
The multiplexer selection bits
The multiplexer selections are defined as follows:
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 31/46
DOC 112: Computer Hardware Lecture 18 Slide 31
The internal bus selector: s6 s5 s4
First we need to look at the register transfer tables to determinewhen the different paths are selected. Using, for example,
SPC to mean the condition when the PC is selected we find:
SPC = E2•(CALL+CALLINDIRECT)
SALU = E1•CLEAR + (E2+E3)•(INC+DEC+COMP)
+ TWO•E3
SMask = E1•(LOAD+JUMP + STORE) + E3•CALL
SMDR = LOAD•E3 + E4
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 32/46
DOC 112: Computer Hardware Lecture 18 Slide 32
The internal bus selector: s6 s5 s4
Using the unallocated
selections as don’t cares
we can write:
s4 = SALU + SMDR
s5 = SPC
s6 = SMask + SMDR
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 33/46
DOC 112: Computer Hardware Lecture 18 Slide 33
The register selector
We can find the conditions defining the register selector from places
in the register transfer tables where A or B are loaded. Sometimes
the register to be selected is the source (Rsrc: bits 19-16)
sometimes it is the destination (Rdest: bits 23-20), sometimes the
internal bus.
SRsrc = E1·(INDIRECT + TWO)
SBus = E2·ONEWe will use SRdest=(SRsrc+SBus)'
(INDIRECT, ONE, and TWO are Boolean variables indicating the
instruction type)
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 34/46
DOC 112: Computer Hardware Lecture 18 Slide 34
The register selector
The selection is done by a multiplexer, with an additional
set of gates to impose the Sbus condition.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 35/46
DOC 112: Computer Hardware Lecture 18 Slide 35
The PC selector
Last but not least we can get the conditions for the PC
selector from the register transfer tables:
s3 = F1 + E1·(CALL+CALLINDIRECT)
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 36/46
DOC 112: Computer Hardware Lecture 18 Slide 36
How did we do?
We can now make a wiring list, buy the components
from maplin and test it.
The components will cost £200-£300 (over twice the
price of a Intel Core 2.
The clock could be set at about 10KHz (A bit faster if
we fabricate it on a single chip)
So it looks as if we had better consider the Mark 2
version straight away.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 37/46
DOC 112: Computer Hardware Lecture 18 Slide 37
Improvements
All instructions are 32 bit, but mostly the bottom 16 bits are empty.
This means that we are wasting memory space anddoing many more fetch cycles than we need.
We could pack up the instructions on byte boundaries
and introduce some multiplexing hardware to load the
IR correctly.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 38/46
DOC 112: Computer Hardware Lecture 18 Slide 38
More Arithmetic hardware
We have three unused inputs on the multiplexer thatselects the internal bus.
Additional arithmetic hardware could include: A sixteen bit multiplier (multiply the bottom 16 bits of A
and B to obtain a 32 bit result)
An incrementer
A decrementer
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 39/46
DOC 112: Computer Hardware Lecture 18 Slide 39
Other functionality
A circuit to test if the result (or internal bus) was zerowould enable us to provide a SKIP_EQUAL
instruction. (The software department would be very
keen to have this).
This would require a 32 bit OR gate and a single bit
register.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 40/46
DOC 112: Computer Hardware Lecture 18 Slide 40
More Multiplexers
Additional multiplexers could help us to reduce theinstruction cycles of many instructions.
For instance a multiplexer to select the input to Bindependently of A would reduce many three cycle
instructions to two cycles.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 41/46
DOC 112: Computer Hardware Lecture 18 Slide 41
More Data Paths
A data path from the registers to the internal buswould reduce some instructions by one cycle.
This would require an additional input on the busselector multiplexer, and so might be considered an
alternative to the additional arithmetic functions
already discussed.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 42/46
DOC 112: Computer Hardware Lecture 18 Slide 42
Optimised Combinational logic
This is the hard part.
We want to have the minimum time delays in all our
combinational logic.
This is partly a question of path length, but does
require looking at low level transistor models to
calculate the time accurately
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 43/46
DOC 112: Computer Hardware Lecture 18 Slide 43
And that the end of the course
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 44/46
DOC 112: Computer Hardware Lecture 18 Slide 44
And that the end of the course
- well nearly!
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 45/46
DOC 112: Computer Hardware Lecture 18 Slide 45
And that the end of the course
- well nearly!
Coursework 2 will be the first lab exercise of nextterm.
There will be a revision session before the exam at the
start of the summer term. Watch the web page for thetime and venue.
8/3/2019 Hardware Slides 18
http://slidepdf.com/reader/full/hardware-slides-18 46/46
In the meantime
Have a great christmas!