1
Recap: ALUu Big combinational logic (16-bit bus)u Add/subtract, and, xor, shift left/right, copy input 2u A 3-bit control for 5 primary ALU operations
– ALU performs operations in parallel– Control wises select which result ALU outputs
Can we combine these 5 bits into 3 bits for 7 operations?Yes, you can. But, you will still need 5 bits at the end.
3
Goal: select from one of n k-bit busesu Implemented by layering k n-to-1 multiplexer
Recap: Multiplexer
9
ClockClock.
u Fundamental abstraction: regular on-off pulse.– on: fetch phase– off: execute phase
u External analog device.u Synchronizes operations of different circuit
elements.u Requirement: clock cycle longer than max
switching time.cycle time
Clock
on
off
Introduction to Computer Science • Robert Sedgewick and Kevin Wayne • Copyright © 2005 • http://www.cs.Princeton.EDU/IntroCS
TOY Machine Architecture
11
The TOY MachineCombinational circuits. ALU.Sequential circuits. Memory.Machine architecture. Wire components together to make computer.
TOY machine.u 256 16-bit words of memory.u 16 16-bit registers.u 1 8-bit program counter.u 16 instruction types.
Fetch
Execute
12
Design a processorHow to build a processor
u Develop instruction set architecture (ISA)– 16-bit words, 16 TOY machine instructions
u Determine major components– ALU, memory, registers, program counter
u Determine datapath requirements– Flow of bits
u Analyze how to implement each instruction– Determine settings of control signals
Practice: 4-bit counter
13
4-bit counter4 4
in out
op
operation op semanticsreset 00 C ← 0load 01 C ← ininc 10 C ← C+1dec 11 C ← C-1
16
Design a processorHow to build a processor
u Develop instruction set architecture (ISA)– 16-bit words, 16 TOY machine instructions
u Determine major components– ALU, memory, registers, program counter
u Determine datapath requirements– Flow of bits
u Analyze how to implement each instruction– Determine settings of control signals
17
Build a TOY: InterfaceInstruction set architecture (ISA).
u 16-bit words, 256 words of memory, 16 registers.u Determine set of primitive instructions.
– too narrow Þ cumbersome to program– too broad Þ cumbersome to build hardware
u 16 instructions.
0: halt
Instructions
1: add
2: subtract
3: and
4: xor
5: shift left
6: shift right
7: load address
8: load
9: store
A: load indirect
B: store indirect
C: branch zero
D: branch positive
E: jump register
F: jump and link
Instructions
18
TOY Reference Card
0: halt
#
1: add2: subtract3: and4: xor5: shift left6: shift right7: load addr
exit(0)
R[d] ¬ R[s] + R[t]
R[d] ¬ R[s] - R[t]
R[d] ¬ R[s] & R[t]
R[d] ¬ R[s] ^ R[t]
R[d] ¬ R[s] << R[t]
R[d] ¬ R[s] >> R[t]
R[d] ¬ addr
8: load9: storeA: load indirectB: store indirectC: branch zeroD: branch positiveE: jump registerF: jump and link
R[d] ¬ mem[addr]
mem[addr] ¬ R[d]
R[d] ¬ mem[R[t]]
mem[R[t]] ¬ R[d]
if (R[d] == 0) pc ¬ addr
if (R[d] > 0) pc ¬ addr
pc ¬ R[t]
R[d] ¬ pc; pc ¬ addr
13 12 11 1015 14 7 69 8 6 4 1 03 25
opcode dest d addr
opcode dest d source s source t
Format 2
Format 1
Operation Pseudocode1
Fmt
1
1
1
1
1
1
2
2
2
1
1
2
2
1
2
Register 0 always 0.Loads from mem[FF]from stdin.Stores to mem[FF] to stdout.
19
Design a processorHow to build a processor
u Develop instruction set architecture (ISA)– 16-bit words, 16 TOY machine instructions
u Determine major components– ALU, memory, registers, program counter
u Determine datapath requirements– Flow of bits
u Analyze how to implement each instruction– Determine settings of control signals
20
Components
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
ALU
Clock
CondEval
22
Design a processorHow to build a processor
u Develop instruction set architecture (ISA)– 16-bit words, 16 TOY machine instructions
u Determine major components– ALU, memory, registers, program counter
u Determine datapath requirements– Flow of bits
u Analyze how to implement each instruction– Determine settings of control signals
23
Datapath
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
ALU
R[d] ¬ R[s] ALU R[t] R[d] ¬ addr R[d] ¬ mem[addr]
mem[addr] ¬ R[d] R[d] ¬ mem[R[t]] mem[R[t]] ¬ R[d]if (R[d]?) pc ¬ addr pc ¬ R[t] R[d] ¬ pc; pc ¬ addr
1-6 7 89 A B
CD E F
CondEval
24
Datapath
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
pc+1
pc for jumpand branch
address for load/store
result of ALU or address for load address
pc for jal
addr
store data
load
8 16
16
80
80
8
25
Design a processorHow to build a processor
u Develop instruction set architecture (ISA)– 16-bit words, 16 TOY machine instructions
u Determine major components– ALU, memory, registers, program counter
u Determine datapath requirements– Flow of bits
u Analyze how to implement each instruction– Determine settings of control signals
26
Datapath
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
10
10
01
10
100100
27
Control
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
WRITE_MEM
WRITE_IR
CLOCK_MEM CLOCK_REG
WRITE_REG
ALU_OP
READ_REGA MUX
WRITE_REGMUX
MEM_ADDRMUX
WRITE_PC
PC_MUX ALU MUX
A total of 17 control signals
10
10
01
10
100100
28
TOY architecture
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
10
10
01
10
100100
29
Clock
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
10
10
01
10
100100
31
ClockTwo cycle design (fetch and execute)
u Use 1-bit counter to distinguish between 2 cycles.u Use two cycles since fetch and execute phases
each access memory and alter program counter.
32
Clocking MethodologyTwo-cycle design.
u Each control signal is in one of four epochs.– fetch [set memory address from pc]– fetch and clock [write instruction to IR]– execute [set ALU inputs from registers]– execute and clock [write result of ALU to registers]
Fetch
Clock
Execute
Fetch
Phase 1fetch
Phase 3execute
Phase 2fetch & clock
Phase 4execute & clock
34
Clocking Methodology
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
fetch execute
10
10
01
10
100100
35
Program counterRead program counter when
u Fetchu Execute for jal
Write program counter when
u Fetch and clocku Execute and clock
depending on conditions
36
Example: ADD
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
PC=20Mem[20]=1234R[3]=0028 R[4]=0064
20 ????
10
10
01
10
100100
37
Example: ADD (fetch)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
520
????20
PC=20Mem[20]=1234R[3]=0028 R[4]=0064
10
10
01
10
100100
38
Example: ADD (fetch)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
520
1234
20 ????
PC=20Mem[20]=1234R[3]=0028 R[4]=0064
10
10
01
10
100100
39
Example: ADD (fetch)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
520
1234
21
20 ????
21
PC=20Mem[20]=1234R[3]=0028 R[4]=0064
10
10
01
10
100100
40
Example: ADD (fetch and clock)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
PC=21IR=1234
21 1234
211
234
Mem[20]=1234R[3]=0028 R[4]=0064
10
10
01
10
100100
41
Example: ADD (execute)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
21 1234
1
234
PC=21IR=1234
Mem[20]=1234R[3]=0028 R[4]=0064
3
4
10
10
01
10
100100
42
Example: ADD (execute)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
21 1234
1
234
PC=21IR=1234
Mem[20]=1234R[3]=0028 R[4]=0064
0028
4
0064
3
10
10
01
10
100100
43
0064
0028
Example: ADD (execute)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
21 1234
1
234
PC=21IR=1234
Mem[20]=1234R[3]=0028 R[4]=0064
4
3
008C
008C
10
10
01
10
100100
44
Example: ADD (execute and clock)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
21 1234
1
234
PC=21IR=1234R[2]=008C
Mem[20]=1234R[3]=0028 R[4]=0064
008C
2
10
10
01
10
100100
45
Example: Jump and link
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
PC=20Mem[20]=FF30R[3]=0028 R[4]=0064
10
10
01
10
100100
46
Fetch
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
10
10
01
10
100100
47
Fetch and clock
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
10
10
01
10
100100
48
Execute
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
49
ControlTwo approaches to implement control
u Micro-programming– Use a memory (ROM) for micro-code– More flexible– Easier to program
u Hard-wired– Use logic gates and wire – More efficient
ControlOpcodeExecuteFetchClock
=0>0
…
17 control signals
…
17 control signals
512x17 ROM::
9-bit address
50
Control
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
WRITE_MEM
WRITE_IR
CLOCK_MEM CLOCK_REG
WRITE_REG
ALU_OP
READ_REGA MUX
WRITE_REGMUX
MEM_ADDRMUX
WRITE_PC
PC_MUX ALU MUX
A total of 17 control signals
10
10
01
10
100100
52
TOY Reference Card
0: halt
#
1: add2: subtract3: and4: xor5: shift left6: shift right7: load addr
exit(0)
R[d] ¬ R[s] + R[t]
R[d] ¬ R[s] - R[t]
R[d] ¬ R[s] & R[t]
R[d] ¬ R[s] ^ R[t]
R[d] ¬ R[s] << R[t]
R[d] ¬ R[s] >> R[t]
R[d] ¬ addr
8: load9: storeA: load indirectB: store indirectC: branch zeroD: branch positiveE: jump registerF: jump and link
R[d] ¬ mem[addr]
mem[addr] ¬ R[d]
R[d] ¬ mem[R[t]]
mem[R[t]] ¬ R[d]
if (R[d] == 0) pc ¬ addr
if (R[d] > 0) pc ¬ addr
pc ¬ R[t]
R[d] ¬ pc; pc ¬ addr
13 12 11 1015 14 7 69 8 6 4 1 03 25
opcode dest d addr
opcode dest d source s source t
Format 2
Format 1
Operation Pseudocode1
Fmt
1
1
1
1
1
1
2
2
2
1
1
2
2
1
2
Register 0 always 0.Loads from mem[FF]from stdin.Stores to mem[FF] to stdout.
54
Execute and clock (write-back)
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1-bitcounter
1
5
2
4
=0
>0
OpcodeExecuteFetchClock
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
ControlClock
10
10
01
10
100100
55
Writing registers and memory
Memory
W
W Data
Addr
R Data
Registers
W
W DataA Data
B DataW AddrA AddrB Addr
56
More examples
PC
Registers
W
W DataA Data
B DataW AddrA AddrB Addr+
1
Memory
W
W Data
Addr
R Data
IRopd
s
t
CondEval
ALU
2
5
10
10
01
10
100100
59
Control
data busto memory input
control linesto ALU
opcodefrom IR
control linesto processor registers
external clock just ticks
data busfrom ALU
Control. Circuit that determines control line sequencing.
66
History + FutureComputer constructed by layering abstractions.
u Better implementation at low levels improves everything.
u Ongoing search for better abstract switch!
History.u 1820s: mechanical switches.u 1940s: relays, vacuum tubes.u 1950s: transistor, core memory.u 1960s: integrated circuit.u 1970s: microprocessor.u 1980s: VLSI.u 1990s: integrated systems.u 2000s: web computer.u Future: quantum, optical soliton, …
Ray Kurzweil (http://en.wikipedia.org/wiki/Image:PPTMooresLawai.jpg)