ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
1
Central Processing Unit
Sample Realistic Designs
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
2
Major Components of the CPU
• Every CPU consists of the three basic components shown in the figure below.– Registers hold the inputs of the ALU operations
and eventually receive the results.– The control unit
controls the operationof both ALU andregistersthrough the control signals.
– The ALU performs the actualoperations.
ControlUnit
Registers
ALU
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
3
Realistic Organization
• With a large number of registers, dedicated connections are impossible.– Some form of BUS mechanism has to be used to
organize the connections.
• Most ALU operations require two pieces of data.– We can send them to temporary ALU registers
one at a time.– Better yet, we can utilize a two bus system.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
4
Register File Control
• Once the results are ready, they have to be sent to the proper register for storage.– The instruction must specify the register and the
control unit must enable the right control input.– Registers are usually given a code to reduce the
size of the instructions.• This code can be decoded to create the needed load
control inputs for the registers.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
5
Controlling the ALU
• The ALU is a multi-function unit.– The control unit needs to specify through control
signals which operation to be performed.
• Putting all of the above together, each microoperation needs the following information:– Select inputs for BUS A.– Select inputs for BUS B– Destination register code.– ALU operation code.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
6
Register Organization
R1
R2
R3
R4
R5
R6
R7
3 X 8Decoder
LoadControls
SELD
MUX 1 MUX 2
Arithmetic Logic UnitALU
OPR
SELA
SELB
Bus A Bus B
Input Data
SELA SELB SELD OPER
3 3 3 5
Control Word
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
7
Example of a Microoperation
• The control word mentioned above has 14 bits. They will represent the different parts of the microoperations to be performed.
• For example, the microoperation:
R1 R2 – R3
has the following control word:
Field SELA SELB SELD OPER
R2 R3 R1 SUB
010 011 001 00101
The control word is
010 011 001 000101
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
8
Stack Organization
• A useful feature that is included in almost every CPU.
• The stack is a storage device that stores information in a Last In First Out (LIFO) manner.
• The stack in digital computers is essentially a memory unit with a dedicated address register – the Stack Pointer – that continuously points to the upper most item in the stack.
• Items are added to the stack using a PUSH operation and removed from it using a POP.– These operations are simulated through
incrementing and decrementing the register.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
9
Register Stack
• If the microprocessor has enough registers, it is actually possible to implement the stack operation using registers.
• The stack pointer register in this situation would contain the index of the register containing the item at the top of the stack.
• With a register stack there would also be a need for a couple of flag registers to determine when the stack is completely full or completely empty.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
10
Stack Based CPUs
• There are several CPUs that were designed without general purpose registers.– Instead these CPUs had a fast memory stack that
could be used instead of the registers.– In order to effectively use such a system,
mathematical expressions have to be re-written in a slightly different manner.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
11
Infix Notation
• Common arithmetic expressions are written with the operator between its operands.– This causes a problem for programmers.
– Consider the following expression:A * B + C * D
– The program must:• Read the entire expression• Extract all of the operands• Extract all of the operations• Decide which operations to do first.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
12
Prefix Notation
• It is possible to re-write arithmetic expressions so that the operation is specified before its operands.– This way, there is no need to parse the entire expression.
• Read an operation, scan forward until the its two operands are obtained, execute it, continue.
– The previous expression can be re-written as:
+ * A B * C D
• We read the + operator first.
• Scan forward, we find an * operation. Therefore, the first operand of the + is the result of this *. Perform the * operation.
• We find another * operation. Perform it.
• Now we have the two operands for the +, perform it.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
13
Reverse Polish Notation
• The previous notation is known either as prefix notation or Polish Notation – since it was defined by a Polish mathematician.
• A more popular notation is actually the reverse of this one – Reverse Polish Notation (RPN). – Postfix notation.– The operands are specified first, then the
operator.– This notation is extremely popular with stack
based CPUs.• Like the CPUs in HP’s scientific calculators.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
14
RPN
• Our sample expression can be written as:
A B * C D * +
– It can be evaluated as follows:• Scan from the left, as soon as an operation is found,
perform it on the two operands immediately to its left.• Replace the operation and its two operands with the
result.• Continue forward.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
15
RPN Evaluation Example
• Evaluate the following expression:
1 + 2 * 3 + 4 * 5 * 6– First, re-write it in RPN:
4 5 * 6 * 2 3 * + 1 +
• We find “4 5 *” first. Evaluate that. 20. The expression now becomes:
20 6 * 2 3 * + 1 +• Then we find “20 6 *”. Evaluate it. 120.
120 2 3 * + 1 +• Now we find “2 3 *”. Evaluate it. 6.
120 6 + 1 +• Now we find “120 6 +”. Evaluate it. 126.
126 1 +• Evaluate the last expression “126 1 +”.• The result is 127.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
16
Conversion to RPN
• We must follow the hierarchy of operations:– Perform all operations inside inner parenthesis
first, then outer ones.– Perform multiplication and division before addition
and subtraction.
• Example– Translate the following expression to RPN:
(A + B) * [C * (D + E) + F]
– The result is:A B + D E + C * F + *
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
17
Evaluating RPN Expressions with a Stack
• Push all operands on the stack until the first operation.
• Pop the first two elements off the stack and perform the operation.
• Push the result back on the stack.
• Continue.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
18
Example
• Evaluate the following expression using a stack:3 4 * 5 6 * +
• Push the 3 on the stack.• Push the 4 on the stack.• Pop the 4 and the 3, perform the * operation.• Push the result (12) on the stack.• Push 5 on the stack.• Push 6 on the stack.• Pop the 6 and the 5, perform the * operation.• Push the result (30) on the stack.• Pop the 30 and the 12 off the stack, perform the +
operation.• Push the result (42) on the stack.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
19
Example 2
• Evaluate the following expression using a stack:
1 + 2 * 3 + 4 * 5 * 6– First, re-write it in RPN:
4 5 * 6 * 2 3 * + 1 +
4 20 120 120
2
120 126
4 5 * 6 * 2 3 * + 1 +
4
5
20
6
120
2
3
6
126
1
127
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
20
Section 8.8
Reduced Instruction Set Computer
(RISC)
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
21
Instruction Set vs. Architecture
• The design of the instruction set is an important aspect of computer architecture.– The instruction set chosen determines the way
machine language programs are constructed.
• Early computers had small and simple instruction sets.– Due mainly to the need to reduce the hardware
needed to implement them.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
22
CISC - Complex Instruction Set Computers
• With the invention of complex ICs, hardware complexity became a non-issue. This lead to the development of some highly complex architectures.– Architectures with instruction sets that contained
more than 100 instructions became widely spread.
– The trend was to move operations from software to hardware.
• Machine instructions like COS, SIN and TAN started to appear.
• Actually, some processors also had machine instructions for matrix operations.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
23
RISC - Reduced Instruction Set Computers
• Complex instruction sets had a large number of complex instructions.– The complex instructions required a long time to
execute.– The instructions required a lot of memory
accesses.– Some of the instructions were so specialized that
they were used quite infrequently.
• In the early 1980s, designers tried to balance that by moving towards simpler instruction sets.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
24
CISC Characteristics
• Designers wanted to simplify the process of compilation.– Rather than translate a high level language
instruction into many machine language instructions, why not design machine language instructions that implemented them directly.
– Complex machine language instructions.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
25
CISC Characteristics
• In order to be efficient with memory use, variable length instructions were used.– Register based instructions were short, 1-2 bytes,
while memory based instructions were long, up to 5 bytes.
– Packing such variable length instructions into a fixed-length memory requires some very special decoding circuits.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
26
CISC Characteristics
• Instructions in typical CISC processors provide for direct manipulation of operands in memory.– This will require multiple memory references during the
execution of the instruction.
– The reason for including these instructions is to simplify the compilation of high-level language programs.
• Remember, most variables in a high-level language program are implemented as memory locations.
– As more instructions and addressing modes are added to a processor, more logic would be needed to support them.
• Ultimately, this leads to a lower performance.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
27
CISC Characteristics in Summary
• A large number of instructions.– Typically 100 – 200 instructions.
• Some instructions that perform specialized tasks and are used infrequently.
• A large variety of addressing modes.
• Variable length instruction formats.
• Instructions that manipulate operands in memory directly.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
28
RISC Characteristics
• RISC tries to reduce execution time by simplifying the instruction set of the computer.
• The basic characteristics of RISC processors are:– Relatively few instructions.
– Relatively few addressing modes.
– Memory access limited to load and store instructions.
– All operations are done within the registers of the CPU.
– Fixed-length, easily decoded instruction formats.
– Single cycle instruction execution.
– Hardwired rather than micro-programmed control units.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
29
RISC Characteristics
• Mostly register to register operations.– Only simple load and store for memory access.
• Operands are read into registers using a load instruction.• The operation is done between the registers.• Results are stored in memory using an explicit store
instruction.
– This simplifies the instruction set and forces the optimization of register usage.
• This also removes the need for many complex addressing modes.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
30
RISC Characteristics
• Simple instruction formats.– The instruction length is fixed.– Instructions are aligned to memory words.
• Easy to decode instruction formats.– This simplifies the control logic.
• Hard-wired control is used to speed-up the generation of control signals.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
31
Single Cycle Instruction Execution
• A main feature of RISC processors is their ability to complete the execution of an instruction every clock cycle.– This is done by overlapping the fetch, decode and
execute cycles of two or three instructions by using pipelining.
• Most CISC processors today also depend on this important feature for speeding up their performance.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
32
Additional Features of RISC Processors
• A relatively large number of registers.– This would be useful for storing intermediate
results.
• Use of overlapped register windows.– This helps speed-up procedure calls.
• Compiler support for efficient translation of high-level language programs to make use of these features.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
33
Over-lapped Register Windows• When a function call in a high-level language program
requires many operations to implement:– Register values in the calling program must be saved.– Parameters must be placed into appropriate registers for the
subroutine.– The subroutine is called.
• On the return path, a similar set of operations are needed.– The subroutine has to save its return values in the
appropriate registers.– Control is returned to the calling program.– Registers values in the calling program are restored.
• All of these are very time consuming.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
34
Over-lapped Register Windows
• Given that function calls occur very often in high-level language programs, someway of speeding up this process has to be found.
• Some processors use a separate register bank for each procedure.– No need to save and restore the calling
procedure’s registers.
• RISC processors do a similar thing but it is not dedicated.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
35
Over-lapped Register Windows
• Each procedure is allocated a group of registers.
• When a function call is being executed, a set of registers are automatically assigned to the new procedure.– Therefore, there is no need to save and restore the calling
procedure’s registers.
• This new set of registers overlaps by a certain amount with the registers of the calling procedure.– This overlap is used for passing parameters.
• When a function is terminated, the registers allocated to it are freed for later use by a different procedure.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
36
Over-lapped Register Window Example• Our CPU has a total of 74 registers.
– When a program starts, 10 registers are allocated for global data.
– The main program is allocated 10 registers for its local data.
– The main program calls function A.• 6 registers are allocated for passing data
back and forth between main and A.• 10 registers are allocated to A for local
data.
– A calls B.• 6 registers are allocated common to A
and B.• 10 registers are allocated to B for local
data.
• Each procedure can access a total of 32 registers.
GlobalReg.
R0
R9MainLocalReg.
R10
R19SharedMain, A
ALocalReg.
R20
R25R26
R35Shared
A, B
BLocalReg.
R36
R41R42
R51
FreeReg.
R52
R73
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
37
Effects on Programming
• High level languages – no effect.– All of this is done by the compiler.
• Assembly language – registers no longer have a set name.– If you write an instruction of the form:
ADD R1, R2
there is no guarantee that you will actually use R1 and R2 of the processor.
• The above instruction means, use the first register and the second register in my window.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
38
Over-lapped Window Parameters
• The number of registers allocated for each type is a parameter of the processor design.– G the number of global registers.– L the number of local registers.– C the number of common register.
all of these depend on the design of the CPU.
• In some processor designs, these parameters are decided dynamically.– Depending on the total number of procedures and
the total number of registers, a procedure’s window size may change.
• The operating system now has to be very intelligent.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
39
Berkeley RISC I – Example RISC CPU
• 32-bit processor.– 32-bit address.– 8, 16, or 32-bit data.– 32-bit instruction format.
• 31 instructions.– 12 Data Manipulation.– 11 Data Transfer.– 8 Program Control.
• 3 addressing modes:– Register.– Immediate.– Relative to PC.
• 138 registers.– 10 Global registers.– 10 windows of 32
registers each.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
40
Berkeley RISC I
• Since only 32 registers are accessible at any point in time, only 5 bits are needed for register selection.
• Instructions utilize a three address format.– Destination Register.– Source Register.– Second Source Register or Immediate Data.
• Register R0 is a constant 0 all the time.– It can be used to fool the processor into
performing additional addressing modes.
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
41
Instruction Formats
581558
S2Not Used0RsRdOpcode
Register Mode
131558
S21RsRdOpcode
Register-Immediate Mode
1958
YCondOpcode
PC Relative Mode
S2 - Register
S2 – Immediate Data
Y – Relative Displacement
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
42
Data Transfer Operations
Opcode Operands Register Transfer Description
LDL (Rs)S2, Rd Rd M[Rs + S2] Load Long
LDSU (Rs)S2, Rd Rd M[Rs + S2] Load Short Unsigned
LDSS (Rs)S2, Rd Rd M[Rs + S2] Load Short Signed
LDBU (Rs)S2, Rd Rd M[Rs + S2] Load Byte Unsigned
LDBS (Rs)S2, Rd Rd M[Rs + S2] Load Byte Signed
LDHI Rd,Y Rd Y Load Immediate High
STL Rd, (Rs)S2 M[Rs + S2] Rd Store Long
STS Rd, (Rs)S2 M[Rs + S2] Rd Store Short
STB Rd, (Rs)S2 M[Rs + S2] Rd Load Byte
GETPSW Rd Rd PSW Load Status Word
PUTPSW Rd PSW Rd Set Status Word
ECEG-3202: Computer Systems Design & Organization, Dept of ECE, AAU
43
Data Manipulation Operations
Opcode Operands Register Transfer Description
ADD Rs, S2, Rd Rd Rs + S2 Integer Add
ADDC Rs, S2, Rd Rd Rs + S2 + Carry Add with Carry
SUB Rs, S2, Rd Rd Rs – S2 Integer Subtract
SUBC Rs, S2, Rd Rd Rs – S2 – Carry Subtract with carry
SUBR Rs, S2, Rd Rd S2 – Rs Reverse Subtract
SUBCR Rs, S2, Rd Rd S2 – Rs – Carry Reverse Subtract with carry
AND Rs, S2, Rd Rd Rs S2 AND
OR Rs, S2, Rd Rd Rs V S2 OR
XOR Rs, S2, Rd Rd Rs S2 XOR
SLL Rs, S2, Rd Rd Rs shifted by S2 Shift left
SRL Rs, S2, Rd Rd Rs shifted by S2 Shift right logical
SRA Rs, S2, Rd Rd Rs shifted by S2 Shift right arithmetic