+ All Categories
Home > Documents > 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the...

1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the...

Date post: 22-Dec-2015
Category:
View: 225 times
Download: 0 times
Share this document with a friend
Popular Tags:
29
1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: Memory access: load/store word (lw, sw) AL instructions: add, sub, and, or, and slt. Branch instructions: beq and jump (j). The subset doesn't include all the integer nor any fp instructions but the principle is the same. For every instruction the first two steps are identical: Fetch an instruction from where the PC points to in memory. Decode the instruction and read the registers or memory contents specified.
Transcript
Page 1: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

1

The Processor: Datapath and Control• We will design a microprocessor that includes a

subset of the MIPS instruction set:– Memory access: load/store word (lw, sw)– AL instructions: add, sub, and, or, and slt.– Branch instructions: beq and jump (j).

• The subset doesn't include all the integer nor any fp instructions but the principle is the same.

• For every instruction the first two steps are identical:– Fetch an instruction from where the PC points to in

memory.– Decode the instruction and read the registers or memory

contents specified.

Page 2: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

2

Abstract View of the DataPath

• The data path contains 2 types of logic elements:– Combinational: Elements that operate on data values.

Their outputs depend on their inputs. The ALU is an combinnational element.

– State: Elements with internal storage. Their state is defined by the values they contain (memory and registers).

Registers

Register #

Data

Register #

Datamemory

Address

Data

Register #

PC Instruction ALU

Instructionmemory

Address

Page 3: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

3

Clocking Methodology• A state element has at least two inputs and one

output. The inputs are the data value to be written into the element and the clock signal which determines when the value will be written. The output is the data value stored in the element. Thus a state element can be read from at any time but written depending on the clock.

• A clocking methodology defines when signals can be read and written. This is crucial (חיוני) to the correct design of a computer.

• We will assume an edge-triggered clocking methodology. Any values stored in the machine are updated only on a clock edge.

Page 4: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

4

Edge-Triggered Clocking• Because only state

elements can storevalues, any collectionof combinational logicmust have its inputscoming from a set of state elements and its outputs written to set of state elements. The time necessary for the signals to reach element 2 defines the length of the clock cycle.

• An edge-triggered methodologyallows us to read the contents of an register, send the value through some combinational logic and write that register in thesame clock cycle. We assume that state elements have implicit clock signals.

Clock cycle

Stateelement

1Combinational logic

Stateelement

2

Stateelement

Combinational logic

Page 5: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

5

Fetching an Instruction• A memory unit will hold the instructions that are to be

executed. The address of the next instruction is in the PC. We need an ALU that performs only addition in order to calculate the next instruction to fetch.

• Thick arrows symbolize 32-bit buses unless specified differently. Thin arrows specify 1-bit lines, colored lines specify control lines.

PC

Instructionmemory

Readaddress

Instruction

4

Add

PC

Instructionmemory

Instructionaddress

Instruction

a. Instruction memory b. Program counter

Add Sum

c. Adder

Page 6: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

6

The Register File• The R-type instructions (also called the arithmetic-logical

instructions) read the contents of 2 registers, perform an ALU op. , and write the result back into a third register.

• The 32 registers are stored in the register file. The register file has 3 5-bit inputs to specify the registers, 2 32-bit outputs for the data read, 1 32-bit input for the data written and 1 control signal to decide if data should be written in. In addition we will need an ALU to perform the operations.

InstructionRegisters

Writeregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

ALUresult

ALU

Zero

RegWrite

ALU operation3

Page 7: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

7

Data Memory• The 2 elements needed to implement load and store

instructions are data memory and a unit that sign-extends the 16-bit constant in an I-type instruction. In addition we use the existing ALU to compute the address to access.

• The data memory has 2 32-bit inputs, the address and the write data, and 1 32-input the read data. In addition it has 2 control lines: MemWrite and MemRead.

Instruction

16 32

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Datamemory

Writedata

Readdata

Writedata

Signextend

ALUresult

ZeroALU

Address

MemRead

MemWrite

RegWrite

ALU operation3

Page 8: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

8

Branch Equal• The beq instruction has 3 operands two registers that are

compared for equality and a 16-bit offset used to compute the branch address relative to the PC. To implement this instruction we must add the sign-extend offset to the PC.

• There are 2 important details:1. The base for the address calculation is the address afterthe current instruction's address. But since we compute PC+4 when fetching we already have this address2. The offset is in words not bytes so we have to shift left the offset by 2.

16 32Sign

extend

ZeroALU

Sum

Shiftleft 2

To branchcontrol logic

Branch target

PC + 4 from instruction datapath

Instruction

Add

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

RegWrite

ALU operation3

Page 9: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

9

Combining ALU and Memory Instructions• The ALU datapath (slide 6) and the Memory datapath (slide

7) are similar. The differences are:– The second input to the ALU is a register (R-type) or the

sign-extended offset (I-type).– The value stored into the destination register comes from

the ALU (R-type) or from memory (I-type) .• Using 2 multiplexors (Mux) we can combine both datapaths.

Instruction

16 32

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Datamemory

Writedata

Readdata

Mux

MuxWrite

data

Signextend

ALUresult

ZeroALU

Address

RegWrite

ALU operation3

MemRead

MemWrite

ALUSrcMemtoReg

Page 10: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

10

The Complete Datapath

• This simple processor can compute ALU instructions, access memory or compute the next instruction's address in a single cycle.

PC

Instructionmemory

Readaddress

Instruction

16 32

Add ALUresult

Mux

Registers

Writeregister

Writedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Shiftleft 2

4

Mux

ALU operation3

RegWrite

MemRead

MemWrite

PCSrc

ALUSrc

MemtoReg

ALUresult

ZeroALU

Datamemory

Address

Writedata

Readdata M

ux

Signextend

Add

Page 11: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

11

ALU Control• The ALU has 3 control inputs, we use 5 of the 8 possible

input combinations:000 AND001 OR010 add110 subtract111 slt

• The ALU control uses as its inputs the funct field of the instruction and a 2-bit control field called the ALUOp.

• For lw/sw the ALU computes the address using addition (ALUOp=00), for the R-type instructions the ALU performs one of 5 actions depending on the function field of the instruction (ALUOp=10), for beq the ALU performs a subtraction (ALUOp=01).

• The ALU control is a large truth table that given the funct field and ALUOp outputs 3-bit controls for the ALU.

Page 12: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

12

Main Control• Look at the formats of the R-type and I-type instructions:

Field opcode rs rt rd shamt funct Bits 31-26 25-21 20-16 15-11 10-6 5-0Field opcode rs rt address Bits 31-26 25-21 20-16 15-0

• The following observations can be made:– The opcode is always in bits 31-26– The 2 registers to be read are always the rs (25-21) and rt

(20-16) fields (R-type, beq, and store).– The base register for load/ store instructions is always rs

(25-21)– The 16-bit offset for beq, lw,sw is always in bits (15-0)– The destination register is in one of two places: For a lw it

is rt (20-16), for a R-type it is rd (15-11). Thus we need a MUX to select which field of the instruction is written.

Page 13: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

13

The Main Control Signals• There are 7 control signals in our microprocessor, let's see

what happens when they are asserted (set to 1) and deasserted (set to 0):Signal Deasserted AssertedRegDst The Write reg is rt The Write reg is rdRegWrite None The Write register is written with the Write data ALUSrc The 2nd ALU operand The 2nd ALU operand is the comes from the register file is the 16-bit addressPCSrc PC=PC + 4 PC=Branch targetMemRead None Memory contents at the address input are put on the Read data outputMemWrite None Memory contents at the address input are replaced by the Write data inputMemtoReg The value of the reg. Write The value of the reg. Write

data input is from the ALU data input is from memory

Page 14: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

14

Main Control Diagram

PC

Instructionmemory

Readaddress

Instruction[31– 0]

Instruction [20 16]

Instruction [25 21]

Add

Instruction [5 0]

MemtoReg

ALUOp

MemWrite

RegWrite

MemRead

BranchRegDst

ALUSrc

Instruction [31 26]

4

16 32Instruction [15 0]

0

0Mux

0

1

Control

Add ALUresult

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Signextend

Mux

1

ALUresult

Zero

PCSrc

Datamemory

Writedata

Readdata

Mux

1

Instruction [15 11]

ALUcontrol

Shiftleft 2

ALUAddress

Page 15: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

15

Opcode to Control• The control lines are determined by the opcodes of

the instructions. The exception is the PCSrc line which is dependent on the output of the beq instruction as well (x means don't care).

• Line R-type lw sw beqRegDst 1 0 x xALUSrc 0 1 1 0MemtoReg 0 1 x xRegWrite 1 1 0 0MemRead 0 1 0 0MemWrite 0 0 1 0Branch 0 0 0 1ALUOp 10 00 00 01

• At this stage the Control is a block box, which receives inputs and gives outputs.

Page 16: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

16

Operation of the Datapath• Let's see the stages of execution of a R-type instruction add $t1,$t2,$t3:

1. An instruction is fetched from memory, the PC is incremented

2. Two registers $t2 and $t3 are read from the register file.

3. The ALU operates on the data read from the register file.

4. The results of the ALU is written into the register $t3.

• This doesn't really happen in 4 steps because the implementation is combinational, but at the end of the clock cycle the result is written into the destination register.

• Let's look at lw $t1,offset($t2)1. An instruction is fetched from memory, the PC is incremented

2. The register $t2 is read from the register file.

3. The ALU computes the sum of $t2 and the sign-extended offset.

4. The sum from the ALU is used as the address for the data memory.

5. The data from memory is written into register $t1.

Page 17: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

17

Adding the Jump Instruction• The j instruction uses pseudodirect addressing, the upper

4 bits of PC+4 are concatenated (מחוברים) to the 26 bits (shifted left by 2) of the address in the J-type instruction.

Shiftleft 2

PC

Instructionmemory

Readaddress

Instruction[31– 0]

Datamemory

Readdata

Writedata

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction [15– 11]

Instruction [20– 16]

Instruction [25– 21]

Add

ALUresult

Zero

Instruction [5– 0]

MemtoReg

ALUOp

MemWrite

RegWrite

MemRead

Branch

JumpRegDst

ALUSrc

Instruction [31– 26]

4

Mux

Instruction [25– 0] Jump address [31– 0]

PC+4 [31– 28]

Signextend

16 32Instruction [15– 0]

1

Mux

1

0

Mux

0

1

Mux

0

1

ALUcontrol

Control

Add ALUresult

Mux

0

1 0

ALU

Shiftleft 2

26 28

Address

Page 18: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

18

Performance of Single-Cycle Machines• Let's assume that the operation time for the following units is:

Memory - 2 nanoseconds (ns), ALU and adders - 2 ns, Register file - 1 ns. We will assume that MUXs, control, sign-extension, PC accesses, and wires have no delays.

• Which implementation is faster? 1. Every instruction operates in 1 clock cycle of fixed length.2. Every instruction operates in a varying length clock cycle.

• Lets look at the time needed by each instruction:

Inst. Fetch Reg. Rd ALU op Memory Reg. Wr TotalR-Type 2 1 2 0 1 6nsLoad 2 1 2 2 1 8nsStore 2 1 2 2 7nsBranch 2 1 2 5nsJump 2 2ns

Page 19: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

19

Fixed vs. Variable Cycle Length• Lets Assume a program has the following instruction mix:

24% loads, 12% stores, 44% R-type, 18% branchs, 2% jumps.

• CPU execution time = Instruction count * Cycle time• For the fixed cycle length the cycle time is 8 ns, long enough

for the longest instruction (load). Thus each instruction takes 8 ns to execute.

• For the variable cycle time the average CPU clock cycle is:8*24% + 7*12% + 6*44% + 5*18% + 2*2% = 6.3 ns

• It is obvious that the variable clock implementation is faster but it is extremely hard to implement.

• So why not use the single cycle implementation which is only 6.3/8 = 78% slower?

• When adding instructions such as multiply and divide which can take tens of cycles this scheme is too slow.

Page 20: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

20

A Multicycle Implementation• We broke each instruction into several steps, we can use

these steps to build a multicycle implementation. Each step takes 1 cycle, the multicycle implementation allows a functional unit to be used more than once in each instruction as long as it is used on different clock cycles.

PC

Memory

Address

Instructionor data

Data

Instructionregister

Registers

Register #

Data

Register #

Register #

ALU

Memorydata

register

A

B

ALUOut

We now have only a single memory unit and a single ALU. In addition we need registers to hold the output of each stage.

Page 21: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

21

New Registers and MUXs• We have now added several new registers(which hare

transparent to the programmer) and some new MUXs:– Instruction Register (IR) - the instruction fetched– Memory Data Register (MDR) - data read from memory– A, B - registers read from the register file– ALUOut - result of ALU operation

• The new MUXs added are:– An additional MUX to the 1st ALU input, chooses

between the A register and the PC.– The MUX on the 2nd ALU input is changed from a 2-way

to a 4-way MUX. The additional inputs are the constant 4 (used to increment the PC) and the sign-extended and shifted offset field (used in beq).

Page 22: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

22

Multicycle Diagram

• There are 3 possible sources for the PC value: 1. The output of the ALU which is PC+4; 2. The register ALUOut which is the address of the computed branch target; 3. The lower 26 bits of the IR shifted left by 2, concatenated with the 4 upper bits of the PC.

Shiftleft 2

MemtoReg

IorD MemRead MemWrite

PC

Memory

MemData

Writedata

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Instruction[15– 11]

Mux

0

1

Mux

0

1

4

ALUOpALUSrcB

RegDst RegWrite

Instruction[15– 0]

Instruction [5– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

1 Mux

0

3

2

ALUcontrol

Mux

0

1ALU

resultALU

ALUSrcA

ZeroA

B

ALUOut

IRWrite

Address

Memorydata

register

Page 23: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

23

The Instruction Execution Stages (1,2)1. Instruction Fetch (IF)- Fetch the instruction from

memory and compute the address of the next sequential address:IR = Memory[PC];PC= PC + 4;

2. Instruction Decode (ID) and register fetch - get the registers from the register file and compute the potential branch address (even if it isn't needed in the future):A = Reg[IR[25-21]];B = Reg[IR[20-16]];ALUOut = PC + (sign-extended(IR[15-0])<<2);

Page 24: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

24

The Instruction Execution Stages (3)3. Execution (EX), Memory address computation or

branch completion - In this stage the operation is determined by the the instruction class: A. Memory reference: ALUOut = A + sign-extended(IR[15-0]);B. R-type: ALUOut = A op B;C. Branch: if (A == B) PC = ALUOut;D. Jump: PC = PC[31-28] cat (IR[25-0]<<2)

Page 25: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

25

The Instruction Execution Stages (4,5)4. Memory access (Mem) or R-type completion -

During this step the load/store instruction accesses memory or the AL instruction write its results.A. Memory reference: MDR = Memory[ALUOut]; (load) Memory[ALUOut] = B; (store)B. R-type: Reg[IR[15-11]] = ALUOut;

5. Memory read completion step - The load completes by writing the value from memory into a register.Reg[IR[20-16]]=MDR;

Page 26: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

26

Cycles Per Instruction (CPI)• The CPI of a program defines how many cycles an average

instruction takes. Assuming an instruction mix (for the gcc compiler) of 22% loads, 11% stores, 49% R-type, %16 branches, and 2% jumps what is the CPI, assuming each state requires one clock cycle?

• The number of clock cycles for each instruction format is:Loads: 5; Stores: 4; R-type: 4; Branches: 3; Jumps: 3

• Thus the CPI = 0.22*5 + (0.11 + 0.49)*4 + (0.16 + 0.02)*3 = 4.04

• This is better than the worst case CPI in which each instruction would have taken the same number of clock cycles.

Page 27: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

27

Exceptions• One of the most hardest parts of control is implementing

exceptions and interrupts, events other than branches and jumps which change the normal flow of instruction execution.

• An exception is an unexpected event that happens during program execution such as an arithmetic overflow or an illegal instruction (which are the only 2 in our design).

• An interrupt is an event that is external to the processor, such as requests by I/O devices.

• When an exception occurs the machine must save the address of the offending instruction in the exception program counter (EPC), and then transfer execution to the OS. The OS might service the exception and return control to the program or terminate execution.

Page 28: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

28

Causes of Exceptions• In order for the OS to handle the exception it must know the

cause of the exception. MIPS has a register called the Cause register which holds the reason of the exception.

• A second method is called vectored interrupts. In a vectored interrupt the address to which control is transferred is determined by the exception cause. The OS knows the cause of the exception by the address that is jumped to.

• We need two additional registers the EPC which holds the address of the instruction and the Cause Register which holds 0 for an undefined instruction and 1 for arithmetic overflow.

• We will need 2 control signals to write to the EPC and cause registers (EPCWrite and CauseWrite) and a signal to set the LSB of the Cause register (IntCause).

Page 29: 1 The Processor: Datapath and Control We will design a microprocessor that includes a subset of the MIPS instruction set: –Memory access: load/store word.

29

Datapath with Exceptions

• IntCause is defined by the control if it can't decode the instruction or if the ALU signals an overflow. The next PC MUX now has 4 inputs, the exception handler addr is added

Shiftleft 2

Memory

MemData

Writedata

Mux

0

1

Instruction[15– 11]

Mux

0

1

4

Instruction[15– 0]

Signextend

3216

Instruction[25– 21]

Instruction[20– 16]

Instruction[15– 0]

Instructionregister

ALUcontrol

ALUresult

ALUZero

Memorydata

register

A

B

IorD

MemRead

MemWrite

MemtoReg

PCWriteCond

PCWrite

IRWrite

Control

Outputs

Op[5– 0]

Instruction[31-26]

Instruction [5– 0]

Mux

0

2

Jumpaddress [31-0]Instruction [25– 0] 26 28

Shiftleft 2

PC [31-28]

1

Address

EPC

CO 00 00 00 3

Cause

ALUOp

ALUSrcB

ALUSrcA

RegDst

PCSource

RegWrite

EPCWriteIntCauseCauseWrite

1

0

1 Mux

0

3

2

Mux

0

1

Mux

0

1

PC

Mux

0

1

RegistersWriteregister

Writedata

Readdata 1

Readdata 2

Readregister 1

Readregister 2

ALUOut


Recommended