+ All Categories
Home > Documents > 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Date post: 13-Jan-2016
Category:
Upload: nicholas-powell
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
44
1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015
Transcript
Page 1: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

1

COMP541

Multicycle MIPS

Montek Singh

Apr 8, 2015

Page 2: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Topics Challenges w/ single-cycle MIPS

implementation Multicycle MIPS

State elementsNow add registers between stages

How to controlPerformance

2

Page 3: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Review: Processor Performance Program execution time

Execution Time = (# instructions)

(cycles/instruction)(seconds/cycle)= IC x CPI x Tc

Definitions: IC = instruction countCycles/instruction = CPISeconds/cycle = clock period = Tc

1/CPI = Instructions/cycle = IPC Challenge is to satisfy constraints of:

CostPowerPerformance

Page 4: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Single-Cycle Performance (textbook version) TC is limited by the critical path (lw)

lw is typically the longest instruction

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1

A RD

DataMemory

WD

WE0

1

PC0

1PC' Instr

25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

Branch

MemWrite

MemtoReg

ALUSrc

RegWrite

Op

Funct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

AL

U

1

010

01

0

1

0 0

Page 5: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Single-Cycle Performance (textbook version)• Single-cycle critical path:

• Tc = tpcq_PC + tmem + max(tRFread, tsext + tmux) + tALU + tmem + tmux + tRFsetup

• In most implementations, limiting paths are: – memory, ALU, register file. – Tc = tpcq_PC + 2tmem + tRFread + tALU + tmux + tRFsetup

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1

A RD

DataMemory

WD

WE0

1

PC0

1PC' Instr

25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

Branch

MemWrite

MemtoReg

ALUSrc

RegWrite

Op

Funct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU1

010

01

0

1

0 0

Page 6: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Single-Cycle Performance Example

Tc = tpcq_PC + 2tmem + tRFread + tALU + tmux + tRFsetup

= [30 + 2(250) + 150 + 200 + 25 + 20] ps = 925 ps

What’s the max clock frequency?

Page 7: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Single-Cycle Performance Example For a program with 100 billion instructions

executing on a single-cycle MIPS processor,Execution Time

= # instructions x CPI x TC

= (100 × 109)(1)(925 × 10-12 s)= 92.5 seconds

Page 8: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

8

Multicycle MIPS

Key idea: Break instruction execution into multiple clock cycles

Page 9: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle MIPS Processor Single-cycle microarchitecture:

+ simple- cycle time limited by longest instruction (lw)- two adders/ALUs and two memories

Multicycle microarchitecture:+ higher clock speed+ simpler instructions run faster+ reuse expensive hardware on multiple cycles- sequencing overhead

Same design steps: datapath & control

Page 10: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle State Elements Replace Instruction and Data memories with a

single unified memoryMore realistic (buy one big RAM!)Was not possible in single-cycle implementation

both instruction and data accesses needed within same clock cycle

Now: Use same memory twice if needed instruction fetch and data access are in distinct clock

cyclesCLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

RegisterFile

PCPC'

WD

WE

CLK

EN

Page 11: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw instr fetch First consider executing lw STEP 1: Fetch instruction

introduce Instruction Register to buffer this instructiona “non-architectural register”

not accessible to programmer

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

RegisterFile

PCPC' Instr

CLK

WD

WE

CLK

EN

IRWrite

Page 12: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw register read Read register $rs

insert another non-architectural register, Abuffers the value of $rs read from register file

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

RegisterFile

PCPC' Instr25:21

CLK

WD

WE

CLK CLK

A

EN

IRWrite

Page 13: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw immediate Immediate field is sign-extended

for consistency, could insert another non-architectural register to buffer SignImm

skipped in this versionbecause SignImm is a simple combinational function of

Instr, which is already being held in Instruction Register

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

PCPC' Instr25:21

15:0

CLK

WD

WE

CLK CLK

A

EN

IRWrite

Page 14: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw address ALU computes memory address

insert another register to buffer ALUOut

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

PCPC' Instr25:21

15:0

SrcB

ALUResult

SrcA

ALUOut

CLK

ALUControl2:0

ALU

WD

WE

CLK CLK

A CLK

EN

IRWrite

Page 15: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw memory read Same memory read now for data access

insert a mutiplexer in front of memory’s address inputchoose either PC or ALUOut as address

i.e., either instruction fetch or data accesscontrolled by new control signal IorD

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

PCPC' Instr25:21

15:0

SrcB

ALUResult

SrcA

ALUOut

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

Data

CLK

CLK

A CLK

EN

IRWriteIorD

0

1

Page 16: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: lw write register Data from memory is written into register file

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

PCPC' Instr25:21

15:0

SrcB20:16

ALUResult

SrcA

ALUOut

RegWrite

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

Data

CLK

CLK

A CLK

EN

IRWriteIorD

0

1

Page 17: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: increment PC PC incremented by re-using the ALU to do PC +

4 in single-cycle, we had to introduce a dedicated +4

adder in multi-cycle, same ALU used twice, in distinct cycles!

PCWrite

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1PCPC' Instr25:21

15:0

SrcB

20:16

ALUResult

SrcA

ALUOut

ALUSrcARegWrite

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

Data

CLK

CLK

A

00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWriteIorD

0

1

Now using main ALU when it is not busy (instead of dedicated adder)

Page 18: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: sw Compared to lw

address computation is identical to lwwrite data in $rt to memory

MemWrite will be 1 during the appropriate clock cycle$rt is buffered using nonarchitectural register B

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1PC0

1

PC' Instr25:21

20:16

15:0

SrcB20:16

ALUResult

SrcA

ALUOut

MemWrite ALUSrcARegWrite

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

Data

CLK

CLK

A

00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWriteIorDPCWrite

B

Page 19: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: R-type Instrs. Read from $rs and $rt

multiplexers in front of ALU choose $rs and $rt as operands

rite ALUResult to register file Write to $rd (instead of $rt)

multiplexers in front of write address/data to register file

0

1

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1PC0

1

PC' Instr25:21

20:16

15:0

SrcB20:16

15:11

ALUResult

SrcA

ALUOut

RegDstMemWrite MemtoReg ALUSrcARegWrite

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWriteIorDPCWrite

Page 20: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Datapath: beq 2 tasks

Determine whether values in rs and rt are equalCalculate branch target address:

BTA = (sign-extended immediate << 2) + (PC+4)ALU reused!

SignImm

b

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC0

1

PC' Instr25:21

20:16

15:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

RegDst BranchMemWrite MemtoReg ALUSrcARegWrite

Zero

PCSrc

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWriteIorD PCWrite

PCEn

Page 21: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Complete Multicycle Processor Caveat: Same differences in functionality w.r.t. our lab

version as single-cycle MIPS

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

Page 22: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Control Unit

ALUSrcA

PCSrc

Branch

ALUSrcB1:0

Opcode5:0

ControlUnit

ALUControl2:0Funct5:0

MainController

(FSM)

ALUOp1:0

ALUDecoder

RegWrite

PCWrite

IorD

MemWrite

IRWrite

RegDst

MemtoReg

RegisterEnables

MultiplexerSelects

Page 23: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: Fetch

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

0

1 1

0

X

X

00

01

0100

1

0

Reset

S0: Fetch

Page 24: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: Fetch

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

0

1 1

0

X

X

00

01

0100

1

0

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

Reset

S0: Fetch

• Fetch instruction• Also increment PC (because ALU not in use)

Note: signals only shown when needed and enables only when asserted.

Page 25: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: Decode

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

Reset

S0: Fetch S1: Decode

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

X

0 0

0

X

X

0X

XX

XXXX

0

0

• No signals needed for decode• Register values also fetched

• Perhaps will not be used

Page 26: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: Address Calculation

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

Reset

S0: Fetch

S2: MemAdr

S1: Decode

Op = LWor

Op = SW

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

X

0 0

0

X

X

01

10

010X

0

0

• Now change states depending on instr

Page 27: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: Address Calculation

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

Reset

S0: Fetch

S2: MemAdr

S1: Decode

Op = LWor

Op = SW

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC 0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gDst

Branch

MemWriteM

emtoR

eg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

X

0 0

0

X

X

01

10

010X

0

0

• For lw or sw, need to compute addr

Page 28: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: lw

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemRead

Op = LWor

Op = SW

Op = LW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

• For lw now need to read from memory

• Then write to register

Page 29: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: sw

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1IorD = 1

MemWrite

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

Op = LWor

Op = SW

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

• sw just writes to memory

• One step shorter

Page 30: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: R-Type

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

Op = LWor

Op = SW

Op = R-type

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

• The r-type instructions have two steps: compute result in ALU and write to reg

Page 31: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: beq

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

beq needs to use ALU twice, so consumes two cycles• One to

compute addr

• Another to decide on eq

Can take advantage of decode when ALU not used to compute BTA(no harm if BTA not used)

Page 32: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Complete Multicycle Controller FSM

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

Page 33: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: addi

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

Op = ADDI

S9: ADDIExecute

S10: ADDIWriteback

Similar to r-type

• Add• Write back

Page 34: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Main Controller FSM: addi

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 0

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 1

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0MemtoReg = 0

RegWrite

Op = ADDI

S9: ADDIExecute

S10: ADDIWriteback

Page 35: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Extended Functionality: j

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1PC0

1

PC' Instr25:21

20:16

15:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

RegDst BranchMemWrite MemtoReg ALUSrcARegWrite

Zero

PCSrc1:0

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWriteIorD PCWrite

PCEn

00

01

10

<<2

25:0 (jump)

31:28

27:0

PCJump

Page 36: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Control FSM: j

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 00

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0MemtoReg = 0

RegWrite

Op = ADDI

S9: ADDIExecute

S10: ADDIWriteback

Op = J

S11: Jump

Page 37: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Control FSM: j

IorD = 0AluSrcA = 0

ALUSrcB = 01ALUOp = 00PCSrc = 00

IRWritePCWrite

ALUSrcA = 0ALUSrcB = 11ALUOp = 00

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

IorD = 1RegDst = 1

MemtoReg = 0RegWrite

IorD = 1MemWrite

ALUSrcA = 1ALUSrcB = 00ALUOp = 10

ALUSrcA = 1ALUSrcB = 00ALUOp = 01PCSrc = 01

Branch

Reset

S0: Fetch

S2: MemAdr

S1: Decode

S3: MemReadS5: MemWrite

S6: Execute

S7: ALUWriteback

S8: Branch

Op = LWor

Op = SW

Op = R-type

Op = BEQ

Op = LW

Op = SW

RegDst = 0MemtoReg = 1

RegWrite

S4: MemWriteback

ALUSrcA = 1ALUSrcB = 10ALUOp = 00

RegDst = 0MemtoReg = 0

RegWrite

Op = ADDI

S9: ADDIExecute

S10: ADDIWriteback

PCSrc = 10PCWrite

Op = J

S11: Jump

Page 38: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Performance Instructions take different number of cycles:

3 cycles: beq, j4 cycles: R-Type, sw, addi5 cycles: lw

CPI is weighted average SPECINT2000 benchmark:

25% loads10% stores 11% branches2% jumps52% R-type

Average CPI = (0.11 + 0.2)(3) + (0.52 + 0.10)(4) + (0.25)(5) = 4.12

Page 39: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Performance

SignImm

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1 0

1

PC0

1

PC' Instr25:21

20:16

15:0

5:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

31:26

Re

gD

st

Branch

MemWrite

Mem

toReg

ALUSrcA

RegWriteOp

Funct

ControlUnit

Zero

PCSrc

CLK

CLK

ALUControl2:0

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

• Multicycle critical path: Tc = tpcq + tmux + max(tALU + tmux, tmem) + tsetup

Page 40: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Performance Example

Tc = tpcq_PC + tmux + max(tALU + tmux, tmem) + tsetup

= tpcq_PC + tmux + tmem + tsetup

= [30 + 25 + 250 + 20] ps = 325 ps

Element Parameter Delay (ps)

Register clock-to-Q tpcq_PC 30

Register setup tsetup 20

Multiplexer tmux 25

ALU tALU 200

Memory read tmem 250

Register file read tRFread 150

Register file setup tRFsetup 20

Page 41: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Multicycle Performance Example For a program with 100 billion instructions

executing on a multicycle MIPS processorCPI = 4.12Tc = 325 ps

Execution Time = (# instructions) × CPI × Tc

= (100 × 109)(4.12)(325 × 10-12) = 133.9 seconds

This is slower than the single-cycle processor (92.5 seconds). Why?Not all steps the same lengthSequencing overhead for each step (tpcq + tsetup= 50 ps)

Page 42: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Review: Single-Cycle MIPS Processor

SignImm

CLK

A RD

InstructionMemory

+

4

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1

A RD

DataMemory

WD

WE0

1

PC0

1PC' Instr

25:21

20:16

15:0

5:0

SrcB

20:16

15:11

<<2

+

ALUResult ReadData

WriteData

SrcA

PCPlus4

PCBranch

WriteReg4:0

Result

31:26

RegDst

Branch

MemWrite

MemtoReg

ALUSrc

RegWrite

Op

Funct

ControlUnit

Zero

PCSrc

CLK

ALUControl2:0

ALU

0

1

25:0 <<2

27:0 31:28

PCJump

Jump

Page 43: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Review: Multicycle MIPS Processor

ImmExt

CLK

ARD

Instr / DataMemory

A1

A3

WD3

RD2

RD1WE3

A2

CLK

Sign Extend

RegisterFile

0

1

0

1PC0

1

PC' Instr25:21

20:16

15:0

SrcB20:16

15:11

<<2

ALUResult

SrcA

ALUOut

ZeroCLK

ALU

WD

WE

CLK

Adr

0

1Data

CLK

CLK

A

B00

01

10

11

4

CLK

ENEN

00

01

10

<<2

25:0 (Addr)

31:28

27:0

PCJump

5:0

31:26

Branch

MemWrite

ALUSrcA

RegWriteOp

Funct

ControlUnit

PCSrc

CLK

ALUControl2:0

ALUSrcB1:0IRWrite

IorD

PCWritePCEn

Re

gD

st

Mem

toReg

Page 44: 1 COMP541 Multicycle MIPS Montek Singh Apr 8, 2015.

Next Time Next topic:

We’ll look at pipelined MIPS Improving throughput (and adding complexity!) by

trying to use all of the hardware every cycle

44


Recommended