+ All Categories
Home > Documents > 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr [email protected]...

15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr [email protected]...

Date post: 21-Dec-2015
Category:
View: 217 times
Download: 0 times
Share this document with a friend
37
15-447 Computer Architecture Fall 2007 © October 3rd, 2007 Majd F. Sakr [email protected] www.qatar.cmu.edu/~msakr/15447-f07/ CS-447– Computer Architecture M,W 10-11:20am Lecture 11 Single Cycle Datapath
Transcript
Page 1: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

October 3rd, 2007

Majd F. Sakr

[email protected]

www.qatar.cmu.edu/~msakr/15447-f07/

CS-447– Computer Architecture

M,W 10-11:20am

Lecture 11Single Cycle Datapath

Page 2: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Lecture Objectives

°Learn what a datapath is, and how does it provide the required functions.

°Appreciate why different implementation strategies affects the clock rate and CPI of a machine.

°Understand how the ISA determines many aspects of the hardware implementation.

Page 3: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Implementation vs. Performance

Performance of a processor is determined by

• Instruction count of a program

• CPI

• Clock cycle time (clock rate)

The compiler & the ISA determine the instruction count.

The implementation of the processor determines the CPI and the clock cycle time.

Page 4: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Possible Execution Steps of Any Instructions

° Instruction Fetch

° Instruction Decode and Register Fetch

° Execution of the Memory Reference Instruction

° Execution of Arithmetic-Logical operations

° Branch Instruction

° Jump Instruction

Page 5: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Instruction Processing° Five steps:

• Instruction fetch (IF)

• Instruction decode and operand fetch (ID)

• ALU/execute (EX)

• Memory (not required) (MEM)

• Write-back (WB)

Registers

Register #

Data

Register #

Datamemory

Address

Data

Register #

PC Instruction ALU

Instructionmemory

Address

IF

ID

EX

MEM

WB

Page 6: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Datapath & Control

Control

Page 7: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Datapath Elements

The data path contains 2 types of logic elements:

• Combinational: (e.g. ALU) Elements that operate on data values. Their outputs depend on their inputs.

• State: (e.g. Registers & Memory) Elements with internal storage. Their state is defined by the values they contain.

Page 8: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

State Elements

Page 9: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Pentium Processor Die

° State

• Registers

• Memory

° Control ROM

° Combinational logic (Compute)

REG

Page 10: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Abstract View of the Datapath

Registers

Register #

Data

Register #

Datamemory

Address

Data

Register #

PC Instruction ALU

Instructionmemory

Address

Page 11: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Single Cycle Implementation

° This simple processor can compute ALU instructions, access memory or compute the next instruction's address in a single cycle.

Page 12: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Program Counter

If each instruction needs 4 memory locations then, Next PC <= PC + 4

Page 13: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

PC Datapath – Branch OffsetPC <= PC + Branch Offset

Page 14: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Abstract View After PC Basic Implementation

Page 15: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

The Register File

° Arithmetic & Logical instructions (R-type), read the contents of 2 registers, perform an ALU operation, and write the result back to a register.

° Registers are stored in the register file. The register file has inputs to specify the registers, outputs for the data read, input for the data written and 1 control signal to decide if data should be written in. In addition we will need an ALU to perform the operations.

Page 16: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

The Register File

InstructionRegisters

Writeregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

ALUresult

ALU

Zero

RegWrite

ALU operation3

Page 17: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

R-Type Instructions•Assembly (e.g., register-register signed addition)

ADD rdreg rsreg rtreg

• Machine encoding

• Semantics

if MEM[PC] == ADD rd rs rtGPR[rd] ← GPR[rs] + GPR[rt]

PC ← PC + 4

Page 18: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

ADD rd rs rt

Page 19: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Datapath for Add

Page 20: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

I-Type ALU Instructions

° Assembly (e.g., register-immediate signed additions)

ADDI rtreg rsreg immediate16

° Machine encoding

° Semantics

if MEM[PC] == ADDI rt rs immediate

GPR[rt] ← GPR[rs] + sign-extend (immediate)

PC ← PC + 4

Page 21: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

ADDI rtreg rsreg immediate16

Page 22: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Datapath for R and I-Type ALU Instructions

Page 23: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Data Memory

° The element needed to implement load and store

instructions are data memory. In addition we use

the existing ALU to compute the address to

access.

° The data memory has 2 x-bit inputs: the address

and the write data, and 1 x-output: the read data.

In addition it has 2 control lines:

MemWrite and MemRead.

Page 24: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Data Memory

Instruction

16 32

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Datamemory

Writedata

Readdata

Writedata

Signextend

ALUresult

ZeroALU

Address

MemRead

MemWrite

RegWrite

ALU operation3

Page 25: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Load Instruction° Assembly (e.g., load 4-byte word)

LW rtreg offset16 (basereg)

° Machine encoding

° Semantics

if MEM[PC]==LW rt offset16 (base)

EA = sign-extend(offset) + GPR[base]

GPR[rt] ← MEM[ translate(EA) ]

PC ← PC + 4

Page 26: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

LW Datapath

Page 27: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Branch Equal

°The beq (branch if equal) instruction has 3 operands two registers that are compared for equality and a n-bit offset used to compute the branch address relative to the PC.

Page 28: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Branch Equal

16 32Sign

extend

ZeroALU

Sum

Shiftleft 2

To branchcontrol logic

Branch target

PC + 4 from instruction datapath

Instruction

Add

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Writedata

RegWrite

ALU operation3

Page 29: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Unconditional Jump° Assembly

J immediate26

° Machine encoding

° Semantics

if MEM[PC]==J immediate26

target = { PC[31:28], immediate26, 2’b00 }

PC ← target

Page 30: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Unconditional Jump Datapath

Page 31: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Combining ALU and Memory Instructions

° The ALU datapath and the Memory datapath are similar. The differences are:

• The second input to the ALU is a register (R-type) or the offset (I-type).

• The value stored into the destination register comes from the ALU (R-type) or from memory (I-type) .

° Using 2 multiplexers (Mux) we can combine both datapaths.

Page 32: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Combining ALU and Memory Instructions

Instruction

16 32

RegistersWriteregister

Readdata 1

Readdata 2

Readregister 1

Readregister 2

Datamemory

Writedata

Readdata

Mux

MuxWrite

data

Signextend

ALUresult

ZeroALU

Address

RegWrite

ALU operation3

MemRead

MemWrite

ALUSrcMemtoReg

Page 33: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

The Complete Datapath

PC

Instructionmemory

Readaddress

Instruction

16 32

Add ALUresult

Mux

Registers

Writeregister

Writedata

Readdata 1

Readdata 2

Readregister 1Readregister 2

Shiftleft 2

4

Mux

ALU operation3

RegWrite

MemRead

MemWrite

PCSrc

ALUSrc

MemtoReg

ALUresult

ZeroALU

Datamemory

Address

Writedata

Readdata M

ux

Signextend

Add

Page 34: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Complete Datapath

Page 35: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

What’s Wrong with Single Cycle?

° All instructions run at the speed of the slowest instruction.

° Adding a long instruction can hurt performance• What if you wanted to include multiply?

° You cannot reuse any parts of the processor• We have 3 different adders to calculate PC+1,

PC+1+offset and the ALU

° No profit in making the common case fast• Since every instruction runs at the slowest instruction

speed- This is particularly important for loads as we will see later

Page 36: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

What’s Wrong with Single Cycle?

1 ns – Register read/write time

2 ns – ALU/adder

2 ns – memory access

0 ns – MUX, PC access, sign extend, ROM

add: 2ns + 1ns + 2ns + 1ns = 6 ns

beq: 2ns + 1ns + 2ns = 5 ns

sw: 2ns + 1ns + 2ns + 2ns = 7 ns

lw: 2ns + 1ns + 2ns + 2ns + 1ns = 8 ns

Get read ALU mem writeInstr reg operation reg

Page 37: 15-447 Computer ArchitectureFall 2007 © October 3rd, 2007 Majd F. Sakr msakr@qatar.cmu.edu msakr/15447-f07/ CS-447– Computer Architecture.

15-447 Computer Architecture Fall 2007 ©

Computing Execution Time

Assume: 100 instructions executed25% of instructions are loads,

10% of instructions are stores,

45% of instructions are adds, and

20% of instructions are branches.

Single-cycle execution:

100 * 8ns = 800 ns

Optimal execution:

25*8ns + 10*7ns + 45*6ns + 20*5ns = 640 ns


Recommended