Pipeline Note 2

7/27/2019 Pipeline Note 2

1/31

Data Hazards


2/31

Data Hazards

We must ensure that the results obtained when instructions areexecuted in a pipelined processor are identical to those obtained whenthe same instructions are executed sequentially.

Hazard occurs

A 3 + A

B 4 A No hazard

A 5 C

B 20 + C

When two operations depend on each other, they must be executed

sequentially in the correct order. Another example:

Mul R2, R3, R4

Add R5, R4, R6


3/31


4/31

Data Hazards

11 ul

1 1

s

i r . . i l i l l 1.

l k l

1

Tim

Figure 8.6. Pipeline stalled by data dependency between D2 and W1.


5/31


6/31


7/31

Data Hazards

IF

IF

ID

ID EX

EX M

M

WB

WB

ADD R1, R2, R3

SUB R4, R1, R5

Select R2 and R3 for

ALU OperationsADD R2 and R3 STORE SUM IN

R1

Select R1 and R5 for

ALU Operations


8/31

Stalling

Stalling involves halting the flow of

instructions until the required result is ready

to be used.

However stalling wastes processor time by

doing nothing while waiting for the result.


9/31

Cont

IF

IF

ID

ID EX

EX M

M

WB

WB

ADD R1, R2, R3

SUB R4, R1, R5

IF ID EX M WBSTALL

IF ID EX M WBSTALL

IF ID EX M WBSTALL


10/31

Type of Pipelining

Software Pipelining

Can Handle Complex Instructions

Allows programs to be reused

Hardware Pipelining

Help designer manage complexity a complex

task can be divided into smaller, more

manageable pieces.

Hardware pipelining offers higher performance


11/31

Type of Hardware Pipelines

Instruction Pipeline - An instruction pipeline is

very similar to a manufacturing assembly line.

Data Pipeline data pipeline is designed to

pass data from stage to stage.


12/31

Instruction Pipelines Conflict

It divided into two categories.

Data Conflicts

Branch Conflicts

When the current instruction changes a

register that the next one needed, data

conflicts happens.

When the current instruction make a jump,

branch conflicts happens.


13/31


14/31

Overview

Branch Penalties

Branch Prediction


15/31

Branch Penalties

Branches are a major problem because you donot know which instruction will come nextuntil the instruction is executed

The time to fill the pipeline again is known asthe branch penalty.

The larger the number of stages in the

pipeline the larger the potential branchpenalty.


16/31

Branch Penalties Cont..

Instruction 3 is a conditional branch to instruction 20, which is taken.

As soon as it is executed in step 6, the pipeline is flushed (instruction 3 is able to

complete) and instructions starting at #20 are loaded into the pipeline.

Note that no instructions are completed during cycles 7 through 10.


17/31

Branch Prediction: Improving Branch Performance

Static Prediction

With static branch prediction the instruction loaded will depend on the

design of the pipeline. If the prediction is correct there is no branch

penalty.

A compiler has to aware of the type of prediction used on the machine in

order to optimize the machine code properly. Never taken- The prediction is that the branch will not be taken.

Therefore, the next instructions in memory are loaded into the pipeline

Always taken The prediction is that the branch will be taken so the next

instruction loaded is at the branch destination.

Code prediction The processor determines which prediction to usebased on the instruction.


18/31

Delayed Branch

The instructions in the delay slots are alwaysfetched. Therefore, we would like to arrangefor them to be fully executed whether or not

the branch is taken. The objective is to place useful instructions in

these slots.

The effectiveness of the delayed branchapproach depends on how often it is possibleto reorder instructions.


19/31

Delayed Branch

Add

LOOP Shift_left R1

Decrement

Branch=0

R2

LOOP

NEXT

(a) Original program loop

LOOP Decrement R2

Branch=0

Shift_left

LOOP

R1NEXT

(b) Reordered instructions

Figure 8.12. Reordering of instructions for a delayed branch.

Add

R1,R3

R1,R3


20/31

Delayed Branch

Figure 8.13. Execution timing showing the delay slot being filled

during the last two passes through the loop in Figure 8.12.

F E

F E

F E

F E

F E

F E

F E

Instruction

Decrement

Branch

Shift (delay slot)

Decrement (Branch taken)

Branch

Shift (delay slot)

Add (Branch not taken)

1 2 3 4 5 6 7 8Clock c ycleTime


21/31

Branch Prediction: Improving Branch Performance

Static Prediction

With static branch prediction the instruction loaded will depend on the

design of the pipeline. If the prediction is correct there is no branch

penalty.

A compiler has to aware of the type of prediction used on the machine in

order to optimize the machine code properly. Never taken- The prediction is that the branch will not be taken.

Therefore, the next instructions in memory are loaded into the pipeline

Always taken The prediction is that the branch will be taken so the next

instruction loaded is at the branch destination.

Code prediction The processor determines which prediction to usebased on the instruction.


22/31

Branch Prediction Cont..

Better performance can be achieved if we arrange forsome branch instructions to be predicted as taken and

others as not taken.

Use hardware to observe whether the target address is

lower or higher than that of the branch instruction.

Let compiler include a branch prediction bit.

So far the branch prediction decision is always the same

every time a given instruction is executed static branchprediction.


23/31

Influence on Instruction Sets


24/31

Overview

Some instructions are much better suited to

pipeline execution than others.

Addressing modes

Conditional code flags


25/31

Addressing Modes

Addressing modes include simple ones andcomplex ones.

In choosing the addressing modes to be

implemented in a pipelined processor, wemust consider the effect of each addressingmode on instruction flow in the pipeline:

Side effects

The extent to which complex addressing modes cause thepipeline to stall

Whether a given mode is likely to be used by compilers


26/31

Addressing Modes

In a pipelined processor, complex addressing modes do

not necessarily lead to faster execution.

Advantage: reducing the number of instructions /

program space Disadvantage: cause pipeline to stall / more hardware to

decode / not convenient for compiler to work with

Conclusion: complex addressing modes are not suitable

for pipelined execution.


27/31


28/31

Conditional Codes

If an optimizing compiler attempts to reorderinstruction to avoid stalling the pipeline whenbranches or data dependencies between

successive instructions occur, it must ensure thatreordering does not cause a change in theoutcome of a computation.

The dependency introduced by the condition-

code flags reduces the flexibility available for thecompiler to reorder instructions.


29/31

Datapath and ControlConsiderations

C

it r

fil

C n t n t

I n r m n t r


30/31

Original Designm r y ut l i n

i u r . . h r - u r n i t i n f t h t t h.

I n t r u t i nr

LU

U

rlin

I

u

it r

fil


31/31

I n t r u t i n h

i ur . . t t h m i fi f r i l in cu ti n , i th

u

C n t r l i n l i l i n

I

C

LU

I n t r u t i nr

I n r m n t r

/ r i t

I n t r u t i nuu

u

C

t h

m r y r

/

m r y r

( I n tr u t i n f t h )

( t )

i n t r t u f f r t t h i n u t n u t u t f t h L U .Pipelined Design

- Separate instruction and data caches

- PC is connected to IMAR

- DMAR

- Separate MDR

- Buffers for ALU

- Instruction queue

- Instruction decoder output

- Reading an instruction from the instruction cache- Incrementing the PC

- Decoding an instruction

- Reading from or writing into the data cache

- Reading the contents of up to two regs

- Writing into one register in the reg file

- Performing an ALU operation

Date post:	02-Apr-2018
Category:	Documents
Upload:	hazwan-khalid-mojam
View:	215 times
Download:	0 times

Pipeline Note 2

Documents