Date post: | 02-Apr-2018 |
Category: |
Documents |
Upload: | hazwan-khalid-mojam |
View: | 215 times |
Download: | 0 times |
of 31
7/27/2019 Pipeline Note 2
1/31
Data Hazards
7/27/2019 Pipeline Note 2
2/31
Data Hazards
We must ensure that the results obtained when instructions areexecuted in a pipelined processor are identical to those obtained whenthe same instructions are executed sequentially.
Hazard occurs
A 3 + A
B 4 A No hazard
A 5 C
B 20 + C
When two operations depend on each other, they must be executed
sequentially in the correct order. Another example:
Mul R2, R3, R4
Add R5, R4, R6
7/27/2019 Pipeline Note 2
3/31
7/27/2019 Pipeline Note 2
4/31
Data Hazards
11 ul
1 1
s
i r . . i l i l l 1.
l k l
1
Tim
Figure 8.6. Pipeline stalled by data dependency between D2 and W1.
7/27/2019 Pipeline Note 2
5/31
7/27/2019 Pipeline Note 2
6/31
7/27/2019 Pipeline Note 2
7/31
Data Hazards
IF
IF
ID
ID EX
EX M
M
WB
WB
ADD R1, R2, R3
SUB R4, R1, R5
Select R2 and R3 for
ALU OperationsADD R2 and R3 STORE SUM IN
R1
Select R1 and R5 for
ALU Operations
7/27/2019 Pipeline Note 2
8/31
Stalling
Stalling involves halting the flow of
instructions until the required result is ready
to be used.
However stalling wastes processor time by
doing nothing while waiting for the result.
7/27/2019 Pipeline Note 2
9/31
Cont
IF
IF
ID
ID EX
EX M
M
WB
WB
ADD R1, R2, R3
SUB R4, R1, R5
IF ID EX M WBSTALL
IF ID EX M WBSTALL
IF ID EX M WBSTALL
7/27/2019 Pipeline Note 2
10/31
Type of Pipelining
Software Pipelining
Can Handle Complex Instructions
Allows programs to be reused
Hardware Pipelining
Help designer manage complexity a complex
task can be divided into smaller, more
manageable pieces.
Hardware pipelining offers higher performance
7/27/2019 Pipeline Note 2
11/31
Type of Hardware Pipelines
Instruction Pipeline - An instruction pipeline is
very similar to a manufacturing assembly line.
Data Pipeline data pipeline is designed to
pass data from stage to stage.
7/27/2019 Pipeline Note 2
12/31
Instruction Pipelines Conflict
It divided into two categories.
Data Conflicts
Branch Conflicts
When the current instruction changes a
register that the next one needed, data
conflicts happens.
When the current instruction make a jump,
branch conflicts happens.
7/27/2019 Pipeline Note 2
13/31
7/27/2019 Pipeline Note 2
14/31
Overview
Branch Penalties
Branch Prediction
7/27/2019 Pipeline Note 2
15/31
Branch Penalties
Branches are a major problem because you donot know which instruction will come nextuntil the instruction is executed
The time to fill the pipeline again is known asthe branch penalty.
The larger the number of stages in the
pipeline the larger the potential branchpenalty.
7/27/2019 Pipeline Note 2
16/31
Branch Penalties Cont..
Instruction 3 is a conditional branch to instruction 20, which is taken.
As soon as it is executed in step 6, the pipeline is flushed (instruction 3 is able to
complete) and instructions starting at #20 are loaded into the pipeline.
Note that no instructions are completed during cycles 7 through 10.
7/27/2019 Pipeline Note 2
17/31
Branch Prediction: Improving Branch Performance
Static Prediction
With static branch prediction the instruction loaded will depend on the
design of the pipeline. If the prediction is correct there is no branch
penalty.
A compiler has to aware of the type of prediction used on the machine in
order to optimize the machine code properly. Never taken- The prediction is that the branch will not be taken.
Therefore, the next instructions in memory are loaded into the pipeline
Always taken The prediction is that the branch will be taken so the next
instruction loaded is at the branch destination.
Code prediction The processor determines which prediction to usebased on the instruction.
7/27/2019 Pipeline Note 2
18/31
Delayed Branch
The instructions in the delay slots are alwaysfetched. Therefore, we would like to arrangefor them to be fully executed whether or not
the branch is taken. The objective is to place useful instructions in
these slots.
The effectiveness of the delayed branchapproach depends on how often it is possibleto reorder instructions.
7/27/2019 Pipeline Note 2
19/31
Delayed Branch
Add
LOOP Shift_left R1
Decrement
Branch=0
R2
LOOP
NEXT
(a) Original program loop
LOOP Decrement R2
Branch=0
Shift_left
LOOP
R1NEXT
(b) Reordered instructions
Figure 8.12. Reordering of instructions for a delayed branch.
Add
R1,R3
R1,R3
7/27/2019 Pipeline Note 2
20/31
Delayed Branch
Figure 8.13. Execution timing showing the delay slot being filled
during the last two passes through the loop in Figure 8.12.
F E
F E
F E
F E
F E
F E
F E
Instruction
Decrement
Branch
Shift (delay slot)
Decrement (Branch taken)
Branch
Shift (delay slot)
Add (Branch not taken)
1 2 3 4 5 6 7 8Clock c ycleTime
7/27/2019 Pipeline Note 2
21/31
Branch Prediction: Improving Branch Performance
Static Prediction
With static branch prediction the instruction loaded will depend on the
design of the pipeline. If the prediction is correct there is no branch
penalty.
A compiler has to aware of the type of prediction used on the machine in
order to optimize the machine code properly. Never taken- The prediction is that the branch will not be taken.
Therefore, the next instructions in memory are loaded into the pipeline
Always taken The prediction is that the branch will be taken so the next
instruction loaded is at the branch destination.
Code prediction The processor determines which prediction to usebased on the instruction.
7/27/2019 Pipeline Note 2
22/31
Branch Prediction Cont..
Better performance can be achieved if we arrange forsome branch instructions to be predicted as taken and
others as not taken.
Use hardware to observe whether the target address is
lower or higher than that of the branch instruction.
Let compiler include a branch prediction bit.
So far the branch prediction decision is always the same
every time a given instruction is executed static branchprediction.
7/27/2019 Pipeline Note 2
23/31
Influence on Instruction Sets
7/27/2019 Pipeline Note 2
24/31
Overview
Some instructions are much better suited to
pipeline execution than others.
Addressing modes
Conditional code flags
7/27/2019 Pipeline Note 2
25/31
Addressing Modes
Addressing modes include simple ones andcomplex ones.
In choosing the addressing modes to be
implemented in a pipelined processor, wemust consider the effect of each addressingmode on instruction flow in the pipeline:
Side effects
The extent to which complex addressing modes cause thepipeline to stall
Whether a given mode is likely to be used by compilers
7/27/2019 Pipeline Note 2
26/31
Addressing Modes
In a pipelined processor, complex addressing modes do
not necessarily lead to faster execution.
Advantage: reducing the number of instructions /
program space Disadvantage: cause pipeline to stall / more hardware to
decode / not convenient for compiler to work with
Conclusion: complex addressing modes are not suitable
for pipelined execution.
7/27/2019 Pipeline Note 2
27/31
7/27/2019 Pipeline Note 2
28/31
Conditional Codes
If an optimizing compiler attempts to reorderinstruction to avoid stalling the pipeline whenbranches or data dependencies between
successive instructions occur, it must ensure thatreordering does not cause a change in theoutcome of a computation.
The dependency introduced by the condition-
code flags reduces the flexibility available for thecompiler to reorder instructions.
7/27/2019 Pipeline Note 2
29/31
Datapath and ControlConsiderations
C
it r
fil
C n t n t
I n r m n t r
7/27/2019 Pipeline Note 2
30/31
Original Designm r y ut l i n
i u r . . h r - u r n i t i n f t h t t h.
I n t r u t i nr
LU
U
rlin
I
u
it r
fil
7/27/2019 Pipeline Note 2
31/31
I n t r u t i n h
i ur . . t t h m i fi f r i l in cu ti n , i th
u
C n t r l i n l i l i n
I
C
LU
I n t r u t i nr
I n r m n t r
/ r i t
I n t r u t i nuu
u
C
t h
m r y r
/
m r y r
( I n tr u t i n f t h )
( t )
i n t r t u f f r t t h i n u t n u t u t f t h L U .Pipelined Design
- Separate instruction and data caches
- PC is connected to IMAR
- DMAR
- Separate MDR
- Buffers for ALU
- Instruction queue
- Instruction decoder output
- Reading an instruction from the instruction cache- Incrementing the PC
- Decoding an instruction
- Reading from or writing into the data cache
- Reading the contents of up to two regs
- Writing into one register in the reg file
- Performing an ALU operation