Chapter One Introduction to Pipelined
Processors
Principle of Designing Pipeline Processors
(Design Problems of Pipeline Processors)
Job Sequencing and Collision Prevention
Job Sequencing and Collision Prevention
• Consider reservation table given below at t=0
0 1 2 3 4 5Sa A ASb A ASc A A
Job Sequencing and Collision Prevention
• Consider next initiation made at t=1
• The second initiation easily fits in the reservation table
0 1 2 3 4 5 6 7Sa A1 A2 A1 A2
Sb A1 A2 A1 A2
Sc A1 A2 A1 A2
Job Sequencing and Collision Prevention • Now consider the case when first initiation is
made at t = 0 and second at t = 2.
• Here both markings A1 and A2 falls in the same stage time units and is called collision and it must be avoided
0 1 2 3 4 5 6 7Sa A1 A2 A1 A2
Sb A1 A2
A1A2 A2
Sc A1 A2 A1A2 A2
Terminologies
Terminologies
• Latency: Time difference between two initiations in units of clock period
• Forbidden Latency: Latencies resulting in collision
• Forbidden Latency Set: Set of all forbidden latencies
General Method of finding Latency
Considering all initiations:
• Forbidden Latencies are 2 and 5
0 1 2 3 4 5 6 7 8 9 10Sa
A1
A2
A3 A4 A5
A6A1 A2 A3 A4
A5 A6
Sb
A1
A2
A1A3
A2A4
A3A5
A4A6 A5 A6
Sc
A1 A2
A1A3
A2A4
A3A5
A4A6 A5
A6
Shortcut Method of finding Latency
• Forbidden Latency Set = {0,5} U {0,2} U {0,2} = { 0, 2, 5 }
Terminologies• Initiation Sequence : Sequence of time units
at which initiation can be made without causing collision
• Example : { 0,1,3,4 ….}• Latency Sequence : Sequence of latencies
between successive initiations• Example : { 1,2,1….}• For a RT, number of valid initiations and
latencies are infinite
Terminologies• Initiation Rate : – The average number of initiations done per unit
time– It is a positive fraction and maximum value of IR is 1
• Average Latency : The average of latency of a given latency sequence
AL = 1/IR
Terminologies• Latency Cycle:• Among the infinite possible latency sequence,
the periodic ones are significant. E.g. { 1, 3, 3, 1, 3, 3,… }• The subsequence that repeats itself is called
latency cycle.E.g. {1, 3, 3}
Terminologies• Period of cycle: The sum of latencies in a
latency cycle (1+3+3=7)• Average Latency: The average taken over its
latency cycle (AL=7/3=2.33)• To design a pipeline, we need a control
strategy that maximize the throughput (no. of results per unit time)
• Maximizing throughput is minimizing AL
Terminologies
• Control Strategy – Initiate pipeline as specified by latency sequence.– Latency sequence which is aperiodic in nature is
impossible to design• Thus design problem is arriving at a latency
cycle having minimal average latency.
Terminologies• Stage Utilization Factor (SUF):• SUF of a particular stage is the fraction of time units
the stage used while following a latency sequence.• Example: Consider 5 initiations of function A
as below 0 1 2 3 4 5 6 7 8 9 10 11 12 13
Sa A1 A2 A3 A1 A2 A4
A5
A3 A4 A5
Sb A1 A2 A1 A2 A3 A3
A4
A5 A4 A5
Sc A1 A2 A1 A2 A3
A3
A4 A5 A4 A5
Terminologies
• SUF of stage Sa is number of markings present along Sa divided by the time interval over which marking is counted.
• SUF(Sa) = SUF(Sb) = SUF(Sc) = 10/14
Terminologies• Let SU(i) be the stage utilization factor of stage i• Let N(i) be no. of markings against stage i in the
reservation table• Suppose we initiate pipeline with initiation rate
(IR), then SU(i) is given by
period ofDuration N(i) x periodgiven aover made sinitiation of No. SU(i)
SUF
142 x 5 SU(a)
period ofDuration N(i) x periodgiven aover made sinitiation of No. SU(i)
Terminologies• Minimum Average Latency (MAL)• Thus SU(i) = IR x N(i)• SU(i) ≤ 1 IR x N(i) ≤ 1
N(i) ≤ 1/IR N(i) ≤ AL • Therefore
)(max1
iNMALk
i
State Diagram
• Suppose a pipeline is initially empty and make an initiation at t = 0.
• Now we need to check whether an initiation possible at t=i for i > 0.
• bi is used to note possibility of initiation
• bi = 1 initiation not possible
• bi = 0 initiation possible
State Diagram
bi 1 0 1 0 01
State Diagram• The above binary representation (binary vector)
is called collision vector(CV)• The collision vector obtained after making first
initiation is called initial collision vector(ICV)ICVA = (101001)
• The graphical representation of states (CVs) that a pipeline can reach and the relation is given by state diagram
State Diagram• States (CVs) are denoted by nodes • The node representing CVt-1 is connected to
CVt by a directed graph from CVt-1 to CVt and similarly for CVt* with a * on arc
Procedure to draw state diagram
1. Start with ICV2. For each unprocessed state, say CVt-1, do as
follows:a) Find CVt from CVt-1 by the following steps
1. Left shift CVt-1 by 1 bit2. Drop the leftmost bit3. Append the bit 0 at the right-hand end
Procedure to draw state diagram
b) If the 0th bit of CVt is 0, then obtain CV* by logically ORing CVt with ICV.
c) Make a new node for CVt and join with CVt-1 with an arc if the state CVt does not already exist.
d) If CV* exists, repeat step (c), but mark the arc with a *.
State Diagram1 0 1 0 0 1
State Diagram1 0 1 0 0 1
0 1 0 0 1 0
Left Shift
State Diagram1 0 1 0 0 1
0 1 0 0 1 0
Zero CV* exists
State Diagram1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
ICV – 101001 OR CVi – 010010CV* 111011
State Diagram1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0
No CV*
1 1 0 1 1 0 No CV*
Left ShiftLeft Shift
State Diagram
No CV*
Left Shift
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
ICV – 101001 OR CVi – 001000CV* 101001
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
Zero CV* exists
1 0 1 1 0 0
*Left Shift
State Diagram1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
ICV – 101001 CVi – 010000CV* 111001
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0
1 0 1 1 0 0
*
1 1 1 0 0 1
Zero CV* exists
*
1 0 0 1 0 0
ICV – 101001 CVi – 011000 CV* 111001
0 1 0 0 1 0*
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 0 0 1
1 1 1 0 1 1
*
1 1 0 1 1 0
1 0 1 1 0 0
0 1 1 0 0 0
* Zero CV* exists
1 1 0 0 0 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
*
*
No CV*
*
*
0 1 0 0 0 0
1 0 0 0 0 0
0 1 1 0 0 0
1 1 0 1 1 0
1 1 1 0 0 1
1 0 1 1 0 0
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0
0 0 1 0 0 0
*
1 1 0 0 0 0
No CV*
*
*
1 1 0 0 0 0 *
*
*
*
1 0 0 0 0 0
0 0 0 0 0 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
1 0 0 1 0 0
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
1 1 0 1 1 0
*
*
0 0 0 0 0 0
*
*
0 1 0 0 0 0 1 1 1 0 0 1 0 1 1 0 0 0
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0 1 0 1 1 0 0
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
*
1 1 0 0 0 0
1 0 0 0 0 0
1 0 0 1 0 0
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
State Diagram• From the above diagram, closed loops can be
identified as latency cycles.• To find the latency corresponding to a loop, start
with any initial * count the number of states before we encounter another * and reach back to initial *.
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (3)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (1,3,3)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (4,3)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (1,6)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (1,7)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (4)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (6)
1 0 1 0 0 1
0 1 0 0 1 0 1 1 1 0 1 1
*
1 0 0 1 0 0 1 1 0 1 1 0
0 0 1 0 0 0
0 1 0 0 0 0 1 1 1 0 0 1
1 0 1 1 0 0
0 1 1 0 0 0
*
*
1 1 0 0 0 0
1 0 0 0 0 0
0 0 0 0 0 0
*
*
1 1 0 0 1 0
*
Latency = (7)
State Diagram• The state with all zeros has a self-loop which
corresponds to empty pipeline and it is possible to wait for indefinite number of latency cycles of the form (1,8), (1,9),(1,10) etc.
• Simple Cycle: latency cycle in which each state is encountered only once.
• Complex Cycle: consists of more than one simple cycle in it.
• It is enough to look for simple cycles
State Diagram• In the above example, the cycle that offers MAL
is (1, 3, 3) (MAL = (1+3+3)/3 = 2.33)• Thus we have,
• A cycle arrived so is called greedy cycle, which minimize latency between successive initiation
2)(max1
iNMALk
i