Date post: | 20-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 0 times |
CMPUT680 - Fall 2006
Topic A: Data Dependence in Loops
José Nelson Amaralhttp://www.cs.ualberta.ca/~amaral/courses/680
CMPUT 680 - Compiler Design and Optimization
2
Reading
Wolfe, Michael, High Performance Compilers for Parallel Computing, Addison-Wesley, 1996 Chapter 5
Randy Allen, Ken Kennedy, Optimizing Compilers for Modern Architectures: A Dependence-based Approach, Morgan Kauffman, 200. Chapter 2.
CMPUT 680 - Compiler Design and Optimization
3
Basic Concept and Motivation
A loop-carried data dependence occurs when a memory access in the iteration i of a loop cannot occur before an access in some iteration i-k is performed.
There is data dependence between an access a at iteration i-k and an access b at iteration i if: a and b access the same memory location There is a path from a to b Either a or b is a write
CMPUT 680 - Compiler Design and Optimization
4
Three Types of Data Dependence
X =
= X
... δFlow dependence
= X
X =
... -1δAnti-dependence
X =
X =
... 0δOutput dependence
CMPUT 680 - Compiler Design and Optimization
5
Data Dependence
Example 1:
S1: A = 0
S2: B = A
S3: A = B + 1
S4: C = A
S1
S2
S3
S4
S2 is flow dependent on S1
S1 δf S2
S1 δ S2
(Wolfe, pp. 138)
S1is the source and S2 is the target of the dependence.
CMPUT 680 - Compiler Design and Optimization
6
Data Dependence
S2 δ S3 : S3 is flow-dependent on S2
S1 δ0 S3 : S3 is output-dependent on S1
S2 δ-1 S3 : S3 is anti-dependent on S2
S1
S2
S3
S4
Example 1:
S1: A = 0
S2: B = A
S3: A = B + 1
S4: C = A
CMPUT 680 - Compiler Design and Optimization
7
Parameterized Dependences
DO I = 1, NS1 A(I+1) = A(I) + B(I) ENDDO
“Statement S1 depends upon itself.”
DO I = 1, NS1 A(I+2) = A(I) + B(I) ENDDO
“Statement S1 depends on an instance of itself
two iterations previous.”
We need to be able to describe such dependences formally.
(Allen-Kennedy, pp. 39)
CMPUT 680 - Compiler Design and Optimization
8
Loop Normalization
DO I = L, U STEP S …. ENDDO
Given a loop of the form:
The normalized value of an iteration k can beobtained from:
Normalized(k) = (k-L+S)/S
DO I = 5, 26 STEP 3 …. ENDDO
Example5 8 11 14 … 26Iteration Space
1 2 3 4 … 8Normalized Iteration Space
(Allen-Kennedy, pp. 39)
CMPUT 680 - Compiler Design and Optimization
9
Data Dependences
Loop carried: between two statements instances in two different iterations of a loop.Loop independent: between two statements instances in the same loop iteration.
Lexically forward: the source comes before the target .Lexically backward: otherwise.
The right-hand side of an assignment is consideredto precede the left-hand side.
CMPUT 680 - Compiler Design and Optimization
10
Review of Linear Algebra Lexicographic Order
Two n-vectors a and b are equal, a = b, if ai = bi, 1 i n.
We say that a is less than b, a<b, if ai<bi, 1 i n.
We say that a is lexicographically less than b, at level j,a «j b, if ai = bi, 1 i < j and aj<bj.
We say that a is lexicographically less than b, a « b, if there is a j, 1 j n, such that a «j b.
(Wolfe, pp. 86)
CMPUT 680 - Compiler Design and Optimization
11
Lexicographic OrderExample of vectors
. and , because
positivehically lexicograp are and Both
. t simply thaor , 3, levelat
than lesshically lexicograp is say that We
1
1
1
1
2
0
1
1
:below and vectorsheConsider t
b0a0
ba
baba
ba
ba
ba
3
pp
pp
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
−
−=
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡−
=
CMPUT 680 - Compiler Design and Optimization
12
Properties of Lexicographic Order
Let n 1, and i, j, k denote arbitrary vectors in Rn
1 For each u in 1 u n, the relation «u in Rn is irreflexive and transitive.
2 The n relations «u are pairwise disjoint: i «u j and i «v j imply that u = v.
3 If i j, there is a unique integer u such that 1 u n and exactly one of the following two conditions holds:
i «u j or j «u i.
4 i «u j and j «v k together imply that i «w k, where
w = min (u,v).
CMPUT 680 - Compiler Design and Optimization
13
Data Dependence in LoopsAn ExampleFind the dependence relations due to the array X in the program
below:
(S1) for i = 2 to 9 do
(S2) X[i] = Y[i] + Z[i]
(S3) A[i] = X[i-1] + 1
(S4) end for
SolutionTo find the data dependence relations in a simple loop, we can
unroll the loop and see which statement instances depend on which others:
i = 2 i = 3 i = 4
(s2) X[2]=Y[2]+Z[2] X[3] =Y[3]+Z[3] X[4]=Y[4]+Z[4](s3) A[2]=X[1]+1 A[3] =X[2]+1 A[4]=X[3]+1
(Wolfe, pp. 140)
CMPUT 680 - Compiler Design and Optimization
14
There is a loop-carried, lexically forward, flow dependence from S2 to S3.
Data Dependence in Loops
S2
S3
(1,3)
Data dependence graph for statements in a loop(1,3) := iteration distance is 1, latency is 3.
(S1) for i = 2 to 9 do
(S2) X[i] = Y[i] + Z[i]
(S3) A[i] = X[i-1] + 1
(S4) end for
i = 2 i = 3 i = 4
(s2) X[2]=Y[2]+Z[2] X[3]=Y[3]+Z[3] X[4]=Y[4]+Z[4](s3) A[2]=X[1]+1 A[3]=X[2]+1 A[4]=X[3]+1
CMPUT 680 - Compiler Design and Optimization
15
Iteration space and iteration-space-dependence-graph
ExampleShow the iteration space dependence graph for the loop in our example.
Solution
0 1 2 3 4 5 6 7 8 9
Iteration space dependence graph
We need an abstraction for this.
Iteration Space(an informal introduction)
CMPUT 680 - Compiler Design and Optimization
16
Iteration Space(an informal introduction)
(S1) for i = 3 to 9 do(S2) X[i] = Y[i] + Z[i](S3) A[i] = X[i-2] + 1(S4) B[i] = A[i-1] + 2(S5) end for
i2S(X) = [3; 4; 5; 6; 7; 8; 9]
i3T(X) = [1; 2; 3; 4; 5; 6; 7]
i3S(A) = [3; 4; 5; 6; 7; 8; 9]
i4T(A) = [2; 3; 4; 5; 6; 7; 8]
For each dependency, there isan iteration vector for the sourceand one for the target
Iteration Vector: a vector formed by the index variable used to access an array in the loop.
S2
S3
S4
CMPUT 680 - Compiler Design and Optimization
17
d(X) = i3T(X) - i2S(X) d(X) = [-2; -2; -2; -2; -2; -2; -2]d(A) = i4T(A) - i3S(A) d(A) = [-1; -1; -1; -1; -1; -1; -1]
i2S(X) = [3; 4; 5; 6; 7; 8; 9]
i3T(X) = [1; 2; 3; 4; 5; 6; 7]
i3S(A) = [3; 4; 5; 6; 7; 8; 9]
i4T(A) = [2; 3; 4; 5; 6; 7; 8]
Distance Vector: a vector formed by the difference between the iteration vectors of the source and target of a dependency.
(S1) for i = 3 to 9 do(S2) X[i] = Y[i] + Z[i](S3) A[i] = X[i-2] + 1(S4) B[i] = A[i-1] + 2(S5) end for
Iteration Space(an informal introduction)
S2
S3
S4
CMPUT 680 - Compiler Design and Optimization
18
dir(X) = [<;<;<;<;<;<;<]dir(A) = [<;<;<;<;<;<;<]
The elements of a direction vector are <, >, and =.Other authors use +, -, 0.
(S1) for i = 3 to 9 do(S2) X[i] = Y[i] + Z[i](S3) A[i] = X[i-2] + 1(S4) B[i] = A[i-1] + 2(S5) end for
Direction Vector: contain only information about the direction of the dependence but no iteration distance information.
Iteration Space(an informal introduction)
S2
S3
S4
CMPUT 680 - Compiler Design and Optimization
19
Each element of the direction vector can be stored in two bits.
Given a distance vector, we can compute the direction vector, but not vice-versa.
Iteration Space(an informal introduction)
CMPUT 680 - Compiler Design and Optimization
20
ExampleShow the index variable iteration vectors and normalized iteration vectors for the iterations in the loop below:(1) for i = 2 to 6 do(2) for j = 6 to 2 by -2 do(3) A[i, j] = A[i, j+2] +1(4) end for(5) end for
SolutionSince there are two nested loops, the iteration space has two dimensions.
Iteration Space(an informal introduction)
CMPUT 680 - Compiler Design and Optimization
21
i
Iteration space dependence graph corresponding to the index variable iteration vectors.
2 63 4 5
j
6
2
4
Iteration Space(an informal introduction)
(1) for i = 2 to 6 do(2) for j = 6 to 2 by -2 do(3) A[i, j] = A[i, j+2] +1(4) end for(5) end for
CMPUT 680 - Compiler Design and Optimization
22
Distance/Direction Vectors
It is often convenient to deal with incompletely specified direction vectors
Example 1:{(0, 0, 0, 1), (0, -1, 0, 1), (0, 0, 1, 1), (0, -1, 1, 1)}
==> {(0, 0, 0, 1)}Example 2:{(0, -1, 0, -1), (0, 0, 0, -1), (0, 1, 0, -1)}
==> {(0, *, 0, -1)}
CMPUT 680 - Compiler Design and Optimization
23
Distance/Direction Vectors
Let a, b denote two vectors in Rn and s their direction vector. Then a « b if and only if s has one of the following forms:
(1, *, *, …, *)(0, 1, *, …, *)(0, 0, 1, *, …, *)
(0, 0, …, 0, 1).
More precisely, a «u b for u in 1 u n, if and only if s has the form with a leading 1 after (u - 1) zeros.
Notation (0, 1, -1) (=, >, <)
CMPUT 680 - Compiler Design and Optimization
24
do i = 3, 100
S: A[2i] = B[i] + 2
T: C[i] = D[i] + 2A[2i+1] + A[2i - 4] + A[i]
done
What are the dependences and the dependence distance vectorsin the example above?
An Example
CMPUT 680 - Compiler Design and Optimization
25
do i = 3, 100
S: A[2i] = B[i] + 2
T1: TEMP1 = D[i] + 2A[2i + 1]
T2: TEMP2 = TEMP1 + A[2i - 4]
T3: C(i) = TEMP2+ A[i]
done
iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200]iT1(A) = [7; 9; 11; 13; 15; 17; …; 199; 201]iT2(A) = [2; 4; 6; 8; 10; 12; …; 194; 196]iT3(A) = [3; 4; 5; 6; 7; 8; …; 99; 100]
An Example
CMPUT 680 - Compiler Design and Optimization
26
An Example
iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200]iT1(A) = [7; 9; 11; 13; 15; 17; …; 199; 201]
T1 is flow dependent on S with dependence distance 1.
d(T1,S) = iT1(A) - iS(A)
do i = 3, 100
S: A[2i] = B[i] + 2
T1: TEMP1 = D[i] + 2A[2i + 1]
T2: TEMP2 = TEMP1 + A[2i - 4]
T3: C(i) = TEMP2+ A[i]
done
CMPUT 680 - Compiler Design and Optimization
27
iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200]iT2(A) = [2; 4; 6; 8; 10; 12; …; 194; 196]d(T2,S) = iT2(A) - iS(A)
T2 is flow dependent on S with dependence distance -4.
do i = 3, 100
S: A[2i] = B[i] + 2
T1: TEMP1 = D[i] + 2A[2i + 1]
T2: TEMP2 = TEMP1 + A[2i - 4]
T3: C(i) = TEMP2+ A[i]
done
An Example
CMPUT 680 - Compiler Design and Optimization
28
iS(A) = [6; 8; 10; 12; 14; 16; …; 198; 200]iT3(A) = [3; 4; 5; 6; 7; 8; …; 99; 100]d(T3,S) = i
T3(A) - iS(A)
T3 is flow dependent on S with dependence distance (i-2i) = -i
do i = 3, 100
S: A[2i] = B[i] + 2
T1: TEMP1 = D[i] + 2A[2i + 1]
T2: TEMP2 = TEMP1 + A[2i - 4]
T3: C(i) = TEMP2+ A[i]
done
An Example
CMPUT 680 - Compiler Design and Optimization
29
Wolfe’s Definition
From Michael Wolfe’s, pg. 140:
“An anti-dependence from a statement to itself is considered lexically forward”:
Sk: x[i] x[i+1] + 1
“A dependence is lexically forward when thesource comes before the target without passingthrough a loop back edge”:
x[1] x[2] + 1
x[2] x[3] + 1
x[3] x[4] + 1
(back edge)
(back edge)
CMPUT 680 - Compiler Design and Optimization
30
Wolfe’s Definition
From Michael Wolfe’s, pg. 140:
“A self-flow dependence is lexically backward”:
Sk: x[i] x[i-1] + 1
x[1] x[0] + 1
x[2] x[1] + 1
x[3] x[2] + 1
(back edge)
(back edge)
CMPUT 680 - Compiler Design and Optimization
31
Allen-Kennedy Definition
From Allen-Kennedy’s, pg. 45:
“Suppose that there is a dependence from statement S1 on iteration i of a loop nest of n loopsand statement S2 on iteration j; then the dependence distance vector d(i,j) is defined as a vector of length n such that:
( ) kkk ijjid −=,
CMPUT 680 - Compiler Design and Optimization
32
Allen-Kennedy Definition
From Allen-Kennedy’s, pg. 46:
“Suppose that there is a dependence from statement S1 on iteration i of a loop nest of n loops and statement S2 on iteration j; then the dependencedirection vector D(i,j) is defined as a vector of length nsuch that:
( )( )( )⎪
⎩
⎪⎨
⎧
<>==><
=0, if ""
0, if ""
0, if ""
),(
k
k
k
k
jid
jid
jid
jiD
CMPUT 680 - Compiler Design and Optimization
33
Allen-Kennedy Definition
From Allen-Kennedy’s, pg. 50:
“Statement S2 has a loop-carried dependence onstatement S1 if and only if S1 references locationM on iteration j, and d(i,j) > 0 (that is, D(i,j) containsa “<“ as its leftmost non-”=“ component).”
“A loop-carried dependence from statement S1 tostatement S2 is said to be backward if S2 appearsbefore S1 in the loop body or if S1 and S2 are the samestatement. The carried dependence is said to be forward if S2 appears after S1 in the loop body.