Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | alvin-hodges |
View: | 220 times |
Download: | 0 times |
FADA: Fuzzy Array Dataflow Analysis.
ADaAn: Array Dataflow Analyzer.
Work directed by : D. Barthou and S. TouatiPRiSM laboratory - University of Versailles
ARPA-informal seminar
19/02/2008
By : M. [email protected]
2/24
Introduction
• Technological barrier reached (soon ?)
• Parallelization– Automatic (icc,…)
• Bad detection [Padua, 2001, 2006]
– Led by directives (OpenMP, MPI,…)
• Parallelism detection– We need an efficient Dataflow analysis method.
3/24
Outline
1. FADA– Exact analysis
– Fuzzy Analysis (FADA)
– FADA Vs. State-of-art
2. Applications & parallel work
3. Implementation : ADaAn
4. Conclusion
4/24
Dependence Analysis, the Evolution
Parametric Integer Programming
(PIP)
Feautrier’s Exact
Analysis
Fuzzy Analysis : FADA
Param
etric
solv
er
(Om
ega)
Exact
Analysis
(PETIT) Adaptatio
n
of PETIT
Region Analysis
(PIPS)
Hybrid Analysis
(Polaris)
1990 1995 2000
ZIV, MIV, GCD,
BanerjeeTest …
5/24
for (i=1; i<=N; i++)
for (j=1; j<=M; j++){
S0: T[i,j]=0;
for (k=1; k<=L; k++)
S1: T[i,j] = T[i,j]+ A[i,k]*B[k,j];
}
}
Exact Analysis (Feautrier’s modeling)
Array Dataflow Analysis (ADA)
P. Feautrier ”Dataflow Analysis of scalar and array references.” International Journal of Parallel Computing, 20(1):23-53, 1991.
Which Operation writes the value of T[i,j] read by S1 during iteration (ir, jr, kr) ?
May be from S1 during (iw, jw, kw)
1,1
)2111(
Else
)k, j:(iS
kLMNif
rrr
r
May be from S1 during (iw, jw, kw)
),,(),,(
),(),(
),(),,()1,1(
),,(),,()1,1,1(
rrrlexwww
wwrr
www
rrr
kjikji
jiji
MNkji
LMNiji
May be S0 during (iw, jw)
),,(),(
),(),(
),(),()1,1(
),,(),,()1,1,1(
rrrlexww
wwrr
ww
rrr
kjiji
jiji
MNji
LMNiji
Maybe S0 during (iw, jw)
0
)111(
Else
), j:(iS
LMNif
rrThe exact source of T[i,j] read by S1 during iteration (ir, jr, kr)
else
),j:(ie S els
)-,k,j:(iS
)k if(
)LMIf(N
rr
rrr
r
0
11
2
111
ADA works only for static control programs
6/24
J-F. Collard, D. Barthou, P. Feautrier. “Fuzzy Array dataflow analysis”. ACM Symp. On Principles and Practice of Parallel Programming, 30(8):29-101, Aug. 1995.
FADA (Fuzzy Array Dataflow Analysis)
for (i=1; i<=N; i++)
if( c(i) )
S0: A = … ;
else
S1: A = …;
endif
S2: … = …A;
endfor
ADA-like modeling
rlexw
w
w
r
ii
trueic
Ni
Ni
)(
1
1
Parameterized solution
else
: S
)iif(N r
10
10
0
11
Can <S0,iw> be the source of ‘A’ read by <S2,ir> ?
7/24
Reducing Fuzziness
)(. xcxBxAxS
BxAxS .'
Red and blue elements
Blue elements
•Structural Analysis•Iterative Analysis
•Translating Properties
FADA’s advanced analyses
8/24
for (i=1; i<=N; i++)
if( c(i) )
S0: A = … ;
else
S1: A = …;
endif
S2: … = …A;
endfor
Structural Analysis
FADA can deduce : that the value of A read by ‘S2’ is produced during the same iteration by S0 or S1.
Structural property of an if-then-else construct :One and only one branch can be executed during a given iteration.
FADA proves, there is no dependence carried by the i-loop
9/24
Iterative Analysis
for (i=1; i<=N; i++){
if (A[i-1])
S0: …=…B[i-1];
if(!A[i])
S1: B[i]=…;
}
iterative analysis : compare two non-affine constraints by comparing the source of referenced variables (Here, A-cells).
FADA’s inference:
•May be there is a confilct on B[i], between S0 during i+1, and S1 during i.
•FADA Compares if-conditions, and deduces :
•S0 can not be executed during iteration i+1 if S1 was executed during i. (for k>0 or k<1)
FADA can deduce :
source of “B-cells”, read by S0, can not be an S1 operation.
Improved Version
A[i+k]=…;
FADA proves, there is no dependence at all
10/24
Translating Properties
• A demonstrator, with external/internal knowledge (iterative analysis, structural analysis,…)– Desired cases :
• Obtain trivial values “true/false”• Interpret all non affine constraints (and solve the rest
using a parametric solver)
Mj
ifjji
MifNii
Ni
)(,
)(,
1
11/24
FADA
FADA, a global view
Program
basic analysis
advanced analyses•Structural analysis•Iterative analysis•Translating properties
Parameterized Definitions
Exact Definitions
12/24
Hybrid Analysis
Rus & Rauchwerger Work
RO(i) WR(i)
RW(i)
Program
Regions
DS {out, anti,flow}
ParallelVersion
SequentialVersion
DS={}Yes No
USRUSR
PDAG
USR: Uniform Sets of References
PDAG: Predicate Directed Acyclic Graph
DS: dependencies Set
RO: Read Only
WR: Write first (write, read)
RW: Read first (read, write).
Statically :
Building: Regions, Ds And predicate DS=Empty
•Dynamically :
Check the predicate and branch to the correct version
13/24
Hybrid Analysis Vs. Fuzzy Analysis
HA FADA
Input restrictions -no while loops support (Warning).
-no aliasing
-no aliasing
Method Region-computation based
Instance-wise Seek-definitions based
Assertion Code generation
Theoretically Well defined
Less defined.
Interprocedural Natural Passes by summarizing
Albert don’t agree, I think that recurrence operator (On USR sets) do not handle while loops (the operator requires upper and lower bounds).
14/24
FADA Vs. State of the art
While and if-then construct handling
Comparing non affine entities
Dynamic analysis
PIPS Yes No No
PETIT Yes No No
FADA Yes Yes, and all analysis will be done statically.
No
HA Yes Yes, but comparison will be
performed dynamically.Yes
15/24
Outline
1. FADA
2. Applications & parallel work1. Parallelism detection
2. Improving communications
3. Source to source transformations
3. Implementation : ADaAn
4. Conclusion
16/24
FADA’s Applications
1. Parallelism detection
for (i=1; i<=N; i++)
if( c(i) )
S0: A = … ;
else
S1: A = …;
endif
S3: … = …A;
endfor
for (i=1; i<=N; i++){
if (A[i-1]){
…=…B[i-1];
}
A[i-2]=…;
if(!A[i]){
B[i]=…;
}
}
Example 1 Example 2
No dependence carried by the i-loop No dependence at all
17/24
Applications
2. Improve synchronizations and communications#pragma omp parallel cyclic
for (i=1; i<=N; i++){
j=0;
while (f(A[i,j])){
B[a[f(i)],b[j]] = … ;
j++;
}
}
#pragma omp parallel cyclicfor (i=1; i<=N; i++){ j=0; while (f(A[i,j])){ …= … B[a[f(i)],b[j]] ; j++;
}}
#pragma omp nowait
We can remove the implicit barrier
18/24
Applications
3. Source-to-source transformations
#pragma unroll(2) merge(while)
for(…){
j=0;
while(q(i,j)){
a[i]= … a[i];
j++;
}
}
for(…){
j=0;
while(q(i,j) && q(i+1,j)){
a[i]=…a[i];
a[i+1]=…a[i+1];
j++;
}
while(q(i,j)){
a[i]= … a[i];
j++;
}
while(q(i+1,j)){
a[i+1]= … a[i+1];
j++;
}
}
A simplified deep-jam example
19/24G G’
Irregular code transformation
S1
S0X(j)
Y(i)
S0’ S0’’
S1’’S1’
Y1(i’)
Y2(i’’) Y3(i’”)
Y2(i””)
X1(j’)
X2(j”)
X3(j”’)
X4(j””)
Main idea
Dependencies are preserved ?)"",'",",'(_ iiiirelationaffinei
•True : Validate the step
•False : Reject The transformation
•Demonstration not achieved
•Generate appropriate code to correct transformed program.
))""()'"()"()'((
))""()'"()"()'((
))""()'"()"()'((
))""()'"()"()'((
)(
4321
4321
4321
4321
iYiYiYiY
iYiYiYiY
iYiYiYiY
iYiYiYiY
iY
20/24
Outline
1. FADA
2. Applications & parallel work
3. ADaAn1. Input
2. Preprocessing
3. output
4. Conclusion
21/24
ADaAn
ADaAn (Array Dataflow Analyzer)
Program Preprocessing
FADA’s basic analysis BeeCl@ck
PIPlib
Polylib
FADA’s advanced analyses
DefinitionsVerification
code
SWI-PROLOG
Overview
Dependence graph
Input file
PROGRAM ADaAn_input;
BEGIN
for(i:1:N){
S0: j=0; /*labled statement*/
[do whith j] while(C(i,j)){
if (a[i,j]){
A[i,j] += B[i,j];
}else{
A[i,b[j]] = 5;
};
j++;
};
};
END
22/24
ADaAn (Array Dataflow Analyzer)
Overview
ADaAn
Program Preprocessing
FADA’s basic analysis BeeCl@ck
PIPlib
Polylib
FADA’s advanced analyses
DefinitionsVerification
code
SWI-PROLOG
Dependence graph
Preprocessing step : Automatic symbolic constants identification
C=…;
for (i:1:N){
…=…i;
C=…C;
};
C1=C…;
for (i:N-50:N){
…=…C1;
};
C=…;
for (i:1:N){
…=…i;
C=…C;
};
C=C…;
for (i:N-50:N){
…=…C;
};
Preprocessing step : Advanced symbolic constants identification
For(i:1:N){
j=0;
while (C(i,j)){
…=…T[i];
…=…V;
j++;
};
V=…;
};
Preprocessing step : Single loop counter ID and loop normalization
…
for (i:1:N){
…=…i;
};
…
for (i0:0:ub(…)-lb(…)){
…=…i0-lb(…);
};
…
for (i:1:N){
…=…i;
};
…
for (i:lb(…):ub(…)){
…=…i;
};
23/24
ADaAn
Output
ADaAn
Program Preprocessing
FADA’s basic analysis BeeCl@ck
PIPlib
Polylib
FADA’s advanced analyses
DefinitionsVerification
code
SWI-PROLOG
Dependence graph
24/24
ADaAn
• Achieved– Basic analysis– Structural Analysis
• In progress– Parametric PROLOG-like demonstrator (for : Iterative analysis and
Translating properties)
In progress