Preliminary Transformations Chapter 4 of Allen and Kennedy Harel Paz.

transcript

Preliminary Transformations

Chapter 4 of Allen and Kennedy

Harel Paz

Most dependence tests require subscript expressions to be linear or affine functions of loop induction variables, with known constant coefficient and at most a symbolic additive constant. Affine functions:

Higher dependence test accuracy

Introduction

bxAxAxAxxxf nnn ...),...,,( 221121

An Example

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 KI = KI + INC U(KI) = U(KI) + W(J) ENDDO S(I) = U(KI)ENDDO

Programmers optimized code

Preliminary transformations!

U(KI) cannot be tested

An Example- cont’

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 KI = KI + INC U(KI) = U(KI) + W(J) ENDDO S(I) = U(KI)ENDDO

INC is invariant in the inner loop

KI is an auxiliary induction variable

An Example- cont’

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100

! Deleted: KI = KI + INC U(KI + J*INC) = U(KI + J*INC) + W(J) ENDDO KI = KI + 100 * INC S(I) = U(KI)ENDDO

Induction-variable substitution Replaces references to auxiliary induction

variable with direct functions of loop index.

KI is an auxiliary induction variable of the outer loop

KI contains a loop- invariant value

An Example- cont’

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 U(KI + (I-1)*100*INC + J*INC) = U(KI + (I-1)*100*INC + J*INC) + W(J) ENDDO ! Deleted: KI = KI + 100 * INC S(I) = U(KI + I * (100*INC))ENDDOKI = KI + 100 * 100 * INC

Second application of induction-variable substitution- remove all references to KI

An Example- cont’

INC = 2KI = 0DO I = 1, 100 DO J = 1, 100 U(KI + (I-1)*100*INC + J*INC) = U(KI + (I-1)*100*INC + J*INC) + W(J) ENDDO S(I) = U(KI + I * (100*INC))ENDDOKI = KI + 100 * 100 * INC

INC and K are constant values

An Example- cont’

INC = 2! Deleted: KI = 0DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 -200) + W(J) ENDDO S(I) = U(I*200)ENDDOKI = 20000

Applying Constant Propagation Substitutes the constants

An Example- cont’

Applying Dead Code Elimination Removes all unused code

INC = 2DO I = 1, 100 DO J = 1, 100 U(I*200 + J*2 - 200) = U(I*200 + J*2 -200) + W(J) ENDDO S(I) = U(I*200)ENDDOKI = 20000

Information Requirements Preliminaries transformations: induction

variables substitution, constant propagation, dead code elimination Loop normalization.

Transformations need knowledge Loop Stride Constant-values assignment Loop-invariant quantities Usage of variables

Data flow analysis

Loop Normalization

Lower Bound 1, with Stride 1 Makes dependence testing as simple

as possible. Makes transformations like induction-

variable substitution easier to perform.

Loop Normalization - Algorithm

Procedure normalizeLoop(L0);

i = a unique compiler-generated LIVS1: replace the loop header for L0 ( DO I = L, U, S )

with the adjusted loop header DO i = 1, (U – L + S) / S;S2: replace each reference to I within the loop by L + (i -1)*S;S3: insert a finalization assignment I = L + (i -1)*S; immediately after the end of the loop;end normalizeLoop;

Loop Normalization - Caveat

Un-normalized: DO I = 1, M DO J = I, N A(J, I) = A(J, I - 1) + 5 ENDDO ENDDO

Normalized: DO I = 1, M DO J = 1, N – I + 1 A(J + I – 1, I) = A(J + I – 1, I – 1) + 5 ENDDO ENDDO

Direction vector of (<,=)

J=J’I=I’-1

J+I-1=J’+I’-1I=I’-1

Direction vector of (<,>)

Loop Normalization - Caveat

Caveat Consider interchanging loops

(<,=) becomes (=,<) OK (<,>) becomes (>,<) Problem

Handled by another transformation

Data Flow Analysis Goal: perform preliminaries

transformations. Need: Understand how data elements

are created and used in a program. Definition-use Graph. Static single assignment (SSA).

Data flow analysis are heavily used in other optimizing transformations that preserve the program’s meaning.

Definition-use Graph Definition-use graph is a graph that

contains an edge from each definition point in the program to every possible use of the variable at run time.

Blocks

Switch

Case 1 Case 2 Case 3

Basic block is a maximal group of statements such that one statements in the group is executed if and only if only every statements is executed.

Block’s Definition-Use Edges

Constructing definition-use edges for a basic block: Walk through each statement in order in the

block. For each statement, note the defined

variable, and the variables it uses. For each use, add an edge from the last

block definition. When a new definition is encountered for a

variable, it kills the existing definition.

Definition-use Graph- Sets Basic block computation produces the sets:

uses(b): the set of all variables used within block b that have no prior definitions within the block.

defsout(b): the set of all definitions within block b that are not killed within the block.

killed(b): the set of all definitions that define variables killed by other definitions within block b.

Constructing the graph for the whole program: reaches(b): the set of all definitions from all blocks

(including b) that can possibly reach b.

Definition-use Graph:Reaches Set

Computing reaches for one block b may immediately change all other reaches including b’s itself since reaches(b) is an input into other reaches equations.

Achieving correct solutions requires simultaneously solving all equations There is a workaround

Switch

Case 1 Case 2 Case 3

)))()(()(()(bPp

pkilledpreachespdefsoutbreaches

Definition-use Graph – Calculating reaches

Dead Code Elimination Removes all dead code, thus making the

code cleaner Dead Code is code whose results are never

used in any ‘Useful statements’. What are Useful statements ?

Output statements, input statements, control flow statements, and their required statements

Dead Code Elimination –Main Idea

Output X and Y values

X=t*2+j

Dead Code Elimination

dead code should be eliminated

output z

z=k+5y

Constant Propagation

Replace all variables that have constant values (at a certain point) at runtime with those constant values.

Constant Propagation –Main Idea

Y=X+Z Y=X+15

otherwiseconstnon

jiconstconstconst

constnonanyconstnon

anyanyunknown

Values can only move down in the lattice

Constant Propagation - Algorithm

Complexity Issue

Number of definition-use edges can grow very large in presence of control flow.

X= X= X=

=X =X =X

1S 2S 3S

5S 6S 7S

9 definition-use edges

Static Single-Assignment Form

SSA- a variation on the definition-use graph with the following properties:

1. Each assignment creates a different variable name.

2. Where control flow joins, a special operation is inserted to merge different incarnations of the same variable.

Benefits: Reduces the number of definition-use edges. Improves performance of algorithms.

SSA Example

X= X= X=

=X =X =X

1S 2S 3S

5S 6S 7S

1S 2S 3S

5S 6S 7S

1X 2X 3X

),,( 3214 XXXX

4X 4X 4X

Another Example

DO I = 1, N .....ENDDO

I = 1IF ( I > N ) GO TO E……I = I + 1GO TO L

I1 = 1I3= Φ(I1,I2)IF ( I3 > N ) GO TO E……I2 = I3 + 1GO TO L

I1 = 1IF ( I > N ) GO TO E……I2 = I + 1GO TO L

Forward Expression Substitution

Forward expression substitution we’ll deal with: substitution of statements whose right-hand side variables include only the loop induction variable or variables that are loop invariant.

DO I = 1, 100 K = I + 2 A(K) = A(K) + 5ENDDO

DO I = 1, 100 A(I+2) = A(I+2) + 5ENDDO

Forward Expression Substitution

Need definition-use edges and control flow analysis Need to guarantee that the definition is always executed on

a loop iteration before the statement into which it is substituted.

DO I = 1, 100 IF (I%2==0) THEN

K = I + 2 END A(K) = A(K) + 5ENDDO

DO I = 1, 100 IF (I%2==0) THEN

K = I + 2 A(K) = A(K) + 5 ELSE K = I + 1 A(K) = A(K) + 6 ENDENDDO

Forward Expression Substitution- Algorithm

In order to forward substitute expressions involving only loop invariant variables and the loop invariant variable: Examine each SSA edge into a

statement S, which is a candidate for forward substitution.

If the edge comes from the loop, it must be the Φ-node for the loop induction variable, at the loop beginning.

I1 = 1I3= Φ(I1,I2)IF ( I3 > N ) GO TO EK=I3+2…I2 = I3 + 1GO TO L

Forward Expression Substitution- Algorithm

If a statement S can be forward substituted, examine each SSA edge whose source is S, and whose target is within the loop: If Φ-node, do nothing. Else substitute rhs(S), for every occurrence of lhs(S)

in the SSA sink edge. +Update SSA edges.

If all lhs(S) uses are removed S can be deleted. If all lhs(S) loop uses are removed (but there are

non-loop uses), S should be removed outside the loop.

If not all lhs(S) loop uses are removed, try IV substitution.

Preliminary Transformations

Second Part of the Lecture

Harel Paz

Last Week

Goal: high dependence test accuracy

Preliminaries transformations: Loop normalization Dead code elimination Constant propagation Induction variables substitution

Forward expression substitution

Data flow analysis:definition-use graph & SSA

SSA - Loop Example

DO I = 1, N .....ENDDO

I = 1IF ( I > N ) GO TO E……I = I + 1GO TO L

I1 = 1I3= Φ(I1,I2)IF ( I3 > N ) GO TO E……I2 = I3 + 1GO TO L

I1 = 1IF ( I > N ) GO TO E……I2 = I + 1GO TO L

Induction Variable Substitution

Need to recognize auxiliary induction variables. An auxiliary induction variable in a DO loop headed by DO I = LB, UB, S

is any variable that can be correctly expressed as cexpr * I + iexprL at every location L where it is used in the loop, where cexpr and iexprL are expressions that do not vary in the loop, although different locations in the loop may require substitution of different values of iexprL.

We’ll only deal with auxiliary induction variables defined by a statement like: K = K ± cexpr.

Induction Variable Recognition-

Main Idea

A statement S may define an auxiliary induction variable for a loop L if S is contained in a simple cycle of SSA edges that involves only S and one another statement, a Φ-node, in the loop. Check that the form is: K = K ± cexpr. Check that ‘cexpr’ is loop invariant.

DO I = 1, N A(I) = B(K) + 1 K = K + 4 … D(K) = D(K) + A(I)ENDDO

DO I = 1, N A(I) = B(K)+1 K = K + 4 … D(K) = D(K)+A(I)ENDDO

A(I)=B(K)+1

D(K)=D(K)+A(I)

IVSub Without Loop Normalization

DO I = L, U, S K = K + N … = A(K)ENDDO

DO I = L, U, S … = A(K + (I – L + S) / S * N)ENDDOK = K + (U – L + S) / S * N

Problem: Inefficient code Nonlinear subscript

IVsub for such loop is fruitless!

IVSub on a Normalized Loop

DO I = L, U, S K = K + N … = A(K)ENDDO

I = 1DO i = 1, (U-L+S)/S,

1 K = K + N … = A (K) I = I + 1ENDDO

Advantages: Efficient code. Appropriate for

dependence testing. IVsub for such loop is

beneficial!I = 1DO i = 1, (U – L + S) / S, 1 … = A (K + i * N)ENDDOK = K + (U – L + S) / S * NI = I + (U – L + S) / S

Loop normalization

Summary

Transformations to put more subscripts into standard form Loop Normalization Induction Variable Substitution Constant Propagation

Preliminary Transformations Chapter 4 of Allen and Kennedy Harel Paz.

Documents