Intermediate Code Generation

CS3012: Formal Languages and Compilers

Intermediate Code Generation

Why use intermediate code ?

analysis independent from target language

optimisation independent from targetlanguage

porting to new machines requires onlya change of one component of thecompiler

We will generate three-address code,

using syntax-directed definitions.


Three Address Code

Statements in this language are of the form:

x := y op z

where x, y and z are names, constants orcompiler-generated temporary variables, andop stands for any operator.

A more complicated statement like

d := a+b*c

would have to be translated to

t1 := b * cd := a + t1

where t1 is a compiler-generated temporary variable.


expression: a := b * c + b / c

postfix: abc*bc/+ :=

syntax tree: :=

a

*

b

+

b

/

c c

three-address code:

t1 := b * ct2 := b / ct3 := t1 + t2

a := t3


Three-address statements

x := y op z assignment

x := op y unary assignment

x := y copy

goto L unconditional jump

if x relop y goto L conditional jump

param x procedure call

call p n procedure call

return y procedure call

x := y[i] indexed assignment

x[i] := y indexed assignment


A Syntax-directed Translation

To generate three-address code from source,we will use syntax-directed definitions.

First, we will consider the language ofassignments and expressions.

S will have one attribute "code", which willcontain the three-address code fragment ofthe assignment.

E will have two attributes:

code - the corresponding code fragmentplace - the name that will hold the value corresponding to E.

The notation gen(x ":=" y "+" z) represents

x := y + z

The notation <fragment> || expr meansconcatenate the expression onto the endof the code fragment.


1)

2)

3)

4)

5)

6)

S -> id := E

E1 -> E2 + E3

E1 -> E2 * E3

E1 -> -E2

E1 -> ( E2 )

E -> id

S.code := E.code || gen(id.place ":=" E.place)

E1.place := newtemp();E1.code := E2.code || E3.code || gen(E1.place ":=" E2.place "+" E3.place)

E1.place := newtemp();E1.code := E2.code || E3.code || gen(E1.place ":=" E2.place "*" E3.place)

E1.place := newtemp();E1.code := E2.code || gen(E1.place ":=" "uminus" E2.place)

E1.place := newtemp();E1.code := E2.code

E.place = id.place;E.code := ""


a := b * c + b * -c

S

a := E8n

b c

c

b E5n-

E3n + E7n

E1n * E2n E4n * E6n


Constructing the Attributes

E1n

E2n

E3n

E4n

E5n

E6n

E7n

E8n

S

b

c

t1

b

c

t2

t3

t4

E1n.code || E2n.code || t1 := b * c

E5n.code || t2 := uminus c

E4n.code || E6n.code || t3 := b * t2

E3n.code || E7n.code || t4 := t1 + t3

E8n.code || a := t4

place code


Flow of Control

We can extend that syntax-directeddefinition to handle flow of controlstatements:

S1 -> while E do S2

S1.begin := newlabel();S1.after := newlabel();S1.code := gen(S1.begin ":") || E.code ||

gen("if" E.place "= 0 goto" S1.after) || S2.code || gen("goto" S1.begin) || gen(S1.after ":")

The attributes "begin" and "after" will hold labels,and newlabel() will return a new label.


E.code

codelabels

S1.begin :

S1.after :

if E.place = 0 goto S1.after

S2.code

goto S1.begin

...

...


Looking up the Symbol Table

1)

2)

3)

4)

5)

6)

S -> id := E

E1 -> E2 + E3

E1 -> E2 * E3

E1 -> -E2

E1 -> ( E2 )

E -> id

p := lookup(id.name);if p nil then emit(p ":=" E.place)else error

E1.place := newtemp();emit(E1.place ":=" E2.place "+" E3.place)

E1.place := newtemp();emit(E1.place ":=" E2.place "*" E3.place)

E1.place := newtemp();emit(E1.place ":= uminus" E2.place)

E1.place := E2.place

p := lookup(id.name);if p nil then E.place := pelse error


res := a * (alpha + -b)

Assume res, a, alpha and b have already been declared, and placed in the symbol table:

lexptr

:->res->a->alpha->b

token

:ID_TID_TID_TID_T

attributes

:

index

:5678


processed string

res := a

res :=E1

res :=E1 * (alpha

res :=E1 * (E2

res :=E1 * (E2 + -b

res :=E1 * (E2 + -E3

res :=E1 * (E2 + E4

res :=E1 * (E5

res :=E1 * (E5)

res :=E1 * E6

res :=E7

S

attributes

E1.place = <6>

E2.place = <7>

E3.place = <8>

E4.place = <9>

E5.place = <10>

E6.place = <10>

E7.place = <11>

output

<9> := uminus<8>

<10> := <7>+<9>

<11> := <6>*<10>

<5> := <11>


Arrays

We will store the elements of an array in ablock of consecutive locations.

A is an array

w is the width of each element

low is the lower bound on the index

base is the address of A

The ith element of A begins at location:

base + (i - low) * w

ori * w + (base - (low * w))

= c

We then store c with A in the symbol table, andthe address of A[i} then is c + (i * w)


Multi-dimensional Arrays

We will consider arrays stored row by row

low1 is the lower bound on the first index

low2 is the lower bound on the second

n2 is the upper bound on the second index

The address of A[i,j] is:

base + ((i - low1)*n2 + (j - low2))* w

or((i * n2) + j)*w + (base - ((low1 * n2) + low2)*w)


Grammar of Array References

The obvious grammar for indexing array elements is:

L -> id [Elist] | idElist -> Elist , E | E

We will use, however, a different grammar, thatalows us to build up the index limits as weconstruct the Elists:

L -> Elist | idElist -> Elist , E | id [ E

We also need:attributes: Elist.ndim - number of dimensions

Elist.place - temp valueL.place - position in symbol tableL.offset - offset into the array

functions: limit(array,i) - the limit of the ith dimension of the array

c(array) - returns the pre-computed formula

width(array) - returns w


The syntax-directed definition

1) S -> L := E

2) E1 -> E2 + E3

3) E1 -> (E2)

4) E -> L

5) L -> Elist ]

if L.offset = nullthen emit(L.place ":=" E.place)else emit(L.place "[" L.offset "] :=" E.place)

E1.place := newtemp();emit(E1.place ":=" E2.place "+" E3.place)

E1.place := E2.place

if L.offset = null then E.place = L.placeelse E.place := newtemp(); emit(E.place ":=" L.place "[" L.offset "]")

L.place := newtemp();L.offset := newtemp();emit(L.place ":=" c(Elist.array))emit(L.offset ":=" Elist.place "*" width(Elist.array))


6) L -> id

7) Elist1 -> Elist2 , E

8) Elist -> id [ E

L.place := id.placeL.offset := null

t := newtemp();m := Elist2.ndim + 1;emit(t ":=" Elist2.place "*" limit(Elist2.array, m))emit(t ":=" t "+" E.place);Elist1.array := Elist2.array;Elist1.place := t;Elist1.ndim := m

Elist.array := id.place;Elist.place := E.place;Elist.ndim := 1


Type Conversion

We have seen before how to compute thetype expression for complex expressionsusing more than one data type.

It is the job of the compiler to constructthe necessary three-address code to do anyautomatic type conversion required .

We will assume that there are two basictypes, integer and real, and we may have toconvert integers to reals.

We assume that there is a function

inttoreal

and two different "+" operators:

int+

real+


Semantic rule for E1 -> E2 + E3

E1.place := newtemp();if E2.type = integer and E3.type = integer thenbegin

emit(E1.place ":=" E2.place "int+" E3.place);E1.type := integer

endelse if E2.type = real and E3.type = real thenbegin

emit(E1.place ":=" E2.place "real+" E3.place);E1.type = real

endelse if E2.type = integer and E3.type = real thenbegin

u := newtemp();emit(u ":= inttoreal" E2.place);emit(E1.place ":=" u "real+" E3.place);E1.type = real

endelse if


else if E2.type = real and E3.type = integer thenbegin

u := newtemp();emit(u ":= inttoreal" E2.place);emit(E1.place ":=" E2.place "real+" u);E1.type := real

endelse E1.type = type_error;

We would also require similar semantic rules for

E1 -> E2 * E3

using operators "int*" and "real*".


generating the code

processed string

id1 := id2 * (id3 + -id4)

id1 :=E1 * (id3 + -id4)

id1 :=E1 * (E2 + -id4)

id1 :=E1 * (E2 + -E3)

id1 :=E1 * (E2 + E4)

id1 :=E1 * (E5)

id1 :=E1 * E6

id1 :=E7

S

attributes

E1.place = <6>

E2.place = <7>

E3.place = <8>

E4.place = <9>

E5.place = <10>

E6.place = <10>

E7.place = <11>

output

<9> := uminus<8>

<10> := <7>+<9>

<11> := <6>*<10>

<5> := <11>

Date post:	06-Jan-2016
Category:	Documents
Upload:	randy
View:	23 times
Download:	2 times

Intermediate Code Generation

Documents