+ All Categories
Home > Documents > Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2 Let us see where...

Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2 Let us see where...

Date post: 21-Dec-2015
Category:
Upload: steven-goodwin
View: 231 times
Download: 1 times
Share this document with a friend
Popular Tags:
41
Chapter 5 Intermediate Code Intermediate Code Generation Generation
Transcript
Page 1: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5

Intermediate Code GenerationIntermediate Code Generation

Page 2: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

2

Let us see where we are now. Let us see where we are now. We have tokenized the program and parsed it.We have tokenized the program and parsed it. We know the structure of the program and of We know the structure of the program and of

every statement in it, every statement in it, and we have presumably established that it is and we have presumably established that it is

free of grammatical errors. free of grammatical errors. It would appear that we are ready to start It would appear that we are ready to start

translating it. translating it.

Page 3: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

3

1. Semantic Actions and Syntax-Directed Translation We can attach a meaning to every productionWe can attach a meaning to every production

Because the sequence of productions guides Because the sequence of productions guides the generation of intermediate code, we call the generation of intermediate code, we call this process Syntax-Directed Translation.this process Syntax-Directed Translation.

DefDef:: The computations or other operations The computations or other operations attached to the productions impute meaning to attached to the productions impute meaning to each production, and so these operations are each production, and so these operations are called called Semantic ActionsSemantic Actions

Page 4: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

4

DefDef:: The information obtained by the semantic The information obtained by the semantic actions is associated with the symbols of actions is associated with the symbols of the grammar; it is normally put in fields of the grammar; it is normally put in fields of records associated with the symbols; these records associated with the symbols; these fields are called fields are called attributesattributes

Note:Note: as far as the parser is concerned, neither as far as the parser is concerned, neither the semantic actions nor the attributes are a the semantic actions nor the attributes are a part of the grammar; they are only used as a part of the grammar; they are only used as a device for bridging the gap between parsing device for bridging the gap between parsing and constructing an intermediate and constructing an intermediate representation. representation.

Page 5: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

5

Things we must take care of with the semantic Things we must take care of with the semantic actions: actions: making sure the variables are declared before making sure the variables are declared before

use. use. type checking type checking making sure actual and formal parameters are making sure actual and formal parameters are

matched matched These things are called semantic analysis These things are called semantic analysis So we can now have it both ways, we can Put So we can now have it both ways, we can Put

context dependent information and actions context dependent information and actions together into a language that is still context together into a language that is still context free. free.

Page 6: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

6

2. Intermediate Representations

We will look at several different We will look at several different representationsrepresentations Syntax TreesSyntax Trees Directed Acyclic GraphsDirected Acyclic Graphs Postfix notationPostfix notation Three-Address CodeThree-Address Code Other Forms.Other Forms.

Page 7: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

7

2.1 Syntax Trees

typically used when intermediate code is to be typically used when intermediate code is to be generated later (maybe after an generated later (maybe after an optimization pass) optimization pass)

Page 8: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

8

Statement: x = a * b + a * bStatement: x = a * b + a * b Parse Tree Parse Tree

Page 9: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

9

Syntax Tree Syntax Tree

Page 10: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

10

In a Parse Tree the emphasis is on the In a Parse Tree the emphasis is on the grammatical structure of the statement. grammatical structure of the statement.

In a Syntax Tree the emphasis is on the actual In a Syntax Tree the emphasis is on the actual computation to be performed. computation to be performed.

Page 11: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

11

2.2 Directed Acyclic Graphs

The directed acyclic graph (DAG) is a relative The directed acyclic graph (DAG) is a relative of a Syntax Tree. of a Syntax Tree.

The difference is that nodes for variables or The difference is that nodes for variables or repeated sub-expressions are merged. repeated sub-expressions are merged.

Page 12: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

12

DAG -- Notice that this is not the same DAG -- Notice that this is not the same computations as in the previous examplescomputations as in the previous examples They had a typo -- (swapped the * and +)They had a typo -- (swapped the * and +) But it still explains the conceptBut it still explains the concept

Page 13: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

13

The use of DAG's to eliminate redundant code The use of DAG's to eliminate redundant code is our first instance of optimization. We will is our first instance of optimization. We will see more optimization in Chapter 6.see more optimization in Chapter 6.

Redundant code really comes into play when Redundant code really comes into play when we do array subscripts. we do array subscripts.

When you start generating intermediate code, When you start generating intermediate code, you will be amazed at how much is generated you will be amazed at how much is generated for array subscripts. for array subscripts.

Page 14: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

14

2.3 Postfix Notation

Very easy to generate from a Bottom-Up parse. Very easy to generate from a Bottom-Up parse. You can also generate it from a Syntax Tree You can also generate it from a Syntax Tree

via a postorder traversal. via a postorder traversal. The chief virtue of postsfix is that it can be The chief virtue of postsfix is that it can be

evaluated with the use of a stack. evaluated with the use of a stack. Nested if statements can cause problems. Nested if statements can cause problems.

Page 15: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

15

2.4 Three-Address Code

This form breaks the program down into This form breaks the program down into elementary statements having no more than elementary statements having no more than 3 variables and no more than one 3 variables and no more than one operator. operator.

Sample Statement: x = a + b * b Sample Statement: x = a + b * b Translation:Translation:

T := b * bT := b * b x := a + Tx := a + T

Note: T is a temporary variable. Note: T is a temporary variable.

Page 16: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

16

The notation is a compromise; it has the The notation is a compromise; it has the general form of a high-level language, but the general form of a high-level language, but the individual statements are simple enough that individual statements are simple enough that they map into assembly language in a they map into assembly language in a reasonably straight forward manner. reasonably straight forward manner.

3AC may be: 3AC may be: Generated from a traversal of a Syntax Tree or Generated from a traversal of a Syntax Tree or

a DAG. a DAG. or it may be generated as intermediate code or it may be generated as intermediate code

directly in the course of the parse. directly in the course of the parse.

Page 17: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

17

2.5 Intermediate Languages

Sometimes the Intermediate representation Sometimes the Intermediate representation may be a language of its own. may be a language of its own.

This helps uncouple the front end of the This helps uncouple the front end of the compiler from the back end. compiler from the back end.

You can then have a front end for each You can then have a front end for each language that generate the same intermediate language that generate the same intermediate language, and then one back end for each type language, and then one back end for each type of computer. of computer.

Page 18: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

18

Examples: Examples: UNCOL (1961) -- UNiversal Compiler-UNCOL (1961) -- UNiversal Compiler-

Oriented Language. Oriented Language.

P-Code (1981) -- UCSD -- based upon a p-code P-Code (1981) -- UCSD -- based upon a p-code interpreter (they also built p-code interpreter (they also built p-code compilers.) compilers.)

GNU Intermediate Code -- gcc, g++, g77, gada,GNU Intermediate Code -- gcc, g++, g77, gada, -- a Lispish type intermediate language. -- a Lispish type intermediate language.

Page 19: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

19

3. Bottom-Up Translation

Bottom-Up parsing generally lends itself to Bottom-Up parsing generally lends itself to intermediate code generation more intermediate code generation more readily than does top-down parsing. readily than does top-down parsing.

In either case, we must keep track of the In either case, we must keep track of the various elements or pieces of the intermediate various elements or pieces of the intermediate representation we are using, so we can representation we are using, so we can get at them when we need them. get at them when we need them.

Page 20: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

20

These elements will be attributes of symbols in These elements will be attributes of symbols in the grammar. the grammar.

For an identifier, the attribute will usually be For an identifier, the attribute will usually be its address in the symbol table. its address in the symbol table.

For a non-terminal, the attribute will be some For a non-terminal, the attribute will be some appropriate reference to part of the appropriate reference to part of the intermediate representation. intermediate representation.

Page 21: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

21

The most convenient way to keep track of The most convenient way to keep track of these attributes is by keeping them in a stack these attributes is by keeping them in a stack (known as the (known as the semantic stacksemantic stack). ).

In the case of bottom-up parsing, the semantic In the case of bottom-up parsing, the semantic stack and the parser stack move in stack and the parser stack move in synchronism. synchronism. When we pop from the parse stack we pop the When we pop from the parse stack we pop the

semantic stack, and when we push something semantic stack, and when we push something onto the parse stack we will push something onto the parse stack we will push something onto the semantic stackonto the semantic stack

Page 22: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

22

3.1 Trees

Syntax Trees Syntax Trees The semantic action associated with each The semantic action associated with each

production will include planting a tree and production will include planting a tree and taking care of the grafts. taking care of the grafts.

The attribute will contain a pointer to the root The attribute will contain a pointer to the root of the tree for this expression.of the tree for this expression.

You will need functions like You will need functions like make_tree( )make_tree( ) and and make_leaf( )make_leaf( )

Page 23: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

23

DAGS DAGS The main difference between constructing a The main difference between constructing a

DAG and constructing a Syntax Tree is that we DAG and constructing a Syntax Tree is that we do not create redundant nodes in a DAG. do not create redundant nodes in a DAG.

That means that the functions to create trees That means that the functions to create trees and grafts must be modified to check and grafts must be modified to check for duplicates. for duplicates.

This is typically done with a hash function.This is typically done with a hash function.

Page 24: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

24

3.2 Postfix Notation

Postfix notation is particularly easy to generate Postfix notation is particularly easy to generate from a bottom-up parse. from a bottom-up parse.

Grammar:Grammar:

Page 25: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

25

3.3 Three-Address Code

We can obtain 3AC from a tree or a DAG, or We can obtain 3AC from a tree or a DAG, or we can generate it directly in the course of a we can generate it directly in the course of a bottom-up parse. bottom-up parse.

To generate 3AC during the parse we need the To generate 3AC during the parse we need the following functions: following functions: MakeQuad( ) -- puts its parameters into the MakeQuad( ) -- puts its parameters into the

listing file in the proper format. listing file in the proper format. GetTemp( ) -- Generates a new temporary GetTemp( ) -- Generates a new temporary

number.number.

Page 26: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

26

4. Top-Down Translation

It is, to be blunt, a mess It is, to be blunt, a mess

Two stacks, one for the parser, and one for the Two stacks, one for the parser, and one for the attributes. Thus the semantic stack attributes. Thus the semantic stack must now be separate and will not move in must now be separate and will not move in synchronism with the parser stack synchronism with the parser stack

Page 27: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

27

Up to now, we picked up attributes from Up to now, we picked up attributes from children, and the bottom-up parser provided a children, and the bottom-up parser provided a handy way to do this. Now we have to be able handy way to do this. Now we have to be able to transfer attributes between siblings, and in to transfer attributes between siblings, and in some cases it may be necessary to transfer some cases it may be necessary to transfer them from parent to child.them from parent to child.

Page 28: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

28

4.1 Synthesized and Inherited Attributes synthesized -- from your children (like we have synthesized -- from your children (like we have

looked at). looked at). inherited -- from your parent, or sibling inherited -- from your parent, or sibling Def:Def: A grammar in which all attributes are A grammar in which all attributes are

synthesized is called an synthesized is called an S-attribute S-attribute grammargrammar

Page 29: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

29

Notice that inheritance so far is from left to Notice that inheritance so far is from left to right. right. That may not always be the case, and that can That may not always be the case, and that can

really mess things up.really mess things up.

Def:Def: Grammars in which no non-terminal ever Grammars in which no non-terminal ever inherits from a younger brother are called inherits from a younger brother are called L-L-attributed grammarsattributed grammars

Page 30: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

30

4.2 Attributes in a Top-Down Parse

Since so many of our problems arise from the Since so many of our problems arise from the need to eliminate left recursion in the need to eliminate left recursion in the underlying grammar, the place to start is to see underlying grammar, the place to start is to see how a normal semantic action has to be how a normal semantic action has to be modified when removing left recursion.modified when removing left recursion.

Page 31: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

31

In this example, .s is used to denote a In this example, .s is used to denote a synthesized attribute, and .i an inherited one.synthesized attribute, and .i an inherited one.

Page 32: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

32

An sample parse tree of a+bAn sample parse tree of a+b

Page 33: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

33

4.3 Removal of Left Recursion

When transforming a grammar to remove left When transforming a grammar to remove left recursions, we must also transform the recursions, we must also transform the semantic actions, as the example we just saw semantic actions, as the example we just saw implies.implies.

There are some basic rules on how to do this, There are some basic rules on how to do this, but we will not cover them at this time.but we will not cover them at this time.

Page 34: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

34

5. More about Bottom-Up Translation We are now going to consider the cases where We are now going to consider the cases where

semantic actions must be embedded in the semantic actions must be embedded in the midst of the right-hand side of a production.midst of the right-hand side of a production.

And cases where it is expedient to use And cases where it is expedient to use inherited attributes. inherited attributes.

Page 35: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

35

5.1 Embedded Semantic Actions

You will recall that in S-attributed grammars, You will recall that in S-attributed grammars, the semantic actions occur in one piece at the the semantic actions occur in one piece at the end of the right-hand side of the production.end of the right-hand side of the production.

For some statements that is fine, but for others For some statements that is fine, but for others it is not. it is not. The typical example is: if C then SThe typical example is: if C then S because based upon the value of C we need a because based upon the value of C we need a

conditional jump to the end of the statement S.conditional jump to the end of the statement S.

Page 36: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

36

There are a few ways around this:There are a few ways around this: Add Semantic actions in the middle of the Add Semantic actions in the middle of the

production (YACC/BISON allow this)production (YACC/BISON allow this) Breaking Productions up Breaking Productions up Adding Marker Non-Terminals Adding Marker Non-Terminals

Another problem:Another problem: The limit a jump non zero can go (jnz limit). The limit a jump non zero can go (jnz limit).

On the Intel architecture, this is 127 bytes. On the Intel architecture, this is 127 bytes.

Page 37: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

37

5.2 Inherited Attributes

Example: Fortran VariablesExample: Fortran Variables The data type comes first, so you can enter The data type comes first, so you can enter

lexemes into the symbol table with the data lexemes into the symbol table with the data type (an inherited attribute) type (an inherited attribute)

Page 38: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

38

6. Pascal-Type Declarations

Problem:Problem: The data type comes last, so you can’t add the The data type comes last, so you can’t add the

variables into the Symbol table as they come.variables into the Symbol table as they come.

Solution:Solution: Save the identifiers (or their symbol table Save the identifiers (or their symbol table

pointer) onto some stack/list pointer) onto some stack/list When the data types come around, update the When the data types come around, update the

list elements. list elements.

Page 39: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

39

7. Type Checking and Coercion

integer operation / float operation integer operation / float operation is it integer_add, float_add,... is it integer_add, float_add,...

The problem is that this is an overloaded The problem is that this is an overloaded operatoroperator

It is up to the Semantic Analyzer It is up to the Semantic Analyzer to determine which operation is desired and to to determine which operation is desired and to

choose the appropriate implementation.choose the appropriate implementation. And if the user has specified two incompatible And if the user has specified two incompatible

operands, it must take some action.operands, it must take some action.

Page 40: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

40

Page 41: Chapter 5 Intermediate Code Generation. Chapter 5 -- Intermediate Code Generation2  Let us see where we are now.  We have tokenized the program and.

Chapter 5 -- Intermediate Code Generation

41

8. Summary

We have seen some of the principal methods We have seen some of the principal methods for intermediate code generation and some of for intermediate code generation and some of the principal problems.the principal problems.

Now we have nearly reached the conclusion of Now we have nearly reached the conclusion of the front end of the compiler.the front end of the compiler.

Some front-ends also include some Some front-ends also include some optimization, and some interpreters also stop optimization, and some interpreters also stop here. here.


Recommended