+ All Categories
Home > Documents > Compiler Structures

Compiler Structures

Date post: 13-Jan-2016
Category:
Upload: nadda
View: 42 times
Download: 0 times
Share this document with a friend
Description:
Compiler Structures. 241-437 , Semester 1 , 2011-2012. Objective describe semantic analysis with attribute grammars, as applied in yacc and recursive descent parsers. 8. Attribute Grammars. Overview. 1. What is an Attribute Grammar? 2. Parse Tree Evaluation 3. Attributes - PowerPoint PPT Presentation
62
241-437 Compilers: Attr. Grammars/8 Compiler Structures Objective describe semantic analysis with attribute grammars, as applied in yacc and recursive descent parsers 241-437, Semester 1, 2011-2012 8. Attribute Grammars
Transcript
Page 1: Compiler Structures

241-437 Compilers: Attr. Grammars/8 1

Compiler Structures

• Objective– describe semantic analysis with attribute

grammars, as applied in yacc and recursive descent parsers

241-437, Semester 1, 2011-2012

8. Attribute Grammars

Page 2: Compiler Structures

241-437 Compilers: Attr. Grammars/8 2

Overview

1. What is an Attribute Grammar?

2. Parse Tree Evaluation

3. Attributes

4. Attribute Grammars and yacc

5. A Grid Grammar

6. Recursive Descent and Attributes

Page 3: Compiler Structures

241-437 Compilers: Attr. Grammars/8 3

In this lecture

Source Program

Target Lang. Prog.

Semantic Analyzer

Syntax Analyzer

Lexical Analyzer

FrontEnd

Code Optimizer

Target Code Generator

BackEnd

Int. Code Generator

Intermediate Codeconcentratingon attribute grammars

Page 4: Compiler Structures

241-437 Compilers: Attr. Grammars/8 4

1. What is an Attribute Grammar?

• An attribute grammar is a context free grammar with semantic actions attached to some of the productions– semantic = meaning

• An action specifies the meaning of a production in terms of its body terminals and nonterminals.

Page 5: Compiler Structures

241-437 Compilers: Attr. Grammars/8 5

Example Attribute Grammar

L EE E + TE TT T * FT FF ( E )F num

printf(Ebody.val)E.val := Ebody.val + Tbody.valE.val := Tbody.valT.val := Tbody.val * Fbody.valT.val := Fbody.valF.val := Ebody.valF.val := value(num)

Production Semantic Action

Page 6: Compiler Structures

241-437 Compilers: Attr. Grammars/8 6

2. Parse Tree Evaluation

• One way of understanding semantic actions is as extra information (attributes) attached to the nodes of the parse tree for the input.

• The semantic action specifies the parent node attribute in terms of the attributes of its children.

Page 7: Compiler Structures

241-437 Compilers: Attr. Grammars/8 7

Basic Parse Tree Input: 9 * 5 + 2

L EE E + TE TT T * FT FF ( E )F num

L

E

TE +

*T

F

9

F

5

F

2

T

Page 8: Compiler Structures

241-437 Compilers: Attr. Grammars/8 8

Adding Meaning to the Tree

• What is the meaning of "9 * 5 + 2"?– the answer is to evaluate it, to get 47

• Add attributes to the tree, starting from the leaves and working up to the root– use the semantic actions to get the attribute

values

Page 9: Compiler Structures

241-437 Compilers: Attr. Grammars/8 9

Parse Tree with Actions

L

E

TE +

*T

F

9

F

5

F

2

T

printf(Ebody.val)E.val := Ebody.val + Tbody.valE.val := Tbody.valT.val := Tbody.val * Fbody.valT.val := Fbody.valF.val := Ebody.valF.val := value(num) 9

9

45

45

47

47printf

2

2

evaluatebottom-up

5

Page 10: Compiler Structures

241-437 Compilers: Attr. Grammars/8 10

3. Attributes

• Attribute values can be– numbers, strings, any data structures,

code, assembly language instructions

• It's not always necessary to build a parse tree in order to evaluate the grammar's action.

Page 11: Compiler Structures

241-437 Compilers: Attr. Grammars/8 11

Kinds of Attribute

• There are two main kinds of attribute evaluation:– synthesized and inherited attributes

• The value of a synthesized attribute is calculated by using its body values– as in the previous example

Page 12: Compiler Structures

241-437 Compilers: Attr. Grammars/8 12

Synthesized Attributes in a Tree

• Example:Production Semantic Action

T T * F T.val := Tbody.val * Fbody.val

*T F

T

9

45

5 evaluatebottom-up

Page 13: Compiler Structures

241-437 Compilers: Attr. Grammars/8 13

Inherited Attributes

• An inherited attribute for a body symbol (i.e. terminal, non-terminal) gets its value from the other body symbols and the parent value– often used for evaluating more complex

programming language features

Page 14: Compiler Structures

241-437 Compilers: Attr. Grammars/8 14

Inherited Attributes in a Tree

X.x := function(A.a, Y.y)

Y.y := function(A.a, X.x)

A.a

X.x Y.y

A.a

X.x Y.y

Direction of

evaluation

• Two examples:

Page 15: Compiler Structures

241-437 Compilers: Attr. Grammars/8 15

4. Attribute Grammars and yacc

• yacc supports (synthesized) attribute grammars– yacc actions are semantic actions– no parse tree is needed, since yacc evaluates the

actions using the parser's built-in stack

Page 16: Compiler Structures

241-437 Compilers: Attr. Grammars/8 16

expr.y Again%token NUMBER

%%

exprs: expr '\n' { printf("Value = %d\n", $1); }

| exprs expr '\n' { printf("Value = %d\n", $2); }

;

expr: expr '+' term { $$ = $1 + $3; }

| expr '-' term { $$ = $1 - $3; }

| term { $$ = $1; }

;

continued

declarations

actions

attributes

Page 17: Compiler Structures

241-437 Compilers: Attr. Grammars/8 17

term: term '*' factor { $$ = $1 * $3; }

| term '/' factor { $$ = $1 / $3; } /* integer division */

| factor

;

factor: '(' expr ')' { $$ = $2; }

| NUMBER

;

continued

more actions

Page 18: Compiler Structures

241-437 Compilers: Attr. Grammars/8 18

$$#include "lex.yy.c"

int yyerror(char *s){ fprintf(stderr, "%s\n", s); return 0;}

int main(void){ yyparse(); // the syntax analyzer return 0;}

c code

Page 19: Compiler Structures

241-437 Compilers: Attr. Grammars/8 19

Evaluation in yaccStack$$ 3$ F$ T$ T *$ T * 5$ T * F$ T$ E$ E +$ E + 4$ E + F$ E + T$ E$ E \n$ Es

Input3*5+4\n$

*5+4\n$*5+4\n$*5+4\n$

5+4\n$+4\n$+4\n$+4\n$

+4\n$ 4\n$

\n$\n$\n$\n$

$$

Stack Actionshiftreduce F numreduce T Fshiftshiftreduce F num reduce T T * Freduce E T shiftshiftreduce F num reduce T F reduce E E + T shiftreduce Es E \naccept

val_3333 3 53 5151515 15 415 415 41919 19

Semantic Action

$$ = $1 (implicit)$$ = $1 (implicit)

$$ = $1 (implicit)$$ = $1 * $3$$ = $1 (implicit)

$$ = $1 (implicit)

$$ = $1 (implicit)$$ = $1 + $3

printf $1

Input: 3 * 5 + 4\n

Page 20: Compiler Structures

241-437 Compilers: Attr. Grammars/8 20

5. A Grid Grammar

• A robot starts at (0,0) on a grid, and is given compass directions:– n = north, s = south, e = east, w = west

• Evaluate the sequence of directions to work out the final position of the robot.

Page 21: Compiler Structures

241-437 Compilers: Attr. Grammars/8 21

Example

• The robot receives the directions:– n e e n n w– what is the 'meaning' (semantics) of the

directions?– the 'meaning' is the final robot position, (1,3)

start

final

n

ew

s

Page 22: Compiler Structures

241-437 Compilers: Attr. Grammars/8 22

5.1. Grid Grammar Input: n w s s

robot pathpath path step | step e | w | s | n

robot

path

path step

spath step

spath step

wpath step

n

Page 23: Compiler Structures

241-437 Compilers: Attr. Grammars/8 23

Grid Attribute Grammar

robot path

path path step

path

step estep wstep sstep n

printf( pathbody.(x,y) )

path.x := pathbody.x + stepbody.dxpath.y := pathbody.y + stepbody.dypath.(x,y) = (0,0)

step.(dx,dy) := (1,0)step.(dx,dy) := (-1,0)step.(dx,dy) := (0,-1)step.(dx,dy) := (0,1)

Production Semantic Actions

Page 24: Compiler Structures

241-437 Compilers: Attr. Grammars/8 24

Data Types

• The path rules use (x,y), the position of the robot.

• The step rules use (dx,dy), the step taken by the robot.

• Implementing these data types requires new features of yacc.

(x,y)

dx,dy

Page 25: Compiler Structures

241-437 Compilers: Attr. Grammars/8 25

Parse Tree with Actions Input: n w s s

robot

path

path step

spath step

spath step

wpath step

n

(0,0)

(0,1)

(-1,1)

(-1,0)

(-1,-1)

0,1

-1,0

0,-1

0,-1

printf (-1,-1)

evaluatebottom-up

Page 26: Compiler Structures

241-437 Compilers: Attr. Grammars/8 26

5.2. Non-integer Yacc Attributes

• The default yacc attributes (e.g. $$, $1, etc) are integers.

• We want data structures for (x,y) and (dx,dy), coded as two struct types.

Page 27: Compiler Structures

241-437 Compilers: Attr. Grammars/8 27

Defining New Types

• The new types are collected together inside a %union in the yacc definitions section:

%union{ type1 name1; type2 name2; . . .}

• For the grid:%union{ struct (int x, int y; } pos; struct (int dx, int dy; } offset;}

Page 28: Compiler Structures

241-437 Compilers: Attr. Grammars/8 28

• The non-terminals that return the new types must be listed.

• Any tokens that use the types must be listed.

• For the grid:% type <offset> step% type <pos> path

Using the Types

these non-terminals returnvalues of the specified type

Page 29: Compiler Structures

241-437 Compilers: Attr. Grammars/8 29

Using Typed Variables

• If an attribute (variable) is a record, then dotted-name notation is used to refer to its fields– e.g. $$.dx, $1.y

• The default action ($$ = $1) will cause an error if $$ and $1 are not the same type.

Page 30: Compiler Structures

241-437 Compilers: Attr. Grammars/8 30

5.3. Grid Compiler

$ flex grid.l$ bison grid.y$ gcc grid.tab.c -o gridEval

grid.l,

a flex file

grid.y,

a bison file

bison

flex lex.yy.c

grid.tab.c

gccgridEval,

c executable

#include

Page 31: Compiler Structures

241-437 Compilers: Attr. Grammars/8 31

Usage

$ ./gridEvalnwssRobot is at (-1,-1)$ ./gridEvaln n n w w w s eRobot is at (-2,2)$

I typed these lines.

I typed ctrl-D

Page 32: Compiler Structures

241-437 Compilers: Attr. Grammars/8 32

grid.l%%

[nN] {return NORTH;}[sS] {return SOUTH;}[eE] {return EAST;}[wW] {return WEST;}

[ \n\t] ;

%%

int yywrap(void) { return 1; }

Page 33: Compiler Structures

241-437 Compilers: Attr. Grammars/8 33

grid.y

%union{ struct { int x; int y; } pos; struct { int dx; int dy; } offset;}

%token EAST WEST NORTH SOUTH

%type <offset> step%type <pos> path

%%

continued

typedefinitions

types use by thenon-terminals

Page 34: Compiler Structures

241-437 Compilers: Attr. Grammars/8 34

robot: path { printf("Robot is at (%d,%d)\n", $1.x, $1.y); }

;

path: path step {$$.x = $1.x + $2.dx; $$.y = $1.y + $2.dy;}

| {$$.x = 0; $$.y = 0;} ;

step: EAST {$$.dx = 1; $$.dy = 0;} | WEST {$$.dx = -1; $$.dy = 0;} | SOUTH {$$.dx = 0; $$.dy = -1;} | NORTH {$$.dx = 0; $$.dy = 1;} ;

%%

continued

Page 35: Compiler Structures

241-437 Compilers: Attr. Grammars/8 35

#include "lex.yy.c"

int yyerror(char *s){ fprintf(stderr, "%s\n", s); return 0;}

int main(void){ yyparse(); return 0;}

Page 36: Compiler Structures

241-437 Compilers: Attr. Grammars/8 36

6. Recursive Descent and Attributes

• It is easy to add semantic actions to a recursive descent parser– in many cases, there's no need for the parser to

build a parse tree in order to evaluate the attributes

• The basic translation strategy:– each production becomes a function

continued

Page 37: Compiler Structures

241-437 Compilers: Attr. Grammars/8 37

• The function (e.g. f()) calls other functions representing its body non-terminals– those functions return values (attributes) to f()– f() combines the values, and returns a value

(attribute)

Page 38: Compiler Structures

241-437 Compilers: Attr. Grammars/8 38

6.1. The Expressions Parser Again

• The basic LL(1) grammar:Stats => ( [ Stat ] \n )*

Stat => let ID = Expr | Expr

Expr => Term ( (+ | - ) Term )*

Term => Fact ( (* | / ) Fact ) *

Fact => '(' Expr ')' | Int | Id

Page 39: Compiler Structures

241-437 Compilers: Attr. Grammars/8 39

An Expressions Program (test3.txt)

5 + 6 give answerlet x = 2 declare

variable3 + ( (x*y)/2) // comments// ylet x = 5let y = x /0 error

// comments

Page 40: Compiler Structures

241-437 Compilers: Attr. Grammars/8 40

• exprParse1.c is a recursive descent parser using the expressions language.

• It differs from exprParse0.c by having semantic actions attached to its productions– these actions evaluate the expressions, and

assign values to expression variables

6.2. Parsing with Actions

Page 41: Compiler Structures

241-437 Compilers: Attr. Grammars/8 41

Grammar with Actions

• Productions ActionsStats => ( [ Stat ] \n )* ---

Stat => let ID = Expr add id to symbol table;id.val = expr.val;print( id.val );

Stat => Expr print( expr.val );

continued

Page 42: Compiler Structures

241-437 Compilers: Attr. Grammars/8 42

Expr => Term ( (+ | - ) Term )*

return term1.val (+| -)term2.val (+| -) ...termn.val;

Term => Fact ( (* | / ) Fact ) *

return fact1.val (*| /)fact2.val (*| /) ...factn.val;

continued

Page 43: Compiler Structures

241-437 Compilers: Attr. Grammars/8 43

Fact => '(' Expr ') return expr.val;

Fact => Int return int.val;

Fact => Id lookup id;if not found then add (id, 0) to table;

return id.val;

Page 44: Compiler Structures

241-437 Compilers: Attr. Grammars/8 44

The Symbol Table

• The symbol table is a data structure used to store expression variables and their values.

• In exprParse1.c, it's an array of structs, with each struct holding the name of the variable and its current integer value.

. . . .idvalue

syms[]

Page 45: Compiler Structures

241-437 Compilers: Attr. Grammars/8 45

6.3. Usage

$ gcc -Wall -o exprParse1 exprParse1.c$ ./exprParse1 < test3.txt== 11x being declaredx = 2y being declared== 3x = 5Error: Division by zero; using 1 insteady = 5$

Page 46: Compiler Structures

241-437 Compilers: Attr. Grammars/8 46

6.4. exprParse1.c Callgraphsame as in exprParse0.c

symboltable (new)

generated fromgrammar (nowwith actions)

Page 47: Compiler Structures

241-437 Compilers: Attr. Grammars/8 47

6.5. Symbol Table Data Structures

#define MAX_SYMS 15 // max no of variables

typedef struct SymInfo { char *id; // name of variable int value; // value (an integer)} SymbolInfo;

int symNum = 0; // number of symbols storedSymbolInfo syms[MAX_SYMS];

. . . .idvalue

syms[]

0 1 2 14

Page 48: Compiler Structures

241-437 Compilers: Attr. Grammars/8 48

Symbol Table FunctionsSymbolInfo *getIDEntry(void)/* find _OR_ create symbol table entry for

current tokString; return a pointer to it */{ SymbolInfo *si = NULL; if ((si = lookupID(tokString)) != NULL)

// already declared return si;

// add id to table printf("%s being declared\n", tokString); return addID(tokString, 0); //0 is default value} // end of getIDEntry()

Page 49: Compiler Structures

241-437 Compilers: Attr. Grammars/8 49

SymbolInfo *lookupID(char *nm)/* is nm in the symbol table? return pointer to struct or NULL */{ int i; for(i=0; i<symNum; i++) if (!strcmp(syms[i].id, nm)) return &syms[i]; return NULL;} // end of lookupID()

Page 50: Compiler Structures

241-437 Compilers: Attr. Grammars/8 50

SymbolInfo *addID(char *nm, int value)/* add nm and value to the symbol table;

return pointer to struct */{ if (symNum == MAX_SYMS) { printf("Symbol table full; cannot add %s\n", nm); exit(1); }

syms[symNum].id = (char *) malloc(strlen(nm)+1); strcpy(syms[symNum].id, nm); syms[symNum].value = value; SymbolInfo *si = &syms[symNum]; symNum++;

return si;} // end of addID()

Page 51: Compiler Structures

241-437 Compilers: Attr. Grammars/8 51

Using the Symbol Table• The grammar functions use the symbol table via the

matchID() function.

SymbolInfo *matchId(void)// checks current ID with symbol table{ SymbolInfo *si; dprint("Parsing ident\n"); if ((si = getIDEntry()) == NULL) { printf("Error: id is NULL on line %d\n",lineNum); exit(1); } match(ID); // ok, so consume ID token return si;} // end of matchId()

Page 52: Compiler Structures

241-437 Compilers: Attr. Grammars/8 52

6.6. Translating the Grammar Rules

• The same translation is carried out as before, but the code is augmented with actions.

• The semantic actions are translated into extra C code in the grammar functions.

Page 53: Compiler Structures

241-437 Compilers: Attr. Grammars/8 53

The Grammar Functions

• main() and statements() are unchanged from exprParse0.c since they don't have any semantic actions.

• Functions with extra actions:– statement(), expression(), term(), factor()

Page 54: Compiler Structures

241-437 Compilers: Attr. Grammars/8 54

int main(void){ nextToken(); statements(); match(SCANEOF); return 0;}

void statementsvoid statements((voidvoid))// // statements statements ::= ::= { { // // [ [ statementstatement] ] '\n' }'\n' }{{ dprintdprint("("Parsing Parsing

statements\n statements\n")");; while while ((currToken currToken != !=

SCANEOF SCANEOF) ) {{ if if ((currToken currToken != != NEWLINENEWLINE)) statementstatement()();; matchmatch((NEWLINENEWLINE));; }}} } // // end of statementsend of statements()()

Unchanged Functions

Page 55: Compiler Structures

241-437 Compilers: Attr. Grammars/8 55

statement() Before and After

void statement(void)// statement ::= ( 'let' ID '=' EXPR ) | EXPR{ if (currToken == LET) { match(LET); match(ID); match(ASSIGNOP); expression(); } else expression();} // end of statement()

with no semantic actions

Page 56: Compiler Structures

241-437 Compilers: Attr. Grammars/8 56

void statement(void)// statement ::= ( 'let' ID '=' EXPR ) | EXPR{ SymbolInfo *si; int value; dprint("Parsing statement\n"); if (currToken == LET) { match(LET); si = matchId(); // was match(ID); match(ASSIGNOP); value = expression(); si->value = value; printf("%s = %d\n", si->id, value); } else { // expression value = expression(); printf("== %d\n", value); }}

Actions: add id to table; id.val = expr.val; print( id.val );or print( expr.val );

Page 57: Compiler Structures

241-437 Compilers: Attr. Grammars/8 57

expression() Before and After

void expression(void)// expression ::= term ( ('+'|'-') term )*{ term(); while((currToken == PLUSOP) ||

(currToken == MINUSOP)) { match(currToken); term(); }} // end of expression()

with no semantic actions

Page 58: Compiler Structures

241-437 Compilers: Attr. Grammars/8 58

int expression(void)// expression ::= term ( ('+'|'-') term )*{ int result, v2; int isAddOp;

dprint("Parsing expression\n"); result = term(); while((currToken == PLUSOP) || (currToken == MINUSOP)) { isAddOp = (currToken == PLUSOP) ? 1 : 0; match(currToken); v2 = term(); if (isAddOp == 1) // addition result += v2; else // subtraction result -= v2; } return result;} // end of expression()

Action: return term1.val (+| -) term2.val (+| -) ... termn.val;

Page 59: Compiler Structures

241-437 Compilers: Attr. Grammars/8 59

term() Before and After

void term(void)// term ::= factor ( ('*'|'/') factor )*{ factor(); while((currToken == MULTOP) ||

(currToken == DIVOP)) { match(currToken); factor(); }} // end of term()

with no semantic actions

Page 60: Compiler Structures

241-437 Compilers: Attr. Grammars/8 60

int term(void)// term ::= factor ( ('*'|'/') factor )*{ int result, v2; int isMultOp; dprint("Parsing term\n"); result = factor(); while((currToken == MULTOP) || (currToken == DIVOP)) { isMultOp = (currToken == MULTOP) ? 1 : 0; match(currToken); v2 = factor(); if (isMultOp == 1) // multiplication result *= v2; else { // division if (v2 == 0) printf("Error: Division by zero; using 1 instead\n"); else result = result / v2; } } return result;} // end of term()

Action: return fact1.val (*| / ) fact2.val (*| / ) ... factn.val;

Page 61: Compiler Structures

241-437 Compilers: Attr. Grammars/8 61

factor() Before and After

void factor(void)// factor ::= '(' expression ')' | INT | ID{ if(currToken == LPAREN) { match(LPAREN); expression(); match(RPAREN); } else if(currToken == INT) match(INT); else if (currToken == ID) match(ID); else syntax_error(currToken);} // end of factor()

with no semantic actions

Page 62: Compiler Structures

241-437 Compilers: Attr. Grammars/8 62

int factor(void)// factor ::= '(' expression ')' | INT | ID{ int result = 0; dprint("Parsing factor\n"); if(currToken == LPAREN) { match(LPAREN); result = expression(); match(RPAREN); } else if(currToken == INT) { match(INT); result = currTokValue; } else if (currToken == ID) { SymbolInfo *si = matchId(); result = si->value; } else syntax_error(currToken); return result;} // end of factor()

Actions: return expr.val;or return int.val;or add id to table (if new); return id.val;


Recommended