+ All Categories
Home > Documents > Semantic Analysis (Generating An AST)

Semantic Analysis (Generating An AST)

Date post: 02-Feb-2016
Category:
Upload: nairi
View: 63 times
Download: 0 times
Share this document with a friend
Description:
Semantic Analysis (Generating An AST). CS 471 September 26, 2007. Semantic Analysis. Source code. lexical errors. Lexical Analysis. tokens. syntax errors. Parsing. AST. semantic errors. Semantic Analysis. Valid programs: decorated AST. Goals of a Semantic Analyzer. - PowerPoint PPT Presentation
25
Semantic Analysis (Generating An AST) CS 471 September 26, 2007
Transcript
Page 1: Semantic Analysis (Generating An AST)

Semantic Analysis(Generating An AST)

CS 471September 26,

2007

Page 2: Semantic Analysis (Generating An AST)

2 CS 471 – Fall 2007

Semantic Analysis

Source code

Lexical Analysis

Parsing

Semantic Analysis

Valid programs: decorated AST

lexical errors

syntax errors

semantic errors

tokens

AST

Page 3: Semantic Analysis (Generating An AST)

3 CS 471 – Fall 2007

Goals of a Semantic Analyzer

Compiler must do more than recognize whether a sentence belongs to the language…

• Find all possible remaining errors that would make program invalid

– undefined variables, types– type errors that can be caught statically

• Figure out useful information for later phases– types of all expressions– data layout

Page 4: Semantic Analysis (Generating An AST)

4 CS 471 – Fall 2007

Semantic Actions

Can do useful things with the parsed phrases– Each terminal and nonterminal may be

associated with type, e.g. exp: INT type is int– For rule: A B C D •Type must match A•Value can be built with BCD

Page 5: Semantic Analysis (Generating An AST)

5 CS 471 – Fall 2007

Semantic Actions

Semantic action executed when grammar production is reduced

• Recursive-descent parser: semantic code interspersed with control flow

• Yacc: fragments of C code attached to a grammar production

Page 6: Semantic Analysis (Generating An AST)

6 CS 471 – Fall 2007

Interpreter

Could develop an interpreter that executes the program as part of the semantic actions!

Example Grammar:

E id

E E + E

E E – E

E E * E

E -E

Page 7: Semantic Analysis (Generating An AST)

7 CS 471 – Fall 2007

Unions in Yacc

%union allows us to declare a union datatype

used to package the types/attributes of symbols

%union {

int pos;

int ival;

string sval;

struct {

int intval;

enum Types valtype;

} constantval;

A_exp exp;

}

Exported asYYSTYPE

Page 8: Semantic Analysis (Generating An AST)

8 CS 471 – Fall 2007

Types in Yacc

Using the values of union structs, tell Yacc the types

Terminals

%token <sval> ID STRING

%token <ival> INT

%token <pos> COMMA SEMI LBRACE RBRACE …

And Nonterminals (use %type)

%type <exp> expression program

LHS of productiontype

Page 9: Semantic Analysis (Generating An AST)

9 CS 471 – Fall 2007

Symbols in Yacc

•The symbol $n (n > 0) refers to the attribute of nth symbol on the RHS

•The symbol $$ refers the attribute of the LHS

•The symbol $n (n 0) refers to contextual information

Note: actions in middle contribute as a symbol!

expr : expr1 PLUS expr2

$$ $1 $3

Page 10: Semantic Analysis (Generating An AST)

10 CS 471 – Fall 2007

Interpreter in Yacc

%{ declarations of yylex and yyerror %}%union {int num; string id}% token <num> INT% token <id> ID% type <num> exp% start exp

%left PLUS MINUS%left TIMES%left UMINUS%%

[please fill in solution]

E id E E + EE E – EE E * EE -E

Recall

expr : expr1 PLUS expr2

$$ $1 $3

Page 11: Semantic Analysis (Generating An AST)

11 CS 471 – Fall 2007

Internally: A Semantic Stack

Implemented using a stack parallel to the state stack

Stack Input Action

1 + 2 * 3 $ shift

INT: 1 + 2 * 3 $ reduce

exp: 1 + 2 * 3 $ shift

exp: 1 +: 2 * 3 $ shift

exp: 1 +: INT: 2 * 3 $ reduce

exp: 1 +: exp: 2 3 $ shift

exp: 1 +: exp: 2 *: $ shift

exp: 1 +: exp: 2 *: INT: 3 $ reduce

exp: 1 +: exp: 2 *: exp: 3 $ reduce

exp: 1 +: exp: 6 $ reduce

exp: 7 $ accept

Page 12: Semantic Analysis (Generating An AST)

12 CS 471 – Fall 2007

Inlined TypeChecker and CodeGen

You can even type check and generate code:

expr : expr PLUS expr {

if ($1.type == $3.type &&

($1.type == IntType ||

$1.type == RealType)) $$.type = $1.type

else error(“+ applied on wrong type!”);

GenerateAdd($1, $3, $$);

}

Page 13: Semantic Analysis (Generating An AST)

13 CS 471 – Fall 2007

Problems

•Difficult to read

•Difficult to maintain

•Compiler must analyze program in order parsed

•Instead … we split up tasks

Page 14: Semantic Analysis (Generating An AST)

14 CS 471 – Fall 2007

Compiler ‘main program’

void Compile() {

TokenStream l = Lexer(input);

AST tree = Parser(l);

if (TypeCheck(tree))

IR ir = genIntermediateCode(tree);

emitCode(ir);

}

}

Page 15: Semantic Analysis (Generating An AST)

15 CS 471 – Fall 2007

Thread of control

Input Stream

Lexer

Parser

characters

tokens

AST

compile

parse

getToken

readStream

AST

Page 16: Semantic Analysis (Generating An AST)

16 CS 471 – Fall 2007

Producing the Parse Tree

Separates issues of syntax (parsing) from issues of semantics (type checking, translation to machine code)

• One leaf for every token

• One internal node for every reduction during parsing

• Concrete parse tree represents concrete syntax

But … parse tree has problems

• Punctuation tokens redundant

• Structure of the tree conveys this info

Enter the Abstract Syntax Tree

Page 17: Semantic Analysis (Generating An AST)

17 CS 471 – Fall 2007

AST

• Abstract Syntax Tree is a tree representation of the program. Used for

– semantic analysis (type checking)– some optimization (e.g. constant folding)– intermediate code generation (sometimes

intermediate code = AST with somewhat different set of nodes)

• Compiler phases = recursive tree traversals

Page 18: Semantic Analysis (Generating An AST)

18 CS 471 – Fall 2007

Do We Need An AST?

• Old-style compilers: semantic actions generate code during parsing

Problems:

• hard to maintain

• limits language features

• not modular!

expr ::= expr PLUS expr {: emitCode(add); :}input

parser

code

stack

Page 19: Semantic Analysis (Generating An AST)

19 CS 471 – Fall 2007

Interesting Detour

•Old compilers didn’t create ASTs … not enough memory to store entire program

•Can also see reasons for C requiring forward declarations - avoids an extra compilation pass

Page 20: Semantic Analysis (Generating An AST)

20 CS 471 – Fall 2007

Positions

In one pass compiler – errors reported using position of the lexer as approximation (global var)

Abstract syntax data structures must have pos fields

• Line number

• Char number

•Line number is unambiguous

•Char number is a matter of style

Page 21: Semantic Analysis (Generating An AST)

21 CS 471 – Fall 2007

Abstract Syntax for Tiger

/* absyn.h */

typedef struct A_var_ * A_var;

struct A_var_

{ enum {A_simpleVar,A_fieldVar,A_subscriptVar}kind;

A_pos pos;

union {S_symbol simple;

struct {A_var var;

S_symbol sym;} field;

struct {A_var var;

A_exp exp;} subscript;

} u;

};

Page 22: Semantic Analysis (Generating An AST)

22 CS 471 – Fall 2007

More Syntax (Constructors…p.98)

A_var A_SimpleVar(A_pos pos, S_symbol sym);

A_exp A_WhileExp(A_pos pos, A_exp test, A_exp body);

A_expList A_ExpList(A_exp head, A_expList tail);

Page 23: Semantic Analysis (Generating An AST)

23 CS 471 – Fall 2007

Tiger Program

(a := 5; a+1) translates to:

A_SeqExp(2,

A_ExpList(A_AssignExp(4,

A_SimpleVar(2,

S_Symbol(“a”)), A_IntExp(7,5)),

A_ExpList((A_OpExp(11,A_plusOp,

A_VarExp(A_SimpleVar(10,

S_Symbol(“a”))),A_IntExp(12,1))),

NULL)))

• AssignExp choose column of “:=“ for pos

• OpExp choose column of “+” for pos

Page 24: Semantic Analysis (Generating An AST)

24 CS 471 – Fall 2007

Some Odd Tiger Features

Tiger allows mutually recursive declarations:

let var a + 5

function f() : int = g(a)

function g(i: int) = f()

in f()

end

Thus: FunctionDec constructor takes a list of functions

Page 25: Semantic Analysis (Generating An AST)

25 CS 471 – Fall 2007

Correlation to Yacc (and your project)

(Demo)

Checklist

1. Detailed look at the Tiger AST (absyn.h)

2. Edit tiger.grm

3. The Tiger Language Manual• PA3 and PA4 make heavy use of it• Follow the structure to generate your yacc file


Recommended