+ All Categories
Home > Documents > Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR...

Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR...

Date post: 01-Aug-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
48
Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev
Transcript
Page 1: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Fall 2016-2017 Compiler PrinciplesLecture 4: Parsing part 3

Roman ManevichBen-Gurion University of the Negev

Page 2: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Tentative syllabus

FrontEnd

Scanning

Top-downParsing (LL)

Bottom-upParsing (LR)

IntermediateRepresentation

Operational Semantics

Lowering

Optimizations

DataflowAnalysis

LoopOptimizations

Code Generation

RegisterAllocation

EnergyOptimization

InstructionSelection

2

mid-term exam

Page 3: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Previously

3

• LR(0) parsing

– Running the parser

– Constructing transition diagram

– Constructing parser table

– Detecting conflicts

• SLR(0)

– Eliminating conflicts via FOLLOW sets

Page 4: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Agenda

4

• LR(1)

• LALR(1)

• Automatic LR parser generation

• Handling ambiguities

Page 5: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Going beyond SLR(0)

• Some common language constructs introduce conflicts even for SLR

(0) S’ → S(1) S → L = R(2) S → R(3) L → * R(4) L → id(5) R → L

5

Page 6: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

S’ → SS → L = RS → RL → * RL → idR → L

S’ → S

S → L = RR → L

S → R

L → * RR → LL → * RL → id

L → id

S → L = RR → LL → * RL → id

L → * R

R → L

S → L = R

S

L

R

id

*

=

R

*

id

R

L*

L

id

q0

q4

q7

q1

q3

q9

q6

q8

q2

q5

6

Page 7: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

shift/reduce conflict

• S → L = R vs. R → L

• FOLLOW(R) contains =– S → L = R → * R = R

• SLR cannot resolve conflict

7

S → L = RR → L

S → L = RR → LL → * RL → id

=

q6

q2

(0) S’ → S(1) S → L = R(2) S → R(3) L → * R(4) L → id(5) R → L

Page 8: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Inputs requiring shift/reduce

• For the input id the rightmost derivationS’ → S → R → L → id requires reducing in q2

• For the input id = idS’ → S → L = R → L = L → L = id → id = idrequires shifting

8

(0) S’ → S(1) S → L = R(2) S → R(3) L → * R(4) L → id(5) R → L S → L = R

R → L

S → L = RR → LL → * RL → id

=

q6

q2

Page 9: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

LR(1) grammars

• In SLR: a reduce item N α is applicable only when the lookahead is in FOLLOW(N)

• But for a given context (state) are all tokens in FOLLOW(N) indeed possible?– Not always– We can compute a context-sensitive (i.e., specific to a

given state) subset of FOLLOW(N) and use it to remove even more conflicts

• LR(1) keeps lookahead with each LR item• Idea: a more refined notion of FOLLOW

computed per item

9

Page 10: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

LR(1) item

N αβ, t

Already matched To be matched

Input

Hypothesis about αβ being a possible handle: so far we’ve matched α, expecting to see βand after reducing N we expect to see the token t

10

Page 11: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

LR(1) items

• LR(1) item is a pair – LR(0) item– Lookahead token

• Meaning– We matched the part left of the dot, looking to match the part on the

right of the dot, followed by the lookahead token

• Example– The production L id yields the following LR(1) items

11

[L → ● id, *][L → ● id, =][L → ● id, id][L → ● id, $][L → id ●, *][L → id ●, =][L → id ●, id][L → id ●, $]

(0) S’ → S(1) S → L = R(2) S → R(3) L → * R(4) L → id(5) R → L

[L → ● id][L → id ●]

LR(0) items

LR(1) items

Page 12: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Computing Closure for LR(1)

• For every [A → α ● Bβ , c] in S

– for every production B→δ and every token b in the grammar such that b FIRST(βc)

– Add [B → ● δ , b] to S

12

Page 13: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

(S’ → ∙ S , $)

(S → ∙ L = R , $)

(S → ∙ R , $)

(L → ∙ * R , = )

(L → ∙ id , = )

(R → ∙ L , $ )

(L → ∙ id , $ )

(L → ∙ * R , $ )

(S’ → S ∙ , $)

(S → L ∙ = R , $)

(R → L ∙ , $)

(S → R ∙ , $)

(L → * ∙ R , =)

(R → ∙ L , =)

(L → ∙ * R , =)

(L → ∙ id , =)

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(L → id ∙ , =)

(S → L = ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → * R ∙ , =)

(L → * R ∙ , $)

(R → L ∙ , =)

(R → L ∙ , $)

(S → L = R ∙ , $)

S

L

R

id

*

=

R

*id

R

L

*

L

id

q0

q4 q5

q7

q6

q9

q3

q1

q2

q8

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(R → L ∙ , $)

(L → * R ∙ , $)

q11

q12

q10

Rq13

id

13

Page 14: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Back to the conflict

• Is there a conflict now?

14

(S → L ∙ = R , $)

(R → L ∙ , $)

(S → L = ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

=

q6

q2

Page 15: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

LALR(1)

• LR(1) tables have huge number of entries

• Often don’t need such refined observation (and cost)

• Idea: find states with the same LR(0) component and merge their lookaheads component as long as there are no conflicts

• LALR(1) not as powerful as LR(1) in theory but works quite well in practice

– Merging may not introduce new shift-reduce conflicts, only reduce-reduce, which is unlikely in practice

15

Page 16: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

(S’ → ∙ S , $)

(S → ∙ L = R , $)

(S → ∙ R , $)

(L → ∙ * R , = )

(L → ∙ id , = )

(R → ∙ L , $ )

(L → ∙ id , $ )

(L → ∙ * R , $ )

(S’ → S ∙ , $)

(S → L ∙ = R , $)

(R → L ∙ , $)

(S → R ∙ , $)

(L → * ∙ R , =)

(R → ∙ L , =)

(L → ∙ * R , =)

(L → ∙ id , =)

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(L → id ∙ , =)

(S → L = ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → * R ∙ , =)

(L → * R ∙ , $)

(R → L ∙ , =)

(R → L ∙ , $)

(S → L = R ∙ , $)

S

L

R

id

*

=

R

*id

R

L

*

L

id

q0

q4 q5

q7

q6

q9

q3

q1

q2

q8

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(R → L ∙ , $)

(L → * R ∙ , $)

q11

q12

q10

Rq13

id

16

Page 17: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

(S’ → ∙ S , $)

(S → ∙ L = R , $)

(S → ∙ R , $)

(L → ∙ * R , = )

(L → ∙ id , = )

(R → ∙ L , $ )

(L → ∙ id , $ )

(L → ∙ * R , $ )

(S’ → S ∙ , $)

(S → L ∙ = R , $)

(R → L ∙ , $)

(S → R ∙ , $)

(L → * ∙ R , =)

(R → ∙ L , =)

(L → ∙ * R , =)

(L → ∙ id , =)

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(L → id ∙ , =)

(S → L = ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → * R ∙ , =)

(L → * R ∙ , $)

(R → L ∙ , =)

(R → L ∙ , $)

(S → L = R ∙ , $)

S

L

R

id

*

=

R

*id

R

L

*

L

id

q0

q4 q5

q7

q6

q9

q3

q1

q2

q8

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(R → L ∙ , $)

(L → * R ∙ , $)

q11

q12

q10

Rq13

id

17

Page 18: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

(S’ → ∙ S , $)

(S → ∙ L = R , $)

(S → ∙ R , $)

(L → ∙ * R , = )

(L → ∙ id , = )

(R → ∙ L , $ )

(L → ∙ id , $ )

(L → ∙ * R , $ )

(S’ → S ∙ , $)

(S → L ∙ = R , $)

(R → L ∙ , $)

(S → R ∙ , $)

(L → * ∙ R , =)

(R → ∙ L , =)

(L → ∙ * R , =)

(L → ∙ id , =)

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → id ∙ , $)

(L → id ∙ , =)

(S → L = ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

(L → * R ∙ , =)

(L → * R ∙ , $)

(R → L ∙ , =)

(R → L ∙ , $)

(S → L = R ∙ , $)

S

L

R

id

*

=

R

*id

R

L

*

L

id

q0

q4 q5

q7

q6

q9

q3

q1

q2

q8

(L → * ∙ R , $)

(R → ∙ L , $)

(L → ∙ * R , $)

(L → ∙ id , $)

q10

R

id

18

Page 19: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Left/Right- recursion

• At home: create a simple grammar withleft-recursion and one with right-recursion

• Construct corresponding LR(0) parser

– Any conflicts?

• Run on simple input and observe behavior

– Attempt to generalize observation for long inputs

19

Page 20: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Example: non-LR(1) grammar

20

(1) S Y b c $

(2) S Z b d $

(3) Y a

(4) Z a

S ∙ Y b c, $

S ∙ Y b c, $

Y ∙ a, b

Z ∙ a, b

Y a ∙, b

Z a ∙, b

a

reduce-reduce conflicton lookahead ‘b’

Page 21: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

21

Automated parser

generation(via CUP)

Page 22: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

High-level structure

JFlex javacLexerspec

Lexical analyzer

text

tokens

.java

CUP javacParserspec

.java Parser

AST

LANG.cup

LANG.lex

Parser.javasym.java

Lexer.java

(Token.java)

22

Page 23: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Expression calculator

expr expr + expr

| expr - expr

| expr * expr

| expr / expr

| - expr

| ( expr )

| number

Goals of expression calculator parser:• Is 2+3+4+5 a valid expression?• What is the meaning (value) of this expression?

23

Page 24: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Syntax analysis with CUP

CUP javacParserspec

.java Parser

AST

CUP – parser generator

Generates an LALR(1) Parser

Input: spec file

Output: a syntax analyzer

Can dump automaton and tabletokens

24

Page 25: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

CUP spec file

• Package and import specifications

• User code components

• Symbol (terminal and non-terminal) lists

– Terminals go to sym.java

– Types of AST nodes

• Precedence declarations

• The grammar

– Semantic actions to construct AST

25

Page 26: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

26

Parsing ambiguous grammars

Page 27: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Expression Calculator –1st Attempt

terminal Integer NUMBER;

terminal PLUS, MINUS, MULT, DIV;

terminal LPAREN, RPAREN;

non terminal Integer expr;

expr ::= expr PLUS expr

| expr MINUS expr

| expr MULT expr

| expr DIV expr

| MINUS expr

| LPAREN expr RPAREN

| NUMBER

;

Symbol typeexplained later

27

Page 28: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Ambiguities

a + b * c

a b c

*

+

a b c

+

*

a + b + c

a b c

+

+

a b c

+

+

28

Page 29: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Ambiguities as conflicts for LR(1)

a + b + c

a b c

+

+

a b c

+

+

29

a + b * c

a b c

*

+

a b c

+

*

Page 30: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

terminal Integer NUMBER;

terminal PLUS,MINUS,MULT,DIV;

terminal LPAREN, RPAREN;

terminal UMINUS;

non terminal Integer expr;

precedence left PLUS, MINUS;

precedence left DIV, MULT;

precedence left UMINUS;

expr ::= expr PLUS expr

| expr MINUS expr

| expr MULT expr

| expr DIV expr

| MINUS expr %prec UMINUS

| LPAREN expr RPAREN

| NUMBER

;

Expression Calculator –2nd Attempt

Increasing precedence

Contextual precedence

30

Page 31: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Parsing ambiguous grammars using precedence declarations

• Each terminal assigned with precedence– By default all terminals have lowest precedence– User can assign his own precedence– CUP assigns each production a precedence

• Precedence of rightmost terminal in production• or user-specified contextual precedence

• On shift/reduce conflict resolve ambiguity by comparing precedence of terminal and production and decides whether to shift or reduce

• In case of equal precedences left/right help resolve conflicts– left means reduce– right means shift

• More information on precedence declarations in CUP’s manual

31

Page 32: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Resolving ambiguity (associativity)

a + b + c

a b c

+

+

a b c

+

+

precedence left PLUS

32

Page 33: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Resolving ambiguity (op. precedence)

a + b * c

a b c

*

+

a b c

+

*

precedence left PLUS

precedence left MULT

33

Page 34: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Resolving ambiguity (contextual)

- a * b

a b

*

-

precedence left MULT

MINUS expr %prec UMINUS

a

-b

*

34

Page 35: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Resolving ambiguity

terminal Integer NUMBER;

terminal PLUS,MINUS,MULT,DIV;

terminal LPAREN, RPAREN;

terminal UMINUS;

precedence left PLUS, MINUS;

precedence left DIV, MULT;

precedence left UMINUS;

expr ::= expr PLUS expr

| expr MINUS expr

| expr MULT expr

| expr DIV expr

| MINUS expr %prec UMINUS

| LPAREN expr RPAREN

| NUMBER

;

Rule has precedence of UMINUS

UMINUS never returnedby scanner

(used only to define precedence)

35

Page 36: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

More CUP directives

• precedence nonassoc NEQ

– Non-associative operators: < > == != etc.

– 1<2<3 identified as an error (semantic error?)

• start non-terminal

– Specifies start non-terminal other than first non-terminal

– Can change to test parts of grammar

• Getting internal representation

– Command line options:• -dump_grammar

• -dump_states

• -dump_tables

• -dump

36

Page 37: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

import java_cup.runtime.*;

%%

%cup

%eofval{

return new Symbol(sym.EOF);

%eofval}

NUMBER=[0-9]+

%%

<YYINITIAL>”+” { return new Symbol(sym.PLUS); }

<YYINITIAL>”-” { return new Symbol(sym.MINUS); }

<YYINITIAL>”*” { return new Symbol(sym.MULT); }

<YYINITIAL>”/” { return new Symbol(sym.DIV); }

<YYINITIAL>”(” { return new Symbol(sym.LPAREN); }

<YYINITIAL>”)” { return new Symbol(sym.RPAREN); }

<YYINITIAL>{NUMBER} {

return new Symbol(sym.NUMBER, new Integer(yytext()));

}

<YYINITIAL>\n { }

<YYINITIAL>. { }

Parser gets terminals from the scanner

Scanner integration

Generated from token

declarations in .cup file

37

Page 38: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Recap

• Package and import specifications and user code components

• Symbol (terminal and non-terminal) lists

– Define building-blocks of the grammar

• Precedence declarations

– May help resolve conflicts

• The grammar

– May introduce conflicts that have to be resolved

38

Page 39: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

39

Abstract syntaxtree construction

Page 40: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Assigning meaning

• So far, only validation

• Add Java code implementing semantic actions

expr ::= expr PLUS expr

| expr MINUS expr

| expr MULT expr

| expr DIV expr

| MINUS expr %prec UMINUS

| LPAREN expr RPAREN

| NUMBER

;

40

Page 41: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

• Symbol labels used to name variables

• RESULT names the left-hand side symbol

non terminal Integer expr;

expr ::= expr:e1 PLUS expr:e2

{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}

| expr:e1 MINUS expr:e2

{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}

| expr:e1 MULT expr:e2

{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}

| expr:e1 DIV expr:e2

{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}

| MINUS expr:e1

{: RESULT = new Integer(0 - e1.intValue(); :} %prec UMINUS

| LPAREN expr:e1 RPAREN

{: RESULT = e1; :}

| NUMBER:n

{: RESULT = n; :}

;

Assigning meaning

41

Page 42: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Abstract Syntax Trees

• More useful representation of syntax tree

– Less clutter

– Actual level of detail depends on your design

• Basis for semantic analysis

• Later annotated with various information

– Type information

– Computed values

• Technically – a class hierarchy of abstract syntax tree nodes

42

Page 43: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Parse tree vs. AST

+

expr

1 2 + 3

expr

expr

( ) ( )

expr

expr

1 2

+

3

+

43

Page 44: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

AST hierarchy example

44

int_const plus minus times divide

expr

Page 45: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

AST construction

• AST Nodes constructed during parsing

– Stored in push-down stack

• Bottom-up parser

– Grammar rules annotated with actions for AST construction

– When node is constructed all children available (already constructed)

– Node (RESULT) pushed on stack

45

Page 46: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

1 + (2) + (3)

expr + (expr) + (3)

+

expr

1 2 + 3

expr

expr + (3)

expr

( ) ( )

expr + (expr)

expr

expr

expr

expr + (2) + (3)

int_constval = 1

pluse1 e2

int_constval = 2

int_constval = 3

pluse1 e2

expr ::= expr:e1 PLUS expr:e2

{: RESULT = new plus(e1,e2); :}

| LPAREN expr:e RPAREN

{: RESULT = e; :}

| INT_CONST:i

{: RESULT = new int_const(…, i); :}

AST construction

46

Page 47: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

terminal Integer NUMBER;

terminal PLUS,MINUS,MULT,DIV,LPAREN,RPAREN,SEMI;

terminal UMINUS;

non terminal Integer expr;

non terminal expr_list, expr_part;

precedence left PLUS, MINUS;

precedence left DIV, MULT;

precedence left UMINUS;

expr_list ::= expr_list expr_part

| expr_part

;

expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI

;

expr ::= expr PLUS expr

| expr MINUS expr

| expr MULT expr

| expr DIV expr

| MINUS expr %prec UMINUS

| LPAREN expr RPAREN

| NUMBER

;

Example of lists

47

Executed when e is shifted

Page 48: Fall 2016-2017 Compiler Principles Lecture 4: Parsing part 3comp171/wiki.files/04-parsing-3-LR1-LALR … · Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University of the Negev.

Next lecture:IR and Operational Semantics


Recommended