+ All Categories
Home > Documents > Automated Parser Generation (via CUP )

Automated Parser Generation (via CUP )

Date post: 23-Feb-2016
Category:
Upload: robbin
View: 54 times
Download: 0 times
Share this document with a friend
Description:
Automated Parser Generation (via CUP ). High-level structure. text. Lexer spec. .java. Lexical analyzer. JFlex. javac. Lexer.java. TPL.lex. (Token.java). tokens. Parser spec. .java. Parser. CUP. javac. sym.java Parser.java. TPL.cup. AST. Expression calculator. - PowerPoint PPT Presentation
25
Automated Parser Generation (via CUP ) 1
Transcript
Page 1: Automated Parser Generation (via  CUP )

Automated Parser Generation(via CUP)

1

Page 2: Automated Parser Generation (via  CUP )

High-level structure

JFlex javacLexerspec

Lexical analyzer

text

tokens

.java

CUP javacParserspec .java Parser

AST

TPL.cup

TPL.lex

sym.javaParser.java

Lexer.java

(Token.java)

2

Page 3: Automated Parser Generation (via  CUP )

Expression calculator

expr expr + expr| expr - expr| expr * expr| expr / expr| - expr| ( expr )| number

Goals of expression calculator parser:• Is 2+3+4+5 a valid expression?• What is the meaning (value) of this expression?

3

Page 4: Automated Parser Generation (via  CUP )

Syntax analysis with CUP

CUP javacParserspec .java Parser

AST

CUP – parser generator Generates an LALR(1) Parser Input: spec file Output: a syntax analyzer

tokens

4

Page 5: Automated Parser Generation (via  CUP )

CUP spec file

• Package and import specifications• User code components• Symbol (terminal and non-terminal) lists– Terminals go to sym.java– Types of AST nodes

• Precedence declarations• The grammar– Semantic actions to construct AST

5

Page 6: Automated Parser Generation (via  CUP )

Expression Calculator – 1st Attempt

terminal Integer NUMBER;terminal PLUS, MINUS, MULT, DIV;terminal LPAREN, RPAREN;

non terminal Integer expr;

expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr| LPAREN expr RPAREN| NUMBER

;

Symbol typeexplained later

6

Page 7: Automated Parser Generation (via  CUP )

Ambiguities

a * b + c

a b c

+

*

a b c

*

+

a + b + ca b c

+

+

a b c

+

+

7

Page 8: Automated Parser Generation (via  CUP )

Ambiguities as conflicts for LR(1)

a * b + c

a b c

+

*

a b c

*

+

a + b + ca b c

+

+

a b c

+

+

8

Page 9: Automated Parser Generation (via  CUP )

terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;non terminal Integer expr;

precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;

expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER

;

Expression Calculator – 2nd Attempt

Increasing precedence

Contextual precedence

9

Page 10: Automated Parser Generation (via  CUP )

Parsing ambiguous grammars using precedence declarations

• Each terminal assigned with precedence– By default all terminals have lowest precedence– User can assign his own precedence– CUP assigns each production a precedence

• Precedence of rightmost terminal in production• or user-specified contextual precedence

• On shift/reduce conflict resolve ambiguity by comparing precedence of terminal and production and decides whether to shift or reduce

• In case of equal precedences left/right help resolve conflicts– left means reduce– right means shift

• More information on precedence declarations in CUP’s manual

10

Page 11: Automated Parser Generation (via  CUP )

Resolving ambiguity

a + b + c

a b c

+

+

a b c

+

+

precedence left PLUS

11

Page 12: Automated Parser Generation (via  CUP )

Resolving ambiguity

a * b + c

a b c

+

*

a b c

*

+

precedence left PLUSprecedence left MULT

12

Page 13: Automated Parser Generation (via  CUP )

Resolving ambiguity

- a * b

a b

*

-

precedence left MULTMINUS expr %prec UMINUS

a

-b

*

13

Page 14: Automated Parser Generation (via  CUP )

Resolving ambiguityterminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV;terminal LPAREN, RPAREN;terminal UMINUS;

precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;

expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec

UMINUS| LPAREN expr RPAREN| NUMBER

;

Rule has precedence of UMINUS

UMINUS never returnedby scanner

(used only to define precedence)

14

Page 15: Automated Parser Generation (via  CUP )

More CUP directives• precedence nonassoc NEQ– Non-associative operators: < > == != etc.– 1<2<3 identified as an error (semantic error?)

• start non-terminal– Specifies start non-terminal other than first non-terminal– Can change to test parts of grammar

• Getting internal representation– Command line options:

• -dump_grammar• -dump_states • -dump_tables• -dump

15

Page 16: Automated Parser Generation (via  CUP )

import java_cup.runtime.*;%%%cup%eofval{ return new Symbol(sym.EOF);%eofval}NUMBER=[0-9]+%%<YYINITIAL>”+” { return new Symbol(sym.PLUS); }<YYINITIAL>”-” { return new Symbol(sym.MINUS); }<YYINITIAL>”*” { return new Symbol(sym.MULT); }<YYINITIAL>”/” { return new Symbol(sym.DIV); }<YYINITIAL>”(” { return new Symbol(sym.LPAREN); }<YYINITIAL>”)” { return new Symbol(sym.RPAREN); }<YYINITIAL>{NUMBER} {

return new Symbol(sym.NUMBER, new Integer(yytext()));}<YYINITIAL>\n { }<YYINITIAL>. { }

Parser gets terminals from the scanner

Scanner integration

Generated from tokendeclarations in .cup file

16

Page 17: Automated Parser Generation (via  CUP )

Recap

• Package and import specifications and user code components

• Symbol (terminal and non-terminal) lists– Define building-blocks of the grammar

• Precedence declarations– May help resolve conflicts

• The grammar– May introduce conflicts that have to be resolved

17

Page 18: Automated Parser Generation (via  CUP )

Assigning meaning

• So far, only validation• Add Java code implementing semantic actions

expr ::= expr PLUS expr| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER

;

18

Page 19: Automated Parser Generation (via  CUP )

• Symbol labels used to name variables• RESULT names the left-hand side symbol

expr ::= expr:e1 PLUS expr:e2{: RESULT = new Integer(e1.intValue() + e2.intValue()); :}| expr:e1 MINUS expr:e2{: RESULT = new Integer(e1.intValue() - e2.intValue()); :}| expr:e1 MULT expr:e2{: RESULT = new Integer(e1.intValue() * e2.intValue()); :}| expr:e1 DIV expr:e2{: RESULT = new Integer(e1.intValue() / e2.intValue()); :}| MINUS expr:e1{: RESULT = new Integer(0 - e1.intValue(); :} %prec UMINUS| LPAREN expr:e1 RPAREN{: RESULT = e1; :}| NUMBER:n {: RESULT = n; :};

Assigning meaning

19

Page 20: Automated Parser Generation (via  CUP )

Building an AST

• More useful representation of syntax tree– Less clutter– Actual level of detail depends on your design

• Basis for semantic analysis• Later annotated with various information– Type information– Computed values

• Technically – a class hierarchy of abstract syntax tree nodes

20

Page 21: Automated Parser Generation (via  CUP )

Parse tree vs. AST

+

expr

1 2 + 3

expr

expr

( ) ( )

expr

expr

1 2

+

3

+

21

Page 22: Automated Parser Generation (via  CUP )

22

AST hierarchy example

int_const plus minus times divide

expr

Page 23: Automated Parser Generation (via  CUP )

AST construction• AST Nodes constructed during parsing– Stored in push-down stack

• Bottom-up parser– Grammar rules annotated with actions for AST

construction– When node is constructed all children available

(already constructed)– Node (RESULT) pushed on stack

• Top-down parser– More complicated

23

Page 24: Automated Parser Generation (via  CUP )

1 + (2) + (3)

expr + (expr) + (3)

+

expr

1 2 + 3

expr

expr + (3)

expr

( ) ( )

expr + (expr)

expr

expr

expr

expr + (2) + (3)

int_constval = 1

pluse1 e2

int_constval = 2

int_constval = 3

pluse1 e2

expr ::= expr:e1 PLUS expr:e2 {: RESULT = new plus(e1,e2); :} | LPAREN expr:e RPAREN {: RESULT = e; :} | INT_CONST:i {: RESULT = new int_const(…, i); :}

AST construction

24

Page 25: Automated Parser Generation (via  CUP )

terminal Integer NUMBER;terminal PLUS,MINUS,MULT,DIV,LPAREN,RPAREN,SEMI;terminal UMINUS;non terminal Integer expr;non terminal expr_list, expr_part; precedence left PLUS, MINUS;precedence left DIV, MULT;precedence left UMINUS;

expr_list ::= expr_list expr_part | expr_part

; expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI

; expr ::= expr PLUS expr

| expr MINUS expr| expr MULT expr| expr DIV expr| MINUS expr %prec UMINUS| LPAREN expr RPAREN| NUMBER

;

Example of lists

25


Recommended