+ All Categories
Home > Documents > Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Date post: 15-Dec-2015
Category:
Upload: tavion-billy
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
33
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011
Transcript
Page 1: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Chap. 5, Top-Down Parsing

J. H. WangMar. 29, 2011

Page 2: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Outline

• Overview• LL(k) Grammars• Recursive-Descent LL(1) Parsers• Table-Driven LL(1) Parsers• Obtaining LL(1) Grammars• A Non-LL(1) Language• Properties of LL(1) Parsers• Parse Table Representation• Syntactic Error Recovery and Repair

Page 3: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Overview

• Two forms of top-down parsers– Recursive-descent parsers– Table-driven LL parsers: LL(k) – to be explained

later

• Compiler compilers (or parser generators)– CFG as a language’s definition, parsers can be

automatically constructed– Language revision, update, or extension can be

easily applied to a new parser– Grammar can be proved unambiguous if parser

construction is successful

Page 4: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Top-Down Parsing

• Top-down– To grow a parse tree from root to leaves

• Predictive– Must predict which production rule to be

applied

• LL(k)– Scan input left to right, leftmost derivation, k

symbol lookahead

• Recursive descent– Can be implemented by a set of mutually

recursive procedures

Page 5: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

LL(k) Grammars

• Recall from Chap.2– A parsing procedure for each nonterminal A– The procedure is responsible for

accomplishing one step of derivation for the corresponding production

– Choosing production by inspecting the next k tokens. Predict Set for production A is the set of tokens that trigger the production

– Predict Set is determined by the right-hand side (RHS)

Page 6: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

• We need a strategy for choosing productions– Predictk(p): the set of length-k token strings

that predict the application of rule p• Input string: a*

• S=>*lmAy1…yn

– P={pProductionsFor(A)|aPredict(p)}• P: empty set -> syntax error• P: more than one productions -> nondeterminism• P: exactly one production

Page 7: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

How to Compute Predict(p)

• To predict production p: AX1…Xm, m>=0– The set of terminal symbols that are first

produced in some derivation from X1…Xm

– Those terminal symbols that can follow A

– (Fig. 5.1)

Page 8: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 9: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

• For LL(1) grammar, the productions for each nonterminal A must have disjoint predict sets

• Not all CFGs are LL(1)– More lookahead may be needed: LL(k),

k>1– A more powerful parsing method may

be required (Chap. 6)– The grammar may be ambiguous

Page 10: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 11: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 12: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

S

MATCH

PEEK

ADVANCE

ERROR

Page 13: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Recursive-Descent LL(1) Parsers

• Input: token stream ts– PEEK(): to examine the next input token

without advancing the input– ADVANCE(): to advances the input by one

token

• To construct a recursive-descent parser– We write a separate procedure for each

nonterminal A– For each production pi, we check each symbol

in the RHS X1…Xm

• Terminal symbol: MATCH(ts, Xi)• Nonterminal symbol: call Xi(ts)

Page 14: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

PEEK

PEEK

PEEK

Page 15: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

PEEK

PEEK

PEEK

PEEK

PEEK

PEEK

PEEK

PEEK

MATCH

MATCH

MATCH

MATCH

MATCH

MATCH

Page 16: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Table-Driven LL(1) Parsers

• Creating recursive-descent parsers can be automated, but– Size of parser code– Inefficiency: overhead of method calls and

returns

• To create table-driven parsers, we use stack to simulate the actions by MATCH() and calls to nonterminals’ procedures– Terminal symbol: MATCH– Nonterminal symbol: table lookup– (Fig. 5.8)

Page 17: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

PUSH

MATCH

POP

ERROR

APPLY

APPLY

POP

PUSH

PEEK

PARSER

Page 18: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

How to Build LL(1) Parse Table

• The table is indexed by the top-of-stack (TOS) symbol and the next input token– Row: nonterminal symbol– Column: next input token– (Fig. 5.9)

Page 19: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

ILL ABLE

Page 20: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 21: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Obtaining LL(1) Grammars

• It’s easy to violate the requirement of a unique prediction for each combination of nonterminal and lookahead symbols– Common prefixes– Left recursion

Page 22: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Common Prefixes

• Two productions for the same nonterminal begin with the same string of grammar symbols– Ex. (Fig. 5.12) Not LL(k)

• Factoring transformation– Fig. 5.13– Ex. (Fig. 5.14)

Page 23: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

ACTOR

Page 24: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

LIMINATE EFT ECURSION

Page 25: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Left Recursion

• A production is left recursive if its LHS symbol is also the first symbol of its RHS– E.g. StmtList StmtList ; Stmt– AA

| – (Fig. 5.15 & Fig. 5.16)

Page 26: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 27: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

A Non-LL(1) Language

• Almost all common programming language constructs: LL(1)– One exception: if-then-else (dangling

else program)– Can be resolved by mandating that each

else is matched to its closest unmatched then

– (Fig. 5.17)

Page 28: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
Page 29: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

• Ambiguous (Chap. 6)– E.g. if expr then if expr then other else

other• If expr then { if expr then other else other }• If expr then { if expr then other } else other• -> at least two distinct parses

• Dangling bracket language (DBL)– DBL={[i]j|i≥j≥0}

• if expr then Stmt -> [ (opening bracket)• else Stmt -> ] (optional closing bracket)

Page 30: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

• Fig. 5.18(a)– S [ S CL

| λCL ] | λ

• E.g. [[]

• Fig. 5.18(b)– S [ S

| TT [ T ] | λ

Page 31: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

• It’s not LL(k)– [Predict( S[S )

[Predict( ST )[[Predict2( S[S )[[Predict2( ST )…[kPredictk( S[S )[kPredictk( ST )

Page 32: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Properties of LL(1) Parsers

• A correct, leftmost parse is constructed

• All grammars in LL(1) are unambiguous

• All table-driven LL(1) parsers operate in linear time and space with respect to the length of the parsed input

Page 33: Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.

Thanks for Your Attention!


Recommended