+ All Categories
Home > Documents > YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead...

YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead...

Date post: 17-Dec-2015
Category:
Upload: beverly-parker
View: 216 times
Download: 1 times
Share this document with a friend
39
YANG YANG 1 Chap 5 LL(1) Parsing 1) left-to-right scanning leftmost derivation 1-token lookahead ser generator: Parsing becomes the easiest! Modifying parsers is also convenient
Transcript
Page 1: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 1Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• LL(1) left-to-right scanning leftmost derivation 1-token lookahead

• parser generator:

Parsing becomes the easiest!

Modifying parsers is also convenient.

Page 2: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 2Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

Given the productions

A 1 A 2 ..... A n

During a (leftmost) derivation,

... A ... ... 1 ... or ... 2 ... or ... n ...

Which route should we choose?

(Try-and-error is not a good idea.)

» Use the lookahead symbols.

Page 3: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 3Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Consider the situation:We are about to expand a nonterminalA and there are several productions whose LHS are A:

A 1

A 2

.....A n

We choose one of the productionsbased on the lookahead token.

Which one should we choose?

Consider First(1) First(2) ...... First(n)and

if i , then consider also Follow(A). *

Page 4: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 4Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Define

predict(A ) =First() (if First() then Follow(A))

• If the lookahead token a predict(A) then we use the production A to expand A.

• What if a predict(A 1) and a predict(A 2)?

• What if a predict(A) for all productions A whose LHS are A?

Page 5: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 5Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

Property of LL(1) grammars:

If a grammar is LL(1), then

for any two productions

A A

First(Follow(A)) First(Follow(A)) =

Page 6: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 6Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

Given the FIRST and FOLLOW sets in Fig. 5-2 and 5-3, calculate the predictset for each production.

Page 7: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 7Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§5.2 LL(1) Parse Table

• The predict() function may be represented as an LL(1) parse table.

T: Vn * Vt P {error}

a b ...... A 3 B error ....

T[A, a] = A if apredict(A)

= error otherwise

• A grammar is LL(1) iff all entries in the parse table contain a unique production or the error flag.

Page 8: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 8Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

Figure 5.5 The LL(1) table for Micro

Page 9: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 9Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

5.3 LL(1) parsers

• Similar to scanners, there are two kinds of parsers:

1. built-in: recursive descent

2. table-driven

Page 10: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 10Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

1. built-in

stmt() { token = next_token(); switch(token) { case ID: /*production 5:stmt-->ID:=<exp>;*/ match(ID); match(ASSIGN); exp(); match(SEMICOLON); break; case READ: /*production 6*/ ... case WRITE: /*production 7*/ ... default: syntax_error(....); } }

Page 11: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 11Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

•It is obvious that these recursive descentparsing procedures can be generatedautomatically from the grammar.

grammar LL(1) table

parser generator

recursive descent parser

• However, it is difficult for the parsergenerator to integrate the semantic routines into the (generated) recursive descent parser automatically.

Page 12: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 12Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

2. table-driven parser

(+) generic driver

Only the LL(1) table needs to bechanged when the grammar ismodified.

(+) non-recursive (faster)

Parser maintains a stack itself.No recursive calls.

Page 13: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 13Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

lldriver(){

push( START_SYMBOL );a := next_token;while stack is not empty do{

X := symbol on stack topif ( X is a nondeterminal &&

T[X, a] == XY1Ym )’)pop(1);push Ym, Ym-1, , Y1

else if ( x == a )pop(1);a := next_token();

else if ( x is an action symbol ) pop(1); call correspond routine

else sntax_error();}

}

Page 14: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 14Chap 5 LL(1) ParsingChap 5 LL(1) Parsing

Ex. begin A := B - 3 + A; end $

a = begin

X = <GOAL><GOAL> parsestack

Trace the action of the parser on this example.

Page 15: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 15Chap 5 LL(1) ParsingChap 5 LL(1) Parsing

5.5 Action symbols

• Action symbols may be processed by the parser in a similar way.

1. in recursive descent parsers

Ex. gen_action( “ID:=<exp>#assign” );”) will generate the following code:

match(ID); match(ASSIGN); exp(); assign(); match(semicolon);

• Parameters are transmitted through a semantic stack.

• Semantic stack is a stack of semantic records.• Parser stack is a stack of grammar (and action) symbols.

Page 16: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 16Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

2. in LL(1) driver

• Action symbols are pushed into the parse stack in the same way as grammar symbols.

• When action symbols are on stack top, the driver calls corresponding semantic routines.

• See previous slide for lldriver.

• Parameters are also transmitted through semantic stack.

Page 17: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 17Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§5.6 Making grammars LL(1)

• Not all grammars are LL(1). However, some non-LL(1) grammars can be made LL(1) by simple modifications.

• When is a grammar not LL(1)?When there is an entry in the parsetable that contains more than oneproductions.

Ex. ...... ID ...... .... <stmt> 2,5 ....

This is called a conflict, which means we do not know which production to use when <stmt> is on stack top and ID is the next input token.

Page 18: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 18Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Conflicts are classfied into two categories: 1. common prefix 2. left recursion

• Common prefixEx.

<stmt> if <exp> then <stmt><stmt> if <exp> then <stmt> else <stmt>

Consider when <stmt> is on stacktop, ‘if’ is the next input token. Wecannot choose which production to use at this time.

In general, if we have two productions

A A

and First() First() , then we have a conflict.

Page 19: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 19Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Solution: factor out the common prefix

Ex.

<stmt> if <exp> then <stmt> <tail><tail> <tail> else <stmt>

Page 20: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 20Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

2. left recursion: productions of the form:

A A

• grammar with left-recursive productions are not LL(1) because we may have

A A A

same lookahead

Page 21: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 21Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Problem: left recursion

AA A A

Intuition: all the strings derivable from A have the form: , , , , , ,

Solution: replace the productions So we may use the following productions instead:

AT AT T TT

Page 22: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 22Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

Ex. Given the left-recursive grammar:

EE + T ET TT * P TP PID

After eliminating left recursion, we get

ET A A A+ T A TP B B B* P B PID

Page 23: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 23Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

3. more general solution ex.

<stmt><label> <unlabeled stmt><label>ID :<label><unlabeled stmt>ID := <exp> ;

We cannot decide which production to use when <label> is on the stack top and ID is the next token:

<label> ? <stmt> <unlabeled stmt>

lookahead lookahead ID ID

Page 24: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 24Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Solution: use the following productions(which essentially look ahead 2 tokens)

<stmt>ID <suffix> <suffix>: <unlabeled stmt> <suffix>:= <exp> ; <unlabeled stmt>ID := <exp> ;

Try two examples:

A: B := C ;

B := C ;

Page 25: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 25Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

4. For more difficult cases, we use semantic routines to help parsing.

Ex. In Ada, we may declare arrays as

A: array(I .. J, BOOLEAN)

A straightforward grammar is (for array bound)

<bound><exp> .. <exp> <bound>ID <exp>ID <exp>… and ID First(<exp>)

This grammar is not LL(1) because we cannot make a decision when <bound> is on stack top and ID is the next token.

Page 26: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 26Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Solution: <bound> <exp> <tail> <tail> <tail> .. <exp>

• All grammars can be transformed into Greibach Normal Form, in which a production has the form:

Aa

terminal

So given a grammar G, we can do

GGNFno common prefix no left recursion but still NOT LL(1)!Ex. Sa A a Sb A b a Ab A consider A is on stacktop; b is next token.

Page 27: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 27Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§5.7 The dangling-else problem

• Consider if a then if b then x := 1 else x := 2

Two possibilities: a a T T F b b T F T x := 2x := 1 x := 2 x := 1

The problem is which ‘if’ the ‘else’belong to.• In essence, we are trying to find an LL(1) grammar for the set { [i ]j | ij0}

But is it possible?

Page 28: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 28Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• 1st attempt: G1

S[ S C S C] C

This grammar is ambiguous. Consider [ [ ]

S S

[ S C [ S C

[ S C [ S C ]

]

Page 29: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 29Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• 2nd attempt: we can make ] be associated with the nearest unpaired [ as follows:

S[ SSTT [ T ]T

This grammar is not ambiguous. Consider [ [ ] S

[ S

[ T ]However, this grammar is not LL(1), either. Consider the case when S is on stack top and [ is the next input token. [First( [ S ) [First( T )This grammar can be parsed with a bottom-up parser, but not a top-downparser.

Page 30: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 30Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Solution: conflicts + special rules

1. GS ;2. Sif S E3. Sother4. Eelse S5. E

The parse table if else other ; G 1 1 S 2 3 E 4,5 5 conflicts

We can enforce that T[E, else] = 4th rule.

This essentially forces ‘else’ to be matched with the nearest unpaired ‘if’.

Page 31: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 31Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Alternative solution: change the language.

• Add ‘end if’ at the end of every ‘if’.

Sif S E Sother Eelse S end if Eend if

Page 32: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 32Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§5.9 Properties of LL(1) parsers:

• A correct leftmost parse is guaranteed.

• All LL(1) grammars are un-ambiguous.

• linear time and linear space

Page 33: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 33Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§ llgen

Page 776 of the book

output from llgen

*definedecrtn 1ifprocess 2

Page 34: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 34Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

§ LL(k) parsing

• Recall a grammar is LL(1) only if

for any two productions A and A,

First(Follow(A))First(Follow(A)) =

• To generalize, we write

for any two productions A and A

Firstk(Followk(A)) Firstk(Followk(A)) =

if G is strong LL(k).

• The word ‘strong’ means G imposes too strong a condition.

Page 35: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 35Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Consider GS $ Sa A a Sb A b a Ab A

– This grammar is not LL(1)When A is on stack top and b is next token, we cannot choose between

Ab and A.

stack input b ..... A ......

-- Does it help if we can look ahead two tokens? NO! if the next two tokens are bb then we should choose Ab. if the next two tokens are ba then we cannot make a choice.

Page 36: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 36Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

case 1. input is aba a A A S a a G $ $ $

lookahead match lookahead ab a ba at this point, we should choose Ab

case 2. input is bba b A A b b S a a G $ $ $

lookahead match lookahead bb b ba at this point, we should choose A

Page 37: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 37Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

So the problem is not the limited numberof lookahead tokens.

The problem is in the ‘context’.

Page 38: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 38Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• Therefore, the grammar is not strong LL(1).

• Actually, we can verify that the grammar is not strong LL(k) for all k1 by verify that

Firstk( ba$ ) Firstk( bFollowk(A) ) Firstk( Followk(A) )

for all k1

Page 39: YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.

YANGYANG 39Chap 5 LL(1)

Parsing

Chap 5 LL(1) Parsing

• However, it is possible to parse the language of the grammar under the following conditions:

1. look ahead two tokens 2. from left to right 3. using the left context

We call such grammars LL(2), rather than strong LL(2).

• Note that LL(2) strong LL(2)

LL(1) = strong LL(1)


Recommended