Top Down Parser

Post on 23-Jan-2016

58 views 0 download

description

hy

transcript

1

Top-Down Parsing

• The parse tree is created top to bottom.

• Top-down parser– Recursive-Descent Parsing

• Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.)

• It is a general parsing technique, but not widely used.

• Not efficient

– Predictive Parsing

• no backtracking

• efficient

• needs a special form of grammars (LL(1) grammars).

• Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

2

Top-Down Parsing

• Begin with the start symbol at the root of the parse tree

• Build the parse tree from the top down

3

Top-Down Parsing

S aSbS | bSaS | e S

a S b S

b S a S

e e

e

4

Parsing Decisions

Which nonterminal in the parse tree should be expanded?

Which of its grammar rules should be used to expand it?

5

Nondeterministic Parser

Expand any nonterminal.

Expand it using a grammar rule that occurs in the derivation of the

input string.

6

Backtracking Parser

Expand the leftmost nonterminal in the parse tree.

Try a grammar rule for the nonterminal. If it does not work out, try

another one.

7

Backtracking Parser

S aSa | bSb | a | b S

b b b

a S a

8

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

a S a

9

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

a S a

10

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

b S b

11

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

a

12

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

b

13

Backtracking Parser

S aSa | bSb | a | b S

bb

b

b S b

b

14

Recursive Descent Parsing

• Basic idea:

– Write a routine to recognize each lhs

– This produces a parser with mutually recursive routines.

– Good for hand-coded parsers.

Ex: A aBb (This is only the production rule for A)

proc A {

- match the current token with a, and move to the next token;

- call „B‟;

- match the current token with b, and move to the next token;

}

15

Recursive Descent Parsing (cont.)

• When to apply e-productions.

A aA | bB | e

• If all other productions fail, we should apply an e-production. For

example, if the current token is not a or b, we may apply the

e-production.

16

Recursive Descent Parsing (cont.)

A aBb | bAB

proc A {

case of the current token {

„a‟: - match the current token with a, and move to the next token;

- call „B‟;

- match the current token with b, and move to the next token;

„b‟: - match the current token with b, and move to the next token;

- call „A‟;

- call „B‟;

}

}

17

Recursive Descent Parser for a Simple Declaration Statement

• Decl_stmt type idlist;

• Type int|float

• Idlist id|id ,idlist

• Proc declstmt()

• {

– Call type();

– Call idlist();

}

Proc type()

{

case of the current token {

‘int’ : match the current

token with int, move to the

next token

‘float’ : match the

currenttoken with float,

move to the next token;

}

}

Write the code for the nonterminal idlist

18

AaB | b will correspond to

–A() {

– if (lookahead == 'a')

match('a');

B();

else if (lookahead == 'b')

match ('b');

else error();

}

19

Recursive descent parser for expression

• ETE'

• E'+TE'|e

TFT'

• T'*FT'|e

F(E)

Fid

parse() {token = get_next_token();if (E() and token == '$')then return trueelse return false

}

E() {if (T())then return Eprime()else return false

}

Eprime() {if (token == '+')then token=get_next_token()

if (T())then return Eprime()else return false

else if (token==')' or token=='$')then return true else return false

}The remaining procedures are similar.

20

When Top down parsing doesn’t Work Well

• Consider productions S S a | a:

– In the process of parsing S we try the above rules

– Applied consistently in this order, get infinite loop

– Could re-order productions, but search will have

lots of backtracking and general rule for ordering is

complex

• Problem here is left-recursive grammar:

21

Left Recursion

E E + T | TT T * F | FF n | (E)

E

E + T

E + T

22

Elimination of Left recursion

• Consider the left-recursive grammar

S S a | b

• S generates all strings starting with a b and followed

by a number of a

• Can rewrite using right-recursion

S b S‟

S‟ a S‟ | e

23

Elimination of left Recursion. Example

• Consider the grammar

S 1 | S 0 ( b = 1 and a = 0 )

can be rewritten as

S 1 S‟

S‟ 0 S‟ | e

24

More Elimination of Left Recursion

• In general

S S a1 | … | S an | b1 | … | bm

• All strings derived from S start with one of b1,…,bm

and continue with several instances of a1,…,an

• Rewrite as

S b1 S‟ | … | bm S‟

S‟ a1 S‟ | … | an S‟ | e

25

General Left Recursion

• The grammar

S A a | d (1)

A S b (2)

is also left-recursive because

S + S b a

• This left recursion can also be eliminated by first

substituting (2) into (1)

• There is a general algorithm (e.g. Aho, Sethi, Ullman

§4.3)

26

Predictive Parsing

• Wouldn‟t it be nice if

– the r.d. parser just knew which production to expand next?

– Idea:

switch ( something ) {

case L1: return E1();

case L2: return E2();

otherwise: print “syntax error”;

}

– what‟s “something”, L1, L2?

• the parser will do lookahead (look at next token)

27

Predictive parsing (Contd..)

• Modification of Recursive descent top down parsing

in which parser “predicts” which production to use

– By looking at the next few tokens

– No backtracking

• Predictive parsers accept LL(k) grammars

– L means “left-to-right” scan of input

– L means “leftmost derivation”

– k means “predict based on k tokens of lookahead”

• In practice, LL(1) is used

28

LL(1) Languages

• For each non-terminal and input token there

may be a UNIQUE choice of production that

could lead to success.

• LL(k) means that for each non-terminal and k

tokens, there is only one production that could

lead to success

29

But First: Left Factoring

• Consider the grammar

E T + E | T

T int | int * T | ( E )

Impossible to predict because

– For T two productions start with int

– For E it is not clear how to predict

• A grammar must be left-factored before use

for predictive parsing

30

Left-Factoring Example

• Starting with the grammar

– E T + E | T

– T int | int * T | ( E )

• Factor out common prefixes of productions

E T X

X + E | ε

T ( E ) | int Y

Y * T | ε

31

Left-Factoring (cont.)

• In general,

A ab1 | ab2 where a is non-empty and the first symbols of b1 and b2 (if they have one)are different.

when processing a we cannot know whether expand

A to ab1 or

A to ab2

But, if we re-write the grammar as follows

A aA’

A’ b1 | b2 so, we can immediately expand A to aA’

32

Left-Factoring -- Algorithm

• For each non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say

A ab1 | ... | abn | g1 | ... | gm

convert it into

A aA’ | g1 | ... | gm

A’ b1 | ... | bn

33

Left-Factoring – Example1

A abB | aB | cdg | cdeB | cdfB

A aA’ | cdA’’

A’ bB | B

A’’ g | eB | fB

34

Predictive Parser (example)

stmt if ...... |

while ...... |

begin ...... |

for .....

• When we are trying to write the non-terminal stmt, if the current token

is if we have to choose first production rule.

• When we are trying to write the non-terminal stmt, we can uniquely

choose the production rule by just looking the current token.

• We eliminate the left recursion in the grammar, and left factor it. But it

may not be suitable for predictive parsing (not LL(1) grammar).

35

Non-Recursive Predictive Parsing -- LL(1) Parser

• Non-Recursive predictive parsing is a table-driven parser.

• It is a top-down parser.

• It is also known as LL(1) Parser.

input buffer

stack Non-recursive output

Predictive Parser

Parsing Table

36

LL(1) Parser

input buffer

– our string to be parsed. We will assume that its end is marked with a special symbol $.

output

– a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer.

37

stack

– contains the grammar symbols

– at the bottom of the stack, there is a special end marker symbol $.

– initially the stack contains only the symbol $ and the starting symbol S. $S initial stack

– when the stack is emptied (ie. only $ left in the stack), the parsing is completed.

• parsing table

– a two-dimensional array M[A,a]

– each row is a non-terminal symbol

– each column is a terminal symbol or the special symbol $

– each entry holds a production rule.

38

• INITIAL CONFIGURATION

• Stack Input Buffer

• $S Input string$

• FINAL CONFIGURATION

• Stack Input Buffer

• $ $

39

LL(1) Parser – Parser Actions

• The symbol at the top of the stack (say X) and the current symbol in the input string (say a) determine the parser action.

• There are four possible parser actions.

1. If X and a are $ ( Final configuration) parser halts (successful completion)

2. If X and a are the same terminal symbol (different from $)

parser pops X from the stack, and moves the next symbol in the input buffer.

40

3. If X is a non-terminal

parser looks at the parsing table entry M[X,a]. If

M[X,a] holds a production rule XY1Y2...Yk, it pops X

from the stack and pushes Yk,Yk-1,...,Y1 into the stack.

The parser also outputs the production rule XY1Y2...Yk

to represent a step of the derivation.

4. none of the above error

– all empty entries in the parsing table are errors.

– If X is a terminal symbol different from a, this is also

an error case.

41

LL(1) Parser – Example1

S aBc LL(1) Parsing

B bB | e Table

stack input output

$S abbc$ S aBc

$cBa abbc$

$cB bbc$ B bB

$cBb bbc$

$cB bc$ B bB

$cBb bc$

$cB c$ B e

$c c$

$ $ accept, successful completion

B e

c

B bBB

S aBcS

$ba

42

LL(1) Parser – Example1 (cont.)

Outputs: S aBc B bB B e

Derivation(left-most): SaBcabBcabbBcabbc

S

Ba c

B

Bb

b

e

parse tree

43

LL(1) Parser – Example2E TE‟

E‟ +TE‟ | e

T FT‟

T‟ *FT‟ | e

.F (E) | id

F (E)F idF

T‟ eT‟ eT‟ *FT‟T‟ eT’

T FT‟T FT‟T

E‟ eE‟ eE‟ +TE‟E’

E TE‟E TE‟E

$)(*+id

1.E TE‟ 2.E‟ +TE‟ 3. E‟ e

4.T FT‟ 5.T‟ *FT‟ 6. T‟ e

7.F (E) 8.Fid

44

LL(1) Parser – Example2

stack input output

$E id+id$ E TE’

$E’T id+id$ T FT’

$E’ T’F id+id$ F id

$ E’ T’id id+id$

$ E’ T’ +id$ T’ e

$ E’ +id$ E’

TE’

$ E’ T+ +id$

$ E’ T id$ T FT’

$ E’ T’ F id$ F id

$ E’ T’id id$

$ E’ T’ $ T’ e

$ E’ $ E’ e

$ $ accept1.E TE‟ 2.E‟ +TE‟ 3. E‟ e

4.T FT‟ 5.T‟ *FT‟ 6. T‟ e

7.F (E) 8.Fid

id + * ( ) $

E 1 1

E ‘ 2 3 3

T 4 4

T ‘ 6 5 6 6

F 8 7

45

Constructing LL(1) Parsing Tables

• Two functions are used in the construction of LL(1) parsing tables:– FIRST FOLLOW

• FIRST(a) is a set of the terminal symbols which occur as first symbols in strings derived from a, where a is any string of grammar symbols.

• if a derives to e, then e is also in FIRST(a) .

• FOLLOW(A) is the set of the terminals which occur immediately after (follow) the non-terminal A in the strings derived from the starting symbol.

– a terminal a is in FOLLOW(A) if S aAab

– $ is in FOLLOW(A) if S aA*

*

46

Compute FIRST for Any String X [FIRST(X)]

1. If X is a terminal symbol OR e then FIRST(X)={X}

2. If X is a non-terminal symbol and X e is a production rule then e

is in FIRST(X).

3. If X is a non-terminal symbol and X Y1Y2..Yn is a production rule

if a terminal a in FIRST(Yi) and e is in all FIRST(Yj) for j=1,...,i-1

then a is in FIRST(X).

if e is in all FIRST(Yj) for j=1,...,n then e is in FIRST(X).

47

Example

1.Xa FIRST(X)={a}

2.X ε FIRST(X)={ε}

3.Xa|ε FIRST(X)={a, ε}

4. XAbB AaB B ε

FIRST(X)={a} FIRST(A)={a} FIRST(B)={ε}

5.XABC A ε B ε Cc

FIRST(X)={c} FIRST(A)={ε } FIRST(B)={ε }

48

FIRST Example

E TE‟

E‟ +TE‟ | e

T FT‟

T‟ *FT‟ | e

F (E) | id

FIRST(F) = {(,id}

FIRST(T’) = {*, e}

FIRST(T) = {(,id}

FIRST(E’) = {+, e}

FIRST(E) = {(,id}

49

ETE’

• First(E)

– E is a non-terminal and has a production ETE‟ , From rule 3

• Add all the non e -symbols of FIRST(T) and also collect first sets of E‟ if their

preceding nonterminal can derive e

• FIRST(T) = ?

– T is a nonterminal and has a production rule TFT‟, from rule 3

– Add all the non e -symbols of FIRST(F) and also collect first sets of T‟ if

their preceding nonterminal can derive e

FIRST(F) = ?

F is a nonterminal and has a production Fid | (E) .

First(F)= { id, ( }

Hence

FIRST(E)=FIRST(T)=FIRST(F)={ (,id }

50

FIRST(E’) AND FIRST(T’)

E‟+TE‟ | ε

• FIRST(E‟)= FIRST(+TE‟) U FIRST(ε)

= {+, ε}

T‟*FT‟| ε

• FIRST(T‟)= FIRST(*FT‟) U FIRST(ε)

= {*, ε}

51

FIRST SETS

FIRST(E) = {(,id}

FIRST(T) = {(,id}

FIRST(E’) = {+, ε}

FIRST(T’) = {*, ε}

FIRST(F) = {(,id}

E TE’

E’ +TE’ | ε

T FT’

T’ *FT’ | ε

F (E) | id

52

Compute FOLLOW (for non-terminals)

• If S is the start symbol $ is in FOLLOW(S)

• if A aBb is a production rule

everything in FIRST(b) is FOLLOW(B) except e

• If ( A aB is a production rule ) or

( A aBb is a production rule and e is in FIRST(b) )

everything in FOLLOW(A) is in FOLLOW(B).

We apply these rules until nothing more can be added to any follow set.

53

FOLLOW Example

E TE‟

E‟ +TE‟ | e

T FT‟

T‟ *FT‟ | e

F (E) | id

• FOLLOW(E)

• Since E is a start symbol

add $ to the follow set

• From rule 2, the terminal )

is followed by E. So add )

also to the follow set of E

• Hence

• FOLLOW(E)= { $,)}

54

• FOLLOW(E‟) : [ETE‟, E‟+TE‟ ]

• From rule (3) everything in FOLLOW(E) will be added to FOLLOW(E‟).

• HENCEFOLLOW(E‟)={ $, ) }

FOLLOW(T) : [ETE‟,

E‟+TE‟]

From rule (2) FIRST(E‟)

except ε is added to

FOLLOW(T).

From rule (3) , since First(E‟)

contains ε add FOLLOW (E)

to the FOLLOW(T).

HENCE

FOLLOW(T)={+, $, ) }

55

– FOLLOW(F) :

[TFT‟,T‟*FT‟]

• From rule (2) FIRST(T‟) except

ε is added to FOLLOW(F).

• From rule (3) , since First(T‟)

contains e add FOLLOW (T) to

the FOLLOW(T‟).

• HENCE

– FOLLOW(F)={*,+, $, ) }

• FOLLOW(T‟) :

[TFT‟,T‟*FT‟ ]

• From rule (3) everything in

FOLLOW(T) will be added to

FOLLOW(T‟).

• HENCE

– FOLLOW(T‟)={+, $, ) }

56

FOLLOW SETS

FOLLOW(E) = { $, ) }

FOLLOW(E‟) = { $, ) }

FOLLOW(T) = { +, ), $ }

FOLLOW(T‟) = { +, ), $ }

FOLLOW(F) = {+, *, ), $ }

57

EXERCISES

• COMPUTE FIRST and FOLLOW

SETS for the following grammar

S aBc

B bB | e

58

• SOLUTION

• FIRST(S)={a}

• FIRST(B)={b, e}

• FOLLOW(S)={$}

• FOLLOW(B)={c}

59

• 2. • statement if-statement | other

If-statement if ( exp ) statement else-part

Else-part else statement | ε

Exp0 | 1

3:A(A ) A| ε

4:

Lexpatom |list

Atomnumber | identifier

List ( lexp-seq )

Lexp-seq lexp , lexp-seq |lexp

– Left factor the grammar

– Compute First and Follow for the resultant grammar.

60

The LL(1) Parse Table

• Let G be an LL(1) grammar and M be the parsing table.

– M has one row for each nonterminal A

– M has one column for each terminal symbol a, plus a

column for the end of input symbol “$”.

61

Constructing LL(1) Parsing Table -- Algorithm

• for each production rule A a of a grammar G

– for each terminal a in FIRST(a)

add A a to M[A,a]

– If e in FIRST(a)

for each terminal b in FOLLOW(A) add A a to M[A,b]

– If e in FIRST(a) and $ in FOLLOW(A)

add A a to M[A,$]

• All other undefined entries of the parsing table are error entries.

62

Constructing LL(1) Parsing Table -- Example

S aBc B bB | e

FIRST(S)={a} FIRST(B)={b, e}

FOLLOW(S)={$} FOLLOW(B)={c}

SaBc

First(S)=First(aBc)={a}

Hence M[S,a]=SaBc

BbB| e

First(B)={b, e}

M[B,b]=BbB

c

B

S

$ba

SaBc

B bB B e

Follow(B)={c}

Hence M[B,c]=B e

63

Expression Grammar Parse Table

E TE'

E' +TE' | ε

T FT'

T' *FT' | ε

F ( E ) | id

• First (E) = First (T) = First (F) = { (, id }

• First (E') = { +, ε }

• First (T') = { *, ε }

• Follow (E) = Follow (E') = { $, ) }

• Follow (T) = Follow (T') = { +, $, ) }

• Follow (F) = { *, +, $, ) }

64

Expression Grammar Parse Table

• E TE' :

Since First(TE') = First(T) =

{ (, id }, we add E TE' to M[E, (] and

M[E, id].

First (E) = First (T) = First (F) = { (,

id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E'

T

T'

F

65

Expression Grammar Parse Table

• E' +TE' :

Since First(+TE') = {+}, we add E' +TE'

to M[E',+].First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE'

T

T'

F

66

Expression Grammar Parse Table

• E' e :

We must examine Follow(E') = { $, )

}. We add E' e to M[E',)] and

M[E',$]

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T

T'

F

67

Expression Grammar Parse Table

• T FT' :

Since First(FT') = First(F) =

{ (, id }, we add T FT' to M[T,(]

and M[T,id].

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T'

F

68

Expression Grammar Parse Table

• T' *FT' :

Since (*FT') = {*}, we add

T' *FT' to M[T',*].

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T' T' *FT'

F

69

Expression Grammar Parse Table

• T' e :

We examine Follow(T') =

{ +, $, ) }. We add T' e to M[T',+],

M[T',)], and M[T',$].

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T' T' e T' *FT' T' e T' e

F

70

Expression Grammar Parse Table

• F ( E ):

We add F ( E ) to M[F,(]

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T' T' e T' *FT' T' e T' e

F F ( E )

71

Expression Grammar Parse Table

• F id :

We add F id to M[F,id]

First (E) = First (T) = First (F) = { (, id }

First (E') = { +, ε }

First (T') = { *, ε }

Follow (E) = Follow (E') = { $, ) }

Follow (T) = Follow (T') = { +, $, ) }

Follow (F) = { *, +, $, ) }

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T' T' e T' *FT' T' e T' e

F F id F ( E )

72

Expression Grammar Parse Table

id + * ( ) $

E E TE' E TE'

E' E' +TE' E' e E' e

T T FT' T FT'

T' T' e T' *FT' T' e T' e

F F id F ( E )

The completed parse table for the expression grammar

73

Exercises on Parsing Table Construction

• 1. statement if-statement | other

If-statement if ( exp ) statement else-part

Else-part else statement | ε

Exp0 | 1

2 :A(A ) A| ε

3.

Lexpatom |list

Atomnumber | identifier

List ( lexp-seq )

Lexp-seq lexp , lexp-seq |lexp

– Show the actions of the corresponding LL(1) parser, given the input string

(a,(b,(2)),( c )).

74

LL(1) Grammars

• A grammar whose parsing table has no multiply-defined entries is said

to be LL(1) grammar.

one input symbol used as a look-head symbol do determine parser action

LL(1) left most derivation

input scanned from left to right

• The parsing table of a grammar may contain more than one production

rule. In this case, we say that it is not a LL(1) grammar.

75

A Grammar which is not LL(1)

S i C t S E | a FOLLOW(S) = { $,e }

E e S | e FOLLOW(E) = { $,e }

C b FOLLOW(C) = { t }

FIRST(iCtSE) = {i}

FIRST(a) = {a}

FIRST(eS) = {e}

FIRST(e) = {e}

FIRST(b) = {b}

two production rules for M[E,e]

Problem ambiguity

C bC

E eE e S

E e

E

S iCtSES aS

$tieba

76

A Grammar which is not LL(1) (cont.)

• What do we have to do it if the resulting parsing table

contains multiply defined entries?

– If we didn‟t eliminate left recursion, eliminate the left

recursion in the grammar.

– If the grammar is not left factored, we have to left factor

the grammar.

– If its (new grammar‟s) parsing table still contains multiply

defined entries, that grammar is ambiguous or it is

inherently not a LL(1) grammar.

77

• A left recursive grammar cannot be a LL(1) grammar.

• A grammar is not left factored, it cannot be a LL(1) grammar

• An ambiguous grammar cannot be a LL(1) grammar.

78

Properties of LL(1) Grammars

• A grammar G is LL(1) if and only if the following

conditions hold for two distinctive production rules A a

and A b

-Both a and b cannot derive strings starting with same

terminals.

- At most one of a and b can derive to ε.

-If b can derive to ε, then a cannot derive to any string

starting with a terminal in FOLLOW(A).

79

Non LL(1) Examples

Grammar Not LL(1) because

S S a | a Left recursive

S a S | a FIRST(a S) FIRST(a)

S a R | e

R S | e For R: S * e and e * e

S a R a

R S | e

For R:

FIRST(S) FOLLOW(R)

80

Error Recovery in Predictive Parsing

• An error may occur in the predictive parsing (LL(1) parsing)

– if the terminal symbol on the top of stack does not match with

the current input symbol.

– if the top of stack is a non-terminal A, the current input symbol is a,

and the parsing table entry M[A,a] is empty.

• What should the parser do in an error case?

– The parser should be able to give an error message (as much as

possible meaningful error message).

– It should be recover from that error case, and it should be able

to continue the parsing with the rest of the input.

81

Example

82

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

Example: Parse 1 + 2 * 3

83

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

Or: number + number * number

84

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

85

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

exp

86

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

exp'

term

87

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

exp'

term

88

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

exp'

term'

factor

89

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num + num * num $

$

exp'

term'

num

90

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

term'

91

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

92

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

term

addop

93

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

term

addop

94

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

term

+

95

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

+ num * num $

$

exp'

term

+

96

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num * num $

$

exp'

term

97

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num * num $

$

exp'

term'

factor

98

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num * num $

$

exp'

term'

num

99

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

* num $

$

exp'

term'

100

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

* num $

$

exp'

term'

factor

mulop

101

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

* num $

$

exp'

term'

factor

*

102

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num $

$

exp'

term'

factor

103

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num $

$

exp'

term'

factor

104

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

num $

$

exp'

term'

num

105

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

$

$

exp'

term'

106

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

$

$

exp'

term'

107

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

$

$

exp'

108

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

$

$

exp'

109

term

factor term'

term

factor term'

term

term' eterm'

mulop factor

term'

term' eterm' eterm' eterm'

mulop

*

mulop

factor

number

factor

( exp )

factor

addop

-

exp'

addop

term exp'

- *

exp' e

$

addop

+

addop

exp'

addop

term exp'

exp' eexp'

exp

term exp'

exp

term exp'

exp

+)number(M[N][T]

$

$

110

Successful Parse!

111

Self Study

• Error Recovery Techniques

• Panic-Mode Error Recovery

• Phrase-Level Error Recovery

• Error-Productions

• Global-Correction

• Reference :

• Aho, Sethi and Ullman