Date post: | 23-Jan-2016 |

Category: | ## Documents |

View: | 13 times |

Download: | 0 times |

Share this document with a friend

Description:

hy

Transcript:

1Top-Down Parsing

The parse tree is created top to bottom.

Top-down parser Recursive-Descent Parsing

Backtracking is needed (If a choice of a production rule does not work, we backtrack to try other alternatives.)

It is a general parsing technique, but not widely used.

Not efficient

Predictive Parsing

no backtracking

efficient

needs a special form of grammars (LL(1) grammars).

Non-Recursive (Table Driven) Predictive Parser is also known as LL(1) parser.

2Top-Down Parsing

Begin with the start symbol at the root of the parse tree

Build the parse tree from the top down

3Top-Down Parsing

S aSbS | bSaS | e S

a S b S

b S a S

e e

e

4Parsing Decisions

Which nonterminal in the parse tree should be expanded?

Which of its grammar rules should be used to expand it?

5Nondeterministic Parser

Expand any nonterminal.

Expand it using a grammar rule that occurs in the derivation of the input string.

6Backtracking Parser

Expand the leftmost nonterminal in the parse tree.

Try a grammar rule for the nonterminal. If it does not work out, try another one.

7Backtracking Parser

S aSa | bSb | a | b S

b b b

a S a

8Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

a S a

9Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

a S a

10

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

b S b

11

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

a

12

Backtracking Parser

S aSa | bSb | a | b S

b b b

b S b

b S b

b

13

Backtracking Parser

S aSa | bSb | a | b S

bb

b

b S b

b

14

Recursive Descent Parsing

Basic idea:

Write a routine to recognize each lhs

This produces a parser with mutually recursive routines.

Good for hand-coded parsers.

Ex: A aBb (This is only the production rule for A)

proc A {

- match the current token with a, and move to the next token;

- call B;

- match the current token with b, and move to the next token;

}

15

Recursive Descent Parsing (cont.)

When to apply e-productions.

A aA | bB | e

If all other productions fail, we should apply an e-production. For

example, if the current token is not a or b, we may apply the

e-production.

16

Recursive Descent Parsing (cont.)

A aBb | bAB

proc A {

case of the current token {

a: - match the current token with a, and move to the next token;

- call B;

- match the current token with b, and move to the next token;

b: - match the current token with b, and move to the next token;

- call A;

- call B;

}

}

17

Recursive Descent Parser for a Simple Declaration Statement

Decl_stmt type idlist;

Type int|float

Idlist id|id ,idlist

Proc declstmt()

{

Call type();

Call idlist();

}

Proc type()

{

case of the current token {

int : match the current

token with int, move to the

next token

float : match the

currenttoken with float,

move to the next token;

}

}

Write the code for the nonterminal idlist

18

AaB | b will correspond to

A() {

if (lookahead == 'a')

match('a');

B();

else if (lookahead == 'b')

match ('b');

else error();

}

19

Recursive descent parser for expression

ETE'

E'+TE'|e

TFT'

T'*FT'|e

F(E)

Fid

parse() {token = get_next_token();if (E() and token == '$')then return trueelse return false

}

E() {if (T())then return Eprime()else return false

}

Eprime() {if (token == '+')then token=get_next_token()

if (T())then return Eprime()else return false

else if (token==')' or token=='$')then return true else return false

}The remaining procedures are similar.

20

When Top down parsing doesnt Work Well

Consider productions S S a | a:

In the process of parsing S we try the above rules

Applied consistently in this order, get infinite loop

Could re-order productions, but search will have

lots of backtracking and general rule for ordering is

complex

Problem here is left-recursive grammar:

21

Left Recursion

E E + T | TT T * F | FF n | (E)

E

E + T

E + T

22

Elimination of Left recursion

Consider the left-recursive grammar

S S a | b

S generates all strings starting with a b and followed

by a number of a

Can rewrite using right-recursion

S b S

S a S | e

23

Elimination of left Recursion. Example

Consider the grammar

S 1 | S 0 ( b = 1 and a = 0 )

can be rewritten as

S 1 S

S 0 S | e

24

More Elimination of Left Recursion

In general

S S a1 | | S an | b1 | | bm

All strings derived from S start with one of b1,,bmand continue with several instances of a1,,an

Rewrite as

S b1 S | | bm S

S a1 S | | an S | e

25

General Left Recursion

The grammar

S A a | d (1)

A S b (2)

is also left-recursive because

S + S b a

This left recursion can also be eliminated by first

substituting (2) into (1)

There is a general algorithm (e.g. Aho, Sethi, Ullman

4.3)

26

Predictive Parsing

Wouldnt it be nice if

the r.d. parser just knew which production to expand next?

Idea:

switch ( something ) {

case L1: return E1();

case L2: return E2();

otherwise: print syntax error;

}

whats something, L1, L2?

the parser will do lookahead (look at next token)

27

Predictive parsing (Contd..)

Modification of Recursive descent top down parsing

in which parser predicts which production to use

By looking at the next few tokens

No backtracking

Predictive parsers accept LL(k) grammars

L means left-to-right scan of input

L means leftmost derivation

k means predict based on k tokens of lookahead

In practice, LL(1) is used

28

LL(1) Languages

For each non-terminal and input token there

may be a UNIQUE choice of production that

could lead to success.

LL(k) means that for each non-terminal and k

tokens, there is only one production that could

lead to success

29

But First: Left Factoring

Consider the grammar

E T + E | T

T int | int * T | ( E )

Impossible to predict because

For T two productions start with int

For E it is not clear how to predict

A grammar must be left-factored before use

for predictive parsing

30

Left-Factoring Example

Starting with the grammar

E T + E | T

T int | int * T | ( E )

Factor out common prefixes of productions

E T X

X + E |

T ( E ) | int Y

Y * T |

31

Left-Factoring (cont.)

In general,

A ab1 | ab2 where a is non-empty and the first symbols of b1 and b2 (if they have one)are different.

when processing a we cannot know whether expand

A to ab1 or

A to ab2But, if we re-write the grammar as follows

A aA

A b1 | b2 so, we can immediately expand A to aA

32

Left-Factoring -- Algorithm

For each non-terminal A with two or more alternatives (production rules) with a common non-empty prefix, let say

A ab1 | ... | abn | g1 | ... | gm

convert it into

A aA | g1 | ... | gmA b1 | ... | bn

33

Left-Factoring Example1

A abB | aB | cdg | cdeB | cdfB

A aA | cdA

A bB | B

A g | eB | fB

34

Predictive Parser (example)

stmt if ...... |

while ...... |

begin ...... |

for .....

When we are trying to write the non-terminal stmt, if the current token is if we have to choose first production rule.

When we are trying to write the non-terminal stmt, we can uniquely

choose the production rule by just looking the current token.

We eliminate the left recursion in the grammar, and left factor it. But it

may not be suitable for predictive parsing (not LL(1) grammar).

35

Non-Recursive Predictive Parsing -- LL(1) Parser

Non-Recursive predictive parsing is a table-driven parser.

It is a top-down parser.

It is also known as LL(1) Parser.

input buffer

stack Non-recursive output

Predictive Parser

Parsing Table

36

LL(1) Parser

input buffer

our string to be parsed. We will assume that its end is marked with a special symbol $.

output

a production rule representing a step of the derivation sequence (left-most derivation) of the string in the input buffer.

37

stack

contains the grammar symbols

at the bottom of the stack, there is a special end marker symbol $.

initially the stack contains only the symbol $ and the starting symbol S. $S initial stack

Embed Size (px)

Recommended