Top-Down Parsing

Post on 12-Jan-2016

32 views 1 download

description

Top-Down Parsing. Identify a leftmost derivation for an input string Why ? By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input. - PowerPoint PPT Presentation

transcript

CH4.1

314ALL

Dr. Mohamed Ramadan Saady

Top-Down ParsingTop-Down Parsing

• Identify a leftmost derivation for an input string

• Why ?

• By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input.

• A aBc adDc adec (scan a, scan d, scan e, scan c - accept!)

• Recursive-descent parsing concepts

•Predictive parsing

• Recursive / Brute force technique

• non-recursive / table driven

• Error recovery

• Implementation

CH4.2

314ALL

Dr. Mohamed Ramadan Saady

Top-Down ParsingTop-Down Parsing

From Grammar to Parser, take IFrom Grammar to Parser, take I

CH4.3

314ALL

Dr. Mohamed Ramadan Saady

Recursive Descent ParsingRecursive Descent Parsing

• General category of Parsing Top-Down

• Choose production rule based on input symbol

• May require backtracking to correct a wrong choice.

• Example: S c A d A ab | a

input: cad

cad S

c dA

cadS

c dA

a b

cadS

c dA

a b Problem: backtrack

cadS

c dA

a

cadS

c dA

a

CH4.4

314ALL

Dr. Mohamed Ramadan Saady

Top-Down ParsingTop-Down Parsing

From Grammar to Parser, take IIFrom Grammar to Parser, take II

CH4.5

314ALL

Dr. Mohamed Ramadan Saady

Predictive ParsingPredictive Parsing

•Backtracking is bad!

•To eliminate backtracking, what must we do/be sure of for grammar?• no left recursion• apply left factoring

• (frequently) when grammar satisfies above conditions:current input symbol in conjunction with current non-terminal uniquely determines the production that needs to be applied.

• Utilize transition diagrams:

For each non-terminal of the grammar do following:

1. Create an initial and final state

2. If A X1X2…Xn is a production, add path with edges X1, X2, … , Xn

• Once transition diagrams have been developed, apply a straightforward technique to algorithmicize transition diagrams with procedure and possible recursion.

CH4.6

314ALL

Dr. Mohamed Ramadan Saady

Transition DiagramsTransition Diagrams

• Unlike lexical equivalents, each edge represents a token•Transition implies: if token, match input else call proc• Recall earlier grammar and its associated transition diagrams

E TE’ E’ + TE’ |

T FT’ T’ * FT’ |

F ( E ) | id

0 21T E’

E:

3 6+

4T

E’: 5E’

7 98F T’

T:

10 13*

11F

T’: 12T’

14 17(

15E

F: 16)

id

How are transition diagrams used ?

Are -moves a problem ?

Can we simplify transition diagrams ?

Why is simplification critical ?

CH4.7

314ALL

Dr. Mohamed Ramadan Saady

How are Transition Diagrams Used ?How are Transition Diagrams Used ?

main(){ TD_E();}

TD_T(){ TD_F(); TD_T’();}

TD_E(){ TD_T(); TD_E’();}

TD_E’(){ token = get_token(); if token = ‘+’ then { TD_T(); TD_E’(); }}

TD_F(){ token = get_token(); if token = ‘(’ then { TD_E(); match(‘)’); } else if token.value <> id then {error + EXIT} else ...}

TD_E’(){ token = get_token(); if token = ‘*’ then { TD_F(); TD_T’(); }}

What happened to -moves?

… “else unget()and terminate”

NOTE: not all error conditions have been represented.

CH4.8

314ALL

Dr. Mohamed Ramadan Saady

How can Transition Diagrams be How can Transition Diagrams be Simplified ?Simplified ?

6E’

E’: 53+

4T

CH4.9

314ALL

Dr. Mohamed Ramadan Saady

How can Transition Diagrams be How can Transition Diagrams be Simplified ? (2)Simplified ? (2)

6E’

E’: 53+

4T

E’: 53+

4T

6

CH4.10

314ALL

Dr. Mohamed Ramadan Saady

How can Transition Diagrams be How can Transition Diagrams be Simplified ? (3)Simplified ? (3)

6E’

E’: 53+

4T

E’: 53+

4T

6

E’: 3+

4

T

6

CH4.11

314ALL

Dr. Mohamed Ramadan Saady

How can Transition Diagrams be How can Transition Diagrams be Simplified ? (4)Simplified ? (4)

6E’

E’: 53+

4T

E’: 53+

4T

6

E’: 3+

4

T

6

21E’T

0E:

CH4.12

314ALL

Dr. Mohamed Ramadan Saady

How can Transition Diagrams be How can Transition Diagrams be Simplified ? (5)Simplified ? (5)

6E’

E’: 53+

4T

E’: 53+

4T

6

E’: 3+

4

T

6

21E’T

0E:

T0E: 3

+4

T

6

CH4.13

314ALL

Dr. Mohamed Ramadan Saady

Additional Transition Diagram Additional Transition Diagram SimplificationsSimplifications

• Similar steps for T and T’

• Simplified Transition diagrams:*

F7T: 10

13

T’: 10*

11

F

13

14 17(

15E

F: 16)

id

Why is simplification important ?

How does code change?

CH4.14

314ALL

Dr. Mohamed Ramadan Saady

Top-Down ParsingTop-Down Parsing

From Grammar to Parser, take IIIFrom Grammar to Parser, take III

CH4.15

314ALL

Dr. Mohamed Ramadan Saady

Motivating Table-Driven ParsingMotivating Table-Driven Parsing

1. Left to right scan input

2. Find leftmost derivation

Grammar: E TE’

E’ +TE’ | T id

Input : id + id $

Derivation: E

Processing Stack:

Terminator

CH4.16

314ALL

Dr. Mohamed Ramadan Saady

Non-Recursive / Table DrivenNon-Recursive / Table Driven

Empty stack symbol

a + b $

Y

X

$

Z

Input

Predictive Parsing Program

Stack Output

Parsing Table M[A,a]

(String + terminator)

NT + T symbols of CFG What actions parser

should take based on stack / input

General parser behavior: X : top of stack a : current input

1. When X=a = $ halt, accept, success

2. When X=a $ , POP X off stack, advance input, go to 1.

3. When X is a non-terminal, examine M[X,a]

if it is an error call recovery routineif M[X,a] = {X UVW}, POP X, PUSH W,V,UDO NOT expend any input

CH4.17

314ALL

Dr. Mohamed Ramadan Saady

Algorithm for Non-Recursive ParsingAlgorithm for Non-Recursive Parsing

Set ip to point to the first symbol of w$;

repeat

let X be the top stack symbol and a the symbol pointed to by ip;

if X is terminal or $ then

if X=a then

pop X from the stack and advance ip

else error()

else /* X is a non-terminal */

if M[X,a] = XY1Y2…Yk then begin

pop X from stack;

push Yk, Yk-1, … , Y1 onto stack, with Y1 on top

output the production XY1Y2…Yk

end

else error()

until X=$ /* stack is empty */

Input pointer

May also execute other code based on the production used

CH4.18

314ALL

Dr. Mohamed Ramadan Saady

ExampleExample

E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id

Our well-worn example !

Table M

Non-terminal

INPUT SYMBOL

id + * ( ) $

E

E’

T

T’

F

ETE’

TFT’

Fid

E’+TE’

T’ T’*FT’

F(E)

TFT’

ETE’

T’

E’ E’

T’

CH4.19

314ALL

Dr. Mohamed Ramadan Saady

Trace of ExampleTrace of Example

STACK INPUT OUTPUT

CH4.20

314ALL

Dr. Mohamed Ramadan Saady

Trace of ExampleTrace of Example

Expend Input

$E

$E’T$E’T’F$E’T’id$E’T’$E’$E’T+$E’T$E’T’F$E’T’id$E’T’$E’T’F*$E’T’F$E’T’id$E’T’$E’$

id + id * id$

id + id * id$id + id * id$id + id * id$

+ id * id$+ id * id$+ id * id$

id * id$id * id$id * id$

* id$* id$

id$id$

$$$

E TE’T FT’F id

T’ E’ +TE’

T FT’F id

T’ *FT’

F id

T’ E’

STACK INPUT OUTPUT

CH4.21

314ALL

Dr. Mohamed Ramadan Saady

Leftmost Derivation for the ExampleLeftmost Derivation for the Example

The leftmost derivation for the example is as follows:

E TE’ FT’E’ id T’E’ id E’ id + TE’ id + FT’E’

id + id T’E’ id + id * FT’E’ id + id * id T’E’

id + id * id E’ id + id * id

CH4.22

314ALL

Dr. Mohamed Ramadan Saady

What’s the Missing Puzzle Piece ?What’s the Missing Puzzle Piece ?

Constructing the Parsing Table M !

1st : Calculate First & Follow for Grammar

2nd: Apply Construction Algorithm for Parsing Table ( We’ll see this shortly )

Basic Tools:

First: Let be a string of grammar symbols. First() is the set that includes every terminal that appears leftmost in or in any string originating from . NOTE: If , then is First( ).

Follow: Let A be a non-terminal. Follow(A) is the set of terminals a that can appear directly to the right of A in some sentential form. (S Aa, for some and ). NOTE: If S A, then $ is Follow(A).

*

*

*

CH4.23

314ALL

Dr. Mohamed Ramadan Saady

Motivation Behind First & FollowMotivation Behind First & Follow

First:

Follow:

Is used to help find the appropriate reduction to follow given the top-of-the-stack non-terminal and the current input symbol.

Example: If A , and a is in First(), then when a=input, replace A with (in the stack).

( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH .

Is used when First has a conflict, to resolve choices, or when First gives no suggestion. When or , then what follows A dictates the next choice to be made.

Example: If A , and b is in Follow(A ), then when and b is an input character, then we expand A with , which will eventually expand to , of which b follows!

( : i.e., First( ) contains .)

*

*

*

CH4.24

314ALL

Dr. Mohamed Ramadan Saady

An example.An example.

$S abbd$

STACK INPUT OUTPUT

S a B C d B CB | | S aC b

CH4.25

314ALL

Dr. Mohamed Ramadan Saady

Computing First(X) : Computing First(X) : All Grammar SymbolsAll Grammar Symbols

1. If X is a terminal, First(X) = {X}

2. If X is a production rule, add to First(X)

3. If X is a non-terminal, and X Y1Y2…Yk is a production rule

Place First(Y1) in First(X)

if Y1 , Place First(Y2) in First(X)

if Y2 , Place First(Y3) in First(X)

if Yk-1 , Place First(Yk) in First(X)

NOTE: As soon as Yi , Stop.

Repeat above steps until no more elements are added to any First( ) set.

Checking “Yj ?” essentially amounts to checking whether belongs to First(Yj)

*

*

*

*

*

CH4.26

314ALL

Dr. Mohamed Ramadan Saady

Computing First(X) : Computing First(X) : All Grammar Symbols - continuedAll Grammar Symbols - continued

Informally, suppose we want to compute

First(X1 X2 … Xn ) = First (X1) “+”

First(X2) if is in First(X1) “+”

First(X3) if is in First(X2) “+”

First(Xn) if is in First(Xn-1)

Note 1: Only add to First(X1 X2 … Xn) if is in First(Xi) for all i

Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !

CH4.27

314ALL

Dr. Mohamed Ramadan Saady

Example 1Example 1

Given the production rules:

S i E t SS’ | a

S’ eS |

E b

CH4.28

314ALL

Dr. Mohamed Ramadan Saady

Example 1Example 1

Given the production rules:

S i E t SS’ | a

S’ eS |

E b

Verify that

First(S) = { i, a }

First(S’) = { e, }

First(E) = { b }

CH4.29

314ALL

Dr. Mohamed Ramadan Saady

Example 2Example 2

Computing First for: E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id

CH4.30

314ALL

Dr. Mohamed Ramadan Saady

Example 2Example 2

Computing First for: E TE’ E’ + TE’ | T FT’ T’ * FT’ | F ( E ) | id

First(E)

First(TE’)

First(T)

First(T) “+” First(E’)

First(F)

First((E)) “+” First(id)

First(F) “+” First(T’)

“(“ and “id”

Not First(E’) since T

Not First(T’) since F

*

*

Overall: First(E) = { ( , id } = First(F)

First(E’) = { + , } First(T’) = { * , }

First(T) First(F) = { ( , id }