1
Top-Down ParserTop-Down ParserTop-Down ParserTop-Down Parser A top-down parser starts
with the root of the parse tree.
The root node is labeled with the goal symbol of the grammar
2
Top-Down ParserTop-Down ParserTop-Down ParserTop-Down Parser A top-down parser starts
with the root of the parse tree.
The root node is labeled with the goal symbol of the grammar
3
Top-Down Parsing AlgorithmTop-Down Parsing AlgorithmTop-Down Parsing AlgorithmTop-Down Parsing Algorithm
Construct the root node of the parse tree
Repeat until the fringe of the parse tree matches input string
4
Top-Down Parsing AlgorithmTop-Down Parsing AlgorithmTop-Down Parsing AlgorithmTop-Down Parsing Algorithm
Construct the root node of the parse tree
Repeat until the fringe of the parse tree matches input string
5
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing At a node labeled A, select
a production with A on its lhs
for each symbol on its rhs, construct the appropriate child
6
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing At a node labeled A, select
a production with A on its lhs
for each symbol on its rhs, construct the appropriate child
7
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing When a terminal symbol is
added to the fringe and it does not match the fringe, backtrack
Find the next node to be expanded
8
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing When a terminal symbol is
added to the fringe and it does not match the fringe, backtrack
Find the next node to be expanded
9
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing
The key is picking right production in step 1.
That choice should be guided by the input string
10
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing
The key is picking right production in step 1.
That choice should be guided by the input string
11
Expression GrammarExpression GrammarExpression GrammarExpression Grammar1 Goal → expr2 expr → expr + term3 | expr - term4 | term5 term → term * factor6 | term ∕ factor7 | factor8 factor → number9 | id10
| ( expr )
12
Top-Down ParsingTop-Down ParsingTop-Down ParsingTop-Down Parsing
Let’s try parsing
x – 2 * y
13
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
14
This worked well except that “–” does not match “+”
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
15
The parser must backtrack to here
P Sentential Form input
- Goal x – 2 * y
1 expr x – 2 * y
2 expr + term x – 2 * y
4 term + term x – 2 * y
7 factor + term x – 2 * y
9 <id,x> + term x – 2 * y
9 <id,x> + term x – 2 * y
16
This time the “–” and “–” matched
P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y
4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y
17
We can advance past “–” to look at “2”
P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y- <id,x> – term x – 2 * y
18
Now, we need to expand “term”
P Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr – term x – 2 * y4 term – term x – 2 * y7 factor – term x – 2 * y9 <id,x> – term x – 2 * y9 <id,x> – term x – 2 * y- <id,x> – term x – 2 * y
19
P Sentential Form input- <id,x> – term x – 2 * y7 <id,x> – factor x – 2 * y9 <id,x> –
<num,2>x – 2 * y
- <id,x> – <num,2>
x – 2 * y
“2” matches “2”We have more input but no non-terminals left to expand
20
The expansion terminated too soon
Need to backtrack
P Sentential Form input- <id,x> – term x – 2 * y7 <id,x> – factor x – 2 * y9 <id,x> –
<num,2>x – 2 * y
- <id,x> – <num,2>
x – 2 * y
21
P Sentential Form input- <id,x> – term x – 2 * y5 <id,x> – term * factor x – 2 * y7 <id,x> – factor * factor x – 2 * y8 <id,x> – <num,2> *
factorx – 2 * y
- <id,x> – <num,2> * factor
x – 2 * y
- <id,x> – <num,2> * factor
x – 2 * y
9 <id,x> – <num,2> * <id,y>
x – 2 * y
- <id,x> – <num,2> * <id,y>
x – 2 * y
Success! We matched and consumed all the input
22
Another Possible ParseAnother Possible ParseAnother Possible ParseAnother Possible ParseP Sentential Form input- Goal x – 2 * y1 expr x – 2 * y2 expr +term x – 2 * y2 expr +term +term x – 2 * y2 expr +term +term +term x – 2 * y2 expr +term +term +term
+....x – 2 * y
consuming no input!!Wrong choice of expansion leads to non-terminationParser must make the right choice
23
Left RecursionLeft RecursionLeft RecursionLeft Recursion
Top-down parsers cannot handle left-recursive
grammars
24
Left RecursionLeft RecursionLeft RecursionLeft RecursionFormally,
A grammar is left recursive if A NT such that a derivation A * A , for some string (NT T)*
25
Left RecursionLeft RecursionLeft RecursionLeft Recursion Our expression grammar is
left recursive.
This can lead to non-termination in a top-down parser
26
Left RecursionLeft RecursionLeft RecursionLeft Recursion Our expression grammar is
left recursive.
This can lead to non-termination in a top-down parser
27
Left RecursionLeft RecursionLeft RecursionLeft Recursion
Non-termination is bad in any part of a compiler!
28
Left RecursionLeft RecursionLeft RecursionLeft Recursion For a top-down parser, any
recursion must be a right recursion
We would like to convert left recursion to right recursion
29
Left RecursionLeft RecursionLeft RecursionLeft Recursion For a top-down parser, any
recursion must be a right recursion
We would like to convert left recursion to right recursion
30
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
To remove left recursion, we transform the grammar
31
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
Consider a grammar fragment:
A → A |
where neither nor starts with A.
32
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left RecursionWe can rewrite this as:
A → A'
A' → A' |
where A' is a new non-terminal
33
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left RecursionWe can rewrite this as:
A → A'
A' → A' |
where A' is a new non-terminal
34
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
A → A ' A' → A'
|
This accepts the same language but uses only right recursion
35
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
The expression grammar we have been using contains two cases of left- recursion
36
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
expr → expr + term | expr – term | term
term → term * factor | term ∕ factor | factor
37
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left RecursionApplying the transformation yields
expr → term expr' expr' → + term expr'
| – term expr' |
38
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion
Applying the transformation yields
term → factor term' term' → * factor term'
| ∕ factor term' |
39
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion These fragments use only
right recursion They retain the original left
associativity A top-down parser will
terminate using them.
40
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion These fragments use only
right recursion They retain the original left
associativity A top-down parser will
terminate using them.
41
Eliminating Left RecursionEliminating Left RecursionEliminating Left RecursionEliminating Left Recursion These fragments use only
right recursion They retain the original left
associativity A top-down parser will
terminate using them.
42
1 Goal → expr2 expr → term expr' 3 expr' → + term expr' 4 | – term expr'5 | 6 term → factor term' 7 term' → * factor term' 8 | ∕ factor term'9 | 10 factor → number11 | id12 | ( expr )
43
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing If a top down parser picks
the wrong production, it may need to backtrack
Alternative is to look ahead in input and use context to pick correctly
44
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing If a top down parser picks
the wrong production, it may leed to backtrack
Alternative is to look ahead in input and use context to pick correctly
45
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing How much lookahead is
needed?
In general, an arbitrarily large amount
46
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing How much lookahead is
needed?
In general, an arbitrarily large amount
47
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing Fortunately, large classes
of CFGs can be parsed with limited lookahead
Most programming languages constructs fall in those subclasses
48
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing Fortunately, large classes
of CFGs can be parsed with limited lookahead
Most programming languages constructs fall in those subclasses
49
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingBasic Idea:
Given A → | , the parser should beable to choose between and .
50
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingFIRST Sets:
For some rhs G, define FIRST() as the set of tokens that appear as the first symbol in some string that derives from .
51
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
That is,x FIRST()
iff xfor some.
52
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
The LL(1) PropertyIf A → and A → both appear in the grammar, we would like FIRST() FIRST() =
53
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingPredictive parsers accept LL(k) grammars
“left-to-right” scan of input
left-most derivation
LL(k) “k” tokens of lookahead
54
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingThe LL(1) Property
FIRST() FIRST() = allows the parser to make a correct choice with a lookahead of exactly one symbol!
55
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
What about -productions?
They complicate the definition of LL(1)
56
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
What about -productions?
They complicate the definition of LL(1)
57
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
If A → and A → and FIRST() , then we need to ensure that FIRST() is disjoint from FOLLOW(), too
58
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
FOLLOW() is the set of all words in the grammar that can legally appear after an .
59
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
For a non-terminal X,
FOLLOW(X ) is the set of symbols that might follow the derivation of X.
60
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
FIRST and FOLLOWX
FIRST FOLLOW
61
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
Define FIRST+() as
FIRST() FOLLOW(), if FIRST()
FIRST(), otherwise
62
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
Then a grammar is LL(1) iff A → and A → implies
FIRST+() FIRST+() =
63
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingGiven a grammar that has the is LL(1) property
• we can write a simple routine to recognize each lhs
• code is simple and fast
64
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingGiven a grammar that has the is LL(1) property
• we can write a simple routine to recognize each lhs
• code is simple and fast
65
Predictive ParsingPredictive ParsingPredictive ParsingPredictive ParsingGiven a grammar that has the is LL(1) property
• we can write a simple routine to recognize each lhs
• code is simple and fast
66
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
Consider A → 1 23 which satisfies the LL(1) property FIRST+()FIRST+
() =
67
/* find an A */if(token FIRST(1)) find a 1 and return trueelse if(token FIRST(2)) find a 2 and return trueif(token FIRST(3)) find a 3 and return trueelse error and return false
68
/* find an A */if(token FIRST(1)) find a 1 and return trueelse if(token FIRST(2)) find a 2 and return trueif(token FIRST(3)) find a 3 and return trueelse error and return false
69
/* find an A */if(token FIRST(1)) find a 1 and return trueelse if(token FIRST(2)) find a 2 and return trueif(token FIRST(3)) find a 3 and return trueelse error and return false
70
/* find an A */if(token FIRST(1)) find a 1 and return trueelse if(token FIRST(2)) find a 2 and return trueif(token FIRST(3)) find a 3 and return trueelse error and return false
71
/* find an A */if(token FIRST(1)) find a 1 and return trueelse if(token FIRST(2)) find a 2 and return trueif(token FIRST(3)) find a 3 and return trueelse error and return false
72
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing
Grammar with the LL(1) property are called predictive grammars because the parser can “predict” the correct expansion at each point in the parse.
73
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing Parsers that capitalize on
the LL(1) property are called predictive parsers
One kind of predictive parser is the recursive descent parser
74
Predictive ParsingPredictive ParsingPredictive ParsingPredictive Parsing Parsers that capitalize on
the LL(1) property are called predictive parsers
One kind of predictive parser is the recursive descent parser
75
Recursive Descent ParsingRecursive Descent ParsingRecursive Descent ParsingRecursive Descent Parsing1 Goal → expr2 expr → term expr' 3 expr' → + term expr' 4 | - term expr'
5 | 6 term → factor term' 7 term' → * factor term' 8 | ∕ factor term'
9 | 10 factor → number11 | id
12 | ( expr )
76
Recursive Descent ParsingRecursive Descent ParsingRecursive Descent ParsingRecursive Descent Parsing
This leads to a parser with six mutually recursive routines
Goal TermExpr TPrimeEPrime Factor
77
Recursive Descent ParsingRecursive Descent ParsingRecursive Descent ParsingRecursive Descent Parsing
Each recognizes one non-terminal (NT) or terminal (T)
Goal TermExpr TPrimeEPrime Factor
78
Recursive Descent ParsingRecursive Descent ParsingRecursive Descent ParsingRecursive Descent Parsing
The term descent refers to the direction in which the parse tree is built.
Here are some of these routines written as functions
79
Recursive Descent ParsingRecursive Descent ParsingRecursive Descent ParsingRecursive Descent Parsing
The term descent refers to the direction in which the parse tree is built.
Here are some of these routines written as functions