Date post: | 27-Dec-2015 |
Category: |
Documents |
Upload: | cynthia-cain |
View: | 254 times |
Download: | 10 times |
1
LR Parsing TechniquesLR Parsing Techniques
Bottom-Up Parsing- LR: a special form of BU Parser
LR Parsing as Handle PruningShift-Reduce Parser (LR Implementation)LR(k) Parsing Model
- k lookaheads to determine next actionParsing Table Construction:
SLR, LR, LALR
2
Bottom-Up ParsingBottom-Up Parsing• A bottom-up parser attempts to construct a
parse tree for an input string beginning at the leaves (the bottom) and working up towards the root (the top).
3
Bottom-Up Parsing: Ex1Bottom-Up Parsing: Ex1
BU Parsing: Construct a parse tree from the leaves to the root: left-to-right reduction
G: S a A B e input: abbcdeA A b c | bB d
ca d eb
A
b
A
ca d eb
A
b
BA
ca d eb
A
b
S
BA
ca d eb
A
bca d ebb
4
Bottom-Up Parsing: Ex2Bottom-Up Parsing: Ex2
BU Parsing: Construct a parse tree from the leaves to the root: random reduction
G: S a A B e input: abbcdeA A b c | bB d
ca d eb
A
b
BA
ca d eb
A
b
S
BA
ca d eb
A
bca d ebb
B
ca d eb
A
b
5
LR Parsing: BU + Left-to-RightLR Parsing: BU + Left-to-Right• Many ways to construct a parse tree bottom-up
– Ex1 & Ex2– Prefer a simpler form of parser… Left-to-right scanning
• If scanning strictly Left-to-right Rightmost derivation in reverse (thus the name LR Parser). Why rm.? (…Ex1)– Never consider right terminals while reducing left (|N)*– Reduce left (|N)* (terminals or non-terminals) as much as
possible until no further reduce– Shift when no further reduce
Reversing the sequence of reduction corresponds to a rightmost derivation
• LR Parser– A special form of BU Parser– A parser with simpler form: left-to-right scan
6
LR Parsing: BU + Left-to-RightLR Parsing: BU + Left-to-Right
LR Parsing: Construct a parse tree from the leaves to the root, scanning left-to-right (resulting in rightmost derivation in reverse)
S a A B e input: abbcdeA A b c | bB d
ca d eb
A
b
A
ca d eb
A
b
BA
ca d eb
A
b
S
BA
ca d eb
A
bca d ebb
abbcde rm aAbcde rm aAde rm aABe rm S
ca d eb
A
b
A
8
Rightmost Derivation in Rightmost Derivation in ReverseReverse
E
E E
E E
id1 + id2 * id3
1
23
45
E
E E
E E
id1 + id2 * id3
1 2
3 4
5
9
LR ParsingLR Parsing
The L stands for scanning the input from left to right
The R stands for constructing a rightmost derivation in reverse
10
LR ParsingLR Parsing LR Parsing =/= Leftmost Reduction
The 1st reducible substring does not always result in successful parse
Handle(s): those successfully lead to S
Top-Down: Expansion Matching
Bottom-Up: Shift/Reduce Locating next “handle” to reduce [How To??] Handle pruning: hide details below reduced (|N)*
ca d eb
A
b
A
A
ca d eb
A
b
11
HandlesHandles
NOT all (leftmost) reduction (A ) leads to the start symbol S: rm A rm (n)
rm S Only some handles do
A handle of a right-sentential form consists of– a production A – a position of where can be replaced by A to produce
the previous right-sentential form in a rightmost derivation of
abbcde rm aAbcde rm aAde rm aABe rm S
A b A A b c B d S a A B eHandles:
Right-sent. forms:
12
HandlesHandles
• If , then A in the position following is a handle of . (The string contains only terminal symbols.)
• We say “a handle” rather than “the handle” since the grammar may be ambiguous. But if the grammar is unambiguous, then every right sentential form has exactly one handle.
rmrm
AS*
15
LR Parsing as Handle PruningLR Parsing as Handle Pruning rm A rm S
S
A
The string to the right of the handle contains only terminals (A is the rightmost non-terminal) A is the leftmost complete interior node with all its children in the tree
Pruning: Find a string that is reducible to S and hide its details by reductionand proceed with the new sentential form.
Never consider right terminals while reducing left grammar symbols
17
• A rightmost derivation in reverse can be obtained by handle pruning.
• Let G =
E E+E | E*E | (E) | id (ambiguous!)
Right-sententialform
Handle Reducingproduction
id1+id2*id3 id1 E→id
E+id2*id3 id2 E→id
E+E*id3 id3 E→id
E+E*E E*E E→E*E
E+E E+E E→E+E
E
rm
LR Parsing as Handle PruningLR Parsing as Handle Pruning(1st reduction sequence)(1st reduction sequence)
18
LR Parsing as Handle PruningLR Parsing as Handle Pruning(2nd reduction sequence)(2nd reduction sequence)
• A rightmost derivation in reverse can be obtained by handle pruning.
• Let G =
E E+E | E*E | (E) | id (ambiguous!)Right-sentential form
Handle Reducing production
id1+id2*id3 id1 E→id
E+id2*id3 id2 E→id
E+E*id3 E+E E→E+E
E*id3 id3 E→id
E*E E*E E→E*E
E
rm
20
Shift-Reduce ParsingShift-Reduce Parsing
Parsing program
Parsing table
Input
Output
Stack
Handle
rm A rm S
Areduce)
shift
21
Stack Implementation of Stack Implementation of Shift-Reduce ParsersShift-Reduce Parsers
• A convenient way to implement a shift-reduce parse is to use a stack to hold grammar symbols and an input buffer to hold the string to be parsed.
• a push-down machine with a tape
• The parser operates by shifting zero or more symbols onto the stack until a handle is on top of the stack. The parser then replaces/reduces with/to the left side of the appropriate production.
• This procedure repeats until the stack contains the start symbol and the input is empty.
22
Stack OperationsStack Operations
Shift: shift the next input symbol onto the top of the stack
Reduce: replace the handle at the top of the stack with the corresponding nonterminal
Accept: announce successful completion of the parsing
Error: call an error recovery routine
24
An ExampleAn Example
Action Stack InputS $ a b b c d e $S $ a b b c d e $R $ a b b c d e $S $ a A b c d e $S $ a A b c d e $R $ a A b c d e $S $ a A d e $R $ a A d e $S $ a A B e $R $ a A B e $A $ S $
25
Configurations of shift-reduce Configurations of shift-reduce parser on input parser on input idid11+id+id22*id*id33
Step Stack Input Action 1 $ id1+id2*id3$ shift 2 $id1 +id2*id3$ reduce by E id 3 $E +id2*id3$ shift 4 $E+ id2*id3$ shift 5 $E+id2 *id3$ reduce by E id 6 $E+E *id3$ shift 7 $E+E* id3$ shift 8 $E+E*id3 $ reduce by E id 9 $E+E*E $ reduce by E E*E 10 $E+E $ reduced by E E+E 11 $E $ accept
*Note: The grammar is ambiguous. Therefore, there is another possible reduction sequence.
26
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
How to represent parsing states so we can tell the right parsing actions to take?
0 1 2 4 5 7 9
3
6 8
10
27
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
How to represent parsing states so we can tell the right parsing actions to take?
0 1 2 4 5 7 9
3
6 8
10
28
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S0: . S . a A B e
0 1 2 4 5 7 9
3
6 8
10
29
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S1: S a . A B e (shift a) A . A b c A . b
0 1 2 4 5 7 9
3
6 8
10
30
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
• S2: A b . • (shift b, to reduce A b)
0 1 2 4 5 7 9
3
6 8
10
31
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S3: S a A . B e B . d A A . b c
0 1 2 4 5 7 9
3
6 8
10
32
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S4: A A b . c(shift b)
0 1 2 4 5 7 9
3
6 8
10
33
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S5: A A b c .(shift c, reduce A A b c )
0 1 2 4 5 7 9
3
6 8
10
34
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S6: S a A . B e B . d
0 1 2 4 5 7 9
3
6 8
10
35
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S7: B d . (shift d, reduce B d)
0 1 2 4 5 7 9
3
6 8
10
36
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S8: S a A B . e
0 1 2 4 5 7 9
3
6 8
10
37
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S9: S a A B e .(shift e, reduce S a A B e )
0 1 2 4 5 7 9
3
6 8
10
38
LR Parsing StatesLR Parsing States
G: S a A B e A A b c | b B dInput: abbcde
S
BA
ca d eb
A
b $
S10: S’ S .(S reduced)
0 1 2 4 5 7 9
3
6 8
10
39
LR Parsing StatesLR Parsing States S0: . S . a A B e S1: S a . A B e (shift a) A . A b c | . b A . A b c | . b (closed, no further expansion) S2: A b . (shift b, reduce A b) S3: S a A . B e B . d A A . b c S4: A A b . c (shift b) S5: A A b c . (shift c, reduce A A b c ) S6: S a A . B e B . d S7: B d . (shift d, reduce B d) S8: S a A B . e S9: S a A B e . (shift e, reduce S a A B e )
ca d eb
A
bA
ca d eb
A
bBA
ca d eb
A
b
S
BA
ca d eb
A
b
ca d ebb
G: S a A B e A A b c | b B dInput: abbcde
S10: S’ S . (S reduced)
40
Ambiguity: Sources of ConflictsAmbiguity: Sources of Conflicts
When trying to reduce a sub-string of the current sentential form:Not all reducible substrings are handlesAmbiguous: More than one substring as a handle
Sources of Conflictsnon-LR GrammarShift-reduce conflictsReduce-reduce conflicts
41
Shift/Reduce ConflictShift/Reduce Conflict
stmt if expr then stmt | if expr then stmt else stmt | other
Stack Input$ - - - if expr then stmt * else stmt - - - $
Shift if expr then stmt else stmt Reduce if expr then stmt
42
Reduce/Reduce ConflictReduce/Reduce Conflict(1) stmt id ( para_list ) // func(a,b) (2) stmt expr := expr (3) para_list para_list , para(4) para_list para(5) para id(6) expr id ( expr_list ) // array(a,b)(7) expr id(8) expr_list expr_list , expr(9) expr_list expr
Stack Input(a) $ - - - id ( id , id ) - - - $ [Q: r5? r7?] [Sol: use “stmt procid ( para_list )” => (a) r7 (b) r5](b) $- - - procid ( id , id ) - - - $ [r5]
-Need a complex lexical analyzer to identify id vs. procid- Reduction depends on stack[sp-2]
43
LR(k) GrammarsLR(k) Grammars
Only some classes of grammars, known as the “LR(k) Grammars,” can be parsed deterministically by a shift-reduce parserCFG’s that are non-LR may need some
adaptation to make them deterministically parsed with a shift-reduce parser
Parsing Table ConstructionPredict handles at each positions (after shifts)
44
LR(k) ParsingLR(k) Parsing
The L stands for scanning the input from left to right
The R stands for constructing a rightmost derivation in reverse
The k stands for the number of lookahead input symbols used to make parsing decisions
45
LR ParsingLR Parsing
The LR parsing algorithm
Constructing SLR(1) parsing tables
Constructing LR(1) parsing tables
Constructing LALR(1) parsing tables
46
Model of an LR ParserModel of an LR Parser
LRParsing Program
Input
Output
Stack
Action Goto
Sm
Sm-1
Xm-1
Xm
S0Parsing table
State after
action
State before action
Initial State
Shift/Reduce State after Reductionhandle
47
Parsing Table for Expression Parsing Table for Expression GrammarGrammar
(0) E’ E (1) E E + T (2) E T(3) T T * F (4) T F(5) F ( E ) (6) F id
Follow(E)={+,),$}Follow(T)={+,),$,*}Follow(F)={+,),$,*}
State Action Goto id + * ( ) $ E T F0 s5 s4 1 2 31 s6 acc2 r2 s7 r2 r23 r4 r4 r4 r44 s5 s4 8 2 35 r6 r6 r6 r66 s5 s4 9 37 s5 s4 108 s6 s119 r1 s7 r1 r110 r3 r3 r3 r311 r5 r5 r5 r5
48
GOTO ActionsGOTO Actions
I0: E’ . E E . E + T E . T T . T * F T . F F . ( E ) F . id
I1: E’ E . E E . + T
I2: E T . T T . * F
I3: T F .
I4: F ( . E ) E . E + T E . T T . T * F T . F F . ( E ) F . id
I5: F id .id
(
E
T
F
Before reduction
After reduction0 E 1
0 T 2
0 F 3
0 id 5
49
LR Parsing AlgorithmLR Parsing Algorithm
• Input:– An input string and an LR parsing table with functions
action and goto for a grammar G.
• Output:– If is in L(G), a bottom-up parse for ; otherwise, an err
or indication.
• Method:– Initially, the parser has s0 on its stack, where s0 is the initi
al state, and $ in the input buffer.
– Shift/reduce according to the parsing table (See next Page)
50
LR Parsing ProgramLR Parsing Programwhile (1) do { s := the state of top of the stack; a := get input token; if (action[s,a] == shift s’) { push a then s’ on top of the stack; a = get input token; } else if (action[s,a] == reduce A->) { pop 2*|| symbols off the stack; s’ = the state now on top of the stack; push A then goto[s’,A] on top of the stack; output the production A->; } else if (action[s,a] == accept) return; else error();}
51
Stack Input shift/reduce+goto Action
(1) 0 id * id + id $ (0,id):s5 Shift
(2) 0 id 5 * id + id $ (5,*):r6; (0,F):3 Reduce by F id
(3) 0 F 3 * id + id $ (3,*):r4; (0,T):2 Reduce by T F
(4) 0 T 2 * id + id $ (2,*):s7 Shift
(5) 0 T 2 * 7 id + id $ (7,id):s5 Shift
(6) 0 T 2 * 7 id 5 + id $ (5,+):r6; (7,F):10 Reduce by F id
(7) 0 T 2 * 7 F 10 + id $ (10,+):r3; (0,T):2 Reduce by T T*F
(8) 0 T 2 + id $ (2,+):r2; (0,E):1 Reduce by E T
(9) 0 E 1 + id $ (1,+):s6 Shift
(10) 0 E 1 + 6 id $ (6,id):s5 Shift
(11) 0 E 1 + 6 id 5 $ (5,$):r6; (6,F):3 Reduce by F id
(12) 0 E 1 + 6 F 3 $ (3,$):r4; (6,T):9 Reduce by T F
(13) 0 E 1 + 6 T 9 $ (9,$):r1; (0,E):1 Reduce by E E+T
(14) 0 E 1 $ (1,$):acc Accept
LR Parsing on LR Parsing on idid11*id*id22+id+id33
52
LR Parsing AdvantagesLR Parsing Advantages
Efficient: non-backtracking Efficient Parsing Efficient Error detection (& correction)
Detect syntax error as soon as one appear during L-o-R scan
Coverage: virtually all programming languages
G(LR) > G(TD predictive parsing)
Disadvantages: Too much work to construct by hands (YACC)
53
How To: LR Parsing (repeated)How To: LR Parsing (repeated) LR Parsing =/= Leftmost Reduction
The 1st reducible substring does not always result in successful parse
Handle(s): those successfully lead to S
Top-Down: Expansion Matching
Bottom-Up: Shift/Reduce Locating next “handle” to reduce [How To??] Handle pruning: hide details below reduced (|N)*
54
LR Parsing Table ConstructiLR Parsing Table Construction Techniqueson Techniques
Parsing Table Construction:
SLR(1) Parser- LR(0) Items & States
LR(1) Parser- shift/reduce conflict resolution
- LR(1) Items & States
LALR(1) Parser- LR(1) state merge
- reduce-reduce conflict
56
SLR ParserSLR Parser
Coverage: weakest in terms of #grammars it succeeds Easiest to construct
Parser: a DFA for recognizing viable prefixes States: Sets of LR(0) Items
The items in a set can be viewed as the states of an NFA recognizing viable prefixes
Grouping items into sets is equivalent to subset construction
57
LR Parsing StatesLR Parsing States S0: . S . a A B e S1: S a . A B e (shift a) A . A b c | . b A . A b c | . b (closed, no further expansion) S2: A b . (shift b, reduce A b) S3: S a A . B e B . d A A . b c S4: A A b . c (shift b) S5: A A b c . (shift c, reduce A A b c ) S6: S a A . B e B . d S7: B d . (shift d, reduce B d) S8: S a A B . e S9: S a A B e . (shift e, reduce S a A B e )
ca d eb
A
bA
ca d eb
A
bBA
ca d eb
A
b
S
BA
ca d eb
A
b
ca d ebb
G: S a A B e A A b c | b B dInput: abbcde
S10: S’ S . (S reduced)
60
Viable PrefixViable Prefix
• The set of prefixes of c.s.f.’s (canonical/right sentential forms) that can appear on the stack of a shift-reduce parser are called viable prefixes.
• Equivalently, it is a prefix of a right-sentential form that does not continue past the right end of the rightmost handle of that sentential form
• If is a viable prefix, then w * w is a c.s.f.
61
Item and Valid ItemItem and Valid Item
• An LR(0) item (item for short) is a marked production
[A 1•2] (dotted rule: production with a dot at RHS)
• An item [A 1•2] is said to be valid for some viable prefix 1 iff w * S*Aw 12w
• The “•” represents where we are now during parsing– Left of dot: those scanned
– Right of dot: those to be visited later
S
A w
1 2
62
Example of Valid ItemExample of Valid Item
• Consider the grammar:S • 1C | • DC 3 | 4D • 1BB 2
• Valid items for the viable prefix :[S • 1C], [S • D], and [D • 1B]
S
S
S
D
1 B
1 C
or
63
Example of Valid Item (cont.)Example of Valid Item (cont.)• Assume 1 ’, i.e.,
could be
Valid items for the viable prefix “1”: [S 1 • C], [C • 3], [C • 4], [D 1 • B], and [B • 2]
S
'
S
D
1 B
or
S
1 C
S
1 C
2
43
64
Example of Valid Item (cont.)Example of Valid Item (cont.)
• Assume
• Valid item for viable prefix “13”: [C 3 • ]
• Valid item for viable prefix “1C”: [S 1 C • ]
S
1 C
3
65
Closure: All Valid Items Closure: All Valid Items Enumerable from GEnumerable from G• Given a grammar
E’ EE E+T | TT T*F | FF (E) | id
• What are valid items for the viable prefix “…E+” ?• [E E+ • T], but also [T ...•F] since
• E’*E+T T F E+ F
• Likewise, [T •T*F], [T •F], [F • (E)] , [F •id]– called Closure of [E E+•T] (inclusive)
1
2
3
4
1
2
3
4
1
66
Computation of ClosureComputation of Closure
• Given a set, I, of items• Initially Closure(I) = I• Loop: for all items [A •B]…• If [A •B] is in Closure(I) and B is in P,
then include [B • ] into Closure(I).• Repeat the Loop until no new dotted rules can be
added
• Initial set of items for a grammar:– I0 = Closure({[S’ •S] })– (S: start symbol, S’: augmented start symbol)
67
GOTO ComputationGOTO Computation
• Let I be a set of items which are valid for some viable prefix .
• Then goto(I,X), where X(N or Σ), is the set of items which are valid for the viable prefix X.
• So [A •X] in I implies Closure({[A X • ]}) in goto(I,X)
S* A]w • X w X • w ([]: set of items I, including [A •X] others)
=
68
Sets of LR(0) Items Sets of LR(0) Items ConstructionConstruction
• Augment the grammar with: S’ S
• Let I0 = Closure({[S’ •S] }), C = {I0}
while (not all elements of C are marked) {
-select an unmarked item set of C (say “I”) and mark it;
- X (V or Σ), if goto(I,X) is not already in C, then add goto(I,X) to C (unmarked);
}
• also called Characteristic Finite State Machine (CFSM) Construction Algorithm.
69
SLR(1) Parsing ActionsSLR(1) Parsing Actions
• Compute the CFSM states C={I0,I1,…,In}.1. If [A •a] Ii and goto(Ii,a) = Ij then set action(Ii,a) = s
hift,Ij (where ‘a’ is a terminal)2. If [A •] Ii then set action(Ii,a) = reduce A for all
a in Follow(A)1. A terminal a in Follow(A) does not guarantee that A will resu
lt in a successful parse. (not necessarily a “handle”)2. But, a terminal NOT in Follow(A) will definitely indicate an impo
ssible parse.3. So reduction on symbols in Follow(A) is only a loose criterion for
possible success parse.
3. If [S’ S•] Ii then set action(Ii,$) = accept4. Other action(*,*) = error
70
ConflictsConflicts
• Shift-reduce conflicts: both a shift action and a reduce action are possible in the same Closure.– E.g., state 2 in Figure 4.37 (p.229) [Aho 86]
• Reduce-reduce conflicts: two or more distinct reduce actions are possible in the same Closure.
71
Example: Grammar G for Example: Grammar G for Math ExpressionsMath Expressions
(0) E’ E(1) E E+T(2) E T(3) T T*F(4) T F(5) F (E)(6) F id
Follow(E)={+,),$}, Follow(T)={+,),$,*}, Follow(F)={+,),$,*}
72
Computing SLR(1) States for GComputing SLR(1) States for G
• an SLR(1) State = a set of LR(0) items
• (See the next slide, Fig. 4.35, page 225, [Aho 86])
73
Canonical LR(0) Collection for GCanonical LR(0) Collection for GI0: E’ . E E . E + T E . T T . T * F T . F F . ( E ) F . id
I1: E’ E . E E . + T
I2: E T . T T . * F
I3: T F .
I4: F ( . E ) E . E + T E . T T . T * F T . F F . ( E ) F . id
I5: F id .
I6: E E + . T T . T * F T . F F . ( E ) F . id
I7: T T * . F F . ( E ) F . id
I8: F ( E . ) E E . + T
I9: E E + T . T T . * F
I10: T T * F .
I11: F ( E ) .
id
(
E
T
F
+
*
E
FT
(
id
T
F
( id
F
)
+
*( id
75
GOTO ActionsGOTO Actions
I0: E’ . E E . E + T E . T T . T * F T . F F . ( E ) F . id
I1: E’ E . E E . + T
I2: E T . T T . * F
I3: T F .
I4: F ( . E ) E . E + T E . T T . T * F T . F F . ( E ) F . id
I5: F id .id
(
E
T
F
Before reduction
After reduction0 E 1
0 T 2
0 F 3
0 id 5
77
Parsing Table for Expression Parsing Table for Expression GrammarGrammar
(0) E’ E (1) E E + T (2) E T(3) T T * F (4) T F(5) F ( E ) (6) F id
Follow(E)={+,),$}Follow(T)={+,),$,*}Follow(F)={+,),$,*}
State Action Goto id + * ( ) $ E T F0 s5 s4 1 2 31 s6 acc2 r2 s7 r2 r23 r4 r4 r4 r44 s5 s4 8 2 35 r6 r6 r6 r66 s5 s4 9 37 s5 s4 108 s6 s119 r1 s7 r1 r110 r3 r3 r3 r311 r5 r5 r5 r5
78
Transition Diagram of DFA D Transition Diagram of DFA D for Viable Prefixesfor Viable Prefixes
• State transition in terms of sets of LR(0) items (Fig. 4.36)
• SLR(1) Parsing Table: (Fig. 4.31)– Ii = “a” => Ij: action(i,a) = shift-j– Ii = “A” => Ij: goto(i,A) = j– Ii : [A . ] action(i,FOLLOW
(A)) = reduce [A • If A = S’ (augmented start symbol ) action(i,$)=acc
ept
79
Visualizing Transitions in the Visualizing Transitions in the Transition DiagramTransition Diagram
• Shift: moving forward one step along arc– Equivalent to pushing input symbols
• Reduce “LHS RHS”: moving backward to a previous state ‘s’ along arcs labeled with the RHS symbols– Then GOTO(s, LHS)
• equivalent to popping RHS symbols from stack then pushing LHS, then redefining current state
80
Parsing Table for Expression Parsing Table for Expression GrammarGrammar
action gotoStateid + * ( ) $ E T F
0 s5 s4 1 2 31 s6 acc2 r2 s7 r2 r23 r4 r4 r4 r44 s5 s4 8 2 35 r6 r6 r6 r66 s5 s4 9 37 s5 s4 108 s6 s119 r1 s7 r1 r1
10 r3 r3 r3 r311 r5 r5 r5 r5
81
LR Parsing Table ConstructiLR Parsing Table Construction Techniques…on Techniques…
Canonical LR Parsing Table …LALR Parsing Table …(See Textbook …)
82
Canonical LR ParserCanonical LR Parser
• SLR(1) parser does NOT always work– SLR(1) Grammar => Unambiguous– Unambiguous CFG =/=> SLR(1) Grammar
• E.g., Shift-reduce conflicts in the SLR(1) parsing table may NOT be a real shift-reduce conflict (e.g., impossible “reduce”)
• Need more specific & additional information to define states [to avoid false reductions]– use LR(1) items, instead of LR(0) items– Much more states than SLR(1)
• Need (canonical) LR(1) or LALR(1) Parsers (Parsing Table construction methods)
83
Example: non-SLR(1) Example: non-SLR(1) Grammar for AssignmentGrammar for Assignment
(0) S’ S
(1) S L = R
(2) S R
(3) L * R (content of R)
(4) L id
(5) R L
I3:
(2) S R .
‘=’ Follow(S)
I2:
(1) S L . = R
(5) R L .
Action(2,‘=’) =shift 6
Action(2,‘=’) = reduce 5
Follow(R) = {‘=’, …}
IF: Reduce on ‘=’ Goto I3 Error (Follow(S)) NOT Really Reducible
L
R
S => L = R => *R = R
84
Example: non-SLR(1) Example: non-SLR(1) Grammar for AssignmentGrammar for Assignment
• Problem:– G is unambiguous– SLR Shift/Reduce conflict is false, but– SLR parsing table is unable to remember
enough left context to decide proper action on ‘=’ when seeing a string reducible to L
85
Why UnambiguousWhy UnambiguousYet Non-SLR(1)Yet Non-SLR(1)
• Some reduce actions are not really reducible by checking input against Follow(LHS)– Not all symbols in FOLLOW(LHS) result in
successful reduction to S.
– May fail after a few steps of reductions.
• SLR(1) states does not resolve such conflicts by using LR(0)-item defined states– Need more specific constraints to rule out a subset
of Follow(LHS) from indicating a reduction action
86
LR(1) Parsing Table LR(1) Parsing Table ConstructionConstruction
• SLR: reduce A → on input ‘a’ if Ii contains [A → .] & ‘a’ FOLLOW(A)– Not really reducible for all ‘a’ FOLLOW(A)– Only a subset (maybe proper subset)– But on some cases: S a =/=> A a
• Reduce A → does not produce a right sentential form
– E.g., “S L • = R …” =/=> “S R • = R …”– although “S *R • = R” ‘=‘ in follow(R)
87
LR(1) Parsing Table LR(1) Parsing Table ConstructionConstruction
• Solution:– Define each state by including more specific informatio
n to rule out invalid reductions
– Sometimes results in splitting states of the same “core”
• LR(0) items: [A → . ]– Only dotted production (the “core”)
• LR(1) items: [A → . , LA’s]– Dotted production(the “core”), plus lookaheads that all
ow reduction upon [A → ]• “1”: length of LA symbols
88
LR(1) Parsing Table LR(1) Parsing Table ConstructionConstruction
• [A → . , a] (& ≠) : LA (‘a’) has no effect on items of this form
• [A → . , a] (i.e., =): LA has effect on items of this form– Reduction is called for only when next input is
‘a’ (not all terminal symbols in Follow(A))– Only a subset in Follow(A) will be the right
LA’s• Initially, only one restriction is known: [S’ → . S, $]• Infer other restrictions by closure computation
89
LR(1) Item and Valid ItemLR(1) Item and Valid Item
• An LR(1) item is a dotted production plus lookahead symbols: [A •,, a]
• An LR(1) item [A •,, a] is said to be valid for a viable prefix if r.m. derivation S* A w w, where2. ‘a’ First(w) (or w= && a = ‘$’)
• The “•” represents where we are now during parsing– Left of dot: those scanned– Right of dot: those to be visited later
90
LR(1) Parsing Table LR(1) Parsing Table ConstructionConstruction
• Change the closure() and goto() functions of SLR parsing table construction, with initial collection:– C = {closure({S’ . S, $})}– [A •Ba] valid implies [B • , b] valid if b is in
FIRST(a)
• Construction method for set of LR(1) items– See next few pages
91
LR(1): Closure(I)LR(1): Closure(I)
• Given a set, I, of items• Initially Closure(I) = I• Repeat:
– for each items [A •Ba] in I,
– each production B is in G’,
– and each terminal b in FIRST(a),
– include [B • , b] to Closure(I).
• Until no more items can be added to I
92
LR(1): GOTO(I,X)LR(1): GOTO(I,X)
• Let J = {[A X • , a] | such that [A •Xa] is in I}.
• goto(I,X) = closure(J)
• That is:– J = {}
– For all [A •Xa] in I, J += {[A X • , a]}
– Return(closure(J))
J: [A X • , a][A’ ’X • ’, a’]…
I: [A •X , a][A’ ’ •X ’, a’]…
Goto(I,X) = Closure({[A X • , a],[A’ ’X • ’, a’]})
X
93
Sets of LR(1) Items Sets of LR(1) Items ConstructionConstruction
• Augment the grammar with: S’ S, call it G’
• Let I0 = Closure({[S’ •S, $] }), C = {I0}
Repeat {
- I C, - X(N or Σ), if goto(I,X) is not already in C, then add goto(I,X) to C
}
Until no more sets of items can be added to C
94
Example: resolving shift/reduce Example: resolving shift/reduce conflicts with LR(1) itemsconflicts with LR(1) items
• G’: {S’S, S CC, C cC|d}• L(G)={ cm d cn d }• => I0 ~ I9 (Fig. 4.39, p. 235 [Aho 86])• I3 vs. I6: same set of LR(0) items with differe
nt lookaheads• Conditions for reduction are different
– I3: reduce on c/d (when constructing 1st ‘C’)– I6: reduce on $ (when constructing 2nd ‘C’)
95
SLR(1) Goto GraphSLR(1) Goto Graph
I0: S’ . S S . C C C . c C C . d
I1: S’ S . [$]
I2: S C . C C . c C C . d
S
CI5: S C C . [$]
I3: C c . C C . c C C . d
I4: C d . , c/d/$
I8: C c C .,c/d/$
d
cC
c
d
C
d
c
G: S’ S S C C C c C C d
Follow Sets:S: {$}C: {c,d,$}
96
LR(1) Goto GraphLR(1) Goto Graph
I0: S’ . S, $ S . C C, $ C . c C, c/d C . d , c/d
I1: S’ S ., $
I2: S C . C, $ C . c C, $ C . d , $
S
CI5: S C C ., $
I3: C c . C, c/d C . c C, c/d C . d , c/d
I4: C d . , c/d
I8: C c C . , c/d
d
cC
c
d
C
I6: C c . C, $ C . c C, $ C . d , $
I7: C d . , $
I9: C c C . , $
d
cC
c
d
G: S’ S S C C C c C C d
97
Construction of Canonical Construction of Canonical LR(1) Parsing TableLR(1) Parsing Table
• Algorithm 4.10– Shift: (same as SLR, ignoring LA in item)– Reduce on ‘a’: [A •,, a] – Accept on ‘$’: [S’ S •,, $] – Goto: (same as SLR)
• LR(1) Grammar:– a grammar without conflicts (multiply defined a
ctions) in LR(1) Parsing Table
98
SLR(1) vs. LR(1)SLR(1) vs. LR(1)
• LR(1): more specific states– May split into states with the same “core” but w
ith different lookaheads– SLR(1) Grammar LR(1) Grammar– Number of states LR(1) >> SLR(1)
99
LALR(1)LALR(1)
• Merge LR(1) states with the same core, while retaining lookahead symbols– Considerably smaller than canonical LR tables
• Most programming language constructs can be expressed by an LALR grammar
– SLR and LALR have the same number of states• Without/with lookahead symbols [full/subset of FOLLOW]• Several hundred states for PASCAL• Several thousands, if using LR(1)
• G is an LALR(1) Grammar: if no conflicts after state merge
100
LALR(1) vs. LR(1)LALR(1) vs. LR(1)
• Effect of LR(1) state merge:– The merging of states with common cores can n
ever produce a shift-reduce conflict that was not present in one of the original states
• Because shift actions depend only on the core, not the lookahead
– However, a merge may produce a reduce-reduce conflict.
• Because union of lookaheads may introduce unnecessary reductions
101
LALR(1) vs. LR(1)LALR(1) vs. LR(1)
• Example: merging that produces reduce-reduce conflicts.– LR(1) Grammar:
• S’ S• S a A d | b B d | a B e | b A e• A c• B c
– Sets of LR(1) items:• {[A c . , d], [B c . , e]} (valid for viable prefix ac)• {[A c . , e], [B c . , d]} (valid for viable prefix bc)
– Merging states with common cores {[A c . , d/e], [B c . , d/e]}• merging also merges loohaheads
– Reduce-reduce conflicts:• A c and B c , on inputs d and e
102
LALR(1) vs. LR(1)LALR(1) vs. LR(1)
• Effect of LR(1) state merge:– Behave like the original, or– Declare error later, but before shifting next
input symbol– For correct input: LR and LALR have the same
sequence of shift/reduce– For erroneous input: LALR requires extra
reduces after LR has detected an error (but before shifting next)
103
Example: Merge States with Example: Merge States with Same CoreSame Core
• Fig. 4.39: I4 vs. I7 – same reduction with different lookaheads
• State merge:– dotted rules remain, LA’s merged
• Examples:– I3 + I6 => I36– I4 + I7 => I47– I8 + I9 => I89– Same as SLR(1) table (Fig. 4.41, p239, [Aho 86])
104
LALR(1) Parsing Table LALR(1) Parsing Table Construction (I)Construction (I)
• Method 1: (Naïve Method)– [1] Construct LR(1) parsing table
• Very costly [#states is normally very large]
– [2] Merge states with the same core
105
LALR(1) Parsing Table LALR(1) Parsing Table Construction (II)Construction (II)
• Method 2: (Efficient Construction Method)– [1] Construct kernels set of LR(0) items, from [S
’•S] • It is Possible to Compute shift/reduce/goto actions dir
ectly from kernel items• kernel items: items whose dot is not at the beginning,
except [S’ . S, $]: those not derived from closure()– Can represent a set of items
– [2] Append lookaheads• Compute initial spontaneous lookaheads, and• those item pairs that pass Propagated lookaheads
106
LALR(1) Parsing Table LALR(1) Parsing Table Construction (II.1)Construction (II.1)
• Compute shift/reduce/goto actions directly from kernel items: (pps. 240-241)– Reduce:– Shift:– Goto:– Need to pre-compute First’(C) = {A | r.m. C*
A } for all pairs of nonterminals (C, A) and
107
LALR(1) Parsing Table LALR(1) Parsing Table Construction (II.2)Construction (II.2)
• Determine spontaneous and propagated lookaheads (Fig. 4.43)– Compute closure({core,#}) by assuming a “du
mmy lookahead” ‘#’
108
LALR(1) Parsing Table LALR(1) Parsing Table Construction: ExampleConstruction: Example
• Example: 4.46/Fig. 4.42 [p. 241, Aho 86]– Kernels of sets of LR(0) items
Fig. 4.37 [with non-kernel items]
• Example: 4.47– Get Spontaneous & Propagated lookaheads
• Fig. 4.44: item pairs that propagate lookaheads• Fig. 4.45: initial spontaneous lookahead, and multipl
e passes of lookahead propagation
• LALR(1) parsing table:– Todo by yourself
109
LALR(1) Parsing Table LALR(1) Parsing Table ConstructionConstruction
• LALR(/LR) (Fig 4.45) SLR (Fig. 4.37)– SLR: I2: shift/reduce conflict on ‘=’
– LALR(/LR): I2: shift on ‘=’, reduce on ‘$’, NO conflict I2:
(1) S L . = R, $
(5) R L . , $
I2:
(1) S L . = R
(5) R L .