Parsing - 会津大学公式ウェブサイトhamada/LP/L4-1-LP.pdf · Shift reduce parser 1....

Post on 18-Mar-2020

1 views 0 download

transcript

Parsing

Today

COOL

code

txt

Executable

code

exeLexicalAnalysis

Syntax Analysis

Parsing

AST SymbolTableetc.

Inter.Rep.

(IR)

Code

Gen.

Top Down Parsing

Parsing

Bottom Up Parsing

Predictive Parsing Shift-reduce Parsing

LL(k) Parsing LR(k) Parsing

Left Recursion

Left Factoring

Bottom-Up Parsers

Bottom-up parsers: build the nodes on the bottom of the parse tree first.Suitable for automatic parser generation, handle a larger class of grammars.Examples: shift-reduce parser (or LR(k) parsers)

Bottom-up Parsing

zNo problem with left-recursionzWidely used in practicezLR(1), SLR(1), LALR(1)

Non-ambiguous CFG

CLR(1)

LALR(1)

SLR(1)

LL(1)

Grammar Hierarchy

Bottom-up Parsing

zWorks from tokens to start-symbolzRepeat: yidentify handle - reducible sequence: ⌧ non-terminal is not constructed but⌧ all its children have been constructed

yreduce - construct non-terminal and update stack

zUntil reducing to start-symbol

Bottom-up Parsing1 + (2) + (3)

E + (E) + (3)

+

E → E + (E)E → i

E

1 2 + 3

E

E + (3)

E

( ) ( )

E + (E)

E

E

E

E + (2) + (3) i = 0,1, 2, …, 9

Bottom-up Parsing

zIs the following grammar LL(1) ?

1 + (2)1 + (2) + (3)

zBut this is a useful grammar

E → E + (E)E → i

zNO

Bottom-Up Parser

A bottom-up parser, or a shift-reduce parser, beginsat the leaves and works up to the top of the tree.

The reduction steps trace a rightmost derivationon reverse.

S → aABeA → Abc | bB → d

Consider the Grammar:

We want to parse the input string abbcde.

Bottom-Up Parser Example

a dbb cINPUT:

Bottom-Up ParsingProgram

e OUTPUT:$

ProductionS → aABeA → AbcA → bB → d

Bottom-Up Parser Example

a dbb cINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A

b

$

ProductionS → aABeA → AbcA → bB → d

Bottom-Up Parser Example

a dbA cINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A

b

$

ProductionS → aABeA → AbcA → bB → d

Bottom-Up Parser Example

a dbA cINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A

b

$

ProductionS → aABeA → AbcA → bB → d

We are not reducing here in this example.

A parser would reduce, get stuck and then backtrack!

Bottom-Up Parser Example

a dbA cINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A

b

$

ProductionS → aABeA → AbcA → bB → d

c

A

b

Bottom-Up Parser Example

a dAINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A c

A

b

$

ProductionS → aABeA → AbcA → bB → d

b

Bottom-Up Parser Example

a dAINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A c

A

b

$

ProductionS → aABeA → AbcA → bB → d

b

B

d

Bottom-Up Parser Example

a BAINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A c

A

b

$

ProductionS → aABeA → AbcA → bB → d

b

B

d

Bottom-Up Parser Example

a BAINPUT:

Bottom-Up ParsingProgram

e OUTPUT:

A c

A

b

$

ProductionS → aABeA → AbcA → bB → d

b

B

d

a

S

e

Bottom-Up Parser Example

SINPUT:

Bottom-Up ParsingProgram

OUTPUT:

A c

A

b

$

ProductionS → aABeA → AbcA → bB → d

b

B

d

a

S

e

This parser is known as an LR Parser because it scans the input from Left to right, and it constructs

a Rightmost derivation in reverse order.

Bottom-Up Parser Example

The scanning of productions for matching withhandles in the input string, and backtracking makesthe method used in the previous example veryinefficient.

Can we do better?

LR Parser Example

Input

Stack

LR ParsingProgram

action goto

Output

Shift reduce parser

2. Apply the shift-reduce parsing algorithm to construct the parse tree

1. Construct the action-goto table from the given grammar

Shift reduce parser

1. Construct the action-goto table from the given grammar

This is what make difference between different typsof shift reduce parsing such as SLR, CLR, LALR

In this course due to short of time we will not study how to construct the action-goto table

Shift reduce parser2. Apply the shift-reduce parsing algorithm to construct the parse tree

The following algorithm shows how we can construct the move parsing table for an input string w$ with respect to a given grammar G.

set ip to point to the first symbol of the input string w$repeat forever begin

if action[top(stack), current-input(ip)] = shift(s) then beginpush current-input(ip) then s on top of the stackadvance ip to the next input symbol

endelse if action[top(stack), current-input(ip)] = reduce A à ß thenbegin

pop 2*|ß| symbols off the stack;

output the production A à ßend

else error()end

push A then goto[top(stack), A] on top of the stack;

else if action[top(stack), current-input(ip)] = accept thenreturn

LR Parser Example

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

The following grammar:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

Can be parsed with this actionand goto table

s represents shiftr represents reduceacc represents acceptempty represents error

LR Parser Exampleid idid+ ∗INPUT: $

STACK: E0

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

GRAMMAR:

OUTPUT:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgramE5

id0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

F

id

GRAMMAR:

OUTPUT:

0

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

F

id

GRAMMAR:

OUTPUT:

E3F0

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

F

id

GRAMMAR:

OUTPUT:

0

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

F

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgramE2

T0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

F

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

E7∗2T0

T

F

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgramE5

id7∗2T0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

F

id

F

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgramE7

∗2T0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

F

id

F

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgramE10

F7∗2T0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

∗T F

F

id

id

GRAMMAR:

OUTPUT:

0

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

∗T F

F

id

id

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram2

T0

T

∗T F

F

id

idaction goto State

id + * ( ) $ E T F 0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

E

GRAMMAR:

OUTPUT:

0

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

∗T F

F

id

id

E

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram1

E0

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

∗T F

F

id

id

E

GRAMMAR:

OUTPUT:LR Parser Example

id idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

T

∗T F

F

id

id

E

6+1E0

GRAMMAR:

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

OUTPUT:

T

∗T F

F

id

id

E

5id6+1E0

F

id

GRAMMAR:

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

OUTPUT:

T

∗T F

F

id

id

E

6+1E0

F

id

GRAMMAR:

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

OUTPUT:

T

∗T F

F

id

id

E

3F6+1E0

F

id

GRAMMAR:

T

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

OUTPUT:

T

∗T F

F

id

id

E

6+1E0

F

id

GRAMMAR:

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

OUTPUT:

T

∗T F

F

id

id

E

9T6+1E0

F

id

GRAMMAR:

T

E

+

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram0

GRAMMAR:

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

OUTPUT:

T

∗T F

F

id

id

E

F

id

T

E

+

LR Parser Exampleid idid∗ +INPUT: $

STACK:

(1) E → E + T(2) E’ → T(3) T → T ∗ F(4) T → F(5) F → ( E ) (6) F → id

LR ParsingProgram

action goto State id + * ( ) $ E T F

0 s5 s4 1 2 3 1 s6 acc 2 r2 s7 r2 r2 3 r4 r4 r4 r4 4 s5 s4 8 2 3 5 r6 r6 r6 r6 6 s5 s4 9 3 7 s5 s4 10 8 s6 s11 9 r1 s7 r1 r1

10 r3 r3 r3 r3 11 r5 r5 r5 r5

OUTPUT:

T

∗T F

F

id

id

E

1E0

F

id

GRAMMAR:

T

E

+

Constructing Parsing Tables

All LR parsers use the same parsing program thatwe demonstrated in the previous slides. What differentiates the LR parsers are the action and the goto tables:Simple LR (SLR): succeeds for the fewest grammars, but is the easiest to implement.

Canonical LR: succeeds for the most grammars, but is the hardest to implement. It splits states when necessary to prevent reductions that would get the parser stuck.

Lookahead LR (LALR): succeeds for most common syntacticconstructions used in programming languages, but producesLR tables much smaller than canonical LR.