+ All Categories
Home > Documents > Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers...

Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers...

Date post: 01-Aug-2020
Category:
Upload: others
View: 12 times
Download: 0 times
Share this document with a friend
79
Implementation of Parsers Bottom-Up Syntax Analysis Bottom-Up Syntax Anaysis Educational Objectives: General Principles of Bottom-Up Syntax Analysis LR(k) Analysis Resolving Conflicts in Parser Generation Connection between CFGs and push-down automata Ina Schaefer Context-Free Analysis 59
Transcript
Page 1: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Bottom-Up Syntax Anaysis

Educational Objectives:• General Principles of Bottom-Up Syntax Analysis• LR(k) Analysis• Resolving Conflicts in Parser Generation• Connection between CFGs and push-down automata

Ina Schaefer Context-Free Analysis 59

Page 2: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Basic Ideas: Bottom-Up Syntax Analysis

• Bottom-Up Analysis is more powerful than top-down analysis,since production is chosen at the end of the analysis while intop-down analysis the production is selected up front.

• LR: Read input from left (L) and search for right derivations (R)

Ina Schaefer Context-Free Analysis 60

Page 3: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Principles of LR Parsing

1. Reduce from sentence to axiom2. Construct sentential forms from prefixes in (N ! T )! and input

rests in T !. Prefixes are right sentential forms of grammar. Suchprefixes are called viable prefixes. This prefix property has to holdinvariantly to avoid dead ends.

3. Reductions are always made at the left-most possible position.

Ina Schaefer Context-Free Analysis 61

Page 4: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Viable Prefix

DefinitionLet S "!

rm !Au "rm !"u a right sentential form of !.

Then " is called handle or redex of the right sentential for !"u.

Each prefix of !" is a viable prefix of !.

Ina Schaefer Context-Free Analysis 62

Page 5: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Regularity of Viable Prefixes

TheoremThe language of viable prefixes of a grammar ! is regular.

Proof.Cf. Wilhelm, Maurer Thm. 8.4.1 and Corrollary 8.4.2.1. (pp. 361, 362),Essential proof steps are illustrated in the following by construction ofLR # DFA(!).

Ina Schaefer Context-Free Analysis 63

Page 6: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples

• Consider !1! S $ aCD! C $ b! D $ a|b

Analysis of aba can lead to an dead end. (cf. Lecture).

Considering viable prefixes can avoid this.

Ina Schaefer Context-Free Analysis 64

Page 7: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples (2)

• Consider !2! S $ E#! E $ a|(E)|EE

Analysis of ( ( a ) ) ( a) # (cf. Lecture)

Stack can manage prefixes already read.

Ina Schaefer Context-Free Analysis 65

Page 8: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Examples (3)

• Consider !3! S $ E#! E $ E + T |T! T $ ID

Analysis of ID + ID + ID # (cf. Lecture)

Ina Schaefer Context-Free Analysis 66

Page 9: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce

Schematic syntax tree for input xay with a % T , x , y % T ! and startsymbol S:

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

Read Pointer

Read Pointer

Read PointerIna Schaefer Context-Free Analysis 67

Page 10: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce (2)Shift step:

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

Read Pointer

Read Pointer

Read Pointer

Reduce step:

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

80© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

x a y

!a

Lesezeiger

Schematischer Syntaxbaum zur Eingabe xay mit

a in T, x,y in T* und Startsymbol S:

x a y

! = "#

Lesezeiger

x a y

!

Lesezeiger

Schiebe Schritt (shift): Reduktionsschritt (reduce):

"$=>

Read Pointer

Read Pointer

Read Pointer

Ina Schaefer Context-Free Analysis 68

Page 11: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing: Shift and Reduce (3)

Main Problems:• Reductions can only be performed if remaining prefix is still a

viable prefix.• When to shift? When to reduce? Which production to use?

Solution:For each grammar ! construct LR # DFA(!) automaton (also calledLR(0) automaton), that describes the viable prefixes.

Ina Schaefer Context-Free Analysis 69

Page 12: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA

Let ! =( T , N,", S) be a CFG.• For each non-terminal n % N, construct Item Automaton• Build union of item automata: Start state is the start state of item

automaton for S, Final states are final states of item automata• Add # transitions from each state which contains the position point

in front for a non-terminal A to the starting state of the itemautomaton of A

If all states of the LR-DFA automaton are considered as final states,the accepted language is the language of viable prefixes.

Ina Schaefer Context-Free Analysis 70

Page 13: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example

!3: S $ E#, E $ E + T |T , T $ ID

82© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

!5 : S E # , E E + T | T , T ID

Beispiel: (Konstruktion eines LR-DEA)

Konstruktion des LR-DEA für

[S .E #] [S E.# ] [S E#.]

[E .E+T]

[E .T ]

[T .ID ]

[E E+.T] [E E+T.]

[E T.]

[T ID.]

E #

E + T

ID

[E E.+T]

T

"

" "

"

Deterministisch machen liefert folgenden Automaten:Ina Schaefer Context-Free Analysis 71

Page 14: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example (2)

Determinisation:

83© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

[S .E #]

[S E.# ][S E#.]

[E .E+T]

[E .T ]

[T .ID ]

[E E+.T]

[E E+T.][E T.] [T ID.]

E #

+

T

IDFehlerT

[E E.+T]

bezeichnet Fehlerkanten

q0

q1 q2

q3

q4q5

q6

Die zuverlässigen Präfixe maximaler Länge:

E# , T , ID , E+ID , E+T

[T .ID ]

ID

Bemerkungen:

• Im Beispiel enthält jeder Endzustand genau eine

vollständig gelesene Produktion. Dies ist im Allg.

nicht so.

• Enthält ein Endzustand mehrere vollständig gelesene

Produktionen spricht man von einem reduce/reduce-

Konflikt.

• Enthält ein Endzustand eine vollständig gelesene

und eine unvollständig gelesene Produktion

mit einem Terminal nach dem Positionspunkt,

spricht man von einem shift/reduce-Konflikt.

q7

Error

Error Transitions

Viable prefixes of maximal length: E#, T , ID, E + ID, E + T

Ina Schaefer Context-Free Analysis 72

Page 15: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR-DFA: Example (3)

Remarks:• In the example, each final state contains one completely read

production, this is in general not the case.• If a final state contains more than one completely read

productions, we have a reduce/reduce conflict.• If a final state contains a completely read and an uncompletely

read production with a terminal after the position point, we have ashift/reduce conflict.

Ina Schaefer Context-Free Analysis 73

Page 16: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Analysis with LR-DFA

Analysis of ID + ID + ID # with LR-DFA(the viable prefix is underlined)

84© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Analyse von ID + ID + ID # mit dem LR-DEA,

unterstrichen ist jeweils der zuverlässige Präfix:

ID + ID + ID # <=

T + ID + ID # <=

E + ID + ID # <=

E + T + ID # <=

E + ID # <=

E + T # <=

E # <=

S

Beispiel: (Analyse mit LR-DEA)

Beachte:

• Die Satzformen bestehen immer aus einem

zuverlässigen Präfix und der Resteingabe.

• Verwendet man nur den LR-DEA

zur Analyse muss man nach jeder Reduktion

die Satzform von Anfang an lesen.

deshalb: verwende Kellerautomaten zur Analyse

Ina Schaefer Context-Free Analysis 74

Page 17: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Analysis with LR-DFA (2)

Note:• The sentential forms always consist of a viable prefix and a

remaining input.• If an LR-DFA is used, after each reduction the sentential form has

to be read from the beginning.

Thus: Use pushdown automaton for analysis.

Ina Schaefer Context-Free Analysis 75

Page 18: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR pushdown automaton

DefinitionLet ! =( N, T ,", S) be a CFG. The LR-DFA pushdown automaton for !contains:

• a finite set of state Q (the states of the LR-DFA(!))• a set of actions Act = {shift , accept , error} ! red("), where

red(") contains an action reduce(A $ ") for each productionA $ ".

• an action table at : Q $ Act .• a successor table succ : P & (N ! T ) $ Q with

P = {q % Q |at(q) = shift}

Ina Schaefer Context-Free Analysis 76

Page 19: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR pushdown automaton (2)

Remarks:• The LR-DFA pushdown automaton is a variant of pushdown

automata particularly designed for LR parsing.• States encode the read left context.• If there are no conflicts, the action table can be directly

constructed from the LR-DFA:! accept: final state of item automaton of start symbol! reduce: all other final states! error: error state! shift: all other states

Ina Schaefer Context-Free Analysis 77

Page 20: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Execution of Pushdown Automaton

• Configuration: Q! & T ! where variable stack denotes thesequence of states and variable inr denotes the remaining input

• Start configuration: (q0, input), where q0 is the start state of theLR-DFA

• Interpretation Procedure:

(stack, inr) := (q0,input);do {

step(stack,inr);} while ( at(top(stack)) != accept

and at(top(stack)) ! = error );if (( at (top(stack)) == error) return error;

with

Ina Schaefer Context-Free Analysis 78

Page 21: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Execution of Push-Down Automaton (2)

void step ( var StateSeq stack, var SymbolSeq inr) {State tk: = top(stack);switch ( at(tk) ) {case shift:

stack: = push ( succ (tk,top(inr)), keller);inr := tail(inr);break;

case reduce A -> a:stack := mpop( length(a) ,stack);stack := push (succ(top(stack), A), stack);break;

}}

Ina Schaefer Context-Free Analysis 79

Page 22: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR push down automaton: Example

LR-DFA with states q0, . . . , q7 for grammar !3

Action Table

87© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Beispiel: (LR-Kellerautomat zu !5 )

Aktionstabelle:

q0 schieben

q1 schieben

q2 akzeptieren

q3 schieben

q4 reduzieren E E+T

q5 reduzieren E T

q6 reduzieren T ID

q7 fehler

Nachfolgertabelle:

ID + # E T

q0 q6 q7 q7 q1 q5

q1 q7 q3 q2 q7 q7

q2

q3 q6 q7 q7 q7 q4

q4

q5

q6

q7

LR-DEA mit Zuständen q0 – q7 (siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion

q0

q0 q6

q0 q5

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID

+ ID + ID # reduzieren E T

+ ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID

+ ID # reduzieren E E+T

+ ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben

akzeptieren

shift

accept

error

reduce

shift

shift

reduce

reduce

Successor Table

87© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Beispiel: (LR-Kellerautomat zu !5 )

Aktionstabelle:

q0 schieben

q1 schieben

q2 akzeptieren

q3 schieben

q4 reduzieren E E+T

q5 reduzieren E T

q6 reduzieren T ID

q7 fehler

Nachfolgertabelle:

ID + # E T

q0 q6 q7 q7 q1 q5

q1 q7 q3 q2 q7 q7

q2

q3 q6 q7 q7 q7 q4

q4

q5

q6

q7

LR-DEA mit Zuständen q0 – q7 (siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion

q0

q0 q6

q0 q5

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID

+ ID + ID # reduzieren E T

+ ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID

+ ID # reduzieren E E+T

+ ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben

akzeptieren

Ina Schaefer Context-Free Analysis 80

Page 23: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR push down automaton: Example (2)Computation for Input ID + ID + ID #

87© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Beispiel: (LR-Kellerautomat zu !5 )

Aktionstabelle:

q0 schieben

q1 schieben

q2 akzeptieren

q3 schieben

q4 reduzieren E E+T

q5 reduzieren E T

q6 reduzieren T ID

q7 fehler

Nachfolgertabelle:

ID + # E T

q0 q6 q7 q7 q1 q5

q1 q7 q3 q2 q7 q7

q2

q3 q6 q7 q7 q7 q4

q4

q5

q6

q7

LR-DEA mit Zuständen q0 – q7 (siehe Beipiel oben)

Rechnung zu Eingabe ID + ID + ID # :

Keller Eingaberest Aktion

q0

q0 q6

q0 q5

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q3

q0 q1 q3 q6

q0 q1 q3 q4

q0 q1

q0 q1 q2

ID + ID + ID # schieben

+ ID + ID # reduzieren T ID

+ ID + ID # reduzieren E T

+ ID + ID # schieben

ID + ID # schieben

+ ID # reduzieren T ID

+ ID # reduzieren E E+T

+ ID # schieben

ID # schieben

# reduzieren T ID

# reduzieren E E+T

# schieben

akzeptieren

Stack Input Rest Action

shiftshift

shift

shiftshift

shiftaccept

reduce

reduce

reducereduce

reducereduce

Ina Schaefer Context-Free Analysis 81

Page 24: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR-DFA Construction

Questions:• Does LR-DFA construction work for all unambiguous grammars?• For which grammars does the construction work?• How can the construction be generalized / made more

expressive?

Ina Schaefer Context-Free Analysis 82

Page 25: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Example LR-DFALR-DFA for !6: S $ E#, E $ T + E |T , T $ ID|N(), N $ ID

88© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Fragen:

• Funktioniert die obige Konstruktion für alle

eindeutigen Grammatiken?

• Für welche Grammatiken funktioniert sie?

• Wie kann man sie verallgemeinern/mächtiger machen?

Beispiel:

LR-DEA für !6 :

S E # , E T+E | T , T ID | N( ) , N ID

[S .E #]

[S E.# ] [S E#.]

[E .T+E]

[E .T ]

[T .ID ]

[ T N(.) ]

E

#

+

T

ID

Fehler

T

bezeichnet Fehlerkanten

q0

q1 q2 q3

q4

q5

q6

ID[T .N( ) ]

[N .ID ]

[E T.+E]

[E T. ][E T+.E]

[E .T ]

[E .T+E]

[T .N( ) ]

[N .ID ]

[T .ID ]

[E T+E.]

E

[T ID.]

q10

[N ID.]

N

[ T N.( ) ] [ T N( ). ]

N

( )q8q7 q9

error

Error Transitions

Ina Schaefer Context-Free Analysis 83

Page 26: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Conflicts

2 Kinds of Conflicts:• Shift/Reduce Conflicts (q4 in example)• Reduce/Reduce Conflicts (q6 in example)

Ina Schaefer Context-Free Analysis 84

Page 27: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Theory

DefinitionLet ! =( N, T ,", S) be a CFG and k % N. ! is an LR(k) grammar if forany two right derivations

S "!rm "Au "rm "!u

S "!rm $Bv "rm "!w

it holds that:If prefix(k , u) = prefix(k , w) then " = $, A = B and v = w

Ina Schaefer Context-Free Analysis 85

Page 28: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LR Parsing Theory (2)

Remarks:

• While for LL grammars the selection of the production depends onthe non-terminal to be derived, for LR grammars it depends on thecomplete left context.

• For LL grammars, the look ahead considers the language to begenerated from the non-terminal. For LR grammars, the lookahead considers the language generated from not yet readnon-terminals.

Ina Schaefer Context-Free Analysis 86

Page 29: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Characterization of LR(0)

TheoremA reduced CFG ! is LR(0) if-and-only-if the LR-DFA(!) contains noconflicts.

Proof.cf. Lecture

Ina Schaefer Context-Free Analysis 87

Page 30: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Characterization of LR(0) (2)

Example: Application of LR(0)-ChracterizationShow (using the above theorem) that !5 is LR(0).!5:

• S $ A|B• A $ aAb|0• B $ aBbb|1

Ina Schaefer Context-Free Analysis 88

Page 31: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Expressiveness of LR(k)

• For each context-free language L with prefix property(i.e. 'v , w % L: v is no prefix of w), there exists an LR(0) grammar.

• Grammar !5 is not LL(k), but LR(0).• Methods for LR(1) can be generalized to LR(k), SLR(k) and

LALR(k).

Ina Schaefer Context-Free Analysis 89

Page 32: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Resolving Conflicts by Look Ahead

• Compute look ahead sets from (N ! T )"k for items. The lookahead set of an item approximates the set of prefixes of length kwith which the input rest at this item can start.

• If the look ahead sets at an item are disjoint, then the action to beexecuted (shift, reduce) can be determined by k symbols lookahead.

• For an item, select the action whose look ahead set contains theprefix of the input rest. Action table has to be extended.

• For computation of look ahead sets, there are different methods.

Ina Schaefer Context-Free Analysis 90

Page 33: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Common Methods for Look Ahead Computation

• SLR(k) uses LR-DFA and FOLLOWk of conflicting items for lookahead

• LALR(k) - look ahead LR - uses LR-DFA with state-dependentlook ahead sets

• LR(k) integrates computation of look ahead sets in automataconstruction (LR(k) automaton)

Ina Schaefer Context-Free Analysis 91

Page 34: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars

Definition (SLR(1) grammar)Let ! =( N, T ,", S) be a CFG and LA([A $ ".]) = FOLLOW1(A).

A state LR-DEA(!) has an SLR(1) conflict if there exists two differentreduce items with LA([A $ ".]) ( LA([B $ !.]) )= * or two items[A $ ".] and [B $ ".a!] with a % LA([A $ a]).

! is SLR(1) if there is no SLR(1) conflict.

Ina Schaefer Context-Free Analysis 92

Page 35: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (2)

Example: !6 is an SLR(1) grammar• S $ E#

• E $ T + E |T• T $ ID|N()

• N $ IDConsider the conflicts between [E $ T .] and [E $ T . + E ] andbetween [T $ ID.] and [N $ ID.]

FOLLOW1(E) ( {+} = {#} ({ +} = *FOLLOW1(T ) ( FOLLOW1(N) = {#,+} ({ (} = *

Ina Schaefer Context-Free Analysis 93

Page 36: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (3)

Example: !7 (simplifed C expressions) is not an SLR(1) grammar

• S $ E#

• E $ L = R|R• L $ +R|ID• R $ L

Ina Schaefer Context-Free Analysis 94

Page 37: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

SLR Grammars (4)LR-DFA for !7

93© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Beispiel: (nicht SLR(1)-Sprache)

Betrachte folgende Grammatik für vereinfachte

C-Ausdrücke:

!7 : S E # , E L = R | R , L *R | ID , R L

Der zugehörige LR-DEA:

[S .E# ]

[S E.# ] [S E#. ]

[E .L=R]

[E .R]

[E L .=R]

[E L= .R]

[E L=R.]

[E R.]

[R .L]

[R L .]

[L .*R]

[L .ID]

[L * .R][L *R.]

[L ID.]

[R .L]

[L .*R]

[L .ID]

[R .L]

[L .*R]

[L .ID]

[R L .]

ER

L

*

=

ID

#

R

* LID

ID

R

L

*

Der einzige Zustand mit einem Konflikt enthält die

Items [E L .=R] und [R L .] mit

FOLLOW1(R) { = } = { =, # } { = } = { =} = { }

U U

/

Only conflict in items [E $ L. = R] and [R $ L.] with

FOLLOW1(R) ( {=} = {=,#} ({ =} )= *

Ina Schaefer Context-Free Analysis 95

Page 38: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR(1) Automata

LR(1) automaton contains items [A $ ".!, V ] with V , T where• " is on top of the stack• the input rest is derivable from !c with c % V , i.e.

V , FOLLOW1(A).

Ina Schaefer Context-Free Analysis 96

Page 39: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Construction of LR(1) Automata (2)

LR(1) automaton for !7. Conflict is resolved, as {=} ({ #} = *.

94© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Konstruktion von LR(1)-Automaten:

Items der Form [ A !.", V ] mit V T

und der Bedeutung, dass ! auf dem Keller liegt und

der Anfang des Eingaberests aus "c ableitbar ist mitc in V. D.h. V FOLLOW1(A) .

U

U

S .E#

S E.#

S E#.

E .L=R #

E .R #

E L .=R # E L= .R #

E L=R. #

E R. #

R .L #

R L . #

L .*R #,=

L .ID #,=

L * .R #,=

L *R. #,=

L ID. #

R .L #

L .*R #

L .ID #

R .L #,=

L .*R #,=

L .ID #,=

R L . #

E R

L

*

=

ID

#

R

*

L

ID

ID

R

L

*

L * .R #

R .L #

L .*R #

L .ID #

L *R. #

*L ID. #,=

R L . #,=

R

L

Konflikt kann behoben werden, da {=} {#} = {}U

Ina Schaefer Context-Free Analysis 97

Page 40: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LALR(1) Automata

• LALR(1) Automata are constructed from LR(1) automata bymerging states in which items only differ in look ahead sets. Lookahead sets for equal items are conjoint. The resulting automatonhas the same states as the LR-DFA.

• However, LALR(1) automata can be generated more efficiently.

Ina Schaefer Context-Free Analysis 98

Page 41: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

LALR(1) Automata (2)LR(1) automaton for !7.

95© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

LALR(1)-Automaten:

Aus dem LR(1)-Automaten erhält man den LALR(1)-

Automaten durch Zusammenlegen der Zustände, in

denen sich die Items nur in der Vorausschaumenge

unterscheiden. Die Vorausschaumengen zu gleichen

Items werden dabei vereinigt. Der resultierende Automat

hat die gleichen Zustände wie der LR-DEA..

S .E#

S E.#

S E#.

E .L=R #

E .R #

E L .=R # E L= .R #

E L=R. #

E R. #

R .L #

R L . #

L .*R #,=

L .ID #,=

L * .R #,=

L *R. #,=

R .L #

L .*R #

L .ID #

R .L #,=

L .*R #,=

L .ID #,=

E R

L

*

=

ID

#

*

ID

ID

R

L

*

L ID. #,=

R L . #,=

R

L

Der LALR(1)-Automat lässt sich allerdings effizienter

direkt konstruieren.

q0

q1

q2

q3

q4

q5

q6

q7

q8

q9

Ina Schaefer Context-Free Analysis 99

Page 42: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Grammar Classes

96© A. Poetzsch-Heffter, TU Kaiserslautern26.04.2007

Zusammenhang der Grammatikklassen:

Lesen Sie zu Unterabschnitt 2.2.2.2:

Wilhelm, Maurer:

• aus Kap. 8, Abschnitt 8.4.1 bis einschl. 8.4.5,

S. 353 – 383.

mehrdeutige Grammatiken

eindeutige Grammatiken

LR(k)

LR(1)

LALR(1)

SLR(1)

LR(0)

LL(k)

LL(1)

LL(0)

unambiguous grammars

ambiguous grammars

Ina Schaefer Context-Free Analysis 100

Page 43: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Literature

Recommended Reading for Bottom-Up Analysis:• Wilhelm, Maurer: Chapter 8, Sections 8.4.1 - 8.4.5, pp. 353 - 383

Ina Schaefer Context-Free Analysis 101

Page 44: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Parser Generators

Educational Objectives

• Usage of Parser Generators• Characteristics of Parser Generators

Ina Schaefer Context-Free Analysis 102

Page 45: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

JavaCUP Parser Generator

• CUP - Constructor of Useful Parsershttp://www2.cs.tum.edu/projects/cup/

• Java-based Generator for LALR-Parsers

• JFlex can be used to generate according scanner.

• Running JavaCUP:

java -jar java-cup-11a.jar options inputfile

Ina Schaefer Context-Free Analysis 103

Page 46: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Structure of JavaCUP Specification

package JavaPackageName;import java_cup.runtime.*;

/* User supplied code for scanner, actions, ... */

/* Terminals (tokens returned by the scanner). */terminal TerminalDecls;

/* Non-terminals */non terminal NonTerminalDecls;

/* Precedences */precedence [left | right | nonassoc ] TerminalList;

/* Grammar */start with non-terminalName;

non_terminalName :: = prod_1 | ... | prod_n ;Ina Schaefer Context-Free Analysis 104

Page 47: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Example: JavaCUP Specification for !7

import java_cup.runtime.*;

/* Terminals (tokens returned by the scanner). */terminal ID, EQ, MULT;

/* Non terminals */non terminal S, E, L, R;

/* The grammar */

start with S;

S ::= E;E ::= L EQ R | R;L ::= MULT R | ID;R ::= L;

Ina Schaefer Context-Free Analysis 105

Page 48: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Structure of Generated Parser Code

• Output Files parser.java and sym.java

• Tables for LALR Automaton! Production table: provides the symbol number of the left hand side

non-terminal, along with the length of the right hand side, for eachproduction in the grammar,

! Action table: indicates what action (shift, reduce, or error) is to betaken on each lookahead symbol when encountered in each state

! Reduce-goto table: indicates which state to shift to after reduce

Ina Schaefer Context-Free Analysis 106

Page 49: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Implementation of Parsers Bottom-Up Syntax Analysis

Usage of Generated Parser

• Parser calls scanner with scan() method when a new terminal isneeded

• Initialising Parser with new Scanner

parser parser_obj = new parser(new my_scanner());

• Usage of Parser:

Symbol parse_tree = parser_obj.parse();

Ina Schaefer Context-Free Analysis 107

Page 50: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling

Educational Objectives:

• Problems and Principles of Error Handling• Techniques of Error Handling for Context-Free Analysis

Ina Schaefer Context-Free Analysis 108

Page 51: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Principles of Error Handling

Error handling is required in all analysis phases and at runtime. Onedistinguishes

• lexical errors• parse errors (in context-free analysis)• errors in name and type analysis• runtime errors (cannot be avoided in most cases)• logical errors (behavioural errors)

First 2 (3) kinds of errors are syntactic errors. We only consider errorhandling in context-free analysis.

Specification of error handling results basically from languagespecification.

Ina Schaefer Context-Free Analysis 109

Page 52: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Requirements for error handling

• Errors should be localized as exactly as possible.(Problem: Error is not detected at error position.)

• As many errors at possible should be detected at once, but onlyreal errors and no errors as consequences.

• Errors are not always unique, i.e. it is not clear in general how tocorrect an error: class int { Int a; .... } or int a = 1-;

• Error handling should not slow down analysis of correct programs.

Therefore, error handling is non-trivial and depends on the sourcelanguage to be analysed.

Ina Schaefer Context-Free Analysis 110

Page 53: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling in Context-Free Analysis

1. Panic Error HandlingMark synchronizing terminal symbols, e.g. end or ;

If parser reaches error state, all symbols up to next synchronizingsymbol are skipped and the stack is corrected as if the productionwith the synchronizing symbol was read correctly.

! Pros: easy to implement, termination guaranteed! Cons: large parts of the program can be skipped or misinterpreted! Example: Incorrect Input a : = b *** c;

Read until ; correct stack and reuse as if statement has beenaccepted

Ina Schaefer Context-Free Analysis 111

Page 54: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling in Context-Free Analysis (2)

2. Error ProductionsExtend grammar with productions describing typical errorsituations, so called error productions. Error messages can bedirectly associated with error productions.

! Pros: easy to implement, termination guaranteed! Cons: extended grammar can belong to more general grammar

class, knowledge of typical error situations is necessary! Example: Typical error in PASCAL

if ... then A := E; else ...Error Production:Stmt $ if Expr then Stmt! ; else Stmt!

Ina Schaefer Context-Free Analysis 112

Page 55: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling in Context-Free Analysis (3)

3. Production-Local Error CorrectionGoal is local correction of input such that analysis can beresumed. Local means that it is tried to correct the input for thecurrent production.

! Pros: flexible and powerful technique! Cons: problematic if errors occur earlier than they can be detected,

operations for corrections can lead to nonterminating analysis

Ina Schaefer Context-Free Analysis 113

Page 56: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling in Context-Free Analysis (4)

4. Global Error CorrectionAttempt to get a correction that is as good as possible by alteringthe read input or the look ahead input.

Idea: Define distance or quality measure on inputs. For eachincorrect input, look for a syntactically correct input that is bestaccording to the used measure.

! Pros: very powerful technique! Cons: analysis effort can be rather high, implementation is complex

and poses risk of non-termination.

Ina Schaefer Context-Free Analysis 114

Page 57: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Error Handling in Context-Free Analysis (5)

5. Interactive Error CorrectionIn modern programming languages, syntactic analysis is oftenalready supported by editors. In this case, editor marks errorpositions.

! Pros: quick feedback, possible error positions are shown directly,interaction with programmer possible

! Cons: editing can be disturbed, analysis must be able to handleincomplete programs

The presented techniques can also be combined. For selection oftechnique, programming language syntax is important. Errorhandling also depends on grammar class and implementationtechniques used for parser.

Ina Schaefer Context-Free Analysis 115

Page 58: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Burke-Fischer Error Handling

Example of global error correction technique

• Procedure: Use correction window of n symbols before symbol atwhich error was detected. Check all possible variations of symbolsequence in correction window that can be obtained by insertion,exchange or modification of a symbol at any position.

• Quality Measure: Choose variation that allows longestcontinuation of parsing procedure

• Implementation: Work with two stack automata, one representsthe configuration at the beginning of the correction window, theother one the configuration at the end of the correction window. Inan error case, the automaton running behind can be used toresume at the old position and to test the computed variations.

Ina Schaefer Context-Free Analysis 116

Page 59: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Error Handling

Literature

Recommended Reading: Wilhelm, Maurer: Chapter 8,Sections 8.3.6 and 8.4.6 (general understanding sufficient)

Ina Schaefer Context-Free Analysis 117

Page 60: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Concrete and Abstract Syntax

Educational Objectives

• Connection of parsing to other phases of program processing andtranslation

• Differences between abstract and concrete syntax• Language concepts for describing syntax trees• Syntax tree construction

Ina Schaefer Context-Free Analysis 118

Page 61: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Connection of Parsers to other Phases

1. Parser directly controls following phases2. Concrete Syntax Tree as Interface3. Abstract Syntax Tree as Interface

Ina Schaefer Context-Free Analysis 119

Page 62: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Direct Control by Parser

• Example: Recursive Descent: Parser calls other actions aftereach derivation/reduction step

• Pros:! simple (if realisable)! flexible! efficient (especially memory efficient)

• Cons:! non-modular, no clear interfaces! not suitable for global aspects of translation! following phases depend on parsing! cannot be used with every parser generator

Ina Schaefer Context-Free Analysis 120

Page 63: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Abstract Syntax vs. Concrete Syntax

Definition (Concrete Syntax)The concrete syntax of a programming languages determines theactual text representation of the programs (incl. key words,separators).If ! is the CFG used for parsing a program P in a certain language, thesyntax tree of P according to ! is the concrete syntax tree of P.

Definition (Abstract Syntax)The abstract syntax of a programming language describes the treestructure of programs in a form that is sufficient and suitable for furtherprocessing.A tree for representing a program P according to the abstract syntax ofa language is called abstract syntax tree of P.

Ina Schaefer Context-Free Analysis 121

Page 64: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Abstract Syntax

• abstraction from keywords and separators• operator precedences are represented in tree structure (different

non-terminals are not necessary)• better incorporation of symbol information• simplifying transformations

Remarks:• The abstract syntax of a language is often not specified in the

language report.• The abstract syntax usually also comprises information about

source code positions.

Ina Schaefer Context-Free Analysis 122

Page 65: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Example: Concrete vs. Abstract Syntax

Concrete Syntax: !2

• S $ E#

• E $ T + E |T• T $ F + T |F• F $ (E) |ID

Abstract Syntax• Exp = Add | Mult Ident• Add (Exp left, Exp right)• Mult (Exp left, Exp right)

Ina Schaefer Context-Free Analysis 123

Page 66: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Example: Concrete vs. Abstract Syntax (2)Text: (a + b) + c

Concrete Syntax Tree

Textrepräsentation: ( a + b ) * c

Konkreter Syntaxbaum: Abstrakter Syntaxbaum:

S Mult

T

E #

Mult

Add c

F

F

T

a b

E

E T

FF

T

( ID ID ) * ID( ID + ID ) * IDa b c

113© A. Poetzsch-Heffter, TU Kaiserslautern07.05.2007

Abstract Syntax Tree

Textrepräsentation: ( a + b ) * c

Konkreter Syntaxbaum: Abstrakter Syntaxbaum:

S Mult

T

E #

Mult

Add c

F

F

T

a b

E

E T

FF

T

( ID ID ) * ID( ID + ID ) * IDa b c

113© A. Poetzsch-Heffter, TU Kaiserslautern07.05.2007

Ina Schaefer Context-Free Analysis 124

Page 67: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Concrete Syntax Tree as Interface

Token Stream

Parser (with Tree Construction)

Concrete Syntax Tree

Further LanguageProcessing

• Counters disadvantages of direct control by parser• Advantages over Abstract Syntax

! No additional specification of abstract syntax required! Tree construction does not have to be described.! Tree construction can be done automatically by parser generators.

Ina Schaefer Context-Free Analysis 125

Page 68: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Abstract Syntax Tree as Interface

Token Stream

Parser (with Transforming Tree Construction)

Abstract Syntax Tree

Further LanguageProcessing

• Advantages over Concrete Syntax! Simpler, more compact tree representation! Simplifies later phases! Often implemented by programming or specification language as

mutable data structure

Ina Schaefer Context-Free Analysis 126

Page 69: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Abstract Syntax: Specification and Tree Construction

• For representing abstract syntax trees, we use order-sorted terms.• The sets and types of these terms are described by type

declarations.

Ina Schaefer Context-Free Analysis 127

Page 70: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Order-sorted Data Types

DefinitionOder-sorted Data Types are specified by declarations of the followingform:

• Variant Type Declarations V = V0|V1| . . . |Vm

• Tuple Type Declaration T (T1sel1, . . . , Tnseln)• List Type Declarations L + S

Example:• Exp = Add | Mult | Ident• Add (Exp left, Exp right)• Mult (Exp left, Exp right)

where Ident is a predefined type.

Ina Schaefer Context-Free Analysis 128

Page 71: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Order-sorted Data Types (2)

Definition (Order-sorted Types - contd.)Order-sorted Terms are recursively defined as

• If t is a term of type Vi , then it is also of type V.• If ti is a term of type Ti for each i, then T (t1, . . . , tn) is of type T ,

T is also the constructor.• If s1, . . . , sk are terms of type S, then L(s1, . . . , sn) is of type L,

L is also the list constructor.Additional Operators

• the selectors selk : T $ Tk returns the k-th subterm of a tuple• the usual list operations (rest, append, conc, ...)

Ina Schaefer Context-Free Analysis 129

Page 72: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Order-sorted Data Types (3)

Remarks: Order-sorted Data Types• generalise data types of functional languages by subtyping, a term

can belong to serveral types• are used in specification languages, e.g. OBJ3, MAX, ...• are a very compact form for type declaration• can be implemented by OO languages

Ina Schaefer Context-Free Analysis 130

Page 73: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Example: Order-sorted Data Types and OO Types

Declaration of order-sorted data types:• Exp = Add | Mult | Neg | Ident• Add (Exp left, Exp right)• Mult (Exp left, Exp right)• Neg (Exp val)

Ina Schaefer Context-Free Analysis 131

Page 74: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Implementation in Java

interface Exp {Exp left() throws IllegalSelectException;Exp right() throws IllegalSelectException;Exp val() throws IllegalSelectException;

}

class Add implements Exp {private Exp left;private Exp right;

Add( Exp l, Exp r ) {left = l; right = r; }

Exp left() { return left; }Exp right(){ return right; }Exp val() throws IllegalSelectException {

throw new IllegalSelectException(); }}

Ina Schaefer Context-Free Analysis 132

Page 75: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Implementation in Java (2)

class Mult implements Exp {// analog zu Add }

class Neg implements Exp {private Exp val;

Neg( Exp v ) { val = v; }

Exp left() throws IllegalSelectException {throw new IllegalSelectException(); }

Exp right() throws IllegalSelectException {throw new IllegalSelectException(); }

Exp val() { return val; }}

Ina Schaefer Context-Free Analysis 133

Page 76: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Implementation in Java (3)

class Ident implements Exp extends PredefIdent {

Exp left() throws IllegalSelectException {throw new IllegalSelectException(); }

Exp right() throws IllegalSelectException {throw new IllegalSelectException();}

Exp val() throws IllegalSelectException {throw new IllegalSelectException(); }

}

Ina Schaefer Context-Free Analysis 134

Page 77: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Transformation of Concrete to Abstract SyntaxTransformation konkreter in abstrakte Syntax:

S

E #

T

Mult(_,_)

F

F

T

E

Add(_,_)

E T

T

FF

( ID + ID ) * ID

120© A. Poetzsch-Heffter, TU Kaiserslautern07.05.2007

a b c

Ina Schaefer Context-Free Analysis 135

Page 78: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Transformation to Abstract Syntax with JavaCUP

S ::= E:e #{: RESULT = e; :} ;

E ::= E:e ’+’ T:t{: RESULT = ADD(e,t); :} |T:t{: RESULT = t; :} ;

T ::= T:t ’*’ F:f{: RESULT = MULT(t,f); :} |F:f{: RESULT = f; :} ;

F ::= ’(’ E:e ’)’{: RESULT = e ; :} |ID:i{: RESULT = i; :} ;

Ina Schaefer Context-Free Analysis 136

Page 79: Implementation of Parsers Bottom-Up Syntax Analysis Bottom ... · Implementation of Parsers Bottom-Up Syntax Analysis Construction of LR-DFA Let Γ=(T , N, Π, S) be a CFG.• For

Concrete and Abstract Syntax

Recommended Reading

• Wilhelm, Maurer: Section 9.1, pp. 406 + 407• Appel: Chapter 4, pp. 89 – 105

Ina Schaefer Context-Free Analysis 137


Recommended