+ All Categories
Home > Technology > TMPA-2017: Extended Context-Free Grammars Parsing with Generalized LL

TMPA-2017: Extended Context-Free Grammars Parsing with Generalized LL

Date post: 05-Apr-2017
Category:
Upload: iosif-itkin
View: 128 times
Download: 4 times
Share this document with a friend
39
Transcript

Extended Context-Free Grammars Parsing with

Generalized LL

Author: Artem Gorokhov

Saint Petersburg University

Programming Languages and Tools Lab, JetBrains

March 4,2017

Artem Gorokhov (SPbU) March 4,2017 1 / 15

Artem Gorokhov (SPbU) March 4,2017 2 / 15

Extended Context-Free Grammar

S = a M*

M = a? (B K )+

| u B

B = c | 𝜀

Artem Gorokhov (SPbU) March 4,2017 3 / 15

=⇒

Artem Gorokhov (SPbU) March 4,2017 4 / 15

Artem Gorokhov (SPbU) March 4,2017 5 / 15

Artem Gorokhov (SPbU) March 4,2017 5 / 15

Existing solutions

ANTLR, Yacc, Bison

I Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsing

I No toolsI LL(k), LR(k)

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsing

I No toolsI LL(k), LR(k)

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsing

I No toolsI LL(k), LR(k)

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsingI No toolsI LL(k), LR(k)

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsingI No toolsI LL(k), LR(k)

Generalized LL

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsingI No toolsI LL(k), LR(k)

Generalized LLI Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Existing solutions

ANTLR, Yacc, BisonI Can’t use ECFG without transformationI Admit only subclass of Context-Free languages (LL(k), LR(k))

Some research on ECFG parsingI No toolsI LL(k), LR(k)

Generalized LL

I Admit arbitrary CFG (including ambiguous)I Can’t use ECFG without transformation

Artem Gorokhov (SPbU) March 4,2017 6 / 15

Automata and ECFGs

Grammar G0

S = a*S b? | c =⇒

RA for grammar G0

cS

S

a

b

ε

ε

Artem Gorokhov (SPbU) March 4,2017 7 / 15

Recursive Automata Minimization

Grammar G1

S = K K K K K K |K a K K K KK = S K | a K | a

Automaton for G1

K

a

K K KKS

K

K K K K

a

K

KK

S

a

Minimized automaton for G1

a S

a

K

K

SK K K K

K

K

Artem Gorokhov (SPbU) March 4,2017 8 / 15

Derivation Trees for Recursive Automata

Input:

aacb

Automaton:

cS

S

a

b

a

Derivation trees:

S,0,4

b,3,4a,0,1 a,1,2

c,2,3

S,2,3

S,0,4

b,3,4a,0,1

a,1,2

c,2,3

S,2,3

S,1,3

S,0,4

b,3,4

a,0,1

a,1,2

c,2,3

S,1,4

S,2,3

Artem Gorokhov (SPbU) March 4,2017 9 / 15

SPPF for Recursive Automata

Input:

aacb

Automaton:

cS

S

a

b

a

Shared Packed Parse Forest:

S,0,4

b,3,4

a,0,1 a,1,2

3,1,3

c,2,3

S,1,4

S,2,3

3,0,3

S,1,3

2,0,2

Artem Gorokhov (SPbU) March 4,2017 10 / 15

SPPF for Recursive Automata

Input:

aacb

Automaton:

cS

S

a

b

a

Shared Packed Parse Forest:

S,0,4

b,3,4

a,0,1 a,1,2

3,1,3

c,2,3

S,1,4

S,2,3

3,0,3

S,1,3

2,0,2

Artem Gorokhov (SPbU) March 4,2017 10 / 15

SPPF for Recursive Automata

Input:

aacb

Automaton:

cS

S

a

b

a

Shared Packed Parse Forest:

S,0,4

b,3,4

a,0,1 a,1,2

3,1,3

c,2,3

S,1,4

S,2,3

3,0,3

S,1,3

2,0,2

Artem Gorokhov (SPbU) March 4,2017 10 / 15

SPPF for Recursive Automata

Input:

aacb

Automaton:

cS

S

a

b

a

Shared Packed Parse Forest:

S,0,4

b,3,4

a,0,1 a,1,2

3,1,3

c,2,3

S,1,4

S,2,3

3,0,3

S,1,3

2,0,2

Artem Gorokhov (SPbU) March 4,2017 10 / 15

Input processing

Descriptors queue

Descriptor (G, i, U, T) uniquely defines parsing process stateI G - position in grammarI i - position in inputI U - stack nodeI T - current parse forest root

Artem Gorokhov (SPbU) March 4,2017 11 / 15

Input processing

Descriptors queue

Descriptor (G, i, U, T) uniquely defines parsing process stateI G - position in grammar state of RAI i - position in inputI U - stack nodeI T - current parse forest root

Artem Gorokhov (SPbU) March 4,2017 11 / 15

Input processing

Input : bc

Grammar:

S = (a | b | S) c?

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : bc

Grammar:

S = a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | c

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : ∙ bc

Grammar:

S = ∙ a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : ∙ bc

Grammar:

S = a C_opt| ∙ b C_opt| S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : ∙ bc

Grammar:

S = a C_opt| b C_opt| ∙ S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : ∙ bc

Grammar:

S = ∙ a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : ∙ bc

Grammar:

S = a C_opt| ∙ b C_opt| S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : b ∙ c

Grammar:

S = a C_opt| b ∙ C_opt| S C_opt

C_opt = 𝜀 | c

Descriptors queue

S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

b,0,1

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : b ∙ c

Grammar:

S = a C_opt| b C_opt| S C_opt

C_opt = ∙ 𝜀 | c

Descriptors queue

C_opt = ∙𝜀, 1, . . . , . . .S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : b ∙ c

Grammar:

S = a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | ∙ c

Descriptors queue

C_opt = ∙c , 1, . . . , . . .C_opt = ∙𝜀, 1, . . . , . . .

S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : b ∙ c

Grammar:

S = a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | ∙ c

Descriptors queue

C_opt = ∙c , 1, . . . , . . .C_opt = ∙𝜀, 1, . . . , . . .S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : bc∙

Grammar:

S = a C_opt| b C_opt| S C_opt

C_opt = 𝜀 | c ∙

Descriptors queue

C_opt = ∙c , 1, . . . , . . .C_opt = ∙𝜀, 1, . . . , . . .S = ∙ S C_opt, 0, . . . , . . .

S = ∙ b C_opt, 0, . . . , . . .

S = ∙ a C_opt, 0, . . . , . . .

C_opt,1,2

c,1,2

Artem Gorokhov (SPbU) March 4,2017 12 / 15

Input processing

Input : bc

Automaton :

cab

S

S

Artem Gorokhov (SPbU) March 4,2017 13 / 15

Input processing

Input : ∙ bc

Automaton :

cab

S

S Descriptors queue

S , 0, . . . , . . .

Artem Gorokhov (SPbU) March 4,2017 13 / 15

Input processing

Input : ∙ bc

Automaton :

cab

S

S b,0,1

S,0,1

b,0,1

Artem Gorokhov (SPbU) March 4,2017 13 / 15

Evaluation

Grammar G1

S = K K K K K K |K a K K K KK = S K | a K | a

RA for grammar G1

a S

a

K

K

SK K K K

K

K

Experiment results for input a40

Memory usageTime,secDescriptors Stack Edges SPPF Nodes

Grammar 7,940 6,974 111,127,244 81

RA 5,830 4,234 74,292,078 54

Ratio 27% 39% 33 % 35 %Artem Gorokhov (SPbU) March 4,2017 14 / 15

Applicability

Graph parsing: all input strings in one graph

abcdabfd

=⇒ bac

df

Graph parsing resultsMemory usage

Time, minDescriptors Stack Edges Stack Nodes

Grammar 21,134,080 7,482,789 2,731,529 02.26

RA 9,153,352 2,792,330 839,148 01.25

Ratio 57% 63% 69 % 45 %

Artem Gorokhov (SPbU) March 4,2017 15 / 15


Recommended