+ All Categories
Home > Documents > NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase...

NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase...

Date post: 17-May-2018
Category:
Upload: vocong
View: 248 times
Download: 4 times
Share this document with a friend
71
1 NLP Programming Tutorial 8 – Phrase Structure Parsing NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science and Technology (NAIST)
Transcript
Page 1: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

1

NLP Programming Tutorial 8 – Phrase Structure Parsing

NLP Programming Tutorial 8 -Phrase Structure Parsing

Graham NeubigNara Institute of Science and Technology (NAIST)

Page 2: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

2

NLP Programming Tutorial 8 – Phrase Structure Parsing

Interpreting Language is Hard!

I saw a girl with a telescope

● “Parsing” resolves structural ambiguity in a formal way

Page 3: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

3

NLP Programming Tutorial 8 – Phrase Structure Parsing

Two Types of Parsing● Dependency: focuses on relations between words

● Phrase structure: focuses on identifying phrases and their recursive structure

I saw a girl with a telescope

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Page 4: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

4

NLP Programming Tutorial 8 – Phrase Structure Parsing

Recursive Structure?

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Page 5: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

5

NLP Programming Tutorial 8 – Phrase Structure Parsing

Recursive Structure?

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Page 6: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

6

NLP Programming Tutorial 8 – Phrase Structure Parsing

Recursive Structure?

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???

NP

Page 7: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

7

NLP Programming Tutorial 8 – Phrase Structure Parsing

Recursive Structure?

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???

NP

Page 8: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

8

NLP Programming Tutorial 8 – Phrase Structure Parsing

Recursive Structure?

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???

NP

Page 9: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

9

NLP Programming Tutorial 8 – Phrase Structure Parsing

Different Structure, Different Interpretation

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???NP

NP

Page 10: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

10

NLP Programming Tutorial 8 – Phrase Structure Parsing

Different Structure, Different Interpretation

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???NP

NP

Page 11: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

11

NLP Programming Tutorial 8 – Phrase Structure Parsing

Different Structure, Different Interpretation

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???NP

NP

Page 12: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

12

NLP Programming Tutorial 8 – Phrase Structure Parsing

Different Structure, Different Interpretation

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

???NP

NP

Page 13: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

13

NLP Programming Tutorial 8 – Phrase Structure Parsing

Non-Terminals, Pre-Terminals, Terminals

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Pre-Terminal

Non-Terminal

Terminal

Page 14: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

14

NLP Programming Tutorial 8 – Phrase Structure Parsing

Parsing as a Prediction Problem

● Given a sentence X, predict its parse tree Y

● A type of “structured” prediction (similar to POS tagging, word segmentation, etc.)

I saw a girl with a telescopePRPVBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Page 15: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

15

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Model for Parsing

● Given a sentence X, predict the most probable parse tree Y

I saw a girl with a telescopePRPVBD DT NN IN DT NN

NPNP

PP

VP

S

argmaxY

P (Y∣X )

NP

Page 16: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

16

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Generative Model

● We assume some probabilistic model generated the parse tree Y and sentence X jointly

● The parse tree with highest joint probability given X also has the highest conditional probability

P(Y , X )

argmaxY

P (Y∣X )=argmaxY

P(Y , X)

Page 17: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

17

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Context Free Grammar (PCFG)

● How do we define a joint probability for a parse tree?

P( )

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

S

NP

Page 18: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

18

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Context Free Grammar (PCFG)

● PCFG: Define probability for each node

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

SP(S → NP VP)

P(PRP → “I”)

P(VP → VBD NP PP)

P(PP → IN NP)

P(NP → DT NN)

P(NN → “telescope”)NP

Page 19: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

19

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Context Free Grammar (PCFG)

● PCFG: Define probability for each node

● Parse tree probability is product of node probabilities

P(S → NP VP) * P(NP → PRP) * P(PRP → “I”) * P(VP → VBD NP PP) * P(VBD → “saw”) * P(NP → DT NN) * P(DT → “a”) * P(NN → “girl”) * P(PP → IN NP) * P(IN → “with”)* P(NP → DT NN) * P(DT → “a”) * P(NN → “telescope”)

I saw a girl with a telescopePRP VBD DT NN IN DT NN

NPNP

PP

VP

SP(S → NP VP)

P(PRP → “I”)

P(VP → VBD NP PP)

P(PP → IN NP)

P(NP → DT NN)

P(NN → “telescope”)NP

Page 20: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

20

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Parsing

● Given this model, parsing is the algorithm to find

● Can we use the Viterbi algorithm as we did before?

argmaxY

P (Y , X )

Page 21: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

21

NLP Programming Tutorial 8 – Phrase Structure Parsing

Probabilistic Parsing

● Given this model, parsing is the algorithm to find

● Can we use the Viterbi algorithm as we did before?● Answer: No!● Reason: Parse candidates are not graphs, but

hypergraphs.

argmaxY

P (Y , X )

Page 22: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

22

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● Let's say we havetwo parse trees

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

NP0,1

Page 23: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

23

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● Most parts are the same!

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

NP0,1

Page 24: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

24

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● Graph with all same edges + all nodes

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

Page 25: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

25

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● Create graph with all same edges + all nodes

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

Page 26: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

26

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● With the edges in the first trees:

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

Page 27: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

27

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● With the edges in the second tree:

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

NP0,1

Page 28: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

28

NLP Programming Tutorial 8 – Phrase Structure Parsing

What is a Hypergraph?

● With the edges in the first and second trees:

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

Two choices!Choose red, get the first tree

Choose blue, get the second tree

NP0,1

Page 29: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

29

NLP Programming Tutorial 8 – Phrase Structure Parsing

Why a “Hyper”graph?

● The “degree” of an edge is the number of children

● The degree of a hypergraph is the maximum degree of all its edges

● A graph is a hypergraph of degree 1!

PRP0,1

I

VBD1,2

saw

Degree 1VP1,7

VBD1,2

NP2,7

Degree 2VP1,7

VBD1,2

NP2,4

Degree 3

PP4,7

0 1 2 32.5 4.0 2.3

2.1

1.4

Example →

Page 30: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

30

NLP Programming Tutorial 8 – Phrase Structure Parsing

Weighted Hypergraphs

● Like graphs:● can add weights to hypergraph edges● use negative log probability of rule

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP2,7

-log(P(S → NP VP))

-log(P(VP → VBD NP PP))

log(P(PRP → “I”))

-log(P(VP → VBD NP))

NP0,1

Page 31: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

31

NLP Programming Tutorial 8 – Phrase Structure Parsing

Solving Hypergraphs

● Parsing = finding minimum path through a hypergraph

Page 32: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

32

NLP Programming Tutorial 8 – Phrase Structure Parsing

Solving Hypergraphs

● Parsing = finding minimum path through a hypergraph

● We can do this for graphs with the Viterbi algorithm● Forward: Calculate score of best path to each state● Backward: Recover the best path

Page 33: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

33

NLP Programming Tutorial 8 – Phrase Structure Parsing

Solving Hypergraphs

● Parsing = finding minimum path through a hypergraph

● We can do this for graphs with the Viterbi algorithm● Forward: Calculate score of best path to each state● Backward: Recover the best path

● For hypergraphs, almost identical algorithm!● Inside: Calculate score of best subtree for each node● Outside: Recover the best tree

Page 34: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

34

NLP Programming Tutorial 8 – Phrase Structure Parsing

Review: Viterbi Algorithm(Forward Step)

0 1 2 32.5 4.0 2.3

2.1

1.4

best_score[0] = 0for each node in the graph (ascending order)

best_score[node] = ∞for each incoming edge of node

score = best_score[edge.prev_node] + edge.scoreif score < best_score[node]

best_score[node] = score best_edge[node] = edge

e1

e2

e3

e5

e4

Page 35: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

35

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

00.0

1∞

2∞

3∞

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Page 36: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

36

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

score = 0 + 2.5 = 2.5 (< ∞)best_score[1] = 2.5best_edge[1] = e

1

00.0

12.5

2∞

3∞

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Check e1:

Page 37: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

37

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

score = 0 + 2.5 = 2.5 (< ∞)best_score[1] = 2.5best_edge[1] = e

1

00.0

12.5

21.4

3∞

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Check e1:

score = 0 + 1.4 = 1.4 (< ∞)best_score[2] = 1.4best_edge[2] = e

2

Check e2:

Page 38: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

38

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

score = 0 + 2.5 = 2.5 (< ∞)best_score[1] = 2.5best_edge[1] = e

1

00.0

12.5

21.4

3∞

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Check e1:

score = 0 + 1.4 = 1.4 (< ∞)best_score[2] = 1.4best_edge[2] = e

2

Check e2:

score = 2.5 + 4.0 = 6.5 (> 1.4)No change!

Check e3:

Page 39: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

39

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

score = 0 + 2.5 = 2.5 (< ∞)best_score[1] = 2.5best_edge[1] = e

1

00.0

12.5

21.4

34.6

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Check e1:

score = 0 + 1.4 = 1.4 (< ∞)best_score[2] = 1.4best_edge[2] = e

2

Check e2:

score = 2.5 + 4.0 = 6.5 (> 1.4)No change!

Check e3:

score = 2.5 + 2.1 = 4.6 (< ∞)best_score[3] = 4.6best_edge[3] = e

4

Check e4:

Page 40: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

40

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example:

best_score[0] = 0

score = 0 + 2.5 = 2.5 (< ∞)best_score[1] = 2.5best_edge[1] = e

1

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e3

e2

e4

e5

Initialize:

Check e1:

score = 0 + 1.4 = 1.4 (< ∞)best_score[2] = 1.4best_edge[2] = e

2

Check e2:

score = 2.5 + 4.0 = 6.5 (> 1.4)No change!

Check e3:

score = 2.5 + 2.1 = 4.6 (< ∞)best_score[3] = 4.6best_edge[3] = e

4

Check e4:

score = 1.4 + 2.3 = 3.7 (< 4.6)best_score[3] = 3.7best_edge[3] = e

5

Check e5:

Page 41: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

41

NLP Programming Tutorial 8 – Phrase Structure Parsing

Result of Forward Step

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

best_score = ( 0.0, 2.5, 1.4, 3.7 )

best_edge = ( NULL, e1, e

2, e

5 )

Page 42: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

42

NLP Programming Tutorial 8 – Phrase Structure Parsing

Review: Viterbi Algorithm(Backward Step)

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

best_path = [ ]next_edge = best_edge[best_edge.length – 1]while next_edge != NULL

add next_edge to best_pathnext_edge = best_edge[next_edge.prev_node]

reverse best_path

Page 43: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

43

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example of Backward Step

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

Initialize:best_path = []next_edge = best_edge[3] = e

5

Page 44: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

44

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example of Backward Step

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

Initialize:best_path = []next_edge = best_edge[3] = e

5

Process e5:

best_path = [e5]

next_edge = best_edge[2] = e2

Page 45: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

45

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example of Backward Step

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

Initialize:best_path = []next_edge = best_edge[3] = e

5

Process e5:

best_path = [e5]

next_edge = best_edge[2] = e2

Process e2:

best_path = [e5, e

2]

next_edge = best_edge[0] = NULL

Page 46: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

46

NLP Programming Tutorial 8 – Phrase Structure Parsing

Example of Backward Step

00.0

12.5

21.4

33.7

2.5 4.0 2.3

2.1

1.4

e1

e2

e3

e5

e4

Initialize:best_path = []next_edge = best_edge[3] = e

5

Process e5:

best_path = [e5]

next_edge = best_edge[2] = e2

Process e5:

best_path = [e5, e

2]

next_edge = best_edge[0] = NULL

Reverse:

best_path = [e2, e

5]

Page 47: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

47

NLP Programming Tutorial 8 – Phrase Structure Parsing

Inside Step for Hypergraphs:● Find the score of best subtree of VP1,7

VBD1,2

NP2,4

PP4,7

VP1,7 NP

2,7

e1

e2

Page 48: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

48

NLP Programming Tutorial 8 – Phrase Structure Parsing

Inside Step for Hypergraphs:● Find the score of best subtree of VP1,7

VBD1,2

NP2,4

PP4,7

VP1,7 NP

2,7

score(e1) =

-log(P(VP → VBD NP PP)) + best_score[VBD1,2] + best_score[NP2,4] + best_score[NP2,7]

score(e2) =

-log(P(VP → VBD NP)) + best_score[VBD1,2] + best_score[VBD2,7]

e1

e2

Page 49: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

49

NLP Programming Tutorial 8 – Phrase Structure Parsing

Inside Step for Hypergraphs:● Find the score of best subtree of VP1,7

VBD1,2

NP2,4

PP4,7

VP1,7 NP

2,7

score(e1) =

-log(P(VP → VBD NP PP)) + best_score[VBD1,2] + best_score[NP2,4] + best_score[NP2,7]

score(e2) =

-log(P(VP → VBD NP)) + best_score[VBD1,2] + best_score[VBD2,7]

best_edge[VB1,7] = argmine1,e2

score

e1

e2

Page 50: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

50

NLP Programming Tutorial 8 – Phrase Structure Parsing

Inside Step for Hypergraphs:● Find the score of best subtree of VP1,7

VBD1,2

NP2,4

PP4,7

VP1,7 NP

2,7

score(e1) =

-log(P(VP → VBD NP PP)) + best_score[VBD1,2] + best_score[NP2,4] + best_score[NP2,7]

score(e2) =

-log(P(VP → VBD NP)) + best_score[VBD1,2] + best_score[VBD2,7]

best_edge[VB1,7] = argmine1,e2

score

best_score[VB1,7] = score(best_edge[VB1,7])

e1

e2

Page 51: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

51

NLP Programming Tutorial 8 – Phrase Structure Parsing

Building Hypergraphs from Grammars

● Ok, we can solve hypergraphs, but what we have is:

● How do we build a hypergraph?

P(S → NP VP) = 0.8P(S → PRP VP) = 0.2P(VP → VBD NP PP) = 0.6P(VP → VBD NP)= 0.4P(NP → DT NN) = 0.5P(NP → NN) = 0.5P(PRP → “I”) = 0.4P(VBD → “saw”) = 0.05P(DT → “a”) = 0.6...

A Grammar A Sentence

I saw a girl with a telescope

Page 52: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

52

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

● The CKY (Cocke-Kasami-Younger) algorithm creates and solves hypergraphs

● Grammar must be in Chomsky normal form (CNF)● All rules have two non-terminals or one terminal on right

● Can convert rules into CNF

S → NP VPS → PRP VPVP → VBD NP

VP → VBD NP PPNP → NNNP → PRP

PRP → “I”VBD → “saw”DT → “a”

OK OK Not OK!

VP → VBD NP PPVP → VBD VP'VP' → NP PP

NP → PRP + PRP → “I” NP_PRP → “I”

Page 53: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

53

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

● Start by expanding all rules for terminals with scores

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

NP0,1 0.5

Page 54: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

54

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

● Expand all possible nodes for 0,2

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,20.5 + 3.2 + 1.0 = 4.7 5.3

Page 55: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

55

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

● Expand all possible nodes for 1,3

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3 5.0

Page 56: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

56

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

● Expand all possible nodes for 0,3

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

Page 57: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

57

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

● Find the S that covers the entire sentence and its best edge

Page 58: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

58

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

● Expand the left child, right child recursively until we have our tree

Page 59: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

59

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

● Expand the left child, right child recursively until we have our tree

Page 60: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

60

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

● Expand the left child, right child recursively until we have our tree

Page 61: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

61

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Algorithm

I saw him

PRP0,1

VP1,21.0

VBD1,23.2 1.4

PRP2,32.4

NP2,3 2.6

S0,2

NP0,1 0.5

SBAR0,2

VP1,34.7 5.3

S0,3

5.9

5.0

SBAR0,3

6.1

● Expand the left child, right child recursively until we have our tree

Page 62: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

62

NLP Programming Tutorial 8 – Phrase Structure Parsing

Printing Parse Trees

● Standard text format for parse tree: “Penn Treebank”

IN DT NN

NP

PP

with a telescope

(PP (IN with) (NP (DT a) (NN telescope)))

Page 63: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

63

NLP Programming Tutorial 8 – Phrase Structure Parsing

Printing Parse Trees

● Hypergraphs printed recursively, starting at top:

I saw a girl with a telescope

PRP0,1

VBD1,2

DT2,3

NN3,4

IN4,5

DT5,6

NN6,7

NP5,7

NP2,4

PP4,7

VP1,7

S0,7

NP0,1

print(S0,7

) = “(S “ + print(NP0,1

) + “ “ + print(VP1,7

)+”)”print(NP

0,1) = “(NP “ + print(PRP

0,1) + ”)”

print(PRP0,1

) = “(PRP I)”

...

Page 64: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

64

NLP Programming Tutorial 8 – Phrase Structure Parsing

Pseudo-Code

Page 65: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

65

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Pseudo-Code: Read Grammar# Read a grammar in format “lhs \t rhs \t prob \n”make list nonterm # Make list of (lhs, rhs1, rhs2, prob)make map preterm # Make a map preterm[rhs] = [ (lhs, prob) ...]for rule in grammar_file split rule into lhs, rhs, prob (with “\t”) # Rule P(lhs → rhs)=prob split rhs into rhs_symbols (with “ “) if length(rhs) == 1: # If this is a pre-terminal add (lhs, log(prob)) to preterm[rhs] else: # Otherwise, it is a non-terminal add (lhs, rhs[0], rhs[1], log(prob)) to nonterm

Page 66: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

66

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Pseudo-Code: Add Pre-Terminalssplit line into wordsmake map best_score # index: sym

i,j value = best log prob

make map best_edge # index: symi,j value = (lsym

i,k, rsym

k,j)

# Add the pre-terminal symfor i in 0 .. length(words)-1: for lhs, log_prob in preterm where P(lhs → words[i]) > 0: best_score[lhs

i,i+1] = [log_prob]

Page 67: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

67

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Pseudo-Code: Combine Non-Terminals

for j in 2 .. length(words): # j is right side of the span for i in j-2 .. 0: # i is left side (Note: Reverse order!) for k in i+1 .. j-1: # k is beginning of the second child # Try every grammar rule log(P(sym → lsym rsym)) = logprob for sym, lsym, rsym, logprob in nonterm: # Both children must have a probability if best_score[lsym

i,k] > -∞ and best_score[rsym

k,j] > -∞:

# Find the log probability for this node/edge my_lp = best_score[lsym

i,k] + best_score[rsym

k,j] + logprob

# If this is the best edge, update if my_lp > best_score[sym

i,j]:

best_score[symi,j] = my_lp

best_edge[symi,j] = (lsym

i,k, rsym

k,j)

Page 68: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

68

NLP Programming Tutorial 8 – Phrase Structure Parsing

CKY Pseudo-Code: Print Tree

print(S0,length(words)

) # Print the “S” that spans all words

subroutine print(symi,j):

if symi,j

exists in best_edge: # for non-terminals

return “(“+sym+” “ + print(best_edge[0]) + “ ” + + print(best_edge[1]) + “)” else: # for terminals return “(“+sym+“ ”+words[i]+“)”

Page 69: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

69

NLP Programming Tutorial 8 – Phrase Structure Parsing

Exercise

Page 70: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

70

NLP Programming Tutorial 8 – Phrase Structure Parsing

Exercise● Write cky.py

● Test the program

● Input: test/08­input.txt● Grammar: test/08­grammar.txt● Answer: test/08­output.txt

● Run the program on actual data:

● data/wiki­en­test.grammar, data/wiki­en­short.tok

● Visualize the trees● script/print­trees.py < wiki­en­test.trees

● (Requires NLTK: http://nltk.org/)● Challenge: think of a way to handle unknown words

Page 71: NLP Programming Tutorial 8 - Phrase Structure Parsing · 3 NLP Programming Tutorial 8 – Phrase Structure Parsing Two Types of Parsing Dependency: focuses on relations between words

71

NLP Programming Tutorial 8 – Phrase Structure Parsing

Thank You!


Recommended