+ All Categories
Home > Documents > Learning Accurate, Compact, and Interpretable Tree Annotation

Learning Accurate, Compact, and Interpretable Tree Annotation

Date post: 13-Jan-2016
Category:
Upload: marinel
View: 18 times
Download: 0 times
Share this document with a friend
Description:
Learning Accurate, Compact, and Interpretable Tree Annotation. Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Milo š Ercegovčević. Outline. Introduction EM algorithm Latent Grammars Motivation Learning Latent PCFG - PowerPoint PPT Presentation
Popular Tags:
31
Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš Ercegovčević
Transcript
Page 1: Learning Accurate, Compact, and Interpretable Tree Annotation

Learning Accurate, Compact, and Interpretable Tree Annotation

Recent Advances in Parsing Technology

WS 2011/2012

Saarland University in Saarbrücken Miloš Ercegovčević

Page 2: Learning Accurate, Compact, and Interpretable Tree Annotation

Outline

Introduction EM algorithm

Latent Grammars Motivation Learning Latent PCFG

Split-Merge Adaptation

Efficient inference with Latent Grammars Pruning in Multilevel Coarse-to-Fine parsing Parse Selection

Page 3: Learning Accurate, Compact, and Interpretable Tree Annotation

Introduction : EM Algorithm

Iterative algorithm for finding MLE or MAP estimates of parameters in statistical models

X – observed data; Z – set of latent variables Θ – a vector of unknown parametes Likelihood function:

MLE of the marginal likelihood :

However this quantity is intractable Often we don’t know both Z and Θ

)|,()|();( ZXpXpXL z

)|,(),;( ZXpZXL

Page 4: Learning Accurate, Compact, and Interpretable Tree Annotation

Introduction : EM Algorithm

Find the MLE of the marginal likelihood by iteratively applying two steps:

Expectation step (E-step): Calculate Z under current Θ

Maximization step (M-step): Find Θ that maximizes the quantity

)|(maxarg )()1( tt Q

)],;([log)|( )(,|

)( ZXLEq tXZ

t

Page 5: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent PCFG

Standard coarse Treebank Tree Baseline for parsing F1 72.6

Page 6: Learning Accurate, Compact, and Interpretable Tree Annotation

Parent annotated trees [Johnson ’98], [Klein & Manning ’03]

F1 86.3

Latent PCFG

Page 7: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent PCFG

Head lexicalized [Collins ’99, Charniak ’00] trees F1 88.6

Page 8: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent PCFG

Automatically clustered categories with F1 86.7 [Matsuzaki et al. ’05]

Same number of subcategories for all categories

Page 9: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent PCFG

At each step split the categories into two sets. After 6 iterations number of subcategories is 64 Initialize EM with the results of the smaller grammar

Page 10: Learning Accurate, Compact, and Interpretable Tree Annotation

Learning Latent PCFG

S

Induce subcategories Like forward-backward

for HMMs Fixed brackets

Forward

X1

X2X7X4

X5 X6X3

He was right

.

Backward

Page 11: Learning Accurate, Compact, and Interpretable Tree Annotation

Learning Latent Grammar Inside-Outside probabilities

),,(),,()(),,(,

zINyINzyxzy

XIN CtsPBsrPCBAAtrP

),,(),,()(),,(,

zINxOUTzyxzx

yOUT CtsPAtrPCBABtrP

),,(),,()(),,(,

yINxOUTzyxyx

zOUT BsrPAtrPCBACtsP

Page 12: Learning Accurate, Compact, and Interpretable Tree Annotation

Learning Latent Grammar Expectation step (E-step):

Maximization step (M-step):

),,(),,(

)(),,(),|),,,((

zINyIN

zyxxOUTzyx CtsPBsrP

CBAAtrPTwCBAtsrP

}{#

}{#:)(

''',' zyxzy

zyxzyx CBA

CBACBA

Page 13: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent Grammar : Adaptive splitting

Without loss in Accuracy

Want to split more according to the data Solution: Split everything then merge by the

loss

split with likelihood Data

reversedsplit with likelihood Data

Page 14: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent Grammar : Adaptive splitting The likelihood of data for tree T and sentence

w:

Then for two annotations the overall loss can be estimated as:

),,(),,(),( XOUTxx

IN AtrPAtrPTwP

i Tn i

ii

in

ANNOTATION

iTwP

TwPAA

),(

),(),( 21

Page 15: Learning Accurate, Compact, and Interpretable Tree Annotation

0

5

10

15

20

25

30

35

40

NP

VP PP

AD

VP S

AD

JP

SB

AR QP

WH

NP

PR

N

NX

SIN

V

PR

T

WH

PP

SQ

CO

NJP

FR

AG

NA

C

UC

P

WH

AD

VP

INT

J

SB

AR

Q

RR

C

WH

AD

JP X

RO

OT

LST

Number of Phrasal Subcategories

Page 16: Learning Accurate, Compact, and Interpretable Tree Annotation

0

5

10

15

20

25

30

35

40

NP

VP PP

AD

VP S

AD

JP

SB

AR QP

WH

NP

PR

N

NX

SIN

V

PR

T

WH

PP

SQ

CO

NJP

FR

AG

NA

C

UC

P

WH

AD

VP

INT

J

SB

AR

Q

RR

C

WH

AD

JP X

RO

OT

LST

Number of Phrasal Subcategories

PP

VPNP

Page 17: Learning Accurate, Compact, and Interpretable Tree Annotation

0

5

10

15

20

25

30

35

40

NP

VP PP

AD

VP S

AD

JP

SB

AR QP

WH

NP

PR

N

NX

SIN

V

PR

T

WH

PP

SQ

CO

NJP

FR

AG

NA

C

UC

P

WH

AD

VP

INT

J

SB

AR

Q

RR

C

WH

AD

JP

X

RO

OT

LST

Number of Phrasal Subcategories

XNAC

Page 18: Learning Accurate, Compact, and Interpretable Tree Annotation

Number of Lexical Subcategories

0

10

20

30

40

50

60

70

NN

P JJN

NS

NN

VB

N RB

VB

G VB

VB

D CD IN

VB

ZV

BP DT

NN

PS

CC

JJR

JJS :

PR

PP

RP

$M

DR

BR

WP

PO

SP

DT

WR

B-L

RB

- .E

XW

P$

WD

T-R

RB

- ''F

WR

BS

TO $

UH , ``

SY

M RP

LS#

TO

,

POS

Page 19: Learning Accurate, Compact, and Interpretable Tree Annotation

0

10

20

30

40

50

60

70

NN

P JJN

NS

NN

VB

N RB

VB

G VB

VB

D CD IN

VB

ZV

BP DT

NN

PS

CC

JJR

JJS :

PR

PP

RP

$M

DR

BR

WP

PO

SP

DT

WR

B-L

RB

- .E

XW

P$

WD

T-R

RB

- ''F

WR

BS

TO $

UH , ``

SY

M RP

LS#

Number of Lexical Subcategories

IN

DT

RB VBx

Page 20: Learning Accurate, Compact, and Interpretable Tree Annotation

Number of Lexical Subcategories

0

10

20

30

40

50

60

70

NN

P JJN

NS

NN

VB

N RB

VB

G VB

VB

D CD IN

VB

ZV

BP DT

NN

PS

CC

JJR

JJS :

PR

PP

RP

$M

DR

BR

WP

PO

SP

DT

WR

B-L

RB

- .E

XW

P$

WD

T-R

RB

- ''F

WR

BS

TO $

UH , ``

SY

M RP

LS#

NN

NNS

NNP JJ

Page 21: Learning Accurate, Compact, and Interpretable Tree Annotation

Latent Grammar : Results

ParserF1

≤ 40 words

F1

all words

Klein & Manning ’03 86.3 85.7

Matsuzaki et al. ’05 86.7 86.1

Collins ’99 88.6 88.2

Charniak & Johnson ’05 90.1 89.6

Petrov et al. ‘06 90.2 89.7

Page 22: Learning Accurate, Compact, and Interpretable Tree Annotation

Efficient inference with Latent Grammars Latent Grammar with 91.2 F1 score on Dev Set (1600

sentences) WSJ Training time 1621: more than a minute per sentence For usage in real-word applications this is to slow

Improve on inference: Hierarchical Pruning Parse Selection

Page 23: Learning Accurate, Compact, and Interpretable Tree Annotation

Intermediate Grammars

X-Bar=G0

G=

G1

G2

G3

G4

G5

G6

Lea

rning DT1 DT2 DT3 DT4 DT5 DT6 DT7 DT8

DT1 DT2 DT3 DT4

DT1

DT

DT2

Page 24: Learning Accurate, Compact, and Interpretable Tree Annotation

G1

G2

G3

G4

G5

G6

Lea

rning

G1

G2

G3

G4

G5

G6

Lea

rning

Projected Grammars

X-Bar=G0

G=

Pro

jectio

n i

0(G)

1(G)

2(G)

3(G)

4(G)

5(G)

G

Page 25: Learning Accurate, Compact, and Interpretable Tree Annotation

Treebank

Rules in (G)

S NP VP

Rules in G

S1 NP1 VP1 0.20S1 NP1 VP2 0.12S1 NP2 VP1 0.02S1 NP2 VP2 0.03S2 NP1 VP1 0.11S2 NP1 VP2 0.05S2 NP2 VP1 0.08S2 NP2 VP2 0.12

Infinite tree distribution

0.56

Estimating Grammars

Page 26: Learning Accurate, Compact, and Interpretable Tree Annotation

Hierarchical Pruning

Consider the span:

… QP NP VP …coarse:

split in two: … QP1 QP2 NP1 NP2 VP1 VP2 …

… QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 …split in four:

split in eight: … … … … … … … … … … … … … … … … …

Page 27: Learning Accurate, Compact, and Interpretable Tree Annotation

Parse Selection

Given a sentence w and a split PCFG grammar G select the best parse that minimize our beliefs:

Intractable: we cannot generate all the TT

TP T

TPTT

P TTLGwTPT ),(),|(minarg*

Page 28: Learning Accurate, Compact, and Interpretable Tree Annotation

Parse Selection

Possible solutions best derivation generate n-best parses and re-rank them

sampling derivations of the grammar select the minimum risk candidate based on loss

function of posterior marginals:

),0,(

),,,(),,,(

nrootP

jkiBCArjkiBCAq

IN

Te

TG eqT )(maxarg

Page 29: Learning Accurate, Compact, and Interpretable Tree Annotation

Results

Page 30: Learning Accurate, Compact, and Interpretable Tree Annotation

Thank You!

Page 31: Learning Accurate, Compact, and Interpretable Tree Annotation

References

S. Petrov, L. Barrett, R. Thibaux, D Klein. Learning Accurate, Compact, and Interpretable Tree Annotation, COLING-ACL 2006 slides.

S. Petrov and D. Klein, NACL Improved Inference for Unlexicalized Parsing : 2007 slides.

S. Petrov, L. Barrett, R. Thibaux, and D. Klein. 2006. Learning accurate, compact, and interpretable tree annotation. In COLING-ACL ’06, pages 443–440.

S. Petrov and D. Klein. 2007. Improved Inference for Unlexicalized Parsing . In NACL ’06.

T. Matsuzaki, Y. Miyao, and J. Tsujii. 2005. Probabilistic CFG with latent annotations. In ACL ’05, pages 75–82.


Recommended