Statistical Natural Language Parsingmausam/courses/csl772/autumn2014/lectures/1… · Statistical...

Statistical Natural Language Parsing

Mausam

(Based on slides of Michael Collins Dan Jurafsky Dan Klein Chris Manning Ray Mooney Luke Zettlemoyer)

Two views of linguistic structure 1 Constituency (phrase structure)

bull Phrase structure organizes words into nested constituents

bull How do we know what is a constituent (Not that linguists donrsquot argue about some cases)bull Distribution a constituent behaves as a unit that can appear in different

places

bull John talked [to the children] [about drugs]

bull John talked [about drugs] [to the children]

bull John talked drugs to the children about

bull Substitutionexpansionpro-forms

bull I sat [on the boxright on top of the boxthere]

bull Coordination regular internal structure no intrusion fragments semantics hellip

Two views of linguistic structure 2 Dependency structure

bull Dependency structure shows which words depend on (modify or are arguments of) which other words

The boy put the tortoise on the rug

rug

the

the

ontortoise

put

boy

The

Why Parse

bull Part of speech information

bull Phrase information

bull Useful relationships

8

The rise of annotated data

The Penn Treebank

( (S(NP-SBJ (DT The) (NN move))(VP (VBD followed)

(NP(NP (DT a) (NN round))(PP (IN of)(NP(NP (JJ similar) (NNS increases))(PP (IN by)

(NP (JJ other) (NNS lenders)))(PP (IN against)

(NP (NNP Arizona) (JJ real) (NN estate) (NNS loans))))))( )(S-ADV

(NP-SBJ (-NONE- ))(VP (VBG reflecting)(NP(NP (DT a) (VBG continuing) (NN decline))(PP-LOC (IN in)

(NP (DT that) (NN market)))))))( )))

[Marcus et al 1993 Computational Linguistics]

Penn Treebank Non-terminals


bull Starting off building a treebank seems a lot slower and less useful than building a grammar

bull But a treebank gives us many thingsbull Reusability of the labor

bull Many parsers POS taggers etc

bull Valuable resource for linguistics

bull Broad coverage

bull Frequencies and distributional information

bull A way to evaluate systems

Statistical parsing applications

Statistical parsers are now robust and widely used in larger NLP applications

bull High precision question answering [Pasca and Harabagiu SIGIR 2001]

bull Improving biological named entity finding [Finkel et al JNLPBA 2004]

bull Syntactically based sentence compression [Lin and Wilbur 2007]

bull Extracting opinions about products [Bloom et al NAACL 2007]

bull Improved interaction in computer games [Gorniak and Roy 2005]

bull Helping linguists find data [Resnik et al BLS 2005]

bull Source sentence analysis for machine translation [Xu et al 2009]

bull Relation extraction systems [Fundel et al Bioinformatics 2006]

Example Application Machine Translation

bull The boy put the tortoise on the rug

bull लड़क न रखा कछआ ऊपर कालीनbull SVO vs SOV preposition vs post-position

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNलड़क न रखाकछआकालीन

ऊपर

Pre 1990 (ldquoClassicalrdquo) NLP Parsing

bull Goes back to Chomskyrsquos PhD thesis in 1950s

bull Wrote symbolic grammar (CFG or often richer) and lexiconS NP VP NN interest

NP (DT) NN NNS rates

NP NN NNS NNS raises

NP NNP VBP interest

VP V NP VBZ rates

bull Used grammarproof systems to prove parses from words

bull This scaled very badly and didnrsquot give coverage For sentence

Fed raises interest rates 05 in effort to control inflationbull Minimal grammar 36 parses

bull Simple 10 rule grammar 592 parses

bull Real-size broad-coverage grammar millions of parses

Classical NLP ParsingThe problem and its solution

bull Categorical constraints can be added to grammars to limit unlikelyweird parses for sentencesbull But the attempt make the grammars not robust

bull In traditional systems commonly 30 of sentences in even an edited text would have no parse

bull A less constrained grammar can parse more sentencesbull But simple sentences end up with ever more parses with no way to

choose between them

bull We need mechanisms that allow us to find the most likely parse(s) for a sentencebull Statistical parsing lets us work with very loose grammars that admit

millions of parses for sentences but still quickly find the best parse(s)

Context Free Grammars and Ambiguities

20

Context-Free Grammars

21

Context-Free Grammars in NLP

bull A context free grammar G in NLP = (N C Σ S L R)bull Σ is a set of terminal symbols

bull C is a set of preterminal symbols

bull N is a set of nonterminal symbols

bull S is the start symbol (S isin N)

bull L is the lexicon a set of items of the form X x

bull X isin C and x isin Σ

bull R is the grammar a set of items of the form X

bull X isin N and isin (N cup C)

bull By usual convention S is the start symbol but in statistical NLP we usually have an extra node at the top (ROOT TOP)

bull We usually write e for an empty sequence rather than nothing22

A Context Free Grammar of English

23

Left-Most Derivations

24

Properties of CFGs

25

Attachment ambiguities

bull A key parsing decision is how we lsquoattachrsquo various constituentsbull PPs adverbial or participial phrases infinitives coordinations etc



bull Catalan numbers Cn

= (2n)[(n+1)n]

bull An exponentially growing series which arises in many tree-like contexts

bull Eg the number of possible triangulations of a polygon with n+2 sides

bull Turns up in triangulation of probabilistic graphical modelshellip

Attachments

bull I cleaned the dishes from dinner

bull I cleaned the dishes with detergent

bull I cleaned the dishes in my pajamas

bull I cleaned the dishes in the sink

Syntactic Ambiguities I

bull Prepositional phrasesThey cooked the beans in the pot on the stove with handles

bull Particle vs prepositionThe lady dressed up the staircase

bull Complement structuresThe tourists objected to the guide that they couldnrsquot hearShe knows you like the back of her hand

bull Gerund vs participial adjectiveVisiting relatives can be boringChanging schedules frequently confused passengers

Syntactic Ambiguities II

bull Modifier scope within NPsimpractical design requirementsplastic cup holder

bull Multiple gap constructionsThe chicken is ready to eatThe contractors are rich enough to sue

bull Coordination scopeSmall rats and mice can squeeze into holes or cracks in the wall

Non-Local Phenomena

bull Dislocation gappingbull Which book should Peter buy

bull A debate arose which continued until the election

bull Bindingbull Reference

bull The IRS audits itself

bull Controlbull I want to go

bull I want you to go

32

A Fragment of a Noun Phrase Grammar

33

Extended Grammar with Prepositional Phrases

34

Verbs Verb Phrases and Sentences

35

PPs Modifying Verb Phrases

36

Complementizers and SBARs

37

More Verbs

38

Coordination

39

Much more remainshellip

40

Parsing Two problems to solve1 Repeated workhellip


Parsing Two problems to solve2 Choosing the correct parse

bull How do we work out the correct attachment

bull She saw the man with a telescope

bull Is the problem lsquoAI completersquo Yes but hellip

bull Words are good predictors of attachmentbull Even absent full understanding

bull Moscow sent more than 100000 soldiers into Afghanistan hellip

bull Sydney Water breached an agreement with NSW Health hellip

bull Our statistical parsers will try to exploit such statistics

Probabilistic Context Free Grammar

45

Probabilistic ndash or stochastic ndash context-free grammars (PCFGs)

bull G = (Σ N S R P)bull T is a set of terminal symbols



bull R is a set of rulesproductions of the form X

bull P is a probability function

bull P R [01]

bull

bull A grammar G generates a language model L

P(g) =1g IcircT

aring

PCFG ExampleA Probabilistic Context-Free Grammar (PCFG)

S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02

NN rArr telescope 01

DT rArr the 10

IN rArr with 05

IN rArr in 05

bull Probability of a tree t with rules

α1 rarr β1α2 rarr β2 αn rarr βn

is

p(t) =n

i = 1

q(α i rarr βi )

where q(α rarr β) is the probability for rule α rarr β

44

Example of a PCFG

48

Probability of a ParseA Probabilistic Context-Free Grammar (PCFG)

S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

A Probabilistic Context-Free Grammar (PCFG)

S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps

The man saw the woman with the telescope

NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01

PCFGs Learning and Inference

Model The probability of a tree t with n rules αi βi i = 1n

Learning Read the rules off of labeled sentences use ML estimates for

probabilities

and use all of our standard smoothing tricks

Inference For input sentence s define T(s) to be the set of trees whose yield is s

(whole leaves read left to right match the words in s)

Grammar Transforms

51

Chomsky Normal Form

bull All rules are of the form X Y Z or X wbull X Y Z isin N and w isin Σ

bull A transformation to this form doesnrsquot change the weak generative capacity of a CFGbull That is it recognizes the same language

bull But maybe with different trees

bull Empties and unaries are removed recursively

bull n-ary rules are divided by introducing new nonterminals (n gt 2)

A phrase structure grammar

S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with

Chomsky Normal Form steps

S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form

bull You should think of this as a transformation for efficient parsing

bull With some extra book-keeping in symbol names you can even reconstruct the same trees with a detransform

bull In practice full Chomsky Normal Form is a painbull Reconstructing n-aries is easy

bull Reconstructing unariesempties is trickier

bull Binarization is crucial for cubic time CFG parsing

bull The rest isnrsquot necessary it just makes the algorithms cleaner and a bit quicker

ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN

An example before binarizationhellip

P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S

After binarizationhellip

Treebank empties and unaries

ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66

Constituency Parsing

fish people fish tanks

Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S

Cocke-Kasami-Younger (CKY) Constituency Parsing


Viterbi (Max) Scores

people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

Extended CKY parsing

bull Unaries can be incorporated into the algorithmbull Messy but doesnrsquot increase algorithmic complexity

bull Empties can be incorporatedbull Use fenceposts

bull Doesnrsquot increase complexity essentially like unaries

bull Binarization is vitalbull Without binarization you donrsquot get parsing cubic in the length of the

sentence and in the number of nonterminals in the grammar

bull Binarization may be an explicit transformation or implicit in how the parser works (Early-style dotted rules) but itrsquos always there

A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else

return max q(X-gtYZ)

bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ

function CKY(words grammar) returns [most_probable_parseprob]

score = new double[(words)+1][(words)+1][(nonterms)]

back = new Pair[(words)+1][(words)+1][nonterms]]

for i=0 ilt(words) i++

for A in nonterms

if A -gt words[i] in grammar

score[i][i+1][A] = P(A -gt words[i])

handle unaries

boolean added = true

while added

added = false

for A B in nonterms

if score[i][i+1][B] gt 0 ampamp A-gtB in grammar

prob = P(A-gtB)score[i][i+1][B]

if prob gt score[i][i+1][A]

score[i][i+1][A] = prob

back[i][i+1][A] = B

added = true

The CKY algorithm (19601965)hellip extended to unaries

for span = 2 to (words)

for begin = 0 to (words)- span

end = begin + span

for split = begin+1 to end-1

for ABC in nonterms

prob=score[begin][split][B]score[split][end][C]P(A-gtBC)

if prob gt score[begin][end][A]

score[begin]end][A] = prob

back[begin][end][A] = new Triple(splitBC)

handle unaries


while added

added = false

for A B in nonterms

prob = P(A-gtB)score[begin][end][B]


score[begin][end][A] = prob

back[begin][end][A] = B

added = true

return buildTree(score back)


The grammarBinary no epsilons

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4

1 2 3 4fish people fish tanks

0

1

2

3

4

1 2 3 4fish people fish tanksS NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06

N people 05V people 01

N fish 02V fish 06

N tanks 02V tanks 01

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms



if(prob gt score[i][i+1][A])


back[i][i+1][A] = B

added = true

N fish 02V fish 06NP N 014VP V 006S VP 0006

N people 05V people 01NP N 035VP V 001S VP 0001


N tanks 02V tanks 01NP N 014VP V 003S VP 0003

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


if (prob gt score[begin][end][A])







NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

Call buildTree(score back) to get the best parse

Evaluating constituency parsing


Gold standard brackets S-(011) NP-(02) VP-(29) VP-(39) NP-(46) PP-(6-9) NP-(79) NP-(910)

Candidate brackets S-(011) NP-(02) VP-(210) VP-(310) NP-(46) PP-(6-10) NP-(710)

Labeled Precision 37 = 429

Labeled Recall 38 = 375

LPLR F1 400

Tagging Accuracy 1111 = 1000

How good are PCFGs

bull Penn WSJ parsing accuracy about 73 LPLR F1

bull Robust

bull Usually admit everything but with low probability

bull Partial solution for grammar ambiguity

bull A PCFG gives some idea of the plausibility of a parse

bull But not so good because the independence assumptions are

too strong

bull Give a probabilistic language model

bull But in the simple case it performs worse than a trigram model

bull The problem seems to be that PCFGs lack the

lexicalization of a trigram model

Weaknesses of PCFGs

87

Weaknesses

bull Lack of sensitivity to structural frequencies

bull Lack of sensitivity to lexical information

bull (A word is independent of the rest of the tree given its POS)

88

A Case of PP Attachment Ambiguity

89

90

A Case of Coordination Ambiguity

91

92

Structural Preferences Close Attachment

93


bull Example John was believed to have been shot by Bill

bull Low attachment analysis (Bill does the shooting) contains same rules as high attachment analysis (Bill does the believing)bull Two analyses receive the same probability

94

PCFGs and Independence

bull The symbols in a PCFG define independence assumptions

bull At any node the material inside that node is independent of the material outside that node given the label of that node

bull Any information that statistically connects behavior inside and outside a node must flow through that nodersquos label

NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I

bull The independence assumptions of a PCFG are often too strong

bull Example the expansion of an NP is highly dependent on the parent of the NP (ie subjects vs objects)

119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP

All NPs NPs under S NPs under VP

Non-Independence II

bull Symptoms of overly strong assumptionsbull Rewrites get used where they donrsquot belong

In the PTB this

construction is

for possessives

Advanced Unlexicalized Parsing

99

Horizontal Markovization

bull Horizontal Markovization Merges States

70

71

72

73

74

0 1 2v 2 inf

Horizontal Markov Order

0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls

Vertical Markovization

bull Vertical Markov order rewrites depend on past k ancestor nodes

(ie parent annotation)

Order 1 Order 2

7273747576777879

1 2v 2 3v 3

Vertical Markov Order

0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits

bull Problem unary rewrites are used to transmute categories so a high-probability rule can be used

Annotation F1 Size

Base 778 75K

UNARY 783 80K

Solution Mark unary rewrite sites with -U

Tag Splits

bull Problem Treebank tags are too coarse

bull Example SBAR sentential complementizers (that whether if) subordinating conjunctions (while after) and true prepositions (in of to) are all tagged IN

bull Partial Solutionbull Subdivide the IN tag

Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits

bull UNARY-DT mark demonstratives as DT^U (ldquothe Xrdquo vs ldquothoserdquo)

bull UNARY-RB mark phrasal adverbs as RB^U (ldquoquicklyrdquo vs ldquoveryrdquo)

bull TAG-PA mark tags with non-canonical parents (ldquonotrdquo is an RB^VP)

bull SPLIT-AUX mark auxiliary verbs with ndashAUX [cf Charniak 97]

bull SPLIT-CC separate ldquobutrdquo and ldquoamprdquo from other conjunctions

bull SPLIT- ldquordquo gets its own tag

F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits

bull Problem sometimes the behavior of a category depends on something inside its future yield

bull Examplesbull Possessive NPs

bull Finite vs infinite VPs

bull Lexical heads

bull Solution annotate future elements into nodes

Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K

Distance Recursion Splits

bull Problem vanilla PCFGs cannot distinguish attachment heights

bull Solution mark a property of higher or lower sitesbull Contains a verb

bull Is (non)-recursive

bull Base NPs [cf Collins 99]

bull Right-recursive NPs

Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K

DOMINATES-V 869 141K

RIGHT-REC-NP 870 152K

NP

VP

PP

NP

v

-v

A Fully Annotated Tree

Final Test Set Results

bull Beats ldquofirst generationrdquo lexicalized parsers

Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860

Klein amp Manning 03 869 857 863

Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109

Heads in Comtext Free Rules

110

Heads

111

Rules to Recover Heads An Example for NPs

112

Rules to Recover Heads An Example for VPs

113

Adding Headwords to Trees

114


115

Lexicalized CFGs in Chomsky Normal Form

116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j

(VP-gt VBD[saw] NP[her])

(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return

max max score(X[h]-gtY[h]Z[w])

bestScore(Yikh)

bestScore(Zkjw)

max score(X[h]-gtY[w]Z[h])

bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ

Parsing with Lexicalized CFGs

119

Pruning with Beams

bull The Collins parser prunes with per-cell beams [Collins 99]bull Essentially run the O(n5) CKY

bull Remember only a few hypotheses for each span ltijgt

bull If we keep K hypotheses at each span then we do at most O(nK2) work per span (why)

bull Keeps things more or less cubic

bull Also certain spans are forbidden entirely on the basis of punctuation (crucial for speed)

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j

Parameter Estimation

121

A Model from Charniak (1997)

122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

AnalysisEvaluation (Method 2)

126

Dependency Accuracies

127

Strengths and Weaknesses of Modern Parsers

128

Modern Parsers

129

Annotation refines base treebank symbols to

improve statistical fit of the grammar

Parent annotation [Johnson rsquo98]

Head lexicalization [Collins rsquo99 Charniak rsquo00]

Automatic clustering

The Game of Designing a Grammar

Manual Splits

bull Manually split categoriesbull NP subject vs object

bull DT determiners vs demonstratives

bull IN sentential vs prepositional

bull Advantagesbull Fairly compact grammar

bull Linguistic motivations

bull Disadvantagesbull Performance leveled out

bull Manually annotated

ForwardOutside

Learning Latent AnnotationsLatent Annotations

bull Brackets are known

bull Base categories are known

bull Hidden variables for subcategories

X1

X2 X7X4

X5 X6X3

He was right

Can learn with EM like Forward-Backward for HMMs BackwardInside

Automatic Annotation Induction

bull Advantages

bull Automatically learned

Label all nodes with latent variables

Same number k of subcategoriesfor all categories

bull Disadvantages

bull Grammar gets too large

bull Most categories are oversplit while others are undersplit

Model F1

Klein amp Manning rsquo03 863

Matsuzaki et al rsquo05 867

Refinement of the DT tag

DT

DT-1 DT-2 DT-3 DT-4

Hierarchical refinement Repeatedly learn more fine-grained subcategories

start with two (per non-terminal) then keep splitting

initialize each EM run with the output of the last

DT

Adaptive Splitting

Want to split complex categories more

Idea split everything roll back splits which were

least useful

[Petrov et al 06]

Adaptive Splitting

bull Evaluate loss in likelihood from removing each split =

Data likelihood with split reversed

Data likelihood with split

bull No loss in accuracy when 50 of the splits are reversed

Adaptive Splitting Results

Model F1

Previous 884

With 50 Merging 895

Number of Phrasal Subcategories

Final Results

F1

le 40 words

F1

all wordsParser

Klein amp Manning rsquo03 863 857

Matsuzaki et al rsquo05 867 861

Collins rsquo99 886 882

Charniak amp Johnson rsquo05 901 896

Petrov et al 06 902 897

Hierarchical Pruning

hellip QP NP VP hellipcoarse

split in two hellip QP1 QP2 NP1 NP2 VP1 VP2 hellip

hellip QP1 QP1 QP3 QP4 NP1 NP2 NP3 NP4 VP1 VP2 VP3 VP4 hellipsplit in four

split in eight hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

Parse multiple times with grammars at different levels of granularity

Bracket Posteriors

1621 min

111 min

35 min

15 min [912 F1](no search error)

Two views of linguistic structure 1 Constituency (phrase structure)

bull Phrase structure organizes words into nested constituents

bull How do we know what is a constituent (Not that linguists donrsquot argue about some cases)bull Distribution a constituent behaves as a unit that can appear in different

places

bull John talked [to the children] [about drugs]

bull John talked [about drugs] [to the children]

bull John talked drugs to the children about

bull Substitutionexpansionpro-forms

bull I sat [on the boxright on top of the boxthere]

bull Coordination regular internal structure no intrusion fragments semantics hellip




rug

the

the

ontortoise

put

boy

The

Why Parse




8


The Penn Treebank














bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





rug

the

the

ontortoise

put

boy

The

Why Parse




8


The Penn Treebank














bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Why Parse




8


The Penn Treebank














bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



The Penn Treebank














bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min







bull Broad coverage
















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min















S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPIN NP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN V NPINNP

DT NNDT NNthe

boyput

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP

DT NNDT NNthe

boy

put

tortoisethethe rug

on




S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





S

NP VP PP

DT NN V NP IN NP

DT NN DT NNtheboy

put

tortoisethethe rug

on

S

NP VPPP

DT NN VNPINNP


ऊपर






NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min







NP NNP VBP interest

VP V NP VBZ rates










choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






choose between them




20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



20


21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



21













23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min














23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



23


24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



24

Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Properties of CFGs

25






= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min







= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





= (2n)[(n+1)n]




Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Attachments














Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min











Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Non-Local Phenomena







32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


32


33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



33


34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



34


35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



35


36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



36


37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



37

More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


More Verbs

38

Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Coordination

39


40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



40












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min













45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min












45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min











45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



45







bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








bull P R [01]

bull


P(g) =1g IcircT

aring


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Example of a PCFG

48


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44


S rArr NP VP 10

VP rArr Vi 04

VP rArr Vt NP 04

VP rArr VP PP 02

NP rArr DT NN 03

NP rArr NP PP 07

PP rArr P NP 10

Vi rArr sleeps 10

Vt rArr saw 10

NN rArr man 07

NN rArr woman 02


DT rArr the 10

IN rArr with 05

IN rArr in 05



is

p(t) =n

i = 1

q(α i rarr βi )


44

The man sleeps


NNDT Vi

VPNP

NNDT

NP

NNDT

NP

NNDT

NPVt

VP

IN

PP

VP

S

S

t1=

p(t1)=100310070410

10

0403

10 07 10

t2=

p(ts)=180310070204100310020405031001

10

03 03 03

02

04 04

0510

10 10 1007 02 01




probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





probabilities




Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Grammar Transforms

51

Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Chomsky Normal Form







S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

S VP

VP V NP

VP V

VP V NP PP

VP V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V

S V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

V fish

S fish

V tanks

S tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP

NP NP PP

NP PP

NP N

PP P NP

PP P

N people

N fish

N tanks

N rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with


S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V NP PP

S V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with


S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

VP V NP PP

NP NP NP

NP NP PP

NP N

NP e

PP P NP

N people

N fish

N tanks

N rods

V people

V fish

V tanks

P with


S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP

VP V NP

S V NP

VP V VP_V

VP_V NP PP

S V S_V

S_V NP PP

VP V PP

S V PP

NP NP NP

NP NP PP

NP P NP

PP P NP

NP people

NP fish

NP tanks

NP rods

V people

S people

VP people

V fish

S fish

VP fish

V tanks

S tanks

VP tanks

P with

PP with

Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Chomsky Normal Form







ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


ROOT

S

NP VP

N

people

V NP PP

P NP

rodswithtanksfish

NN


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


P NP

rods

N

with

NP

N

people tanksfish

N

VP

V NP PP

VP_V

ROOT

S



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



ROOT

S-HLN

NP-SUBJ VP

VB-NONE-

e Atone

PTB Tree

ROOT

S

NP VP

VB-NONE-

e Atone

NoFuncTags

ROOT

S

VP

VB

Atone

NoEmpties

ROOT

S

Atone

NoUnaries

ROOT

VB

Atone

High Low

Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Parsing

66



Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min




Rule Prob θi

S NP VP θ0

NP NP NP θ1

hellip

N fish θ42

N people θ43

V fish θ44

hellip

PCFG

N N V N

VP

NPNP

S




people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



people fish

NP 035V 01N 05

VP 006NP 014V 06N 02

S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10








A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min









A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


A Recursive Parser

bestScore(Xijs)

if (j == i)

return q(X-gts[i])

else


bestScore(Yiks)

bestScore(Zk+1js)

kX-gtYZ





for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






for A in nonterms



handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min




end = begin + span


for ABC in nonterms





handle unaries


while added

added = false

for A B in nonterms





added = true




S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



S NP VP 09

S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks 02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


score[0][1]

score[1][2]

score[2][3]

score[3][4]

score[0][2]

score[1][3]

score[2][4]

score[0][3]

score[1][4]

score[0][4]

0

1

2

3

4


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for A in nonterms



N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


N fish 02V fish 06


N fish 02V fish 06


0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





back[i][i+1][A] = B

added = true





0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10









NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP NP NP00049

VP V NP0105

S NP VP000126

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S NP VP000378

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10

handle unaries


while added

added = false

for A B in nonterms





added = true





NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10


for ABC in nonterms









NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP NP NP00049

VP V NP0105

S VP00105

NP NP NP00049

VP V NP0007

S NP VP00189

NP NP NP000196

VP V NP0042

S VP00042

NP NP NP00000686

VP V NP000147

S NP VP0000882

NP NP NP00000686

VP V NP0000098

S NP VP001323

NP NP NP

00000009604VP V NP

000002058S NP VP

000018522

0

1

2

3

4


S VP 01

VP V NP 05

VP V 01

VP V VP_V 03

VP V PP 01

VP_V NP PP 10

NP NP NP 01

NP NP PP 02

NP N 07

PP P NP 10

N people 05

N fish 02

N tanks02

N rods 01

V people 01

V fish 06

V tanks 03

P with 10








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min







LPLR F1 400


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


How good are PCFGs


bull Robust





too strong





Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Weaknesses of PCFGs

87

Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Weaknesses




88


89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



89

90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


90


91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



91

92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


92


93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



93




94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





94





NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min






NP

S

VP

S NP VP

NP DT NN

NP

Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Non-Independence I



119

6

NP PP DT NN PRP

9 9

21

NP PP DT NN PRP

74

23

NP PP DT NN PRP


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Non-Independence II


In the PTB this

construction is

for possessives


99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



99



70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min




70

71

72

73

74

0 1 2v 2 inf


0

3000

6000

9000

12000

0 1 2v 2 inf


Sym

bo

ls




Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





Order 1 Order 2

7273747576777879

1 2v 2 3v 3


0

5000

10000

15000

20000

25000

1 2v 2 3v 3


Sym

bo

ls

Model F1 Size

v=h=2v 778 75K

Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Unary Splits


Annotation F1 Size

Base 778 75K

UNARY 783 80K


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Tag Splits




Annotation F1 Size

Previous 783 80K

SPLIT-IN 803 81K

Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Other Tag Splits







F1 Size

804 81K

805 81K

812 85K

816 90K

817 91K

818 93K

Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Yield Splits




bull Lexical heads


Annotation F1 Size

tag splits 823 97K

POSS-NP 831 98K

SPLIT-VP 857 105K







Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








Annotation F1 Size

Previous 857 105K

BASE-NP 860 117K



NP

VP

PP

NP

v

-v




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min




Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886

Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Lexicalised PCFGs

109


110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



110

Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Heads

111


112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



112


113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



113


114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



114


115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



115


116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



116

Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Example

117

Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Lexicalized CKY

Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


(VP-gtVBDNP)[saw]

bestScore(Xijh)

if (j = i)

return score(Xs[i])

else

return


bestScore(Yikh)

bestScore(Zkjw)


bestScore(Yikw)

bestScore(Zkjh)

kh

X-gtYZ

kh

X-gtYZ


119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



119

Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Pruning with Beams






Y[h] Z[hrsquo]

X[h]

i h k hrsquo j


121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



121


122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



122


123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



123

Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Other Details

124


Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



Parser LP LR F1

Magerman 95 849 846 847

Collins 96 863 858 860


Charniak 97 874 875 874

Collins 99 887 886 886


126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



126


127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



127


128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



128

Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Modern Parsers

129







Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Manual Splits








ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


ForwardOutside





X1

X2 X7X4

X5 X6X3

He was right



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



bull Advantages




bull Disadvantages



Model F1




DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



DT

DT-1 DT-2 DT-3 DT-4




DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min





DT

Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Adaptive Splitting



least useful

[Petrov et al 06]

Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Adaptive Splitting






Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



Model F1

Previous 884

With 50 Merging 895


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min



Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min


Final Results

F1

le 40 words

F1

all wordsParser












Bracket Posteriors

1621 min

111 min

35 min








Bracket Posteriors

1621 min

111 min

35 min


Bracket Posteriors

1621 min

111 min

35 min


1621 min

111 min

35 min


Date post:	24-Jul-2020
Category:	Documents
Upload:	others
View:	5 times
Download:	0 times

Statistical Natural Language Parsingmausam/courses/csl772/autumn2014/lectures/1… · Statistical...

Documents