+ All Categories
Home > Documents > CS388: Natural Language Processing Lecture 10: Syntax...

CS388: Natural Language Processing Lecture 10: Syntax...

Date post: 03-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
34
CS388: Natural Language Processing Lecture 10: Syntax I Greg Durrett Slides adapted from Dan Klein, UC Berkeley
Transcript
Page 1: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

CS388:NaturalLanguageProcessingLecture10:SyntaxI

GregDurrett

SlidesadaptedfromDanKlein,UCBerkeley

Page 2: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Recall:CNNsvs.LSTMs

‣ BothLSTMsandconvoluLonallayerstransformtheinputusingcontext

themoviewasgood themoviewasgood

nxk

cfilters,mxkeach

O(n)xc

nxk

nx2c

BiLSTMwith hiddensizec

‣ LSTM:“globally”looksattheenLresentence(butlocalformanyproblems)

‣ CNN:localdependingonfilterwidth+numberoflayers

Page 3: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Recall:CNNs

themoviewasgood

nxk

cfilters,mxkeach

nxc

maxpoolingoverthesentence

c-dimensionalvector

projecLon+soZmax

P (y|x)

W

‣Maxpooling:returnthemaxacLvaLonofagivenfilterovertheenLresentence;likealogicalOR(sumpoolingislikelogicalAND)

Page 4: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Recall:NeuralCRFs

BarackObamawilltraveltoHangzhoutodayfortheG20mee=ng.

PERSON LOC ORG

B-PER I-PER O O O B-LOC B-ORGO O O O O

BarackObamawilltraveltoHangzhou

1)Computef(x)

2)Runforward-backward

3)Computeerrorsignal

4)Backprop(noknowledgeofsequenLalstructurerequired)

Page 5: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ThisLecture‣ ConsLtuencyformalism

‣ Context-freegrammarsandtheCKYalgorithm

‣ Refininggrammars

‣ DiscriminaLveparsers

Page 6: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ConsLtuency

Page 7: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Syntax‣ Studyofwordorderandhowwordsformsentences

‣Whydowecareaboutsyntax?

‣ Recognizeverb-argumentstructures(whoisdoingwhattowhom?)

‣MulLpleinterpretaLonsofwords(nounorverb?Fedraises…example)

‣ HigherlevelofabstracLonbeyondwords:somelanguagesareSVO,someareVSO,someareSOV,parsingcancanonicalize

Page 8: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ConsLtuencyParsing‣ Tree-structuredsyntacLcanalysesofsentences

‣ Commonthings:nounphrases,verbphrases,preposiLonalphrases

‣ BogomlayerisPOStags

‣ ExampleswillbeinEnglish.ConsLtuencymakessenseforalotoflanguagesbutnotall

Page 9: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

sentenLalcomplement

wholeembeddedsentence

adverbialphrase

Page 10: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ConsLtuencyParsing

Theratthecatchasedsqueaked

IracedtoIndianapolis,unimpededbytraffic

Page 11: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Challenges‣ PPagachment

§  Ifwedonoannota+on,thesetreesdifferonlyinonerule:§  VP→VPPP§  NP→NPPP

§  Parsewillgoonewayortheother,regardlessofwords§  Lexicaliza+onallowsustobesensi+vetospecificwords

sameparseas“thecakewithsomeicing”

Page 12: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Challenges‣ NPinternalstructure:tags+depthofanalysis

Page 13: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ConsLtuency‣ HowdoweknowwhattheconsLtuentsare?

‣ ConsLtuencytests:‣ SubsLtuLonbyproform(e.g.,pronoun)

‣ CleZing(Itwaswithaspoonthat…)

‣ Answerellipsis(Whatdidtheyeat?thecake) (How?withaspoon)

‣ SomeLmesconsLtuencyisnotclear,e.g.,coordinaLon:shewenttoandboughtfoodatthestore

Page 14: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Context-FreeGrammars,CKY

Page 15: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

CFGsandPCFGs§  Writesymbolicorlogicalrules:

§  Usededuc4onsystemstoproveparsesfromwords§  Minimalgrammaron“Fedraises”sentence:36parses§  Simple10-rulegrammar:592parses§  Real-sizegrammar:manymillionsofparses

§  Thisscaledverybadly,didn’tyieldbroad-coveragetools

Grammar (CFG) Lexicon

ROOT → S

S → NP VP

NP → DT NN

NP → NN NNS

NN → interest

NNS → raises

VBP → interest

VBZ → raises

NP → NP PP

VP → VBP NP

VP → VBP NP PP

PP → IN NP

‣ Context-freegrammar:symbolswhichrewriteasoneormoresymbols

‣ Lexiconconsistsof“preterminals”(POStags)rewriLngasterminals(words)

‣ CFGisatuple(N,T,S,R):N=nonterminals,T=terminals,S=startsymbol(generallyaspecialROOTsymbol),R=rules‣ PCFG:probabiliLesassociatedwithrewrites,normalizebysourcesymbol

0.20.5

0.30.70.31.0

1.01.0

1.01.01.01.0

Page 16: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

EsLmaLngPCFGs

‣MaximumlikelihoodPCFG:countandnormalize!SameasHMMs/NaiveBayes

S→NPVPNP→PRPNP→DTNN…

1.00.50.5

‣ TreeTisaseriesofruleapplicaLonsr. P (T ) =Y

r2T

P (r|parent(r))

Page 17: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

BinarizaLon‣ Toparseefficiently,weneedourPCFGstobeatmostbinary(notCNF)

VP

VBD NP PP PP

sold thebook toher for$3

P(VP→VBDNPPPPP)=0.2

VP

VBD VP

NP

PP

VP

PP

VP

VBD VP-[NPPPPP]

NP

PP

VP-[PPPP]

PP

‣ Lossless: ‣ Lossy:

P(VP→VBZPP)=0.1…

Page 18: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

ChomskyNormalForm

VP

VBD VP-[NPPPPP]

VBD

PP

VP-[PPPP]

PPP(VP→VBDVP-[NPPPPP])=0.2

VP

VBD VP

NP

PP

VP

PP

‣ Lossless: ‣ Lossy:

P(VP→VBDVP)=0.2

P(VP→NPVP)=0.03P(VP-[NPPPPP]→NPVP-[PPPP])=1.0

P(VP-[PPPP]→PPPP)=1.0 P(VP→PPPP)=0.001‣ DeterminisLcsymbolsmakethis thesameasbefore

‣MakesdifferentindependentassumpLons,notthesamePCFG

Page 19: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

CKY

He wrote a long report on Mars

NPPP

NP

‣ FindargmaxP(T|x)=argmaxP(T,x)

‣ Dynamicprogramming:chartmaintainsthebestwayofbuildingsymbolXoverspan(i,j)

‣ Loopoverallsplitpointsk,applyrulesX->YZtobuild Xineverypossibleway

‣ CKY=Viterbi,alsoanalgorithmcalledinside-outside=forward-backward Cocke-Kasami-Younger

i jk

X

ZY

Page 20: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

UnaryRulesSBAR

S

theratthecatchasedsqueaked

NP

NNSmice

‣ UnaryproducLonsintreebankneedtobedealtwithbyparsers

‣ Binarytreesovernwordshaveatmostn-1nodes,butyoucanhaveunlimitednumbersofnodeswithunaries(S→SBAR→NP→S→…)

‣ InpracLce:enforceatmostoneunaryovereachspan,modifyCKYaccordingly

Page 21: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Results

KleinandManning(2003)

‣ StandarddatasetforEnglish:PennTreebank(Marcusetal.,1993)

‣ EvaluaLon:F1overlabeledconsLtuentsofthesentence

‣ VanillaPCFG:~75F1

‣ BestPCFGsforEnglish:~90F1

‣ Otherlanguages:resultsvarywidelydependingonannotaLon+complexityofthegrammar

‣ SOTA:95F1

Page 22: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

RefiningGeneraLveGrammars

Page 23: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

PCFGIndependenceAssumpLons

11%9%

6%

NP PP DT NN PRP

9% 9%

21%

NP PP DT NN PRP

7%4%

23%

NP PP DT NN PRP

All NPs NPs under S NPs under VP

‣ Languageisnotcontext-free:NPsindifferentcontextsrewritedifferently

‣ Canwemakethegrammar“lesscontext-free”?

Page 24: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

RuleAnnotaLon

‣ VerLcal(parent)annotaLon:addtheparentsymboltoeachnode,candograndparentstoo

§  Ver$calMarkovorder:rewritesdependonpastkancestornodes.(cf.parentannota$on)

Order 1 Order 2

72%73%74%75%76%77%78%79%

1 2v 2 3v 3

Vertical Markov Order

05000

10000

150002000025000

1 2v 2 3v 3

Vertical Markov OrderSymbols

KleinandManning(2003)

‣ LikeatrigramHMMtagger,incorporatesmorecontext

70%

71%

72%

73%

74%

0 1 2v 2 inf

Horizontal Markov Order

0

3000

6000

9000

12000

0 1 2v 2 inf

Horizontal Markov Order

Symbols

Order 1 Order ∞

KleinandManning(2003)

‣ HorizontalannotaLon:rememberthestatesof mulL-arityrulesduringbinarizaLon

Page 25: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

AnnotatedTree

KleinandManning(2003)

‣ 75F1withbasicPCFG=>86.3F1withthishighlycustomizedPCFG(SOTAwas90F1attheLme,butwithmorecomplexmethods)

Page 26: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

LexicalizedParsers

§  What’sdifferentbetweenbasicPCFGscoreshere?§  What(lexical)correla;onsneedtobescored?

‣ EvenwithparentannotaLon,thesetreeshavethesamerules.Needtousethewords

Page 27: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

LexicalizedParsers§  Add“headwords”to

eachphrasalnode§  Syntac4cvs.seman4c

heads§  Headshipnotin(most)

treebanks§  Usuallyuseheadrules,

e.g.:§  NP:

§  TakeleFmostNP§  TakerightmostN*§  TakerightmostJJ§  Takerightchild

§  VP:§  TakeleFmostVB*§  TakeleFmostVP§  TakeleFchild

‣ Annotateeachgrammarsymbolwithits“headword”:mostimportantwordofthatconsLtuent

‣ RulesforidenLfyingheadwords(e.g.,thelastwordofanNPbeforeapreposiLonistypicallythehead)

‣ CollinsandCharniak(late90s):~89F1withthese

Page 28: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

DiscriminaLveParsers

Page 29: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

CRFParsing

HewrotealongreportonMars.

PP

NP

HewrotealongreportonMars.

PPNP

VP

VBDNP

Myreport

Fig.1report—onMars wrote—onMars

Page 30: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

CRFParsing

Taskaretal.(2004)Hall,Durreg,andKlein(2014)

DurregandKlein(2015)

score

LeZchildlastword=report ∧ NP PPNP

w>f NP PPNP

2 5 7=

f NP PPNP

2 5 7HewrotealongreportonMars.

PPNP

NP

=2 5 7

wrotealongreportonMars.

wrotealongreportonMars.

‣ Canlearnthatwereport[PP],whichiscommonduetorepor=ngonthings

‣ Can“neuralize”thisaswelllikeneuralCRFsforNER

Page 31: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

+Discrete ConLnuous

He wrote a long report on Mars

NPPP

NP

‣ Chartremainsdiscrete!

‣ Feedforwardpassonnets

‣ RunCKYdynamicprogram‣ DiscretefeaturecomputaLon

+Discrete ConLnuous…

Parsingasentence:

DurregandKlein(ACL2015)

JointDiscreteandConLnuousParsing

Page 32: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

NeuralCRFParsing

Sternetal.(2017),Kitaevetal.(2018)

‣ Simplerversion:scorecons=tuentsratherthanruleapplicaLons

score w>f NP2 7

=

HewrotealongreportonMars.

PPNP

NP

2 5 7

wrotealongreportonMars.

‣ UseBiLSTMs(Stern)orself-agenLon(Kitaev)tocomputespanembeddings

‣ 91-93F1,95F1withELMo(SOTA).Greatonotherlangstoo!

Page 33: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Takeaways‣ PCFGsesLmatedgeneraLvelycanperformwellifsufficientlyengineered

‣ NeuralCRFsworkwellforconsLtuencyparsing

‣ NextLme:revisitlexicalizedparsingasdependencyparsing

Page 34: CS388: Natural Language Processing Lecture 10: Syntax Igdurrett/courses/fa2018/lectures/lec10-1pp.pdfGrammar (CFG) Lexicon ROOT → S S → NP VP NP → DT NN NP → NN NNS NN →

Survey‣Writeonethingyoulikeabouttheclass

‣Writeonethingyoudon’tlikeabouttheclass


Recommended