IntroductiontoSyntaxandContext-FreeGrammars
Slides with contributions from Owen Rambow, Dan Jurafsky and James Martin
Announcements• Reading:
• TodayC11-11.1,SpeechandLanguage;10.2,11-11.1,NLP
• Next?me:C11.2-11.4NLP
Lookingahead• Today:grammars,ContextFreeandDependency
• Wednesday:Dependencyparsing
• Moveintoseman?cs
WhatisSyntax?
• Studyofstructureoflanguage
• Howwordsarearrangedinasentenceandtherela?onshipbetweenthem.
• Goal:relatesurfaceform(percep?on)toseman?cs(meaning)
WhatSyntaxisNot
• Phonology:studyofsoundsystemsandhowsoundscombine
• Morphology:studyofhowwordsareformedfromsmallerparts(morphemes)
• Seman?cs:studyofmeaningoflanguage
Syntaxasaninterface
Morphology Syntax Semantics
Representational Device
Simpli<iedViewofLinguistics
⇔ /waddyasai/ Phonology
Morphology /waddyasai/ ⇔ what did you say What did you say ⇔ what do+past2ndP say
Syntax what do you say ⇔ say Q
you what
obj subj
Semantics say Q
you what
obj subj ⇔ Q[ λx. say(you, x) ]
TheBigPicture Empirical Matter Formalisms
• Data structures • Formalisms (e.g., CFG) • Algorithms • Distributional Models
Maud expects there to be a riot *Teri promised there to be a riot Maud expects the shit to hit the fan *Teri promised the shit to hit the fan
Linguistic Theory
?
?
? ?
WhatAboutChomsky?• Atbirthofformallanguagetheory(compsci)andformallinguis?cs
• Majorcontribu?on:syntaxiscogni'vereality
• Humansabletolearnlanguagesquickly,butnotalllanguages⇒universalgrammarisbiological
• Goalofsyntac?cstudy:finduniversalprinciplesandlanguage-specificparameters
• SpecificChomskyantheorieschangeregularly
• Generalideasadoptedbyalmostallcontemporarysyntac?ctheories(“principles-and-parameters-typetheories”)
TypesofLinguisticTheories• Prescrip've:“prescrip?velinguis?cs”isanoxymoron• Prescrip'vegrammar:howpeopleoughttotalk
• Descrip've:provideaccountofsyntaxofalanguage• Descrip'vegrammar:howpeopledotalk• oXenappropriateforNLPengineeringwork
• Explanatory:provideprinciples-and-parametersstyleaccountofsyntaxof(preferably)severallanguages
TheBigPictureEmpirical Matter
Formalisms
• Data structures • Formalisms (e.g., CFG) • Algorithms • Distributional Models
Maud expects there to be a riot *Teri promised there to be a riot Maud expects the shit to hit the fan *Teri promised the shit to hit the fan
Linguistic Theory
?
?
? or
NeedforSyntax• Grammarcheckers• Ques?onanswering• Informa?onextrac?on• Machinetransla?on
• Givenvariabilityinlanguage,helpstonormalize
keyideasofsyntax• Cons?tuency(we’llspendmostofour?meonthis)• Subcategoriza?on• Gramma?calrela?ons• Movement/long-distancedependency
StructureinStrings• Somewords:theasmallnicebigveryboygirlseeslikes
• Somegoodsentences:• theboylikesagirl• thesmallgirllikesthebiggirl• averysmallniceboyseesaveryniceboy
• Somebadsentences:• *theboythegirl• *smallboylikesnicegirl
• Canwefindsubsequencesofwords(cons'tuents)whichinsomewaybehavealike?
StructureinStringsProposal1
• Somewords:theasmallnicebigveryboygirlseeslikes
• Somegoodsentences:• (the)boy(likesagirl)• (thesmall)girl(likesthebiggirl)• (averysmallnice)boy(seesaveryniceboy)
• Somebadsentences:• *(the)boy(thegirl)• *(small)boy(likesthenicegirl)
StructureinStringsProposal2
• Somewords:theasmallnicebigveryboygirlseeslikes
• Somegoodsentences:• (theboy)likes(agirl)• (thesmallgirl)likes(thebiggirl)• (averysmallniceboy)sees(averyniceboy)
• Somebadsentences:• *(theboy)(thegirl)• *(smallboy)likes(thenicegirl)
MoreStructureinStrings• Somewords:theasmallnicebigveryboygirlseeslikes
• Somegoodsentences:• ((the)boy)likes((a)girl)• ((the)(small)girl)likes((the)(big)girl)• ((a)((very)small)(nice)boy)sees((a)((very)nice)girl)
• Somebadsentences:• *((the)boy)((the)girl)• *((small)boy)likes((the)(nice)girl)
FromSubstringstoTrees• (((the)boy)likes((a)girl))
boy the
likes girl
a
NodeLabels?• (((the)boy)likes((a)girl))
• Choosecons?tuentssoeachonehasonenon-bracketedword:thehead
• Groupwordsbydistribu?onofcons?tuentstheyhead(part-of-speech,POS):• Noun(N),verb(V),adjec?ve(Adj),adverb(Adv),determiner(Det)
• Categoryofcons?tuent:XP,whereXisPOS• NP,S,AdjP,AdvP,DetP
NodeLabels• (((the/Det)boy/N)likes/V((a/Det)girl/N))
boy
the
likes
girl
a
DetP
NP NP
DetP
S
TypesofNodes• (((the/Det)boy/N)likes/V((a/Det)girl/N))
boy
the
likes
girl
a
DetP
NP NP
DetP
S
Phrase-structure tree
nonterminal symbols = constituents
terminal symbols = words
Constituency(Review)• E.g.,Nounphrases(NPs)
• Areddogonabluetree• Abluedogonaredtree• Somebigdogsandsomeli8ledogs• Adog• We• Bigdogs,li8ledogs,reddogs,bluedogs,yellowdogs,greendogs,blackdogs,andwhitedogs
• Howdoweknowtheseformacons?tuent?
Constituency(II)
• Theycanallappearbeforeaverb:– Somebigdogsandsomelidledogsaregoingaroundincars…
– Bigdogs,lidledogs,reddogs,bluedogs,yellowdogs,greendogs,blackdogs,andwhitedogsareallatadogparty!
– Idonot• Butindividualwordscan’talwaysappearbeforeverbs:
– *lidlearegoing…– *blueare…– *andare
• Mustbeabletostategeneraliza?onslike:– Nounphrasesoccurbeforeverbs
Constituency(III)
• Preposingandpostposing:• Underatreeisayellowdog.• Ayellowdogisunderatree.
• Butnot:• *Under,isayellowdogatree.• *Underaisayellowdogtree.
• Preposi?onalphrasesnotableforambiguityinadachment
PhraseStructureandDependencyStructure
likes/V
boy/N girl/N
the/Det a/Det boy
the
likes
girl
a
DetP
NP NP
DetP
S
All nodes are labeled with words! Only leaf nodes labeled with words!
PhraseStructureandDependencyStructure(ctd)
likes/V
boy/N girl/N
the/Det a/Det boy
the
likes
girl
a
DetP
NP NP
DetP
S
Representationally equivalent if each nonterminal node has one lexical daughter (its head)
TypesofDependency
likes/V
boy/N girl/N
a/Det small/Adj the/Det
very/Adv
sometimes/Adv Obj Subj
Adj(unct)
Fw Fw
Adj
Adj
GrammaticalRelations• Typesofrela?onsbetweenwords
• Arguments:subject,object,indirectobject,preposi?onalobject
• Adjuncts:temporal,loca?ve,causal,manner,…
• Func?onWords
Subcategorization• Listofargumentsofaword(typically,averb),withfeaturesaboutrealiza?on(POS,perhapscase,verbformetc)
• IncanonicalorderSubject-Object-IndObj• Example:
• like:N-N,N-V(to-inf)• see:N,N-N,N-N-V(inf)
• Note:J&Mtalkaboutsubcategoriza?ononlywithinVP
Subcategorizationexamples• Give
• Pretend
• Tell
• Bet
WhatAbouttheVP?
boy
the
likes
girl
a
DetP
NP NP
DetP
S
boy
the
likes DetP
NP
girl
a
NP
DetP
S
VP
WhatAbouttheVP?• ExistenceofVPisalinguis?c(i.e.,empirical)claim,notamethodologicalclaim
• Seman?cevidence???• Syntac?cevidence
• VP-fron?ng(andquicklycleanthecarpethedid!)• VP-ellipsis(Hecleanedthecarpetsquickly,andsodidshe)• CanhaveadjunctsbeforeandaXerVP,butnotinVP(HeoDeneatsbeans,*heeatsoDenbeans)
• Note:VPcannotberepresentedinadependencyrepresenta?on
Context-FreeGrammars
• Definedinformallanguagetheory(compsci)
• Terminals,nonterminals,startsymbol,rules
• String-rewri?ngsystem
• Startwithstartsymbol,rewriteusingrules,donewhenonlyterminalsleX
• NOTALINGUISTICTHEORY,justaformaldevice
CFG:Example• ManypossibleCFGsforEnglish,hereisanexample(fragment):• S→NPVP• VP→VNP• NP→DetPN|AdjPNP• AdjP→Adj|AdvAdjP• N→boy|girl• V→sees|likes• Adj→big|small• Adv→very• DetP→a|the
the very small boy likes a girl
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
S
S
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
NP VP
NP
S
VP
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
DetP N VP
DetP
NP
S
VP
N
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
the boy VP
boy the
DetP
NP
S
VP
N
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
the boy likes NP
boy the likes
DetP
NP
NP
S
VP
N V
DerivationsinaCFG
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
the boy likes a girl
boy the likes
DetP
NP
girl a
NP
DetP
S
VP
N
N
V
DerivationsinaCFG;OrderofDerivationIrrelevant
S→NPVPVP→VNPNP→DetPN|AdjPNPAdjP→Adj|AdvAdjPN→boy|girlV→sees|likesAdj→big|smallAdv→veryDetP→a|the
NP likes DetP girl
likes
NP
girl
NP
DetP
S
VP
N
V
DerivationsofCFGs• Stringrewri?ngsystem:wederiveastring(=derivedstructure)
• Butderiva?onhistoryrepresentedbyphrase-structuretree(=deriva'onstructure)!
boy the likes
DetP NP
girl a
NP
DetP
S
VP
N
N
V the boy likes a girl
FormalDe<initionofaCFGG=(V,T,P,S)
• V:finitesetofnonterminalsymbols
• T:finitesetofterminalsymbols,VandTaredisjoint
• P:finitesetofproduc?onsoftheformA→α,A∈Vandα∈(T∪V)*
• S∈V:startsymbol
Context?• Theno?onofcontextinCFGshasnothingtodowiththeordinarymeaningofthewordcontextinlanguage
• Allitreallymeansisthatthenon-terminalontheleX-handsideofaruleisoutthereallbyitself(freeofcontext)A->BCMeansthatIcanrewriteanAasaBfollowedbyaCregardlessofthecontextinwhichAisfound
KeyConstituents(English)• Sentences• Nounphrases• Verbphrases• Preposi?onalphrases
Sentence-Types
• Declara?ves:Idonot.S->NPVP
• Impera?ves:Godogs!Go!S->VP
• Yes-NoQues?ons:Doyoulikemyhat?S->AuxNPVP
• WHQues?ons:Whataretheygoingtodo?S->WHAuxNPVP
NPs• NP->Pronoun
• Icame,yousawit,theyconquered• NP->Proper-Noun
• NewJerseyiswestofNewYorkCity• LeeBollingeristhepresidentofColumbia
• NP->DetNoun• Thepresident
• NP->Nominal• Nominal->NounNoun
• AmorningflighttoDenver
NPs• NP->Pronoun
• Icame,yousawit,theyconquered• NP->Proper-Noun
• NewJerseyiswestofNewYorkCity• LeeBollingeristhepresidentofColumbia
• NP->DetNoun• Thepresident
• NP->Nominal• Nominal->NounNoun
• AmorningflighttoDenver
What other types of nominals do you find in English? Give examples.
PPs• PP->Preposi?onNP
• Overthehouse• Underthehouse• Tothetree• Atplay• Atapartyonaboatatnight
It is hot out here in the sun. It is not hot here under the house. What is “here”?
Recursion• We’llhavetodealwithrulessuchasthefollowingwherethenon-terminalontheleXalsoappearssomewhereontheright(directly)NP->NPPP [[Theflight][toBoston]]VP->VPPP [[departedMiami][atnoon]]
(indirectly)NP->NPSrelSrel->NPVP[[thedog][[thecat]likes]]
Recursion
• Ofcourse,thisiswhatmakessyntaxinteres?ngThedogbitesThedogthemousebitbitesThedogthemousethecatatebitbites
Recursion
[[Flights][fromDenver]][[[Flights][fromDenver]][toMiami]][[[[Flights][fromDenver]][toMiami]][inFebruary]][[[[[Flights][fromDenver]][toMiami]][inFebruary]][onaFriday]]
Etc.NP->NPPP
ImplicationsofRecursionandContext-Freeness• VP->VNP• (I)hate
flightsfromDenverflightsfromDenvertoMiamiflightsfromDenvertoMiamiinFebruaryflightsfromDenvertoMiamiinFebruaryonaFridayflightsfromDenvertoMiamiinFebruaryonaFridayunder$300flightsfromDenvertoMiamiinFebruaryonaFridayunder$300withlunch
• Thisiswhycontext-freegrammarsareappealing!IfyouhavearulelikeVP->VNP• ItonlycaresthatthethingaXertheverbisanNPItdoesn’thavetoknowabouttheinternalaffairsofthatNP
GrammarEquivalence• Canhavedifferentgrammarsthatgeneratesamesetofstrings(weakequivalence)• Grammar1:NP→DetPNandDetP→a|the• Grammar2:NP→aN|NP→theN
• Canhavedifferentgrammarsthathavesamesetofderiva?ontrees(strongequivalence)• WithCFGs,possibleonlywithuselessrules• Grammar2:NP→aN|NP→theN• Grammar3:NP→aN|NP→theN,DetP→many
• Strongequivalenceimpliesweakequivalence
NormalForms&c• Thereareweaklyequivalentnormalforms(ChomskyNormalForm,GreibachNormalForm)
• Therearewaystoeliminateuselessproduc?onsandsoon
ChomskyNormalFormACFGisinChomskyNormalForm(CNF)ifallproduc?onsareofoneoftwoforms:
• A→BCwithA,B,Cnonterminals• A→a,withAanonterminalandaaterminal
EveryCFGhasaweaklyequivalentCFGinCNF
“GenerativeGrammar”• Formallanguages:formaldevicetogenerateasetofstrings(suchasaCFG)
• Linguis?cs(Chomskyanlinguis?csinpar?cular):approachinwhichalinguis?ctheoryenumeratesallpossiblestrings/structuresinalanguage(=competence)
• Chomskyantheoriesdonotreallyuseformaldevices–theyuseCFG+informallydefinedtransforma?ons
NobodyUsesSimpleCFGs(ExceptIntroNLPCourses)
• Allmajorsyntac?ctheories(Chomsky,LFG,HPSG,TAG-basedtheories)representbothphrasestructureanddependency,inonewayoranother
• Allsuccessfulparserscurrentlyusesta?s?csaboutphrasestructureandaboutdependency
• Derivedependencythrough“headpercola?on”:foreachrule,saywhichdaughterishead
MassiveAmbiguityofSyntax• Forastandardsentence,andagrammarwithwidecoverage,thereare1000sofderiva?ons!
• Example:• Thelargeportraitpaintertoldthedelega?onthathesentmoneyordersinalederonWednesday
PennTreebank(PTB)• Syntac?callyannotatedcorpusofnewspapertexts(phrasestructure)
• Thenewspapertextsarenaturallyoccurringdata,butthePTBisnot!
• PTBannota?onrepresentsapar?cularlinguis?ctheory(butafairly“vanilla”one)
• Par?culari?es• Veryindirectrepresenta?onofgramma?calrela?ons(needforheadpercola?ontables)
• CompletelyflatstructureinNP(brownbaglunch,pink-and-yellowchildseat)
• HasflatSs,flatVPs
ExamplefromPTB((S(NP-SBJIt)(VP's(NP-PRD(NP(NPthelatestinvestmentcraze) (VPsweeping (NPWallStreet))) : (NP(NParash) (PPof (NP(NPnewclosed-endcountryfunds) , (NP(NPthose (ADJPpubliclytraded) porxolios) (SBAR(WHNP-37that) (S(NP-SBJ*T*-37) (VPinvest (PP-CLRin (NP(NPstocks) (PPof (NPasingleforeigncountry)))))))))))
Typesofsyntacticconstructions• Isthisthesameconstruc?on?
• Anelfdecidedtocleanthekitchen• AnelfseemedtocleanthekitchenAnelfcleanedthekitchen
• Isthisthesameconstruc?on?• Anelfdecidedtobeinthekitchen• AnelfseemedtobeinthekitchenAnelfwasinthekitchen
Typesofsyntacticconstructions(ctd)
• Isthisthesameconstruc?on?Thereisanelfinthekitchen• Theredecidedtobeanelfinthekitchen• Thereseemedtobeanelfinthekitchen
• Isthisthesameconstruc?on?Itisraining/itrains• Itdecidedtorain/beraining• Itseemedtorain/beraining
Typesofsyntacticconstructions(ctd)
• Isthisthesameconstruc?on?• Anelfdecidedthathewouldcleanthekitchen• AnelfseemedthathewouldcleanthekitchenAnelfcleanedthekitchen
Typesofsyntacticconstructions(ctd)Conclusion:• toseem:whateverisembeddedsurfacesubjectcanappearinupperclause
• todecide:onlyfullnounsthatarereferen?alcanappearinupperclause
• Twotypesofverbs
Typesofsyntacticconstructions:Analysis
an elf
S
NP VP
V
to decide
S
NP VP
V
to be
PP
in the kitchen
S
VP
V
to seem
S
NP VP
V
to be
PP
in the kitchen
an elf an elf
Typesofsyntacticconstructions:Analysis
an elf
S
NP VP
V
decided
S
NP
PRO
VP
V
to be
PP
in the kitchen
S
VP
V
seemed
S
NP VP
V
to be
PP
in the kitchen
an elf
Typesofsyntacticconstructions:Analysis
an elf
S
NP VP
V
decided
S
NP
PRO
VP
V
to be
PP
in the kitchen
S
VP
V
seemed
S
NP VP
V
to be
PP
in the kitchen
an elf
Typesofsyntacticconstructions:Analysis
an elf
S
NP VP
V
decided
S
NP
PRO
VP
V
to be
PP
in the kitchen
S
NPi VP
V
seemed
S
NP VP
V
to be
PP
in the kitchen
an elf
ti
Typesofsyntacticconstructions:Analysis
toseem:lowersurfacesubjectraisestoupperclause;raisingverbseems(theretobeanelfinthekitchen)thereseems(ttobeanelfinthekitchen)itseems(thereisanelfinthekitchen)
Typesofsyntacticconstructions:Analysis(ctd)
• todecide:subjectisinupperclauseandco-referswithanemptysubjectinlowerclause;controlverb
anelfdecided(anelftocleanthekitchen)anelfdecided(PROtocleanthekitchen)anelfdecided(hecleans/shouldcleanthekitchen)*itdecided(anelfcleans/shouldcleanthekitchen)
LessonsLearnedfromtheRaising/ControlIssue• Usedistribu?onofdatatogroupphenomenaintoclasses
• Usedifferentunderlyingstructureasbasisforexplana?ons
• Allowthingsto“move”aroundfromunderlyingstructure->transforma'onalgrammar
• Checkwhetherexplana?onyougivemakespredic?ons
ExamplesfromPTB(S(NP-SBJ-1Theropes)(VPseem(S(NP-SBJ*-1)(VPto(VPmake(NPmuchsound))))))(S(NP-SBJ-1Theancientchurchvicar)(VPrefuses(S(NP-SBJ*-1)(VPto(VPtalk(PP-CLRabout (NPit)))))
TheBigPicture Empirical Matter
Formalisms
• Data structures • Formalisms • Algorithms • Distributional Models
Maud expects there to be a riot *Teri promised there to be a riot Maud expects the shit to hit the fan *Teri promised the shit to hit the
or
Linguistic Theory Content: Relate morphology to semantics • Surface representation (eg, ps) • Deep representation (eg, dep) • Correspondence
uses
descriptive theory is
about
explanatory theory is about
predicts