+ All Categories
Home > Documents > Lexical Semantics, Distributions, Predicate …tbergkir/11711fa16/1129_slides.pdfLexical Semantics,...

Lexical Semantics, Distributions, Predicate …tbergkir/11711fa16/1129_slides.pdfLexical Semantics,...

Date post: 10-Apr-2018
Category:
Upload: hadung
View: 223 times
Download: 4 times
Share this document with a friend
79
Lexical Semantics, Distributions, Predicate-Argument Structure, and Frame Semantic Parsing 11-711 Algorithms for NLP 29 November 2016 (With thanks to Noah Smith and Lori Levin)
Transcript

LexicalSemantics,Distributions,Predicate-ArgumentStructure,and

FrameSemanticParsing

11-711AlgorithmsforNLP29November2016

(WiththankstoNoahSmithandLoriLevin)

11-711CourseContext

• Previoussemanticslecturesdiscussedcomposingmeaningsofpartstoproducethecorrectglobalsentencemeaning– Themailmanbitmydog.

• The“atomicunits”ofmeaninghavecomefromthelexicalentriesforwords

• Themeaningsofwordshavebeenoverlysimplified(asinFOL):atomicobjectsinaset-theoreticmodel

WordSense

• Instead,abank canholdtheinvestmentsinacustodialaccountintheclient’sname.

• Butasagricultureburgeonsontheeastbank,theriverwillshrinkevenmore.

• Whilesomebanks furnishspermonlytomarriedwomen,othersaremuchlessrestrictive.

• Thebank isnearthecornerofForbesandMurray.

FourMeaningsof“Bank”• Synonyms:• bank1 =“financialinstitution”• bank2 =“slopingmound”• bank3 =“biologicalrepository”• bank4 =“buildingwhereabank1 doesitsbusiness”

• Theconnectionsbetweenthesedifferentsenses varyfrompracticallynone(homonymy)torelated(polysemy).– Therelationshipbetweenthesensesbank4 andbank1 iscalledmetonymy.

Antonyms

• White/black,tall/short,skinny/American,…• Butdifferentdimensionspossible:–White/Blackvs.White/Colorful– Oftenculturallydetermined

• Partlyinterestingbecauseautomaticmethodshavetroubleseparatingthesefromsynonyms– Samesemanticfield

HowManySenses?

• Thisisahardquestion,duetovagueness.

Ambiguityvs.Vagueness

• Lexicalambiguity:Mywifehastwokids(childrenorgoats?)

• vs.Vagueness:1sense,butindefinite:horse(mare,colt,filly,stallion,…)vs.kid:– IhavetwohorsesandGeorgehasthree– IhavetwokidsandGeorgehasthree

• Verbstoo:IranlastyearandGeorgedidtoo• vs.Reference:I,here,thedognotconsideredambiguousinthesameway

HowManySenses?

• Thisisahardquestion,duetovagueness.• Considerations:– Truthconditions(servemeat/servetime)– Syntacticbehavior(servemeat/serveassenator)– Zeugmatest:• #DoesUnitedservebreakfastandPittsburgh?• ??Shepoacheselephantsandpears.

RelatedPhenomena

• Homophones(would/wood,two/too/to)–Mary,merry,marryinsomedialects,notothers

• Homographs(bass/bass)

WordSensesandDictionaries

WordSensesandDictionaries

Ontologies

• ForNLP,databasesofwordsensesaretypicallyorganizedbylexicalrelationssuchashypernym (IS-A)intoaDAG

• Thishasbeenworkedonforquiteawhile• Aristotle’sclasses(about330BC)– substance(physicalobjects)– quantity(e.g.,numbers)– quality(e.g.,beingred)– Others:relation,place,time,position,state,action,affection

WordsensesinWordNet3.0

Synsets

• (bass6,bass-voice1,basso2)• (bass1,deep6)(Adjective)

• (chump1,fool2,gull1,mark9,patsy1,fallguy1,sucker1,softtouch1,mug2)

“Rough”Synonymy

• JonathanSafranFoer’s EverythingisIlluminated

NounrelationsinWordNet3.0

Isahamburgerfood?

VerbrelationsinWordNet3.0

• Notnearlyasmuchinformationasnouns

FramebasedKnowledgeRep.

• Organizerelationsaroundconcepts• Equivalentto(orweakerthan)FOPC

– Imagefromfuturehumanevolution.com

Wordsimilarity

• Humanlanguagewordsseemtohavereal-valuedsemanticdistance(vs.logicalobjects)

• Twomainapproaches:– Thesaurus-basedmethods• E.g.,WordNet-based

– Distributionalmethods• Distributional“semantics”,vector“semantics”• Moreempirical,butaffectedbymorethansemanticsimilarity(“wordrelatedness”)

Human-subjectWordAssociationsStimulus:wall

Numberofdifferentanswers:39Totalcountofallanswers:98BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...

Stimulus:giraffe

Numberofdifferentanswers:26

Totalcountofallanswers:98NECK330.34ANIMAL90.09ZOO90.09LONG70.07TALL70.07SPOTS50.05LONGNECK40.04AFRICA30.03ELEPHANT20.02HIPPOPOTAMUS20.02LEGS20.02...

FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/

Thesaurus-basedWordSimilarity

• Simplestapproach:pathlength

Betterapproach:weightedlinks• Usecorpusstatstogetprobabilitiesofnodes• Refinement:useinfocontentofLCS:

2*logP(g.f.)/(logP(hill)+logP(coast))=0.59

DistributionalWordSimilarity• Determinesimilarityofwordsbytheirdistribution inacorpus– “Youshallknowawordbythecompanyitkeeps!”(Firth1957)

• E.g.:100kdimension vector,“1”ifwordoccurswithin“2lines”:

• “Whoismyneighbor?”Whichfunctions?

Whoismyneighbor?• Linearwindow?1-500wordswide.Orwholedocument.Removestopwords?

• Usedependency-parserelations?Moreexpensive,butmaybebetterrelatedness.

Weightsvs.justcounting

• Weightthecountsbytheapriorichanceofco-occurrence

• Pointwise MutualInformation(PMI)• Objectsofdrink:

Distancebetweenvectors

• Comparesparsehigh-dimensionalvectors– Normalizeforvectorlength

• Justusevectorcosine?• SeveralotherfunctionscomefromIRcommunity

Lotsoffunctionstochoosefrom

Distributionally SimilarWords

30

Rumvodkacognacbrandywhiskyliquordetergentcolaginlemonadecocoachocolatescotchnoodletequilajuice

Writereadspeakpresentreceivecallreleasesignofferknowacceptdecideissueprepareconsiderpublish

Ancientoldmoderntraditionalmedievalhistoricfamousoriginalentiremainindianvarioussingleafricanjapanesegiant

Mathematicsphysicsbiologygeologysociologypsychologyanthropologyastronomyarithmeticgeographytheologyhebreweconomicschemistryscripturebiotechnology

(fromanimplementationofthemethoddescribedinLin.1998.AutomaticRetrievalandClusteringofSimilarWords.COLING-ACL.Trainedonnewswiretext.)

Human-subjectWordAssociationsStimulus:wall

Numberofdifferentanswers:39Totalcountofallanswers:98BRICK160.16STONE90.09PAPER70.07GAME50.05BLANK40.04BRICKS40.04FENCE40.04FLOWER40.04BERLIN30.03CEILING30.03HIGH30.03STREET30.03...

Stimulus:giraffe

Numberofdifferentanswers:26

Totalcountofallanswers:98NECK330.34ANIMAL90.09ZOO90.09LONG70.07TALL70.07SPOTS50.05LONGNECK40.04AFRICA30.03ELEPHANT20.02HIPPOPOTAMUS20.02LEGS20.02...

FromEdinburghWordAssociationThesaurus,http://www.eat.rl.ac.uk/

Recentevents(2013-now)

• RNNs(RecurrentNeuralNetworks)asanotherwaytogetfeaturevectors– Hiddenweightsaccumulatefuzzyinfoonwordsintheneighborhood

– Thesetofhiddenweightsisusedasthevector!• Compositionbymultiplying(etc.)–Mikolov etal(2103):“king– man+woman=queen”(!?)

– CCGwithvectorsasNPsemantics,matricesasverbsemantics(!?)

RNNs

Fromopeni.nlm.nih.gov

34 SemanticProcessing[2]

SemanticCases/ThematicRoles

• Developedinlate1960’sand1970’s• Postulatealimitedsetofabstractsemanticrelationshipsbetweenaverb&itsarguments:thematicroles orcaseroles

• Insomesense,partoftheverb’ssemantics

35 SemanticProcessing[2]

ThematicRoleexample

• Johnbrokethewindowwiththehammer• John:AGENTrolewindow:THEMErolehammer:INSTRUMENTrole

• ExtendLFnotationtousesemanticroles

36 SemanticProcessing[2]

ThematicRoles

• IsthereaprecisewaytodefinemeaningofAGENT,THEME,etc.?

• Bydefinition:– “TheAGENTisaninstigatoroftheactiondescribedbythesentence.”

• Testingviasentencerewrite:– Johnintentionally brokethewindow– *Thehammerintentionally brokethewindow

37 SemanticProcessing[2]

ThematicRoles[2]

• THEME– Describestheprimaryobjectundergoingsomechangeorbeingactedupon

– FortransitiveverbX,“whatwasXed?”– Thegrayeaglesawthemouse“Whatwasseen?”(A:themouse)

Breaking,Eating,Opening• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.

• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.

• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.

Breaking,Eating,Opening• Johnbrokethewindow.• Thewindowbroke.• Johnisalwaysbreakingthings.

• Weatedinner.• Wealreadyate.• Thepieswereeatenupquickly.

• Openup!• Someoneleftthedooropen.• Johnopensthewindowatnight.

breaker,brokenthing,breakingfrequency?

eater,eatenthing,eatingspeed?

opener,openedthing,openingtime?

CanWeGeneralize?

• Thematicroles describegeneralpatternsofparticipantsingenericevents.

• Thisgivesusakindofshallow,partialsemanticrepresentation.

• FirstproposedbyPanini,before400BC!

ThematicRoles

Role Definition ExampleAgent Volitional causer of the event The waiter spilled the soup.

Force Non-volitional causer of the event The wind blew the leaves around.

Experiencer Mary has a headache.Theme Most directly affected participant Mary swallowed the pill.Result End-product of an event We constructed a new building.Content Proposition of a propositional event Mary knows you hate her.Instrument You shot her with a pistol.Beneficiary I made you a reservation.Source Origin of a transferred thing I flew in from Pittsburgh.Goal Destination of a transferred thing Go to hell!

ThematicGridorCaseFrame

• Example:break– Thechildbrokethevase.<agenttheme>

subjobj– Thechildbrokethevasewithahammer.

<agentthemeinstr >subjobj PP

– Thehammerbrokethevase.< themeinstr >obj subj

– Thevasebroke.<theme>subj

ThematicGridorCaseFrame

• Example:break– Thechildbrokethevase.<agenttheme>

subjobj– Thechildbrokethevasewithahammer.

<agentthemeinstr >subjobj PP

– Thehammerbrokethevase.< themeinstr >obj subj

– Thevasebroke.<theme>subjTheThematicGridorCaseFrameshows

• Howmanyargumentstheverbhas• Whatrolestheargumentshave• Wheretofindeachargument

• Forexample,youcanfindtheagentinthesubjectposition

DiathesisAlternation:achangeinthenumberofargumentsorthegrammaticalrelationsassociatedwith

eachargument

• Chris gaveabooktoDana. <agentthemegoal>subjobj PP

• AbookwasgiventoDana byChris. <agentthemegoal>PPsubjPP

• ChrisgaveDanaabook. <agentthemegoal>subjobj2obj

• Dana wasgivenabookbyChris. <agentthemegoal>PPobj subj

TheTroubleWithThematicRoles

• Theyarenotformallydefined.• Theyareoverlygeneral.• “agent verb theme withinstrument”and“instrumentverbtheme”...– Thecookopenedthejarwiththenewgadget.

→ Thenewgadgetopenedthejar.– Susanatetheslicedbananawithafork.

→ #Theforkatetheslicedbanana.

TwoDatasets

• PropositionBank(PropBank):verb-specificthematicroles

• FrameNet:“frame”-specificthematicroles

• Thesearelexiconscontainingcaseframes/thematicgridsforeachverb.

PropositionBank(PropBank)

• Asetofverb-sense-specific “frames”withinformalEnglishglossesdescribingtheroles

• Conventionsforlabelingoptionalmodifierroles

• PennTreebankislabeledwiththoseverb-sense-specificsemanticroles.

“Agree”inPropBank

• arg0:agreer• arg1:proposition• arg2:otherentityagreeing

• Thegroupagreeditwouldn’tmakeanoffer.• UsuallyJohn agreeswithMary oneverything.

“Fall(movedownward)”inPropBank

• arg1:logicalsubject,patient,thingfalling• arg2:extent,amountfallen• arg3:startingpoint• arg4:endingpoint• argM-loc:medium• Sales fellto$251.2million from$278.8million.• Theaveragejunkbondfellby4.2%.• Themeteorfellthroughtheatmosphere,crashingintoCambridge.

FrameNet

• FrameNetissimilar,butabstractsfromspecificverbs,sothatsemanticframes arefirst-classcitizens.

• Forexample,thereisasingleframecalledchange_position_on_a_scale.

change_position_on_a_scale

Oil rose in price by 2%It has increased to having them 1 day a month.Microsoft shares fell to 7 5/8.Colon cancer incidence fell by 50% among men.

Manywords,notjustverbs,sharethesameframe:

Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumbleNouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumbleAdverb:increasingly

Conversely,onewordhasmanyframesExample:rise

• Change-position-on-a-scale:OilROSEinpricebytwopercent.• Change-posture:aprotagonist changestheoverallpositionorpostureofabody.

– Source:startingpointofthechangeofposture.– Charles ROSEfromhisarmchair.

• Get-up:A Protagonist leavestheplacewheretheyhaveslept,their Bed,tobeginorresumedomestic,professional,orotheractivities.GettingupisdistinctfromWakingup,whichisconcernedonlywiththetransitionfromthesleepingstatetoawakefulstate.– I ROSE frombed,threwonapairofcamouflageshortsanddrovemylittleToyotaCorolla

toaconstructionclearingafewmilesaway.• Motion-directional:Inthisframea Theme movesinacertain Direction whichisoften

determinedbygravityorothernatural,physicalforces.The Theme isnotnecessarilyaself-mover.– TheballoonROSEupward.

• Sidereal-appearance: An Astronomical_entity comesintoviewabovethehorizonaspartofaregular,periodicprocessof(apparent)motionoftheAstronomical_entity acrossthesky.Inthecaseofthesun,theappearancebeginstheday.– Atthetimeofthenewmoon, themoon RISES ataboutthesametimethesunrises,and

itsetsataboutthesametimethesunsets.Eachday thesun's RISE offersusanewday.

FrameNet• Framesarenotjustforverbs!• Verbs:advance,climb,decline,decrease,diminish,dip,double,drop,dwindle,edge,explode,fall,fluctuate,gain,grow,increase,jump,move,mushroom,plummet,reach,rise,rocket,shift,skyrocket,slide,soar,swell,swing,triple,tumble

• Nouns:decline,decrease,escalation,explosion,fall,fluctuation,gain,growth,hike,increase,rise,shift,tumble

• Adverb:increasingly

FrameNet

• Includesinheritanceandcausationrelationshipsamongframes.

• Examplesincluded,butlittlefully-annotatedcorpusdata.

SemLink• Itwouldbereallyusefulifthesedifferentresourceswereinterconnectedinausefulway.

• SemLink projectis(was?)tryingtodothat• UnifiedVerbIndex(UVI)connects– PropBank– VerbNet– FrameNet–WordNet/OntoNotes

SemanticRoleLabeling

• Input:sentence• Output:foreachpredicate*,labeledspansidentifyingeachofitsarguments.

• Example:[agentThebatter]hit[patient theball][time yesterday]

• Somewherebetweensyntacticparsingandfull-fledgedcompositionalsemantics.

*Predicatesaresometimesidentifiedintheinput,sometimesnot.

Butwait.Howisthisdifferentfromdependencyparsing?

• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]

• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]

Butwait.Howisthisdifferentfromdependencyparsing?

• Semanticrolelabeling– [agentThebatter]hit[patient theball][time yesterday]

• Dependencyparsing– [subjThebatter]hit[obj theball][mod yesterday]

1. Thesearenotthesametask.2. Semanticrolelabelingismuchharder.

Subjectvsagent

• Subjectisagrammaticalrelation• Agentisasemanticrole

• InEnglish,asubjecthastheseproperties– Itcomesbeforetheverb– Ifitisapronoun,itisinnominativecase(inafiniteclause)

• I/he/she/we/theyhittheball.• *Me/him/her/us/themhittheball.

– Iftheverbisinpresenttense,itagreeswiththesubject• She/he/ithitstheball.• I/we/theyhittheball.• *She/he/ithittheball.• *I/we/theyhitstheball.• Ihittheball.• Ihittheballs.

Subjectvsagent

• Inthemosttypicalsentences(forsomedefinitionof“typical”),theagentisthesubject:– Thebatterhittheball.– Chrisopenedthedoor.– Theteachergavebookstothestudents.

• Sometimestheagentisnotthesubject:– Theballwashitbythebatter.– Theballswerehitbythebatter.

• Sometimesthesubjectisnottheagent:– Thedooropened.– Thekeyopenedthedoor.– Thestudentsweregivenbooks.– Booksweregiventothestudents.

SimilaritiestoWSD

• PickcorrectchoicefromNambiguouspossibilities

• Definitionsarenotcrisp• Needtopickalabellingscheme,corpus– Choiceshavebigeffectonperformance,usefulness

SemanticRoleLabeling

• Input:sentence• Output:segmentationintoroles,withlabels

• Examplefrombook:• [arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp yesterday]

SemanticRoleLabeling:HowItWorks

• First,parse.• Foreachpredicatewordintheparse:– Foreachnodeintheparse:• Classify thenodewithrespecttothepredicate.

YetAnotherClassificationProblem!

• Asbefore,therearemanytechniques(e.g.,NaïveBayes)

• Key:whatfeatures?

FeaturesforSemanticRoleLabeling

• Whatisthepredicate?• Phrasetypeoftheconstituent• Headwordoftheconstituent,itsPOS• Pathintheparsetreefromtheconstituenttothepredicate

• Activeorpassive• Isthephrasebeforeorafterthepredicate?• Subcategorization(≈grammarrule)ofthepredicate

Featureexample

• Examplesentence:[arg0 TheExaminer]issued[arg1 aspecialedition][argM-tmp

yesterday]

• Arg0features:issued,NP,Examiner,NNP,path,active,before,VP->VBDNPPP

Example

Figure20.16:ParsetreeforaPropBank sentence,showingthePropBank argumentlabels.Thedottedlineshowsthepath featureNP ↑ S ↓ VP ↓ VBD forARG0,theNP-SBJconstituentTheSanFranciscoExaminer.

AdditionalIssues

• Initialfilteringofnon-arguments• Usingchunkingorpartialparsinginsteadoffullparsing

• Enforcingconsistency(e.g.,non-overlap,onlyonearg0)

• Phrasalverbs,supportverbs/lightverbs– takeanap:verbtakeissyntacticheadofVP,butpredicateisnapping,nottaking

Twodatasets,twosystems

• ExamplefrombookusesPropBank

• Locally-developedsystemSEMAFORworksonSemEval problem,basedonFrameNet

PropBank vsFrameNet

Shallowapproachestodeepproblems

• ForbothWSDandSRL:– Shallowapproachesmucheasiertodevelop• Asin,possibleatall forunlimitedvocabularies

– Notwonderfulperformanceyet• Sometimesclaimedtohelpaparticularsystem,butoftendoesn’tseemtohelp

– Definitionsarenotcrisp• Thereclearlyissomething there,butthegranularityofthedistinctionsveryproblematic

• DeepLearningwillfixeverything?

Questions?

SEMAFOR

• AFrameNet-basedsemanticrolelabelingsystemdevelopedwithinNoah’sresearchgroup‣ Itusesadependencyparser(theMSTParser)forpreprocessing

‣ Identifiesanddisambiguatespredicates;thenidentifiesanddisambiguateseachpredicate’sarguments

‣ Trainedonframe-annotatedcorporafromSemEval2007/2010tasks.Domains:weaponsreports,travelguides,news,SherlockHolmesstories.

Nouncompounds• Averyflexible(productive)syntacticstructureinEnglish

‣ Thenounnoun patterniseasilyappliedtonamenewconcepts(Webbrowser)andtodisambiguateknownconcepts(firetruck)

‣ CanalsocombinetwoNPs:incumbentprotectionplan,[undergraduate [[computerscience][lecturecourse]]

‣ Sometimescreatesambiguity,esp.inwritingwherethereisnophonologicalstress:Spanishteacher

‣ Peoplearecreativeaboutinterpretingevennonsensicalcompounds

• Alsopresentinmanyotherlanguages,sometimeswithspecialmorphology

‣ Germanisinfamousforlovingtomergewordsintocompounds.e.g.Fremdsprachenkenntnisse,‘knowledgeofforeignlanguages’

Nouncompounds• SemEval2007task:ClassificationofSemanticRelationsbetweenNominals

‣ 7predefinedrelationtypes

1. Cause-Effect:fluvirus

2. Instrument-User:laserprinter

3. Product-Producer:honeybee

4. Origin-Entity:ryewhiskey

5. Purpose-Tool:souppot

6. Part-Whole:carwheel

7. Content-Container:applebasket

• http://nlp.cs.swarthmore.edu/semeval/tasks/task04/description.shtml

Nouncompounds• SemEval2010task:Nouncompoundinterpretationusing

paraphrasingverbs

‣ Adatasetwascompiledinwhichsubjectswerepresentedwithanouncompoundandaskedtoprovideaverbdescribingtherelationship

‣ nutbreadelicited:contain(21);include(10);bemadewith(9);have(8);bemadefrom(5);use(3);bemadeusing(3);feature(2);befilledwith(2);tastelike(2);bemadeof(2);comefrom(2);consistof(2);hold(1);becomposedof(1);beblendedwith(1);becreatedoutof(1);encapsulate(1);diffuse(1);becreatedwith(1);beflavoredwith(1)

• http://semeval2.fbk.eu/semeval2.php?location=tasks#T12

Thesaurus/dictionary-basedsimilaritymeasures


Recommended