+ All Categories
Home > Documents > From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer...

From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer...

Date post: 11-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
55
From Simple to Complex QA Eduard Hovy CMU Language Technologies Institute www.cs.cmu.edu/~hovy
Transcript
Page 1: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

FromSimpletoComplexQA

EduardHovyCMULanguageTechnologiesInstitute

www.cs.cmu.edu/~hovy

Page 2: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

WebclopediaQA,2003

•  Wherearezebrasmostlikelyfound? —inthedictionary

•  Wheredolobstersliketolive? —onthetable

•  HowmanypeopleliveinChile? —nine

Webclopedia(Hovyetal.2001)

•  Whatisaninvertebrate? —Dukakis 1

Page 3: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

BasicsimplefactoidQA

•  IdentifykeywordsfromQ•  Build(Boolean)queryforIR•  RetrievetextsusingIR•  Ranktexts/passages

•  FindspecifiedQtype•  MoveApatternsovertextand

scoreeachposition•  Rankwindows;returntopN

Alist

InputQ

Corpus:30%

+Web:add10%

1Mdocuments3000sentences

50candidates5answers

…Xwasbornin<YEAR>……Xwasbornon<DATE>……X(<YEAR>–<YEAR>)…

2

Page 4: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

WhereistheAnswer?—Progresssince2003?

TypicalQAformat:

EithertheQcontextprovidestheA1.  nearby(=n-wordwindow)context 2.  distant(=doc-level)context

Ornotatall…soyouhavetousebackgroundinfo3.  fromthetrainingdata4.  fromlogicalderivation/reasoningrules/procedure 3

Question:QContext:“wwwwww…w”A:

Page 5: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Either…

WhenallinfoneededtogettheAispresentintheQcontext

…thensomeformofsurfaceandsimpletypematching+sub-Acompositionisenough

—>Ultimately,justdo[nested]simpleQA

Or…

ButwhengettingtheArequiresinformationnotintheQcontext(likebackgroundinfo,calculation,etc.)

…thenyouareintrouble:thisisnotstandardized,henceimpossibletoevaluate

—>NocomplexQA!?4

Page 6: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Outline

1.  Ainthenearbycontext2.  Ainthedistantcontext3.  Ahiddeninthetrainingdata4.  Aonlybyreasoning

5

Page 7: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Option1:Ainnearbycontext

•  BuildanduseshortpatternsorarichLM•  Tonsofworksince2000onpatternlearningandgeneralization,QAtypologies,etc.

•  NumerousQAdatasets(TREC,SQuAD,CNN…)•  ManyQAcompetitions(SEMEVAL…)

6

Sowhere’sthelimit?

Page 8: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

YoucandoaLOTwithpatternsDidyouknowyouareanexpertonthe

PanamaCanal?

BlahPanamaCanalblahblahPanamablahPres.RooseveltblahUSAblahblahblahblahblah10yearsblahuntil1914blahblahblahblah51milesblahblahblahblahblahblahblahblahblahblahblahblahblah8to10hoursblahblahblahblahGatunLakeblah

WhenwasthePanamaCanalcompleted?HowlongisthePanamaCanal?HowlongdidittaketobuildthePanamaCanal?HowlongdoesittaketocrossthePanamaCanal?WhatisthelakeinthePanamaCanalcalled?WhichUSPresidentenabledthePanamaCanal?WhichoceansdoesthePanamaCanalconnect?Inyourtrainingdata,youhavesurelyseen“PanamaCanal”withonlytwooceannames…

7Sowhere’sthelimit?

Page 9: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Acorpustotestthepowerofngram/patternQAmodels

•  CLOTH(Xie,Lai,Dai,Hovy,EMNLP2018)–  Large-scaleClozetestdataset–  CreatedbyEnglishteachersinChinaforEnglishexams(MiddleandHighschoollevels)

– Aftercleanup:7kpassages;99kquestions(2/3removed)

– Droppedwordsandwordoptionscarefullycreatedbyteachers:highlynuancedalternatives

–  Testsknowledgeofgrammar,vocabulary,reasoning•  Howwelldostate-of-the-artcomputationalmodelsdocomparedtohumans?– Wetestusinga1-billion-wordlanguagemodel

8

Page 10: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

•  Tense,voice,preps•  Localcontentwords

•  Copy/paraphrasewords•  Contentwords,long-distancedependencies

Percentagesoftestexamples,Middle/Highschoollevels

(Xieetal.2018)

9

Page 11: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

QAsystemresults

•  Evena1B-LMstilllagsbehindhumanperformance•  Increasingthecontextlengthfor1B-LMdoesnothelp•  However:human-createdquestionsaredifferent:

(Xieetal.2018)

10

(AR:AttentionReader)

Thiswaspre-BERT!)

Page 12: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Conclusionforoption1

ForfactoidQAtypesthatobeypatterns,iftheAiscloseenough,andyouhaveenoughtrainingdata……youwillalwayslearngoodenoughwordcombinationpatternstoconnectQparameters<–>Qcontextmaterial<–>A

(Ifyouhaven’tseenthenecessarywordcombinations,youwon’teverbeabletoanswertheQ)

11

Page 13: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Option2:Aindistantcontext

•  StillusesomeformofmatchingQandA•  Needamore-sophisticatedandlonger-distancetypeof‘pattern’

12

Page 14: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Makingmatchingmorecomplex:RACE:Abettertestbed

•  RACE:ReAdingComprehensiondatasetfromExaminations(Lai,Xie,Liu,Yang,Hovy,EMNLP2018)

•  CollectedfromChinesemiddleandhighschoolexamsthatevaluatehumanstudents’Englishreadingcomprehensionability– Designedbyhumanexperts:Ensuresqualityandbroadtopiccoverage

–  SubstantiallymoredifficultthanexistingQAdatasets(butRACE-MeasierthanRACE-H)

– About4/5ofsourcematerialfilteredouttoremoveduplicates,incorrectformat,etc.

•  Aftercleaning:27,933passages;97,687questions

(Laietal.2018)

13

Page 15: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

14

Page 16: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Toward‘reasoning’:typesofmore-complexmatching

•  ParaphrasingQs:testlanguageability•  DetailQs:identifyandmatchdetailsofathing

•  AttitudeQs:findopinions/attitudesoftheauthortowardssomething(’sentiment’)

•  Whole-pictureQs:understandtheentirestory(multi-sentence)

•  SummarizationQs:understandthepoint(multi-sentence)

(Laietal.2018)

15

Increasingreasoning

Page 17: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

ComparisonwithotherQAdatasets•  Reasoningquestions:59.2%ofRACE;20.5%ofSQuAD•  Processingtypes:

–  Wordmatching:exactmatch–  Paraphrasing:paraphraseorentailment–  Single-sentreasoning:incompleteinfoorconceptualoverlap

–  Multi-sentreasoning:synthesizinginformationfrommultiplesentences

–  Insufficient/Ambiguous:noA,orAisnotunique

(Laietal.2018)

16

Page 18: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

ComparingQAalgorithms

•  Baselines:–  SlidingWindow:TF-IDFbasedmatchingalgorithm–  StanfordAttentionReader(AR)andGatedAttentionReader(early-2018state-of-the-artneuralmodels)

•  RACEhasmore‘semantics’(=requiresmore‘reasoning’)thanothercorpora:–  higherhumanceiling–  harderforneuralmodels

(Laietal.2018)

17

Page 19: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Matchingtypeperformance

•  TurkersandSlidingWindowaregoodatsimplematchingquestions

•  Surprisingly,StanfordARdoesnothavebetterperformanceonmatchingquestions

(Laietal.2018)

18

Page 20: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Conclusionforoption2

WhentheAisdistant,orrequiresmore-sophisticatedmatching/’reasoning’(notjustsimpleword-string/languagemodel),

thenattention-basedneuralmodelscandosomeofit,butstillfailwiththeharderparts

19

Page 21: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Option3:A‘hidden’intrainingdata

SometimestheQcontextdoesnotcontaintheAatall…butyoucanSTILLgettherightA!(AndevengetitwithouttheQitself!)CorruptedngramsandotherSQuADperturbations(JiaandLiang,EMNLP2017)

NecessityofQcontextorevenofQitself(KaushikandLipton,EMNLP2018,BestShortPaperaward)

20

Page 22: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Example:QonlyQuestion:shinkanemaru,thegravel-voicedback-roombosswhodiedonthursdayaged81,goesdowninhistoryasjapan’smostcorruptpost-warpoliticianafter___________Passage:...glynisbc-nj-zimmer-profile-2takes-nytrahanefumioyasuhirodragnealhadonbjorkman/max...seventh-largestembarrasedjeopardyhilariouslymasahisahaibarabajram8-to-24duke/meredithacceding...koiduiraqs2:32:21//www.ironmanlive.com/sagawakyubindeaninternatinoal90-meterkakueitanakaseven-paragraph577,610wendovergolf-lpga-jpnpartner,un-appointeduemazzeicanada-u.s.Answer:kakueitanaka

(KaushikandLipton,EMNLP2018)

21

Page 23: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Doyouactuallyneedthecontext?

•  Researchgoal:–  HowstrongaremodelsthatseetheQonly?– WhataboutmodelsthatseetheQcontextpassageonly?–  Howdoweknowmodelsarereally“reading”thewholepassage?

•  Question-onlysetting:–  IftheQAsystemneedsthepassage,randomizeitswordsfirst–  IfjustcandidateAsneeded,placetheminrandomspots,fillinterveningtextwithgibberish

•  Passage-onlysetting:–  ‘Ignore’theQs:assigneachQtosomerandompassage

22

(KaushikandLiptonEMNLP2018)

Page 24: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Experiments•  Datasets/tests:

–  Spanselection:SQuAD,TriviaQA–  Clozequeries:ChildrensBookTest(CBT),CNN,CLOTH,Who-did-What,DailyMail

–  Multi-classclassification(implicit):bAbI(20tasks)–  Multiple-choicequestionanswering:RACE,MCTest–  Answergeneration:MSMARCO

•  Algorithms:–  Key-ValueMemoryNetworks:

Milleretal.2016:Key-ValueMemoryNetworksforDirectlyReadingDocuments.ProceedingsofEMNLP

–  GatedAttentionReaders:Dhingraetal.2017:Gated-AttentionReadersforTextComprehension.ProceedingsofACL

–  QANet:Yuetal.2018:QANet:CombiningLocalConvolutionwithGlobalSelf-AttentionforReadingComprehension.ProceedingsofICLR

(KaushikandLiptonEMNLP2018)

23

Page 25: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Someresults

SQuAD,usingQANet

bAbI,usingKey-ValueMemNets

(KaushikandLiptonEMNLP2018)

24

Page 26: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Who-did-What,usingGated-AttentionReaders

CBT,usingGated-AttentionReaders

(KaushikandLiptonEMNLP2018)

25

Page 27: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Why?What’sgoingon??Question:shinkanemaru,thegravel-voicedback-roombosswhodiedonthursdayaged81,goesdowninhistoryasjapan’smostcorruptpost-warpoliticianafter___________Passage:...glynisbc-nj-zimmer-profile-2takes-nytrahanefumioyasuhirodragnealhadonbjorkman/max...seventh-largestembarrasedjeopardyhilariouslymasahisahaibarabajram8-to-24duke/meredithacceding...koiduiraqs2:32:21//www.ironmanlive.com/sagawakyubindeaninternatinoal90-meterkakueitanakaseven-paragraph577,610wendovergolf-lpga-jpnpartner,un-appointeduemazzeicanada-u.s.Answer:kakueitanaka

Transportationcompany

Kanemaru’ssecretary

Long-termpolitician

NamenotinGoogle

26

Page 28: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Conclusionforoption3

•  Don’ttrustQAdatasets!•  Don’ttrustQAsystemclaims!•  First,checkif

– anypre-existing(=trainingdata)dependenciesamongtheQandcandidateAs?

–  fullcontextpredictstheAwithouteventheQ?

27

Page 29: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Option4:Aonlythroughreasoning

FortrulycomplexQA:1.  Identifytheindividualsteps/piecesneeded

toderivetheA2.  Figureouthowtocompute/findthem

– FromtheQcontextand/orfromelsewhere

3.  Compose(andcheck?)them– BuildanAfinding‘script’

28

Page 30: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Possiblesourcesofthisknowledge•  Externalsearch:

–  Querysomethinglikethewebandhopetobelucky

•  Entailments:“sentence”–>“sentence”–  Operateatsurfaceform(inRTEformulation)–  Allowonesurfaceformtobestatedwhenanotherisgiven–  NewsurfaceformmayprovideAnswer–  Need:entailmentrules+entailmentapplier

•  Axioms:A∨B–>C–  Operateatdeeperlevel–  Connectrepresentationsubgraphs,evenprovidingnewnodes–  ExpandedgraphmayprovideAnswer–  Need:axioms/compositionrules+theoremprover

29

Page 31: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Type1:Apopulartasktoday:QAoverstructureddata

•  Data:database,table,etc.•  Task:askQsthatrequire(1)findingvariousbitsofdataand(2)composingthemtomaketheA

•  Themissinginformationisthescriptgoverningthesequenceofaccessandcomposition

•  Research:howto[learnto]buildthisscript?•  Evaluation:didthesystemproducetherightA?•  Examples:

– U.S.geographydatabaseof800facts(Zelle&Mooney,1996)– Wikitablequestions(PasupatandLiang,2015;Dasigi2018)– Otherdomains’tables(severalAI2projects)

30

Page 32: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Wikitabledataset

31

Athlete Nation Olympics Medals

Gillis Grafström

Sweden (SWE) 1920–1932 4

Kim Soo-Nyung

South Korea (KOR) 1988-200 6

Evgeni Plushenko Russia (RUS) 2002–2014 4

Kim Yu-na South Korea (KOR) 2010–2014 2

Patrick Chan Canada (CAN) 2014 2

Question:WhichathletewasfromSouthKoreaaftertheyear2010?

Answer:KimYu-Na

Reasoning:1)  GetrowswhereNationcolumn

containsSouthKorea2)  FilterrowswhereOlympicshas

avaluegreaterthan2010.3)  GetvaluefromAthletecolumn

fromfilteredrows.

Program:((reverseathlete)(and (nationsouth_korea) (year((reversedate)

(>=2010-mm-dd)))WikiTableQuestions,PasupatandLiang,2015

(DasigiLTIPhDthesis,2018)

Page 33: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Example:Dasigi•  Approachforlearningtobuildaccessroutines:

1.  ParseQ,builddependencytree2.  ConvertintoLogicalForm3.  Translateintocandidatetableaccessroutine4.  (tryallkindsofmappingsfromwordstoqueryoperators/structure)5.  Testcompositionbyrepeatedtrialanderror

•  Essentially,learningisasearchin‘operatorcombinationspace’tobuildthelogicalform

•  Weaksupervisionisnotenough.Speedupthelearning/searchby:–  Learningtoassociatetableaccessparameterswithpartsofthetree(Q

variables)–  Learningtoassociatenestingandaccessoperatorswithpartsofthetree

(‘operator’words:“themost”,“last”,etc.)–  Predefiningsomelexicon-to-operationmappings–  Payingattentiontogrammaticalconstructionofthetree–  Implementingheuristicstoguideexploration(‘shortQsfirst’)

32

(DasigiLTIPhDthesis,2018)

Page 34: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Dasigiapproach•  Strategies:

–  Incorporateknowledgeofgrammaticalconstraints–  ‘Lucky’examples:removerightAwithwrongquerylogic–  Questioncoverage:howmanyQwordsmapped?–  Complexqueries(denotation):howlargeisthequery?–  Doiterativesearch,fromsimplertomorecomplexQs

•  CombineintosingleObjective:Minimizeexpectedvalueofcost(Goodman,1996;GoelandByrne,2000;SmithandEisner,2005)

withalinearcombinationofcoverageanddenotationcosts

33

x: NL term y: script term d: denotation

Page 35: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

EmpiricalcomparisononWikiTableQuestions

●  Requiresapproximatesetoflogicalformsduringtraining

●  UsedoutputfromDynamicProgrammingonDenotations(PasupatandLiang,2016)

●  Variousmodels:strings,trees,etc.

●  Efficientsearchfollowedbypruningusinghumanannotations

34

(Krishnamurthy, Dasigi and Gardner, 2017)

Page 36: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Dasigiresultsusingiterativesearch

35

●  Similar trend in 2 domains ●  Used functional query language (Liang et al., 2018)

(Dasigi,Gardner,Murty,Zettlemoyer,Hovy2018)

NLVR WikiTableQuestions

Page 37: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Conclusionforoption4.1

Interestingideato‘operationalize’theQandtestits‘truth’byrunningthescriptfortheA

ButworksonlywithstructuredAsourceswheresuchoperationalizationispossible

36

Canwe‘operationalize’other,typicalkindsofQs?

Page 38: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Type2:AnewQAtask:Multi-domainknowledge

Q:WhatisthelargestcapitalcitysouthofSantiagodeChile?

– Geographicknowledge(lat-long,population)– Numericalability(sorting,etc.)

Q:WhichoftheleadersoftheXYZenterprisearewell-liked,andwhy?

– Discoveryofsocialrolebyactions– Sentimentjudgmentsattachedtoactions

37

Page 39: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Multi-domainknowledge

•  DefineNself-containedstandardized‘domainspecialists’(KBs+reasoners)thatanyQAenginecanrun

•  Atrun-time,analyzetheQ,buildtheAscript,activatethespecialistsasneeded,computetheA

Arithmetic

GeographyPsych:goals

SocialcustomsPhysics

38

Page 40: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Researchneeded

•  Foreachdomainspecialist:– Defineits‘knowledgeservice’–  Createtheunderlyingknowledge– DefinetheI/OAPIsfortheQAenginetouse–  Buildthespecialist

•  ForeachQAengine:– AnalyzetheQ—>determineparametersandneed– Decomposeneed,buildascriptofspecialistqueriesplustheirresultcomposition

–  Execute39

Page 41: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Somespecialistareaswearecurrentlyworkingoninmygroup

1.  Arithmetic/numericalreasoningforentailment(Ravichander,Naik,Rosé,Hovy,CoNLL2019,ACL2019)

2.  Psychgoalsforsentimentjustification(OtaniandHovy,ACL2019)

3.  Socialrolesforgroupactivitysupport(Yang,Kraut,Hov,yEMNLP,HCI,andothers2017–18)

40

Page 42: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Topic1.Numericalcalculation

•  Task:Entailmentproblem•  Input:clausescontainingnumbers•  Output:entailed/not-entailed

•  Results:–  EQUATEdatasetextractedfrom~8existingQAandEntailmentresources,withAsadded

–  Baselinenumericalreasonerscoresonthedataset

(Ravichander,Naik,Rose,Hovy,2019)

P:AbombinaHebrewUniversitycafeteriakilledfiveAmericansandfourIsraelisH:AbombingatHebrewUniversityinJerusalemkilledninepeople,includingfiveAmericans

41

Page 43: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

EQUATEcorpusDataset Size Clas

sesSynthetic

DataSource

AnnotationSource

QuantitativePhenomena

StressTest 7500 3 ✓ AQuA-RAT Automatic Quantifiers

RTE-Quant 166 2 ✗ RTE2-RTE4 Expert Arithmetic,Worldknowledge,Ranges,Quantifiers

AwpNLI 722 2 ✓ ArithmeticWordProblems

Automatic Arithmetic

NewsNLI 1000 2 ✗ CNN Crowd-sourced

Ordinals,Quantifiers,Arithmetic,WorldKnowledge,Magnitude,Ratios

RedditNLI 250 3 ✗ Reddit Expert Range,Arithmetic,Approximation,Verbal

(Ravichanderetal.,2019)

42

Page 44: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Baselines(SOTAmethods)•  MajorityClass(MAJ):Simplebaselinealwayspredictsthemajorityclassintestset.•  Hypothesis-Only(HYP):FastTextclassifiertrainedononlyhypothesestopredictthe

entailmentrelation(Gururanganetal.2018)•  ALIGN:Abag-of-wordsalignmentmodelinspiredbyMacCartney(2009)•  NB(NieandBansal2017):SentenceencoderconsistingofstackedBiLSTM-RNNs

withshortcutconnectionsandfine-tuningofembeddings.Achievestopnon-ensembleresultintheRepEval-2017sharedtask

•  CH(Chenetal.2017):SentenceencoderconsistingofstackedBiLSTM-RNNswithshortcutconnections,character-compositionwordembeddingslearnedviaCNNs,intra-sentencegatedattentionandensembling.AchievesbestoverallresultintheRepEval-2017sharedtask

•  RC(Balazsetal.2017):Single-layerBiLSTMwithmeanpoolingandintra-sentenceattention

•  IS(Conneauetal.2017):Single-layerBiLSTM-RNNwithmax-pooling,showntolearnrobustuniversalsentencerepresentationsthattransferwellacrossinferencetasks

•  BiLSTM:WereimplementthesimpleBiLSTMbaselinemodelofNangiaetal.(2017).OurreimplementationachievesslightlybetterresultsontheMultiNLIdevset

•  CBOW:Bag-of-wordssentencerepresentationfromwordembeddingspassedthroughatanhnon-linearityandasoftmaxlayerforclassification.

(Ravichanderetal.,2019)

43

Page 45: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Constructingentailmentinferences•  Generateareport

foreachpremise-hypothesispair,consistingof:–  ExtractedNUMSETSforpremiseandhypothesis

–  WhichNUMSETSwerecombinedandbywhatoperation

–  WhichNUMSETSwerejustifiedandwhichweren’t

•  Combinesneuralandsymbolicprograms–  Somesubmodulesareneural;overallframeworkissymbolic–  Lightweightsupervision

(Ravichanderetal.,2019)

44

Page 46: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Topic2.Humangoals

•  ComplexQAdomain:humangoalforsentiment–  Ilovedthehotel’spricebuttheroomwasnoisy—>[price+][room-]

•  Task:sentimentjustification:WHYdoestheHolderhavethesentimentvalueforthefacet?

•  Approach:Classifyeachclauseintoalistofhuman(psychologicalandsocial)goals–  Initialset:Maslowhierarchy–  Currently:~110humangoalsfromUSC(Talevichetal.)

•  Data:Crowdsourced;κ≈0.55

(OtaniandHovy,2019)

45

Page 47: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

(Talevichetal.2017)

46

Page 48: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Topic3.Socialroles

•  ComplexQAdomain:Humaninteractionsingroups

•  Task:Automatedsocialrolediscovery–  Input:Discussionsinasocialmediaplatform– Output:Rolelist,andassignmentforeachuser

•  Data:– Wikipediaeditors:ourroletaxonomyconformstoWikipedia’sinternalset

– CancerSurvivorNetworkdiscussiongroups

(Yang,Kraut,Hovy,2018)

47

Page 49: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

User edit history Role assignments

Information_insertion 0.4 Reference_insertion 0.2 ….

Roles

Grammar 0.2 Markup_deletion 0.1 Rephrase 0.1 ….

Wikilink_insertion 0.2 Wikilink_deletion 0.1 ….

48

(Yang,Kraut,Hovy,2018)

Page 50: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

LatentrolemodelinWikipedia

Role: distribution of edit actions

Role proportions

Role assignment for user u and word n

Edit actions

(Yang,Kraut,Hovy,2018)

49

Page 51: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Discoverededitorroles(namingbyexpert)Expert’s role name Discovered representative behavior

Substantive Expert Information insertion, wikilink insertion, reference insertion

Social Networker Main talk namespace, user namespace

Vandal Fighter Reverting, user talk namespace

Quality Assurance Wikilink insertion, wikipedia namespace, template namespace

Fact Checker Information deletion, wikilink deletion, reference deletion

Cleanup Worker Wikilink modification, template insertion, markup modification

Fact Updater Template modification, reference modification

Copy Editor Grammar, paraphrase, relocation 50

(Yang,Kraut,Hovy,2018)

Page 52: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Topics4–.Otherinferencespecialists

•  GeographyandTime… (see(Allen,CACM1983)and(Davis,JAIR2017))– E.g.:north-of,area-included-in-region…

•  Physics,Biology… (seetheHALOproject)– RecentworkonaspectsofPhysicsatAI2(Clarketal.)

•  Emotions

51

Page 53: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Physics:noun-nouncompounds

Whereis…

•  …thekitchentable•  …thecoffeetable•  …thewoodtable•  …theteacher’stable•  …thedatatable

•  Needtoknowtherelationandthenountypestoinferadditionalinfo:

•  LOC•  FUNCTIONèLOC•  MATERIAL•  ?FUNCTIONèLOC?•  TYPESèCONTENTèLOC?

52

Page 54: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

Conclusionforoption4.2WherenextwithComplexQA?

•  Identifyandbuildthemostusefuldomainspecialists–  Findbasicknowledgeprimitives– Developreasoninglogics,models,andimplementations

– Develop/findQAdatasetsthatexercisethissortofspecialistknowledgeandreasoning

•  Greatoverviewin(Davis,JAIR2018)•  Createacommonlibraryforalltoshare•  EvaluatecorrectnessANDAnswerproductionscripts(traces,as‘explanation’) 53

Open-sourceandgeneral-purpose(notjustscientific/political)versionofWolframAlpha

Page 55: From Simple to Complex QA - NUS · – Multiple-choice question answering: RACE, MCTest – Answer generation: MS MARCO • Algorithms: – Key-Value Memory Networks: Miller et al.

THANKYOU

54


Recommended