+ All Categories
Home > Documents > TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background...

TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background...

Date post: 29-Jan-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
22
TAC 2018 Streaming Multimedia KBP Pilot Hoa Trang Dang National Institute of Standards and Technology
Transcript
Page 1: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TAC2018StreamingMultimediaKBPPilot

HoaTrangDang

NationalInstituteofStandardsandTechnology

Page 2: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

Background

• NISTwillevaluateperformersinDARPAAIDAProgram(ActiveInterpretationofDisparateAlternatives)• SomeAIDAevaluationswillbeopenevaluationsinTACandTRECVID.• ThegoalofAIDAistodevelopasemanticenginethatautomaticallygeneratesmultiplealternativeanalyticinterpretations ofasituation,basedonavarietyofunstructuredsourcesthatmaybenoisy,conflicting,ordeceptive.• Documentscancontainamixofmultilingualtext,speech,image,video;includingmetadata.• Adocumentcanbeassmallasasingletweet,oraslargeasaWebpagecontaininganewsarticlewithtext,picturesandvideoclips.

§ Alldatawillbein streamingmode; systemscanaccessthedataonlyonceinrawformat,butmayaccessaKBcontainingastructuredsemanticrepresentationofalldataseentodate

Page 3: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

ACTIVE INTERPRETATIONOF DISPARATE ALTERNATIVES(AIDA)• Givenascenario(“Benghazi”),documentstream,andseveraltopics.Foreachtopic:

• TA1outputsallKnowledgeElements(entities,relations,events,etc.,definedintheontology)inthedocuments,includingalternativeinterpretations• TA2fusesKEsfromTA1intotheTA2KB,maintainingalternativeinterpretations• TA3constructsinternallyconsistenthypotheses(partialKBs)fromTA2KB

TA1TA2TA3

Page 4: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

Scenario-SpecificOntology

• Scenarioswillinvolveeventssuchasinternationalconflicts,naturaldisasters,violenceatinternationalevents,orprotestsanddemonstrations.• AIDAwillextendKBPontologyofentities,relations,events,beliefandsentimenttoincludeadditionalconceptsthatareneededtocoverinformationalconflictsineachtopicinthescenario• Ideally,wouldhaveasingleontologyforalltopicsinthescenario(?)

Page 5: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

AIDAKBrepresentation

• KnowledgeElement(KE)isastructuredrepresentationofentities,relations,events,etc.-- likelyanaugmentedtriplelikeinColdStartKB• Tripleisaugmentedwithprovenanceandconfidence• Provenanceisasetofjustifications.Eachjustificationhasajustification-levelconfidence• KE-levelconfidenceisexplicitlyprovidedbyTA1andTA2,andisanaggregationofjustification-levelconfidences

• KBcontainsconflictingKEs(asfoundintherawdocuments)• Representation-- notreconciliation-- ofconflicts

Page 6: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

WhatisallowedinKBrepresentation?

• AIDA:“Althoughtheremaybeneedforsomenaturallanguage,imagethumbnails,featurized media,etc.intheKBforreference,registration,ormatchingpurposes,itisexpectedthatmostoftheassertionsintheKBwillbeexpressibleinthestructuredrepresentation,withelementsderivedfromanontology.”• FeaturesaccessibletoTA1/TA2inKEcannotbedocument-levelcontentfeatures(?).Allowablefeaturesinclude• Numberofsupportingdocs,andlinktodocs(butcan’treaddocs)• Timeoffirstsupportingdoc,mostrecentsupportingdoc

• Comments/recommendationsfromparticipatingteamsarewelcomeregardingwhatfeaturesshouldbeallowedintheKB• Forevaluationpurposes,provenanceaccessibletoLDCshouldbepointersintotherawdocumentsdenotingtextspans,audiospans,images,orvideoshots

Page 7: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TAC/TRECVID2018tasks(pilot)• Task1:Extractallevents,subeventoractions,entities,relations,locations,time,andsentimentfrommultimediadocumentstream ,conditionedonzeroormoredifferentcontexts,orhypotheses (TAC,TRECVID2018)• OutputisasetofallpossibleKEs,includingconfidenceandprovenance• Mention-leveloutput,includingwithin-documentlinking

• Task2:BuildKBbyaggregatingallKEsfromTA1and“user”(TAC2018)• OutputisKBincludingcross-doclinking• Evaluatebyqueries(withentrypoints)andassessment

• [Task3:CreatehypothesesfromTask2KBs(AIDAprogram-internalin2018)]

Page 8: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

Training/Evaluationdata

• Onenewscenarioperevaluationcycle;4 scenariostotaloverlifetimeofAIDAprogram.• 100Kdocs/scenario,includingrelevantandirrelevantdocuments• 5-20%ofdocswillberelevanttothescenario• 200labeleddocsperscenario

• 12-20topicsperscenario• Atleastoneforeignlanguageperscenario,plusEnglish• AIDA:“Governmentwillprovidelinguisticresources andtoolsofaqualityandcompositiontobedetermined,butconsisting atleastofthetypeandsizefoundinaLORELEIRelatedLanguagePack (LRLP)"

Page 9: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

LowResourceLanguagePacks• 1Mw- 2Mw+monotextfromnews,webtext&socialmedia• 300Kw- 1.1Mw+paralleltextofvariablequality(professional,crowd,found,comparable)• Annotationsfor25Kw- 75Kw/languageincluding

• SimpleNamedEntity(PER,ORG,GPE,LOC/FAC)• KBlinkingofnamestoGeoNames andCIAWorldFactBook• SituationFrames:needs/issuesforanincident(e.g.UrgentshelterneedinKermanshahprovince)

• FullEntity(name,nom,pro)andwithin-doccoref• Predicate-argumentannotationofdisaster-relevantActsandStates

• Grammaticalresourcesrangingfromfullgrammaticalsketchtofoundresources(dictionaries,grammars,primers,gazetteers)tolexicons• BasicNLPtoolsincludingword,sentencesegmenters,encodingconverters; nametaggers

Page 10: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

RelatedTRECVIDTasks

Page 11: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TRECVID(2001– Present)• Shotboundarydetection:Identifytheshotboundariesinthegivenvideoclip(s)• High-levelfeatureextraction/SemanticIndexing:Givenastandardsetofshotboundariesandalistoffeature(concepts)definitions,returnarankedlistofshotsaccordingtothehighestpossibilityofdetectingthepresenceofeachfeature

• Ad-hocVideoSearch:Givenastatementofinformationneed,returnarankedlistofshotswhichbestsatisfytheneed;similartosemanticindexing,butwithcomplexconcepts(combinationofconcepts);e.g.,findgroupofchildrenplayingfrisbee inapark.

• RushesSummarization:Givenavideofromtherushestestcollection,automaticallycreateanMPEG-1summarycliplessthanorequaltoamaximumdurationthatshowsthemainobjectsandeventsintherushesvideotobesummarized

• Surveillanceeventdetection:detectasetofpredefinedeventsandidentifytheiroccurrencestemporally

• Content-basedcopydetection:givenatestcollectionofvideosandasetof(video,audio,video+audio)queries,determineforeachquerytheplace,ifany,thatsomepartofthequeryoccurs,withpossibletransformations,inthetestcollection

Page 12: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TRECVID(2001– Present)• Known-itemSearch:Givenatext-onlydescriptionofthevideodesiredandatestcollectionofvideowithassociatedmetadata,automaticallyreturnalistofupto100videoIDsrankedbyprobabilitytobetheonesought• InstanceSearch:Givenacollectionoftestvideos,amastershotreference,andacollectionofqueriesthatdelimitaperson,object,orplaceentityinsomeexamplevideo,locateforeachquerythe1000shotsmostlikelytocontainarecognizableinstanceoftheentity[AIDATA2cross-doccoref]• MultimediaEventDetection:Givenacollectionoftestvideosandalistoftestevents,indicatewhethereachofthetesteventsispresentanywhereineachofthetestvideosandgivethestrengthofevidenceforeachsuchjudgment• Localization:Givenavideoshot,Determinethepresenceofaconcepttemporallywithintheshot,withrespecttoasubsetoftheframescomprisedbytheshot,and,spatially,foreachsuchframethatcontainstheconcept,toaboundingrectangle[AIDAprovenance?]

Page 13: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

Latesttaskintroducedin2016:Video-to-Text• Givenasetof2000URLsofTwitter(Vine)videosandsetsoftextdescriptions(eachcomposedof2000sentences),systemsareaskedtoworkandsubmitresultsfortwosubtasks:

• MatchingandRanking: ReturnforeachvideoURLarankedlistofthemostlikelytextdescriptionthatcorrespond(wasannotated)tothevideofromeachofthedifferenttextdescriptionsets.

• DescriptionGeneration: AutomaticallygenerateforeachvideoURLatextdescription(1sentence)independentlyandwithouttakingintoconsiderationtheexistenceoftextdescription

sets.

• Systemsandannotatorswereencouragedtodescribevideosusing4facets:• Who isthevideodescribingsuchasconcreteobjectsandbeings(kindsofpersons,animals,things)• What aretheobjectsandbeingsdoing?(genericactions,conditions/stateorevents)• Where suchaslocale,site,place,geographic,architectural(kindofplace,geographicorarchitectural)

• When suchastimeofday,season,etc

Page 14: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

AirplaneAnchorpersonAnimal Basketball BeachBicyclingBoat_ShipBoy Bridges BusCar_RacingChair CheeringClassroom Computers Dancing Demonstration_Or_ProtestGreetingHand Highway

Sitting_DownStadium Swimming Telephones ThrowingBaby Door_OpeningFields Flags Forest George_BushHill Lakes Military_AirplaneExplosion_FireFemale-Human-Face-Closeup Flowers GirlGovernment-Leader Instrumental_Musician

Oceans Quadruped Skating Skier SoldiersStudio_With_AnchorpersonTraffic Kitchen MeetingMotorcycle News_StudioNighttime Office Old_PeoplePeople_MarchingPress_ConferenceReportersRoadway_JunctionRunningSinging

ExamplesofconceptsusedintheTRECVIDSemanticINdexing(SIN)task

Page 15: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

Multimedia

• Eachdocumentcancontainamixoftext,speech,image,video;includingmetadata.• Multiplelanguages:Englishplus1-2foreignlanguages(TBA)• LDCwillprovidelanguagepackscontainingresourcesforeachlanguage

• Allparticipantswillbegiventhesamedocuments• Participantsareallowedtoprocessinfoinapropersubsetofthelanguagesormediatypes• NISTmayreportbreakdownevaluationresultsbylanguage,mediatype,etc.

Page 16: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

StreamingExtraction

• Documentsarriveinbatchesasachunk.• ~100documents/chunk(?),withcaponlengthoftimecoveredinachunk

• TA1(andTA2?)systememitsKE’s(triple+confidence+extras)aftereachchunk.• Atspecifiedtimepointsinthestream,thesetofaccumulatedKE’sisevaluated.• Rankedprecision/recallderivatives.

• Atsomeofthosepoints,awildhypothesisappears!• Ahypothesis=asetofproposedtuples.• TA1systemoutputsKE’sprimedbythehypothesis,whichareevaluated.

Page 17: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TA1ExtractionConditionedonContext• TA1mustbecapableofacceptingalternatecontexts and producingalternateanalyses foreachcontext.• Forexample,theanalysisofacertainimageproducesknowledgeelementsrepresentingabuson aroad.However,knowledgeelementsinoneormorehypothesessuggestthatthisisariverratherthanaroad. Theanalysis algorithmshouldusethisinformationforadditionalanalysisoftheimagewithpriorsfavoringa boat.

• Simplifyingassumptionsforevaluationpurposes:• Contextsarecoherenthypotheses(representedasapartialKB)drawnfromasmallstaticsetofpossiblehypothesesthatareproducedmanuallybyLDC• Only“whatif”hypothesesareinputtoTA1;KEsandconfidencevaluesresultingfrom“whatif”hypothesesdonotgetpassedontoTA2butareevaluatedseparately

Page 18: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

HowisTask1differentfrompastTRECVIDandTACcomponenttasks?

• Multimedia• Streaminginput• Can’tgobacktoreanalyzerawdocsinpreviousdatachunks

• TA1hasaccesstoTA2KBencodingpreviouslyaddedKE’s

• Multiplehypothesesandinterpretations• Expandedontologytocoverinformationalconflictsinscenario• TA1outputsallpossibleextractionsandinterpretations,notjustthemostconfidentones• TA1extractionfromdataitemsmaybeconditionedonhypothesis

Page 19: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

HowisTask2 differentfromColdStartKBP?

• Multimedia• Streaminginput• TA2hasnoaccesstorawdataitemstoassistinfusingincomingKEswithexistingKB;canonlyusewhat’srepresentedintheincomingKEandexistingKB

• Multiplehypothesesandinterpretations• Expandedontologytocoverinformationconflictsinscenario• TA2KBmustmaintainallpossibleKEs(evenlow-confidenceKEs)inordertosupportcreationofmultiplehypothesesanddisparateinterpretations• TA2KEsandconfidencestheoreticallycouldbeconditionedonhypothesisinfuture,butfor2018theTA2KBisindependentofany“whatif”hypotheses.

Page 20: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

EvaluationbyAssessment

• Evaluateusingpost-submissionassessmentandclusteringofpooledmentions• TosupportevaluationofTA1extractionconditionedoncontext,ground-truthmustbeconditionedonasmallsetofhypotheses,predeterminedbyLDC.

• OnlytargetedKEs(relevanttohypotheses)willbeevaluated• Onlykhighest-confidencementions/justificationsforeachKEwillbepooledandassessed• LDCmight provideexhaustiveannotationofmentionsofentitiesforasmall setofdocuments,forgold-standardbased“NER”evaluation

Page 21: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

AIDAEvaluationSchedule

• 318-monthphases• January2018kick-off

• ~Sept2018:Eval Pilot• ~May2019:Eval 1(Phase1)• ~Nov2020:Eval 2(Phase2)• ~May2022:Eval 3(Phase3)

Page 22: TAC 2018 Streaming Multimedia KBP Pilot...National Institute of Standards and Technology. Background •NIST will evaluate performers in DARPA AIDA Program (Active Interpretation of

TAC2018StreamingMMKBPPilotEvaluationSchedule

• Sample/training/eval datarelease:• ~January:scenarioand3mostlylabeledtopicsfortraining;all100Kunlabeleddocsforthescenario(foreignlanguagesannouncedatthistime)• ~April:3additionallabeledtopicsfortraining• ~September:6“evaluation”topics

• EarlySeptember(?):Task1evaluationwindow• MidSeptember(?):Task2evaluationwindow


Recommended