+ All Categories
Home > Documents > Using Query Patterns to Learn the Durations of Events

Using Query Patterns to Learn the Durations of Events

Date post: 24-Feb-2016
Category:
Upload: kaz
View: 28 times
Download: 0 times
Share this document with a friend
Description:
Using Query Patterns to Learn the Durations of Events. Andrey Gusev joint work with Nate Chambers , Pranav Khaitan , Divye Khilnani , Steven Bethard , Dan Jurafsky. Examples of Event Durations. Talk to a friend – minutes Driving – hours Study for an exam – days Travel – weeks - PowerPoint PPT Presentation
40
Using Query Patterns to Learn the Durations of Events Andrey Gusev joint work with Nate Chambers, Pranav Khaitan, Divye Khilnani, Steven Bethard, Dan Jurafsky
Transcript
Page 1: Using Query Patterns to  Learn  the Durations of Events

Using Query Patterns to Learn the Durations of Events

Andrey Gusev

joint work withNate Chambers, Pranav Khaitan, Divye Khilnani, Steven Bethard, Dan Jurafsky

Page 2: Using Query Patterns to  Learn  the Durations of Events

Examples of Event Durations• Talk to a friend – minutes• Driving – hours• Study for an exam – days• Travel – weeks• Run a campaign – months• Build a museum – years

Page 3: Using Query Patterns to  Learn  the Durations of Events

Why are we interested in durations?• Event Understanding

• Duration is an important aspectual property• Can help build timelines and events

• Event coreference• Duration may be a cue that events are coreferent

• Gender (learned from the web) helps nominal coreference

• Integration into search products• Query: “healthy sleep time for age groups”• Query: “president term length in [country x]”

Page 4: Using Query Patterns to  Learn  the Durations of Events

Approach1: Supervised System

How can we learn event durations?

Page 5: Using Query Patterns to  Learn  the Durations of Events

Dataset (Pan et al., 2006)• Labeled 58 documents from TimeBank with event

durations• Average of minimum and maximum labeled durations

• A Brooklyn woman who was watching her clothes dry in a laundromat.• Min duration – 5 min• Max Duration – 1 hour• Average – 1950 seconds

Page 6: Using Query Patterns to  Learn  the Durations of Events

Original Features (Pan et al., 2006)• Event Properties

• Event token, lemma, POS tag

• Subject and Object• Head word of syntactic subject and objects of the event,

along with their lemmas and POS tags.

• Hypernyms• WordNet hypernyms for the event, its subject and its object.• Starting from the first synset of each lemma, three

hyperhyms were extracted from the WordNet hierarchy.

Page 7: Using Query Patterns to  Learn  the Durations of Events

New Features• Event Attributes

• Tense, aspect, modality, event class

• Named Entity Class of Subjects and Objects• Person, organization, locations, or other.

• Typed Dependencies• Binary feature for each typed dependency

• Reporting Verbs• Binary feature for reporting verbs (say, report, reply, etc.)

Page 8: Using Query Patterns to  Learn  the Durations of Events

Limitations of the Supervised ApproachNeed explicitly annotated datasets

• Sparse and limited data

• Limited to the annotated domain

• Low inter-annotator agreement• More than a Day and Less Than a Day– 87.7%• Duration Buckets – 44.4%• Approximate Duration Buckets– 79.8%

Page 9: Using Query Patterns to  Learn  the Durations of Events

Overcoming Supervised LimitationsStatistical Web Count approach

• Lots of text/data that can be used

• Not limited to the annotated domain

• Implicit annotations from many sources

• Hearst(1998), Ji and Lin (2009)

Page 10: Using Query Patterns to  Learn  the Durations of Events

Approach 2: Statistical Web Counts

How can we learn event durations?

Page 11: Using Query Patterns to  Learn  the Durations of Events

Terms - Durations Buckets and Distributions• “talked for * seconds”• “talked for * minutes”• “talked for * hours”• “talked for * days”• “talked for * weeks”• “talked for * months”• “talked for * years”

Duration Bucket

Distribution

- 1638 hits- 61816 hits- 68370 hits- 4361 hits- 3754 hits- 5157 hits- 103336 hits

Page 12: Using Query Patterns to  Learn  the Durations of Events

Two Duration Prediction Tasks• Coarse grained prediction

• “Less than a day” or “Longer than a day”

• Fine grained prediction• Second, minute, hour, etc.

Page 13: Using Query Patterns to  Learn  the Durations of Events

Task 1: Coarse Grained Prediction

Page 14: Using Query Patterns to  Learn  the Durations of Events

Yesterday Pattern for Coarse Grained Task

• <eventpast> yesterday

• <eventpastp> yesterday

• eventpast = past tense

• eventpastp= past progressive tense

• Normalize yesterday event pattern counts with counts of event occurrence in general

• Average the two ratios • Find threshold on the training set

Page 15: Using Query Patterns to  Learn  the Durations of Events

Example: “to say” with Yesterday Pattern• “said yesterday” – 14,390,865 hits

• “said” – 1,693,080,248 hits• “was saying yesterday” – 29,626 hits

• “was saying” – 14,167,103 hits

• Average Ratio = 0.0053€

Ratiopastp =29,626

14,167,103= 0.0021

Ratiopast =14,390,865

1,693,080,248= 0.0085

Page 16: Using Query Patterns to  Learn  the Durations of Events

Threshold for Yesterday Pattern

0.000

50.0

01

0.001

50.0

02

0.002

50.0

03

0.003

50.0

04

0.004

50.0

05

0.005

50.650.660.670.680.690.700.710.720.730.740.75

Ratio

Acc

urac

y

t = 0.002

Page 17: Using Query Patterns to  Learn  the Durations of Events

Task 2: Fine Grained Prediction

Page 18: Using Query Patterns to  Learn  the Durations of Events

Fine Grained Durations from Web Counts

• How long does the event

“X” last?

• Ask the web:• “X for * seconds”• “X for * minutes”• …

• Output distribution over time units

Said

Page 19: Using Query Patterns to  Learn  the Durations of Events

Not All Time Units are Equal • Need to look at the base

distribution• “for * seconds”• “for * minutes”• …

• In habituals, etc. people like to say “for years”

Page 20: Using Query Patterns to  Learn  the Durations of Events

Conditional Frequencies for Buckets

• Divide• “X for * seconds”

• By• “for * seconds”

• Reduce credit for seeing “X for years”

Said

Page 21: Using Query Patterns to  Learn  the Durations of Events

Double Peak Distribution• Two interpretations

• Durative• Iterative

• Distributions show that with two peaks S M H D W M Y D

0.0

0.1

0.2

0.3

0.4

0.5to smile to run

Page 22: Using Query Patterns to  Learn  the Durations of Events

Merging Patterns • Multiple patterns

• Distributions averaged

• Reduces noise from individual patterns

• Pattern needs to have greater than 100 and less 100,000 hits

Said

Page 23: Using Query Patterns to  Learn  the Durations of Events

Fine Grained Patterns• Used Patterns

• <eventpast> for * <bucket>

• <eventpastp> for * <bucket>

• spent * <bucket> <eventger>

• Patterns not used• <eventpast> in * <bucket>• takes * <bucket> to <event>• <eventpast> last <bucket>

Page 24: Using Query Patterns to  Learn  the Durations of Events

Evaluation and Results

Page 25: Using Query Patterns to  Learn  the Durations of Events

Evaluation• TimeBank annotations (Pan, Mulkar and Hobbs 2006)

• Coarse Task: Greater or less than a day• Fine Task: Time units (seconds, minutes, hours, …, years)

• Counted as correct if within 1 time unit• Baseline: Majority Class

• Fine Grained – months• Coarse Grained – greater than a day

• Compare with re-implementation of supervised (Pan, Mulkar and Hobbs 2006)

Page 26: Using Query Patterns to  Learn  the Durations of Events

New Split for TimeBank Dataset• Train – 1664 events (714 unique verbs)

• Test – 471 events (274 unique verbs)

• TestWSJ – 147 events (84 unique verbs)

• Split info is available at • http://cs.stanford.edu/~agusev/durations/

Page 27: Using Query Patterns to  Learn  the Durations of Events

Web Counts System Scoring• Fine grained

• Smooth over the adjacent buckets and select top bucketscore(bi) = bi-1 + bi + bi+1

• Coarse grained• “Yesterday” classifier with a threshold (t = 0.002)• Use fine grained approach

• Select coarse grained bucket based on fine grained bucket

Page 28: Using Query Patterns to  Learn  the Durations of Events

Results

Coarse - Test Fine - Test Coarse - WSJ Fine - WSJ

Baseline 62.4 59.2 57.1 52.4

Supervised 73.0 62.4 74.8 66.0

Bucket Counts 72.4 66.5 73.5 68.7

Yesterday Counts 70.7 N/A 74.8 N/A

Web counts perform as well as the fully supervised system

Page 29: Using Query Patterns to  Learn  the Durations of Events

Backoff Statistics (“Spent” Pattern)

Both Subject Object None356 446 195 548

• Events in training dataset

• Had at least 10 hits

Both Subject Object None3 86 84 1372

Page 30: Using Query Patterns to  Learn  the Durations of Events

Effect of the Event Context• Supervised classifier use context in their features

• Web counts system doesn’t use context of the events• Significantly fewer hits when including context• Better accuracy with more hits than with context

• What is the effect of subject/object context on the understanding of event duration?

Page 31: Using Query Patterns to  Learn  the Durations of Events

Human Annotation:Mechanical Turk

Can humans do this task without context?

Page 32: Using Query Patterns to  Learn  the Durations of Events

MTurk Setup • 10 MTurk workers for each event

• Without the context

• Event – choice for each duration bucket

• With the context

• Event with subject/object – choice for each duration bucket

Page 33: Using Query Patterns to  Learn  the Durations of Events

Sometimes Context Doesn’t Matter

Exploded Intolerant

Page 34: Using Query Patterns to  Learn  the Durations of Events

Web counts vs. Turk distributions“said” (web count) “said” (MTurk)

Page 35: Using Query Patterns to  Learn  the Durations of Events

Web counts vs. Turk distributions“looking” (web count) “looking” (MTurk)

Page 36: Using Query Patterns to  Learn  the Durations of Events

Web counts vs. Turk distributions“considering” (web count) “considering” (MTurk)

Page 37: Using Query Patterns to  Learn  the Durations of Events

Compare accuracy– Event with context– Event without context

Coarse - Test Fine - Test Coarse - WSJ Fine - WSJ

Baseline 62.4 59.2 57.1 52.4

Event only 52.0 42.1 49.4 43.8

Event and context 65.0 56.7 70.1 59.9

Results: Mechanical Turk Annotations

Context significantly improves accuracy of MTurk annotations

Page 38: Using Query Patterns to  Learn  the Durations of Events

Event Duration Lexicon• Distributions for 1000 most frequent verbs from the

NYT portion of the Gigaword with 10 most frequent grammatical objects of each verb

• Due to thresholds not all the events have distributions

EVENT=to use,ID=e13-7,OBJ=computer,PATTERNS=2,DISTR=[0.009;0.337;0.238;0.090;0.130;0.103;0.092;0.002;]

http://cs.stanford.edu/~agusev/durations/

Page 39: Using Query Patterns to  Learn  the Durations of Events

Summary• We learned aspectual information from the web• Event durations from the web counts are as accurate

as a supervised system• Web counts are domain-general, work well even

without context• New lexicon with 1000 most frequent verbs with 10

most frequent objects • MTurk suggests that context can improve accuracy of

event duration annotation

Page 40: Using Query Patterns to  Learn  the Durations of Events

Thanks! Questions?


Recommended