Exploiting Timelines to Enhance Multi-document Summarization
Jun-Ping Ng, Yan Chen, Min-Yen Kan, Zhoujun LiDSO National Laboratories
National University of SingaporeBeihang University
2
Outline
• Overview• Approach• Experiments and Results • Discussion
3
OVERVIEW
4
Multi-document Summarization
5
Extractive Summarization
• Find the most salient sentences in source collection
• Top-k sentences are extracted to compose final summary
• <Graphic>
6
Two Storms
(1) A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.
(2) More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.
(3) The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP
7
Two Storms
(1) A fierce cyclone packing extreme winds and torrential rain smashed into Bangladesh’s southwestern coast Thursday, wiping out homes and trees in what officials described as the worst storm in years.
(2) More than 100,000 coastal villagers have been evacuated before the cyclone made landfall.
(3) The storm matched one in 1991 that sparked a tidal wave that killed an estimated 138,000 people, Karmakar told AFP
8
Timeline
9
APPROACH
10
Merging Timelines Into Summarization
11
Temporal Processing
• Based on TimeML (Pustejovsky et al 2003)• Basic temporal units – events + timexes• Three steps
– Event-timex temporal relation classification– Event-event temporal relation classification– Timex normalization
• Merge to obtain timelines• <TODO>
12
Timelines
Sentence Scoring• Time span importance• Contextual time span importance• Sentence temporal coverage density
14
Defining Timeline Features
15
Time Span Importance (TSI)• Time spans which contain many events are more salient• Sentences which references events in these time spans are
thus better candidates for a summary
16
Scoring TSI
17
Contextual Time Span Importance (CTSI)
• Time spans near to “important” time spans may also be important
18
Scoring CTSI
19
Sentence Temporal Coverage Density (TCD)
• Number of sentences in a summary is limited• Favour sentences which
– contain more events– covering a wide variety of time spans
20
Scoring TCD
21
Sentence Re-ordering• SWING makes use of the Maximal Marginal Relevance
(MMR) algorithm to identify redundancies in selected sentences
• MMR is heavily biased towards lexicons and surface similarities
22
Beyond Lexical Penalties
23
An official in Barisal, 120 kilometres south of Dhaka, spoke of severe destruction as the 500 kilometre-wide mass of cloud passed overhead.
“Many trees have been uprooted and houses and schools blown away,” Mostofa Kamal, a district relief and rehabilitation officer, told AFP by telephone.
“Mud huts have been damaged and the roofs of several houses blown off,” said the state’s relief minister, Mortaza Hossain.
TimeMMR• Novel dimension to redundancy detection• Beyond lexical similarities, identify sentences which contain
substantial time span overlaps• Candidate sentences which share many time spans with
selected sentences are penalised
24
EXPERIMENTS AND RESULTS
Results• TAC-2010 data set to
train regression model• TAC-2011 data set to
test • Using timelines lead to
better summaries!
System ROUGE-2
SWING 0.1339
+ Timelines 0.1394*
+ TimeMMR 0.1389
26
Overcoming Errors• Timelines contain errors
– Errors from underlying temporal processing systems– Simplifying assumptions made in timeline construction– Lack of consistency checking and validation
27
Reliability Filtering• Identify timelines which potentially contain more errors• Exclude these when performing summarization
28
Length as a Metric• Use the length of a timeline as a gauge of its “accuracy”• Drop the use of timelines which are less than the average
length, computed over the whole input document collection
29
Results• Experiments repeated
with reliability filtering• Significant
improvement obtained • After filtering timelines
are used in 21 out of 44 document sets
System ROUGE-2
SWING 0.1339
+ Timelines 0.1394*
+ Timelines + Filtering 0.1418**
+ TimeMMR 0.1389
+ TimeMMR+ Filtering 0.1402**
30
DISCUSSION
Text Example
32
The Army’s surgeon general criticized stories in The Washington Post disclosing problems at Walter Reed Army Medical Center, saying the series unfairly characterized the living conditions and care for soldiers recuperating from wounds at the hospital’s facilities.
The Army’s surgeon general criticized stories in The Washington Post disclosing problems at Walter Reed Army Medical Center, saying the series unfairly characterized the living conditions and care for soldiers recuperating from wounds at the hospital’s facilities.
Defense Secretary Robert Gates says people found to have been responsible for allowing substandard living conditions for soldier outpatients at Walter Reed Army Medical Center in Washington will be “held account- able,” although so far no one in the Army chain of com- mand has offered to resign.
A top Army general vowed to personally over- see the upgrading of Walter Reed Army Medical Cen- ter’s Building 18, a dilapidated former hotel that houses wounded soldiers as outpatients.
Top Army officials visited Building 18, the decrepit former hotel housing more than 80 recovering soldiers, outside
“I’m not sure it was an accurate representation,” Lt. Gen. Kevin Kiley, chief of the Army Medical Com- mand which oversees Walter Reed and all Army health care, told reporters during a news conference.
Timelines Used SWING
Future Work• Study the use of alternative evaluation metrics, especially
for TimeMMR• Look at better metrics for reliability filtering• Expand the scope of the timelines that are used for more
flexibility
33
Conclusion
• The use of time is useful for summarization!• Sentence Scoring
– Derive features from a timeline– Combine features with a supervised learning
summarization framework• Sentence Re-ordering
– Use overlapping time spans to identify redundancies
Thank you!
35