Post on 10-Jul-2015
description
transcript
“Concise Preservation by Combining Managed Forgetting
and Contextualized Remembering”
EU/FP7 ForgetIT Project (2013-2016)
http://www.forgetit-project.eu
What Triggers Human Remembering of Events?
Large-Scale Analysis of Collective Memory in Wikipedia
Nattiya Kanhabua, Tu Ngoc Nguyen and Claudia Niederée
L3S Research Center , Hannover, Germany
Motivation: ForgetIT Project
Human Forgetting and Remembering
Collective Memory in Wikipedia
Experiments and Discussion
Conclusions
Outline
However, we are facing:
• Dramatic increase in content creation (e.g. digital photos)
• Increasing use of mobile devices with restricted capacity
• Information overload and changing professional and private lives
• Inadvertent forgetting due to lack of systematic preservation
Forgetting plays a crucial role for human remembering and life
(focus on current, relevant information; ignore redundant details)
Managed forgetting ≠ automatic deletion
Instead: a range of forgetting options e.g.
• Resource condensation
• Change of indexing & ranking
• Reduction of redundancy
A computer that forgets intentionally ?
And, in context of digital preservation??
However, we are facing:
• Dramatic increase in content creation (e.g. digital photos)
• Increasing use of mobile devices with restricted capacity
• Information overload and changing professional and private lives
• Inadvertent forgetting due to lack of systematic preservation
Forgetting plays a crucial role for human remembering and life
(focus on current, relevant information; ignore redundant details)
Managed forgetting ≠ automatic deletion
Instead: a range of forgetting options e.g.
• Resource condensation
• Change of indexing & ranking
• Reduction of redundancy
A computer that forgets intentionally ?
And, in context of digital preservation??
Managed forgetting = to remember the right information
Individual memories are subject to a fast
forgetting process [Ebbinghaus, 1885]
• Rapidly forget details -> “less redundancy”
Episodic memory (of one’s past event) is
reconstructed from similar events/context
• Rely on common patterns -> “false memory”
Memory bumps in the forgetting curve is
caused by reminding or triggering of:
• A physical object (e.g. a printed photo)
• A digital memory system
• Different subsequent events
Human Forgetting and Remembering
H. Ebbinghaus, Über das Gedächtnis. Untersuchungen zur experimentellen Psychologie. Duncker & Humblot,
Leipzig, 1885.
E. Tulving, Episodic memory: From mind to brain. Annual review of psychology, vol. 53, no. 1, pp. 1-25, 2002.
“ Collective memory is a socially constructed, common image (memory)
of the past of a community, which frames its current understanding
and actions.” [Halbwachs, 1950]
• Crowd phenomenon and important to societal processes
• Not static as determined by the concerns of the present
From Individual Memories to Collective Memory
M. Halbwachs, On collective memory. Chicago: The University of Chicago Press, 1950 (Translation).
Flashbulb memories in cognitive psychology
• A study of remembering of high-impact events, e.g.,
The British Royal Wedding or September 11 attacks
• Aspects: details, confidence, consistency of memory
over time, impact of media coverage
• Qualitative study: limited number of events and users
Collective Memory in Wikipedia
Wikipedia as a source for global memory
• Largest and most up-to-date online encyclopedia
(19M registered users, 30K active editors)
• Social negotiation and construction reflected in
early editing activities and talk pages
• Indicators for identifying real-world events
C. Pentzold, The online encyclopaedia wikipedia as a global memory place, Memory Studies, 2009.
M. Georgescu, N. Kanhabua, D. Krause, W. Nejdl, and S. Siersdorfer, Extracting event-related information
from article updates in wikipedia, ECIR'2013.
View logs as the signal for collective memory
• Public page view traffics with a long time span
• Not directly reflect how people forget; significant
patterns are a good estimate public remembering
• Large-scale analysis complements (1) qualitative
studies (2) analyzing article content (scalability)
Contributions
First study of identifying catalysts for event memory triggering by using
time series analysis techniques:
• temporal correlations in peaking page visits between events,
• a surprise score or the residual sum of squares on prediction error, and
• the skewness of view shapes as a catalyst for memories
Identify the relationship between events by using different features
• the role of time passed, the same types of events, the size or magnitude of
events, the near-by city or neighbor country
Analyze over 5500 high-impact events from 11 event categories
Related to the previous study by [Au Yeung and Jatowt, 2011]
• Analyzed references to the past (as an indicator to what is remembered) in a
large news collection for identifying, which years are most frequently referenced
C.-m. Au Yeung and A. Jatowt, Studying how the past is remembered: Towards computational history
through large scale text mining, CIKM’2011
We propose a 3-step approach, for a given event:
1. Compute “remembering scores” of past events within the same category
2. Rank related past events by the computed remembering scores
3. Identify features (e.g., time, location) having a high correlation with remembering
Our Approach
Remembering scores: a linear combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering scores: a linear combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering scores: a linear combination of three features:
1. Cross-correlation coefficient (CCF)
2. Sum of squared error (SSE)
3. Skewness (Kurtosis)
Measuring Signals for Memory Revival
Remembering = α•CCF + β•SSE + γ•Kurtosis
Features for Triggered Remembering
Temporal similarity:
• Time distance between two events (in days, months or years)
• Time distance based on exponential decay functions
Location similarity:
• Map a geographic hierarchy of event locations as follows
city -> state -> country -> neighbor countries -> continent
• Assign 4 scale values: 4 to same city, 3 to state, 2 to country,1 to continent
Impact of Events:
• Damaged area/properties/cost/fatalities
• Magnitude (for earthquake events)
• Highest winds, lowest pressure (for Atlantic hurricanes)
N. Kanhabua and K. Nørvåg: Determining time of queries for re-ranking search results. ECDL 2010
J. Strötgen, M. Gertz, and C. Junghans: An event-centric model for multilingual document similarity. SIGIR 2011
Experiments
Datasets:
• Page views statistics 2007-2013
• A large set of 5,500 events
• From 11 event-related categories
• α = 0.5, β = 0.4, γ = 0.1
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Temporal and spatial distributions
• Strong focus on more recent events
• Better coverage with increasing popularity
• Most frequent locations depending on event types
Temporal and Spatial Distributions
Category: Atlantic Hurricane
Distributions of remembering scores
• Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic)
• Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast)
Category: Atlantic Hurricane
Distributions of remembering scores
• Hurricane Sandy (Form date: October 22, 2012, Affected area: Mid-Atlantic)
• Hurricane Hanna (Form date: August 28, 2008, Affected area: US east coast)
Location and time have a low effect on
remembering scores for this category.
Category: Atlantic Hurricane
Top-10 events triggered by the two events
• Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane stuck at the area of Puerto Rico and East Coast
• Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada area, which t is high impact and most destructive
Category: Atlantic Hurricane
Top-10 events triggered by the two events
• Hurricane Hanna commemorates Hurricane Gustav, the freshest hurricane stuck at the area of Puerto Rico and East Coast
• Hurricane Sandy triggers 1991 Perfect Storm initially formed around Canada area, which t is high impact and most destructive
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Most recent
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Same
destination
Category: Aviation accidents
Mixture of impact factors, such as, time and location
• Qantas Flight 32 (crashed on 4 November 2010) triggers remembering of
(1) Qantas Flight 30 and British Airways Flight 9 (both going to Australia),
and (2) Aero Caribbean Flight 883 (most recent event)
Same
destination
Deadliest (two
aircraft collided)
Concorde
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Look beyond single events, especially, if there are
several events in temporal and local proximity.
Category: Earthquakes
A series of earthquake events at Christchurch, New Zealand
• 2010 Canterbury earthquake triggers 2010 Haiti earthquake (recent and high-
impact) and two close-by events, and high-impact historical earthquakes
• 2011 Christchurch earthquake shows locality focus, i.e., people seem to be
interested in the previous events in the same region
• June 2011 Christchurch earthquake, the remembered events are dominated
by the two predecessor events
Look beyond single events, especially, if there are
several events in temporal and local proximity.
Category: Terrorist incidents
Interesting observation: semantic similarity between events
• June 2012 Kaduna church bombings triggers other religion terror attacks
• 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels
2nd
5th
24th
Category: Terrorist incidents
Interesting observation: semantic similarity between events
• June 2012 Kaduna church bombings triggers other religion terror attacks
• 2008 Mumbai attacks trigger terror attacks in business, entertainment and hotels
2nd
7th
15th
Conclusions
We identified some first pattern for event memory triggering for diverse event types including natural and manmade disasters as well as accidents and terrorism.
Our analysis confirmed the influence of closeness in time and location, but the semantic similarity of events also influences which event memories are triggered by an event.
In our future work, we plan to deepen our systematic analysis of factors for revisiting past events and of the combination of those factors.
We also plan to investigate external factors such as media coverage linking new events to past events or reflection of such relationships in other types of social media.
What do you remember? Thanks for your attention!