How I spend my summer vacations

Post on 04-Feb-2015

2,150 views 0 download

description

WADL2013 presentation by Justin F. Brunelle on his current research projects.

transcript

How I spend my summer vacations

Justin F. BrunelleWS-DL Research Group

Department of Computer ScienceOld Dominion University

WADL 2013

Justin in a nutshell

• PhD Student at ODU• Dynamic representations–in the archives–Improved quality from–archived data–Alter-ego: Application Developer • at The MITRE Corporation–Big data & cloud computing

How much can we archive?

The setup

• 1,000 URIs from Twitter• 1,000 URIs from Archive-it• Capture with tools• Study the archivability

Good

Good

Good

Bad

Bad

Bad

Bad

Bad

Why?

Losing the Moment

•What we share != What we curate• 4.2% of Twitter is perfectly archived–Losing My Revolution: 11% gone in 2 years• 34.2% of Archive-it is perfectly archived• Accessibility? Gov vs. non-Gov?

Measuring memento damage

VS.

Not all embedded resources are created equal

Not all embedded resources are created equal

Planned Work

• Evaluate importance of missing stuff–Size, position–# CSS Classes–Not all stylesheets created equal–Missing border vs missing functionality– “Whitespace”–Provide Web service•Mechanical Turk evaluation of “damage”• Evaluate collections of mementos

What does it all mean?

• Archivability is measurable•Damage is measurable• If we can predict archivability….–We can try new methods of archiving on “hard to

capture” mementos–Attempt repairs on existing mementos–Gauge our successes in real-time•Next step: capturing dynamic content