Date post: | 02-Jul-2015 |
Category: |
Technology |
Upload: | andrew-treloar |
View: | 291 times |
Download: | 1 times |
DANS is an institute of KNAW and NWO
Data Archiving and Networked Services
Data Archiving and Networked Services
A Perspective on Archiving the Scholarly Web Herbert van de Sompel, LANL Team Leader, Prototyping/ DANS Visiting Fellow http://orcid.org/0000-0002-0715-6126/ Andrew Treloar, ANDS Director of Technology/ DANS Visiting Fellow http://orcid.org/0000-0002-8911-3081/
Overview
• Some context • Thinking about the future • Changes in research process and outputs • Implications for archives • Conclusions
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Republic of Letters
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
System of Journals
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
System of Journals • Rosendaal and Geurts (1997) • Registration
– submission of manuscript • Certification
– peer-review (pre-publication) – commentary (post-publication)
• Awareness – discovery services
• Archiving – libraries (print) – publishers (electronic) – special purpose organisations (e.g. Portico)
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Web of Objects
October 8, 2014 h?p://www.researchobject.org/pages/wp-‐content/uploads/2014/08/research-‐objects-‐illustraHon.png
Pointers to the future
“the future is already here – it’s just not very evenly distributed”
William Gibson, NPR interview So, let’s look at the most evenly distributed example we have - lifesciences
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Registration: Protein Data Bank
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Registration: Wiki Pathways
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Registration: NeuroLex
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Certification: PubMed Commons
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Awareness: myExperiment
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Awareness: OpenMalaria
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Archiving: CSIRO
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Archiving: GenBank
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Caveats
• Different disciplines have different cultures and practices
• Lifesciences even differ somewhat from other sciences
• But they are tackling some of the challenges now that other disciplines will face soon (or are grappling with now): – increased data volumes and complexity – greater reliance on in-silico science – need to support greater transparency and collaboration
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Changes in research process
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Changes in research objects
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
October 8, 2014
Publishing, pre-web
CC-‐BY-‐SA, @hvdsomp
October 8, 2014
Publishing, web
CC-‐BY-‐SA, @hvdsomp
October 8, 2014
Publishing, web
CC-‐BY-‐SA, @hvdsomp
Web platforms for scholarship
• Common web platforms are increasingly used for scholarship – Wikis, GitHub, DropBox, Twitter, Wordpress, etc.
• Many of these have desirable characteristics: – Versioning – Timestamping – Social embedding
• They record rather than archive • But they are capturing critical elements of the
scholarly process
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Github terms of service
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
• GitHub reserves the right at any time and from time to time to modify or discontinue, temporarily or permanently, the Service (or any part thereof) with or without notice. (E.1)
• GitHub does not warrant that (i) the service will meet your specific requirements, (ii) the service will be uninterrupted, timely, secure, or error-free, (iii) the results that may be obtained from the use of the service will be accurate or reliable, (iv) the quality of any products, services, information, or other material purchased or obtained by you through the service will meet your expectations, and (v) any errors in the Service will be corrected. (D.4)
Recording vs. Archiving Recording Archiving
Short-‐term Longer-‐term
No guarantees provided A?empt to provide guarantees
Write many/Read many Write Once/Read Many
Scholarly process Scholarly record
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Herbert Van de Sompel, Andrew Treloar OCLC DANS Workshop, Amsterdam, Netherlands, June 10 2014
Web of Objects – Core Observations
• The research process, not just its outcome, is becoming visible … on the web
• Massive extension of the scholarly record with an enormous variety of novel objects
• The objects are heterogeneous, dynamic, compound, inter-related and distributed across the web
• The objects are often hosted on common web platforms that are not dedicated to scholarship
The archival paradigm [and infrastructure] must take these characteristics into account
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Transfer from recording to archive
• What: Boundaries of archived object(s) • When: Triggers for archival transition • Who: Archivist? Researcher? Software? • Why: Motivations to archive
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Further work
• Exploring the implications of this approach, and answers to some of these questions, together with the archival team at DANS (and the institutions/researchers with whom they engage)
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Conclusion
• Nature of research process is changing • Kinds of research objects (inputs and outputs)
are changing • Need the ability to record and archive processes
as well as outputs • Archives need to respond to these changes
October 8, 2014 CC-‐BY-‐SA, @hvdsomp and @atreloar
Data Archiving and Networked Services (DANS) Anna van Saksenlaan 10 | 2593 HT The Hague | The Netherlands P.O. Box 93067 | 2509 AB The Hague | The Netherlands 070 3446 484 | [email protected] | www.dans.knaw.nl KVK 54667089 | DANS is an institute of KNAW en NWO
Questions?
@hvdsomp
@atreloar
CC-‐BY-‐SA @hvdsomp and @atreloar