Date post: | 17-Dec-2015 |
Category: |
Documents |
Upload: | melvyn-rich |
View: | 218 times |
Download: | 1 times |
Sustainable Preservation of Linked Data
Vassilis Christophides
Linked Data vs Cultural Artifacts!
• Linked datasets are digitally-born objects designed to be copied, rely on vocabularies and integrity constraints (understandable by both people and programs), whose data and structures changing over time
Digital Object vs Data Preservation
Source: Preserving Our Digital Heritage: The National Digital Information Infrastructure and Preservation Program 2010 Report. A Collaborative Initiative of the Library of Congress
Frame Linked Data Preservation as a Sustainable Economic Activity
• Economic activity: deliberate allocation of resources– Cost of losing datasets
• Sustainable: ongoing resource allocation over long periods of time– Involved data subjects
• Articulate the problem/provide recommendations & guidelines– Economic and societal benefits
Technical
Social Economic
Blue Ribbon Task Force on Sustainable Digital Preservation and Access, Final report 2010
Sustainability Conditions
• Who benefits from use of the preserved data?
• Who selects what data to preserve?
• Who owns the data?• Who preserves the data?• Who pays both for data
and preservation services?
• recognition of the benefits of preservation by decision makers
• selection of datasets with long-term value
• incentives for decision makers to act in the public interest or to elaborate new business models
• appropriate governance of preservation activities
• ongoing and efficient allocation of resources to preservation
• timely actions to ensure long-term data access and usability
The Scientific Data Life Cycle
• Data Life Cycle Labs A New Concept to Support Data-Intensive Science
Scientist
Research Process
Secondary(derived)
data
Tertiarydata for
publication
Primary publication
Secondarypublication
TertiaryPublication
PeerReview
e-Prints
PublicationArchives
Library - Peers - Public - Industry
PublicationProcess
Primary data
Web Content
Patent data
Research Process
Researchbased on
data
Metadata
CurationCurator
Curation Process
Archiveddata
Data repositories
Philip Lord, 2003
Scientist
Research Process
Secondary(derived)
data
Tertiarydata for
publication
Primary publication
Secondarypublication
TertiaryPublication
PeerReview
e-Prints
PublicationArchives
Library - Peers - Public - Industry
PublicationProcess
Primary data
Web Content
Patent data
Research Process
Researchbased on
data
Metadata
CurationCurator
Curation Process
Archiveddata
Data repositories
Philip Lord, 2003
Data-as-a-Service (DaaS) Pricing Models• By far the most common case is that of a fixed price for the entire data set
(CustomLists, Infochimps) or a fixed number of transactions per month based on client subscriptions (Azure DataMarket, Infochimps API)
• DaaS pricing models are based on tiered data access falling into– Volume-based model: 1) quantity-based pricing and 2) pay per call (A
“call” is a single request/response interaction with the API for data)– Data type-based model: An example is a mapping API that offers the
geo-coordinates and zip codes of the neighbourhoods in an urban area while additional attributes including school or post office locations are sold for an additional charge
– Hybrid pricing models combine value with volume charges to create finer-grained pricing to better meet both the buyers’ and sellers’ needs
• Existing pricing models favour essentially big customers that can typically afford to purchase the entire data sets they need, but small customers often need only a few data items from them and cannot afford to pay the full price