+ All Categories
Home > Documents > Provenance of scientific information as experienced in DRIVER

Provenance of scientific information as experienced in DRIVER

Date post: 08-Jan-2016
Category:
Upload: dalit
View: 40 times
Download: 2 times
Share this document with a friend
Description:
Provenance of scientific information as experienced in DRIVER. 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008. Wolfram Horstmann Bielefeld University / DRIVER. Notions of Provenance. Where do data objects* originate from? Scientific Work -- examples - PowerPoint PPT Presentation
22
Provenance of scientific information as experienced in DRIVER 6th e-Infrastructure Concertation Event Lyon, 24 th November 2008 Wolfram Horstmann Bielefeld University / DRIVER
Transcript
Page 1: Provenance of  scientific information as experienced in DRIVER

Provenance of scientific information

as experienced in DRIVER

6th e-Infrastructure Concertation Event

Lyon, 24th November 2008

Wolfram HorstmannBielefeld University / DRIVER

Page 2: Provenance of  scientific information as experienced in DRIVER

Notions of Provenance

• Where do data objects* originate from? – Scientific Work -- examples

• Instrumentation techniques– Manufacturers of hard- and software

• Methodologies– Processes, e.g. gene sequencing

– Technical/Local -- examples

• (web)-identifiers• Database, repository name

* Primary data, documents, metadata …

Page 3: Provenance of  scientific information as experienced in DRIVER

Why Provenance?

• Quoting / Citing / Referencing as global scientific principle – „Reproducible research“

• Giving credits to authors / creators in distributed environments

• Original location / context has to be known

• Experienced in Grid-Environments [1]

Page 4: Provenance of  scientific information as experienced in DRIVER

Provenance & Interoperability

• Re-Use / Sharing: “Addressing/Accessing”– Common view, common use– Unidirectional: No change of data objects!

• Federation: “Discovering in Context”– Remote representation of distributed DOs

• Aggregation: “Contextualizing”– Add unchanged object in a context

• Processing/Annotation: “Changing”– Uni- vs. Bidirectional: Change of DOs and remote

representation vs. back-storage (e.g. CVS)

Page 5: Provenance of  scientific information as experienced in DRIVER

Scenarios in DRIVER

Page 6: Provenance of  scientific information as experienced in DRIVER

Digital Scientific Data

Page 7: Provenance of  scientific information as experienced in DRIVER

Digital Object Collections

⊃⊃ ⊃ ⊃

Page 8: Provenance of  scientific information as experienced in DRIVER

Digital Object Repositories

+ + + +

=

Page 9: Provenance of  scientific information as experienced in DRIVER

Digital Information Space

Page 10: Provenance of  scientific information as experienced in DRIVER

Conventional Web Data

Page 11: Provenance of  scientific information as experienced in DRIVER

„Simple“ Applications

Page 12: Provenance of  scientific information as experienced in DRIVER

Metadata Infrastructure

Page 13: Provenance of  scientific information as experienced in DRIVER

Basic Provenance Settings

• Indicate Production Situation– Metadata

• Author, Instrumentation etc.

• Remote Representation– Indicate place of origin in remote systems

• Metadata as digital objects / first order citizens

– Allow lineage respresentation • Credits in remote environments / versioning

Page 14: Provenance of  scientific information as experienced in DRIVER

Orders of Provenance

• 1st order: Metadata– Provenance attached to data– Minimal „knowledge“ required in application– Allow remote handling of data objects– Require metadata infrastructure– Metadata introduce 2 objects: requires linkage

• 2nd order: context / compounds– Express multiple relations between objects– May introduce semantic model

Page 15: Provenance of  scientific information as experienced in DRIVER

Provenance in DRIVER #1

• Simple Objects: OAI-PMH [2]

– 1st order provenance • Metadata: minimum OAI-DC

– 2nd order provenance• DRIVER explicit identifiers for repositories• OAI-PMH: inline representation („about“)

Page 16: Provenance of  scientific information as experienced in DRIVER

Semantic/Compound Data

Page 17: Provenance of  scientific information as experienced in DRIVER

„Semantic“ Applications

Page 18: Provenance of  scientific information as experienced in DRIVER

Provenance in DRIVER #2

• „Enhanced Publications“ – Research project in

DRIVER-II– Representation of

data /document packages

– Use of OAI-ORE

Page 19: Provenance of  scientific information as experienced in DRIVER

Provenance in OAI-ORE

• OAI-ORE: Object Re-Use and Exchange[4] – Uses Resource Maps < Named Graphs– Uses „lineage“ to represent expl. Provenance– Future: explicit provenance model [7] ?

Page 20: Provenance of  scientific information as experienced in DRIVER

Summary

• Provenance essential for …– Indicating origin in distributed data spaces

• Accessing / Addressing• Federation / Aggregation • Processing / Annotation

– Document and data citation / trace-back– 1st order: describing data > metadata– 2nd order: describing context > semantic data

Page 21: Provenance of  scientific information as experienced in DRIVER

Lessons learnt in DRIVER

• Use web-enabled Identification (URI/UDDI etc.)– „Dark“ databases don‘t interoperate

• 1st order provenance at place of origin– Requires metadata to describe origin– Enables a metadata infrastructure– Introduces linkage problem

• 2nd order provenance in contexts– Requires data provider identification in federators /

aggregators in order to link back– May require semantic model for context– Would benefit from a semantic infrastructure

Page 22: Provenance of  scientific information as experienced in DRIVER

Resources[1] On provenance in the eScience / grid-environment

– http://www.sigmod.org/sigmod/record/issues/0509/p31-special-sw-section-5.pdf – In GLITE

• http://www.cesnet.cz/doc/techzpravy/2007/glite-job-provenance/• http://twiki.ipaw.info/bin/view/Challenge

[2] On provenance in OAI-PMH– http://www.openarchives.org/OAI/2.0/guidelines-provenance.htm

[3] On provenance OAI-ORE (referred to as ore:lineage)– http://www.openarchives.org/ore/meetings/Soton/ore_beyond_basics.pdf

(general)– http://www.openarchives.org/ore/1.0/vocabulary (definition)

[4] Named Graphs, Provenance and Trust (Caroll et al. )– http://www4.wiwiss.fu-berlin.de/bizer/SWTSGuide/carroll-ISWC2004.pdf

[5] W3C: On provenance in RDF– http://www.w3.org/2001/12/attributions/

[6] Open Provenance Model– http://eprints.ecs.soton.ac.uk/14979/1/opm.pdf

[7] DRIVER: Digital Repository Infrastructure for European Research– http://www.driver-community.eu


Recommended