Date post: | 14-Jul-2015 |
Category: |
Technology |
Upload: | mhaendel |
View: | 117 times |
Download: | 0 times |
World Wide Web Consortium (W3C)
The W3C is the main international standards organization for the World Wide Web
The W3C is made up of over 400 member organizations for the purpose of working together in the development of standards for the World Wide Web
W3C has sophisticated development and community validation procedures for standards development
The Semantic Webis the new global web of knowledge
It involves standards for publishing, sharing, and querying facts, expert knowledge and services
It is a scalable approach to thediscovery of independently formulated
and distributed knowledge
Cyganiak and Jentzsch. http://lod-cloud.net/
Resource Description Framework
Language to represent knowledge Logic-based formalism -> automated reasoning graph-like properties -> data analysis
Good for: Describing in terms of type, attributes, relations Integrating data from different sources Sharing the data (W3C standard) Reusing what is available, developing what you need,
and contributing back to the web of data
Challenge: Working with Web Data Often have inadequate descriptions so we don’t know what
they are about or how they were constructed
Datasets change over time, but often don’t come with versioning information
May have been constructed using other data, but it’s not clear which version of data was used or whether these were modified
Data may be available in a variety of formats
There may be multiple copies of data from different providers, but it’s unclear if they are exact copies or derivatives
Version of standard or vocabulary used not indicated
Data registries are not synchronized and can contain conflicting information
Key Use Cases for HCLS Dataset description
1. Dataset Identification, Description, Licensing and Provenance
2. Dataset Discovery (via Catalog)
3. Exchange of Dataset Descriptions
4. Dataset Linking
5. Content Summary
6. Monitoring of Dataset Changes
Objectives
Develop a guidance note for reusing existing vocabularies to describe datasets with RDF– Mandatory, recommended, optional descriptors– Identifiers– Versioning– Attribution– Provenance– Content summarization
Recommend vocabulary-linked attributes and value sets
Provide reference editor and validation
We complied a list of metadata fields used across the community
and then surveyed over 20 vocabularies to see if they provided relevant metadata elements or value sets…
…to produce a big spreadsheet that maps metadata needs with existing vocabularies
Dublin Core Metadata Initiative
Widely used
Broadly applicable– Documents
– Datasets
✗Generic terms
✗Not comprehensive
✗No required properties
“Date: A point or period of time associated with an event in the lifecycle of the resource.”
Description
Identifiers Title Description Homepage License Language Keywords Concepts and vocabularies used Standards Publication
Attribution
Simple Model– Individuals are related to roles using specific
propertiese.g. dct:creator, pav:createdBy, pav:curatedBy
Expandable Model– Individuals are related to roles and dates via
associated object– PROV, VIVO-ISF
Provenance and Change
Version number
Source
Provenance: retrieved from, derived from, created with
Frequency of change
HCLS:http://www.w3.org/blog/hcls/
Mailing list: http://lists.w3.org/Archives/Public/public-semweb-lifesci/
Editors’ Draft: http://tiny.cc/hcls-datadesc-ed
W3C Interest Group Note:http://tiny.cc/hcls-datadesc
Special thanks to Alasdair Gray, Scott Marshall, Joachim BaranThanks to all other contributors to the HCLS note