EUDAT-B2FIND: A FAIR-friendly and Interdisciplinary Data Catalogue

Post on 12-Apr-2017

21 views 0 download

transcript

EUDAT-B2FINDA FAIR-friendly and Interdisciplinary Data Catalogue

Heinrich Widmann, DKRZ

BlueBRIDGE Workshop 2017

03.04.2017

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Outline

1 EUDAT and the B2 services

2 Guidelines and Concepts

3 FAIR Approach of EUDAT-B2FIND

4 Outlook and Summary

1 / 24

EUDAT and the B2 services

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue

EUDAT - Motivation and Objective

The project European Data Infrastructure (eudat) isfunded by the EU Horizon2020 program, started in 2011,now in 2nd phase EUDAT2020, will end 2018≥ 2018 : agreement of EUDAT-EGI-Indigo consortium on theEOSC-Hub proposal

Motivation : Manage the rising tide of research dataChallenge : Help communities to handle the Big DataManagement in a wide cross-disciplinary scopeObjective : Build up a Collaborate Data Infrastructure (CDI),

based on common and generic data servicesdriven by requirements of the research communities

2 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue

EUDAT B2 Service Suite

For details see at http://www.eudat.eu/services3 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue

EUDAT Collaborative Data Infrastructure (CDI)

For details see at http://www.eudat.eu/eudat-cdi

4 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Motivation and ObjectiveB2 Service SuiteCollaborative Data Infrastructue

EUDAT Collaborative Data Infrastructure (CDI)

For details see at http://www.eudat.eu/eudat-cdi

4 / 24

Guidelines and Concepts

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

The FAIR principles → EUDAT-B2FIND

The FAIR principles → as implemented by EUDAT-B2FIND

Findability→ Discovery Portal with powerful search featuresAccessibility→ Persistent Identifiers for unique resolvabilityof data objectsInteroperability→ Interdisciplinary Catalogue based onCommon standardsReuseability→ Interoperable Format used for data access byEUDAT’s Storage B2-services

5 / 24

FAIR Approach of EUDAT-B2FIND

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Facetted Search

B2FIND provides search forFree textGeo spatial coverageTemporal coveragePublication YearTextual facets

TagsCreatorDiscipline etc.

6 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Types of Identifiers

7 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Resolvability of Data Objects

8 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Resolvability of Data Collection

9 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Distribution of Data Access Identifiers

1 DOI & PID

28

DOI

27

PID

45

URL

10 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Levels of Interoperability -1-

11 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Levels of Interoperability -2-

12 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Levels of Interoperability -3-

13 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Levels of Interoperability -4-

14 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

B2FIND Ingestion Workflow

Preconditions tojoin B2FIND

MD providerserviceSpec. of MD(format,schema)Only twomandatoryfields (titleand oneidentifier)

15 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

B2FIND MD Schema (extract)

MD Type Field name Semantic definition Allowed Values Obligation Occurence

General Info Title A name or title a resourceis known

Free Text (Unicode) Mandatory 1

Description Additional informa-tion about content ofresource.

Free text (Unicode) Recommended 0-1

Data AccessSource URL that uniquely identi-

fies a resourceShould be resolv-able URL Mandatory [1]

0-1

PID Persistent IDentifier(Handle in a Handle-server)

+ persitent and re-solvable via handleserver

0-1

DOI Digital Object Identifier(registered at Datacite)

+ citable and re-solvable via DOIagencies

0-1

Provenence DataCreator Main researchers in-

volved in data productionList of persons Recommended 0-1

Discipline Field of Research Controlled Vo-cabulary, seeb2find_disciplines

Recommended 0-n

PublicationYear

The year data are pub-lished

YYYY Optional 0-1

Coverage Data TemporalCoverage

The temporal limits Interval of UTCdate-times

Optional 0-1

SpataialCoverage

Spatial extent Spatial coordinatebox or point

Optional 0-1

16 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

The facet Discipline and its Controlled Vocabulary

Taken from List of Academic disciplines athttp://en.wikipedia.org/wiki/List_of_academic_disciplines_and_sub-disciplines

17 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

Coverage of Disciplines in B2FIND

10Social Sciences8

Natural Sciences

51Humanities

2

Professions

29

Not stated

18 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

Findability → Discovery PortalAccessibility → Persistent IdentifiersInteroperability → Mapping onto Common SchemaMD Ingestion and Common SchemaDisciplines, Communities and MD Catalogue

B2FIND Metadata Catalogue - Ingestion Status

19 / 24

Outlook and Summary

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

ChallengesLessons learnedNext stepsConclusionsLinks and Contact

Challenges

EUDAT has to master the balancing act between providinggeneric, discipline agnostic services and meet research specificneeds

e.g. requirements of Blue Growth Communities

Integrate B2 services in/as BlueBRIDGE VRE ?Handle scalability and granularity issuesAssurance of Quality of MetadataImprove Usability of (Graphical) User InterfaceUse the potential of the Semantic Web (LoD)

20 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

ChallengesLessons learnedNext stepsConclusionsLinks and Contact

Lessons learned

Less is sometimes more : Catalogue with a manageable amountof high quality metadata instead a mess of millions of entriesTalk more to community representatives and researchers (atbest already in the phase of generation of the metadata)Low(er) barrier for communities to get contact, documentationand support from EUDAT

21 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

ChallengesLessons learnedNext stepsConclusionsLinks and Contact

Next steps

Provide Guidelines and Recommendations for Data Providers‘Annotation’ functionality (B2NOTE) : Users link datasets toexternal reference materials (vocabularies, ontologies, etc.)Hierarchical search, Query-based Taxonomies : Enablinghierarchical search, e.g. in trees of DisciplinesExtend and adapt Validation and Consistency checks, e.g.

check of resolvability of URL’s (Resource Identifiers)

22 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

ChallengesLessons learnedNext stepsConclusionsLinks and Contact

Conclusions

EUDAT-B2FINDestablished an operative service based on agreed standards andguidelines as the FAIR principles,provides a discovery portal with powerful search functionalitiesandis based on a unique catalogue of research data , combining manyheterogeneous and cross-discipline sources

Improved interoperability is achieved by homogenisation to acommon metadata schemaFurther efforts are made to address the demands of thecommunities and data projects, to adapt the system for futurechallenges

23 / 24

EUDAT and the B2 servicesGuidelines and Concepts

FAIR Approach of EUDAT-B2FINDOutlook and Summary

ChallengesLessons learnedNext stepsConclusionsLinks and Contact

LinksInfo about EUDAT : http://eudat.euB2FIND portal : http://b2find.eudat.eu

ContactSupport form : www.eudat.eu/support-requestEmail : widmann@dkrz.de

Thank you for your attention !

24 / 24