+ All Categories
Home > Science > IBC FAIR Data Prototype Implementation slideshow

IBC FAIR Data Prototype Implementation slideshow

Date post: 24-Jan-2017
Category:
Upload: mark-wilkinson
View: 33 times
Download: 2 times
Share this document with a friend
167
Mark D. Wilkinson CBGP-UPM/INIA, Madrid [email protected] approach to interoperability leads to FAIRness for legacy and prospective data. IBC Scientific Days January 17-18,
Transcript
Page 1: IBC FAIR Data Prototype Implementation   slideshow

Mark D. WilkinsonCBGP-UPM/INIA,

[email protected]

A novel, API-free approach to interoperability leads to FAIRness for legacy and prospective data.

IBC Scientific DaysJanuary 17-18, 2017

Page 2: IBC FAIR Data Prototype Implementation   slideshow

The Problem...one recent survey of 18 microarray studies found that only two were fully reproducible using the archived data. Another study of 19 papers in population genetics found that 30% of

analyses could not be reproduced from the archived data and that 35% of datasets were incorrectly or insufficiently

described.

Dominique G. Roche , Loeske E. B. Kruuk, Robert Lanfear, Sandra A. Binning (2015)

http://dx.doi.org/10.1371/journal.pbio.1002295

Page 3: IBC FAIR Data Prototype Implementation   slideshow

The ProblemWe surveyed 100 datasets associated with nonmolecular studies in journals that commonly publish ecological and

evolutionary research and have a strong PDA policy. Out of these datasets, 56% were incomplete, and

64% were archived in a way that partially or entirely prevented reuse.

Dominique G. Roche , Loeske E. B. Kruuk, Robert Lanfear, Sandra A. Binning (2015)

http://dx.doi.org/10.1371/journal.pbio.1002295

Page 4: IBC FAIR Data Prototype Implementation   slideshow

The ProblemIs that data, therefore...

Useless?

Page 5: IBC FAIR Data Prototype Implementation   slideshow

The Problem

NO!!

Page 6: IBC FAIR Data Prototype Implementation   slideshow

The Problem

It’s Reuseless!

h.t. to Barend Mons

Page 7: IBC FAIR Data Prototype Implementation   slideshow

FAIRFindable

→ Globally unique, resolvable, and persistent identifiers→ Machine-actionable contextual information supporting

discovery

Accessible→ Clearly-defined access protocol→ Clearly-defined rules for authorization/authentication

Interoperable→ Use shared vocabularies and/or ontologies→ Syntactically and semantically machine-accessible format

Reusable→ Be compliant with the F, A, and I Principles→ Contextual information, allowing proper interpretation→ Rich provenance information facilitating accurate citation

The Four Principles

Page 8: IBC FAIR Data Prototype Implementation   slideshow

“Skunkworks”

Task: Build a prototype

Page 9: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks ParticipantsMark WilkinsonMichel DumontierBarend MonsTim ClarkJun ZhaoPaolo CiccaresePaul GrothErik van MulligenLuiz Olavo Bonino da Silva

SantosMatthew GambleCarole GobleJoël KuiperMorris SwertzErik Schultes

Erik SchultesMercè CrosasAdrian GarciaPhilip DurbinJeffrey GretheKaty WolstencroftSudeshna DasM. Emily Merrill

Page 10: IBC FAIR Data Prototype Implementation   slideshow

The Hourglass Concept

We want a large ecosystem of apps that use FAIR Data

Page 11: IBC FAIR Data Prototype Implementation   slideshow

The Hourglass Concept

We want to support a wide range of source providers

Page 12: IBC FAIR Data Prototype Implementation   slideshow

The Hourglass Concept

The FAIR solution between them must be THIN!

Page 13: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks participants had

tons of experience v.v.

metadata around scholarly

publication

Page 14: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks participants had

tons of experience v.v.

metadata around scholarly

publication RDA,Force11,Dataverse,Research Objects,NanoPubs,Semantic Science,SADI, AlzForum, SWAN,LSID, ………...

Page 15: IBC FAIR Data Prototype Implementation   slideshow

There was very little

disagreement about F, about A,

or about R

Page 16: IBC FAIR Data Prototype Implementation   slideshow

The “I” is the big problem

Page 17: IBC FAIR Data Prototype Implementation   slideshow

Interoperability is Hard!!

The “I” is the big problem

Page 18: IBC FAIR Data Prototype Implementation   slideshow

Keeping the history brief

A series of teleconferences led to the concept of putting metadata into an iterative set of

~identical “containers”

Page 19: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks HackathonsThe “containers of containers of containers”

idea was elaborated by the belief that we should also reject any solution that required

a new API

ProgrammableWeb.com already catalogues >16,000 different Web APIs

Page 20: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks HackathonsThe “containers of containers of containers”

idea was elaborated by the belief that we should also reject any solution that required

a new API

ProgrammableWeb.com already catalogues >16,000 different Web APIs

APIs DO NOT MAKE YOU INTEROPERABLE!

Page 21: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks HackathonsThe “containers of containers of containers”

idea was elaborated by the belief that we should also reject any solution that required

a new API

Page 22: IBC FAIR Data Prototype Implementation   slideshow

Skunkworks Hackathons

Are there existing standards that are

And have the properties of

?

Page 23: IBC FAIR Data Prototype Implementation   slideshow
Page 24: IBC FAIR Data Prototype Implementation   slideshow

Uses machine-accessible standards and representations, following a REST paradigm

LDPUseful Features

I

I + R

F + A

I

Defines HTTP-resolvable URIs for each of these containers

Defines the concept of a “Container” - a machine-actionable way to represent repositories, data deposits, data files, data points, and their metadata

Uses a widely accepted standard (DCAT) to relate metadata to data → machine-actionable data mining

Page 25: IBC FAIR Data Prototype Implementation   slideshow

Uses machine-accessible standards and representations, following a REST paradigm

LDPUseful Features

I

I + R

F + A

I

Defines HTTP-resolvable URIs for each of these containers

Defines the concept of a “Container” - a machine-actionable way to represent repositories, data deposits, data files, data points, and their metadata

Uses a widely accepted standard (DCAT) to relate metadata to data → machine-actionable data mining

Inspiration

Not Adoption

Page 26: IBC FAIR Data Prototype Implementation   slideshow

The FAIR Accessor

In incremental detail

Page 27: IBC FAIR Data Prototype Implementation   slideshow

What can we describe with FAIR Accessors?

FAIR Accessors provide a machine-actionable, structured,

REST-oriented way to publish Metadata

about a wide range of scholarly “entities”

Page 28: IBC FAIR Data Prototype Implementation   slideshow

What can we describe with FAIR Accessors?

Warehouses (e.g. EBI)

Databases (e.g. UniProt)

Repositories (e.g. Zenodo, INRA-URGI Wheat Repo, UniProt)

Datasets (e.g. output from a workflow)

Research Objects (data a/o workflow a/o results a/o publications)

Data “slices” (e.g. the result of a database query)

Data Records (e.g. image, excel file, patient clinical record)

Other…

Page 29: IBC FAIR Data Prototype Implementation   slideshow

HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Container Resource

Page 30: IBC FAIR Data Prototype Implementation   slideshow

(a “resource” is a URI / URL) HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Page 31: IBC FAIR Data Prototype Implementation   slideshow

HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Container Resource

Page 32: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

What does a FAIR Accessor “look like”?

Page 33: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

What does a FAIR Accessor “look like”?

There is a URI for the “Container”

(of any of the kinds listed in the previous slide)

Page 34: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

What does a FAIR Accessor “look like”?

Resources are manipulated using the HTTP protocol on the

Resource URI

For the FAIR Accessor, the only HTTP method we currently

require is HTTP GET

Page 35: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

What does a FAIR Accessor “look like”?

What is returned is a document full of metadata richly

describing that Container (warehouse, database, dataset,

slice, etc.)

And a list of Resources (URIs) that represent the contained

“things”

Page 36: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

What does a FAIR Accessor “look like”?

Looking more closely at one of those contained things...

Page 37: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

The contained thing is a Resource

Page 38: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

That Resource can be resolved by HTTP GET

Page 39: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

To retrieve a Metadata document describing that

resource (e.g. a single record)

Page 40: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Which record does this Metadata describe?

The foaf:primaryTopic attribute defines this

Page 41: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Using the metadata structures defined by DCAT the FAIR Accessor may also tell you how to get the content of the record, and what formats are available

Page 42: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

In this case, the record is available in XML formatBy calling HTTP GET on URL_U2

Page 43: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Or in RDF format by calling HTTP GET on URL_U1

Page 44: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

What does a FAIR Accessor “look like”?

Page 45: IBC FAIR Data Prototype Implementation   slideshow

Or you may add additional layers...

Metadata

Metadata

Metadata

MetadataDATA - format 1DATA - format 2

Page 46: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

1: There is no APIGET

Interpret the Metadata

Select the desired Resource

GET

Page 47: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

1: There is no APIGET

Interpret the Metadata

Select the desired Resource

GET

ANY Web agent can explore/index a FAIR Accessor(e.g. Google)

An agent that understands globally-accepted vocabularies can explore it “intelligently”

Page 48: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

49

1: There is no API

It’s difficult to get thinner than nothing...

Page 49: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

2: Identifiers for unidentifi-ed/-able things

HTTP GET

<FAIR metadata/>

This is the ArrayExpress queryI did for paper doi:10/1234.56

Results: MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Page 50: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

2: Identifiers for unidentifi-ed/-able things

HTTP GET

<FAIR metadata/>

This is the ArrayExpress queryI did for paper doi:10/1234.56

Results: MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Should assist with reproducibility and transparency

Page 51: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

3: A predictable “place” for metadata

PrimaryTopic: record 1A445Record Metadata...

DATA - format 1DATA - format 2

Different “kinds” of metadata have distinct ontological types, and distinct document structures. There is no ambiguity regarding what the metadata is describing - a repository or a record. Repository metadata

MetaRecordURL

Page 52: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

3: Symmetry & predictable path to citation

XXX

Part of dataset XXXMetadata...

DATA - format 1DATA - format 2

The record metadata contains an “upward” link to the Repository-level metadata, which should contain license and citation information

Repository metadata:Cite: doi:10/8847.384License: cc-by

Page 53: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

4: Granularity of Access/Privacy/Security

Container Resource HTTP GET

<FAIR metadata/>

Contains

<<184 Records>>

Contact Mark Wilkinson For more information about These records

Page 54: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

4: Granularity of Access/Privacy/Security

Container Resource HTTP GET

<FAIR metadata/>

Contains

<<184 Records>>

Contact Mark Wilkinson For more information about These records

Page 55: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

Container HTTP GET

<FAIR metadata/>Contains

MetaRecordResource3

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:distribution

<<NONE>>

HTTP GET

4: Granularity of Access/Privacy/Security

Page 56: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

Container HTTP GET

<FAIR metadata/>Contains

MetaRecordResource3

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:distribution

<<NONE>>

HTTP GET

4: Granularity of Access/Privacy/Security

Page 57: IBC FAIR Data Prototype Implementation   slideshow

Container HTTP GET

<FAIR metadata/>

Contains MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/> foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

HTTP GET

Features of the FAIR Accessor

4: Granularity of Access/Privacy/Security

Page 58: IBC FAIR Data Prototype Implementation   slideshow

Container HTTP GET

<FAIR metadata/>

Contains MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/> foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

HTTP GET

Features of the FAIR Accessor

4: Granularity of Access/Privacy/Security

Page 59: IBC FAIR Data Prototype Implementation   slideshow

Features of the FAIR Accessor

4: Granularity of Access/Privacy/Security

Thin solution - if it’s private, do nothing! Literally!

Page 60: IBC FAIR Data Prototype Implementation   slideshow

The Real ThingA working FAIR Accessor

Serving a “Slice” of UniProt

Page 61: IBC FAIR Data Prototype Implementation   slideshow

A real-world scenario...

You are publishing a paper describing the evolution of proteins in the RNA Processing

machineries of the fungus Aspergillus nidulans.

You want to be a good scholarly publisherinterested in transparency and reproducibility

So you must describe, in detail, the inclusion/exclusion criteria for selecting proteins for your dataset

(today, this is generally done either in the text of the paper, or not at all...)

Page 62: IBC FAIR Data Prototype Implementation   slideshow

The query that returns the relevant proteins

WHERE{

?protein a up:Protein .?protein up:organism ?organism .?organism rdfs:subClassOf taxon:162425 .?protein up:classifiedWith ?go .?go rdfs:subClassOf*

<http://purl.obolibrary.org/obo/GO_0006396> .bind(replace(str(?protein),

"http://purl.uniprot.org/uniprot/", "", "i") as ?id)}

Page 63: IBC FAIR Data Prototype Implementation   slideshow

The query that returns the relevant proteins

WHERE{

?protein a up:Protein .?protein up:organism ?organism .?organism rdfs:subClassOf taxon:162425 .?protein up:classifiedWith ?go .?go rdfs:subClassOf*

<http://purl.obolibrary.org/obo/GO_0006396> .bind(replace(str(?protein),

"http://purl.uniprot.org/uniprot/", "", "i") as ?id)}

NCBI Taxonomy:Aspergillus nidulans

Page 64: IBC FAIR Data Prototype Implementation   slideshow

The query that returns the relevant proteins

WHERE{

?protein a up:Protein .?protein up:organism ?organism .?organism rdfs:subClassOf taxon:162425 .?protein up:classifiedWith ?go .?go rdfs:subClassOf*

<http://purl.obolibrary.org/obo/GO_0006396> .bind(replace(str(?protein),

"http://purl.uniprot.org/uniprot/", "", "i") as ?id)}

Gene Ontology:RNA Processing

Page 65: IBC FAIR Data Prototype Implementation   slideshow

Create and publish a FAIR Accessor for that query

http://linkeddata.systems/Accessors/UniProtAccessor

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Page 66: IBC FAIR Data Prototype Implementation   slideshow

Create and publish a FAIR Accessor for that query

http://linkeddata.systems/Accessors/UniProtAccessor

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Resolve the URI(in software or in your browser)

Page 67: IBC FAIR Data Prototype Implementation   slideshow

Create and publish a FAIR Accessor for that query

Returns a page of metadata (in this example, in RDF)

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Page 68: IBC FAIR Data Prototype Implementation   slideshow
Page 69: IBC FAIR Data Prototype Implementation   slideshow

70

Page 70: IBC FAIR Data Prototype Implementation   slideshow

71

Note that this Metadata is about ME! I am the creator of this dataset, and may be credited for it.

Page 71: IBC FAIR Data Prototype Implementation   slideshow
Page 72: IBC FAIR Data Prototype Implementation   slideshow
Page 73: IBC FAIR Data Prototype Implementation   slideshow
Page 74: IBC FAIR Data Prototype Implementation   slideshow
Page 75: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Page 76: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

Step down to individual Record metadata

Page 77: IBC FAIR Data Prototype Implementation   slideshow

Step down to individual Record metadata

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

HTTP GET

Software calls HTTP GET on the URL representing the MetaRecord Resource for the desired record in the Container

(or just click on it, or type it into your browser)

Page 78: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

The document that is returned

Page 79: IBC FAIR Data Prototype Implementation   slideshow
Page 80: IBC FAIR Data Prototype Implementation   slideshow

Note the change in metadata focus!

This metadata is about the UniProt Record (not about Mark Wilkinson).

The record described in this metadata was created by UniProt, so the citation and authorship information is now THEIRS, not MINE.

Page 81: IBC FAIR Data Prototype Implementation   slideshow

Container Resource

Symmetrical Linkback upward to the Accessor

Container, for additional metadata

Page 82: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml

Page 83: IBC FAIR Data Prototype Implementation   slideshow

Two ways to retrieve the record - RDF or HTML

(in REST-speak, two Representations of that Resource)

Page 84: IBC FAIR Data Prototype Implementation   slideshow

Note that this metadata record is somewhat more FAIR, than what you can (easily) retrieve from UniProt itself!

e.g. the UniProt record does not include the citation or license information - you have to manually surf around the UniProt Web page to find that.

So the Accessor makes UniProt’s already notably FAIR data, even more FAIR (with respect to “R”)

Page 85: IBC FAIR Data Prototype Implementation   slideshow

How FAIR are we now?

What does the Accessor give us?

Page 86: IBC FAIR Data Prototype Implementation   slideshow

What we have achieved

We have created a FAIR record for something - i.e. a slice of a database - that was, historically, un-recordable and un-identifiable in any formal way.

F

F + A

F + R

Accessors are a standard approach to providing human & machine accessible metadata to facilitate appropriate discovery (contextual, biological), proper usage (license) and proper citation for any kind of data.

The discovery, accessibility, and drill-down/up behaviors do not require any novel API, rather simply rely on global Web standards; this allows them to be indexed by existing Web search engines

Page 87: IBC FAIR Data Prototype Implementation   slideshow

What we have achieved

F + AI

The metadata itself uses machine-accessible syntaxes, and widely adopted ontologies and vocabularies, thus easily integrates with other metadata

AAccessors provide a lightweight means to protect privacy while still providing the maximum degree of transparency possible

+Accessors can be static, or dynamic. i.e. we can provide template Accessor file(s) that are edited in Notepad, then published together with the data; or Accessors can dynamically generate their output from code (e.g. layered on a database server)

Page 88: IBC FAIR Data Prototype Implementation   slideshow

So far, we have focused on FAIR Metadata

Page 89: IBC FAIR Data Prototype Implementation   slideshow

Are there approaches to making the DATA FAIR?

Page 90: IBC FAIR Data Prototype Implementation   slideshow

Making a Plant-related Resource FAIR

FAIR reformatting of the plant component of the

Pathogen Host Interaction Database (PHI-base)

Page 91: IBC FAIR Data Prototype Implementation   slideshow

Making a Plant-related Resource FAIR

Dr. Mikel Egaña ArangurenOntologist

Dr. Alejandro Rodríguez GonzálezDatabase Expert

Dr. Alejandro Rodríguez Iglesias(PhD student at the time)

Rodriguez-Iglesias A., Rodriguez-González A., Irvine AG., Sesma A., Urban M., Hammond-Kosack KE., Wilkinson MD. 2016. Publishing FAIR Data: An Exemplar Methodology Utilizing PHI-Base. Frontiers in Plant Science 7.

Page 92: IBC FAIR Data Prototype Implementation   slideshow

Extract Transform Load

A “Brute Force” approach to FAIRness

Requires a ~comprehensive data/semantic model

Making a Plant-related Resource FAIR

Page 93: IBC FAIR Data Prototype Implementation   slideshow

The Plant Pathogen Interaction Ontology (PPIO)

Written in OWL2

Many of the Classes are defined by rich logical axioms

Designed for automated classification and enrichment of data

through logical reasoning(e.g. if attached to a data stream)

Semantic Modeling of Plant Pathogen Interaction

Data

Page 94: IBC FAIR Data Prototype Implementation   slideshow

General introduction 95

The Disease Triangle – Pathogen/Host/Environment

http://fyi.uwex.edu/fieldcroppathology/field-crops-fungicide-information/

Page 95: IBC FAIR Data Prototype Implementation   slideshow

General introduction 96

The Disease Triangle – Pathogen/Host/Environment

This concept has evolved over decades of domain-expert thought and discussion

Why not use this as the basis for our Semantic model of Pathogenicity?

Page 96: IBC FAIR Data Prototype Implementation   slideshow

The Disease Triangle, Modelled as “Contexts”

Interaction Context

Environmental Context

Host Context

Pathogen ContextResulting phenotype

Page 97: IBC FAIR Data Prototype Implementation   slideshow

Interaction Context

PhenotypeResistance phenotype

Susceptibility phenotype

Phenotypic Process

1. Abnormal growth development phenotype.

2. Color variation phenotype.

3. Tissue disintegration phenotype.

4. Vascular system damage phenotype.

Page 98: IBC FAIR Data Prototype Implementation   slideshow

99

Interaction ContextPhenotypic Process Branch

Page 99: IBC FAIR Data Prototype Implementation   slideshow

100

Plant Trait Ontology +

Interaction ContextPhenotypic Process Branch

Page 100: IBC FAIR Data Prototype Implementation   slideshow

The Disease Triangle, Modelled as “Contexts”

Interaction Context

Environmental Context

Host Context

Pathogen ContextResulting phenotype

Page 101: IBC FAIR Data Prototype Implementation   slideshow

Environmental Context manually extracted from the literature

Environmental Context

Page 102: IBC FAIR Data Prototype Implementation   slideshow

Environmental Context manually extracted from the literature

Environmental Context

Page 103: IBC FAIR Data Prototype Implementation   slideshow

Environmental Context manually extracted from the literature

Environmental Context

Page 104: IBC FAIR Data Prototype Implementation   slideshow

A pathogen that enters throughthe stomata will be more successful

in high humidity, and have higher pathogenicity

Environmental Context manually extracted from the literature

Environmental Context

Page 105: IBC FAIR Data Prototype Implementation   slideshow

The Disease Triangle, Modelled as “Contexts”

Interaction Context

Environmental Context

Host Context

Pathogen ContextResulting phenotype

Page 106: IBC FAIR Data Prototype Implementation   slideshow

• Plant-pathogen interaction data, including:

• Resulting phenotypes• Molecular/genetic basis of pathogenicity• Experimental approaches• Provenance information

Host Context

Pathogen Context

Page 107: IBC FAIR Data Prototype Implementation   slideshow

• 4800 interactions• 3300 gene-mutant records• 220 pathogens• 130 hosts• 261 registered diseases• 1700 references

Host Context

Pathogen Context

Page 108: IBC FAIR Data Prototype Implementation   slideshow

Interaction Context

Interaction Context

[WT]

Interaction Context[mutant]

Host Context

Host Context

PathogenContext

Pathogen Context

Description DescriptionProtocol Protocol

“Historical observation”

“Base state” “reduced

virulence”

“soft rot”

Protocol descriptionCitation

PubMedID“PMID:1234”

“gene deletion”

Environmental Context

Environmental Context

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

Page 109: IBC FAIR Data Prototype Implementation   slideshow

Interaction Context

Interaction Context

[WT]

Interaction Context[mutant]

Host Context

Host Context

PathogenContext

Pathogen Context

Description DescriptionProtocol Protocol

“Historical observation”

“Base state” “reduced

virulence”

“soft rot”

Protocol descriptionCitation

PubMedID“PMID:1234”

“gene deletion”

Environmental Context

Environmental Context

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

Page 110: IBC FAIR Data Prototype Implementation   slideshow

Pathogen Context

Allele

Gene

Locus ID

Gene function

Gene name

Gene accession

“AEQ95741”

“Effector protein”

“TAL2G”

“G7TJZ8”

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

Page 111: IBC FAIR Data Prototype Implementation   slideshow

Pathogen Context

Allele

Gene

Locus ID

Gene function

Gene name

Gene accession

“AEQ95741”

“Effector protein”

“TAL2G”

“G7TJZ8”

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

http://identifiers.org/*/*

Page 112: IBC FAIR Data Prototype Implementation   slideshow

Pathogen Context

Allele

Gene

Locus ID

Gene function

Gene name

Gene accession

“AEQ95741”

“Effector protein”

“TAL2G”

“G7TJZ8”

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

rdfs:label

Page 113: IBC FAIR Data Prototype Implementation   slideshow

Pathogen Context

Allele

Gene

Locus ID

Gene function

Gene name

Gene accession

“AEQ95741”

“Effector protein”

“TAL2G”

“G7TJZ8”

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

http://identifiers.org/*/*

Page 114: IBC FAIR Data Prototype Implementation   slideshow

Pathogen Context

Allele

Gene

Locus ID

Gene function

Gene name

Gene accession

“AEQ95741”

“Effector protein”

“TAL2G”

“G7TJZ8”

Rodríguez-Iglesias A. et al Front. Plant Sci., (2016)

http://identifiers.org/*/*

KEY MESSAGE:

Because we use identifiers.org URIs, and AgroLD does also,

we can query our Pathogen Host Interaction database, and

DYNAMICALLY RETRIEVE additional information from AgroLD

with NO additional effort!!

Page 115: IBC FAIR Data Prototype Implementation   slideshow

Transform PHI-base data into RDF compliant with the PPIO Ontology

Load into Virtuoso Triplestore

Page 116: IBC FAIR Data Prototype Implementation   slideshow

Transform PHI-base data into RDF compliant with the PPIO Ontology

Load into Virtuoso Triplestore

(this was a LOT of work!!)

Page 117: IBC FAIR Data Prototype Implementation   slideshow

Transform PHI-base data into RDF compliant with the PPIO Ontology

Load into Virtuoso Triplestore

…but are we now FAIR?

Page 118: IBC FAIR Data Prototype Implementation   slideshow

Transform PHI-base data into RDF compliant with the PPIO Ontology

Load into Virtuoso Triplestore

…but are we now FAIR?

…Not really….

Page 119: IBC FAIR Data Prototype Implementation   slideshow

Findable

Accessible

Interoperable

ReusableX

X

HTTP GET, SPARQL, open access

RDF with published ontologies

Page 120: IBC FAIR Data Prototype Implementation   slideshow

Findable

Accessible

Interoperable

ReusableX

X• How would you find this database?• How would you know if anything interesting is in

it?• How would you (your machine) find a record?

• Who do you cite if you reuse a piece of data?• What are the license conditions? • Can I reuse the data at all??

HTTP GET, SPARQL, open access

RDF with published ontologies

Page 121: IBC FAIR Data Prototype Implementation   slideshow

Build a FAIR Accessorhttp://linkeddata.systems/SemanticPHIBase/Metadata

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic SemPHI #1

dcat:Distribution_1Source URL_U1Format rdf+xml

dcat:Distribution_2Source URL_U2Format HTML

HTTP GET

Page 122: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic SemPHI #1

dcat:Distribution_1Source URL_U1Format rdf+xml

dcat:Distribution_2Source URL_U2Format HTML

HTTP GET

Build a FAIR Accessorhttp://linkeddata.systems/SemanticPHIBase/Metadata

The URL of the record in“native” PHI-base

Page 123: IBC FAIR Data Prototype Implementation   slideshow

Container Resource HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic SemPHI #1

dcat:Distribution_1Source URL_U1Format rdf+xml

dcat:Distribution_2Source URL_U2Format HTML

HTTP GET

Build a FAIR Accessorhttp://linkeddata.systems/SemanticPHIBase/Metadata

The URL of the (RDF) record inSemantic PHI-base

Page 124: IBC FAIR Data Prototype Implementation   slideshow

This allows us to find Semantic PHI-base

based on its Repository-level Metadata

“what kind of data does Semantic PHI-base Contain?”

“Does it have any information about my gene of interest?”

And “drill-down” to a record of interestselected based on its Record Metadata

Page 125: IBC FAIR Data Prototype Implementation   slideshow

But… FAIR Accessors should be symmetrical

How do I go from data back “upwards” to metadata?

Page 126: IBC FAIR Data Prototype Implementation   slideshow

But… FAIR Accessors should be symmetrical

How do I go from data back “upwards” to metadata?To allow the retrieval of the Metadata

for any piece of data in Semantic PHI Base

Use the URL of the FAIR Accessor (Container Resource)as the URL of the “Named Graph” in the triplestore

using RDF “Quads”

SubjectURI PredicateURI ObjectURI ContextURL

Container URL HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

Page 127: IBC FAIR Data Prototype Implementation   slideshow

But… FAIR Accessors should be symmetrical

How do I go from data back “upwards” to metadata?

Container URL HTTP GET

<FAIR metadata/>

Contains

MetaRecordResource1 MetaRecordResource2 MetaRecordResource3

...

To allow the retrieval of the Metadata for any piece of data in Semantic PHI Base

Use the URL of the FAIR Accessor (Container Resource)as the URL of the “Named Graph” in the triplestore

using RDF “Quads”

SubjectURI PredicateURI ObjectURI Container URL

Page 128: IBC FAIR Data Prototype Implementation   slideshow

Findable

Accessible

Interoperable

Reusable

Page 129: IBC FAIR Data Prototype Implementation   slideshow

The Brute Force approach is…

a lot of work!

Worthwhile for community-critical resources and

databases like AgroLD, UniProt, PHI-base,

ChEMBL, etc.

Page 130: IBC FAIR Data Prototype Implementation   slideshow

Is there a more “elegant” & lightweight

way to be FAIR?

Page 131: IBC FAIR Data Prototype Implementation   slideshow

FAIR Projection:

Providing FAIR Datafrom non-FAIR Data

Dynamically

Page 132: IBC FAIR Data Prototype Implementation   slideshow

This is going to be a bit complicated, but please be patient

Page 133: IBC FAIR Data Prototype Implementation   slideshow

Imagine the data we need to integrate

is in a CSV filein FigShare or Zenodo

How do we discover and integrate that data?

Page 134: IBC FAIR Data Prototype Implementation   slideshow

Things we need to do:

We need a way to query “opaque” data blobs (like CSV) about their content

We need a way to retrieve that content in a FAIR format

We need, therefore, to model semantics for that opaque data content

We need to model various semantics for that content (one “size” doesn’t fit all!)

We need to associate those semantic models with a record or record-sets

We need a way to query those semantics determine which “size” fits our req’s

We would like to reuse semantic definitions as much as possible

We need to do all of this without creating a new API :-)

Page 135: IBC FAIR Data Prototype Implementation   slideshow

Triple Pattern Fragments+

RDF Mapping Language

Ruben VerborghGhent University

Anastasia DimouGhent University

Page 136: IBC FAIR Data Prototype Implementation   slideshow

Triple Pattern Fragments (TPF)A REST interface for requesting/retrieving RDF Triples

(from any source)

Ruben Verborgh

“Slices” of data, from any source, are considered Resources and are therefore represented by a distinct URL:

http://some.database.org/dataset?s=___;p=___;o=___

Calling HTTP GET on a TPF URL returns the set of Triples matching {?s, ?p, ?o}

PLUS hypermedia instructions and Resource URLs for other relevant slices.

Page 137: IBC FAIR Data Prototype Implementation   slideshow

Triple Pattern Fragments (TPF)A REST interface for retrieving RDF Triples

(from any source)

Ruben Verborgh

For example, the “BMI” column from a patient registry is a Resource with the URL:

http://my.registry.org/patients?p=CMO:0000105 (CMO:0000105 = “body mass index””)

HTTP GET gives me all BMI triples in the registry, together with other Resource URLs representing other “slices” that

might be useful, for example:

http://my.registry.org/patients?p=CMO:0000004 (CMO:0000004 = “systolic B.P.”)

Page 138: IBC FAIR Data Prototype Implementation   slideshow

Triple Pattern Fragments (TPF)A REST interface for retrieving RDF Triples

(from any source)

Ruben Verborgh

For example, the “BMI” column from a patient registry is a Resource with the URL:

http://my.registry.org/patients?p=CMO:0000105 (CMO:0000105 = “body mass index””)

HTTP GET gives me all BMI triples in the registry, together with other Resource URLs representing other “slices” that

might be useful, for example:

http://my.registry.org/patients?p=CMO:0000004 (CMO:0000004 = “systolic B.P.”)

Page 139: IBC FAIR Data Prototype Implementation   slideshow

We have a standard, RESTful way to request triples from

any data source

i.e. every slice of every dataset will be considered a distinct Resource

→ simply call HTTP GET on that Resource to get the Triples

Page 140: IBC FAIR Data Prototype Implementation   slideshow

But... We have no way to know what

TPF Resources are available for any given dataset

or what those Resources “are” (proteins? genes? patients? articles?)

Page 141: IBC FAIR Data Prototype Implementation   slideshow

RMLA way to describe the structure of an RDF

document

Anastasia Dimou

RML allows us to create models of (meta)data structures

“What could this data look like, if it were mapped to RDF?”

RML fulfills similar objectives to DCAT Profiles, the Dublin Core Application Profile, and ISO

11179 - Metadata Registries;but has added advantages!

http://rml.io/RMLmappingLanguage.html

Page 142: IBC FAIR Data Prototype Implementation   slideshow

Using RML to describe the structure

and semantics of a single Triple

Map1

PredicateObject Map

SubjectMap

ObjectMap

ex:PatientRecord

subjectMap template“http://example.org/patient/{id}”

predicateObjectM

ap

predicate ex:hasVariant

objectMa

p

class

Map2

parentTriplesMap

SubjectMap2

SO:0000694(“SNP”)

subjectMap template“http://identifiers.org/dbsnp/{snp}”

class

THE MODEL

Page 143: IBC FAIR Data Prototype Implementation   slideshow

Using RML to describe the structure

and semantics of a single Triple

Map1

PredicateObject Map

SubjectMap

ObjectMap

ex:PatientRecord

subjectMap template“http://example.org/patient/{id}”

predicateObjectM

ap

predicate ex:hasVariant

objectMa

p

class

Map2

parentTriplesMap

SubjectMap2

SO:0000694(“SNP”)

subjectMap template“http://identifiers.org/dbsnp/{snp}”

class

THE MODEL

We call this a “Triple Descriptor”

These are used to describe the structure of data

“slices” in which all Triples have the same structure

Page 144: IBC FAIR Data Prototype Implementation   slideshow

THE MODEL

Using RML to describe the structure

and semantics of a single Triple

Map1

PredicateObject Map

SubjectMap

ObjectMap

ex:PatientRecord

subjectMap template“http://example.org/patient/{id}”

predicateObjectM

ap

predicate ex:hasVariant

objectMa

p

class

Map2

parentTriplesMap

SubjectMap2

SO:0000694(“SNP”)

subjectMap template“http://identifiers.org/dbsnp/{snp}”

class

Patient:123

rdf:type

ex:PatientRecord

snp:rs0020394

ex:hasVariant

rdf:type

ex:PatientRecord

The Data

Page 145: IBC FAIR Data Prototype Implementation   slideshow

Where are we now?

TPF - A standard, RESTful way to request Triples

Triple Descriptors - A standard way to describe the structure and meaning of a Triple

Page 146: IBC FAIR Data Prototype Implementation   slideshow

Where are we now?

TPF - A standard, RESTful way to request Triples

Triple Descriptors - A standard way to describe the structure and meaning of a Triple

We need a way to associate these with each other

We need a way to associate these with a dataset or record

Page 147: IBC FAIR Data Prototype Implementation   slideshow

Luckily, we have already solved this!

Page 148: IBC FAIR Data Prototype Implementation   slideshow

MetaRecord Resource3

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

HTTP GET

The FAIR Accessor can do this

Using the metadata structures defined by DCAT the FAIR Accessor also tells you how to get the content of the record, and what formats are available

Page 149: IBC FAIR Data Prototype Implementation   slideshow

If we consider the TPF Resource URL to be just another DCAT Distribution, we get...

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml

Page 150: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml

If we consider the TPF Resource URL to be just another DCAT Distribution, we get...

URL representing the Triple Pattern Fragment Resource

Page 151: IBC FAIR Data Prototype Implementation   slideshow

If we consider the TPF Resource URL to be just another DCAT Distribution, we get… now add the Triple Descriptor <FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2

format application/xml dcat:Distribution_3

Source TPFrag_URL_1

format rdf+xmlModel: Triple_Desc_URL

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml Model: Triple_Desc_URL

Page 152: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source

TPF_URL_1 Format rdf+xml Model: Triple_Desc_URL

If we consider the TPF Resource URL to be just another DCAT Distribution, we get… now add the Triple Descriptor

HTTP GET on thatURL returns:

Page 153: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml Model: Triple_Desc_URL

If we consider the TPF Resource URL to be just another DCAT Distribution, we get… now add the Triple Descriptor

Record+

TPF Server+

RML Model=

FAIR Projector

Page 154: IBC FAIR Data Prototype Implementation   slideshow

If we consider the TPF Resource URL to be just another DCAT Distribution, we get… now add the Triple DescriptorHTTP GET on TPF_URL

returns rdf+xml triples from Record R

That look like

Interoperability

without Brute Force

<FAIR metadata/>

foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml Model: Triple_Desc_URL

Page 155: IBC FAIR Data Prototype Implementation   slideshow

I hear you objecting… I skipped something important!!!We still have not defined a

way to CREATE these triples

Page 156: IBC FAIR Data Prototype Implementation   slideshow

<FAIR metadata/> foaf:primaryTopic Record R

dcat:Distribution_1Source URL_U1format rdf+xml

dcat:Distribution_2Source URL_U2format application/xml

dcat:Distribution_3 Source TPF_URL Format rdf+xml Model: Triple_Desc_URL

I hear you objecting… I skipped something important!!!

How does this return Triples?

We still have not defined a way to

CREATE these triples

Page 157: IBC FAIR Data Prototype Implementation   slideshow

Sadly, there is no magic wand to create interoperability

Page 158: IBC FAIR Data Prototype Implementation   slideshow

Sadly, there is no magic wand to create interoperability

Someone has to write the TPF server that converts the data

Interoperability will never come “for free”

(because semantics will never come “for free”)

Page 159: IBC FAIR Data Prototype Implementation   slideshow

However, there are reasons for optimism!

1.Researchers transform data anyway to integrate it - this is a daily routine in most bioinformatics labs

2. For the most common file formats (e.g. CSV or Excel), there are RML-based tools to automate the RDF transformation; simply create an RML model of what you want, and ask the tool to covert the file.

3.Investing time into creating an RML model is more FAIR than ad hoc “re-useless” brute-force transformation. When you create a FAIR Projector for your own data transformation needs, it is reusable!

Page 160: IBC FAIR Data Prototype Implementation   slideshow

However, there are reasons for optimism!

AND4. RML Triple Descriptors are very simple

(one triple!) so we can also templatize their construction creating a FAIR Projector is quite easy in many cases!

5. Citations Citations Citations! FAIR Accessors/Projectors are FAIR objects - You can get credit if other people use your Projector for their analyses

Page 161: IBC FAIR Data Prototype Implementation   slideshow

Summary of FAIR Projectors

FAIR Projectors provide a discoverable and standardized REST interface to retrieve interoperable data, and its interoperable metadata

F+I+R

A + I

IFAIR Projectors can convert non-FAIR data into FAIR data, or can change the structure, URL format, or semantics of existing FAIR data sources

FAIR Projectors can be deployed over, and provide a common interface to:

- Static Data Deposits, in any format, anywhere- Databases- Triplestores- Certain (common) types of Web ServicesR+

++Triple Descriptors are FAIR entities, intended for reuse, & None of this required a new API

Page 162: IBC FAIR Data Prototype Implementation   slideshow

Siri…

I need data about the expression of the Oryza dwarf-1 gene under high salt conditions.

Please find that data, regardless of location and format

If possible, please reformat it automatically to match my local dataset

Also, please collect the citation information for each piece of data

If the data is not under an open access license, or if any of the data is behind a firewall or paywall, please provide me the contact information of the data owner so that I can ask them for a copy.

The (near!) future of FAIR

Page 163: IBC FAIR Data Prototype Implementation   slideshow

Thanks to:Michel Dumontier - Stanford Center for Biomedical Informatics Research, Stanford, California.

Ruben Verborgh – Ghent University – imec, Ghent, Belgium

Luiz Olavo Bonino da Silva Santos - Dutch Techcentre for Life Sciences, Utrecht, The Netherlands - Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

Tim Clark - Department of Neurology, Massachusetts General Hospital Boston MA and Harvard Medical School, Boston, MA, USA

Morris A. Swertz - Genomics Coordination Center and Department of Genetics, University Medical Center Groningen, Groningen, The Netherlands

Fleur D.L. Kelpin - Genomics Coordination Center and Department of Genetics, University Medical Center Groningen, Groningen, The Netherlands

Alasdair J. G. Gray - Department of Computer Science, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK

Erik A. Schultes - Department of Human Genetics, Leiden University Medical Center, The Netherlands

Erik M. van Mulligen - Department of Medical Informatics, Erasmus University Medical Center Rotterdam, The Netherlands

Paolo Ciccarese - Perkin Elmer Innovation Lab, Cambridge MA and Harvard Medical School, Boston MA, USA

Mark Thompson - Leiden University Medical Center, Leiden, The Netherlands

Jerven T. Bolleman - Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva, Switzerland

Page 164: IBC FAIR Data Prototype Implementation   slideshow

Thanks to my former lab members… I MISS YOU!!!

Dr. Mikel Egaña ArangurenOntologist

Dr. Alejandro Rodríguez GonzálezDatabase Expert

Dr. Alejandro Rodríguez Iglesias(PhD student at the time)

Page 165: IBC FAIR Data Prototype Implementation   slideshow

Funding for Mark Wilkinson from: Fundacion BBVA and the UPM Isaac Peral programme, and the Spanish Ministerio de Economía y Competitividad grant number TIN2014-55993-R. Additional support for FAIR Skunkworks members comes from:

European Union funded projects ELIXIR-EXCELERATE (H2020 no. 676559), ADOPT BBMRI-ERIC (H2020 no. 676550)CORBEL (H2020 no. 654248)Netherlands Organisation for Scientific Research (Odex4all project)Stichting Topconsortium voor Kennis en Innovatie High Tech Systemen en Materialen (FAIRdICT project)BBMRI-NLRD-Connect and ELIXIR (Rare disease implementation study FP7 no. 305444).

Page 166: IBC FAIR Data Prototype Implementation   slideshow

You are not only welcome to share and reuse this presentation...

...You are encouraged to!http://tinyurl.com/IBC-FAIR

Page 167: IBC FAIR Data Prototype Implementation   slideshow

Important: All our templates are free to use under Creative Commons Attribution License. If you use the graphic assets (photos, icons and typographies) included in this Google Slides Templates you must keep the Credits slide or add all attributions in the last slide notes.

FGST

Free Google

Slides Templates

Some graphical elements were taken from slide templates provided by:


Recommended