COMMIT/VIVO

Post on 12-Sep-2014

665 views 1 download

Tags:

description

This presentation describes the use by Data2Semantics (http://www.data2semantics.org) of the VIVO portal (http://vivoweb.org) for interlinking researchers contributing to projects within the COMMIT programme (http://www.commit-nl.nl).

transcript

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

What is Data2Semantics?

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

What is Data2Semantics? What is

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

What is Data2Semantics? What is

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

What is Data2Semantics? What is

Rinke Hoekstra and Adianto WibisonoVU University Amsterdam/University of Amsterdam

rinke.hoekstra@vu.nl

What is Data2Semantics?

Next Steps...

What is

... first a bit of background

to2Data Semantics

Semantics for Scientific Data PublishersFrom Data

HUBBLE Linked Data Hub for Clinical Decision Support

PROV-O-MaticTM

• Python Wrapper script for shell commandshttps://github.com/Data2Semantics/data/blob/master/src/d2s/prov.py

• Output in PROV-O & W3C Time vocabulary

• Timestamped URIs for files/resources

• ... integrate with GIT?

• Provenance trail for conversion, loading and linking

Monday, February 27, 12

TabLinkerSemi-Automatic RDF Converter for Eccentric Excel Files

Monday, February 27, 12

Partial Replication

Yasgui

COMPLEXITY vs. INTERESTINGNESS

?

Data Analysis

Provenance Reconstruction

http://www.data2semantics.org

RDF$Conversion$

RDF$Cleaning$

Internal$Linking$

Link$to$Other$Data$

Semi8Automa;c$Annota;on$

Cloud$

Provenance$Enrichment$

acquiring$data$from$text?$

xml2rdf$d2rq$

rdb2rdf$$

e.g.$GATE$OpenCalais$

AIDA$Browser$Poseidon$(Pirates/Maps)$

…$

SILK$Amalgame$Graph$Rewri;ng$Graph$Rewri;ng$

Provenance$

Analysis/Metrics$

Querying$and$Ranking$

Visualiza;on$

User$Interfaces$

sgvizler$

RDF$Feedback$

Semi8Automa;c$Conversion$

“tablinker”$

to2Data Semantics

Semantics for Scientific Data PublishersFrom Data

AERS-LDserious adverse

event reportsexposed as linked data

Papers & Guidelines

BioPortalMesh,

MedDRA,SnomedCT,

etc.

LOD CloudUMLS, DBPedia,Sider, Drugbank,

LinkedCT

SILK linkspeci!cation

languageand

PROV-O

BioPortalAnnotator

withAnnotationOntology

andPROV-O

4Store

Google WebToolkit

Hubble demonstrates three ‘sales pitches’ of linked data: inter-operability, interlinking and tool availability.

From patient to:- Relevant publications- Related adverse events- Clinical trials- Drug information- Known side e"ects- Statistical analysis

HUBBLE Linked Data Hub for Clinical Decision Support

PROV-O-MaticTM

• Python Wrapper script for shell commandshttps://github.com/Data2Semantics/data/blob/master/src/d2s/prov.py

• Output in PROV-O & W3C Time vocabulary

• Timestamped URIs for files/resources

• ... integrate with GIT?

• Provenance trail for conversion, loading and linking

Monday, February 27, 12

TabLinkerSemi-Automatic RDF Converter for Eccentric Excel Files

Monday, February 27, 12

Partial Replication

Yasgui

COMPLEXITY vs. INTERESTINGNESS

?

Data Analysis

Provenance Reconstruction

http://www.data2semantics.org

RDF$Conversion$

RDF$Cleaning$

Internal$Linking$

Link$to$Other$Data$

Semi8Automa;c$Annota;on$

Cloud$

Provenance$Enrichment$

acquiring$data$from$text?$

xml2rdf$d2rq$

rdb2rdf$$

e.g.$GATE$OpenCalais$

AIDA$Browser$Poseidon$(Pirates/Maps)$

…$

SILK$Amalgame$Graph$Rewri;ng$Graph$Rewri;ng$

Provenance$

Analysis/Metrics$

Querying$and$Ranking$

Visualiza;on$

User$Interfaces$

sgvizler$

RDF$Feedback$

Semi8Automa;c$Conversion$

“tablinker”$

Key Points

• Build useful services and tools for data publishers ...

• ... that maintain provenance information ...

• ... and cater for the entire research cycle ...

• ... including a feedback loop to new research

One of our use cases ...

• Public-private research community

• Emphasis on applications of IT

• Emphasis on knowledge transfer

• 15 projects

• Collaboration with EIT ICT-Labshttp://www.eitictlabs.eu/

http://www.commit-nl.nl

Why VIVO?• Demonstrate collaboration within COMMIT/

between projects (synergy), between organizations

• Integrate project results with collaboration networkshared publications, deliverables

Linked Data Rubik’s Cube by Duncan Hull

Why ?

Why ?Most Dutch universities

Large companies

Government organizations

The Data

• COMMIT Websitehttp://www.commit-nl.nl

• All project plans (buzzword mining)

• All public deliverables (~200 per year)

• All participating persons (not just researchers)

“Pilot”• Scraping

• Web Karmahttp://bit.ly/WebKarma

Future Work

• Improve people scraperfirst name, family name, affiliation

• Ingest other contentdeliverables, plans etc.

• Shared ontology amongst Dutch VIVO installations

• Shared identifiers for researchers in NL (and VIVO)ORCID, ResearcherID, Digital Author ID

Event

• Yearly event for all COMMIT people

• Tap into registration process to get detailed info

• Wireless sensor networks to capture “synergy”

• Prizes whatnot...

VIVO Pitfalls

• Very “institutional” perspective

• How to actively engage individual researchers?Reward mechanisms, integrate with Web 2.0 practices...

http://oreilly.com/web2/archive/what-is-web-20.html (2005)

Web 2.0

• Web applications generate your data

• Rich user experience

• You control your own data

• Immediate reward

• Quality increases by usage

• Lightweight Web Application

• Interface to API of existing data repositories

• Enrich metadata by linking to Linked Data resources

• Provide annotation services for data files

• Plugin based architecture

• Publish RDF metadata as new data publication

http://linkitup.data2semantics.org

Where to publish the RDF?

http://linkitup.data2semantics.org

Send me more!

Where to publish the RDF?

Future Work• Improve people scraper

first name, family name, affiliation

• Ingest other contentdeliverables, plans etc.

• Shared ontology amongst Dutch VIVO installations

• Shared identifiers for researchers in NLORCID, ResearcherID, Digital Author ID

• ... reward mechanisms for individual authors!

http://www.data2semantics.org

Future Work• Improve people scraper

first name, family name, affiliation

• Ingest other contentdeliverables, plans etc.

• Shared ontology amongst Dutch VIVO installations

• Shared identifiers for researchers in NLORCID, ResearcherID, Digital Author ID

• ... reward mechanisms for individual authors!

http://www.data2semantics.org

Next week COMMIT/ Data

Early March COMMIT/ VIVO

Early April COMMIT/ Days

Future Work• Improve people scraper

first name, family name, affiliation

• Ingest other contentdeliverables, plans etc.

• Shared ontology amongst Dutch VIVO installations

• Shared identifiers for researchers in NLORCID, ResearcherID, Digital Author ID

• ... reward mechanisms for individual authors!

http://www.data2semantics.org

Next week COMMIT/ Data

Early March COMMIT/ VIVO

Early April COMMIT/ Days

Future Work• Improve people scraper

first name, family name, affiliation

• Ingest other contentdeliverables, plans etc.

• Shared ontology amongst Dutch VIVO installations

• Shared identifiers for researchers in NLORCID, ResearcherID, Digital Author ID

• ... reward mechanisms for individual authors!

http://www.data2semantics.org

Next week COMMIT/ Data

Early March COMMIT/ VIVO

Early April COMMIT/ Days