+ All Categories
Home > Self Improvement > EDF 2012 Datasets

EDF 2012 Datasets

Date post: 11-May-2015
Category:
Upload: jens-lehmann
View: 477 times
Download: 0 times
Share this document with a friend
Description:
The presentation shows which datasets have been converted to RDF and interlinked within the LATC EU project. In particular, it shows the typical conversion process for one example dataset - the EU financial transparency system.
Popular Tags:
18
Jens Lehmann AKSW Group, University of Leipzig 6 June 2012 Realising and Exploiting the EU Data Cloud European Data Forum, Copenhagen, Denmark Dataset Presentations
Transcript
Page 1: EDF 2012 Datasets

Jens Lehmann AKSW Group, University of Leipzig

6 June 2012

Realising and Exploiting the EU Data Cloud

European Data Forum, Copenhagen, Denmark

Dataset Presentations

Page 2: EDF 2012 Datasets

EU-Level Dataset Development

Page 3: EDF 2012 Datasets

List of LATC Datasets

Business Legal Institutions

FTS(EU finance)

Eur-Lex(European Law)

EuroStat(Statistical Data)

CORDIS (EU projects, finance)

N-Lex(National Law)

Institution List

Euraxess (EU jobs, companies)

Taxation & Customs EU Who is Who

EURES (EU jobs)

EU Patent Office EU Barometer

EC Competition(market overview)

EU Agencies European Election Results

eSBN(eBusiness solutions)

PreLex(inter-institutional law)

European Parliament Media

UNODC(drugs & crime statistics)

European Central Bank Statitstics

Other: Eventseer, Sciencewise

Total: 22 Datasetshttp://latc-project.eu/datasets/

Page 4: EDF 2012 Datasets

Financial Transparency System

Step 1: Analysing the Dataset

Financial Transparency System (FTS) contains information about 110000+ EU grants

Contains beneficiaries, amount of funding, year, responsible department, country etc.

Covers years 2007 – 2010

Originally published in HTML, XML and CSV

Page 5: EDF 2012 Datasets

Financial Transparency System Step 2: Modelling the Data in RDF and OWL

Michael Martin, Claus Stadler, Philipp Frischmuth, Jens Lehmann: Increasing the Financial Transparency of European Commission Project Funding: Semantic Web Journal (Under review)

Page 6: EDF 2012 Datasets

Financial Transparency System Step 3: Converting the Dataset

Java classes generated automatically from XML Schema

XML data accessible as Java Objects → script based transformation

High flexibility for data cleansing and special cases

Source code of transformation

● https://github.com/AKSW/FTS-EC-2-RDF/

XML

XSD Java Classes

Java Objects RDF

JAXB

TransformationJAXB

Page 7: EDF 2012 Datasets

Financial Transparency System

Step 4: Publishing the Dataset

Landing Page, Linked Data, SPARQL endpoint, browser at http://fts.publicdata.eu via OntoWiki

Metadata: Datahub

OntoWiki

http://thedatahub.org

Page 8: EDF 2012 Datasets

Financial Transparency System

Page 9: EDF 2012 Datasets

Financial Transparency System

Page 10: EDF 2012 Datasets

Financial Transparency System

Step 5: Enriching the Dataset

Linking with LIMES (http://limes.aksw.org)

Link targets:

● LinkedGeoData: cities● DBpedia: cities, countries, years, schema

Geo-Coding of beneficiaries on city and address level – 45k coordinates

Meta data: author, license, source, statistics using DublinCore, Void, DataCube

Page 11: EDF 2012 Datasets

Financial Transparency System

Step 6: Queries, Applications, Visualisation

RDF version allows:

● Find organisations with highest funding● Compare funding across countries / beneficiaries● Compare funding per year and country (from FTS)

with gross domestic product (from DBpedia) – see next slide

→ overall increases transparency and may serve as input for research policy strategies

Page 12: EDF 2012 Datasets

Financial Transparency SystemSELECT * { { SELECT ?ftsyear ?ftscountry (SUM(?amount) AS ?funding) { ?com rdf:type fts-o:Commitment . ?com fts-o:year ?year . ?year rdfs:label ?ftsyear . ?com fts-o:benefit ?benefit . ?benefit fts-o:detailAmount ?amount . ?benefit fts-o:beneficiary ?beneficiary . ?beneficiary fts-o:country ?country . ?country owl:sameAs ?ftscountry . } } { SELECT ?dbpcountry ?gdpyear ?gdpnominal { ?dbpcountry rdf:type dbp-o:Country . ?dbpcountry dbp-p:gdpNominal ?gdpnominal . ?dbpcountry dbp-p:gdpNominalYear ?gdpyear . } } FILTER ((?ftsyear = str(?gdpyear)) && (?ftscountry = ?dbpcountry)) }

Page 13: EDF 2012 Datasets

Financial Transparency System

Page 14: EDF 2012 Datasets

European Employment Services

European Employment Services (EURES) cooperation network for free movement of workers in the EU

Publishes 1.2+ mio Job vacancies, 700 000 CVs, 25000 employers

RDF version can be used to:● compare geographical, economic information for new jobs

(DBpedia, LGD)● Salary comparisons relative to standards in job region● Quality of nearby schools

Page 15: EDF 2012 Datasets

European Employment Services

Neither API nor dump available → site scraping

Modelling considered existing ontologies

Published using D2R: http://www4.wiwiss.fu-berlin.de/eures/

7 mio triples, classes: Offer, Skill, Employer

3000 links to DBpedia cities + regions + countries + languages + currencies, LEXVO languages, Eurostat

Updates can be performed by scraping only new pages

Page 16: EDF 2012 Datasets

Euraxess

Contains research jobs in EU, 6400 organisations, 1700 open jobs, 61000 registered researchers, 18000 researcher CVs

http://ec.europa.eu/euraxess/

Contains information about people, jobs, skills, languages etc.

links to DBpedia languages and LEXVO languages

Page 17: EDF 2012 Datasets

Euraxess + EURES Query

Query: aggregates information about jobs and companies in a country from two different sources

SELECT DISTINCT ?job ?company WHERE {SERVICE <http://www4.wiwiss.fu-berlin.de/eures/sparql> { ?job eures:country ?countryjob. ?countryjob a eures:Country. ?countryjob rdfs:label ?n.}SERVICE <http://www4.wiwiss.fu-berlin.de/euraxess/sparql> { ?company euraxess:country ?countrycomp. ?countrycomp a euraxess:Country. ?countryjob owl:sameAs ?countrycomp .}}

Page 18: EDF 2012 Datasets

Summary / Take Away Messages

Linked Data increasingly important in EU E-Government

Many RDF conversion tools/techniques available depending on source format

Linked Data simplifies data integration – added value by enrichment, e.g. linking to other data sets or schema creation

LOD cloud provides rich background information

Thanks for your Attention!


Recommended