+ All Categories
Home > Software > Ontop: Answering SPARQL Queries over Relational Databases

Ontop: Answering SPARQL Queries over Relational Databases

Date post: 22-Jan-2018
Category:
Upload: guohui-xiao
View: 1,497 times
Download: 0 times
Share this document with a friend
77
Ontop: Answering SPARQL Queries over Relational Databases Guohui Xiao Faculty of Computer Science, Free University of Bozen-Bolzano, Italy Free University of Bozen-Bolzano February 12, 2016 Stanford University, CA, USA
Transcript
Page 1: Ontop: Answering SPARQL Queries over Relational Databases

Ontop: Answering SPARQL Queries over Relational Databases

Guohui Xiao

Faculty of Computer Science, Free University of Bozen-Bolzano, Italy

Free University of Bozen-Bolzano

February 12, 2016Stanford University, CA, USA

Page 2: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

About Me

• Guohui Xiao, PhD

• Assistant Professor at KRDB Research Centre for Knowledge and Data,Free University of Bozen-Bolzano, Italy

• EducationsI PhD in Computer Science, Vienna University of Technology, AustriaI MSc and BSc in Mathematics, Peking University, China

• Research interests:I Artificial intelligence, Knowledge representationI Description logics, Ontology, Semantic WebI Ontology-based Data AccessI Implementation and Optimization of reasoning systems

• Ontop team leader

• Current project: Optique (Scalable End-user Access to Big Data), EU FP7

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 1/56

Page 3: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 2/56

Page 4: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 3/56

Page 5: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

We are Living in the Era of Big Data

Data NeverSleeps 2.0

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 4/56

Page 6: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

The Problem: information access

How to formulate the right questionto obtain the right answerin the ocean of Big Data.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 5/56

Page 7: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

The Problem: information access

How to formulate the right questionto obtain the right answerin the ocean of Big Data.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 5/56

Page 8: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

How much time is spent searching for data?

Engineers in industry spend a significant amount of their time searchingfor data that they require for their core tasks.For example, in the oil&gas industry, 30–70% of engineers’ time is spentlooking for data and assessing its quality (Crompton, 2008).

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 6/56

Page 9: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example: Statoil Exploration

Experts in geology and geophysics developstratigraphic models of unexplored areas onthe basis of data acquired from previousoperations at nearby locations.

Facts:

• 1,000 TB of relational data

• using diverse schemata

• spread over 2,000 tables, over multiple individual data bases

Data Access for Exploration:

• 900 experts in Statoil Exploration.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 7/56

Page 10: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example: Statoil Exploration

Experts in geology and geophysics developstratigraphic models of unexplored areas onthe basis of data acquired from previousoperations at nearby locations.

Facts:

• 1,000 TB of relational data

• using diverse schemata

• spread over 2,000 tables, over multiple individual data bases

Data Access for Exploration:

• 900 experts in Statoil Exploration.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 7/56

Page 11: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some additional attributes (wellbore id,.....................). Limit to all wellbores with ... and show attributes like............................................... Limit to all wellbores with ... in .................and show key attributes in a table. After connecting to ... we could for instancelimit further to cores in ... with ...... and where it is larger than a given value,for instance ..... We could also find out whether there are cores in ..... which arenot stored in .... (based on .....) and where there could be .......... value. Someof the missing data we possibly own, other not.

SELECT [...]

FROM

db_name.table1 table1,

db_name.table2 table2a,

db_name.table2 table2b,

db_name.table3 table3a,

db_name.table3 table3b,

db_name.table3 table3c,

db_name.table3 table3d,

db_name.table4 table4a,

db_name.table4 table4b,

db_name.table4 table4c,

db_name.table4 table4d,

db_name.table4 table4e,

db_name.table4 table4f,

db_name.table5 table5a,

db_name.table5 table5b,

db_name.table6 table6a,

db_name.table6 table6b,

db_name.table7 table7a,

db_name.table7 table7b,

db_name.table8 table8,

db_name.table9 table9,

db_name.table10 table10a,

db_name.table10 table10b,

db_name.table10 table10c,

db_name.table11 table11,

db_name.table12 table12,

db_name.table13 table13,

db_name.table14 table14,

db_name.table15 table15,

db_name.table16 table16

WHERE [...]

table2a.attr1=‘keyword’ AND

table3a.attr2=table10c.attr1 AND

table3a.attr6=table6a.attr3 AND

table3a.attr9=‘keyword’ AND

table4a.attr10 IN (‘keyword’) AND

table4a.attr1 IN (‘keyword’) AND

table5a.kinds=table4a.attr13 AND

table5b.kinds=table4c.attr74 AND

table5b.name=‘keyword’ AND

(table6a.attr19=table10c.attr17 OR

(table6a.attr2 IS NULL AND

table10c.attr4 IS NULL)) AND

table6a.attr14=table5b.attr14 AND

table6a.attr2=‘keyword’ AND

(table6b.attr14=table10c.attr8 OR

(table6b.attr4 IS NULL AND

table10c.attr7 IS NULL)) AND

table6b.attr19=table5a.attr55 AND

table6b.attr2=‘keyword’ AND

table7a.attr19=table2b.attr19 AND

table7a.attr17=table15.attr19 AND

table4b.attr11=‘keyword’ AND

table8.attr19=table7a.attr80 AND

table8.attr19=table13.attr20 AND

table8.attr4=‘keyword’ AND

table9.attr10=table16.attr11 AND

table3b.attr19=table10c.attr18 AND

table3b.attr22=table12.attr63 AND

table3b.attr66=‘keyword’ AND

table10a.attr54=table7a.attr8 AND

table10a.attr70=table10c.attr10 AND

table10a.attr16=table4d.attr11 AND

table4c.attr99=‘keyword’ AND

table4c.attr1=‘keyword’ AND

table11.attr10=table5a.attr10 AND

table11.attr40=‘keyword’ AND

table11.attr50=‘keyword’ AND

table2b.attr1=table1.attr8 AND

table2b.attr9 IN (‘keyword’) AND

table2b.attr2 LIKE ‘keyword’% AND

table12.attr9 IN (‘keyword’) AND

table7b.attr1=table2a.attr10 AND

table3c.attr13=table10c.attr1 AND

table3c.attr10=table6b.attr20 AND

table3c.attr13=‘keyword’ AND

table10b.attr16=table10a.attr7 AND

table10b.attr11=table7b.attr8 AND

table10b.attr13=table4b.attr89 AND

table13.attr1=table2b.attr10 AND

table13.attr20=’‘keyword’’ AND

table13.attr15=‘keyword’ AND

table3d.attr49=table12.attr18 AND

table3d.attr18=table10c.attr11 AND

table3d.attr14=‘keyword’ AND

table4d.attr17 IN (‘keyword’) AND

table4d.attr19 IN (‘keyword’) AND

table16.attr28=table11.attr56 AND

table16.attr16=table10b.attr78 AND

table16.attr5=table14.attr56 AND

table4e.attr34 IN (‘keyword’) AND

table4e.attr48 IN (‘keyword’) AND

table4f.attr89=table5b.attr7 AND

table4f.attr45 IN (‘keyword’) AND

table4f.attr1=‘keyword’ AND

table10c.attr2=table4e.attr19 AND

(table10c.attr78=table12.attr56 OR

(table10c.attr55 IS NULL AND

table12.attr17 IS NULL))

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56

Page 12: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some additional attributes (wellbore id,.....................). Limit to all wellbores with ... and show attributes like............................................... Limit to all wellbores with ... in .................and show key attributes in a table. After connecting to ... we could for instancelimit further to cores in ... with ...... and where it is larger than a given value,for instance ..... We could also find out whether there are cores in ..... which arenot stored in .... (based on .....) and where there could be .......... value. Someof the missing data we possibly own, other not.

SELECT [...]

FROM

db_name.table1 table1,

db_name.table2 table2a,

db_name.table2 table2b,

db_name.table3 table3a,

db_name.table3 table3b,

db_name.table3 table3c,

db_name.table3 table3d,

db_name.table4 table4a,

db_name.table4 table4b,

db_name.table4 table4c,

db_name.table4 table4d,

db_name.table4 table4e,

db_name.table4 table4f,

db_name.table5 table5a,

db_name.table5 table5b,

db_name.table6 table6a,

db_name.table6 table6b,

db_name.table7 table7a,

db_name.table7 table7b,

db_name.table8 table8,

db_name.table9 table9,

db_name.table10 table10a,

db_name.table10 table10b,

db_name.table10 table10c,

db_name.table11 table11,

db_name.table12 table12,

db_name.table13 table13,

db_name.table14 table14,

db_name.table15 table15,

db_name.table16 table16

WHERE [...]

table2a.attr1=‘keyword’ AND

table3a.attr2=table10c.attr1 AND

table3a.attr6=table6a.attr3 AND

table3a.attr9=‘keyword’ AND

table4a.attr10 IN (‘keyword’) AND

table4a.attr1 IN (‘keyword’) AND

table5a.kinds=table4a.attr13 AND

table5b.kinds=table4c.attr74 AND

table5b.name=‘keyword’ AND

(table6a.attr19=table10c.attr17 OR

(table6a.attr2 IS NULL AND

table10c.attr4 IS NULL)) AND

table6a.attr14=table5b.attr14 AND

table6a.attr2=‘keyword’ AND

(table6b.attr14=table10c.attr8 OR

(table6b.attr4 IS NULL AND

table10c.attr7 IS NULL)) AND

table6b.attr19=table5a.attr55 AND

table6b.attr2=‘keyword’ AND

table7a.attr19=table2b.attr19 AND

table7a.attr17=table15.attr19 AND

table4b.attr11=‘keyword’ AND

table8.attr19=table7a.attr80 AND

table8.attr19=table13.attr20 AND

table8.attr4=‘keyword’ AND

table9.attr10=table16.attr11 AND

table3b.attr19=table10c.attr18 AND

table3b.attr22=table12.attr63 AND

table3b.attr66=‘keyword’ AND

table10a.attr54=table7a.attr8 AND

table10a.attr70=table10c.attr10 AND

table10a.attr16=table4d.attr11 AND

table4c.attr99=‘keyword’ AND

table4c.attr1=‘keyword’ AND

table11.attr10=table5a.attr10 AND

table11.attr40=‘keyword’ AND

table11.attr50=‘keyword’ AND

table2b.attr1=table1.attr8 AND

table2b.attr9 IN (‘keyword’) AND

table2b.attr2 LIKE ‘keyword’% AND

table12.attr9 IN (‘keyword’) AND

table7b.attr1=table2a.attr10 AND

table3c.attr13=table10c.attr1 AND

table3c.attr10=table6b.attr20 AND

table3c.attr13=‘keyword’ AND

table10b.attr16=table10a.attr7 AND

table10b.attr11=table7b.attr8 AND

table10b.attr13=table4b.attr89 AND

table13.attr1=table2b.attr10 AND

table13.attr20=’‘keyword’’ AND

table13.attr15=‘keyword’ AND

table3d.attr49=table12.attr18 AND

table3d.attr18=table10c.attr11 AND

table3d.attr14=‘keyword’ AND

table4d.attr17 IN (‘keyword’) AND

table4d.attr19 IN (‘keyword’) AND

table16.attr28=table11.attr56 AND

table16.attr16=table10b.attr78 AND

table16.attr5=table14.attr56 AND

table4e.attr34 IN (‘keyword’) AND

table4e.attr48 IN (‘keyword’) AND

table4f.attr89=table5b.attr7 AND

table4f.attr45 IN (‘keyword’) AND

table4f.attr1=‘keyword’ AND

table10c.attr2=table4e.attr19 AND

(table10c.attr78=table12.attr56 OR

(table10c.attr55 IS NULL AND

table12.attr17 IS NULL))

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56

Page 13: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

How much time/money is spent searching for data?

A user query at Statoil

Show all norwegian wellbores with some additional attributes (wellbore id,.....................). Limit to all wellbores with ... and show attributes like............................................... Limit to all wellbores with ... in .................and show key attributes in a table. After connecting to ... we could for instancelimit further to cores in ... with ...... and where it is larger than a given value,for instance ..... We could also find out whether there are cores in ..... which arenot stored in .... (based on .....) and where there could be .......... value. Someof the missing data we possibly own, other not.

SELECT [...]

FROM

db_name.table1 table1,

db_name.table2 table2a,

db_name.table2 table2b,

db_name.table3 table3a,

db_name.table3 table3b,

db_name.table3 table3c,

db_name.table3 table3d,

db_name.table4 table4a,

db_name.table4 table4b,

db_name.table4 table4c,

db_name.table4 table4d,

db_name.table4 table4e,

db_name.table4 table4f,

db_name.table5 table5a,

db_name.table5 table5b,

db_name.table6 table6a,

db_name.table6 table6b,

db_name.table7 table7a,

db_name.table7 table7b,

db_name.table8 table8,

db_name.table9 table9,

db_name.table10 table10a,

db_name.table10 table10b,

db_name.table10 table10c,

db_name.table11 table11,

db_name.table12 table12,

db_name.table13 table13,

db_name.table14 table14,

db_name.table15 table15,

db_name.table16 table16

WHERE [...]

table2a.attr1=‘keyword’ AND

table3a.attr2=table10c.attr1 AND

table3a.attr6=table6a.attr3 AND

table3a.attr9=‘keyword’ AND

table4a.attr10 IN (‘keyword’) AND

table4a.attr1 IN (‘keyword’) AND

table5a.kinds=table4a.attr13 AND

table5b.kinds=table4c.attr74 AND

table5b.name=‘keyword’ AND

(table6a.attr19=table10c.attr17 OR

(table6a.attr2 IS NULL AND

table10c.attr4 IS NULL)) AND

table6a.attr14=table5b.attr14 AND

table6a.attr2=‘keyword’ AND

(table6b.attr14=table10c.attr8 OR

(table6b.attr4 IS NULL AND

table10c.attr7 IS NULL)) AND

table6b.attr19=table5a.attr55 AND

table6b.attr2=‘keyword’ AND

table7a.attr19=table2b.attr19 AND

table7a.attr17=table15.attr19 AND

table4b.attr11=‘keyword’ AND

table8.attr19=table7a.attr80 AND

table8.attr19=table13.attr20 AND

table8.attr4=‘keyword’ AND

table9.attr10=table16.attr11 AND

table3b.attr19=table10c.attr18 AND

table3b.attr22=table12.attr63 AND

table3b.attr66=‘keyword’ AND

table10a.attr54=table7a.attr8 AND

table10a.attr70=table10c.attr10 AND

table10a.attr16=table4d.attr11 AND

table4c.attr99=‘keyword’ AND

table4c.attr1=‘keyword’ AND

table11.attr10=table5a.attr10 AND

table11.attr40=‘keyword’ AND

table11.attr50=‘keyword’ AND

table2b.attr1=table1.attr8 AND

table2b.attr9 IN (‘keyword’) AND

table2b.attr2 LIKE ‘keyword’% AND

table12.attr9 IN (‘keyword’) AND

table7b.attr1=table2a.attr10 AND

table3c.attr13=table10c.attr1 AND

table3c.attr10=table6b.attr20 AND

table3c.attr13=‘keyword’ AND

table10b.attr16=table10a.attr7 AND

table10b.attr11=table7b.attr8 AND

table10b.attr13=table4b.attr89 AND

table13.attr1=table2b.attr10 AND

table13.attr20=’‘keyword’’ AND

table13.attr15=‘keyword’ AND

table3d.attr49=table12.attr18 AND

table3d.attr18=table10c.attr11 AND

table3d.attr14=‘keyword’ AND

table4d.attr17 IN (‘keyword’) AND

table4d.attr19 IN (‘keyword’) AND

table16.attr28=table11.attr56 AND

table16.attr16=table10b.attr78 AND

table16.attr5=table14.attr56 AND

table4e.attr34 IN (‘keyword’) AND

table4e.attr48 IN (‘keyword’) AND

table4f.attr89=table5b.attr7 AND

table4f.attr45 IN (‘keyword’) AND

table4f.attr1=‘keyword’ AND

table10c.attr2=table4e.attr19 AND

(table10c.attr78=table12.attr56 OR

(table10c.attr55 IS NULL AND

table12.attr17 IS NULL))

At Statoil, it takes up to 4 days to formulate a query in SQL.

Statoil loses up to 50.000.000e per year because of this!!

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 8/56

Page 14: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Challenges Accessing Big Data

This is what happens:

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 9/56

Page 15: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Need for Abstraction

We need to facilitate access to Data

• by abstracting away from how the data is stored, and

• by making use of high level views on the data, so called ontologies.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 10/56

Page 16: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontology Based Data Access Framework

. . .

. . .

. . .

. . .

ONTOLOGY=

global vocabulary+

conceptual view

DATA SOURCES

external andheterogeneous

MAPPINGS

how to populatethe ontology

query

result

Logical transparency in accessing data:

• does not know where and how data is stored;

• can only see a conceptual view of data.G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56

Page 17: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontology Based Data Access Framework

. . .

. . .

. . .

. . .

ONTOLOGY=

global vocabulary+

conceptual view

DATA SOURCES

external andheterogeneous

MAPPINGS

how to populatethe ontology

query

result

Logical transparency in accessing data:

• does not know where and how data is stored;

• can only see a conceptual view of data.G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56

Page 18: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontology Based Data Access Framework

. . .

. . .

. . .

. . .

ONTOLOGY=

global vocabulary+

conceptual view

DATA SOURCES

external andheterogeneous

MAPPINGS

how to populatethe ontology

query

result

Logical transparency in accessing data:

• does not know where and how data is stored;

• can only see a conceptual view of data.G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56

Page 19: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontology Based Data Access Framework

. . .

. . .

. . .

. . .

ONTOLOGY=

global vocabulary+

conceptual view

DATA SOURCES

external andheterogeneous

MAPPINGS

how to populatethe ontology

query

result

Logical transparency in accessing data:

• does not know where and how data is stored;

• can only see a conceptual view of data.G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 11/56

Page 20: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 12/56

Page 21: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop

• Is a platform to query databases through ontologies, relying on semantictechnologies.

• Compliant with the standards of the W3C.

• Supports all major relational DBs (Oracle, DB2, Postgres, MySQL, etc.).

• Open-source and released under Apache license.

• Development of Ontop:I development started 6 years agoI already well established:

• +200 topics in the mail list• +2300 downloads in last 10 months

I currently being developed in the context of the EU project Optique

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 13/56

Page 22: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Architecture of Ontop

Ontop SPARQL Query Answering Engine (Quest)

OWL-API Sesame Storage And Inference Layer (SAIL) API

R2RML APIOWL-API (OWL Parser)

Sesame API(SPARQL Parser)JDBC

Protege Optique Platform

Sesame Workbench & SPARQL Endpoint

Application Layer

API Layer

OntopCore

Inputs Relational Databases

R2RML Mappings

OWL 2 QL Ontologies

SPARQL Queries

Figure: Architecture of the Ontop system

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 14/56

Page 23: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Databases

Ontop supports standard relational database engines via JDBC.

• commercial databases: DB2, Oracle, MS SQL Server

• open-source databases: PostgreSQL, MySQL, H2, HSQL

• federated databases (e.g., Teiid1 or Exareme2) to support multiple data sources(e.g., relational databases, XML, CSV, and Web Services).

1http://teiid.jboss.org2http://www.exareme.org

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 15/56

Page 24: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example: Hospital Database

Table: tbl patient

pid name type stage

1 ’Mary’ false 42 ’John’ true 1

types:

• false for Non-Small Cell Lung Carcinoma (NSCLC)

• true for Small Cell Lung Carcinoma (SCLC),.

Stage

• NSCLC: 1–6 for stages I, II, III, IIIa, IIIb, and IV, respectively;

• SCLC: 1 and 2 for stages Limited and Extensive, respectively.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 16/56

Page 25: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontology

• Ontop uses RDFS and OWL2QL as ontology languages.

• OWL2QL is based on the DL-Lite family of lightweight description logics, whichguarantees FO-rewritability

Example

:NSCLC rdfs:subClassOf :LungCancer .:SCLC rdfs:subClassOf :LungCancer .

:LungCancer rdfs:subClassOf :Neoplasm .:hasNeoplasm rdfs:domain :Patient .:hasNeoplasm rdfs:range :Neoplasm .

:hasName a owl:DatatypeProperty .:hasStage a owl:ObjectProperty .

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 17/56

Page 26: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Mappings

Ontop supports two mapping languages:

• W3C RDB2RDF mapping language R2RML

• Ontop native mapping language

Example (Mappings in Ontop native mapping language)

:db1/{pid} a :Patient .

← SELECT pid FROM tbl patient

:db1/neoplasm/{pid} a :NSCLC .

← SELECT pid FROM tbl patient

WHERE type = false

:db1/neoplasm/{pid} a :SCLC .

← SELECT pid FROM tbl patient WHERE type = true

:db1/{pid} :hasName {name} .

← SELECT pid, name FROM tbl patient

:db1/{pid} :hasNeoplasm :db1/neoplasm/{pid} .

← SELECT pid FROM tbl patient

:db1/neoplasm/{pid} :hasStage :stage-IIIa .

← SELECT pid FROM tbl patient WHERE stage = 4 and type = false

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 18/56

Page 27: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Queries

• Ontop supports essentially all features of SPARQL 1.0 as well as the OWL2QLentailment regime of SPARQL 1.1.

• Implementation of other features of SPARQL 1.1 (e.g., aggregates, property pathqueries, negation) is working in progress.

The following SPARQL query retrieves all the names of all patients who have aneoplasm (tumor) at stage IIIa.

SELECT ?name WHERE {?p a :Patient ;

:hasName ?name ;:hasNeoplasm ?tumor .

?tumor a :Neoplasm ;:hasStage :stage -IIIa . }

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 19/56

Page 28: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop Core API

• The core of Ontop is the SPARQL query answering engine Quest.

• We will explain the details in the next section.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 20/56

Page 29: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

API layer of Ontop

System developers can use Ontop as a Java library

• OWL API is a reference implementation for creating, manipulating, and serializingOWL ontologies. We extended the OWLReasoner Java interface to supportSPARQL query answering.

• Sesame is a de-facto standard framework for processing RDF data. Ontopimplements the Sesame Storage And Inference Layer (SAIL) API supportinginferencing and querying over relational databases.

• Available as Maven artifacts from central repository.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 21/56

Page 30: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Application Layer of Ontop

• Command line interface

• Protege plugin

• Sesame Workbench and SPARQL Endpoint

• Optique Platform

• Stardog

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 22/56

Page 31: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop Protege plugin

The Ontop Protege plugin provides a graphical interface for:

• editing mappings

• executing SPARQL queries

• checking (in)consistency of the ontology

• bootstrapping ontologies and mappings from the database

• importing and exporting R2RML mappings

• materializing RDF triples, etc.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 23/56

Page 32: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Mapping Editor in Protege

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 24/56

Page 33: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

SPARQL query answering in Protege

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 25/56

Page 34: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop plugin available from Protege plugin repository

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 26/56

Page 35: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Sesame workbench and SPARQL endpoint

• Sesame OpenRDF Workbench is a web application for administrating Sesamerepositories.

• We extended the Workbench to create and manage Ontop repositories.

• Such repositories can then be used as standard HTTP SPARQL endpoints.

• Currently Ontop only supports Sesame v2, we are working on supporting v4.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 27/56

Page 36: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Sesame workbench and SPARQL endpoint

• Sesame OpenRDF Workbench is a web application for administrating Sesamerepositories.

• We extended the Workbench to create and manage Ontop repositories.

• Such repositories can then be used as standard HTTP SPARQL endpoints.

• Currently Ontop only supports Sesame v2, we are working on supporting v4.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 27/56

Page 37: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Screenshot of the Ontop Sesame Workbench

Figure

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 28/56

Page 38: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop in the Optique Architecture

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 29/56

Page 39: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop in the Optique Architecture

Ontop

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 29/56

Page 40: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Stardog

• Stardog is a commercial triplestore developed by complexible, Inc.

• Since version 4 released in November 2015, Stardog has integrated Ontop code tosupport SPARQL queries over virtual RDF graphs.

• The Virtual Graph feature is only available in the enterprise edition

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 30/56

Page 41: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 31/56

Page 42: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 43: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 44: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 45: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 46: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 47: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 48: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Conceptual Framework of Query Answering by Query Rewriting

ONTOLOGY

MAPPINGS

DATASOURCES

. . .

. . .

. . .

. . .

Ontological Query q

Rewritten Query

SQLRelational Answer

Ontological Answer

qresult

Rewriting

Unfolding

Evaluation

Result Translation

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 32/56

Page 49: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Ontop Workflow

Ontop

ON-LINE OFF-LINE

Reasoner

Ontology

Mapping-Optimiser

Mappings

DB Integrity Constraints

ClassifiedOntology

T-mapping

SPARQLQuery

Query Rewriter

SQL query

SPARQL to SQLTranslator

Figure: The Ontop workflow

• The off-line stage (start-up time) processes the ontology, mappings, and databaseintegrity constraints.

• The on-line stage executes SPARQL queries by rewriting to SQL queries

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 33/56

Page 50: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Offline Stage

The offline stage can be thought of as consisting of three phases:

• ontology classification

• T-mapping construction

• T-mapping optimization

Example

• New axioms in the classified Ontology:NSCLC rdfs:subClassOf :Neoplasm .:SCLC rdfs:subClassOf :Neoplasm .

• Inferred Mappings after T-mapping construction:db1/neoplasm/{pid} a :Neoplasm .← SELECT pid FROM tbl patient WHERE type = false

:db1/neoplasm/{pid} a :Neoplasm .← SELECT pid FROM tbl patient WHERE type = true

• Optimized T-mappings:db1/neoplasm/{pid} a :Neoplasm .← SELECT pid FROM tbl patient WHERE type = false OR type = true

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 34/56

Page 51: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Online Stage

During query execution (the online stage), Ontop transforms an input SPARQL queriesinto an optimized SQL query using the T-mappings and database integrity constraints.Optimizing the generated SQL queries

structural optimizations

• pushing the joins inside the unions,

• pushing the functions as high as possible in the query tree,

• eliminating sub-queries.

Semantic query optimizations

semantic analysis of SQL queries to reduce the size and complexity

• removing redundant self-joins,

• detecting unsatisfiable or trivially valid (true) conditions.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 35/56

Page 52: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example of SQL translation and optimization

• Consider a SPARQL query

SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }

• Non-optimized generated SQL query

SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE type = false OR type = true) Q1

JOIN (SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE stage = 4 AND type = false) Q2

ON Q1.x = Q2.x)

• SQL query after the structural optimization

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT T1.pidFROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pidWHERE (T1.type = false OR T1.type = true)

AND T2.stage = 4 AND T2.type = false) Q

• SQL query after the self-join elimination

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q

• SQL query after the second structural optimization

SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patientWHERE type = false AND stage = 4

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56

Page 53: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example of SQL translation and optimization

• Consider a SPARQL query

SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }

• Non-optimized generated SQL query

SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE type = false OR type = true) Q1

JOIN (SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE stage = 4 AND type = false) Q2

ON Q1.x = Q2.x)

• SQL query after the structural optimization

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT T1.pidFROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pidWHERE (T1.type = false OR T1.type = true)

AND T2.stage = 4 AND T2.type = false) Q

• SQL query after the self-join elimination

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q

• SQL query after the second structural optimization

SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patientWHERE type = false AND stage = 4

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56

Page 54: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example of SQL translation and optimization

• Consider a SPARQL query

SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }

• Non-optimized generated SQL query

SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE type = false OR type = true) Q1

JOIN (SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE stage = 4 AND type = false) Q2

ON Q1.x = Q2.x)

• SQL query after the structural optimization

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT T1.pidFROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pidWHERE (T1.type = false OR T1.type = true)

AND T2.stage = 4 AND T2.type = false) Q

• SQL query after the self-join elimination

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q

• SQL query after the second structural optimization

SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patientWHERE type = false AND stage = 4

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56

Page 55: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example of SQL translation and optimization

• Consider a SPARQL query

SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }

• Non-optimized generated SQL query

SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE type = false OR type = true) Q1

JOIN (SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE stage = 4 AND type = false) Q2

ON Q1.x = Q2.x)

• SQL query after the structural optimization

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT T1.pidFROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pidWHERE (T1.type = false OR T1.type = true)

AND T2.stage = 4 AND T2.type = false) Q

• SQL query after the self-join elimination

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q

• SQL query after the second structural optimization

SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patientWHERE type = false AND stage = 4

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56

Page 56: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Example of SQL translation and optimization

• Consider a SPARQL query

SELECT ?x WHERE { ?x a :Neoplasm ; :hasStage :stage -IIIa . }

• Non-optimized generated SQL query

SELECT Q1.x FROM (( SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE type = false OR type = true) Q1

JOIN (SELECT concat(":db1/neoplasm/", pid) AS xFROM tbl_patient WHERE stage = 4 AND type = false) Q2

ON Q1.x = Q2.x)

• SQL query after the structural optimization

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT T1.pidFROM tbl_patient T1 JOIN tbl_patient T2 ON T1.pid = T2.pidWHERE (T1.type = false OR T1.type = true)

AND T2.stage = 4 AND T2.type = false) Q

• SQL query after the self-join elimination

SELECT concat(":db1/neoplasm/", Q.pid) AS x FROM(SELECT pid FROM tbl_patient WHERE type = false AND stage = 4) Q

• SQL query after the second structural optimization

SELECT concat(":db1/neoplasm/", pid) AS x FROM tbl_patientWHERE type = false AND stage = 4

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 36/56

Page 57: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 37/56

Page 58: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Statoil Use Case

• Optique Use Case Partner

• Main reference: “Ontology Based Access to Exploration Data at Statoil[Kharlamov, Hovland, et al., 2015, ISWC In-use Track].

• Exploration domain

• Improve the efficiency of the information gathering routine for geologists at Statoil

• Efficient, creative data collection from multiple data sources

• ⇒ separate slides for this use case

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 38/56

Page 59: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Siemens Use Case

• Optique Use Case Partner

• Main reference: “How Semantic Technologies Can Enhance Data Access atSiemens Energy” [Kharlamov, Solomakhina, et al., 2014, ISWC In-use Track]

• ⇒ separate slides for this use case

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 39/56

Page 60: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

EPNet Use Case

• EPNet Project (ERC Advanced Grant EPNet “Production and distribution offood during the Roman Empire: Economics and Political Dynamics”,ERC-2013-ADG 340828).

• Main reference: “Ontology-Based Data Integration in EPNet: Production andDistribution of Food During the Roman Empire” [Calvanese, Liuzzo, et al., 2016,J. of Eng. Appl. of AI]

• Ontology-Based Data Integration: integrating multiple datasource.

• Linking three datasets: the EPNet relational repository , the Epigraphic DatabaseHeidelberg, and the Pleiades dataset

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 40/56

Page 61: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

EMSec Use Case

• EMSec (Echtzeitdienste fur die Maritime Sicherheit, Real-time Services for the Maritime Security)

is a German BMBF (Federal Ministry of Research and Education) funded project

• Geo-spatial support by Ontop-spatial (developed as a fork of Ontop)

• Sextant for visualizing linked geospatial data

• Use case paper “Ontology-based Data Access for Maritime Security” is undersubmission

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 41/56

Page 62: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

IBM Research Ireland Use Case

• Main reference: “Data Access Linking and Integration with DALI: building aSafety Net for an Ocean of City Data” [Lopez et al., 2015, ISWC In-use Track]

• Smarter Cities Technology Centre, IBM Research, Ireland

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 42/56

Page 63: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Electronic Health Records Use Case

• Main reference: “Validating an ontology-based algorithm to identify patients withType 2 Diabetes Mellitus in Electronic Health Records” [Rahimi et al., 2014, Int.J. of Medical Informatics]

• Medicine, The University of New South Wales, Australia

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 43/56

Page 64: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Electronic Health Records Use Case (Cont.)

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 44/56

Page 65: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Use Cases

• More use cases are in https://github.com/ontop/ontop/wiki/UseCases

• Unfortunately, we are not able to track all use cases of Ontop.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 45/56

Page 66: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Outline

1 Introduction

2 Overview of Ontop

3 SPARQL Query Answering in Ontop

4 Use Cases

5 Recent Progresses and Future

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 46/56

Page 67: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Recent Progresses

More recent lines of research on Ontop include

• formalization of SPARQL in the context of OBDA [Rodriguez-Muro and Rezk,2015, J. Web Semantics] [Kontchakov et al., 2014, ISWC]

• OWL 2 QL entailment regime [Kontchakov et al., 2014, ISWC]

• SWRL rule language with a limited form of recursion handled by SQL CommonTable Expressions [Xiao et al., 2014, RR]

• owl:sameAs for cross-linked datasets [Calvanese, Giese, et al., 2015, ISWC]

• Expressive ontologies beyond OWL2QL by rewriting and approximation with thehelp of the mapping layer [Botoeva et al., 2016, AAAI]

• System description of Ontop [Calvanese, Cogrel, et al., 2016, Semantic Web J.,to appear]

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 47/56

Page 68: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Recent Progresses

More recent lines of research on Ontop include

• formalization of SPARQL in the context of OBDA [Rodriguez-Muro and Rezk,2015, J. Web Semantics] [Kontchakov et al., 2014, ISWC]

• OWL 2 QL entailment regime [Kontchakov et al., 2014, ISWC]

• SWRL rule language with a limited form of recursion handled by SQL CommonTable Expressions [Xiao et al., 2014, RR]

• owl:sameAs for cross-linked datasets [Calvanese, Giese, et al., 2015, ISWC]

• Expressive ontologies beyond OWL2QL by rewriting and approximationwith the help of the mapping layer [Botoeva et al., 2016, AAAI]

• System description of Ontop [Calvanese, Cogrel, et al., 2016, Semantic Web J.,to appear]

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 47/56

Page 69: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Beyond OWL 2 QL (AAAI 16 paper)

• Framework for Rewriting and Approximation of OBDA specifications

. . .

. . .

. . .

. . .

〈T ,M,S〉 〈T ′,M′,S〉

. . .. . .

. . .

. . .

Rewriting The new specification is equivalent to the original one w.r.t. queryanswering (query-inseparable).

Approximation The new specification is a sound approximation of the original onew.r.t. query answering.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 48/56

Page 70: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Beyond OWL 2 QL (II)

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 49/56

Page 71: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Beyond OWL 2 QL (III)

Figure: OntoproxG. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 50/56

Page 72: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

WIP: OBDA beyond Relational Databases

Mapping parsinga

Ontology parsingb

Mappingcompilation

c

sparqlparsing

0Rewriting

1 Unfolding w.r.t.mappings

2Structural/semantic

optimization

3

Normalization/Decomposition

4

RA-to-native querytranslation

5

Evaluation6

Post-processing

7

Mapping file

Ontology file

Mapping M

Ontology T

T -Mapping MT

sparqlstring

sparql Q Rewrittensparql Q′

T

RA q1

RA q2

RA q3Nativequeries

Nativeresults

sparqlresult

OFFLINE

ONLINE

• NoSQL Movement

• In fact most of the components of Ontop are SQL-independent

• We are working on OBDA over non-relational datasource

• We are targeting on MongoDB now

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 51/56

Page 73: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

Future

• In order to further improve performance, we will investigate data-dependentoptimizations.

• support larger fragments of SPARQL (e.g., aggregation, negation, and pathqueries) and R2RML (e.g., named graphs).

• For end-users, we will improve the GUI and extend utilities to make Ontop evenmore user-friendly.

• go beyond relational databases and support other kinds of data sources (e.g.,graph and document databases).

• Continue building community

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 52/56

Page 74: Ontop: Answering SPARQL Queries over Relational Databases

Acknowledgment

Page 75: Ontop: Answering SPARQL Queries over Relational Databases

Thank youfor your attention!

QUESTIONS?

Page 76: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

References I

Kharlamov, Evgeny, Nina Solomakhina, Ozgur Lutfu Ozcep, Dmitriy Zheleznyakov, Thomas Hubauer, Steffen Lamparter,Mikhail Roshchin, Ahmet Soylu, and Stuart Watson (2014). “How Semantic Technologies Can Enhance Data Access atSiemens Energy”. In: The Semantic Web - ISWC 2014 - 13th International Semantic Web Conference, Riva del Garda,Italy, October 19-23, 2014. Proceedings, Part I, pp. 601–619.

Kontchakov, Roman, Martin Rezk, Mariano Rodriguez-Muro, Guohui Xiao, and Michael Zakharyaschev (2014). “AnsweringSPARQL Queries over Databases under OWL 2 QL Entailment Regime”. In: vol. 8796.doi:10.1007/978-3-319-11964-9 35, pp. 552–567.

Rahimi, Alireza, Siaw-Teng Liaw, Jane Taggart, Pradeep Ray, and Hairong Yu (2014). “Validating an ontology-basedalgorithm to identify patients with Type 2 Diabetes Mellitus in Electronic Health Records”. In: Int. J. of MedicalInformatics 83.10. doi:10.1016/j.ijmedinf.2014.06.002, pp. 768–778.

Xiao, Guohui, Martin Rezk, Mariano Rodriguez-Muro, and Diego Calvanese (2014). “Rules and Ontology Based DataAccess”. In: Proc. 8th Int. Conference on Web Reasoning and Rule Systems (RR 2014). Ed. by Marie-Laure Mugnier andRoman Kontchakov. Lecture Notes in Computer Science. Springer.

Calvanese, Diego, Martin Giese, Dag Hovland, and Martin Rezk (2015). “Ontology-based Integration of Cross-linkedDatasets”. In: Proc. of the 14th Int. Semantic Web Conference (ISWC). Lecture Notes in Computer Science. Springer.

Kharlamov, Evgeny, Dag Hovland, et al. (2015). “Ontology Based Access to Exploration Data at Statoil”. In: The SemanticWeb - ISWC 2015 - 14th International Semantic Web Conference, Bethlehem, PA, USA, October 11-15, 2015,Proceedings, Part II, pp. 93–112.

Lopez, Vanessa, Martin Stephenson, Spyros Kotoulas, and Pierpaolo Tommasi (2015). “Data Access Linking and Integrationwith DALI: Building a Safety Net for an Ocean of City Data”. In: The Semantic Web - ISWC 2015 - 14th InternationalSemantic Web Conference, Bethlehem, PA, USA, October 11-15, 2015, Proceedings, Part II, pp. 186–202.

Rodriguez-Muro, Mariano and Martin Rezk (2015). “Efficient SPARQL-to-SQL with R2RML Mappings”. In: 33.doi:10.1016/j.websem.2015.03.001, pp. 141–169.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 55/56

Page 77: Ontop: Answering SPARQL Queries over Relational Databases

Introduction Overview of Ontop SPARQL Query Answering in Ontop Use Cases Recent Progresses and Future

References II

Botoeva, Elena, Diego Calvanese, Valerio Santarelli, Domenico Fabio Savo, Alessandro Solimando, and Guohui Xiao (2016).“Beyond OWL 2 QL in OBDA: Rewritings and Approximations”. In: Proc. of the 30th AAAI Conf. on ArtificialIntelligence (AAAI). AAAI Press.

Calvanese, Diego, Benjamin Cogrel, Sarah Komla-Ebri, Roman Kontchakov, Davide Lanti, Martin Rezk,Mariano Rodriguez-Muro, and Guohui XIao (2016). “Ontop: Answering SPARQL Queries over Relational Databases”. In:Semantic Web Journal.

Calvanese, Diego, Pietro Liuzzo, Alessandro Mosca, Jose Remesal, Martin Rezk, and Guillem Rull (2016). “Ontology-BasedData Integration in EPNet: Production and Distribution of Food During the Roman Empire”. In: Engineering Applicationsof Artificial Intelligence.

G. Xiao (FUB) Ontop: Answering SPARQL Queries over Relational Databases 56/56


Recommended