Date post: | 02-Nov-2014 |
Category: |
Technology |
Upload: | mariano-rodriguez |
View: | 278 times |
Download: | 3 times |
..
..
Ontop at Work
Mariano Rodríguez-Muro1,Roman Kontchakov2
Michael Zakharyaschev2
1 Faculty of Computer Science, FreeUniversity of Bozen-Bolzano, Italy2 Department of Computer Science
and Information Systems,Birkbeck, University of London, U.K.
May 22th, 2013
...
OBDA: What is it?
.Loosely speaking.....
.Using ontologies to access of data.
(Virtual) ABox
UserQuery Ontology
(TBox)
Mappings
OBDA System
RBMSData source
Our focus are OWL 2 QL ontologies, since they are tailored tohandle very large amounts of data by means of query rewritingtechniques.
Ontop at Work 2 / 29
...
OBDA: What is it?
.Loosely speaking.....
.Using ontologies to access of data.
(Virtual) ABox
UserQuery Ontology
(TBox)
Mappings
OBDA System
RBMSData source
Our focus are OWL 2 QL ontologies, since they are tailored tohandle very large amounts of data by means of query rewritingtechniques.
Ontop at Work 2 / 29
...
OBDA: What is it?
.Loosely speaking.....
.Using ontologies to access of data.
(Virtual) ABox
UserQuery Ontology
(TBox)
Mappings
OBDA System
RBMSData source
Our focus are OWL 2 QL ontologies, since they are tailored tohandle very large amounts of data by means of query rewritingtechniques.
Ontop at Work 2 / 29
...
Query Answering by Query rewriting
.Objective..
.Given a query Q over the ontology T derive a query Q′
over the database D that preserves the semantics of T.
.
.
Consider a TBox T
Movie ≡ ∃title, Movie ⊑ ∃year,Movie ≡ ∃cast, ∃cast− ⊑ PersonActor ⊑ Person Actress ⊑ Person,
Producer ⊑ Person, Director ⊑ Person,Writer ⊑ Person, Editor ⊑ Person.
Ontop at Work 3 / 29
...
Query Answering by Query rewriting
.Objective..
.Given a query Q over the ontology T derive a query Q′
over the database D that preserves the semantics of T..
.
Consider a TBox T
Movie ≡ ∃title, Movie ⊑ ∃year,Movie ≡ ∃cast, ∃cast− ⊑ PersonActor ⊑ Person Actress ⊑ Person,
Producer ⊑ Person, Director ⊑ Person,Writer ⊑ Person, Editor ⊑ Person.
Ontop at Work 3 / 29
...
Example
.
.
The Database D: Two DB relations title[m, t, y] andcastinfo[p,m, r].
The mapping M (logical form, think R2RML):
Movie(m)← title(m, t, y), title(m, t)← title(m, t, y),year(m, y)← title(m, t, y), cast(m, p)← castinfo(p,m, r),Person(p)← castinfo(p,m, r),Actor(p)← castinfo(p,m, ”c1”) · · ·Editor(p)← castinfo(p,m, ”c6”).
Ontop at Work 4 / 29
...
Example
.
.
The Database D: Two DB relations title[m, t, y] andcastinfo[p,m, r].
The mapping M (logical form, think R2RML):
Movie(m)← title(m, t, y), title(m, t)← title(m, t, y),year(m, y)← title(m, t, y), cast(m, p)← castinfo(p,m, r),Person(p)← castinfo(p,m, r),Actor(p)← castinfo(p,m, ”c1”) · · ·Editor(p)← castinfo(p,m, ”c6”).
Ontop at Work 4 / 29
...
The classic OBDA architecture
..CQ q .
ontology T
. FO q′.
mapping
. SQL.
data D
.
ABox A
.+.rewriting
. +.unfolding
.
+
.
ABox virtualisation
Stages in the classic OBDA approach:. Rewriting w.r.t. T,. Unfolding w.r.t. M,. Execution over D.
.
.Unfolding and Mappings are ignored in most OBDA literature
Ontop at Work 5 / 29
...
The classic OBDA architecture
..CQ q .
ontology T
. FO q′.
mapping
. SQL.
data D
.
ABox A
.+.rewriting
. +.unfolding
.
+
.
ABox virtualisation
Stages in the classic OBDA approach:. Rewriting w.r.t. T,. Unfolding w.r.t. M,. Execution over D.
.
.Unfolding and Mappings are ignored in most OBDA literatureOntop at Work 5 / 29
...
Example: Rewriting
Given the query Qq(x)← Person(x)
Gives the rewriting
q(x)← Person(x)q(x)← cast(z, x)q(x)← Actor(x). . .
q(x)← Editor(x)
Ontop at Work 6 / 29
...
Example: Unfolding
Given the query Qq(x)← Person(x)
Gives the rewriting
q(x1)← castinfo(x1,m, r)q(x2)← castinfo(x2,m, r)q(x3)← castinfo(x3,m, ”c1”)
. . .
q(x8)← castinfo(x8,m, ”c6”)
Ontop at Work 7 / 29
...
IssuesThe issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
In the literature we find two solutions:. Encoding the rewriting as a Datalog program. For example,
given the query:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
we generate the rewriting:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
Person(x)← cast(m, x)Person(x)← Actor(x)
. . .
Person(x)← Edtior(x)
Ontop at Work 8 / 29
...
IssuesThe issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
In the literature we find two solutions:. Encoding the rewriting as a Datalog program. For example,
given the query:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
we generate the rewriting:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
Person(x)← cast(m, x)Person(x)← Actor(x)
. . .
Person(x)← Edtior(x)
Ontop at Work 8 / 29
...
IssuesThe issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
In the literature we find two solutions:. Encoding the rewriting as a Datalog program. For example,
given the query:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
we generate the rewriting:q(x, y)← Person(x),Person(y), cast(m, x), cast(m, z)
Person(x)← cast(m, x)Person(x)← Actor(x)
. . .
Person(x)← Edtior(x)Ontop at Work 8 / 29
...
IssuesThe issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
In the literature we find two solutions:. Encoding the rewriting as a Datalog program.
.But.....
.
The query still needs to be unfolded into an SQL query. There aretwo choices here:
. Generate SQL queries with nested UNIONs. Very bad forperformance.
. Expand into a UCQ. Back to square 1.
Ontop at Work 9 / 29
...
Issues (cont.)The issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
. Using Query Containment to clean the output. For example,to detect that this:
q(x1)← castinfo(x1,m, r)q(x2)← castinfo(x2,m, r)q(x3)← castinfo(x3,m, ”c1”)
. . .
q(x8)← castinfo(x8,m, ”c6”)
can be simplified to
q(x1)← castinfo(x1,m, r)
Ontop at Work 10 / 29
...
Issues (cont.)The issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
. Using Query Containment to clean the output.
For example,to detect that this:
q(x1)← castinfo(x1,m, r)q(x2)← castinfo(x2,m, r)q(x3)← castinfo(x3,m, ”c1”)
. . .
q(x8)← castinfo(x8,m, ”c6”)
can be simplified to
q(x1)← castinfo(x1,m, r)
Ontop at Work 10 / 29
...
Issues (cont.)The issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
. Using Query Containment to clean the output. For example,to detect that this:
q(x1)← castinfo(x1,m, r)q(x2)← castinfo(x2,m, r)q(x3)← castinfo(x3,m, ”c1”)
. . .
q(x8)← castinfo(x8,m, ”c6”)
can be simplified to
q(x1)← castinfo(x1,m, r)
Ontop at Work 10 / 29
...
Issues (cont.)The issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
. Using Query Containment to clean the output. For example,to detect that this:
q(x1)← castinfo(x1,m, r)q(x2)← castinfo(x2,m, r)q(x3)← castinfo(x3,m, ”c1”)
. . .
q(x8)← castinfo(x8,m, ”c6”)
can be simplified to
q(x1)← castinfo(x1,m, r)
Ontop at Work 10 / 29
...
Issues (cont.)The issues with these rewritings are:
. Large size (n1 ∗ . . . ∗ n2)
. Largely redundant (w.r.t. query containment)
. Using Query Containment to clean the output..But.....
.
. Query containment is an extremely expensive operation.
. We are working with large sets of queries.
Ontop at Work 11 / 29
...
Roots of the problem
There are 3 main reasons for large CQ rewritings and unfoldings:
(E) Sub-queries of q with existentially quantified variablescan be folded in many different ways to match thecanonical model (existential trees), e.g.,
Person ⊑ ∃hasFather.Person
and the query
q(x)← hasFather(x, y), hasFather(y, z)
(H) The concepts and roles for atoms in q can have manysub-concepts and sub-roles according to T,
(M) The mapping M can have multiple definitions of theontology terms,
Most of the proposed rewriting techniques try to tame (E).
Ontop at Work 12 / 29
...
Roots of the problem
There are 3 main reasons for large CQ rewritings and unfoldings:(E) Sub-queries of q with existentially quantified variables
can be folded in many different ways to match thecanonical model (existential trees), e.g.,
Person ⊑ ∃hasFather.Person
and the query
q(x)← hasFather(x, y), hasFather(y, z)
(H) The concepts and roles for atoms in q can have manysub-concepts and sub-roles according to T,
(M) The mapping M can have multiple definitions of theontology terms,
Most of the proposed rewriting techniques try to tame (E).
Ontop at Work 12 / 29
...
Roots of the problem
There are 3 main reasons for large CQ rewritings and unfoldings:(E) Sub-queries of q with existentially quantified variables
can be folded in many different ways to match thecanonical model (existential trees), e.g.,
Person ⊑ ∃hasFather.Person
and the query
q(x)← hasFather(x, y), hasFather(y, z)
(H) The concepts and roles for atoms in q can have manysub-concepts and sub-roles according to T,
(M) The mapping M can have multiple definitions of theontology terms,
Most of the proposed rewriting techniques try to tame (E).
Ontop at Work 12 / 29
...
Roots of the problem
There are 3 main reasons for large CQ rewritings and unfoldings:(E) Sub-queries of q with existentially quantified variables
can be folded in many different ways to match thecanonical model (existential trees), e.g.,
Person ⊑ ∃hasFather.Person
and the query
q(x)← hasFather(x, y), hasFather(y, z)
(H) The concepts and roles for atoms in q can have manysub-concepts and sub-roles according to T,
(M) The mapping M can have multiple definitions of theontology terms,
Most of the proposed rewriting techniques try to tame (E).Ontop at Work 12 / 29
...
More about (E)
More about (E). it is in theory incurable. it is independent of (H) and (M)
However. Rewriting algorithms deal with (E) and (H) at the same time. Real-world Qs and T’s generate few queries when dealing with
(E) in isolation.. Even artificially constructed Qs and T’s become simple.
.
.The strongest issues in query rewriting are (H) and (M)
In Ontop we deal with (H) and (M) separately from (E). We do itthrough T-mappings and TreeWitness rewritings.
Ontop at Work 13 / 29
...
More about (E)
More about (E). it is in theory incurable. it is independent of (H) and (M)
However. Rewriting algorithms deal with (E) and (H) at the same time. Real-world Qs and T’s generate few queries when dealing with
(E) in isolation.. Even artificially constructed Qs and T’s become simple.
.
.The strongest issues in query rewriting are (H) and (M)
In Ontop we deal with (H) and (M) separately from (E). We do itthrough T-mappings and TreeWitness rewritings.
Ontop at Work 13 / 29
...
More about (E)
More about (E). it is in theory incurable. it is independent of (H) and (M)
However. Rewriting algorithms deal with (E) and (H) at the same time. Real-world Qs and T’s generate few queries when dealing with
(E) in isolation.. Even artificially constructed Qs and T’s become simple.
.
.The strongest issues in query rewriting are (H) and (M)
In Ontop we deal with (H) and (M) separately from (E). We do itthrough T-mappings and TreeWitness rewritings.
Ontop at Work 13 / 29
...
More about (E)
More about (E). it is in theory incurable. it is independent of (H) and (M)
However. Rewriting algorithms deal with (E) and (H) at the same time. Real-world Qs and T’s generate few queries when dealing with
(E) in isolation.. Even artificially constructed Qs and T’s become simple.
.
.The strongest issues in query rewriting are (H) and (M)
In Ontop we deal with (H) and (M) separately from (E). We do itthrough T-mappings and TreeWitness rewritings.
Ontop at Work 13 / 29
...
Dealing with (H) and (M): T-MappingsA T-mapping MT is a transformation of M that enforces all (H)entailments (H-completeness), formally,
M |= A(c) and T |= A ⊑ B→ MT |= B(c)
.T-mapping example 1..
.
Consider two DB relations title[m, t, y] and castinfo[p,m, r] and anontology MO describing the film domain as follows:
Movie ≡ ∃cast
Let M be the following mappings:
Movie(m)← title(m, t, y),cast(m, p)← castinfo(p,m, r).
Ontop at Work 14 / 29
...
Dealing with (H) and (M): T-MappingsA T-mapping MT is a transformation of M that enforces all (H)entailments (H-completeness), formally,
M |= A(c) and T |= A ⊑ B→ MT |= B(c).T-mapping example 1..
.
Consider two DB relations title[m, t, y] and castinfo[p,m, r] and anontology MO describing the film domain as follows:
Movie ≡ ∃cast
Let M be the following mappings:
Movie(m)← title(m, t, y),cast(m, p)← castinfo(p,m, r).
Ontop at Work 14 / 29
...
Dealing with (H) and (M): T-MappingsA T-mapping MT is a transformation of M that enforces all (H)entailments (H-completeness), formally,
M |= A(c) and T |= A ⊑ B→ MT |= B(c).T-mapping example 1 (domain/range)..
.
Consider two DB relations title[m, t, y] and castinfo[p,m, r] and anontology MO describing the film domain as follows:
Movie ≡ ∃cast
Let M be the following mappings:
Movie(m)← title(m, t, y),cast(m, p)← castinfo(p,m, r).Movie(m)← castinfo(p,m, r).
Ontop at Work 15 / 29
...
T-Mappings: Example 2
.T-mappings example 2 (hierarchies)..
.
Consider a TBox T
Actor ⊑ Person Actress ⊑ Person,Producer ⊑ Person, Director ⊑ Person,
Writer ⊑ Person, Editor ⊑ Person.
The mapping M:
Actor(p)← castinfo(p,m, ”c1”) · · ·Editor(p)← castinfo(p,m, ”c6”).
Ontop at Work 16 / 29
...
T-Mappings: Example 2
.T-mappings example 2 (hierarchies)..
.
Consider a TBox T
Actor ⊑ Person Actress ⊑ Person,Producer ⊑ Person, Director ⊑ Person,
Writer ⊑ Person, Editor ⊑ Person.
The mapping M:
Person(p)← castinfo(p,m, ”c1”) · · ·Person(p)← castinfo(p,m, ”c6”).
Ontop at Work 17 / 29
...
Optimising T-mappings
.
.
The objective of T-mapping allow to deal with hierarchical reasoning(H) at the level of the unfolding. At this point, we can exploit
. DB dependencies and
. SQL expressivity to reduce and often the exponential growthcoming form (H) and (M).
Ontop at Work 18 / 29
...
Optimising with Dependencies
A first optimisation is Query Containment (w.r.t. dependencies)
.Example..
.
Consider the previous example, since T |= ∃cast ⊑ Movie, theT-mapping contains:
Movie(m) ← title(m, t, y),Movie(m) ← castinfo(p,m, r).
The latter rule is redundant since IMDb contains the foreign key
title(m, t, y)⇝ title(p,m, r)
This step is crucial to reduce the growth due to inferences related todomain and range.
Ontop at Work 19 / 29
...
Optimising with Dependencies
A first optimisation is Query Containment (w.r.t. dependencies).Example..
.
Consider the previous example, since T |= ∃cast ⊑ Movie, theT-mapping contains:
Movie(m) ← title(m, t, y),Movie(m) ← castinfo(p,m, r).
The latter rule is redundant since IMDb contains the foreign key
title(m, t, y)⇝ title(p,m, r)
This step is crucial to reduce the growth due to inferences related todomain and range.
Ontop at Work 19 / 29
...
Optimising with Dependencies
A first optimisation is Query Containment (w.r.t. dependencies).Example..
.
Consider the previous example, since T |= ∃cast ⊑ Movie, theT-mapping contains:
Movie(m) ← title(m, t, y),Movie(m) ← castinfo(p,m, r).
The latter rule is redundant since IMDb contains the foreign key
title(m, t, y)⇝ title(p,m, r)
This step is crucial to reduce the growth due to inferences related todomain and range.
Ontop at Work 19 / 29
...
Optimising with SQL expressivity
Observation. The only means for perfect reformulations to dealwith (H) is through disjunction (UNION). DBMS are not goodplanning UNIONs.
However, At the level of the unfolding and mappings, we have fullSQL expressivity (e.g., Disjunction (OR), inequalities, etc.).
.Objective..
.
Given a T-mapping, define mapping transformations thatentail the same ABox using less mappings while ensuringthat the encoding used is efficient during execution.
Ontop at Work 20 / 29
...
Optimising with SQL expressivity
Observation. The only means for perfect reformulations to dealwith (H) is through disjunction (UNION). DBMS are not goodplanning UNIONs.
However, At the level of the unfolding and mappings, we have fullSQL expressivity (e.g., Disjunction (OR), inequalities, etc.).
.Objective..
.
Given a T-mapping, define mapping transformations thatentail the same ABox using less mappings while ensuringthat the encoding used is efficient during execution.
Ontop at Work 20 / 29
...
Optimising with SQL expressivity
Observation. The only means for perfect reformulations to dealwith (H) is through disjunction (UNION). DBMS are not goodplanning UNIONs.
However, At the level of the unfolding and mappings, we have fullSQL expressivity (e.g., Disjunction (OR), inequalities, etc.).
.Objective..
.
Given a T-mapping, define mapping transformations thatentail the same ABox using less mappings while ensuringthat the encoding used is efficient during execution.
Ontop at Work 20 / 29
...
Optimising with SQL expressivity
Observation. The only means for perfect reformulations to dealwith (H) is through disjunction (UNION). DBMS are not goodplanning UNIONs.
However, At the level of the unfolding and mappings, we have fullSQL expressivity (e.g., Disjunction (OR), inequalities, etc.).
.Objective..
.
Given a T-mapping, define mapping transformations thatentail the same ABox using less mappings while ensuringthat the encoding used is efficient during execution.
Ontop at Work 20 / 29
...
Optimising with SQL expressivity
Use OR and inequalities to re-express mappings for hierarchies anddiscriminant columns.
.Dealing with discriminant columns..
.
For example, the mapping M for IMDb and MO contains six rulesfor sub-concepts of Person:
Person(p)← castinfo(p,m, ”c1”)· · ·
Person(p)← castinfo(p,m, ”c6”)
These can be reduced to a single rule:
Person(p)← castinfo(c, p,m, r), (r = c1) ∨ · · · ∨ (r = c6).
Ontop at Work 21 / 29
...
Optimising with SQL expressivity
Use OR and inequalities to re-express mappings for hierarchies anddiscriminant columns..Dealing with discriminant columns..
.
For example, the mapping M for IMDb and MO contains six rulesfor sub-concepts of Person:
Person(p)← castinfo(p,m, ”c1”)· · ·
Person(p)← castinfo(p,m, ”c6”)
These can be reduced to a single rule:
Person(p)← castinfo(c, p,m, r), (r = c1) ∨ · · · ∨ (r = c6).
Ontop at Work 21 / 29
...
Optimising with SQL expressivity
Use OR and inequalities to re-express mappings for hierarchies anddiscriminant columns..Dealing with discriminant columns..
.
For example, the mapping M for IMDb and MO contains six rulesfor sub-concepts of Person:
Person(p)← castinfo(p,m, ”c1”)· · ·
Person(p)← castinfo(p,m, ”c6”)
These can be reduced to a single rule:
Person(p)← castinfo(c, p,m, r), (r = c1) ∨ · · · ∨ (r = c6).
Ontop at Work 21 / 29
...
The architecture of Ontop
..CQ q .
ontology T
. UCQ qtw.
T-mapping
.
mapping M
.
dependencies Σ
. SQL.
data D
.
ABox A
.
H-complete ABox A
.+ .tw-rewriting
. +.unfolding
.
+
.
ABox virtualisation
.
+
.
ABox virtualisation
.
+
.
ABox completion
.
+
.completion
.
SQO
.
SQO
.
Highlights: (H) and (M) dealt with T-mappings, rewriting for(H)-complete ABoxes, extensive use of SQO over the unfolding.
Ontop at Work 22 / 29
...
Other Optimisations in Ontop
We also apply other important optimisations during system setupand at query time, the most important:Equivalence Simplification Simplify the ontology vocabulary w.r.t.
equivalence (keep one representative of eachequivalence class).
Semantic Query Optimisation Optimise each query generatedindividually... see next slides.
Emptiness indexes Keeping track of empty predicates
Ontop at Work 23 / 29
...
Results
A summary of the results we have observed using this architecture:. Mappings per class/property are few. Query rewritings are small. SQL queries generated like this often correspond to what a
human expert would have generated.. Query execution of SPARQL with entailments is fast, often
much faster than in triple stores...Query rewriting can be done efficiently
Ontop at Work 24 / 29
...
Benchmarks
0.1
1
10
100
1000
10000
100000
1000000
R1 R2 R3 R4 R5 Q1 Q2 Q3 Q4 Q5 Q6 V7 V8 V9 V10
OWLIM
STARDOG
ONTOP
Benchmark: LUBMex, 200 Unis (30M triples). Systems: OWLIM(forward chaining), Stardog (rewriting), Ontop/DB2.Ontop/DB2
. returns immediately for 5/15 queries,
. faster than the rest in 12/15 queriesOntop at Work 25 / 29
...
SummaryResults so far
. Efficiently dealt with exponential growth from (H) and (M)
. Use of dependencies and CQC/SQO to minimise and optimisemapping rules
. We exploit SQL expressivity to transform mappings to minimizethe number of mappings.
.
.OWL 2 QL query answering with query rewriting is efficient andmaterialisation is not required.
Ontop is available as an SPARQL end-point, OWLAPI andSesame library, and Protege 4 plugin. Many more features(SPARQL, R2RML). Permanently under-development, however,stable enough to be used seriously in many projects, incl. Optique.Current work is applying these techniques to more expressivesettings, e.g., OWL + Rules, OWL 2 EL, OWL 2 RL, through anhybrid approach.
Ontop at Work 26 / 29
...
SummaryResults so far
. Efficiently dealt with exponential growth from (H) and (M)
. Use of dependencies and CQC/SQO to minimise and optimisemapping rules
. We exploit SQL expressivity to transform mappings to minimizethe number of mappings.
.
.OWL 2 QL query answering with query rewriting is efficient andmaterialisation is not required.
Ontop is available as an SPARQL end-point, OWLAPI andSesame library, and Protege 4 plugin. Many more features(SPARQL, R2RML). Permanently under-development, however,stable enough to be used seriously in many projects, incl. Optique.Current work is applying these techniques to more expressivesettings, e.g., OWL + Rules, OWL 2 EL, OWL 2 RL, through anhybrid approach.
Ontop at Work 26 / 29
...
SummaryResults so far
. Efficiently dealt with exponential growth from (H) and (M)
. Use of dependencies and CQC/SQO to minimise and optimisemapping rules
. We exploit SQL expressivity to transform mappings to minimizethe number of mappings.
.
.OWL 2 QL query answering with query rewriting is efficient andmaterialisation is not required.
Ontop is available as an SPARQL end-point, OWLAPI andSesame library, and Protege 4 plugin. Many more features(SPARQL, R2RML). Permanently under-development, however,stable enough to be used seriously in many projects, incl. Optique.
Current work is applying these techniques to more expressivesettings, e.g., OWL + Rules, OWL 2 EL, OWL 2 RL, through anhybrid approach.
Ontop at Work 26 / 29
...
SummaryResults so far
. Efficiently dealt with exponential growth from (H) and (M)
. Use of dependencies and CQC/SQO to minimise and optimisemapping rules
. We exploit SQL expressivity to transform mappings to minimizethe number of mappings.
.
.OWL 2 QL query answering with query rewriting is efficient andmaterialisation is not required.
Ontop is available as an SPARQL end-point, OWLAPI andSesame library, and Protege 4 plugin. Many more features(SPARQL, R2RML). Permanently under-development, however,stable enough to be used seriously in many projects, incl. Optique.Current work is applying these techniques to more expressivesettings, e.g., OWL + Rules, OWL 2 EL, OWL 2 RL, through anhybrid approach.
Ontop at Work 26 / 29
...
Semantic Query Optimisation
Consider the query
q(t, y)← Movie(m), title(m, t), year(m, y), (y > 2010)
By straightforwardly applying the unfolding to qtw and theT-mapping M above, we obtain the query
q′tw(t, y)← title(m, t0, y0), title(m, t, y1), title(m, t2, y), (y > 2010),
which requires two (potentially) expensive Join operations.However, by using the primary key m of title we obtain:
q′′tw(t, y)← title(m, t, y), (y > 2010).
...
Semantic Query Optmization
Semantic Query Optimisation (SQO) is a field from DB theoryfocused on optimisation of queries w.r.t. dependencies.Semantic Query Optimisations in DB and OBDA
. While some of SQO techniques reached industrial RDBMSs,it never had a strong impact on the database community.
. In OBDA, in contrast, SQL queries are generatedautomatically, and so SQO is the only tools to reach optimalqueries.
.
.
In practice, an OBDA system must implement at least SQOw.r.t. primary keys and foreign keys to deal with the disparitiesbetween RDF and relational.
...
Why does it work?
DBs are created through standard practices that generate featuresthat are the focus of the previous optimisations.Starting from a rich conceptual schema, we encode it in a relationalschema by:
– amalgamating N-to-1 and 1-to-1 attributes of an entity to asingle n-ary relation with a primary key identifying the entity(e.g., title with title and year),
– using foreign keys over attribute columns when a column refersto the entity (e.g., name and castinfo),
– using type-discriminant columns to encode hierarchicalinformation (e.g., castinfo).
As this process is universal, the T-mappings created for the resultingdatabases are dramatically simplified by the Ontop optimisations