SPARQL Query Answering over OWL Ontologies
Presented By Mohamed Sabri
Ilianna Kollia1,2, Birte Glimm2, and Ian Horrocks2
1 Na:onal Technical University of Athens, Greece 2 Oxford University, UK
Mo-va-on: • The query evalua;on mechanism defined in the SPARQL Query
specifica;on is based on subgraph matching (simple entailment). • SPARQL 1.1 includes more elaborate entailment regimes; including
RDFS and OWL. • Query answering under such entailment regimes is more complex
as it may involve retrieving answers that only follow implicitly from the queried graph.
• While several methods and implementa;ons for SPARQL under
RDFS seman;cs are available, methods that use OWL seman;cs have not yet been well-‐studied.
Effects of Different Entailment Regimes:
• Green dashed lines indicate RDF-‐entailed triples and red dashed lines indicate triples that are also RDFS-‐entailed.
• SELECT ?pub WHERE { ?pub rdf:type ex:Publication }
OWL Reasoning:
SELECT ?pub WHERE { ?pub rdf:type ex:Publication } 1) Materializa-on approaches.
2) Query Rewri-ng approaches.
3) Combined approaches.
OWL Reasoning:
SELECT ?pub WHERE { ?pub rdf:type ex:Publication } 1) Materializa-on approaches.
2) Query Rewri-ng approaches.
3) Combined approaches.
OWL Reasoning:
SELECT ?pub WHERE { ?pub rdf:type ex:Publication } 1) Materializa-on approaches.
2) Query Rewri-ng approaches.
3) Combined approaches.
Mo-va-on (cont’d): • However, the previous techniques are only applicable for less
expressive OWL 2 profiles.
• In this paper, the authors present a sound and complete algorithm for answering SPARQL queries under the OWL 2 Direct Seman:cs entailment regime (SPARQL-‐OWL).
• …the first implementa;on to fully support SPARQL-‐OWL! • “[Query Answering] is a very ac;ve research area, with many
different techniques being developed and inves;gated… it is reasonable to expect that the future performance improvements in query answering systems will be even more spectacular than those achieved in the past by class reasoning systems”, Ian Horrocks.
Mo-va-on (cont’d): • However, the previous techniques are only applicable for less
expressive OWL 2 profiles.
• In this paper, the authors present a sound and complete algorithm for answering SPARQL queries under the OWL 2 Direct Seman:cs entailment regime (SPARQL-‐OWL).
• …the first implementa;on to fully support SPARQL-‐OWL! • “[Query Answering] is a very ac;ve research area, with many
different techniques being developed and inves;gated… it is reasonable to expect that the future performance improvements in query answering systems will be even more spectacular than those achieved in the past by class reasoning systems”, Ian Horrocks.
Web Ontology Language OWL : SubClassOf(:DogOwner ObjectSomeValuesFrom(:owns :Dog))) SubClassOf(:CatOwner ObjectSomeValuesFrom(:owns :Cat)))
ObjectPropertyDomain(:owns :Person)
ClassAssertion(ObjectUnionOf(:DogOwner :CatOwner) :mary)
ObjectPropertyAssertion(:owns :mary _:somePet)
Ø We can infer: O |= ClassAssertion(:Person :mary)
u However, there is no unique canonical model that could be used to answer queries. Why?!
u And in OWL it cannot be guaranteed that the models of an ontology are finite!
Web Ontology Language OWL : SubClassOf(:DogOwner ObjectSomeValuesFrom(:owns :Dog))) SubClassOf(:CatOwner ObjectSomeValuesFrom(:owns :Cat)))
ObjectPropertyDomain(:owns :Person)
ClassAssertion(ObjectUnionOf(:DogOwner :CatOwner) :mary)
ObjectPropertyAssertion(:owns :mary _:somePet)
Ø We can infer: O |= ClassAssertion(:Person :mary)
u However, there is no unique canonical model that could be used to answer queries. Why?!
u And in OWL it cannot be guaranteed that the models of an ontology are finite!
Defini-ons (simplified):
Defini-on 1. We write I for the set of all IRIs, L for the set of all literals, and B for the set of all blanknodes. The set T of RDF terms is I∪L∪B. Let V be a countably infinite set of variables disjoint from T. A triple pa]ern is member of the set (T ∪ V) × (I ∪ V) × (T ∪ V), and a basic graph pa]ern (BGP) is a set of triple paQerns. Defini-on2. A solu;on mapping is a par:al func:on μ: V → T from variables to RDF terms. Defini-on 3. An RDF instance mapping is a par:al func:on σ: B → T from blank nodes to RDF terms. We use:
to denote the result of mapping an OWL 2 DL graph G into an OWL ontology.
to denote the result of mapping an OWL 2 DL graph G into an OWL ontology.
OG
OBGPG
SPARQL-‐OWL Query Answering:
• The SPARQL-‐OWL regime specifies what the answers are, but not how they can actually be computed.
• A straigh(orward algorithm to realize the entailment regime: Given an OWL 2 DL graph G and a well-‐formed BGP BGP for G,
1) map G into , BGP into 2) and then simply tests, for each compa;ble pair (μ,σ), whether sk( ) |= μ(σ( )).
• Compa1ble Mappings: (intui;vely) class variables can only be mapped to class names, object property variables to object proper;es, etc.
• but in the worst case, the number of dis;nct compa;ble pairs (μ, σ) is exponen;al in the number of variables in the query, i.e., if m is the number of terms in and n is the number of variables in , we test O(mn) solu;ons.
OG OBGPG
OG OBGPG
SPARQL-‐OWL Query Answering:
• The SPARQL-‐OWL regime specifies what the answers are, but not how they can actually be computed.
• A straigh(orward algorithm to realize the entailment regime: Given an OWL 2 DL graph G and a well-‐formed BGP BGP for G,
1) map G into , BGP into 2) and then simply tests, for each compa;ble pair (μ,σ), whether sk( ) |= μ(σ( )).
• Compa1ble Mappings: (intui;vely) class variables can only be mapped to class names, object property variables to object proper;es, etc.
• but in the worst case, the number of dis;nct compa;ble pairs (μ, σ) is exponen;al in the number of variables in the query, i.e., if m is the number of terms in and n is the number of variables in , we test O(mn) solu;ons.
OG OBGPG
OG OBGPG
OG OBGPG
SPARQL-‐OWL Query Answering (cont’d):
• So, the authors presents op;miza;ons to reduce the number of: • entailment checks, and • method calls to the reasoner.
• However, op;miza;ons cannot easily be integrated in the above simple algorithm since it uses the reasoner to check for the entailment of the query as a whole and does not take advantage of rela;ons that may exist between axiom templates.
• For example, choosing a good execu;on order can significantly affect the performance;
consider the BGP { ?x rdf:type :A . ?x :op ?y . } With 100 individuals only one of which belongs to class :A, you may perform 10,200 tests instead of just 200.
SPARQL-‐OWL Query Answering (cont’d):
• Also, instead of checking entailment, we can, for several axiom templates, directly retrieve the solu;ons from the reasoner. Example: { ?x rdfs:subClassOf :C }
• Most methods of reasoners are highly op:mized, which can
significantly reduce the number of tests that are performed. Furthermore, if the class hierarchy is precomputed, the reasoner can find the answers simply with a cache lookup.
• However, the actual execu;on cost might vary significantly (depending on the internal state of the reasoner)
• The proposed algorithm internally uses an OWL 2 DL reasoner to
check entailment (HermiT, in this implementa;on).
Simple and Complex Axiom Templates:
• We dis;nguish between simple and complex axiom templates, where:
• Simple axiom templates are those that correspond to dedicated reasoning tasks. SubClassOf(?x :C)
• Complex axiom templates are, in contrast, evaluated by itera;ng over the compa;ble mappings and by checking entailment for each instan;ated axiom template.
SubClassOf(:C ObjectIntersec;onOf(?z ObjectSomeValuesFrom(?x ?y))) ClassAsser;on(ObjectSomeValuesFrom(:op ?x) ?y)
390 I. Kollia, B. Glimm, and I. Horrocks
Algorithm 1. Query Evaluation ProcedureInput: G: the active graph, which is an OWL 2 DL graph
BGP: an OWL 2 DL BGPOutput: a multiset of solutions for evaluating BGP over G under OWL 2 Direct Semantics1: OG:=map(G)2: OG
BGP:=map(BGP,OG)3: Axt := rewrite(OG
BGP) {create a list Axt of simplified axiom templates from OGBGP}
4: Axt1, . . . ,Axtm:=connectedComponents(Axt)5: for j=1, . . . , m do6: Rj := {(µ0,!0) | dom(µ0) = dom(!0) = !}7: axt1, . . . , axtn := reorder(Axtj)8: for i = 1, . . . , n do9: Rnew := !
10: for (µ,!) " Rj do11: if isSimple(axti) and ((V(axti) # B(axti)) \ (dom(µ) # dom(!))) ! ! then12: Rnew := Rnew # {(µ # µ$,! # !$) | (µ$,!$) " callReasoner(µ(!(axti )))}13: else14: B := {(µ # µ$,! # !$) | dom(µ$) = V(µ(axti)), dom(!$) = B(!(axti)),
(µ # µ$,! # !$) is compatible with axti and sk(OG)}15: B := prune(B, axti, OG)16: while B ! ! do17: (µ$,!$) := removeNext(B)18: if OG |= µ$(!$(axti)) then19: Rnew := Rnew # {(µ$,!$)}20: else21: B := prune(B,axti, (µ$,!$))22: end if23: end while24: end if25: end for26: Rj := Rnew
27: end for28: end for29: R := {(µ1 # . . . # µm,!1 # . . . # !m) | (µ j,! j) " Rj, 1 % j % m}30: return {(µ,m) | m > 0 is the maximal number with {(µ,!1), . . . , (µ,!m)} & R}
Direct Semantics. We first explain the general outline of the algorithm and leave thedetails of the used submethods for the following section. First, G and BGP are mappedto OG and OG
BGP, respectively (lines 1 and 2). The function rewrite (line 3) can beassumed to do nothing. Next, the method connectedComponents (line 4) partitionsthe axiom templates into sets of connected components, i.e., within a component thetemplates share common variables, whereas between components there are no sharedvariables. Unconnected components unnecessarily increase the amount of intermedi-ate results and, instead, we can simply combine the results for the components in theend (line 29). For each component, we proceed as described below: we first determinean order (method reorder in line 7). For a simple axiom template, which contains sofar unbound variables, we then call a specialized reasoner method to retrieve entailed
390 I. Kollia, B. Glimm, and I. Horrocks
Algorithm 1. Query Evaluation ProcedureInput: G: the active graph, which is an OWL 2 DL graph
BGP: an OWL 2 DL BGPOutput: a multiset of solutions for evaluating BGP over G under OWL 2 Direct Semantics1: OG:=map(G)2: OG
BGP:=map(BGP,OG)3: Axt := rewrite(OG
BGP) {create a list Axt of simplified axiom templates from OGBGP}
4: Axt1, . . . ,Axtm:=connectedComponents(Axt)5: for j=1, . . . , m do6: Rj := {(µ0,!0) | dom(µ0) = dom(!0) = !}7: axt1, . . . , axtn := reorder(Axtj)8: for i = 1, . . . , n do9: Rnew := !
10: for (µ,!) " Rj do11: if isSimple(axti) and ((V(axti) # B(axti)) \ (dom(µ) # dom(!))) ! ! then12: Rnew := Rnew # {(µ # µ$,! # !$) | (µ$,!$) " callReasoner(µ(!(axti )))}13: else14: B := {(µ # µ$,! # !$) | dom(µ$) = V(µ(axti)), dom(!$) = B(!(axti)),
(µ # µ$,! # !$) is compatible with axti and sk(OG)}15: B := prune(B, axti, OG)16: while B ! ! do17: (µ$,!$) := removeNext(B)18: if OG |= µ$(!$(axti)) then19: Rnew := Rnew # {(µ$,!$)}20: else21: B := prune(B,axti, (µ$,!$))22: end if23: end while24: end if25: end for26: Rj := Rnew
27: end for28: end for29: R := {(µ1 # . . . # µm,!1 # . . . # !m) | (µ j,! j) " Rj, 1 % j % m}30: return {(µ,m) | m > 0 is the maximal number with {(µ,!1), . . . , (µ,!m)} & R}
Direct Semantics. We first explain the general outline of the algorithm and leave thedetails of the used submethods for the following section. First, G and BGP are mappedto OG and OG
BGP, respectively (lines 1 and 2). The function rewrite (line 3) can beassumed to do nothing. Next, the method connectedComponents (line 4) partitionsthe axiom templates into sets of connected components, i.e., within a component thetemplates share common variables, whereas between components there are no sharedvariables. Unconnected components unnecessarily increase the amount of intermedi-ate results and, instead, we can simply combine the results for the components in theend (line 29). For each component, we proceed as described below: we first determinean order (method reorder in line 7). For a simple axiom template, which contains sofar unbound variables, we then call a specialized reasoner method to retrieve entailed
390 I. Kollia, B. Glimm, and I. Horrocks
Algorithm 1. Query Evaluation ProcedureInput: G: the active graph, which is an OWL 2 DL graph
BGP: an OWL 2 DL BGPOutput: a multiset of solutions for evaluating BGP over G under OWL 2 Direct Semantics1: OG:=map(G)2: OG
BGP:=map(BGP,OG)3: Axt := rewrite(OG
BGP) {create a list Axt of simplified axiom templates from OGBGP}
4: Axt1, . . . ,Axtm:=connectedComponents(Axt)5: for j=1, . . . , m do6: Rj := {(µ0,!0) | dom(µ0) = dom(!0) = !}7: axt1, . . . , axtn := reorder(Axtj)8: for i = 1, . . . , n do9: Rnew := !
10: for (µ,!) " Rj do11: if isSimple(axti) and ((V(axti) # B(axti)) \ (dom(µ) # dom(!))) ! ! then12: Rnew := Rnew # {(µ # µ$,! # !$) | (µ$,!$) " callReasoner(µ(!(axti )))}13: else14: B := {(µ # µ$,! # !$) | dom(µ$) = V(µ(axti)), dom(!$) = B(!(axti)),
(µ # µ$,! # !$) is compatible with axti and sk(OG)}15: B := prune(B, axti, OG)16: while B ! ! do17: (µ$,!$) := removeNext(B)18: if OG |= µ$(!$(axti)) then19: Rnew := Rnew # {(µ$,!$)}20: else21: B := prune(B,axti, (µ$,!$))22: end if23: end while24: end if25: end for26: Rj := Rnew
27: end for28: end for29: R := {(µ1 # . . . # µm,!1 # . . . # !m) | (µ j,! j) " Rj, 1 % j % m}30: return {(µ,m) | m > 0 is the maximal number with {(µ,!1), . . . , (µ,!m)} & R}
Direct Semantics. We first explain the general outline of the algorithm and leave thedetails of the used submethods for the following section. First, G and BGP are mappedto OG and OG
BGP, respectively (lines 1 and 2). The function rewrite (line 3) can beassumed to do nothing. Next, the method connectedComponents (line 4) partitionsthe axiom templates into sets of connected components, i.e., within a component thetemplates share common variables, whereas between components there are no sharedvariables. Unconnected components unnecessarily increase the amount of intermedi-ate results and, instead, we can simply combine the results for the components in theend (line 29). For each component, we proceed as described below: we first determinean order (method reorder in line 7). For a simple axiom template, which contains sofar unbound variables, we then call a specialized reasoner method to retrieve entailed
• For a simple axiom template, we then call a specialized
reasoner method to retrieve entailed results (line 12).
• Otherwise, we check which compa;ble solu;ons yield an entailed axiom (lines 13 to 24).
Proposed Op-miza-ons:
1) Axiom Template Reordering: q The simple axiom templates are ordered by their cost, which is computed as the weighted sum of the es;mated number of required consistency checks and the es;mated result size. (reasoner-‐dependant)
q The complex templates are ordered based only on the number of bindings that have to be tested.
2) Axiom Template Rewri1ng: q Some costly to evaluate axiom templates can be rewri]en into axiom templates that can be evaluated more efficiently and yield an equivalent result.
q Example:
SubClassOf(?x ObjectIntersec;onOf(ObjectSomeValuesFrom(:op ?y) :C)) SubClassOf(?x :C) and SubClassOf(?x ObjectSomeValuesFrom(:op ?y))
Proposed Op-miza-ons:
1) Axiom Template Reordering: q The simple axiom templates are ordered by their cost, which is computed as the weighted sum of the es;mated number of required consistency checks and the es;mated result size. (reasoner-‐dependant)
q The complex templates are ordered based only on the number of bindings that have to be tested.
2) Axiom Template Rewri1ng: q Some costly to evaluate axiom templates can be rewri]en into axiom templates that can be evaluated more efficiently and yield an equivalent result.
q Example:
SubClassOf(?x ObjectIntersec;onOf(ObjectSomeValuesFrom(:op ?y) :C)) SubClassOf(?x :C) and SubClassOf(?x ObjectSomeValuesFrom(:op ?y))
Proposed Op-miza-ons:
1) Axiom Template Reordering: q The simple axiom templates are ordered by their cost, which is computed as the weighted sum of the es;mated number of required consistency checks and the es;mated result size. (reasoner-‐dependant)
q The complex templates are ordered based only on the number of bindings that have to be tested.
2) Axiom Template Rewri1ng: q Some costly to evaluate axiom templates can be rewri]en into axiom templates that can be evaluated more efficiently and yield an equivalent result.
q Example:
SubClassOf(?x ObjectIntersec;onOf(ObjectSomeValuesFrom(:op ?y) :C)) SubClassOf(?x :C) and SubClassOf(?x ObjectSomeValuesFrom(:op ?y))
392 I. Kollia, B. Glimm, and I. Horrocks
Table 2. Axiom templates and their equivalent simpler ones, where C(i) are class expressions(possibly containing variables), a is an individual or variable, and r is an object property expres-sion (possibly containing a variable)
ClassAssertion(ObjectIntersectionOf(:C1 . . . :Cn) :a) ! {ClassAssertion(:Ci :a) | 1 " i " n}SubClassOf(:C ObjectIntersectionOf(:C1 . . . :Cn)) ! {SubClassOf(:C :Ci) | 1 " i " n}
SubClassOf(ObjectUnionOf(:C1 . . . :Cn) :C) ! {SubClassOf(:Ci :C) | 1 " i " n}SubClassOf(ObjectSomeValuesFrom(:op owl:Thing :C) ! ObjectPropertyDomain(:op :C)
SubClassOf(owl:Thing ObjectAllValuesFrom(:op :C)) ! ObjectPropertyRange(:op :C)
Class-Property Hierarchy Exploitation The number of consistency checks needed toevaluate a BGP can be further reduced by taking the class and property hierarchies intoaccount. Once the classes and properties are classified (this can ideally be done before asystem accepts queries), the hierarchies are stored in the reasoner’s internal structures.We further use the hierarchies to prune the search space of solutions in the evaluationof certain axiom templates. We illustrate the intuition with an example. Let us assumethat OG
BGP contains the axiom template:
SubClassOf(:Infection ObjectSomeValuesFrom(:hasCausalLinkTo ?x))
If :C is not a solution and SubClassOf(:B :C) holds, then :B is also not a solution.Thus, when searching for solutions for x, the method removeNext (line 17) chooses thenext binding to test by traversing the class hierarchy topdown. When we find a non-solution :C, the subtree rooted in :C of the class hierarchy can safely be pruned, whichwe do in the method prune in line 21. Queries over ontologies with a large number ofclasses and a deep class hierarchy can, therefore, gain the maximum advantage fromthis optimization. We employ similar optimizations using the object and data propertyhierarchies. It is obvious that we only prune mappings that cannot constitute actualsolution and instance mappings, hence, soundness and completeness of Algorithm 1 ispreserved.
Exploiting the Domain and Range Restrictions Domain and range restrictions in OG
can be exploited to further restrict the mappings for class variables. Let us assume thatOG contains Axiom (6) and OG
BGP contains Axiom Template (7).
ObjectPropertyRange(:takesCourse :Course) (6)
SubClassOf(:GraduateStudent ObjectSomeValuesFrom(:takesCourse ?x)) (7)
Only the class :Course and its subclasses can be solutions for x and we can immediatelyprune other mappings in the method prune (line 15), which again preserves soundnessand completeness.
4 System Evaluation
Since entailment regimes only change the evaluation of basic graph patterns, standardSPARQL algebra processors can be used that allow for custom BGP evaluation. Fur-thermore, standard OWL reasoners can be used to perform the required reasoning tasks.
Proposed Op-miza-ons (cont’d):
3) Class-‐Property Hierarchy Exploita1on: q The hierarchies are stored in the reasoner’s internal structures. q Can be used to prune the search space of solu;ons in the evalua;on of certain axiom templates.
q Example: SubClassOf(:Infec;on ObjectSomeValuesFrom(:hasCausalLinkTo ?x)) [If :C is not a solu;on and SubClassOf(:B :C) holds, then :B is also not a solu;on.]
4) Exploi1ng the Domain and Range Restric1ons : q Can be exploited to further restrict the mappings for class variables.
q Example: ObjectPropertyRange(:takesCourse :Course) SubClassOf(:GraduateStudent ObjectSomeValuesFrom(:takesCourse ?x)) [Only the class :Course and its subclasses can be solu;ons for x and we can immediately prune other mappings]
Proposed Op-miza-ons (cont’d):
3) Class-‐Property Hierarchy Exploita1on: q The hierarchies are stored in the reasoner’s internal structures. q Can be used to prune the search space of solu;ons in the evalua;on of certain axiom templates.
q Example: SubClassOf(:Infec;on ObjectSomeValuesFrom(:hasCausalLinkTo ?x)) [If :C is not a solu;on and SubClassOf(:B :C) holds, then :B is also not a solu;on.]
4) Exploi1ng the Domain and Range Restric1ons : q Can be exploited to further restrict the mappings for class variables.
q Example: ObjectPropertyRange(:takesCourse :Course) SubClassOf(:GraduateStudent ObjectSomeValuesFrom(:takesCourse ?x)) [Only the class :Course and its subclasses can be solu;ons for x and we can immediately prune other mappings]
System Evalua-on: SPARQL Query Answering over OWL Ontologies 393
uses
QueryParsing
Algebra Evaluation
SPARQLquery
BGP Execution
AlgebraObject
Query solution sequence
BGP
determines
BGPsolution
sequence
OWL Reasoner
OWL Ontology
BGP parsing
BGP Rewriting
BGP Evaluation
BGP Reordering
Fig. 1. The main phases of query processing in our system
4.1 The System Architecture
Figure 1 depicts the main phases of query processing in our prototypical system. In oursetting, the queried graph is seen as an ontology that is loaded into an OWL reasoner.Currently, we only load the default graph/ontology of the RDF dataset into a reasonerand each query is evaluated using this reasoner. We plan, however, to extend the systemto named graphs, where the dataset clause of the query can be used to determine a rea-soner which contains one of the named ontologies instead of the default one. Loadingthe ontology and the initialization of the reasoner are performed before the system ac-cepts queries. We use the ARQ library3 of the Jena Semantic Web Toolkit for parsingthe query and for the SPARQL algebra operations apart from our custom BGP evalu-ation method. The BGP is parsed and mapped into axiom templates by our extensionof the OWL API [6], which uses the active ontology for type disambiguation. The re-sulting axiom templates are then passed to a query optimizer, which applies the axiomtemplate rewriting and then searches for a good query execution plan based on statisticsprovided by the reasoner. We use the HermiT reasoner4 for OWL reasoning, but onlythe module that generates statistics and provides cost estimations is HermiT specific.
4.2 Experimental Results
We tested our system with the Lehigh University Benchmark (LUBM) [3] and a rangeof custom queries that test complex axiom template evaluation over the more expressiveGALEN ontology. All experiments were performed on a Windows Vista machine witha double core 2.2 GHz Intel x86 32 bit processor and Java 1.6 allowing 1GB of Javaheap space. We measure the time for one-o! tasks such as classification separatelysince such tasks are usually performed before the system accepts queries. Whether morecostly operations such as the realization of the ABox, which computes the types for allindividuals, are done in the beginning, depends on the setting and the reasoner. Sincerealization is relatively quick in HermiT for LUBM (GALEN has no individuals), wealso performed this task upfront. The given results are averages from executing eachquery three times. The ontologies and all code required to perform the experiments areavailable online.5
3 http://jena.sourceforge.net/ARQ/4 http://www.hermit-reasoner.com/5 http://www.hermit-reasoner.com/2010/sparqlowl/sparqlowl.zip
• Since entailment regimes only change the evalua;on of basic graph pa]erns, standard SPARQL algebra processors can be used that allow for custom BGP evalua;on.
• Uses the ARQ library of the Jena Seman;c Web Toolkit for parsing the query and for the SPARQL algebra opera;ons apart from our custom BGP evalua;on.
• The BGP is parsed and mapped into axiom templates by our extension of the OWL API.
• We use the HermiT reasoner4 for OWL reasoning.
Experimental Results: 394 I. Kollia, B. Glimm, and I. Horrocks
Table 3. Query answering times in milliseconds for LUBM(1,0) and in seconds for the queriesof Table 4 with and without optimizations
LUMB(1, 0) GALEN queries from Table 4Query Time Query Reordering Hierarchy Rewriting Time
Exploitation1 20 1 2.12 46 1 x 0.13 19 2 780.64 19 2 x 4.45 32 3 >30 min6 58 3 x 119.67 42 3 x 204.78 353 3 x x 4.99 4,475 4 x x >30 min
10 23 4 x x 361.911 19 4 x x >30 min12 28 4 x x x 68.213 16 5 x >30 min14 45 5 x >30 min
5 x x 5.6
We first evaluate the 14 conjunctive ABox queries provided in the LUBM. Thesequeries are simple ones and have variables only in place of individuals and literals. TheLUBM ontology contains 43 classes, 25 object properties, and 7 data properties. Wetested the queries on LUBM(1,0), which contains data for one university starting fromindex 0, and which contains 16,283 individuals and 8,839 literals. The ontology took3.8 s to load and 22.7 s for classification and realization. Table 3 shows the executiontime for each of the queries. The reordering optimization has the biggest impact onqueries 2, 7, 8, and 9. These queries require much more time or are not answered at allwithin the time limit of 30 min without this optimization (758.9 s, 14.7 s, >30 min, >30min, respectively).
Conjunctive queries are supported by a range of OWL reasoners. SPARQL-OWLallows, however, the creation of very powerful queries, which are not currently sup-ported by any other system. In the absence of suitable standard benchmarks, we cre-ated a custom set of queries as shown in Table 4 (in FSS). Note that we omit variabletype declarations since the variable types are unambiguous in FSS. Since the complexqueries are mostly based on complex schema queries, we switched from the very simpleLUBM ontology to the GALEN ontology. GALEN consists of 2,748 classes, 413 objectproperties, and no individuals or literals. The ontology took 1.6 s to load and 4.8 s toclassify the classes and properties. The execution time for these queries is shown on theright-hand side of Table 3. For each query, we tested the execution once without opti-mizations and once for each combination of applicable optimizations from Section 3.
As expected, an increase in the number of variables within an axiom template leadsto a significant increase in the query execution time because the number of mappingsthat have to be checked grows exponentially in the number of variables. This can, inparticular, be observed from the di!erence in execution time between Query 1 and 2.
• Evaluate the 14 conjunc;ve ABox queries provided in the LUBM. • Without op;miza;on, queries 2, 7, 8, and 9 required 758.9 s, 14.7 s, >30 min, and >30
min, respec;vely.
Experimental Results: SPARQL Query Answering over OWL Ontologies 395
Table 4. Sample complex queries for the GALEN ontology
1 SubClassOf(:Infection ObjectSomeValuesFrom(:hasCausalLinkTo ?x))2 SubClassOf(:Infection ObjectSomeValuesFrom(?y ?x))3 SubClassOf(?x ObjectIntersectionOf(:Infection
ObjectSomeValuesFrom(:hasCausalAgent ?y)))4 SubClassOf(:NAMEDLigament ObjectIntersectionOf(:NAMEDInternalBodyPart ?x)
SubClassOf(?x ObjectSomeValuesFrom(:hasShapeAnalagousToObjectIntersectionOf(?y ObjectSomeValuesFrom(?z :linear))))
5 SubClassOf(?x :NonNormalCondition)SubObjectPropertyOf(?z :ModifierAttribute)SubClassOf(:Bacterium ObjectSomeValuesFrom(?z ?w))SubObjectProperty(?y :StatusAttribute)SubClassOf(?w :AbstractStatus)SubClassOf(?x ObjectSomeValuesFrom(?y :Status))
From Queries 1, 2, and 3 it is evident that the use of the hierarchy exploitation opti-mization leads to a decrease in execution time of up to two orders of magnitude and, incombination with the query rewriting optimization, we can get an improvement of upto three orders of magnitude as seen in Query 3. Query 4 can only be completed in thegiven time limit if at least reordering and hierarchy exploitation is enabled. Rewritingsplits the first axiom template into the following two simple axiom templates, which areevaluated much more e!ciently:
SubClassOf(NAMEDLigament NAMEDInternalBodyPart)SubClassOf(NAMEDLigament ?x)
After the rewriting, the reordering optimization has an even more pronounced e"ectsince both rewritten axiom templates can be evaluated with a simple cache lookup.Without reordering, the complex axiom template could be executed before the simpleones, which leads to the inability to answer the query within the time limit of 30 min.Without a good ordering, Query 5 can also not be answered, but the additional use ofthe class and property hierarchy further improves the execution time by three orders ofmagnitude.
Although our optimizations can significantly improve the query execution time, therequired time can still be quite high. In practice, it is, therefore, advisable to add as manyrestrictive axiom templates for query variables as possible. For example, the addition ofSubClassOf(?y Shape) to Query 4 reduces the runtime from 68.2 s to 1.6 s.
5 Discussion
We have presented a sound and complete query answering algorithm and novel opti-mizations for SPARQL’s OWL Direct Semantics entailment regime. Our prototypicalquery answering system combines existing tools such as ARQ, the OWL API, and theHermiT OWL reasoner to implement an algorithm that evaluates basic graph patternsunder OWL’s Direct Semantics. Apart from the query reordering optimization—which
Conclusion & Future Work: • A sound and complete query answering algorithm and novel op;miza;ons for
SPARQL’s OWL Direct Seman;cs entailment regime.
• Prototype combines exis;ng tools; ARQ, the OWL API, and the HermiT.
• Apart from the query reordering op;miza;on, the system is independent of the reasoner used.
• The op;miza;ons can improve query execu;on ;me by up to three orders of magnitude.
• Future work will include the crea;on of more accurate cost es;mates for the cost-‐based query reordering. (the next paper on course webpage!)
• The implementa;on of caching strategies that reduce the number of tests for different instan;a;ons of a complex axiom template. (no reported work yet!)
• Although the proposed op;miza;ons can significantly improve the query execu;on ;me, the required ;me can s;ll be quite high. In prac;ce, it is, therefore, advisable to add as many restric;ve axiom templates for query variables as possible.
Conclusion & Future Work: • A sound and complete query answering algorithm and novel op;miza;ons for
SPARQL’s OWL Direct Seman;cs entailment regime.
• Prototype combines exis;ng tools; ARQ, the OWL API, and the HermiT.
• Apart from the query reordering op;miza;on, the system is independent of the reasoner used.
• The op;miza;ons can improve query execu;on ;me by up to three orders of magnitude.
• Future work will include the crea;on of more accurate cost es;mates for the cost-‐based query reordering. (the next paper on course webpage!)
• The implementa;on of caching strategies that reduce the number of tests for different instan;a;ons of a complex axiom template. (no reported work yet!)
• Although the proposed op;miza;ons can significantly improve the query execu;on ;me, the required ;me can s;ll be quite high. In prac;ce, it is, therefore, advisable to add as many restric;ve axiom templates for query variables as possible.