Post on 18-Aug-2015
transcript
A Toolkit for Query Answering with Existential Rules
Jean-François Baget, Michel Leclère, Marie-Laure Mugnier,Swan Rocher, and Clément Sipieter
RuleML 2015
sipieter@lirmm.fr Graal 1
Ontology-mediated Query Answering
Query
Ontology
DataKnowledge
basesipieter@lirmm.fr Graal 2
Ontology language
Existential rules framework
=Datalog+[Calì et al., 2009]
sipieter@lirmm.fr Graal 3
What is Datalog+?
An extension of positive Datalog:
I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))
I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)
I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)
Generalizes:
I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),
I database dependencies (TGDs and EGDs).
sipieter@lirmm.fr Graal 4
What is Datalog+?
An extension of positive Datalog:
I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))
I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)
I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)
Generalizes:
I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),
I database dependencies (TGDs and EGDs).
sipieter@lirmm.fr Graal 4
What is Datalog+?
An extension of positive Datalog:
I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))
I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)
I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)
Generalizes:
I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),
I database dependencies (TGDs and EGDs).
sipieter@lirmm.fr Graal 4
What is Datalog+?
An extension of positive Datalog:
I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))
I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)
I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)
Generalizes:
I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),
I database dependencies (TGDs and EGDs).
sipieter@lirmm.fr Graal 4
What is Datalog+?
An extension of positive Datalog:
I Existentially quanti�ed variables in rule heads∀x (human(x)→ ∃y parentOf (y , x))
I Negative constraints∀x (man(x) ∧ woman(x)→ ⊥)
I Equality rules∀x∀y∀z (motherOf (y , x) ∧motherOf (z , x)→ y = z)
Generalizes:
I Horn description logics(e.g. DL-Lite, the 3 tractable pro�les of OWL2),
I database dependencies (TGDs and EGDs).
sipieter@lirmm.fr Graal 4
Graal, an architecture overview
sipieter@lirmm.fr Graal 5
Graal - General architecture
chaining
core
RDBMS
Triple Store
InMemory GraphDBstore
ioDLGP
ruleset-analyser
OWL2fragment
forward-chaining
backward-
homomorphism
RuleML
Rule
AtomSet
Query
ConjunctiveQuery UnionConjunctiveQuery
Atom Substitution
GraphOfRuleDependencies
NegativeConstraint
Predicate Term
1
0..*
2
0..*
0..* 0..*1
0..*
sipieter@lirmm.fr Graal 6
Graal - General architecture
chaining
core
RDBMS
Triple Store
InMemory GraphDBstore
ioDLGP
ruleset-analyser
OWL2fragment
forward-chaining
backward-
homomorphism
RuleML
Rule
AtomSet
Query
ConjunctiveQuery UnionConjunctiveQuery
Atom Substitution
GraphOfRuleDependencies
NegativeConstraint
Predicate Term
1
0..*
2
0..*
0..* 0..*1
0..*
sipieter@lirmm.fr Graal 6
Store: Storing data
RDBMS Triple Store GraphDBstore
core
InMemory
AtomSet
Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)
Atom
Predicate
Term
0..*
0..*
1
sipieter@lirmm.fr Graal 7
Homomorphism: Querying data
Query
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
Query
homomorphismRecursive backtrackcoreAtomSet
Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)
RDBMS
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
Query
Triple Store
homomorphismRecursive backtrackcoreAtomSet
Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
Query
homomorphismRecursive backtrackcoreAtomSet
Iterator<Atom> iterator()Set<Term> getTerms()boolean contains(Atom)boolean addAtom(Atom)boolean removeAtom(Atom)
GraphDB
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
homomorphism
Query
SQL
RDBMS
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
homomorphism
Query
Sparql
Triple Store
sipieter@lirmm.fr Graal 8
Homomorphism: Querying data
store
homomorphism
Query
Cypher
Neo4j
sipieter@lirmm.fr Graal 8
Taking ontology in account
chaining
core
RDBMS
Triple Store
InMemory GraphDBstore
ioDLGP
ruleset-analyser
OWL2fragment
forward-chaining
backward-
homomorphism
RuleML
Rule
AtomSet
Query
ConjunctiveQuery UnionConjunctiveQuery
Atom Substitution
GraphOfRuleDependencies
NegativeConstraint
Predicate Term
1
0..*
2
0..*
0..* 0..*1
0..*
sipieter@lirmm.fr Graal 9
Forward Chaining / Chase
QueryRules
sipieter@lirmm.fr Graal 10
Forward Chaining / Chase
QueryRules
homomorphism
store
SQLforward-chaining
RDBMS
sipieter@lirmm.fr Graal 10
Forward Chaining / Chase
QueryRules
homomorphism
store
SQLforward-chaining
RDBMS
sipieter@lirmm.fr Graal 10
Backward Chaining / Query rewriting
QueryRules
sipieter@lirmm.fr Graal 11
Backward Chaining / Query rewriting
RDBMS
SQL
Q1 v Q2 v Q3 v Q4 v Q5 v Q6 v … v Qn
store
homomorphism
query-rewriting
QueryRules
sipieter@lirmm.fr Graal 11
Pure - A compilation based query rewriter
sipieter@lirmm.fr Graal 12
E�ciency of the Query Rewriting approach in practice?
+ data do not grow− the size of the rewriting set can be prohibitive in practice
A
B1
B2
Bn
B1(x) ∧ A(x)
B2(x) ∧ B1(x)
Bn(x) ∧ Bn−1(x)
A1
A2
A3
An
A2(x)→ A1(x)
A3(x)→ A2(x)
An(x)→ An−1(x)
q = A1(x1) ∧ . . . ∧ A1(xk)
rewriting set: nk CQs
It is not a theoretical worst-case: it happens often in practicebecause hierarchies are at the heart of ontologies.
sipieter@lirmm.fr Graal 13
Compilation-based Query Rewriting
Preprocess some simple rules known as sources of combinatorialexplosion:
atom1 → atom2
without existential variableE.g., subclass, subproperty, domain, range, inverse properties. . .
R = RC ∪RE
1. Compile RC into a preorder over atoms
2. Embed this preorder into the rewrite process
RC PreorderRE
q
query�pivotal�
preorder
O�ine compilation
Query rewriting
sipieter@lirmm.fr Graal 14
Example
R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)
R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)
sipieter@lirmm.fr Graal 15
Example
R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)
R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)
sipieter@lirmm.fr Graal 16
Closure of the compilable rules
R0 : project(x , y , z,w)→ hasArea(x , y)R1 : project(x , y , z,w)→ hasScManager(x , z)R2 : project(x , y , z,w)→ hasAdmManager(x ,w)R3 : sensitiveArea(x)→ area(x)R4 : security(x)→ sensitiveArea(x)R5 : innovation(x)→ sensitiveArea(x)R6 : hasScManager(x , y)→ hasManager(x , y)R7 : hasAdmManager(x , y)→ hasManager(x , y)R8 : isManagerOf (y , x)→ hasManager(x , y)R9 : hasManager(y , x)→ isManagerOf (x , y)
We compute all inferred rules obtained by composition:
Ra : R1 · R6 = project(x , y , z,w)→ hasManager(x , z)Rb : R2 · R7 = project(x , y , z,w)→ hasManager(x ,w)Rc : R4 · R3 = security(x)→ area(x)Rd : R5 · R3 = innovation(x)→ area(x)Re : R6 · R9 = hasScManager(x , y)→ isManagerOf (y , x)Rf : R7 · R9 = hasAdmManager(x , y)→ isManagerOf (y , x)Rg : Ra · R9 = project(x , y , z,w)→ isManagerOf (z, x)Rh : Rb · R9 = project(x , y , z,w)→ isManagerOf (w , x)
sipieter@lirmm.fr Graal 17
Preorder
isManagerOf (x , y)
hasScManager(y , x)hasAdmManager(y , x)hasManager(y , x)project(y , z , x ,w)project(y , z ,w , x)
area(x)
security(x)innovation(x)
sensitiveArea(x)
hasManager(x , y)
hasScManager(x , y)hasAdmManager(x , y)isManagerOf (y , x)project(x , z , y ,w)project(x , z ,w , y)
hasAdmManager(x , y)
project(x , z ,w , x)
hasScManager(x , y)
project(x , z , x ,w)
sensitiveArea(x)
security(x)innovation(x)
hasArea(x , y)
project(x , y , z ,w)
sipieter@lirmm.fr Graal 18
4-Homomorphism
Q = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)F = isManagerOf (M1,P) ∧ project(P,A,M1,M2) ∧ security(A)
hasArea(x , y)
project(x , y , z ,w)
sensitiveArea(x)
security(x)innovation(x)
. . .
COMPILED
sipieter@lirmm.fr Graal 19
4-Homomorphism
Q = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)F = isManagerOf (M1,P) ∧ project(P,A,M1,M2) ∧ security(A)
h = {{x ,M1}, {y ,P}, {z ,A}}
hasArea(x , y)
project(x , y , z ,w)
sensitiveArea(x)
security(x)innovation(x)
. . .
COMPILED
sipieter@lirmm.fr Graal 19
Query rewriting using 4-Homomorphism
Q = criticalManager(x)
Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)
Q ′′ = accreditatedManager(x)
[Classical rewriting: 38 CQs]
R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)
hasArea(x , y)
project(x , y , z ,w)
sensitiveArea(x)
security(x)innovation(x)
. . .
COMPILED
sipieter@lirmm.fr Graal 20
Query rewriting using 4-Homomorphism
Q = criticalManager(x)
Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)
Q ′′ = accreditatedManager(x)
[Classical rewriting: 38 CQs]
R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)
hasArea(x , y)
project(x , y , z ,w)
sensitiveArea(x)
security(x)innovation(x)
. . .
COMPILED
sipieter@lirmm.fr Graal 20
Query rewriting using 4-Homomorphism
Q = criticalManager(x)
Q ′ = isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)
Q ′′ = accreditatedManager(x)
[Classical rewriting: 38 CQs]
R10 : manager(x)→ isManager(x , y)R11 : isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)→ criticalManager(x)R12 : criticalManager(x)→ isManagerOf (x , y) ∧ hasArea(y , z) ∧ sensitiveArea(z)R13 : accreditatedManager(x)→ isManagerOf (x , y) ∧ project(y , z, v ,w) ∧ security(z)
hasArea(x , y)
project(x , y , z ,w)
sensitiveArea(x)
security(x)innovation(x)
. . .
COMPILED
sipieter@lirmm.fr Graal 20
Results - Impact of compilation on rewriting sizes
Benchmark:translation of DL-LiteR ontologies:
I Adolena (102 rules, 75% compilable)
I Vicodi (222 rules, 100% compilable)
I OpenGalen2-Lite (51k rules, 55%compilable)
I OBOprotein (43k rules, 82% compilable)
Each ontology is provided with 5 queries.
UCQ p-UCQ
A Q1 27 2
Q2 50 2
Q3 104 1
Q4 224 2
Q5 624 1
V Q1 15 1
Q2 10 1
Q3 72 1
Q4 185 1
Q5 30 1
G Q1 2 1
Q2 1152 1
Q3 488 5
Q4 147 1
Q5 324 19
O Q1 27 20
Q2 1356 1264
Q3 33887 1
Q4 34733 682
Q5 36612 -
sipieter@lirmm.fr Graal 21
Results - Impact of compilation on rewriting times
Pure PureC PureCto UCQ
A Q1 190 20 140
Q2 100 10 50
Q3 180 20 40
Q4 290 10 140
Q5 1510 10 450
V Q1 20 10 10
Q2 20 10 20
Q3 120 10 80
Q4 130 10 70
Q5 20 10 20
G Q1 10 10 20
Q2 1070 60 630
Q3 1030 80 270
Q4 30 20 20
Q5 900 40 100
O Q1 450 140 150
Q2 1170 1120 1880
Q3 TO 100 558000
Q4 TO 440 TO
Q5 TO TO TOTO = 10min.
sipieter@lirmm.fr Graal 22
Query evaluation (Futur work)
In central memory, compute 4-homomorphisms from Q to F.
Otherwise:
I Unfold Q using the preorder,
I Transform Q + the compilable rules into a Datalog program P,
I Saturate F with Rc (the set of compilable rules).
sipieter@lirmm.fr Graal 23
https://graphik-team.github.io/graal/
sipieter@lirmm.fr Graal 24
Results - Impact of compilation on rewriting timesPure PureC PureC Nyaya Requiem Iqaros tw Rapid
to UCQ
A Q1 190 20 140 1130 270 60 20 20
Q2 100 10 50 870 110 60 20 30
Q3 180 20 40 2370 140 200 10 40
Q4 290 10 140 5560 260 140 20 50
Q5 1510 10 450 33210 470 580 20 100
V Q1 20 10 10 20 20 20 10 10
Q2 20 10 20 60 20 20 10 10
Q3 120 10 80 30 70 30 10 30
Q4 130 10 70 30 140 40 10 40
Q5 20 10 20 30 80 50 20 30
G Q1 10 10 20 - 50 50 10 10
Q2 1070 60 630 - 209050 5870 20 80
Q3 1030 80 270 - 259110 9190 30 60
Q4 30 20 20 - 190260 780 10 20
Q5 900 40 100 - 238460 7410 30 50
O Q1 450 140 150 - 3450 6680 20 30
Q2 1170 1120 1880 - 21790 27820 580 960
Q3 TO 100 558000 - TO TO 80 620
Q4 TO 440 TO - TO 139990 1240 14700
Q5 TO TO TO - TO TO TO 562230
TO = 10min.sipieter@lirmm.fr Graal 25