+ All Categories
Home > Documents > Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR...

Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR...

Date post: 25-Jan-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
69
Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture 5: Motivation, Relational DE, Chase 18 November, 2015 Foundations of Ontologies and Databases for Information Systems CS5130 (Winter 2015)
Transcript
Page 1: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Özgür L. Özçep

INSTITUT FÜR INFORMATIONSSYSTEME

Data Exchange 1Lecture 5: Motivation, Relational DE, Chase

18 November, 2015

Foundations of Ontologies and Databasesfor Information SystemsCS5130 (Winter 2015)

Page 2: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Recap of Lecture 4

Page 3: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

One of these lectures ...

I Last lecture was one of these where the lecturer sees this:I https://www.youtube.com/watch?v=IQgAuBhlBT0

Owl video

3 / 69

Page 4: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

I Locality as a means for proving in-expressivity results forlogics

I Hanf LocalityAnswers are the same on two structures which are point-wisesimilar (Ex. 4.1)

I Gaifman localityQuery cannot distinguish between tuples which are locally thesame in the given structure

I Bounded number of Degree (BNDP)Cannot produce more degrees in output w.r.t. a given boundthan in the input

I Relations: Hanf � Gaifman � BNDP

I 0-1 lawAlmost all structures have property or almost all have notproperty.

I 0-1 law works also for logics with recursion (Datalog) (Ex. 4.3)

End of Recap4 / 69

Page 5: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Solution to Exercise 4.1 (6 Points)Use Hanf locality in order to proof that the following booleanqueries are not FOL-definable: 1. graph acyclicity, 2. tree.

SolutionGraph Acyclicity (GA).I For contradiction assume GA is Hanf-local with parameter r ′. Choose r > r ′ + 1

such that r is evenI Let G be the union of a circle of length 2r and a linear order of length r

I Let G′ be an order of length 3r .

I Take a bijection f : G→ G′ whereI the circle is unravelled to the middle of G′.I The lower half part of the order in G is mapped to the lower

part of G′

I The upper half part of the order in G is mapped to the upperpart of G′

I an r ′-neighbourhood of any a in G and f (a) ∈ G′ is the same.

I Hence G�r G′, but: G is cyclic and G is not.`TreeI Same construction (as G′ is tree whereas G is not)

5 / 69

Page 6: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Solution to Exercise 4.2 (4 Points)

Show that EVEN(σ) can be defined within second-order logic.

Hint: formalize “There is a binary relation which is an equivalencerelation having only equivalence classes with exactly two elements”and argue why this shows the axiomatizability.

Solution

∃R ∀xR(x , x) ∧∀x∀yR(x , y)→ R(y , x) ∧∀x∀y(∀zR(x , y) ∧ R(y , z))→ R(x , z) ∧∀x∃y(R(x , y) ∧ x 6= y ∧ ∀z(R(x , z)→ z = x ∨ z = y))

6 / 69

Page 7: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Solution to Exercise 4.3 (2 Points)

Argue why (in particular within the DB community) one imposessafety conditions for Datalog rules.

SolutionOtherwise the semantics would either lead to infinite answer sets ordomain dependence. For example, for ans(x)← R(a) all bindingsfor x in the domain of a DB where R(a) is contained, would haveto be named. So the answer would not depend only on R(a) but onthe domains of the variables one allows.

7 / 69

Page 8: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Solution to Exercise 4.4 (4 points)

Give examples of general program rules for which1. No fixpoint exists at all (Hint: “This sentence is not true”)2. Has two minimal fixpoints (Hint: “The following sentence is

false. The previous sentence is true.”)Solution

I No fixpoint: p ← ¬pI Two minimal fixpoints. Unfortunately the hint was wrong

(sorry for that). Should have been: “The following sentence isfalse. The previous sentence is false.”

q ← ¬pp ← ¬q

Fixpoints {p} and {q}.

8 / 69

Page 9: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Data Exchange: Motivation

Page 10: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

References

I M. Arenas, P. Barceló, L. Libkin, and F. Murlak. Foundationsof Data Exchange. Cambridge University Press, 2014.

I M. Arenas: Slides to “Data Exchange in the Relational andRDF Worlds”, Fifth Workshop on Semantic Web InformationManagement 2011

10 / 69

Page 11: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Data Exchange History

I Much research in DB community

I Incorporated into IBM Clio

I Formal treatment starts with 2003 paper by Fagin andcolleaguesLit: R. Fagin, L. M. Haas, M. Hernández, R. J. Miller, L. Popa, and Y.

Velegrakis. Conceptual modeling: Foundations and applications. chapter Clio:

Schema Mapping Creation and Data Exchange, pages 198–236. Springer-Verlag,

Berlin, Heidelberg, 2009.

Lit: R. Fagin et al. Data exchange: Semantics and query answering. In:

Database Theory - ICDT 2003, 2003, Proceedings, volume 2572 of LNCS, pages

207?224. Springer, 2003.

11 / 69

Page 12: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Semantic Integration

I Data Exchange a form of semantic integration

I Research area semantic integration (SI)Deals with issues related to ensuring interoperability ofpossibly heterogeneous data sources.

I Lecture 5 and 6: Data Exchange: Directed DB-level SI forsource and target DB

I Following lecturesI OBDA: Bridging the DB and ontology worldI Ontology-level integration

12 / 69

Page 13: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Data Exchange (DE)I DE deals in a specific way with the integration of DBsI Heterogeneity: Two DBs on the same domain but different

schemata, σ (source) and τ (target)

I Interoperability: Relationship specifications Mτσ for σ and τ

I Relevant service: Query answering over τ

I ChallengesI Consistency: Is there a corresponding τ instance vor a given σ

instance?I Materialization: If yes, then construct and materialize one

instance for τI Query answering: Answer query on this instance (using

rewriting)I How does one construct/maintain mappings

13 / 69

Page 14: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Data Exchange (DE)I DE deals in a specific way with the integration of DBsI Heterogeneity: Two DBs on the same domain but different

schemata, σ (source) and τ (target)

I Interoperability: Relationship specifications Mτσ for σ and τ

I Relevant service: Query answering over τ

I ChallengesI Consistency: Is there a corresponding τ instance vor a given σ

instance?I Materialization: If yes, then construct and materialize one

instance for τI Query answering: Answer query on this instance (using

rewriting)I How does one construct/maintain mappings

14 / 69

Page 15: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Relational DE

I Going to deal mainly with relational DBs

I Language for specifying Mστ : Specific FOL formulas calledtuple generating dependencies (tgds)

I Allow for constraints on the target schema (such as foreignkeys)

I Explicate criteria for goodness of solutions by universalmodel and core notion

I Query answering w.r.t. certain answer semantics and usingrewriting

15 / 69

Page 16: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )

Target DB τ

Routes( fno, src, dest )

Info( fno, dep, arr, airl )

Serves( airl, city, coun, phone )

I Instead of changing the source schema σ, invent own (target)schema τ

I Query over target schema

16 / 69

Page 17: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )

Target DB τ

Routes( fno, src, dest )

Info( fno, dep, arr, airl )

Serves( airl, city, coun, phone )

I Find “corresponding” τ DB instances for given σ instancesI Correspondence ensured by mapping rules Mστ

1. Flight(src, dest, airl , dep) −→∃fno ∃ arr(Routes(fno, src, dest) ∧ Info(fno, dep, arr , airl))

2. Flight(src, dest, airl , dep) ∧ Geo(city , coun, pop) −→∃phone(Serves(airl , city , coun, phone)

3. Flight(src, city , airl , dep) ∧ Geo(city , coun, pop) −→∃phone (Serves(airl , city , coun, phone)

17 / 69

Page 18: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )paris sant. airFr 2320

Target DB τ

Routes( fno, src, dest )

Info( fno, dep, arr, airl )

Serves( airl, city, coun, phone )

I Find “corresponding” τ DB instances for given σ instancesI Correspondence ensured by mapping rules Mστ

1. Flight(src, dest, airl , dep) −→∃fno ∃ arr(Routes(fno, src, dest) ∧ Info(fno, dep, arr , airl))

2. Flight(src, dest, airl , dep) ∧ Geo(city , coun, pop) −→∃phone(Serves(airl , city , coun, phone)

3. Flight(src, city , airl , dep) ∧ Geo(city , coun, pop) −→∃phone (Serves(airl , city , coun, phone)

18 / 69

Page 19: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )paris sant. airFr 2320

Target DB τ

Routes( fno, src, dest )

Info( fno, dep, arr, airl )

Serves( airl, city, coun, phone )

I Find “corresponding” τ DB instances for given σ instancesI Correspondence ensured by mapping rules Mστ

1. Flight(src, dest, airl , dep) −→∃fno ∃ arr(Routes(fno, src, dest) ∧ Info(fno, dep, arr , airl))

2. Flight(src, dest, airl , dep) ∧ Geo(city , coun, pop) −→∃phone(Serves(airl , city , coun, phone)

3. Flight(src, city , airl , dep) ∧ Geo(city , coun, pop) −→∃phone (Serves(airl , city , coun, phone)

19 / 69

Page 20: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )paris sant. airFr 2320

Target DB τ

Routes( fno, src, dest )⊥1, paris, sant.

Info( fno, dep, arr, airl )⊥1, 2320, ⊥2 airFr

Serves( airl, city, coun, phone )

I Find “corresponding” τ DB instances for given σ instancesI Correspondence ensured by mapping rules Mστ

1. Flight(src, dest, airl , dep) −→∃fno ∃ arr(Routes(fno, src, dest) ∧ Info(fno, dep, arr , airl))

2. Flight(src, dest, airl , dep) ∧ Geo(city , coun, pop) −→∃phone(Serves(airl , city , coun, phone)

3. Flight(src, city , airl , dep) ∧ Geo(city , coun, pop) −→∃phone (Serves(airl , city , coun, phone)

20 / 69

Page 21: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )paris sant. airFr 2320

Target DB τ

Routes( fno, src, dest )⊥1, paris, sant.

Info( fno, dep, arr, airl )⊥1, 2320, ⊥2 airFr

Serves( airl, city, coun, phone )

I σ-instanceS = {Flight(paris, sant, airFr , 2320)}

I τ solutionT = {Routes(⊥1, paris, sant), Info(⊥1, 2320,⊥2, airFr)}

I In general there may be more than one solution:T′ = {Routes(123, paris, sant), Info(123, 2320,⊥2, airFr)}

I Have to answer queries w.r.t. all solutions: certain answers21 / 69

Page 22: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Running Example: Flight Domain

Source schema σ

Geo( city, coun, pop )

Flight ( src, dest, airl, dep )

Target DB τ

Routes( fno, src, dest )

Info( fno, dep, arr, airl )

Serves( airl, city, coun, phone )

I σ-instanceS = {Flight(paris, sant, airFr , 2320)

I Boolean query Q1 = ∃fno Routes(fno, paris, sant)I Certain answers is yes, because in all solutions there is a route

form Paris to SantiagoI Boolean query Q2 = Routes(123, paris, sant)

I Certain answer is no

22 / 69

Page 23: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Relational MappingsI Going to deal mainly with relational mappingsI Relational DB (Codd 1970) very successful and still highly

relevantI There were other opinions...

“Some of the ideas presented in the paper are interesting and may be of some

use, but, in general, this very preliminary work fails to make a convincing point

as to their implementation, performance, and practical usefulness. The paper’s

general point is that the tabular form presented should be suitable for general

data access, but I see two problems with this statement: expressivity and

efficiency. [...] The formalism is needlessly complex and mathematical, using

concepts and notation with which the average data bank practitioner is

unfamiliar.” Cited according to (Santini 2005)

Lit: E. F. Codd. A relational model of data for large shared data banks.

Commun. ACM, 13(6):377–387, June 1970.

Lit: S. Santini. We are sorry to inform you ... Computer, December 2005.

23 / 69

Page 24: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Relational Mappings Formally

DefinitionA relational mapping M is a tuple of the form

M = (σ, τ,Mστ ,Mτ )

whereI σ is the source schemaI τ is the target schema with all relation symbols different from

those in σI Mστ is a finite set of FOL formulae over σ ∪ τ called

source-to-target dependenciesI Mτ is a set of constraints on the target schema called target

dependencies

24 / 69

Page 25: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

DB Instances of Schemata

I Schemata are relational signaturesI Concrete database instance

I For a given schema σ a concrete DB instance is a σ FOLstructures with active domain

I Active domain: Domain contains all and only individuals (alsocalled constants) occurring in relations

I Usually: All source instances are concrete DBs

I Generalized DB instancesI For some attributes in target schema (Example: flight number

fno) no corresponding attribute in source may existI Next to constants CONST allow disjoint set of marked NULLs,

denoted VARI A generalized DB instance may contain elements from CONST∪ VAR

25 / 69

Page 26: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

DB Instances of Schemata

I Schemata are relational signaturesI Concrete database instance

I For a given schema σ a concrete DB instance is a σ FOLstructures with active domain

I Active domain: Domain contains all and only individuals (alsocalled constants) occurring in relations

I Usually: All source instances are concrete DBs

I Generalized DB instancesI For some attributes in target schema (Example: flight number

fno) no corresponding attribute in source may existI Next to constants CONST allow disjoint set of marked NULLs,

denoted VARI A generalized DB instance may contain elements from CONST∪ VAR

26 / 69

Page 27: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Source-Target-Dependencies Mστ

I Source-Target-Dependencies may be arbitrary FOL formulaI But usually they have a simple directed form

I due to decidability

I Here: source-to-target tuple-generating dependencies (st-tgds)

DefinitionA source-to-target tuple-generating dependencies (st-tgds) isa FOL formula of the form

∀~x~y(φσ(~x , ~y) −→ ∃~z ψτ (~x , ~z))

whereI φσ is a conjunction of atoms over source schema σI ψτ is a conjunction of atoms over target schema τ

27 / 69

Page 28: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Wake-Up Question

Are st-tgds Datalog rules?

28 / 69

Page 29: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Wake-Up Question

Are st-tgds Datalog rules?

I No, as Datalog rules do not allow existentials in the head ofthe query

I But there is the extended logic called Datalog+/−

Lit: A. Calì, G. Gottlob, and T. Lukasiewicz. Datalog+/-: A unified approach to

ontologies and integrity constraints. In Proceedings of the 12th International

Conference on Database Theory, pages 14?30. ACM Press, 2009.

29 / 69

Page 30: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Target Dependencies Mτ

I Constraints on target schema well known constraints fromclassical DB theory

I Two different types dependencies are sufficiently general tocapture these constraints

DefinitionA tuple-generating dependency (tgd) is a FOL formula of theform

∀~x~y(φ(~x , ~y) −→ ∃~z ψ(~x , ~z))

where φ, ψ are conjunctions of atoms over τ .

An equality-generating (egd) is a FOL formula of the form

∀~x(φ(~x) −→ xi = xj)

where φ(~x) is a conjunction of atoms over τ and xi , xj occur in ~x .

30 / 69

Page 31: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Semantics: Solutions

DefinitionGiven: a mappingM and a σ instance S

A τ instance T is called a solution for S underM iff(S,T) satisfies all rules in Mστ (for short: (S,T) |= Mστ ) and Tsatisfies all rules in Mτ .

I (S,T) |= Mστ iff S ∪ T |= Mστ whereI S ∪ T is the union of the instances S,T: Structure containing

all relations from S and T with domain the union of domainsof S and T

I well defined because schemata are disjoint

I SolM(S): Solutions for S underM

31 / 69

Page 32: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

First Key Problem: Existence of Solutions

Problem: SOLEXISTENCEMInput: Source instance SOutput: Answer whether there exists a solution for S underM

I Note:M is assumed to be fixed =⇒ data complexityI This problem is going to be approached with well known proof

tool: chase

32 / 69

Page 33: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Trivial Case: No Target Dependencies

I Without target constraint there is always a solution

Proposition

LetM = (σ, τ,Mστ ) with Mστ consisting of st-tgds. Then for anysource instance S there are infinitely many solutions and at leastone solution can be constructed in polynomial time.

Proof IdeaI For every rule and every tuple ~a fulfilling the head generate

facts according to the body (using fresh named nulls for theexistentially quantified variables)

I Resulting τ instance T is a solutionI Polynomial: Testing whether ~a fulfills the head (a conjunctive

query) can be done in polynomial timeI Infinity: From T can build any other solution by extension

33 / 69

Page 34: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Trivial Case: No Target Dependencies

I Without target constraint there is always a solution

Proposition

LetM = (σ, τ,Mστ ) with Mστ consisting of st-tgds. Then for anysource instance S there are infinitely many solutions and at leastone solution can be constructed in polynomial time.

Proof IdeaI For every rule and every tuple ~a fulfilling the head generate

facts according to the body (using fresh named nulls for theexistentially quantified variables)

I Resulting τ instance T is a solutionI Polynomial: Testing whether ~a fulfills the head (a conjunctive

query) can be done in polynomial timeI Infinity: From T can build any other solution by extension

34 / 69

Page 35: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Reminder: Conjunctive Queries (CQs)

I Class of sufficiently expressive and feasible FOL queries of form

Q(~x) = ∃~y(α1(~x1, ~y1) ∧ · · · ∧ αn(~xn, ~yn)

)where

I αi (~xi , ~yi ) are atomic FOL formula andI ~xi variable vectors among ~x and ~yi variables among ~y

I Corresponds to SELECT-PROJECT-JOIN Fragment of SQL

35 / 69

Page 36: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Reminder: Conjunctive Queries (CQs)

Theorem

I Answering CQs is NP-complete w.r.t. combined complexity(Chandra,Merlin 1977)

I Subsumption test for CQs is NP completeI Answering CQs is in AC0 (and thus in P) w.r.t. data complexity

Lit: A. K. Chandra and P. M. Merlin. Optimal implementation of conjunctive queries

in relational data bases. In: Proceedings of the Ninth Annual ACM Symposium on

Theory of Computing, STOC’77, pages 77–90, New York, NY, USA, 1977. ACM.

36 / 69

Page 37: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Undecidability for General Constraints

TheoremThere is a relational mappingM = (σ, τ,Mστ ,Mτ ) such thatSOLEXISTENCEM is undecidable.

I As a consequence: Further restrict mapping rulesI But the following chase construction defined for arbitrary

st-tgds

37 / 69

Page 38: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Undecidability for General Constraints

TheoremThere is a relational mappingM = (σ, τ,Mστ ,Mτ ) such thatSOLEXISTENCEM is undecidable.

Wake-Up Question

As another exercise in reduction prove the following corollary:There is a relational mappingM = (σ, τ,Mστ ) with a single FOLdependency in Mστ s.t. SOLEXISTENCEM is undecidable

38 / 69

Page 39: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Undecidability for General Constraints

TheoremThere is a relational mappingM = (σ, τ,Mστ ,Mτ ) such thatSOLEXISTENCEM is undecidable.

Wake-Up Question

As another exercise in reduction prove the following corollary:There is a relational mappingM = (σ, τ,Mστ ) with a single FOLdependency in Mστ s.t. SOLEXISTENCEM is undecidable

ProofI Assume otherwiseI GivenM = (σ, τ,Mστ ,Mτ )

I constructM′ = (σ, τ, {χ}) withI χ =

∧Mστ ∪Mτ

39 / 69

Page 40: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Existence Proof vs. Construction

I Showing existence 6= construction a verifierI Actually we are going to construct a solution using the chase

I Interesting debate in philosophy of mathematics whethernon-constructive proofs are acceptable

I Mathematical Intuitionism: field allowing only constructiveproofs

I truth = provable = constructively provableI Classical logical inference rules s.a. ¬¬A � A not allowedI Main inventor: L.E.J. Brouwer (1881 to 1966)

Irony: Has many interesting results in classical(non-constructive) mathematics (Brouwer’s fixpoint theorem)

40 / 69

Page 41: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase Construction

I A widely used tool in DB theoryI Original use: Calculating entailments of DB constraints

Lit: D. Maier, A. O. Mendelzon, and Y. Sagiv. Testing implications of data

dependencies. ACM Trans. Database Syst., 4(4):455?469, Dec. 1979.

I General ideaI Apply tgds as completion/repair rules in a bottom-up strategyI until no tgds can be applied anymoreI Chase construction mail fail if one of the egds is violated

I The chase leads to an instance with desirable propertiesI It produces not too many redundant factsI Universality

41 / 69

Page 42: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

42 / 69

Page 43: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

43 / 69

Page 44: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

44 / 69

Page 45: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

45 / 69

Page 46: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

46 / 69

Page 47: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (termination)

47 / 69

Page 48: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

48 / 69

Page 49: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

49 / 69

Page 50: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

50 / 69

Page 51: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

51 / 69

Page 52: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

52 / 69

Page 53: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

53 / 69

Page 54: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

54 / 69

Page 55: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

55 / 69

Page 56: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Example (Non-terminating c(h)ase)

I Source schema σ = {E}; target schema τ = {G , L}I Mστ = { E (x , y)→ G (x , y)︸ ︷︷ ︸

θ1

}

Mτ = { G (x , y)→ ∃z L(y , z)︸ ︷︷ ︸χ1

, L(x , y)→ ∃z G (y , z)︸ ︷︷ ︸χ2

}

I Source instance S = {E (a, b)}

I (S, ∅) (violates θ1)I (S, {G (a, b)}) (violates χ1)I (S, {G (a, b), L(b,⊥)}) (violates χ2 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1)}) (violates χ1 )I (S, {G (a, b), L(b,⊥),G (⊥,⊥1), L(⊥1,⊥2)}) (violates χ2 )I . . . (non-termination)

56 / 69

Page 57: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase DefinitionI Let S be a σ instance and dom(S) its domain

Definition (Chase steps)

Sχ,~a; S′ iff

1. χ a tgd of form φ(~x)→ ∃~yψ(~x , ~y) andI S |= φ(~a) for some elements ~a from dom(S)I S′ extends S with all atoms occurring in ψ(~a, ~⊥).

2. or χ is an egd of form φ(~x)→ xi = xj andI S |= φ(~a) for some elements ~a from dom(S) with ai 6= aj andI ai is constant or null, aj is null and S′ = S[aj/ai ]I aj is constant, aj is null and S′ = S[ai/aj ]

Sχ,~a; fail iff

I S |= φ(~a) for some elements ~a from dom(S) with ai 6= ajI and both ai , aj are constants.

57 / 69

Page 58: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase DefinitionI Let S be a σ instance and dom(S) its domain

Definition (Chase steps)

Sχ,~a; S′ iff

1. χ a tgd of form φ(~x)→ ∃~yψ(~x , ~y) andI S |= φ(~a) for some elements ~a from dom(S)I S′ extends S with all atoms occurring in ψ(~a, ~⊥).

2. or χ is an egd of form φ(~x)→ xi = xj andI S |= φ(~a) for some elements ~a from dom(S) with ai 6= aj andI ai is constant or null, aj is null and S′ = S[aj/ai ]I aj is constant, aj is null and S′ = S[ai/aj ]

Sχ,~a; fail iff

I S |= φ(~a) for some elements ~a from dom(S) with ai 6= ajI and both ai , aj are constants.

58 / 69

Page 59: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase DefinitionI Let S be a σ instance and dom(S) its domain

Definition (Chase steps)

Sχ,~a; S′ iff

1. χ a tgd of form φ(~x)→ ∃~yψ(~x , ~y) andI S |= φ(~a) for some elements ~a from dom(S)I S′ extends S with all atoms occurring in ψ(~a, ~⊥).

2. or χ is an egd of form φ(~x)→ xi = xj andI S |= φ(~a) for some elements ~a from dom(S) with ai 6= aj andI ai is constant or null, aj is null and S′ = S[aj/ai ]I aj is constant, aj is null and S′ = S[ai/aj ]

Sχ,~a; fail iff

I S |= φ(~a) for some elements ~a from dom(S) with ai 6= ajI and both ai , aj are constants.

59 / 69

Page 60: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase DefinitionI Let S be a σ instance and dom(S) its domain

Definition (Chase steps)

Sχ,~a; S′ iff

1. χ a tgd of form φ(~x)→ ∃~yψ(~x , ~y) andI S |= φ(~a) for some elements ~a from dom(S)I S′ extends S with all atoms occurring in ψ(~a, ~⊥).

2. or χ is an egd of form φ(~x)→ xi = xj andI S |= φ(~a) for some elements ~a from dom(S) with ai 6= aj andI ai is constant or null, aj is null and S′ = S[aj/ai ]I aj is constant, aj is null and S′ = S[ai/aj ]

Sχ,~a; fail iff

I S |= φ(~a) for some elements ~a from dom(S) with ai 6= ajI and both ai , aj are constants.

60 / 69

Page 61: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Chase

DefinitionA chase sequence for S under M is a sequence of chase steps

Siχi ,~ai; Si+1 such thatI S0 = S

I each χi is in M

I for each distinct i , j also (χi , ~ai ) 6= (χj , ~aj)

For a finite chase sequence the last instance is called its result.I If the result is fail , then the sequence is said to be a failing

sequenceI If no further dependency from M can be applied to a result,

then the sequence is called successful.

61 / 69

Page 62: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Indeterminism

I Indeterminism regarding choice of nulls (no problem)I Indeterminism regarding order of chosen tgds and egds

This may lead to different chase results

62 / 69

Page 63: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Use of Chases in Data Exchange

I A chase sequence for S under aM is a chase sequence for(S, ∅) under Mστ ∪Mτ

I If (S,T) result of a finite sequence, call just T the result

I Chase is the right tool for finding solutions

Proposition

GivenM and source instance S.I If there is a successful chase sequence for S with result T,

then T is a solution.I If there is a failing chase sequence for S, then S has no

solution.

63 / 69

Page 64: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Use of Chases in Data ExchangeI A chase sequence for S under aM is a chase sequence for

(S, ∅) under Mστ ∪Mτ

I If (S,T) result of a finite sequence, call just T the result

I Chase is the right tool for finding solutions

Proposition

GivenM and source instance S.I If there is a successful chase sequence for S with result T,

then T is a solution.I If there is a failing chase sequence for S, then S has no

solution.

I The proposition does no cover all cases: non-terminating chaseI In this case still there still may be a solution

64 / 69

Page 65: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Weak Acyclicity

I In order to guarantee termination restrict target constraintsI Reason for non-termination: generation of new nulls with same

dependencies

Example (Cycle in Dependencies)

I χ1 = G (x , y)→ ∃z L(y , z)I χ2 = L(x , y)→ ∃z G (y , z)

Possible infinite generation

G (a, b)χ1; L(b,⊥1)

χ2; G (⊥1,⊥2)χ1; L(⊥2,⊥3) . . .

I Problem caused by cycle in dependencies

65 / 69

Page 66: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Weak Acyclicity

I In order to guarantee termination restrict target constraintsI Reason for non-termination: generation of new nulls with same

dependencies

Example (Cycle in Dependencies)

I χ1 = G (x , y)→ ∃z L(y , z)I χ2 = L(x , y)→ ∃z G (y , z)

Possible infinite generation

G (a, b)χ1; L(b,⊥1)

χ2; G (⊥1,⊥2)χ1; L(⊥2,⊥3) . . .

I Problem caused by cycle in dependencies

66 / 69

Page 67: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Simple Dependency Graphs

I Nodes: pairs (R, i) of predicate R and argument-position iI Edges: From (Rb, i) to (Rh, j) iff there is a tgd such

1. Rh occurs in head and Rb occurs in body and2. either variable x in i-position in Rb occurs in j-postion in Rh

3. or variable in j-position in Rh is existentially quantified

Example (Simple Dependency Graph)

I χ1 = G (x , y)→ ∃z L(y , z)

I χ2 = L(x , y)→ ∃z G (y , z)

(L,1)

(G,1)

(L,2)

(G,2)

Set of tgds called acyclic if simple dependency graph is acyclic.

67 / 69

Page 68: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Dependency GraphsI Nodes: pairs (R, i) of predicate R and argument-position iI Edges: From (Rb, i) to (Rh, j) iff there is a tgd such

1. Rh occurs in head and Rb occurs in body and2. either variable x in i-position in Rb occurs in j-postion in Rh

3. or variable in j-position in Rh is existentially quantifiedand these are labelled by *

Example (Dependency Graph)

I χ1 = G (x , y)→ ∃z L(y , z)

I χ2 = L(x , y)→ ∃z G (y , z)

(L,1)

(G,1)

(L,2)

(G,2)

* *

Set of tgds called weakly acyclic if dependency graph has no cyclewith a * edge.

68 / 69

Page 69: Özgür L. Özçep Data Exchange 1 - uni-luebeck.de · Özgür L. Özçep INSTITUT FÜR INFORMATIONSSYSTEME Data Exchange 1 Lecture5:Motivation,RelationalDE,Chase 18November,2015

Termination for weakly acyclic tgds

TheoremLetM = (σ, τ,Mστ ,Mτ ) be a mapping where Mτ is the union ofegds and weakly acyclic tgds. Then the length of every chasesequence for a source S is polynomially bounded w.r.t. the size ofS.

I In particular: Every chase sequence terminatesI Moreover: SOLEXISTENCEM can be solved in polynomial

timeI a solution can be constructed in polynomial time

69 / 69


Recommended