+ All Categories
Home > Education > Data Exchange over RDF

Data Exchange over RDF

Date post: 11-May-2015
Category:
Upload: net2-project
View: 469 times
Download: 0 times
Share this document with a friend
Popular Tags:
30
Data Exchange over RDF Andr´ es Letelier Advisor: Marcelo Arenas Pontificia Universidad Cat´olica de Chile September 1, 2011
Transcript
Page 1: Data Exchange over RDF

Data Exchange over RDF

Andres LetelierAdvisor: Marcelo Arenas

Pontificia Universidad Catolica de Chile

September 1, 2011

Page 2: Data Exchange over RDF

What is data exchange?

ProblemData under one schema S needs to be restructured and translatedinto a target schema T

S −→ T

IS −→ IT

Page 3: Data Exchange over RDF

Schema mappings

QuestionWhich source instances corresponds to which target instances?

AnswerSchema mappings:

M⊆ Instances(S)× Instances(T)

Usually, schema mappings are defined as M = (S,T,ΣST)

Page 4: Data Exchange over RDF

Definition (Solution)

I2 is a solution of I1 under M iif (I1, I2) ∈MThe set of all solutions for I1 under M is denoted by SolM(I1)

Page 5: Data Exchange over RDF

Resource Description Framework (RDF)

I Data model for representing information about World WideWeb resources

I W3C Recommendation (1998)

I Part of the semantic web stack

I Directed, labeled graphs

I Blank nodes (labeled nulls)

I Basically, sets of triples (s, p, o)

Page 6: Data Exchange over RDF

ExampleD =

(B1 name paul)(B1 email [email protected])(B2 name john)(B2 city Liverpool)

Page 7: Data Exchange over RDF

SPARQL (pronounced “sparkle”)

I Query language for RDF

I W3C Recommendation(2008)

I Standard for querying RDF datasets

I Returns sets of partial mappingsI Operators:

I ProjectionI AND (inner join)I OPT (left join)I FILTERI UNIONI and more

Page 8: Data Exchange over RDF

Example

P1 = (?X,name, ?Y )

JP1KD =

?X ?Y

B1 paul

B2 john

Page 9: Data Exchange over RDF

Example

P2 = (?X,name, ?Y ) AND (?X, email, ?Z)

JP2KD =?X ?Y ?Z

B1 paul [email protected]

Page 10: Data Exchange over RDF

Example

P3 = (?X,name, ?Y ) OPT (?X, email, ?Z)

JP3KD =

?X ?Y ?Z

B1 paul [email protected]

B2 john

Page 11: Data Exchange over RDF

Well-designed SPARQL patterns

Definition (Well-designed patterns)

A pattern P is well designed if for every subpattern P ′ of the formP1 OPT P2, every variable that appears in P2 and outside P ′ alsoappears in P1.

Example

I (?X,name, ?Y ) OPT ((?X, email, ?Z) OPT (?X, city, ?A))is well-designed

I (?X,name, ?Y ) OPT ((?W, email, ?Z) OPT (?X, city, ?A))is not

Page 12: Data Exchange over RDF

Data Exchange over RDF

I S and T are fixed to be RDF triples

I Tuple generating dependencies have to be redefined

I But first, we need some definitions...

Page 13: Data Exchange over RDF

RDF Tuple Generating Dependencies

Let P be a SPARQL pattern, µ1 and µ2 be partial mappings, andΩ1 and Ω2 be sets of mappings. Then:

I var(P ) are the variables mentioned in P

I dom(µ1) is the domain of µ1I A SPARQL SELECT query (denoted by (W,P ), whereW ⊆ var(P )) is the projection of the evaluation of P ontothe variables in W

Page 14: Data Exchange over RDF

RDF Tuple Generating Dependencies

Let P be a SPARQL pattern, µ1 and µ2 be partial mappings, andΩ1 and Ω2 be sets of mappings. Then:

I µ1 is subsumed by µ2 (µ1 µ2) if dom(µ1) ⊆ dom(µ2), forevery ?X in dom(µ1) that is not bound to a blank node wehave that µ1(?X) = µ2(?X) and for every pair of variables?X and ?Y in dom(µ1) such that µ1(?X) = µ1(?Y ) it is thecase that µ2(?X) = µ2(?Y ).

I Ω1 is subsumed by Ω2 (Ω1 v Ω2) if for every mapping µ1 inΩ1 there exists a mapping µ2 in Ω2 such that µ1 µ2.

Page 15: Data Exchange over RDF

RDF Tuple Generating Dependencies

(Re)Definition (Tuple Generating Dependencies)

Let P1 and P2 be SPARQL patterns, and W ⊂ var(P1) ∩ var(P2).An RDF tgd is a sentence of the form

(W,P1)→ (W,P2)

Given two RDF graphs G1 and G2, and a set of tgds Σ,(G1, G2) |= Σ if for every tgd (W,P1)→ (W,P2) in Σ it is thecase that J(W,P1)KG1 v J(W,P2)KG2

Page 16: Data Exchange over RDF

RDF Schema Mappings

Since S and T are fixed,

M = Σ

G2 ∈ SolM(G1)←→ (G1, G2) |= Σ

Page 17: Data Exchange over RDF

Universal solutions

Example

Let W = ?X, Σ =(W, (?X,name, ?Y ) AND (?X, email, ?Z))→(W, (?Y, hasmail, ?Z))and consider the dataset D:

Solution 1G2 =

(paul hasmail [email protected])

Solution 2G′

2 = (paul hasmail [email protected])(john hasmail n)

Page 18: Data Exchange over RDF

Universal solutions

DefinitionA solution G2 is universal if for every other solution G′

2, G2 v G′2

I Solution 1 is universal

I Solution 2 is not

Page 19: Data Exchange over RDF

Universal solutions

Not all settings have universal solutions:Consider G1 = (1, 2, 3), W = ?X, ?Y and

Σ = (W, (?X, ?Y, ?Z))→(W, ((?X, a, b) OPT (?W, b, ?Y ))

AND ((?X, c, d) OPT (?Z, d, ?Y )))

Page 20: Data Exchange over RDF

Solution 1G2 =

(1 a b)( n1 b 2)(1 c d)

Solution 2G′

2 = (1 a b)( n2 d 2)(1 c d)

This setting has no universal solution!

Page 21: Data Exchange over RDF

Good and bad news

Bad newsThere is no ensurance that an exchange setting that has a solutionwill have a universal solution

Good newsIf the heads of all tgds in Σ are well-designed and there is asolution, there is always a universal solution

Better newsWe have an algorithm

Page 22: Data Exchange over RDF

“Chasing” SPARQL queries

input A mapping µ and a (well-designed) SPARQL pattern P

output An RDF graph G such that µ ∈ JP KG

Chase(µ, ν, P,G)

I t:add unbound variables in t as fresh blank nodes to νadd ν(t) to G

I P1 AND P2:Chase(µ, ν, P1, G)Chase(µ, ν, P2, G)

I P1 OPT P2:Chase(µ, ν, P1, G)if dom(µ) \ dom(ν) ∩ var(P2) 6= ∅: Chase(µ, ν, P2, G)

Page 23: Data Exchange over RDF

After chasing:

I µ νI ν ∈ JP KGI µ v JP KGI If we chase with every P2 in Heads(Σ) the evaluations of

J(W,P1)KG1 , we get a universal solution.

Page 24: Data Exchange over RDF

Certain answers

Definition (Certain answers on a regular data exchange setting)

The set of certain answers is the intersection of the evaluation ofthe query over all the valid solutions

Example

Consider G1 = (1, 2, 3) and

(?X,(?X, ?Y, ?Z))→(?X, (?X, 1, 2) OPT (?X, ?Y, 3))

Page 25: Data Exchange over RDF

Solution 1G2 =

(1 1 2)

Solution 2G′

2 = (1 1 2)(1 2 3)

J(W,P2)KG2 = ?X 7→ 1

J(W,P2)KG′2

= ?X 7→ 1, ?Y 7→ 2

The intersection of J(W,P2)KG2 and J(W,P2)KG′2

is empty!

Page 26: Data Exchange over RDF

Certain answers

Given a pattern P and a set of RDF graphs G, let Lower(P,G) bethe set of all lower bounds of G w.r.t. subsumption.

(Re)Definition (Certain Answers)

The set of certain answers of a set of RDF graphs and a SPARQLpattern P is defined as any mapping Ω? in Lower(P,G), such thatfor any other Ω in Lower(P,G) it is the case that Ω v Ω?.

ClaimAll the possible sets of certain answers to an RDF data exchangesetting are homomorfically equivalent.

Page 27: Data Exchange over RDF

Back in our previous example...

Solution 1G2 =

(1 1 2)

Solution 2G′

2 = (1 1 2)(1 2 3)

J(W,P2)KG2 = ?X 7→ 1

J(W,P2)KG′2

= ?X 7→ 1, ?Y 7→ 2

The set of certain answers is now ?X 7→ 1

Page 28: Data Exchange over RDF

In conclusion...

Our contributions so far:

I RDF and SPARQL TGDs

I RDF Schema mappings

I Universal solutions

I Materialization of universal solutions

I Certain answers

Page 29: Data Exchange over RDF

In conclusion...

To do:

I Prove remaining claims

I Query answering (using universal solutions)

I Incomplete information in the source instance

I Knowledge exchange over RDFs

Page 30: Data Exchange over RDF

Thank you for listening

Any questions?


Recommended