Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues By A. Kementsietsidis, M....

Post on 14-Dec-2015

222 views 2 download

Tags:

transcript

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

By A. Kementsietsidis, M. Arenas and R.J. Miller

Presented by Md. Anisur Rahman: 3558643

Anahit Martirosyan: 100628480LianXiang Qiu: 3603336

University Of OttawaWinter 2004

Outline

P2P Data-Sharing-System Mapping Table Alternative Semantics for Mapping Tables Mapping Tables as Constraints An algorithm for checking consistency of the

existing mappings and inferring new mappings from them

Conclusion and Future work

Peer-to-Peer Data-Sharing System

What is a Mapping Table?

GDB_id SwissProt_id

G1

G1

G2

G3

P9

Q62

P40

P38

GDB_id Gene_Name

G1

G2

G3

NF1

NF2

NGFB

SwissProt_id Protein_ name

P9

P40

NF1

MERL

Relation GDB Relation SwissProt

Mapping Table

A mapping table m from a set of attributes X to a set of attributes Y is a finite set of mappings over X Y

Alternative Semantics for Mapping Tables

Closed-Closed-World SemanticsClosed-Open-World Semantics

GDB_id SwissProt_id

G2 P40

GDB_id SwissProt_id

G2

v - {G2}

P40

v’ - {P40}

Valuation over a mapping table

A valuation p over mapping table m is a function that maps each constant value in m to itself and each variable v of m to a value of the domain of the attribute where v

appears If v appears in the expression of the form v-S , then p(v)S

Attr1 Attr2

a 3

b 2

v-{a,b} 1

dom(Attr1)={a, b, c, d}

dom(Attr2)={1, 2, 3}

p(a) = ap(3) = 3p(v) = cp(v) = d

Mapping table m

Mapping Constraint

GDB_id Gene_Name

G1

G2

G3

NF1

NF2

NGFB

SwissProt_id Protein_ name

P9

P40

NF1

MERL

GDB_id SwissProt_id

G2

v - {G2}

P40

v’ - {P40}

Relation GDB Relation SwissProt

GDB_id GENE_Name Swissprot_id Protein_ Name

G1

G2

G3

G2

NF1

NF2

NGFB

NF2

P9

P40

P9

P9

NF1

MERL

NF1

NF1

Mapping table m

A relation having attributes from both GDB and SwissProt

idotSwissm

idGDB _Pr_: Mapping Constraint

Extension of a mapping constraint

Given a mapping constraint ext () = {(t) | t m and is a valuation

over m }

Ym

X :

Attr1 Attr2

a 3

b 2

v-{a,b} 1

Mapping table m

21: Attrm

Attr

dom(Attr1)={a, b, c, d}

dom(Attr2)={1, 2, 3}

Attr1 Attr2

a 3

b 2

c 1

d 1

ext(µ)

A mapping constraint is called the cover of a set of mapping constraints if

is consistent if and only if there exists t ext()

For every mapping constraint , ╞ ’ if and only if ext() ext(’)

Cover of a set of mapping constraints

Ym

X :

Ym

X ':'

Example of Cover

B1 B2

px 1

qy 2

rz 3

rx 4

A1 A2 B1

p x pxq y qy

v-{p,q} v’ v’’-{px,qy}

C1 C2

a i

b j

c k

A1 A2

p x

q y

r z

B1 C1 C2

px a iqy b j

v-{px,qy} v’ v’’-{I,j}

A1 A2 C1 C2

p x a iq y b j

Mapping table m1 Mapping table m2

Mapping table m

Relation r1

Relation r2

Relation r3

212

12 ,: CCm

B

11

211 ,: Bm

AA

2121 ,,: CCm

AA

={1, 2}

The Algorithm

Input A path = P1, P2,…., Pn of peers

A set of mapping constraints over path Two sets of attributes X and Y in peers P1 and Pn

Output: A mapping constraint that is a cover of Y

mX :

How is the Algorithm useful?

To check whether ╞ ’ Run the algorithm to find the cover Check whether ext() ext(’).

To check whether is consistent Run the algorithm to find the cover Check whether ext() is nonempty

An Example

P1 P3

=P1, P2, P3, P4

= {µ1, µ2,…, µ11}

{A1, A2,.., A6}

P2

{B1, B2,.., B6} {C1,C2,C3,C4}

P4

{D3, D4}

4444: BA m

3233 ,: 3 BBA m

5555: BA m

6666: BA m

1111: BA m

21212 ,,: 2 BBAA m

3599: CB m

12177,: CBB m

2388: CB m

331010: DC m

441111: DC m

Partitions

4444: BA m

3233 ,: 3 BBA m

5555: BA m

6666: BA m

1111: BA m

21212 ,,: 2 BBAA m µ2

µ1 µ3 µ5

µ4 µ6

2121 ,, 2 BBAA m323 ,3 BBA m

111 BA m

1

444 BA m2

555 BA m3

666 BA m4

Inferred Partitions

Peer P1 Peer P2

2121 ,, 2 BBAA m323 ,3 BBA m

111 BA m

1

444 BA m2

555 BA m3

666 BA m4

1217, CBB m5

238 CB m6

359 CB m7

1

5

2

6

3

7

4

444 BA m

2121 ,, 2 BBAA m323 ,3 BBA m

1417, CBB m

238 CB m

111 BA m

359 CB m

555 BA m

666 BA m

Inferred partition over P1 and P2

Advantages of Partitioning

While computing the cover, partitioning reduces computational cost as fewer constraints are considered at a time.

Different partitions can be processed in parallel.

Description of the Algorithm

The algorithm has two phases The Information gathering Phase The Computation Phase

Information Gathering Phase

P1 P2 P3 P4

Compute partitionsFor each partition send to P2 the set of attributes in the partition

Compute own partitionsCompute inferred partitions using the information of partitions of P1

Compute own partitionsCompute inferred partitions using the information of propagated inferred partitions from P2

Computation Phase

P1 P2 P3 P4

Using the local constraints of the inferred partition , computes a cover between P3 and P4

The mappings belonging to the cover are streamed to peer P2.

Determines with which of its own partitions the incoming stream of mapping should be associated With this information it generates a cover between itself and P4

Uses the incoming stream of mappings to generate a cover between its own attributes and those of peer P4

Conclusion and Future Scope

This paper showed that by treating mapping tables as constraints on the exchange of information between peers it is possible to reason about them and check their consistency.

There is scope for investigating the use of mapping tables in support of query answering.

Thank YouThank You