Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues By A. Kementsietsidis, M....

transcript

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues

By A. Kementsietsidis, M. Arenas and R.J. Miller

Presented by Md. Anisur Rahman: 3558643

Anahit Martirosyan: 100628480LianXiang Qiu: 3603336

University Of OttawaWinter 2004

Outline

P2P Data-Sharing-System Mapping Table Alternative Semantics for Mapping Tables Mapping Tables as Constraints An algorithm for checking consistency of the

existing mappings and inferring new mappings from them

Conclusion and Future work

Peer-to-Peer Data-Sharing System

What is a Mapping Table?

GDB_id SwissProt_id

GDB_id Gene_Name

SwissProt_id Protein_ name

Relation GDB Relation SwissProt

Mapping Table

A mapping table m from a set of attributes X to a set of attributes Y is a finite set of mappings over X Y

Alternative Semantics for Mapping Tables

Closed-Closed-World SemanticsClosed-Open-World Semantics

GDB_id SwissProt_id

G2 P40

GDB_id SwissProt_id

v - {G2}

v’ - {P40}

Valuation over a mapping table

A valuation p over mapping table m is a function that maps each constant value in m to itself and each variable v of m to a value of the domain of the attribute where v

appears If v appears in the expression of the form v-S , then p(v)S

Attr1 Attr2

v-{a,b} 1

dom(Attr1)={a, b, c, d}

dom(Attr2)={1, 2, 3}

p(a) = ap(3) = 3p(v) = cp(v) = d

Mapping table m

Mapping Constraint

GDB_id Gene_Name

SwissProt_id Protein_ name

GDB_id SwissProt_id

v - {G2}

v’ - {P40}

Relation GDB Relation SwissProt

GDB_id GENE_Name Swissprot_id Protein_ Name

Mapping table m

A relation having attributes from both GDB and SwissProt

idotSwissm

idGDB _Pr_: Mapping Constraint

Extension of a mapping constraint

Given a mapping constraint ext () = {(t) | t m and is a valuation

over m }

Attr1 Attr2

v-{a,b} 1

Mapping table m

21: Attrm

dom(Attr1)={a, b, c, d}

dom(Attr2)={1, 2, 3}

Attr1 Attr2

ext(µ)

A mapping constraint is called the cover of a set of mapping constraints if

is consistent if and only if there exists t ext()

For every mapping constraint , ╞ ’ if and only if ext() ext(’)

Cover of a set of mapping constraints

Example of Cover

A1 A2 B1

p x pxq y qy

v-{p,q} v’ v’’-{px,qy}

B1 C1 C2

px a iqy b j

v-{px,qy} v’ v’’-{I,j}

A1 A2 C1 C2

p x a iq y b j

Mapping table m1 Mapping table m2

Mapping table m

Relation r1

Relation r2

Relation r3

12 ,: CCm

211 ,: Bm

2121 ,,: CCm

={1, 2}

The Algorithm

Input A path = P1, P2,…., Pn of peers

A set of mapping constraints over path Two sets of attributes X and Y in peers P1 and Pn

Output: A mapping constraint that is a cover of Y

How is the Algorithm useful?

To check whether ╞ ’ Run the algorithm to find the cover Check whether ext() ext(’).

To check whether is consistent Run the algorithm to find the cover Check whether ext() is nonempty

An Example

=P1, P2, P3, P4

= {µ1, µ2,…, µ11}

{A1, A2,.., A6}

{B1, B2,.., B6} {C1,C2,C3,C4}

{D3, D4}

4444: BA m

3233 ,: 3 BBA m

5555: BA m

6666: BA m

1111: BA m

21212 ,,: 2 BBAA m

3599: CB m

12177,: CBB m

2388: CB m

331010: DC m

441111: DC m

Partitions

4444: BA m

3233 ,: 3 BBA m

5555: BA m

6666: BA m

1111: BA m

21212 ,,: 2 BBAA m µ2

µ1 µ3 µ5

µ4 µ6

2121 ,, 2 BBAA m323 ,3 BBA m

111 BA m

444 BA m2

555 BA m3

666 BA m4

Inferred Partitions

Peer P1 Peer P2

2121 ,, 2 BBAA m323 ,3 BBA m

111 BA m

444 BA m2

555 BA m3

666 BA m4

1217, CBB m5

238 CB m6

359 CB m7

444 BA m

2121 ,, 2 BBAA m323 ,3 BBA m

1417, CBB m

238 CB m

111 BA m

359 CB m

555 BA m

666 BA m

Inferred partition over P1 and P2

Advantages of Partitioning

While computing the cover, partitioning reduces computational cost as fewer constraints are considered at a time.

Different partitions can be processed in parallel.

Description of the Algorithm

The algorithm has two phases The Information gathering Phase The Computation Phase

Information Gathering Phase

P1 P2 P3 P4

Compute partitionsFor each partition send to P2 the set of attributes in the partition

Compute own partitionsCompute inferred partitions using the information of partitions of P1

Compute own partitionsCompute inferred partitions using the information of propagated inferred partitions from P2

Computation Phase

P1 P2 P3 P4

Using the local constraints of the inferred partition , computes a cover between P3 and P4

The mappings belonging to the cover are streamed to peer P2.

Determines with which of its own partitions the incoming stream of mapping should be associated With this information it generates a cover between itself and P4

Uses the incoming stream of mappings to generate a cover between its own attributes and those of peer P4

Conclusion and Future Scope

This paper showed that by treating mapping tables as constraints on the exchange of information between peers it is possible to reason about them and check their consistency.

There is scope for investigating the use of mapping tables in support of query answering.

Thank YouThank You

Mapping Data in Peer-to-Peer Systems: Semantics and Algorithmic Issues By A. Kementsietsidis, M....

Documents