+ All Categories
Home > Documents > George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Date post: 28-Mar-2015
Category:
Upload: megan-holmes
View: 218 times
Download: 5 times
Share this document with a friend
Popular Tags:
25
George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam
Transcript
Page 1: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

George Anadiotis, Spyros Kotoulas and Ronny SiebesVU University Amsterdam

Page 2: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Why do we need distribution… Why do we need anytime behavior… Why is should be (very) scalable… Why should we drop consistency and

completeness… Why do we need trust/ontology ranking… etc

2

Page 3: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

What is P2P? (1 slide) Relationship between P2P and SW(3 slides) Our Goal (1 slide) Distributed SW stores(1 slide)

◦ Structured P2P stores (3 slides)◦ Federated stores (2 slides)

Our approach (6 slides) Future work (1 slide)

3

Page 4: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Class of distributed systems Most important characteristics

◦ Same functionality across peers◦ Peer autonomy◦ Formation of overlay networks◦ Common interface◦ They respect some agreed-upon way to organize

File-sharing networks are NOT the only Peer-to-Peer systems.

4

Page 5: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

5

Page 6: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Source of semantic information to self-organize

Interoperability

6

Page 7: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Scalable infrastructure for◦ Storage◦ Reasoning◦ Collaboration

Self-organization Autonomy – control of data Privacy Scalable algorithms Robustness No censorship No preferential treatment of information

7

Common misconception:All Peer-to-Peer systems can offer the above

Page 8: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Global-scale semantic web storage and reasoning◦ Scalability

Computation Administration

8

Page 9: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Structured peer-to-peer◦ Use DHTs ◦ One global distributed store◦ Peers do not maintain their own data

Federated stores◦ Each peer maintains its own store◦ Stores are interconnected◦ Either global schema or mappings between

schemata

9

Page 10: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

• The mathematical abstraction for hashtables is a Map

• Functionality:• put(key,value)• get(key)

• Similar to normal hash-tables with the difference that each bucket now is a peer

• Accessing different buckets involves network traffic

• Routing to a bucket is done bothering approx. log(N) peers, N is network size

10

Page 11: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Values are stored in the peer with ID starting with the first letter of the key

11

a dcb e f

g jih k l

m pon q r

s uvt w x

<Key=horse, Value=the horse is an animal>

Page 12: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

<rabbit, subClassOf, animal><seal, subClassOf, animal><animal, lives_in, habitat>

<monk_seal, subClassOf, seal><mseal1, type, monk_seal>

Peer 1 Peer 2

12

a dcb e f

g jih k l

m pon q r

s uvt w x

<rabbit, subClassOf, animal><seal, subClassOf, animal><animal, lives_in, habitat>

<rabbit, subClassOf, animal>

<animal, lives_in, habitat>

<monk_seal, subClassOf, seal>

<mseal1, type, monk_seal>

<seal, subClassOf, animal><monk_seal, subClassOf,

seal><rabbit, subClassOf, animal>

<mseal1, type, monk_seal>

<animal, lives_in, habitat>

Page 13: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

13

a dcb e f

g jih k l

m pon q r

s uvt w x

<rabbit, subClassOf, animal><seal, subClassOf, animal><animal, lives_in, habitat>

<rabbit, subClassOf, animal>

<animal, lives_in, habitat>

<monk_seal, subClassOf, seal>

<mseal1, type, monk_seal>

<seal, subClassOf, animal><monk_seal, subClassOf,

seal><rabbit, subClassOf, animal>

<mseal1, type, monk_seal>

RDFS class axioms

(1) <X, subClassOf, Z> <- <X, subClassOf, Y> , <Y, subClassOf, Z>

(2) <X, type, Z> <- <X, type, Y>, <Y, subClassOf, Z>

<animal, lives_in, habitat>

<monk_seal, subClassOf, animal>

<monk_seal, subClassOf, animal>

<monk_seal, subClassOf, animal>

Page 14: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

14

a dcb e f

g jih k l

m pon q r

s uvt w x

<rabbit, subClassOf, animal><seal, subClassOf, animal><animal, lives_in, habitat>

<rabbit, subClassOf, animal>

<animal, lives_in, habitat>

<monk_seal, subClassOf, seal>

<mseal1, type, monk_seal>

<seal, subClassOf, animal><monk_seal, subClassOf,

seal><rabbit, subClassOf, animal>

<mseal1, type, monk_seal>

RDFS class axioms

(1) FORALL O,V O[rdfs:subClassOf->V] <- EXISTS W (O[rdfs:subClassOf->W] AND W[rdfs:subClassOf->V]).

(2) FORALL O,T O[rdf:type->T] <- EXISTS S (S[rdfs:subClassOf->T] AND O[rdf:type->S]).

<animal, lives_in, habitat>

<monk_seal, subClassOf, animal>

<monk_seal, subClassOf, animal>

<monk_seal, subClassOf, animal>

<mseal1, type, animal><mseal1, type, animal><mseal1, type, animal>

Page 15: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

As shown, the transitive closure has to be calculated – backwards chaining would require many DHT messages

But it does not scale to large number of ontologies.◦ E.g. a animal hierarchy:

Adding the triple <animal, subClassOf, living_organism> means that for all triples with animal, we need to insert an additional triple.

Control over ontologies◦ Provenance of information◦ Ontologies and instance data are made public◦ Publishers are not in control of their ontologies/data

One super-user inserts all data

15

Page 16: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Each peer maintains its ontology and instance data

Mappings are (manually) defined between ontologies

Thus, a semantic topology is created Queries are posted according to such a

schema and forwarded following these mappings

Semantic Web counterpart of Federated Databases

16

Page 17: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Bootstrapping◦ New peers have to manually map their ontologies

to the ontology of a peer already in the network◦ Finding relevant ontologies requires flooding

Routing◦ The overlay is created according to the ontologies

understood by peers, not the data they contain. Possible scalability problem.

◦ Searching for instances requires flooding

17

Page 18: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Effort to combine both approaches◦ Use a DHT to efficiently find ontologies and

instance data◦ Exploit semantic locality by keeping ontologies

local to the publisher◦ Whenever possible, perform reasoning peer-to-

peer

18

Page 19: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

19

<rabbit, subClassOf, animal><seal, subClassOf, animal><animal, lives_in, habitat>

<monk_seal, subClassOf, seal><mseal1, type, monk_seal>

Peer 1 Peer 2

19

a dcb e f

g jih k l

m pon q r

s uvt w x

animal:P1

rabbit:P1monk_seal:P2mseal1:P2

habitat:P1 lives_in:P1

seal:P1,P2subClassOf:P1, P2

Page 20: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

2020

<rabbit, subClassOf, animal>

<seal, subClassOf, animal><animal, lives_in, habitat>

Peer 1 Peer 2

20

a dcb e f

g jih k l

m pon q r

s uvt w x

animal:P1

rabbit:P1

seal:P1,P2subClassOf:P1, P2

monk_seal:P2mseal1:P2

habitat:P1

<seal, subClassOf, X?> <Y?, subClassOf, seal>

Query

seal?

P1, P2P1, P2

<monk_seal, subClassOf, seal><monk_seal, subClassOf, seal>

<seal, subClassOf, animal><seal, subClassOf, animal>

lives_in:P1

Peer 3

<monk_seal, subClassOf, seal><mseal1, has_type, monk_seal>

Page 21: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

2121

<rabbit, subClassOf, animal>

<seal, subClassOf, animal><animal, lives_in, habitat>

Peer 1 Peer 2

21

a dcb e f

g jih k l

m pon q r

s uvt w x

animal:P1

rabbit:P1

seal:P1,P2subClassOf:P1, P2

monk_seal:P2mseal1:P2

habitat:P1

<monk_seal, subClassOf, X?>Query

monk_seal?

P2P2

<monk_seal, subClassOf, seal><monk_seal, subClassOf, seal>

<seal, subClassOf, animal><seal, subClassOf, animal>

lives_in:P1

Peer 3

<monk_seal, subClassOf, seal><mseal1, type, monk_seal>

seal?

P1P1

<seal, subClassOf, X?>

Page 22: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Control◦ Access Control◦ Select which data is published on the index◦ Trust – ban spammers, remember good peers

Privacy◦ It is possible to obfuscate descriptors stored in the DHT

Responsibility◦ Publisher has the responsibility to maintain own data

Scalability◦ DHTs can scale to millions of nodes

Data is up-to-date

22

Page 23: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Based on the data of swoogle, there is currently small overlap between ontologies

The distribution of ontology popularity follows a power-law pattern

If most answers reside on the same peer, our approach outperforms those that rely on triple distribution on top of a DHT

23

Page 24: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

Simulations using SWD from Swoogle and Watson (around 25.000)

Integration of privacy in the index Selecting the right ontologies/peers

24

Page 25: George Anadiotis, Spyros Kotoulas and Ronny Siebes VU University Amsterdam.

?25


Recommended