+ All Categories
Home > Documents > Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental...

Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental...

Date post: 11-Jun-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
66
1 ID2210 - Introduction Peer-to-Peer and GRID Computing ID2210 Seif Haridi ([email protected]) Fatemeh Rahimian ([email protected]) Amir H. Payberah ([email protected])
Transcript
Page 1: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

1ID2210 - Introduction

Peer-to-Peer and GRID ComputingID2210

Seif Haridi ([email protected])Fatemeh Rahimian ([email protected])

Amir H. Payberah ([email protected])

Page 2: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

2ID2210 - Introduction

Course Objective

Page 3: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

3ID2210 - Introduction

Objectives

• Introduction to basic concepts and principles of large­scale dynamic decentralized distributed systems and distributed algorithms.

•Study of peer­to­peer overlays, DHTs, gossip based algorithms and content distribution networks.

• Implementation and evaluation of some of the peer­to­peer algorithms in a simulator environment.

•How to read, review and present a scientific paper.

Page 4: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

4ID2210 - Introduction

Topics of Study

•Fundamental results in large­scale distributed algorithms.

•Overview of peer­to­peer systems, algorithms, and applications.

•Study of Distributed Hash Tables (DHTs), also called Structured Overlay Networks (SONs).

•Gossip and Epidemic Overlays.

•Content and Streaming Distribution Networks.

Page 5: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

5ID2210 - Introduction

Non Objectives

•Learning about GRID concepts and applications.

•Learning how to program centralized services. Web services technology

Page 6: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

6ID2210 - Introduction

Material

•Mainly based on research papers.

•You will find all the material on the course webpage:      http://www.ict.kth.se/courses/ID2210/

Page 7: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

7ID2210 - Introduction

Examination

•Students will work in groups of 1 or 2.

•The course has four types of requirements:

•Reading assignment: 30 points

•Lab assignment: 40 points

•Group presentation: 15 points

•Quiz: 15 points

Page 8: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

8ID2210 - Introduction

Reading Assignment

•Read, summarize and review three papers.

•For each paper studied, write a summary report Identify and motivate the problem Pinpoint the main contributions Explain the solution(s) Explain how it is evaluated Identify positive and negative aspects of the solution/paper Answer to a few given questions

•Your summary report is reviewed and graded by two other groups The reviews affect the grade of the reviewer group only.

Page 9: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

9ID2210 - Introduction

Lab Assignment

•You implement two peer­to­peer systems in Kompics.

•You evaluate in simulation, the performance or properties of the implemented systems. For the evaluation part you will be given the source code.

•Report you results in a document.

Page 10: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

10ID2210 - Introduction

Group Presentation and Quiz

•You give a 15 minutes talk on a scientific paper.

•The list of papers will be available in the course webpage.

•You are free to choose any other paper, but it should be confirmed by teacher assistants.

• In the quiz we will ask questions based on the lectures notes and their corresponding papers.

Page 11: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

11ID2210 - Introduction

Final Grade

•Final grade is determined by the sum of the assignment grades.

A: 90 – 100 B: 80 – 89 C: 70 – 79 D: 60 – 69 E: 50 – 59 F: < 50

Page 12: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

12ID2210 - Introduction

Discussion Forum

•Use the course discussion forum if you have any questions

http://www.ict.kth.se/courses/ID2210/ 

Please use your Firstname­Lastname for your login.

Page 13: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

13ID2210 - Introduction

Teachers

•Course responsible Seif Haridi ([email protected])

•Teaching assistant Amir H. Payberah ([email protected])  Fatemeh Rahimian ([email protected])

•Guest Lecturer Sarunas Girdzijauskas ([email protected])  Jim Dowling ([email protected])

Page 14: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

14ID2210 - Introduction

Course Overview

Page 15: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

15ID2210 - Introduction

Page 16: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

16ID2210 - Introduction

P2P Why Should We Care?

CacheLogic ResearchInternet Protocol Breakdown 1993 ­ 2006

Page 17: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

17ID2210 - Introduction

Outline

•What is P2P?

•Evolution (Discovery Related) 1st Generation: Centralized systems

• Napster

2nd Generation: Flooding­Based systems• Gnutella

3rd Generation: Distributed Hash Tables (DHT)• Chord, Pastry, Kademlia, etc...

Page 18: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

18ID2210 - Introduction

What is P2P Computing? (1/3)

•Oram (First book on P2P): P2P is a class of applications, that

Takes advantage of resources – (storage, cpu, etc,..) – available at the edges of the Internet.

Because accessing these decentralized resources means operating in an environment of unstable connectivity and unpredictable IP addresses, P2P nodes must operate outside the DNS system and have significant or total autonomy from central servers.

Page 19: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

19ID2210 - Introduction

What is P2P Computing? (2/3)

•P2P Working Group (A Standardization Effort): P2P computing is

The sharing of computer resources and services by direct exchange between systems.

Peer­to­peer computing takes advantage of existing computing power and networking connectivity, allowing economical clients to leverage their collective power to benefit the entire enterprise.

Page 20: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

20ID2210 - Introduction

What is P2P Computing? (3/3)

•Our view: P2P computing is distributed computing with the following desirable properties:

Resource Sharing Dual client/server role Decentralization/Autonomy Scalability Robustness/Self­Organization

Page 21: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

21ID2210 - Introduction

P2P Research Issues

•Discovery: Where are things?

•Content Distribution: How fast can we get things?

•NAT/Firewalls Jumping over them

•Security

•Anonymity

• ...

Page 22: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

22ID2210 - Introduction

Let us see how did it all start ...

•Some users store data items on their machines.

•Other users are interested in this data.

•Problem: [d]

Page 23: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

23ID2210 - Introduction

Let us see how did it all start ...

•Some users store data items on their machines.

•Other users are interested in this data.

•Problem: How does a user know which other user(s) in the world have the data item(s) that s/he desires?

Page 24: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

24ID2210 - Introduction

Let us see how did it all start ...

Ubuntu.isoBritney.mp3

rakhsh.sics.se

Hello.mp3FamilyGuy.avi

castor.sics.se

Where isFamilyGuy.avi?

...

...

...

...

...

...

...

...

...

Page 25: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

25ID2210 - Introduction

Example P2P Problem: Lookup

•At the heart of all P2P systems.

Internet

Publisher

Client

PublisherKey = ''title''

Value = data file

Lookup(''title'')

Page 26: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

26ID2210 - Introduction

First Generation

Page 27: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

27ID2210 - Introduction

First Generation of P2P Systems

•Central Directory + Distributed Storage

Ubuntu.isoBritney.mp3

rakhsh.sics.se

GaGa.mp3FamilyGuy.avi

castor.sics.se x.kth.se

Ubuntu.iso

Central directory

FamilyGuy.avi   {castor.sics.se}→Britney.mp3   {rakhsh.sics.se}→GaGa.mp3   {castor.sics.se}→

Ubuntu.iso   {rakhsh.sics.se, x.kth.se}→

Data transferData transfer

Query

Query

Page 28: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

28ID2210 - Introduction

Basic Operations in Napster

• Join Connect to the central server (Napster)

•Share (Publish/Insert) Inform the server about what you have

•Leave/Fail Simply disconnect Server detects failure, removes your data from the directory

•Search (Query) Ask the central server and it returns a list of hits

•Download Directly download from other nodes using the hits provided by the server

Page 29: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

29ID2210 - Introduction

Centralized Lookup

Internet

P

C

Key = ''title''

Value = data file

Search(''title'')

Share(''title'', P)

Page 30: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

30ID2210 - Introduction

The End of Napster

•Since users of Napster stored copyrighted material, the service was stopped for legal reasons.

Page 31: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

31ID2210 - Introduction

Napster Advantage/Disadvantage?

•Advantage/Disadvantage [d]

Page 32: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

32ID2210 - Introduction

Napster Advantage/Disadvantage?

•Advantage/Disadvantage [d]

•Advantage Simple

•Disadvantage O(N) state in server Single point of failure

Page 33: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

33ID2210 - Introduction

Second Generation

Page 34: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

34ID2210 - Introduction

Second Generation of P2P Systems

•Distributed Directory + Distributed Storage

…... di

rect

ory

…... di

rect

ory

…... di

rect

ory

…... di

rect

ory

…... di

rect

ory

Page 35: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

35ID2210 - Introduction

Gnutella Protocol Messages

•Broadcast Messages Ping: initiating message (''I’m here'') for overlay maintenance Query: search pattern and TTL (time­to­live)

•Back­Propagated Messages Pong: reply to a ping, contains information about the peer Query Hit: contains information about the computer that has the requested 

file

•Node­to­Node Messages GET: return the requested file PUSH: push the file to the requester node

Page 36: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

36ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A

1

2

3

4

56

7A

Page 37: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

37ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours

1

2

3

4

56

7A

A

A

Page 38: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

38ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours•Neighbours forward message

1

2

3

4

56

7

A

A

A

A

Page 39: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

39ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours•Neighbours forward message•Nodes that have file A initiate a 

reply message

1

2

3

4

56

7

A:7A

AA:5

A

Page 40: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

40ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours•Neighbours forward message•Nodes that have file A initiate a 

reply message•Query reply message is back 

propagated

1

2

3

4

56

7A:7

A

A:5

Page 41: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

41ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours•Neighbours forward message•Nodes that have file A initiate a 

reply message•Query reply message is back 

propagated

1

2

3

4

56

7A:7

A

A:5

Page 42: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

42ID2210 - Introduction

Gnutella Search Mechanism

•Node 2 initiates search for file A•Sends message to all neighbours•Neighbours forward message•Nodes that have file A initiate a 

reply message•Query reply message is back 

propagated•Nodes 2 directly connects to 

node 7 and downloads file A

1

2

3

4

56

7

Page 43: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

43ID2210 - Introduction

Gnutella Advantage/Disadvantage?

•Advantage/Disadvantage [d]

Page 44: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

44ID2210 - Introduction

Gnutella Advantage/Disadvantage?

•Advantage/Disadvantage [d]

•Advantage Robust

•Disadvantage Worst case O(N) message per lookup No guarantees to find data item

• Because of TTL

Page 45: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

45ID2210 - Introduction

Third Generation

Page 46: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

46ID2210 - Introduction

Distributed Hash Tables (DHT)

•An ordinary hashtable, which is ...

Key Value

Fatemeh Stockholm

Ali California

Tallat Islamabad

Cosmin Bucharest

Seif Stockholm

Amir Tehran

Page 47: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

47ID2210 - Introduction

Distributed Hash Tables (DHT)

•An ordinary hashtable, which is distributed.

Key Value

Fatemeh Stockholm

Ali California

Tallat Islamabad

Cosmin Bucharest

Seif Stockholm

Amir Tehran

Page 48: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

48ID2210 - Introduction

Distributed Hash Tables (DHT)

•put(key,value), get(key) interface.

•The neighbours of a node are well­defined and not randomly chosen.

•Values are no longer stored at their owners, instead the network chooses at which node a data item will be stored.

•Every node provides a lookup operation.

•Nodes keep routing pointers If item not found, route to another node

Page 49: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

49ID2210 - Introduction

The Key Idea in DHTs

•1st and 2nd Generation:  Each data item is stored in the machine of its 

creator/downloader.

•3rd Generation (DHTs):  The ID of a data item determines the machine on which it is 

going to be stored.

Page 50: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

50ID2210 - Introduction

Distributed Hash Tables (DHT)

1. Decides on common key spacefor nodes and values

12

257

2

1431

Set of nodes Key of nodes

Set of items Key of items

2. Connects the nodes smartly 3. Make a strategy for assigning items to nodes

Page 51: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

51ID2210 - Introduction

Consistent Hashing using a Ring (1/6)

• Identifier space of size 16, [0, 15].

rakhsh.sics.se castor.sics.se x.kth.se 193.9.9.3

H(rakhsh.sics.se)=12 H(castor.sics.se)=3 H(x.kth.se)=0 H(192.9.9.3)=7

Page 52: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

52ID2210 - Introduction

Consistent Hashing using a Ring (2/6)

• Identifier space of size 16, [0, 15].

rakhsh.sics.se castor.sics.se x.kth.se 193.9.9.3

H(rakhsh.sics.se)=12 H(castor.sics.se)=3 H(x.kth.se)=0 H(192.9.9.3)=7

plan.tex

id2210.pdf

hello.mp3

H(plan.tex)=2

H(id2210.pdf)=12

H(hello.mp3)=14

Page 53: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

53ID2210 - Introduction

Consistent Hashing using a Ring (3/6)

•Assume the ID space is [0, 15], i.e. a maximum of 16 nodes.

•We treat this range as a circular id space.

• succ(x): is the first node on the ring with id greater than or equal x, where x is the id of a document or node.

•The successor of node i is succ(i+1).

•Thus, the nodes are forming a ring.

0

12

3

7

0

3

7

12

12

4

5

6

11

9 8

10

15

13

14

Page 54: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

54ID2210 - Introduction

Consistent Hashing using a Ring (4/6)

•Using this ring, we can decide which item is stored at which node.

• Initially, node 0 stored item 2 and node 7 stored items 12 and 14.

•The policy is: An item with ID x, would be stored at the node with id succ(x).

2

1214

0

12

3

7

3

7

12

12

4

5

6

11

9 8

10

15

13

14

0

Page 55: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

55ID2210 - Introduction

Consistent Hashing using a Ring (5/6)

•The policy is: An item with ID x, would be stored at the node with id succ(x).

•So, node 0 gets to store item 14, node 3 to store item, and node 12 to store item 12.

•But how can we do this? [d]

2

12

14

0

12

3

7

3

7

12

12

4

5

6

11

9 8

10

15

13

14

0

Page 56: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

56ID2210 - Introduction

Consistent Hashing using a Ring (6/6)

•But how can we do this? [d]

• If the successor pointers are already there, the two operations, get and put would be simply done by following them sequentially.

•From any node, you can do:put(hash(item), item)

•From any node, you can do:get(hash(item))

2

12

14

0

12

3

7

3

7

12

12

4

5

6

11

9 8

10

15

13

14

0

Page 57: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

57ID2210 - Introduction

Distributed Hash Tables (DHT)

•Nodes are the hash buckets•Key identifies data uniquely•DHT balances keys and data across nodes•DHT replicates, caches, routes lookups, etc.

...

Distributed Applications

DHT

insert(key, data) lookup(key)data

Page 58: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

58ID2210 - Introduction

Why DHTs Now?

•Demand pulls Growing need for security and robustness. Large­scale distributed applications are difficult to build. Many applications use location­independent data.

•Technology pushes Faster, and better computers: every PC can be a server. Scalable lookup algorithms. Trustworthy systems from untrusted components.

Page 59: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

59ID2210 - Introduction

DHT is a Good Interface

•Supports a wide range of applications, because Keys have no semantic meaning Values are application dependent

•Minimal interface

DHT UDP/IP

lookup(key)   data→insert(key, data)

send(IP addr, data)recv(IP addr)   data→

Page 60: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

60ID2210 - Introduction

DHT is a Good Shared Infrastructure

• Applications inherit some security and robustness from DHT DHT replicates data Resistant to malicious participants

• Low­cost deployment Self­organizing across administrative domains Allows to be shared among applications

• Supports large scale workloads

Page 61: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

61ID2210 - Introduction

DHT Applications

• Distributed File Systems [CFS, OceanStore, PAST, Arla/DKS]

•Web cache/archives [Squirrel]

• Censor­resistant stores [Eternity, FreeNet]

• Event notification [Scribe, DKS]

• Naming systems [ChordDNS, INS]

• Query and indexing [Kademlia]

• Communication primitives [I3]

• Backup store [HiveNet]

• Distributed Authorizations Delegation

Page 62: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

62ID2210 - Introduction

Name­based Communication

•Map names to locations

Key Value

Fatemeh 130.237.32.10

Ali 192.9.12.5

Amir 18.7.23.5

Cosmin 10.10.95.4

Seif 127.5.220.12

Tallat 12.110.210.2

Page 63: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

63ID2210 - Introduction

Distributed Backup

• Clients install the backup tool• Decide on amount of space to share• Choose files for backup• Data is encrypted• Stored in the directory

Key Value

Hi.mp3 2343

2210.txt 2511

Bye.avi 4539

... ...

... ...

... ...

Page 64: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

64ID2210 - Introduction

A Page to Remember

Page 65: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

65ID2210 - Introduction

A Page to Remember

•P2P computing Resource Sharing Dual client/server role Decentralization/Autonomy Scalability Robustness/Self­Organization

•Three generations: 1st Generation: Centralized systems

• Napster

2nd Generation: Flooding­Based systems• Gnutella

3rd Generation: Distributed Hash Tables (DHT)• Chord, Pastry, Kademlia, etc...

Page 66: Peer-to-Peer and GRID Computing ID2210 · 4 ID2210 - Introduction Topics of Study •Fundamental results in largescale distributed algorithms. •Overview of peertopeer systems, algorithms,

66ID2210 - Introduction

Question?


Recommended