+ All Categories
Home > Documents > Peer To Peer Distributed Systems

Peer To Peer Distributed Systems

Date post: 25-Feb-2016
Category:
Upload: dinos
View: 100 times
Download: 0 times
Share this document with a friend
Description:
Peer To Peer Distributed Systems. Pete Keleher. Why Distributed Systems?. Aggregate resources! memory disk CPU cycles Proximity to physical stuff things with sensors things that print things that go boom other people Fault tolerance! Don’t want one tsunami to take everything down. - PowerPoint PPT Presentation
Popular Tags:
33
Peer To Peer Distributed Systems Pete Keleher
Transcript
Page 1: Peer To Peer Distributed Systems

Peer To PeerDistributed Systems

Pete Keleher

Page 2: Peer To Peer Distributed Systems

Why Distributed Systems? Aggregate resources!

– memory– disk– CPU cycles

Proximity to physical stuff– things with sensors– things that print– things that go boom– other people

Fault tolerance!– Don’t want one tsunami to take everything down

Page 3: Peer To Peer Distributed Systems

Why Peer To Peer Systems?

What’s peer to peer?

Page 4: Peer To Peer Distributed Systems

(Traditional) Client-Server

Server Clients

Page 5: Peer To Peer Distributed Systems

Peer To Peer

– Lots of reasonable machines• No one machine loaded more than others• No one machine irreplacable!

Page 6: Peer To Peer Distributed Systems

Peer-to-Peer (P2P) Where do the machines come from?

– “found” resources• SETI @ home• BOINC

– existing resources• computing “clusters” (32, 64, ….)

What good is a peer to peer system?– all those things mentioned before, including

Storage: files, MP3’s, leaked documents, porn …

Page 7: Peer To Peer Distributed Systems

The lookup problem

Internet

N1N2 N3

N6N5N4

Publisher

Key=“title”Value=MP3 data… Client

Lookup(“title”)?

Page 8: Peer To Peer Distributed Systems

Centralized lookup (Napster)

Publisher@Client

Lookup(“title”)

N6

N9 N7

DB

N8

N3

N2N1SetLoc(“title”, N4)

Simple, but O(N) states and a single point of failure

Key=“title”Value=MP3 data…

N4

Page 9: Peer To Peer Distributed Systems

Flooded queries (Gnutella)

N4Publisher@Client

N6

N9

N7 N8

N3

N2N1

Robust, but worst case O(N) messages per lookup

Key=“title”Value=MP3 data…

Lookup(“title”)

Page 10: Peer To Peer Distributed Systems

Routed queries (Freenet, Chord, etc.)

N4PublisherClient

N6

N9

N7 N8

N3

N2N1

Lookup(“title”)Key=“title”Value=MP3 data…

Bad load balance.

Page 11: Peer To Peer Distributed Systems

Routing challenges Define a useful key nearness metric.

Keep the hop count small.– O(log N)

Keep the routing tables small.– O(log N)

Stay robust despite rapid changes.

Page 12: Peer To Peer Distributed Systems

Distributed Hash Tables to the Rescue!

Load Balance: Distributed hash function spreads keys evenly over the nodes (Consistent hashing).

Decentralization: Fully distributed (Robustness).

Scalability: Lookup grows as a log of number of nodes.

Availability: Automatically adjusts internal tables to reflect changes.

Flexible Naming: No constraints on key structure.

Page 13: Peer To Peer Distributed Systems

What’s a Hash? Wikipedia: any well-defined procedure or mathematical

function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer

Example: Assume: N is a large prime ‘a’ means the ASCII code for the letter ‘a’ (it’s 97)

H(“pete”) == (H(“pe”) x N + ‘t’) x N + ‘e’= (H(“pe”) x N + ‘t’) x N + ‘e’= 451845518507

H(“pet”) x N + ‘e’ H(“pete”) mod 1000 = 507H(“peter”) mod 1000 = 131H(“petf”) mod 1000 = 986

It’s a deterministic random number generator!

Page 14: Peer To Peer Distributed Systems

Chord (a DHT) m-bit identifier space for both keys and nodes.

Key identifier = SHA-1(key).

Node identifier = SHA-1(IP address).

Both are uniformly distributed.

How to map key IDs to node IDs?

Page 15: Peer To Peer Distributed Systems

Consistent hashing [Karger 97]

N32

N90

N105

K80

K20

K5

Circular 7-bitID space

Key 5Node 105

A key is stored at its successor: node with next higher ID

Page 16: Peer To Peer Distributed Systems

Basic lookup

N32

N90

N105

N60

N10N120

K80

“ Where is key 80?”

“ N90 has K80”

Page 17: Peer To Peer Distributed Systems

Basic lookup

N32

N90

N105

N60

N10N120

K80

“ Where is key 80?”

“ N90 has K80”

Page 18: Peer To Peer Distributed Systems

Basic lookup

N32

N90

N105

N60

N10N120

K80

“ Where is key 80?”

“ N90 has K80”

Page 19: Peer To Peer Distributed Systems

Basic lookup

N32

N90

N105

N60

N10N120

K80

“ Where is key 80?”

“ N90 has K80”

Page 20: Peer To Peer Distributed Systems

Basic lookup

N32

N90

N105

N60

N10N120

K80

“ Where is key 80?”

“ N90 has K80”

Page 21: Peer To Peer Distributed Systems

“ Finger table” allows log(N)-time lookups

N80

½¼

1/8

1/161/321/641/128

Every node knows m other nodes in the ring

Page 22: Peer To Peer Distributed Systems

Finger i points to successor of n+2i-1

N80

½¼

1/8

1/161/321/641/128

112N120

Each node knows more about portion of circle close to it

Page 23: Peer To Peer Distributed Systems

Lookups take O(log(N)) hops

N32

N10N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 24: Peer To Peer Distributed Systems

Lookups take O(log(N)) hops

N32

N10N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 25: Peer To Peer Distributed Systems

Lookups take O(log(N)) hops

N32

N10N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 26: Peer To Peer Distributed Systems

Lookups take O(log(N)) hops

N32

N10N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 27: Peer To Peer Distributed Systems

Lookups take O(log(N)) hops

N32

N10N5

N20N110

N99

N80

N60

Lookup(K19)

K19

Page 28: Peer To Peer Distributed Systems

Joining: linked list insert

N36

N40

N25

1. Lookup(36) K30K38

1. Each node’s successor is correctly maintained.2. For every key k, node successor(k) is responsible for k.

Page 29: Peer To Peer Distributed Systems

Join (2)

N36

N40

N25

2. N36 sets its ownsuccessor pointer

K30K38

Initialize the new node finger table

Page 30: Peer To Peer Distributed Systems

Join (3)

N36

N40

N25

3. Set N25’s successorpointer

Update finger pointers of existing nodes

K30K38

Page 31: Peer To Peer Distributed Systems

Join (4)

N36

N40

N25

4. Copy keys 26..36from N40 to N36

K38

K30

Transferring keys

Page 32: Peer To Peer Distributed Systems

Stabilization Protocol To handle concurrent node joins/fails/leaves.

Keep successor pointers up to date, then verify and correct finger table entries.

Incorrect finger pointers may only increase latency, but incorrect successor pointers may cause lookup failure.

Nodes periodically run stabilization protocol.

Won’t correct a Chord system that has split into multiple disjoint cycles, or a single cycle that loops multiple times around the identifier space.

Page 33: Peer To Peer Distributed Systems

Take Home Points Hash used to uniformly distribute data, nodes

across a range.

Random distribution balances load.

Awesome systems paper:– identify commonality across algorithms– restrict work to implementing that one simple

abstraction– use as building block


Recommended