Peer-to-Peer Networks and Distributed Hash Tables 2006.

Peer-to-Peer Networks and Distributed Hash Tables

2006

2

Peer-peer networking

• file sharing:- files are stored at the end user machines (peers)

rather than at a server (C/S), files are transferred directly between peers.

leverage: - P2P is a way to leverage vast amounts of computing

power, storage, and connectivity from personal computers (PC) distributed around the world.

• Q: What are the new technical challenges?• Q: What new services/applications enabled?• Q: Is it just “networking at the application-level”?• Everything old is new again?

3

Napster Naptser -- free music over the Internet Key idea: share the content, storage and bandwidth of

individual (home) users Model: Each user stores a subset of files; Each user has

access (can download) files from all users in the system• Application-level, client-server protocol (index server)

over point-to-point TCP • How does it work -- four steps:

• Connect to Napster index server

• Upload your list of files (push) to server.

• Give server keywords to search the full list with.

• Select “best” of correct answers. (pings)

Internet

4

Napster: Example

AB

C

D

E

F

m1(machine) m2

m3

m4

m5

m6

m1 Am2 Bm3 Cm4 Dm5 Em6 F

E?m5

E? E

5

Napster characteristics Advantages: - Simplicity, easy to implement sophisticated

search engines on top of the index system• centralized index server: • single logical point of failure• can load balance among servers using DNS

rotation• potential for congestion• Napster “in control” (freedom is an illusion)

• no security: • passwords in plain text• no authentication • no anonymity

6

Main Challenge Find where a particular file is stored Scale: up to hundred of thousands or millions

of machines - 7/2001: # simultaneous online users:

Napster-160K, Gnutella-40K, Morpheus-300K Dynamicity: machines can come and go any

time

AB

C

D

E

F

E?

7

Gnutella peer-to-peer networking: peer applications Focus: decentralized method of searching for

files How to find a file: flood the request- Send request to all neighbors- Neighbors recursively multicast the request- Eventually a machine that has the file

receives the request, and it sends back the answer

Advantages:- Totally decentralized, highly robust

Disadvantages:- Not scalable; the entire network can be

swamped with request (to alleviate this problem, each request has a TTL)

8

Gnutella: Example Assume: m1’s neighbors are m2 and m3;

m3’s neighbors are m4 and m5;…

AB

C

D

E

F

m1m2

m3

m4

m5

m6

E?

E?

E?E?

E

9

Gnutella

What we care about:

- How much traffic does one query generate?

- how many hosts can it support at once?

- What is the latency associated with querying?

- Is there a bottleneck?

late 2000: only 10% of downloads succeed- 2001: more than 25% downloads

successful (is this success or failure?)

10

BitTorrent BitTorrent (BT) is new generation p2p. It can make

download more faster

- The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece

Swarming: Parallel downloads among a mesh of cooperating peers

- Scalable - capacity increases with increase in number of peers/downloaders

- Efficient - it utilises a large amount of available network bandwidth

Tracker

- a central server keeping a list of all peers participating in the swarm (Handles peer discovery)

11

BitTorrent…. A picture..

Uploader/downloader

Uploader/downloader

Uploader/downloader

Uploader/downloader

TrackerUploader/downloader

12

BitTorrent…. A picture..

13

Freenet Addition goals to file location:

- Provide publisher anonymity, security

- Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines

Architecture:

- Each file is identified by a unique identifier

- Each machine stores a set of files, and maintains a “routing table” to route the individual requests

14

Data Structure Each node maintains a common stack- id – file identifier- next_hop – another node that store the file id- file – file identified by id being stored on local node

Forwarding: - Each message contains the file id it is referring to- If file id stored locally, then stop;- If not, search for the “closest” id in the

stack, and forward the message to the corresponding next_hop

id next_hop file……

15

Query API: file = query(id); Upon receiving a query for document id- Check whether the queried file is stored locally• If yes, return it• If not, forward the query message

Notes:- Each query is associated a TTL that is decremented

each time the query message is forwarded; to obscure distance to originator:• TTL can be initiated to a random value within

some bounds• When TTL=1, the query is forwarded with a finite

probability- Each node maintains the state for all outstanding

queries that have traversed it help to avoid cycles- When file is returned, the file is cached along the

reverse path

16

Query Example

Note: doesn’t show file caching on the reverse path

4 n1 f412 n2 f12 5 n3

9 n3 f9

3 n1 f314 n4 f14 5 n3

14 n5 f1413 n2 f13 3 n6

n1 n2

n3

n4

4 n1 f410 n5 f10 8 n6

n5

query(10)

1

2

3

4

4’

5

17

Insert API: insert(id, file); Two steps - Search for the file to be inserted- If not found, insert the file

Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator

Insertion- Follow the forward path; insert the file at all

nodes along the path- A node probabilistically replace the

originator with itself; obscure the true originator

18

Insert Example

Assume query returned failure along “blue” path; insert f10

4 n1 f412 n2 f12 5 n3

9 n3 f9

3 n1 f314 n4 f14 5 n3

14 n5 f1413 n2 f13 3 n6

n1 n2

n3

n4

4 n1 f411 n5 f11 8 n6

n5

insert(10, f10)

19

Insert Example

10 n1 f10 4 n1 f412 n2

3 n1 f314 n4 f14 5 n3

14 n5 f1413 n2 f13 3 n6

n1

n3

n4

4 n1 f411 n5 f11 8 n6

n5

insert(10, f10)

9 n3 f9

n2orig=n1

20

Insert Example n2 replaces the originator (n1) with itself

10 n1 f10 4 n1 f412 n2

10 n2 f10 9 n3 f9

10 n2 f10 3 n1 f314 n4

14 n5 f1413 n2 f13 3 n6

n1 n2

n3

n4

4 n1 f411 n5 f11 8 n6

n5

insert(10, f10)

orig=n2

21

Insert Example

n2 replaces the originator (n1) with itself

10 n1 f10 4 n1 f412 n2

10 n2 f10 9 n3 f9

10 n2 f10 3 n1 f314 n4

10 n4 f1014 n5 f1413 n2

n1 n2

n3

n4

10 n4 f10 4 n1 f411 n5

n5

Insert(10, f10)

23

Freenet Summary

Advantages

- Provides publisher anonymity

- Totally decentralize architecture robust and scalable

- Resistant against malicious file deletion Disadvantages

- Does not always guarantee that a file is found, even if the file is in the network

24

Solutions to the Location Problem Goal: make sure that an item (file) identified is

always found- indexing scheme: used to map file names to their

location in the system - Requires a scalable indexing mechanism

Abstraction: a distributed hash-table data strctr - insert(id, item);- item = query(id);- Note: item can be anything: a data object, document,

file, pointer to a file… Proposals- CAN, Chord, Kademlia, Pastry, Viceroy,

Tapestry, etc

25

Hash tables- essential building block in software systems

Internet-scale distributed hash tables- equally valuable to large-scale distributed systems?- peer-to-peer systems

- Napster, Gnutella, Groove, FreeNet, MojoNation…

- large-scale storage management systems- Publius, OceanStore, PAST, Farsite, CFS ...

- mirroring on the Web- Content-Addressable Network (CAN)- scalable- operationally simple- good performance

Internet-scale hash tables

26

Content Addressable Network (CAN): basic idea

insert(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

Interface- insert(key,value) key (id), value (item)

- value = retrieve(key)

27

CAN: basic idea

retrieve (K1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

(K1,V1)

28

CAN: basic idea

Associate to each node and item a unique id in an d-dimensional Cartesian space

- key (id) - node/point – zone (d) Goals

- Scales to hundreds of thousands of nodes

- Handles rapid arrival and failure of nodes Properties

- Routing table size O(d)

- Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes

29

CAN: solution virtual d-dimensional Cartesian coordinate

space entire space is partitioned amongst all the

nodes

- every node “owns” a zone in the overall space

abstraction

- can store data at “points” in the space

- can route from one “point” to another point = node that owns the enclosing zone

30

CAN Example: Two Dimensional Space

Space divided between nodes

All nodes cover the entire space

Each node covers either a square or a rectangular area of ratios 1:2 or 2:1

Example:

- Node n1:(1, 2) first node that joins cover the entire space

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1

31


Node n2:(4, 2) joins space is divided

between n1 and n2

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

32


Node n3:(3, 5) joins space is divided

between n1 and n3

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3

33


Nodes n4:(5, 5) and n5:(6,6) join

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

34

Node I::insert(K,V)

(1) a = hx(K) b = hy(K)(2) route(K,V) --> (a,b)(3) (a,b) stores (K,V)

Simple example: To store a pair (K1,V1)

key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address)

x=a

y=b

I

35


Each item is stored by the node who owns its mapping in the space

Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)

Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5);

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

36

Simple example: To retrieve key K1

node J::retrieve(K)

(1) a = hx(K)

b = hy(K)

(2) route “retrieve(K)” to (a,b)

Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies

(K,V)

J

y=b

x=a

37

CAN: Query/Routing Example

Each node knows its neighbors in the d-space

Forward query to the neighbor that is closest to the query id

Example: assume n1 queries f4

A node only maintains state for its immediate neighboring nodes

Can route around some failures

1 2 3 4 5 6 70

1

2

3

4

5

6

7

0

n1 n2

n3 n4n5

f1

f2

f3

f4

38

CAN: node insertionInserting a new node affects only a single other node and its immediate neighbors1) discover some node “I” already in CAN2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half

(p,q)

I

J

new node

new

39

CAN: Node Failure Recovery

Simple failures

- Know your neighbor’s neighbors

- When a node fails, one of its neighbors takes over its zone

More complex failure modes

- Simultaneous failure of multiple adjacent nodes

- Scoped flooding to discover neighbors

- Hopefully, a rare event Only the failed node’s immediate

neighbors are required for recovery

40

Evaluation

Scalability

Low-latency

Load balancing

Robustness

41

CAN: scalability

For a uniformly partitioned space with n nodes and d dimensions - per node, number of neighbors is 2d- average routing path is (dn1/d)/4 hops- simulations show that the above

results hold in practice Can scale the network without

increasing per-node state

Chord/Plaxton/Tapestry/Buzz- log(n) neighbors with log(n) hops

42

CAN: low-latency Problem

- latency stretch = (CAN routing delay) (IP routing delay)

- application-level routing may lead to high stretch

Solution- increase dimensions, realities (reduce

the path length)- Heuristics (reduce the per-CAN-hop

latency)•RTT-weighted routing•multiple nodes per zone (peer nodes)

•deterministically replicate entries

43

CAN: low-latency

#nodes

Late

ncy

str

etc

h

0

20

40

60

80

100

120

140

160

180

16K 32K 65K 131K

#dimensions = 2

w/o heuristics

w/ heuristics

44

0

2

4

6

8

10

CAN: low-latency

#nodes

Late

ncy

str

etc

h

16K 32K 65K 131K

#dimensions = 10

w/o heuristics

w/ heuristics

45

CAN: load balancing

Two pieces- Dealing with hot-spots

•popular (key,value) pairs•nodes cache recently requested entries

•overloaded node replicates popular entries at neighbors

- Uniform coordinate space partitioning•uniformly spread (key,value) entries•uniformly spread out routing load

46

CAN: Robustness

Completely distributed - no single point of failure ( not

applicable to pieces of database when node failure happens)

Not exploring database recovery (in case there are multiple copies of database)

Resilience of routing- can route around trouble

47

Strengths

More resilient than flooding broadcast networks

Efficient at locating information Fault tolerant routing Node & Data High Availability

(w/ improvement) Manageable routing table size &

network traffic

48

Weaknesses

Impossible to perform a fuzzy search

Susceptible to malicious activity Maintain coherence of all the

indexed data (Network overhead, Efficient distribution)

Still relatively higher routing latency

Poor performance w/o improvement

49

Suggestions Catalog and Meta indexes to

perform search function Extension to handle mutable

content efficiently for web-hosting

Security mechanism to defense against attacks

50

Ongoing Work Topologically-sensitive CAN

construction- Distributed Binning

Goal- bin nodes such that co-located nodes

land in same bin Idea

- well known set of landmark machines- each CAN node, measures its RTT to

each landmark- orders the landmarks in order of

increasing RTT CAN construction

- place nodes from the same bin close together on the CAN

51

Distributed Binning

- 4 Landmarks (placed at 5 hops away from each other)

- naïve partitioning

number of nodes

256 1K 4K

lat e

ncy

St r

et c

h

5

10

15

20

256 1K 4K

w/o binning w/ binning

w/o binning w/ binning

#dimensions=2 #dimensions=4

52

Ongoing Work (cont’d)

CAN Security (Petros Maniatis - Stanford)

- spectrum of attacks

- appropriate counter-measures

CAN Usage

- Application-level Multicast (NGC 2001)

- Grass-Roots Content Distribution

- Distributed Databases using CANs(J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang)

53

Summary CAN

- an Internet-scale hash table- potential building block in Internet

applications Scalability

- O(d) per-node state- average routing path is (dn1/d)/4 hops

Low-latency routing- simple heuristics help a lot

Robust- decentralized, can route around

trouble

Date post:	26-Dec-2015
Category:	Documents
Upload:	jane-goodwin
View:	224 times
Download:	2 times

Peer-to-Peer Networks and Distributed Hash Tables 2006.

Documents