Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | jane-goodwin |
View: | 224 times |
Download: | 2 times |
Peer-to-Peer Networks and Distributed Hash Tables
2006
2
Peer-peer networking
• file sharing:- files are stored at the end user machines (peers)
rather than at a server (C/S), files are transferred directly between peers.
leverage: - P2P is a way to leverage vast amounts of computing
power, storage, and connectivity from personal computers (PC) distributed around the world.
• Q: What are the new technical challenges?• Q: What new services/applications enabled?• Q: Is it just “networking at the application-level”?• Everything old is new again?
3
Napster Naptser -- free music over the Internet Key idea: share the content, storage and bandwidth of
individual (home) users Model: Each user stores a subset of files; Each user has
access (can download) files from all users in the system• Application-level, client-server protocol (index server)
over point-to-point TCP • How does it work -- four steps:
• Connect to Napster index server
• Upload your list of files (push) to server.
• Give server keywords to search the full list with.
• Select “best” of correct answers. (pings)
Internet
4
Napster: Example
AB
C
D
E
F
m1(machine) m2
m3
m4
m5
m6
m1 Am2 Bm3 Cm4 Dm5 Em6 F
E?m5
E? E
5
Napster characteristics Advantages: - Simplicity, easy to implement sophisticated
search engines on top of the index system• centralized index server: • single logical point of failure• can load balance among servers using DNS
rotation• potential for congestion• Napster “in control” (freedom is an illusion)
• no security: • passwords in plain text• no authentication • no anonymity
6
Main Challenge Find where a particular file is stored Scale: up to hundred of thousands or millions
of machines - 7/2001: # simultaneous online users:
Napster-160K, Gnutella-40K, Morpheus-300K Dynamicity: machines can come and go any
time
AB
C
D
E
F
E?
7
Gnutella peer-to-peer networking: peer applications Focus: decentralized method of searching for
files How to find a file: flood the request- Send request to all neighbors- Neighbors recursively multicast the request- Eventually a machine that has the file
receives the request, and it sends back the answer
Advantages:- Totally decentralized, highly robust
Disadvantages:- Not scalable; the entire network can be
swamped with request (to alleviate this problem, each request has a TTL)
8
Gnutella: Example Assume: m1’s neighbors are m2 and m3;
m3’s neighbors are m4 and m5;…
AB
C
D
E
F
m1m2
m3
m4
m5
m6
E?
E?
E?E?
E
9
Gnutella
What we care about:
- How much traffic does one query generate?
- how many hosts can it support at once?
- What is the latency associated with querying?
- Is there a bottleneck?
late 2000: only 10% of downloads succeed- 2001: more than 25% downloads
successful (is this success or failure?)
10
BitTorrent BitTorrent (BT) is new generation p2p. It can make
download more faster
- The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece
Swarming: Parallel downloads among a mesh of cooperating peers
- Scalable - capacity increases with increase in number of peers/downloaders
- Efficient - it utilises a large amount of available network bandwidth
Tracker
- a central server keeping a list of all peers participating in the swarm (Handles peer discovery)
11
BitTorrent…. A picture..
Uploader/downloader
Uploader/downloader
Uploader/downloader
Uploader/downloader
TrackerUploader/downloader
12
BitTorrent…. A picture..
13
Freenet Addition goals to file location:
- Provide publisher anonymity, security
- Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines
Architecture:
- Each file is identified by a unique identifier
- Each machine stores a set of files, and maintains a “routing table” to route the individual requests
14
Data Structure Each node maintains a common stack- id – file identifier- next_hop – another node that store the file id- file – file identified by id being stored on local node
Forwarding: - Each message contains the file id it is referring to- If file id stored locally, then stop;- If not, search for the “closest” id in the
stack, and forward the message to the corresponding next_hop
id next_hop file……
15
Query API: file = query(id); Upon receiving a query for document id- Check whether the queried file is stored locally• If yes, return it• If not, forward the query message
Notes:- Each query is associated a TTL that is decremented
each time the query message is forwarded; to obscure distance to originator:• TTL can be initiated to a random value within
some bounds• When TTL=1, the query is forwarded with a finite
probability- Each node maintains the state for all outstanding
queries that have traversed it help to avoid cycles- When file is returned, the file is cached along the
reverse path
16
Query Example
Note: doesn’t show file caching on the reverse path
4 n1 f412 n2 f12 5 n3
9 n3 f9
3 n1 f314 n4 f14 5 n3
14 n5 f1413 n2 f13 3 n6
n1 n2
n3
n4
4 n1 f410 n5 f10 8 n6
n5
query(10)
1
2
3
4
4’
5
17
Insert API: insert(id, file); Two steps - Search for the file to be inserted- If not found, insert the file
Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator
Insertion- Follow the forward path; insert the file at all
nodes along the path- A node probabilistically replace the
originator with itself; obscure the true originator
18
Insert Example
Assume query returned failure along “blue” path; insert f10
4 n1 f412 n2 f12 5 n3
9 n3 f9
3 n1 f314 n4 f14 5 n3
14 n5 f1413 n2 f13 3 n6
n1 n2
n3
n4
4 n1 f411 n5 f11 8 n6
n5
insert(10, f10)
19
Insert Example
10 n1 f10 4 n1 f412 n2
3 n1 f314 n4 f14 5 n3
14 n5 f1413 n2 f13 3 n6
n1
n3
n4
4 n1 f411 n5 f11 8 n6
n5
insert(10, f10)
9 n3 f9
n2orig=n1
20
Insert Example n2 replaces the originator (n1) with itself
10 n1 f10 4 n1 f412 n2
10 n2 f10 9 n3 f9
10 n2 f10 3 n1 f314 n4
14 n5 f1413 n2 f13 3 n6
n1 n2
n3
n4
4 n1 f411 n5 f11 8 n6
n5
insert(10, f10)
orig=n2
21
Insert Example
n2 replaces the originator (n1) with itself
10 n1 f10 4 n1 f412 n2
10 n2 f10 9 n3 f9
10 n2 f10 3 n1 f314 n4
10 n4 f1014 n5 f1413 n2
n1 n2
n3
n4
10 n4 f10 4 n1 f411 n5
n5
Insert(10, f10)
23
Freenet Summary
Advantages
- Provides publisher anonymity
- Totally decentralize architecture robust and scalable
- Resistant against malicious file deletion Disadvantages
- Does not always guarantee that a file is found, even if the file is in the network
24
Solutions to the Location Problem Goal: make sure that an item (file) identified is
always found- indexing scheme: used to map file names to their
location in the system - Requires a scalable indexing mechanism
Abstraction: a distributed hash-table data strctr - insert(id, item);- item = query(id);- Note: item can be anything: a data object, document,
file, pointer to a file… Proposals- CAN, Chord, Kademlia, Pastry, Viceroy,
Tapestry, etc
25
Hash tables- essential building block in software systems
Internet-scale distributed hash tables- equally valuable to large-scale distributed systems?- peer-to-peer systems
- Napster, Gnutella, Groove, FreeNet, MojoNation…
- large-scale storage management systems- Publius, OceanStore, PAST, Farsite, CFS ...
- mirroring on the Web- Content-Addressable Network (CAN)- scalable- operationally simple- good performance
Internet-scale hash tables
26
Content Addressable Network (CAN): basic idea
insert(K1,V1)
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
Interface- insert(key,value) key (id), value (item)
- value = retrieve(key)
27
CAN: basic idea
retrieve (K1)
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
(K1,V1)
28
CAN: basic idea
Associate to each node and item a unique id in an d-dimensional Cartesian space
- key (id) - node/point – zone (d) Goals
- Scales to hundreds of thousands of nodes
- Handles rapid arrival and failure of nodes Properties
- Routing table size O(d)
- Guarantees that a file is found in at most d*n1/d steps, where n is the total number of nodes
29
CAN: solution virtual d-dimensional Cartesian coordinate
space entire space is partitioned amongst all the
nodes
- every node “owns” a zone in the overall space
abstraction
- can store data at “points” in the space
- can route from one “point” to another point = node that owns the enclosing zone
30
CAN Example: Two Dimensional Space
Space divided between nodes
All nodes cover the entire space
Each node covers either a square or a rectangular area of ratios 1:2 or 2:1
Example:
- Node n1:(1, 2) first node that joins cover the entire space
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1
31
CAN Example: Two Dimensional Space
Node n2:(4, 2) joins space is divided
between n1 and n2
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
32
CAN Example: Two Dimensional Space
Node n3:(3, 5) joins space is divided
between n1 and n3
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3
33
CAN Example: Two Dimensional Space
Nodes n4:(5, 5) and n5:(6,6) join
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
34
Node I::insert(K,V)
(1) a = hx(K) b = hy(K)(2) route(K,V) --> (a,b)(3) (a,b) stores (K,V)
Simple example: To store a pair (K1,V1)
key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address)
x=a
y=b
I
35
CAN Example: Two Dimensional Space
Each item is stored by the node who owns its mapping in the space
Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)
Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5);
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
36
Simple example: To retrieve key K1
node J::retrieve(K)
(1) a = hx(K)
b = hy(K)
(2) route “retrieve(K)” to (a,b)
Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies
(K,V)
J
y=b
x=a
37
CAN: Query/Routing Example
Each node knows its neighbors in the d-space
Forward query to the neighbor that is closest to the query id
Example: assume n1 queries f4
A node only maintains state for its immediate neighboring nodes
Can route around some failures
1 2 3 4 5 6 70
1
2
3
4
5
6
7
0
n1 n2
n3 n4n5
f1
f2
f3
f4
38
CAN: node insertionInserting a new node affects only a single other node and its immediate neighbors1) discover some node “I” already in CAN2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half
(p,q)
I
J
new node
new
39
CAN: Node Failure Recovery
Simple failures
- Know your neighbor’s neighbors
- When a node fails, one of its neighbors takes over its zone
More complex failure modes
- Simultaneous failure of multiple adjacent nodes
- Scoped flooding to discover neighbors
- Hopefully, a rare event Only the failed node’s immediate
neighbors are required for recovery
40
Evaluation
Scalability
Low-latency
Load balancing
Robustness
41
CAN: scalability
For a uniformly partitioned space with n nodes and d dimensions - per node, number of neighbors is 2d- average routing path is (dn1/d)/4 hops- simulations show that the above
results hold in practice Can scale the network without
increasing per-node state
Chord/Plaxton/Tapestry/Buzz- log(n) neighbors with log(n) hops
42
CAN: low-latency Problem
- latency stretch = (CAN routing delay) (IP routing delay)
- application-level routing may lead to high stretch
Solution- increase dimensions, realities (reduce
the path length)- Heuristics (reduce the per-CAN-hop
latency)•RTT-weighted routing•multiple nodes per zone (peer nodes)
•deterministically replicate entries
43
CAN: low-latency
#nodes
Late
ncy
str
etc
h
0
20
40
60
80
100
120
140
160
180
16K 32K 65K 131K
#dimensions = 2
w/o heuristics
w/ heuristics
44
0
2
4
6
8
10
CAN: low-latency
#nodes
Late
ncy
str
etc
h
16K 32K 65K 131K
#dimensions = 10
w/o heuristics
w/ heuristics
45
CAN: load balancing
Two pieces- Dealing with hot-spots
•popular (key,value) pairs•nodes cache recently requested entries
•overloaded node replicates popular entries at neighbors
- Uniform coordinate space partitioning•uniformly spread (key,value) entries•uniformly spread out routing load
46
CAN: Robustness
Completely distributed - no single point of failure ( not
applicable to pieces of database when node failure happens)
Not exploring database recovery (in case there are multiple copies of database)
Resilience of routing- can route around trouble
47
Strengths
More resilient than flooding broadcast networks
Efficient at locating information Fault tolerant routing Node & Data High Availability
(w/ improvement) Manageable routing table size &
network traffic
48
Weaknesses
Impossible to perform a fuzzy search
Susceptible to malicious activity Maintain coherence of all the
indexed data (Network overhead, Efficient distribution)
Still relatively higher routing latency
Poor performance w/o improvement
49
Suggestions Catalog and Meta indexes to
perform search function Extension to handle mutable
content efficiently for web-hosting
Security mechanism to defense against attacks
50
Ongoing Work Topologically-sensitive CAN
construction- Distributed Binning
Goal- bin nodes such that co-located nodes
land in same bin Idea
- well known set of landmark machines- each CAN node, measures its RTT to
each landmark- orders the landmarks in order of
increasing RTT CAN construction
- place nodes from the same bin close together on the CAN
51
Distributed Binning
- 4 Landmarks (placed at 5 hops away from each other)
- naïve partitioning
number of nodes
256 1K 4K
lat e
ncy
St r
et c
h
5
10
15
20
256 1K 4K
w/o binning w/ binning
w/o binning w/ binning
#dimensions=2 #dimensions=4
52
Ongoing Work (cont’d)
CAN Security (Petros Maniatis - Stanford)
- spectrum of attacks
- appropriate counter-measures
CAN Usage
- Application-level Multicast (NGC 2001)
- Grass-Roots Content Distribution
- Distributed Databases using CANs(J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang)
53
Summary CAN
- an Internet-scale hash table- potential building block in Internet
applications Scalability
- O(d) per-node state- average routing path is (dn1/d)/4 hops
Low-latency routing- simple heuristics help a lot
Robust- decentralized, can route around
trouble