Home > Documents > PEER TO PEER AND DISTRIBUTED HASH TABLES CS 2711.

PEER TO PEER AND DISTRIBUTED HASH TABLES CS 2711.

Date post: 14-Dec-2015
Category:
Author: rodolfo-hopkinson
View: 217 times
Download: 0 times
Share this document with a friend
Embed Size (px)
Popular Tags:
of 28 /28
PEER TO PEER AND DISTRIBUTED HASH TABLES CS 271 1
Transcript
  • Slide 1

PEER TO PEER AND DISTRIBUTED HASH TABLES CS 2711 Slide 2 Distributed Hash Tables Challenge: To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains CS 2712 Partial thanks Idit Keidar) Slide 3 Searching for distributed data Goal: Make billions of objects available to millions of concurrent users e.g., music files Need a distributed data structure to keep track of objects on different sires. map object to locations Basic Operations: Insert(key) Lookup(key) CS 2713 Slide 4 Searching Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Key=title Value=MP3 data Client Lookup(title) ? CS 2714 Slide 5 Simple Solution First There was Napster Centralized server/database for lookup Only file-sharing is peer-to-peer, lookup is not Launched in 1999, peaked at 1.5 million simultaneous users, and shut down in July 2001. CS 2715 Slide 6 Napster: Publish I have X, Y, and Z! Publish insert(X, 123.2.21.23)... 123.2.21.23 CS 2716 Slide 7 Napster: Search Where is file A? Query Reply search(A) --> 123.2.0.18 Fetch 123.2.0.18 CS 2717 Slide 8 Overlay Networks A virtual structure imposed over the physical network (e.g., the Internet) A graph, with hosts as nodes, and some edges Overlay Network Node ids Hash fn Keys CS 2718 Slide 9 Unstructured Approach: Gnutella Build a decentralized unstructured overlay Each node has several neighbors Holds several keys in its local database When asked to find a key X Check local database if X is known If yes, return, if not, ask your neighbors Use a limiting threshold for propagation. CS 2719 Slide 10 I have file A. Gnutella: Search Where is file A? Query Reply CS 27110 Slide 11 Structured vs. Unstructured The examples we described are unstructured There is no systematic rule for how edges are chosen, each node knows some other nodes Any node can store any data so a searched data might reside at any node Structured overlay: The edges are chosen according to some rule Data is stored at a pre-defined place Tables define next-hop for lookup CS 27111 Slide 12 Hashing Data structure supporting the operations: void insert( key, item ) item search( key ) Implementation uses hash function for mapping keys to array cells Expected search time O(1) provided that there are few collisions CS 27112 Slide 13 Distributed Hash Tables (DHTs) Nodes store table entries lookup( key ) returns the location of the node currently responsible for this key We will mainly discuss Chord, Stoica, Morris, Karger, Kaashoek, and Balakrishnan SIGCOMM 2001 Other examples: CAN (Berkeley), Tapestry (Berkeley), Pastry (Microsoft Cambridge), etc. CS 27113 Slide 14 CAN [Ratnasamy, et al] Map nodes and keys to coordinates in a multi- dimensional cartesian space source key Routing through shortest Euclidean path For d dimensions, routing takes O(dn 1/d ) hops Zone Slide 15 Chord Logical Structure (MIT) m-bit ID space (2 m IDs), usually m=160. Nodes organized in a logical ring according to their IDs. N1 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 15 Slide 16 DHT: Consistent Hashing N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID CS 27116 Thanks CMU for animation Slide 17 Consistent Hashing Guarantees For any set of N nodes and K keys: A node is responsible for at most (1 + )K/N keys When an (N + 1)st node joins or leaves, responsibility for O(K/N) keys changes hands CS 27117 Slide 18 DHT: Chord Basic Lookup N32 N90 N105 N60 N10 N120 K80 Where is key 80? N90 has K80 Each node knows only its successor Routing around the circle, one node at a time. CS 27118 Slide 19 DHT: Chord Finger Table N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node that succeeds or equals n + 2 i In other words, the i th finger points 1/2 n-i way around the ring CS 27119 Slide 20 DHT: Chord Join Assume an identifier space [0..8] Node n 1 joins 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 1 1 3 1 2 5 1 Succ. Table CS 27120 Slide 21 DHT: Chord Join Node n 2 joins 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 1 2 5 1 Succ. Table i id+2 i succ 0 3 1 1 4 1 2 6 1 Succ. Table CS 27121 Slide 22 DHT: Chord Join Nodes n 0, n 6 join 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table CS 27122 Slide 23 DHT: Chord Join Nodes: n 1, n 2, n 0, n 6 Items: f 7, f 1 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table CS 27123 Slide 24 DHT: Chord Routing Upon receiving a query for item id, a node: Checks whether stores the item locally? If not, forwards the query to the largest node in its successor table that does not exceed id 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 0 Succ. Table 7 Items 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table query(7) CS 27124 Slide 25 Chord Data Structures Finger table First finger is successor Predecessor What if each node knows all other nodes O(1) routing Expensive updates CS 27125 Slide 26 Routing Time n f p finger[i] Node n looks up a key stored at node p p is in ns ith interval: p ((n+2 i-1 )mod 2 m, (n+2 i )mod 2 m ] n contacts f=finger[i] The interval is not empty so: f ((n+2 i-1 )mod 2 m, (n+2 i )mod 2 m ] f is at least 2 i-1 away from n p is at most 2 i-1 away from f The distance is halved at each hop. n+2 i-1 n+2 i 26 Slide 27 Routing Time Assuming uniform node distribution around the circle, the number of nodes in the search space is halved at each step: Expected number of steps: log N Note that: m = 160 For 1,000,000 nodes, log N = 20 CS 27127 Slide 28 P2P Lessons Decentralized architecture. Avoid centralization Flooding can work. Logical overlay structures provide strong performance guarantees. Churn a problem. Useful in many distributed contexts. CS 27128


Recommended