1 Peer-to-Peer Systems. 2 Introduction What is peer One that of equal standing with another...

Post on 01-Jan-2016

218 views 2 download

transcript

1

Peer-to-Peer Systems

2

Introduction

What is peerOne that of equal standing with another

Peer-to-peerA way of structure distributed applicationsEach node acts as both a client and a server

3

Client-server v.s. Peer-to-Peer networkExample : How to find an object in the network• Client-server approach

– Use a big server store objects and provide a directory for look up

• Peer-to-Peer approach– Data are fully distributed– Each peer acts as both a client and a server– By asking?

Client-server• Client is dump• Server does most things, but…

Peer-to-peer• The peers have equal functionality

– Client, server, router

4

Characteristic of P-to-P

Multiple peers participating the networkThe number of peers are largeEach peer contains some sharing resources

Distributed, decentralizedSelf-controlAd hoc participationDynamic

Resource sharing, cost sharing

5

Applications

File sharingNapsper, Gnutella

Instant messageICQ

GamingInformation hidingEtc…

6

General assumption in P-to-P

7

TopologyDistributing objects, centralizing directory

Napster• Most famous and motivate whole P2P research

Distributing objects without centralizing directory

Gnutella• No centralized directory servers• Pings the net to locate friends• File requests are broadcast to friends• When provider located, file transferred via HTTP

Freenet

8

Distributing objects and directoriesChordCanHypercube• PRR• Pastry• Tapestry• Etc…

YapperDistributing objects and multiple servers

Supper peers network

9

Desirable prpertiesDeterministic location

If an object exists anywhere in the network, it should be located

Routing localityRoute should have low stretch

Load balanceThe load of storing objects (or object locations) and routing information should be evenly distributed over network nodes

Dynamic membershipThe network should adapt to joining and leaving nodes while maintaining the above properties

10

11

12

Gnutella : summary

Fully distributedSimple, efficient, flexible query

High network trafficThe cost of a search is unboundedThe life time of a message is unknown

Only know its hop count but not duration

13

Freenet

Selective routingQueries for files follow a route biased by hints

Replication of data clusteringKey clusteringImprove data availability

14

Chord

A distributed lookup protocolRouting table is distributedGiven a key, it maps the key onto a node

15

Base Chord protocolConsistent hashing• The consistent hash function assigns each

node and key an m-bit identifier using a base hash function

– A nodes’s identifier is chosen by hashing the node’s IP address

– A key identifier is produced by hashing the key– M must be large enough to make the probability of

two nodes hashing to the same identifier negligible– Identifiers are ordered in an identifier circle modulo

2m

– Key k is assigned to the first node whose identifier is equal to or follows k in the identifier space.

– Successor(k)

16

17

What should be done when a node n join or leave the system?

Scalable key locationRouting information• Each node only be aware of its successor node on the c

ircle– Inefficient

• Each node, n, maintains a routing table with at most m entries

– Finger table– A node’s finger table generally does not contain enough info

rmation to determine the successor of an arbitrary key k.– The finger pointers at repeatedly doubling distances around t

he circle each forwarding process halve the distance to the target identifier

– O(logN)

18

19

Node joinsEach node maintains a predecessor pointerWhen node n joins• Initialize the predecessor and fingers of node n.• Update the fingers and predecessors of existing

nodes to reflect the addition of n– Node n will become the ith finger of node p iff

– P precedes n by at least 2i-1 and– The ith finger of node p succeeds n.

• Transferring and publishing keys

20

21

Failures• When a node fails, nodes whose finger

tables include n must find n’s successor.• Maintains a “successor-list” of r nearest

successors• Replications

22

Can (Content-Addressable Network)

The entire CAN space is divided amongst the nodes currently in the system.

The new node must find a node already in the CANUsing the CAN routing mechanisms, it must find a node whose zone will be splitThe neighbors of the split zone must be notified so that routing can include the new node

23

24

Node departure, recovery and Can maintenance

One of the failed node’s neighbors takes over the zone

• (key,value) pairs held by the departing node are lost until the state is refreshed by the holders of the data

Takeover algorithm• Each neighbor of the failed node will start a takeover t

imer running independently.• When timer expires, the peer send a TAKEOVER messa

ge conveying its own zone volume to all of the failed node’s neighbors

• Compare the volume– The node which is still alive and has a small zone volum

e will be chosen.

25

Design improvementsMulti-dimensioned coordinate spaces

• Increasing the dimensions reduces the routing path length

RTT (round-trip-time) weighted routing• Reducing the latency of individual hops along the

path and not at reducing the path lengthOverloading coordinate zones

• Allow multiple nodes to share the same zone– A node maintains a list of its peers in addition to its

neighbor list• Adv

– Reduced per-hop latency– Improved fault tolerance

Multiple hash functions• Improve data availability• Map a single key onto k points (replication)

26

Hypercube routing

Node and object Ids are drawn from the same ID space which can be thought of as a ring

Each node’s ID is represented by d digits of base b• Example : 32-bit ID => 8 Hex digits

(b=16)

27

Neighbor tableEach node consists of d levels with b entries at each levelEach node also keeps track of its reverse-neighbors

28

Routing schemeExample

Join protocol (index maintaining)Single joinMultiple joins• Sequential joins• Concurrent joins

– Independent joins– Dependent joins

29

YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology

Prasanna GanesanQixiang SunHector Garcia-MolinaIEEE INFOCOM 2003

30

Intro.

Build a small DHT consisting of nearby nodes and then provide an intelligent search mechanism that can traverse all the small DHTsYAPPERS (Yet Another Peer-to-PEeR System) operates on top of an arbitrary overlay network.Lookup service

Partial lookupTotal lookup

31

Key concept

If a node A wants to register a value for a white key <Kw, V1>, this pair can be stored at A itself, since A is also white.For a gray key and its value <Kg, V2> , then A looks for a neighboring gray node.A query for a gray key needs to be forwarded only to gray nodes.

32

33

We call the nodes within h hops the immediate neighborhood of a nodeThe nodes within 2h+1 hops the extended neighborhood

34

Basic algorithm

Consistency: if a node X is in two different neighborhoods IN(A) and IN(B), both A and B assign the same color to node XStability: X is assigned the same color regardless how IN(A) changes dynamically when nodes enter or leaveStability reduces data relocationThe key assignment:

35

The immediate neighborhood

Multiple nodes in IN(X) have the same color?

Allowing X to pick any one of these nodes to store the key

No nodes in IN(X) have color C?By a backup assignment scheme

36

Backup assignment

When there are no nodes in IN(X) that have color Ci, color Ci is assigned to a node with color C {(i+1) mod b}, if there are multiple nodes of C {(i+1) mod b}, choose the node with the smallest IP.

37

In resolving the pitfalls mentioned above, our solution is no longer consistent and stable as envisioned earlier.By probabilistic analysis, it can shown that if a node A has blogb nodes in IN(A), then with high probability there exists a node of each color.

38

Maintaining topology

Edge deletion: when deleting an edge (X,Y), both X and Y broadcast the deletion event to its surviving neighbors with a TTL of 2h.Edge insertion: when adding an edge (X,Y), a “trim” technique will be performed by nodes connected to X and Y.

39

Enhancement

Fringe node problem solutions;Pruning: If X is a fringe node, then X doesn’t participate in YAPPERS directly. It selects a nearby high connectivity node Y as its proxyBiased backup: forbidding a node with a small immediate neighborhood assigning backup colors to a node with a large immediate neighborhood

40

Requirements

ExpressivenessWork in P2P search has focused on answering simple queriesTypes of queries• Key lookup• Keyword• Range query• Aggregates• SQL

41

Autonomy, Efficiency and Robustness

Autonomy• The freedom of node join and leave

Efficiency• Bandwidth, processing power

Robustness• Stability in the presence of failures

42

ComprehensivenessQuality of Service

Number of resultsResponse timerelevance