+ All Categories
Home > Documents > Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

Date post: 27-Jan-2017
Category:
Upload: sabina
View: 219 times
Download: 2 times
Share this document with a friend
20
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014 473 Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables Pascal Felber, Member, IEEE, Peter Kropf, Member, IEEE, Eryk Schiller, and Sabina Serbu Abstract—Peer-to-peer systems represent a radical shift from the classical client-server paradigm in which a centralized server processes requests from all clients. In a peer-to-peer (P2P) system, every “peer” can play the role of a client and a server at the same time, hence sharing responsibilities among all parties. As in practice some peers or connecting links may be heavily loaded in comparison to others, load balancing algorithms are necessary to ensure a fair distribution of the load among participating peers. In this survey, we present load management solutions in P2P systems. According to the level at which they operate, we classify the different approaches into three categories: object placement, routing protocol, and underlay. The first two approaches tackle information lookup and retrieval in the overlay network, while the last one addresses traffic imbalance at the level of the underlying network. Index Terms—Load balancing, peer-to-peer, distributed hash tables, decentralized systems. I. I NTRODUCTION P EER-TO-PEER (P2P) systems are a class of decentralized distributed systems in which each participating node acts as both a client and a server for the other participating peers. Information storage, lookup and retrievals generate load that is shared among all peers. This is obviously an advantage from the point of view of reliability, robustness and scalability. P2P systems do, however, require load balancing algorithms able to fairly distribute the load among all participating peers, in order to avoid situations in which some peers or links would experience much heavier load than others. This survey focuses on IP-based P2P systems, wherein a peer may communicate directly with every other peer. Hence, we do not consider mobile ad-hoc networks. For a larger coverage, we do not make any assumptions on the network management facilities provided by the underlying infrastructure. A. Definitions Before discussing load balancing mechanisms, we have to define the meaning of “load” and associated terms. In the context of P2P systems, the load may relate to objects, peers, or links. We denote by object a piece of information stored in the system, and its popularity is the frequency at which it is accessed. The object load can therefore be induced by its size and popularity. Each node (i.e., peer) has a limited capacity in terms of available storage space, processing time, or bandwidth [1]. The request load on a node is caused by the queries received for objects stored locally. It covers all aspects related to communication costs (i.e., Manuscript received July 29, 2012; revised January 11, 2013. The authors are with the University of Neuchˆ atel, Switzerland (e-mail: [email protected]). Digital Object Identifier 10.1109/SURV.2013.060313.00157 Load Balancing Object Placement Routing Namespace Virtual servers Multiple hashes Caching & Replication Link reorganization Path redundancy Underlay Topology-based IDs Proximity neighbor selection Proximity routing Fig. 1. Taxonomy of surveyed load balancing solutions. sent and received messages) and computational power spent for request processing. As peers may forward messages to other peers during information lookup, they are also exposed to a given routing load for queries that only traverse them. The combination of both types of communication activities is referred to as traffic load [2] in the overlay network. B. Roadmap This survey covers the recent work on load balancing in P2P systems based on DHTs. We classify the different approaches as depicted in Figure 1. Section II provides a short introduction to DHTs. In Section III, we introduce and discuss various causes of load imbalance. Section IV presents approaches that exploit object placement. Section V focuses on balancing traffic. Section VI discusses the use of information on the underlying network structure to optimize overlay communications. Finally, Sections VII and VIII end the survey with a short discussion and concluding remarks. C. Related Surveys Load balancing in distributed hash tables (DHTs) shares common challenges, and in some respect solutions, with other domains such as network load balancing, in which multiple interfaces are used for simultaneous data transmissions [3], or even multiprocessor scheduling problems [4] that must assign tasks to processors to obtain the lowest completion time. Yet, because of their decentralized structure, DHT overlays require dedicated load balancing algorithms that take into account 1553-877X/14/$31.00 c 2014 IEEE
Transcript
Page 1: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014 473

Survey on Load Balancing in Peer-to-PeerDistributed Hash Tables

Pascal Felber, Member, IEEE, Peter Kropf, Member, IEEE, Eryk Schiller, and Sabina Serbu

Abstract—Peer-to-peer systems represent a radical shift fromthe classical client-server paradigm in which a centralized serverprocesses requests from all clients. In a peer-to-peer (P2P) system,every “peer” can play the role of a client and a server at thesame time, hence sharing responsibilities among all parties. As inpractice some peers or connecting links may be heavily loaded incomparison to others, load balancing algorithms are necessary toensure a fair distribution of the load among participating peers.In this survey, we present load management solutions in P2Psystems. According to the level at which they operate, we classifythe different approaches into three categories: object placement,routing protocol, and underlay. The first two approaches tackleinformation lookup and retrieval in the overlay network, whilethe last one addresses traffic imbalance at the level of theunderlying network.

Index Terms—Load balancing, peer-to-peer, distributed hashtables, decentralized systems.

I. INTRODUCTION

PEER-TO-PEER (P2P) systems are a class of decentralizeddistributed systems in which each participating node acts

as both a client and a server for the other participating peers.Information storage, lookup and retrievals generate load that isshared among all peers. This is obviously an advantage fromthe point of view of reliability, robustness and scalability. P2Psystems do, however, require load balancing algorithms ableto fairly distribute the load among all participating peers, inorder to avoid situations in which some peers or links wouldexperience much heavier load than others. This survey focuseson IP-based P2P systems, wherein a peer may communicatedirectly with every other peer. Hence, we do not considermobile ad-hoc networks. For a larger coverage, we do notmake any assumptions on the network management facilitiesprovided by the underlying infrastructure.

A. Definitions

Before discussing load balancing mechanisms, we haveto define the meaning of “load” and associated terms. Inthe context of P2P systems, the load may relate to objects,peers, or links. We denote by object a piece of informationstored in the system, and its popularity is the frequencyat which it is accessed. The object load can therefore beinduced by its size and popularity. Each node (i.e., peer)has a limited capacity in terms of available storage space,processing time, or bandwidth [1]. The request load on a nodeis caused by the queries received for objects stored locally.It covers all aspects related to communication costs (i.e.,

Manuscript received July 29, 2012; revised January 11, 2013.The authors are with the University of Neuchatel, Switzerland (e-mail:

[email protected]).Digital Object Identifier 10.1109/SURV.2013.060313.00157

Load Balancing

Object Placement

Routing

Namespace

Virtual servers

Multiple hashes

Caching & Replication

Link reorganization Path redundancy

Underlay

Topology-based IDs

Proximity neighbor selection

Proximity routing

Fig. 1. Taxonomy of surveyed load balancing solutions.

sent and received messages) and computational power spentfor request processing. As peers may forward messages toother peers during information lookup, they are also exposedto a given routing load for queries that only traverse them.The combination of both types of communication activities isreferred to as traffic load [2] in the overlay network.

B. Roadmap

This survey covers the recent work on load balancingin P2P systems based on DHTs. We classify the differentapproaches as depicted in Figure 1. Section II provides ashort introduction to DHTs. In Section III, we introduceand discuss various causes of load imbalance. Section IVpresents approaches that exploit object placement. Section Vfocuses on balancing traffic. Section VI discusses the use ofinformation on the underlying network structure to optimizeoverlay communications. Finally, Sections VII and VIII endthe survey with a short discussion and concluding remarks.

C. Related Surveys

Load balancing in distributed hash tables (DHTs) sharescommon challenges, and in some respect solutions, with otherdomains such as network load balancing, in which multipleinterfaces are used for simultaneous data transmissions [3], oreven multiprocessor scheduling problems [4] that must assigntasks to processors to obtain the lowest completion time. Yet,because of their decentralized structure, DHT overlays requirededicated load balancing algorithms that take into account

1553-877X/14/$31.00 c© 2014 IEEE

Page 2: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

474 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

their specific properties. The load balancing problem in DHT-based peer-to-peer systems and its quirky aspects have notbeen tackled to date by any survey.

II. AN OVERVIEW OF DHTS

In this section, we start with a short description of DHToverlays. The overlays are generally designed to be wellbalanced under a uniform flow of requests, i.e., with objectshaving equal popularity, all nodes receive a similar amountof requests. In the following, we characterize a good overlaydesign, giving some insights into its structural organization,maintenance of routing tables and interactions with the under-lying topology.

A. High-level Architecture

We consider an overlay network composed of a set ofnodes associated with identifiers (addresses) that lie within agiven identifier space. Links are established between selectedpairs of nodes to form a connected structure. The overlayis designed in such a way that nodes collaboratively storea large number of objects, with each peer being responsiblefor a small fraction of them. To that end, the identifier spaceis partitioned into a set of non-overlapping ranges that areassigned to individual nodes. Each object also has an identifier(key) that lies in the same identifier space as the nodes. Storageof an object is under the responsibility of the node that ownsthe identifier range to which the object belongs.

When a node issues a lookup query to retrieve an object,the routing function redirects the query to the node responsiblefor this object. The request may traverse multiple nodes on itsway, depending on the number of links established betweenpeers in the overlay and the lookup algorithm. Routing ispredominantly based on simple greedy algorithms operatingby means of proximity and distance functions defined on theidentifier space. Typically, a node greedily forwards packets tothe neighbor that is closest to the destination in the identifierspace.

When nodes and keys are uniformly distributed over theidentifier space, objects are expected to be shared amongnodes in a fair manner. One classical approach to achievethis property, known as namespace balancing, is to deriveidentifiers and keys by means of hash functions (e.g., hashingthe IP address of a node or an object name) and assign keysto the closest node in the identifier space. Alternatively, onecan assign identifiers to nodes using random positions [5],Hilbert numbers [6], IP-addresses based grouping [7], ora centralized identifier generator [8]. In Section IV-A, weextensively discuss namespace balancing.

B. The Routing Tables

In a typical DHT overlay, each node maintains in its“routing table” a list of outgoing links leading to its immediateneighbors. Conversely, every peer has a set of incoming linksfrom neighboring nodes, although it may not know the identityof the neighbors nor the number of links in case they areunidirectional. A routing algorithm executing at a forwardingnode along a lookup path will select as next hop a neighbor

from its routing table that is closer to the destination. Nodeswith a small number of incoming links are thus expectedto receive on average fewer requests than those with manyincoming links, as explained in [9]. In order to fairly sharethe traffic load in the overlay, the routing tables should beorganized in such a way that the number of incoming linksper node is balanced.

To keep the routing tables up to date, DHTs typicallyuse maintenance mechanisms that periodically verify whetherneighbors are still reachable and in operation. When discover-ing a dead neighbor, the obsolete entry is replaced by one ofthe nodes that conforms to the organization rules of the routingtables (e.g., taking into account restrictions on identifiers ordistances in the underlying space).

One can improve load balancing in a DHT by consider-ing multiple (redundant) paths towards the destination andchoosing as next hop the least loaded among the possiblealternatives [8]. Moreover, when the routing algorithm can useseveral entries leading to the same destination, the reliabilityof the whole system also increases as there are fewer singlepoints of failure.

C. Underlying Topology

It is important to also take into account the underlyingnetwork structure because communication in the overlay is adirect source of traffic in the underlying network (underlay).Imagine a situation in which immediate neighbors in theoverlay are placed in distant regions of the underlay. In thiscase a multi-hop lookup query approaching the destinationin the overlay may go back and forth in the underlay, dra-matically increasing the traffic in the underlay. To properlyaccount for this possible cause of overhead, network-friendlyP2P systems restrict the identifier assignment or the routingtable organization by using network layer metrics like delay,hop-count, etc. Pastry [10] prefers neighbors with the lowestlatency or hop count. CAN-like system [11] groups nodes inbins based on network measurement techniques. TOPLUS [7]classifies nodes by using their IP addresses.

D. Examples of DHT overlays

In the following, we provide a few examples of well-knownDHT designs.

1) Chord: In Chord [12], each node and key has an m-bitidentifier in a ring-shaped space, which contains 2m differentaddresses. The identifier is derived by respectively hashing anodal IP address or a file name. The first node that followsan object key along the ring in a clockwise direction isresponsible for this object. For routing purposes, each nodehas a routing table with m entries, in which entry i pointstowards the first following node along the ring at a distance ofat least 2i, where i = 0, . . . ,m− 1. Sometimes, entry i in therouting table is referred to as finger i, while the correspondinglinks pointing at node n are n’s incoming links. Figure 2shows a schematic diagram of a Chord ring structure with26 addresses and 15 nodes. Outgoing and incoming links ofnode 22 are respectively marked with solid and dashed arrows.Chord uses greedy routing, a routing strategy in which arequest advances in a clockwise direction so that a forwarding

Page 3: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 475

n5

n12n13n15n16n18n19

n22n24

n26n30

n38

n52

n55

n61 Key

Node

Incomingneighbor

Finger

063

Fig. 2. Chord ring structure, fingers, and incoming links.

11001

11101

-00000

1-0001

11-000

n25

n29

n0

n17

n24

Leaf set

Routing table

Node n28

NodeID 11100

00001

01001

-11100

0-1011

00-000

n1

n9

n28

n11

n0

Leaf set

Routing table

Node n4

NodeID 00100n29 (11101)

n28 (11100)

n25

n24

n18 (10010)

n17 (10001) n15 (01111)

n11

n9

n4 (00100)

n1 (00001) n0 (00000)

(01011)

(01001)

(11001)

(11000)

Fig. 3. Pastry ring structure, overview of the routing tables, path convergence.

node chooses the closest possible preceding neighbor to thedestination as the next hop. In Chord, a network diameter(message dilation), i.e., the number of intermediate forwardingnodes along the shortest path between the most-distant sourceand destination is equal to R = Θ(logN).

2) Pastry: Like Chord, Pastry [10] also uses a ring struc-ture, where a node identifier is derived by hashing an IP ad-dress or a user public key, and the key of an object correspondsto a hash of its name. In Pastry, each identifier specifies aposition along the ring. It consists of a sequence of digitsin base 2b, where a typical value for b is 4 (hexadecimal).Each node handles the numerically closest keys along the ring.Nodes maintain routing tables with information about othernodes with which they share common prefixes (the leftmostdigits of an address) of different sizes, as well as the closestneighbors in the address space (see Figure 3). At each step, aforwarding node chooses a neighbor that shares a longer prefixwith the destination as the next hop for a given message. If nosuch neighbor exists, the request is sent to a node that sharesa prefix of the same length, but is numerically closer to thekey. Forwarding takes Θ(log2b N) steps in a network of Nnodes.

3) KAD: Designed for file sharing, KAD [13] is a peer-to-peer system that is part of the eDonkey framework. KADis based on Kademlia [14], [15], [16]. It uses 128 bit longaddresses, and a routing function based on the XOR metric andprefix matching. There exist two kinds of keys called sourcekeys and keyword keys. To compute an address of a source

Client

hash('matrix')

Keyword key

Source key(file f1)

Source key(file f2)

Source key(file f3)

Publishing nodes

Fig. 4. KAD structure, publishing nodes, keyword keys, source keys.

n1

n4

n2

n3n5

n6

(x,y)

Fig. 5. CAN hypercube structure, path convergence.

key, one has to hash a file content. This key contains theinformation about the file along with the publishing nodeslocations that provide the content. The keyword keys usedfor searching are obtained by hashing tokens derived fromthe file name, e.g., a file called the_matrix contains twotokens the and matrix. This sort of keys informs aboutthe object name, and provides us with the location of relatedsource keys. To retrieve some object, a client runs a two-phaseprocess (see Figure 4). In the first phase, a client searches fora keyword. It computes the hash of an introduced token, e.g.,addr = hash(’matrix’) and asks the closest nodes toaddr for references to objects. A reference contains the objectname along with its source keys. In the second phase, the userchooses an object of interest, and visits a node that handlesthe corresponding source key in order to get an IP addressof the publishing node(s). The node may subsequently receivethe file by directly contacting this IP address.

4) CAN: CAN [17] is a DHT that uses a d-dimensionalcoordinate space divided into N zones, where each zone ishandled by one node. Two nodes neighbor if their zones abutalong one dimension and overlap among all the other d − 1dimensions so that each node maintains 2d neighbors. Eachkey identifier is a d dimensional coordinate in the addressingspace. Figure 5 shows an example of a CAN overlay with 21zones, where only a few nodes have been depicted. Node 3 isconsidered a neighbor of node 1, because their zones overlapalong the Y axis. CAN uses a greedy routing strategy: a nodeforwards a request to the neighbor whose zone is the closestto the requested key. The message dilation is at Θ(d/4N

1d ).

Page 4: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

476 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

TABLE ICAUSES OF LOAD IMBALANCE.

Context Description of the cause Type of load produced

Overlay namespace Unequal portion of namespace assigned to nodes Number of objects per nodeRequests Keys with high popularity Request loadRouting Links or paths used very frequently for routing Routing load

Underlying topology Routing without awareness on the underlying topology Traffic in the underlay

III. THE PROBLEM OF LOAD IMBALANCE

The possible causes for load imbalance are various and canbe observed at every level of a P2P system. In this sectionwe discuss them and we introduce our classification for thecorresponding load balancing solutions.

A. Causes of Imbalance

There are several aspects to be considered.1) Overlay Namespace: Every node and object typically

has an identifier in an address space (also called namespace).An inappropriate node distribution over the identifier spacecan lead to load imbalance, as it may assign to individualnodes portions of the namespace of radically different size.Indeed, a non-uniform key distribution may create regions ofthe identifier space with high node density. Further, the loadon a node may vary in time because of continuous insertionsand deletions of objects in the system, skewed object patterns(distributions of object identifiers or sizes) and churn (nodesjoining and leaving the system) [1]. We use the number ofobjects a node is responsible for as the node object load.

2) Requests: While some P2P systems are designed tohandle a uniform pattern of requests, i.e., each object isrequested with approximately the same frequency, real-worldworkloads tend to indicate that access patterns follow a power-law distribution, with a few very popular objects and manythat are almost never requested. Furthermore, the popularityof objects may vary over time, which makes balancing ofobject across nodes very challenging. A node responsible forpopular keys at a given time is consequently susceptible tobecome overloaded. This highlights the need for adaptive loadbalancing algorithms that can adjust their behavior during thelife-span of the system. We express the request load as thenumber of processed requests per time unit.

3) Routing: The routing algorithm executing on a nodeselects one of its neighbors as next hop for a given lookupmessage. Assume that a node predominantly forwards thelookup requests to a single neighbor. This neighbor and itscommunication links become heavily loaded in comparisonto others. Moreover, if such traffic aggregation repeats overseveral consecutive hops, they may become overloaded. Weexpress the routing load as the number of forwarded requestsper time unit.

4) Underlying Topology: When the overlay is agnostic ofthe underlying topology, the requests may go along pathswith a huge stretch in comparison to the shortest paths in theunderlay. This is obviously a reason for traffic overhead. Thedifferent causes of imbalance are summarized and comparedin Table I.

KeyNode RequestPopular key

na nb nc nd

(a) Overlay namespace

nanb

nc

(b) Request

na

nb

ncnd

(c) Routing

Overlay

Underlay

na

ndncnb

nand

ncnb

(d) Underlay topology

Fig. 6. Scenarios with imbalanced load.

B. Discussion of the Causes

We illustrate several scenarios of overlays that are notproperly balanced in Figure 6. A grey circle represents a node,a capital letter indicates a node address, and a black squarecorresponds to an object key.

Scenario (a) shows an inappropriately organized overlaynamespace. A horizontal line presents the identifier spaceshared by 4 nodes: na, nb, nc and nd. Each node handlesthe keys lying in between its own location and the closestnode on its right. We depicted two problems. First, nodesna and nb have namespaces of equal sizes, however, theirobject load is not balanced since node nb manages more keys.Second, the namespace of node nc is much larger than thenamespaces of nodes na or nb, and thus node nc gets moreobjects when the key distribution over the addressing space isuniform (balanced).

Scenario (b) shows an example of increased request load.In this example, each node handles two keys and thus theobject load is balanced. However, node nb with a popular

Page 5: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 477

key receives more requests (curved arrows) than na or nc.Consequently, nb becomes far more loaded than other nodes.

In Scenario (c), the overlay forwards a large number ofrequests between nodes na and nc. The traffic may go alongtwo different paths: na → nb → nc or na → nd → nc, but themajority of requests use the second path, which may quicklybecome overloaded.

In Scenario (d), the overlay has no information on theunderlying topology. The figure shows 4 nodes in view of theoverlay and underlay. Nodes in the underlay are connectedby physical links shown with dotted lines (in a real topology,messages would actually traverse routers and nodes would actas endpoints). The curved solid lines represent the requests. Amessage sent by node na to node nc via nb is very inefficient,because the source and the destination lie very close to eachother in the underlay, yet traversing nb requires traversing thewhole physical network back and forth.

C. Classification of Load Balancing Mechanisms

There are two classes of approaches for improving loadbalancing in P2P systems. When a method is applicable toany existing overlay structure, we refer to it as overlay-independent, contrary to overlay-specific methods only ap-plicable to a single organization of the overlay. Overlay-specific solutions may be far more efficient in comparisonto overlay-independent approaches, however, their applicationis of course limited. The big advantage of the overlay-independent methods is that they can operate as a complementto other optimization techniques, thus improving further theoverall performance of the system. Examples of overlay-independent load balancing algorithms include caching, repli-cation, or use of super-nodes.1 In principle, surrounding ahighly-loaded node by additional peers alleviates the load ofits corresponding incoming and outgoing links. We are alsoaware of specific solutions designed to exclusively alleviatethe load on particular links [18].

Approaches to DHT load balancing mainly interfere withthe namespace, request rate, and routing at overlay and under-lay levels. We classify different solutions into three categories(see Figure 1):

• Object placement deals with node and key placement inthe identifier space, key to node mapping, the physicallocation of objects, and the size and popularity of objects(Section IV).

• Routing concerns lookup strategies and the organizationof routing tables (Section V).

• Underlay relates to the traffic at the level of the under-lying network (Section VI).

Table II summarizes the load balancing mechanisms pre-sented in this survey. The approaches designed to solve aparticular problem are marked with a “�”. A “�” indicatesthat performance is improved to some extent, but the problemis not completely solved. A “−” specifies that the problem isnot addressed at all (not applicable).

1Super-nodes are nodes of higher performance and capacity, which unin-terruptedly operate in the system for a long time.

IV. OBJECT PLACEMENT

DHTs typically rely on hash functions to map objects andpeers to specific positions in the identifier space. If thesepositions are not well balanced in the identifier space or thepopularity of objects is heavily biased, load imbalance mayoccur. In this section, we present the mechanisms providing:namespace balancing by assigning namespaces of equal sizeto peers; object load balancing by means of virtual serversor multiple hashing functions storing an equal amount ofinformation on each node; and request rate balancing (mainlyfor popular objects) by using caching or replication.

A. Namespace Balancing

There are several mechanisms assigning identifiers to nodesand keys to objects, however, a lot of research focuses onthe uniform key distribution [19] assuming that the load of anode is proportional to the size of its range. To optimize theload balance, the algorithms try to fairly distribute parts ofthe addressing space among nodes. High performance of analgorithm is achieved with high probability (w.h.p.) meaningwith the probability at least 1−1/N , where N is a confidenceparameter [20], [21]. N indicates the size of the system (e.g.,the number of peers), so the probability is arbitrarily close to 1for large networks. Some researchers measure the namespacebalance by using smoothness defined as a size ratio of thelargest to the smallest range in the system. Smoothness σ ofa well balanced peer-to-peer system should remain constantσ = Θ(1), because otherwise the system is not well balancedhaving both large and small namespaces at the same time.

Consistent hashing is a method of namespace balancingpresented by Karger et al. [20]. It introduces an appro-priate hashing function (e.g., SHA-1) uniformly distributingidentifiers and keys over the identifier space. Every nodeindependently uses this function to choose its own identifier.Keys are then handled by nodes in such a way that thewhole system obeys the organization rules of an overlay.In case of consistent hashing over the unit circumference[0, 1) in which nodes get disjoint intervals, the mean valueof interval length is equal to 1/N . However, each node rathergets the interval of length in range (Ω(1/N2), O(logN/N))w.h.p. [20], [21], therefore, the overall smoothness of thesystem is estimated at σ = Θ(N logN). This method is easyto implement, because each node independently chooses itsrandom identifier, however, it suffers from high smoothness.

An improved namespace balancing solution has been pro-posed by Kenthapadi et al. [5]. It is based on initial probes ofnamespace sizes assigned to a few nodes in the system. Onceagain, the authors considered a unit circumference as theiridentifier space. A node joining the system selects at random rpoints in the identifier space. It then finds two “sandwiching”nodes surrounding each point, and performs several localinspections v of namespace ranges (length of arcs) possessedby a few neighbors of each discovered node (see Figure 7).Finally, the node selects its position in the middle of thelongest discovered interval. The authors have proven thatσ = Θ(1) if rv = Ω(logN), however, the initial messagecost of arc discovery is Θ(R · r+ v), where R is the diameterof the network (message dilation). Typically R = Θ(logN)

Page 6: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

478 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

TABLE IILOAD BALANCING SOLUTIONS.

Category Solutions Object load Request load Traffic loadOverlay Underlay

Object Placement Namespace Balancing � � � −Virtual Servers � � − �

Multiple Hashes � − � �Caching and Replication − � � �

Routing Link Reorganization − − � �Path Redundancy − − � �

Underlay Proximity Awareness − − − �

Arriving nodeRandom pointSandwiching nodeImmediate neighbor

Random probe

Local probe

nj

pr

nv

n2p2

n1

p1p+

p-

Fig. 7. Random and local probes for namespace ranges.

Node

Marker

nx

n2

n1

n0

n0'sspace of length l0

l2

l1

Fig. 8. Node nx installs its random marker in n0’s range. We depict twosuccessors of n0, which maintain other markers in their addressing ranges.

for Pastry [10], Chord [12], Tapestry [22], Viceroy [23],Kademlia [14]. R = Θ(logN/ log logN) for Koorde [24] orO(N

1d ) for CAN, where d is the “dimension” of the identifier

space. The idea should be easily implementable in practice,however, it is only applicable to join operations, so it does notprotect against load imbalance when nodes leave the system.

Bienkowski et al. [25] specified a protocol keeping constantsmoothness σ = Θ(1) in the peer-to-peer system using a unitcircumference. Every node installs a random marker in theidentifier space by contacting a node handling the markerposition. By counting markers m and measuring intervallengths l at O(logN) successors (see Figure 8), each node

estimates the total number of nodes in the network. In thewell balanced identifier space, each node gets an interval oflength Θ(1/N). According to this value, each node classifiesits own interval as being short, middle, or long. A node witha short namespace is forced to leave in order to re-join in themiddle of the long interval handled by some other node in thesystem. This solution maintains constant smoothness Θ(1),but requires additional message overhead established by theestimation of the number of nodes in the system Θ(R+logN)and the churn of leaves and re-joins. This is the price to pay fora solution that works equally well on join and leave operations.

B. Virtual Servers

To the best of our knowledge, virtual servers were firstintroduced by Stoica et al. in Chord [12]. By choosingseveral identifiers, each physical node maintains Θ(logN)instances called virtual servers along the unit circumference.This solution more uniformly distributes identifiers over theaddressing space, and it achieves constant smoothness Θ(1),because the probability that a node is responsible for a longinterval decreases. Its main disadvantages, however, includean additional effort of Θ(logN) join operations and themaintenance of Θ(logN) independent routing tables of sizeΘ(logN) per identifier. This solution may be easily introducedin practice, because a physical node has to run O(logN)instances of a DHT application that maintain only one iden-tifier. Some DHTs activate an arbitrary number of virtualservers per physical node, which is proportional to the peercapacity [29]. To balance the load (i.e., the total amount ofall objects on a physical node), nodes may exchange virtualservers simultaneously minimizing the amount of load transferamong the physical nodes [1], [26]. The main advantage ofthis solution is that each node stores an amount of datawhich is proportional to its resources, however, the logicfor a virtual server relocation adds more complexity to thesystem, because it has to move the whole virtual serverinfrastructure including the ID, routing table organization, andcontent between physical nodes. Relocations of large virtualservers may also require high use of bandwidth.

Rao et al. [26] defined 3 schemes in which physical nodescontact each other to periodically exchange virtual servers andbalance the load. They group the nodes into two categories:heavy (i.e., highly loaded) and light (i.e., less loaded). In thefirst independent searching strategy called one-to-one, a lightnode contacts a randomly chosen node. If a heavy node is

Page 7: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 479

TABLE IIICOMPARISON BETWEEN LOAD BALANCING SOLUTIONS THAT USE VIRTUAL SERVERS.

Characteristic Rao et al. [26] Godfrey et al. [19] Karger et al. [27] Ledlie et al. [28]Godfrey et al. [1]

Name — Y0 —- κ-ChoicesVSs per Node Any Θ(cv logN) 1 κ/2

Position of VSs Free to moveanywhere

From a fixed positionper node

Free to moveanywhere

Free to moveanywhere

Path Length Θ(logN) Θ(logN)+ Θ(1) Θ(logN) Θ(logN)

Traffic Probing mechanism tofind heavy node, or 1message to contactdirectory

Generates churn: nodeleaves and then joinsfor its VSs to get newIDs

Generates churn: nodeleaves and then joinsto take a new IDwhere appropriate

Only active nodesgenerate churn: a VSleaves and then joinsto take a new ID

Central Entities Relies on directories No No NoRouting Tables One per VS One of Θ(logn) size One per VS One per VS

Heavy node

Light node

Random probenh

nl

Load exchange

(a) Independent Searching

Directory node

Search forlight nodesnh

nl

Load report

D1

D2

Dd

Load exchange

(b) Load reporting

Fig. 9. Nodes exchange virtual serves among the physical machines.

contacted, a load transfer from the heavy to the light nodemay take place (see Figure 9(a)). In the second scheme calledone-to-many, a network maintains a set of static centralizedrendezvous directories known to all peers. Each light nodereports its load to a randomly selected directory. A heavynode looks for the most appropriate light node in the randomlyopened directory. If the targeted light node is found, the loadtransfer from the heavy node to the light node begins (seeFigure 9(b)). In the third scheme many-to-many, all nodes(i.e., light and heavy) report their load to a few directories. A

directory, which resides on a single node, contains the loadinformation on a fraction of nodes in the system. Its mainrole is to compute and initiate the most efficient load transferaccording to the local information. Independent searchingis easy to implement, but it does not evenly distribute theload among participating peers, because it only involves twocommunicating entities that do not have a broader perspectiveon the global load distribution, so the whole procedure isinaccurate. Other more complicated solutions depend on cen-tralized directories maintaining information on the global load,which consume storage resources, bandwidth, and processingtime. A directory node may more precisely distribute the load,because it possesses the information on the load of multipleentities.

Later on, Godfrey et al. [1] proposed a solution based on theone-to-many and many-to-many techniques. So called periodicbalancing relies on the rendezvous directories in which eachnode reports its load to one of a few directories in the network.A node providing a directory computes the optimal transferaccording to its local information and periodically initiatesa load exchange. Since the optimal virtual server relocationproblem is NP-Complete (similar to multiprocessor schedul-ing), the greedy approximations of longest processing time areoften used instead [1], [30]. Their complexity is estimated atO(r log r), where r is a number of virtual servers designatedfor relocation. When the load exchange is finished, eachnode randomly re-selects its directory. So called emergencybalancing based on the one-to-many paradigm is activatedwhen the load of some heavy node reaches a predefinedthreshold value. In this case, a heavy node notifies its directoryby an emergency request to call an immediate transfer thatsheds its load to the light nodes. The rendezvous directorytechnique is more efficient than the independent searching,however, because of information centralization, it is vulnerableto node-targeted attacks. The load balancing stops operatingproperly when the directory is overwhelmed by huge trafficor a malicious node overtakes the directory position [30].

To solve this issue Wu et al. [30] designed a Group MulticastStrategy combining advantages of independent searching andrendezvous directories at the same time. The authors assumethat a node-targeted attack is costly and the probability of aquick node compromise is low. They divide the network into

Page 8: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

480 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

nd

ns

n2

n3

Iv

Fig. 10. Routing in the Y0 overlay.

disjoint groups. In a small system that contains up to a fewhundred clients, there exists only one group, but when thesystem size grows, the number of groups increases in orderto limit the number of group members. The nodal load infor-mation (total load, the load of virtual servers) is disseminatedamong the group members by using any multicast protocol.In the simplest dissemination strategy every node broadcastsnew information to all neighboring peers that belong to thesame group, but to prevent from redundant forwarding, nodesignore previously seen messages. At every moment, there isonly one group coordinator being responsible for computingand initiating reassignments based on previously disseminatedload information. To elect this peer, all nodes maintain asequential counter seqlb. When a node realizes that its ad-dressing range covers Hash(seqlb), it becomes responsible forone round in which it requests reassignments, increases thesequential counter by one, and broadcasts the new value toother members. A receiving node accepts the value only if itis greater than the previously known seqlb. There is a hugeamount of load information continually exchanged among thegroup members, because at every moment all these nodes haveto be ready to serve as a group coordinator. As a consequence,the network traffic may dramatically increase in comparisonto directory reporting [26], [1].

Godfrey et al. [19] refined the idea of maintaining severalvirtual identifiers per node. The overlay called Y0 is based onChord. At the beginning, each node v estimates a numberof nodes in the system N , computes a normalized nodecapacity cv (i.e., the node capacity divided by the averagecapacity) and picks up a random location pv. Then, it spans aninterval Iv of length Θ(cv logN/N) with an upper endpoint atpv therefore containing Θ(cv logN) consecutive subintervalsof length Θ(1/N). Node v chooses Θ(cv logN ) virtual identi-fiers so that each identifier lies in a separate subinterval of Iv .A node maintains a single finger table of Θ(cv logN) entriesand a successor table containing Θ(cv logN) successors (oneper identifier). The routing function to the distant destinationsd /∈ Iv (see Figure 10) relies on the classical Chord routingmechanism, however, when a message arrives at node v suchthat d ∈ Iv , the node forwards the message by using thesuccessor table originating at the closest anticlockwise localidentifier to the destination (because of the clockwise routing

pr

1

1/43/4

1/2

nj

1/8

nk

Physical nodeRandom position

Precious position

Random activated position

Fig. 11. Competition for the most precious identifiers. nk’s identifier is closerto 1 than nj ’s, so nk activates this location and covers 1 in its range.

direction). The maintenance cost of routing tables is estimatedat Θ(logN), the message dilation equals Θ(logN) + Θ(1),and the smoothness is Θ(1). The main drawback of Y0 is thatit operates by means of addressing ranges, and it does notbalance the load in case of biased key distributions.

Karger et al. [27] maintain Θ(logN) random identifiersper node similar to virtual servers. As opposed to the virtualservers, a single node activates only one virtual node at amoment. Each node fights for covering the most precious po-sition according to the ordering rule: 1 � 1/2 � 1/4 � 3/4 �1/8 . . . by activating an appropriate identifier (see Figure 11).Notice that a node that maintains the closest identifier to themost-precious position activates this location, and other nodeshave to fight for other addresses of lower value. This solutionkeeps constant smoothness of the addressing space Θ(1). Themost significant disadvantage of this scheme is that at eachjoin or leave, O(log logN) nodes change addresses (virtualnodes). Complementarily, the authors suggest transferring theload among nodes by slightly changing identifiers in somesituations, e.g., when a node discovers more/less loaded suc-cessor, both nodes should update their coordinates to equallyshare the load.

Ledlie et al. [28] study a system with restricted identifiersin which security plays a key role. A physical node benefitsfrom κ possible locations in an overlay. A node obtains xcert

(i.e., a number) from the trusted authority and generates afixed set of κ identifiers h(xcert + i) in the addressing space,where h is a hashing function and i = 0, . . . , κ − 1. Thealgorithm is referred to as κ-Choices, because the systemrestricts identifiers of a node to κ locations. κ-Choices maykeep up to κ/2 activated virtual servers per physical machine,however, it stops generating new servers when the load on anode has already reached its targeted workload. When a nodeadds a new virtual server into the system, it probes the overlayto find which of its yet not activated identifiers would workunder the lowest load and activates this location. The authorsdistinguish two types of nodes: active and passive. An activenode constantly adapts to dynamic conditions by probing thenetwork and trying to activate a more suitable set of identifiers.Passive nodes do not produce a churn of leaves and re-joins,because they do not perform any actions after the initial join.

Page 9: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 481

h2(oi)

na

h3(oi)

h1(oi)

Redirection pointers

nc

nb

(a) Concurrent lookup requests from three different sourcesna, nb, nc

h2(oi)h3(oi)

h1(oi)

nb

(b) Parallel lookup requests from one single source nb

Fig. 12. Examples of usage of multiple hash functions.

The strong point of this solution is that it directly probes theload before selecting identifiers. The weak sides are that κdoes not depend on a node capacity (it is fixed), and somenodes may still receive huge loads, because κ �= f(N), whichmeans that the smoothness is not Θ(1).

Table III summarizes the differences between differentspecifications of virtual servers (VSs) for load balancingthrough the object placement. A “–” sign indicates that thecorresponding information is neither supported nor relevant.The main idea of VSs is that each node maintains one ormany identifiers in the identifier space. Both virtual serversand their number on a node may vary in time.

C. Deployment of Multiple Hash Functions

Ratnasamy et al. [17] use several hashing functions. In theirstructured multi-dimensional torus-like overlay called ScalableContent-Addressable Network (CAN), a node storing an itemcomputes several object locations by employing d ≥ 2 hashingfunctions and populates an object among these positions. Tofind an object, d parallel lookup queries are launched towardsd potential destinations or just a single query towards theclosest location in the overlay.

Byers and al. [31] refine the idea of multiple hashes byusing a variant of the power-of-two choices paradigm (referto [32] for a survey). The authors suggest computing several

potential object locations by using d ≥ 2 different hashingfunctions and installing an item on the node with the lowestdiscovered load. To minimize the effort of launching severalqueries per item, the authors advise installing tiny redirectionpointers towards the object location on all d − 1 remainingnodes. In this case, only a single lookup query is compulsoryto find the object position. When a query arrives at thenode with the requested object redirection pointer, it is thenimmediately redirected to the object location. The authors alsopresent two interesting mechanisms relying on the redirectionpointers called load-stealing and load-shedding. In case ofload-stealing, a light node may take over an object handledby some heavy peer and instead installs a redirection pointeron this node. In load-shedding, a heavy node passes its itemto a light node and keeps the redirection pointer towards thenew location.

Figure 12 shows an example of the system routing inthe clockwise direction equipped with 3 hashing func-tions h1, h2, h3. Object oi is handled by the node withthe lowest load. There are 3 potential object locations:h1(oi), h2(oi), h3(oi), but imagine that the object residesat h1(oi). Dashed arrows represent redirection pointers andcontinuous lines depict lookup requests. In Figure 12(a), nodesna, nb and nc issue lookup queries to find object oi, by usingrespectively h2, h3 and h1. When the queries issued by na

and nb arrive at their destinations h2(oi) or h3(oi), they areimmediately forwarded by using the redirection pointers tothe actual object location at h1(oi). In Figure 12(b), node nb

issues three parallel requests by using 3 different hashingfunctions to increase fault tolerance. The successful requestreaches the destination either directly, when h1 is used orthrough the redirection pointers by employing h2 or h3. Thissolution avoids heavy peers when an object is installed, andin consequence more evenly distributes objects among nodes.The idea is also resistant against biased key distributions,however, it requires a few load inspections prior a key insert,which may be costly in case of small objects, or a largenumber of insertions. The average query path length is notaffected, because it only requires one additional hop Θ(1),which is insignificant in comparison to the message dilation ofa typical overlay R = Θ(logN) or R = Θ(logN/ log logN).

Wu et al. [33] apply the idea of multiple hashes in KAD.The KAD system has a problem with popular source andkeyword keys. When a particular key is very popular, thenodes in its vicinity keep a large number of references andrespond to a multitude of lookup requests. To distribute thekeys among several locations and alleviate the load, theauthors suggest using many hash functions h1, . . . , hm. Apublishing or searching node uses a random hash functionhi, where 0 < i ≤ m derived by means of single hashH and a functional composition such that h1(k) = H(k),h2 = H(H(k)), h3 = H(H(H(k))), . . . .

D. Caching and Replication

Even under the uniform placement of objects, popularityof certain items may cause imbalance in the request load,because nodes owning popular objects have to respond toa large number of lookup requests. Caching or replication

Page 10: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

482 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

distributes multiple copies of the same object among nodes,so some peers locally keep replicas of an object handled byanother peer in the system. A peer that maintains a copy, whichis called a replica peer, reduces the request load on exhaustedmachines by acting as a server for cached items. Caching isrelated to requests in which a requesting peer leaves a copyof the requested item in a local cache, while a replicationis mainly related to actively pushing objects by their ownersto selected nodes (updating objects when necessary). Such acopy is then accessible by requests issued by other peers. Sincereplicated objects tend to be found more quickly (in a smallernumber of hops), the average message dilation decreases.The idea of copying objects to other locations is simple,and moreover, orthogonal to other load balancing techniques,which makes it an attractive choice as complementary loadbalancing. The main challenges for caching and replicationinclude:

• selecting objects for replication;• selecting replica peers;• balancing the request load among the replica peers.

1) Various approaches for object replication: Cohen etal. [34] and Lv et al. [35] have studied the performanceof 3 different replication strategies. They consider a systemhaving N nodes (each node handles up to ρ items) and mdistinct objects. In the system running a uniform replicationstrategy, every object, whether it is popular or not, has thesame number of copies evenly distributed among nodes. Thisstrategy reduces the lookup delay, however, it establishes ahigh number of rarely requested replicas inefficiently usingstorage resources. A proportional strategy prefers popularobjects for replication. When the number of copies linearlydepends on an item’s popularity, it supports the lookup queriesfor the most common objects at the cost of rarely used ones.It is then clear that unpopular objects are difficult to find. Asquare-root replication may be considered as a solution lyingin between the uniform and proportional strategy combiningthe strongest sides of these two. In this case, the numberof replicas will be the square root of normalized objectpopularity qi :

∑i qi = 1, i = 1, . . . ,m.

Wang et al. [36] refined the proportional replication by em-ploying an adaptive popularity-aware prefetch, which installsreplicas correlated to popular items. The authors studied rangequeries for objects matching a given pattern. When a multitudeof range queries request well and poorly replicated items atthe same time, we experience a correlation slowing downthe whole system, because the lookups for poorly replicatedobjects run for a very long time. It is then obvious that thesepoorly replicated correlated items have to be more intenselydistributed among nodes to improve the performance. Theauthors distinguish between poorly and well replicated itemsby means of exploration along random walks of a limitedtime-to-live (TTL). They also define a distance function D inthe range space, so that a node may compute the maximaldistance τi between any poorly replicated and its closestwell replicated item for every lookup query. By averagingthis distance among a few queries 〈τi〉, each node estimatesits average correlation length τ . The authors more quicklyspread the correlated items among nodes and improve the

ns

Replication requests

n3

n1

nd

n2

Lookup requests

Fig. 13. Example of path replication.

performance of range queries by using τ carried on top ofthe lookup queries. When a node responds to a range queryfor s, it additionally returns all items s′ correlated to s suchthat ∀s′ D(s, s′) < τ .

2) Caching along the path of lookup queries: Since everymessage contains information about the request path, the mostsuitable method is to cache along that path. This can be doneon all the nodes, randomly, close to the destination node, orat the node that requested the object [37]. In the following,we extensively describe each of these methods.

In some situations replication does neither provide quickeraccess nor load balancing. Dabek et al. [29] provide CFSbased on the Chord DHT. They keep k replicas of the sameobject among a home node and its successors. This replicationscheme does not aim load balancing, but instead, it increasesthe reliability of the whole system by eliminating a single pointof failure. When a home node fails, its successors immediatelystart serving the object. The authors do, however, considerload balancing by using path replication, which caches arequested item on all nodes along the path of the recentlylookup query for this object. At the same time, Rowstron etal. [38] independently adapted the same caching and replica-tion concepts to Pastry [10] in PAST such that a copy of eachitem resides on all k closest peers (leafs) to the home node,and additionally, nodes cache items along the paths of lookupor insert queries (PCX). In Figure 13, we present an exampleof such an operation. At the beginning, node ns initiates alookup query (continuous arrow) for an object handled by nd

so that the query visits a few intermediate hops: n1, n2, andn3, and finally reaches destination nd. In turn, node nd issuesreplication requests (dashed arrows) populating the objecton all nodes residing along the previously established path.This strategy is either referred to as Path Replication [9] orPath Caching with Expiration (PCX) [39], because typicalcache entries maintain expiration timeouts after which theyare considered stale or invalid as keeping cache entries toalready deleted or modified items is not desired from a dataconsistency perspective, however, cache entries that are notsynchronized with a home node cannot, in general, guaranteeany data consistency.

Wang et al. [40] provide a Distributed Caching and AdaptiveSearch (DiCAS), another adaptive mechanism which installstiny hint entries along the paths of previously issued lookup

Page 11: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 483

queries. Each peer picks up a random group identifier in apredefined range: 0 . . .M−1. At the beginning, an item resideson a random peer in the group, where the group identifier isderived by means of GroupID = hash(ObjectID) mod M . Alookup query for an object should obviously be restricted toits group members, however, in some situations dead ends areencountered, i.e., nodes with no next hops matching the groupID. In these cases a routing function prefers a next hop withthe highest node degree (large number of neighbors), becausethere is a high probability that such a peer neighbors with atleast one node that belongs to the group of interest. Whenthe object is found, it is returned in the opposite directionalong the path of the lookup query so that peers equippedwith the corresponding group ID may save the object location.Consequently, subsequent requests for the same object morequickly reach the destination due to the saved routing hintsthat provide the information on the object location.

Obviously, an increased number of replica peers offersquicker access to objects, implicit fault tolerance, and requestload balancing. Yamamoto et al. [9] discovered, however, thatthe number of replicas installed by PCX may become verylarge, far beyond the necessary limit required to improvethe system performance. In order to limit the number ofreplicas, the authors provide us with Path Random Replicationand Path Adaptive Replication. In case of the Path RandomReplication, a node lying along a path of requesting querycaches an item with arbitrary probability p ≤ 1. Cachingshould also take into account the limits of storage capacity. Inthe Path Adaptive Replication, p depends on a node storageand consumed resources, such that a heavily loaded node or anode with limited capacity less likely caches items. The maindisadvantage of these schemes is that unnecessary replicasof extremely well cached items may still occupy storageresources of a large fraction of high capacity peers preventingthe installation of more important replicas to poorly replicateditems.

When an object is replicated among multiple locations,the request load is indeed shared among its owner and thecorresponding replica peers. In some situations, however, it isnot an equal share, because some nodes may still receive morerequests than others. To solve this issue, Bianchi et al. [37]constantly measure request rates for locally handled objectsincluding object request rates and object request rates perincoming neighbor. To equally share the load among nodes,exhausted home or replica peer nf replicates its most popularobject (MPO) having the highest number of requests per timeunit on the immediate incoming neighbor who forwards themost important number of queries for it to nf . This solutionis easily implementable, because a node decides whether thereplica on the incoming neighbor is necessary or not based ona local estimation. This algorithm only installs replicas whennecessary, and though it outperforms PCX, however, a replicais created by using active pushing, so it may consume a lotof network resources.

Shen [41] uses a similar approach in Decentralized FileReplication Algorithm (EAD). In his scheme, each forwardingnode monitors object query rates qf , and issues a request forreplication to the resource owner of f , when the query rate forthis given item significantly exceeds the mean value: qf > αq,

KeyReplicaHint

nd nf nq

Request

Reply

Install routing hint

Fig. 14. Replication on the node that recently retrieved an object.

where α is an arbitrary number and q is the mean requestrate for all objects. The request for replication indicates thatthe forwarding node is interested in holding a replica of f todirectly respond to the lookup queries and alleviate the load onthe home node. A heavily loaded resource owner immediatelysends replicas to the interested peers (taking into account theirquery rates, i.e., a peer that forwards more queries per timeunit more likely gets the replica) until its targeted workload isreached. A lightly loaded home node only replicates its itemwhen the benefit of the replication is greater than its cost.

Gopalakrishnan et al. [42] present the Lightweight, Adap-tive, system-neutral Replication protocol (LAR). The ideais that each node locally measures the routing load. Whenforwarding node nf realizes that its load exceeds predefinedthreshold value lhi, it attempts to create a new replica on thepeer nq that has recently requested nd for an object by issuingthe lookup query seen by nf (see Figure 14). This is a fairlygood choice, because this peer should already have a retrievedcopy of the object by then. A successfully created replica ispro-actively advertised by installing small-sized cache entriesalong a fraction of the lookup query path (between nq and nf ).Each cache entry obviously acts as a routing hint for newlyarriving lookup queries. A forwarding node may efficientlybalance the load by redirecting messages towards the homenode or cached replica peers. LAR outperforms the CFSreplication scheme in terms of traffic overhead, because it onlyinstalls tiny cache entries along the paths of lookup queries,while it mainly depends on the retrieval process initiated bya client interested in a particular object.

Roussopoulos et al. [39], [43] propose the ControlledUpdate Propagation (CUP) protocol tackling a consistencyproblem of cached items. Under dynamic network conditions,the content is constantly modified, added, or removed. It isthen difficult to judge whether the encountered cached entriesare still valid or expired. To solve that issue, CUP introducesa tree-like object specific downstream channel transferringcache updates to the peers that provide the cache entries.The tree is built in the reactive way such that it contains allpeers along the paths of previously issued lookup queries. Alightly loaded peer may independently cut the downstreamchannel by refusing or not forwarding update messages, whenit discovers that maintenance costs are greater than benefitsfrom the currently maintained cached items, e.g., when a nodereceives a lot of update messages for an object, but does notsee any lookup queries for it.

3) Caching on nearby locations: In the case of a popularobject, it is beneficial to install replicas on the nodes placednear its home node, because otherwise they may forward alarge portion of lookup queries and overload the home node.

Page 12: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

484 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

A9-02 A9-0A ... A9-E8

A9F-2 A9F-A

A9F4

Level k-2

Level k-1

Level kKey

ReplicaAggregation

n0

Fig. 15. Beehive replication strategy.

Ramasubramanian et al. [44] proposed Beehive, whichis a general proactive replication framework for any DHTusing prefix-based routing (e.g., Chord [12], Pastry [10],Tapestry [22]) also referred to as a Longest-Common-Prefix(LCP)-based replication [45]. In a system with prefix rout-ing, a destination shares a longer prefix or the most-leftleading part of address with each consecutive next hop.Normally, after traversing κ hops, a query reaches a nodehaving κ leading prefixes matching the destination, so thesearch space exponentially reduces and the average messagedilation is Θ(logN). To shorten the message dilation by κhops, a home node handling the object may replicate its itemon the other nodes whose addresses differ by at most κterminating suffixes from the object location being the κlast hops along every potential path towards the destination.Because the replica peers directly reply to the lookup requests,the request load is therefore shared among many nodes. At thebeginning an object is only carried by its home node, and wemay also say that it resides at home level k = O(logN).When the object popularity is too high, the overloaded homenode expects to lower the request load by installing the objectreplicas on all nearby peers residing at the k − 1 level suchthat their addresses only differ by 1 suffix from the objectkey. If the demand for this object has dropped below anacceptable value on all carrying nodes, the replication stops.Otherwise, the process continues by populating replicas atthe lower k − 2 level, wherein nodes differ by 2 suffixes.In general, the replication process pushes the replicas to thelower levels until the targeted workload is finally reached (seeFigure 15). The authors suggest us that the targeted level ofreplication should guarantee the constant lookup latency O(1).They analytically derive the targeted level of replication forevery object in the system by assuming the Zipf-like popularitydistribution qi = f(iα), where i = 1, . . . ,m; m is the numberof distinct objects in the system; and α is an exponent char-acterizing the Zipf’s distribution. An appropriate replicationlevel i for every object is estimated at xi = fi(α,C), where Cis an arbitrary message dilation, however, it requires theknowledge about α. To derive α, Beehive performs two phasescalled analysis and aggregation. During the analysis, eachnode estimates the request rates for locally maintained items.The local information does not reveal, however, the overallobject popularity, because the request rates are distributedamong all nodes that handle a given object. Beehive uses anaggregation protocol to exchange the object popularity amongall these nodes placed at different levels. The disseminationprotocol periodically updates the information on the requestrates in both directions, i.e., from lower replication levels toupper levels and the other way around. Having the overall

object popularity, each node may locally estimate α by usinga linear regression, because assumed Zipf-like qi is a straightline in the log-scale. A node computes a desired replicationlevel xi for every maintained object by using analyticallyderived functions fi. The subsequent replication strategy isfully distributed: when a node at level k realizes that oneof its objects is not well enough replicated, it populates theobject among all known nodes in its routing table that resideat i − 1 level according to the object position. The maindrawback of Beehive includes the non-local estimation of thereplication level, which requires a protocol exchanging theload information among a large number nodes. This makesthe algorithm very impractical to use in large scale imple-mentations. The algorithm may not operate properly underdifferent object popularities due to the explicit dependence onthe Zipf’s distribution.

Xia et al. [45] studied different specifications of LCP-based replications to find out whether a similar replicationpattern to Beehive’s may be obtained by only using thelocal measurements of request rates. Their starting point isobviously Beehive in which a home node pro-actively pushesfrequently requested objects to the nodes at the lower levelsaccording to the object position. The authors’ strategy isbased on the observation that in the overlay with prefix basedrouting, the probability that node s forwards a query from arandom node to destination d is equal to p = (1/2)

e−l(s,t),where e = log(m) is the address length, m is the size ofthe addressing space and l(·, ·) is the length of commonmost-left prefixes shared by two addresses. Notice that theprobability of message forwarding goes up with the increasingnumber of prefixes shared by a node with the destination(see also Figure 17). A simple replication strategy in whicha node polls and caches an item, when the locally measuredobject popularity exceeds a given threshold θf/2 gives similarperformance as Beehive [44], because when l(s, d) is small,node s near the popular object location d observes a lot ofrequests and likely caches the item, but the idea does notsuffer from any non-local procedures.

Huang et al. [46] proposed a Logless File ReplicationAlgorithm (LessLog). In their scheme, each node maintainsa unique physical identifier (PID) in the addressing space. Atthe beginning an item resides at its home node being alsothe root of a so called virtual binomial tree. In the case ofsignificant object popularity, the object is being distributedalong the tree in the descending order from the root. A nodedoes not need to maintain specific logs per clients, but incase of overload it simply populates its objects downwards thetrees. Every destination has its own distinct virtual binomialtree known to everybody in the system, because a nodemay easily derive its own position in any tree along withthe address of the parent leading to the root or offspringnodes going downwards the tree. The conversion betweenreal and virtual addresses relies on the eXclusive OR (XOR)over the binary complement of the destination address (e.g.,m = 4 : PIDd = 0100 → PIDd = 1011) and the nodal PID:xor

(PIDdst, PIDn

). A simple computation xor

(PIDd, PIDd

)

leads to an immediate observation that a destination (tree root)gets a virtual identifier only consisting of m continuous 1s,where m is the address length. Consider an example in which

Page 13: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 485

node n equipped with PIDn = 101 lies very close to thedestination of PIDdst = 100, we also see that it stands veryclose to the root of the destination binomial tree, because011 becomes 110 = xor(011, 101). A node deriving a nexthop for a given destination starts with computing its virtualaddress in the tree. Then, the routing function replaces themost-left standing “0” with “1” (e.g., 0011 → 1011) ofthe virtual address, and converts the result to the physicallocation of the parent node by xoring it with the binarycomplement of the destination PIDd. A conversion betweenreal and virtual addresses is so easy due to the followingXOR property: xor(A,B) = C → xor(A,C) = B. Thereplication of popular items relies on pushing an object tothe offspring nodes in the object’s virtual binomial tree sothat the replicas may be more quickly encountered on theroute towards the destination. Consequently, the load is sharedamong several different locations. The authors do not need tomaintain specific client logs like Gopalakrishnan et al. [42]or Bianchi et al. [37], which consume both memory andprocessing time, but they depend on active pushing, whichconsumes network resources when a replica is created.

4) Miscellaneous: Because the network may be heteroge-neous having both high and low capacity peers, it is beneficialto establish a notion of hierarchy in the network. Shen etal. [6] proposed the Proactive Low-overhead File ReplicationScheme (Plover). The scheme uses nodes of high performanceand capacity, i.e., super-nodes as basic building blocks of thepeer-to-peer network. Each high capacity node owns a clusterof ordinary nodes. When an ordinary node joins the peer-to-peer network, it simply establishes a link with the closestdiscovered super-node. A super-node then manages the loadbalancing by exchanging hot files among light and heavynodes and replicates popular objects within the cluster.

V. ROUTING

In the previous section, we have shown how to distributeobjects among peers. We now go further to study the paths oflookup queries. To transmit a message between any source anddestination, an overlay specifies a distributed routing functionin which every node is responsible for packet forwarding.Each node maintains a routing table, which contains identifiersand the corresponding IP addresses of the neighboring peers.This information is used by a routing algorithm when a nodeselects a next hop for a given destination, and a forwardingprocess in which a node passes a message further by usingthe IP addresses and the underlying network. Each messagevisits a few intermediate nodes to finally reach the destination.Under the uniform flow of requests, some paths may becomeoverloaded [47], because they are used more frequently thanthe others. Moreover, object popularity of a skewed patternmay amplify this bias by making a problem even morechallenging. To overcome this issue, routing load balancing isapplied. The link organization of the overlay and the routingstrategy have both a deep impact on traversed paths, so weclassify the routing load balancing into the following disjointsub-categories:

• Link reorganization, in which the routing strategy re-mains unchanged, but the load balancing alter the linkorganization of the overlay.

1010 1100 1101 1110 1111...

11** nodes

Suitable nodes for entry 1

Routingentry

Neighborwildcard

01231 0 0 *1 0 1 1

1 1 * *0 * * *

Fig. 16. Flexibility in the choice of neighbors.

• Path redundancy, in which the link organization remainsunchanged, but the system refines the routing strategy.

A. Link Reorganization

A number of entries in the routing table is referred to asdegree of an overlay. An overlay of constant degree has lowmaintenance costs, but usually there do not exist many choicesto fill in each routing entry. There are several examples of con-stant degree overlays including de Bruijn-based overlays [24],Viceroy [23], or CAN [17]. Other DHTs like Chord [12],Pastry [10], Tapestry [22], or Kademlia [14] use a logarithmicnode degree. These overlays have higher maintenance costs,since their routing tables contain more entries. Nevertheless,the routing algorithm may use multiple entries leading tothe same destination, which is a good starting point for apath redundancy. When there exist many eligible nodes to fillin a particular routing entry, e.g., Chord [12], Pastry [10],Tapestry [22], or Kademlia [14], one may easily replace anexhausted neighbor (i.e., a routing entry) with a new lightpeer without affecting the average path length of subsequentrequesting lookup queries. Other DHTs, e.g., CAN [17],Viceroy [23], or Koorde [24] do not allow multiple eligiblenodes, i.e., there is only one eligible node that fits an entry,so there is no flexibility in link reorganization.

We provide now a more detailed description of the situationin which there exist multiple eligible nodes for a particularrouting entry. In Pastry [10], a routing table on a node nis organized in such a way, that an entry at row i refersto a node whose address exactly shares i most-left standingprefixes with n. Let us consider an identifier space of lengthequal to 24 in which each identifier contains 4 consecutivebits: “0” or “1”. In this case, the number of eligible nodesdeclines with i, because there are 2(3−i) possible identifiersmatching ith entry, however, not all of them are necessarilyused by other nodes. In Figure 16, we consider a node 1010.On the left side, we present its routing table, where a “�”indicates a wild card substituting “0” or “1”. On the right side,we see a linear addressing space with 4 eligible identifiersmatching the routing entry at row 1. There is, however,no node occupying 1110, thus node 1010 may arbitrarilyuse 1100, 1101, and 1111 to fill in row 1 according to anypredefined rule.

Serbu et al. [48] benefit from Pastry’s flexibility in neighborselection. The idea is that a node updates content of its routingtable by using the load information on other peers, which ispiggybacked on top of lookup requests. Every node locallymonitors its routing load, and injects its identifier and themeasured load into the lookup message upon forwarding. Aforwarding node also inspects all lookup queries to searchfor the load information about preceding hops, which areobviously eligible to fill in some of the routing entries. The

Page 14: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

486 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

n2 n6 n10 n14 n18 n22 n24

Fig. 17. Example of path convergence towards the same destination node n24.

forwarding node updates its routing entry by using one of theeligible nodes from the forwarded message when this eligiblenode claims to be less loaded than the neighbor presentlyoccupying the entry. This technique reduces the load of heavypeers by increasing the message forwarding by light nodes.Highly loaded nodes receive less requests, because they arediscarded from the routing tables all around the network. Thealgorithm consists of local procedures, so it might be easilyimplemented and deployed in a large scale system. It does notrequire additional control messages and the increased size ofthe lookup query has a reasonable cost (a few bytes per nodemultiplied by O(logN) possible intermediate nodes along anylookup path).

Xia et al. [45] studied the performance of caching mecha-nisms through simulations against two types of routing strate-gies. In case of Single-Choice Randomized Routing Tables,each routing entry contains a random eligible node for acorresponding level and digit value, e.g., at the 0-level, arouting entry for “0” contains a random node of address“0 � . . . �”, where a “�” is a wild card substituting “0”or “1”. Contrary to the previous strategy, Multiple-ChoiceRandomized Routing Tables do not choose a random eligiblenode. This time, a new routing entry prefers a node withthe lowest number of upstream peers (incoming links). Thenumber of upstream peers is not a local value, so it might bedifficult to implement this method in a real system.

B. Path RedundancyA routing strategy selects a next hop for a message on

each consecutive forwarding node. Greedy routing (e.g., Pas-try [10], Chord [12]) that only uses local information on apeer, is incontestably one of the most widely employed routingstrategies. The greedy routing directs each request as close tothe destination as possible achieving an optimal path length incase of a non-erroneous operation. One of its major drawbacksis that frequent requests destined to the same node overloadpeers lying near the destination, since the messages at the “lastmile” predominantly visit the same nodes.

An example of path convergence is shown in Figure 17.We present an identifier space of length equal to 25 in whichan identifier consists of 5 consecutive bits. In our scenario, adashed arrow represents a single instance of a lookup requesttraveling between a pair of nodes. To present an aggregationof multiple queries, we use continuous arrows with a thicknessproportional to the routing load. Several nodes: n2, n6, n10,and n14 issue lookup requests to node n24. The requestsfrom n2 and n10 arrive at n18, from where they go along thesame path passing through n22 to reach the destination at n24.Additionally, n22 forwards the other requests coming from n6

and n14, so the traffic destined to n24 highly overloads n18

and n22.The routing load balance may be achieved due to multiple

substitute next hops suitable for a given destination so that

110 111

010 011

100 101

000 001

000

100 010 001

110 101 011

111

Fig. 18. HyperCuP routing.

lookup requests go along a few alternative paths. As anexample, let us consider eQuus [49], in which a clique beinga group of nearby nodes in terms of some proximity metricshares the same identifier and a set of keys. In order to providerobust connectivity, a client maintains links towards all othernodes in its clique and k peers in each O(logN) other cliques,where N is a total number of cliques in the system. Toensure robust communication, the content of routing tablesis periodically updated. A node updating a routing table forclique c, contacts a random node in c, which in turn sendsback a list of k nodes in c being currently in operation. Anode that has a message to forward to c may pick up a randomnext hop out of k known peers in c. This solution providesalternative paths, and therefore evenly distributes the routingload among all the nodes in the same clique. The algorithmconstantly maintains a few eligible next hops with the sameclique ID, so the random routing function choosing one ofthese nodes evenly distributes the routing load. The idea issimple and easily implementable in a real system.

Another type of overlay maintains hypercube-like struc-tures [50], [49], [51], [52], which provide alternative pathsof similar lengths. HyperCuP [50] maintains a variable di-mensional distributed hypercube-like graph (i.e., a section,square, cube, or n-cube) throughout the whole life-time ofthe system. Nodes reside at the vertices of the hypercube,and the set of links is always restricted and equal to the setof edges in the hypercube, e.g., 1 for a section in 1D, 4 fora square in 2D, 12 for a cube in 3D, or 32 for 4-cube in4D, etc., therefore every node is adjacent to a neighbor inevery dimension. Notice that each identifier on the hypercubeconsists of a sequence of bits: “0” or “1”, which indicates abitwise position in every dimension (x, y, z, . . . ). There is alink (edge) between any two nodes iff their binary identifiersonly differ by only one bit. As an example, let us considera message traveling along an edge connecting 100 with 000.The message moves along an x-axis, because these addressesonly differ in the first bit. Some peers have to maintain virtualnodes to populate all available vertices, because there are notalways enough nodes to occupy all positions in the cube (e.g.,2, 4, 8, 16, . . . ). The hypercube consistency is always providedby join and leave operations. When the number of nodesexceeds the current number of vertices, the system opens upa new dimension for a newly arriving node. The diameter ofthe network is estimated at O(logN), and the join operationrequires O(logN) messages. The authors have only specifieda binomial tree broadcasting (see Figure 18). It is well known,that the binomial tree spans the hypercube graph [53]. Tobroadcast the message, one has to associate a unique numberwith every dimension, e.g., x = 0, y = 1, etc. When a nodeobtains a query on a link/edge i along dimension di, it then

Page 15: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 487

n2 n6 n10 n14 n16 n22 n24n8 n20n4 n12

Fig. 19. Load balancing on the forwarding traffic towards the same destina-tion, node n24, using random choice for the next hop.

only forwards on all links j of a “higher dimension”, such thatdj > di. The most important property of the hypercube fromthe load balancing point of view is a multitude of alternativepaths between any source and destination.

Alvarez et al. [51] specify another network of a hypercubestructure. Each node maintains a unique identifier of a fewbits and a mask, which contains a few 1s to the left followedby 0s. The addressing space handled by a node is derived bymeans of a logical AND over the nodal address and mask.The routing algorithm may either be proactive, in which anode maintains a next hop for each region in the addressingspace or reactive, mostly based on the standard routing inhypercubes, which changes differing bits one by one. As anexample, node 0100 may either forward its message on theroute to 1111 to 1100, 0110, or 0101, because they differfrom 0100 in only one bit.

Han et al. [54] study hypercube networks in which the rout-ing function relies on the binomial tree. The authors formulatea problem in which there are n clients accessing an object,which is replicated among m distinct replica servers. Everyserver chooses its subset of clients and arranges transmissionsinto sessions (each session contains simultaneously retrievingclients) by minimizing a so called degree of interference, i.e.,an amount of concurrent data streams on all overlying links.The algorithm fairly distribute the load among the links in thebinomial tree.

The HYPEER overlay [52] loosely adapts a hypercube,in which join and leave operations provide a node withexponentially spaced neighbors, which are referred to asaligned nodes. Therefore the distance between a node andits neighbor residing at row x in the routing table equals 2x.Imagine a situation in which a source requests a destination22 + 23 + 24 away. At the first hop, an ordinary greedyrouting takes the longest leap selecting a node 24 closer tothe destination. The subsequent moves respectively jump 23

and 22 further. Because every node has a set of aligned nodes,another logarithmic-length routing strategy may consist of theshortest jump of 22 at the beginning followed by increasinglylonger consecutive steps. The authors provide us with a loadbalancing routing strategy, in which a node selects a randomeligible node for any given destination. In our example, asource advances the packet by 22, 23, or 24 with the sameprobability. This strategy balances the routing load at the costof slightly longer paths. It also requires the maintenance of thealigned neighbors, because otherwise the routing is unable toalleviate traffic on the nodes near the destination.

In Figure 19, we have the same situation as in Figure 17,with the difference that we use the HYPEER load balancingrouting strategy. As previously stated, there are 4 nodes issuingrequests to n24. This time the lookup queries go along differentpaths fairly distributing the load.

VI. LOAD BALANCING IN THE UNDERLAY

A peer-to-peer system, which has a good performance interms of an overlay, i.e., short delay, completely balancedobject and routing load, does not necessarily optimize trafficin the underlying network (e.g., the Internet). Sometimesunderlays and overlays do not perfectly match with regardto nodes or links, e.g., some nearby nodes in the underlay getdistant identifiers in the overlay. Moreover, the load balancingmechanisms applied to an overlay may not affect or balancethe load at the underlying level. Even if the routing loadis balanced, the traffic may still go along a small numberof underlying links and overload the corresponding networkresources. Although the overlying paths seem to have a shortlength, a message may go back and forth among distantregions of the underlying network. In this section, we considerdifferent methods, which exclusively minimize and balancethe network traffic. To achieve this goal, we should buildan overlay by using the knowledge on the underlying topol-ogy, however, the performance of this mechanism is limited,because stub networks may only use one link to exchangeinformation with remote destinations. There are three widelyaccepted approaches to exploit the network proximity [55]:

• Topology-based IDs: mapping the overlay onto the un-derlying topology by using topology aware IDs.

• Proximity neighbor selection: mapping the overlay linksonto the underlay topology by selecting the physicallyclosest neighbors.

• Proximity routing: selecting the physically closest nexthop among several alternative eligible hops.

The first two modify a structure of overlay providing propermapping of nodes and links onto the underlying topology.Consequently, the traffic travels along similar paths in bothunderlay and overlay. Another choice relies on the proximityrouting taking into account the underlying distances to eligibleneighbors.

A. Topology-based IDs

Nodal identifiers should contain residual information onthe physical location to optimize or balance the traffic in theunderlying network. This approach has several drawbacks [55]such as the uniform distribution of nodes over the identifierspace may be destroyed inducing load balancing problemsin the overlay or close neighbors may suffer from correlatedfailures reducing robustness of the overlying network.

Ratnasamy et al. [11] delivered a topologically aware ad-dressing scheme for CAN [17]. They cluster nodes into binsbased on network measurements and landmark numbering.Every node inspects Round-Trip delay Times (RTTs) to afew landmarking machines in the Internet (see Figure 20).Based on this information, a node does not randomly selectits identifier in the whole available addressing space, butinstead, it uses an n-dimensional vector of RTTs as its position〈l1, l2, . . . , ln〉. Obviously, the nodes that fall into the same binare relatively close to each other in terms of network latency,i.e., RTT. The landmark positioning does not exhibit highprecision, i.e., RTT varies for instance in time or syntheticcoordinates do not perfectly reflect real positions, but it isenough to optimize traffic so that the request approaching the

Page 16: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

488 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

Request

Reply

lx

ly

n1

x

y

Fig. 20. The RTT is computed as the time difference between sending aquery and receiving its reply. n1 estimates its x coordinate and y coordinatesby resp. measuring RTTs to landmarks lx and ly .

destination in the overlay does not go back and forth in theunderlying network.

Xu et al. [56] reduce the dimension of location vectorsestimated through landmark positioning by employing spacefilling curves [57]. This is beneficial, because a preciseposition estimation requires a large number of landmarkingmachines, while ordinary peer-to-peer systems use low dimen-sional namespaces, e.g., 1D Chord [12]. As an example, a nodemeasures the distances (l1, . . . ln) to n different locations inthe network. By employing Hilbert curves, an n dimensionalvector is mapped onto a single identifier in R1. This frame-work guarantees us that two nearby nodes in terms of highdimensional vectors, also stay together after the dimensionalcontraction. The algorithm partitions a high dimensional spaceinto very small n-cubes of equal size. It then establishes aspace-filling curve passing through all n-cubes one by onein a specific order. All points that belong to the same n-cube obtain the same Hilbert number, which is equal to thesequence number of the cube (see Figure 21). Hilbert curvesare widely deployed in more complex systems. As an example,Shen et al. [6] measure delays to a few landmarking nodesto derive addresses in their Chord-like system. In anotherexample, Shen et al. [58] equipped the Cycloid overlay [59]with proximity aware addressing referred to as Locality-AwareRandomized Load Balancing (LAR). In Cycloid, every nodehas an identifier, which consists of two separate indices: acyclic index varying 0, . . . , d − 1 and a cubic index ranging0, . . . , 2d − 1. The authors use Hilbert curves to derive cubicidentifiers, which also reflect the physical positions of nodes.Notice, that all the nodes that share the same cubic index,reside along a small ring. A node of the highest capacityobtains the largest cyclic index, and becomes a super-nodemanaging load transfers among all the other peers attached tothe same ring. When the super-node is unable to balance theload locally, i.e., among the nodes sharing the same cubicindex, it then contacts external super-nodes with differentcubic addresses to globally spread the load among a largernumber of peers.

Dabek et al. [60] design Vivaldi, which is a simple, light-weight system that derives synthetic network coordinates in amultidimensional space. A node locally updates its coordinatesby lazily contacting some other peers in the system. The

0 1

23

4

5 ...

Fig. 21. Hilbert curves for addressing. A filling curve passes through 16small squares in a specific order. An address is equal the sequence numberof the n-cube. Two nearby points in the space also share nearby addresses.Hilbert curves may be generalized to any number of dimensions.

algorithm measures an RTT between any two communicatingnodes. It then installs a spring between these two nodes.The spring has a length equal to the RTT. Afterwards, thenodes slightly change their coordinates by minimizing thepotential energy of all springs attached so far. The objectiveis to derive artificial lazily adjustable coordinates, which alsomaximize precision of position estimation. As an example,Canary [61] incorporates a physical link latency into anoverlay by employing the network coordinates [60]. Theyprovide nodal identifiers for a CAN-like addressing space.The zone of each node is constantly adjusted according tothe movement of the nodal coordinates.

Garces-Erice et al. provide Topology-Centric Look-UpService (TOPLUS). TOPLUS is an “extremist’s design” tolookup services, because it directly uses IP addresses anddoes not maintain any upper layer identifiers. Each object ismapped onto the IP addressing space by applying a hash-ing function. The routing function relies on the eXclusiveOR (XOR) selecting the closest node to the destination.TOPLUS uses the Border Gateway Protocol (BGP) routingtables enriched with the information about LANs and InternetService Providers (ISPs) to appropriately cluster the networkinto the groups of nearby hosts recognized by IP addressesand masks: x.y.z.t/n. To keep the routing informationlimited, a network is divided into tiers. A tier may containgroups and smaller tiers in its interior. The authors provideus with a hierarchical partial-order tree in which smaller tiersare aggregated in the larger structures. Routing in the treeresembles Pastry [10], however, it is based on the IP addressesand the XOR metric. The main goal of TOPLUS is to providelookup paths of short stretch in comparison to the shortest pathin the underlying network. The main drawbacks include a nonuniform distribution of nodes over the addressing space, lackof virtual servers, and correlated failures of close neighbors.

B. Proximity Neighbor Selection

A proximity neighbor selection is used in the overlayswhich allow multiple eligible nodes, e.g., Pastry [10]. Thealgorithm selects the closest node in terms of hop count,delay, geographical distance, etc., to fill in a particular routingentry [55].

Page 17: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 489

nd

ncnb

na

ne

nf

Proximity routingBasic routingLinkNode

Fig. 22. Requesting paths from na to nd, with and without proximity routing.

C. Proximity Routing

When the overlay construction process does not take into ac-count an underlying topology, a forwarding node may advancea packet towards the physically closest eligible node to thedestination to minimize traffic in the underlying network [55].

We illustrate an example of proximity routing operation inFigure 22. We present a few nodes participating in the peer-to-peer system. In our scenario, the separation between eachtwo nodes reflects their physical distance in the underlyingnetwork. We used dashed lines to represent overlying links,and black arrows to indicate lookup queries. In this example,node na sends two lookup requests destined to the samenode nd, the first message uses proximity routing, while theother one greedily approaches the destination. Node na real-izes that each request can be further delivered to neighbor nb

or ne. They are both eligible being closer to the destinationin the identifier space, however, the proximity routing beingaware that node nb lies closer to the destination by physicaldistance forwards the message to nb. Contrary to this, greedyrouting blindly selects node ne being the closest identifier tothe destination. The neighbor selection is repeated at everyintermediate peer, so the packet arrives at the destination.Finally, both paths have the same lengths in the overlay,however, a path established by the proximity routing is muchshorter at the underlying layer optimizing the consumption ofnetwork resources.

The most convenient way providing a notion of distanceamong nodes requires direct probing, e.g., RTT, hop-count,etc., however, the cost of periodical measurements may be tooexpensive, and their precision is limited, e.g., RTT varies intime. Other choices rely on network coordinates [60] or Hilbertcurves. Alternatively, an ISP may provide an oracle thatdelivers the information related to the proximity of peers [62].The ISP has an exact information considering its internalstructure, e.g., the quality of all internal links. The quality ofexternal links may be approximated by means of BPG routingtables.

Zhu et al. [63] use a concept of virtual servers to balancethe load among physical nodes in the system. Their scheme isdivided into 4 phases. At the beginning, the system aggregatesthe load information and classifies the nodes into light andheavy peers. Then the nodes select their virtual servers forrelocation and the load transfer among heavy and light nodesbegins. To optimize the amount of traffic in the underlyingnetwork, the authors greedily exchange content among thephysically closest nodes. The addresses of nodes are derivedby means of Hilbert curves and the RTT estimation to a fewlandmarking nodes in the network.

Efthymiopoulos et al. [64] specify proximity routing in aCAN-like system by using two addressing spaces. At the be-ginning, each node inspects its physical position in the network

to restrict the random location choice to an appropriate zone(portion) of a so called Locality CAN (L-CAN) [11] so thatthe physically close nodes can get nearby coordinates in termsof the L-CAN metric. Consequently, every node chooses oneL-CAN position within a given zone of L-CAN, and anotherVirtual CAN (V-CAN) coordinate uniformly selected from theentire V-CAN space. The system maintains two copies of theCAN structure in which nodes keep disjoint areas of the L-CAN and V-CAN, e.g., the L-CAN space Sl

i ∈ L-CAN ofnode i does not overlap with the L-CAN space maintained byanother node j: ∀i�=jS

li ∩ Sl

j = ∅. The object load balancingrequires distributing objects over the V-CAN space with theuniform node distribution, because L-CAN may not providethis property as there may exist both types of regions withhuge and poor node densities. Notice, however, that a portionof V-CAN Sv

i handled by node i overlaps with Sl� in L-CAN

of a few other peers. The authors use this property to installredirection pointers destined to node i on all other nodes jthat maintain the overlapping portions of L-CAN with Sv

i , i.e.,∀j : Sl

j ∈ L-CAN : Slj ∩Sv

j �= ∅. To limit the network load bypreventing from back and forth traffic, a message for key k isforwarded among physically close neighbors in L-CAN untilit reaches the peer j with k ∈ Sl

j . By using the locally carriedredirection pointers, node j searches for destination node ihaving k ∈ Sv

i . When i is found, j delivers the message tothe destination in one overlying hop. As an advantage, thesystem keeps network traffic low and the reasonable objectload on the nodes. This comes at the price of two differentCAN-like structures maintained at the same time.

D. Inter-Layer Interactions

Seetharaman [18] studied inter-layer interactions (i.e.,“games”) between the routing in the overlay and the TrafficEngineering (TE) in the underlay. Obviously, they are ingeneral oblivious of each other. These layers may be focusedon conflicting objectives, so they may cause instabilities andreduce the network performance. Let us consider an exampleof such a situation when the overlay operating in an almostideal fit with the underlying network suddenly overloads aphysical connection and causes a routing reconfiguration,i.e., route flap. Starting from this moment, the overlay hasto adapt to new conditions by enforcing traffic redirections.This includes a substantial performance loss followed by atedious process leading to the optimal operation. Therefore,it is obvious that we need to prevent from TE route re-configurations, because subsequent route flaps degrade theservice quality of an overlay. From the overlay perspective, theauthors provide us with a load-constrained linear program. Theprogram approximates the TE traffic matrix, i.e., the maximalload on every overlying path that does not trigger the TEactions, and determines the most efficient set of overlyingpaths for data transmission simultaneously ensuring that theload of every path remains within the acceptable limit.

VII. SUMMARY

Starting from the early research papers on peer-to-peersystems, the load balancing is considered as a primary goalamong others like security and fault tolerance. This is not

Page 18: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

490 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

surprising, since peer-to-peer systems are inspired by similartechniques previously derived in the world wide web, e.g.,consistent hashing [20] or distributed storage of high perfor-mance [65].

In this survey, we have presented a detailed view on themost important load balancing issues. We have identified themain causes skewing the load balance in peer-to-peer systemsgrouping different problems into 3 consistent classes: objectplacement, routing, and underlay related. We will now shortlysummarize previously described ideas.

We have included namespace balancing, virtual servers,multiple hashing functions, caching, and replication in the ob-ject placement class. An assignment of identifiers to nodes orobjects in such a way that each node roughly handles an equalnumber of objects is referred to as namespace balancing. Thebasic idea of consistent hashing is relatively simple, since eachnode or object gets its ID by means of the predefined hashingfunction, e.g., SHA-1. In more complex techniques, a nodehas to carefully select its identifier by probing the intervalsin the identifier space. Virtual servers are related to nodes,which maintain several positions in the identifier space. Eachnode can activate or deactivate its identifiers and moreover,move a virtual server along with its content, i.e., routing tablesand stored objects, to different physical machines. There existmany protocols constantly adapting to new dynamic conditions(i.e., arrivals or departures of nodes; insert, update, and removeoperations on objects) by transferring virtual servers amongthe physical nodes. Employment of several hashing functionsselecting locations for an object and its redirection pointers issimple, since the lookup requests usually go towards a randomdestination. A more complicated task relies on selecting apeer that handles the object by using the information on loadof potential candidates. Unfortunately, the idea does not re-balance the load on object removal, because the algorithmonly inspects the load on insert. Caching and replication donot modify the structure of existing overlays and they aretherefore widely applicable. Instead, they populate extra copiesof the same object among several nodes in the system. Thereare various configuration issues including selection of objectsfor replication and replica peers for copies handling. Usually,nodes periodically repeat replication requests, because someof the replicas may have been discarded by cache replace-ment policies or considered stale or invalid. Low complexityof caching and replication makes them a good choice forcomplementary load balancing at the price of extra storageresources.

Considering traffic or routing load balancing, there aretwo complementary techniques. We may either continuouslyoptimize and replace neighbors on each node while keepinga single routing strategy (i.e., link reorganization) or keep thesame neighbors, but continuously change the routing function(e.g., random routing strategy). In case of link reorganization,the routing algorithm may use one eligible neighbor or linkto reach a given destination. Due to the link reorganization,a node may replace an exhausted neighbor with a light nodebalancing the routing load. The routing strategy using differenteligible links to forward a request to a given destination isreferred to as the path redundancy. Both solutions balance therouting load, however, they use different sources of informa-

tion. Link reorganization requires the reports on the load ofneighboring peers, which implies the overhead of additionalpackets. Path redundancy methods locally keep track of theload on each outcoming link, hence the local load estimationmay be highly inaccurate. They also require higher mainte-nance costs keeping multiple routing entries. Therefore, linkreorganization offers high precision by employing informationoverhead.

The overlay may overload a physical network, becausethe layers are, in general, oblivious of each other. The bestsolutions reduce the load at the underlying level by perfectlymapping an overlying topology onto the physical network andkeeping both structures as close as possible. This may beachieved through a careful neighbor selection, i.e., physicallyclose neighbors in terms of RTT, hop-count, geographicalposition, etc., are preferred to fill in the routing entries.Other researchers incorporate network coordinates containingresidual information on a node physical location as identifiersin the overlay. In some situations, it is too difficult or too ex-pensive to derive a proper mapping. In these cases, proximityrouting chooses the physically closest eligible neighbor to thedestination. In all cases, the information on the underlyingtopology is derived through the network measurements orprovided by means of oracles established by ISPs.

VIII. CONCLUSION

Load balancing is a challenge that rises up whenever asystem comprises multiple components contributing to achievea common goal. The present survey focuses on load bal-ancing in DHT-based peer-to-peer systems with respect toobject placement, routing, and the underlay network structures.Load balancing mechanisms in such systems, be they DHT-based or not, always aim to optimize the system’s operationperformance taking into account the limited capacities of itscomponents in terms of available storage space, processingtime, and bandwidth, as well as operational factors such asobject popularity, lookup and request load, or maintenanceactivities. Tackling at best one or several of the balancingobjectives might jeopardize the efficiency of another one. Asan example, the use of virtual servers may well balance thestorage load proportional to the nodes’ resources, but mayrequire high bandwidth to allow for efficient relocation ofvirtual servers. Furthermore, load balancing mechanisms mayalso influence other important system properties (that are notdiscussed in this survey), such as fault tolerance, dynamicity,or security.

A very careful selection of the mechanisms chosen forload balancing is necessary. Among the many issues to beconsidered are: additional maintenance costs, performancepenalties, negative influence with respect to other systemproperties, need for continuous reorganization of the system,consequences in case of frequent joins and leaves, jeopardizingthe inherent distributed system character by the introduction ofcentralized components, necessity of frequent and voluminousload information, or excessive load monitoring.

In conclusion, there exist many efficient load balancingmechanisms that cover various aspects for balancing the load.For selecting a particular mechanism it is, however, most

Page 19: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

FELBER et al.: SURVEY ON LOAD BALANCING IN PEER-TO-PEER DISTRIBUTED HASH TABLES 491

important to carefully define the objectives desired to achieve,e.g., focusing on balancing the request load, while taking intoaccount the consequences with respect to the many other loadand system issues. From that perspective, a major challengeremains the introduction of mechanisms that simultaneouslybalance the various aspects of load in such systems.

REFERENCES

[1] B. Godfrey, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica,“Load balancing in dynamic structured P2P systems,” in Proc. IEEEInternational Conference on Computer Communications (INFOCOM),Hong Kong, 2004.

[2] S. Serbu, S. Bianchi, P. Kropf, and P. Felber, “Dynamic load sharingin peer-to-peer systems: When some peers are more equal than others,”IEEE Internet Comput., vol. 11, no. 4, pp. 53–61, 2007.

[3] S. Prabhavat, H. Nishiyama, N. Ansari, and N. Kato, “On load distri-bution over multipath networks,” IEEE Commun. Surveys Tuts., vol. 14,no. 3, pp. 662–680, quarter 2012.

[4] R. I. Davis and A. Burns, “A survey of hard real-time scheduling formultiprocessor systems,” ACM Comput. Surv., vol. 43, no. 4, pp. 35:1–35:44, Oct. 2011.

[5] K. Kenthapadi and G. S. Manku, “Decentralized algorithms usingboth local and random probes for p2p load balancing,” in Proc. 17thAnnual ACM Symposium on Parallelism in Algorithms and Architectures(SPAA). New York, NY, USA: ACM, 2005, pp. 135–144.

[6] H. Shen and Y. Zhu, “Plover: A proactive low-overhead file replicationscheme for structured p2p systems,” in IEEE International Conferenceon Communications (ICC), 2008, pp. 5619–5623.

[7] L. Garces-Erice, K. W. Ross, E. W. Biersack, P. A. Felber, and G. Urvoy-Keller, “Topology-centric look-up service,” in COST264/ACM 5th In-ternational Workshop on Networked Group Communications (NGC).Springer, 2003, pp. 58–69.

[8] S. Serbu, P. Kropf, and P. Felber, “Improving the dependability of prefix-based routing in dhts,” in Proc. International Conference on the moveto meaningful internet systems (OTM), ser. Lecture Notes in ComputerScience, no. 4803. Springer Berlin / Heidelberg, 2007, pp. 206–225.

[9] H. Yamamoto, D. Maruta, and Y. Oie, “Replication methods for loadbalancing on distributed storages in P2P networks,” IEICE Trans., vol.89-D, no. 1, pp. 171–180, 2006.

[10] A. Rowstron and P. Druschel, “Pastry: Scalable, distributed objectlocation and routing for large-scale peer-to-peer systems,” in IFIP/ACMInternational Conference on Distributed Systems Platforms (Middle-ware), ser. Lecture Notes in Computer Science, R. Guerraoui, Ed., no.2218. Springer Heidelberg, Germany, 2001, pp. 329–350.

[11] S. Ratnasamy, M. Handley, R. Karp, and S. Shenker, “Topologically-aware overlay construction and server selection,” in Proc. IEEE Inter-national Conference on Computer Communications (INFOCOM), June2002.

[12] I. Stoica, R. Morris, D. Karger, M. F. Kaashoek, and H. Balakrishnan,“Chord: A scalable peer-to-peer lookup service for internet applica-tions,” in Proc. ACM Special Interest Group on Data Communications(SIGCOMM), 2001, pp. 149–160.

[13] M. Steiner, T. En-Najjary, and E. W. Biersack, “A global view of KAD,”in Proc. 7th ACM SIGCOMM conference on Internet measurement, ser.IMC ’07. New York, NY, USA: ACM, 2007, pp. 117–122.

[14] P. Maymounkov and D. Mazieres, “Kademlia: A peer-to-peer infor-mation system based on the xor metric,” in Proc. 1st InternationalWorkshop on Peer-to-Peer Systems, 2002, pp. 53–65.

[15] M. Steiner, W. Effelsberg, and T. En-najjary, “Load reduction in the kadpeer-to-peer system,” in In Fifth International Workshop on Databases,Information Systems and Peer-to-Peer Computing (DBISP2P, 2007.

[16] D. Carra, M. Steiner, and P. Michiardi, “Adaptive load balancing inKAD,” in In Proc. Peer-to-Peer Computing 2011, 2011, pp. 92–101.

[17] S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, “Ascalable content addressable network,” in Proc. ACM Special InterestGroup on Data Communications (SIGCOMM), 2001, pp. 161–172.

[18] S. Seetharaman, V. Hilt, M. Hofmann, and M. Ammar, “Preemptivestrategies to improve routing performance of native and overlay layers,”in Proc. IEEE International Conference on Computer Communications(INFOCOM), 2007.

[19] P. B. Godfrey and I. Stoica, “Heterogeneity and load balance indistributed hash tables,” in Proc. IEEE International Conference onComputer Communications (INFOCOM), 2005.

[20] D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Pan-igrahy, “Consistent hashing and random trees: Distributed cachingprotocols for relieving hot spots on the World Wide Web,” in Proc.ACM Symposium on Theory of Computing (STOC), 1997, pp. 654–663.

[21] V. King and J. Saia, “Choosing a random peer,” in Proc. the twenty-third annual ACM symposium on Principles of distributed computing,ser. PODC ’04. New York, NY, USA: ACM, 2004, pp. 125–130.

[22] B. Y. Zhao, L. Huang, J. Stribling, S. C. Rhea, A. D. Joseph, andJ. D. Kubiatowicz, “Tapestry: A resilient global-scale overlay for servicedeployment,” IEEE J. Sel. Areas Commun., vol. 22, no. 1, pp. 41–53,2004.

[23] D. Malkhi, M. Naor, and D. Ratajczak, “Viceroy: A scalable anddynamic emulation of the butterfly,” in Proc. 21st ACM Symposium onPrinciples of Distributed Computing (PODC), 2002, pp. 183–192.

[24] M. F. Kaashoek and D. R. Karger, “Koorde: A simple degree-optimaldistributed hash table,” in Proc. 2nd International Workshop on Peer-to-Peer Systems, 2003, pp. 323–336.

[25] M. Bienkowski, M. Korzeniowski, and F. M. auf der Heide, “Dynamicload balancing in distributed hash tables,” in Proc. 4th InternationalWorkshop, IPTPS 2005. Springer-Verlag Berlin / Heidelberg, 2005,pp. 217–225.

[26] A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica,“Load balancing in structured p2p systems,” in Proc. 2nd InternationalWorkshop IPTPS 2003. Springer Berlin / Heidelberg, 2003, pp. 68–79.

[27] D. R. Karger and M. Ruhl, “Simple efficient load balancing algorithmsfor peer-to-peer systems,” in Proc. 16th Annual ACM Symposium onParallelism in Algorithms and Architectures (SPAA). New York, NY,USA: ACM, 2004, pp. 36–43.

[28] J. Ledlie and M. Seltzer, “Distributed, secure load balancing with skew,heterogeneity and churn,” in Proc. IEEE International Conference onComputer Communications (INFOCOM), vol. 2. IEEE, 2005, pp. 1419–1430.

[29] F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica, “Wide-area cooperative storage with CFS,” in Proc. ACM Symposium onOperating Systems Principles (SOSP), 2001, pp. 202–215.

[30] D. Wu, Y. Tian, and K.-W. Ng, “Resilient and efficient load balancing indistributed hash tables,” J. Network and Computer Applications, vol. 32,no. 1, pp. 45–60, Jan. 2009.

[31] J. Byers, J. Considine, and M. Mitzenmacher, “Simple load balancingfor distributed hash tables,” in 2nd International Workshop on Peer-to-Peer Systems (IPTPS ’03), 2003, pp. 80–87.

[32] M. Mitzenmacher, A. W. Richa, and R. Sitaraman, “The power of tworandom choices: A survey of techniques and results,” in in Handbookof Randomized Computing. Kluwer, 2000, pp. 255–312.

[33] T.-T. Wu and K. Wang, “An efficient load balancing scheme for resilientsearch in kad peer to peer networks,” in IEEE 9th Malaysia InternationalConference on Communications. IEEE Computer Society, 2009, pp.759 – 764.

[34] E. Cohen and S. Shenker, “Replication strategies in unstructured peer-to-peer networks,” in In Proc. ACM Special Interest Group on DataCommunications (SIGCOMM), August 2002.

[35] Q. Lv, P. Cao, E. Cohen, K. Li, and S. Shenker, “Search and replicationin unstructured peer-to-peer networks,” in Proc. 16th internationalconference on Supercomputing (ICS). New York, NY, USA: ACM,2002, pp. 84–95.

[36] Q. Wang, K. Daudjee, and M. T. Ozsu, “Popularity-aware prefetch inP2P range caching,” in Proc. 8th International Conference on Peer-to-Peer Computing (P2P). Washington, DC, USA: IEEE ComputerSociety, 2008, pp. 53–62.

[37] S. Bianchi, S. Serbu, P. Felber, and P. Kropf, “Adaptive load balancingfor DHT lookups,” in Proc. 15th International Conference on ComputerCommunications and Networks (ICCCN), 2006, pp. 411–418.

[38] A. Rowstron and P. Druschel, “Storage management and caching inPAST, a large-scale, persistent peer-to-peer storage utility,” in 18th ACMSymposium on Operating Systems Principles (SOSP), 2001, pp. 188–201.

[39] M. Roussopoulos and M. Baker, “Cup: Controlled update propagationin peer-to-peer networks,” in Proc. 2003 USENIX Annual TechnicalConference, 2003.

[40] C. Wang, L. Xiao, Y. Liu, and P. Zheng, “DiCAS: An efficient distributedcaching mechanism for P2P systems,” IEEE Trans. Parallel Distrib. Syst.(TPDS), vol. 17, no. 10, pp. 1097–1109, 2006.

[41] H. Shen, “An efficient and adaptive decentralized file replication al-gorithm in p2p file sharing systems (EAD),” in Eighth InternationalConference on Peer-to-Peer Computing, 2008.

[42] V. Gopalakrishnan, B. Silaghi, B. Bhattacharjee, and P. Keleher, “Adap-tive replication in peer-to-peer systems,” in Proc. 24th International

Page 20: Survey on Load Balancing in Peer-to-Peer Distributed Hash Tables

492 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, FIRST QUARTER 2014

Conference on Distributed Computing Systems (ICDCS). IEEE Com-puter Society, 2004, pp. 360–369.

[43] L. Yin and G. Cao, “Dup: Dynamic-tree based update propagation inpeer-to-peer networks,” in Proc. 21st International Conference on DataEngineering, ser. ICDE ’05. Washington, DC, USA: IEEE ComputerSociety, 2005, pp. 258–259.

[44] V. Ramasubramanian and E. G. Sirer, “Beehive: O(1) lookup perfor-mance for power-law query distributions in peer-to-peer overlays,” inProc. 1st conference on Symposium on Networked Systems Design andImplementation (NSDI). USENIX Association, 2004, pp. 8–8.

[45] Y. Xia, A. Dobra, and S. C. Han, “Multiple-choice random networkfor server load balancing,” in Proc. IEEE International Conference onComputer Communications (INFOCOM), 2007, pp. 1982–1990.

[46] K.-L. Huang, T.-Y. Huang, and J. C. Y. Chou, “Lesslog: A logless filereplication algorithm for peer-to-peer distributed systems,” 18th IEEEInternational Parallel and Distributed Processing Symposium (IPDPS),vol. 1, p. 82b, 2004.

[47] S. Serbu, P. Kropf, and P. Felber, “Fault-tolerant p2p networks: Howdependable is greedy routing?” in Workshop on Dependable ApplicationSupport in Self-Organising Networks (DASSON), 2007.

[48] S. Serbu, S. Bianchi, P. Kropf, and P. Felber, “Dynamic load sharing inpeer-to-peer systems: When some peers are more equal than others,” inProc. 2006 Montreal Conference on eTechnologies (MCETECH), 2006,pp. 149–156.

[49] T. Locher, S. Schmid, and R. Wattenhofer, “equus: A provably robustand locality-aware peer-to-peer system,” in Proc. 6th IEEE InternationalConference on Peer-to-Peer Computing (P2P). Washington, DC, USA:IEEE Computer Society, 2006, pp. 3–11.

[50] M. Schlosser, M. Sintek, S. Decker, and W. Nejdl, “Hypercup –hypercubes, ontologies and efficient search on P2P networks,” in 1stWorkshop on Agents and P2P Computing Springer (AP2PC), vol. 2530.Springer Heidelberg, Germany, 2002, pp. 133–134.

[51] J. I. Alvarez-Hamelin, A. C. Viana, and M. D. Amorim, “DHT-basedfunctionalities using hypercubes,” in Proc. World Computer Congress(IFIP WCC), vol. 212, 2006, pp. 157–176.

[52] S. Serbu, P. Felber, and P. Kropf, “Hypeer: Structured overlay withflexible-choice routing,” Computer Networks: The International Journalof Computer and Telecommunications, vol. 55, no. 1, pp. 300–313, 2011.

[53] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introductionto Algorithms, 2nd ed. The MIT Press, 2001.

[54] S. C. Han and Y. Xia, “Network load-aware content distribution inoverlay networks,” Computer Communications, vol. 32, no. 1, pp. 51–61, 2009.

[55] M. Castro, P. Druschel, Y. C. Hu, and A. Rowstron, “Topology-awarerouting in structured peer-to-peer overlay networks,” in InternationalWorkshop on Future Directions in Distributed Computing (FuDiCo),June 2002.

[56] Z. Xu, M. Mahalingam, and M. Karlsson, “Turning heterogeneity into anadvantage in overlay routing,” in Proc. IEEE International Conferenceon Computer Communications (INFOCOM), vol. 2, march–april 2003,pp. 1499–1509.

[57] T. Asano, D. Ranjan, T. Roos, E. Welzl, and P. Widmayer, “Space-filling curves and their use in the design of geometric data structures,”Theoretical Computer Science, vol. 181, pp. 3–15, July 1997.

[58] H. Shen and C.-Z. Xu, “Locality-aware randomized load balancingalgorithms for DHT networks,” in Proc. 2005 International Conferenceon Parallel Processing (ICPP). Washington, DC, USA: IEEE ComputerSociety, 2005, pp. 529–536.

[59] H. Shen, C.-Z. Xu, and G. Chen, “Cycloid: A constant-degree andlookup-efficient p2p overlay network,” Performance Evaluation, vol. 63,no. 3, pp. 195 – 216, 2006, p2P Computing Systems.

[60] F. Dabek, R. Cox, F. Kaashoek, and R. Morris, “Vivaldi: A decentralizednetwork coordinate system,” in Proc. ACM Special Interest Group onData Communications (SIGCOMM), 2004, pp. 15–26.

[61] T. Kojima, M. Asahara, K. Kono, and A. Hayakawa, “Embedding net-work coordinates into the heart of distributed hash tables,” in IEEE 9thInternational Conference on Peer-to-Peer Computing (P2P). Seattle,WA: IEEE Computer Society, 2009, pp. 155 – 158.

[62] V. Aggarwal, O. Akonjang, and A. Feldmann, “Improving user and ISPexperience through ISP-aided P2P locality,” in Proc. 11th IEEE GlobalInternet Symposium (GI). IEEE Computer Society, Washington, DC,USA, 2008.

[63] Y. Zhu and Y. Hu, “Efficient, proximity-aware load balancing forstructured p2p systems,” in Proc. 3rd International Conference on Peer-to-Peer Computing (P2P). Washington, DC, USA: IEEE ComputerSociety, 2003, p. 220.

[64] N. Efthymiopoulos, A. Christakidis, S. Denazis, and O. Koufopavlou,“Enabling locality in a balanced peer-to-peer overlay,” in IEEE GlobalTelecommunications Conference (GLOBECOM), 2006, pp. 1–5.

[65] C. G. Plaxton, R. Rajaraman, and A. W. Richa, “Accessing nearbycopies of replicated objects in a distributed environment,” in Proc. the9th annual ACM symposium on Parallel Algorithms and Architectures(SPAA). New York, NY, USA: ACM, 1997, pp. 311–320.

Sabina Serbu received her M.Sc. in Computer Sci-ence from the University Politehnica of Bucharest,Romania and her Ph.D. in Computer Science fromthe University of Neuchatel, Switzerland in 2010.Her main research interests include load balancing,fault tolerance, routing strategies and gossip-basedprotocols in peer-to-peer systems.

Eryk Schiller received two M.Sc., one from theUniversity of Science and Technology and the otherfrom the Jagiellonian University, Cracow, Polandin resp. 2006 and 2007. He obtained a Ph.D. inComputer Science from the University of Grenoble,France in 2010. Since 2010, he holds an appointmentof a post-doctoral research assistant at the Universityof Neuchatel, Switzerland, working in the field ofcomputer networks.

Pascal Felber received his M.Sc. and Ph.D. inComputer Science from the Swiss Federal Instituteof Technology. From 1998 to 2002, he worked atOracle Corporation and Bell-Labs (Lucent Tech-nologies) in the USA. From 2002 to 2004, he wasan Assistant Professor at Institute EURECOM inFrance. Since October 2004, he is a Professor ofComputer Science at the University of Neuchatel,Switzerland, working in the field of dependableand distributed systems. He has published over 100research papers in various journals and conferences.

Peter Kropf received his M.Sc. in mathematics andhis Ph.D. in Computer Science from the Univer-sity of Bern, Switzerland. From 1994 to 1999, hewas an assistant and associate professor at LavalUniversity, Quebec, Canada. Subsequently, he wasappointed as an associate professor at the depart-ment of Computer Science and Operations Research(DIRO) at University of Montreal, Canada. SinceOctober 2003, he is a Professor at the University ofNeuchatel, Switzerland. Peter Kropf is currently aDean of the Faculty of Science. He has published

over 90 research papers in the field of parallel and distributed systems,simulation and optimization.


Recommended