+ All Categories
Home > Documents > Peer-to-Peer Networks for Content Distribution

Peer-to-Peer Networks for Content Distribution

Date post: 10-Jan-2022
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
29
February 1, 2005 Peer-to-Peer Networks for Content Distribution Ernst Biersack [email protected] http://www.eurecom.fr/~erbi/ © Institut Eurécom 2005 2 © Institut Eurécom 2005 Overview Motivation Overview of distribution models Simple example for file replication Parallel download BitTorrent: A real P2P file download application How to organize peers for file distribution? Theoretical analysis of different static organizations Simulation of mesh-based organizations Conclusion 3 © Institut Eurécom 2005 Motivation: Context and Problem An growing number of well-connected users access increasing amounts of content But interest in content is often “Zipf” distributed (small fraction of very popular content) Servers and links are overloaded Number of clients Size of content “Flash crowd” (e.g., 9/11) Tremendous engineering (and cost!) necessary to make server farms scalable and robust Internet Internet Source Congestion Degraded/ loss of service Traditional client/server content distribution Problem: scalable distribution of content 4 © Institut Eurécom 2005 Zipf’s law Zipf’s law: The frequency of an event P as a function of rank i is a power law function: P i = / i α where α≤ 1 Observed to be true for Frequency of written words in English texts Population of cities Frequency of access of Web pages Size of Web objects
Transcript
Page 1: Peer-to-Peer Networks for Content Distribution

February 1, 2005

Peer-to-Peer Networksfor Content Distribution

Ernst [email protected]

http://www.eurecom.fr/~erbi/

© Institut Eurécom 2005

2 © Institut Eurécom 2005

Overview

MotivationOverview of distribution modelsSimple example for file replicationParallel download

BitTorrent: A real P2P file download applicationHow to organize peers for file distribution?

Theoretical analysis of different static organizationsSimulation of mesh-based organizations

Conclusion

3 © Institut Eurécom 2005

Motivation: Context and Problem

An growing number of well-connected users access increasing amounts of contentBut interest in content is often “Zipf”distributed (small fraction of very popular content)Servers and links are overloaded

Number of clientsSize of content“Flash crowd” (e.g., 9/11)

Tremendous engineering (and cost!) necessary to make server farms scalableand robust

InternetInternet

Source

Congestion

Degraded/loss of service

Traditional client/server content distribution

Traditional client/server content distribution

Problem: scalable distribution of contentProblem: scalable

distribution of content

4 © Institut Eurécom 2005

Zipf’s law

Zipf’s law: The frequency of an event P as a function of rank i is a power law function:

Pi = Ω / iα where α ≤ 1Observed to be true for

Frequency of written words in English textsPopulation of citiesFrequency of access of Web pagesSize of Web objects

Page 2: Peer-to-Peer Networks for Content Distribution

5 © Institut Eurécom 2005

Motivation: Real-World Scenarios

Quick distribution of critical contentE.g., antivirus definitions

Efficient distribution of large contentE.g., nightly update of a bank’s branches, promotional movie from manufacturer to all car dealers

Distribution of streaming contentE.g., live event, Internet TV

Classical approaches have high costSource over-provisioning (for peak demand)Content Delivery Networks (CDNs)

Novel approach: cooperative networks

6 © Institut Eurécom 2005

Illustration: File Replication

Assume you have a set of N hosts

We initially have one or a few (n) copies of the file to distribute (n << N)

Q: how to replicate efficiently the file on all N hosts?

7 © Institut Eurécom 2005

File Replication: Simple Scenario

3 asymmetric hosts:upload rate: u= 0.5 file_unit/time_unitdownload rate: d = 1 file_unit/time_unit

Initially one copy

File of size: 1 file_unit

8 © Institut Eurécom 2005

File Replication: Simple Scenario

Tree:Upload in parallel to each host at rate u/2=0.25Download time= 1/(u/2)=4 time_units

(Smart) Full replication:-Upload to host 2 at full rate u=0.5

-Host 2 done at = 1/(u)=2 time_units-Replication on host 3: Simultaneously

-Host 1 gives one half of file at u=0.5, takes 1 time_unit-Host 2 gives one half of file at u=0.5, takes 1 time_unit

- Completion after 3 time_units

1

2 3

1

2 3

1

2 3

Page 3: Peer-to-Peer Networks for Content Distribution

9 © Institut Eurécom 2005

File Replication: Simple Scenario Exercise:

Assume that the file is broken in 4 equal size pieces p1 – p4Host 1 first transmits p1 to host 2Then, simultaneously

Host 1 transmits p2 to host 2 and host 2 transmits p1 to host 3 Then, simultaneously

Host 1 transmits p3 to host 2 and host 2 transmits p2 to host 3 Then, simultaneously

Host 1 transmits p4 to host 2 and host 2 transmits p3 to host 3 Host 2 transmits p4 to host 3

Questions: How long does it take to distribute the file to both, host 2 and 3What kind of organization does this type of parallel, shifted transfer correspond to?

10 © Institut Eurécom 2005

File Replication: Simple Scenario

Partial replication:4 actions in parallel:-Upload f/2 from host 1 to host 2 at u/2-Upload f/2 from host 1 to host 3 at u/2-Host 2 upload to host 3 f/2-Host 3 upload to host 2 f/2

Total time: f/2/(u/2)=1/0.5= 2 time_units

1

2 3

1

2 3

I am cheating in this example a bit: WHY?

11 © Institut Eurécom 2005

Observations

Designing optimal policy is difficult in practice because ofHeterogeneity of hosts in terms of their upload/download capacity

Host 1 has a 10Mb/s access link Whereas host 2 has a modem access

Hosts comie and leave at any point of time

12 © Institut Eurécom 2005

ObservationsHosts are not homogeneous:

Different access linksCampus, corporate networks

• Firewalls might be a problemxDSL: asymmetric upload and download capacity

Different locations in the worldUS, Europe, Asia

• Different RTTs ⇒ different TCP throughputs• Different Availability of paths, hosts….

98) Sigcomm al.et (Padhye lossRTT

MSSCstTTCP⋅

Page 4: Peer-to-Peer Networks for Content Distribution

February 1, 2005

FastReplicaL. Cherkasova, J. Lee

Proceedings of the 4th USENIX Symposium on Internet Technologies and Systems, 2003

14 © Institut Eurécom 2005

FastReplica in the SmallProblem Statement:

Let N0 be a node which has an original file F with size Size(F);Let R = N1, … , Nn be a replication set of nodes.

The problem consists in replicating file F across nodes N1, … ,Nn while minimizing the overall replication time.We assume n∈[ 10-30] nodes.File F is divided in equal chunks: F1, … , Fn, where Size(Fi) = Size(F) / n FastReplica consists of two steps:

Distribution Step ,Collection Step.

15 © Institut Eurécom 2005

N3

File F

F1 F2 F3 F n-1 F n

F1

F n-1

F n

F3F2

N0

N1

N2 N n-1

N n

FastReplica in the Small: Distribution Step

N0 sends to Ni: • chunk Fi .• a list of nodes R = N1, … , Nn\Ni to which chunk Fi must be sent in

the next step;

16 © Institut Eurécom 2005

F n-1

F n

File F

F1

F1

F2 F3 F n-1 F n

N0

N1

N2

N3

N n-1

N n

F1

F1

F3F2

F1

F1

FastReplica in the SmallCollection Step (View “from a Node”)

After receiving Fi , node Ni opens (n-1) connections to its siblings and sends chunk Fi to them.

Page 5: Peer-to-Peer Networks for Content Distribution

17 © Institut Eurécom 2005

File F

F1

F1

F2 F3 F n-1 F n

F n-1

F n

F3F2

N0

N1

N2

N3

N n-1

N n

F2F3

F n-1

F n

FastReplica in the SmallCollection Step (View “to a Node”)

Thus each node Ni has:• (n - 1) outgoing connections to send chunk Fi ,• (n - 1) incoming connections from the remaining nodes in the group to

receive the other chunks F1, … , Fi-1 , Fi+1 , … , Fn .

18 © Institut Eurécom 2005

What Is the Main Idea of FastReplica?

Instead of typical replication of the entire file F to n nodes Using n Internet paths FastReplica exploits (n x n) different Internet paths within the replication group, where each path is used for transferring 1/n-th of file F.

Benefits:The impact of congestion along the involved paths is limited for a transfer of 1/n-th of the file,FastReplica takes advantage of the upload and downloadbandwidth of recipient nodes.

Limitations:Organization of nodes is very static, does not scale to larger number of nodes

19 © Institut Eurécom 2005

Preliminary performance analysis of FastReplica in the small

Two performance metrics: average and maximum replication time.

Idealistic setting: all nodes and links are homogeneous, and each node can support n network connections to other nodes at B bytes/sec.

Timedistr = Size(F) / (nxB)Timecollect = Size(F) / (nxB)

FastReplica: TimeFR =Timedistr + Timecollect = 2 x Size(F) / (nxB)Multiple Unicast: TimeMU = Size(F) / B

Replication_Time_Speedup = TimeMU / TimeFR = n / 2

20 © Institut Eurécom 2005

Page 6: Peer-to-Peer Networks for Content Distribution

February 1, 2005

Parallel-Access to access Replicated Information in the Internet

Note: While this idea was originally proposed by P. Rodriguez and E. Biersack

for content that exists in multiple copies on the Web, it turns out that the same concept is very useful in Peer-to-Peer systems.We first present the scheme as originally conceived and then will come

back to it when discussing Bittorrent

22 © Institut Eurécom 2005

Overview

Motivation and Problem StatementHow it worksChunk and Peer selectionPerformance studyImplementation and Deployment IssuesConclusion

23 © Institut Eurécom 2005

Replicate the same content in several places (mirrors) in the networkClients are directly satisfied from one of the mirror servers

Access to frequently requested content

THE INTERNET

24 © Institut Eurécom 2005

How are copies created?

Web caches are widely deployed in the Internet to store copies of master documents from multiple content providers

In control of the ISPMirror sites fully replicate content from a certain content provider

In control of the content providerContent distribution networks (e.g. Akamai) replicate content from different content providers

In control of the content provider. Multiple content providers share the same resources

Page 7: Peer-to-Peer Networks for Content Distribution

25 © Institut Eurécom 2005

IssuesHow to re-direct clients to the “best” copy?

26 © Institut Eurécom 2005

Selecting Best “Copy”The copy selected should

provide the client the lowest possible access time and achieve overall good load balancing between the different sites.

Techniques for copy selection areIP-administrative domain, Internet TopologyNumber of hops, RTTsApplication level measurements

ProblemsHigh complexity and overhead (periodic polls, state information)The selected copy may not always be the best one at this point of time

27 © Institut Eurécom 2005

Parallel AccessInstead do a parallel-access!

Avoids non-trivial server selectionsPerforms load balancing

Popular Document

Mirror Servers

28 © Institut Eurécom 2005

The number of blocks should be larger than the number of mirror sitesaccessed in parallelEach block should be small enough to rapidly adapt to changing conditions and ensure that the last block requested from each server terminates at about the same time

Each block should be sufficient large to reduce the influence of the idle times and reduce the number of negotiations (transmission time>>RTT)

How to choose the block size?

Mirror SiteClientGet block i

Block i

Get block i+1

Transmission of block i

Idle time

Mirror SiteClientGet block i

Block iGet block i+1

Transmission of block i

idle time

Pipeline

Page 8: Peer-to-Peer Networks for Content Distribution

29 © Institut Eurécom 2005

AssumptionsWe consider popular documents that are identically replicated (bit-by-bit)Documents should be large enough (several hundreds of Kbytes)

For small documents, several documents can be grouped (i.e. all in-lined objects of a Web page) and perform a parallel-access to group of documents

The last block can be requested from several servers at the sametime to avoid waiting for very slow serversWe assume that the paths from the client to the mirror servers are bottleneck-disjointClients and servers are able to utilize range requests as specified in HTTP 1.1

30 © Institut Eurécom 2005

Current implementation runs on a Java client and does not perform pipeliningWe considered mirror sites for the Squid software(http://squid.nlanr.net)We evaluated the performance of a parallel-access every 15 minutesduring a period of 10 days, and averaged the performance over the 10 day periodFor every experiment, we calculate the optimum transmission time, which is the transmission time obtained when all servers send useful information until the document is fully received and there are no idle times

Performance Evaluation

Slovakia Portugal Greece SpainAustria

UKJapan

Australia

Israel

Eurecom, France

31 © Institut Eurécom 2005

Parallel-access: Performance

Drastically reduces downloading times and smooths out bandwidth variations The differences between the optimum and the parallel-access are due to the idle times => Pipelining

32 © Institut Eurécom 2005

Small Documents

For small documents the transmission time is not very relevantPipelining is difficult due to the small transmission timesEven for a 10 KB file, a parallel-access has a very good performance

Page 9: Peer-to-Peer Networks for Content Distribution

33 © Institut Eurécom 2005

Parallel-Access on a Modem line

A parallel-access through a modem line gives as good performance as the fastest server with no server selection

34 © Institut Eurécom 2005

Simulation of pipelining

Using pipelining the parallel-rate is very close to the optimumHowever, using pipelining is not crucial since parallel-access with no pipelining is already very good

35 © Institut Eurécom 2005

Parallel-Access: Multiple sites vs One site

A parallel-access to the same server may result in a worse performance than a single access since connections compete among themselves

36 © Institut Eurécom 2005

Current Implementation

0-yasamin$ pa -n20 http://www.auth.gr/Squid/FAQ/Squid1.2.ps.gz http://www.uniovi.es/Squid/FAQ/Squid1.2.ps.gz

Source 0=> http://www.auth.gr/Squid/FAQ/FAQ.ps.gzSource 1=> http://www.uniovi.es/~mirror/squid/FAQ/FAQ.ps.gzParts Requested/ReceivedRequested: 01101 01010 11001 10101Received: 01101 01010 11001 10101It took 42 sec to download 763 KBParallel Rate: 18 KBps (42 sec)Rate Source 0: 8 KBps (91 sec)Rate Source 1: 10 KBps (74 sec)

Page 10: Peer-to-Peer Networks for Content Distribution

37 © Institut Eurécom 2005

Deployment IssuesWhat if everybody does the same thing?

If all clients share the same bottleneck => Speedup is reducedIf clients do not share the same bottleneck => Speedup is still kept high (for more info see: Christos Gkantsidis, Mostafa Ammar, Ellen Zegura, "On the Effect of Large-Scale Deployment of Parallel Downloading", IEEE Workshop on Internet Applications (WIAPP'03), 2003)

In any caseClients experience a performance which is at least equal to the one of the fastest serverLoad is automatically shared among all servers: More powerful servers will have higher number of requests, while less powerful servers will have less requestsThere is no need for a server selection algorithm

38 © Institut Eurécom 2005

ConclusionsA parallel-access for popular and large documents

Avoids non-trivial server selectionsSmoothes bandwidth fluctuationsSpeeds-up document downloadsEven on bandwidth limited environments, the performance obtainedwith a parallel-access is at least as good as that offered by the best server

Parallel Access applicable in P2P environments Clients who have copy act as mirror servers

Parallel Access to files is implemented today in various tools such asMorpheus, EDonkey, or BitTorrent

39 © Institut Eurécom 2005

Handout

Pablo Rodriguez and Ernst W. Biersack. Dynamic Parallel-Access to Replicated Content in the Internet. IEEE/ACM Transactions on Networking, 10(4):455--464, August 2002.

February 1, 2005

BitTorrentPeer-to-peer based system for distributing a file from one or more

nodes that have a complete copy to a possibly large number of peers

Page 11: Peer-to-Peer Networks for Content Distribution

41 © Institut Eurécom 2005

Overview

MotivationHow it worksChunk and Peer selectionPerformance study

Global Performance (Tracker log)Client Behavior and Performance (Client log)

Summary

42 © Institut Eurécom 2005

Content Distribution Model: Cooperative Networking

In a cooperative network (“peer-to-peer”), all nodes are both client and server

Many nodes, but unreliable and heterogeneous

Takes advantage of distributed, shared resources (bandwidth, CPU, storage) on peer nodes

Fault-tolerant, self-organizing

Dynamic environment: frequent join and leave is the norm

InternetInternet

Source

Cooperative content distributionCooperative content distribution

43 © Institut Eurécom 2005

Cooperative Distribution: Intuition

Client/Server Cooperative

1. 9h:52m2. 14h:48m1. 9h:52m2. 14h:48m

1. 52s2. 09m:54s1. 52s2. 09m:54s

Source server: 100 Mb/sClients: 10 Mb/s1. Antivirus update100,000 clientsFile: 4 MB2. Daily database update1000 clientsFile: 600 MB

Source server: 100 Mb/sClients: 10 Mb/s1. Antivirus update100,000 clientsFile: 4 MB2. Daily database update1000 clientsFile: 600 MB

44 © Institut Eurécom 2005

Cooperative Distribution in BitTorrent

Principle: Capitalize bandwidth of edge computersSelf-scaling network: more clients ⇒ more aggregate bandwidth ⇒ more scalabilityCost-effective, robust against failures and flash crowds

How well does it work in practice?Study of BitTorrent tracker log covering 5 months, 1.77 GB file, 180,000+ clientsTracing the behavior of BitTorrent client

Page 12: Peer-to-Peer Networks for Content Distribution

45 © Institut Eurécom 2005

Elements of a BitTorrent Session

One session= distribution of a single (generally large) file

Elements:An ordinary web serverA static ‘metainfo’ fileA TrackerAn original downloaderOn end user side: web browser+bt plug-in

46 © Institut Eurécom 2005

Joining a BT Session

Web Server Tracker

New Peer

Updates

Active peersIP1, IP2, IP3, ...Active peersIP1, IP2, IP3, ...

BTClient

TorrentFile

Random peersIPx, IPy, ...Random peersIPx, IPy, ...

1. Download torrent meta-info1. Download torrent meta-info

2. Launch BT client2. Launch BT client

3. BT client contacts tracker3. BT client contacts tracker

4. Tracker picks 40 peers at random for the new client4. Tracker picks 40 peers at random for the new client

5. BT client cooperates withpeers returned by the tracker

5. BT client cooperates withpeers returned by the tracker

47 © Institut Eurécom 2005

Session Initiation

Start running a web server that hosts a torrent fileThe torrent file contains the IP address of the tracker

The tracker (often not on web server) tracks all peersInitially, it must know at least one peer with the complete filePeer that has entire file: seedPeer still downloading file: leecher

On client sideBT client reads tracker IP address and contacts the tracker (through HTTP or HTTPS)The tracker provides to the BT client a set of active peers (leechers and seeds, typically 40) to cooperate withClients regularly report state (% of download) to the tracker

48 © Institut Eurécom 2005

Peer SetsTracker picks peers at random in its listOnce a peer is incorporated in the BT session, it can also be picked to be in the peer set of another peerThis technique allows a wide temporal diversity

A peer knows both older peers and newcomers!Ensures transfer of chunks between “generations”

Note: a peer communicates with its initial peer set and the other peers that contacted it but NOT with other peer sets

TimePeer pPeers of p’s initial peer set Peers with p in initial peer set

Peer arrival

Page 13: Peer-to-Peer Networks for Content Distribution

49 © Institut Eurécom 2005

File Transfer Algorithm

Initial file broken into chunks (typically 256 kB)Torrent file contains SHA1 hash for each chunk: allows to check integrity of each chunk

Reports sent regularly (at start-up, shutdown, and every 30 minutes) to tracker

Unique peer ID, IP, port, quantity of data uploaded and downloaded, status (started, completed, stopped), etc.

Peer connect with each other over TCP, full duplex (data transit in both directions)

Upon connection, peers exchange their list of chunksEach time a peer has downloaded a chunk and checked its integrity, it advertises it to its peer set

50 © Institut Eurécom 2005

Connection States

On each side, a connection maintains 2 variables“Interested”: you have a chunk that I want

Allows a peer to know its possible clients for upload“Chocked”: I don’t want to send you data at the time

Possible reasons: I have found faster peers, you did not/can’t reciprocate enough, …

51 © Institut Eurécom 2005

Chunk Selection Algorithm

Which missing chunk should we request from other peers?Simple strategy: random selection

Choose at random among chunks available in peer setRandomness ensures diversity

Biased strategy: peers apply the rarest-first policyChoose the least represented missing chunk in the peer setRare chunks can more easily be traded with othersMaximize the minimum number of copies of any given chunk in each peer set

BT uses rarest-first policy except for newcomers that use random to quickly obtain a first block

52 © Institut Eurécom 2005

Peer Selection Algorithm

Serving too many peers simultaneously is not efficient: BT serves 5 hosts in parallelWhich hosts to serve?

The ones that also serve us: tit for tat (leechers)The ones that offer the best download rates (seeds)

Can there be any better hosts?Optimistically unchoke a random peer to possibly find another host that provides better serviceNewcomers have less data to offer ⇒ give them “priority” in the optimistic unchokeBT reconsiders choking/unchoking every 10 s (long enough for TCP to reach steady state)

Page 14: Peer-to-Peer Networks for Content Distribution

53 © Institut Eurécom 2005

Five months (April to August 2003) tracker log of a very popular BT session

Linux RedHat 9, 1.77 GB fileLog contains all the reports of all the clients (ID, IP, amount of bytes uploaded and downloaded)

In addition, we ran our own instrumented client on 3 different days to observe a given peer set

Log contains blocks uploaded to and downloaded by each host (each time a host has a new block, it advertises its peer set)Exhibits the behavior of BT during the download phase and once the client becomes seed

BitTorrent Study

54 © Institut Eurécom 2005

Tracker Log180,000 clients during the 5 five months periodInitial flash crowd: 51,000 clients during the first 5 days

Flash crowdOne client every 80 s

55 © Institut Eurécom 2005

Tracker Log: Number of Clients

Reaches 4000+ active clients in the first dayRemains in the interval [100,200] later

Flash crowd

56 © Institut Eurécom 2005

Tracker Log: Clients’ behavior

Clients are very altruistic1. When they are leechers

They have no choice due to tit-for-tat2. Once download is completed since they stay on average 3 hours

after downloadThe transfer is long, may complete overnightThe content is legal (RIAA will not sue!)The user is very kind

Page 15: Peer-to-Peer Networks for Content Distribution

57 © Institut Eurécom 2005

Tracker Log: Seeds

40 Tbytes

20 Tbytes

Presence of seeds is a key feature of BTOver the 5 months they contributed twice as much volume as leechers

58 © Institut Eurécom 2005

Tracker Log: Seeds vs. Leechers

The percentage of seeds is consistently high (20+%)Thus, two factors allow BT to sustain the flash crowd:

Its ability to quickly create seeds (i.e., complete downloads)The fact that users are altruistic and seeds remain online

20%

Peak during flash crowd

59 © Institut Eurécom 2005

Tracker Log: BT vs. Mirroring

Throughput per leecher is always above 500 kb/sAt least ADSL client

Aggregate throughput of system (sum over all leechers at each instant) was higher than 800 Mb/s

More than 80 mirrors, each sustaining a 10 Mb/s serviceConsidering only the 20,000 hosts that completed download in a single session (BT allows resume)

Throughput is better than average: 1.3 Mb/sAverage download time is 30,000 s (8.3 h)1.77 GB / 1.3 Mb/s = 10,000s (2.7 h)Conclusion: a high variance in download throughputs!

60 © Institut Eurécom 2005

Tracker Log: Complete Sessions

Peak value around ADSL speedSome hosts have very high bandwidth

Mean

Peak close to 400 kb/s (ADSL)

Page 16: Peer-to-Peer Networks for Content Distribution

61 © Institut Eurécom 2005

Tracker Log: US vs. Europe

In the first 4 weeks: 45% from US, 15% from EuropeUS clients have better access links that European clients

High-bandwidth peers

62 © Institut Eurécom 2005

Tracker Log: Incomplete Sessions

Causes of abortion (no interest, crash)?Assumption: abortions due to experiencing bad serviceValid if users receive almost nothing while online

90% of the incompletesessions last less than 10,000 sec (<3 h)

90% of the incompletesessions last less than 10,000 sec (<3 h)

60% of the incomplete sessions last less than 1,000 sec (<20 min)

60% of the incomplete sessions last less than 1,000 sec (<20 min)

Throughput of incomplete sessions smaller than that of complete sessions

Throughput of incomplete sessions smaller than that of complete sessions

63 © Institut Eurécom 2005

Tracker Log: Incomplete Sessions

90% of non-completedClients downloaded lessthan 10% of file

90% of non-completedClients downloaded lessthan 10% of file

64 © Institut Eurécom 2005

Client Log

Modified client behind a 10 Mb/s campus access link3 transfers, 3 days of 5th month (far from flash-crowd)Average transfer time: 4,500 s (1.25 h, fast client!)We remained as seed for another 13 hours

End of download

Number of clients drops after end of download phase.Explanation: seeds disconnect

Number of clients drops after end of download phase.Explanation: seeds disconnect

Page 17: Peer-to-Peer Networks for Content Distribution

65 © Institut Eurécom 2005

Client Log:Tit-for-tatA lot of straight lines: continuous serviceSteps: chocke effectThis Large Step: client disconnected?

66 © Institut Eurécom 2005

Client Log: Upload and Download

Ramp-up period(obtain first chunks)

End of downloadStart serving

chunks

Client never gets stalled: we always find peers to serve and download chunks from ⇒ good efficiency

Client never gets stalled: we always find peers to serve and download chunks from ⇒ good efficiency

Connectionsreach full speed

We uploaded as much as we downloaded after 10,000 sec = twice the download time

We uploaded as much as we downloaded after 10,000 sec = twice the download time

Cooperation is enforced: the download rate increases becausethe upload rate increases

Cooperation is enforced: the download rate increases becausethe upload rate increases

67 © Institut Eurécom 2005

Client Log: Tit-for-Tat

Who gave us the file, seeds or leechers?40% from seeds and 60% from leechers85% of the file was provided by only 25% peersMost of the file provided by peers that connected to us (not from original peer set)

How good is the tit-for-tat policy?Two conflicting goalsMust enforce cooperation among peersMust allow transfer even if bandwidth not perfectly balanced

Example: I don’t give you anything because I can send you at 100 kb/s whereas you can only send at 80 kb/s

68 © Institut Eurécom 2005

Client Log: Tit-for-Tat

We found that BT pays more attention to the amount of data transferred than to the balance of bandwidths — a very good property

Legend:V: volumeT: throughputd: downloadu: upload

Legend:V: volumeT: throughputd: downloadu: upload

High correlation of traffic volumesLow correlation of throughputsHigh correlation of traffic volumesLow correlation of throughputs

Page 18: Peer-to-Peer Networks for Content Distribution

69 © Institut Eurécom 2005

Client Log: Tit-for-Tat

We received more than we gave, even if we do not account for seeds traffic

Probably due to our good download capacity and to tit-for-tat enforcement

Ratio of 4

Ratio of 2

70 © Institut Eurécom 2005

BitTorrent Summary

BT very efficient for highly popular downloadsStill, its performance might be affected if clients do not stay long enough as seeds, e.g., in case of illegal content…What happened to 160,000 incomplete downloads?

BT is clearly able to sustain large flash crowdsSome open question

Could we do better by using different peer and chunk selection strategies?

71 © Institut Eurécom 2005

Handout

M. Izal, G. Urvoy-Keller, E.W. Biersack, P. Felber, A. Al Hamra, and L. Garces-Erice. Dissecting BitTorrent: Five Months in a Torrent's Lifetime. In Passive and Active Measurements 2004, April 2004.

February 1, 2005

Performance Analysis of P2P Networks for File Distribution

Ernst W. [email protected] EurecomFrance

Page 19: Peer-to-Peer Networks for Content Distribution

73 © Institut Eurécom 2005

Overview

MotivationModel and MetricsThree distribution techniques

Linear chainTreeParallel Trees

Performance comparisonFurther ImprovementsLessons learned and summary

74 © Institut Eurécom 2005

Context and ProblemAn growing number of well-connected users access increasing amounts of contentBut interest in content is often “Zipf”distributed (small fraction of very popular content)Servers and links are overloaded

Number of clientsSize of content“Flash crowd” (e.g., 9/11)

Tremendous engineering (and cost!) necessary to make server farms scalable and robust

InternetInternet

Source

Congestion

Degraded/loss of service

Traditional client/server content distribution

Traditional client/server content distribution

Problem: scalable distribution of contentProblem: scalable

distribution of content

75 © Institut Eurécom 2005

Cooperative Networking

In a cooperative network (“peer-to-peer”), all nodes are both client and server

Many nodes, but unreliable and heterogeneous

Takes advantage of distributed, shared resources (bandwidth, CPU, storage) on peer nodes

Fault-tolerant, self-organizing

Dynamic environment: frequent join and leave is the norm

InternetInternet

Source

Cooperative content distributionCooperative content distribution

76 © Institut Eurécom 2005

Cooperative Distribution

Principle: Capitalize bandwidth of edge computers

Self-scaling network:more clients ⇒ more aggregate bandwidth ⇒ more scalability

Self-organizing: robust against failures and flash crowds

How well does it work in practice?Study of BitTorrent over 5 months, 1.77 GB file, 180,000+ clients, 60+ TB transferred ⇒ scales very well

What are the best cooperative distribution strategies?Analytical models of distribution topologies

Page 20: Peer-to-Peer Networks for Content Distribution

77 © Institut Eurécom 2005

Model and MetricsSingle copy of the file held by the source s

The server serves the file infinitelyN peers p1,...,pN arrive at t0 and request the same file F

File F is broken into C chunksPeers cooperate by exchanging chunksPeers upload F a limited number of times before leaving

All peers and source have identical upload and download rate b1 round (unit of time) = time needed to download F at rate bTime needed to download a chunk = 1/C

MetricsT(N) — number of rounds needed to fully serve N peers (≥ 1)N(t) — number of peers served within t rounds

78 © Institut Eurécom 2005

Sequential Service

Non-cooperative approach: iteratively serve peers

Independent of C, scales linearly with t

s

p1

p3

p2

t = 0

t = 1/3

t = 2/3

t = 1

t = 4/3

t = 5/3

t = 2

Sequential, C = 3

ttNSequential =)(

NNTSequential =)(

79 © Institut Eurécom 2005

Linear Chain

Cooperative approachEvery peer serves the whole file once to another peer and disconnectsA peer can start serving once it has 1 chunk

Many chunks improve scalability

s

p1

p10

p2

p3

p5

p7

p9

p4

p6

p8

p11p12

t = 0

t = 1/3

t = 2/3

t = 1

t = 4/3

t = 5/3

t = 2

Linear, C = 3After t rounds: t+1 chains

Chain has 1+tC peers (1+(t-1)C complete)

11

23

12

3

12

3

12

3

12

312

80 © Institut Eurécom 2005

Linear Chain

Node to chunk ratio N/CC

CNCCC

CNCCNCTLinear

⋅⋅⋅++

⋅⋅⋅+−+−

=

28

28)2()2(

),(

2

2

CNNCTLinear ≈),(

2221),( ≈+≈NCTLinear

141

21),( =+≈NCTLinear1<<

CN

1=CN

1>>CN

All peers active most of the time, 1 chain

Few peers active, many complete or not started, several chains

First peer finishes when last starts, 1 chain

N/C = 104

N/C = 10-3N/C = 1 N/C = 1

Page 21: Peer-to-Peer Networks for Content Distribution

81 © Institut Eurécom 2005

Treek

Server uploads the file to k peers in parallel at rate b/kEvery peer downloads the file at rate b/k in k roundsNon-leaf peers upload the whole file to k peers at rate b/k

Serves the file k timesLinear chain when k=1

t = 0

t = 2/C

t = 4/C

t = 6/C

Treek=2, C = 3, N = 30

p1 p2

p3 p4 p5 p6

p7 p8 p9 p10 p11 p12 p13 p14

p15 p16 p17 p18 p19 p20 p21 p22 p23 p24 p25 p26 p27 p28 p29 p30

s

b/2 b/2

82 © Institut Eurécom 2005

Treek

Scales exponentiallywith C and tT and N depend on k

Deep trees delayengagement of peersFlat trees have lowthroughput (b/k)

Ck

kNkNkCT kTree ⋅⎥⎦

⎥⎢⎣⎢+= )(log),,(

1)(1)1(),,(

+−+−=≈ k

CktCkt

Tree kktkCN

N/C = 1

N/C < 1

N/C > 1

Tree depth

Per-

leve

l del

ay

File

tran

s. @

b/k

83 © Institut Eurécom 2005

Treek

Optimal value for k

Depends on N and C

Larger values yield moreleaves ⇒ more non-cooperating peersWhen N/C is large, linearchain has few peers simultaneously active⇒ tree is better

)1(2log)1(4)(loglog 2

−⋅⋅−⋅++−

= CNCNN

opt ek N/C = 104

N/C = 10-3

84 © Institut Eurécom 2005

PTreek

Server uploads 1/k of the file to k peers in parallel at rate b/kEvery peer downloads the file from k peers at rate b/kEvery peer uploads 1/k-th of the file to k peers at rate b/k

Serves the file onceCreates k parallel spanning trees, linear chain when k=1

Each peer is interior node of at most one tree ⇒ reliability

t = 0

t = 2/C

t = 4/C

t = 6/C

PTreek=2, N = 15

p1

p3 p4

p7 p8 p9 p10

p2

p5 p6

p11 p12 p13 p14

p2 p5 p6 p11 p12 p13 p14 p1 p3 p4 p7 p8 p9 p10

s

b/2 b/2

even odd

Page 22: Peer-to-Peer Networks for Content Distribution

85 © Institut Eurécom 2005

PTreek

Scales exponentiallywith C and tOptimal value for k = e

Independent of N and CLarger values providebetter resilience tofailures

⎣ ⎦ CkNNkCT kPTree ⋅+= log1),,(

kCt

PTree ktkCN)1(

),,(−

Best value

Tree depth

Per-

leve

l del

ay

File

tran

s. @

b

86 © Institut Eurécom 2005

PTreek vs. Linear Chain

When N/C << 1, peers stay engaged a long time in the chain ⇒ benefit of PTreek diminishesPTreek outperforms Linear when N/C > 10-1

N/C > 10-1N/C << 1N/C > 10-1

PTreek=3

Linear

PTreek=3

Linear

C = 102 C = 106

87 © Institut Eurécom 2005

PTreek vs. Treek

Download time for PTreek close to 1 (independent of k)Treek needs always at least k rounds

PTreek=2,3

Treek=2

Treek=3

Treek=2

Treek=3

PTreek=2,3

C = 102 C = 104

88 © Institut Eurécom 2005

Recap

PTreek performs best

Further, PTreek is less sensitive to churn

1

k

1

Copiesserved

Servicetime

b/k * k = bbPTreek

b/kb (- leaves)

Treek

bb (- leaves)

Linear

Download &upload rate

Clientsserved

CCNCC

⋅⋅⋅++

282

Ck

kNk k ⋅⎥⎦

⎥⎢⎣⎢+ )(log1)( +−

kCkt

k

2tC ⋅

⎣ ⎦ CkNk ⋅+ log1k

Ctk

)1( −

Opposite from ADSL

Page 23: Peer-to-Peer Networks for Content Distribution

89 © Institut Eurécom 2005

Can we do Better?Phase 1: Send one chunk to every node

s p4p2 p3p1 p5 p6 p7

s p2 p5 p7

s p4p2 p5 p6 p7

c1

c2

c3

t = 1/C

t = 0

t = 2/C

t = 3/C

p1

p1

p4 p6p3

p3

...

...

...

p8

p8

p8

c1

c1

c2

c1

The frequency of each block grows exponentiallyAssuming N=2m-1, after m rounds (one round = 1/C), each peer has one chunk

90 © Institut Eurécom 2005

Can we do Better?

Phase 2: peers exchange chunks

At each round, each peer receives one new chunkWe need C-1 rounds to deliver the remaining C-1 chunks to all peers

c2

c1

c3

...

c1+c2

c1+c4

c1+c3

c1+...

c1+c2+c4

c1+c2+c3

c1+c2+...c1+c2+c5

CN

CC

CNNCT m 22 log11)1(log)12,( +≈

−+

+=−=

91 © Institut Eurécom 2005

Lessons Learned

Engage peers as fast as possible and keep them engaged as long as possible

Keep the number of copies about the same (to avoid that many peers must wait for a very rare block that very few peers own)Linear chain can perform better than Tree (for N/C << 1)

Analysis is a first stepDeterministic analysisHomogeneous clients and homogeneous bandwidthNo peer failures

Real systems must account for heterogeneity and churn

92 © Institut Eurécom 2005

Summary

Self-scaling: more clients ⇒ more aggregate bandwidth ⇒ more scalabilityPTreek is a very efficient tree-based architectureFile should be split in many chunks

Performance scales exponentially with number of chunks CNot too many: coordination and connection overhead

Limit the number of simultaneous uploads to k = [3-5]Higher values provide more robustness

Self-organizing networkDegree, chunk selection strategy, peer selection strategy

Page 24: Peer-to-Peer Networks for Content Distribution

93 © Institut Eurécom 2005

Handout

E. W. Biersack, P. Rodriguez, and P. Felber. Performance Analysis of Peer-to-Peer Networks for File Distribution. In Proceedings of the Fifth International Workshop on Quality of Future Internet Services (QofIS'04), Barcelona, Spain, September 2004.

94 © Institut Eurécom 2005

Mesh-based Architectures

95 © Institut Eurécom 2005

Overview

MotivationPeer and chunk selection strategiesModel for performance evaluationResults

Simultaneous peer arrivalRarest block Random block

Mesh-based topologies vs static (tree-like) topologiesLessons learned and summary

96 © Institut Eurécom 2005

Mesh-based Architectures: Motivation

In practice, peers come and go (referred to as churn)Network self-organizes, according to 3 factors:

Indegree and outdegree: number of neighborsPeer selection strategy: which peer to serve next?Chunk selection strategy: which chunk to serve next?

Page 25: Peer-to-Peer Networks for Content Distribution

97 © Institut Eurécom 2005

Cooperative Distribution SimulatorSimulate cooperative distribution of a large file

File size: 200 chunks of 256 kB = 51.2 MBNumber of connections per peer: 5 (more if free bandwidth)Number of origin peers: 1 with 128 kb/s upPeer capacities

Homogeneous symmetric: all peers with 128 kb/sHomogeneous asymmetric: 100% of the peers with 512/128 kb/sHeterogeneous asymmetric : 50% of the peers with 512/128 kb/s and 50% with 128/64 kb/s

Peer lifetimeSelfish: peers leave as soon as download finishesAltruistic: peers remain online 5 minutes after completion

98 © Institut Eurécom 2005

Cooperative Distribution SimulatorPeer arrivals

Simultaneous: 5000 peers arrive at t0Continuous: average inter-arrival time of 2.5 s

Block selection policyRandomRarest

Peer selection policyRandomLeast missingMost missing

99 © Institut Eurécom 2005

Simultaneous Arrivals: Completion Times

Least missing performs very poor. WHY?

Least missing performs very poor. WHY?

Rarest chunk selection, simultaneous arrivals,homogeneous and symmetric bandwidth, selfish peers

100 © Institut Eurécom 2005

Simultaneous Arrivals: Download DurationRarest chunk selection, simultaneous arrivals, heterogeneous and asymmetric bandwidth, selfish peers

Two distinct classes of peers (most missing

tends to even the download durations)

Two distinct classes of peers (most missing

tends to even the download durations)

Fast peersSlow peers

Page 26: Peer-to-Peer Networks for Content Distribution

101 © Institut Eurécom 2005

Simultaneous Arrivals: Completion Times

Random and least missing suffer from

random chunk selection and peer

selfishness

Random and least missing suffer from

random chunk selection and peer

selfishness

sec 3200/ 128

8256200=

⋅⋅=

skbkbTopt

First peer completes first with Least Missing

Random chunk selection, simultaneous arrivals, homogeneous and symmetric bandwidth, selfish peers

102 © Institut Eurécom 2005

Simultaneous Arrivals: Efficiency

Most missing and adaptive missing are

most efficient (quickly engage many peers)

Most missing and adaptive missing are

most efficient (quickly engage many peers)

Random chunk selection, simultaneous arrivals, homogeneous and symmetric bandwidth, selfish peers

103 © Institut Eurécom 2005

Simultaneous Arrivals: Chunk Distribution

Random peer selection A single chunk is missing from many

peers (served sequentially by the

source)

A single chunk is missing from many

peers (served sequentially by the

source)

Random chunk selection, simultaneous arrivals, homogeneous and symmetric bandwidth, selfish peers

Random quickly brings all peers

close to completion, but blocks on the last few chunks

Random quickly brings all peers

close to completion, but blocks on the last few chunks

104 © Institut Eurécom 2005

Comparison of Tree-based and Mesh-based Approaches

We will show thatLeast-Missing provides similar performance to Treek

Most-Missing that provides similar performance to PTreek

Advantages of meshesAvoiding the construction of treesAre more flexible in case of node failures or node heterogeneity

Page 27: Peer-to-Peer Networks for Content Distribution

105 © Institut Eurécom 2005

Mesh-based ApproachesLeast-Missing:

chunk selection strategy = globally rarest Peer selection strategy = closest to complete, i.e. serve in priority the peer that holds the highest number of chunks among all peers

Most-Missing:chunk selection strategy = globally rarestPeer selection strategy = furthest from completion, i.e. serve with priority the peer that holds the lowest number of chunks among all peers

106 © Institut Eurécom 2005

Least-Missing with Pin=1 and Pout=2Least-Missing with Pin=1 and Pout=2 Treek=2

107 © Institut Eurécom 2005

Least-Missing with Pin=1 and Pout=2Least-Missing with Pin=1 and Pout=2 Treek=2

r/2 r/2

s

p1 p2

p3

p7 p8 p12 p14

p4

p9 p10

p5

p11

p6

p13

C1 C1

C1 C1

C1 C1 C1 C1C1

C2 C2

C2 C2

C3 C3at t = 0at t = 1/Cat t = 2/C

108 © Institut Eurécom 2005

Least-Missing with Pin=1 and Pout=1

What organization does this correspond to??Linear chainTreeParallel Trees

Page 28: Peer-to-Peer Networks for Content Distribution

109 © Institut Eurécom 2005

Most-Missing

s p4p2 p3 p5 p6 p7 at t = 0p8p1

C1

s p4p2 p3 p5 p6 p7 at t = 1/Cp8p1

C2

s p2 p3 p5 p6 p7 at t = 2/Cp8p1 p4

s p2 p3 p5 p6 p7 p8p1 p4

C3

C4

C1

C1

C1

C1 C3

C2

at t = 3/C

All peers have Pin=Pout=1

p1

p2 p3

p4 p5 p6

C2

C1

p7

C1

110 © Institut Eurécom 2005

PTreek

Idea: Engage all peers as fast as possibleMake all peers work as long as possible and therefore all terminate at about the same time

t = 0

t = 2/C

t = 4/C

t = 6/C

PTreek=2, N = 15

p1

p3 p4

p7 p8 p9 p10

p2

p5 p6

p11 p12 p13 p14

p2 p5 p6 p11 p12 p13 p14 p1 p3 p4 p7 p8 p9 p10

s

b/2 b/2

even odd

111 © Institut Eurécom 2005

Conclusion

Selection strategy is importantPeer selection: Most missing performs bestBlock selection: Rarest block first avoids performance problems that can occur with random block selection

• When many clients may need the same (last) block

112 © Institut Eurécom 2005

Overall ConclusionPeer to peer systems for file distribution perform very well in practice

BitTorrent allows to serve thousands of clients using mesh based organization

Peer to peer systems for file distribution areVery cost effective and robust against failures and flash crowdsSelf-scaling: the more clients the more “resources” to serve

What are the best cooperative distribution strategiesMesh –based organization is most appropriate in heterogeneous environments (in terms of up- and down-link bandwidth)Peer- and block selection strategy have a major impact on performance

Page 29: Peer-to-Peer Networks for Content Distribution

113 © Institut Eurécom 2005

Handout

P. A. Felber and E. W. Biersack. Self-scaling Networks for Content Distribution, September 2004.

February 1, 2005

Questions

115 © Institut Eurécom 2005

END


Recommended