+ All Categories
Home > Documents > On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

Date post: 14-Feb-2017
Category:
Upload: trandiep
View: 224 times
Download: 0 times
Share this document with a friend
26
Delft University of Technology Parallel and Distributed Systems Report Series On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File-Sharing Networks Boxun Zhang, Alexandru Iosup, Johan Pouwelse, Dick Epema, and Henk Sips {B.Zhang,A.Iosup,J.Pouwelse,D.H.J.Epema,H.J.Sips}@tudelft.nl report number PDS-2009-005 PDS ISSN 1387-2109
Transcript
Page 1: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

Delft University of Technology

Parallel and Distributed Systems Report Series

On Assessing Measurement Accuracy in

BitTorrent Peer-to-Peer File-Sharing Networks

Boxun Zhang, Alexandru Iosup, Johan Pouwelse,

Dick Epema, and Henk Sips

{B.Zhang,A.Iosup,J.Pouwelse,D.H.J.Epema,H.J.Sips}@tudelft.nl

report number PDS-2009-005

PDS

ISSN 1387-2109

Page 2: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

Published and produced by:Parallel and Distributed Systems SectionFaculty of Information Technology and Systems Department of Technical Mathematics and InformaticsDelft University of TechnologyZuidplantsoen 42628 BZ DelftThe Netherlands

Information about Parallel and Distributed Systems Report Series:[email protected]

Information about Parallel and Distributed Systems Section:http://pds.twi.tudelft.nl/

c© 2009 Parallel and Distributed Systems Section, Faculty of Information Technology and Systems, Departmentof Technical Mathematics and Informatics, Delft University of Technology. All rights reserved. No part of thisseries may be reproduced in any form or by any means without prior written permission of the publisher.

Page 3: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp

Abstract

The BitTorrent peer-to-peer file-sharing network is currently one of the dominant Internet applications.Understanding the characteristics of BitTorrent through real-world measurements is key to improve thequality of service for tens of millions of BitTorrent users, but the complexity and scale of BitTorrent make asingle, complete measurement impractical. Thus, an increasing number of real measurements have employeddiverse sampling techniques to study the BitTorrent network. However, there is no study that investigatesthe accuracy of the findings of the different measurement techniques used in practice. To address this gap,in this work we propose a thorough investigation of the accuracy of BitTorrent measurement techniques. Tothis end, we first introduce a taxonomy of inaccuracy sources. We then investigate the effect of these sourcesusing 15 long-term BitTorrent datasets collected from 9 BitTorrent communities between 2004 and 2009. Wefind that most reported measurements are based on techniques that can lead to inaccurate characterizationof system properties.

Wp 1 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 4: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

WpContents

Contents

1 Introduction 4

2 Background 42.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 BitTorrrent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Estimating Measurement Inaccuracy 53.1 Estimating Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53.2 Data Source Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63.3 Data Volume Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

4 The Collected Traces 84.1 T1: BT-TUD-1, SuprNova . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.2 T2: BT-TUD-2, PirateBay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.3 T3: LegalTorrents.com . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84.4 T4: etree.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.5 T5: tlm-project.org . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.6 T6: transamrit.net . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94.7 T7: unix-ag.uni-kl.de . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.8 T8: idsoftware.com . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104.9 T9: boenielsen.dk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5 The Results 115.1 Inaccuracy Due to Data Source Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

5.1.1 Measurement Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115.1.2 Community Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125.1.3 Passive vs. Active . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

5.2 Inaccuracy Due to Data Volume Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145.2.1 Sampling rate and Duration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145.2.2 Number of Communities and Number of Torrents . . . . . . . . . . . . . . . . . . . . . . . 165.2.3 Catching long-term dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

6 Related Work 22

7 Conclusion and Future Work 23

8 Acknowledgements 23

Wp 2 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 5: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

WpList of Figures

List of Figures

1 Swarm dynamics at swarm and peer level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Comparison of cumulative community throughput resulting from community-level measurement

and swarm-level measurement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Comparison of file size distributions 2005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Comparison of file size distributions 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Download speed 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Upload speed 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Session length 2009 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 Sampling Bias for one swarm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Sampling bias on swarm size, multi-swarm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1510 Sampling Bias Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1611 Comparison of overall peer coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1612 Comparison of session length distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1713 Comparison of download speed distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1714 Comparison of download amount per peer distributions . . . . . . . . . . . . . . . . . . . . . . . 1815 Comparison of upload speed distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1816 Comparison of download speed distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1917 Comparison of main characteristics resulting from datasets acquired from a specific number of

torrents coming from one community. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1918 Comparison of monthly file size distribution from T1 . . . . . . . . . . . . . . . . . . . . . . . . . 2019 Comparison of monthly swarm size distributions from T1 . . . . . . . . . . . . . . . . . . . . . . 2020 Comparison of monthly throughput . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2121 Download speed of different continents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2122 Download speed comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2123 Torrent size comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

List of Tables

1 Summary of the datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Wp 3 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 6: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp1. Introduction

1 Introduction

Peer-to-Peer file-sharing networks such as BitTorrent serve millions of users daily and are responsible for asignificant percentage of the total Internet traffic. Since BitTorrent is still relatively new, ongoing research stillattempts to improve core functionality such as performance and tolerance to abuse. Deep understanding ofthe resource and of the usage patterns are the basis of system design and optimization; for BitTorrent, suchunderstanding results from measuring real BitTorrent deployments. Despite a large number of BitTorrent mea-surements [18, 10, 16, 2], the lack of a thorough evaluation of the accuracy of these measurements prevents thecomparison of measurement results. Our objective is to understand the accuracy of BitTorrent measurements.

The complexity, scale, and dynamics of P2P file-sharing networks make complete measurements impractical.In practice, measurements have to sample data from a part of the network, within resource and cost limits.Thus, the design of these measurements includes a hidden trade-off between cost and data accuracy. Muchwork [17, 18, 3, 5, 10, 16, 7] has been put into empirical measurements of P2P file-sharing systems, includingBitTorrent, but only few studies [8, 20, 2] consider the inaccuracy introduced by the measurements.

Understanding the accuracy of BitTorrent measurements facilitate the design, validation, and comparisonof BitTorrent models and algorithms. Ultimately, using accurate data leads to improved user experience. Incontrast, inaccurate data may highlight false problems and lead to impractical solutions.

Our work is further motivated by two real and immediate applications. First, we are continuing [22] our workto establish a publicly-accessible P2P Workloads Archive. This archive will include in a first phase the tens ofP2P measurement datasets we have acquired since 2003, and in particular the 15 datasets we use in this work.Second, within the QLectives project1 we are currently taking and there are plans for new measurements of theBitTorrent network. It is therefore important to develop a method and the tools to assess the measurementaccuracy for each of these traces. To address this situation, in this work we investigate the accuracy of variousBitTorrent measurement techniques used in practice, and show that the existing measurement techniques needto reconsider the data sources and the volume of acquired data, or produce inaccurate or even meaninglessresults. Our main contribution is twofold:

1. We propose a method for estimating the accuracy of BitTorrent measurements that focuses on two mainaxes and six main sources of measurement inaccuracy (Section 3);

2. We evaluate the effect of these sources using 15 long-term BitTorrent datasets collected from 9 BitTorrentcommunities between 2004 and 2009, and show evidence that the techniques used in practice today leadto inaccurate results (Section 5).

2 Background

In this section we introduce the background needed to understand the remainder of this work. Much of theP2P-related terminology and BitTorrent description in this section is adapted from our previous work onBitTorrent [16, 8, 22].

2.1 Terminology

A P2P system is a system that uses P2P technology to provide a set of services; this group of services formstogether an application such as file sharing. We call peers the participants in a P2P system that contribute toor use the system’s resources and services. A peer is completely disconnected until it joins the system, and isactive until it leaves the system. A real user may run several peer sessions; the sessions are not overlapped intime. We call a swarm the group of peers, from all the peers in a P2P system, that interact with each other

1QLectives (http://www.qlectives.eu/) is a 7 million Euro four years project, starting in March 2009, funded by the EU underthe FP7 FET programme.

Wp 4 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 7: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp2.2 BitTorrrent

for a specific goal, such as transferring a file. A swarm starts being active when the first peer joins that swarm,and ends its activity when its last peer leaves. The lifetime of a swarm is the period between the start and theend of the swarm. A community is the group of peers who are or can easily become aware of the existence ofeach other’s swarms.

Our view on P2P systems considers three levels of operation. A P2P system includes at least a peer level,but may also include any of the community and swarm levels.

The definitions of community, swarm, and peers presented here are general for peer-to-peer systems, thoughtheir implementation may differ with the P2P protocol. For example, BitTorrent and eDonkey have differentimplementations and uses of the swarm concept.

2.2 BitTorrrent

In this work we focus on BitTorrent, a popular P2P file-sharing application. BitTorrent is currently the largestP2P file-sharing, with an estimated Internet traffic share of over 50% in 2008 [1], up from 30% in 2004 [14].

BitTorrent includes all the three levels of operation defined in Section 2.1, that is, community, swarm,and peer. The files (torrents) transferred in BitTorrent contain two parts: the raw file data and a metadata(directory and information) part. Peers interested in a file obtain the file’s metadata from a web site (thecommunity level of BitTorrent) and use the peer location services offered by a tracker (the swarm level ofBitTorrent) to find other peers interested in sharing the file. The raw file is then exchanged between peers (thepeer level of BitTorrent). To facilitate this exchange, the raw data are split in smaller parts, called chunks.Thus, to obtain a complete file a user has to obtain all the chunks comprised by a file through the use of threeapplication levels.

In BitTorrent, a leecher is a peer who still needs chunks, and a seeder is a peer who has the complete torrentand shares it with other peers. A delicate part in the file-sharing process of BitTorrent is the contribution ofbandwidth. In this context, freeriding is defined as the activity of a peer that does not contribute any bandwidthto the system, and hit-and-run is defined as the activity of a peer that does not contribute to the network afterobtaining the complete file (so after being eligible to act as a seeder).

3 Estimating Measurement Inaccuracy

In this section we introduce a method for estimating the inaccuracy of BitTorrent measurements.Our method focuses on two main questions that define a measurement process: What to measure? and How

much to measure? The first question results from the complexity of BitTorrent. For example, there currentlyexist tens of BitTorrent communities to choose from, each with its own web sites and trackers, and possiblyits own usage characteristics. The second question expresses the trade-off between accuracy and volume ofmeasurement data. Since the data are often collected from sources within different administrative domains (i.e.,users), BitTorrent measurements are inherently limited in size. Corresponding to the two main questions, twomain aspects influence the measurement accuracy: the source of the data to be collected, and the volume of thedata to be collected. We first introduce the characteristics of the system that we want to observe, and for whichwe want to understand the sources of inaccuracy. We then present in turn the sources of inaccuracy stemmingfrom the data source selection and from the data volume reduction.

3.1 Estimating Accuracy

Following traditional work on modeling Internet traffic [11, 9], much can be gained for system designer byunderstanding at the community level the sizes of the files that are shared, at the swarm level the arrival anddeparture processes, and at the peer level the application-level bandwidth. We distinguish between the completeand the transient swarm population: we define the swarm population as the set of peers present in the swarm

Wp 5 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 8: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp3.2 Data Source Selection

at any time during the measurement, and the swarm (dynamic) size as the transient set of peers present duringa specified interval of the measurement.

We estimate the accuracy of a measurement with the following metrics:

• Coverage is the percentage of peers or events included in the dataset resulting from a real measurement,from the real number of peers or events comprised in the complete dataset. As a first approximation, amodel of the system or the most complete dataset available can be used when the complete dataset is notavailable.

• Error/deviation of values is a metric that mimics traditional statistical approaches for comparingprobability distributions for random variables. The Kolmogorov-Smirnov test [12] uses the D characteristicto estimate the maximum deviation between the cumulative distribution functions (CDFs) of two randomvariables. We use a graphical representation of the CDFs of the measured and the real characteristicunder study, which gives us a visual estimation of the D characteristic.

3.2 Data Source Selection

Depending on the selection of the data source, we distinguish three main sources of inaccuracy:Measurement level In Section 2 we have defined three levels for a P2P application, community, swarm,

and peer. Measuring at any single level may result in measurement inaccuracy. For example, since peers havelimited uptime (presence) in the system, measurements taken at peer level may not be able to contact all peers.Moreover, accurately estimating the time when a swarm becomes active is not possible when using only peerlevel measurements.

Community level: Community type There are many types of communities in BitTorrent. We definethree characteristics that discriminate between BitTorrent communities: content coverage, legality, openness.Communities may cover either general or specific content. The specific content may be further divided intotraditional content sub-types such as video, audio, games, operating system, etc.; Garbacki et al. [4] identifyup to 200 content sub-types for the SuprNova BitTorrent community. The correlation between content typepopularity and individual file popularity has been observed and exploited for several P2P file-sharing networks,including BitTorrent [7, 4]. Independent of content type, communities may share only content that has beencertified as legal, or content with any legal status. Legal communities are usually built around companies thattarget a reduction of costs in distributing their content. Such companies install and maintain well-connectedseeders with high avaialbility; such communities may exhibit very different sharing behavior than communitiesthat rely on voluntary seeding. Last, communities may be open to any user or closed. Closed communitiesrequire the registration of their users; this allows unique identification of peers and permits the enforcement ofseeding quotas or minimal seeding-to-leeching ratios.

Peer level: Passive vs. Active Measurements Following the terminology in our previous work [8],peer-level measurements are active if the measurement probes initiate contact with BitTorrent peers, andpassive if the measurement probes wait for externally initiated contacts. In contrast to passive measurements,active measurements require that peers are accessible, that is, that they are not behind a firewall. The 2007measurement by Xie et al. [21] shows that firewalls may affect up to 90% of the peers in a large live streamingapplication, and that less than 20% of the peers can by-pass firewalling through user-initiated configuration(UPnP). Thus, active measurements may lead to significantly reduced peer coverage.

3.3 Data Volume Reduction

The data volume is another major discriminant for peer-to-peer measurements. We define the complete dataas the dataset comprising the complete state of the system at the time when the measurement was taken, andall the events that changed the system state during the measurement. Complete data ensures the maximumaccuracy possible for a specific data source, but raises the requirements of the measurement. Reducing the data

Wp 6 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 9: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp4. The Collected Traces

ID Trace description Period Sampling Torrents Sessions TrafficT1’04 BT-TUD-1, SuprNova Dec 2003 to Dec 2004 Hourly 32,452 n/a n/a

(General, Any Legal Status) 06 Dec 2003 to 17 Jan 2004 2.5 min 120 28,423,470 n/aT2’05 BT-TUD-2, PirateBay 05-11 May 2005 2.5 min 2,000 35,881,338 12 PB/year

(General/Movies, Any Legal Status)T3’05 LegalTorrents.com 22 Mar to 19 Jul 2005 5 min 41 n/a 698 GB/dayT3’09 (General, Legal Content) 24 Sep 2009 onwards 5 min 183 n/a 1.1 TB/dayT4’05 etree.org 22 Mar to 19 Jul 2005 15 min 52 165,168 9 GB/dayT4’09 (Recorded events, Only Legal Content) 24 Sep 2009 onwards 15 min 45 169,768 143 GB/dayT5’05 tlm-project.org 22 Mar to 30 Apr 2005 10 min 264 149,071 735 GB/dayT5’09 (Linux, Only Legal Content) 24 Sep 2009 onwards 10 min 74 21,529 15 GB/dayT6’05 transamrit.net 22 Mar to 19 Jul 2005 5 min 14 130,253 258 GB/dayT6’09 (Slackware Unix, Only Legal Content) 24 Sep 2009 onwards 5 min 60 61,011 840 GB/dayT7’05 unix-ag.uni-kl.de 22 Mar to 19 Jul 2005 5 min 11 279,323 493 GB/dayT7’09 (Knoppix, Only Legal Content) 24 Sep 2009 onwards 5 min 12 160,522 348 GB/dayT8’05 idsoftware.com 22 Mar to 19 Jul 2005 5 min 13 48,271 19 GB/dayT8’09 (Game Demos, Only Legal Content) 24 Sep 2009 onwards 5 min 37 14,697 12 GB/dayT9’05 boegenielsen.net 22 Mar to 19 Jul 2005 5 min 15 36,391 308 GB/day

(Knoppix, Only Legal Content)

Table 1: Summary of the datasets used in this work. Only the datasets for traces T1’04 and T2’05 have beenpreviously analyzed [16, 8].

volume, for example by sampling events and/or peers, ensures that the measurement is feasible and sometimesthat the measurement is resource-efficient. Measuring complete data for large BitTorrent communities even forone week would require the use of thousands of measurement machines and petabytes of storage. In our previouswork [16, 8] we have used various data volume reduction techniques to be able to track such large communitiesusing ”‘only”’ hundreds of nodes and terabytes of storage. We distinguish three main types of techniques fordata volume reduction:

Sampling rate and Duration Since peer-to-peer systems have properties that evolve over time, measure-ments have to observe the same property over time. The data volume is then the product of the sampling rateand the duration of the measurement. Reducing the sampling rate and/or the duration leads to data volumereduction and possibly to lower accuracy. In practice, sampling rates of a sample every 2.5 [16, 8] and even 30minutes [2], and durations of a few days [8] to a few months [10] are common.

Number of communities and Number of swarms Complete data on BitTorrent comprise all the swarmsfrom all the BitTorrent communities. This is impractical, as many communities may share properties, and withina community the most populated swarms account for most of the BitTorrent traffic. Thus, measurements mayreduce the volume of acquired data by reducing one or both of the number of communities and the number ofswarms. In practice, measurements have often focused on one community [16, 8], or even on only one swarm [10].Recently, Andrade et al. have measured four communities [2], but their commendable approach is singular forthe BitTorrent community.

Long-term dynamics In the past decade, the evolution of BitTorrent has proven surprising. Overall, Bit-Torrent has emerged as a dominant Internet traffic generator. However, many of the BitTorrent communitieshave changed in time; some have disappeared due to lack of interest or corporate pressure. Thus, to preventreported characteristics from becoming stale, measurements should make efforts to catch long-term system dy-namics, including monthly, seasonal, yearly, and multi-year patterns; studying time patterns is a well-establishedtopic for the Internet community [11, 9]. In practice, the only long-term studies related to BitTorrent are thefive months study of Izal et al. [10] and our own year-long measurement of SuprNova [16].

Wp 7 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 10: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp4. The Collected Traces

4 The Collected Traces

To understand and characterize the accuracy of BitTorrent measurements we acquired long-term traces fromten BitTorrent communities. Overall, the collected traces describe communities of hundreds of thousands ofpeers and responsible yearly for over 13 peta-bytes of data, as summarized in Table 1.

To ensure heterogeneity among the limited number of traces, we have taken into account the followingcontrollable factors into trace collection. From the community perspective, the traces focus on communitieswith either general or very specific specific types of content; for example, the id software community focuseson sharing demos of games commercialized by id software. From the community size perspective, the tracesinclude the largest communities in the world at the time of the data collection down to small communities,both in terms of number of users and number of shared files. In terms of duration, with the notable exceptionof the BitTorrent traces collected during just a few days in mid-2005, but still represent the largest collectionof BitTorrent information relative to the unit of time, all the traces are long-term. With respect to the biasesintroduced by the very long-term community evolution, several of the collected traces include two datasets, oneacquired in 2005 and one acquired in 2009.

In the remainder of this section we describe each of the traces, in turn.

4.1 T1: BT-TUD-1, SuprNova

Datasets (1): T1’04

This trace was collected from the SuprNova community during the period between 2003 and 2004. Thiscommunity distributes vary types of contents with any legal state, and this trace contains data at both swarmlevel and peer level: swarm level data was collected from 32,452 swarms with hourly sampling interval, whichcontains the number of seeders and leechers of each measured swarms over time, and descriptive information oftorrents including file name, info hash, added time and file size; peer level data was collected from 120 swarmsduring the period between 06 Dec 2003 and 17 Jan 2004 with sampling interval of 2.5 minutes, and in total28,423,470 sessions were captured, which contains peer’s ip address, port number, download progress (numberof downloaded chunks) and error messages.

4.2 T2: BT-TUD-2, PirateBay

Datasets (1): T2’04

This trace was collected from the ThePirateBay community during the period between 05 May 2005 and 11May 2005. This community distributes vary types of contents with any legal state. The trace contains data atboth swarm level and peer level: swarm level data was collected from 2000 swarms, which contains the numberof seeders and leechers of each measured swarms over time, and descriptive information of torrents includingfile name, info hash, added time and file size; peer level data was collected from 2000 swarms with samplinginterval of 2.5 minutes, and in total 35,881,338 sessions were captured, which contains peers’ ip address, portnumber, client ID, download progress (number of downloaded chunks) and error messages. And the estimatedannual throughput of this community during that period is 12 PB.

4.3 T3: LegalTorrents.com

Datasets (2): T3’05, T3’09

T3’05 was collected from the LegalTorrents.com during the period between 22 Mar 2005 and 17 Jul 2005,and T3’09 has been collected from this community since 24 Sep 2009 with 5 minute sampling interval. Thiscommunity mainly distributes general types of contents and only provides legal contents. Both datasets onlycontain community-level data, which is the number of leechers and seeders, total number of completed downloads

Wp 8 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 11: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp4.4 T4: etree.org

and traffic of each swarm. And both datasets contain descriptive information of measured torrents includingfile name, added time, file size, number of files in each torrent and description.

In 2005, 41 swarms were measured and the daily throughput of this community was 698 GB traffic. In 2009,183 swarms until now are measured and the daily throughput of this community is 1.1 TB traffic.

4.4 T4: etree.org

Datasets (2): T4’05, T4’09

T4’05 was collected from etree.org during the period between 22 Mar 2005 and 17 Jul 2005, and T4’09 hasbeen colloected from this community since 24 Sep 2009 with 15 minute sampling interval. This communitymainly distributes recorded events and only provides legal contents. Both datasets only contain swarm leveldata, which is peer’s ip address with last byte blinded, client type, port number, download amount, uploadamount, connected time, sharing ratio, download progress, download speed and upload speed of in each swarm.And both datasets contain descriptive information of measured torrents including file name, infohash, addedtime, file size, number of files in each torrent and torrent description.

In 2005, 165,168 sessions in 52 swarms were measured and the daily throughput of this community was 9GB. In 2009, until now 169,768 sessions in 45 swarms are measured and the daily throughput of this communityis 143 GB traffic.

4.5 T5: tlm-project.org

Datasets (2): T5’05, T5’09

T5’05 was collected from tlm-project.org during the period between 22 Mar 2005 and 30 Apr 2005 with,and T5’09 has been collected from this community since 24 Sep 2009 with 10 minute sampling interval. Thiscommunity mainly distributes various linux distributions and only provides legal contents. Both datasets containcommunity level and swarm level data: community level data contains the number of leechers and seeders, totalnumber of completed downloads and traffic of each measured swarm; peer level data contains peer’s ip addresswith last byte blinded, port number, download amount, upload amount, download progress, connected time,sharing ratio in each swarm, and T5’09 also includes peer’s download and upload speed. And both datasetscontain descriptive information of torrents including file name, infohash, added time, file size, number of filesin each torrent.

In 2005, 149,071 sessions in 264 swarms were measured and the daily throughput of this community was 735GB. In 2009, until now 21,529 sessions in 74 torrents are measured and the daily throughput of this communityis 15 GB.

4.6 T6: transamrit.net

Datasets (2): T6’05, T6’09

T6’05 was collected from the transamrit.net during the period between 22 Mar 2005 and 19 Jul 2005,and T6’09 has been collected from this community since 24 Sep 2009 with 5 minute sampling interval. Thiscommunity mainly distributes Slackware linux distributions and only provides legal contents. Both datasetscontain community level and swarm level data: community level data contains the number of leechers andseeders, total number of completed downloads and traffic of each measured swarm; peer level data contains ipaddress with last byte blinded, port number, download amount, upload amount, connected time, sharing ratio,download progress, download speed and upload speed in each measured swarm. And both datasets containdescriptive information of torrents including file name, infohash, added time, file size and number of files ineach torrent.

Wp 9 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 12: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp4.7 T7: unix-ag.uni-kl.de

In 2005, 130,253 sessions in 14 swarms were measured and the daily throughput of this community was 258GB. In 2009, until now 61,011 sessions in 60 swarms are measured and the daily throughput of this communityis 840 GB.

4.7 T7: unix-ag.uni-kl.de

Datasets (2): T7’05, T7’09

T7’05 was collected from unix-ag.uni-kl.de during the period between 22 Mar 2005 and 19 Jul 2005, andT7’09 has been collected from this community since 24 Sep 2009 with 5 minute sampling interval. This com-munity mainly distributes Knoppix linux distributions and only provides legal contents. Both datasets containcommunity level and swarm level data: community level data contains the number of leechers and seeders,total number of completed downloads, total traffic and average download progress of all participating peers ofeach swarm; peer level data contains peer’s ip address with last byte blinded, port number, download amount,upload amount, connected time, sharing ratio, download progress, download speed and upload speed in eachmeasured swarm. And both datasets contain descriptive information of torrents including file name, infohash,added time, file size and number of files in each torrent.

In 2005, 279,323 sessions in 11 swarms were measured and the daily throughput of this community was 493GB. In 2009, until now 160,522 sessions in 12 swarms are measured and the daily throughput of this communityis 348 GB.

4.8 T8: idsoftware.com

Datasets (2): T8’05, T8’09

T8’05 was collected from idsoftware.com during the period between 22 Mar 2005 and 19 Jul 2005, and T8’09has been collected from this community since 24 Sep 2009 with 5 minute sampling interval. This communitydistributes demos of games from id Software and only provides legal contents. Both datasets contain communitylevel and swarm level data: community level data contains the number of leechers and seeders in each swarm:peer level data contains peer’s ip address with last byte blinded, port number, download amount, uploadamount, connected time, download progress and sharing ratio in each measured swarm. And both datasetscontain descriptive information of torrents including file name, infohash, added time, file size and number offiles in each torrent.

In 2005, 48,271 sessions in 13 swarms were measured and the daily throughput of this community was 19GB. In 2009, until now 14,697 sessions in 37 swarms are measured and the daily throughput of this communityis 12 GB.

4.9 T9: boenielsen.dk

Datasets (2): T9’05

T9’05 was collected from boenielsen.dk during the period between 22 Mar 2005 and 19 Jul 2005 with 5minute sampling interval. This community mainly distributed Knoppix linux distributions and only providedlegal contents. The dataset contains community level and swarm level data: community level data containsthe number of leechers and seeders, total number of completed downloads, total traffic and average downloadprogress of all peers of each swarm: peer level data contains peer’s ip address with last byte blinded, portnumber, download amount, upload amount, connected time, download progress and sharing ratio in measuredswarms. And the dataset also contains descriptive information of torrents including file name, infohash, addedtime, file size and number of files in each torrent.

In 2005, 36,391 sessions in 15 swarms were measured and the daily throughput of this community was 308GB.

Wp 10 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 13: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5. The Results

0

500

1000

1500

2000

2500

3000

3500

4000

20/12/03 27/12/03 03/01/04 10/01/04 17/01/04

Num

ber

of p

eers

Time

30% difference

Peer level measurement failure

FlashcrowdSwarm Level

Peer Level

Figure 1: Comparison of swarm dynamics resulting from swarm-level measurement and peer-level measurement.

0

3000

6000

9000

12000

16/04 23/04 30/04 07/05 14/05 21/05

Com

mun

ity T

hrou

ghpu

t (G

B)

Time (DD/YY)

55% difference

T7 ’05 Community LevelT7 ’05 Swarm Level

Figure 2: Comparison of cumulative community throughput resulting from community-level measurement andswarm-level measurement.

5 The Results

In this section investigate the impact of the techniques for data source selection and data volume reduction onthe accuracy of BitTorrent measurements.

5.1 Inaccuracy Due to Data Source Selection

Method: Throughout the evaluation of inaccuracy due to data source selection we compare characteristicsextracted from measured datasets.

5.1.1 Measurement Level

Finding: Measurements focusing on a single operational level of BitTorrent may lead to verylow accuracy. For example, swarm 003 of T1’04 was tracked at both swarm and peer level. During the

Wp 11 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 14: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.1 Inaccuracy Due to Data Source Selection

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

CD

F

File Size (MB)

T1 Dec ’04T2 ’05T3 ’05T4 ’05T5 ’05T8 ’05

Figure 3: Comparison of file size distributions in six BitTorrent communities in 2005.

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

CD

F

Torrent Size (MB)

T3 ’09T4 ’09T5 ’09T8 ’09

Figure 4: Comparison of file size distributions in four BitTorrent communities in 2009.

flashcrowd exhibited at the beginning of the swarm activity (see Figure 1), the peer-level coverage drops below70% of the swarm level data. Later during the measurements, the infrastructure becomes overloaded and theresulting failure leads to a 50% peer coverage for about half the duration of the flashcrowd. A similar effectcan be observed when taking measurements at both community and swarm level. For T7’05, different levels ofinformation aggregation (community and swarm) lead to errors of over 50% and thus very high inaccuracy (seeFigure 2). Recommendation: Measure swarms or communities at multiple levels simultaneously.

5.1.2 Community Type

Finding: Measuring different BitTorrent communities may lead to very different results. Forseveral BitTorrent communities, we show in Figures 3, 4, 5, 6, and 7 the cumulative distribution functions(CDFs) of file sizes, download speed, and session length, respectively. For example, the statistical propertiesof file sizes (Figure 3 and 4) differ significantly between communities. For these communities we do not see acorrelation of the characteristics with the community focus on general vs. specific content (T1.Dec’04, T2’05,and T3’05 vs. T4’05, T5’05, and T8’05), or on content legality concerns (T1.Dec’04 and T2’05 vs. T3’05,

Wp 12 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 15: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.1 Inaccuracy Due to Data Source Selection

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

CD

F

Downlod Speed (KB/s)

T5 ’09T6 ’09T7 ’09T8 ’09

Figure 5: Comparison of download speed in four BitTorrent communities.

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50

CD

F

Upload speed (KB/s)

T5 ’09T6 ’09T7 ’09T8 ’09

Figure 6: Comparison of upload speed in four BitTorrent communities.

T4’05, T5’05, and T8’05). Recommendation: Include BitTorrent communities of different types inthe same measurement.

5.1.3 Passive vs. Active

Finding: Passive and active measurement results differ significantly. The presence of firewalledpeers is significant in BitTorrent. For example, less than 60% of the peers are non-firewalled in the T1’04(SuprNova) trace [16]. An in-depth analysis of the presence and behavior of firewalled peers for four communitieswas presented by Mol et al. [13]; their analysis also covers the data of T2’05 (The Pirate Bay), which werecollected using both active and passive measurements. It turns out that only 34% of the peers discoveredusing the active measurements are non-firewalled and that 96% of the swarms have over 50% peers firewalled.Most importantly, the same study shows that the characteristics of firewalled and non-firewalled peers differsignificantly. Notably, as BitTorrent rewards peers with (good) connectivity, non-firewalled peers exhibit 80%less uptime than firewalled peers. An impractically large number of measurement points, which act as peersand wait to be contacted, are required to reach good coverage passive measurements require the deployment.

Wp 13 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 16: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200 1400

CD

F

Session Length (Minute)

T5 ’09T6 ’09T7 ’09T8 ’09

Figure 7: Comparison of session length in four BitTorrent communities.

Recommendation: Use both passive and active measurements for peer-level measurements.

5.2 Inaccuracy Due to Data Volume Reduction

5.2.1 Sampling rate and Duration

Method: To understand the effect of the sampling rate, as a first step we select from a dataset measuredwith a low sampling interval (the highest sampling rate), from which we extract a contiguous block with thecomplete data for one month. We consider this block the complete data (in the sense of perfectly accurate) forthe remainder of the analysis; we call this block the original dataset, and the sampling interval of this datasetthe original sampling interval. As a second step, we sample from the selected dataset at various intervals, whichare multiples of the original sampling interval, and report the accuracy of the datasets obtained with the newsampling intervals relative to the original dataset. We use a similar approach to understand the effect of themeasurement duration, with the following changes. In the first step we select a one month dataset; we callthe measurement duration of the selected dataset the original duration. In the second step, we consider theblocks of data from the beginning of the selected dataset and with various duration; the considered durationsare power-of-two divisions of the original duration, such as 1/2, 1/4, 1/8, etc. By applying both the samplinginterval enlargement and the duration reduction on the same dataset we can understand which of these twodata volume reduction techniques has a bigger impact on the measurement accuracy.

Finding: When measuring at peer-level, higher sampling interval leads to lower accuracy. Inparticular, a sampling interval above 15 minutes may lead to very low accuracy. Figure 8 shows foran exemplary swarm, swarm 003 of T1’04, that the hourly peer coverage drops below 80% when the samplinginterval is increased from the 2.5 minutes (the original sampling interval) to 15 minutes, and below 60% whenthe sampling interval is increased to 30 minutes. Figure 9 confirms these results for more swarms in T1’04.

Finding: When measuring at peer-level, higher sampling interval also leads to higher accuracyvariance. Figure 10 depicts as a boxplot the basic statistical properties (i.e., median, Q1, Q3) of the accuracyvalues observed for the whole dataset, for various sampling intervals. Only measurements taken with a 7.5 (12.5)minutes or lower sampling interval have over 90% (80%) median accuracy. The expected variance, defined asthe inter-quartile range and indicated in Figure 10 by boxes, indicates that only measurements with a samplinginterval of at most 10 minutes result in over 80% accuracy. These results confirm the variance that can bevisually observed in Figure 8. Recommendation: Measurements at peer level must be taken withrates of more than one sample every 15 minutes, and should target more than one sample at

Wp 14 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 17: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

20

40

60

80

100

0 50 100 150 200 250 300

Per

cent

age

(%)

Time (in hour, since 2003.12.20, 15:00)

2.5 minutes5 minutes

15 minutes30 minutes

Figure 8: Comparison of transient peer coverage resulting from measuring at different sampling intervals. Onlythe first 300 hours are displayed. (Data from T1’04.003.)

0

20

40

60

80

100

0 5 10 15 20 25

Ave

rage

Pee

r C

over

age

(%)

Normalized Sampling Interval (unit = 2.5 min)

original measurement

15 min T1 ’04 swarm 003T1 ’04 swarm 005T1 ’04 swarm 009T1 ’04 swarm 010

Figure 9: Average transient peer coverage resulting from measuring at different sampling intervals. (Data forfour swarms of T1’04.)

most every 10 minutes.Finding: Reducing the measurement duration quickly reduces the coverage of the measure-

ments. A doubling of the sampling interval leads to lower accuracy loss than a halving of themeasurement duration. Figure 11 depicts for several datasets the average peer coverage resulting from var-ious measurement durations, including the original duration. The different datasets exhibit different losses ofaccuracy for initial reductions of the measurement duration, but quickly converge to over 80% loss of coverage.After the initial duration halving (to 1/2 of the original duration), the swarm 003 from the trace T1’04 isthe least affected at over 80% coverage, but the coverage of the complete community in T5’05 would alreadybe below 40% coverage. The large difference is the result of the system state: swarm 003 exhibits a largeflashcrowd [16] in which the peers are caught for at least a week until obtaining the content they want, while inthe tlm-project.org community the peers obtain results quickly and then can leave the swarm without returning.Recommendation: Measurements should be taken over a period of at least one month. Avoid

Wp 15 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 18: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

Figure 10: Measurement results becomes more unstable with large sampling intervals. The accuracy levels of90%, 80%, and 60% are emphasized. (Data from T1.003.)

0

20

40

60

80

100

4 weeks 2 weeks 1 week 1/2 week 1/4 week 1/8 week 1/16 week

Ove

rall

Pee

r C

over

age

(%)

Measurement Duration

original measurement T1 ’04 swarm 003T1 ’04 swarm 009

T5 ’05

Figure 11: Comparison of overall peer coverage resulting from measuring for different sampling durations.

reducing the volume of the data by reducing the measurement duration. Doubling the samplinginterval is preferable to halving the measurement duration.

5.2.2 Number of Communities and Number of Torrents

Method: To understand the effect of the number of communities included in the measurement, as a first stepwe select from six communities, T4 through T9, the first month of data from their 2005 datasets; the firstmonth is the same for each of these datasets. In Step 2, we order the six communities by the total amountof traffic generated by that community during the selected month. In Step 3, we compute all the investigatedcharacteristics from all the selected datasets. In Step 4, we iteratively remove from the considered datasets thedataset corresponding to the community with the lowest rank (total traffic) that was considered at the previousiteration, and repeat from Step 3 until only one community is left for analysis. We apply a similar four-stepapproach to understand the impact of the number of torrents included in the measurement. We conservativelyassume that there exists some way to order the swarms a priori by the amount of traffic they generate; this isoften possible for example for highly-anticipated torrents such as blockbuster movies; this approach has beentaken by many reported measurements [16, 8]. In Step 1, we select each of the swarms in a community asan independent dataset. In Step 2, we rank all the swarms in a community according to the total amount oftraffic generated by each swarm. In Step 3, we compute all the investigated characteristics for all the considered

Wp 16 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 19: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000

CD

F

Session Length (Minute)

8% difference

6 communities5 communities4 communities3 communities2 communities

1 community

Figure 12: Comparison of session length distributions resulting from measuring different numbers of communi-ties.

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200

CD

F

Downlod Speed (KB/s)

8% difference

6 communities5 communities4 communities3 communities2 communities

1 community

Figure 13: Comparison of download speed distributions resulting from measuring different numbers of commu-nities.

datasets. In Step 4, we iteratively remove from the considered datasets the lowest ranked swarms that wereconsidered at the previous iteration, and repeat from Step 3 until only one swarm is left for analysis. We arethus able to analyze both absolute (e.g., from 100 to 50) and relative (e.g., from 100% to 50%) reductions inthe number of swarms included in the measurement.

Finding: Measuring only one community is insufficient to obtain representative BitTorrentresults. Figure 12, 13 and 14 and depicts the session length, download speed, download amount per sessionCDFs for a varying number of communities. The session length CDF stabilizes only after four or more com-munities are considered together. And the download speed and download amount CDFs stablize after morethan one community are considered together.The upload speed distribution is the only characteristic that wehave investigated, and which does not require more communities to be measured to obtain, which is shown infigure 15. Recommendation: Measure at least four communities simultaneously.

Finding: Measuring a single swarm is insufficient to obtain representative results for theswarm’s community. Figure 16 depicts the download speed characteristic for various numbers of swarms

Wp 17 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 20: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000

CD

F

Download Amount (MB)

6 communities5 communities4 communities3 communities2 communities

1 community

Figure 14: Comparison of download amount per peer distributions resulting from measuring different numbersof communities.

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50 60

CD

F

Upload Speed (KB/s)

6 communities5 communities4 communities3 communities2 communities

1 community

Figure 15: Comparison of upload speed distributions resulting from measuring different numbers of communities.

from the Unix-AG community (trace T7’09): the characteristics of the top-ranked swarm and of the top-12swarms are significantly different (over 8%). To better compare the impact of the number of selected swarmswe display a Kiviat diagram with six axes, one for each characteristic (Figure 17). Six datasets are considered,corresponding each to a number of selected swarms, from 1 to 100. For each dataset, the displayed value on eachaxis is the average value obtained from the dataset normalized by the largest average value found for all datasets.The figure shows the difference between the top ranked swarm and the other datasets comprising multipleswarms, and characterizes the complex differences between the multiple swarm datasets. Recommendation:Include more than one swarm in the measurements. A different number of swarms leads todifferent results. Depending on the measurement scenario, including 20 to 50 swarms in themeasurements provides a good trade-off between results accuracy and data volume.

Wp 18 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 21: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500

CD

F

Downlod Speed (KB/s)

8% difference

T7 ’09 12 swarmsT7 ’09 6 swarmsT7 ’09 3 swarmsT7 ’09 1 swarms

Figure 16: Comparison of download speed distributions resulting from measuring different numbers of torrentsin one community.

Average Completed Downloads

Average No. SessionsAverage Swarm Size

Average Download Speed

Average Arrival Rate Average Session Length

Top swarm (by no. peers)Top 5 swarms

Top 10 swarms

Top 20 swarmsTop 50 swarms

Top 100 swarms

Figure 17: Comparison of main characteristics resulting from datasets acquired from a specific number oftorrents coming from one community.

5.2.3 Catching long-term dynamics

Method: To show evidence of the long-term evolution of BitTorrent we first extract from our long-term tracesblocks of contiguous data and then compare them.

Finding: Yearly and seasonal patterns exist in BitTorrent, but different communities exhibitdifferent yearly and seasonal evolution trends. In Figure 3, the file size CDFs for T1’04 and T2’05indicate that a significant file size decrease may occur over the course of a single year; we now investigate thiseffect. Figure 18 depicts the evolution of the file sizes from Dec 2003 to Nov 2004; for clarity, the figure onlyshows curves corresponding to every second month. The measurements taken in Dec 2003 reveal a very differentvalues of this characteristic vs the other measurements (8%, or a D metric value of 0.08). Smaller differencesappear between consecutive months, and overall the file sizes decrease slowly over time (the curves have similarshape and ”move” towards the right side of the graph). The changes in the swarm size distribution for T1’04are depicted in Figure 19. There exist some indications of seasonal pattern: high swarm sizes occur in April,June, and December, typical vacation months, and low swarm sizes occur in August and October, typical workmonths. We conclude that yearly and seasonal patterns exist in BitTorrent. Figure 20 depicts the evolution ofthe total monthly throughput for three communities, T3’05, T7’05, and T9’05, over a period of three months.The total monthly throughput evolves differently for the three communities; for example, for T7’05 the total

Wp 19 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 22: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

0

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000 3500 4000

CD

F

File Size (MB)

8% difference

T1 ’03 DecT1 ’04 FebT1 ’04 AprT1 ’04 JunT1 ’04 AugT1 ’04 Oct

Figure 18: Comparison of monthly file size distribution from T1 (SuprNova). From 12 months of data onlyevery second month is depicted.

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000

CD

F

Swarm Size

T1 ’03 DecT1 ’04 FebT1 ’04 AprT1 ’04 JunT1 ’04 AugT1 ’04 Oct

Figure 19: Comparison of monthly swarm size distributions from T1(SuprNova). From 12 months of data onlyevery second month is depicted.

monthly throughput first increases then decreases. We conclude that the evolution trends are not community-independent. Recommendation: Measurements that follow several system characteristics shouldbe longer than three months, and preferably at least year-long.

Finding: Multi-year evolution is present in BitTorrent, but it is difficult to characterize. In ourprevious work [8] we have observed that the average application-level download speed (the main QoS indicatorfor BitTorrent) has doubled between T1’04 to T2’05. We now show that the evolution is not consistent across allusers. Figure 21 depicts the download speed distributions for T1’04 and T2’05 with users grouped by continent.The top-left sub-graph confirms our previous remark; the doubling holds for the whole distribution of users,including the median. The median for EU users increased more than twice, which compensates for the lowerincrease in download speed for the other continents. North American and Asian users show similar mediandownload speed growth. There is a steep increase in the number of users with very high download bandwidthin both the EU and North America, but not in the remaining continents. We have shown that a similar effect,

Wp 20 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 23: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp5.2 Inaccuracy Due to Data Volume Reduction

Figure 20: Comparison of monthly throughput of 3 communities in 2005.

0

0.2

0.4

0.6

0.8

1

CD

F

World ’04World ’05

0

0.2

0.4

0.6

0.8

1

0 20 40 60 80

CD

F

Download speed (KB/s)

Asia ’04Asia ’05

Europe ’04Europe ’05

0 20 40 60 80

Download speed (KB/s)

South America ’04South America ’05

North America ’04North America ’05

0 20 40 60 80

Download speed (KB/s)

Africa ’04Africa ’05

Figure 21: Change of download speed in different continents and in different years. (Data from trace BT-TUD-1,BT-TUD-2.)

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

CD

F

T5 ’05T5 ’09

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

CD

F

Download speed (KB/s)

T7 ’05T7 ’09

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

T6 ’05T6 ’09

0

0.2

0.4

0.6

0.8

1

0 50 100 150 200 250 300

Download speed (KB/s)

T8 ’05T8 ’09

Figure 22: Change of download speed from 2005 to 2009, by community.

Wp 21 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 24: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp6. Related Work

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

CD

F

T3 ’05T3 ’09

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

CD

F

Torrent Size (MB)

T5 ’05T5 ’09

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

T4 ’05T4 ’09

0

0.2

0.4

0.6

0.8

1

0 200 400 600 800 1000 1200

Torrent Size (MB)

T8 ’05T8 ’09

Figure 23: Change of file size from 2005 to 2009, by community.

that is, the download speed increases from 2005 to 2009, but the increase varies greatly by community, for tracesT3-T8, as shown in figure 22. Similarly, we show in figure 23 evidence that the file size distribution changedfrom 2005 to 2009, but the actual ”direction” of the change varies greatly by community. Recommendation:Repeat the measurements yearly.

6 Related Work

bias:sec:relatedMuch previous work was dedicated to measurements of real P2P file-sharing networks [17, 18, 3, 5, 10, 16, 7],

but only few reported results include references to the bias of the measurements [8, 20, 2]. Overall, prior to thiswork there exists no comprehensive study of the sources of bias, and only limited solutions for reducing biashave been proposed.

The studies that did not focus on bias include many of the bias sources analyzed in our work. Most of thesestudies span only a few days [3] or weeks. Similarly, many of these studies collect information from a particularlocation, such as a university [5] or a router in the Internet backbone [18]; thus, their results include a vantagepoint effect [15]. Other studies cover only few [10] files or just one community [6].

The studies that did recognize the importance of addressing measurement biases focused on a limited subsetfrom the sources analyzed in our work. In general, under the assumption that ”‘more is better”’, these studiesobtained data over long periods of time [20, 2], from more peers and from peers located all over the world [7, 8, 2],for more files [8] and more communities [2], and filtered the raw data before analysis to eliminate potentialbiases [7, 2]. However, these studies did not eliminate all the sources of bias investigated in our work, and didnot quantify for the sources of bias they consider the extent of the bias.

Close to our work, Stutzbach et al. [19] assess the bias incurred by sampling data from unstructured P2Pfile-sharing networks, and propose the MRWB technique for collecting nearly unbiased samples for unstructurednetworks. Their technique is designed specifically for unstructured networks such as Gnutella, in that it relieson the ability of the measurement tools to select nodes based on their different connectivity degree in thenetwork graph. Thus, MRWB cannot be applied to traditional BitTorrent networks, in which all nodes havea connectivity degree of 1 to the measurement point (the tracker). Also, this body of related work does notconsider the case of disjoint networks, which are common in BitTorrent either as independent communities oras independent swarms within the same community.

Closest to our work, Stutzbach et al. [20] investigate the trade-off between sampling rate and the bias

Wp 22 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 25: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

Wp7. Conclusion and Future Work

incurred by sampling data from unstructured P2P file-sharing networks. However, their analysis results arespecific to unstructured networks in that the maximum crawling rate can exceed the rate of meaningful eventssuch as peer arrival and departure. Thus, their results do not hold for BitTorrent networks, where the defaulttracker settings limit the ”‘crawl”’ to less than 500 peers per hour.

7 Conclusion and Future Work

Accurate measurements of real BitTorrent deployments are required to improve quality of service for millions ofBitTorrent users. In this work we have presented the first thorough investigation of the factors that influence theBitTorrent measurement accuracy. Towards this end, we have first proposed a method for evaluating accuracy.Our method includes a taxonomy of sources of inaccurate results comprising two axes–data source selection anddata volume reduction– totaling six inaccuracy sources, and two categories of metrics for quantifying accuracy.Then, we have evaluated the effects of the different sources of inaccuracy using 15 real traces taken from 9BitTorrent communities. Our results indicate that current BitTorrent techniques fail to consider most of thesix sources of inaccuracy, and thus introduce high and often difficult to characterize data inaccuracies.

We plan to extend our work towards a complete method for accurate and low-volume BitTorrent measure-ments.

8 Acknowledgements

The research leading to this contribution has received funding from the European Community’s Seventh Frame-work Programme in the P2P-Next project under grant no 216217.

Wp 23 http://www.pds.ewi.tudelft.nl/∼iosup/

Page 26: On Assessing Measurement Accuracy in BitTorrent Peer-to-Peer File ...

B. Zhang et al. Wp

BitTorrent BitTorrent Measurement AccuracyWp

PDS

Wp

WpReferences

References

[1] ipoque internet studies, 2006-2009. [Online] Available: www.ipoque.com/resources/internet-studies/. 5[2] N. Andrade, E. Santos-Neto, F. V. Brasileiro, and M. Ripeanu. Resource demand and supply in bittorrent content-

sharing communities. Computer Networks, 53(4):515–527, 2009. 4, 7, 22[3] R. Bhagwan, S. Savage, and G. M. Voelker. Understanding availability. In IPTPS, pages 256–267, 2003. 4, 22[4] P. Garbacki, D. H. J. Epema, and M. van Steen. Optimizing peer relationships in a super-peer network. In ICDCS,

page 31, 2007. 6[5] K. Gummadi, R. Dunn, S. Saroiu, S. Gribble, H. Levy, and J. Zahorjan. Measurement, modeling, and analysis of

a peer-to-peer file-sharing workload. In ACM Symp. on Operating Systems Principles (SOSP), 2003. 4, 22[6] L. Guo, S. Chen, Z. Xiao, E. Tan, X. Ding, and X. Zhang. Measurements, analysis, and modeling of bittorrent-like

systems. In Internet Measurment Conference, pages 35–48, 2005. 22[7] S. B. Handurukande, A.-M. Kermarrec, F. L. Fessant, L. Massoulie, and S. Patarin. Peer sharing behaviour in the

edonkey network, and implications for the design of server-less file sharing systems. In EuroSys, pages 359–371,2006. 4, 6, 22

[8] A. Iosup, P. Garbacki, J. A. Pouwelse, and D. H. J. Epema. Correlating topology and path characteristics of overlaynetworks and the internet. In IEEE/ACM Int’l. Symp. on Cluster Computing and the Grid (CCGrid) Workshops,GP2PC, page 10, 2006. 4, 6, 7, 16, 20, 22

[9] A. Iyengar, M. S. Squillante, and L. Zhang. Analysis and characterization of large-scale web server access patternsand performance. World Wide Web, 2(1-2):85–100, 1999. 5, 7

[10] M. Izal et al. Dissecting BitTorrent: Five Months in a Torrent’s Lifetime. In Proc. of PAM, pages 1–11, AntibesJuan-les-Pins, France, Apr 2004. 4, 7, 22

[11] W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson. On the self-similar nature of ethernet traffic (extendedversion). IEEE/ACM Trans. Netw., 2(1):1–15, 1994. 5, 7

[12] H. W. Lilliefors. On the kolmogorov-smirnov test for normality with mean and variance unknown. Journal of theAmerican Statistical Association, 62:399–402, 1967. 6

[13] J. Mol, J. Pouwelse, D. Epema, and H. Sips. Free-riding, fairness, and firewalls in p2p file-sharing. In IEEE Int’l.Conf. on Peer-to-Peer Computing (P2P), pages 301–310, 2008. 13

[14] A. Parker. The True Picture of Peer-To-Peer File-Sharing, 2005. Panel Presentation, IEEE Int’l. Workshop on WebContent Caching and Distribution. 5

[15] V. Paxson. Strategies for Sound Internet Measurement. In Proc. of ACM/USENIX IMC, pages 263–271, Oct 2004.22

[16] J. A. Pouwelse, P. Garbacki, D. H. J. Epema, and H. J. Sips. The bittorrent p2p file-sharing system: Measurementsand analysis. In IPTPS, volume 3640 of LNCS, pages 205–216. Springer, 2005. 4, 7, 13, 15, 16, 22

[17] S. Saroiu, P. K. Gummadi, and S. Gribble. A measurement study of peer-to-peer file sharing systems. In MultimediaComputing and Networking (MMCN ’02), January 2002. 4, 22

[18] S. Sen and J. Wang. Analyzing peer-to-peer traffic across large networks. In Proc. of ACM SIGCOMM IMW, pages137–150, 2002. 4, 22

[19] D. Stutzbach, R. Rejaie, N. G. Duffield, S. Sen, and W. Willinger. On unbiased sampling for unstructured peer-to-peer networks. IEEE/ACM Trans. Netw., 17(2):377–390, 2009. 22

[20] D. Stutzbach, R. Rejaie, and S. Sen. Characterizing unstructured overlay topologies in modern p2p file-sharingsystems. IEEE/ACM Trans. Netw., 16(2):267–280, 2008. 4, 22

[21] S. Xie, G. Y. Keung, and B. Li. A measurement of a large-scale peer-to-peer live video streaming system. In ICPPW’07: Proceedings of the 2007 International Conference on Parallel Processing Workshops, page 57, Washington, DC,USA, 2007. IEEE Computer Society. 6

[22] B. Zhang, A. Iosup, P. Garbacki, and J. Pouwelse. A unified format for traces of peer-to-peer systems. In LSAP’09: Proceedings of the 1st ACM workshop on Large-Scale system and application performance, pages 27–34, NewYork, NY, USA, 2009. ACM. 4

Wp 24 http://www.pds.ewi.tudelft.nl/∼iosup/


Recommended