+ All Categories
Home > Documents > Collusive Piracy Prevention

Collusive Piracy Prevention

Date post: 07-Apr-2018
Category:
Upload: zeeshana15
View: 222 times
Download: 0 times
Share this document with a friend

of 14

Transcript
  • 8/4/2019 Collusive Piracy Prevention

    1/14

    Collusive Piracy Prevention inP2P Content Delivery Networks

    Xiaosong Lou, Student Member, IEEE, and Kai Hwang, Fellow, IEEE

    AbstractCollusive piracy is the main source of intellectual property violations within the boundary of a P2P network. Paid clients

    (colluders) may illegally share copyrighted content files with unpaid clients (pirates). Such online piracy has hindered the use of open

    P2P networks for commercial content delivery. We propose a proactive content poisoning scheme to stop colluders and pirates from

    alleged copyright infringements in P2P file sharing. The basic idea is to detect pirates timely with identity-based signatures and time-

    stamped tokens. The scheme stops collusive piracy without hurting legitimate P2P clients by targeting poisoning on detected violators,

    exclusively. We developed a new peer authorization protocol (PAP) to distinguish pirates from legitimate clients. Detected pirates will

    receive poisoned chunks in their repeated attempts. Pirates are thus severely penalized with no chance to download successfully in

    tolerable time. Based on simulation results, we find 99.9 percent prevention rate in Gnutella, KaZaA, and Freenet. We achieved 85-

    98 percent prevention rate on eMule, eDonkey, Morpheus, etc. The scheme is shown less effective in protecting some poison-resilient

    networks like BitTorrent and Azureus. Our work opens up the low-cost P2P technology for copyrighted content delivery. The advantage

    lies mainly in minimum delivery cost, higher content availability, and copyright compliance in exploring P2P network resources.

    Index TermsPeer-to-peer networks, content poisoning, copyright protection, network security.

    1 INTRODUCTION

    PEER-TO-PEER (P2P) networks are most cost-effective indelivering large files to massive number of users [3],[33]. Unfortunately, todays P2P networks are grosslyabused by illegal distributions of music, games, videostreams, and popular software. These abuses have not onlyresulted in heavy financial loss in media and contentindustry, but also hindered the legal commercial use ofP2P technology.

    The main sources of illegal file sharing are peers whoignore copyright laws and collude with pirates. To solve thispeer collusion problem, we propose a copyright-compliant

    system for legalized P2P content delivery. Our goal is tostop collusive piracy within the boundary of a P2P contentdelivery network. In particular, our scheme appeals to

    protecting large-scale perishable contents that diminish invalue as time elapses.

    Traditional content delivery networks (CDNs) [13], [17],[24] use a large number of surrogate content servers overmany globally distributed WANs. The content distributorsneed to replicate or cache contents on many servers. Thebandwidth demand and resources needed to maintain these

    CDNs are very expensive.A P2P content network significantly reduces the dis-tribution cost [27], since many content servers are elimi-nated and open networks are used. P2P networks improvethe content availability, as any peer can serve as a content

    provider. P2P networks are inherently scalable, becausemore providers lead to faster content delivery.

    We use identity-based signatures (IBS) [4] to secure fileindexes. IBS offers similar level of security as PKI-basedsignatures with much less overhead. We apply discrimina-tory content poisoning against pirates. We focus on protec-tion of decentralized P2P content networks. Protectingcentralized P2P networks like Napster is much simpler thanthe scheme we proposed because of centralized indexing.

    Honest or legitimate clients are those that comply withthe copyright law not to share contents freely. Pirates arepeers attempting to download some content files withoutpaying or authorization. The colluders are those paid clientswho share the contents with pirates. Pirates and colluderscoexist with the law-abiding clients.

    Content poisoning is implemented by deliberate falsifica-tion of the file requested by pirate. The media industrybacked by Record Industry Association of America (RIAA) and Motion Picture Association of America (MPAA) has appliedunscreened brutal-force content poisoning to deter piracy inopen P2P file-sharing networks. However, their preventionresults are mixed and controversial.

    Overpeer [23] ceased operation in 2005 for ineffectiveuniversal poisoning of all peer clients. MediaDefender [1]has caused a heated controversy over P2P user community,when their operations disrupted many P2P services. TheeDonkey overlay was ordered to shut down in 2006 [24] dueto incontrollable piracy activities. The popular BitTorrentand Freenet networks are still facing many lawsuits againsttheir content distribution operations [22].

    The media industry applies poisoning against all P2Pfile-sharing services by brutal-force. In contrast, our schemedetects unpaid pirates and use discriminatory contentpoisoning to deter online piracy. Legitimate clients can stillenjoy the flexibility and convenience provided by an open

    P2P network. Our scheme stops pirates from downloading

    . The authors are with the Department of Electrical Engineering, Universityof Southern California, Los Angles, CA 90089.E-mail: {xlou; kaihwang}@usc.edu.

    Manuscript received 28 Sept. 2007; revised 8 Apr. 2008; accepted 10 Sept.2008; published online 27 Jan. 2009.Recommended for acceptance by X. Zhang.For information on obtaining reprints of this article, please send e-mail to:[email protected] and reference IEEECS Log No TC-2007-09-0492.

    Digital Object Identifier no. 10.1109/TC.2009.26.

    970 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 710 2010

    0018-9340/09/$25.00 2010 IEEE Published by the IEEE Computer Society

  • 8/4/2019 Collusive Piracy Prevention

    2/14

    copyrighted files, even in the presence of colluding peers.We use a reputation scheme to detect these colluders.

    A copyright-protected P2P network should benefit bothmedia industry and Internet user communities [6]. Ourwork leads to the development of a new generation ofCDNs based on P2P technology. Table 1 lists importantsymbols and notations used to benefit our readers. Theseterms are used to secure file indexes, generate access tokens,quantify poisoning effects, collusion prevention, and definethe performance metrics.

    We focus on finding solution of collusive piracy within thescope of a P2P network. Internetwork piracy betweenunprotected networks is a much more complex security

    problem. Our main purpose is to stop colluders fromreleasing content files freely and disrupt pirate efforts fromaccumulating clean chunks. There are many other forms ofonlineorofflinepiracythatarebeyondthescopeofthisstudy.

    For example, our protection scheme does not work on aprivate or enclosed network formed by pirate hostsexclusively. We did not solve the randomized piracyproblems using email attachments, FTP download directlybetween colluders, or replicated CDs or DVDss. At present,these direct point-to-point copyright violation problems aremostly handled by digital rights management (DRM) techni-ques [21]; even the protection results are not consideredsatisfactory [10], as many hackers have post DRM-cracks onthe Internet.

    2 RELATED WORK AND OUR APPROACH

    First, we review related work on copyrighted P2P contentdelivery. Then we identify our unique approach to solvingthe problem in P2P networks.

    2.1 Related Work

    A P2P network does not require many expensive servers todeliver contents. Instead, contents are distributed andshared among the peers. P2P networks improve fromconventional CDNs in content availability and systemscalability [3]. Many performance and security issues in

    P2P networks have been studied, such as in [5], [12], [30],

    [31], [32]. Digital content protection in P2P networks werealso studied in [15], [29].

    Electronic publishing was hindered by the rapid growthof copyright violations [14], [26]. The major source of illegalP2P content distribution lies in peer collusion to sharecopyrighted content with other peers or pirates. Kalker et al.[16] proposed a copyrighted music distribution over a P2Pnetwork. However, the system is ineffective when colludersare undetected.

    Digital watermarking is often considered for digitalcopyright protection [19]. Digital watermarking is in- jected to content file so that when a pirated copy isdiscovered, authorities can find the origin of piracy via aunique watermark in each copy. In a P2P network, allpeers are sharing exactly the same file (if not poisoned),which effectively defeats the purpose of watermarking.Thus, watermarking is not a suitable technology for P2Pfile-sharing.

    By subdividing a large file into small chunks, P2Pcontent delivery allows a peer to download multiple chunksfrom different sources. File chunking increases the avail-ability and shortens the download time. Based on file

    chunking and hashing protocols built in popular open P2Pnetworks, we classify them into three network families, asshown in Table 2.

    P2P networks in the same family have some commonfeatures. They are variants or descendants of their pre-decessors: BitTorrent [2], Gnutella [11], and eMule [18],respectively. These families are primarily distinguished byfile chunking or hashing protocols applied. Presently, noneof these P2P networks has built with satisfactory supportfor copyright protection.

    The BitTorrent family [2] applies the strongest hashing atindividual piece or chunk level, which is most resistant topoisoning. The Gnutella family applies file-level hashing,

    which is easily poisoned. The eMule family [18] applies

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 971

    TABLE 1Parameters and Notations Used in Paper

    TABLE 2File Chunking, Hashing, Poisoning, and Download Policies

    in P2P Content Networks

  • 8/4/2019 Collusive Piracy Prevention

    3/14

    part-level hashing with fixed chunking. Our analytical andexperimental results show that the eMule family demon-strates a moderate level of resistance to content poisoning.

    The ability to detect and identify poisoned chunks isdifferent in three P2P network families. The BitTorrentfamily keeps clean chunks and discard poisoned chunks.The Gnutella family must download the entire file before

    any poisoned chunks can be detected. Because of the part-level hashing, the eMule family will either keep or discardthe entire part of 53 chunks.

    2.2 Our Approach and Contributions

    Content poisoning is often treated as a security threat to P2Pnetworks [7], [9]. To our best knowledge, using selectivecontent poisoning to prevent collusive piracy has not beenexplored in the past. We offer the very first proactivepoisoning approach to curtailing copyright violation in P2Pnetworks. We make the following specific contributionstowards P2P content delivery.

    2.2.1 Distributed Detection of Colluders and Pirates

    We develop a protocol that identifies a peer with itsendpoint address. File index format is changed to incorpo-rate a digital signature based on this identity. A peerauthentication protocol is developed to establish thelegitimacy of a peer when it downloads and uploads thefile. Using IBS, our system enables each peer to identifyunauthorized peers or pirates without the need for commu-nication with a central authority.

    2.2.2 Proactive Content Poisoning of Detected Pirates

    Our protocol requires sending poisoned chunks to anydetected pirate requesting a protected file. If all clientssimply deny download request without poisoning, the

    pirates can still accumulate clean chunks from colluders thatare willing to share. With poisoning, pirates are forced todiscard even clean chunks received. This will prolong theirdownload time to a level beyond practical limit. Experi-ments show that it is unlikely that a pirate can download aclean copy of the file.

    2.2.3 Containment of Peer Collusion to Stage Piracy

    Our system is unique from any existing P2P copyrightprotection scheme in that we recognize that peer collusionis inevitable: a paid customer may intentionally colludewith pirates; a pirate may also hack into client hosts andturn them into unwilling colluders. Our system is designed

    so that even with large number of colluders, a pirate willstill suffer from intolerably long download time. We alsopresent a random collusion detection mechanism to furtherenhance our system.

    2.2.4 Trusted P2P Platform for Copyrighted

    Content Delivery

    Hardware investment for P2P content delivery is muchlower than that required in any existing CDNs. Our systemonly uses a few distribution agents to serve large number ofclients. The system is highly scalable, robust to peer andlink failures, and easily deployed in Gnutella, KaZaA,eMule networks, etc. All claimed advantages are backed by

    performance analysis and simulation results.

    3 COPYRIGHT-PROTECTED P2P NETWORKS

    This section specifies the system architecture, clientjoining process, pirate poisoning mechanism, and colluderdetection that we built in the newly proposed copyright-protection scheme for P2P content distribution in opennetwork environment.

    3.1 Trusted P2P Network Architecture

    Our copyright-protected P2P network is depicted in Fig. 1,conceptually. The network is built over a large number ofpeers. There are four types of peers coexist in the P2Pnetwork: clients (honest or legitimate peers), colluders (paidpeers sharing contents with others without authorization),distribution agents (trusted peers operated by content ownersfor file distribution), and pirates (unpaid clients down-loading content files illegally).

    To join the system, clients submit the requests to atransaction server that handles purchasing and billingmatters. A private key generator (PKG) is installed to generateprivate keys with IBS for securing communication amongthe peers. The PKG has a similar role of a certificate authority(CA) in PKI services. The difference lies in the fact that CAgenerates the public/private key pairs, while PKG onlygenerates the private key.

    The transaction server and PKG are only used initiallywhen peers are joining the P2P network. With IBS, thecommunication between peers does not require explicit

    public key, because the identity of each party is used as thepublic key. In our system, file distribution and copyrightprotection are completely distributed.

    Based on past experience, the number of peers sharing orrequesting the same file at any point of time is aroundhundreds. Depending on the variation of the swarm size,only a handful of distribution agents is needed. For example,it is sufficient to use 10 PC-based distribution agents tohandle a swarm size of 2,000 peers. These agents authorizepeers to download and prevent unpaid peers from gettingthe same contents.

    Paid clients, colluders, and pirates are all mixed upwithout visible labels. Our copyright-protection network is

    designed to distinguish them automatically. Each client is

    Fig. 1. A secured P2P platform for copyright-protected content delivery.This open network is accessed by a large number of paid clients, some

    colluders or pirates, and a few distribution agents. The system designprevents pirates from downloading copyrighted files from colluders.

    972 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 71, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    4/14

    assigned with a bootstrap agent, selected from one of thedistribution agents, as its entry point. In current P2Pnetworks, a peer can self-assert its username withoutverification. Therefore, we use peer endpoint address(IP address port number) instead of username to identifya peer. A peer is considered fully connected if it is reachablevia a listening port on its host.

    We use the endpoint address of the listening port as apeer identity. For simplicity, we assume that each peer has astatistically configured listening port. Currently, most P2Pusers connect to the Internet via a home network. In suchenvironments, statistically configuring the NAT device toforward incoming ports to a peer node is a norm. Theconstraint occurs when a large number of peers are behinda single NAT device.

    Fig. 2 depicts an example: A peer has an IP address192.168.0.2 leased from its local router. It is listening to port5,678 forwarded by the router. When communicating withthe bootstrap agent, the peer announces its listening portnumber. The bootstrap agent calls an Observe() subroutine,

    which verifies that the same peer is indeed reachable via theclaimed port, although its public IP address is actually68.59.33.62. Hence, the peer is identified by 68.59.33.62:5678.

    The detail of Observe() is as follows: when a peer sendsmessage to its bootstrap agent through outgoing port,agent attaches a random number (nonce) in the reply. Theagent then sends a message to the advertised listening port68.59.33.62:5678, asking the peer to send back the nonce. Ifthe peer replies correctly, then its endpoint is verified.

    The endpoint address is used as peers public key. Thereis no need to encrypt the file body. This reduces the systemoverhead. Enabling peers behind NAT without a staticlistening port requires a hole-punching mechanism. The

    system uses the bootstrap agent to forward the incomingrequests. The identities of all agents, except the bootstrapagent, are hidden from clients. This stops a malicious nodeto blacklist or attack the distribution agents.

    3.2 Protection in Peer Joining Process

    Fig. 3 illustrates the process for a client to join a P2Pnetwork supported by a new peer authorization protocol(PAP). We will formally specify PAP in Section 4.3. Here,we first introduce the handshaking mechanisms used toprotect the peer joining process. For a peer to join thenetwork, it first logs in to a transaction server to purchasethe content. After transaction, the client receives a digital

    receipt containing the content title, client ID, etc. This

    receipt is encrypted such that only content owner anddistribution agent can decrypt.

    The client receives the address of the bootstrap agent asits point of contact. The joining client authenticates with the bootstrap agent using the digital receipt. The session key

    assigned by the transaction server secures their commu-nication. Since the bootstrap agent is set up by the contentowner, it decrypts the receipt and authenticate its identity.The bootstrap agent requests a private key from PKG andconstructs an authorization token, accordingly.

    Let k be the private key of content owner and id be theidentity of the content owner. We use Ekmsg to denote theencryption of message with key k. The Skmsg denotes adigital signature of plaintext msg with key k. The client isidentified by user ID and the file by file ID.

    Each legitimate peer has a valid token. The token is onlyvalid for a short time so that a peer needs to refresh thetoken periodically. To ensure that peers do not share the

    content with pirates, the trusted P2P network modifies thefile-index format to include a token and IBS peer signature.Peers use this secured file index in inquiries and downloadrequests. Seven messages are specified below to protect thepeer joining process:

    Msg0: Content purchase request;Msg1: BootstrapAgentAddress, Ek (digital_receipt, Bootstrap-

    Agent_session_key); Msg2: Adding digital signature Ek (digital_receipt); Msg3: Authentication request with userID, fileID, Ek

    (digital_receipt); Msg4: Private key request with privateKeyRequest (observed

    peer address);

    Msg5: PKG replies with privateKey;Msg6: Assign the authentication token to the client.

    Peers identify the pirates by checking the validity ofextra signatures in file indexes. The trusted P2P applies thisprotection to share clean contents exclusively among thepeers, and uses content poisoning techniques against thepirates. Tokens are time-stamped and need to be refreshedperiodically. Colluders detected by our system cannotreceive new token after its current token expires.

    3.3 Proactive Content Poisoning

    We summarize in Table 3 the key protocols and mechan-

    isms used to construct the trusted P2P system. In this

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 973

    Fig. 2. The bootstrap agent observes end-point address p

    68:59:33:62 : 5678 in a trust-enhanced P2P network.Fig. 3. The protected peer joining process for copyrighted P2P content

    delivery. Seven messages are used to secure the communications

    among four parties involved.

  • 8/4/2019 Collusive Piracy Prevention

    5/14

    approach, modified file index format enables pirate detec-tion. PAP authorizes legitimate download privileges toclients. Content distributor applies content poisoning todisrupt illegal file distribution to unpaid clients. The systemis enhanced by randomized collusion detection.

    In our system, a content file must be downloaded fully tobe useful. Such a restraint is easily achievable by compres-sing and encrypting the file with a trivial password that isknown to every peer. This encryption does not offer anyprotection of the content, except to package the entire filefor distribution.

    Fig. 4 illustrates the proactive content poisoning mechan-isms built in our enhanced P2P system. If a pirate sends

    download request to a distribution agent or a client, then byprotocol definition, it will receive poisoned file chunks. If thedownload request was sent to a colluder, then it will receiveclean file chunks. If a pirate shares the file chunks withanother pirate, then it could potentially spread the poison.

    Therefore, it is critical to send poisoned chunks topirates, not simply denying their requests. Otherwise, evenif all clients deny pirates requests, the pirate still canassemble a clean copy from those colluders who haveresponded with clean chunks.

    With poisoning, we exploit the limited poison detectioncapability of P2P networks and force a pirate to discard theclean chunks downloaded with the poisoned chunks. The

    rationale behind such poisoning is that if a pirate keeps

    downloading corrupted file, the pirates will eventually giveup the attempt out of frustration.

    3.4 Randomized Colluder Detection

    Although our system is designed to tolerate the presence ofcolluders in the network, we show in later sections thatreducing number of colluders will improve system perfor-mance. Therefore, we introduce a reputation-based [8]colluder detection mechanism to secure our system frompiracy.

    As reported in the GossipTrust paper [34], the gossipprotocol and power nodes play a crucial role in speeding upthe reputation aggregation process in a P2P network.Randomized gossiping can reach consensus among all peersin a distributed manner. This approach exploits massive

    concurrency among millions of active nodes in a very largeP2P network. We design below a simplified GossipTrustsystem to identify colluders.

    The idea is to associate each {peer, file} pair with a collusionrate. The 0 rate means that the peer was never reported asa colluder. Otherwise, the peer is getting a collusion reportof 1, meaning that it has shared clean content with illegaldownload requesters. This collusion rate is accumulativelike the way e-Bay collects peers reputation scores. Fig. 5illustrates the collusion detection process.

    Distribution agents randomly recruit clients, calleddecoys, to send illegal download requests to suspectedpeers. If an illegal request is returned with a clean file

    chunk, the decoy reports the collusion event. Since thedecoy is randomly chosen, there exists a risk that the reportis not trustworthy either by error or by cheating. Thus, weneed a reputation system to screen the peers.

    To choose honest decoys, we designed a lightweightreputation system. Consider a P2P network with n paidclients. We use a collusion vector CC fcicig, where 0 cici is the collusion rate of peer i. The collusion threshold isused to bar detected colluders from getting new tokens.

    When a current token expires, the colluder is labeledas a pirate with denied access to the file. We define atrust vector TT ftitig, where ti 1 ci=ti 1 ci= for all 1 i nn.When a decoy i probes a peer j for collusion, it sends j

    an illegal request and sends report rrijij to the agent. The

    TABLE 3Mechanisms for Copyright Protection

    Fig. 4. Proactive poisoning mechanisms built in trusted P2P network,where clean chunks (white) and poisoned chunks (shaded) are mixed infile streams received by pirates, but legitimate clients receive only cleanchunks.

    Fig. 5. Distribution agent randomly recruits some clients to probe

    suspected peers. Collusion is reported when a peer replies clean content

    to an illegal download request.

    974 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 7, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    6/14

    condition rijrij 1, when j replies with a clean content. The

    collusion rate for peer j is computed by the following

    expression:

    cjcj minminfcjcj titi rijrij; g for all 1 ii; jj nn: 1

    Peer i is identified as a colluder, when its collusion rate

    exceeds the threshold, i.e., cj

    cj! . With this reputation

    system, a distribution agent weighs each decoys report

    against its own trust score to determine the trustworthiness

    of the reported collusion event. Such a design ensures that a

    pirate will never be selected as a probing decoy.Consider a case when the collusion threshold is set with

    2:5. Consider an honest peer i with an initial collusion

    rate ci 0, and thus, a complete trust titi 1 initially. A

    suspected client j has collusion rate cici 1:6. We recruit i to

    probe j, and i reports with rijrij 1. We can identify peer j as

    a colluder since cjcj Min1:6 1 1; 2:5 2:5. This way,

    only high-reputation clients are hired as probing decoys.

    Thus, more credibility is given to ensure the accuracy of

    colluder detection.

    4 PEER AUTHORIZATION PROTOCOL

    In a P2P content distribution network, only the content

    owner can verify the user ID/password pair; peers cannot

    check each others identity. Revealing a users identity to

    other peers violates his or her privacy. To solvethis problem,

    we developed a PAP protocol. First, we apply IBS to secure

    file indexing. Then we outline the procedure to generate

    tokens. Finally, we specify the PAP protocol that authorizes

    file access to download by peers.

    4.1 Secure File IndexingIn a P2P file-sharing network, a file index is used to map a

    file ID to a peer endpoint address. When a peer requests to

    download a file, it first queries the indexes that match a

    given file ID. Then the requester downloads from selected

    peers pointed by the indexes. To detect pirates from paid

    clients, we propose to modify file index to include three

    interlocking components: an authorization token, a timestamp,

    and a peer signature.Each legitimate client has a valid token assigned by its

    bootstrap agent. The timestamp indicates the time when a

    token expires. Thus, the peer needs to refresh the token

    periodically. This short-lived token is designed for protect-ing copyright against colluders. The cost at each distribu-

    tion agent to refresh the client tokens is rather limited, as

    shown via experiments. The peer signature is signed with

    the private key generated by PKG. This signature proves

    the authenticity of a peer.Download requests make explicit references to file

    indexes. The combined effects of the three extra fields

    ensure that all references to the file indexes are secured.

    Peers identify the pirates by checking the validity of the

    token and the signature in a file index. These features secure

    the P2P network operations to safeguard the sharing of

    clean contents among the paid clients.

    4.2 File-Level Token Generation

    First, both the transaction server and the PKG are fullytrusted. Their public keys are known to all peers. ThePAP protocol consists of two integral parts: token genera-tion and authorization verification. When a peer joins theP2P network, it first sends authorization request to the bootstrap agent. All messages between a peer and its

    bootstrap agent are encrypted using the session keyassigned by the transaction server at purchase time.

    The authorization token is generated by Algorithm 1specified below. A token is a digital signature of a three-tuple: {peer endpoint, file ID, timestamp} signed by the privatekey of the content owner. Since bootstrap agent has a copyof the digital receipt sent by transaction server, verifyingthe receipt is thus done locally. The Decript (Receipt)function decrypts the digital receipt to identify the file .The Observe (requestor) returns with the endpoint address pp.The Owner Sign (; pp; tsts) function returns with a token.

    Upon receiving a private key, the bootstrap agentdigitally signs the file ID, endpoint address, and timestamp

    to create the token. The reply message contains a four-tuple: {endpoint address, peer private key, timestamp, token}.The reply message from bootstrap agent is encrypted usingthe assigned session key.

    Algorithm 1. Token Generation

    Input: Digital ReceiptOutput: Encrypted authorization token TTProcedures :

    01: if Receipt is invalid,02: deny the request;03: else04: = Decrypt(Receipt);

    // is file identifier decrypted from receipt //

    05: pp = Observe(requestor);// pp is endpoint address as peer identity//

    06: kk = PrivateKeyRequest (p);// Request a private key for user at pp //

    07: Token T = OwnerSign(ff; pp; tsts)// Sign the token TT to access file ff //

    08: Reply = fk;p;ts; Tk;p;ts; Tg// Reply with key, endpoint address,

    timestamp, and the token //09: SendtoRequestor{Encrypt(Reply)}

    // Encrypt reply with the session key //10: end if

    The cost at each distribution agent to refresh the tokens israther limited. In our experiments, there are 10 distributionagents to serve 1,000 clients/colluders. Each token refreshrequires transmitting at most 2 KB of data and each peer isrequired to refresh its token in every 10 minutes. Per eachagent, there are 1; 000=10 100 peers refreshing tokens in10 minutes, Hence, we need to transmit only 100 2 KB 200 KB to refresh thetokens in every 10 minutes. Consideringa standard broadband link capacity of 1.5 Mbps bandwidth,such a low refreshing overhead is negligible.

    4.3 The Peer Authorization Protocol

    The PAP protocol is formally specified below. A client must

    verify the download privilege of a requesting peer before

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 975

  • 8/4/2019 Collusive Piracy Prevention

    7/14

    clean file chunks are shared with the requestor. If therequestor fails to present proper credentials, the client mustsend poisoned chunks, as shown in Fig. 6.

    In PAP, a download request applies a token TT, file index ,timestamp tsts, and the peer signature SS. If any of the fieldsare

    missing, the download is stopped. A download client musthave a valid token TT and signature SS. Two pieces of criticalinformation are needed: public key KK of PKG and the peerendpoint address pp.

    Algorithm 2 verifies both token TT and signature SS. Fileindex ; p; p contains the peer endpoint addresspp andthefileID . Token TT also contains the file index information and ttsindicating the expiration time of the token. The Parse (input)extracts timestamp tsts, token TT, signature SS, and index froma download request. The functionMatch (TT ; tsts, KK) checks thetoken TT against public key KK. Similarly, Match (SS; pp) grantsaccess ifSSmatches with pp.

    Algorithm 2. Peer Authorization Protocol

    Input: TT = token, tsts = timestamp, SS = peer signature,and ; p; p) = file index for file at endpoint pp

    Output: Peer authorization statusTrue: authorization grantedFalse: authorization denied

    Procedures :

    01: Parse (input) = fT ; ts; S ; ; pgfT ; ts; S ; ; pg// Check all credentials from a input request //

    02: pp = Observe(requestor);// detect peer endpoint address pp //

    03: if {Match (S; pS; p) fails},

    //Fake endpoint address pp detected //return false;

    04: endif

    05: if {Match(T ; ts; KT ; ts; K) fails},return false;

    // Invalid or expired token detected //06: endif07: return true;

    When a client downloads a file, it needs to authorize thepeer to share the file. Otherwise, downloading from a piratemay be poisoned, as shown in Fig. 4. When respondingqueries from honest peers, a client adopts a slightlyreduced version of Algorithm 2. Because the inquiry issent directly to endpoint pp, the Observe() procedure is no

    longer required.

    4.4 Adversary and Security Analysis

    In contrast to a security-via-obscurity scheme, the PAPprotocol is designed to be completely open. We provide anadversary analysis for security assurance of the proposedcopyright-protected P2P networks. These assurances ensurethat our PAP protocol is secured from common attacks asexplained below.

    4.4.1 Peer Endpoint Address Is Forgery Proof

    Collusive piracy is achievable, only if the pirate manages

    to communicate with other peers. IP spoofing can change

    pirates endpoint address, resulting in pirate not to receive

    any response. Therefore, spoofing endpoint address during

    download is useless to a pirate. A pirate can intercept the

    token sent to a client, and masquerade its own endpoint

    address to match with the token. However, using the

    Observe() subroutine illustrated in Fig. 2, other clients will

    notice the masqueraded peer identity and fail its endpoint

    verification.

    4.4.2 Authorization Tokens Cannot be Shared by Peers

    A token is generated after the verification of a digital receipt.

    This is used to authorize a client to download the content. It

    is designed to be a digital signature of a three-tuple: {file ID,

    endpoint address, timestamp}. Multiple peers cannot share this

    three-tuple because each peer has a different endpoint

    address. Sharing the same token on different endpoint

    addresses will result in signature mismatch. This is applied

    to stop a pirate from using a stolen token.

    4.4.3 Pirates Cannot Poison Legitimate Clients

    Our system modifies file index format to include tokens and

    signatures. When downloading from other peers, a clientchecks the file index for valid signatures. It only downloads

    file chunks from other legitimate clients that publish some

    valid file indexes. Therefore, even if a pirate attempts to

    poison other peers, no legitimate client will use it as a

    download source.

    4.4.4 Stolen Private Keys Are Useless to Pirates

    A pirate may hack into a peers host to obtain its private

    keys. A colluder may even share these secrets with a pirate.

    However, sharing or stealing private keys does not help the

    pirate at all, because of the use of endpoint address as public

    key. Since other clients use Observe() subroutine to obtain

    peer endpoint address, stolen private keys are useful.

    5 PROTECTION PERFORMANCE ANALYSIS

    In this section, we analyze the performance of the P2P

    copyright protection scheme. First, we give the condition to

    secure the file index. Then, we calculate thepoisoning rate of

    receiving poisoned chunk in response to a pirates down-

    load request. Finally, we estimate the average file download

    times TT by legitimate clients and detected pirates for

    comparison. The protection success rate measures the

    percentage of pirates that fail to download the requested

    file within a given tolerance threshold.

    Fig. 6. The PAP enables instant detection of a pirate upon submitting an

    illegal download request.

    974 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 7, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    8/14

    5.1 Secure File Indexes

    In current P2P networks, a file index ; p; p) associates a fileidentifier with a peer endpoint address pp. In PAP, wereplace this index format with a four-tuple:

    f ; p; T ; ts; Sg f ; p; T ; ts; Sg: 2

    This security-enhanced index format cannot be forged.

    Both TT and SS are collision-free signatures. A pirate cannotcreate its own token or signature via brutal-force attack.Therefore, a pirate cannot create index by itself. WithAlgorithm 2, attempt to modify any single element of willfail in token or signature verification or both. Therefore, theenhanced index is secured.

    Based on above discussion and Section 4.4, there exist aone-to-one mapping of and client digital receipt. Thisforgery proof mapping is the foundation of our PAPprotocol because it ensures distributed pirate detection atevery client. Securing the digital receipt belongs to therealm of general network security, which is beyond thescope of this paper.

    The reason of using IBS instead of PKI service is due toconcern of overhead. In a P2P network with nn peers, eachpeer may need to contact all nn 1 peers. If we usePKI service for signature verification, the total CA com-munication overhead is On2n2. With an IBS system, thisoverhead is reduced to Onn, because a peer needs tocontact the PKG only once.

    5.2 Chunk Poisoning Rate

    Our system has an integral function to randomly detectcolluders. However, such effect could never be perfect. It isalways possible that some colluders will evade the detec-tion. Therefore, these undetected colluders become the realsource of copyright violations. Let collusion rate "" be thepercentage of paid clients acting as undetected colluders.The pirate receives clean content from undetected colluders.

    Under a randomized policy, the piracy rate rr is thepercent of pirates among all peers in the content deliverynetwork. We define chunk poisoning rate as the probabilityof a pirate to receive a poisoned chunk. The following twotheorems are obtained:

    Theorem 1. In BitTorrent-like network, the chunk poisoningrate is expressed by

    1 r1 " 1 r1 ": 3

    For eMule and Gnutella, the chunk poisoning rate isexpressed by

    1 " 1 ": 4

    Proof. In BitTorrent, only an honest client or a distributionagent can poison a pirate. There is no propagation ofpoisoned chunks among the pirates. The term (1 rr)represents the percentage of nonpirates among allpeers. Among these peers, (1 "1 ") is the percent ofnoncolluding clients. Therefore, is just the product ofthe two terms.

    A pirate cannot identify poisoned file chunks in eMuleand Gnutella. The pirate stores undetected poisonedchunks in its local cache and unknowingly shares them

    with other pirates. We can express poisoning rate by

    Probability {poisoned by a client or an agent} Probability {poisoned by a pirate} 1 r1 " r1 r1 " r1 "1 ": Q:E:D:Q:E:D:

    Fig. 7 plots the poisoning rate in (1) as a function ofrr

    and "". This is a concave upward surface in the 3D space.The peak of the protection surface is at the point whenwe have a fully trusted P2P network by which r r 0. Thelowest point corresponds to a completely pirated net-work where law-abiding peer does not exist.

    5.3 Prolonged Download Time by Pirates

    In this study, we ignore the accidental corruption of content

    (content pollution) due to transmission error, etc. To capture

    the inherent poisoning resistance of a P2P network, we

    define a piracy penalty as the percent of downloaded

    chunks that are discarded due to poisoning.Let ff be the size of a clean content file and ddbe the actual

    file downloaded including clean and poisoned chunks. Theratio ff=dd represents the percentage of downloaded chunks

    that are clean. Thus, we have

    1 f 1 f=dd: 5

    Piracy penalty implies extra workload imposed on the

    pirates to receive some poisoned chunks. In estimating this

    piracy penalty, we ignore the effects of network topology,

    traffic congestions, peer locations, etc. To normalize the

    results, we consider a chunk the smallest unit of a file, whose

    hash value is used to verify its authenticity and integrity. A

    chunk is called a piece in BitTorrent. Chunk is called by

    eMule or Gnutella families.We model the piracy penalty on all three P2P content

    networks: Gnutella, eMule, and BitTorrent. In the eMule

    network, every 53 chunks form a part. Peers compute the

    hash values of all parts and exchange the part-level hash setsinside the P2P network. Thus, the part hash sets are also

    susceptible to content poisoning.We estimate below the download time of a pirate. This

    estimation will be verified by simulation experiments.

    Consider a peer attempting to download a content file of

    size ff. Let bb be the average download speed for a peer. A

    legitimate client does not receive any poisoned chunks.

    Thus, the download time of a legitimate client is simply

    calculated by Tc fTc f=bb.

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 977

    Fig. 7. Variation of the poisoning rate in BitTorrent-like networks with

    respect to variation in piracy rate r and collusion rate ".

  • 8/4/2019 Collusive Piracy Prevention

    9/14

    By definition of ;1 represents the percent of usefuldownload effort. Every time a pirate attempts to download afile, only 1 portion of the downloaded file is clean anduseful. Therefore, to receive the entire file, the pirate mustrepeat the download attempts 1=1 ) times. This leads tothe following download time estimated for a pirate TpTp:

    Tp fTp f=b1 b1 : 6

    Theorem 2. The pirate is expected to experience the followingdownload times in three existing networks with our proactivecopyright-protection system:

    ETTp ff=bb ""mm; Gnutella family;ff=b "54b "54; eMule family;ff=bb r " r"r " r"; BitTorrent family:

    8

    Z1

    gTpdTp: 8

    The above expressions for the poisoning rate , piracypenalty , expected download times TcTc and TpTp, andprotection success rate are obtained by analytical reason-ing. Their accuracy is verified by simulation experiments inSection 6, except in some cases, where the pirate downloadtime becomes so large that cannot be simulated in finitetime. For simplicity, we assume that pirates adopting a

    random peer selection policy. In Section 6, we will discussother peer selection policies.

    6 SIMULATED P2P EXPERIMENTAL RESULTS

    It is very difficult to conduct copyright violation experimentsin a real-life P2P network. We have to use simulated P2Pexperiments to verify the analytical results just presented.We simulate P2P network architectures of three major P2Pnetwork families introduced in Table 2.

    The trusted P2P features are built into simulators of thesethree popular P2P networks. The simulation process consistsof three stages: First, we measure the chunk poisoning rate

    on simulated networks. Second, we measure the download

    time and protection success rate of P2P networks simulated.Finally, we compare their differences in defense perfor-mance and protection overhead experienced.

    6.1 Simulation Setting and Experiments

    The simulated P2P network environment is illustrated inFig. 8. The simulator consists of three layers: The lowest

    layer for data collection and reporting. The middle layersimulates the behaviors of four peer categories: distributionagents, legitimate clients, peer colluders, and illegal pirates.The upper layer simulates the P2P transport.

    For eMule simulation, we bypassed the eMule serverconnections so that the network operates in a completelydecentralized mode. Assume that all peers are connected viabroadband connections with bandwidth limit of 1.5 Mbps.For simplicity, we ignored network delays and transmissionerrors. The simulator excludes the communication betweendistribution agents and the PKG, which are outside of theP2P network. The network topology is known to all peers.

    The simulator measures the shortest possible download

    time by a pirate or by a client. In reality, the download timeby a pirate should be even longer than the measured valuedue to delays caused by other network factors. Simulatorstarts with distribution agents having clean chunks. Thechunks are distributed using a rarest-first policy [20].

    We use 10 distribution agents in our simulated networks.By varying the numbers of peers, colluders, and pirates, wesimulate the P2P network with different level of piracychallenges. There are many other peer selection policies usedby P2P client software. Essentially, these policies favor onegroup of peers against another. Therefore, we simply lumptheir effects into different values of peer collusion rate "" or ofthe piracy rate r.

    A P2P network may have millions of peer nodes. Ourexperience shows that per content file, a peer accesses atmost a few hundreds of peer nodes. Since the trusted P2Poffers file-level copyright protection, it is sufficient tosimulate a P2P overlay with 2,000 of peer nodes. We testthe distribution of a 700 MB CD-ROM file. In eMulenetwork, this file is divided into 4,017 chunks.

    To explore the limits of a trusted P2P network with theproposed copyright protection features, we also tested smallfiles of sizes 10 and 20 MB, and very large content files ofsize 4.5 GB of a movie. Based on estimation in Theorem 2,the download times of large content files by pirates mayrequire millions of years, meaning infeasible for a pirate to

    download a clean file from a trusted P2P network.

    Fig. 8. Copyright-protected P2P simulation experiment environment.

    974 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 7, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    10/14

    6.2 Results on Chunk Poisoning Rate

    In all experiments, the chunk poisoning rate starts from100 percent. This reflects the fact that initially only thedistribution agents have clean file chunks in local caches.Since distribution agents must poison any request from adetected pirate, the pirate will receive only poisoned chunks.When colluders have the chunks to share, the poisoning rate

    is lowered to a stable value after about 2 hours. Fig. 9areports the measured poisoning rate for the first two hourson a simulated eMule network. The y-axis shows the average value over all 1,000 pirates.

    The piracy rate r assumes the value 50 percent. With low"" 1 percent,thepoisoningrateisfairlyclosetotheidealcaseof 100 percent of poisoning. With a higher "" 10 percent,poisoning rate starts at a value close to 1 and fluctuates in adecreasingtrend. In about1.5 hours, convergesto0.9.Fig.9bplots similar results on the Gnutella network.

    Fig. 10 plots the effects of changing "" and r on thepoisoning rate . Fig. 10a reports results of varying thecollusion rate "" with rr 50 percent and 1,000 pirates out of

    2,000 peers. Increasing the collusion rate "" results in a sharpdrop of poisoning rate on both eMule and Gnutella.

    As the number of colluders increases to 800 ("" 0:8), theaverage is reduced to only 20 percent. The decrease of islinear with respect to the increase of "", especially in the caseof Gnutella. On the eMule network, the measured poisoningrate deviates a little from the theoretical prediction, asdictated by (2).

    The poisoning rate is insensitive to the increase ofpirates, as shown in Fig. 10b. These results are plotted forthe low collusion rate "" 10 percent corresponding to100 colluders out of 1,000 paid peers. The simulation resultsshow that when the number of pirates increases from 200 to

    1,000, the change of is very small.

    This result reinforces Theorem 1, namely, the poisoningrate is irrelevant to the piracy rate rr on the eMule network.Although we simulated "" up to 0.8, such extremely highcollusion rate is unlikely to occur in real-life P2P applica-

    tions. We incorporate randomized detection to identifycolluders. The detected colluders will not obtain a newtoken, and thus treated as new pirates after the currenttoken expires.

    6.3 Download Time by Legitimate Clients

    Fig. 11 plots the average client download times of a contentfile with a file size ff 700 MB. Because clients are notpoisoned, we measure their download time as a basis forstudying the download time penalty of pirates. In botheMule and Gnutella, the average download time for clientsremains quite flat around 1.5 hours.

    This flat download time implies no suffer by legal users

    under peer collusions. In Fig. 11, a paid client on the

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 979

    Fig. 9. Chunk poisoning rate () in two P2P networks under two collusion

    rates and a fixed piracy rate rr = 50 percent. (a) eMule network and

    (b) Gnutella network.

    Fig. 11. Average download time of a 700-MB CD file by legitimate clientsin simulated eMule and Gnutella networks.

    Fig. 10. Poisoning rate decreases linearly with increasing peer collusion

    rate, independent of the piracy rate. (a) Effect of "" underrr 50 percent.

    (b) Effect of rr under "" = 10 percent.

  • 8/4/2019 Collusive Piracy Prevention

    11/14

    Gnutella downloads in 10 percent shorter time than a clienton the eMule network. The shorter download time onGnutella is attributed to the lower complexity of the filechunking protocol applied in Gnutella.

    6.4 Pirate Download Time and Success Rate

    Now, we report simulation results on the expected down-load time by pirates. To explore the limit of the trusted P2Psystem, we have experimented on various file sizes. Wedefine the protection success rate by the failure rate ofpirates to download the file within a tolerance threshold oftime. The is set at 20 days for a 700 MB CD-ROM imagefile and 30 days for a 4.5 GB movie file.

    In Fig. 12, we simulated all three P2P network families

    with 100 paid clients, 900 colluders, and 1,000 pirates. Thisscenario of 90 percent colluders is very unlikely in a real-lifeP2P network. We design this experiment to approximate aworst-case scenario that all pirates adopt an aggressive peerselection policy. We measure the percentage of pirates thatfail to download a clean copy within the time frame asan approximated success rate . All curves start with100 percent initially. With increasing threshold, we evaluatethe success rate of the three networks.

    Without poisoning, the average download time for aclient is 1.5 hours. The Gnutella family, including KaZaAand LimeWire, has a near-perfect success rate (higher than99.9 percent). The eMule network has an average 85 percent

    after tolerating up to 20 days. These success rates are

    satisfactory, because most pirates will give up trying after afew days without success.

    The Gnutella family including KaZaA and LimeWarenetworks has the highest success rate close to 99.9 percentfor both file sizes. With a tolerance threshold of 20 days inFig. 12, the eMule network has an average of 98 percentsuccess rate for the 4.5 GB file and 85 percent success ratefor the 700 MB file.

    In comparison, an average pirate in BitTorrent networktakes about 100 minutes to download the 700 MB file, onlymarginally longer than a paid client. Hence, the success ratedrops rapidly before 2 hours. This implies that our systemdoes not protect BitTorrent well due to its strong resistanceto content poisoning.

    Table 4 reports the expected download time Tc by a regularclient and TpTp expected by a pirate. These numbers areaveraged over 1,000 pirates out of 2,000 peers. Someextraordinarily large download times by pirates on eMuleor Gnutella networks will exceed hundreds or thousands ofyears, far beyond what the simulation experiments can do.Their magnitude are calculated using (4) and marked asgreater than 1,000,000 years in Table 3. This implies thatpirates cannotdownload thefiles from Gnutella successfully.

    Among the three P2P network families in Table 1, wefind that Gnutella family is best protected by our system.The eMule family is the next and BitTorrent-like networksare most poison resistant. In reality, the tolerance thresholdby pirates is only a few days at most. Pirates may try othermeans to steal from public networks. That kind of attacks isbeyond the scope of our P2P protection scheme.

    In general, we find that our system protects eMule andGnutella families rather satisfactorily, if the tolerancethreshold is set to be 20 days for files up to 700 MB. Forthe 4.5 GB movie file, the tolerance level may have to set as

    long as 4 months. This implies that all pirates will give upthe attempt on P2P networks, if they have to wait for solong and still cannot download a clean copy.

    6.5 Overheads in Token and Poison Processing

    In our experiments, a distribution agent assigns new tokensto about 100 clients/colluders in every 10 minutes. Fig. 13plots the traffic distribution of three message types from asingle agent. In Table 4, the regular download time of a700 MB CD image file to a paid client is about 90 minutes.

    The white bars for actual content delivery dominate thedistribution, consuming almost 99.9 percent of the linkbandwidth of 1.5 Mbps. The crossed bars are the total byte

    count for token distribution. During content uploading, the

    Fig. 12. Protection success rate of a simulated eMule network. Most

    users have a tolerance threshold far less than 5 days. (a) 700 MB CD-

    ROM file and (b) 4.5 GB movie file.

    TABLE 4Expected Download Times by Pirates and Paid Clients

    in Three Simulated P2P Networks

    974 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 7, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    12/14

    token and poisoning overhead is less than 0.1 percent

    ( 0:1 MB/100 MB), which is negligible.

    The responsibility of content poisoning is distributed toall paid clients and distribution agents. This distribution

    inevitably costs some upload bandwidth by agents and

    clients. Compared to clean (unpoisoned) file sharing, the

    upload bandwidth constitutes the network overhead for

    distributing poisoned chunks.We distinguish in Fig. 14 the bandwidth consumed to

    deliver clean file versus poisoned file chunks. We report the

    normalized bandwidth allocation for both types of file

    chunks in eMule network under two peer distributions. The

    shaded area represents the upload bandwidth of a client

    allocated to distribute poisoned chunks. The white area is

    the bandwidth used to distribute the content files to

    legitimate clients. Initially, since no client has any file

    chunks to share, the clients receive no request for either.Fig. 14a shows a high piracy rate of rr 50 percent,

    meaning that 1,000 pirates are present out of 2,000 peers in

    the simulated eMule network. Therefore, almost all upload bandwidth is used to handle the pirate requests after

    1.5 hours. Fig. 14b corresponds to a low piracy rate of

    200 pirates out of 2,000 peers. Thus, lower portion of the

    bandwidth is allocated to distribute poisoned chunks. The

    poisoning overhead is dropped to 40 percent after one hour.Fig. 15 plots the upload bandwidth allocated for content

    poisoning at a distribution agent. The agents dedicate the

    upload bandwidth, when client download is in progress.

    When there are no legitimate requests, distribution agents

    are fully dedicated to the task of poisoning pirates. This

    switching of bandwidth allocation enhances a low-cost

    P2P file distribution network.For a small content file of 10 MB, the switching of

    bandwidth usage takes place at 2 minutes. The 50 MB file

    takes 7 minutes to switch from content delivery to

    complete poisoning. The 700 MB file takes 90 minutes to

    switch the bandwidth allocation. The differences in the

    switch time correspond to the average download time of a

    legitimate client.

    7 CONCLUSIONS AND SUGGESTIONS

    Based on the above performance results, the combined use of

    IBS-based indexing and selective chunk poisoning is indeed

    effective to detect colluders and pirates, and stop them from

    collusive piracy in major families of P2P networks.

    7.1 Major Research Findings

    Summarized below are the major research findings and

    discussions on deployment requirements of our proposed

    P2P copyright-protection system.

    7.1.1 Stop Piracy at the Presence of Colluders

    With secure file indexing and assistance from a peer

    reputationsystem,colluding peers are detectable. The system

    stops piracy by poisoning pirates with excessively long

    download overhead at the presence of a large number of

    colluders.

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 981

    Fig. 14. Normalized upload bandwidth for distributing clean andpoisoned chunks in a simulated eMule network. (a) High piracy rate(50 percent). (b) Low piracy rate (10 percent).

    Fig. 15. Normalized upload poisoning overhead for delivering files of 10-

    700 MB over the ADSL links.Fig. 13. Traffic distribution for uploading a 700-MB CD file using

    authorization token and chunk poisoning over 100 peer nodes.

  • 8/4/2019 Collusive Piracy Prevention

    13/14

    7.1.2 Pirates Detected Immediately upon First Attempt

    With the time-stamped authorization token, the PAP proto-

    col enables clients to detect illegal download attempts from a

    pirate without communicating with a central authority. Our

    scheme heavily penalizes copyright violators without hurt-

    ing honest clients.

    7.1.3 Selecting High-Reputation Peers as Probing

    Decoys

    Using a reputation system, we select trusted clients to act as

    decoys to probe peers in collusive piracy. This mechanism

    accurately detects colluders when they send clean file

    chunks to illegal requests. Reducing number of colluders

    can improve the protection against pirates, as demonstrated

    by experiments.

    7.1.4 Gnutella-Like Networks Best Protected

    by Our Scheme

    Applying our protection scheme, the Gnutella family,

    including Gnutella, Ares, KaZaA, LimeWire, Freenet, Bare-Share, etc., demonstrates the highest penalty on pirates

    because poison detection is only possible at the file level.

    Even a few chunks poisoned, the entire file must be

    discarded and downloaded repeatedly.This makes the file download time by pirates intolerably

    long (1,000 years or longer). Thus, this family of P2P

    networks is most suitable to apply the proposed copyright

    protection scheme, yielding almost a perfect protection

    success rate (>99.9 percent), as reported in Section 6.4.

    7.1.5 eMule-Like Networks Get Protected Satisfactorily

    The eMule-like P2P networks apply weak chunk hashing at

    the part level. Pirating on eMule networks experiences veryhigh download overhead, as seen in Table 4. Our copyright-

    protection system works with 85-98 percent success rate on

    eMule family of P2P networks, as listed in Table 2.

    7.1.6 BitTorrent Network Is Most Resistant to Poisoning

    Our system is less effective to protect poison-resistant P2P

    networks like BitTorrent, BNBT, Azureus, etc. Continued

    research is needed to extend the PAP protocol to overcome

    this difficulty.

    7.1.7 Protection Scheme Performs Better for Large Files

    The proposed system is more effective to protect large files.

    Our PAP protocol is not tailored to protect short music ordocument files. A popular song in MP3 format has less than

    a few MB in size. Content poisoning takes less effects on

    such small files due to single or a few chunks contained in

    the file.

    7.1.8 Negligible Detection and Poisoning Overhead

    The proposed PAP protocol detects colluders and pirates,

    and apply chunk poisoning selectively. These extra activities

    add only limited extra workload or traffic to the network.

    These overheads are distributed among all distribution

    agents and clients, making their effects almost negligible on

    individual clients.

    7.2 Discussions and Suggestions

    Our protection scheme gives higher priority to satisfy honestclients. Putting poisoning tasks at lower priority reduces theupload overhead. Our selective chunk poisoning outper-forms the undiscriminated poisoning practiced by RIAA orMPAA enforcers. This system is fair to the majority of honestclients who enjoy P2P content delivery services.

    Perishable contents, such as real-time broadcast of newsor sports events, are well protected by our system. For thesecontents, a hashing mechanism to detect poisoning cannot be effective, because distributing chunk hashes ahead ofcontent is impossible in real time. Essentially, all P2Pnetworks including BitTorrent fail to resist poisoning inperishable contents.

    Existing DRM systems [10] focus on enforcing usagerules of digital content such as copy and replay, while ourproposed system provides a protected P2P distributionplatform to control the access of copyrighted content files.Combining DRM with P2P copyright protection should bethe focus of future research.

    In a traditional encryption-based protection scheme,once a pirate discovers how the system works, he canbreak it post the hack on the Internet. We enforce copyrightprotection via distributed content poisoning. Statistically, apirate cannot download the file successfully in tolerabletime once the PAP protocol is enforced. Two R/D tasks aresuggested below for further work for copyright protectionin P2P content delivery.

    7.2.1 Prototyping and Benchmark Experiments Needed

    in Real-Life Open P2P Networks

    Simulation results reported here can only prove the protec-tion concept, lacking of sustained accuracy. Proactive chunk

    poisoning can be made selectively to reduce the processingoverhead. However, further studies are needed to upgradethe performance of the copyright-protected system in real-life P2P benchmark applications.

    7.2.2 Integration of Trusted P2P System with Reputation

    System and DRM Scheme

    File-level reputation system posts a new challenge to workwith DRM systems in P2P content delivery. The integrationof selective poisoning with reputation system and DRM willwiden the CDN application domains. Combining DRM andreputation system to protect P2P content delivery networkswill lead to a total solution of the online piracy problem.

    ACKNOWLEDGMENTS

    This work was supported by US National Science Founda-tion ITR Grant ACI-0325409 at the University of SouthernCalifornia. The work was carried out at USC Internet andP2P Computing Laboratory. The authors appreciate thetechnical support by GridSec team members.

    REFERENCES[1] N. Anderson, Peer-to-Peer Poisoners: A Tour of Media-Defen-

    der, Ars Technica, Sept. 2007.[2] BitTorrent.org, BitTorrent Protocol Specification, http://

    www.bittorrent.org/protocol.html, 2006.

    974 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 7, JULY 2010

  • 8/4/2019 Collusive Piracy Prevention

    14/14

    [3] S. Androutsellis-Theotokis and D. Spinellis, A Survey of Peer-to-Peer Content Distribution Technologies, ACM Computing Sur-veys, vol. 36, pp. 335-371, 2004.

    [4] D. Boneh and M. Franklin, Identity-Based Encryption from theWeil Pairing, Proc. Advances in Cryptology (Crypto 01), pp. 213-229, 2001.

    [5] S. Chen and X.D. Zhang, Design and Evaluation of a Scalable andReliable P2P Assisted Proxy for On-Demand Streaming MediaDelivery, IEEE Trans. Knowledge and Data Eng., vol. 18, no. 5,

    pp. 669-682, May 2006.[6] A.K. Choudhury, N.F. Maxemchuk, S. Paul, and H.G. Schulzrinne,

    Copyright Protection for Electronic Publishing over ComputerNetworks, IEEE Trans. Networking, vol. 9, no. 3, pp. 12-20, May/

    June 1995.[7] N. Christin, A.S. Weigend, and J. Chuang, Content Availability,

    Pollution and Poisoning in File-Sharing P2P Networks, Proc.ACM Conf. e-Commerce, pp. 68-77, 2005.

    [8] E. Damiani, D.C. di Vimercati, S. Paraboschi, P. Samarati, andF. Violante, A Reputation-Based Approach for ChoosingReliable Resources in Peer-to-Peer Networks, Proc. ACM Conf.Computer and Comm. Security (CCS 02), pp. 207-216, 2002.

    [9] D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica, andW. Zwaenepoel, Denial-of-Service Resilience in Peer-to-PeerFile Sharing Systems, Proc. Intl Conf. Measurement and

    Modeling of Computer Systems, pp. 38-49, 2005.[10] M. Fetscherin and M. Schmid, Comparing the Usage of Digital

    Rights Management Systems in the Music, Film, and PrintIndustry, Proc. Conf. e-Commerce, 2003.

    [11] J. Frankel and T. Pepper, The Gnutella Protocol Spec. v0.4,Revision 1.2, http://www9.limewire.com/developer/gnutella_protocol_0.4.pdf, 2000.

    [12] B. Gedik and L. Liu, A Scalable P2P Architecture for DistributedInformation Monitoring Applications, IEEE Trans. Computers,vol. 56, no. 6, pp. 767-782, June 2005.

    [13] M. Hofmann and I. Beaumont, Content Networking, Architecture,Protocols, and Practice, S.F. Kaufmann, ed. 2005.

    [14] Y. Itakura, M. Yokozawa, and T. Shinohara, Model Analysis ofDigital Copyright Piracy on P2P Networks, Proc. Intl Symp.

    Applications and the Internet Workshops (SAINT), pp. 84-89, Jan.2004.

    [15] T. Iwata, T. Abe, K. Ueda, and H. Sunaga, A DRM SystemSuitable for P2P Content Delivery and the Study on Its

    Implementation, Proc. Asia-Pacific Conf. Comm. (APCC), vol. 2,pp. 806-811, Sept. 2003.[16] T. Kalker, D.H.J. Epema, P.H. Hartel, R.L. Lagendijk, and M. Van

    Steen, Music2shareCopyright-Compliant Music Sharing in P2PSystems, Proc. IEEE, vol. 92, no. 6, pp. 961-970, June 2004.

    [17] B. Krishnamurthy, C. Wills, and Y. Zhang, On the Use andPerformance of Content Distribution Networks, Proc. SpecialInterest Group on Data Comm. on Internet Measurement Workshop(SIGCOMM), Nov. 2001.

    [18] Y. Kulbak and D. Bickson, The eMule Protocol Specification,Technical Report TR-2005-03, Hebrew Univ., Jan. 2005.

    [19] S.H. Kwok, Watermark-Based Copyright Protection SystemSecurity, Comm. ACM, pp. 98-101, Oct. 2003.

    [20] A. Legout, G. Urvoy-Keller, and P. Michiardi, Rarest First andChoke Algorithms Are Enough, Proc. ACM Special Interest Groupon Data Comm. on Internet Measure (SIGCOMM), pp. 203-216, 2006.

    [21] E. Luoma and H. Vahtera, Current and Emerging Requirements

    for Digital Rights Management Systems Through Examination ofBusiness Networks, Proc. 37th Ann. Hawai Intl Conf. SystemSciences, 2004.

    [22] D.P. Majoras, O. Swindle, T.B. Leary, and J. Harbour, Peer-to-Peer File-Sharing Technology: Consumer Protection and Competi-tion Issues, Federal Trade Commission Report, June 2005.

    [23] N. Mook, P2P Flooder Overpeer Cease Operation, Beta News,http://www.betanews.com/article/P2P_Flooder_Overpeer_Ceases_Operation/1134249644, Dec. 2005.

    [24] N. Mook, P2P Future Darkens as eDonkey Closes, http://www.betanews.com/article/P2P_Future_Darkens_as_eDonkey_Closes/1127953242, Sept. 2005.

    [25] G. Pallis and A. Vakali, Insight and Perspectives for ContentDelivery Networks, Comm. ACM, pp. 101-106, Jan. 2006.

    [26] P. Rodriguez et al., On the Feasibility of Commercial Legal P2PContent Distribution, SIGCOMM Computer Comm. Rev., vol. 36,

    [27] S. Saroiu et al., An Analysis of Internet Content DeliverySystems, SIGOPS Operating System Rev., pp. 315-327, 2002.

    [28] M. Srivatsa and L. Liu, Vulnerabilities and Security Threats inStructured Overlay Networks: A Quantitative Analysis, Proc.20th Ann. Computer Security Applications Conf., 2004.

    [29] J. Sung, J. Jeong, and K. Yoon, DRM Enabled P2P Architecture,Proc. Eighth Intl Conf. Advanced Comm. Technology (ICACT), vol. 1,pp. 487-490, Jan. 2006.

    [30] K. Walsh and E.G. Sirer, Fighting Peer-to-Peer SPAM and Decoys

    with Object Reputation, Proc. ACM Special Interest Group on DataComm. (SIGCOMM) Workshop Economics of Peer-to-Peer Systems,pp. 138-143, 2005.

    [31] C. Wang, B.A. Alqaralleh, B.B. Zhou, F. Brites, and A.Y. Zomaya,Self-Organizing Content Distribution in a Data Indexed DHTNetwork, Proc. Sixth IEEE Intl Conf. P2P Computing, pp. 241-248,2006.

    [32] L. Xiao, Y. Liu, and L.M. Ni, Improving Unstructured P2PSystems by Adaptive Connection Establishment, IEEE Trans.Computers, vol. 54, no. 9, pp. 1091-1103, Sept. 2005.

    [33] M. Yurkewych, B.N. Levine, and A.L. Rosenberg, On the Cost-Ineffectiveness of Redundancy in Commercial P2P Computing,Proc. 12th ACM Conf. Computer and Comm. Security, 2005.

    [34] R. Zhou and K. Hwang, GossipTrust for Fast ReputationAggregation in P2P Networks, IEEE Trans. Knowledge and DataEng., vol. 20, no. 9, pp. 1282-1295, Sept. 2008.

    Xiaosong Lou received the BS degree fromShanghai Jiao Tong University in 1994 and theMS degree in computer engineering in 2005from the University of Southern California(USC), where he received the PhD in compu-ter engineering in 2009. He is currently work-ing at Yahoo on a performance engineer. Hisresearch covers P2P content networking, on-line video streaming, and network security. Heis a student member of the IEEE.

    Kai Hwang received the PhD degree from the

    University of California, Berkeley, in 1972. Heis a professor of electrical engineering andcomputer science at the University of SouthernCalifornia. He specializes in computer architec-ture, parallel processing, network security, Webservices, distributed computing, and Internettechnology. He has led various researchgroups at Purdue and USC, where he hasproduced over 20 PhD students. Presently, he

    also serves as an EMC endowed visiting professor at TsinghuaUniversity, Beijing, China. He has chaired numerous ACM/IEEEInternational Conferences and presented over two dozens of keynoteaddresses in various Conferences. He has lectured worldwide andperformed advisory work for IBM Fishkill, Intel Scalable SystemDivision, MIT Lincoln Lab., JPL at Caltech, ETL in Japan, AcademiaSinica in China, GMD in Germany, and INRIA in France. He is a fellowof the IEEE. More information regarding Dr. Hwang can be found athttp://GridSec.usc.edu/Hwang.html.

    . For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.

    LOU AND HWANG: COLLUSIVE PIRACY PREVENTION IN P2P CONTENT DELIVERY NETWORKS 983


Recommended