Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | paul-foster |
View: | 214 times |
Download: | 0 times |
1
One-Click Hosting Services: A File-Sharing
HideoutDemetris [email protected]
Evangelos P. [email protected]
ICS-FORTHHeraklion, Crete, Hellas
Constantine [email protected]
College of ComputingGeorgia Tech
By X.F. Chen
2
ContentFile Sharing & OCHClients CharacteristicsService ArchitectuerRapidshare V.S. BittorrentContent Indexing SitesConclusion
3
File Sharing• One of the most popular Internet user activities• 60-70% of total traffic volume
• Since 2005, a large number of One-Click Hosting (OCH) services have made their appearance– Mainly used for file-sharing
4
What's OCH
• Provide file hosting services at no cost• Provide unique URLs to the uploader
that she can share with her friends & communities
• Provide no indexing for the hosted files
5
Upload Phase
6
Download Phase
7
Collected Data
• Two monitoring points– Monitor1: NREN, ~10K total users (~750 IP)– Monitor2: University, ~1K total users (~450 IP)
BACK
8
Flow Sizes• 90% of the flows < 15
0KBytes– Probably page acc
ess flows
• Upload flows range from several MB to 2GB
• Two main drop– maximum upload f
ile size limit
9
Free Vs. Paying Clients
• Rate-limits free user downloads to 0.2Mb/s – 2.0Mb/s
• Only 20% of the users experience greater download throughputs Subscribers
10
File Popularity Unique downloaders per file
• 75% of the files downloaded only once• Only 0.05% downloaded by more than 5 users Caching will be useless in OCH
BACK
11
Service Architecture Try to infer the architecture of the RapidShare service
by answering:• What is the total number of servers used by RapidS
hare?• Where are these servers located?• How many copies of a file in RapidShare? • How does RapidShare balance the load between se
rvers? How is this architecture different from traditional cont
ent distribution networks?
12
Total Number of Servers Used
• 5,291 distinct server IP addresses
• 36 /24 subnets• 8 different ISPs• Large increase in n
umber of servers during Sep'08
Infrastructure Update
13
Server Location
Discover the geographical location of the server infrastructure
• Performed a number of traceroutes from different planetlab locations
• Used minimum RTT to infer distance from landmarks
14
Server Location cont.
• min-RTT values show a single central datacenter• Datacenter closest to central-European
15
Content Replication
What is the number of servers that store each file?• Used TOR as a geographically distributed downl
oader– 421 different exit nodes
• Requested 22,000 RapidShare file URLs
Each file indexed by exactly 1 server Each file served by exactly 12 servers (group)
16
Server Upload BalancingWhich server group will host a newly
uploaded file?
• 50000 file upload requests
• Log upload group-id
Recently added groups have a higher likelihood of being selected as the upload group
17
Server Download BalancingWhich download server of that group will
be used upon a download request?
• 1000 back-to-back file download requests
• Log download server
Indexing servers are less likely to be selected as download server
18
OCH services vs. CDNs One-Click Hosting services
• Data-center in a single location• Focus on large transfers that are less sensitive to delay• Get revenues from Premium users• Content replicated on multiple servers
Content Distribution Networks• Multiple geographically distributed servers so as to minimize
delay observed by client• Aim to minimize Web transaction delay• Get revenues from large content producers• Content replicated on multiple servers
BACK
19
Challenging the P2P Paradigm
P2P has been (and continues to be) the most popular File-Sharing mechanism
BitTorrent Vs. RapidShare.com– Download Throughput– Content Availability
20
BT Vs. RS: Download Throughput
• RS subscribers outperforms open BitTorrent trackers in terms of throughput
• Free users experience comparable download experience
21
BT Vs. RS: Content Availability
• Searched for a number of different files in both network
Rapidshare.com holds at least as much objects as BitTorrent
BACK
22
Content Indexing Websites
Form an important component for the emergence of OCH services
Crawled 4 different Indexing Websites
• Identify the contributors of the traffic • Identify the size of the shared object• Identify the types of shared object
23
Indexing WebSites
• Less than 20% of the files are not available• Only a small number of users upload
content
Name # Indexed Objects
RS Hosted Objects
# of Stale Files
# of Uploaders
egydown.com 972 787 134 (17%) N/A
rapidmega.info 942 893 116 (13%) 9
rslinks.org 12124 11841 64 (0.5%) 21
rapidshareindex.com
54327 36522 7052 (19.3%) 18
24
Content Contributors
• A small number of the users is responsible for most of the content uploaded
25
Shared Objects• At most 60% obj
ects consist of a single URL
• Users share mostly Videos and Applications
26
Copyrighted Material
• Manually observed 100 most recent objects uploaded in each WebSite.
• In all cases more than 84% of the Objects are copyrighted.
BACK
27
Conclusions For OCH by this paper:• Responsible for a significant share of daily Web traffic• Most files are downloaded only once• All servers at multihomed single datacenter (V.S. CDN)• Free users experience similar performance with BitTorrent use
rs & Subscribers (~20%) experience better performance• Most users do not contribute on sharing files (only download)
Question:• Will OCH services be an alternative to P2P for file-sharing in t
he future?
28
Backup slides