Date post: | 22-Dec-2014 |
Category: |
Documents |
Upload: | mohammad-faraji |
View: | 186 times |
Download: | 3 times |
Object Placement in Video Content Distribution Networks
Object Placement in Video Content Distribution Networks
Mohammad Faraji, Kianoosh Mokhtarian
Department of Electrical and Computer EngineeringUniversity of Toronto
December 2011
BackgroundBackground
8 years of video content added to YouTube every day
Terabytes a day; Petabytes a year
Trend is to further accelerate
Higher-quality video streams (currently only 10% are HD)
Content distribution infrastructure
Several datacentres around the world
User request sent to closest datacentre (DNS/HTTP redirect)
MotivationMotivation
Store video files across datacentres (DCs)
Generously replicate all videos on DCs?
Not viable
Growth of data volume >> storage cost
Good News from Measurement Studies
Good News from Measurement Studies
Popularity of video depends on geographical location
More than half of the time, only a fraction from the beginning of video is downloaded
=> Place (partial) video files in selected locations
ModelingModeling
Input: history of user requests (video v for IP address i)
Distance of i to any of datacentres?
Use an Internet Coordinate System (ICS)
Delay(i, j) = Eucledian_distance[ ICS(i), ICS(j) ]
Make tracking of requests scalable
Cluster user IPs into regions in the Eucledian space of ICS
Popularity matrix P[region, video]
Distance matrix D[region, datacentre]
Partial Video FilesPartial Video Files
First minute of video downloaded many more times
Store partial video files
More effective caching
Lower start-up delay
Partial popularity assumed independent of region
Download reports: (v, 1MB), (v, 2.3MB), (v, 0.5 MB), ...
Compress into a few entries for each video (dynamic alg)
PP[v] = (0...1MB, 100 times), (1MB...end, 50 times)
Problem StatementProblem Statement
Assign (part of) each video to one or more DC
Minimize distance of video to user (region), given:
The distance matrix D[region, datacentre]
The expected download pattern P[region, video]
Partial popularity PP[video]
The storage limitation of each DC
Problem HardnessProblem Hardness
Simpler alternatives
Store one video file on a few selected DCs
NP-Complete (min set cover, max coverage)
Store multiple video files on one DC
NP-Complete (knapsack)
SolutionSolution
Maintain a utility matirx U[v, d]
Utility of replicating "the next chunk of" video v on DC d
Auxiliary priority queues
1. Find the highest-utility video v*:
2. Place the next chunk of v* on the best DC d*
3. Update row v* of U, and what the next chunk is for v*
Complexity: O[ (total video replicas) x
(log[# videos] + log[# DCs] + log[max
chunks/video]) ]
Evaluation (in Progress): DataEvaluation (in Progress): Data
File size and length of ~200K videos from [Cheng 2010]
Distances in Internet
Pairwise delay between 2500 nodes from [Wong 2005]
Video popularities
Global: Zipf-distributed (as repeatedly reported)
Local: synthetic
Partial video popularities
Generated according to [Qiu 2010]
Evaluation (in Progress): Results
Evaluation (in Progress): Results
Total delay, given our placement
Delay w/ and wo/ partial file storage
Comparison to simple threshold based distributed caching
Running time
Estimated communication overhead
Take-AwayTake-Away
Benefits of storing partial video files on selected DCs
Future work
Sevral further details for a complete working system ...
Low-overhead collection of (sub-samples of) downloads
Estimate near-future download patterns
Carefully cluster users in a limited num of regions
Solving video placement by multiple nodes
Incremental algorithm; can't shuffle everything every night
Appendix: Previous WorksAppendix: Previous Works
Cooperative web caching
Hierarchical, distributed, hybrid
CDN design (various flavors)
Video caching
On a single cache
To optimize for VCR-like functions