The evolution of internet
traffic optimisation
In the beginning
Communication was circuit-switched
Quality was maintained via connection-admission-
control on a per-session basis
Network is very oversubscribed at multiple points
Erlang is king of the capacity prediction• Very high quality ‘guarantees’ provided
Voice industry still uses this model today
Key simplifications that allow erlang to work• 1 fixed path
• Fixed capacity used
Neither simplification works well for packet data
access
2
And then came data
Traditionally all oversubscription is modelled at
the edge• Modem banks in dialup
• DSLAM aggregation BRAS in DSL
• RF QAM in cable
• RF in wireless
But data is packet switched• Multiple destinations simultaneously
• Short-lived ‘flows’
• Variable bit-rate applications even for VoIP and video
• Mix of application needs
• Internet is client->server, access is asymmetric
No ‘erlang’ for data, all is best-efforts
Capacity is added based on 95%ile peaks3
And then came P2P filesharing
Asymmetric networks are driven to symmetric
needs
Oversubscription starts to become a transit
problem too
Oversubscription ratio changes from 5kbps / sub
to 10 to 20 to 50 in space of a few years
Best efforts is no longer good enough, consumers
start to complain about quality and responsiveness
Enthusiasts using gaming and VoIP are the ‘canary’
in the coal-mine
Access providers respond with Layer-4 policies
giving lower priority to P2P4
P2P filesharing responds
Dynamic ports
Distributed trackers
End-to-end encryption
Providers add packet inspection to their policy
management• Now they have statistics for better planning
• Policies become explicit
5
And then came convergence
Initially just used by enthusiasts, VoIP and Video
moved mainstream with Vonage, Skype, Hulu, …
Drove quality issues to the forefront
P2P filesharing keeps growing, but its pro-rata
share goes down
6
6500750085009500
105001150012500135001450015500
bps
per
sub
Downstream (US MSO amalgam)
HTTP
Streaming
P2P
And then came consumption billing
This is not typically used as a means of congestion
alleviation• You need a very low quota to have this affect
• Daily / monthly top users do not particularly over-
contribute to peak usage
Commonly used for service creation
Variations in approach as to monetisation
7
User monthly usage histogram(US MSO amalgam, downstream)
8
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
16.00%
18.00%
0-4K
B
4KB-
8KB
8KB-
16KB
16KB
-32K
B
32KB
-64K
B
64KB
-128
KB
128K
B-25
6KB
256K
B-51
2KB
512K
B-1M
iB
1MiB
-2M
iB
2MiB
-4M
iB
4MiB
-8M
iB
8MiB
-16M
iB
16M
iB-3
2MIB
32M
iB-6
4MiB
64M
iB-1
28M
iB
128M
iB-2
56M
iB
256M
iB-5
12M
iB
512M
iB-1
GiB
1GiB
-2G
iB
2GiB
-4G
iB
4GiB
-8G
iB
8GiB
-16G
iB
16G
iB-3
2GiB
32G
iB-6
4GiB
64G
iB-1
28G
iB
128G
iB-2
56G
iB
256G
iB-5
12G
iB
512G
iB-1
TiB
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
% Subscribers
% Cumul Subscribers
But its all about policy management
Makes network policies explicit• Forces people to codify things they would prefer not to think about
• Should subscriber tier trump type of application?» E.g. is any traffic of a ‘gold’ user higher priority than the VoIP traffic of a
‘bronze’ user?
• How do I define ‘fair’» Equal? (back to circuit-switched, erlang)
» Linear? Each user demanding bw has equal chance in each time slot?
» Non-linear? Lower weighting for historically high users (like Unix scheduler)
» Equal chance of application working well? (my chance of a good skype call is
equal to your chance of a good score in a game?)
Couple this with broadband growth slowing» Opex becomes important
» Churn becomes expensive
» Investigating tiers to increase ARPU
Operators are incented to provide best quality to prevent
churn
9
But what should the policy be?
Good, bad, or ugly, technology allows for it!• This isn’t strictly a technology problem, it relates to a
balancing of interests
Need a ‘best practices’• Avoid a lot of competing solutions to vaguely defined
problems
• Lots of ideas abound» Limit bulk to % of total
» Give a soft- and hard- limit
» User-selected QoS and credit schemes
Simple works well• WFQ @ each location, mark
Watch out for dynamic bandwidth access
technologies, tunnel re-routes, mobile-ip
10
So what does ‘DPI’ provide?
‘DPI’ is really about:• Measurements to show what is going on now
• policy management with more conditions & actions» If ‘subscriber tier == x’ && ‘application == y’ && ‘time == z’ …
» So now you can make your policies very explicit
• More flexible queuing and shaping options» Rather than the 2-4 queues a router port provides, give millions
» Subscriber aware and IP grouping aware
» Strict and weighted fair options
• Topology knowledge
• Congestion measurement
11
High HTTP subscriber, 15-min interval
Top user w/ high HTTP examined [~10 hr period]
Peaks upload and download together
Media center PC is using megaupload.com
Our Asian customers have seen such network hard drives for some time
High NNTP subscriber, 1s interval
Downstream rate (bps) @ 1s interval
0
1000000
2000000
3000000
4000000
5000000
6000000
0 20 40 60 80 100 120
Tends to operate 24x7
Tends to be very few
subscribers
Congestion and cost are on
very short boundaries• Lets examine an NNTP user
achieved BW @ 1s granularity
Upstream is solely due to TCP
ACK!
Not high % of bw per network,
high per shared access where
its in use• Per shared access WFQ is most
effective and fair
• COPS/ACL can also be effective
but is not fair
• DSCP Marking is also very
effective
Upstream rate (bps) @ 1s interval
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
100000
0 20 40 60 80 100 120