+ All Categories
Home > Documents > Information, Gravity, and Traffic...

Information, Gravity, and Traffic...

Date post: 11-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
38
1 AT&T Labs - Research Information, Gravity, and Traffic Matrices Yin Zhang, Matthew Roughan, Albert Greenberg, Nick Duffield, David Donoho
Transcript
Page 1: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

1AT&T Labs - Research

Information, Gravity, and Traffic Matrices

Yin Zhang, Matthew Roughan, Albert Greenberg, Nick Duffield, David Donoho

Page 2: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

2AT&T Labs - Research

Problem

Have link traffic measurementsWant to know demands from source to destination

Page 3: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

3AT&T Labs - Research

Example App: reliability analysis

Under a link failure, routes changewant to find an invariant

Page 4: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

4AT&T Labs - Research

Outline

❚ Part I: What do we have to work with – data sources❙ SNMP traffic data❙ Netflow, packet traces❙ Topology, routing and configuration

❚ Part II:Algorithms❙ Gravity models❙ Tomography❙ Combination and information theor

❚ Part III: Applications❙ Network Reliability analysis❙ Capacity planning❙ Routing optimization (and traffic engineering in general)

Page 5: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

5AT&T Labs - Research

Part I: Data Sources

Page 6: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

6AT&T Labs - Research

Traffic Data

Page 7: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

7AT&T Labs - Research

Data Availability – packet traces

Packet traces limited availability – like a high zoom snap shot• special equipment needed (O&M expensive even if box is cheap) • lower speed interfaces (only recently OC48 available, only just OC192)• huge amount of data generated

Page 8: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

8AT&T Labs - Research

Data Availability – flow level data

Flow level data not available everywhere – like a home movie of the network• historically poor vendor support (from some vendors)• large volume of data (1:100 compared to traffic)• feature interaction/performance impact

Page 9: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

9AT&T Labs - Research

Data Availability – SNMP

SNMP traffic data – like a time lapse panorama• MIB II (including IfInOctets/IfOutOctets) is available almost everywhere• manageable volume of data (but poor quality)• no significant impact on router performance

Page 10: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

10AT&T Labs - Research

Part II: Algorithms

Page 11: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

11AT&T Labs - Research

The problem

Only measure at links

Want to compute the traffic yj alongroute j from measurements on the links, xi

1

3

2router

route 2

route 1

route 3

=

3

2

1

3

2

1

110011101

yyy

xxx

Page 12: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

12AT&T Labs - Research

The problem

Only measure at links

1

3

2router

route 2

route 1

route 3

Want to compute the traffic tj alongroute j from measurements on the links, xi

x = AT y

Page 13: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

13AT&T Labs - Research

Naive approachIn real networks the problem is highly under-constrained

Page 14: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

14AT&T Labs - Research

Gravity Model

❚ Assume traffic between sites is proportional to traffic at each site

y1 ∝ x1 x2y2 ∝ x2 x3y3 ∝ x1 x3

❚ Assumes there is no systematic difference between traffic in LA and NY❙ Only the total volume matters❙ Could include a distance term, but locality of information is

not as important in the Internet as in other networks

Page 15: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

15AT&T Labs - Research

Simple gravity model

Page 16: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

16AT&T Labs - Research

Generalized gravity model

❚ Internet routing is asymmetric❚ A provider can control exit points for traffic going to

peer networks

peer links

access links

Page 17: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

17AT&T Labs - Research

Generalized gravity model

peer links

access links

❚ Internet routing is asymmetric❚ A provider can control exit points for traffic going to

peer networks❚ Have much less control of where traffic enters

Page 18: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

18AT&T Labs - Research

Generalized gravity model

Page 19: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

19AT&T Labs - Research

Tomographic approach

❚ Solve the constraints

1

3

router

route 2

route 1

route 3 2

x = AT y

Page 20: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

20AT&T Labs - Research

Direct Tomographic approach

❚ Under-constrained problem❚ Find additional constraints❚ Use a model to do so

❙ Typical approach is to use higher order statistics of the traffic to find additional constraints

❚ Disadvantage❙ Complex algorithm – doesn’t scale

❘ ~1000 routers❘ Can reduce size of problem (by looking at the core)

• Still orders more routers than PoPs❙ Model may not be correct -> result in problems

❚ Alternative: use the gravity model

Page 21: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

21AT&T Labs - Research

Combining gravity model and tomography

❚ In general the aren’t enough constraints❚ Constraints give a subspace of possible solutions

soconstraint subspace

gravity model lution

Page 22: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

22AT&T Labs - Research

Solution

❚ Find a solution which❙ Satisfies the constraint❙ Is close to the gravity model (in some sense)

constraint subspace

solution

Page 23: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

23AT&T Labs - Research

Validation

❚ Results good: ±20% bounds for larger flows❚ Observables even better ❚ Robust❚ Fast

Page 24: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

24AT&T Labs - Research

Distribution of flow sizes

Page 25: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

25AT&T Labs - Research

Estimates over time

Page 26: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

26AT&T Labs - Research

Information Theory

❚ natural relationship to information theory❙ Max entropy:

❘ maximize uncertainty given a set of constraints❙ Minimum Mutual Information:

❘ minimize the mutual information between source and destination

❙ No information❘ The minimum is independence of source and destination

• P(S,D) = p(S) p(D)• P(D|S) = P(D)• actually this corresponds to the gravity model

❘ Add tomographic constraints:• Including additional information as constraints• Natural algorithm is one that minimizes the Kullback-Liebler

information number of the P(S,D) with respect to P(S) P(D)– Max relative entropy (relative to independence)

❙ provides a natural distance for us in the previous algorithm• Quadratic distances are a linear approximation to the KL distance

Page 27: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

27AT&T Labs - Research

Insights

❚ Gravity model = independence of source and destination❙ Generalized gravity model = independence conditional on class of

the source and destination❘ Can now rigorously derive this model

❚ There is a natural distance metric for this problem❚ The solution can now be seen as showing “how far” we are from

the gravity model in a probabilistic sense❚ We can quantify the distance of the solution from any

particular model – e.g. general vs simple gravity model❙ Provides a direct method for testing quality of priors, independent

of algorithm used to get solution❙ For example, choice model prior used by SprintLab

❚ We know how to add in additional information rigourously❙ Isolated netflow❙ Local traffic matrices

Page 28: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

28AT&T Labs - Research

Part III: Applications

Page 29: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

29AT&T Labs - Research

Existing Applications

❚ Network Reliability Analysis ❙ Consider the link loads in the network under failure❙ Allows “what if” type questions to be asked about link

failures (and span, or router failures)❙ Allows comprehensive analysis of network risks

❘ What is the link most under threat of overload under likely failure scenarios

❙ Used in Planned Cable Intrusions (PCIs)❚ Capacity planning

❙ Results have been used in backbone capacity planning❘ Since Oct 2002 (in conjunction with other data)

Page 30: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

30AT&T Labs - Research

Routing optimization

❚ Used with OSPF optimization❙ Get within 6% of OSPF optimum using true TM❙ Get within 12% of absolute best (e.g. using MPLS)

❚ Has been used on a more limited basis, in connection with reliability analysis❙ OSPF weights computed by trial and error❙ Aim: prevent negative impact from failures

❘ Concern in 2002 over three large links in a shared risk group

Page 31: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

31AT&T Labs - Research

Conclusion

❚ Nice algorithm❙ Connection with transport theory❙ Connection with information theory

❚ Practical applications❙ Network reliability❙ Capacity planning❙ Routing optimization

❚ To Do❙ Build better prior models❙ Study the traffic matrices themselves❙ Point-to-multipoint traffic matrices❙ Other applications

❘ Anomaly detection

Page 32: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

32AT&T Labs - Research

Additional slides

Page 33: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

33AT&T Labs - Research

Netflow Measurements

❚ Detailed IP flow measurements❙ Flow defined by

❘ Source, Destination IP, ❘ Source, Destination Port, ❘ Protocol,❘ Time

❙ Statistics about flows❘ Bytes, Packets, Start time, End time, etc.

❙ Enough information to get traffic matrix❚ Semi-standard router feature

❙ Cisco, Juniper, etc.❙ not always well supported❙ potential performance impact on router

❚ Huge amount of data (500GB/day)

Page 34: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

34AT&T Labs - Research

SNMP❚ Pro

❙ Comparatively simple❙ Relatively low volume❙ It is used already (lots of historical data)

❚ Con❙ Data quality – an issue with any data source

❘ Ambiguous ❘ Missing data ❘ Irregular sampling

❙ Octets counters only tell you link utilizations ❘ Hard to get a traffic matrix❘ Can’t tell what type of traffic❘ Can’t easily detect DoS, or other unusual events

❙ Coarse time scale (>1 minute typically; 5 min in our case)

Page 35: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

35AT&T Labs - Research

Topology and configuration

❚ Router configurations❙ Based on downloaded router configurations, every 24 hours

❘ Links/interfaces❘ Location (to and from)❘ Function (peering, customer, backbone, …)❘ OSPF weights and areas❘ BGP configurations

❙ Routing❘ Forwarding tables❘ BGP (table dumps and route monitor)❘ OSPF table dumps

❚ Routing simulations❙ Simulate IGP and BGP to get routing matrices

Page 36: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

36AT&T Labs - Research

Validation

Page 37: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

37AT&T Labs - Research

Some Approaches

❚ Look at a real network❙ Get SNMP from links❙ Get Netflow to generate a traffic matrix❙ Compare algorithm results with “ground truth”❙ Problems:

❘ Hard to get Netflow along whole edge of network• If we had this, then we wouldn’t need SNMP approach

❘ Actually pretty hard to match up data • Is the problem in your data: SNMP, Netflow, routing, …

❚ Simulation❙ Simulate and compare❙ Problems

❘ How to generate realistic traffic matrices❘ How to generate realistic network❘ How to generate realistic routing❘ Danger of generating exactly what you put in

Page 38: Information, Gravity, and Traffic Matricesmaths.adelaide.edu.au/matthew.roughan/papers/information...AT&T Labs - Research 20 Direct Tomographic approach Under-constrained problem Find

38AT&T Labs - Research

Our method

❚ We have netflow around part of the edge (currently)❚ We can generate a partial traffic matrix (hourly)

❙ Won’t match traffic measured from SNMP on links❚ Can use the routing and partial traffic matrix to

simulate the SNMP measurements you would get❚ Then solve inverse problem❚ Advantage

❙ Realistic network, routing, and traffic❙ Comparison is direct, we know errors are due to algorithm

not errors in the data


Recommended