Coordinated Sampling sans Origin-Destination Identifiers:
Algorithms and Analysis
Vyas Sekar, Anupam Gupta, Michael K. Reiter, Hui Zhang
Carnegie Mellon UniversityUniv. of North Carolina Chapel-Hill
1
Flow Monitoring is critical for effective Network Management
2
Traffic Engineering
Analyze new user apps
AnomalyDetection
Network Forensics
Worm Detection
Accounting
Botnet analysis
…….
Need high-fidelity measurements
Respect resource constraints
High flow coverage
Provide network-wide goals
How do we meet the requirements?
Respect resource constraints
High flow coverage
Provide network-wide goals
3
Flow Sampling
Network-Wide Coordination & Optimization
cSamp[NSDI’08]
Network-wide coordination
4
Assign non-overlapping ranges per OD-pair or pathAll routers configured with same hash function/key
[1,5]
[1,3]
[3,7]
[1,2]
[7,9]
[5,8]
Sampling Manifest
Generating Sampling Manifests
5
Network-wide Optimization
(@ NOC)
OD-pair infoTraffic, Path(routers)
Router constraintse.g., SRAM for flowrecords
Sampling manifests
{<OD-Pair,Hash-range>} per router
Objective:Max i ε ODPairs Coveragei Traffici
Subject to achieving maximum Mini ε ODPairs { Coveragei }
LinearProgram
Inputs
Output
cSamp algorithm on each router
6
[5,10]
[1,4]
Sampling Manifest
1. Get OD-Pair from packet
3. Look up hash-range for OD-pair from sampling manifest 2. Compute hash (flow = packet 5-tuple)
4.Log if hash falls in range for this OD-pair
Red vs. Green?
Flow memory
2
2
1
OD Range
7
Why is this challenging?OD-pair identification might be ambiguous Multi-exit peers (and prefix aggregation)(Even with MPLS)
How does cSamp overcome this? Ingresses compute and add this to packet headers
Need to modify packet headers/add shim headerExtra computation on ingressesMay require overhauling routing infrastructure
1. Get OD-Pair from packet
8
Can we realize the benefits of cSamp without OD-pair identification?
Use local information to make sampling decisions “Stitch” coverage across routers on a path
Outline
• Background and Motivation
• Problem Formulation
• Algorithms and Heuristics
• Evaluation
9
R R3R2R1
What local info can I get from
packet and routing table?
{Previous Hop, My Id, NextHop}
SamplingSpecGranularity at
which sampling decisions are made
How much to sample for this SamplingSpec?
SamplingAtomDiscrete hash-ranges,
select some to log10
=
=
“Stitching” together coverage
union
union
R1
R2
R4R3R5
R6
R7
11
Problem Formulation
12
Coverage for path Pi
Load on router Rj
Maximize: Total flow coverage: i TiCi
Minimum fractional coverage: mini {Ci } Subject To:
j, Loadj Lj
SamplingAtom
SamplingSpec
Outline
• Background and Motivation
• Problem Formulation
• Algorithms and Heuristics
• Evaluation
13
Maximize: Total flow coverage: i TiCi Min. frac coverage: mini {Ci }
Subject To: j, Loadj Lj
NP-hard!
Total flow coverage: Submodular maximization with partition-knapsack Efficient greedy algorithm is near-optimal
Min. fractional flow coverage: Need “resource augmentation”Intelligent resource augmentationIncrementally add OD-pair identifiers
14
Min: Hard to approximate!
Leveraging submodularity for ftot
15
A function F: 2V is submodular if A A' V, and s V, F(A {s}) - F(A) F (A' {s}) - F(A’)
“diminishing returns”
Why does it matter?Max F s.t c(A) B, where F is monotone
Greedy algorithm gives a constant-factor approximationCan do lazy evaluation to speedup
Maximize: ftot= i TiCi, Subject To: j, Loadj Lj Special case of above problem with “partition-knapsack”
What about fmin?
16
fmin = mini {Ci } is not submodular Hard to approximate without violating constraints!
But, can get near-optimal, if we violate by a fixed factor
Main idea: Define f’ = i C’i where C’ i = min {Ci , T}Note that f’ = N * T, iff each Ci T
Run binary search over T to find best solution(Each iteration runs greedy with no resource constraints)
Heuristic improvements:1. Intelligent resource augmentation
2. Upgrade a few ingresses to add OD-pairs
Outline
• Motivation
• Problem Formulation
• Algorithms and Heuristics
• Evaluation
17
Total flow coverage
18
cSamp-T (tuple+) gives near-ideal total flow coverage vs. cSamp
Minimum fractional coverage(with intelligent resource augmentation)
19
Can get 75% of optimal performance with 1.5X total increase and a 5X max-per-router increase
Summary
• cSamp for efficient flow monitoring• Network-wide coordination and optimization• But needs OD-pair identification
• How to implement cSamp without OD-pair ids?
• Leverage submodularity for total coverage
• Targeted upgrades for minimum fractional coverage
• cSamp-T makes cSamp’s benefits more immediately available
20