Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 219 times |
Download: | 2 times |
Network Network AnomographyAnomography
Yin Zhang, Zihui Ge, Albert Greenberg, Matthew Roughan
Internet Measurement Conference 2005Berkeley, CA, USA
Presented by Huizhong SunSome slides borrow from Yin Zhang
2
Network Anomaly DetectionNetwork Anomaly Detection
• Is the network experiencing unusual conditions?– Call these conditions anomalies– Anomalies can often indicate network problems
• DDoS attack, network worms, flash crowds, misconfigurations , vendor implementation bugs, …
– Need rapid detection and diagnosis• Want to fix the problem quickly
• Questions of interest– Detection
• Is there an unusual event?
– Identification• What’s the best explanation?
– Quantification• How serious is the problem?
3
Network AnomographyNetwork Anomography
• What we want – Volume anomalies
[Lakhina04]
Significant changes in an Origin-Destination flow, i.e., traffic matrix element
– Detect Volume anomalies
– Identify which O-D pair
A
BC
4
Network AnomographyNetwork Anomography• Challenge
– It is difficult to measure traffic matrix directly– The anomalies detection problem is somewhat more
complex and difficult• First, anomaly detection is performed on a series of
measurements over a period of time, rather than from a single snapshot.
• In addition to changes in the traffic, the solution must build in the ability to deal with changes in routing.
• What we have– Link traffic measurements Simple Network
Management Protocol (SNMP) data on individual link loads is available almost ubiquitously.
• Network Anomography– Infer volume anomalies from link traffic
measurements
5
An IllustrationAn Illustration
Courtesy: Anukool Lakhina [Lakhina04]
6
Anomography =Anomography =Anomalies + TomographyAnomalies + Tomography
7
Mathematical FormulationMathematical Formulation
Problem: Infer changes in TM elements (xt) given link measurements (bt)
Only measure at links
1
3
2router
route 1
route 3
route 2
,t,t,t xxb 321
link 2
link 1
link 3
t
t
t
t
t
t
x
x
x
b
b
b
,3
,2
,1
,3
,2
,1
011
101
110
8
Mathematical FormulationMathematical Formulation
bt = At xt (t=1,…,T)
Typically massively under-constrained!
Only measure at links
1
3
2router
route 1
route 3
route 2
,t,t,t xxb 321
link 2
link 1
link 3
9
Static Network AnomographyStatic Network Anomography
Time-invariant At (= A), B=[b1…bT], X=[x1…xT]
Only measure at links
1
3
2router
route 1
route 3
route 2
,t,t,t xxb 321
link 2
link 1
link 3
B = AX
10
Anomography StrategiesAnomography Strategies
• Early Inverse1. Inversion
– Infer OD flows X by solving bt=Axt
2. Anomaly extraction– Extract volume anomalies X from inferred X
Drawback: errors in step 1 may contaminate step 2
• Late Inverse1. Anomaly extraction
– Extract link traffic anomalies B from B
2. Inversion– Infer volume anomalies X by solving bt=Axt
Idea: defer “lossy” inference to the last step
11
Extracting Link Anomalies Extracting Link Anomalies BB• Temporal Anomography:
– Fourier / wavelet analysis• Link anomalies = the high frequency components
– ARIMA modeling• Diff• EWMA (Exponentially Weighted Moving Average) is
ARIMA(0, 1, 1) • Holt-Winters is ARIMA(0, 2, 2)
– Temporal PCA• PCA = Principal Component Analysis• Project columns onto principal link column vectors
• Spatial Anomography:– Spatial PCA [Lakhina04]
• Project rows onto principal link row vectors
12
Extracting Link Anomalies Extracting Link Anomalies BB
• Fourier analysis– Fourier analysis decompose a complex periodic
waveform into a set of sinusoids with different amplitudes, frequencies and phases.
– The sum of these sinusoids can exactly match the original waveform.
– The idea of using the Fourier analysis to extract anomalous link traffic is to filter out the low frequency components.
– In general, low frequency components capture the daily and weekly traffic patterns, while high frequency components represent the sudden changes in traffic behavior.
13
Extracting Link Anomalies Extracting Link Anomalies BB
• Fourier analysis– For a discrete-time signal x0, x1, . . . , xN-1, the
Discrete Fourier Transform (DFT) is defined by
– where fn is a complex number that captures the amplitude and phase of the signal at the n-th frequency
– Lower n corresponds to a lower frequency component, with f0 being the DC component,
– fn with n close to N/2 corresponding to high frequencies
14
Extracting Link Anomalies Extracting Link Anomalies BB
• Fourier analysis– The Inverse Discrete Fourier Transform (IDFT)
is used to reconstruct the signal in the time domain by
– An efficient way to implement the DFT and IDFT is the Fast Fourier Transform (FFT).
– The computational complexity of the FFT is O(N log(N)).
15
Extracting Link Anomalies Extracting Link Anomalies BB
• FFT based anomography.– 1. Transform link traffic B into the frequency
domain: F = FFT(B): apply the FFT on each row of B. (a row corresponds to the time series of traffic data on one link.)
– 2. Remove low frequency components: i.e. set Fi = 0, for i ∈[1, c] ∪ [N-c, N], where c is a cut-off frequency.
• (For example, using 10-minute aggregated link traffic data of one week duration, and c = 10N/60, corresponding to a frequency of one cycle per hour.)
– 3. Transform back into the time domain: i.e. we take B = IFFT(F). The result is the high frequency components in the traffic data, which we will use as anomalous link traffic
16
Extracting Link Anomalies Extracting Link Anomalies BB
• Wavelet analysis– 1. Use wavelets to decompose B into
different frequency levels: W = WAVEDEC(B), by applying a multi-level 1-D wavelet decomposition on each row of B.
– 2. Then remove low- and mid-frequency components in W by setting all coefficients at frequency levels higher than wc to 0. Here wc is a cut-off frequency level.
– 3. Reconstruct the signal: B = WAVEREC(W’). The result is the high-frequency components in the traffic data.
17
Extracting Link Anomalies Extracting Link Anomalies BB
• ARIMA Modeling -- Box-Jenkins methodology, or AutoRegressive Integrated Moving Average (ARIMA)
• A class of linear time-series forecasting techniques that capture the linear dependency of the future values on the past.
• It has been extensively used for anomaly detection in univariate time series.
• To get back to anomaly detection, we simply identify the forecast errors as anomalous link traffic.
• Traffic behavior that cannot be well captured by the model is considered anomalous.
18
Extracting Link Anomalies Extracting Link Anomalies BB
• ARIMA(p, d, q) model includes three parameters:– The autoregressive parameter (p),
– The number of differencing passes (d),
– The moving average parameter (q).
– Some model used for detecting anomalies in time-series,
• for example, the Exponentially Weighted Moving Average (EWMA) is ARIMA(0, 1, 1); Holt-Winters is ARIMA(0, 2, 2).
19
Extracting Link Anomalies Extracting Link Anomalies BB
• ARIMA(p, d, q) model includes three parameters:– the autoregressive parameter (p), – the number of differencing passes (d),– the moving average parameter (q).
where zk is obtained by differencing the original time series d times (when d ≥ 1) or by subtracting the mean from the original time series (when d = 0), ek is the forecast error at time k, φi (i = 1, ..., p) and θj (j = 1, ..., q) are the autoregression and moving average coefficients, respectively.
20
Extracting Link Anomalies Extracting Link Anomalies BB
Diagnosing Network-Wide Diagnosing Network-Wide Traffic AnomaliesTraffic Anomalies
Anukool Lakhina, Mark Crovella, Christophe Diot
“Diagnosing Network-Wide Traffic Anomalies”
SIGCOMM’04,
22
Extracting Link Anomalies Extracting Link Anomalies BB
• Spatial Anomography: Spatial PCA [Lakhina04] – 1. Identify the first axis that the link traffic data
have the greatest degree of variance along the first axis
– 2. Identify the second axis that the link traffic data have the second greatest degree of variance along the second one, and so on so forth:
23
Extracting Link Anomalies Extracting Link Anomalies BB
• Spatial Anomography: Spatial PCA [Lakhina04]
– 3. Divide the link traffic space into the normal subspace and the anomalous subspace
• by examining the projection of the time series of link traffic data on each principal axis in order.
• As soon as a projection is found that contains a 3σ deviation from the mean, that principal axis and all subsequent axes are assigned to the anomalous subspace.
• All previous principal axis are assigned to the normal subspace.
24
Data CollectedData Collected
Abilene Sprint-Europe
25Low Intrinsic Dimensionality of Link Low Intrinsic Dimensionality of Link TrafficTraffic
Studied via Principal Component Analysis
Key result: Normal traffic is well approximated as occupying a low dimensional subspace
Reasons: 1. Links share OD flows2. Set of OD flows also low dimensional
26
The Subspace MethodThe Subspace Method
• An approach to separate normal from anomalous traffic
• Normal Subspace, : space spanned by the first k principal components
• Anomalous Subspace, : space spanned by the remaining principal components
• Then, decompose traffic on all links by projecting onto and to obtain:
Traffic vector of all links at a particular point in time
Normal trafficvector
Residual trafficvector
27
Traffic on Link 1
Tra
ffic
on
Link
2A Geometric IllustrationA Geometric Illustration
In general, anomalous traffic results in a large value of
y
28
DetectionDetection
Traffic on Link 1
Tra
ffic
on
Link
2
• Capture size of vector using squared prediction error (SPE):
Result due to [Jackson and Mudholkar, 1979]
29
Detection IllustrationDetection Illustration
Value ofover time (all traffic)
over time(SPE)
Value of
SPE at anomaly time points clearly stand out
30
Extracting Link Anomalies Extracting Link Anomalies BB
Temporal PCA
• PCA = Principal Component Analysis• Similar with Spatial PCA• Project columns onto principal link column
vectors
31
• Temporal Anomography: B = AX
• Now if we know B, how to solve the abnormal traffic O-D pairs X ?
• (1) Pseudoinverse solution
• (2) Sparsity maximization
Solving bSolving btt = Ax = Axtt
32
Solving Solving bbt t = A x= A xtt
• Pseudoinverse: xt = pinv(A) bt
– Shortest minimal L2-norm solution
• Solve xt subject to |bt – A xt|2 is minimal
33
Solving Solving bbt t = A x= A xtt
• Maximize sparsity – In practice, we expect only a few anomalies at
any one time, so x typically has only a small number of large values.
– Hence it is natural to proceed by maximizing the sparsity of x, i.e., solving the following l0 norm minimization problem:
34
Performance EvaluationPerformance Evaluation
• Fix one anomaly extraction method• Compare “real” and “inferred”
anomalies– “real” anomalies: directly from OD flow
data– “inferred” anomalies: from link data
• Order them by size– Compare the size
• How many of the top N do we find– Gives detection rate: | top N”real” top Ninferred |
/ N
35
Performance EvaluationPerformance Evaluation
36
Performance EvaluationPerformance Evaluation
37
Performance EvaluationPerformance Evaluation
38
Performance EvaluationPerformance Evaluation
39Performance Evaluation: Performance Evaluation: AnomographyAnomography
• Hard to compare performance– Lack ground-truth: what is an anomaly?
• So compare events from different methods– Compute top M “benchmark” anomalies
• Apply an anomaly extraction method directly on OD flow data
– Compute top N “inferred” anomalies • Apply another anomography method on link data
– Report min(M,N) - | top Mbenchmark top Ninferred |• M N “false negatives”
# big “benchmark” anomalies not considered big by anomography• M N “false positives”
# big “inferred” anomalies not considered big by benchmark method
– Choose M, N similar to numbers of anomalies a provider is willing to investigate, e.g. 30-50 per week
40
Anomography: “False Negatives”Anomography: “False Negatives”Top 50
Inferred“ False Negatives” with Top 30 Benchmark
Diff EWMA
H-W ARIMA
Fourier
Wavelet
T-PCA
S-PCA
Diff 0 0 1 1 5 5 17 12
EWMA 0 0 1 1 5 5 17 12
Holt-Winters
1 1 0 0 6 4 18 12
ARIMA 1 1 0 0 6 4 18 12
Fourier 3 4 8 8 1 7 19 18
Wavelet 0 1 2 2 5 0 13 11
T-PCA 14 14 14 14 19 15 3 15
S-PCA 10 10 13 13 15 11 1 131. Diff/EWMA/H.-W./ARIMA/Fourier/Wavelet all largely
consistent2. PCA methods not consistent (even with each other)
- PCA cannot detect anomalies in the “normal” subspace- PCA insensitive to reordering of [b1…bT] cannot utilize all
temporal info3. Spatial methods (e.g. spatial PCA) are not self-consistent
41
Anomography: “False Positives”Anomography: “False Positives”Top 30
Inferred“ False Positives” with Top 50 Benchmark
Diff EWMA
H-W ARIMA
Fourier
Wavelet
T-PCA
S-PCA
Diff 3 3 6 6 6 4 14 14
EWMA 3 3 6 6 7 5 13 15
Holt-Winters
4 4 1 1 8 3 13 10
ARIMA 4 4 1 1 8 3 13 10
Fourier 6 6 7 6 2 6 19 18
Wavelet 6 6 6 6 8 1 13 12
T-PCA 17 17 17 17 20 13 0 14
S-PCA 18 18 18 18 20 14 1 141. Diff/EWMA/H.-W./ARIMA/Fourier/Wavelet all largely
consistent2. PCA methods not consistent (even with each other)
- PCA cannot detect anomalies in the “normal” subspace- PCA insensitive to reordering of [b1…bT] cannot utilize all
temporal info3. Spatial methods (e.g. spatial PCA) are not self-consistent
42
ConclusionsConclusions
• Anomography = Anomalies + Tomography– Find anomalies in {xt} given bt=Atxt (t=1,…,T)
• Contributions1. A general framework for anomography methods
– Decouple anomaly extraction and inference components
2. A number of novel algorithms– Taking advantage of the range of choices for anomaly
extraction and inference components– Choosing between spatial vs. temporal approaches
3. Extensive evaluation on real traffic data– 6-month Abilene and 1-month Tier-1 ISP
• The method of choice: ARIMA + Sparsity-L1
43
Thank you !Thank you !Question?Question?
44
Extracting Link Anomalies Extracting Link Anomalies BB• Temporal Anomography: B = BT
– Fourier / wavelet analysis• Link anomalies = the high frequency components
– ARIMA modeling• Diff: ft = bt-1 bt = bt – ft
• EWMA: ft = (1-) ft-1 + bt-1 bt = bt – ft
– Temporal PCA• PCA = Principal Component Analysis• Project columns onto principal link column vectors
• Spatial Anomography: B = TB– Spatial PCA [Lakhina04]
• Project rows onto principal link row vectors