Traffic Modeling. © Tallal Elshabrawy Approaches to construct Traffic Models Trace-Driven: Collect...

Traffic Modeling

© Tallal Elshabrawy

Approaches to construct Traffic Models

Trace-Driven: Collect traces from the network using a sniffing tool and

utilize it directly in simulations.

Empirical Distribution: Generate an empirical distribution from collected traces

and accordingly generate random variables to drive the simulations.

Distribution Fitting: Fitting collected traces to a well-known distribution. Use the fitted distribution for both simulations and analysis .

2


Trace-Driven Traffic Modeling

3

timePacket Arrival Behavior over Time

http://www.google.ca/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://wireshark.en.softonic.com/&ei=4loaVYmPEIbZaoXugZAL&bvm=bv.89381419,d.d24&psig=AFQjCNGu6I45eqfJDrakghJn88_GEYjAWQ&ust=1427876957959414

http://www.google.ca/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://itknowledgeexchange.techtarget.com/cisco/using-the-gns3-network-simulator-to-create-a-cisco-ccna-lab/&ei=slsaVbemF4bgauzogrgB&bvm=bv.89381419,d.d24&psig=AFQjCNGtnsMjZ6wad9EXvC01gzM9ZAPbBg&ust=1427877126224833


Empirical Distribution Traffic Modeling

4


Cumulative Distribution Function (CDF)



http://www.google.ca/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://stackoverflow.com/questions/12848837/scipy-cumulative-distribution-function-plotting&ei=MlwaVa7GG9PjaoOegagJ&bvm=bv.89381419,d.d24&psig=AFQjCNHRsuc96xioySQ1t0b6KOv2I3Yq8Q&ust=1427877260827783


Generate Samples of an Empirical Model

Generate a uniform random variable between [0,1] A random sample of the empirical distribution is generated

by selecting from the CDF c that corresponds to the generated uniform distribution output.

5

Cumulative Distribution Function (CDF)Sample #1Sample #2

http://www.google.ca/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://stackoverflow.com/questions/12848837/scipy-cumulative-distribution-function-plotting&ei=MlwaVa7GG9PjaoOegagJ&bvm=bv.89381419,d.d24&psig=AFQjCNHRsuc96xioySQ1t0b6KOv2I3Yq8Q&ust=1427877260827783


Distribution Fitting Traffic Modeling

6




http://www.google.ca/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&uact=8&ved=0CAcQjRw&url=http://www.averill-law.com/&ei=dGAaVY36Ns_XaurdgfAH&bvm=bv.89381419,d.d24&psig=AFQjCNFvhnTs_NOom97SYTqjyFNHoIrTYw&ust=1427878381242389


Trace-Driven Empirical distribution Fitted standard

Advantages

Data is sure to be from the correct sample.

Practical real-life results could be anticipated

Sometimes there is not enough data to figure out the distribution accurately

More flexibility in terms of data that can be generated

Fairly simple to deduce from data

Same advantages as the empirical distribution

Irregularities can be smoothed out

Disadvantages

Simulation is limited to the results produced from the collected data

Data may not be sufficient to do long-enough runs

May have irregularities if the collected sample is not large enough (statistical abnormalities)

The number of data points that can be generated may be limited depending on the original data samples

Can be difficult to deduce if the available data is limited

Always a chance of abnormalities that were not accounted for

Advantages and Disadvantages


Characterization of Distributions

Mean and Central Moments of Distributions:

8

Continuous Dist. Discrete Dist.

Expected Value (Mean)

Variance

Skewness


Expected Value of Random Variables

The average of the generated random samples

If you would like to replace the whole distribution by ONE value, the mean will have the least mean square error

9


Variance of Random Variables

Characterizes how far does the random variable deviate away from its mean value.

10


Skewness of Random Variables

Skewness is a measure of the asymmetry of probability distributions Negative Skewness: The left tail is longer; the mass of the

distribution is concentrated on the right of the distribution. Positive Skewness: The right tail is longer; the mass of the

distribution is concentrated on the left of the distribution.

11


Other Parameters for Continuous Distributions

Some additional parameters of continuous distribution can be helpful in guidance for distribution fitting

Location Parameter (Shift Parameter): Specifies an abscissa (x coordinate) location point of a

distribution’s range of values, Often some kind of midpoint of the distribution.

Scale Parameter: Determine the scale of measurement or spread of a distribution.

Shape Parameter: Determine the basic form or shape of a distribution within the

family of distributions of interest.

12


Location Parameter Examples

13

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0 10 20 30 40 500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Normal Distribution Pareto Distribution

𝑓 (𝑥 )= 1

√2𝜋 𝜎2𝑒−

(𝑥−𝝁 )2

2𝜎 2

𝑓 (𝑥 )= 1𝜎 (1+𝑘 (𝑥−𝜽 )

𝜎 )− 1−1𝑘

𝝁=𝟎 𝝁=𝟐 𝜽=𝟏𝟎 𝜽=𝟐𝟎


Scale Parameter Example

If X is a random variable with a scale parameter 1 then if there is a random variable Y = X then its distribution will have scale parameter

The standard deviation of a normal distribution is a scale parameter for it

14

-10 -8 -6 -4 -2 0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4


Shape Parameter Characteristics Normal and exponential distributions do not have a shape

parameter other distributions such as beta distribution may have two shape parameters.

A change in the shape parameter generally alter a distribution property more fundamentally than shift or scale parameters

15


A distribution with a tail heavier than the exponential Distributions where random variable values that are far from

the “mean” of the distribution have a non-zero probability

Pareto Principle: known as the 80-20 rule, i.e. 80% of the effects come from 20% of the causes

Heavy-Tail Distributions


Heavy or Light Tailed? Pareto Distribution: Pareto Survivor Function: 1- Plot on log-log scale: A heavy tailed survivor function would be linear in log-log

domain

17

10-2

10-1

100

101

102

103

104

105

10-10

10-8

10-6

10-4

10-2

100

Sur

vivo

r F

unct

ion

(1-C

DF

)

Exponential Dist., Mean = 30

Pareto Dist., =1.5, xm

=10, Mean = 30

Heavy Tail

LightTail


Binomial Distribution Uniform (Discrete) Distribution

Geometric Distribution Negative Binomial Distribution

Hypergeometric Distribution

Discrete Probability Distributions


Uniform (Continuous) Distribution

Triangular Distribution

Normal Distribution

Cauchy Distribution Exponential Distribution

Continuous Probability Distributions


Lognormal Distribution

Weibull Distribution Gamma Distribution

Continuous Probability Distributions


Estimation of Parameters

21

Suppose that some distribution shape was deduced from the data set by any of the methods mentioned earlier.

The data set X1,X2,…,Xn was used to deduce the distribution shape and can be used to estimate the parameters defining the distribution completely.

There are many methods used to estimate the parameters. We will use the maximum likelihood estimator (MLE) method.

The method can be explained as follows : suppose we have decided that a certain discrete distribution is the closest to

the data set and that the distribution have one unknown parameter q The Likelihood L(q) function is defined as follows :

This is basically the joint probability mass function since the data are assumed independent. This gives the probability of obtaining the data set as a whole if q is the value of the unknown parameter.

)()...()()( 21 nXpXpXpL


Maximum Likelihood Estimator

22

The MLE of the unknown value q , which is denoted by q* is defined to be the value of q which maximizes L(q)

In the continuous case the probability mass function is substituted with the chosen probability density function.

Example : For an exponential distribution = q b and

The likelihood function will be given by :

/)/1()( xexf

)1

exp(

)1

)...(1

)(1()(

1

/// 21

n

ii

n

XXX

X

eeeL n


Maximum Likelihood Estimator (cont’d)

23

Because most of the theoretical distributions include exponential functions it is often easier to maximize the logarithm of the likelihood function instead of L(q) itself.

Define the log-likelihood function as:

The problem reduces to maximizing the logarithmic function as the value of b which maximizes both functions has to be the same.

The above equation equals zero if which means the sample mean of the sample set.

This should be expected in the case of exponential random variables since they are fully characterized by their means

n

iiXnLl

1

1)ln())(ln()(

n

iiX

n

d

dl

12

1

n

i

i

n

X

1


Maximum Likelihood Estimator (cont’d)

24

Suppose that the distribution chosen is a geometric distribution which is a discrete distribution with pmf given by :

The likelihood function will be given by :

The log-likelihood function :

By differentiating and equating to zero

,...1,0)1()( xppxp xp

n

iiX

n pppL 1)1()(

)1(lnln))(ln()(1

pXpnpLpln

ii

)1)(/(1 nXp

Date post:	29-Dec-2015
Category:	Documents
Upload:	richard-underwood
View:	214 times
Download:	0 times

Traffic Modeling. © Tallal Elshabrawy Approaches to construct Traffic Models Trace-Driven: Collect...

Documents