Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | richard-underwood |
View: | 214 times |
Download: | 0 times |
Traffic Modeling
© Tallal Elshabrawy
Approaches to construct Traffic Models
Trace-Driven: Collect traces from the network using a sniffing tool and
utilize it directly in simulations.
Empirical Distribution: Generate an empirical distribution from collected traces
and accordingly generate random variables to drive the simulations.
Distribution Fitting: Fitting collected traces to a well-known distribution. Use the fitted distribution for both simulations and analysis .
2
© Tallal Elshabrawy
Trace-Driven Traffic Modeling
3
timePacket Arrival Behavior over Time
© Tallal Elshabrawy
Empirical Distribution Traffic Modeling
4
timePacket Arrival Behavior over Time
Cumulative Distribution Function (CDF)
© Tallal Elshabrawy
Generate Samples of an Empirical Model
Generate a uniform random variable between [0,1] A random sample of the empirical distribution is generated
by selecting from the CDF c that corresponds to the generated uniform distribution output.
5
Cumulative Distribution Function (CDF)Sample #1Sample #2
© Tallal Elshabrawy
Distribution Fitting Traffic Modeling
6
timePacket Arrival Behavior over Time
© Tallal Elshabrawy
Trace-Driven Empirical distribution Fitted standard
Advantages
Data is sure to be from the correct sample.
Practical real-life results could be anticipated
Sometimes there is not enough data to figure out the distribution accurately
More flexibility in terms of data that can be generated
Fairly simple to deduce from data
Same advantages as the empirical distribution
Irregularities can be smoothed out
Disadvantages
Simulation is limited to the results produced from the collected data
Data may not be sufficient to do long-enough runs
May have irregularities if the collected sample is not large enough (statistical abnormalities)
The number of data points that can be generated may be limited depending on the original data samples
Can be difficult to deduce if the available data is limited
Always a chance of abnormalities that were not accounted for
Advantages and Disadvantages
© Tallal Elshabrawy
Characterization of Distributions
Mean and Central Moments of Distributions:
8
Continuous Dist. Discrete Dist.
Expected Value (Mean)
Variance
Skewness
© Tallal Elshabrawy
Expected Value of Random Variables
The average of the generated random samples
If you would like to replace the whole distribution by ONE value, the mean will have the least mean square error
9
© Tallal Elshabrawy
Variance of Random Variables
Characterizes how far does the random variable deviate away from its mean value.
10
© Tallal Elshabrawy
Skewness of Random Variables
Skewness is a measure of the asymmetry of probability distributions Negative Skewness: The left tail is longer; the mass of the
distribution is concentrated on the right of the distribution. Positive Skewness: The right tail is longer; the mass of the
distribution is concentrated on the left of the distribution.
11
© Tallal Elshabrawy
Other Parameters for Continuous Distributions
Some additional parameters of continuous distribution can be helpful in guidance for distribution fitting
Location Parameter (Shift Parameter): Specifies an abscissa (x coordinate) location point of a
distribution’s range of values, Often some kind of midpoint of the distribution.
Scale Parameter: Determine the scale of measurement or spread of a distribution.
Shape Parameter: Determine the basic form or shape of a distribution within the
family of distributions of interest.
12
© Tallal Elshabrawy
Location Parameter Examples
13
-10 -8 -6 -4 -2 0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0 10 20 30 40 500
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Normal Distribution Pareto Distribution
𝑓 (𝑥 )= 1
√2𝜋 𝜎2𝑒−
(𝑥−𝝁 )2
2𝜎 2
𝑓 (𝑥 )= 1𝜎 (1+𝑘 (𝑥−𝜽 )
𝜎 )− 1−1𝑘
𝝁=𝟎 𝝁=𝟐 𝜽=𝟏𝟎 𝜽=𝟐𝟎
© Tallal Elshabrawy
Scale Parameter Example
If X is a random variable with a scale parameter 1 then if there is a random variable Y = X then its distribution will have scale parameter
The standard deviation of a normal distribution is a scale parameter for it
14
-10 -8 -6 -4 -2 0 2 4 6 8 100
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
© Tallal Elshabrawy
Shape Parameter Characteristics Normal and exponential distributions do not have a shape
parameter other distributions such as beta distribution may have two shape parameters.
A change in the shape parameter generally alter a distribution property more fundamentally than shift or scale parameters
15
© Tallal Elshabrawy
A distribution with a tail heavier than the exponential Distributions where random variable values that are far from
the “mean” of the distribution have a non-zero probability
Pareto Principle: known as the 80-20 rule, i.e. 80% of the effects come from 20% of the causes
Heavy-Tail Distributions
© Tallal Elshabrawy
Heavy or Light Tailed? Pareto Distribution: Pareto Survivor Function: 1- Plot on log-log scale: A heavy tailed survivor function would be linear in log-log
domain
17
10-2
10-1
100
101
102
103
104
105
10-10
10-8
10-6
10-4
10-2
100
Sur
vivo
r F
unct
ion
(1-C
DF
)
Exponential Dist., Mean = 30
Pareto Dist., =1.5, xm
=10, Mean = 30
Heavy Tail
LightTail
© Tallal Elshabrawy
Binomial Distribution Uniform (Discrete) Distribution
Geometric Distribution Negative Binomial Distribution
Hypergeometric Distribution
Discrete Probability Distributions
© Tallal Elshabrawy
Uniform (Continuous) Distribution
Triangular Distribution
Normal Distribution
Cauchy Distribution Exponential Distribution
Continuous Probability Distributions
© Tallal Elshabrawy
Lognormal Distribution
Weibull Distribution Gamma Distribution
Continuous Probability Distributions
© Tallal Elshabrawy
Estimation of Parameters
21
Suppose that some distribution shape was deduced from the data set by any of the methods mentioned earlier.
The data set X1,X2,…,Xn was used to deduce the distribution shape and can be used to estimate the parameters defining the distribution completely.
There are many methods used to estimate the parameters. We will use the maximum likelihood estimator (MLE) method.
The method can be explained as follows : suppose we have decided that a certain discrete distribution is the closest to
the data set and that the distribution have one unknown parameter q The Likelihood L(q) function is defined as follows :
This is basically the joint probability mass function since the data are assumed independent. This gives the probability of obtaining the data set as a whole if q is the value of the unknown parameter.
)()...()()( 21 nXpXpXpL
© Tallal Elshabrawy
Maximum Likelihood Estimator
22
The MLE of the unknown value q , which is denoted by q* is defined to be the value of q which maximizes L(q)
In the continuous case the probability mass function is substituted with the chosen probability density function.
Example : For an exponential distribution = q b and
The likelihood function will be given by :
/)/1()( xexf
)1
exp(
)1
)...(1
)(1()(
1
/// 21
n
ii
n
XXX
X
eeeL n
© Tallal Elshabrawy
Maximum Likelihood Estimator (cont’d)
23
Because most of the theoretical distributions include exponential functions it is often easier to maximize the logarithm of the likelihood function instead of L(q) itself.
Define the log-likelihood function as:
The problem reduces to maximizing the logarithmic function as the value of b which maximizes both functions has to be the same.
The above equation equals zero if which means the sample mean of the sample set.
This should be expected in the case of exponential random variables since they are fully characterized by their means
n
iiXnLl
1
1)ln())(ln()(
n
iiX
n
d
dl
12
1
n
i
i
n
X
1
© Tallal Elshabrawy
Maximum Likelihood Estimator (cont’d)
24
Suppose that the distribution chosen is a geometric distribution which is a discrete distribution with pmf given by :
The likelihood function will be given by :
The log-likelihood function :
By differentiating and equating to zero
,...1,0)1()( xppxp xp
n
iiX
n pppL 1)1()(
)1(lnln))(ln()(1
pXpnpLpln
ii
)1)(/(1 nXp