Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an...

Post on 14-Jan-2016

216 views 3 download

transcript

Selecting Input Probability Distribution

Simulation Machine

• Simulation can be considered as an Engine with input and output as follows:

Simulation EngineInput Output

Realizing Simulation • Input Analysis: is the analysis of the random variables

involved in the model such as: – The distribution of IAT– The distribution of Service Times

• Simulation Engine is the way of realizing the model, this includes: – Generating Random variables involved in the model– Performing the requiring formulas.

• Output Analysis is the study of the data that are produced by the Simulation engine.

Input Analysis• collect data from the field

• Analyze these data

• Two ways to analyze the data: – Build Empirical distribution and then sample from

this distribution.– Fit the data to a theoretical distribution ( such as

Normal, Exponential, etc.) See Chapter 6 of Text for more distributions.

How to select an Input Probability distribution

1. Hypothesize a family of distributions.

2. Estimate the parameters of the fitted distributions

3. Determine how representative the fitted distributions are

Repeat 1-3 until you get a fitted distribution foe the collected data. Otherwise go with an empirical distribution.

Hypothesizing a Theoretical Distribution

To Fit a Theoretical Distribution

• Need a good background of the theoretical distributions (Consult your Text: Section 6.2)

• Histogram may not provide much insight into the nature of the distribution.

• Need Summary statistics

Summary Statistics

• Mean

• Median

• Variance 2

• Coefficient of Variation (cv = ) for continuous distributions

• Lexis ration for discrete distributions

• Skewness index

2/32

3

)(

])[(

XE

Summary Stats. Cont.

• If the Mean and the Median are close to each others, and low Coefficient of Variation, we would expect a Normally distributed data.

• If the Median is less than the Mean, and is very close to the Mean (cv close to 1), we expect an exponential distribution.

• If the skewness ( close to 0) is very low then the data are symmetric.

Consider the following data

5.076808 5.050842 6.3984924.895876 5.300643 4.6154946.77878 6.236305 6.091197

6.909572 6.829625 6.1210486.474918 4.524959 4.9275477.607923 4.913438 6.6516876.699065 4.965261 5.9685936.019929 5.505035 4.1475875.249301 5.170052 5.468824.653011 4.132489 6.241657

Example

Example Cont.

• Mean 5.654198

• Median 5.486928

• Standard Deviation 0.910188

• Skewness 0.173392

• Range 3.475434

• Minimum 4.132489

• Maximum 7.607923

Example Continue

We might take these data and construct a histogram

0

2

4

6

8

4 5 6 7 8

Mo

re

Frequency

Frequency

The given summary statistics and the histogram suggest a Normal Distribution

Empirical Distribution

00.10.20.30.40.50.60.70.80.9

1

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29

Disadvantages of Empirical distribution

• The empirical data may not adequately represent the true underlying population because of sampling error

• The Generated RV’s are bounded

• To overcome these two problems, we attempt to fit a theoretical distribution.

Estimation of Parameters of the fitted distributions

Suppose we hypothesized a distribution, then

use the Maximum Likelihood Estimator (MLE) to estimate the parameters involved with the hypothesized distribution.

• Suppose that is the only parameter involve in the distribution then construct (for example the mean 1/ in the exponential distribution)

• Let L(f(X1)f(X2)f(Xn)

• Find that maximize L() to be the required parameter.• Example: the exponential distribution. Do in class

Determine how representative the fitted distributions are

• Goodness of Fit (Chi Squared method)

Goodness of Fit (Chi Square method)

1. Divide the range of the fitted distribution into k (k<30) intervals [a0, a1), [a1, a2), … [ak-1, ak] Let Nj

= the number of data that belong to [aj-1, aj)

2. Compute the expected proportion of the data that fall in the jth interval using the fitted distribution call them pj

3. Compute the Chi-square

k

j j

jj

np

npN

1

22 )(

Chi-square cont.

• Note that npj represents the expected number of data that would fall in the jth interval if the fitted distribution is correct.

• If

• Where r is the number of parameters in the distribution (in Exponential dist. r = 1 which is )

• Then do not reject distribution with significance (1-)100%.

1,122

rk

Example:• Consider the following data:

0.01, 0.07, 0.03, 0.23, 0.04,

0.10, 0.31, 0.10, 0.31, 1.17,

1.50, 0.93, 1.54, 0.19, 0.17,

0.36, 0.27, 0.46, 0.51, 0.11,

0.56, 0.72, 0.39, 0.04, 0.78

Suppose we hypothesize an exponential distribution, Use Chi-square test by dividing the range into 5 subintervals.

• The estimate of =2.5

• Since k = 5, we have pi=0.2

• For the exponential distribution

• Therefore

tetF 5.21)(

089.05.2

)2.01ln(

2.01

2.01

1

5.2

5.2

1

1

a

e

ea

a

• Therefore chi-square = 0.4• From the tables of chi-square • we can accept the hypothesis

With significance level 5%

81.7205.0,3

Degrees of Freedom Probability, p

  0.99 0.95 0.05 0.01 0.001

1 0.000 0.004 3.84 6.64 10.83

2 0.020 0.103 5.99 9.21 13.82

3 0.115 0.352 7.82 11.35 16.27

4 0.297 0.711 9.49 13.28 18.47

5 0.554 1.145 11.07 15.09 20.52

6 0.872 1.635 12.59 16.81 22.46

7 1.239 2.167 14.07 18.48 24.32

8 1.646 2.733 15.51 20.09 26.13

The Chi-square table