+ All Categories
Home > Documents > Risk Analysis Using Simulation

Risk Analysis Using Simulation

Date post: 02-Nov-2015
Category:
Upload: 0202020
View: 20 times
Download: 0 times
Share this document with a friend
Description:
Process safety course work
29
 Risk Analysis using Simulation Page 1 of 29 Risk Analysis using simulation Introduction  Simulation is a process of understanding behavior of a system or evaluating the various strategies for the operation of the system.  Simulation calculates multiple scenarios of a model by repeatedly sampling values from the probability distributions for the uncertain variables.  Simulation involves the generation of the artificial history of the system and observation of the artificial history to draw inferences concerning the operating characteristics of the real system that is presented. Why Simulation?  When it is too costly to do physical studies on the system itself (e.g., trying alternative layout of a factory, building new facility)  The corresponding analytic models are too complicated to study (e.g. a queuing/transportation network)  When uncertainty, non-stationarity dominates the output results. Steps in Simulation  Model development:  System defination, objectives to study, decision variables, output measures, input variables and parameters.  Data collection: Collect data from the real system, obtain probability distributions of the input parameters by statistical analysis  Model translation:  Translate the physical model into computer simulation software.  Simulation runs & Output Data Analysis : Run the simulation upto satisfactory level and use statistical analysis of the ouput data to estimate the performance measures.
Transcript
  • Risk Analysis using Simulation Page 1 of 29

    Risk Analysis using simulation

    Introduction Simulation is a process of understanding behavior of a system or

    evaluating the various strategies for the operation of the system.

    Simulation calculates multiple scenarios of a model by repeatedly sampling values from the probability distributions for the uncertain

    variables.

    Simulation involves the generation of the artificial history of the system and observation of the artificial history to draw inferences concerning the

    operating characteristics of the real system that is presented.

    Why Simulation? When it is too costly to do physical studies on the system itself (e.g., trying

    alternative layout of a factory, building new facility)

    The corresponding analytic models are too complicated to study (e.g. a queuing/transportation network)

    When uncertainty, non-stationarity dominates the output results.

    Steps in Simulation Model development: System defination, objectives to study, decision

    variables, output measures, input variables and parameters.

    Data collection: Collect data from the real system, obtain probability distributions of the input parameters by statistical analysis

    Model translation: Translate the physical model into computer simulation software.

    Simulation runs & Output Data Analysis: Run the simulation upto satisfactory level and use statistical analysis of the ouput data to estimate

    the performance measures.

  • Risk Analysis using Simulation Page 2 of 29

    Model Verification and Validation: z Verification: Ensuring that the model is free from logical errors. It

    does what it is intended to do.

    z Validation: Ensuring that the model is a valid representation of the whole system. Model outputs are compared with the real system

    outputs.

    Optimization/Experimental Design: Analyze alternative strategies on the validated simulation model.

    Sensitivity analysis: Analuze the input data sensitivity in performance measure of the system. Cheking model robustmess.

    Tools for simulation @Risk Crystal Ball MatLab Minitab Excel Spreadsheet And so on, etc

    Simulation using Excel Spreadsheet

    Basic Excel Skills To facilitate the understanding of simulation using Excel, it is necessary to be

    familiar with the following features of Excel

    Copying formula and cell references Function: It is used in performing special calculations in the Spread sheet

    cells. Some of the more common functions that are usually used in

    simulation models are including:

    9 MlN (range) - finds the smallest value in a range of cells. 9 MAX (range) - finds the largest value in a range of cells 9 SUM (range) - finds the sum of values in a range of cells

  • Risk Analysis using Simulation Page 3 of 29

    9 AVERAGE (range) - finds the average of the values in a range of cells

    9 STDEV(range) -finds the standard deviation for a sample in a range of cells

    9 AND (condition 1, condition 2, ) -a logical function that returns TRUE if all conditions are true, and FALSE if not.

    9 OR (condition1, conditiott2 . . .) - a logical function that returns TRUE if any condition is true, and FALSE if not.

    9 IF (condition, value if true, value if false) -a logical function that returns one value if the condition is true and another if the condition

    is false.

    9 VLOOKIUP (value, table range column number) -looks up a value in a table.

    Charts and graphs Excel has many other functions for statistical, financial and other

    applications

    Others useful features Split Screen: Division of displayed spreadsheet screen Paste Special: Controlling of direct pasting into the cells Column and Row width: Customization of Column and Row size. Displaying Formulas in Worksheets: Showing actual formula in

    spreadsheet

    Displaying Grid Lines and Row, and Column Headers for Printing Filling a Range with a Series of Numbers. Comment Boxes

    Building Simulation Models in Excel However a good design is essential to user understanding. Any good simulation

    model design in any spreadsheet should be included with the following features.

    A descriptive title

  • Risk Analysis using Simulation Page 4 of 29

    A separate input data section area A separate working space A separate output section that provides the model results. Appropriate formatting such as in currency or comma formatting. Complex calculations should be divided into several cells to minimize the

    chances of error and enhance understanding.

    Comments should be placed next to formula cells or in comment boxes for explanation, if appropriate.

    Example: A Simulation Model for Dave's Candies

    Daves Candies is a small family owned business that offers gourmet chocolates

    and ice cream fountain service. For special occasions such as Valentines day,

    the store must place orders for special packaging several weeks in advance from

    their supplier. One product, Valentines day chocolate massacre, is bought for

    $7,50 a box and sells for $12.00. Any boxes that are not sold by February 14 are

    discounted by 50% and can always be sold easily. Historically Daves candies

    has sold between 40-90 boxes each year with no apparent trend. Daves

    dilemma is deciding how many boxes to order for the Valentines day customers.

    If the order quantity, Q is 70, what is the expected profit?

    Formulation: Selling price=$12

    Cost = $7.50

    Discount price=$6

    | If *DQ Profit=selling price*Q-cost*Q................................................[2]

    *D=Demand & Q=Quantity

  • Risk Analysis using Simulation Page 5 of 29

    Simulation model in Excel

    Model description

    The input data are given in Column A and Column B. Simulation results are shown in Columns D through F. IF function is used in Column F to calculate the profit based on order

    demand and quantity relation using the equations [1] and [2].

  • Risk Analysis using Simulation Page 6 of 29

    Each row in the results table represents one trial of the simulation. Demand in each row was generated from probability distribution. Since selling boxes range 40-90 boxes/year, the values in Column E were

    generated by rolling a die and entered into the worksheet to verify the

    formulas for profit in column F.

    Probability & Statistic in Simulation

    Fundamentals Probability - Probability is a measure of how likely a value or event is to

    occur. It can be measured from simulation data as frequency by

    calculating the number of occurrences of the value or event divided by the

    total number of occurrences. This calculation returns a value between 0

    and 1 which then can be converted to percentage by multiplying by 100.

    Probabilistic Risk Assessment (PRA) - A risk assessment that yields a probability distribution for risk, generally by assigning a probability

    distribution to represent variability or uncertainty in one or more inputs to

    the risk equation.

    Probability Density Function (PDF) - A function representing the probability distribution of a continuous random variable. The density at a

    point refers to the probability that the variable will have a value in a narrow

    range about that point.

    Probability Distribution - A probability distribution is a set of probabilities associated with all possible outcomes of an uncertain event.

    Probability Mass Function (PMF) - A function representing the probability distribution for a discrete random variable. The mass at a point

    refers to the probability that the variable will have a value at that point.

    Random Variable - A variable that may assume any value from a set of values according to chance. Discrete random variables can assume only a

    finite or countable infinite number of values (e.g., number of rainfall events

    per year). A random value is continuous if its set of possible values is an

    entire interval of numbers (e.g., quantity of rain in a year).

  • Risk Analysis using Simulation Page 7 of 29

    Central Tendency Exposure (CTE) - A risk descriptor representing the average or typical individual in a population, usually considered to be the

    mean or median of the distribution.

    Confidence Interval - Confidence intervals refers an interval that provides a range where the exact or true probability value of parameters may lie.

    Confidence Limit - The upper or lower value of a confidence interval. Cumulative Distribution Function (CDF) - Obtained by integrating the

    PDF, gives the cumulative probability of occurrence for a random

    independent variable. Each value c of the function is the probability that a

    random observation will be less than or equal to c.

    Frequency Distribution or Histogram - A graphic (plot) summarizing the frequency of the values observed or measured from a population. It

    conveys the range of values and the count (or proportion of the sample)

    that was observed across that range.

    Monte Carlo Analysis (MCA) or Monte Carlo Simulation - A technique for characterizing the uncertainty and variability in risk estimates by

    repeatedly sampling the probability distributions of the risk equation inputs

    and using these inputs to calculate a range of risk values.

    Point Estimate - In statistical theory, a quantity calculated from values in a sample to estimate a fixed but unknown population parameter. Point

    estimates typically represent a central tendency or upper bound estimate

    of variability. [Source: U.S. EPA]

    Types of Distribution: Continuous distribution: A probability distribution is continuous when

    any value between the minimum and maximum is possible (has finite

    probability). For example, an uncertainty function describing the possible

    annual rainfall in Ithaca, New York next year would be a continuous

    function since any value between 0 and some upper limit is possible.

  • Risk Analysis using Simulation Page 8 of 29

    Discrete distribution: A discrete probability distribution has only a finite number of possible values between the maximum and minimum. For

    example, an uncertainty function describing the outcome of a coin toss is

    discrete since only two values are possible: heads or tails.

    Descriptive Statistics

    Parameter - A value that characterizes the distribution of a random variable.

    Parameters commonly characterize the location, scale, shape, or bounds of the

    distribution. For example, a truncated normal probability distribution may be

    defined by four parameters: arithmetic mean [location], standard deviation

    [scale], and min and max [bounds]. It is important to distinguish between a

    variable (e.g., ingestion rate) and a parameter (e.g., arithmetic mean ingestion

    rate).

    Measures of Central Tendency: Mean, Median, & Mode

    Mean: The mean of a set of values is the sum of all the values in the set divided by the total number of values in the set.

    =

    =N

    ii

    xN 11

    Median: The median is the middle data point when the data is ordered. The middle value (50th percentile) in the ordered sequence of measured

    values. For highly skewed data sets the median can give a better

  • Risk Analysis using Simulation Page 9 of 29

    representation than the mean of the middle of the data distribution. The

    median is not as comprehensive a measure of the data set as the mean.

    Mode: The most likely value or mode is the value that occurs most often in a set of values. In a histogram and a result distribution, it is the center

    value in the class or bar with the highest probability.

    Measures of Variation

    Standard deviation: The standard deviation is a measure of how widely dispersed the values are in a distribution. Equals the square root of the

    variance.

    1

    )(1

    2

    ==

    N

    xN

    ii

    Variance: The variance is a measure of how widely dispersed the values are in a distribution, and thus is an indication of the "risk" of the

    distribution. It is calculated as the average of the squared deviations about

    the mean. The variance gives disproportionate weight to "outliers", values

    that are far away from the mean. The variance is the square of the

    standard deviation.

    Range: The range is the absolute difference between the maximum and minimum values in a set of values. The range is the simplest measure of

    the dispersion or "risk" of a distribution

    Kurtosis: Kurtosis is a measure of the shape of a distribution. Kurtosis indicates how flat or peaked the distribution is. The higher the kurtosis

    value, the more peaked the distribution. The coefficient of kurtosis is

    computed as:

    41

    4)(

    =

    =N

    iix

    CK

  • Risk Analysis using Simulation Page 10 of 29

    Skewness: Skewness is a measure of the shape of a distribution. Skewness indicates the degree of asymmetry in a distribution. Skewed

    distributions have more values to one side of the peak or most likely value

    one tail is much longer than the other. A skewness of 0 indicates a

    symmetric distribution, while a negative skewness means the distribution

    is skewed to the left. Positive skewness indicates a skew to the right. The

    coefficient of skewness is computed as:

    31

    3)(

    =

    =N

    iix

    CS

    Coefficient of Variation: It is the ratio of standarad deviation and the sample mean.

    =CV

    Correlation: It is measure the interdependences of two variables. The sample correlation coefficient is computed as:

    yx

    N

    iii

    SSN

    yyxxr

    )1(

    ))((1

    ==

    Random numbers & Probability distribution in Simulation It is a numerical description of the outcome of some experiment. Example

    of outcome of experiment could be profit values, failure rate or chance of

    an event occurrence, etc.

    It can generated from a probability distribution. In simulation terminology, a random number is one that is uniformly

    distributed in between 0 and 1.

    Recall from statistic that uniform probability distribution characterizes a random variable for which all outcomes between a minimum value a and a

    maximum value b are equally likely.

  • Risk Analysis using Simulation Page 11 of 29

    Some Important Continuous Distribution Exponential Distribution: The exponential distribution is characterized

    by the single parameter (.) The exponential random variable T (t > 0) has a probability density function

    f(t) = e-t for t > 0

    Where,

    = mean number of occurrences per unit time t= number of time units until the next occurrence The cumulative distribution function is

    00

    ( ) 1t

    tx x tF t e dx e e = = = for t > 0. Mean: = and

    1

    Variance: 212

    =

    Lognormal Distribution: Distribution parameter (, )

  • Risk Analysis using Simulation Page 12 of 29

  • Risk Analysis using Simulation Page 13 of 29

    Normal Distribution: Distribution parameter (, )

  • Risk Analysis using Simulation Page 14 of 29

    Triangular Distribution : Distribution Parameter (min, m.likely, max)

  • Risk Analysis using Simulation Page 15 of 29

  • Risk Analysis using Simulation Page 16 of 29

    Uniform Distribution : Distribution parameter (min, max)

  • Risk Analysis using Simulation Page 17 of 29

  • Risk Analysis using simulation Page 18 of 29

    Some Important Discrete Distribution Binomial Distribution: Distribution parameter (n, p)

  • Risk Analysis using simulation Page 19 of 29

  • Risk Analysis using simulation Page 20 of 29

    Poisson distribution: Distribution parameter ()

  • Risk Analysis using simulation Page 21 of 29

    Generating random variables in Excel Two properties of probability distribution are useful to generate random numbers:

    1. the probability of any outcome is always between 0 and 1 and

    2. the sum of the probabilities of all outcomes adds to 1

    Random numbers from discrete distribution

    Divide the range from 0 to 1 into intervals that correspond to the probabilities of the discrete outcomes.

    Use VLOOKUP function (look up value, table array, column index number) or Random number generation function in excel.

  • Risk Analysis using simulation Page 22 of 29

    Example : Random number generation for Dave's Candies

  • Risk Analysis using simulation Page 23 of 29

    Random numbers from continuous distribution

    Inverse transform method is used in case if there is no direct function to generate random numbers.

    Inverse transform method uses the following steps: 1. generate a random number from the Uniform distribution:

    u=Uniform(0,1),

    2. Calculate inverse cumulative distribution function (ICDF). In Excel, the

    function name to calculate the ICDF for Normal distribution NORMINV

    and for LogNormal distribution is Lognormal (LOGINV). RAND function

    in excel can also be used to generate random numbers from the Uniform

    distribution, and apply the built-in functions to calculate the ICDF.

    Example: For example, the following formula will return the inverse CDF of the Normal

    distribution with mean=1 and standard_deviation=2 evaluated at p=0.2:

    =NORMINV(0.2; 1; 2)

    Replacing 0.2 with RAND will yield the Normal random number generation formula:

    =NORMINV(RAND(); 1; 2)

    Similarly for LogNormal distribution the function is

    LOGINV (probability, mean, Standard_dev)

    For triangular distribution, It may need to use If function to generate random numbers

    in Excel. The structure of the IF function is:

    =IF(expression, what is returned if true, what is returned if false)

    If function first determines which side of the distribution corresponds to the random

    number and then evaluates the appropriate formula.

    =IF (RAND () < (Mode Min)/(Max Min),Left formula, Right formula)

    Let assume,

    In a triangular distribution, if Min is a, Mode is b and Max. is c, then

    bxaacabRANDaFormulaLeft += )(*)(*() cxbacbcRANDcFormulaRight = )(*)(*())1(

  • Risk Analysis using simulation Page 24 of 29

    Monte-Carlo Simulation

    Figure 1: Monte Carlo analysis to a model.

    Monte-Carlo simulation uses the following steps for assessing parameters uncertainty:

    i. Select a distribution to describe possible values of a parameter.

    ii. Generate data from this distribution.

    iii. Use the generated data as probable values of the parameters in the model to

    produce output.

    Monte-Carlo simulation on Excel Monte-Carlo simulation on excel spreadsheet follows the following steps::

    1. Develop the spreadsheet model including a separate input and output

    region.

    2. Generate random numbers from the assigned probability distribution and

    use the random data into the appropriate formula in the simulation model.

    3. Repeat step 2 until to obtain the satisfactory output to create a distribution of

    results.

  • Risk Analysis using simulation Page 25 of 29

    4. Make a summary for the descriptive statistics and collect output data in a

    frequency distribution or histogram for analysis.

    Example: Monte-Carlo simulation on excel for Dave's Candies problem

    -

    Output

    -

    -

    Example: Methyl-mercury has been inadvertently released to a

    Monte Carlo simulation. With 90% (subjective) confidence, what

    maximally exposed individual? The following table shows the (sub

    distributions for this problem.

    nearby lake. Use

    is the risk to the

    jective) probability

  • Risk Analysis using simulation Page 26 of 29

    Solution: Step 1: Selecting distribution for model parameters 9 Table 1 is used as input for a Monte Carlo simulation

    Table 1: Input data for MCS

    Input Region Parameter Distribution Min. Max. Mean/Mode STD Fish concentration(CF), mg/kg Normal 2.06E-01 4.22E-02

    Intake of Fish (IF),kg/d Uniform 2.00E-02 1.30E-01 6.50E-02

    Methyl mercury RfD (RfDMM), mg/kg-d

    Triangle 1.50E-04 3.00E-03 3.00E-04

    Body mass (BM), kg Triangle 4.50E+01 1.20E+02 7.00E+01

    Step 2: Generate data from this distribution.

    9 Table 3 shows the generated dated for this example: Table 3: Random number generation

    Sample Number CF IF RfD BM

    1 0.209 0.032 5.39E-04 59.203 2 0.268 0.047 9.61E-04 66.066 3 0.203 0.052 2.35E-03 109.907 4 0.199 0.026 2.51E-04 95.667 5 0.140 0.123 3.71E-04 94.443 6 0.230 0.083 6.53E-04 60.822 7 0.191 0.025 2.77E-03 78.394 8 0.182 0.102 1.00E-03 80.310 9 0.197 0.095 1.36E-03 57.432

    10 0.220 0.041 6.73E-04 84.090 11 0.218 0.099 7.61E-04 59.938 12 0.198 0.024 1.57E-03 88.117 - - - - - - - - - - - - - - -

    96 0.265 0.098 2.90E-04 90.694 97 0.213 0.109 4.57E-04 84.283 98 0.233 0.113 1.55E-03 64.637 99 0.265 0.070 4.98E-04 93.816

    100 0.161 0.066 6.63E-04 87.898

  • Risk Analysis using simulation Page 27 of 29

    Step 3: Use the generated data as probable values in the model to calculate HQ.

    9 Used model to calculate the Hazard Index (HI):

    RfDBMICHI FF

    =

    Table 3: MCS to calculate HI using 100 iterations

    Sample Number CF IF RfD BM HI

    1 0.209 0.032 5.39E-04 59.203 2.12E-01 2 0.268 0.047 9.61E-04 66.066 1.97E-01 3 0.203 0.052 2.35E-03 109.907 4.07E-02 4 0.199 0.026 2.51E-04 95.667 2.13E-01 5 0.140 0.123 3.71E-04 94.443 4.92E-01 6 0.230 0.083 6.53E-04 60.822 4.79E-01 7 0.191 0.025 2.77E-03 78.394 2.16E-02 8 0.182 0.102 1.00E-03 80.310 2.32E-01 9 0.197 0.095 1.36E-03 57.432 2.40E-01

    10 0.220 0.041 6.73E-04 84.090 1.59E-01 11 0.218 0.099 7.61E-04 59.938 4.71E-01 12 0.198 0.024 1.57E-03 88.117 3.48E-02 13 0.142 0.037 2.05E-03 64.939 3.92E-02 14 0.268 0.099 1.83E-03 87.285 1.66E-01 15 0.217 0.070 2.17E-03 76.220 9.23E-02 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    94 0.216 0.119 2.01E-03 69.327 1.85E-01 95 0.223 0.078 2.68E-03 74.652 8.68E-02 96 0.265 0.098 2.90E-04 90.694 9.83E-01 97 0.213 0.109 4.57E-04 84.283 6.04E-01 98 0.233 0.113 1.55E-03 64.637 2.63E-01 99 0.265 0.070 4.98E-04 93.816 3.95E-01

    100 0.161 0.066 6.63E-04 87.898 1.83E-01

  • Risk Analysis using simulation Page 28 of 29

    Output Summary: The output from MCS is shown in Fig 1. The frequency distribution for the output from MCS (Monte Carlo simulation) is

    shown in Fig 2.

    From Monte Carlo simulation, the confidence interval for 90% & 95% are obtained [0.193, 0.251] & [0.188, 0.256] respectively.

    This implies that after taking into account the uncertainties on the parameters, one is highly confident (at a subjective level of 95%) that the true HI should lie

    between 0.188 and 0.256.

    Since the 95% upper confidence limit of HI is still below 1, there is high confidence that the maximally exposed individual for this scenario is not

    exposed to an unacceptable level of risk, and remediation should not be

    warranted.

    Figure 1: MCS output

  • Risk Analysis using simulation Page 29 of 29

    Forcast: Hazard Index (HI)

    0

    5

    10

    15

    20

    25

    0.01

    0.11

    0.21

    0.31

    0.41

    0.51

    0.61

    0.71

    0.81

    0.91

    1.01

    Hazard Index

    Freq

    uenc

    Figure 2: Frequency distribution of HI

    References: Hammonds, J. S. , Hoffman, F. O., and Bartell, S .M., An Introductory Guide to

    Uncertainty Analysis in Environmental and Health Risk Assessment managed by

    MARTIN MARIETTA ENERGY SYSTEMS, INC. for the U.S. DEPARTMENT OF

    ENERGY

    James R. Evans, David Louis Olson, Introduction to Simulation and Risk Analysis,

    Prentice Hall PTR, Upper Saddle River, NJ, 2001

    http://www.cas.lancs.ac.uk/glossary_v1.1/prob.html#pdf

    http://www.ltcconline.net/greenl/courses/201/probdist/zScore.htm

    Risk Analysis using simulation


Recommended