How do we generate the statistics of a function of a random variable?

Slide 1

How do we generate the statistics of a function of a random variable?Why is the method called Monte Carlo?

How do we use the uniform random number generator to generate other distributions?Are other distributions directly available in matlab?

How do we accelerate the brute force approach?Probability distributions and moments

Web links: http://www.riskglossary.com/link/monte_carlo_method.htm http://physics.gac.edu/~huber/envision/instruct/montecar.htm

Monte Carlo SimulationSOURCE: http://pics.hoobly.com/full/AA7G6VQPPN2A.jpg

In this lecture we will study how to generate statistical information on functions of random variables. The basic method is called Monte Carlo simulation after the famous casino in Monte Carlo. We will cover some of the functions that are available in Matlab, and we will also discuss briefly some alternatives to Monte Carlo simulation, which is a brute force approach.1 Basic Monte CarloGiven a random variable X and a function h(X): sample X: [x1,x2,,xn]; Calculate [h(x1),h(x2),,h(xn)]; use to approximate statistics of h.

Example: X is U[0,1]. Use MCS to find mean of X2x=rand(10); y=x.^2; %generates 10x10 random matrixmean=sum(y)/10x =0.4017 0.5279 0.1367 0.3501 0.3072 0.3362 0.3855 0.3646 0.5033 0.2666mean=0.3580

What is the true mean

SOURCE: http://schools.sd68.bc.ca/ed611/akerley/question.jpg

SOURCE:http://www.sz-wholesale.com/uploadFiles/041022104413s.jpg

The basic Monte Carlo simulation method consists of generating samples of the random variables, calculating the function at each sample to generate a sample of the function. From the sample of the function we can generate statistics for the function.

As an illustration consider the function x2, where x is uniformly distributed in [0,1]. 10 samples are generated and they can be used, for example, to estimate the mean of the function. In the example on the slide the mean comes out to be 10.

The true mean is the integral of the function times its probability distribution function (which is equal to 1 in [0,1])2

Obtaining distributionsHistogram: y=randn(100,1); hist(y)

One way of getting an idea of the shape of the distribution is generating a histogram from the sample, as shown in the figure.3Cumulative density functionCdfplot(y)

[f,x]=ecdf(y);

It turns out that plotting the CDF of the sample is less tricky than using a histogram to get an idea of the pdf. In fact, smoothing the CDF and differentiating it to get the PDF is usually a better idea from empirical CDF[f,x]=ecdf(y); fColumns 1 through 8 0 0.0100 0.0200 0.0300 0.0400 0.0500 0.0600 0.0700 Columns 9 through 16 0.0800 0.0900 0.1000 0.1100 0.1200 0.1300 0.1400 0.1500xColumns 1 through 8 -2.2529 -2.2529 -2.1763 -2.1746 -1.9539 -1.9068 -1.6354 -1.5729 Columns 9 through 16 -1.4127 -1.2568 -1.2203 -1.2158 -1.1957 -1.1684 -1.1645 -1.0954

Columns 89 through 96 0.9940 1.0437 1.0945 1.1654 1.2020 1.2750 1.5081 1.5338 Columns 97 through 101 1.5908 1.7909 1.8918 2.4124 2.7335

4Histogram of averagex=rand(100); y=sum(x)/100; hist(y)

An important property that is useful to keep in mind is what happens to the shape of the distribution when you start averaging. So here we look at the shape of the averages of 100 samples from the uniform distribution.

First we note that the distribution begins to resemble a normal one. Also it is much narrower. Analytically it is easy to show that the standard distribution of the average is the original standard deviation divided by the square root of the sample size.5Histogram of averagex=rand(1000); y=sum(x)/1000; hist(y)

What is the law of large numbers?

With the average of a sample of 1,000 the trend is even more pronounced to look like a normal distribution.6Distribution of x2x=rand(10000,1);x2=x.^2;hist(x2,20)

And here is how the histogram for the Monte Carlo simulation of x2 looks like.7Other distributionsOther distributions available in matlabFor example, Weibull distribution

r=wblrnd(1,1,1000);hist(r,20)

The Weibull distribution is often used for strength distribution. Matlab has a random number generator for that distribution. The first two parameters are a and b.8Correlated VariablesFor normal distribution can use Matlabs mvnrnd

R = MVNRND(MU,SIGMA,N) returns a N-by-D matrix R of random vectors chosen from the multivariate normal distribution with 1-by-D mean vector MU, and D-by-D covariance matrix SIGMA.

We often assume that random variables are independent or at least uncorrelated, but we sometimes have data on correlation. For example, if we draw information from tests that involve more than one property we may get the correlation.

The correlation is defined as the covariance of the two random variables divided by the product of the two standard deviations. It can vary between -1 and 1. We will get the extreme values, for example, when the variables are related linearly, e.g.

Y=aX+b

With -1 when a is negative and 1 when a is positive.

In terms of generating samples of correlated random variables, Matlab can do that for normal random variables using mvrnd. For other distributions one has to use Markov Chain Monte Carlo, which is beyond the scope of this lecture.9Examplemu = [2 3];sigma = [1 1.5; 1.5 3];r = mvnrnd(mu,sigma,20);plot(r(:,1),r(:,2),'+')

What is the correlation coefficient?

As an example consider the vector mu and matrix sigma given as input mvnrnd. The diagonals of the matrix are the squares of the standard deviations so the two standard deviations are 1 and sqrt(3). The correlation coefficient is 1.5/sqrt(3)=0.866, which is indeed reflected in the figure.10Problems Monte CarloUse Monte Carlo simulation to estimate the mean and standard deviation of x2, when X follows a Weibull distribution with a=b=1.Calculate by Monte Carlo simulation and check by integration the correlation coefficient between x and x2, when x is uniformly distributed in [0,1]11Latin hypercube samplingX = lhsnorm(mu,SIGMA,n) generates a latin hypercube sample X of size n from the multivariate normal distribution with mean vector mu and covariance matrix SIGMA. X is similar to a random sample from the multivariate normal distribution, but the marginal distribution of each column is adjusted so that its sample marginal distribution is close to its theoretical normal distribution.

Instead of the entire randomness of mvnrnd, lhsnorm makes sure that the distribution in each variable separately (marginal distribution) follows closely the actual normal distribution.12Comparing MCS to LHSmu = [2 2];sigma = [1 0; 0 3];r = lhsnorm(mu,sigma,20);sum(r)/20ans = 1.9732 2.0259

r = mvnrnd(mu,sigma,20);sum(r)/20ans =2.3327 2.2184

The figures compare the distribution of points from mvnrnd and lhsnorm with 20 points when the two variables have means of 2, standard deviations of 1 and sqrt(3) and no correlation. We see that mvnrnd has errors of more than 10% in the means compared to only about 1% for lhsnorm. Also the points appear to be distributed over a larger percentage of the space.13Evaluating probabilities of failureFailure is defined in terms of a limit state function that must satisfy g(r)>0, where r is a vector of random variables.Probability of failure is estimated as the ratio of number of negative gs, m, to total MC sample size, NThe accuracy of the estimate is poor unless N is much larger than 1/PfFor small Pf

We often use Monte Carlo simulation to evaluate the probability of failure by counting how many points did not satisfy a success criterion. So the estimate of the probability of failure is the number of failures m over the total sample size N.

That estimate is not very accurate when m is small. Indeed we have an expression for the standard deviation of the probability of failure based on a sample of size N given as

It is easy to check that when the probability of failure is very small this leads to 14

problems probability of failureDerive formula for the standarddeviation of estimate of Pf

If x is uniformly distributed in [0,1], use MCS to estimate the probability that x2>0.95 and estimate the accuracy of your estimate from the formula.

3. Calculate the exact value of the answer to Problem 2 (that is without MCS).

Source: Smithsonian InstitutionNumber: 2004-57325Separable Monte CarloUsually limit state function is written in terms of response vs. capacity g=C(r)-R(r)>0Failure typically corresponds to structures with extremely low capacity or extremely high response but not bothCan take advantage of that in separable MC

16Reading assignmentRavishankar, Bharani, Smarslok B.P., Haftka R.T., Sankar B.V. (2010)Error Estimation and Error Reduction in Separable Monte Carlo Method AIAA Journal ,Vol 48(11), 22252230 .

Source: www.library.veryhelpful.co.uk/ Page11.htm17

Date post:	22-Feb-2016
Category:	Documents
Upload:	aminia
View:	50 times
Download:	0 times

How do we generate the statistics of a function of a random variable?

Documents