Blackboard CLT

8/4/2019 Blackboard CLT

1/25

The Central Limit Theorem

Mitja Stadje

September, 2011

Mitja Stadje, QM Additional Material
http://find/http://goback/


2/25

Central Limit Theorem, CLT

X1,X2, . . . i.i.d. with EXi = and

0 < VarXi = 2


3/25

CLT, Application I: Calculating probabilities of sums

In case the distribution is known, (in particular, the true mean and the true variance are known), the CLT can be used to

compute probabilities involving the sum of i.i.d. observations: Forinstance suppose that Xi are i.i.d. with exponential, ort-distribution, or F-distribution or any other standard distribution.Then to compute P[X1 + X2 + . . . + X100 5] we need to know

the cdf ofX

1 +X

2 + . . . +X

100, i.e.,P[X1 + X2 + . . . + Xn 5] = FX1+X2+...+X100(5)

The problem is that FX1+X2+...+X100 often is very complicated tocompute. However, using the CLT we know that

100i=1 X

i N(100, 1002

). Therefore

P[100i=1

Xi 5] P[100

i=1 Xi 10010

5 10010

]

= (5 100

10).

http://goforward/http://find/http://goback/


4/25



compute probabilities involving the sum of i.i.d. observations: Forinstance suppose that Xi are i.i.d. with exponential, ort-distribution, or F-distribution or any other standard distribution.Then to compute P[X1 + X2 + . . . + X100 5] we need to know

the cdf ofX

1 +X

2 + . . . +X

100, i.e.,P[X1 + X2 + . . . + Xn 5] = FX1+X2+...+X100(5)

The problem is that FX1+X2+...+X100 often is very complicated tocompute. However, using the CLT we know that

100i=1 X

i N(100, 1002

). Therefore

P[100i=1

Xi 5] P[100

i=1 Xi 10010

5 10010

]

= (5 100

10).



5/25



compute probabilities involving the sum of i.i.d. observations: Forinstance suppose that Xi are i.i.d. with exponential, ort-distribution, or F-distribution or any other standard distribution.Then to compute P[X1 + X2 + . . . + X100 5] we need to knowthe cdf of X

1+ X

2+ . . . + X

100, i.e.,

P[X1 + X2 + . . . + Xn 5] = FX1+X2+...+X100(5)The problem is that FX1+X2+...+X100 often is very complicated tocompute. However, using the CLT we know that

100i=1

Xi N

(100, 100

2

). Therefore

P[100i=1

Xi 5] P[100

i=1 Xi 10010

5 10010

]

= (5 100

10).



6/25

CLT, Application II: Confidence intervals for

In case that is not known and we have i.i.d. observationsX1, . . . ,Xn we can always estimate with Xn. However, thequestion is how good is this estimate? To answer this question onetypically gives a whole interval around Xn where with probabilitysay = 95% should be in. This is called a confidence interval.Confidence intervals:

Give a better idea about the possible values of compared tojust a point estimator Xn.

Can be used to evaluate the quality of the estimate. Larger

intervals correspond to bad estimates and smaller intervalscorrespond to good estimates.



7/25

CLT, Application II: Confidence intervals for if is

unknown

Since1 = P

z/2 Z z/2

for Z N(0, 1), by the CLT also

1

= P

z/2

n(Xn )

z/2. (1)

Solvingn(Xn)

z/2 for we get Xn + z/2/

n.

Solvingn(Xn)

z/2 we get Xn z/2/

n. Therefore,

1 = PXn

z/2

n

Xn +z/2

n .

So is with 1 probability in the intervalXn

z/2n

, Xn +z/2

n

.

In case is not known is typically replaced by S2.Mitja Stadje, QM Additional Material


8/25


unknown

Since1 = P

z/2 Z z/2


1

= P

z/2

n(Xn )

z/2. (1)

Solvingn(Xn)


n.

Solvingn(Xn)

z/2 we get Xn z/2/

n. Therefore,

1 = PXn

z/2

n

Xn +z/2

n .


z/2n

, Xn +z/2

n

.

In case is not known is typically replaced by S2

.Mitja Stadje, QM Additional Material


9/25


unknown

Since1 = P

z/2 Z z/2


1

= P

z/2

n(Xn )

z/2. (1)

Solvingn(Xn)


n.

Solvingn(Xn)

z/2 we get Xn z/2/

n. Therefore,

1 = PXn

z/2

n

Xn +z/2

n .


z/2n

, Xn +z/2

n

.

In case is not known is typically replaced by S2

.Mitja Stadje, QM Additional Material


10/25

CLT, Application III: Hypothesis testing

Suppose that is unknown but is known. We are given aconstant 0 (for instance 0 = 0) and we want to test if the true

expectation of our observation is equal to 0. That is we want totest the hypothesis

H0 : = 0 against H1 : = 0.Strategy: Assume for a moment that H0 were true. The CLT yields

T =n(Xn 0)

Z N(0, 1)

So if H0 is true with probability 1 1

= P[

z/2

T

z/2].

Typically is chosen such that 1 = 99% or 95%. Therefore, ifwe observe in my outcomes that T / [z/2, z/2], it seemsunlikely that the hypothesis H0 is true. (If it were true then with95% probability T [z/2, z/2] which was not the case). So inthis case we reject H0.



11/25





T =n(Xn 0)

Z N(0, 1)


= P[

z/2

T

z/2].




12/25





T =n(Xn 0)

Z N(0, 1)


= P[

z/2

T

z/2].




13/25


So the CLT can be used to test hypothesis H0. If H0 is rejected we

can conclude that there is evidence supporting H1. (The otherdirection does not hold.)The region [z/2, z/2] (which under H0 has 95% probability) isalso called the acceptance region. The region R \ [z/2, z/2] iscalled the rejection region of the test. is called the significance

of the test.

(a) In the setting above what do you do if you do not know themean and do not know the variance?

(b) How to test the hypothesis H0 :

0?

(c) Above we only assumed that the observations X1, . . . ,Xn arei.i.d. What happens if we do not know the variance but weadditionally know that our observations are normal? Could wethen do even better?



14/25




of the test.



0?




15/25




of the test.



0?




16/25




of the test.



0?



CLT A li i III H h i i


17/25




of the test.


(b) How to test the hypothesis H0 : 0?(c) Above we only assumed that the observations X1, . . . ,Xn are

i.i.d. What happens if we do not know the variance but weadditionally know that our observations are normal? Could wethen do even better?


CLT A li i III H h i i


18/25




of the test.


(b) How to test the hypothesis H0 : 0?(c) Above we only assumed that the observations X1, . . . ,Xn are

i.i.d. What happens if we do not know the variance but weadditionally know that our observations are normal? Could wethen do even better?


E i


19/25

Excursion

To prove the CLT we will need moment generating functions:

Definition

Suppose that X is a random variable. The moment generatingfunction of X is defined as

MX(t) = E[exp(tX)].

Why the name? M(t) = E[X exp(tX)], so M(0) = E[X].M(t) = E[X2 exp(tX)], so M(0) = E[X2]; and so on.Example: If X N(0, 1) then

MX(t) = E[exp(tX)]

=

exp(tx)fX(x)dx

=

exp(tx)12

exp(x2/2)dx


E i


20/25

Excursion


Definition


MX(t) = E[exp(tX)].


MX(t) = E[exp(tX)]

=

exp(tx)fX(x)dx

=

exp(tx)12

exp(x2/2)dx


E i


21/25

Excursion


Definition


MX(t) = E[exp(tX)].


MX(t) = E[exp(tX)]

=

exp(tx)fX(x)dx

=

exp(tx)12

exp(x2/2)dx



22/25

MX(t) =

12

exp (x2 2tx)/2

dx

=

12

exp (x2 2tx + t2 t2)/2dx

=

12

exp ((x t)2 t2)/2

dx

= et2/2

1

2 exp (x t)2

/2

dx = et2/2

Other properties: If X and Y are independent then

MX+Y(t) = MX(t)MY(t).

Why? Therefore, if Xi is a random sample

MX1+X2+...+Xn(t) = MX1 (t) . . .MXn(t) =

MXi(t)n.



23/25

MX(t) =

12

exp (x2 2tx)/2

dx

=

12

exp (x2 2tx + t2 t2)/2dx

=

12

exp ((x t)2 t2)/2

dx

= et2/2

1

2 exp (x t)2

/2

dx = et2/2




MX1+X2+...+Xn(t) = MX1 (t) . . .MXn(t) =

MXi(t)n.



24/25

MX(t) =

12

exp (x2 2tx)/2

dx

=

12 exp (x2 2tx + t2 t2)/2dx

=

12

exp ((x t)2 t2)/2

dx

= et2/2

1

2 exp (x t)2

/2

dx = et2/2




MX1+X2+...+Xn(t) = MX1 (t) . . .MXn(t) =

MXi(t)n.



25/25

We will need the following result:

Theorem

Let Y1,Y2, . . . be a sequence of random variables with distributionfunctions F1,F2, . . . and mgf -s M1,M2, . . . . If the random variable

Y has distribution function F and mgf M, thenlimn Mn(t) = M(t) for t (h, h), for some h > 0, impliesYn

d Y.


Date post:	07-Apr-2018
Category:	Documents
Upload:	yilong-xu
View:	233 times
Download:	0 times

Blackboard CLT

Documents