+ All Categories
Home > Documents > Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The...

Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The...

Date post: 30-Mar-2018
Category:
Upload: donhan
View: 221 times
Download: 2 times
Share this document with a friend
25
Sampling Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using SAMPLE Information. For example, use Sample mean as an estimate of the population mean of the study. This chapter tells us how well a sample statistic such as sample mean perform when it is used to estimate the unknown population mean.
Transcript
Page 1: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Sampling Distributions and

The Central Limit Theorem

The BIG PICTURE of statistics is to make inferences about

UNKNOWN Population using SAMPLE Information.

For example, use Sample mean as an estimate of the population

mean of the study.

This chapter tells us how well a sample statistic such as sample

mean perform when it is used to estimate the unknown

population mean.

Page 2: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Recall the difference between ‘statistic’ and ‘parameter’.

Population parameters do not change, since they describe the entire

population.

Sample statistics vary from sample to sample, therefore, a sample statistic

such as sample mean is a random variable.

For each sample, we can compute a sample mean, which will be different

from sample to sample, and we can learn about the distribution of these

sample means to see how sample means behave.

To characterize the behavior of sample means, we need to study the

distribution of all possible sample means.

Population

Parameter

Mean

m

Variance

s2

Standard deviation

s

Sample

Statistic

Mean

Variance

s2

Standard deviation

s X

Page 3: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Sampling distribution of sample mean

In a real world situation, population is often not available. All we can do is to use sample information to make an estimate or prediction of the population characteristics.

How do we know if our estimate or prediction is a 'good' one?

Example: To estimate the average weekly grocery spending for a family in a city, a random sample of 25 families are surveyed. The sample average is $80 and s.d. $30.

Is $80 a ‘good estimate of average grocery spending per family in the city?

How about if we take another random sample of 25 families, and we obtained the average to b $90. Which one is better?

Q: How do we decide a good way for estimating the average family grocery spending?

Page 4: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Decide a good way for estimating the average family grocery spending?

The Idea:

• Study the behavior of all potential sample means, each is computed from the spending of 25 families. We can

• Then use the pattern of the general behavior of sample means to figure out how much confidence we have when we make our estimation or prediction.

The behavior of all possible sample means, in statistics, can be described by the distribution of sample mean.

Based on the distribution of sample mean, when we take a sample and obtain only one sample mean, we can tell how close the observed sample mean is to the unknown population mean.

So, first thing is to learn the distributional behavior of all possible sample means. This is

Sampling distribution of Sample Mean

Page 5: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

The distributional behavior of sample means is

characterized by four properties:

The sampling distribution of sample mean is the

probability distribution of all possible sample means,

each sample mean is obtained from a random sample

of n observations drawn from the population with mean

m and standard deviation s.

NOTE: The distribution of sample mean depends on (a) the

population from which we draw the sample and (b) the

sample size, n.

1. How do we determine the distribution of sample mean?

2. What is the center of the distribution?

3. What is the variation of the distribution?

4. What is the shape of the distribution?

Page 6: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

How do we determine the sampling distribution of

sample mean?

/8ix x

2x

x x x x xxx xx x x x x x

x x x x x x x x x x xxxxx xxxx x x x xx

x x x x x x x xxxxx x x x x x x x xx x x

x x x xxx x xxx xx xxx x x x xxx xx x x

xx x xxx x x xx x x x x

Individual

SAT scores POPULATION

Samples:

Each sample

is a random

sample of 8

SAT scores

from the

entire

population

x

Xxxx

xxx xxx

xxxxx xxxxxxxx

xxxxxxxx xxxx

xxxx

x

Xxx

xxxx

Sample Means: 3x1x 4x 5x 6x

In this example, you see only six samples and six sample means. It is not

enough to demonstrate the distribution of sample means. If we continue

to go through the same process and obtain, say, 1000 sample means,

then, we can construct histogram of these sample means. The

distribution of sample mean is shown by this histogram.

OUR GOAL is to describe the distribution of all possible sample means.

Page 7: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

A graphical illustration of distribution of population and

distribution of sample mean Figure A represents the weights for a sample of 26 pebbles, each weighed to

the nearest gram. Figure B represents the mean weights of random samples of 3 pebbles each, with the mean weights rounded to the nearest gram.. One value is circled in each distribution. Is there a difference between what is represented by the X circled in A and the X circled in B? Please select the best answer from the list below.

a) No, in both Figure A and Figure B, the X represents one pebble that weighs 6 grams.

b) Yes, Figure A has a larger range of values than Figure B

c) Yes, the X in Figure A is the weight for a single pebble, while the X in Figure B represents the average weight of 3 pebbles.

Page 8: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Dot plot (A): each dot

represents the weight of

an individual pebble.

This is the distribution of

the population

Dot plot (B): each dot

represents the AVERAGE

weight of THREE pebbles in

the sample.

This is the sampling distribution

of sample mean,

X

Page 9: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Must known facts of Sampling Distribution of

Suppose random sample of size n is drown from a population with mean m

and s.d. s. Then, we can describe the distribution of Sample Mean

based on the following two situations:

(A) If the population where we draw our sample is normal:

will be normal with mean m and s.d.

(B) If the population where we draw our sample is not normal:

(B-1) When sample size n is small (<30):

has the similar distribution shape as the population,

and the mean will be m and s.d. will be

(B-2) When n is large (>= 30) then, regardless the distribution of the

original population where we draw our samples,

will be approximately normal with mean m and s.d.

[The Fact of (B-2) is called the Central Limit Theorem]

X

X n/s

X n/s

X n/s

Page 10: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Population is skewed-to-right

Mean is m, s.d. is s

The Distribution of Sample Mean

When Population is NOT Normal [FACT B-2]

: Central Limit Theorem] [Similar exam questions]

Take random sample of n observations from population, which is NOT normal, Then:

(1) The center (the mean) of sample means

= the center (mean) of population mean

(2) The spread (s.d.) of sample means = the spread (s.d.) of population/sqrt(n)

(3) If the population is not normal (could be skewed-to-right, to-left or others), then, the shape of the distribution of sample mean depends on the sample size n.

If n is larger, the distribution shape of sample mean is closer to Normal. This is what so-called Central Limit Theorem.

A general guideline is that when n > 30, we say the sampling distribution is approximately normal.

nXdsnPopulationdsnXds x /).(./).(./).(. ss

mm x

m m

Distribution of Sample Means:

still skewed, but not as

skewed as population.

Mean is m, s.d. is / ns

Page 11: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

FACT B-2: The Central Limit Theorem

If the population from which the samples are drawn is NOT Normal, the shape of the sampling distribution of sample mean:

(a) If sample size n is small, the distribution shape of sample mean is similar to the population distribution shape.

(b) If sample n is large, the distribution shape of sample mean is closer to normal. In general, as n is larger than 30, the distribution of sample mean is approximately NORMAL, regardless the distribution shape of the population.

n

s.d. Population is X of s.d. and

mean) population (the mean, with Normalely approximat is

x

x

n

X

ss

mm

Page 12: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Example : Sampling Distribution of Sample Mean

[Similar Exam problems]

1. Suppose we draw a random sample of size n = 10 from bank accounts in a large city. We are interest in the average amount of saving per 10 accounts.

The individual saving does not follow a normal curve. In fact, the distribution of individual saving is very skewed to right. Suppose we know the population average saving is m = $3000 and s = $2000.

Q: What would be the distribution of sample means, each is the average of 10 accounts drawn from this population?

Page 13: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Answer ANS: The sampling distribution of Sample Means,

each is the average of 10 account savings drawn from this very skewed population would be:

The shape of the distribution of sample means is still skewed, but, less skewed than the individual account saving distribution. (This is FACT B1)

The mean of the distribution of Sample Means is

$3000, X

m

X and the standard deviation is: / 2000/ 10 $632.46ns s

Page 14: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Example : Sampling Distribution of

Sample Mean [Similar Exam problems]

2. Suppose we draw a random sample of size n = 50 from bank accounts in a large city. We are interest in the average amount of saving per 50 accounts.

The individual saving does not follow a normal curve. In fact, the distribution of individual saving is very skewed to right. Suppose we know the population average saving is m = $3000 and s = $2000. Question: What would be the distribution of sample means, each is the average of 50 accounts drawn from this population?

Page 15: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Answer

ANS: The sampling distribution of Sample Means, each is the average of 50 account savings drawn from this very skewed population would be:

The shape of the distribution of sample means is approximately normal (This is Central Limit Theorem (Fact B2)

The mean of the distribution of Sample Means is

$3000, X

m

X and the standard deviation is: / 2000/ 50 $282.84ns s

Page 16: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Some Important Points related to Sampling

distribution of Sample Mean

• The difference between distribution of sample mean and the original

population distribution is the variation of sample mean is getting smaller

when sample size is getting larger:

• The tells us that sample means will be closer to

the population mean when sample size is larger.

• Applying the empirical rule to the distribution of sample mean tells us

that we are sure that about 68% of sample means will be within one

of population mean, m. About 95% of sample means will be within

two of population mean, m. This works like magic. Since, this

allows us to determine that one unit of error of using sample mean to

estimate population is .

• As you see when sample size is large, this error becomes smaller.

nXdsnPopulationdsnXds x /).(./).(./).(. ss

. .( ) /xs d X ns s

/ ns

/ ns

/ ns

Page 17: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Examples: calculate probabilities based on the sampling distribution of sample mean.

[Similar exam questions]

A random sample of size n = 25 is chosen from a normal population with known mean, m8, and s.d., s = 4.

(a) Determine the sampling distribution of sample mean.

(b) Determine the probability of having sample mean less than 7.

(c) Determine the probability of having sample mean between 7 and 9.

(d) What is the 75th percentile of the sample mean?

(b) (c)

Page 18: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Answer to Q(b)

From Q(a) we have ~ N(8, 0.8)

Q(b) asks P( < 7) . Note that the mean =8 and sd 0.8. Now

use your TI Calculator or the table to find the answer.

XX

Answer is .10565

Page 19: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Answer to Q(c)

From Q(a) we have ~ N(8, 0.8)

Q(c) asks P(7 < < 9) . Note that mean =8 and sd 0.8,

then, use TI calculator or the table to get

X

X

Answer is .7887

Page 20: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Answer to Q(d)

From Q(a) we have ~ N(8, 0.8)

Q(d) asks to find a value of sample mean , so that

P( < ) = .75, Use mean =8 and sd 0.8 in your TI

calculator or the table to get

X

X

Answer is: the 75th

percentile = 8.5396

Page 21: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

Exercises for Sampling Distribution [Similar Exam Problems].

1. In a marketing study of gas prices for a State, if a random sample of 16 prices will be observed, and suppose the individual prices follow a normal distribution with mean price of $1.45 and a standard deviation $.2.

(a) What will be the distribution of sample mean, from size of n = 16?

(b) If you indeed observe 16 prices from a middle size city and compute the average of these 16 prices, you have the average price is $1.38. What is the chance of having the average price from 16 samples to be lower than $1.38?

(c) The city manager claims that average price of 16 stations, $1.38, is extremely low comparing with all other averages, each from 16 prices. Is this claim correct?

(d) Can you find the 40th percentile average price of 16 prices?

Page 22: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

2. In a household income survey study for a State, if a random

sample of 64 will be observed, and that we do not know the

distribution of individual household incomes, but, we do have

information about overall average household income, m =

$45,000 and s.d. = $16,000.

(a) Now based on this information, what cay you say about the

distribution of the sample means, each from 64 household

incomes?

(b) Is the average household income of $52,000 from 64

households an indication of an unusually high average?

(c) Find a 95th percentile of average household incomes from

64 households.

Page 23: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

3. Suppose that the mean time for an oil change at a “10-

minute oil change joint” is 11.4 minutes with a standard

deviation of 3.2 minutes.

(a) If a random sample of n = 35 oil changes is selected,

describe the sampling distribution of the sample mean.

(b) If a random sample of n = 35 oil changes is selected, what

is the probability the mean oil change time is less than 11

minutes?

(c ) If a random sample of n = 50 oil changes is selected, what

is the probability the mean oil change time is less than 11

minutes?

(d) What effect did increasing the sample size have on the

probability?

Page 24: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

4. In a marketing study of gas prices for a State, if a random sample of 16 prices will be observed, and suppose the individual prices follow a normal distribution with mean price of $1.45 and a standard deviation $.2.

(a) What will be the distribution of sample mean, each sample is a random sample of n = 16 prices?

(b) If you indeed observe 16 prices from a middle size city and compute the average of these 16 prices, you have the average price is $1.38. What is the chance of having the average price from 16 samples to be lower than $1.38?

(c) The city manager claims that average price of 16 stations, $1.38, is extremely low comparing with all other averages, each from 16 prices. Is this claim correct?

(d) Can you find the 40th percentile average price of 16 prices?

Page 25: Sampling Distributions and The Central Limit Theorem Distributions and The Central Limit Theorem The BIG PICTURE of statistics is to make inferences about UNKNOWN Population using

• The sampling distribution of sample mean in this case is the histogram of

the 64 observations that are to be collected.

• The average of all possible sample means must be equal to the true

population mean, that is E( ) = m.( The center of the distribution of is

the population mean, m. ( This is the property called UNBIASED. )

• Since each sample mean is from an average of 64 observations, different

samples will result different sample average. Therefore, there will be

variation of sample means.

• The standard deviation of sample mean, < s, the population standard

deviation.

• The shape of the sampling distribution of sample mean can not be close to

normal because the original population distribution shape is not known.

• The shape of the sampling distribution of sample mean will be close to

normal because the sample size is large.

• Central Limit Theorem says: when population is normal, the shape of

sampling distribution of sample mean is close to normal, regardless the

shape of the size of the sample.

X

s x

5. A random sample of n = 64 observations are to be randomly selected.

Determine if each of the following statements is correct or not:


Recommended