+ All Categories
Home > Documents > SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Date post: 12-Jan-2016
Category:
Upload: elpida
View: 47 times
Download: 2 times
Share this document with a friend
Description:
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. PPSS. The situation in a statistical problem is that there is a population of interest, and a quantity or aspect of that population that is of interest. This quantity is called a parameter. The value of this parameter is unknown. - PowerPoint PPT Presentation
50
SAMPLING SAMPLING DISTRIBUTION OF DISTRIBUTION OF MEANS & PROPORTIONS MEANS & PROPORTIONS
Transcript
Page 1: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

SAMPLING SAMPLING DISTRIBUTION OF DISTRIBUTION OF

MEANS & MEANS & PROPORTIONSPROPORTIONS

Page 2: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

PPSSPPSS The situation in a statistical problem is The situation in a statistical problem is

that there is a population of interest, and that there is a population of interest, and a quantity or aspect of that population a quantity or aspect of that population that is of interest. This quantity is called that is of interest. This quantity is called a parameter. The value of this a parameter. The value of this parameter is unknown.parameter is unknown.

To learn about this parameter we take a To learn about this parameter we take a sample from the population and compute sample from the population and compute an estimate of the parameter called a an estimate of the parameter called a statistic.statistic.

Page 3: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Populations & SamplesPopulations & Samples

PopulationPopulation All SaudisAll Saudis All inpatients in All inpatients in

KKUHKKUH All depressed All depressed

peoplepeople

SampleSample A subset of SaudisA subset of Saudis A subset of A subset of

inpatientsinpatients The depressed The depressed

people in Riyadh.people in Riyadh.

Page 4: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Samples and PopulationsSamples and Populations

SampleSample Relatively small Relatively small

number of instances number of instances that are studied in that are studied in order to make order to make inferences about a inferences about a larger group from larger group from which they were which they were drawndrawn

PopulationPopulation The larger group from The larger group from

which a sample is which a sample is drawndrawn

Page 5: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Samples and PopulationsSamples and Populations It is usually not practical to study an It is usually not practical to study an

entire populationentire population in a in a random samplerandom sample each member of each member of

the population has an equal chance of the population has an equal chance of being chosenbeing chosen

a a representative samplerepresentative sample might have the might have the same proportion of men and women as same proportion of men and women as does the population.does the population.

Page 6: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Statistics and Parameters

a parameter is a characteristic of a population

e.g., the average heart rate of all Saudis.

a statistic is a characteristic of a sample

e.g., the average heart rate of a sample of Saudis.

We use statistics of samples to estimate parameters of populations.

Statistic estimates Parameter

X estimates

s estimates

s2 estimates 2

r estimates

“mew”

“sigma”

“rho”

Page 7: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

InferenceInference –– extension of results obtained from an experiment extension of results obtained from an experiment (sample) to the general population(sample) to the general population

use of sample data to draw conclusions about entire population use of sample data to draw conclusions about entire population

PParameterarameter –– number that describes a number that describes a ppopulationopulation Value is not usually known Value is not usually known We are unable to examine populationWe are unable to examine population

SStatistictatistic –– number computed from number computed from ssample dataample data Estimate unknown parametersEstimate unknown parameters Computed to estimate unknown parametersComputed to estimate unknown parameters

Mean, standard deviation, variability, etc..Mean, standard deviation, variability, etc..

Page 8: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

SAMPLING SAMPLING DISTRIBUTIONDISTRIBUTION

The sample distribution The sample distribution is the distribution of is the distribution of allall possible sample means possible sample means

that that could be drawncould be drawn from the population.from the population.

Page 9: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

SAMPLING DISTRIBUTIONSSAMPLING DISTRIBUTIONSWhat would happen if we took many samples ofWhat would happen if we took many samples of

10 subjects from the population?10 subjects from the population?

Steps:Steps:

1.1. Take a large number of samples of size 10 from the populationTake a large number of samples of size 10 from the population

2.2. Calculate the sample mean for each sampleCalculate the sample mean for each sample

3.3. Make a histogram of the mean valuesMake a histogram of the mean values

4.4. Examine the distribution displayed in the histogram for shape, Examine the distribution displayed in the histogram for shape, center, and spread, as well as outliers and other deviationscenter, and spread, as well as outliers and other deviations

Page 10: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS
Page 11: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

How can experimental results be trusted? If is rarely How can experimental results be trusted? If is rarely exactly right and varies from sample to sample, how it will exactly right and varies from sample to sample, how it will be a reasonable estimate of the population mean be a reasonable estimate of the population mean μμ??

How can we describe the behavior of the statistics from How can we describe the behavior of the statistics from different samples?different samples? E.g. the mean valueE.g. the mean value

x

Page 12: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Very rarely do sample values coincide Very rarely do sample values coincide with the population value (parameter).with the population value (parameter).

The discrepancy between the sample The discrepancy between the sample value and the parameter is known as value and the parameter is known as sampling error, when this discrepancy sampling error, when this discrepancy is the result of random sampling.is the result of random sampling.

Fortunately, these errors behave Fortunately, these errors behave systematically and have a systematically and have a characteristic distribution.characteristic distribution.

Page 13: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS
Page 14: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

A sample of 3 students from a class A sample of 3 students from a class of a population of 6 students and measure of a population of 6 students and measure

students GPAstudents GPA

Student GPA

Susan 2.1

Karen 2.6

Bill 2.3

Calvin 1.2

Rose 3.0

David 2.4

Page 15: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Draw each possible sample Draw each possible sample from this ‘population’:from this ‘population’:

Susan 2.1

Karen 2.6

Bill 2.3

Rose 3.0David 2.4

Calvin 1.2

Page 16: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

With samples of n = 3 With samples of n = 3 from this population of from this population of

N = 6 there are 20 N = 6 there are 20 different sample different sample

possibilities:possibilities:

2036

720

123123

123456

)!(!

!

nNn

N

n

N

Page 17: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Note that every different sample would Note that every different sample would

produce a different mean and s.d.,produce a different mean and s.d.,

ONE SAMPLE = Susan + Karen +Bill / 3

= 2.1+2.6+2.3 / 3

= 7.0 / 3 = 2.3

Standard Deviation:

(2.1-2.3) 2 = .22 = .04

(2.6-2.3) 2 = .32 = .09

(2.3-2.3) 2 = 02 = 0

s2=.13/3 and s = =.21

So this one sample of 3 has a mean of 2.3 and a sd of .21

X

043.

Page 18: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

What about other What about other samples?samples?

A SECOND SAMPLE A SECOND SAMPLE = Susan + Karen + = Susan + Karen +

Calvin Calvin = 2.1 + 2.6 + 1.2 = 2.1 + 2.6 + 1.2

= 1.97 = 1.97 SD = .58SD = .58

2020thth SAMPLE SAMPLE= Karen + Rose + David= Karen + Rose + David= 2.6 + 3.0 + 2.4= 2.6 + 3.0 + 2.4= 2.67 = 2.67

SD = .25SD = .25

X

X

Page 19: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Assume the true mean of the Assume the true mean of the population is known, in this simple case population is known, in this simple case of 6 people and can be calculated as of 6 people and can be calculated as 13.6/6 = 13.6/6 = =2.27 =2.27

The The mean of the sampling distributionmean of the sampling distribution (i.e., the mean of all 20 samples) is (i.e., the mean of all 20 samples) is 2.30.2.30.

Page 20: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sample mean is a random variable.Sample mean is a random variable. If the sample was randomly drawn, then any differences If the sample was randomly drawn, then any differences

between the obtained sample mean and the true population between the obtained sample mean and the true population mean is due to sampling error. mean is due to sampling error.

Any difference between Any difference between andand μ μ is due to the fact that different is due to the fact that different people show up in different samplespeople show up in different samples

If is not equal to If is not equal to μ , the difference is due to sampling error.μ , the difference is due to sampling error. ““Sampling error” is normal, it isSampling error” is normal, it is

to-be-expected variability of samplesto-be-expected variability of samples

X

X

Page 21: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

What is a Sampling What is a Sampling Distribution?Distribution?

A distribution made up of every A distribution made up of every conceivable sample drawn from a conceivable sample drawn from a population.population.

A sampling distribution is almost always a A sampling distribution is almost always a hypothetical distribution because typically hypothetical distribution because typically you do not have and cannot calculate you do not have and cannot calculate every conceivable sample mean.every conceivable sample mean.

The mean of the sampling distribution is The mean of the sampling distribution is an unbiased estimator of the population an unbiased estimator of the population mean with a computable standard mean with a computable standard deviation.deviation.

Page 22: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

LAW OF LARGE NUMBERSLAW OF LARGE NUMBERS If we keep taking larger and larger samples, the statistic is If we keep taking larger and larger samples, the statistic is

guaranteed to get closer and closer to the parameter value.guaranteed to get closer and closer to the parameter value.

Page 23: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

ILLUSTRATION OF ILLUSTRATION OF SAMPLING SAMPLING

DISTRIBUTIONSDISTRIBUTIONS

Draw 500 different SRSs.

What happens to the shape of the sampling distribution as the size of the sample increases?

Page 24: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

500 Samples of n = 2500 Samples of n = 2

Page 25: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

500 Samples of n = 4500 Samples of n = 4

Page 26: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

500 Samples of n = 6500 Samples of n = 6

Page 27: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

500 Samples of n = 10500 Samples of n = 10

Page 28: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

500 Samples of n = 20500 Samples of n = 20

Page 29: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Key ObservationsKey Observations

As the sample size increases the mean As the sample size increases the mean of the sampling distribution comes to of the sampling distribution comes to more closely approximate the true more closely approximate the true population mean, here known to be population mean, here known to be = = 3.53.5

AND-this critical-the standard error-that AND-this critical-the standard error-that is the standard deviation of the is the standard deviation of the sampling distribution gets sampling distribution gets systematically narrower.systematically narrower.

Page 30: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Three main points about sampling Three main points about sampling distributionsdistributions

Probabilistically, as the sample size gets Probabilistically, as the sample size gets bigger the sampling distribution better bigger the sampling distribution better approximates a normal distribution.approximates a normal distribution.

The mean of the sampling distribution will The mean of the sampling distribution will more closely estimate the population more closely estimate the population parameter as the sample size increases.parameter as the sample size increases.

The The standard error (SE) standard error (SE) gets narrower and gets narrower and narrower as the sample size increases. Thus, narrower as the sample size increases. Thus, we will be able to make more precise we will be able to make more precise estimates of the whereabouts of the unknown estimates of the whereabouts of the unknown population mean.population mean.

Page 31: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

• Don’t get confuse with the terms of

STANDARD DEVEIATION

and

STANDARD ERROR

Page 32: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Quantifying Uncertainty

• Standard deviation: measures the variation of a variable in the sample.

–Technically,

s x xN ii

N

1

12

1

( )

Page 33: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

• Standard error of mean is calculated by:

s sems

nx

Page 34: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Standard deviation versus standard error

• The standard deviation (s) describes variability between individuals in a sample.

• The standard error describes variation of a sample statistic.

–The standard deviation describes how individuals differ.

–The standard error of the mean describes the precision with which we can make inference about the true mean.

Page 35: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Standard error of the mean

• Standard error of the mean (sem):

• Comments:

–n = sample size

–even for large s, if n is large, we can get good precision for sem

–always smaller than standard deviation (s)

s sems

nx

Page 36: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

ESTIMATING THE ESTIMATING THE POPULATION MEANPOPULATION MEAN

We are unlikely to ever see a sampling distribution We are unlikely to ever see a sampling distribution because it is often impossible to draw every because it is often impossible to draw every conceivable sample from a population and we conceivable sample from a population and we never know the actual mean of the sampling never know the actual mean of the sampling distribution or the actual standard deviation of the distribution or the actual standard deviation of the sampling distribution. But, here is the good news:sampling distribution. But, here is the good news:

We can estimate the whereabouts of the population We can estimate the whereabouts of the population mean from the sample mean and use the sample’s mean from the sample mean and use the sample’s standard deviation to calculate the standard error. standard deviation to calculate the standard error. The formula for computing the standard error The formula for computing the standard error changes, depending on the statistic you are using, changes, depending on the statistic you are using, but essentially you divide the sample’s standard but essentially you divide the sample’s standard deviation by the square root of the sample size.deviation by the square root of the sample size.

Page 37: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

P-HatP-Hat

The situation in this section is that we are The situation in this section is that we are interested in the proportion of the interested in the proportion of the population that has a certain characteristic. population that has a certain characteristic.

This proportion is the population parameter This proportion is the population parameter of interest, denoted by symbol p.of interest, denoted by symbol p.

We estimate this parameter with the We estimate this parameter with the statistic p-hat – the number in the sample statistic p-hat – the number in the sample with the characteristic divided by the with the characteristic divided by the sample size n.sample size n.

Page 38: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

P-Hat DefinitionP-Hat Definition

nXp /ˆ

Page 39: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sample proportions

The proportion of an “event of interest” can be more informative. In

statistical sampling the sample proportion of an event of interest is

used to estimate the proportion p of an event of interest in a population.

For any SRS of size n, the sample proportion of an event is:

n

X

np

sample in theevent ofcount ˆ

In an SRS of 50 students in an undergrad class, 10 are O +ve blood group:

= (10)/(50) = 0.2 (proportion of O +ve blood group in sample)

The 30 subjects in an SRS are asked to taste an unmarked brand of coffee and rate it

“would buy” or “would not buy.” Eighteen subjects rated the coffee “would buy.”

= (18)/(30) = 0.6 (proportion of “would buy”)

Page 40: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling Distribution of Sampling Distribution of p-hatp-hat

How does p-hat behave? To study How does p-hat behave? To study the behavior, imagine taking many the behavior, imagine taking many random samples of size n, and random samples of size n, and computing a p-hat for each of the computing a p-hat for each of the samples.samples.

Then we plot this set of p-hats with a Then we plot this set of p-hats with a histogram.histogram.

Page 41: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling Distribution of Sampling Distribution of p-hatp-hat

Page 42: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling distribution of the sample proportionThe sampling distribution of is never exactly normal. But as the sample size

increases, the sampling distribution of becomes approximately normal.

The normal approximation is most accurate for any fixed n when p is close to

0.5, and least accurate when p is near 0 or near 1.

p̂p̂

Page 43: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Reminder: Sampling variability

Each time we take a random sample from a population, we are likely to

get a different set of individuals and calculate a different statistic. This

is called sampling variability.

If we take a lot of random samples of the same size from a given

population, the variation from sample to sample—the sampling

distribution—will follow a predictable pattern.

Page 44: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Properties of p-hatProperties of p-hat When sample sizes are fairly large, the shape When sample sizes are fairly large, the shape

of the p-hat distribution will be normal.of the p-hat distribution will be normal. The mean of the distribution is the value of The mean of the distribution is the value of

the population parameter p.the population parameter p. The standard deviation of this distribution is The standard deviation of this distribution is

the square root of p(1-p)/n.the square root of p(1-p)/n.

n

pppsd

)1()ˆ(

Page 45: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Example: ProportionExample: Proportion Suppose a large department store chain is considering Suppose a large department store chain is considering

opening a new store in a town of 15,000 people.opening a new store in a town of 15,000 people. Further, suppose that 11,541 of the people in the town Further, suppose that 11,541 of the people in the town

are willing to utilize the store, but this is unknown to are willing to utilize the store, but this is unknown to the department store chain managers.the department store chain managers.

Before making the decision to open the new store, a Before making the decision to open the new store, a market survey is conducted.market survey is conducted.

200 people are randomly selected and interviewed. Of 200 people are randomly selected and interviewed. Of the 200 interviewed, 162 say they would utilize the the 200 interviewed, 162 say they would utilize the new store.new store.

Sampling Distribution Sampling Distribution for Proportionfor Proportion

Page 46: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling Distribution Sampling Distribution for Proportionfor Proportion

Example: ProportionExample: Proportion What is the population proportion p?What is the population proportion p?

11,541/15,000 = 0.7711,541/15,000 = 0.77 What is the sample proportion pWhat is the sample proportion p^̂??

162/200 = 0.81162/200 = 0.81 What is the approximate sampling What is the approximate sampling

distribution (of the sample proportion)?distribution (of the sample proportion)?

2

)1(,~ˆ

n

pppNormalp

What does this mean?

20297.0,77.0Normal

Page 47: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Example: ProportionsExample: Proportions What does this mean?What does this mean?

Population: 15,000 people, p = 0.77

Suppose we take many, many samples (of size 200):

200

200

200

200200

200

200

200

200200

200

200

200

200200

200

200

200

200200

200

200

200

200

p^ = 0.74

Then we find the sample proportion for each sample.

p^ = 0.78

p^ = 0.82

p^ = 0.82

p^ = 0.73

p^ = 0.76

and so forth…

Page 48: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling Distribution Sampling Distribution for Proportionfor Proportion

Example: ProportionExample: Proportion

0.77

0.0297

The sample we took fell here.

0.81

Page 49: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Sampling Distribution Sampling Distribution for Proportionfor Proportion

Example: ProportionExample: Proportion The managers didn’t know the true proportion so they The managers didn’t know the true proportion so they

took a sample.took a sample. As we have seen, the samples vary.As we have seen, the samples vary. However, because we know how the sampling However, because we know how the sampling

distribution behaves, we can get a good idea of how distribution behaves, we can get a good idea of how close we are to the true proportion.close we are to the true proportion.

This is why we have looked so much at the normal This is why we have looked so much at the normal distribution.distribution.

Mathematically, the normal distribution is the sampling Mathematically, the normal distribution is the sampling distribution of the sample proportion, and, as we have distribution of the sample proportion, and, as we have seen, the sampling distribution of the sample mean as seen, the sampling distribution of the sample mean as well.well.

Page 50: SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS

Two Steps in Statistical Inference Process

1. Calculation of “confidence intervals” from the sample mean and sample standard deviation within which we can place the unknown population mean with some degree of probabilistic confidence

2. Compute “test of statistical significance” (Risk Statements) which is designed to assess the probabilistic chance that the true but unknown population mean lies within the confidence interval that you just computed from the sample mean.


Recommended