Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled...

Central limit TheoremSample Distribution Models for Means and Proportions

Central Limit TheoremTwo assumptions

1. The sampled values must be independent

2. The sample size, n, must be large enough

• The mean of a random sample has a sampling distribution whose shape can be approximated by a Normal model.

• The larger the sample, the better the approximation will be.

• This is regardless of the shape of the distribution of the population being sampled from or the shape of the distribution of the sample.

Distribution of sample proportions• Population has a fixed proportion

• To find population proportion a sample is taken and a sample proportion is calculated

If samples are repeatedly taken with the same sample size

• The mean of the sample distribution would be the population proportion,

• The standard deviation would be

( )p

( )p

p̂

pq

n

p̂ p

Conditions to check for the assumptions

1. Success/Failure: The expected number of successes and failure is both greater than 10

2. 10% Condition: Each sample is less than 10% of the population

3. Randomization: The sample was obtained through random sample techniques or we can at least assume that the sample is representative.

All conditions have been met to use the Normal model for the distribution of sample proportions.

10 10np andnq

• If samples were repeatedly taken with the same sample size then from the CLT, the distribution would be approximately Normal

ˆ ~ ,pq

p N pn

Example: Skittles• According to the manufacturer of the candy

Skittles, 20% of the candy produced is the color red. What is the probability that given a large bag of skittles with 58 candies that we get at least 17 red?

Conditions:

1. 10% condition: 58 skittles is less than 10% of all skittles produced.

2. Success/Failure:

There are at least 10 successes and failures

3. Randomization: Though not from a random sample we can assume the bag is representative of the population.

All conditions have been met to use the Normal model for the distribution of sample proportions.

58 0.20 11.6 10np 58 0.80 46.4 10nq

• Mean:

• Standard Deviation:

• So the model for becomes N(0.20,0.0525)

• Sample proportion:

0.20p

0.20 0.800.0525

58

pq

n

17ˆ 0.293

58p

p̂

• Then to find the probability that we get a sample proportion of 0.293 or higher:

ˆ( 0.293) 0.0383P p

Confidence Intervals1 Proportion z-intervals

Distribution of Sample Proportions

• From previous work-

• Distribution of sample proportions follow a Normal Model

• But most of the time we don’t know what the population

proportion is.

,pq

N pn

• We take samples to try to find the population

proportion.

• is the estimate of p

• Since we don’t know p we can’t find the standard

deviation.

• We’ll estimate it with the Standard Error:

p̂

ˆ ˆ

ˆpq

SE pn

Confidence Interval

• An interval based on the sample proportion in which we

have a measure of confidence that the true population

proportion lies in.

• Size of the interval is based on sample size and level of

confidence.

• The larger the sample size, the smaller the interval is

• The larger the confidence, the larger the interval is

• Every confidence interval has the same basic setup

• ME is the measure of error

• For a one-proportion sample

where z* is the critical value, the z value associated

with the level of confidence

estimate ME

*

*

ˆ ˆ( )

ˆ ˆˆ

p z SE p

pqp z

n

* ˆ ˆpqME z

n

Critical Values – some basicsLevel of Confidence z*

90% 1.645

95% 1.960

99% 2.576

To find the critical value given a level of confidence

1. Subtract level of confidence from 1

2. Divide difference by 2

3. Use invNorm( ) function on the calculator but make it a positive value

Ex. 90% confidence

1-.9 = 0.10

0.10/2 = 0.05

invNorm(0.05) = -1.645

z* = 1.645

Conditions• Randomization

• 10% Condition(Independence)

• Success/Failure: this uses the sample

proportion since we don’t know the population

proportion

All conditions have been met to use the Normal

model for a 1-proportion z-interval

ˆ ˆ10; 10np nq

An experiment finds that 27% of 53 subjects report

improvement after using a new medicine. Create a 95%

confidence interval for the actual cure rate.

Conditions:

1) Random: assume representative sample

2) 10% Condition: It is safe to assume that 53 subjects is less

than 10% of all subjects

3) Success/Failure:

All conditions have been met to use the Normal model for a 1-

proportion z-interval.

ˆ53 0.27n p

ˆ 53 .27 14.31 10

ˆ 53 .73 38.69 10

np

nq

Mechanics:

Conclusion:

We are 95% confident that the true proportion of subjects that

show improvement lies between 15.05% and 38.95%.

ˆ53 0.27 95 1.96n p CL z

0.27 0.730.27 1.96

53

0.1505,0.3895

ˆ ˆˆ:

pqCI p z

n

0.27 0.1195

p. 456 #11. In January 2007 Consumer Reports published

their study of bacterial contamination of chicken sold in the

United States. They purchased 525 broiler chickens from

various kinds of food stores in 23 states and tested them

for types of bacteria that cause food-borne illnesses.

Laboratory results indicated that 83% of these chickens

were infected with Campylobacter. Construct a 95%

confidence interval.

ˆ 525(0.17) 89.25 10nq

Conditions:

•Random: assume sample is representative

•10% Condition: 525 chickens is less than 10% of all chickens for sale

•Success/Failure:

p. 456 #11. Contaminated Chicken

n = 525 ˆ 0.83p

ˆ 525(0.83) 435.75 10np

ˆ ˆ 0.83 0.17ˆ 0.83 1.96

525

p qp z

n

All conditions have been met to use the Normal model for a 1 proportion z-interval.

CI:

(0.7979, 0.8621)

We are 95% confident that the true proportion of broiler chickens

infected with Campylobacter lies between 79.8% and 86.2%.

p. 456 #18. Direct mail advertisers send solicitations (a.k.a.

“junk mail”) to thousands of potential customers in the hope

that some will buy the company’s product. The acceptance

rate is usually quite low. Suppose a company wants to test

the response to a new flyer, and sends it to 1000 people

randomly selected from their mailing list of over 200,000

people. They get orders from 123 of the recipients. Create a

90% confidence interval for the percentage of people the

company contacts who may buy something.

ˆ 877 10nq

Conditions:

•Random: stated as a random sample

•10% Condition: 1000 people is less than 10% of 200,000 people on the mailing list

•Success/Failure:

p. 456 #18. Junk Mail

n = 1000

123ˆ 0.123

1000p

ˆ 123 10np

ˆ ˆ 0.123 0.877ˆ 0.123 1.645

1000

p qp z

n

All conditions have been met to use the Normal model for a 1 proportion z-interval.

CI:

(0.1059,0.1400)

We are 90% confident that the true proportion of people contacted that

buy something lies between 10.6% and 14.0%

What does __% confidence mean?

Stock Statement:

About ___% of random samples of size (n) will produce confidence

intervals that contain the true proportion of ___

Ex. #18. What does 90% confidence mean?

About 90% of random samples of size 1000 will produce confidence

intervals that contain the true proportion of people contacted who will

buy something.

d) Since 5% lies below the interval it is

suggested that the company run the mass

mailing.

From a previous experiment we found the cure rate to be 27%. How

many subjects would we need in a new experiment to be able to

create a confidence interval with 98% and a ME of only ±5%?

427n

0.27 98% * 2.326 0.05p z ME

*pq

ME zn

(0.27)(0.73)0.05 2.326

n

• 95% Confidence ME = 0.03 ˆ 0.36p

90% Confidence ME = 0.045 ˆ 0.27p

(0.36)(0.64)0.03 1.96

n

984n

(0.27)(0.73)0.045 1.645

n

264n

Date post:	27-Feb-2021
Category:	Documents
Upload:	others
View:	2 times
Download:	0 times

Central limit Theorem · 2016. 11. 28. · Central Limit Theorem Two assumptions 1. The sampled...

Documents