+ All Categories
Home > Documents > Chapters 18 - 19

Chapters 18 - 19

Date post: 23-Feb-2016
Category:
Upload: italia
View: 35 times
Download: 0 times
Share this document with a friend
Description:
Chapters 18 - 19. Sampling Distribution Models and Confidence Intervals. Example. We want to find out the proportion of men in U.S. population. We draw a sample and calculate the proportion of men in the sample. Suppose we find that the proportion = 60%. - PowerPoint PPT Presentation
30
Chapters 18 - 19 Sampling Distribution Models and Confidence Intervals
Transcript
Page 1: Chapters 18 - 19

Chapters 18 - 19

Sampling Distribution Models and

Confidence Intervals

Page 2: Chapters 18 - 19

Example• We want to find out the proportion of men in

U.S. population. • We draw a sample and calculate the proportion

of men in the sample. Suppose we find that the proportion = 60%.

• We conclude that 60% of the population are men. How much should we trust this estimate?

• Suppose the actual percentage is p = 52%

Page 3: Chapters 18 - 19

Margin of Errors

• Example: – 60% of U.S. population are men with margin of

errors ± 5%– This yields an interval of estimate from 55% to 65%

The extent of the interval on either side of the sample proportion or mean is called the margin of error (ME).

In general, the intervals of estimate have the form estimate ± ME.

Page 4: Chapters 18 - 19

Sampling Distribution• Now imagine what would happen if we looked

at the sample proportions for these samples.• The histogram we’d get if we could see all the

proportions from all possible samples is called the sampling distribution of the proportions.

• What would the histogram of all the sample proportions look like?

• It turns out that the histogram is unimodal, symmetric, and centered at p. It’s a normal distribution.

Page 5: Chapters 18 - 19

Sampling Distribution Model• The mean of the sampling distribution is at p.• The standard deviation of the distribution is

• So, the distribution of the sample proportions

is modeled with normal model

pqn

N p,pqn

Page 6: Chapters 18 - 19

Assumptions• Most models are useful only when specific

assumptions are true.• There are two assumptions in the case of the

model for the distribution of sample proportions:

1. The Independence Assumption: The sampled values must be independent of each other.

2. The Sample Size Assumption: The sample size, n, must be large enough.

Page 7: Chapters 18 - 19

Assumptions and Conditions• Assumptions are hard—often impossible—to

check. • Still, we need to check whether the assumptions

are reasonable by checking conditions that provide information about the assumptions.

• The corresponding conditions to check before using the Normal to model the distribution of sample proportions are the Randomization Condition, 10% Condition and the Success/Failure Condition.

Page 8: Chapters 18 - 19

Conditions1. Randomization Condition: The sample should

be a simple random sample of the population.2. 10% Condition: If sampling has not been made

with replacement, then the sample size, n, must be no larger than 10% of the population.

3. Success/Failure Condition: The sample size has to be big enough so that both np and nq are at least 10.

Page 9: Chapters 18 - 19

A Sampling Distribution Model • A proportion is no longer just a computation

for a set of data.– It is now a random quantity that has a distribution.– This distribution is called the sampling distribution

model for proportions.• Even though we depend on sampling

distribution models, we never actually get to see them. – We never actually take repeated samples from the

same population and make a histogram. We only imagine or simulate them.

Slide 1- 9

Page 10: Chapters 18 - 19

The Central Limit Theoremfor a Proportion

Provided that the sampled values are independent and the sample size is large enough, the sampling distribution of is modeled by a Normal model with – Mean:

– Standard deviation:

p

pqn

Page 11: Chapters 18 - 19

Means – The “Average” of One Die

• Let’s start with a simulation of 10,000 tosses of a die. A histogram of the results is:

Page 12: Chapters 18 - 19

Means – Averaging More Dice• Looking at the average of

two dice after a simulation of 10,000 tosses:

• The average of three dice after a simulation of 10,000 tosses looks like:

Page 13: Chapters 18 - 19

Means – Averaging Still More Dice

• The average of 5 dice after a simulation of 10,000 tosses looks like:

• The average of 20 dice after a simulation of 10,000 tosses looks like:

Page 14: Chapters 18 - 19

Means – What the Simulations Show

• As the sample size (number of dice) gets larger, each sample average is more likely to be closer to the population mean.– So, we see the shape continuing to tighten around

3.5• And, it probably does not shock you that the

sampling distribution of a mean becomes Normal.

Page 15: Chapters 18 - 19

The Central Limit Theorem: The Fundamental Theorem of Statistics

• The sampling distribution of any mean becomes more nearly Normal as the sample size grows. – All we need is for the observations to be

independent and collected with randomization.– We don’t even care about the shape of the

population distribution!• The Fundamental Theorem of Statistics is called

the Central Limit Theorem (CLT).

Page 16: Chapters 18 - 19

The Central Limit Theorem (CLT)The mean of a random sample has a sampling distribution whose shape can be approximated by a Normal model. The larger the sample, the better the approximation will be.

Page 17: Chapters 18 - 19

Assumptions and ConditionsThe CLT requires essentially the same assumptions we saw for modeling proportions:

Independence Assumption: The sampled values must be independent of each other.

Sample Size Assumption: The sample size must be sufficiently large.

We can’t check these directly, but we can think about whether the Independence Assumption is plausible, and check

– Randomization Condition– 10% Condition: n is less than 10% of the population.– Large Enough Sample Condition: The CLT doesn’t tell us how

large a sample we need. For now, you need to think about your sample size in the context of what you know about the population.

Page 18: Chapters 18 - 19

CLT and Standard Error• The CLT says that the sampling distribution of any

mean or proportion is approximately Normal.– For proportions, the sampling distribution is centered

at the population proportion.– For means, it’s centered at the population mean.– For proportions

– For means

SD p̂ pqn

SD y n

Page 19: Chapters 18 - 19

Standard Error• But we don’t know p or σ, we’re stuck, right?• Since we don’t know p or σ, we can’t find the

true standard deviation of the sampling distribution model, so we need to estimate the S..D. of a sampling distribution. We call this estimate (of the S.D. of a sampling distribution a standard error (SE).

• For a sample proportion,

• For the sample mean,

SE p̂ p̂q̂n

SE y sn

Page 20: Chapters 18 - 19

The Real World & the Model WorldBe careful! Now we have two distributions to deal with. The first is the real world distribution of the sample,

which we might display with a histogram. The second is the math world sampling distribution

of the statistic, which we model with a Normal model based on the Central Limit Theorem.

Don’t confuse the two!

Page 21: Chapters 18 - 19

A Confidence Interval

• By the 68-95-99.7% Rule, we knowabout 68% of all samples will have ’s within 1 SE

of pabout 95% of all samples will have ’s within 2 SEs

of pabout 99.7% of all samples will have ’s within 3

SEs of pp̂

Page 22: Chapters 18 - 19

A Confidence Interval

• Consider the 95% level: There’s a 95% chance that p is no more than 2 SEs

away from . So, if we reach out 2 SEs, we are 95% sure that p

will be in that interval. In other words, if we reach out 2 SEs in either direction of , we can be 95% confident that this interval contains the true proportion.

• This is called a 95% confidence interval.

Page 23: Chapters 18 - 19

A 95 % Confidence Interval

Page 24: Chapters 18 - 19

What Does “95% Confidence” Mean? • Each confidence interval uses a

sample statistic to estimate a population parameter.

• But, since samples vary, the statistics we use, and thus the confidence intervals we construct, vary as well.

• Our confidence is in the process of constructing the interval, not in any one interval itself.

• “95% confidence” means there is 95% chance that our interval will contain the true parameter.

Page 25: Chapters 18 - 19

M. E: Certainty vs. Precision• We can claim, with 95% confidence, that the

interval contains the true population proportion.

• The more confident we want to be, the larger our ME needs to be (makes the interval wider).

p̂2SE( p̂)

Page 26: Chapters 18 - 19

M.E: Certainty vs. Precision• To be more confident, we wind up being less precise. • Because of this, every confidence interval is a balance

between certainty and precision.• The tension between certainty and precision is always

there. Fortunately, in most cases we can be both sufficiently certain and sufficiently precise to make useful statements.

• The choice of confidence level is somewhat arbitrary, but keep in mind this tension between certainty and precision when selecting your confidence level.

• The most commonly chosen confidence levels are 90%, 95%, and 99% (but any percentage can be used).

Page 27: Chapters 18 - 19

Critical Values• The ‘2’ in (our 95% confidence interval)

came from the 68-95-99.7% Rule.• Using a table or technology, we find that a more

exact value for our 95% confidence interval is 1.96 instead of 2. We call 1.96 the critical value and denote it z*.

• For any confidence level, we can find the corresponding critical value.

• Example: • For a 90% confidence interval, the critical value is 1.645:

p̂2SE( p̂)

Page 28: Chapters 18 - 19

One-Proportion z-Interval• When the conditions are met, we are ready to

find the confidence interval for the population proportion, p.

• The confidence interval is

where

• The critical value, z*, depends on the particular confidence level, C, that you specify.

p̂z SE p̂

SE( p̂)p̂q̂n

Page 29: Chapters 18 - 19

What Can Go Wrong?Margin of Error Too Large to Be Useful:• We can’t be exact, but how precise do we

need to be?• One way to make the margin of error smaller

is to reduce your level of confidence. (That may not be a useful solution.)

• You need to think about your margin of error when you design your study.– To get a narrower interval without giving up

confidence, you need to have less variability.– You can do this with a larger sample…

Page 30: Chapters 18 - 19

Homework AssignmentChapter 18:• Problem # 11, 15, 23, 29, 31, 47.

Chapter 19:• Problem # 7, 9, 11, 13, 17, 27, 35, 37.


Recommended