+ All Categories
Home > Documents > Chapter 7 Sampling Distributions

Chapter 7 Sampling Distributions

Date post: 30-Dec-2015
Category:
Upload: kenneth-garza
View: 39 times
Download: 3 times
Share this document with a friend
Description:
Chapter 7 Sampling Distributions. 7.1 What is a Sampling Distribution? 7.2 Sample Proportions 7.3 Sample Means. Section 7.1 What Is a Sampling Distribution?. Learning Objectives. After this section, you should be able to… DISTINGUISH between a parameter and a statistic - PowerPoint PPT Presentation
Popular Tags:
29
+ Chapter 7 Sampling Distributions 7.1 What is a Sampling Distribution? 7.2 Sample Proportions 7.3 Sample Means
Transcript
Page 1: Chapter 7 Sampling Distributions

+Chapter 7Sampling Distributions

7.1 What is a Sampling Distribution?

7.2 Sample Proportions

7.3 Sample Means

Page 2: Chapter 7 Sampling Distributions

+Section 7.1 What Is a Sampling Distribution?

After this section, you should be able to…

DISTINGUISH between a parameter and a statistic

DEFINE sampling distribution

DISTINGUISH between population distribution, sampling distribution, and the distribution of sample data

DETERMINE whether a statistic is an unbiased estimator of a population parameter

DESCRIBE the relationship between sample size and the variability of an estimator

Learning Objectives

Page 3: Chapter 7 Sampling Distributions

+What Is a S

ampling D

istribution?Introduction

The process of statistical inference involves using information from a sample to draw conclusions about a wider population.

Different random samples yield different statistics. We need to be able to describe the sampling distribution of possible statistic values in order to perform statistical inference.

We can think of a statistic as a random variable because it takes numerical values that describe the outcomes of the random sampling process. Therefore, we can examine its probability distribution using what we learned in Chapter 6.

Population

Sample Collect data from a representative Sample...

Make an Inference about the Population.

Page 4: Chapter 7 Sampling Distributions

In Nov 2005, the In Nov 2005, the Harris PollHarris Poll asked 889 asked 889 adults,” Do you believe in ghosts?adults,” Do you believe in ghosts?

40% said they did.40% said they did.

At almost the same time, At almost the same time, CBS newsCBS news polled 808 adults and asked the same polled 808 adults and asked the same question?question?

48% said they did.48% said they did.

Page 5: Chapter 7 Sampling Distributions

WHY THE DIFFERENCE?WHY THE DIFFERENCE?What is the variability?What is the variability?Why do sample proportions vary at all?Why do sample proportions vary at all?How can the surveys conducted at the How can the surveys conducted at the same time give different results?same time give different results?

The proportion vary from the sample to The proportion vary from the sample to sample because the samples are sample because the samples are composed of different people.composed of different people.

Page 6: Chapter 7 Sampling Distributions

What would the histogram all sample What would the histogram all sample proportion look like?proportion look like? We We don’t don’t know the answer but we know know the answer but we know that it will be the true proportion in the that it will be the true proportion in the populationpopulation. Let us call that as . Let us call that as p.p.

So suppose 45% of all Americans believe So suppose 45% of all Americans believe in ghosts. So here p = 0.45.in ghosts. So here p = 0.45.What is the shape of the histogram?What is the shape of the histogram?

Don’t guess. Simulate a bunch of Don’t guess. Simulate a bunch of samples that we didn’t really draw.samples that we didn’t really draw.

Page 7: Chapter 7 Sampling Distributions

Here is the histogram of the proportions Here is the histogram of the proportions saying they believe in ghosts simulated saying they believe in ghosts simulated independent samples of 808 adults when independent samples of 808 adults when the true proportion is p = 0.45.the true proportion is p = 0.45.

# of# of

samplessamples

Page 8: Chapter 7 Sampling Distributions

This histogram is a simulation of what This histogram is a simulation of what we’d get if we could see all the we’d get if we could see all the proportions from all possible samples.proportions from all possible samples.

This is called This is called Sampling DistributionSampling Distribution of the of the proportionsproportions

From this normal model we can see a From this normal model we can see a sample proportion in any particular sample proportion in any particular interval.interval.

Page 9: Chapter 7 Sampling Distributions

VocabularyVocabulary Parameter:Parameter: a number that describes a number that describes

the populationthe population Statistic:Statistic: a number that can be a number that can be

computed from the sample data computed from the sample data without making use of any unknown without making use of any unknown parameters.parameters. μμ--> population mean--> population mean sample meansample mean σσ--> population standard deviation--> population standard deviation SS sample standard deviation sample standard deviation pp population proportion population proportion p-hatp-hat sample proportion sample proportion

Page 10: Chapter 7 Sampling Distributions

DefinitionsDefinitions parameter: parameter:

a number that describes the a number that describes the populationpopulation

a parameter is a fixed numbera parameter is a fixed number in practice, we do not know its in practice, we do not know its

value because we cannot value because we cannot examine the entire populationexamine the entire population

Page 11: Chapter 7 Sampling Distributions

DefinitionsDefinitions statistic: statistic:

a number that describes a a number that describes a samplesample

the value of a statistic is known the value of a statistic is known when we have taken a sample, when we have taken a sample, but it can change from sample to but it can change from sample to samplesample

we often use a statistic to we often use a statistic to estimate an unknown parameterestimate an unknown parameter

Page 12: Chapter 7 Sampling Distributions

CompareCompare parameterparameter

mean: mean: μμ standard standard

deviation: deviation: σσ proportion: pproportion: p

Sometimes we call Sometimes we call the parameters the parameters “true”; true mean, “true”; true mean, true proportion, true proportion, etc.etc.

statisticstatistic mean: mean: standard standard

deviation: deviation: ss proportion: proportion:

Sometimes we call Sometimes we call the statistics the statistics “sample”; sample “sample”; sample mean, sample mean, sample proportion, etc.proportion, etc.

x

p

Page 13: Chapter 7 Sampling Distributions

Population vs SamplesPopulation vs Samples

Population Parameters Usually unknown and are estimated by

sample statistics using techniques we will learn

Population Mean: μ Population Standard Deviation: σ Population Proportion: p

Sample Statistics Used to estimate population parameters Sample Mean: x̄$ Sample Standard Deviation: s Sample Proportion: p%

Page 14: Chapter 7 Sampling Distributions

Sampling DistributionsSampling Distributions

The sampling distribution of a The sampling distribution of a statistic is the distribution of values statistic is the distribution of values taken by the statistic in all possible taken by the statistic in all possible samples of the same size from the samples of the same size from the same population.same population.

Page 15: Chapter 7 Sampling Distributions

Example: Tell which is parameter or a Example: Tell which is parameter or a statisticstatistic

1.1. On Tuesday, the bottles of Arizona Iced On Tuesday, the bottles of Arizona Iced Tea in a plant were supposed to contain Tea in a plant were supposed to contain an average of 20 ounces of iced tea. an average of 20 ounces of iced tea. Quality control inspectors sampled 50 Quality control inspectors sampled 50 bottles at random from the day’s bottles at random from the day’s production. These bottles contained an production. These bottles contained an average of 19.6 ounces of iced tea.average of 19.6 ounces of iced tea.

Parameter is Parameter is μμ = 20 ounces of iced tea. = 20 ounces of iced tea.

Statistic is = 19.6 ounces of iced tea.Statistic is = 19.6 ounces of iced tea.x

Page 16: Chapter 7 Sampling Distributions

2. On a NY to Denver flight , 8% of the 2. On a NY to Denver flight , 8% of the 125 passengers were selected for random 125 passengers were selected for random security before boarding. According to the security before boarding. According to the Transportation Security Administration, Transportation Security Administration, 10% of passengers at this airport are 10% of passengers at this airport are chosen for random screening.chosen for random screening.

Parameter is p = 0.10 or 10% of Parameter is p = 0.10 or 10% of passengers.passengers.

Statistic is p-hat = 0.08 or 8% of sample Statistic is p-hat = 0.08 or 8% of sample of passengers.of passengers.

Page 17: Chapter 7 Sampling Distributions

Sampling variabilitySampling variability

Given the same population, we may Given the same population, we may have multiple samples.have multiple samples.

Should we expect that the statistics Should we expect that the statistics for each sample be the same?for each sample be the same?

While sample means or sample While sample means or sample proportions are similar, they do vary. proportions are similar, they do vary. We call this We call this sampling variabilitysampling variability..

Page 18: Chapter 7 Sampling Distributions

To make sense of sampling variability, we To make sense of sampling variability, we ask,ask,

“ “ What would happen if we took many What would happen if we took many samples?” Here’s how to answer ……samples?” Here’s how to answer ……Take a large # of samples from the same Take a large # of samples from the same populationpopulationCalculate the statistic ( like the sample Calculate the statistic ( like the sample mean mean

or sample proportion p-hat) for each or sample proportion p-hat) for each sample.sample.Make a graph( histogram ) of the values Make a graph( histogram ) of the values of the statistic ( x-bar or p-hat)of the statistic ( x-bar or p-hat)Examine the graph for: shape, center, Examine the graph for: shape, center, spread, outliers or other deviationsspread, outliers or other deviations

x

Page 19: Chapter 7 Sampling Distributions

+

Parameters and Statistics

As we begin to use sample data to draw conclusions about a wider population, we must be clear about whether a number describes a sample or a population.

What Is a S

ampling D

istribution?

Definition:

A parameter is a number that describes some characteristic of the population. In statistical practice, the value of a parameter is usually not known because we cannot examine the entire population.

A statistic is a number that describes some characteristic of a sample. The value of a statistic can be computed directly from the sample data. We often use a statistic to estimate an unknown parameter.

Remember s and p: statistics come from samples andparameters come from populations

We write µ (the Greek letter mu) for the population mean and x ("x -

bar") for the sample mean. We use p to represent a population

proportion. The sample proportion ̂ p ("p -hat") is used to estimate the

unknown parameter p.

Page 20: Chapter 7 Sampling Distributions

+

Sampling Variability

This basic fact is called sampling variability: the value of a statistic varies in repeated random sampling.

To make sense of sampling variability, we ask, “What would happen if we took many samples?”

What Is a S

ampling D

istribution?

Population

Sample

Sample

Sample

Sample

SampleSample

Sample

Sample

?

How can x be an accurate estimate of µ? After all, differentrandom samples would produce different values of x .

Page 21: Chapter 7 Sampling Distributions

Activity: Reaching for Chips

Follow the directions on Page 418

Take a sample of 20 chips, record the sample proportion of red chips, and return all chips to the bag.

Report your sample proportion to your teacher.

Teacher: Right-click (control-click) on the graph to edit the counts.

What Is a S

ampling D

istribution?

Page 22: Chapter 7 Sampling Distributions

+

Sampling Distribution

In the previous activity, we took a handful of different samples of 20 chips. There are many, many possible SRSs of size 20 from a population of size 200. If we took every one of those possible samples, calculated the sample proportion for each, and graphed all of those values, we’d have a sampling distribution.

What Is a S

ampling D

istribution?

Definition:

The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.

In practice, it’s difficult to take all possible samples of size n to obtain the actual sampling distribution of a statistic. Instead, we can use simulation to imitate the process of taking many, many samples.

One of the uses of probability theory in statistics is to obtain sampling distributions without simulation. We’ll get to the theory later.

Page 23: Chapter 7 Sampling Distributions

+

Population Distributions vs. Sampling Distributions

There are actually three distinct distributions involved when we sample repeatedly and measure a variable of interest.

1)The population distribution gives the values of the variable for all the individuals in the population.

2)The distribution of sample data shows the values of the variable for all the individuals in the sample.

3)The sampling distribution shows the statistic values from all the possible samples of the same size from the population.

What Is a S

ampling D

istribution?

Page 24: Chapter 7 Sampling Distributions

+

Describing Sampling Distributions

The fact that statistics from random samples have definite sampling distributions allows us to answer the question, “How trustworthy is a statistic as an estimator of the parameter?” To get a complete answer, we consider the center, spread, and shape.

What Is a S

ampling D

istribution?

Definition:

A statistic used to estimate a parameter is an unbiased estimator if the mean of its sampling distribution is equal to the true value of the parameter being estimated.

Center: Biased and unbiased estimators

In the chips example, we collected many samples of size 20 and calculated the sample proportion of red chips. How well does the sample proportion estimate the true proportion of red chips, p = 0.5?

Note that the center of the approximate sampling distribution is close to 0.5. In fact, if we took ALL possible samples of size 20 and found the mean of those sample proportions, we’d get exactly 0.5.

Page 25: Chapter 7 Sampling Distributions

+

Describing Sampling DistributionsW

hat Is a Sam

pling Distribution?

Spread: Low variability is better!

To get a trustworthy estimate of an unknown population parameter, start by using a statistic that’s an unbiased estimator. This ensures that you won’t tend to overestimate or underestimate. Unfortunately, using an unbiased estimator doesn’t guarantee that the value of your statistic will be close to the actual parameter value.

Larger samples have a clear advantage over smaller samples. They are much more likely to produce an estimate close to the true value of the parameter.

The variability of a statistic is described by the spread of its sampling distribution. This spread is determined primarily by the size of the random sample. Larger samples give smaller spread. The spread of the sampling distribution does not depend on the size of the population, as long as the population is at least 10 times larger than the sample.

Variability of a Statistic

n=100 n=1000

Page 26: Chapter 7 Sampling Distributions

+

Describing Sampling DistributionsW

hat Is a Sam

pling Distribution?

Bias, variability, and shape

We can think of the true value of the population parameter as the bull’s- eye on a target and of the sample statistic as an arrow fired at the target. Both bias and variability describe what happens when we take many shots at the target.

Bias means that our aim is off and we consistently miss the bull’s-eye in the same direction. Our sample values do not center on the population value.

High variability means that repeated shots are widely scattered on the target. Repeated samples do not give very similar results.

The lesson about center and spread is clear: given a choice of statistics to estimate an unknown parameter, choose one with no or low bias and minimum variability.

Page 27: Chapter 7 Sampling Distributions

+

Describing Sampling DistributionsW

hat Is a Sam

pling Distribution?

Bias, variability, and shape

Sampling distributions can take on many shapes. The same statistic can have sampling distributions with different shapes depending on the population distribution and the sample size. Be sure to consider the shape of the sampling distribution before doing inference.

Sampling distributions for different statistics used to estimate the number of tanks in the German Tank problem. The blue line represents the true number of tanks.

Note the different shapes. Which statistic gives the best estimator? Why?

Page 28: Chapter 7 Sampling Distributions

+Section 7.1What Is a Sampling Distribution?

In this section, we learned that…

A parameter is a number that describes a population. To estimate an unknown parameter, use a statistic calculated from a sample.

The population distribution of a variable describes the values of the variable for all individuals in a population. The sampling distribution of a statistic describes the values of the statistic in all possible samples of the same size from the same population.

A statistic can be an unbiased estimator or a biased estimator of a parameter. Bias means that the center (mean) of the sampling distribution is not equal to the true value of the parameter.

The variability of a statistic is described by the spread of its sampling distribution. Larger samples give smaller spread.

When trying to estimate a parameter, choose a statistic with low or no bias and minimum variability. Don’t forget to consider the shape of the sampling distribution before doing inference.

Summary

Page 29: Chapter 7 Sampling Distributions

+Looking Ahead…

We’ll learn how to describe and use the sampling distribution of sample proportions.

We’ll learn about The sampling distribution of Using the Normal approximation for

In the next Section…

ˆ p

ˆ p


Recommended