Overview - Sam Houston State University

Post on 24-Oct-2021

3 views 0 download

transcript

Overview

7.1 Introduction to Sampling Distributions

7.2 Central Limit Theorem for Means

7.3 Central Limit Theorem for Proportions

7.1 Introduction to Sampling Distributions

Objectives:

By the end of this section, I will be

able to…

1) Compute point estimates and sampling error.

2) Explain the sampling distribution of the sample mean

3) Describe the sampling distribution of the sample mean when the population is normal.

4) Use normal probability plots to assess normality.

5) Find probabilities and percentiles for the sample mean when the population is normal.

x

x

Point Estimates

Use known statistics to estimate unknown

parameters and report a single number

as the estimate

The value of the statistic is called the point estimate

Table 7.1 Point estimation: Use statistics to estimate unknown population parameters

Sampling error

The distance between the point estimate and its target parameter

Table 7.3 Sampling error for common characteristics

Page 349

Example

Do problems 10 and 12

Solutions:

10)

12)

Example

1.00.19.0s

05.075.070.0ˆ pp

Example 7.3 - Commuting times for student government members

We are interested in how long it takes the five

members of the student government to

commute to school. The times (in minutes)

are given in Table 7.4. Since these five

people are all the members of the student

government, we can consider them to

constitute a population.

Example 7.3 continued

a. Calculate the population mean commuting time μ.

b. Take a sample of the following student

government members: Amber, Brandon, and

Chantal. Find the sample mean commuting

time x and the sampling error |x - μ|.

Table 7.4 Commuting times for the five members of the student government

Example 7.3 continued Solution

The mean commuting time of this population is

For Amber, Brandon, and Chantal, the sample mean commuting time is

The sampling error of this sample mean is

|x1 – | = |11.67 – 16|= 4.33 minutes.

10 20 5 30 1516 minutes

5

x

N

1

10 20 511.67 minutes

3

xx

N

Table 7.5 All possible samples of size 3 from population of student government members

Consider previous example and find all possible

samples of student government members,

the sample means, population mean,

and sampling errors (page 339):

Sampling distribution of the sample mean x

Collection of the sample means of all

possible samples of size n

Calculate the mean of the sample means as the average of the sample means:

Note: the mean of the sample means equals the population mean in this example.

Mean of the Sample Means

Fact 1

The mean of the sampling distribution of the sample mean x is the value of the population mean .

Denoted as

Read as “the mean of the sampling distribution of x is ”

x

For the previous example

Mean of the Sample Means

x

Standard Deviation of the Population

=3.5119

Standard Deviation of the Sample Means

Fact 2

The standard deviation of the sampling distribution of the sample mean x is

is the population standard deviation

n is the sample size where n is assumed to be very large (if n is not large, see the note on page 340)

/x

n

Previous example

x

Page 350

Example

Solutions

a)

Example

inches 68x

inches 95.010/inches 3/ nx

Solutions

b)

Example

inches 68x

inches 30.0100/inches 3/ nx

Solutions

c)

Example

inches 68x

inches 09.01000/inches 3/ nx

Sampling Distribution of the Sample Mean for a Normal Population

Fact 3

Is itself normal, regardless of sample size

Sampling Distribution of the Sample Mean for a Normal Population

Fact 4

Distributed as normal with mean:

And standard deviation:

x

nx /

Fact 5: Standardizing a Normal Sampling Distribution for Means

When the sampling distribution of

is normal, we may standardize to produce

the standard normal random variable Z as follows:

where is the population mean, is the

population standard deviation, and n

is the sample size.

/

x

x

x xZ

n

x

Page 350

Example

Solutions:

Example

000,50$x

1000$25/5000$/ nx

Solutions:

a) Z-score for sample mean of 52,000 is

Example

21000

5000052000

/ n

xZ

Solutions:

a)

Example

0.0228

9772.01

)2(1

)2()000,50$(

ZP

ZPxP

Look up in Table C

(b) calculator

Example

(c) calculator

Example

0013.0

000)00,50000,1-10^99,470normalcdf(

)000,47$(xP

(c) calculator

Example

(c) calculator

Example

0215.0

00)0,50000,1052000,5300normalcdf(

)000,53$000,52($ xP

Page 350

Example

Do part (a)

Solution:

a) $50,000 (note: for a normal distribution, the mean and the median are equal)

Example

Page 350

Example

Do part (b)

Solutions:

b) Find so that

for a normal distribution with

Example

95.0)( CXXP

CX

000,50$x

1000$25/5000$/ nx

First find so that

using the standard normal distribution. From Table C

we get that:

Example

95.0)( CZZP

CZ

655.1CZ

Use to get by using the z-score formula:

Example

CZ

CC

XZ

CX

1000$

000,50$655.1 CX

655,51$CX

Solutions (directly with calculator):

b)

Example

85.644,51$000)95,50000,1invNorm(0.CX

Page 350

Example

Do parts (c) and (d)

Solutions:

c) $48,355

d) $48,355 and $51,645

Example

15.355,48$000)05,50000,1invNorm(0.CX

Normal Probability Plots

A normal probability plot is a scatterplot of the estimated cumulative normal probabilities (expressed as percents) against the corresponding data values in a data set.

Normal Probability Plot

FIGURE 7.4 Normal probability plot of normal data.

Analyzing Normal Probability Plots

If the points in the normal probability plot either cluster around a straight line or nearly all fall within the curved bounds, then it is likely that the data set is normal.

Systematic deviations off the straight line are evidence against the claim that the data set is normal.

Normal Probability Plot

FIGURE 7.5 Normal probability plot of right-skewed data.

Summary

We can use sample statistics as point estimates of the unknown population parameters.

For each statistic, sampling error is the distance between the point estimate and its target parameter.

The sampling distribution of the sample mean for a given sample size n consists of the collection of the sample means of all possible samples of size n from the population.

x

Summary

The mean of the sampling distribution of the sample mean is the value of the population mean μ (Fact 1).

The standard deviation of the sampling

distribution of the sample mean is

, σ is the population standard

deviation (Fact 2).

The sampling distribution of the sample mean for a normal population is itself normal, regardless of sample size (Fact 3).

x

x

/x n

Summary

For a normal population, the sampling distribution of the sample mean is distributed as normal (μ, ), where μ is the population mean and σ is the population standard deviation (Fact 4).

Normal probability plots are used to assess the normality of a data set.

We can use Fact 4 to find probabilities and percentiles using sample means.

x

/ n

7.2 Central Limit Theorem for Means Objectives:

By the end of this section, I will be

able to…

1) Describe the sampling distribution of x for skewed and symmetric populations as the sample size increases.

2) Apply the Central Limit Theorem for Means to solve probability questions about the sample mean.

Main Idea

In this section we want to be able to describe the shape of the distribution of the sample means.

Symmetric Populations

For a symmetric distribution, at n=20 the sampling distribution of the mean is approximately normal.

Roll a fair die once. Make up a table that represents the probability distribution of X=number on the die. Also, plot the probability distribution in a bar chart.

Example

Distribution (table format) is:

x P(x)

1 0.1667

2 0.1667

3 0.1667

4 0.1667

5 0.1667

6 0.1667

Example

5.3)(xxP

FIGURE 7.11 Distribution of a single fair die roll is symmetric.

Roll a fair die one hundred times. Take random samples of size 10 from these 100 rolls. For each sample of size 10, calculate the sample mean. Plot the distribution of the means as a histogram.

Example

FIGURE 7.12 Sample means of size n = 10: already approaching normality.

Roll a fair die one hundred times. Take random samples of size 20 from these 100 rolls. For each sample of size 20, calculate the sample mean. Plot the distribution of the means as a histogram.

Example

FIGURE 7.13 Sample means of size n = 20: approximately normal.

FIGURE 7.14 Normal probability plot for n = 20: acceptable normality.

Skewed Populations

For a skewed population, sampling distribution of the mean becomes approximately normal as the sample size approaches 30.

FIGURE 7.10 Sampling distribution of x and normal probability plots for n = 10, 20, and 30.

¯

Central Limit Theorem for Means

Population with mean μ

Standard deviation σ

The sampling distribution of the sample mean x becomes approximately normal with mean μ and standard deviation as the sample size gets larger

Regardless of the shape of the population.

/ n

Rule of Thumb

We consider n ≥ 30 as large enough to apply the Central Limit Theorem for any population.

Three Cases for the Sampling Distribution of the Sample Mean x

Case 1

The population is normal.

Then the sampling distribution of x is normal (Fact 3 from 7.1).

Three Cases for the Sampling Distribution of the Sample Mean x continued

Case 2

The population is either non-normal or of unknown distribution and the sample size is at least 30.

Apply Central Limit Theorem for Means:

The sampling distribution of the sample mean is approximately normal

Three Cases for the Sampling Distribution of the Sample Mean x continued

Case 3

The population is either non-normal or of unknown distribution and the sample size is less than 30.

Insufficient information to conclude that the sampling distribution of the sample mean x is either normal or approximately normal

Page 362-363

Example

Solutions

6) Case 2

8) Case 3

10) Case 1

Example

Page 363

Example

Solutions

16(a)

16(b)

16(c) unknown

Example

000,60$x

500,2$16/000,10$/ nx

Page 363

Example

Solutions

20(a)

20(b)

20(c) approximately normal

Example

gallonper miles 50x

mpg 75.064/6/ nx

Page 363

Example

Solution:

first notice that it is possible to find the

probability since the systolic blood pressure readings are normally distributed so the distribution of the sample mean is also normal (case 1)

Example

80x

6.125/8/ nx

Solution:

Example

7887.0.6)78,82,80,1normalcdf()8278( xP

Page 363

Example

Solution

Not possible: since the pollen count distribution is not normally distributed and the sample size is smaller than 30, the sampling distribution of the mean of x is unknown.

Example

Page 363

Example

Solution:

first notice that it is possible to find the

probability since even though the pollen count distribution is not normal, the sample size is at least 30, so the distribution of the sample mean is also normal (case)

Example

8x

125.064/1/ nx

Solution (directly with calculator):

Find so that

Example

1.8)75,8,0.125invNorm(0.CX

75.0)( CXXPCX

Page 364

Example

Solution:

(a) yes- case 1 applies

Example

o

x 6.38

oo

x n 225/10/

0.2420

7580.01

)40(1)40( oo xPxP

Solution:

(b) case 2 does not apply since the sample

size is less than 30.

Example

Summary

In this section, we examine the behavior of the sample mean when the population is not normal.

The approximate normality of the sampling distribution of the sample mean kicks in much quicker when the original population is symmetric rather than skewed.

The Central Limit Theorem is one of the most important results in statistics and is stated as follows:

Summary

Given a population with mean μ and standard deviation σ, the sampling distribution of the sample mean x becomes approximately normal(μ, ) as the sample size gets larger, regardless of the shape of the population.

This approximation applies for smaller sample sizes when the original distribution is more symmetric.

/ n

7.3 Central Limit Theorem for Proportions Objectives:

By the end of this section, I will be

able to…

1) Explain the sampling distribution of the sample proportion p.

2) Describe the sampling distribution of the sample proportion p for extreme and moderate values of p.

3) Apply the Central Limit Theorem for Proportions to solve probability questions about the sample proportion.

ˆ

ˆ

Sample Proportion p

Suppose each individual in a population either has or does not have a particular characteristic

For sample of size n sample proportion p (read “p-hat”) is

x represents the number of individuals in the sample that have the particular characteristic

Use p to estimate the unknown value of the population proportion p

ˆx

pn

ˆ

ˆ

ˆ

Example

Page 367, Example 7.14

Table 7.6 All possible samples of size 3 from population of student government members

Example 7.15 - Mean of sample proportions

Calculate the mean of the ten sample

proportions p from Table 7.6 page 367.

ˆ

Example 7.15 continued

Solution

Mean is

p equals the population proportion of

females for the original population,

p = 3/5 = 0.6.

2 1 2 2 3 2 1 2 1 2

3 3 3 3 3 3 3 3 3 30.6

10

ˆ

Fact 6: Mean of the Sampling

Distribution of the Sample Proportion p

The value of the population proportion p

Denoted as

where

Read as “the mean of the sampling distribution of p is p”

ˆ

ˆ

pp̂

Fact 7 - Standard Deviation of the Sampling Distribution of the Sample Proportion p

where

p is the population proportion

n is the sample size

ˆ

n

pq

n

ppp

)1(ˆ

pq 1

Example

Page 379

Do 8 and 10 parts (a) and (b)

Solutions

8(a)

8(b)

Example

5.0ˆ pp

2236.05

)5.0()5.0(ˆ

n

pqp

5.01 pq

Solutions

10(a)

10(b)

Example

01.0ˆ pp

0044.0500

)99.0()01.0(ˆ

n

pqp

99.01 pq

Fact 8 - Conditions for Approximate Normality for the Sampling Distribution of the Sample Proportion p

May be considered approximately normal only if both the following conditions hold:

(1) np ≥ 5 and (2) n(1 - p) ≥ 5

ˆ

Example

Page 379

Do part (c)

Solutions

8(c)

Unknown (we cannot conclude that the sampling distribution of the proportion is normal in this case)

Example

55.2)5.0(5np

55.2)5.0(5)1( nqpn

Solutions

10(c)

We conclude that the sampling distribution of the proportion is approximately normal in this case

Example

5)01.0(500np

5495)99.0(500)1( nqpn

Fact 8 continued

Given a value for p, the minimum sample size required to produce approximate normality in the sampling distribution of the proportion can be found by solving each of these for n:

and choosing the next largest integer value that is at least as large as both of these values for n.

5 and 5 nqnp

Example

Page 379

Do problem 16

Solutions

16)

The minimum sample size is 100

Example

100 5)05.0( nnnp

26.5 5)95.0( nnnq

Fact 9 - Standardizing a Normal Sampling Distribution for Proportions

When the sampling distribution of p is normal (or approximately normal), we

can standardize to produce the standard normal Z:

where p is the population proportion of successes and n is the sample size.

ˆ

ˆ

ˆ ˆ

1

p

p

p p pZ

p p

n

Example

Page 379

Do problem 30

Solutions

30) First check that we can assume a normal distribution:

We cannot conclude that the sampling distribution of the proportion is normal in this case and we cannot find the probability.

Example

55.2)5.0(5np

55.2)5.0(5)1( nqpn

Example

Page 379

Do problem 32

Solutions

First check that we can assume a

normal distribution for :

Example

5)01.0(500np

5495)99.0(500)1( nqpn

Solutions

For normal distribution use

mean:

Standard deviation:

Example

01.0ˆ pp

0044.0500

)99.0()01.0(ˆ

n

pqp

Solutions

z-score method (Table)

Example

23.00044.0

01.0011.0ˆ

ˆ

ˆ

p

ppZ

Solutions

using standard normal distribution and table lookup:

Example

0.4090

5910.01

)23.0(1)23.0( ZPZP

From Table T-10

Solutions

It is more accurate to compute Z as

so that:

Example

0.4129

5871.01

)22.0(1)22.0( ZPZP

22.0500/)99.0()01.0(

01.0011.0ˆ

ˆ

ˆ

p

ppZ

Example

Page 379

Do problem 36

Solutions

First check that we can assume a normal distribution for :

Yes- both are greater than 5

Example

200)5.0(400np

200)50.0(400)1( nqpn

Solutions

Find the value:

So that

is the 90th percentile of values of

Example

90.0)ˆˆ( cppP

cp̂ p̂

cp̂

Solutions

For normal distribution use

mean:

standard deviation:

Example

5.0ˆ pp

025.0400

)5.0()5.0(ˆ

n

pqp

Solutions

Using calculator gives:

Example

532.025)90,0.5,0.0invNorm(0.ˆCp

Central Limit Theorem for Proportions

the sampling distribution of the sample proportion follows an approximately normal distribution with

mean

standard deviation

when the following conditions are satisfied:

np ≥ 5 and n(1 - p) ≥ 5

pp̂

n

pq

n

ppp

)1(ˆ

Example

Page 379

Solutions

(a) Take p=0.87 and choose smallest integer value of n so that both of the following are true:

Answer:

Example

5)87.0(nnp

5)13.0()1( nnqpn

39*nn

Solutions

(b) Check that we can assume a normal distribution for with a sample size of n=39

Yes- both larger than 5

Example

93.33)87.0(39np

07.5)13.0(39)1( nqpn

Solutions

(c) The Central Limit Theorem tells us that the sampling distribution of

is approximately normal when

n=39

Example

Use n=50

(d) The Central Limit Theorem gives that the sampling distribution of

is approximately normal with

Example

87.0ˆ pp

0476.050

)13.0(87.0)1(ˆ

n

ppp

(d) For n=50, find

Example

90.0ˆ50

45ˆ pPpP

0.264376),0.87,0.040.90,10normalcdf( 99

Summary

The sampling distribution of the sample proportion p for a given sample size n consists of the collection of the sample proportions of all possible samples of size n from the population.

The approximate normality of the sampling distribution of the sample proportion kicks in much quicker when the population proportion is moderate rather than extreme.

ˆ

Summary

According to the Central Limit Theorem for Proportions, the sampling distribution of the sample proportion p follows an approximately normal distribution with mean μp = p and standard deviation

the following conditions are satisfied: (1) np ≥ 5 and (2) n(1 - p) ≥ 5.

We can use Fact 9 to find probabilities and percentiles for sample proportions.

1 /p

p p n

ˆ

ˆ