+ All Categories
Home > Documents > 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find...

1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find...

Date post: 06-Jan-2018
Category:
Upload: basil-benson
View: 215 times
Download: 1 times
Share this document with a friend
Description:
3 ES Sample Variability Sample Mean Frequency Empirical Distribution of Sample Means
26
1 ES Chapter 11: Goals • Investigate the variability in sample statistics from sample to sample • Find measures of central tendency for sample statistics • Find measures of dispersion for sample statistics. • Find the pattern of variability for sample statistics
Transcript
Page 1: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

1

ES Chapter 11: Goals• Investigate the variability in sample statistics from

sample to sample

• Find measures of central tendency for sample statistics

• Find measures of dispersion for sample statistics.

• Find the pattern of variability for sample statistics

Page 2: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

2

ES Sampling Error and the Need for Sampling Distributions

Income Tax The Internal Revenue Service (IRS) publishes annual figures on individual income tax returns in Statistics of Income, Individual Income Tax Returns.For the year 2005, the IRS reported that the mean tax of individual income tax returns was $10,319. In actuality, the IRS reported the mean tax of a sample of 292,966 individual income tax returns from a total of more than 130 million such returns.a. Identify the population under consideration.b. Identify the variable under consideration.c. Is the mean tax reported by the IRS a sample mean or the population mean?d. Should we expect the mean tax, , of the 292,966 returns sampled by the IRS to be exactly the same as the mean tax, μ, of all individual income tax returns for 2005?e. How can we answer questions about sampling error? For instance, is the sample mean tax, , reported by the IRS likely to be within $100 of the population mean tax, μ?

Page 3: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

3

ES Sample Variability

6.8 7.2 7.6 8.0 8.4 8.8 9.2 9.6 10.0 10.4 10.8 11.2

Sample Mean

0

1

2

3

4

5

6

7

8

9

Frequency

Empirical Distribution of Sample Means

Page 4: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

4

ES Those People !!!It involves those annoying people who spend what seems to us an unreasonable amount of time vacating the parking space we are waiting for. Ruback and Juieng (1997) ran a simple study in which they divided drivers into two groups of 100 participants each—those who had someone waiting for their space and those who did not.

They then recorded the amount of time that it took the driver to leave the parking space. For those drivers who had no one waiting, it took an average of 32.15 seconds to leave the space. For those who did have someone waiting, it took an average of 39.03 seconds. For each of these groups the standard deviation of waiting times was 14.6 seconds. Notice that a driver took 6.88 seconds longer to leave a space when someone was waiting for it. (If you think about it, 6.88 seconds is a long time if you are the person doing the waiting.)

Page 5: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

5

ES The Chesapeake and Ohio Freight Study

Can relatively small samples actually provide results that are nearly as accurate as those obtained from a census? Statisticians have proven that such is the case, but a real study with sample and census results can be enlightening.

When a freight shipment travels over several railroads, the revenue from the freight charge is appropriately divided among those railroads. A waybill, which accompanies each freight shipment, provides information on the goods, route, and total charges. From the waybill, the amount due each railroad can be calculated.

Calculating these allocations for a large number of shipments is time consuming and costly. If the division of total revenue to the railroads could be done accurately on the basis of a sample—as statisticians contend—considerable savings could be realized in accounting and clerical costs.

To convince themselves of the validity of the sampling approach, officials of the Chesapeake and Ohio Railroad Company (C&O) undertook a study of freight shipments that had traveled over its Pere Marquette district and another railroad during a 6-month period. The total number of waybills for that period (22,984) and the total freight revenue were known.

Page 6: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

6

ES The Chesapeake and Ohio Freight Study

The study used statistical theory to determine the smallest number of waybills needed to estimate, with a prescribed accuracy, the total freight revenue due C&O. In all, 2072 of the 22,984 waybills, roughly 9%, were sampled. For each waybill in the sample, the amount of freight revenue due C&O was calculated and, from those amounts, the total revenue due C&O was estimated to be $64,568.

How close was the estimate of $64,568, based on a sample of only 2072 waybills, to the total revenue actually due C&O for the 22,984 waybills? Take a guess! We’ll discuss the answer at the end of this chapter.

Page 7: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

7

ES Sampling Distributions

• To make inferences about a population, we need to understand sampling

• The sample mean varies from sample to sample

• The sample mean has a distribution; we need to understand how the sample mean varies and the pattern (if any) in the distribution

Page 8: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

8

ES Sampling Distribution of a Sample Statistic

• Sampling Distribution of a Sample Statistic: The distribution of values for a sample statistic obtained from repeated samples, all of the same size and all drawn from the same population (e. g., four children who are outpatients in a community mental health center.)

1) Make a list of all samples of size 2 that can be drawn from this set (Sample with replacement)

2) Construct the sampling distribution for the sample mean for samples of size 23) Construct the sampling distribution for the minimum for samples of size 2

Example: Consider the set of their ages {1, 2, 3, 4}:

Page 9: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

9

ES

{1, 1} 1.0 1 1/16{1, 2} 1.5 1 1/16{1, 3} 2.0 1 1/16{1, 4} 2.5 1 1/16{2, 1} 1.5 1 1/16{2, 2} 2.0 2 1/16{2, 3} 2.5 2 1/16{2, 4} 3.0 2 1/16{3, 1} 2.0 1 1/16{3, 2} 2.5 2 1/16{3, 3} 3.0 3 1/16{3, 4} 3.5 3 1/16{4, 1} 2.5 1 1/16{4, 2} 3.0 2 1/16{4, 3} 3.5 3 1/16{4, 4} 4.0 4 1/16

Sample x Minimum ProbabilityThis table lists all possible samples of size 2, the mean for each sample, the minimum for each sample, and the probability of each sample occurring (all equally likely)

Table of All Possible Samples

Page 10: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

10

ES

1.0 1/161.5 2/162.0 3/162.5 4/163.0 3/163.5 2/164.0 1/16

Sampling Distributionof the Sample Mean

x P x( )

1.0 1.5 2.0 2.5 3.0 3.5 4.00.00

0.05

0.10

0.15

0.20

0.25

x

P x( )

Histogram: Sampling Distributionof the Sample Mean

Sampling Distribution• Summarize the information in the previous table to obtain the sampling

distribution of the sample mean and the sample minimum:

Page 11: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

11

ES

m P (m )1 7/162 5/163 3/164 1/16

Sampling Distribution of the Sample Minimum:

Histogram: Sampling Distribution of the Sample Minimum:

1 2 3 40.0

0.1

0.2

0.3

0.4

0.5

m

P m( )

Sampling Distribution

Page 12: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

12

ES Example 1 Example: Consider the population consisting of six equally likely integers:

1, 2, 3, 4, 5, and 6. Empirically investigate the sampling distribution of the sample mean. Select 50 samples of size 5, find the mean for each sample, and construct the empirical distribution of the sample mean.

1 2 3 4 5 60.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

x

P x( )

3517078..

The Population: Theoretical Probability Distribution

Page 13: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

13

ES Empirical Distribution of the Sample Mean

• Samples of Size 5x

1.8 2.3 2.8 3.3 3.8 4.3 4.8 5.30

2

4

6

8

10

12

14

Frequency

xsx

33520 714..

Sample Meanx

Page 14: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

14

ES Important Notes & Random Sample

Random Sample: A sample obtained in such a way that each possible sample of a fixed size n has an equal probability of being selected

xsx

1. : the mean of the sample means

2. : the standard deviation of the sample means

3. The theory involved with sampling distributions described in the remainder of this chapter requires random sampling

– (Every possible handful of size n has the same probability of being selected)

Page 15: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

15

ES Where Does This Lead Us?

• Describing the most important idea in all of statistics

• Describes the sampling distribution of the sample mean

• Examples suggest: the sample mean (and sample total) tend to be normally distributed

Page 16: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

16

ES Important Definition & Theorem

Central Limit TheoremThe sampling distribution of sample means will become normal as the sample size increases.

Sampling Distribution of Sample MeansIf all possible random samples, each of size n, are taken from any population with a mean and a standard deviation , the sampling distribution of sample means will:

1. have a mean equal to

2. have a standard deviation equal to

Further, if the sampled population has a normal distribution, then the sampling distribution of will also be normal for samples of all sizes

n

x

x

x

Page 17: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

17

ES Summary

• The standard deviation of the sampling distribution of (also called the standard error of the mean) is equal to the standard deviation of the original population divided by the square root of the sample size:Notes: – The distribution of becomes more compact as n increases. (Why?)– The variance of :

x

x n

xx x n2 2

• The distribution of is (exactly) normal when the original population is normal

x

• The CLT says: the distribution of is approximately normal regardless of the shape of the original distribution, when the sample size is large enough!

x

• The mean of the sampling distribution of is equal to the mean of the original population:

x x

Page 18: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

18

ES Standard Error of the Mean

Notes:• The n in the formula for the standard error of the mean is

the size of the sample

• The following example illustrates the results of the Central Limit Theorem

• Lab 5 will have you will generate several sampling distributions

Standard Error of the Mean: The standard deviation of the sampling distribution of sample means: x n

Page 19: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

19

ES Graphical Illustration of the Central Limit Theorem

Original Population

x10 3020

10 x

Distribution of x: n = 10

x

Distribution of x:n = 30

10 20

x

Distribution of x: n = 2

10 3020

Page 20: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

20

ES Applications of the Central Limit Theorem

• When the sampling distribution of the sample mean is (exactly) normally distributed, or approximately normally distributed (by the CLT), we can answer probability questions using the standard normal distribution, Table A

Page 21: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

21

ES Example 2Example: Consider a normal population with = 50 and

= 15. Suppose a sample of size 9 is selected at random. Find:

P x( )45 60

P x( . )47 5

1)2)

Solutions: Since the original population is normal, the distribution of the sample mean is also (exactly) normal

1) x 50

x n 15 9 15 3 52)

Page 22: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

22

ES

5045 60 x0 1.00 2.00 z

0 3413. 0 4772.

Example 2

P x P

P z

( )

(. . .

45 60 45 505

60 505

1.00 2.00)0 9772 0 1587 08185

-

zz = ;x - n

1587.0

Page 23: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

23

ES

5047.5 x0-0.50 z

01915.0 3085.

Example 2

P x P x

P z

( . ) .

( . ).

47 5 505

47 5 505

50 3085

z = ;x - n

Page 24: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

24

ES Example 3 Example: A recent report stated that the day-care cost per week in Boston is

$109. Suppose this figure is taken as the mean cost per week and that the standard deviation is known to be $20.1) Find the probability that a sample of 50 day-care centers would show a mean

cost of $105 or less per week.2) Suppose the actual sample mean cost for the sample of 50 day-care centers is

$120. Is there any evidence to refute the claim of $109 presented in the report?

x

x

Solutions:• The shape of the original distribution is unknown, but the sample size, n, is

large. The CLT applies.• The distribution of is approximately normal

n 109 20 50 2 83 .x

Page 25: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

25

ES Example 3

xP P

P z

( ).

( . ).

105 105 1092 83

1410 0793

z = ;x - n

z

109105 x0 141. z

0 4207.0 0793.

1)

Page 26: 1 ES Chapter 11: Goals Investigate the variability in sample statistics from sample to sample Find measures of central tendency for sample statistics Find.

26

ES Example 3

• To investigate the claim, we need to examine how likely an observation is the sample mean of $120

• There is evidence (the sample) to suggest the claim of = $109 is likely wrong

• Since the probability is so small, this suggests the observation of $120 is very rare (if the mean cost is really $109)

• Consider how far out in the tail of the distribution of the sample meanis $120

P x P

P z

( ).

( . )

120 120 1092 83

3891.0000 - 0.9998 = 0.0002

z = ;x - n

z

2)


Recommended