1
Sampling Sampling DistributionsDistributions
Sampling Sampling DistributionsDistributions
Chapter 9Chapter 9
2
IntroductionIntroduction
In this chapter we study some In this chapter we study some relationships between population and relationships between population and sample characteristics.sample characteristics.
Generally, we are interested in Generally, we are interested in population parameters such as population parameters such as Mean returnMean return Variability of demandVariability of demand Proportion of defectives in a production lineProportion of defectives in a production line
3
IntroductionIntroduction
Such parameters are usually unknownSuch parameters are usually unknown Therefore, we draw a samples from the Therefore, we draw a samples from the
population, and use them to make population, and use them to make inference about the parameters. inference about the parameters.
This is done by constructing This is done by constructing sample sample statistics, statistics, that have close relationship to that have close relationship to the the population parameters.population parameters.
4
Samples are random, so the sample Samples are random, so the sample statistic is a random variable.statistic is a random variable.
As such it has a As such it has a sample distribution.sample distribution. Sample distributions for various Sample distributions for various
statistics are studied in this chapterstatistics are studied in this chapter
IntroductionIntroduction
5
Example 1Example 1 A die is thrown infinitely many times. Let X A die is thrown infinitely many times. Let X
represent the number of spots showing on represent the number of spots showing on any throw.any throw.
The probability distribution of X isThe probability distribution of X is
9.1 Sampling Distribution of 9.1 Sampling Distribution of the Meanthe Mean
x 1 2 3 4 5 6p(x) 1/6 1/6 1/6 1/6 1/6 1/6
E(X) = 1(1/6) +2(1/6) + 3(1/6)+………………….= 3.5
V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92
6
Suppose we want to estimate Suppose we want to estimate from the mean of a sample of from the mean of a sample of size n = 2.size n = 2.
What is the distribution of ?What is the distribution of ?
x
Throwing a die twice – sample mean Throwing a die twice – sample mean
x
7
Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5
10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6
Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5
10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6
Throwing a die twice – sample mean Throwing a die twice – sample mean
These are all the possible pairs of values for the 2 throwsAnd these are the means of each pair
8
Notice there are 36 possible pairs of values:1,1 1,2 ….. 1,62,1 2,2 ….. 2,6………………..6,1 6,2 ….. 6,6
1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.01 2 3 4 5 6 5 4 3 2 1
1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36
xThe distribution of when n = 2 The distribution of when n = 2
x
Calculating the relative frequency of each value of we have the following results
Frequency
Relative freq
(1+1)/2 = 1 (1+2)/2 = 1.5(2+1)/2 = 1.5
(1+3)/2 = 2(2+2)/2 = 2(3+1)/2 = 2
x
9
)25
(1167.
5.325n
2x2
x
x
)10
(2917.
5.310n
2x2
x
x
)5
(5833.
5.35n
2x2
x
x
The Relationship between the The Relationship between the sample size and the sampling sample size and the sampling
distribution of the sample mean distribution of the sample mean
As the sample size changes, the mean of the sample mean does not change!
10
)25
(1167.
5.325n
2x2
x
x
)10
(2917.
5.310n
2x2
x
x
)5
(5833.
5.35n
2x2
x
x
As the sample size increases, the variance of the sample mean decreases!
The Relationship between the The Relationship between the sample size and the sampling sample size and the sampling
distribution of the sample mean distribution of the sample mean
11
)25
(1167.
5.325n
2x2
x
x
)10
(2917.
5.310n
2x2
x
x
)5
(5833.
5.35n
2x2
x
x
The Relationship between the The Relationship between the sample size and the sampling sample size and the sampling
distribution of the sample mean distribution of the sample mean
Also, note the interesting relationshipbetween the sample size and the variance of the sample mean.We’ll formalize this relationship soon.
12
The Sample Variance The Sample Variance
Demonstration: Why is the variance of the sample mean is smaller than the population variance.
1 2 3
Mean = 1.5 Mean = 2.5Mean = 2.
Population 1.5 2.5
Compare the range of the populationto the range of the sample mean.
Let us take samplesof two observations. Click
2
13
The Central Limit TheoremThe Central Limit Theorem
If a random sample is drawn from any If a random sample is drawn from any population, the sampling distribution of the population, the sampling distribution of the sample mean is:sample mean is: NormalNormal if the parent population is normal, if the parent population is normal, Approximately normalApproximately normal if the parent population is if the parent population is
not normal, provided the sample size is sufficiently not normal, provided the sample size is sufficiently large.large.
The larger the sample size, the more closely The larger the sample size, the more closely the sampling distribution of will resemble a the sampling distribution of will resemble a normal distribution.normal distribution.
x
14
The Parameters of theThe Parameters of theSampling Distribution of XSampling Distribution of X
xx μμ
nσ
σ2x2
x
The mean of X is equal to the mean of the The mean of X is equal to the mean of the parent populationparent population
The variance of X is equal to the parent The variance of X is equal to the parent population variance divided by ‘n’.population variance divided by ‘n’.
15
Example 2 Example 2 The amount of soda pop in each bottle is The amount of soda pop in each bottle is
normally distributed with a mean of 32.2 normally distributed with a mean of 32.2 ounces and a standard deviation of .3 ounces and a standard deviation of .3 ounces.ounces.
Find the probability that a Find the probability that a bottlebottle bought by bought by a customer will contain more than 32 a customer will contain more than 32 ounces.ounces.
The Sampling Distribution of XThe Sampling Distribution of X - Example - Example
16
The Sampling Distribution of XThe Sampling Distribution of X - Example - Example
Example 2 Example 2 SolutionSolution
The random variable X is the amount of soda in a bottle.The random variable X is the amount of soda in a bottle.
0.7486
= 32.2x = 32
32)P(x
0.7486.67)P(z).332.232
σμx
P(32)P(xx
17
The Sampling Distribution of XThe Sampling Distribution of X
Find the probability that a carton of four bottles Find the probability that a carton of four bottles will have a will have a meanmean of more than 32 ounces of of more than 32 ounces of soda per bottle.soda per bottle.
32x 2.32x
32)xP( 0.9082
Solution Define the random variable as the mean amount of soda
per bottle.
9082.0)33.1z(P
)43.
2.3232x(P)32x(P
x
18
Example 3Example 3 The average weekly income of B.B.A graduates The average weekly income of B.B.A graduates
one year after graduation is $600.one year after graduation is $600. Suppose the distribution of weekly income has a Suppose the distribution of weekly income has a
standard deviation of $100. What is the probability standard deviation of $100. What is the probability that 35 randomly selected graduates have an that 35 randomly selected graduates have an average weekly income of less than $550?average weekly income of less than $550?
SolutionSolution
0.00152.97)P(z
)35100
600550σ
μxP(550)xP(
x
The Sampling Distribution of XThe Sampling Distribution of X
19
Example 3 – continuedExample 3 – continued If a random sample of 35 graduates actually had If a random sample of 35 graduates actually had
an average weekly income of $550, what would an average weekly income of $550, what would you conclude about the validity of the claim that you conclude about the validity of the claim that the average weekly income is 600?the average weekly income is 600?
SolutionSolution With With = 600 the probability to have a sample mean as = 600 the probability to have a sample mean as
low as 550 is very small (0.0015). The claim that the low as 550 is very small (0.0015). The claim that the mean weekly income is $600 is probably unjustified.mean weekly income is $600 is probably unjustified.
It will be more reasonable to assume that It will be more reasonable to assume that is smaller is smaller than $600, because then a sample mean of $550 than $600, because then a sample mean of $550 becomes more probable.becomes more probable.
The Sampling Distribution of XThe Sampling Distribution of X
20
The parameter of interest for qualitative The parameter of interest for qualitative (nominal) data is the (nominal) data is the proportion of timesproportion of times a particular outcome (success) occurs for a particular outcome (success) occurs for a given population.a given population.
This is the motivation for studying the This is the motivation for studying the distribution of the sample proportiondistribution of the sample proportion
9.2 Sampling Distribution of 9.2 Sampling Distribution of a Sample Proportion (p)a Sample Proportion (p)
<<
21
The sample proportion = The sample proportion =
Let X be the number of times an event of interest takes Let X be the number of times an event of interest takes place (we can call such an event a success just like the place (we can call such an event a success just like the definition we used for the binomial experiment)definition we used for the binomial experiment)
pp ==XXnn
The number of successes
<<
<<9.2 Sampling Distribution of 9.2 Sampling Distribution of
a Sample Proportion (p)a Sample Proportion (p)
22
9.2 Sampling Distribution of 9.2 Sampling Distribution of a Sample Proportion (p)a Sample Proportion (p)
Since X is binomial, probabilities for can Since X is binomial, probabilities for can be calculated from the binomial be calculated from the binomial distribution.distribution.
Yet, for inference about we prefer to use Yet, for inference about we prefer to use normal approximation to the binomial.normal approximation to the binomial.
pp
pp
<<
<<
<<
23
Approximate Sampling Approximate Sampling Distribution Distribution
of a Sample Proportionof a Sample Proportion
From the laws of expected value and From the laws of expected value and variance, it can be shown that variance, it can be shown that = p and = p and 2 2 = p(1-p)/n= p(1-p)/n
Z is calculated by:Z is calculated by:
If both np > 5 and n(1-p) > 5, then Z is If both np > 5 and n(1-p) > 5, then Z is approximately standard normal.approximately standard normal.
p̂
np)p(1
ppZ
ˆ
np)p(1
ppZ
ˆ
p̂
p̂
24
Example 5Example 5 A state representative received 52% of the A state representative received 52% of the
votes in the last election.votes in the last election. One year later the representative wanted One year later the representative wanted
to study his popularity.to study his popularity. If his popularity has not changed, what is If his popularity has not changed, what is
the probability that more than half of a the probability that more than half of a sample of 300 voters would vote for him? sample of 300 voters would vote for him?
Approximate Sampling Distribution Approximate Sampling Distribution of a Sample Proportionof a Sample Proportion
25
Example 5Example 5 SolutionSolution
The number of respondents who prefer the The number of respondents who prefer the representative is binomial with n = 300 and p representative is binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 > 5= .52. Thus, np = 300(.52) = 156 > 5n(1-p) = 300(1-.52) = 144 > 5. The normal n(1-p) = 300(1-.52) = 144 > 5. The normal approximation can be applied here:approximation can be applied here:
.7549.0288
.52.50
np)p(1
ppP.50)pP(
ˆˆ
Approximate Sampling Distribution Approximate Sampling Distribution of a Sample Proportionof a Sample Proportion
26
Using Sampling Distributions for Using Sampling Distributions for InferenceInference
Sampling distributions can be used to make an Sampling distributions can be used to make an inference about population parametersinference about population parameters
For example let us look at an inference about the For example let us look at an inference about the population meanpopulation mean
Generally we’ll compare the actual sample mean Generally we’ll compare the actual sample mean with a hypothesized value of the unknown with a hypothesized value of the unknown population mean, and make an informed decision population mean, and make an informed decision about the likelihood of this hypothesisabout the likelihood of this hypothesis
27
Using Sampling Distributions for Using Sampling Distributions for InferenceInference
Let us guess what the value of Let us guess what the value of is, and build a symmetrical interval is, and build a symmetrical interval around around large enough to make it very likely that the sample mean large enough to make it very likely that the sample mean falls falls inside itinside it. .
If the sample mean falls If the sample mean falls outsideoutside the interval (although this is very the interval (although this is very unlikely), we tend to believe that unlikely), we tend to believe that is different than the value of is different than the value of we we guessed.guessed.
The sampling distribution of the sample mean helps in performing the The sampling distribution of the sample mean helps in performing the calculations.calculations.
x
Large probability
that falls inside[]
x
28
Using Sampling Distributions for Using Sampling Distributions for InferenceInference
Suppose .95 is considered sufficiently large probability the sample mean falls inside the interval.
Let us build a symmetrical interval around .Using the notation and we have:P() = .95. x
x
29
Performing the usual standardization we find that the interval covering
Using Sampling Distributions for Using Sampling Distributions for InferenceInference
nσ
1.96μxn
σ1.96μ
nσ
1.96μ
x
95% of the distribution of the sample mean is:
0.95
nσ
1.96μ
30
ConclusionConclusion There is 95% chance that the sample mean falls within There is 95% chance that the sample mean falls within
the interval [560.8, 639.2] the interval [560.8, 639.2] if the population mean is if the population mean is 600.600.
Since the sample mean was 550, the population mean Since the sample mean was 550, the population mean is probably not 600.is probably not 600.
Using Sampling Distributions for Using Sampling Distributions for InferenceInference
95.)2.639x8.560(PtoreducesWhich
95.)25
10096.1600x
25
10096.1600(P
95.)n
96.1xn
96.1(P
Now let us apply this interval to example 3.
31
Optional: Sampling Distribution of the Optional: Sampling Distribution of the Difference Between Two MeansDifference Between Two Means
The difference between two means can The difference between two means can become a parameter of interest when the become a parameter of interest when the comparison between two populations is comparison between two populations is studied.studied.
To make an inference about To make an inference about 11 - - 22 we we
observe the distribution of .observe the distribution of .21 xx
32
21 xx
If the two populations are If the two populations are notnot both both normally distributed, but the sample normally distributed, but the sample sizes are 30 or more, the distribution of sizes are 30 or more, the distribution of is approximately normal. is approximately normal.
The distribution of is normal ifThe distribution of is normal if The two samples are independent, andThe two samples are independent, and The parent populations are normally The parent populations are normally
distributed.distributed.
21 xx
9.3 Normal Distribution of the 9.3 Normal Distribution of the Difference Between two Sample Difference Between two Sample MeansMeans
33
Applying the laws of expected value and Applying the laws of expected value and variance we have:variance we have:
nσ
nσ
σ
μμμ22
212
xx
21xx
21
21
2
22
1
21
2121
nn
)()xx(Z
2
22
1
21
2121
nn
)()xx(Z
We can define:We can define:
9.3 Normal Distribution of the 9.3 Normal Distribution of the Difference Between two Sample Difference Between two Sample MeansMeans
34
Example 6 Example 6 The starting salaries of MBA students from The starting salaries of MBA students from two universities (WLU and UWO) are two universities (WLU and UWO) are $62,000 (stand.dev. = $14,500), and $62,000 (stand.dev. = $14,500), and $60,000 (stand. dev. = $18,300).$60,000 (stand. dev. = $18,300). What is the probability that a sample mean of What is the probability that a sample mean of
WLU students will exceed the sample mean of WLU students will exceed the sample mean of UWO students? (nUWO students? (nWLUWLU = 50; n = 50; nUWOUWO = 60) = 60)
9.3 Normal Distribution of the 9.3 Normal Distribution of the Difference Between two Sample Difference Between two Sample MeansMeans
35
Example 6 – SolutionExample 6 – Solution
We need to determine
128,3$60300,18
50500,14
nn
2222
21
7389.2389.5.)64.z(P
)3128
20000
nn
) - (xx(P)0xx(P
2
22
1
21
212121
)0xx(P 21 1 - 2 = 62,000 - 60,000 = $2,000
9.3 Normal Distribution of the 9.3 Normal Distribution of the Difference Between two Sample Difference Between two Sample MeansMeans