Date post: | 06-Mar-2016 |
Category: |
Documents |
Upload: | donald-yum |
View: | 216 times |
Download: | 0 times |
of 40
1
Chapter 8
Confidence Interval
Estimation
David Chow
Oct 2014
2
Learning Objectives
To construct and interpret confidence interval estimates for the mean and the proportion.
To determine the necessary sample size for a confidence interval.
Section 8.5: Applications in Auditing (NOT covered)
3
Basic Concepts
A point estimate is a single number
Eg: For the population mean (), a point estimate is ____
A confidence interval is an interval estimate. It provides additional information about variability.
Eg: Giant pandas mean age = 22 yrs old
Point Estimate
Lower Confidence
Limit
Upper Confidence
Limit
Width of confidence interval
Eg: According to a survey, the 95% confidence interval mean wage of private tutoring is between $110 to $150 per hour
I.e., = $130 20
4
Basic Concepts
The general formula for all confidence intervals (C.I.) is:
Critical values (Z) are related to the level of confidence (1- , also called confidence level).
Eg: 95% confidence: (1 - ) = 0.95, or = 5%.
With a given , critical values can be obtained from the Z-table.
Then, a C.I. can be computed:
Eg: 95% confidence: (1 - ) = 0.95, or = 5%.
Point Estimate Margin of Error, where
Margin of Error (e) = (Critical Value) x (Standard Error)
5
Remarks
This chapter focuses on two parameters, and
Lets start with the easiest case: estimating with a known population standard deviation ()
A more realistic case ( unknown) follows
Concepts versus Computation:
As always, statistic concepts can be a bit abstract at first, but computations have standard steps to follow
We will work out a few examples, master the computations first, then go back to think about the rationale and interpretation behind your math
6
Estimating ( Known)
7
Confidence Interval for ( Known)
Assume population standard deviation is known.
Also assume n is large enough (n > 30), or the population
is normally distributed. Such assumptions ensure ____.
A two-tailed confidence interval estimate:
Z, also written as Z/2, is the standardized normal distribution
critical value for a probability of ____ in each tail.
n
ZX
8
Critical Values of Z
Consider a 95% confidence interval:
Z= -1.96 Z= 1.96
.951
.0252
.025
2
Lower Confidence Limit
Upper Confidence Limit Point Estimate
0
Find the critical values
Z0.05 and Z0.005.
9
Eg: Length of A4 Paper
A paper producer wants to check if the
paper produced has the correct mean length
of 11 inches
Find the 95% confidence interval of the
population mean paper length based on a
sample of 100, sample meanx = 10.998 in
is known to be 0.02 in
10
Eg: Length of A4 Paper
The 95% confidence interval is given by:
=x Z/2 x
Step 1: Find Z0.025 = 1.96
Step 2: Z/2 x = 1.96 (0.02)/10 = 0.00392
The required confidence interval is:
= 10.998 0.00392 inches, or
10.99408 < < 11.00192
Find the 99% interval. What is the effect of raising the confidence level?
Eg: Mean Resistance
A sample of 11 circuits
from a large normal
population has a mean
resistance of 2.20 ohms.
Past testing shows that
the population standard
deviation is 0.35 ohms.
Determine a 95%
confidence interval for
the true mean resistance
of the population.
2.4068) , (1.9932
.2068 2.20
)11(.35/ 1.96 2.20
n
025.0
ZX
11
We are 95% confident that the true
mean resistance is between 1.9932
and 2.4068 ohms
I.e., 95% of intervals formed in this
manner will contain the true
population mean.
Is it correct to use the Z-distribution?
ANSWER
12
Recap: Choosing Confidence Level
A bigger confidence level raises
the confidence (of the interval
containing the true mean)
But a wider interval estimate also
means ____ precision
95% is the most common choice
It provides a good balance between
precision and confidence
Example: Body Temperature
n = 106,x = 98.20F, = 0.62F
1. Find the 95% confidence interval
2. How to obtain a narrower interval estimate?
1. Margin of error = ____ = 0.12
CI: 98.08 to 98.32
2. Smaller sigma, bigger n, or smaller (1-alpha)
13
ANSWER
14
, Confidence Intervals and Sampling Distribution
x
Confidence Intervals
Intervals:
to (1-) x 100% of intervals constructed
contain ;
() x 100% do not.
Sampling Distribution
n
ZX
n
ZX
x
x1
x2
/2 /21
15
Interpreting Confidence Level
Suppose we select many different samples of
size n from a population.
A 95% confidence interval is constructed for
each sample.
Then 95% of those interval estimates would
actually contain the true value of .
16
Estimating ( Unknown)
17
Confidence Interval for ( Unknown)
Usually is unknown
Use sample standard deviation S instead
This will introduce extra uncertainty
because S varies from sample to sample
So another distribution (the t distribution) is used
It is flatter than the standard normal distribution
The t distribution requires that the original population is normally distributed
This is assumed in most cases
Strictly speaking, this assumption should be checked at first
18
Confidence Interval for ( Unknown)
With an unknown , you need to be sure that
(1) the sample size is large enough (n 30), or
(2) the population is normal
Such assumptions enable the use of Students t dist:
Confidence Interval Estimate:
where t, also written as t/2,n-1, is the critical value of the t
distribution with n-1 degree of freedom, and an area of /2 in
each tail)
n
StX 1-n
19
Critical Values of t
The critical value of t is characterized by two elements:
The confidence level (1- ), and
The degrees of freedom (df).
What is d.f.?
It is the number of observations that are free to vary after sample mean has been calculated.
In this section, df = n-1.
20
Degrees of Freedom
Given a mean value of 8.0, X3 must be 9
(i.e., X3 is not free to vary) Here, n = 3, so degrees of freedom = n 1 = 3 1 = 2
You are free to choose 2 values (X1 and X2),
but the third is set for a given mean.
Eg: Suppose the mean of 3 numbers is 8.0
Let X1 = 7, X2 = 8
What is X3?
In this example d.f. = 2.
What does it mean?
21
Degrees of Freedom
t 0
t (df = 5)
t (df = 13) t-distributions: bell-shaped, symmetric,
but fatter tails than Z
Standard Normal (t distribution with df = )
Note: t Z as n increases
22
Critical Values of t
Upper Tail Area
df
.25 .10 .05
1
1.000
3.078
6.314
20 0.687 1.325 1.724
21
0.686
1.323
1.721
t 0 1.724
The body of the table contains
t values, not ____
Suppose n = 21, and = 0.10.
Then df = ____,
upper-tail area = ____
/2 = 0.05
d.f. = 20
23
Eg: Mean Age of Retirement
A random sample of 25 retirees has mean age = 50 and std = 8. Find the 95% confidence interval for .
Must assume a normal population.
From t-table, t0.025, 24 = 2.0639
The confidence interval is
25
8(2.0639)50
n
S1-n /2, tX
(46.698 , 53.302)
Eg: Heating Oil Consumption
A random sample of 35 households has mean consumption of heating oilx = 1122.75 gallons, and S = 295.72 gallons.
Find the 95% confidence interval for .
ANSWER
Critical values are t0.025, 34 = 2.0322.
= 1122.75 101.58 gallons.
Based on the sample evidence, we are 95% confident that the interval 1122.75 101.58 gallons covers the population mean.
24
ANSWER
NOTE: Z or t?
If n 30, it is commonly acceptable to use Z (instead of t) as an approximation.
But if you can find a more precise answer (using t-values), why not?
25
Estimating
Population Proportion
26
Confidence Intervals for the
Population Proportion
Recall that the distribution of the sample proportion is
approximately normal if the sample size is large, with
standard deviation
We will estimate this with sample data:
n
p)p(1
n
)(1p
27
Confidence Intervals for the
Population Proportion
The confidence interval for the population proportion is given by:
where
Z = critical Z-value given the level of confidence
p = sample proportion
n = sample size
Such interval estimate for is based on a point estimate (p), plus an allowance for uncertainty arising from sampling
n
p)p(1Zp
28
Example: Vegetarians
1. A random sample of 100 people shows that 25 of them are vegetarians.
Form a 95% confidence interval for the true proportion of vegetarians in the
population.
2. Compute the 95% confidence interval if n=1000.
00.25(.75)/196.125/100
p)/np(1p
Z
0.3349) , (0.1651
(.0433) 1.96 .25 Interpretation
95% of intervals formed from
samples of size 100 in this manner
will cover the true proportion
29
Sample Size
Determination
30
Sample Size Determination
Recall that sample size (n) affects the margin
of error (e, also called sampling error),
where
If e is set before conducting a survey, this
equation helps you determine the sample size
for a pre-set value of e (the acceptable level
of error): 2
22
e
Zn
n
Ze
31
Sample Size Determination
If = 45, what sample size is needed to estimate
the mean within 5 with 90% confidence?
219.195
(45)(1.645)
e
Zn
2
22
2
22
Round up to the next integer to get the
required sample size n = 220
Eg: A4 Paper Again
In the paper manufacturer example, = 0.02, n =
100, and the 95% interval estimate is = 10.998
0.00392 inches.
Suppose the manufacturer wants to limit the error to
0.003 by choosing a larger sample. What is n?
ANSWER
The required sample size is n = 171. 7.1700.003
(0.02)(1.96)2
22
2
22
e
Zn
32
ANSWER
33
Sample Size Determination
To determine the required sample size for the proportion, you
must know:
The critical value Z (from a confidence level of 1-),
The acceptable sampling error (e), and
The true proportion .
If is unknown, use the sample value p, or set = 0.50.
2
2 )1(
e
Zn
Now solve
for n to get n
Ze)1(
Eg: Quality Control
Out of a population of 1,000 light bulbs, we randomly selected 100 of
which 30 were defective. What sample size is needed to be within
0.05 with 90% confidence?
(a) Since the true population proportion is unknown, use the sample
value here.
(b) Now, set = 0.50 and compare the result with (a).
34
2 22 2
1 1.645 0.3 0.7
Error 0.05
227.3 228
Z p pn
(b) The required sample size
increases to 271.
NOTE: The product (1- ) ranges from 0 to 0.25. By assuming a value
of 0.25, we are in fact playing safe by
sampling more than necessary.
ANSWER
(a)
35
More on the
t Distribution
36
The t distribution is a family of probability distributions. It is bell-shaped, symmetric, & flatter than the Z distribution..
t Distribution
A specific t distribution depends on a parameter known as the degrees of freedom (d.f.).
Degrees of freedom refer to the number of independent pieces of information that go into the computation of s.
37
A t distribution with more degrees of freedom has ____ dispersion. As the number of d.f. increases, the difference between t distribution and Z distribution becomes smaller and smaller.
38
Degrees Area in Upper Tail
of Freedom .20 .10 .05 .025 .01 .005
. . . . . . .
50 .849 1.299 1.676 2.009 2.403 2.678
60 .848 1.296 1.671 2.000 2.390 2.660
80 .846 1.292 1.664 1.990 2.374 2.639
100 .845 1.290 1.660 1.984 2.364 2.626
.842 1.282 1.645 1.960 2.326 2.576
Look familiar? They are ____.
t Distribution What is this 2.009?
Review Questions
A population has a standard deviation of 50. A random sample of 100 from this population is selected, and the sample mean is 600. At 95% confidence, the margin of error is ____
As the number of degrees of freedom for a t distribution ____, the difference between the t distribution and the standard normal distribution becomes smaller
For the interval estimation of when is known and the sample is large, the proper distribution to use is ____
1. 9.8
2. Increases
3. The normal distribution
ANSWER
Review Questions
4. The t value for a 95% confidence interval estimation with 24 degrees of freedom is ____
5. A 95% confidence interval for a population mean is determined to be 100 to 120. If the confidence coefficient is reduced to 0.90, the interval for a. becomes narrower
b. becomes wider
c. does not change
d. becomes 0.1
6. In a random sample of 144 observations, sample proportion p = 0.6. The 95% confidence interval for is a. 0.52 to 0.68
b. 0.144 to 0.200
c. 0.60 to 0.70
d. 0.50 to 0.70
4. 2.064
5. A
6. A
ANSWER