Date post: | 03-Jan-2016 |
Category: |
Documents |
Upload: | nurgazy-nazhimidinov |
View: | 492 times |
Download: | 11 times |
1
INTRODUCTION TO HYPOTHESIS TESTING
2
HYPOTHESIS TESTING
• STATISTICAL TEST: The statistical procedure to draw an appropriate conclusion from sample data about a population parameter.
• HYPOTHESIS: Any statement concerning an unknown population parameter.
• Aim of a statistical test: Test an hypothesis concerning the values of one or more population parameters.
3
Concepts of Hypothesis Testing
• The critical concepts of hypothesis testing.– Example:
• An operation manager needs to determine if the mean demand during lead time is greater than 350.
• If so, changes in the ordering policy are needed.
– There are two hypotheses about a population mean:
• H0: The null hypothesis = 350
• H1: The alternative hypothesis > 350
This is what you want to prove
4
HYPOTHESIS TESTING
• Examples– Is there statistical evidence in a random sample
of inside diameter of a certain type of PVC pipe, that support the hypothesis that true average of all the inside diameters of a PVC pipe is 0.75?
– Is there statistical evidence in a random sample of circuit boards that support the hypothesis that less than 10% of the circuit boards are defective among all circuit boards produced by a certain manufacturer?
5
NULL AND ALTERNATIVE HYPOTHESIS
• NULL HYPOTHESIS=H0 states that a treatment has no effect or there is no change compared with the previous situation. The parameter is equal to a single value.
ALTERNATIVE HYPOTHESIS=HA states that a treatment has a significant effect or there is development compared with the previous situation. The parameter can be greater than or less than or different than the value shown in H0.
6
TEST STATISTIC AND REJECTION REGION
• TEST STATISTIC: The sample statistic on which we base our decision to reject or not reject the null hypothesis.
• REJECTION REGION: Range of values such that, if the test statistic falls in that range, we will decide to reject the null hypothesis, otherwise, we will not reject the null hypothesis. The probability that the (standardized) test statistic falls in the rejection region is the PROBABILITY OF TYPE I ERROR or SIGNIFICANCE LEVEL FOR THE TEST, which is known as .
7
Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).
= 350
–Sample from the demand population, and build a statistic related to the parameter hypothesized (the sample mean).
–Pose the question: How probable is it to obtain a sample mean at least as extreme as the one observed from the sample, if H0 is correct?
8
Concepts of Hypothesis Testing
• Assume the null hypothesis is true (= 350).
355x
= 350 450x
–Since the is much larger than 350, the mean is likely to be greater than 350. Reject the null hypothesis.
– In this case the mean is not likely to be greater than 350. Do not reject the null hypothesis.
x
9
Types of Errors
• Two types of errors may occur when deciding whether to reject H0 based on the statistic value.
– Type I error: Reject H0 when it is true.
– Type II error: Do not reject H0 when it is false.
• Example continued
– Type I error: Reject H0 ( = 350) in favor of H1 ( > 350) when the real value of is 350.
– Type II error: Believe that H0 is correct ( = 350) when the real value of is greater than 350.
10
Controlling the Probability of conducting a type I error
• Recall:– H0: = 350 and H1: > 350.
– H0 is rejected if is sufficiently large
• Thus, a type I error is made if when = 350.
• By properly selecting the critical value we can limit the probability of conducting a type I error to an acceptable level. Critical value
x= 350
x
11
RESULTS OF A TEST OF HYPOYHESIS
• Tests are based on the following principle:
Fix , minimize .
H0 is FalseH0 is True
Reject H0
Do not reject H0
Type I errorP(Type I error) =
Correct Decision
Correct Decision
Type II errorP(Type II error) =
12
PROCEDURE OF STATISTICAL TEST
1. Determining H0 and HA.
2. Choosing the best test statistic.
3. Deciding the rejection region (Decision Rule).
4. Conclusion.
13
POWER OF THE TEST AND P-VALUE
• 1- = Power of the test
= P(Reject H0|H0 is not true)
• p-value = Observed significance level = The smallest level of significance at which the null hypothesis can be rejected OR the maximum value of that you are willing to tolerate.
14
EXAMPLE 1
• For each of the following assertions, state whether it is legitimate statistical hypothesis and why?
a) H: >100b) H: s0.20c) H: d) H: e) H:
45X
1 2/ 1 5X Y
Yes, it is an assertion about the value of a parameter
No. The sample stdev is not a parameter
No. The sample median is not a parameter
Yes. It is about the value of two population standard deviations.
No. They are statistics.
15
EXAMPLE 2
• To determine whether the pipe welds in a nuclear power plant meet specifications, a random sample of welds is selected, and tests are conducted on each weld in the sample. Weld strength is measured as the force required to break the weld. Suppose the specifications state that mean strength of welds should exceed 100 lb/in2; the inspection team decides to test H0:=100 versus HA: >100. Explain why it might be preferable to use this HA rather than < 100.
16
EXAMPLE 2
• In this formulation, H0 states the welds do not conform to specifications. This assertion will not be rejected unless there is strong evidence to the contrary. Thus the burden proof is on those who wish to assert that the specification is satisfied. Using <100 results in the welds being believed in conformance unless provided otherwise, so the burden of proof is on non-conformance claim.
17
EXAMPLE 3
• Before agreeing to purchase a large order of polyethylene sheaths for a particular type of high pressure oil-filled submarine power cable, a company wants to see conclusive evidence that the true standard deviation of sheath thickness is less than 0.05 mm. What hypotheses should be tested, and why? In this context, what are the type I and type II errors?
18
Solution 3
is the population standard deviation. So, the appropriate hypothesis
H0: = 0.05 mm.
HA: < 0.05 mm.With this formulation the burden of proof is on the
data to show that the requirement has been met. Type I error: Conclude that the < 0.05 when it is
really equal to 0.05 mm.Type II error: Conclude that =0.05 mm when it is
really less than 0.05 mm.
19
HYPOTHESIS TEST FOR POPULATION MEAN,
known and X~N(, 2) Two-sided Test Test Statistic Rejecting Area
H0: = 0
HA: 0
• Reject Ho if z < -z/2 or z > z/2.
0x
z/ n
1- /2/2
z/2-z/2
Reject H0Reject H0
Do not reject H0
20
HYPOTHESIS TEST FOR POPULATION MEAN,
One-sided Tests Test Statistic Rejecting Area
1. H0: = 0
HA: > 0
Reject Ho if z > z.
2. H0: = 0
HA: < 0
Reject Ho if z < - z.
0x
z/ n
0x
z/ n
Reject H0
z
1-
Do not reject H0
-z
Reject H0Do not reject H0
1-
21
CALCULATION OF P-VALUE
• Determine the value of the test statistics,• For One-Tailed Test:
p-value= P(z > z0) if HA: >0
p-value= P(z < z0) if HA: <0
• For Two-Tailed Test
p=p-value = 2.P(z>zo) for z0>0
p=p-value = 2.P(z<z0) for z0<0
00
xz
/ n
z0
p-value
z0
p-value
-z0 z0
p/2p/2
22
DECISION RULE BY USING P-VALUES
• REJECT H0 IF p-value <
• DO NOT REJECT H0 IF p-value
p-value
23
EXAMPLE 4
• Do the contents of bottles of catsup have a net weight below an advertised threshold of 16 ounces?
• To test this 25 bottles of catsup were selected. They gave a net sample mean weight of .It is known that the standard deviation is . We want to test this at significance levels 1% and 5%.
X 15.9
.4
24
Solution 4
The z-score is:
The p-value is the probability of getting a score worse than this (relative to the alternative hypothesis) i.e.,
Compare the p-value to the significance level. Since it is bigger than both 1% and 5%, we do not reject the null hypothesis.
15.9 16
Z 1.25.4
25
P(Z 1.25) .1056
25
P-value for this one-tailed Test
• The p-value for this test is 0.1056
• Thus, do not reject H0 at 1% and 5% significance level. The contents of bottles of catsup have a net weight below an advertised threshold of 16 ounces.
-1.25
0.1056
0.10
0.05
26
• If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.
• If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true.
• If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.
• If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. The alternative hypothesis
is the more importantone. It represents whatwe are investigating.
The alternative hypothesisis the more importantone. It represents whatwe are investigating.
Conclusions of a Test of Hypothesis
27
EXAMPLE 5
The melting point of each of 16 samples of a certain brand of hydrogenated vegetable oil was determined, resulting in . Assume that the melting point is normal with = 1.20.
a) Test whether the true average melting point of a certain brand of hydrogenated vegetable oil is 95 when =0.01.
94.32x
28
b) If a level .01 test is used, what is the probability of Type II error when the true mean is 94.
29
EXAMPLE 6 At a certain production facility that assembles
computer keyboards, the assembly time is known (from experience) to follow a normal distribution with mean of 130 seconds and standard deviation of 15 seconds. The production supervisor suspects that the average time to assemble the keyboards does not quite follow the specified value. To examine this problem, he measures the times for 100 assemblies and found that the sample mean assembly time ( ) is 126.8 seconds. Can the supervisor conclude at the 5% level of significance that the mean assembly time of 130 seconds is incorrect?
x
30
Solution 6
• We want to prove that the time required to do the assembly is different from what experience dictates:
• Since the standard deviation is ,
• The standardized test statistic value is:
AH : 130
X 126.8
15
126.8 130
Z 2.1315
100
31
Two-Tail Hypothesis:
H0:
HA:
1-
z0
Do not Reject H0
z=test statistic values
(-zz<z
Reject H0
(z<-z
Type I ErrorProbability
-z
Reject H0
(z>z
32
Test Statistic: -2.13=
10015
130-126.8=
n
-X=z
.9 z
-z
0 Z
Rejection Region
33
Conclusion 6
• Since –2.13<-1.96, it falls in the rejection
region.
• Hence, we reject the null hypothesis that the time required to do the assembly is still 130 seconds. The evidence suggests that the task now takes either more or less than 130 seconds.
34
Test of Hypothesis for the Population Mean ( unknown)
For samples of size n drawn from a Normal Population, the test statistic:
has a Student t-distribution with n-1 degrees of freedom.
x -ts / n
35
EXAMPLE 7
• 5 measurements of the tar content of a certain kind of cigarette yielded 14.5, 14.2, 14.4, 14.3 and 14.6 mg per cigarette. Show the difference between the mean of this sample and the average tar content claimed by the manufacturer, =14.0 is significance at =0.05.
x 14.4
52
2 2i2 i 1
( x x ) (14.5 14.4 ) ... ( 14.6 14.4 )s 0.025
n 1 5 1s 0.158
36
Solution 7
• H0: = 14.0
HA: 14.0
Decision Rule: Reject H0 if t<-t/2 or t> t/2.
0
/ 2 ,n 1 0.025 ,4
x 14.4 14.0t 5.66s / n 0.158 / 5
t t 2.776
37
Conclusion 7
• Reject H0 at = 0.05. Difference is significant.
5.66-2.766 2.766
0.0250.025
Reject H0 Reject H0
38
P-value of This Test
• p-value = 2.P(t > 5.66) = 2(0.0024)=0.0048
Since p-value = 0.0048 < = 0.05, reject H0.
Minitab Output
T-Test of the Mean
Test of mu = 14.0000 vs mu not = 14.0000
Variable N Mean StDev SE Mean T P-Value
C1 5 14.4000 0.1581 0.0707 5.66 0.0048
39
CONCLUSION USING THE CONFIDENCE INTERVALS
MINITAB OUTPUT:
Confidence Intervals
Variable N Mean StDev SE Mean 95.0 % C.I.
C1 5 14.4000 0.1581 0.0707 ( 14.2036, 14.5964)
• Since 14 is not in the interval, reject H0. =14 IS NOT IN THE CI
G. Baker, Department of StatisticsUniversity of South Carolina; Slide
40
Internal Combustion Engine
• The nominal power produced by a student-designed internal combustion engine should be 100 hp. The student team that designed the engine conducted 10 tests to determine the actual power. The data follow:
98, 101, 102, 97, 101, 98, 100, 92, 98, 100
Assume data came from a normal distribution.
G. Baker, Department of StatisticsUniversity of South Carolina; Slide
41
Internal Combustion Engine
ColumnColumn nn MeanMean Std. Dev.Std. Dev.
hphp 1010 98.798.7 2.92.9
Summary Data:
What is the probability of getting a sample mean of 98.7 hp or less if the true mean is 100 hp?
Internal Combustion Engine
-4 -3 -2 -1 0 1 2 3 4
t(df=9)
)418.1(10/9.2
1007.98)100|7.98( 99
dfdf tPtPyP
0.0949
What did we assume when doing this analysis?
Are you comfortable with the assumption?
43
EXAMPLE 8
The amount of shaft wear (.0001in.) after a fixed mileage was determined for each of n=8 internal combustion engines having copper lead as a bearing material, resulting in . Assuming that the distribution of shaft wear is normal with mean , test the mean shaft wear is greater than 3.5 at 5 % significance level.
3.72 and 1.25x s
44
EXAMPLE 9
To obtain information on the corrosion-resistance properties of a certain type of steel conduit, 25 specimen are buried in soil for a 2-year period. The maximum penetration (in mils) for each specimen is then measured, yielding a sample average penetration of and a sample standard deviation of s=4.8. The conduits were manufactured with the specification that true average penetration be at most 50 mils. They will be used unless it can be demonstrated conclusively that the specification has not been met. What would you conclude?
x=52.7
45
TESTING HYPOTHESIS ABOUT POPULATION PROPORTION, p
• ASSUMPTIONS:
1. The experiment is binomial.
2. The sample size is large enough.
x: The number of success
The sample proportion is
approximately for large n (np 5 and n(1-p) 5 ).
x p(1 p)p̂ ~ N(p, )
n n
46
HYPOTHESIS TEST FOR p
p̂ pz
p(1 p) / n
where np 5 and n(1 p) 5
/2/2
-z/2 z/2
Reject H0Reject H0Do not reject H0
Two-sided Test Test Statistic Rejecting Area
H0: p = p0
HA: p p0
• Reject H0 if z < -z/2 or z > z/2.
47
HYPOTHESIS TEST FOR p
One-sided Tests Test Statistic Rejecting Area
1. H0: p= p0
HA: p > p0
Reject H0 if z > z.
2. H0: p = p0
HA: p < p0
Reject Ho if z < - z.
p̂ pz
p(1 p) / n
where np 5 and n(1 p) 5
z
Do not reject H0Reject H0
-z
Reject H0 Do not reject H0
p̂ pz
p(1 p) / n
where np 5 and n(1 p) 5
48
EXAMPLE 10
• Mom’s Home Cokin’ claims that 70% of the customers are able to dine for less than $5. Mom wishes to test this claim at the 92% level of confidence. A random sample of 110 patrons revealed that 66 paid less than $5 for lunch.
H0: p = 0.70HA: p 0.70
49
Solution 10
• x = 66, n = 110 and p = 0.70
= 0.08, z/2 = z0.04 = 1.75
• Test Statistic:
x 66p̂ 0.6
n 110
0.6 0.7z 2.289
(0.7)(0.3) /110
50
Conclusion 10
• DECISION RULE:
Reject H0 if z < -1.75 or z > 1.75.
• CONCLUSION: Reject H0 at = 0.08. Mom’s claim is not true.
/2/2
1.75-1.75-2.289
Reject H0Reject H0
51
P-Value
• p-value = 2. P(z < -2.289) =2(0.011) = 0.022
The smallest value of to reject H0 is 0.022.
Since p-value = 0.022 < = 0.08, reject H0.
-2.289
0.011 0.011
2.289
52
CONFIDENCE INTERVAL APPROACH
• Find the 92% CI for p.
• 92% CI for p: 0.518 p 0.682• Hypothesis should be two sided to use
confidence interval approach.• Since p=0.7 is not in the above interval, reject
H0. Mom has underestimated the cost of her meal.
/ 2
ˆ ˆp(1 p) (0.6)(0.4)p̂ z 0.6 1.75
n 110
53
EXAMPLE 11
• Scientists think that robots will play a crucial role in factories in the next several decades. Suppose that in an experiment to determine whether the use of robots to weave computer cables is feasible, a robot was used to assemble 500 cables. The cables were examined and there were 15 defectives. If human assemblers have a defect rate of 0.035, does this data support the hypothesis that the proportion of defectives is lower for robots than humans? Use a 1% significance level.
54
Solution 11
H0: p = 0.035
HA: p < 0.035
It is given that x=15 and n=500. Thus,
X~Bin(n=500, p=0.035)
np=17.5>5 and n(1-p)=482.5>5, we can use normal approximation to binomial
15ˆ 0.03
500
xp
n
55
Solution 11 (continue)• Test statistic:
• Rejection region:
Reject H0 if z < -z = -z0.01= -2.33.• Conclusion: Since z = - 0.6085 > -z0.01= -2.33, do
not reject H0. Robots do not demonstrate their superiority.
ˆ 0.03 0.0350.6084
(1 ) 0.035(1 0.035)500
p pz
p pn
56
– Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day.
– Based on the data (where 1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote?
Example 12 (Predicting the winner in election day)
57
• The problem objective is to describe the population of votes in the state.– The data are nominal.– The parameter to be tested is ‘p’.– Success is defined as “Vote republican”.– The hypotheses are:
H0: p = .5
H1: p > .5 More than 50% vote RepublicanMore than 50% vote Republican
Solution 12
58
The rejection region is z > z = z.05 = 1.645.
From file we count 407 success. Number of voters participating is 765.
The sample proportion is
The value of the test statistic is
The p-value is = P(Z>1.77) = .0382
532.765407p̂
77.1765/)5.1(5.
5.532.
n/)p1(p
pp̂Z
Solving by hand
59
z-Test : Proportion
Sample Proportion 0.532Observations 765Hypothesized Proportion 0.5z Stat 1.77P(Z<=z) one-tail 0.0382z Critical one-tail 1.6449P(Z<=z) two-tail 0.0764z Critical two-tail 1.96
There is sufficient evidence to reject the null hypothesisin favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican.
Testing the ProportionTesting the Proportion
<=0.05
Simple formula for difference in proportions
221
2/2
)(p
)Z)(1)((2
p
Zppn
Sample size in each group (assumes equal sized groups)
Represents the desired power (typically .84 for 80% power).
Represents the desired level of statistical significance (typically 1.96).
A measure of variability (similar to standard deviation)
Effect Size (the difference in proportions)
Simple formula for difference in means
Sample size in each group (assumes equal sized groups)
Represents the desired power (typically .84 for 80% power).
Represents the desired level of statistical significance (typically 1.96).
Standard deviation of the outcome variable Effect Size (the
difference in means)
2
2/2
2
difference
)Z(2
Zn
Sample size calculators on the web…
• http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize
• http://calculators.stat.ucla.edu
• http://hedwig.mgh.harvard.edu/sample_size/size.html