+ All Categories
Home > Documents > 11. Hypothesis Testing

11. Hypothesis Testing

Date post: 03-Jan-2016
Category:
Upload: nurgazy-nazhimidinov
View: 492 times
Download: 11 times
Share this document with a friend
Description:
stat
62
1 INTRODUCTION TO HYPOTHESIS TESTING
Transcript
Page 1: 11. Hypothesis Testing

1

INTRODUCTION TO HYPOTHESIS TESTING

Page 2: 11. Hypothesis Testing

2

HYPOTHESIS TESTING

• STATISTICAL TEST: The statistical procedure to draw an appropriate conclusion from sample data about a population parameter.

• HYPOTHESIS: Any statement concerning an unknown population parameter.

• Aim of a statistical test: Test an hypothesis concerning the values of one or more population parameters.

Page 3: 11. Hypothesis Testing

3

Concepts of Hypothesis Testing

• The critical concepts of hypothesis testing.– Example:

• An operation manager needs to determine if the mean demand during lead time is greater than 350.

• If so, changes in the ordering policy are needed.

– There are two hypotheses about a population mean:

• H0: The null hypothesis = 350

• H1: The alternative hypothesis > 350

This is what you want to prove

Page 4: 11. Hypothesis Testing

4

HYPOTHESIS TESTING

• Examples– Is there statistical evidence in a random sample

of inside diameter of a certain type of PVC pipe, that support the hypothesis that true average of all the inside diameters of a PVC pipe is 0.75?

– Is there statistical evidence in a random sample of circuit boards that support the hypothesis that less than 10% of the circuit boards are defective among all circuit boards produced by a certain manufacturer?

Page 5: 11. Hypothesis Testing

5

NULL AND ALTERNATIVE HYPOTHESIS

• NULL HYPOTHESIS=H0 states that a treatment has no effect or there is no change compared with the previous situation. The parameter is equal to a single value.

ALTERNATIVE HYPOTHESIS=HA states that a treatment has a significant effect or there is development compared with the previous situation. The parameter can be greater than or less than or different than the value shown in H0.

Page 6: 11. Hypothesis Testing

6

TEST STATISTIC AND REJECTION REGION

• TEST STATISTIC: The sample statistic on which we base our decision to reject or not reject the null hypothesis.

• REJECTION REGION: Range of values such that, if the test statistic falls in that range, we will decide to reject the null hypothesis, otherwise, we will not reject the null hypothesis. The probability that the (standardized) test statistic falls in the rejection region is the PROBABILITY OF TYPE I ERROR or SIGNIFICANCE LEVEL FOR THE TEST, which is known as .

Page 7: 11. Hypothesis Testing

7

Concepts of Hypothesis Testing

• Assume the null hypothesis is true (= 350).

= 350

–Sample from the demand population, and build a statistic related to the parameter hypothesized (the sample mean).

–Pose the question: How probable is it to obtain a sample mean at least as extreme as the one observed from the sample, if H0 is correct?

Page 8: 11. Hypothesis Testing

8

Concepts of Hypothesis Testing

• Assume the null hypothesis is true (= 350).

355x

= 350 450x

–Since the is much larger than 350, the mean is likely to be greater than 350. Reject the null hypothesis.

– In this case the mean is not likely to be greater than 350. Do not reject the null hypothesis.

x

Page 9: 11. Hypothesis Testing

9

Types of Errors

• Two types of errors may occur when deciding whether to reject H0 based on the statistic value.

– Type I error: Reject H0 when it is true.

– Type II error: Do not reject H0 when it is false.

• Example continued

– Type I error: Reject H0 ( = 350) in favor of H1 ( > 350) when the real value of is 350.

– Type II error: Believe that H0 is correct ( = 350) when the real value of is greater than 350.

Page 10: 11. Hypothesis Testing

10

Controlling the Probability of conducting a type I error

• Recall:– H0: = 350 and H1: > 350.

– H0 is rejected if is sufficiently large

• Thus, a type I error is made if when = 350.

• By properly selecting the critical value we can limit the probability of conducting a type I error to an acceptable level. Critical value

x= 350

x

Page 11: 11. Hypothesis Testing

11

RESULTS OF A TEST OF HYPOYHESIS

• Tests are based on the following principle:

Fix , minimize .

H0 is FalseH0 is True

Reject H0

Do not reject H0

Type I errorP(Type I error) =

Correct Decision

Correct Decision

Type II errorP(Type II error) =

Page 12: 11. Hypothesis Testing

12

PROCEDURE OF STATISTICAL TEST

1. Determining H0 and HA.

2. Choosing the best test statistic.

3. Deciding the rejection region (Decision Rule).

4. Conclusion.

Page 13: 11. Hypothesis Testing

13

POWER OF THE TEST AND P-VALUE

• 1- = Power of the test

= P(Reject H0|H0 is not true)

• p-value = Observed significance level = The smallest level of significance at which the null hypothesis can be rejected OR the maximum value of that you are willing to tolerate.

Page 14: 11. Hypothesis Testing

14

EXAMPLE 1

• For each of the following assertions, state whether it is legitimate statistical hypothesis and why?

a) H: >100b) H: s0.20c) H: d) H: e) H:

45X

1 2/ 1 5X Y

Yes, it is an assertion about the value of a parameter

No. The sample stdev is not a parameter

No. The sample median is not a parameter

Yes. It is about the value of two population standard deviations.

No. They are statistics.

Page 15: 11. Hypothesis Testing

15

EXAMPLE 2

• To determine whether the pipe welds in a nuclear power plant meet specifications, a random sample of welds is selected, and tests are conducted on each weld in the sample. Weld strength is measured as the force required to break the weld. Suppose the specifications state that mean strength of welds should exceed 100 lb/in2; the inspection team decides to test H0:=100 versus HA: >100. Explain why it might be preferable to use this HA rather than < 100.

Page 16: 11. Hypothesis Testing

16

EXAMPLE 2

• In this formulation, H0 states the welds do not conform to specifications. This assertion will not be rejected unless there is strong evidence to the contrary. Thus the burden proof is on those who wish to assert that the specification is satisfied. Using <100 results in the welds being believed in conformance unless provided otherwise, so the burden of proof is on non-conformance claim.

Page 17: 11. Hypothesis Testing

17

EXAMPLE 3

• Before agreeing to purchase a large order of polyethylene sheaths for a particular type of high pressure oil-filled submarine power cable, a company wants to see conclusive evidence that the true standard deviation of sheath thickness is less than 0.05 mm. What hypotheses should be tested, and why? In this context, what are the type I and type II errors?

Page 18: 11. Hypothesis Testing

18

Solution 3

is the population standard deviation. So, the appropriate hypothesis

H0: = 0.05 mm.

HA: < 0.05 mm.With this formulation the burden of proof is on the

data to show that the requirement has been met. Type I error: Conclude that the < 0.05 when it is

really equal to 0.05 mm.Type II error: Conclude that =0.05 mm when it is

really less than 0.05 mm.

Page 19: 11. Hypothesis Testing

19

HYPOTHESIS TEST FOR POPULATION MEAN,

known and X~N(, 2) Two-sided Test Test Statistic Rejecting Area

H0: = 0

HA: 0

• Reject Ho if z < -z/2 or z > z/2.

0x

z/ n

1- /2/2

z/2-z/2

Reject H0Reject H0

Do not reject H0

Page 20: 11. Hypothesis Testing

20

HYPOTHESIS TEST FOR POPULATION MEAN,

One-sided Tests Test Statistic Rejecting Area

1. H0: = 0

HA: > 0

Reject Ho if z > z.

2. H0: = 0

HA: < 0

Reject Ho if z < - z.

0x

z/ n

0x

z/ n

Reject H0

z

1-

Do not reject H0

-z

Reject H0Do not reject H0

1-

Page 21: 11. Hypothesis Testing

21

CALCULATION OF P-VALUE

• Determine the value of the test statistics,• For One-Tailed Test:

p-value= P(z > z0) if HA: >0

p-value= P(z < z0) if HA: <0

• For Two-Tailed Test

p=p-value = 2.P(z>zo) for z0>0

p=p-value = 2.P(z<z0) for z0<0

00

xz

/ n

z0

p-value

z0

p-value

-z0 z0

p/2p/2

Page 22: 11. Hypothesis Testing

22

DECISION RULE BY USING P-VALUES

• REJECT H0 IF p-value <

• DO NOT REJECT H0 IF p-value

p-value

Page 23: 11. Hypothesis Testing

23

EXAMPLE 4

• Do the contents of bottles of catsup have a net weight below an advertised threshold of 16 ounces?

• To test this 25 bottles of catsup were selected. They gave a net sample mean weight of .It is known that the standard deviation is . We want to test this at significance levels 1% and 5%.

X 15.9

.4

Page 24: 11. Hypothesis Testing

24

Solution 4

The z-score is:

The p-value is the probability of getting a score worse than this (relative to the alternative hypothesis) i.e.,

Compare the p-value to the significance level. Since it is bigger than both 1% and 5%, we do not reject the null hypothesis.

15.9 16

Z 1.25.4

25

P(Z 1.25) .1056

Page 25: 11. Hypothesis Testing

25

P-value for this one-tailed Test

• The p-value for this test is 0.1056

• Thus, do not reject H0 at 1% and 5% significance level. The contents of bottles of catsup have a net weight below an advertised threshold of 16 ounces.

-1.25

0.1056

0.10

0.05

Page 26: 11. Hypothesis Testing

26

• If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.

• If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true.

• If we reject the null hypothesis, we conclude that there is enough evidence to infer that the alternative hypothesis is true.

• If we do not reject the null hypothesis, we conclude that there is not enough statistical evidence to infer that the alternative hypothesis is true. The alternative hypothesis

is the more importantone. It represents whatwe are investigating.

The alternative hypothesisis the more importantone. It represents whatwe are investigating.

Conclusions of a Test of Hypothesis

Page 27: 11. Hypothesis Testing

27

EXAMPLE 5

The melting point of each of 16 samples of a certain brand of hydrogenated vegetable oil was determined, resulting in . Assume that the melting point is normal with = 1.20.

a) Test whether the true average melting point of a certain brand of hydrogenated vegetable oil is 95 when =0.01.

94.32x

Page 28: 11. Hypothesis Testing

28

b) If a level .01 test is used, what is the probability of Type II error when the true mean is 94.

Page 29: 11. Hypothesis Testing

29

EXAMPLE 6 At a certain production facility that assembles

computer keyboards, the assembly time is known (from experience) to follow a normal distribution with mean of 130 seconds and standard deviation of 15 seconds. The production supervisor suspects that the average time to assemble the keyboards does not quite follow the specified value. To examine this problem, he measures the times for 100 assemblies and found that the sample mean assembly time ( ) is 126.8 seconds. Can the supervisor conclude at the 5% level of significance that the mean assembly time of 130 seconds is incorrect?

x

Page 30: 11. Hypothesis Testing

30

Solution 6

• We want to prove that the time required to do the assembly is different from what experience dictates:

• Since the standard deviation is ,

• The standardized test statistic value is:

AH : 130

X 126.8

15

126.8 130

Z 2.1315

100

Page 31: 11. Hypothesis Testing

31

Two-Tail Hypothesis:

H0:

HA:

1-

z0

Do not Reject H0

z=test statistic values

(-zz<z

Reject H0

(z<-z

Type I ErrorProbability

-z

Reject H0

(z>z

Page 32: 11. Hypothesis Testing

32

Test Statistic: -2.13=

10015

130-126.8=

n

-X=z

.9 z

-z

0 Z

Rejection Region

Page 33: 11. Hypothesis Testing

33

Conclusion 6

• Since –2.13<-1.96, it falls in the rejection

region.

• Hence, we reject the null hypothesis that the time required to do the assembly is still 130 seconds. The evidence suggests that the task now takes either more or less than 130 seconds.

Page 34: 11. Hypothesis Testing

34

Test of Hypothesis for the Population Mean ( unknown)

For samples of size n drawn from a Normal Population, the test statistic:

has a Student t-distribution with n-1 degrees of freedom.

x -ts / n

Page 35: 11. Hypothesis Testing

35

EXAMPLE 7

• 5 measurements of the tar content of a certain kind of cigarette yielded 14.5, 14.2, 14.4, 14.3 and 14.6 mg per cigarette. Show the difference between the mean of this sample and the average tar content claimed by the manufacturer, =14.0 is significance at =0.05.

x 14.4

52

2 2i2 i 1

( x x ) (14.5 14.4 ) ... ( 14.6 14.4 )s 0.025

n 1 5 1s 0.158

Page 36: 11. Hypothesis Testing

36

Solution 7

• H0: = 14.0

HA: 14.0

Decision Rule: Reject H0 if t<-t/2 or t> t/2.

0

/ 2 ,n 1 0.025 ,4

x 14.4 14.0t 5.66s / n 0.158 / 5

t t 2.776

Page 37: 11. Hypothesis Testing

37

Conclusion 7

• Reject H0 at = 0.05. Difference is significant.

5.66-2.766 2.766

0.0250.025

Reject H0 Reject H0

Page 38: 11. Hypothesis Testing

38

P-value of This Test

• p-value = 2.P(t > 5.66) = 2(0.0024)=0.0048

Since p-value = 0.0048 < = 0.05, reject H0.

Minitab Output

T-Test of the Mean

Test of mu = 14.0000 vs mu not = 14.0000

Variable N Mean StDev SE Mean T P-Value

C1 5 14.4000 0.1581 0.0707 5.66 0.0048

Page 39: 11. Hypothesis Testing

39

CONCLUSION USING THE CONFIDENCE INTERVALS

MINITAB OUTPUT:

Confidence Intervals

Variable N Mean StDev SE Mean 95.0 % C.I.

C1 5 14.4000 0.1581 0.0707 ( 14.2036, 14.5964)

• Since 14 is not in the interval, reject H0. =14 IS NOT IN THE CI

Page 40: 11. Hypothesis Testing

G. Baker, Department of StatisticsUniversity of South Carolina; Slide

40

Internal Combustion Engine

• The nominal power produced by a student-designed internal combustion engine should be 100 hp. The student team that designed the engine conducted 10 tests to determine the actual power. The data follow:

98, 101, 102, 97, 101, 98, 100, 92, 98, 100

Assume data came from a normal distribution.

Page 41: 11. Hypothesis Testing

G. Baker, Department of StatisticsUniversity of South Carolina; Slide

41

Internal Combustion Engine

ColumnColumn nn MeanMean Std. Dev.Std. Dev.

hphp 1010 98.798.7 2.92.9

Summary Data:

What is the probability of getting a sample mean of 98.7 hp or less if the true mean is 100 hp?

Page 42: 11. Hypothesis Testing

Internal Combustion Engine

-4 -3 -2 -1 0 1 2 3 4

t(df=9)

)418.1(10/9.2

1007.98)100|7.98( 99

dfdf tPtPyP

0.0949

What did we assume when doing this analysis?

Are you comfortable with the assumption?

Page 43: 11. Hypothesis Testing

43

EXAMPLE 8

The amount of shaft wear (.0001in.) after a fixed mileage was determined for each of n=8 internal combustion engines having copper lead as a bearing material, resulting in . Assuming that the distribution of shaft wear is normal with mean , test the mean shaft wear is greater than 3.5 at 5 % significance level.

3.72 and 1.25x s

Page 44: 11. Hypothesis Testing

44

EXAMPLE 9

To obtain information on the corrosion-resistance properties of a certain type of steel conduit, 25 specimen are buried in soil for a 2-year period. The maximum penetration (in mils) for each specimen is then measured, yielding a sample average penetration of and a sample standard deviation of s=4.8. The conduits were manufactured with the specification that true average penetration be at most 50 mils. They will be used unless it can be demonstrated conclusively that the specification has not been met. What would you conclude?

x=52.7

Page 45: 11. Hypothesis Testing

45

TESTING HYPOTHESIS ABOUT POPULATION PROPORTION, p

• ASSUMPTIONS:

1. The experiment is binomial.

2. The sample size is large enough.

x: The number of success

The sample proportion is

approximately for large n (np 5 and n(1-p) 5 ).

x p(1 p)p̂ ~ N(p, )

n n

Page 46: 11. Hypothesis Testing

46

HYPOTHESIS TEST FOR p

p̂ pz

p(1 p) / n

where np 5 and n(1 p) 5

/2/2

-z/2 z/2

Reject H0Reject H0Do not reject H0

Two-sided Test Test Statistic Rejecting Area

H0: p = p0

HA: p p0

• Reject H0 if z < -z/2 or z > z/2.

Page 47: 11. Hypothesis Testing

47

HYPOTHESIS TEST FOR p

One-sided Tests Test Statistic Rejecting Area

1. H0: p= p0

HA: p > p0

Reject H0 if z > z.

2. H0: p = p0

HA: p < p0

Reject Ho if z < - z.

p̂ pz

p(1 p) / n

where np 5 and n(1 p) 5

z

Do not reject H0Reject H0

-z

Reject H0 Do not reject H0

p̂ pz

p(1 p) / n

where np 5 and n(1 p) 5

Page 48: 11. Hypothesis Testing

48

EXAMPLE 10

• Mom’s Home Cokin’ claims that 70% of the customers are able to dine for less than $5. Mom wishes to test this claim at the 92% level of confidence. A random sample of 110 patrons revealed that 66 paid less than $5 for lunch.

H0: p = 0.70HA: p 0.70

Page 49: 11. Hypothesis Testing

49

Solution 10

• x = 66, n = 110 and p = 0.70

= 0.08, z/2 = z0.04 = 1.75

• Test Statistic:

x 66p̂ 0.6

n 110

0.6 0.7z 2.289

(0.7)(0.3) /110

Page 50: 11. Hypothesis Testing

50

Conclusion 10

• DECISION RULE:

Reject H0 if z < -1.75 or z > 1.75.

• CONCLUSION: Reject H0 at = 0.08. Mom’s claim is not true.

/2/2

1.75-1.75-2.289

Reject H0Reject H0

Page 51: 11. Hypothesis Testing

51

P-Value

• p-value = 2. P(z < -2.289) =2(0.011) = 0.022

The smallest value of to reject H0 is 0.022.

Since p-value = 0.022 < = 0.08, reject H0.

-2.289

0.011 0.011

2.289

Page 52: 11. Hypothesis Testing

52

CONFIDENCE INTERVAL APPROACH

• Find the 92% CI for p.

• 92% CI for p: 0.518 p 0.682• Hypothesis should be two sided to use

confidence interval approach.• Since p=0.7 is not in the above interval, reject

H0. Mom has underestimated the cost of her meal.

/ 2

ˆ ˆp(1 p) (0.6)(0.4)p̂ z 0.6 1.75

n 110

Page 53: 11. Hypothesis Testing

53

EXAMPLE 11

• Scientists think that robots will play a crucial role in factories in the next several decades. Suppose that in an experiment to determine whether the use of robots to weave computer cables is feasible, a robot was used to assemble 500 cables. The cables were examined and there were 15 defectives. If human assemblers have a defect rate of 0.035, does this data support the hypothesis that the proportion of defectives is lower for robots than humans? Use a 1% significance level.

Page 54: 11. Hypothesis Testing

54

Solution 11

H0: p = 0.035

HA: p < 0.035

It is given that x=15 and n=500. Thus,

X~Bin(n=500, p=0.035)

np=17.5>5 and n(1-p)=482.5>5, we can use normal approximation to binomial

15ˆ 0.03

500

xp

n

Page 55: 11. Hypothesis Testing

55

Solution 11 (continue)• Test statistic:

• Rejection region:

Reject H0 if z < -z = -z0.01= -2.33.• Conclusion: Since z = - 0.6085 > -z0.01= -2.33, do

not reject H0. Robots do not demonstrate their superiority.

ˆ 0.03 0.0350.6084

(1 ) 0.035(1 0.035)500

p pz

p pn

Page 56: 11. Hypothesis Testing

56

– Voters are asked by a certain network to participate in an exit poll in order to predict the winner on election day.

– Based on the data (where 1=Democrat, and 2=Republican), can the network conclude that the republican candidate will win the state college vote?

Example 12 (Predicting the winner in election day)

Page 57: 11. Hypothesis Testing

57

• The problem objective is to describe the population of votes in the state.– The data are nominal.– The parameter to be tested is ‘p’.– Success is defined as “Vote republican”.– The hypotheses are:

H0: p = .5

H1: p > .5 More than 50% vote RepublicanMore than 50% vote Republican

Solution 12

Page 58: 11. Hypothesis Testing

58

The rejection region is z > z = z.05 = 1.645.

From file we count 407 success. Number of voters participating is 765.

The sample proportion is

The value of the test statistic is

The p-value is = P(Z>1.77) = .0382

532.765407p̂

77.1765/)5.1(5.

5.532.

n/)p1(p

pp̂Z

Solving by hand

Page 59: 11. Hypothesis Testing

59

z-Test : Proportion

Sample Proportion 0.532Observations 765Hypothesized Proportion 0.5z Stat 1.77P(Z<=z) one-tail 0.0382z Critical one-tail 1.6449P(Z<=z) two-tail 0.0764z Critical two-tail 1.96

There is sufficient evidence to reject the null hypothesisin favor of the alternative hypothesis. At 5% significance level we can conclude that more than 50% voted Republican.

Testing the ProportionTesting the Proportion

<=0.05

Page 60: 11. Hypothesis Testing

Simple formula for difference in proportions

221

2/2

)(p

)Z)(1)((2

p

Zppn

Sample size in each group (assumes equal sized groups)

Represents the desired power (typically .84 for 80% power).

Represents the desired level of statistical significance (typically 1.96).

A measure of variability (similar to standard deviation)

Effect Size (the difference in proportions)

Page 61: 11. Hypothesis Testing

Simple formula for difference in means

Sample size in each group (assumes equal sized groups)

Represents the desired power (typically .84 for 80% power).

Represents the desired level of statistical significance (typically 1.96).

Standard deviation of the outcome variable Effect Size (the

difference in means)

2

2/2

2

difference

)Z(2

Zn

Page 62: 11. Hypothesis Testing

Sample size calculators on the web…

• http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize

• http://calculators.stat.ucla.edu

• http://hedwig.mgh.harvard.edu/sample_size/size.html


Recommended