+ All Categories
Home > Documents > Hypothesis Testing

Hypothesis Testing

Date post: 06-Jan-2016
Category:
Upload: affrica
View: 24 times
Download: 0 times
Share this document with a friend
Description:
Hypothesis Testing. ESM 206 6 Feb. 2002. Example: Gas Mileage. Do “Small” cars have a different average gas mileage than “Compact” cars?. Data on mileage of 13 small and 15 compact cars. Example: gas consumption. Which coefficients are different from zero? Data from 36 years in US. - PowerPoint PPT Presentation
21
Hypothesis Testing ESM 206 6 Feb. 2002
Transcript
Page 1: Hypothesis Testing

Hypothesis Testing

ESM 206

6 Feb. 2002

Page 2: Hypothesis Testing

Example: Gas MileageSMALL COMPACT

Eagle Summit Audi 80

Ford Escort Buick Skylark

Ford Festiva Chevrolet LeBaron

Honda Civic Ford Tempo

Mazda Protégé Honda Accord

Mercury Tracer Mazda 626

Nissan Sentra Mitsubishi Galant

Pontiac LeMans Mitsubishi Sigma

Subaru Loyale Nissan Stanza

Subary Justy Oldsmobile Calais

Toyota Corolla Peugeot 405

Toyota Tercel Subaru Legacy

Volkswagen Jetta Toyota Camry

Do “Small” cars have a different average gas mileage than “Compact” cars?

Data on mileage of 13 small and 15 compact cars.

Small Small

Type

20

25

30

35

Mile

ag

e

Page 3: Hypothesis Testing

Example: gas consumption

Which coefficients are different from zero?

Data from 36 years in US.

0 1 2 3 4G P I N U

Page 4: Hypothesis Testing

Hypothesis testing

Define null hypothesis (H0)

Does direction matter?

Choose test statistic, T

Distribution of T under H0

Calculate test statistic, S

Probability of obtaining value at least as extreme as S under H0 (P)

P small: reject H0

Page 5: Hypothesis Testing

The null hypothesis

Statement about underlying parameters of the population

We will either reject or fail to reject H0

Usually a statement of no pattern or of not exceeding some criterion

Examples

Page 6: Hypothesis Testing

The alternate hypothesis

Written HA

Is the logical complement of H0

Examples

Page 7: Hypothesis Testing

One- and two-sided tests

One-sided test: direction mattersPick a direction based on regulatory criteria

or knowledge of processesDirection must be chosen a priori

Two-sided: all that matters is a differenceOne-sided has greater powerMust make decision before analyzing data

Page 8: Hypothesis Testing

Comparing means: the t-test

Compare sample mean to fixed value (eqs. 1-4)

Compare regression coefficient to fixed value (eq. 5)

Compare the difference between two sample means to a fixed value (usually 0) (eqs. 6-7)

Page 9: Hypothesis Testing

Assumptions of the t-test

The data in each sample are normally distributed

The populations have the same varianceCan correct for violations of this with the

Welch modification of dfTest for difference among variances with F-

test

Page 10: Hypothesis Testing

The P-value

P is the probability of observing your data if the null hypothesis is true

P is the probability that you will be in error if you reject the null hypothesis

P is not the probability that the null hypothesis is true

Page 11: Hypothesis Testing

Critical values of P

Reject H0 if P is less than threshold

P < 0.05 commonly usedArbitrary choice

Other values: 0.1, 0.01, 0.001

Always report P, so others can draw own conclusions

Page 12: Hypothesis Testing

Example: Gas MileageSMALL COMPACT

Eagle Summit Audi 80

Ford Escort Buick Skylark

Ford Festiva Chevrolet LeBaron

Honda Civic Ford Tempo

Mazda Protégé Honda Accord

Mercury Tracer Mazda 626

Nissan Sentra Mitsubishi Galant

Pontiac LeMans Mitsubishi Sigma

Subaru Loyale Nissan Stanza

Subary Justy Oldsmobile Calais

Toyota Corolla Peugeot 405

Toyota Tercel Subaru Legacy

Volkswagen Jetta Toyota Camry

Do “Small” cars have a different average gas mileage than “Compact” cars?

Data on mileage of 13 small and 15 compact cars.

Small Small

Type

20

25

30

35

Mile

ag

e

Page 13: Hypothesis Testing

Gas mileage: variances are unequal

Small Compact Min: 25.000000 21.000000 1st Qu.: 28.000000 23.000000 Mean: 31.000000 24.133333 Median: 32.000000 24.000000 3rd Qu.: 33.000000 25.500000 Max: 37.000000 27.000000 Total N: 13.000000 15.000000 NA's : 0.000000 0.000000Variance: 14.500000 3.552381Std Dev.: 3.807887 1.884776

Page 14: Hypothesis Testing

Gas mileage

Test Name: Welch Modified Two-Sample t-Test

Estimated Parameter(s): mean of x = 31

mean of y = 24.13333

Data: x: Small in DS2 , and y: Compact in DS2

Test Statistic: t = 5.905054

Test Statistic Parameter: df = 16.98065

P-value: 0.00001738092

95 % Confidence Interval: LCL = 4.413064

UCL = 9.32027

Page 15: Hypothesis Testing

Example: gas consumption

Which coefficients are different from zero?

Data from 36 years in US.

0 1 2 3 4G P I N U

Page 16: Hypothesis Testing

Gas consumption

Value Std. Error t value Pr(>|t|)

(Intercept) -0.0898 0.0508 -1.7687 0.0868

GasPrice -0.0424 0.0098 -4.3058 0.0002

Income 0.0002 0.0000 23.4189 0.0000

New.Car.Price -0.1014 0.0617 -1.6429 0.1105

Used.Car.Price -0.0432 0.0241 -1.7913 0.0830

Page 17: Hypothesis Testing

Interpreting model coefficients

Is there statistical evidence that the independent variable has an effect? Is the parameter estimate significantly

different from zero?

Is the coefficient large enough that the effect is important?Must take into account the variation in the

independent variableUse linear measure of variation – SD, IQ range,

etc.

Page 18: Hypothesis Testing

Types of error

Type I: reject null hypothesis when it’s really trueDesired level:

Type II: fail to reject null hypothesis when it’s really falseDesired level: Is associated with a given effect size

E.g., want a probability 0.1 of failing to reject when true difference between means is 0.35.

Page 19: Hypothesis Testing

Types of error

In reality, H0 is

True False

Your test says that H0 should be:

Accepted Correct conclusion

Type II error

Rejected Type I error

Correct conclusion

Page 20: Hypothesis Testing
Page 21: Hypothesis Testing

Controlling error levels

is controlled by setting critical P-value

is controlled by , sample size, sample variance, effect size

Tradeoff between and Need to balance costs associated with type I and type II errors

Power is 1-


Recommended