+ All Categories
Home > Documents > Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Date post: 26-Dec-2015
Category:
Upload: janice-mcdowell
View: 226 times
Download: 1 times
Share this document with a friend
57
Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306
Transcript
Page 1: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Inferential Statistics Part 2:Hypothesis Testing

Chapter 9p. 280 - 306

Page 2: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

IntroductionHypothesis testing is closely related to estimation (i.e., what we studied at last week)

The difference is that now we are posing a hypothesis that we want to test

For example, rather than just estimating a population parameter using a sample, we may hypothesize that a sample is different than the population in some way

Bases on a sample statistic we can either accept or reject the hypothesis

Page 3: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Steps in Classical Hypothesis Testing1: Formulate a hypothesis

2: Specify the sampling statistic and its distribution

3: Select a level of significance

4: Construct a decision rule

5: Compute a value of the test statistic

6: Decide to accept or reject the hypothesis

Page 4: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Formulate a hypothesis

Null Hypothesis (H0) – when the sample statistic follows the population parameter (e.g., when characteristics from a sample more or less match those from the population)

Alternative Hypothesis (HA) – When the sample statistic does not follow the population parameter

Possible statements:

ˆ:

ˆ:0

AH

H

ˆ:

ˆ:0

AH

H

ˆ:

ˆ:0

AH

H

Page 5: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Formulate a hypothesis

Which type of hypothesis (null or alternative) are we typically concerned with?

How do “tails” of a distribution fit the statements?

What does it mean to say these hypotheses are mutually exclusive & exhaustive?

Page 6: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Formulate a hypothesis

Remember that the hypotheses are being tested using sample data that may contain sampling error

This is why hypothesis testing falls under the category of inferential statistics

We have to infer results based on a sample We can’t be completely certain of the results, so there is a degree of uncertainty associated with our answersTo estimate this uncertainty we rely upon probability

Page 7: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Types of errorType 1 Error: when we falsely reject a null hypothesis, the probability of doing so is labeled α (i.e., α = P(type 1 error)

Type 2 Error: when we falsely accept a null hypothesis, the probability of doing so is labeled β (i.e., β = P(type 2 error)

H0 is true H0 is false

Reject H0 Type 1 Error No Error

Accept H0 No Error Type 2 Error

Page 8: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Specify the sampling statistic and its distribution

What sampling statistic should you choose for μ, σ, and pi respectively?

What distributions will the sampling statistics have and how do we know?

FYI, when used to test a hypothesis, sampling statistics are also called test statistics

Page 9: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Select a level of significance

In classical hypothesis testing we are only concerned with type 1 error (α)

For example: alpha of 0.1, 0.05, or 0.01The value for alpha is called the significance level

This means that if we reject H0 we will be very confident that it is false

How confident depends on the significance level

The flip-side of this approach is that we are more likely to not reject a null hypothesis that is false

Page 10: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Select a level of significanceHow does this fit with the idea that we are typically concerned with HA rather than H0?

Answer: since the significance is tied to rejecting H0 it is also linked with accepting HA

This means that the hypotheses we make should be structured so that we are testing HA (i.e., rejecting H0 should be scientifically interesting)

To make this more clear, think about the opposite case: if we were really interested in accepting H0 we would have no idea about the significance because we are ignoring type 2 error

Page 11: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Select a level of significance

Whenever we report a decision about the null hypothesis (to reject it or not) we also report the statistical significance

Example:The null hypothesis is rejected at the 0.05 significance level

Page 12: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Select a level of significance

Which significance level we actually choose depends on the application

When might we want a very small α?

In geography 0.1, 0.05., 0.01 are pretty typical

It is also common to see results reported for multiple alphas

Page 13: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Construct a decision rule

For this step we take the hypothesis we’ve defined and the significance level we’ve selected and determine the critical region and the critical values

In other words, we take our values, and determine the thresholds for accepting or rejecting H0

Page 14: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Construct a decision rule

Critical Regions: if the sample statistic falls within these area(s) we will reject H0

Critical Values: the thresholds that divide the critical region(s) from the non-critical region

Page 15: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Construct a decision ruleFor a test statistic with a normal distribution (e.g., x and p) we make our decision rule using:

For p, the equation is:

For x the equation is:

Key things to remember

How to calculate σThe number of tails

p

Criticalz

0

X

Criticalz

0

statistictest

HinValueCriticalz

_

0__

Page 16: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Compute a value of the test statistic

Here we just compute the values using equations we’re familiar with (e.g., x and p)

Note that constructing a decision rule and computing the values of a test statistic can also be done using z-values for the critical values and for the test statistic (see p. 289 for details)

Page 17: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Decide to accept or reject the hypothesis

Now we just compare the test statistic with the critical values and make our decision to reject H0 or not

Page 18: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Classical Hypothesis Testing Example

Has the mean temperature of Charlotte increased over the last 30 years?

This is an example for μ

Page 19: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Example DataSuppose Charlotte’s annual mean temp for the last:

150 years is 50o F.30 years is 53o F.

Suppose the population variance, σ 2, for these 150 years is 9 (so σ = 3)

Assumptions: Each year is independent of other yearsThe last 30 years act as a sample of the population of years since greenhouse gases have been emitted into the Earth’s atmosphere. (These 30 are all we have access to). These 30 years come from the same distribution.

Page 20: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Steps in Classical Hypothesis Testing1: Formulate a hypothesis

2: Specify the sampling statistic and its distribution

3: Select a level of significance

4: Construct a decision rule

5: Compute a value of the test statistic

6: Decide to accept or reject the hypothesis

Page 21: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Step 1: Formulate a hypothesisScientifically, we say our hypothesis is: the mean temperature of Charlotte has increased over the last 30 years

Statistically, we developNull hypothesis H0: Θ ≤ Θ0

Alternative hypothesis HA: Θ > Θ0

When we apply the data:Null hypothesis H0: x ≤ 50o FAlternative hypothesis HA: x > 50o F

This is a 1-sided test

Page 22: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Step 2: Specify the sampling statistic and its distribution

What sampling statistic should we use?

What distribution with it have?

Answers: The sample mean (in this case 53o F)A normal distribution

• Our sample size is 30, which is just large enough to use the z rather than the t distribution

• This is an application of the central limit theorem

Page 23: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

The sample statistic & the hypothesis

If x is below or near 50, we do not reject the null hypothesis:

H0: x ≤ 50o F.

If x is far greater than 50, we reject the null hypothesis in favor of the alternative hypothesis:

HA: x > 50o F.

Why isn’t this simple comparison sufficient?Answer: because x is just a sample and may have error

We set a cutoff point for x, above which we reject our null hypothesis.This cutoff is set at a point where, if the null hypothesis were true, a value of x this large or larger would be very unlikely (due to sampling variation alone).

Page 24: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Step 3: Select a level of significance

This step is always somewhat arbitrary, but we’ll just use 0.05

This means that we’re willing to accept a 5% chance of having a type 1 error (i.e., rejecting H0 when we should not)

Page 25: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Step 4: Construct a decision rule

9037.50505477.0*1.65

1.655477.0

50

5477.0477.5

3

30

3

05.0

0

Critical

Criticalz

n

Criticalz

X

X

Page 26: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Step 4: Construct a decision rule

So we say that we will reject H0 if x is > 50.9037 with a significance level of 0.05

Page 27: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Steps 5 & 6Step 5: Compute a value of the test statistic

In this case we already have the test statistic (x = 53)

Step 6: Decide to reject the null hypothesis (or not)

Now we just compare our test statistic with the critical valueSince 53 is > 50.9037 we will reject the null hypothesis and accept the alternative hypothesis

Page 28: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Shortcomings of the classical approach

The decision to reject the null hypothesis is binary

No detail is given for how far the test statistic is from the critical value (e.g., is it just above it, or way above it)

Different α value might read to different decisions

Page 29: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

The PROB-VALUE approachThis approach fixes the shortcomings of the classical approach

Basically it involves using the same equations, but flipping them around so that we solve for α

In other words:At what level is the test statistic significantWhat is the α (i.e., the probability of making a type 1 error)Should we reject H0 how likely are we to be wrong

Page 30: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

The PROB-VALUE approach

This is based on the equation:

The difference from the classical approach is that now we look up the z-value to tell us the alpha (α)

ˆ

Z

Page 31: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

PROB-VALUE exampleCharlotte Example

Using a z-table, what alpha is associated with this z?

Answer: α = 0.000000021602This value is actually from Excel, the z-table in the book does not go up to 5.477

In other words, there is a 2.16 in 100 million chance of the null being falsely rejected

477.55477.0

5053

ˆ

ˆ

0

z

z

Page 32: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

PROB-VALUE & alphaRemember that the PROB-VALUE is equivalent to finding the alpha associated with a z-value

Therefore we can also use the PROB-VALUE to reject a H0 (or not)

Example:If our selected significance level is 0.05And our PROB-VALUE is 0.00001We’d reject the null hypothesis since 0.00001 < 0.05

Page 33: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Additional things to considerAs with confidence intervals, when conducting a hypothesis using μ we should use t instead of z when:

n < 30we have s instead of σ (with an n > 30 either is ok)

As with confidence intervals, when conducting a hypothesis test using π we should use the binomial distribution instead of z when:

n < 100Example 9-4 in the book solves such a problem

Page 34: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problems Galore!

We’re going to go through several examples that are reminiscent of problems on your homework and what will be on the exam

Page 35: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Key questions to ask before startingWhat is the test statistic?

x and p have slightly different equations, particularly for their standard deviations

How many tails does the test have?Determines whether we use α or α/2Determines whether we multiply the PROB-VALUE by 2

If we are doing a 1 tailed test, which critical value are we concerned about?

: lower critical value : upper critical value

What distribution should we use (t, z, or binomial)

ˆ:

ˆ:

A

A

H

H

Page 36: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1A census of UNC students found that students had, on average, 3.4 pets each while growing up with a standard deviation of 1.9 pets.

A single dorm with 220 students had an average of 3.65 pets growing up. Assuming the students are assigned to the dorm at random (i.e., they are statistically independent), does this dorm have a higher than normal “pet history” with a 0.01 significance level?

Page 37: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1What is the test statistic?

How many tails does the test have?

Which critical value are we concerned about?

Putting these together - what are H0 and HA ?

65.34.3::ˆ:

65.34.3::ˆ:0

xH

xH

A

Page 38: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1What are n, σ, and α?

n = 220σ = 1.9α = 0.01

What distribution should we use and why?The z-distribution since n > 30

What is the z-value associated with α?Z0.01 = 2.33

What is the standard deviation of x?

128.083.14/9.1220/9.1/ nx

Page 39: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1

Critical Value

Should we reject the null hypothesis?

698.33.4.1280*2.33

33.2128.0

4.301.0

0

Critical

Criticalz

Criticalz

X

Page 40: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1

What would happen to the critical value if we changed the significance level to 0.05?

Does this make us more or less likely to reject the null hypothesis?

6112.33.4.1280*1.65

65.1128.0

4.305.0

Critical

Criticalz

Page 41: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #1

PROB-VALUE

What values go in this equation?

What do we do with the resulting z-value?

What is the PROB-VALUE

ˆ

z

953.1128.0

4.365.3

z

0255.0

953.1

VALUEPROB

zz VALUEPROB

Page 42: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #2A census of UNC students found that students had, on average, a 12 minute commute (walking, bicycling, bus, car, etc.) to their first class of the day.

16 randomly sampled students living off campus had an average commute of 17 minutes with a sample standard deviation of 4.5 minutes. Do students living off campus have a longer commute with a 0.05 significance level?

Page 43: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #2What is the test statistic?

How many tails does the test have?

Which critical value are we concerned about?

Putting these together - what are H0 and HA ?

1912::ˆ:

1912::ˆ:0

xH

xH

A

Page 44: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #2What are n, s, and α?

n = 16s = 4.5α = 0.05

What distribution should we use and why?The t-distribution since n < 30 and we have s instead of σ

What is the t-value associated with α?t0.05,15 = 1.75

What is the standard deviation of x?125.14/5.416/5.4/ nsx

Page 45: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #2

Critical Value

Should we reject the null hypothesis?

969.1312.1251*1.75

75.1125.1

1215,05.0

0,

Critical

Criticalt

Criticalt

X

df

Page 46: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #2

PROB-VALUE

What values go in this equation?

What do we do with the resulting z-value?

What is the PROB-VALUE

ˆ

t

444.4125.1

1217

t

000.0

444.4,,

VALUEPROB

tt dfVALUEPROBdf

Page 47: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #3A botanical index states that the average weight of a northern red oak acorn is 6 grams.

A random sample of 101 acorns was collected from the red oaks in the quad and the acorns had an average weight of 5.6 grams and a sample standard deviation of 1.3 grams.

Are the oak trees in the quad atypical from normal trees with a significance of 0.05?

Page 48: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #3What is the test statistic?

How many tails does the test have?

Which critical value are we concerned about?

Putting these together - what are H0 and HA ?

6.56::ˆ:

6.56::ˆ:0

xH

xH

A

Page 49: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #3What are n, s, and α?

n = 101s = 1.3α = 0.05

What distribution should we use and why?Either one would be ok, but since we’re using s we’ll go with t

What is the t-value associated with α/2?t0.025,100 = 1.98Note how close this is to z0.025 = 1.96

What is the standard deviation of x?13.005.10/3.1101/3.1/ nsx

Page 50: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #3

Critical Value

Should we reject the null hypothesis?

2574.6613.0*98.1_

7426.5613.0*1.98_

98.113.0

6100,025.0

0,

CriticalUpper

CriticalLower

Criticalt

Criticalt

X

df

Page 51: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #3

PROB-VALUE

What values go in this equation?

What do we do with the resulting z-value?

What is the PROB-VALUE

ˆ

t

077.313.0

66.5

t

001.0

077.3,,2/

VALUEPROB

tt dfVALUEPROBdf

Page 52: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #4Suppose a census of UNC students found that 8 percent of students bike to class regularly.

A random sample of 160 business majors found that 7 biked regularly.

If would seem that business majors bike less than other students, what significance level does this statement have?

Page 53: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #4What is the test statistic?

How many tails does the test have?

Which critical value are we concerned about?

Putting these together - what are H0 and HA ?

0475.008.0::ˆ:

04375.008.0::ˆ:0

pH

pH

A

Page 54: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #4What are n, π, and p?

n = 160π = 0.08p = 7/160 = 0.04375

What distribution should we use and why?The z-distribution since have probabilities and a large n

What is the standard deviation of p?

02145.00046.0160/92.0*08.0/)1(* np

Page 55: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #4

PROB-VALUE

What values go in this equation?

What do we do with the resulting z-value?

What is the PROB-VALUE

ˆ

z

69.102145.0

08.004375.0

z

046.0

69.1

VALUEPROB

zz VALUEPROB

Page 56: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Sample Problem #4

What does a PROB-VALUE of 0.046 indicate about our statement?

Page 57: Inferential Statistics Part 2: Hypothesis Testing Chapter 9 p. 280 - 306.

Statistical Significance vs. Practical Significance

What are all these tests really telling us?They tell us about the presence of difference (<, >, =), which can be really scientifically uninteresting

Two approaches for managing this situation

Test only important hypothesesUse confidence intervals rather than hypothesis tests


Recommended