1 Experimental Statistics - week 2 Sampling Distributions –Chi-square –F–F Statistical...

1

Experimental StatisticsExperimental Statistics - week 2 - week 2Experimental StatisticsExperimental Statistics - week 2 - week 2

• Sampling Distributions– Chi-square

– F

• Statistical Inference– Confidence Intervals

– Hypothesis Tests

Review ContinuedReview Continued

2

Chi-Square DistributionChi-Square Distribution (distribution of the sample variance) (distribution of the sample variance)

Chi-Square DistributionChi-Square Distribution (distribution of the sample variance) (distribution of the sample variance)

22

2 21

1 ( )( ) ni

i

X Xn S

IF:IF:• Data are Normally Distributed

• Observations are Independent

Then:Then:

has a Chi-SquareChi-Square distribution with n - 1 degrees of freedom

3

Chi-square Distribution, Figure 7.10, page 357

4

5

6

F-DistributionF-DistributionF-DistributionF-Distribution

IF:IF:• S1

2 and S22 are sample variances from 2 samples

• samples independent

• populations are both normal

Then:Then:2 21 12 22 2

/

/

S

S

1 2n nhas an F-distribution with and df

7

F-distribution, Figure 7.10, page 357

8

9

10

(1-(1-)x100% Confidence Intervals)x100% Confidence Intervalsfor for

(1-(1-)x100% Confidence Intervals)x100% Confidence Intervalsfor for

Setting:• Data are Normally Distributed

• Observations are Independent

Case 1: known

/ 2 / 2X z X zn n

Case 2: unknown

/ 2 / 2X t X tn n

( 1n df )

11

CI Example CI Example CI Example CI Example

An insurance company is concerned about the number and magnitude of hail damage claims it received this year. A random sample 20 of the thousands of claims it received this year resulted an average claim amount of $6,500 and a standard deviation of $1,500. What is a 95% confidence interval on the mean claim damage amount?

Suppose that company actuaries believe the company does not need to increase insurance rates for hail damage if the mean claim damage amount is no greater than $7,000. Use the above information to make a recommendation regarding whether rates should be raised.

12

Interpretation of 95% Interpretation of 95% Confidence IntervalConfidence Interval

Interpretation of 95% Interpretation of 95% Confidence IntervalConfidence Interval

100 different 95% CI plotted in the case for which true mean is 80

i.e. about 95% of these confidence intervals should “cover” the true mean

Concern has been mounting Concern has been mounting that SAT scores are falling.that SAT scores are falling.

• 3 years ago -- National AVG = 955

• Random Sample of 200 graduating high school students this year (sample average = 935) (each the standard deviation is about 100)

Question: Have SAT scores dropped ?

Procedure: Determine how “extreme” or “rare” our sample AVG of 935 is if population AVG really is 955.

We must decide:We must decide:

• The sample came from population with population AVG = 955 and just by chance the sample AVG is “small.”

OR

• We are not willing to believe that the pop. AVG this year is really 955. (Conclude SAT scores have fallen.)

15

Statistical HypothesisStatistical Hypothesis- statement about the parameters of one or more populations

Null HypothesisNull Hypothesis - hypothesis to be “tested”

(standard, traditional, claimed, etc.)- hypothesis of no change, effect, or difference

(usually what the investigator wants to disprove)

Alternative HypothesisAlternative Hypothesis- null is not correct

(usually what the hypothesis the investigator suspects or wants to show)

0( )H

( )aH

Hypothesis Testing TerminologyHypothesis Testing TerminologyHypothesis Testing TerminologyHypothesis Testing Terminology

16

Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:

Do the Data provide sufficient evidence to refute the Null Hypothesis?

17

Critical Region (Rejection Region)Critical Region (Rejection Region)- region of test statistic that leads to

rejection of null (i.e. t > c, etc.)

Critical ValueCritical Value- endpoint of critical region

Significance LevelSignificance Level - probability that the test statistic will

be in the critical region if null is true - probability of rejecting when it is true

Hypothesis Testing (cont.)Hypothesis Testing (cont.)Hypothesis Testing (cont.)Hypothesis Testing (cont.)

18

Types of HypothesesTypes of Hypotheses

0 0

0

:

:a

H

H

0 0

0

:

:a

H

H

One-Sided TestsOne-Sided Tests

Two-sided TestsTwo-sided Tests

0 0

0

:

:a

H

H

19

Rejection Regions for One- and Rejection Regions for One- and Two-Sided AlternativesTwo-Sided Alternatives

Rejection Regions for One- and Rejection Regions for One- and Two-Sided AlternativesTwo-Sided Alternatives

-t

Critical Value

0 0 0 : : vs. aH H

0 0 0 : : vs. aH H

0 0 0 : : vs. aH H

0H t t Reject if

0H t tReject if

0 / 2|H t tReject if |

20

A Standard A Standard Hypothesis Test Write-upHypothesis Test Write-up

A Standard A Standard Hypothesis Test Write-upHypothesis Test Write-up

1. State the null and alternative

2. Give significance level, test statistic,and the rejection region

3. Show calculations

4. State the conclusion- statistical decision

- give conclusion in language of the problem

21

Hypothesis Testing Example 1Hypothesis Testing Example 1Hypothesis Testing Example 1Hypothesis Testing Example 1A solar cell requires a special crystal. If properly manufactured, the mean weight of these crystals is .4g. Suppose that 25 crystals are selected at random from from a batch of crystals and it is calculated that for these crystals, the average is .41g with a standard deviation of .02g. At the = .01 level of significance, can we conclude that the batch is bad?

22

Hypothesis Testing Example 2Hypothesis Testing Example 2Hypothesis Testing Example 2Hypothesis Testing Example 2A box of detergent is designed to weigh on the average 3.25 lbs per box. A random sample of 18 boxes taken from the production line on a single day has a sample average of 3.238 lbs and a standard deviation of 0.037 lbs. Test whether the boxes seem to be underfilled.

23

Actual Situation

Errors in Hypothesis TestingErrors in Hypothesis TestingErrors in Hypothesis TestingErrors in Hypothesis Testing

Null is True Null is False

Do NotReject Ho

Reject Ho

Conclusion

CorrectDecision

CorrectDecision

( )

( )( 1 - )

( 1 - )(Power)

Type IIError

Type IError

24

p-p-ValueValue p-p-ValueValue

(observed value of t)

-2.39

p-value

0 0 0 : : vs. aH H

0H t t Reject if

Suppose t = - 2.39 is observed from data for test above

Note: “Large negative values” of t make us believe alternative is true

the probability of an observation as extreme or more extreme than the one observed when the null is true

25

Note:Note:-- if p-value is less than or equal to then we reject null at the significance level

-- the p-value is the smallest level of significance at which the null hypothesis would be rejected

26

Find the p-values for Examples 1 and 2

27

Two Independent SamplesTwo Independent SamplesTwo Independent SamplesTwo Independent Samples

• Assumptions: Measurements from Each Population are

– Mutually Independent Independent within Each Sample

Independent Between Samples

– Normally Distributed (or the Central Limit Theorem can be Invoked)

• Analysis Differs Based on Whether the Two Populations Have the Same Standard Deviation

28

Two Types of Independent Two Types of Independent SamplesSamples

Two Types of Independent Two Types of Independent SamplesSamples

• Population Standard Deviations Equal– Can Obtain a Better Estimate of the Common

Standard Deviation by Combining or “Pooling” Individual Estimates

• Population Standard Deviations Different– Must Estimate Each Standard Deviation

– Very Good Approximate Tests are Available

If Unsure, Do Not AssumeEqual Standard Deviations

29

Equal Population Standard Equal Population Standard DeviationsDeviations

Equal Population Standard Equal Population Standard DeviationsDeviations

Test Statistic

df = n1 + n2 - 2

nns

)μ(μ)yy( t=

p21

2121

11

s= s

+nn

sn + sn=s

pp

p

2

21

222

2112

2

)1()1(

where

30

Behrens-Fisher ProblemBehrens-Fisher ProblemBehrens-Fisher ProblemBehrens-Fisher Problem

y

2

22

1

21

2121 t~

ns

ns

)(y

1 2 If

31

Satterthwaite’s Approximate t Satterthwaite’s Approximate t StatisticStatistic

Satterthwaite’s Approximate t Satterthwaite’s Approximate t StatisticStatistic

y

1 t

ns

ns

)(y

2

22

1

21

212

1 2 If

2 2 21 2

2 21 2

1 2

( ), ,

1 1

a b s sa b

a b n nn n

df = (Approximate t df)

(i.e. approximate t)

32

Often-Recommended Strategy Often-Recommended Strategy for Tests on Meansfor Tests on Means

Often-Recommended Strategy Often-Recommended Strategy for Tests on Meansfor Tests on Means

Test Whether 1 = 2 (F-test )– If the test is not rejected, use the 2-sample t statistics,

assuming equal standard deviations– If the test is rejected, use Satterthwaite’s approximate t

statistic

NOTE: This is Not a Wise Strategy– the F-test is highly susceptible to non-normality

Recommended Strategy:– If uncertain about whether the standard deviations are

equal, use Satterthwaite’s approximate t statistic

33

Example 3: Example 3: Comparing the Mean BreakingComparing the Mean Breaking Strengths of 2 Plastics Strengths of 2 PlasticsExample 3: Example 3: Comparing the Mean BreakingComparing the Mean Breaking Strengths of 2 Plastics Strengths of 2 Plastics

Plastic A:

Plastic B:

.= , s.=y , = n AAA 3332835

Assumptions:Mutually independent measurementsNormal distributions for measurements from each type of plasticEqual population standard deviations

.= , s.=y , = n AAA 9472640

Question:Question: Is there a difference between the 2 plastics in terms of mean breaking strength?

34

New diet -- Is it effective?New diet -- Is it effective?

Design:Design:

50 people: randomly assign 25 to go on diet and 25 to eat normally for next month.

Assess results by comparing weights at end of 1 month.

Diet: No Diet:Diet: No Diet:

D

D

X

SND

ND

X

S

Run 2-sample t-test using guidelines we have discussed.

Is this a good design?

35

Better Design:Better Design:

Randomly select subjects and measure them before and after 1-month on the diet.

Subject Before After 1 150 147 2 210 195 : : :

n 187 190

Difference 3 15 :

-3

Procedure: Calculate differences, and analyze differences using a 1-sample test

““Paired t-Test”Paired t-Test”

36

Example 4:Example 4: International Gymnastics International Gymnastics JudgingJudging

Example 4:Example 4: International Gymnastics International Gymnastics JudgingJudging

Contestant 1 2 3 4 5 6 7 8 9 10 11 12Native J udge 6.8 4.5 8.0 7.2 8.7 4.5 6.6 5.8 6.0 8.8 8.7 4.4Foreign J udges 6.7 4.3 8.1 7.2 8.3 4.6 5.4 5.9 6.1 9.1 8.7 4.3

Question: Do judges from a contestant’s country rate their own contestant higher than do foreign judges?

0 : N FH i.e. test

:a N FH

Data:

Date post:	25-Dec-2015
Category:	Documents
Upload:	marlene-stephens
View:	214 times
Download:	0 times

1 Experimental Statistics - week 2 Sampling Distributions –Chi-square –F–F Statistical...

Documents