Breaking Statistical Rules: How bad is it really? Presented by Sio F. Kong Joint work with: Janet...

Breaking Statistical Rules: How bad is it really?

Presented by Sio F. Kong

Joint work with: Janet Locke,

Samson Amede

Advisor: Dr. C. K. Chauhan

Background

Make inference about populations based on information from random samples.

The process is called Hypothesis Testing. Being Used in many areas such as Biology,

Psychology, Business, etc.

Examples

Mean heart rates:– white newborns vs. African American newborns.

Mean daily intake of saturated fat:– Among a vegetarian population vs. 15 grams.

Mean SAT score:– In a particular county vs. the national average.

Notations

Population means: μ1 , μ2– (unknown most of the time)

Sample means: Population standard deviations: σ1 , σ2

– (unknown most of the time)

Sample standard deviation: S1 , S2

Pool standard deviation: Sp

Sample size: n1 , n2

21, xx

2-Samples Hypotheses Testing

Example:Null Hypothesis: H0 μ1-µ2 = 0(two means are equal)

Research Hypothesis: H1 μ1-µ2 ≠ 0(two means are not equal)

1 - 2 is significantly away from 0 --- reject Null Hypothesis

That is, two means are NOT equal. The corresponding function has a t-distribution.

x x

Important

This test statistics has a t-distribution under certain conditions:

– Samples are drawn randomly.– If samples are small, populations need to be normally

distributed. – The two populations have equal variances, σ1 = σ2.

21

11

0)(

nnS

yx

p

chauhan

samples are randomif samples are small, need populations to be normally distributedSp is a function of the sample standard deviations

chauhan

The populations have equal variances , sigma1=sigma2

Objective

To investigate the effect of the violation of equal variances on the testing procedure.

Our textbook suggests that the effect of the violation is minimum when sample sizes are equal.

Measurement for a GOOD test

Two types of errors:– Type 1 error – rejecting the true null hypothesis– Type 2 error – failing to reject a false hypothesis = the probabilities of type 1 error

is selected in advance, usually 5%.– Power = 1- Pr( type 2 errors )

can be calculated under various alternatives.

A test is good if the power is high under various alternatives while stays the same level as selected.

chauhan

in hypothesis testing,we make decision about populations based on the samples. We can mkae two types of errors.10 reject a true null hypothese, ie two poulations have equal means and you concluded that they are not.h

In this research…

1000 tests are generated by simulations in each situation

Simulation studies are done to calculate:– α: Probability of rejecting the true hypothesis– Power: Probability of rejecting the false

hypothesis Based on various alternatives when equal variances

assumption is violated.

chauhan

our text book suggests that the effects of the violation is minimum if the two sample sizes are equal

Effect when σ1 ≠ σ2:

Pop1 Pop2 Pop1 Pop2

Mean µ1 = 10 µ2 = 10 µ1 = 10 µ2 = 14

Sample Size n1 =10 n2 =10 n1 =10 n2 =10

α power

1 =2, 2 =3 4.4% 89.8%

1 =2, 2 =4 5.4% 75.0%

1 =2, 2 =5 6.0% 60.1%

1 =2, 2 =10 8.0% 24.2%

2

11

21

222

211

nn

SnSnS pool

21

11

0)(

nnS

yxt

p

Reject if

fdtt .,2/

Condition not violated:

σ1 = σ2

In this example: σ1 = σ2 = 2

n1 = n2 =10 n1 ≠ n2

α power n1, n2 α power5.2% 98.5% 12, 8 5.2% 98.9%

13, 7 5.0% 98.6%14, 6 5.0% 97.7%

Conclusion:

When σ1 = σ2, it does not matter if n1 = n2 since it is not a requirement.

Condition violated:

σ1 ≠ σ2

In this example: σ1 = 2 and σ2 = 5

n1 = n2 = 10 n1 ≠ n2

α power n1, n2 α power6.5% 60.1% 12, 8 9.6% 65.2%

13, 7 12.2% 66.6%14, 6 15.4% 64.2%

Conclusion:

When σ1 ≠ σ2, if n1 ≠ n2, effect on alpha is even more significant.

Result

Conclusion

If the difference between σ1 and σ2 get larger, α goes up and power goes up.

Other interesting observations:– If smaller sample has larger standard deviation,

α goes up.– If larger sample has larger standard deviation, α

goes down.

chauhan

instead of alpha=pro of type 1 erroreffect on powerif n1=n2 power is not as high as it was when the assuntion not violated.l

Note

This conclusion is only based on what this simulation study has shown. By selecting different parameters and choosing different alternatives, the result may be different.

Thank You!

Date post:	29-Dec-2015
Category:	Documents
Upload:	corey-lloyd
View:	214 times
Download:	1 times

Breaking Statistical Rules: How bad is it really? Presented by Sio F. Kong Joint work with: Janet...

Documents