Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | corey-lloyd |
View: | 214 times |
Download: | 1 times |
Breaking Statistical Rules: How bad is it really?
Presented by Sio F. Kong
Joint work with: Janet Locke,
Samson Amede
Advisor: Dr. C. K. Chauhan
Background
Make inference about populations based on information from random samples.
The process is called Hypothesis Testing. Being Used in many areas such as Biology,
Psychology, Business, etc.
Examples
Mean heart rates:– white newborns vs. African American newborns.
Mean daily intake of saturated fat:– Among a vegetarian population vs. 15 grams.
Mean SAT score:– In a particular county vs. the national average.
Notations
Population means: μ1 , μ2– (unknown most of the time)
Sample means: Population standard deviations: σ1 , σ2
– (unknown most of the time)
Sample standard deviation: S1 , S2
Pool standard deviation: Sp
Sample size: n1 , n2
21, xx
2-Samples Hypotheses Testing
Example:Null Hypothesis: H0 μ1-µ2 = 0(two means are equal)
Research Hypothesis: H1 μ1-µ2 ≠ 0(two means are not equal)
1 - 2 is significantly away from 0 --- reject Null Hypothesis
That is, two means are NOT equal. The corresponding function has a t-distribution.
x x
Important
This test statistics has a t-distribution under certain conditions:
– Samples are drawn randomly.– If samples are small, populations need to be normally
distributed. – The two populations have equal variances, σ1 = σ2.
21
11
0)(
nnS
yx
p
Objective
To investigate the effect of the violation of equal variances on the testing procedure.
Our textbook suggests that the effect of the violation is minimum when sample sizes are equal.
Measurement for a GOOD test
Two types of errors:– Type 1 error – rejecting the true null hypothesis– Type 2 error – failing to reject a false hypothesis = the probabilities of type 1 error
is selected in advance, usually 5%.– Power = 1- Pr( type 2 errors )
can be calculated under various alternatives.
A test is good if the power is high under various alternatives while stays the same level as selected.
In this research…
1000 tests are generated by simulations in each situation
Simulation studies are done to calculate:– α: Probability of rejecting the true hypothesis– Power: Probability of rejecting the false
hypothesis Based on various alternatives when equal variances
assumption is violated.
Effect when σ1 ≠ σ2:
Pop1 Pop2 Pop1 Pop2
Mean µ1 = 10 µ2 = 10 µ1 = 10 µ2 = 14
Sample Size n1 =10 n2 =10 n1 =10 n2 =10
α power
1 =2, 2 =3 4.4% 89.8%
1 =2, 2 =4 5.4% 75.0%
1 =2, 2 =5 6.0% 60.1%
1 =2, 2 =10 8.0% 24.2%
Condition not violated:
σ1 = σ2
In this example: σ1 = σ2 = 2
n1 = n2 =10 n1 ≠ n2
α power n1, n2 α power5.2% 98.5% 12, 8 5.2% 98.9%
13, 7 5.0% 98.6%14, 6 5.0% 97.7%
Conclusion:
When σ1 = σ2, it does not matter if n1 = n2 since it is not a requirement.
Condition violated:
σ1 ≠ σ2
In this example: σ1 = 2 and σ2 = 5
n1 = n2 = 10 n1 ≠ n2
α power n1, n2 α power6.5% 60.1% 12, 8 9.6% 65.2%
13, 7 12.2% 66.6%14, 6 15.4% 64.2%
Conclusion:
When σ1 ≠ σ2, if n1 ≠ n2, effect on alpha is even more significant.
Result
Conclusion
If the difference between σ1 and σ2 get larger, α goes up and power goes up.
Other interesting observations:– If smaller sample has larger standard deviation,
α goes up.– If larger sample has larger standard deviation, α
goes down.
Note
This conclusion is only based on what this simulation study has shown. By selecting different parameters and choosing different alternatives, the result may be different.