The logic behind a statistical test.
A statistical test is the comparison of the probabilities in favour of a hypothesis H1with the respective probabilities of an appropriate null hypothesis H0.
Hypothesis correct1-
1-
Hypothesis wrong
Hypothesis rejectedHypothesis accepted
Type I error
Type II error Power of a test
Accepting the wrong hypothesis H1 is termed type I error.Rejecting the correct hypothesis H1 is termed ttype II error.
Lecture 11Parametric hypothesis testing
Testing simple hypotheses
Karl Pearson threw 24000 times a coin and wanted to see whether in the real world deviations from the expectation of 12000 numbers and 12000 eagles occur. He got
12012 time the numbers. Does this result deviate from our expectation?
2400024000
12012
24000( 12012) 0.5
i
p Xi
12012 12000( 12012) 1 ( 12012) 1 1 0.4386000
XX X
The exact solution of the binomial
The normal approximation
60005.0*5.0*24000)1(
24000*5.02
pnpnpq
np
0.95 1.96 12000 1.96 6000 12000 151.8CL x s
2 22 (12000 11988) (12000 12012) 0.024
12000 12000
2 test
Assume a sum of variances of Z-transformed variables
n
i
in nxExExExExE1
22222322222122 ])[(])[(...])[(])[(])[(
Each variance is one. Thus the expected value of 2 is n
The 2 distribution is a group of distributions of variances in dependence on the number of elements n.
Observed values of 2 can be compared to predicted and allow for statistical hypthesis testing.
Pearson’s coin example
Probability of H0
9 times green, yellow seed3 times green, green seed3 times yellow, yellow seed1 time yellow, green seed
Combination Ratio Observed PredictedGY 9 61 65.25GG 3 16 21.75YY 3 28 21.75YG 1 11 7.25Sum 16 116 116
010203040506070
GY GG YY YG
# ob
serv
ation
s
Character combination
Observed
Predicted
Does the observation confirm the prediction?
2K2
1
(expected value - observed value)expected value
25.7)1125.7(
75.21)2875.21(
75.21)1675.21(
25.65)6125.65( 2222
2
The Chi2 test has K-1 degrees of freedom.
2K2
1
(expected value - observed value)expected value
25.7)1125.7(
75.21)2875.21(
75.21)1675.21(
25.65)6125.65( 2222
2
All statistical programs give the probability of the null hypothesis, H0.
Advices for applying a χ2-test
• χ2-tests compare observations and expectations. Total numbers of observations and expectations must be equal.
• The absolute values should not be too small (as a rule the smallest expected value should be larger than 10). At small event numbers the Yates correction should be used.
• The classification of events must be unequivocal.• χ2-tests were found to be quite robust. That means they are conservative and rather
favour H0, the hypothesis of no deviation. • The applicability of the χ2-test does not depend on the underlying distributions. They
need not to be normally of binomial distributed.
2K2
1
(expected frequency - observed frequency)expected frequency
N
Dealing with frequencies
2K2
1
( expected value - observed value 0.5)expected value
110 475
90 325
Curled
Normal
A B
200 800
585
415
1000
2x2 contingency table
1000 Drosophila flies with normal and curled wings and two alleles A and B
suposed to influence wing form.
Do flies with allele have more often curled wings than fiels with allele B?
Combination Observed Predicted Chi2A-curled 110 117 0.418803A-normal 90 83 0.590361B-cureled 475 468 0.104701B-normal 325 332 0.14759Sum 1000 1000 1.261456 0.73830541
Sum curled 585Sum normal 415Sum A 200Sum B 800
Chi2 distribution
26.1332
)325332(468
)475468(83
)9083(117
)110117( 22222
A contingency table chi2 test with n rows and m columns has (n-1) * (m-1)
degrees of freedom.
The 2x2 table has 1 degree of freedom
Predicted number of allele A and curled wings
1171000200585)( curledAP
Student’s t-test for equal sample sizes and similar variances
Welch t-test for unequal variances and sample sizes
Bivariate comparisons of means
F-test
2122
F
2
22
1
21
11
ns
ns
xxt
22
21
11
ss
xxnt
F
ss
ss
xxn
ns
ns
xxnt
Sum
Difference
n
i
n
i
2
2
22
21
1
221
22
21
1
2212
2
11
1ndf
11 2
2
2
22
1
2
1
21
2
2
22
1
21
nnsn
ns
ns
ns
df
11
22
11
ndfndf
In a physiological experiment mean metabolism rates had been measured. A first treatment gave mean = 100, variance = 45, a second treatment mean = 120, variance = 55.
In the first case 30 animals in the second case 50 animals had been tested. Do means and variances differ?
N1+N2-2Degrees of freedom
The probability level for the null hypothesis2
22
1
21
11
ns
ns
xxt
4.12
5055
3045
100120
t
2122
F
22.14555)30;50( F
The comparison of variances
Degrees of freedom: N-1
The probability for the null hypothesis of
no difference, H0.
1-0.287=0.713: probability that the first variance (50) is
larger than the second (30).
One sided test
0.57 2*0.287Past gives the probability for a two sided test that one variance is either larger or smaller
than the second.
Two sided test
1 2
2 21 2
t N
Power analysis
Nt
221
22
212
Effect size In an experiment you estimated two means
Each time you took 20 replicates. Was this sample size large enough to confirm differences between both means?
20;15050;180
11
11
sxsx
We use the t-distribution with 19 degrees of freedom.
15)150180(
205009.2 2
222
N
You needed 15 replicates to confirm a difference at the 5%
error level.
The t-test can be used to estimate the number of observations to detect a significant signal for a given effect size.
From a physiological experiment we want to test whether a certain medicament enhances short time memory.
How many persons should you test (with and without the treatment) to confirm a difference in memory of about 5%?
2 2 21 1 1
22
1.05 0.05 0.05 0.052.05 2.051.05 1.05
2.05 8200.05
t N N N N
tN t
We don’t know the variances and assume a Poisson random sample.Hence2 =
We don’t know the degrees of freedom:
We use a large number and get t:
3150)96.1(*820 2 N
Correlation and linear regression
y = ax + b
2 2
1 1
( ) [ ( )]n n
i ii i
D y y ax b
1
1
2 ( ) 0
2 ( ) 0
n
i i ii
n
i ii
D x y ax baD y ax bb
1
22
1
n
i ii
n
ii
x y nx ya
x nx
b y ax
( ) ( )y ax y ax y y a x x
The least squares method of Carl Friedrich Gauß.
0
5
10
15
20
0 5 10 15 20
Y
X
y2
OLRy
y
2
1
2
1
1
22
1
1
22
1
)(1
))((1
1
1
x
xyn
ii
n
iii
n
ii
n
iii
n
ii
n
iii
ss
xxn
yyxxn
xxn
yxyxn
xnx
yxnyxa
Covariance
Variance
Correlation coefficient
xy
x y
xy
x y
sr
s s
22
2 2xy
x y
r
Coefficient of determination
2 Explained varianceTotal variance
R
y
x
yxxyx
ssar
srssas
2
Slope a and coefficient of correlation r are zero if the covariance is zero.
11 r
10 2 r
y = 0.192x + 0.4671R² = 0.1723
01234567
0 10 20 30
Brac
hypt
erou
s spe
cies
Macropterous species
y = 0.3875x + 3.7188R² = 0.4455
02468
101214
0 10 20 30
Dim
orph
ic sp
ecie
s
Macropterous species
Relationships between macropterous, dimorphic and brachypterous ground beetles
on 17 Mazurian lake islandsPositive correlation; r =r2= 0.41The regression is weak. Macropterous species richness explains only 17% of the variance in brachypterous species richness.We have some islands without brachypterous species.We really don’t know what is the independent variable.There is no clear cut logical connection.
Positive correlation; r =r2= 0.67The regression is moderate. Macropterous species richness explains only 45% of the variance in dimorphic species richness.The relationship appears to be non-linear. Log-transformation is indicated (no zero counts).We really don’t know what is the independent variable.There is no clear cut logical connection.
y = -36.203x + 5.5585R² = 0.2311
01234567
0 0.05 0.1 0.15
Brac
hypt
erou
s spe
cies
Isolation
y = 0.4894x + 22.094R² = 0.0037
05
1015202530354045
-3 -2 -1 0 1 2
Brac
hypt
erou
s spe
cies
ln Area
Negative correlation; r =r2= -0.48The regression is weak. Island isolation explains only 23% of the variance in brachypterous species richness.We have two apparent outliers. Without them the whole relationship would vanish, it est R2 0.Outliers have to be eliminated fom regression analysis.We have a clear hypothesis about the logical relationships. Isolation should be the predictor of species richness.
No correlation; r =r2= 0.06The regression slope is nearly zero. Area explains less than 1% of the variance in brachypterous species richness.We have a clear hypothesis about the logical relationships. Area should be the predictor of species richness.
0
5
10
15
20
0 5 10 15 20
Y
X2
yOLRx
xy
sa
s2xy
OLRyx
sa
s
Model I regression
2
222 *
x
yyxyx s
saOLRyaOLRx
aOLRxs
ysaOLRys
What distance to minimize?
y2
x2 OLRx
OLRy
2
2 xy y yRMA x y
x xy x
s s sa a a
s s s
y OLRyRMA
x
s aa
s r
Reduced major axis regression is the geometric average of aOLRy and aOLRx
Model II regression
OLRyRMA aa
0
5
10
15
20
0 5 10 15 20
Y
X
y2
x2 OLRx
OLRyx y
RMA
Past standard output of linear regression Reduced major axis
Parameters and standard errors
Parametric probability for r = 0
2
2( 2)1
r nt df nr
2
22
1)2(
rrntF
We don’t have a clear hypothesis about the causal relationships.In this case RMA is indicated.
Permutation test for statistical significance
Both tests indicate that Brach and Macro are not significantly correlated.The RMA regression slope is insignificant.
Macro Brach Los() Macro Los() Macro Los() Macro Los() Macro Los() Macro7 4 0.335757 14 0.531818 10 0.258728 14 0.296023 10 0.809377 1412 6 0.787809 10 0.580728 18 0.860314 9 0.524753 8 0.801854 1013 3 0.310238 12 0.101989 6 0.709402 15 0.826895 15 0.942821 2218 4 0.626757 22 0.115425 8 0.793515 12 0.064408 13 0.722662 1210 1 0.220597 13 0.413435 14 0.965281 7 0.25255 7 0.218747 1814 4 0.012454 6 0.684826 10 0.305505 13 0.976486 8 0.404831 137 2 0.909548 9 0.474608 22 0.701483 10 0.170293 22 0.745551 822 5 0.299534 10 0.830635 7 0.061196 22 0.517693 14 0.968818 69 1 0.177327 8 0.581156 13 0.204792 8 0.355126 10 0.822951 77 0 0.953261 7 0.916832 7 0.72657 8 0.38976 6 0.78764 1415 0 0.242402 7 0.974389 7 0.013131 18 0.639621 7 0.878803 1513 0 0.595826 13 0.625952 15 0.066869 10 0.511781 7 0.032343 78 1 0.596459 8 0.260397 13 0.414809 6 0.489293 14 0.92727 1010 4 0.880829 14 0.61705 14 0.093979 7 0.504421 12 0.267633 88 2 0.548183 15 0.588517 9 0.462482 7 0.630868 13 0.106493 7
14 6 0.790054 7 0.015239 8 0.234162 13 0.778739 18 0.89634 136 2 0.999702 18 0.253364 12 0.011327 14 0.815214 9 0.4389 9
0.099125 -0.05535 0.302746 0.358917 -0.0413
N 1000Observed r 0.41508801 Mean r 0.061
Lower CL -0.538Upper CL 0.768
Permutation test for statistical significance
Randomize 1000 times x or y.Calculate each time r. Plot the statistical distribution and calculate the lower and upper confidence limits.
0102030405060708090
-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Nr
Lower CL Upper CL
g > 0
Calculating confidence limits
Rank all 1000 coefficients of correlation and take the values at rank positions 25 and 975.
S N2.5 = 25 S N2.5 = 25
> 0
Observed r
The RMA regression has a much steeper slope.This slope is often intuitively better.
The coefficient of correlation is independent of the regression method
The 95% confidence limit of the regression slopemark the 95% probability that the regression slope is within these
limits.The lower CL is negative, hence the zero slope is with the 95% CL.
Upper CL
Lower CL
In OLRy regression insignificance of slope means also insignificance of r and R2.
0
5
10
15
20
0 5 10 15 20
Y
X
y2
OLRy
y
Outliers have an overproportional
influence on correlation and
regression.
Outliers should be eliminated from regression analysis.
Instead of the Pearson coefficient of correlations use Spearman’s rank order correlation.
01234567
0 1 2 3 4 5 6 7
Y
X
Normal correlation on ranked data
rPearson = 0.79
rSpearman = 0.77
Home work and literature
Refresh:
• Coefficient of correlation• Pearson correlation• Spearman correlation• Linear regression• Non-linear regression• Model I and model II regression• RMA regression
Prepare to the next lecture:
• F-test• F-distribution• Variance
Literature:
Łomnicki: Statystyka dla biologówhttp://statsoft.com/textbook/