Introduction to hypothesis testing
Review: Logic of Hypothesis Tests
• Usually, we test (attempt to falsify) a null hypothesis (H0):– includes all possibilities except prediction in
hypothesis (HA)
• If hypothesis (HA)is that an experimental treatment has an effect:– null hypothesis is that there is no effect
• Disproving H0 = evidence that actual hypothesis is true
Decision criterion• How low a probability should make us reject
H0?• If probability is less than significance level
(critical p-value, ), then reject H0; otherwise do not reject
• Convention sets significance level: = 0.05 (5%)
• Arbitrary:– other significance levels might be valid. Context
specific
Three special types of Hypothesis Tests based on the t distribution
1. The mean of a distribution is different from a constant (one sample t test)
2. The mean difference in pairs of observations is different from a constant (paired t test)
3. Two distributions differ (i.e. the means from two sets of observations do not come from the same distribution of means). Two sample t test.
t statisticGeneral form of t statistic:
where St is sample statistic, is parameter value specified in H0 and SE is standard error of sample statistic.
Specific form for population mean:
Value of meanspecified in H0
SE
St
ns
y
Test statistics
• Sampling distributions of t, one for each sample size, when H0 true– use degrees of freedom (df = n - 1)
• Area under each sampling (probability) distribution equals one
• Probabilities of obtaining particular ranges of t when H0 is true
Three special types of Hypothesis Tests based on the t distribution
1. The mean of a distribution is different from a constant. One sample t test
2. The mean difference in pairs of observations is different from a constant. Paired t test.
3. Two distributions differ (ie the means from two sets of observations do not come from the same distribution of means). Two sample t test.
Simple null hypothesis
• Test of hypothesis that population mean equals a particular value (H0: = )
• These values may be from literature or other research or legislation
One sample t-test
Populations are fairly stable if the ratio of births to deaths is close to 1.25.
Ho: B/D ratios = 1.25HA: B/D ratios = 1.25
1) Are the B/D ratios for any of these groups =1.25
2) Test using a one sample t-test
Ourworld
0
0.5
1
1.5
2
2.5
3
3.5
4
Mea
n(B_
To_D
)
Europe Islamic NewWorld
Group
t statisticGeneral form of t statistic:
where St is sample statistic, is parameter value specified in H0 and SE is standard error of sample statistic.
Specific form for population mean:
Value of meanspecified in H0
SE
St
ns
y
One sample t-tests
Single population:H0: = 0 (or any other pre-specified value:
here 1.25)
df = n - 1
ns
y – 1.25
s
yt
y
1.25
Results
1. Box plot2. Normal approximation3. Histogram
0.9
1
1.1
1.2
1.3
1.4
1.5
1.6
1.7
0.05 0.15 0.25Probability
Europe
More Results
Test MeanHypothesized ValueActual EstimateDFStd Dev
1.253.47825
151.17943
Test StatisticProb > |t|Prob > tProb < t
t Test7.5570
<.0001*<.0001*1.0000
-1 0 1 2 3 4
Test MeanHypothesized ValueActual EstimateDFStd Dev
1.253.95091
201.50949
Test StatisticProb > |t|Prob > tProb < t
t Test8.1995
<.0001*<.0001*1.0000
-2 -1 0 1 2 3 4
Islamic New World
Even more – a way to present the results
0
1
2
3
4
5
6
7
8
Birt
hs /
dea
ths
(95%
CI)
Ho:
Two sample t- test
• Used to compare two populations, each of which has been sampled
• The simplest form of tests among multiple populations
• Example: does the average annual income differ for males and females: – Ho: income (males) = income (females)
Survey20
5
10
15
20
25
Female Male
SEX
H0: 1 = 2, i.e. 1 - 2 = 0
- independent observations
df = (n1 - 1) + (n2 - 1) = n1 + n2 - 2
2121
212121 )(
yyyy s
yy
s
yyt
n1sp n2
1 1+
21 yy
Where sp = the pooled standard deviation (more later), and
Calculation:
y1 t =
y2
1
n1
1
n2
+sp
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
Pro
ba
bili
ty o
f t
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
6 7 8 9
HA true
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
Pro
ba
bili
ty o
f t
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4Ho true
Ho: = 2
HA: > 2
1) If Ho is true then the null distribution is known (for a set df)
2) If HA is true, we don’t know the distribution but we do know that it is not the null distribution
Logic of the two sample t test
Assume
Central t Non-Central t
Assume: Ho: = 2, 4 df
y1 t =
y2
1
n1
1
n2
+sp
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
6 7 8 9-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4Ho true
y1 t =
y2
1
n1
1
n2
+sp
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
6 7 8 9-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4Ho true
t 0.05, 4 df = 2.14
Any t >2.14 will lead to incorrect rejection of Ho
1. This means that the difference between y1 and y2
is > than 2.14 standard errors (pooled)
2. This will happen 5 % of the time
Assume: HA: > 2, 4 df
y1 t =
y2
1
n1
1
n2
+sp
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
6 7 8 9-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
y1 t =
y2
1
n1
1
n2
+sp
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
6 7 8 9-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 50.0
0.1
0.2
0.3
0.4HA true
t 0.05, 4 df = 2.14
Any t < 2.14 will lead to incorrect rejection of HA
1. This means that the difference between y1 and y2
is < than 2.14 standard errors (pooled)
2. The probability that this will happen is dependent on n and the true difference between and
Results of example
The unequal variance t-test is based on the Satterthwaite adjustment (of degrees of freedom), it is not recommended unless the variance terms are very different and the sample sizes (n) are very different
What is the conclusion?
Difference in Means
Difference in Means
0
10
20
30
40
50
60
70
0
10
20
30
40
50
60
70
0
5
10
15
20
25
Annu
al In
com
e (m
ean
+-
SE)
Female Male
SEX
Female Male
Paired t – tests: The logic of
1. Often there is interest in comparisons of observations that can be considered ‘paired” within a subject or replicate
a) For example:i. A comparison of activity level before and after eating in the
same individualii. A comparison of longevity of males vs females,where
county is the replicate
2. In such cases there is often benefit in accounting for variance that could be caused by differences among subjects (or replicates)
Paired observations: Paired t- test
H0: d = 0
where d is difference between betweenpaired observations
Where sd = standard deviation of the sample of differences, anddf = n - 1 where n is number of pairs
ds
dt
dnd
sd
Paired t-test – example II
• Pisaster comes in two colors along the west coast: purple and orange:
– Ho: density of purple per site = density of orange
– Individual reefs are the replicates of interest
– Looks like a no brainer
Sea star colors all sites two sample
Orange PurpleCOLOR
0
200
400
600
800
1000
1200
Den
sity
Results of a 2 sample test
Orange PurpleColor of seastars
0
200
400
600
800
1000
1200
De
nsi
ty (
95
% C
I)
PurpleOrange
COLOR
0
200
400
600
800
1000
1200
NU
MB
ER
0123456789Count
0 1 2 3 4 5 6 7 8 9Count
Marginally significantWHY?
¦ StandardGROUP ¦ N Mean Deviation-------+--------------------------Orange ¦ 7 144.71429 101.75086Purple ¦ 7 457.28571 353.47829
Pooled VarianceDifference in Means : -312.57143 95.00% Confidence Interval : -615.48591 to -9.65695 t : -2.24827 df : 12.00000 p-value : 0.04413
Consider the variability added at the level of replicate (site)
Govpt
BoatStair
Shell Beach
Hazards
Cayuco
sPSN
Site
0
200
400
600
800
1000
1200
Den
sity
Given that observations are paired at the level of site – can this be accounted for
Orange PurpleCOLOR
0
200
400
600
800
1000
1200
Den
sity
Govpt
BoatStair
Shell Beach
Hazards
Cayucos
PSN
SITE
0
200
400
600
800
1000
1200
Den
sity
PurpleOrange
COLOR
Paired test: Details of calculationSite Purple Orange differenceGovpt 1023 306 717Boat 585 155 430Stair 476 143 333PSN 233 142 91Cayucos 107 31 76Hazards 728 222 506Shell Beach 49 14 35
mean 312.5714Sediff 97.25882t 3.21381
ORANGE PURPLEIndex of Case
0
200
400
600
800
1000
1200
Va
lue
Note slopes – are they the same:Perhaps rates are a better comparison1) Convert to rates or2) Log transform
Paired test: Details of calculation:use of Log transformed data
Note slopes – much more similarIndicates that:1) Purples are more common
• By a constant ratio –rather than by a constant amount
Site Purple(log) Orange(log) differenceGovpt 3.0098756 2.4857214 0.524154Boat 2.7671559 2.1903317 0.576824Stair 2.677607 2.155336 0.522271PSN 2.3673559 2.1522883 0.215068Cayucos 2.0293838 1.4913617 0.538022Hazards 2.8621314 2.346353 0.515778Shell Beach 1.6901961 1.146128 0.544068
mean 0.490884Sediff 0.046604t 10.53299 LORANGE LPURPLE
Index of Case
1.0
1.5
2.0
2.5
3.0
3.5
Va
lue
Review – calculations of t for • One sample test
• Two sample test
• Paired test
ns
y
y1 y2
1
n1
1
n2
+sp
y1 y2
1
n1
1
n2
+sp
d
nd
sd
n
s
nd
sd
n1sp n2
1 1+ Sp =SS1+SS2
(n1-1)+(n2-1)
SS1+SS2
(n1-1)+(n2-1)
SS1+SS2
(n1+n2-2)
SS1+SS2
(n1+n2-2)=
Sd =
2
2
S =SS
(n-1)
SS
(n-1)
2
SSd
(nd -1)
SSd
(nd -1)
Calculations of Standard Error
1) One sample t-test
2) Paired t-test
3) Two sample t- test (calculation based on pooled variance term)
Testing statistical null hypotheses
Hypothesis construction
General Hypothesis
• A hypothesis that addresses the general question of interest
Ho: There will be no difference in the density of urchins on vertical vs horizontal surfaces
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
Specific hypotheses
• A hypothesis that represents the specific question addressed in your study. The specifics include– Location of study
– Time period
– Replication
– Simple description of design
Specific Hypothesis
Ho: There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B
HA: There will be a difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B
Note much of this can be placed in the methods section, which would alleviate the need to state these details. However, also note that the hypotheses above are actually what are being tested
Depiction of hypotheses
Horizontal Density – Vertical Density of Urchins
- 0 +
Ho: There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Depiction of hypotheses:what should the units be?
Horizontal Density – Vertical Density of Urchins
- 0 +
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Ho
Depiction of hypotheses:what should the units be?
• Goal– To use same units for all assessments – irrespective
of species or system
– To have same set of probabilities based on those units
– Hence - units should link to estimate of confidence• Most common form are t-values, which provide an
estimate of the difference in mean values calibrated by an estimate of error in the assessment of the mean values
T- statistic
1
2
N
XXSD
andN
SDSE
N
ii
SE
XXT
21
30404537
136.3
272.6
000.38
SE
SD
X
(Standard error)
(Standard deviation)
(Number of replicates)
Depiction of hypotheses:what should the units be?
Horizontal Density – Vertical Density of Urchins
- 0 +
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Ho
SET =
Depiction of hypotheses:what should the units be?
Horizontal Density – Vertical Density of Urchins
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Ho
SET =
-3 -2 -1 0 1 2 3
T-distribution (central t) is a null probability distribution
• Depicts the probability that the null hypothesis is correct
• One use is to estimate confidence levels
Depiction of hypotheses:
Horizontal Density – Vertical Density of Urchins
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Ho
SET =
-3 -2 -1 0 1 2 3
Depiction of hypotheses:what should the units be?
Increasing likelihood that Ho is incorrectIncreasing likelihood that Ho is incorrect
Ho
Horizontal Density – Vertical Density of Urchins
SET =
-3 -2 -1 0 1 2 3
Ho: There will be no difference in the density of urchins on vertical vs horizontal surfaces
-3 -2 -1 0 1 2 3Horizontal Density – Vertical Density of Urchins
SET =
Ho: There will be no difference in the density of urchins on vertical vs horizontal surfaces
-3 -2 -1 0 1 2 3Horizontal Density – Vertical Density of Urchins
SET =
Ho: There will be no difference in the density of urchins on vertical vs horizontal surfaces
-3 -2 -1 0 1 2 3
95% CI
Horizontal Density – Vertical Density of Urchins
SET =
Including error yields a confidence interval e.g. 95% confident that the true t value is between….
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
-3 -2 -1 0 1 2 3
95% CI 2.5%2.5%
100% CI
Horizontal Density – Vertical Density of Urchins
SET =
The importance of directionality of the alternative hypothesis (HA)
Consider:
Ho: There will be no difference in the density of urchins on vertical vs horizontal surfaces
HA: There will be a difference in the density of urchins on vertical vs horizontal surfaces
vs
Ho1: Urchin density on horizontal surfaces will be greater than or equal to that on vertical surfaces
HA1: Urchins will be more dense on vertical than on horizontal surfaces
Ho1: Urchin density on horizontal surfaces will be greater than or equal to that on vertical
surfaces
-3 -2 -1 0 1 2 3
100% CI
5%
Horizontal Density – Vertical Density of Urchins
SET =
95% CI
HA1: Urchins will be more dense on vertical than on horizontal surfaces
-3 -2 -1 0 1 2 3
100% CI
5%
Horizontal Density – Vertical Density of Urchins
SET =
95% CI
One vs two tailed hypotheses-
-3 -2 -1 0 1 2 3
100% CI
5% 95% CI
Horizontal Density – Vertical Density of Urchins
SET =
HA1: Urchins will be more dense on vertical than on horizontal surfaces
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
1. Which is more interesting?2. Which is more informed?
-3 -2 -1 0 1 2 3
95% CI 2.5%2.5%
100% CI
One vs two tailed hypotheses-
-3 -2 -1 0 1 2 3
100% CI
5% 95% CI
Horizontal Density – Vertical Density of Urchins
SET =
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
1. Which is more powerful?
-3 -2 -1 0 1 2 3
95% CI 2.5%2.5%
100% CI
HA1: Urchins will be more dense on vertical than on horizontal surfaces
Example
• Replication on horizontal and vertical surfaces = 50 (100 total)
• Mean on Horizontal surfaces = 33.54
• Mean on Vertical Surfaces = 45.31
• Pooled standard deviation = 66.49
SE
XXT
vh 79.1
10049.66
32.4554.33
T
One vs two tailed hypotheses-
-3 -2 -1 0 1 2 3
100% CI
5% 95% CI
Horizontal Density – Vertical Density of Urchins
SET =
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
1. Which is more powerful?
-3 -2 -1 0 1 2 3
95% CI 2.5%2.5%
100% CI
T= -1.79, p=0.04 T= -1.79, p=0.08
HA1: Urchins will be more dense on vertical than on horizontal surfaces
One vs two tailed hypotheses-Conversion to original units
100% CI
5% 95% CI
Horizontal Density – Vertical Density of Urchins
HA: There will be a difference in the density of urchins on vertical vs horizontal surface
95% CI 2.5%2.5%
100% CI
Difference = -11.78, p=0.04
HA1: Urchins will be more dense on vertical than on horizontal surfaces
-19.5 -13.3 -6.65 6.65 13.3 19.50 -19.5 -13.3 -6.65 6.65 13.3 19.50
Difference = -11.78, p=0.08
This is the difference between 1 and 2 tailed hypotheses – make sure you know which you
are dealing with
• Always strive for one tailed hypotheses
• Is there a directional prediction (eg > or separately <)– One tailed
• If not– Two tailed
Assumptions of t test
• The t test is a parametric test
• The t statistic only follows t distribution if:– variable has normal distribution (normality
assumption)
– two groups have equal population variances (homogeneity of variance assumption)
– observations are independent or specifically paired (independence assumption)
Normality assumption
• Data in each group are normally distributed• Checks:
– Frequency distributions – be careful– Boxplots– Probability plots– formal tests for normality
• Solutions:– Transformations– Don’t worry run it anyway – just kidding but not
entirely
Homogeneity of variance
• Population variances equal in 2 groups
• Checks:– subjective comparison of sample variances
– boxplots
– F-ratio test of H0: 12 = 2
2
• Solutions– Transformations
– Don’t worry run it anyway – just kidding again but again not entirely
F-test on variances
• H0: 12 = 2
2
• F statistic (F-ratio) = ratio of 2 sample variances– F = s1
2 / s22
– Reject H0 if F < or > 1
• If H0 is true, F-ratio follows F distribution
• Usual logic of statistical test
50 100 150 200 250 300 350LENGTH
Largest valueSmallest value
Median25% of values 25% of values
Boxplot
0 10 20 30 40 50 60 70 80 90
Limpet numbers per quadrat
0
10
20
30
40
50
60
70
Cou
nt
1. IDEAL 2. SKEWED
4. UNEQUAL VARIANCES3. OUTLIERS
*
*
**
*
Use of transformations to control departures from normality and homogeneity of variances
assumptions
Ourworld
Pop_1990 Lpop1990
Europe 441 0.17
Islamic 1378 0.30
Newworld 1042 0.34
Greatest ratio 3.12 - 1 2 - 1
Variance
Europe
Islamic
NewWorld
GROUP
0
50
100
150
200
PO
P_1
990
Europe
Islamic
NewWorld
GROUP
-1
0
1
2
3
LPO
P19
90
0.02
0.050.080.120.18
0.3
0.45
0.6
0.75
0.840.9
0.930.96
0 50 100 150
Pop_1990
0.02
0.050.080.120.18
0.3
0.45
0.6
0.75
0.840.9
0.930.96
0.2 0.4 1 2 3 4 6 10 20 30 50 100 200
Pop_1990
Nonparametric tests
• Usually based on ranks of the data• H0: samples come from populations with
identical distributions– equal means or medians
• Don’t assume particular underlying distribution of data– normal distributions not necessary
• Equal variances and independence still required
• Typically much less powerful than parametric tests
Mann-Whitney-Wilcoxon test
• Calculates sum of ranks in 2 samples– should be similar if H0 is true
• Compares rank sum to sampling distribution of rank sums– distribution of rank sums when H0 true
• Equivalent to t test on data transformed to ranks
Additional slides
A brief digression to re-sampling theory
Number inside Number outside 3 10 5 7 2 9 8 12 7 8
Mean 5 9.2
Traditional evaluation would probably involve a t test: another approach is re-sampling.
Treatment Number
Inside 3
Inside 5
Inside 2
Inside 8
Inside 7
Outside 10
Outside 7
Outside 9
Outside 12
Outside 8
1) Assume both treatments come from the same distribution
2) Resample groups of 5 observations, with replacement, but irrespective of treatment
Resampling
Treatment Number
Inside 3
Inside 5
Inside 2
Inside 8
Inside 7
Outside 10
Outside 7
Outside 9
Outside 12
Outside 8
1) Assume both treatments come from the same distribution
2) Resample groups of 5 observations, with replacement, but irrespective of treatment
Resampling
Treatment Number
Inside 3
Inside 5
Inside 2
Inside 8
Inside 7
Outside 10
Outside 7
Outside 9
Outside 12
Outside 8
1) Assume both treatments come from the same distribution
2) Resample groups of 5 observations, with replacement, but irrespective of treatment
3) Calculate mean for each group
Resampling
7.6
Treatment Number
Inside 3
Inside 5
Inside 2
Inside 8
Inside 7
Outside 10
Outside 7
Outside 9
Outside 12
Outside 8
1) Assume both treatments come from the same distribution
2) Resample groups of 5 observations, with replacement, but irrespective of treatment
3) Calculate mean for each group4) Repeat many times5) Calculate differences between pairs of means
(remember the null hypothesis is that there is no effect of treatment). This generates a distribution of differences.
Resampling
Mean 1 Mean 2 Difference
8 7.8 0.2
5.6 8.2 ‐2.6
6 9 ‐3
8 5 3
6 6 0
7 8 ‐1
6 6.8 ‐0.8
8 7.2 0.8
8 6.6 1.4
7 8.4 ‐1.4
6 5.4 0.6
7 6.4 0.6
6.4 6.8 ‐0.4
5 3.4 1.6
6.8 4.8 2
6.4 7.2 ‐0.8
7.2 8 ‐0.8
6.4 4.6 1.8
8.4 6 2.4
7.4 6.6 0.8
5.6 8.4 ‐2.8
8.2 6.2 2
7.8 8.4 ‐0.6
8.6 6.6 2
6 10.2 ‐4.2
6.8 5.6 1.2
6.4 7.8 ‐1.4
7.2 4.8 2.4
6.6 7.2 ‐0.6
7 5.2 1.8
6.6 9.8 ‐3.2
8.4 7.8 0.6
-10 -5 0 5 10
Difference in Means
0.0
0.1
0.2 Pro
po
rtion
pe
r Ba
r
0
50
100
150
200
250
Nu
mb
er
of O
bse
rva
tion
s 1000 observations
Distribution of differences
OK, now what?
Compare distribution of differences to real difference
Number inside Number outside 3 10 5 7 2 9 8 12 7 8
Mean 5 9.2
Real difference = 4.2
Estimate likelihood that real difference comes from two similar distributions
Mean 1 Mean 2 Difference
10.2 3.6 6.6 1
10 3.8 6.2 0.999
10.2 4.4 5.8 0.998
9.2 3.6 5.6 0.997
9.8 4.8 5 0.996
8.8 4.2 4.6 0.995
9.6 5.2 4.4 0.994
9.8 5.6 4.2 0.993
9.8 5.8 4 0.992
9.4 5.4 4 0.991
And on through 1000 differences
Proportion of differences less than current
Likelihood is 0.007 that distributions are the same
What are constraints of this sort of approach?
T-test vs resampling
Test P-valueResampling 0.007T-test 0.0093 Why the difference?
Additional examples
Worked example• Fecundity of predatory
gastropods:– sample of 37 and 42 egg capsule
of Lepsiella from littorinid zone and mussel zone respectively
• Counted number of eggs per capsule
• Null hypothesis:– no difference between zones in
mean number of eggs per capsule
• Ward & Quinn (1988), qk2002 Box 3.1
• Specify H0 and choose test statistic:
H0: M = L, i.e. population mean number of eggs per capsule from both zones are equal
The t statistic is appropriate test statistic for comparing 2 population means
• Specify a priori significance (probability) level ():
By convention, use = 0.05 (5%).
• Collect data, check assumptions,calculate test statistic from sample data:
Mean SD nLittorinid: 8.70 3.03 37
Mussel: 11.36 2.33 42
t = -5.39, df = 77
• Compare value of t statistic to its sampling distribution, the probability distribution of statistic (for specific df) when H0 is true– what is probability of obtaining t value of 5.39 or
greater from a t distribution with 77 df?
– what is probability of taking samples with observed or greater mean difference from 2 populations with same means?
• Probability (from JMP)
P = 0.001
• Look up in t table
P < 0.05
• If probability of obtaining this value or larger is less than , conclude H0 is “unlikely” to be true and reject it:– statistically significant result
• Our probability (<0.001) is less than 0.05 so reject H0:– statistically significant result.
• If probability of obtaining this value or larger is greater than , conclude that H0 is “likely” to be true and do not reject it:– statistically non-significant result
Presenting results of t test
• Methods:– An independent t test was used to compare the
mean number of eggs per capsule from the two zones. Assumptions were checked with….
• Results:– The mean number of eggs per capsule from the
mussel zone was significantly greater than that from the littorinid zone (t = 5.39, df = 77, P < 0.001; see Fig. 2).