Estimating the reliability of a treeReconstructed phylogenetic trees are almost certainly wrong. They are estimates of the true tree. But how reliable are they?
Reliability most of the time, ‘reliability’ refers to the topology, not to branch lengths.
reliability = probability that the members of a given clade are always members of that clade
Methods phylogeneticists use different methods to test the reliability of trees
• Bootstrapping• Jackknife• Permutation tests• Likelihood ratio tests (a)LRT
Bootstrapping bootstrapping uses random sampling with replacement to obtain properties of an estimator
Bootstrapping bootstrapping uses random sampling with replacement to obtain properties of an estimator
x
Bootstrapping bootstrapping uses random sampling with replacement to obtain properties of an estimator
xx
xx
1000-10000 times
x
f
Bootstrapping in phylogenetic bootstrapping, the alignment is resampled
A A TC G CA G TT G TT C T
1ACAAG
2TCTTT
3GGGGG
4GGGGG
5GGACG
6 7TCAGA
8 9TGTTT
0ACATT
1GGGGG
4TCAGA
8TCTTT
3GGACG
6TGTTT
0ACAAG
2GGGGG
4TCTTT
9GGGGG
5
original alignment pseudo alignment
Bootstrapping
A B C D E F A C B D E F
original tree bootstrapped tree
+1
+1+1
+1
+1+1
+1+0
+1+1
A C B D E FA C B D E F
A C B D E FA C B D E F
Bootstrapping
A B C D E F A C B D E F
original tree bootstrapped trees
0.810.75
0.87
0.95
0.800.90
0.50.47
0.450.23
Jackknife methods the Jackknife procedure uses random sampling without replacement to obtain properties of an estimator
Jackknife methods the Jackknife procedure uses random sampling without replacement to obtain properties of an estimator
Permutation methods Permutation tests are standard in non-parametric statistics. They reorder the data to obtain a null distribution.
Permutation methods Permutation tests are standard in non-parametric statistics. They reorder the data to obtain a null distribution.
N=18, x=20 N=10, x=25 Dif=5
Permutation methods
N=18, x=23 N=10, x=19.6 Dif=3.4
Permutation methods
difference
f
5% largest differences5% smallest differences
actual difference
Permutation methods in phylogenetics, species can be permuted within characters
A A TC G CA G TT G TT C T
1ACAAG
2TCTTT
3GGGGG
4GGGGG
5GGACG
6 7TCAGA
8 9TGTTT
0species 1
species 2
species 3
species 4
species 5
ACATT
1
A
C
AT
T
1
reshuffle
Permutation methods in phylogenetics, species can be permuted within characters
A A TC G CA G TT G TT C T
1ACAAG
2TCTTT
3GGGGG
4GGGGG
5GGACG
6 7TCAGA
8 9TGTTT
0species 1
species 2
species 3
species 4
species 5
A
C
AT
T
1
AC
A
AG2
ACAAG
2
reshuffle
Permutation methods in phylogenetics, species can be permuted within characters
A A TC G CA G TT G TT C T
1ACAAG
2TCTTT
3GGGGG
4GGGGG
5GGACG
6 7TCAGA
8 9TGTTT
0species 1
species 2
species 3
species 4
species 5
A
C
AT
T
1
AC
A
AG2
A
TG
CG
TG T
CT
T
C
TT
T
3GGGGG
4GGGGG
5
G
GA
C
G
6 7TCA
GA
8 9
T
G
TT
T0
Likelihood ratio tests
A B C D E F G H I J K L
X (ABCDEF) Y (GHI)
W (J)Z (KL)
Likelihood ratio tests standard likelihood tests compare trees with and without the branch
X (ABCDEF) Y (GHI)
W (J)Z (KL)
X Y
WZ
Likelihood = L1 Likelihood = L0
probability that branch exists = 2 * [ln L1 – ln L0]
Approximate likelihood ratio test aLRT is fast, accurate and powerful
X Y
WZ
Likelihood = L1
Likelihood = L2
approximate probability that branch exists = 2 * [ln L1 – ln L2]
X Z
WY
Likelihood = L3
X Z
YW