The adephylo package
S. Dray
Univ. Lyon 1
2015, Lausanne
SD (Univ. Lyon 1) 2015, Lausanne 1 / 24
Introduction
The ade family
analyse de donnees ecologiques
SD (Univ. Lyon 1) 2015, Lausanne 2 / 24
Introduction
an package to analyse phylogenetic signal in traits data
Reimplementation and development of ade4 functionalities
use of phylo (ape), phylo4d (phylobase) classes instead of phylog
new methods (e.g., ppca) and functions
ade4 → adephylo← ape, phylobase
multivariate phylogeny
SD (Univ. Lyon 1) 2015, Lausanne 3 / 24
Introduction
Two ingredients
speciestraits
SD (Univ. Lyon 1) 2015, Lausanne 4 / 24
Introduction
Two ingredients
speciestraits
SD (Univ. Lyon 1) 2015, Lausanne 5 / 24
Multivariate analysis
Summarizing data
variables
individuals
what are the relationships between the variables ?
what are the resemblances/differences between the individuals ?
SD (Univ. Lyon 1) 2015, Lausanne 6 / 24
Multivariate analysis
Summarizing data with multivariate methods
variables
individuals
d = 2
●
●
●
●
●●
●
●
●●
●●
●●
●
●●●
●●
●●
●
●
●
●
●
●
●
●
1234
56
78910
11121314
15
16 171819202122
23
24
25
2627
282930
dfs
altslo
flopH har
pho
nit
amm
oxy
bdo
what are the relationships between the variables ?
what are the resemblances/differences between the individuals ?
SD (Univ. Lyon 1) 2015, Lausanne 7 / 24
Multivariate analysis
One table, two geometric viewpoints
0
X
cloud of n rows (individuals)
variable 1
variable 2
variable p
individuals hyperspace
0
0 X
cloud of p columns (variables)
individual 1
individual 2
individual n
variables hyperspace
Multivariate methods aim to answer these two questions and seek for smalldimension hyperspaces (few axes) where the representations of individualsand variables are as close as possible to the original ones.
SD (Univ. Lyon 1) 2015, Lausanne 8 / 24
Multivariate analysis
Principal Component Analysis
In ade4 : dudi.pca(df)
X =[xij−xj
s(xj )
]Q = Ip
D = 1n In
Maximization of :
Q(a) = aTQTXTDXQa =‖ XQa ‖2D= var(XQa)
S (k) = kTDTXQXTDk =‖ XTDk ‖2Q=
p∑j=1
cor2(k,xj )
SD (Univ. Lyon 1) 2015, Lausanne 9 / 24
Multivariate analysis
Lizards data
18 species, 8 traits :
mean.L : mean adult female length (mm)
matur.L : female length at maturity (mm)
max.L : maximum length of adult female(mm)
hatch.L : hatchling length (mm)
hatch.m : hatchling mass (g)
clutch.S : clutch size (n. eggs)
age.mat : age at maturity (months)
clutch.F : clutch frequency (n. per year)
Demo
Bauwens, D. et R. Dıaz-Uriarte. 1997. Covariation of life-history traits in Lacertid lizards : a comparative study. AmericanNaturalist. 149 :91-111.
SD (Univ. Lyon 1) 2015, Lausanne 10 / 24
Phylogeny
Management in R
Packages ape and phylobase provides functions, methods and classes todeal with phylogenetic data
Import : read.tree
Classes for a tree : phylo (ape), phylo4 (phylobase)
Class for a tree + data : phylo4d (phylobase)
Graphic : plot
Demo
SD (Univ. Lyon 1) 2015, Lausanne 11 / 24
Phylogeny and traits
species
traits
Phylogenetic structures (i.e. phylogenetic autocorrelation or signal) : thevalues of biological traits observed in a set of taxa are not independentfrom their position in the phylogenetic tree.
positive : closely related taxa tend to share similar trait values
negative : strong contrasts between sister taxa
Need for mathematical representations of the phylogenetic relatedness
SD (Univ. Lyon 1) 2015, Lausanne 12 / 24
Phylogeny and traits
species
traits
Phylogenetic structures (i.e. phylogenetic autocorrelation or signal) : thevalues of biological traits observed in a set of taxa are not independentfrom their position in the phylogenetic tree.
positive : closely related taxa tend to share similar trait values
negative : strong contrasts between sister taxa
Need for mathematical representations of the phylogenetic relatedness
SD (Univ. Lyon 1) 2015, Lausanne 12 / 24
Phylogeny and traits
Phylogeny as a distance/similarity matrix
Function distTips computes distances. The argument method can takedifferent values :
patristic : patristic distance, i.e. sum of branch lengths on theshortest path between two tips
nNodes : number of nodes on the shortest path between two tips
Abouheif : Abouheif’s distance
sumDD : sum of the number of direct descendants of all nodes on theshortest path between two tips
Function proxTips returns phylogenetic proximities wij based on aphylogenetic distance dij using wij = 1
daij
SD (Univ. Lyon 1) 2015, Lausanne 13 / 24
Phylogeny and traits Measuring and testing the phylogenetic signal
Moran’s index
The n-by-1 vector x = [x1 · · · xn ]T contains the measurements of aquantitative trait for n species and W = [wij ] is the the n-by-nphylogenetic proximity matrix.
MC (x) =n∑
(i ,j ) wij (xi − x )(xj − x )∑(i ,j ) wij
∑ni=1 (xi − x )2
see moran.idx, abouheif.moran
SD (Univ. Lyon 1) 2015, Lausanne 14 / 24
Phylogeny and traits Measuring and testing the phylogenetic signal
Moran’s index and Abouheif’s Cmean
Abouheif’s test of phylogenetic signal is exactly a test of Moran’s indexwith phylogenetic proximities defined as :
wij =aij∑
j ,i 6=j aij
withaij = (
∏p∈Pij
f (p))−1
where Pij is the set of nodes on the shortest path from tip i to tip j andf (p) is the number of direct descendents from node p.
Demo
Pavoine, S., Ollier, S., Pontier, D. and Chessel, D. 2008. Testing for phylogenetic signal in phenotypic traits : new matrices ofphylogenetic proximities. Theoretical Population Biology, 73, 79–91.
SD (Univ. Lyon 1) 2015, Lausanne 15 / 24
Phylogeny and traits Describing the phylogenetic signal
Moran’s index allows to test the phylogenetic autocorrelation
Phylogenetic structure is summarized by a single number
Different stories can lead to the same value
Measuring→ Describing
How the variance of a quantitative trait is decomposed along thephylogenetic tree ?
SD (Univ. Lyon 1) 2015, Lausanne 16 / 24
Phylogeny and traits Describing the phylogenetic signal
Moran’s index allows to test the phylogenetic autocorrelation
Phylogenetic structure is summarized by a single number
Different stories can lead to the same value
Measuring→ Describing
How the variance of a quantitative trait is decomposed along thephylogenetic tree ?
SD (Univ. Lyon 1) 2015, Lausanne 16 / 24
Phylogeny and traits Describing the phylogenetic signal
Phylogeny as an orthonormal basis
Tools to represent the structure of a tree. Orthonormal basis allows asimple and unique decomposition of the variance.
Dummy variables
Moran’s eigenvectors
SD (Univ. Lyon 1) 2015, Lausanne 17 / 24
Phylogeny and traits Describing the phylogenetic signal
Dummy variables
It defines partitions of tips reflecting the topology of the tree : each node(except the root) is translated into a dummy variable having one value foreach tip (1 if the tip descends from this node and 0 otherwise).
Not an orthonormal basis
Only based on the topology
Ollier, S., Chessel, D. and Couteron, P. 2005 Orthonormal Transform to Decompose the Variance of a Life-History Trait across aPhylogenetic Tree. Biometrics, 62, 471–477.
SD (Univ. Lyon 1) 2015, Lausanne 18 / 24
Phylogeny and traits Describing the phylogenetic signal
Dummy variables
It defines partitions of tips reflecting the topology of the tree : each node(except the root) is translated into a dummy variable having one value foreach tip (1 if the tip descends from this node and 0 otherwise).
Not an orthonormal basis
Only based on the topologyOllier, S., Chessel, D. and Couteron, P. 2005 Orthonormal Transform to Decompose the Variance of a Life-History Trait across aPhylogenetic Tree. Biometrics, 62, 471–477.
SD (Univ. Lyon 1) 2015, Lausanne 18 / 24
Phylogeny and traits Describing the phylogenetic signal
Moran’s eigenvectors
The eigenvectors (B) of a doubly centred matrix of phylogeneticproximities :
H(1
2(WT + W))H
where H = In − 1n1Tn/n
The n − 1 column-vectors of B (sorted by decreasing eigenvalue) areorthonormal variables ranging from the largest to the lowest possiblephylogenetic autocorrelation as measured by Moran’s index.
SD (Univ. Lyon 1) 2015, Lausanne 19 / 24
Phylogeny and traits Describing the phylogenetic signal
Decomposition of a trait on an orthonormal basis
The vector of squared correlation [cor2(x,b1), . . . , cor2(x,bn−1)] providesa decomposition of a quantitative trait on the phylogeny.
ME 1 ME 3 ME 5 ME 7 ME 9 ME 11 ME 13 ME 15
r2
0.00
0.10
0.20
0.30
Demo
SD (Univ. Lyon 1) 2015, Lausanne 20 / 24
Phylogeny and traits Describing the phylogenetic signal
Associated tests
The function orthogram provides different statistics for detecting phylogeneticsignal :
The maximum squared correlation :
R2Max(x) = max(r21 , . . . , r2n−1)
The deviation from an ordered uniform distribution (KS) :
Dmax(x) = max1≤m≤n−1
(
m∑i=1
r2i −m
n − 1)
The skewness (to the root or to the tips) of the variance decomposition :
SkR2k(x) =
n−1∑i=1
ir2i
The average local variation :
SCE(x) =
n−1∑i=2
(r2i − r2i−1)2
SD (Univ. Lyon 1) 2015, Lausanne 21 / 24
Phylogeny and traits Multivariate data
From univariate to multivariate data
Phylogenetic tools are mainly adapted to univariate data → indirectapproach :
summarize multivariate data by PCA
apply phylogenetic analysis on PCA scores
Not optimal as PCA identifies the main resemblances/differences betweenthe individuals but these differences are not constrained by thephylogenetic relatedness
Demo
SD (Univ. Lyon 1) 2015, Lausanne 22 / 24
Phylogeny and traits Multivariate data
From univariate to multivariate data
Phylogenetic tools are mainly adapted to univariate data → indirectapproach :
summarize multivariate data by PCA
apply phylogenetic analysis on PCA scores
Not optimal as PCA identifies the main resemblances/differences betweenthe individuals but these differences are not constrained by thephylogenetic relatedness
Demo
SD (Univ. Lyon 1) 2015, Lausanne 22 / 24
Phylogeny and traits Multivariate data
From PCA to phylogenetic PCA
pPCA is an extension of PCA that includes the matrix of phylogeneticproximities in the algorithm. It modifies the criteria maximized by theanalysis
PCA maximizes
Q(a) = aTQTXTDXQa = var(XQa)
pPCA maximizes
Q(a) = aTQTXT 1
2(WTDT + DW)XQa = var(XQa) ·MC (XQa)
Jombart, T., Pavoine, S., Devillard, S., and Pontier, D. 2010. Putting phylogeny into the analysis of biological traits : Amethodological approach. Journal of Theoretical Biology, 264(3), 693–701.
Demo
SD (Univ. Lyon 1) 2015, Lausanne 23 / 24
Conclusion
vignette("adephylo")
SD (Univ. Lyon 1) 2015, Lausanne 24 / 24