12-11-06
1
Phylogenetics 1: An overview
“The affinities of all beings of the same class have sometimes been represented by a great tree. I believe this simile largely speaks the truth. The green budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species...and this connection of the former and present buds by ramifying branches may well represent the classification of all extinct and living species in groups subordinate to groups.” Charles Darwin, in Chapter IV of On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life.
Phylogenetics 1: An overview
Unrooted tree diagram drawn in the margin of one of Charles Darwin’s notebooks
Phylogenetic tree used in The Origin of Species. Darwin wasn’t just thinking about classification based on phylogenies. He used them to visualize the process of divergence within species and the splitting of populations into separate species. Darwin used this figure to illustrate divergence of variants within species; over time successively more variation accumulates. Eventually some of this variation forms the basis for new species.
12-11-06
2
Phylogenetics: The biological discipline devoted to reconstructing, gene or genome phylogenies
Growth of phylogenetics:
1. Phylogenetic methods (1960’s)
2. Recognition that phylogenies were relevant to nearly all disciplines of biology (1970’s?)
3. Molecular biotechnology revolution [PCR] (1980’s)
4. Economics of computational capacity (1990’s)
0.1 Scale bar
Phylogenetics 1: An overview
clade
12-11-06
3
Phylogenetics 1: An overview
• Analogy
• Homology
• Polarity
• Ancestral character
• Derived character
Phylogenetics 1: An overview
0.1
Felis Canis
Ursus
Bos
Hippopotamus
Physeter
Balaenoptera
Rhinoceros
Equus
Branch lengths estimated under the assumption of the molecular clock
Tips are contemporary; the distance from root to each tip is the same
Roo
t
0.1
Felis Canis
Ursus
Bos
Hippopotamus
Physeter
Balaenoptera
Rhinoceros Equus
Tips are NOT contemporary; the distance from root to each tip is NOT the same
Branch lengths estimated without assumption of the molecular clock
12-11-06
4
The phylogenetic comparative method
Hypothetical dataset for phenotype (Y) and ecological variable (X)
Y
X
Two point dataset from early in evolutionary history
Y
X
Hypothetical example:
Y: size of a primates big toe
X: The stubbiness of the habitat
The phylogenetic comparative method
Hypothetical dataset with points coloured according to clade of origin
Y
X
“Little-toed” clade
“Big-toed” clade
Phylogeny of two groups of close relatives
Recent diversifications
Old divergence of “big-toed” and “little-toed” primates
“Big-toe clade” “Little-toe clade”
Species are NOT drawn independently from the same distribution. “phylogenies are fundamental to comparative biology; there is no doing it without taking them into account”
⎯ Joseph Felsenstein
12-11-06
5
Applications of phylogenetics
1. Sytematics, classification, and taxonomy
2. Biogeography
3. Health Sciences
4. Agriculture
5. Conservation
6. Linguistics
Applications of phylogenetics: systematics
ERNST HAECKEL’S “TREE OF LIFE”, DRAWN SOMETIME IN THE LATE 1800’S
Placed Menschen (“Men”) at the “top” of the tree among the Affen (“Apes”). Haeckle was first to suggest man’s ancestry was among the Great Apes. This tree was a tree of “men”, and Haeckels’s placement of Menschen at the top was intentional
Non-mammalian vertebrates
Invertebrates
Protozoa
This tree and associated system of classification is different from modern ones in that it is based on the notion of linear progress (like a ladder) from the most primitive single-celled organisms “upwards” to man (at the very top). Haeckel considered the things near the top as “more evolved” and things near the bottom as “primitive”.
Ernst Haeckel (1834-1919) was a German biologist and scientific illustrator. He was one of the first popularizers of Darwin’s Theory of Evolution. The tree to the left is from his book “General Morphology – founded on the descent theory”.
12-11-06
6
Applications of phylogenetics: systematics
Monophyly, paraphyly and polyphyly
E D C B A
H
J
G
F
Paraphyletic group (AHJGFDE) and a polyphyletic group (BC)
H
A B
J
G
F
C D E
Monophyletic group [Clade]
12-11-06
7
The old Reptilia as an example of classification based on a paraphyletic group.
Lepidosauromorph (lizards snakes, etc.)
Anapsids (turtles and relatives)
Mammals (Synapsids)
Crocodylomorph (gators and crocs)
Old Reptilia is a GRADE
Amniota is a clade
Ornithischia (some plant eating dinosaurs)
Lots of dinosaur diversity
Aves (birds)
Diversity of extinct mammal-like reptiles
Applications of phylogenetics: systematics
I am a synapsid too!
Applications of phylogenetics: systematics
http://www.tolweb.org/tree/
12-11-06
8
Applications of phylogenetics: biogeography
WES
T: lo
w e
leva
tion
and
dry
EAST
: hig
h el
evat
ion
and
wet
Phylogeorgaphy allows one to test hypotheses such as whether geographic/ environmental factors have been historically important barriers to gene flow.
12-11-06
9
Applications of phylogenetics: biogeography of mouse lemurs
Figure adapted from separate figures in A. D. Yoder (2004) In press
Phylogeographic analysis of mouse lemurs contradicts the expected east-west disjunction for Madagascar, and suggests a completely novel north-south disjunction. The observed phylogenetic tree was inferred from mitochondrial DNA gene
sequences.
Applications of phylogenetics: biogeography of mouse lemurs
12-11-06
10
Applications of phylogenetics: biogeography of mouse lemurs
Applications of phylogenetics: biogeography ⇒ conservation
12-11-06
11
Applications of phylogenetics: Ann Yoder’s research group
Applications of phylogenetics: Health Sciences and HIV
12-11-06
12
Applications of phylogenetics: Health Sciences and HIV
HIV-1 genome:HIV-1 genome:
HIV transmission in health care:
1. Patient ⇒ health care worker: well known
2. Health care worker ⇒ patient: unknown
CDC: epidemiological investigation of dentist with infected patients in 1990’s • only risk factor was a common dentist
• phylogenetics of HIV env gene sequences
Applications of phylogenetics: Health Sciences and HIV
12-11-06
13
Dentist
Patient C
Patient A
Patient G
Patient G
Patient B
Patient E
Patient A
Dentist
Local No2
Local No3
Patient F
Local No9
Local No35
Local No3
Patient D
No other risk factors. All had invasive dental procedures.
Sex partner with HIV
Behavioral risk for HIV
Applications of phylogenetics: Health Sciences and HIV
Applications of phylogenetics: agriculture
1. What was the origin of a pest or agricultural disease species?
2. How did some pest organisms evolve resistance to pesticides?
3. How did a pest species spread through agriculture?
4. Are there species that are closely related to known pests that might also cause problems?
12-11-06
14
Applications of phylogenetics: agriculture
Fusarium:
an economically significant fungal crop pathogen
(and health science)
Powerful toxin that inhibits eukaryotic protein synthesis
Applications of phylogenetics: agriculture
Figure adapted from O’Donnell et al. (2000) PNAS, 97:7905-7910.
Genetic divergence among strains of Fusarium indicates that movement of crops among different agricultural settings must be carefully monitored to prevent introduction of “foreign strains”. Local crops are likely to be much less resistant to the “foreign” strains of Fusarium, as compared with the local strain.
Phylogenetic tree inferred from the combined gene sequences of six single-copy nuclear gene sequences (7,120 bp) by using the methods of maximum parsimony. Numbers above the nodes are bootstrap proportions.
Fursarium garminariam is a fungal pathogen of commercially important species of grains. Phylogenetic analysis indicates substantial genetic divergence among
strains in different agricultural settings.
12-11-06
15
Applications of phylogenetics: conservation
This article highlights three uses of the comparative method in conservation: (i) develop predictive models for risk assessment (ii) identifying the general ecological principles that cause conservation problems (iii) identifying and using endangering traits as triage to prioritize research and
conservation efforts Potential pitfalls are: (i) large and expensive sample sizes required for high power of the method (ii) problems with correlation-based methods to identify causal mechanisms Despite the limitations, it seems that the comparative method will grow to be one of many essential tools for conservation research. A hypothetical example from this paper is presented blow that illustrates how application of fisher’s exact test to the raw data (ignoring phylogenetic non-independence) overestimate the relationship between extinction risk and body size
Should we use a Fisher exact test?
Applications of phylogenetics: linguistics Language phylogeny and divergence dates support the Anatolian-origin theory of the Indo-European language family.
Grey and Atkinson (2003) Nature 426:435-439
Data: Cognate word forms were sampled from 87 languages. Three extinct languages thought to be more distantly related than the extant languages were included for the purpose of rooting the tree. Cognates were coded as present or absent (1 or 0) for each language. The final dataset was a binary matrix of 2,449 cognates.
Methods: Phylogenetic analysis was conducted under a stochastic model binary character evolution that allowed for unequal character state frequencies, and heterogeneous rate of evolution among cognates. Bayesian methods were used to infer the tree topology shown to the left. Values above each branch (in black) are Bayesian posterior probabilities. Divergence times were estimated by first assuming maximum and minimum divergence dates for 11 “calibration nodes” on the phylogeny. A semi parametric likelihood based method was used to infer the divergence dates for the nodes of the phylogeny
Extinct languages used as outgroups
Root
Estimated date of ancestral node
12-11-06
16
Applications of phylogenetics: manuscript evolution
..discover relationships between different manuscript versions of a text