NO TITLE
Jonathan A. Eisen
U. C. Davis
Famous Arrowhead Quotes 2006
• Publications, student degrees, etc. • Not trying to say anything bad about anyone• The human guts are a real milieu• Where’s you evening gown?• You better kiss everybody• This is how you do metagenomics on 50
dollars, and that’s Canadian dollars
Lessons Learned at Arrowhead 2006
• Open availability of genome sequences has revolutionized microbiology
• Open Source software tools can help allow non informatics specialists to make use of genome data
• Open genome databases can make sense of disparate information
Lessons Not Yet Learned
• What about Open Access to publications?
• Open Access can revolutionize the use of the literature just as Genbank revolutionized the use of sequence data
Outline
• Introduction
• Three endosymbiont genomics tales– Wolbachia– Sharpshooter– Calyptogena
• Where next?
Sources for the Origin of Novelty
• New biochemical functions– Evolution of new genes– Old genes with new functions– Mixing and matching existing genes
• Changes in uses of existing genes– Targeting– Regulation
• Acquisition of new functions– Recombination– Lateral gene transfer– Symbioses
Endosymbioses Drove Eukaryotic Evolution
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
But ….
• Key questions unresolved– Was the pre-organelle ancestor free-living?– What the ancestor a mutualist? a parasite?– What happened early in the evolution of the symbiosis?
• The problems with organelles– Symbioses were so long ago that it is nearly impossible
to figure out what the early events were.– May represent frozen accidents
• Solution?– Study more recent and more diverse symbioses
Wolbachia pipientis wMel
• Wolbachia are obligate, maternally transmitted intracellular symbionts
• Wolbachia infect many invertebrate species– Many cause male specific deleterious effects– Model system for studying sex ratio changes in hosts– Some are mutualistic (e.g., in filarial nematodes)
• wMel selected as model system because it infects Drosophila melanogaster
Wolbachia Metagenomic Sequencing
shotgunshotgun
sequencesequence
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Analysis led by Matin Wu in collaboration with lab of Scott O’Neill
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Genome Completed
Wu et al., PLOS Biology 2004
alanine/glycine
Na+
alanine/glycine
Na+
alanine/glycine
Na+
proline/betaine
H+
proline/betaine
H+WD0168
WD0414
WD1046
WD1047
WD0330
Na+
glutamate/aspartate
Na+
WD0211
WD0229
glutamate/aspartate
ornithine
putrescineWD0957
H+ Na+H+ Na+
WD0316 WD0407
H+
WD1107WD1299WD1300WD1391WD0816WD0765
Mg2+
WD0375
H+ Zn2+/Cd2+
WD1042
ATPADP
Zn2+
WD0362WD0938WD0937
ATPADP
Fe3+
WD1136WD0153WD0897
glycerol-3-phosphate/hexose-6-phosphate
phosphateWD0619
H+
drugs
H+
drugs
WD0056
WD0248
H+
drugs
H+
drugs
WD1320
WD0384
H+
?
H+
?
WD0621
WD0099
H+
metabolite?
WD0470
H+
metabolite?
WD1033
H+WD0249
metabolite?
ATPADP
heme
WD0411WD1093WD0340
K+
WD1249
Na+H+
drugsATP
ADP WD0400
phosphate
ATPADP
ORF00100ORF00714ORF00927ORF00940
(2?)
H+
F-type ATPase
ATP ADP
WD1233WD0203WD0204WD0427WD0428WD0429WD0655WD0656
phosphoenolpyruvate
1,3-bisphosphoglycerate
3-phosphoglycerate
2-phosphoglycerate
pyruvate
acetyl-CoA
citrate
isocitrate
oxaloacetate
suc-CoAsuccinate
fumarate
malate
oxaloacetate
TCA CYCLE
glyceraldehyde-3P
fructose-1,6-P2
dihydroxyacetone-P
WD1238
WD0091
WD0451
WD1167
WD0868
WD0494
WD0690
WD0105
WD0791
WD1309WD0544WD0751
WD1209WD1210
WD0437WD0727WD1221WD1222
WD0492
WD1121
mannose-1P mannose-6PWD0695
MALATE WD0488 WD1177WD0416WD0473WD0751WD0325
Non-oxidative Pentose Phosphate Pathway
xylulose-5P
glyceraldehyde-3P
sedoheptulose-7P
fructose-6P
ribose-5P
ribulose-5P
glyceraldehyde-3P
WD0551WD0387
WD0387
WD0712
erythrose-4P
WD1151
glycerol-3P
WD0731
Amino Acid catabolism
GLUTAMATE glutamineWD1322
GLUTAMINE glutamateWD0535
CYSTEINE alanineWD0997
THREONINE glycineWD0617,WD0617
PROLINE glutamateWD0103
SERINE glycineWD1035
Fatty Acid Biosynthesis WD0985, WD0650, WD1083, WD1170, WD0085
PRPP
WD0036
Thiamine metabolismWD1109,WD0763,WD0029,WD0913,WD1018,WD1024
AMP,ADP,dAMP, dADP,ATP,dATP,ITP,dITP,IMP,XMP,GMP,GDP,dGDP,dGTP,dGMP
WD1142WD1305WD1023WD0786WD0867WD0337WD0786
WD0661WD1183WD0197WD0089WD0195WD0439WD0197
adenylosuccinate WD0786
Purine Metabolism
UMPUDP
WD0684WD1295WD0895WD0230WD1239WD0228WD0461
aspartate semialdehydeaspartateWD1029 WD0960 WD0954
Mitochondrial Origin Unresolved
Wolbachia Evolutionary Rate is Accelerated
Endosymbiont Trends
• Compared to free-living relatives– Smaller genomes– Lower GC content– Higher pIs– Higher rates of sequence evolution
• Wolbachia shows ALL of these– However ….
Explanations for Endosymbiont Differences with Free-Living Relatives
• Repair hypothesis– Loss of DNA repair genes leads to increased mutation rate
– Trends are the direct and indirect result of this increased mutation rate
• Population genetics hypothesis– Smaller effective population size leads to more genetic drift
– Trends are mostly the result of accumulation of slightly deleterious mutations
• PopGen explanations favored– Wolbachia has full suite of repair genes
Wolbachia Overrun by Mobile ElementsRepeatClass
Size(Median)
Copies Protein motifs/families IS Family Possible Terminal Inverted Repeat Sequence
1 1512 3 Transposase IS4 5’ ATACGCGTCAAGTTAAG 3’2 360 12 - New 5’ GGCTTTGTTGCAT CGCTA 3’3 858 9 Transposase IS492/IS110 5’ GGCTTTGTTGCAT 3’4 1404.5 4 Conserved hypothetical,
phage terminaseNew 5’ ATACCGCGAWTSAWTCGCGGTAT 3’
5 1212 15 Transposase IS3 5’ TGACCTTACCCAGAAAAAGTGGAGAGAAAG 3’6 948 13 Transposase IS5 5’ AGAGGTTGTCCGGAAACAAGTAAA 3’7 2405.5 8 RT/maturase -8 468 45 - -9 817 3 conserved hypothetical,
transposaseISBt12
10 238 2 ExoD -11 225 2 RT/maturase -12 1263 4 Transposase ???13 572.5 2 Transposase ??? None detected14 433 2 Ankyrin -15 201 2 - -16 1400 6 RT/maturase -17 721 2 transposase IS63018 1191.5 2 EF-Tu -19 230 2 hypothetical -
Wu et al. 2004
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Wu et al. PLoS Biology 2006
Glassy Winged Sharpshooter Symbiont
• Vector for Pierce’s disease in grapes
• Potential bioterror agent• Feeds on nutrient poor
xylem sap• Needs to get amino-
acids and other nutrients from symbionts like aphids
Sharpshooter Shotgun Sequencing
shotgunshotgun
sequencesequence
400,000
100,000
200,000
300,000
500,000
600,000
1
Genome Helps Resolve Phylogeny
Higher Evolutionary Rates in Clade
Endosymbiont Trends
• Compared to free-living relatives– Smaller genomes– Lower GC content– Higher pIs– Higher rates of sequence evolution
• Baumannia shows ALL of these
Explanations for Endosymbiont Differences with Free-Living Relatives
• Repair hypothesis
• Population genetics hypothesis
• PopGen explanations favored
Variation in Evolution RatesCorrelated with Repair Gene Presence
MutS MutL
+ +
+ +
+ +
+ +
_ _
_ _
Explanations for Endosymbiont Differences with Each Other
• Repair hypothesis
• Population genetics hypothesis
• Repair explanations favored
Polymorphisms in Metapopulation
• Data from ~200 hosts– 104 SNPs– 2 indels
• PCR surveys show that this is between host variation
• Much lower ratio of transitions:transversions than in Blochmannia
• Consistent with absence of MMR from Blochmannia
Baumannia Predicted Metabolism
No Amino-Acid Synthesis
???????
Binning by Phylogeny
• Identified putative genes• Built phylogenetic trees of genes• Examined and classified trees
Binning by Phylogeny
• Four main “phylotypes”– Gamma proteobacteria (Baumannia)– Arthropoda (sharpshooter)– Bacteroidetes (Sulcia)– Alpha-proteobacteria (Wolbachia)
Binning by Phylogeny
• Four main “phylotypes”– Gamma proteobacteria (Baumannia)– Arthropoda (sharpshooter)– Bacteroidetes (Sulcia) - only a.a. genes here– Alpha-proteobacteria (Wolbachia)
Finished 130 kb of Sulcia
Co-Symbiosis?
Metagenomics, Symbionts and Binning
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
• Understanding multisymbiont ecosystems is greatly aided by binning
• Multiple methods can be combined for better signal– Assembly
– Composition
– Phylogeny
– Depth of coverage
ABCDEFG
TUVWXYZ
Binning in More Complex Systems?
Venter et al., 2004
What is Next I:Autotrophic Endosymbioses
Calyptogena magnifica symbionts
C. magnifica symbiont sequencing
• Collaboration between Cavanaugh Lab at Harvard (Irene Newton led analysis), Eisen lab, and JGI (Woyke and others).
• Funded by DOE through CSP program and sequencing and closure done at JGI
• Annotation and analysis involved DOE (JGI, ORNL), Harvard, TIGR, Davis, et al.
A Streamlined Chemoautotrophic Machine
What is Next II+?
• Vertical vs. environmental transmission – Two chemosymbioses– P.I. C. M. Cavanaugh, Harvard
• Chordate symbionts– Prochloron - tunicate– P.I. Jacques Ravel, TIGR.
• Epibionts– AMH epibiont of Alvinella worm– P.I. Craig Cary. U. Delaware
• Commensals
TIGRTIGR
Other peopleOther people
Mom and DadMom and Dad
H. OchmanH. OchmanF. RobbF. Robb
J. BattistaJ. Battista
E. OriasE. Orias
D. BryantD. BryantS. O’NeillS. O’Neill
M. EisenM. Eisen
N. MoranN. Moran
R. MyersR. Myers
C. M. CavanaughC. M. Cavanaugh
P. HanawaltP. Hanawalt
J. HeidelbergJ. HeidelbergN. WardN. Ward
J. VenterJ. Venter
C. FraserC. Fraser
S. SalzbergS. Salzberg
I. PaulsenI. Paulsen
$$$$$$
NSFNSFDOEDOE
NIHNIH
M. WuM. Wu
D. WuD. Wu
S. ChatterjiS. Chatterji
H. HuseH. Huse
A. HartmanA. Hartman
MooreMoore
VIVI
D. RuschD. Rusch
A. HalpernA. Halpern
Eisen Eisen GroupGroup
J. MorganJ. Morgan
JGIJGI
E. EisenstadtE. Eisenstadt
M. FrazierM. Frazier
T. WoykeT. Woyke
E. RubinE. Rubin
Correlation of Endosymbiont Features
• Correlation makes it difficult to tease apart cause and effect
• Need examples where properties are decoupled
• May be the case in Baumannia with genome size
Ruthia magnifica
Sulcia Role Categories
Long Term Effects of Repair Loss
• Endosymbionts are model systems for understanding the consequences of loss of repair activities
• RecA lost in Buchnera and Blochmannia but kept in Baumannia and Wigglesworthia
• MutSL loss mentioned previously• RecBCD present even in species without RecA• Mfd present in many species without UvrABCD
Famous Arrowhead 2004 Quotes
• Space-time continuum of genes and genomes
• Gene sequences are the wormhole that allows one to tunnel into the past
• The human mind can conceive of things with no basis in physical reality
• Thoughts can go faster than the speed of light