+ All Categories
Home > Documents > National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable...

National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable...

Date post: 20-Dec-2015
Category:
View: 223 times
Download: 1 times
Share this document with a friend
Popular Tags:
49
onal Center for Biotechnology Information onal Center for Biotechnology Information Evolution of eukaryotic Evolution of eukaryotic genomes: remarkable genomes: remarkable conservation and massive conservation and massive loss of genes and loss of genes and introns introns Eugene V. Koonin National Center for Biotechnology Information, NIH, Bethesda, MD
Transcript
Page 1: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Evolution of eukaryotic Evolution of eukaryotic genomes: remarkable genomes: remarkable

conservation and massive loss conservation and massive loss of genes and intronsof genes and introns

Evolution of eukaryotic Evolution of eukaryotic genomes: remarkable genomes: remarkable

conservation and massive loss conservation and massive loss of genes and intronsof genes and introns

Eugene V. Koonin

National Center for Biotechnology Information,NIH, Bethesda, MD

Page 2: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

“In my own subjects, genetics and molecular biology, research has become so directed toward medical problems and the needs of the pharmaceutical companies that most people do not recognize that the most challenging intellectual problem of all time, the reconstruction of our biological past, can now be tackled with some hope of success. “

Sydney Brenner, Science 282, 1411-1412 (20 Nov 1998)

Page 3: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on Comprehensive evolutionary

classification of genes fromsequenced genomes

Page 4: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Ancient conserved eukaryotic genes

Page 5: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Current status of evolutionary classification of proteins from 7 complete eukaryotic genomes:

112920 proteins = 65170 in KOGs + 23436 in LSEs + 24314 singletons

Lineage-specific expansions

Tatusov et al., BMC Bionformatics, 2003 Sep 11;4(1):41.

Page 6: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on Breakdown of eukaryotic proteins into KOGs, LSEs and

singletons

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

E. cuniculi S. cerevisiae S. pombe A. thaliana C. elegans D.melanogaster

H. sapiens

Singletons

LSEs

2-species KOGs

>3 species KOGs

Current status of evolutionary classification of proteins from 7 complete genomes

Page 7: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Define a phyletic pattern

Page 8: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Page 9: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Page 10: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Page 11: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Page 12: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

858

921

186

188

1421109

271

1947

All

All-Ec

Animals-Fungi

Plant+fungi

Plant+animals

All animals

All fungi

Other patterns

Phyletic patterns of eukaryotic KOGs

Page 13: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

S. cerevisiae

717

497 1004

273 1120

115 1463

221

0%

25%

50%

75%

100%

non-essential essential

1 2-5 6 7

Phyletic patterns of KOGs and phenotypic effect of knockouts

Essential genes tend not to be lost during evolution

Page 14: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

C. elegans

736

312 917

154

3602

181 7282

163

0%

25%

50%

75%

100%

non-essential essential

1 2-5 6 7

Phyletic patterns of KOGs and phenotypic effect of knockouts

Essential genes tend not to be lost during evolution

Page 15: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

The traditional application of the evolutionary parsimony principle:

Given the distribution of a set of binary characters in a set of species, construct the shortest tree (maximum parsimony tree)

A 10111100B 00110111C 00010111D 10111010

A

D

B

C

Page 16: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

However, parsimony can be used with equal ease to addressthe reverse task: given the distribution of a set of binary characters in a set of species AND the *true* tree topology, construct the most parsimonious scenario of evolution (which, of course, might include many more events than the overall most economical scenario)

A 10111100B 00110111C 00010111D 10111010

A B C D

2 1 32

2 210111010 00010111

Page 17: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Ec Sc Sp Ce Dm Hs

At

100%

100%

Maximum parsimony (Dollo) tree for eukaryotes based on the phyletic patterns of KOGs

Page 18: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

The phylogenetic parsimony tree built on the basis of KOG phyletic patterns did not follow the species treeHowever, the parsimony principle can be applied in the opposite direction: given a species tree topology, construct the most parsimonious scenario for the evolution of eukaryotic gene repertoire (mapping of gene (KOG) gain and loss events on the tree branches):

1/0

0/1

gain

loss

Page 19: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

3491 520

Dm Hs Ce Sc Sp At

13688 162

4503 541

-

3711

398 37

1358 193

422 -

55

Ec

32605361

5000 3048

3835

3413

15802

1679299 1969202

842 586

267

The most parsimonious scenario of gene loss and birth in eukaryotic evolution and ancestral gene sets

Gene gainGene loss

Koonin et al. 2004. Genome Biol. 5: R7.

Page 20: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Exon/intron structure of eukaryotic genes

Eukaryotic nuclear, protein-coding genes usually contain multiple spliceosomal introns that are spliced out of pre-mRNAs by an RNA-protein complex, the spliceosome.

GU AG

exon1 exon2

intron

Page 21: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Evolution of introns and the exonic Evolution of introns and the exonic structure of eukaryotic genesstructure of eukaryotic genes

Evolution of introns and the exonic Evolution of introns and the exonic structure of eukaryotic genesstructure of eukaryotic genes

• Tempo and mode of intron evolution remain poorly understood.

• When did introns invade eukaryotic genes:prior to the origin of eukaryotes (introns early),early in eukaryotic evolution, or late?

• The common ancestor of animals, plants andfungi: intron-rich or intron-poor?

• What fraction of introns is conserved over longevolutionary spans?

Page 22: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Origin of intronsOrigin of intronsOrigin of intronsOrigin of introns

• The "intron-early" hypothesis suggests that introns existed before the divergence of prokaryotes and eukaryotes (W. Gilbert).

• The "intron-late" hypothesis posits that introns were inserted into eukaryotic genes after this divergence (T.Cavalier-Smith, Doolittles, J.Palmer)

Loss and sliding

Gain and loss

Page 23: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Three mechanisms of intron evolution have been invoked by proponents of both theories: - intron loss

- intron gain

- intron sliding

Mechanisms of intron evolutionMechanisms of intron evolution

Page 24: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Mechanisms of intron evolution: Mechanisms of intron evolution: intron lossintron loss

Complete loss of introns: re-integration of reverse-transcribed mRNAs into the genome

Loss of one or few introns (recombination/gene conversion between cDNAs and genomic sequences (Feiber et al. 2002 ))

Page 25: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Mechanisms of intron evolution: Mechanisms of intron evolution: intron gainintron gain

?A common event

Page 26: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Mechanisms of intron evolutionMechanisms of intron evolution

Why is our understanding of intron evolution so limited?

- Lack of information on exon/intron structure oforthologous genes

Can we use completely sequenced genomes?

- This is a great source of information but …they are not necessarily easy to work with...

Page 27: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Analysis of introns in completely Analysis of introns in completely sequenced genomessequenced genomes

We used sets of orthologous genes which contained a memberfrom each of 8 eukaryotic genomes:

Human (HS)Fly (DM)Mosquito (AG)Worm (CE)Plant (Arabidopsis) (AT)Baker’s yeast (SC)Fission yeast (SP)Malaria Plasmodium (PF)

KOG database

Page 28: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on KOG analysis

(8 species)

Multiple alignment(MAP)

Identification of conserved blocks

Projection of intronson alignment

Extraction ofintron positionsfrom genomes

Pipeline for analysis of evolution of intron-exonstructure

Page 29: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on HS …ATGTCGATCGTGCTCGTCGTACTCTCGTAC…

DM …ATGTGGATCGTGCTCGTCGTACTCTCGTAC…CE …ATGTGGATTGTGCTCGTCGTACTCTCGTAC…AT …ATGTTGATGGTGCTCGTCGTACTCTCGTAC…SC …ATGTTGATTGTGCTCGTCGTACTCTCGTAC…SP …ATGTTGATT---CTCGTCGTACTCTCGTAC…

All positions with gaps were deletedto ensure robustness of the analysis…but we also analyzed the completealignments

Conserved introns(found in two or morespecies)

Non-conserved introns(one species only)

Page 30: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Statistical analysis: shuffling of intron positions,Monte Carlo simulation

HS …ATGTCGATCGTGCTCGTCGTACTCTCGTAC…

DM …ATGTGGATCGTGCTCGTCGTACTCTCGTAC…

CE …ATGTGGATTGTGCTCGTCGTACTCTCGTAC…

AT …ATGTTGATGGTGCTCGTCGTACTCTCGTAC…

SC …ATGTTGATTGTGCTCGTCGTACTCTCGTAC…

SP …ATGTTGATTGTCCTCGTCGTACTCTCGTAC…

Page 31: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Number of speciesa 1 2 3 4 5 6 7 8

Observedb 13406 2047 719 275 104 25 1 0 Expected 21368 33 0 0 0 0 0 0

Number of introns – total Expected

- 10% 20083 662 8 0 0 0 0 0

Observedb 5446 1122 411 163 74 19 1 0 Expected 9982 42 0 0 0 0 0 0

Number of introns – conserved blocks Expected

- 10% 8613 689 25 0 0 0 0 0

Observed 9808 123 2 0 0 0 0 0 Number of introns – conserved blocks, ±1 Expected 9834 116 0 0 0 0 0 0

Observed 9956 55 0 0 0 0 0 0 Number of introns – conserved blocks, ±2 Expected 9838 114 0 0 0 0 0 0

Observed 9920 70 2 0 0 0 0 0 Number of introns – conserved blocks, ±3 Expected 9844 111 0 0 0 0 0 0

Observed 9973 42 3 0 0 0 0 0 Number of introns – conserved blocks, ±4 Expected 9848 109 0 0 0 0 0 0

CONSERVATION OF INTRON POSITIONS IN 8EUKARYOTIC SPECIES

Page 32: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

a

Pf Sc Sp At Ce Dm Ag Hs Pf 450/971 2 48 137 50 46 54 145 Sc 1 22/46 7 3 3 3 4 6 Sp 34 6 450/839 209 98 114 111 308 At 97 2 147 2933/5589 353 255 254 1148 Ce 33 2 63 240 1468/3465 315 312 948 Dm 32 1 72 161 179 723/1826 787 802 Ag 36 1 62 158 176 382 675/1768 771 Hs 104 3 207 787 557 433 403 3345/6930

Conservation of intron positions among eukaryotes

Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV.Curr Biol. 2003 Sep 2;13(17):1512-7.

Page 33: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Intron positions 33 55 144 169 233 Pf MSRRTKKVGLTGKYGTRYGSSLRKQIKKIELMQHAKYLCTFCGKTATKRTCVGIWKCK--KCKRKVCGGAWSLTTPAAVAAKSTIIRLRKQKEEAQKS At MTKRTKKARIVGKYGTRYGASLRKQIKKMEVSQHNKYFCEFCGKYSVKRKVVGIWGCKD--CGKVKAGGAYTMNTASAVTVRSTIRRLREQTES Sc MAKRTKKVGITGKYGVRYGSSLRRQVKKLEIQQHARYDCSFCGKKTVKRGAAGIWTCSC--CKKTVAGGAYTVSTAAAATVRSTIRRLREMVEA Sp MTKRTKKVGVTGKYGVRYGASLRRDVRKIEVQQHSRYQCPFCGRLTVKRTAAGIWKCSGKGCSKTLAGGAWTVTTAAATSARSTIRRLREMVEV Ce MAKRTKKVGIVGKYGTRYGASLRKMAKKLEVAQHSRYTCSFCGKEAMKRKATGIWNCA--KCHKVVAGGAYVYGTVTAATVRSTIRRLRDLKE Dm MAKRTKKVGIVGKYGTRYGASLRKMVKKMEITQHSKYTCSFCGKDSMKRAVVGIWSCK--RCKRTVAGGAWVYSTTAAASVRSAVRRLRETKEQ Ag YLPKMAKRTRKVGIVGKYGTRYGASLRKMVKKMEITQHAKYTCTFCGKDAMKRSCVGIWSCK--RCNRVVAGGAWVYSTTAAASVRSAVRRLREM Hs MAKRTKKVGIVGKYGTRYGASLRKMVKKIEISQHAKYTCSFCGKTKMKRRAVGIWHCG--SCMKTVAGGAWTYNTTSAVTVKSAIRRLKELKDQ

33 55 144 169 233 Pf 1 0 1 0 0 At 0 1 1 0 0 Sc 0 0 0 0 0 Sp 0 0 0 1 0 Ce 0 0 0 0 0 Dm 0 0 1 0 0 Ag 0 0 1 0 0 Hs 0 0 1 0 1

Example: KOG0473 – ribosomal protein L37

Alignment with mapped intron positionsis converted to a matrix of intron presence/absence

Page 34: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Conserved intron positions - phylogenetic signal

Example:

82 130 195 216 284 285 345 372 432 443 483 554 579 645 736 869 966 1045 1053 1169 1181 1188 1293 1362 1422 1538 1629 Pf 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 At 0 1 1 0 0 1 0 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 0 1 1 1 Sc 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Sp 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Ce 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 Dm 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 Ag 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Hs 0 1 0 1 0 0 1 0 0 1 0 1 0 0 1 1 0 1 0 1 0 0 0 1 0 1 0

KOG0419 - TCP-1a subunit of chaperonin complex

The only intron among 684 genes conserved in 7 species

Matrices for all analyzed genes were concatenated and employedto build a single tree - 684 KOGs, 7236 intron positions

Page 35: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Phylogenetic tree of crown group eukaryotes based on conservation of intron positions: parsimony

100%

Dm

Ag

Hs

Ce Sc

Sp

At

Pf

100%

100%

100%

99%

The topology of this tree is a bit unexpected...

Page 36: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

The phylogenetic parsimony tree built on the basis of the pattern of intron conservation did not follow the species tree.However, the parsimony principle can be applied in the opposite direction: given a species tree topology, construct the most parsimonious scenario for the evolution of eukaryotic gene structure: distribution of intron gain and loss events over the tree branches

1/0

0/1

gain

loss

Page 37: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Parsimonious evolutionary scenario for the mostrealistic topology of the eukaryotic tree

147 156

Dm Ag Hs Ce Sc Sp At Pf 137 194

1844 77

798 735

15 247

197 1

2001 46

307 -

87 933

244 71

386 27

92 24

835 -

3 795

143

Intron lossIntron gain

Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV.Curr Biol. 2003 Sep 2;13(17):1512-7.

Page 38: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Roy SW, Fedorov A, Gilbert W.Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain.Proc Natl Acad Sci U S A. 2003 Jun 10;100(12):7158-62.

A. S. Kondrashov, personal communication

There seems to have been virtually no intron gain and limited intron loss during mammalian evolution

Human mouse rat

~100 introns lost~0 introns gained

~100 Mya

Fish

Page 39: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on A conundrum of intron evolution:

•practically no intron gain during (at least)~100 mln yrs of mammalian evolution

•apparent massive gain during evolution of animal phyla (e.g., chordates) ~500-700 mln yr scale

Are major transitions in eukaryotic evolution associated with bursts of intron insertion?

Page 40: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Correlation between gain of genes and introns

-5000

0

5000

10000

15000

0 500 1000 1500 2000

intron gain

gen

e g

ain

R=0.96

Correlation between the loss of genes and introns

0

200

400

600

800

1000

0 200 400 600 800 1000

intron loss

ge

ne

los

s

R=0.93

Koonin, 2004, Cell Cycle 3, 280

Page 41: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Gain/loss of genes and gain/loss of introns in conserved genesoccur in parallel in eukaryotic evolution – probably manifestation of the same, general lineage-specific trends

‘…by magnifying the power of random genetic drift, reduced population size provides a permissive environment for the proliferation of various genomic features that would otherwise be eliminated by purifying selection.’

Lynch, M., Conery, J.S. (2003) The Origins of Genome Complexity. Science 302, 1401-4.

Page 42: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Comparing old and new introns: gaining insight intothe origin of introns

Sverdlov, Babenko, Rogozin, Koonin. Curr. Biol. (2003);Gene (2004, in press)

Page 43: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

y = -42.679x + 937.18

y = 6.4048x + 1308.4

0

200

400

600

800

1000

1200

1400

1600

1 2 3 4 5 6 7 8

all old

all new

Linear ( all old)

Linear ( all new)

Distribution of old and new introns along the gene length

All genomes pooled

Page 44: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Distribution of old and new introns along the gene length

S. pombe – an intron-poor genome –nearly identical distributions of old and new introns

y = -5.4643x + 63.714

y = -5.9762x + 58.893

0

10

20

30

40

50

60

70

80

90

100

1 2 3 4 5 6 7 8

sp old

sp new

Linear ( sp old)

Linear ( sp new)

Page 45: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

y = -11.679x + 293.18

y = 7.6071x + 422.89

0

100

200

300

400

500

600

1 2 3 4 5 6 7 8

hs old

hs new

Linear ( hs old)

Linear ( hs new)

Distribution of old and new introns along the gene length

H. sapiens – an intron-rich genome – enrichment fornew introns in the 3’-region

Page 46: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Reverse transcriptionduplication

TTTTTTTT

TTTTTTTT

GT AG

GT AG

AAAAAAAAA5’ 3’

Genomic DNA

Homologous recombination

new intron

GT AG

A reverse-transcription based model of intron insertion – almost the same as for intron loss (Fink, 1987) but includes an error of reverse transcription

Introns seem to be preferentially lost AND inserted near the 3’-end of the coding region – could there be similar mechanismsfor intron loss AND insertion?

Role of duplication in the origin of alternative exons has beendemonstratedKondrashov, F.A, Koonin, E.V. Hum. Molec. Genet., 2001Letunic, I. et al., Hum. Molec. Genet., 2002

Page 47: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Conclusions• Evolutionary classification of genes from sequenced genomes

(orthologs and paralogs) allows us to address genome-wideevolutionary trends by applying rather straightforward adaptations of known phylogenetic approaches

• Introns invaded protein-coding genes very early in evolution ofeukaryotes - prior to the origin of multicellular forms - and manyof these ancient introns survive to this day

• Remarkable conservation of ancestral introns in some eukaryotic lineages, with as many as 25-30% of the introns in humans and Arabidopsis being apparently inherited from the common ancestor of animals, fungi and plants, and ~30% Plasmodium introns conserved in the crown group. Even the earliest ancestral eukaryotes seem to have had many genes and introns.

Page 48: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

•Massive gene and intron loss occurred on multiple, independentoccasions during eukaryotic evolution, especially in fungi, but alsoin arthropods and nematodes (and probably many more lineages).

•Classification of introns by age allows one to followthe evolution of splice signals, intron sequences themselves…and might even suggest mechanisms of intron insertion

•Lineage-specific expansion of paralogous gene familiesis accompanied by substantial loss and even more extensive acquisition of introns

•Loss and gain of introns and genes occur in parallel, reflecting thesame lineage-specific trends in genome evolution – perhaps largelydramatic changes in characteristic population sizes entailing changesin selection strength

Conclusions

Page 49: National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Nati

on

al C

en

ter

for

Bio

tech

nolo

gy In

form

ati

on

Acknowledgments

Igor Rogozin (NCBI)

The COG group (NCBI):

Yuri Wolf (NCBI)Boris Mirkin (Birkbeck College, London) Alexander Sorokin (NCBI)Alexander Sverdlov (NCBI, now Columbia U)Vladimir Babenko (NCBI)Fyodor Kondrashov (NCBI, now UC Davis)Alexei Kondrashov (NCBI)

Natalie D. Fedorova, John D. Jackson, Aviva R. Jacobs, Dmitri M. Krylov, Kira S. Makarova, Raja Mazumder1, Sergei L. Mekhedov, Anastasia N. Nikolskaya1, B. Sridhar Rao, Sergei Smirnov, Alexander V. Sverdlov, Roman L. Tatusov, Sona Vasudevan, Jodie J. Yin, Darren A. Natale1

1Currently PIR, Georgetown University


Recommended