+ All Categories
Home > Documents > Comparative Genomics of Thermophilic Bacteria and Archaea

Comparative Genomics of Thermophilic Bacteria and Archaea

Date post: 30-Nov-2023
Category:
Upload: toyaku
View: 0 times
Download: 0 times
Share this document with a friend
20
331 T. Satyanarayana et al. (eds.), Thermophilic Microbes in Environmental and Industrial Biotechnology: Biotechnology of Thermophiles, DOI 10.1007/978-94-007-5899-5_12, © Springer Science+Business Media Dordrecht 2013 Abstract Elucidation of the origin and the early evolution of life is fundamental to our understanding of ancient living systems and of the ancient global environment where early life evolved. A number of molecular phylogenetic trees have been con- structed by comparing the homologous gene sequences. In this chapter, we have reviewed the universal trees constructed based on differ- ent types of genetic information. The tree topology was different depending on the type of the gene analyzed as well as the method used. The root of the universal tree is most likely placed between the bacterial branch and the common ancestor of Archaea and Eucarya. However, there are possibilities that the root may be within the bacterial branches. Monophyly of Archaea is rather controversial. Though the rRNA tree suggested the monophyly, other types of the tree are also reported. The conclusive result where the Eucarya originated within/outside of the branch of Archaea is yet to come. The growth temperature of the ancient organism has long been a topic that has interested many scientists. Theoretical works suggested mesophilic, thermophilic, and hyperthermophilic origin of life, depending on the report. Experimental test analyzing the effect of each or combination of ancestral amino acid residues sug- gested the hyperthermophilic origin of life. However, we cannot totally deny the possible artifact based on the method used for the estimation of ancestral sequences possessed by the ancestral organisms. Keywords *) 1432-1 Horinouchi, Hachioji-shi, Tokyo 192-0392, Japan e-mail: [email protected] Chapter 12 Comparative Genomics of Thermophilic Bacteria and Archaea Satoshi Akanuma, Shin-ichi Yokobori, and Akihiko Yamagishi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Transcript

331T. Satyanarayana et al. (eds.), Thermophilic Microbes in Environmental and Industrial Biotechnology: Biotechnology of Thermophiles, DOI 10.1007/978-94-007-5899-5_12, © Springer Science+Business Media Dordrecht 2013

Abstract Elucidation of the origin and the early evolution of life is fundamental to our understanding of ancient living systems and of the ancient global environment where early life evolved. A number of molecular phylogenetic trees have been con-structed by comparing the homologous gene sequences.

In this chapter, we have reviewed the universal trees constructed based on differ-ent types of genetic information. The tree topology was different depending on the type of the gene analyzed as well as the method used. The root of the universal tree is most likely placed between the bacterial branch and the common ancestor of Archaea and Eucarya. However, there are possibilities that the root may be within the bacterial branches.

Monophyly of Archaea is rather controversial. Though the rRNA tree suggested the monophyly, other types of the tree are also reported. The conclusive result where the Eucarya originated within/outside of the branch of Archaea is yet to come.

The growth temperature of the ancient organism has long been a topic that has interested many scientists. Theoretical works suggested mesophilic, thermophilic, and hyperthermophilic origin of life, depending on the report. Experimental test analyzing the effect of each or combination of ancestral amino acid residues sug-gested the hyperthermophilic origin of life. However, we cannot totally deny the possible artifact based on the method used for the estimation of ancestral sequences possessed by the ancestral organisms.

Keywords

*)

1432-1 Horinouchi, Hachioji-shi, Tokyo 192-0392, Japane-mail: [email protected]

Chapter 12Comparative Genomics of Thermophilic Bacteria and Archaea

Satoshi Akanuma, Shin-ichi Yokobori, and Akihiko Yamagishi

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

332 S. Akanuma et al.

12.1 Introduction

Elucidation of the origin and the early evolution of life is fundamental to our understanding of ancient living systems and of the ancient global environment where early life evolved. Extant genes are evolutionary descendants of ancient genes. Consequently, information on the traits of ancient genes is embedded in the sequences of extant genes. A number of molecular phylogenetic trees have been constructed by comparing the homologous gene sequences. However, the information used in con-structing the phylogenetic trees has been limited. The topologies of the trees largely depend on the genes analyzed. In this chapter, we first review the phylogenetic trees built by several different ways and their possible interpretation. We then discuss the nature of the last universal common ancestor predicted from such phylogenetic analy-ses. Finally, we introduce several studies where ancient proteins were reconstructed by the combination of computational prediction and experimental resurrection of

1998).

12.2 Topology of Universal Trees

In this section, we will review the point to be considered to obtain the true universal trees, as well as the genes to be used for the construction of the universal tree.

12.2.1 Ribosomal RNA Gene Trees

used for phylogenetic analyses. All living organisms contain rRNAs, which are the main components in ribosome involved in protein synthesis. Although the copy numbers of ribosomal RNA genes are often multiple, they are almost identical in an organism (see Hillis and Dixon 1991 -ism as well as isolated ones have been extensively analyzed (Barns et al. 1994; Ward et al. 1990rRNA (gene) sequences and consequently suggested monophyletic status of Bacteria, Archaea, and Eucarya (Woese et al. 1990).

the basal position of Archaea and Bacteria (i.e., Woese et al. 1990; Stetter 2006), sug-gesting the (hyper)thermophilic ancestry of Archaea and Bacteria (Woese et al. 1990; Stetter 2006 1998). However, hyperthermophilic and thermophilic

2002). Because varied nucleotide compositions among operational taxonomic units

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

333

1990; Hasegawa and Hashimoto 1993), it could be that placing hyperthermophilic and thermophilic organisms at the basal position of Archaea and of Bacteria is the

-

than those of other organisms (see Woese’s tree: Woese et al. 1990evolutionary rate among taxa can also cause the unreliable phylogenetic tree (e.g.,

1998group and slow-evolving taxa form another group in a phylogenetic tree. Thus, fast-

Fast-evolving taxa tend to be placed near the basal position of the phylogenetic

genes generally show faster evolutionary rate than do bacterial and archaeal genes (e.g., the tree reported in Woese et al. (1990)). Therefore, it is difficult to examine the precise phylogenetic position of Eucarya in the universal tree.

Organisms with parasitic life often show accelerated evolutionary rate. If an evo-lutionary model where evolutionary pattern and rate are invariable among sites and across time was used, the substitution rate may be underestimated for first-evolving sites and branches, and overestimated for invariant/slow-evolving sites and branches. Accordingly, fast-evolving taxa tend to be placed near the basal position of the tree

Nanoarchaeum equitans, a parasite of another archaeon, is the only member of Nanoarchaeota and often represented to be the basal group of Archaea (Huber et al. 2002; Waters et al. 2003). Because N. equitans has a long branch in the archaeal tree, the basal position of N. equitans

12.2.2 Protein Gene Trees

Many genes encoding proteins have been also used for building universal trees. Elongation factors (EFs) trees (see below for more details) suggested the monophyly of each Archaea, Bacteria, and Eucarya (Iwabe et al. 1989; Baldauf et al. 1996). The monophylies of the three groups were also suggested by the analysis of RNA poly-merase sequences (Iwabe et al. 1991) and of ribosomal protein sequence analysis

2010 2010) suggested that Archaea and Bacteria tend to show similar phylogenetic trend based on about 100 universal trees.

12.2.3 Genome Trees

Increasing number of complete genome sequences (see public databases such as gen-

information. To obtain the reliable phylogenetic tree using genome level information, all of the genes to be analyzed must be orthologous. However, it is not easy to judge

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

334 S. Akanuma et al.

if the protein genes are orthologous or not. For example, the archaeal elongation factor 1 (aEF-1 ) is generally regarded as the ortholog of eukaryotic EF-1 (eEF-1 ). However, aEF-1 is functionally similar to eukaryotic release factor 3 and HBS1, which are paralogs of eEF1 (Saito et al. 2010). Can we think that aEF-1ortholog of eEF-1 ? It is not easy to answer this question. We, however, need to remember that the aEF-1 might be under certain selection pressure different from eEF-1

-tributes to the ribosomal recycling in the termination process of translation (Zavialov et al. 2005that show different characteristics (Suematsu et al. 2010share the same origin, evolutionary constraints (e.g., rate of substitution, invariable residues) are expected to be different if the roles of these proteins are different.

absent from the analysis. Certain bacterial species have natural transformation ability (e.g., Thermus spp., Claverys et al. 2009; Koyama et al. 1986; Hidaka et al. 1994). Horizontal gene transfer event occurred frequently during the early evolution of Bacteria and Archaea. For instance, 24% of protein genes within the Thermotoga mar-itima genome are likely to be the descendants of archaeal genes (Nelson et al. 1999). Horizontal gene transfer between Eucarya and Bacteria may have occurred during the early stage of eukaryote evolution, for example, Tunicata (or Urochordata), one of three subphyla belonging to Chordata. Tunicates are the only multicellular animals producing cellulose. Recent studies suggested that the common ancestor of tunicates might have acquired bacterial cellulose synthetase genes (Sagane et al. 2010).

The numbers of the genes suitable for phylogenetic analysis are limited. Only 31 protein gene families were used for the analysis by Ciccarelli et al. (2006). Most of them are members of the protein families related to translation and transcription. They have reported an unrooted tree including Archaea, Bacteria, and Eucarya. In the tree, Bacteria can be divided into three groups: the basal group is represented by Firmicutes including Bacilli, Clostridia, and Mycoplasmatales; the second group includes Actinobacteria and Bacteroidetes; and the third group consists of Proteobacteria, Cyanobacteria, Deinococcus-Thermus group, and thermophilic Thermotogales and Aquificalessuggests that the common ancestor of Bacteria was not (hyper)thermophilic although Firmicutes include thermophilic species (it should be noted that the root cannot be

Harris et al. (2003) proposed a different type of the tree obtained from genome data. They analyzed 80 conserved clusters of genes throughout the three domains and

A problem for the genome-based phylogenetic analyses with primary sequences is how to select the genes (regions) for analyses. Although more than 1,000 com-

the function of several 10% of predicted protein genes are not known. As mentioned above, only tens of the protein genes were used for phylogenetic analyses by Ciccarelli et al. (2006) and Harris et al. (2003).

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

335

12.2.4 Other Approaches

Other approaches for reconstruction of universal tree also have been reported. Wang et al. (2007) used protein structures to infer relationship among life. They compared presence/absence of each protein structure (or protein structure family) among spe-cies of which complete genome sequences are known. By counting the number of protein structures conserved among species, they estimated the relationship among species. In their conclusion, Eucarya and Archaea appear as monophyletic groups and do not seem to form a group in their (unrooted) tree.

Another method to determine the direction of evolution is to utilize the evolu-tionary event that can indicate direction of evolution. Existence and absence of ret-rotransposon in certain loci has been used for phylogenetic analyses (e.g., Shimamura et al. 1997) because retrotransposon is first transcribed and then inserted into other positions of genome from the original position by reverse transcription of the RNA followed by insertion event. Sharing the same retrotransposon sequence at the same locus within their genomes suggests that these two species diverged after the period when the retrotransposon was inserted to the position. However, this kind of phylo-genetic marker is rare.

12.3 Placing the Root on the Universal Tree

To identify the root of extant organisms, a multiple gene (protein) tree of paralogous genes which might be duplicated into two or more prior to the age of Commonote has been constructed. Commonote has been positioned at the different branches, depending on the type of the gene and the analytical method used.

12.3.1 The Root on the Bacterial Branch

The root is most often placed between Bacteria and common ancestor of Archaea and Eucarya. Iwabe et al. (1989) reconstructed the multiple gene tree of EF-Tu/EF-1of translation, the date of diversification to EF-Tu/EF-1assumed to be before the age of Commonote. Therefore, the Commonote is expected to be located on the branch connecting the EF-Tu/EF-1

EF-1 clade. In turn, the EF-Tu/EF-1EF-2. In the trees of Iwabe et al. (1989), Archaea is the sister group of Eucarya, and Bacteria is their sister group. The similar result was reported by using larger dataset of these proteins (Baldauf et al. 1996 1989) also reported the close relationship between Archaea and Eucarya based on the H+

However, in the case of the H+

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

336 S. Akanuma et al.

1993 2006), 1990) adopted the position

of Commonote in the Bacteria branch to the small subunit ribosomal RNA tree and proposed the three domains of life, Archaea, Bacteria and Eucarya.

Brown and Doolittle (1995) also reconstructed a composite tree for three closely related aminoacyl-tRNA synthetases: that is, valyl tRNA synthetase, leucyl tRNA synthetase, and isoleucyl tRNA synthetase. From the isoleucyl tRNA synthetase part of the tree, Brown and Doolittle (1995) suggested the Archaea/Eucarya clade. However, after the report, Brown et al. (2003) have noticed the two types of bacterial isoleucyl tRNA synthetase present. One of them shows mupirocin resistance, and this type is found in limited lineage of bacteria. Brown et al. (2003) suggested that the bacterial isoleucyl tRNA synthetase resistant to mupirocin appeared indepen-dently to the mupirocin-sensitive bacterial isoleucyl tRNA synthetase, and the eukaryotic isoleucyl tRNA synthetase may originate from the mupirocin-resistant isoleucyl tRNA synthetase. In the earlier work of Brown and Doolittle (1995), mupi-rocin-resistant isoleucyl tRNA synthetases were not included. Therefore, at least from the isoleucyl tRNA synthetase data, we cannot conclude the monophyly of the Archaea/Eucarya clade. On the other hand, another analyses of aminoacyl-tRNA synthetases (seryl, tyrosyl, and tryptophanyl tRNA synthetases) had suggested simi-lar relationship among Archaea, Bacteria, and Eucarya to the tree of Iwabe et al. (1989) (Kollman and Doolittle 2000), although threonyl tRNA synthetase did not.

When we use base or amino acid sequences for reconstruction of molecular phy-logenetic trees, we have to distinguish orthologs from paralogs. Orthologs have common ancestry and share the same function in the biological processes. On the other hand, paralogs have common ancestry, but have different functions in the bio-logical processes. For example, human elongation factor 1 (EF-1 ) and chimpan-zee EF-1 are orthologs since they share their ancestry and the same biological function in the translation. On the other hand, EF-1 and mitochondrial EF-Tu are paralogs. Though both proteins are responsible for elongation process of translation to bring aminoacylated tRNAs to the A site of ribosome, eukaryotic EF-1 works in cytoplasm, while mitochondrial EF-Tu works in mitochondria. When we want to reconstruct species tree, accidental inclusion of paralogous genes (proteins) may mislead to wrong trees. However, the discrimination of the orthologs from the para-logs is not obvious (see discussion above on the aEF-1 ).

12.3.2 The Commonote as a Member of Bacteria

12.3.2.1 Cavalier-Smith’s Hypothesis

Several recent studies have suggested that the Commonote was the member of Bacteria. In other words, the root was placed within the Bacterial branches. Cavalier-Smith (2002, 2006a, b, 2010), for example, has suggested that Commonote is in Eobacteria. In his hypothesis, the oldest extant lineage is

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

337

Eobacteria. Eobacteria in his term include Chloroflexi. Negibacteria (overlapping with gram-negative bacteria) includes Eobacteria and Glycobacteria (consists of Cyanobacteria, Proteobacteria, and so on) with two-layered surface mem-brane. Eobacteria are older groups in the course of evolution in the Cavalier-Smith’s hypothesis. Posibacteria (overlapping with gram-positive bacteria) with single-layered membrane was originated from Negibacteria (Glycobacteria) in his view. From Posibacteria, common ancestor of Eukaryotes and Archaebacteria might be appeared.

His hypothesis on the evolution of life depends on several observations, one being the structure of membranes surrounding cells. Most of others are also pres-ence/absence of certain structures (proteins and other high-weight molecules) in the cell. Cavalier-Smith divided life into two classes, one is the class of two-membrane organisms and another is the class of single-membrane organisms (Cavalier-Smith 2002, 2006a, b, 2010). In addition, he predicted evolutionary direction from two-membrane organisms to single-membrane organisms. This

Smith 2001).The hypothesis of Cavalier-Smith on the early evolution of life and the evolution

of Bacteria, Archaea, and Eucarya depend on the various topics he gathered. However, in general, it is very difficult to tell the evolutionary directions of traits. Although the discussion of Cavalier-Smith (2001, 2002, 2006a, b, 2010) is fruitful for the research on early evolution of life, the standing position is very different from others: most of them find bases in the molecular phylogenetic analyses. Only when we accept his obcell hypothesis, the direction of evolution of traits, and then the direction of evolution of life, can be accepted.

12.3.2.2 Lake’s Hypothesis

living organisms based on the indel analyses of various pairs of protein genes

et al. 2007 2008, 2009). They also delineated the early evolution of 2004). They suggested the pos-

sible close relationship between Archaea and Firmicutes (in particular Bacilli), based on the indel analyses (absence/presence of residues at the well-conserved region). Then, they placed the root of all of life between Actinobacteria and Clostridia 2008, 2009). In other words, Actinobacteria and Clostridia

Commonote was the gram-positive bacteria similar to Actinobacteria and/or Clostridia. This conclusion is opposite to the conclusion on the early evolution of life presented by Cavalier-Smith (2002, 2006a, b, 2010), although both conclu-sions suggest that Commonote was not Archaea, but Bacteria. We need to note

(e.g., Valas and Bourne 2009).

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

338 S. Akanuma et al.

12.4 Are Archaea Monophyletic?

Another issue remained to be answered is whether Eucarya is a subgroup of Archaea 1990) suggested

that Eucarya and Archaea are distinct monophyletic groups.

12.4.1 On the Origin of Eucarya

Determination of the origin of eukaryotic cell is challenging. Substantial evidence has been accumulated on the bacterial origin of mitochondria and plastids (chloro-plasts). Molecular phylogenetic analyses have suggested that the mitochondrion is derived from Alphaproteobacteria (e.g., Andersson et al. 1998) and the plastid from Cyanobacteria (e.g., Rodríguez-Ezpeleta et al. 2005). Therefore, the early eukary-otic cells incorporated bacterial genes through mitochondria and plastid symbiosis. Because mitochondria are the organelle responsible for respiration and related metabolism, and because plastid is responsible for photosynthesis, many eukaryotic metabolic genes are the descendants of early bacterial genes.

Very few evidence is present regarding the origin of cytoplasm and nucleus. The transcription and translation systems of Eucarya are similar to those of Archaea rather than of Bacteria (e.g., Werner 2007 2009). Therefore, Eucarya are thought to be relatives of Archaea rather than Bacteria. However, there are tens of hypothesis on the origin of nucleus (see the review by Martin 2005). Some of them are going to be reviewed in the following sections.

12.4.2 Eocyte Hypothesis

1984; 1992). Eocytes are one of the subgroups of Archaea, which include

some groups in Crenarchaeota. They have analyzed the phylogeny using the indel trait. Since nucleotide insertion/deletion event is thought to occur much less fre-quently than base substitution during evolution, parallel evolution does not seriously affect the phylogenetic analyses. Accordingly, shared insertion/deletion between orthologous sequences can be a good phylogenetic signal. By comparing the exis-tence of indels in the EF-Tu/EF-1concluded that the Eocytes (Crenarchaeota) are the sister group of Eucarya.

Recent phylogenetic analyses of combined data of large and small ribosomal RNA genes, as well as concatenated protein genes, supported the Eocyte hypothesis (Cox et al. 2008; Foster et al. 2009). In these analyses, a method allowing heteroge-neity of nucleotide composition through time was adopted. The evolutionary rates

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

339

of ribosomal RNA genes are accelerated in Eukaryotic lineage (Cox et al. 2008). Therefore, we cannot rule out the possibility that Eucarya being the distinguished group from Archaea because of the long branch attraction in the Woese’s tree (Woese 1987; Woese et al. 1990). In the case of Woese’s molecular phylogenetic

1969) was used for the estimation of evolutionary distances. This is the simplest substitution model for nucleotide sequences, and the effects of transversions on the sequence evolution might be over-estimated. In addition, there was no consideration on the rate heterogeneity among sites and branches, in Woese’s original tree, because of the limitation of analytical technique in those days.

12.4.3 Other Hypothesis on the Archaeal Origin of Eucarya

Euryarchaeota is also a candidate of the closest relative to Eucarya. Martin and his -

chondria and hydrogenosomes (Martin and Müller 1998). According to Martin and Müller (1998 -biosis and the origin of eukaryotic cells. In agreement to the hypothesis, several molecular studies proposed that Eucarya are the closest relatives to methanogens (e.g., Sandman et al. 1990).

It has been argued that Thermoplasmatales or their close relatives were hosts of eukaryotic cells (e.g., Searcy et al. 1978; Margulis 1996). Thermoplasmatales lack cell wall (Darland et al. 1970) and therefore can be good hosts for the intracellular symbiosis. Currently, there are few molecular evidences that directly support the close relationship between Eucarya and Thermoplasmatales 2007; see also Shimizu et al. 2007). In addition, Thermoplasma (and Archaeoglobus) MreB, a bacterial/archaeal homolog of actin, is closely related to eukaryotic actin rather than to those of methanogens (Hara et al. 2007), although the direct ancestor of eukary-otic actin may be different.

The universal tree has been used to obtain the information regarding the origin of 2008) analyzed eukaryotic protein genes that were the descen-

dants of archaeal genes and found that most of them were the sister group of all archaeal orthologs. The result suggests that the Archaea and Eucarya form different

et al. (2007) reported that eukaryotic genes show high affinity to Alphaproteobacteria, Cyanobacteria, and Thermoplasmatales. If the affinity to Alphaproteobacteria is caused by the mitochondrial origin of the genes and if affinity to Cyanobacteria is caused by the plastid origin of the genes, nuclear genes of Eucarya may most closely related to those of Thermoplasmatales.

Recently, Kelly et al. (2011) suggested the Thaumarchaeal origin of Eucarya based on the presence/absence of protein gene families. They also suggested the ancestral characteristics in methanogens in Archaea.

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

340 S. Akanuma et al.

12.5 Was the Commonote Thermophile?

12.5.1 Theoretical Analyses on the Growth Temperature of Commonote

The growth temperature of the ancient organism has long been a topic that has 2010). In a well-accepted phylogenetic

are represented in the deepest and shortest branches (Woese 1987; Achenbach-Richter et al. 1988). Based on this observation, Stetter described that the common ancestors of Archaea and of Bacteria were likely hyperthermophilic (Stetter 2006).

Commonote is also parsimoniously thought to have been thermophilic. However, it cannot be ruled out the hypothesis that the most ancestral organism lived in a

Warwicker 2007).

positive supercoils into circular DNAs in vitro (Kikuchi and Asai 1984). Although the precise role of reverse gyrase in vivo is still unknown, the facts that the protein is found only in thermophiles and that all known hyperthermophiles contain this protein suggest an essential role of reverse gyrase in the adaptation of life to very high temperatures (Forterre 2002). Indeed, a reverse gyrase knockout Thermococcus kodakaraensis mutant can grow at 90 °C but not at 93 °C at which the growth of wild-type T. kodakaraensis can be observed (Atomi et al. 2004). Therefore, although reverse gyrase is not the absolute requirement, the emergence of this enzyme was crucial in the origin of hyperthermophiles. An important struc-tural feature of reverse gyrase is that the protein is composed of two non-related domains, a topoisomerase domain and a helicase domain (Declais et al. 2000). It is apparent that these two domains could not have been fused to produce a single-chained reverse gyrase molecule before topoisomerase and helicase families were diverged. Therefore, assuming that reverse gyrase is an essential protein for hyper-thermophilic organisms (Heine and Chandra 2009), the primitive microorganisms could not be hyperthermophilic. In addition, eukaryotic type I DNA topoisomerase interacts with helicases in vivo, suggesting that type I topoisomerases and helicases originated and evolved independently in mesophiles or thermophiles and later recruited to hyperthermophiles (Forterre 1996). This argument suggests that hyper-thermophiles descended from less thermophilic organisms, but does not preclude the idea that reverse gyrase had evolved prior to the appearance of the last universal common ancestor.

1999) established a model of sequence evolution and estimated the

-fore concluded that the Commonote was likely a mesophile. However, a different

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

341

same genome data set by using maximum parsimony and then claimed that the last 2001).

the hypothesis that the last universal common ancestor was a hyperthermophile 2003a, b).

Ancestral amino acid compositions were also computed. Brooks et al. estimated amino acid compositions of a set of proteins postulated to have existed in the last universal common ancestor using an expectation-maximization method (Brooks et al. 2004). The calculated amino acid composition of this protein set was more similar to the observed composition of the same set in extant thermophilic species than in extant mesophilic species. They therefore concluded that the Commonote lived in a thermophilic environment.

Becerra et al. focused on the evolution of protein disulfide oxidoreductases and then implicated the thermostabilities of proteins in the Commonote. The results imply that disulfide oxidoreductase sequence was missing in genome of the last universal common ancestor, suggesting non-thermophilic ancestry (Becerra et al. 2007). However, it should be noted that disulfide bond formation is not necessarily required for the high thermostability of thermophilic proteins. Indeed, a number of thermophilic and hyperthermophilic proteins lack or contain few cysteine residues (Cambillau and Claverie 2000).

Recently Boussau et al. conducted computational analyses of both rRNA and protein sequences (Boussau et al. 2008). The results suggested that the Commonote was mesophilic and, subsequently, the common ancestor evolved divergently to thermophilic ancestors of Bacteria and of Archaea-Eucarya that were adapted to high temperature, possibly in response to a climate change of early earth.

Thus, a number of theoretical studies have argued the growth temperature of the last universal common ancestor, but these studies remained inferential due to the lack of empirical testing. In the next section, we will describe some experi-mental studies to assess if the Commonote was thermophilic, performed in authors’ laboratory.

12.5.2 Experimental Testing if the Commonote Was (Hyper) thermophilic

Ancestral sequences of a particular protein can be inferred by comparison with extant homologous protein sequences (Messier and Stewart 1997; Bielawski and

2003; Thornton 2004). We have developed an experimental way to assess the antiquity of hyperthermophilic organisms using an inferred amino acid sequence of a protein postulated to exist in the Commonote. In this experimental method, inferred ancestral residues are introduced into an extant protein and then the thermal stabilities of the resulting mutant proteins are examined. If the Commonote was thermophilic, the mutant proteins, each of which contains one or a few inferred

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

342 S. Akanuma et al.

ancestral residues, are expected to show the trend toward enhanced thermal stability when compared to the wild-type protein.

Miyazaki et al. inferred an ancestral amino acid sequence of 3-isopropylmalate

-thermophilic archaeon, Sulfolobus tokodaii (Miyazaki et al. 2001). When the ther-mal stabilities of the resulting mutant proteins were investigated by measuring the remaining activity after heat treatment and the change in 222-nm ellipticity upon thermal unfolding, at least five of the seven ancestral mutants tested showed thermal

extremely thermophilic bacterium, Thermus thermophilus, was also used for the experimental testing as another model protein. Watanabe et al. designed 12 ances-tral mutants each containing an ancestral amino acid residue that was postulated to be present in the common ancestor of Bacteria and Archaea (Watanabe et al. 2006). When the thermal stabilities of the designed mutants were compared to the wild-

T. thermophilus, at least 6 of the 12 ancestral mutants designed exhibited enhanced thermal stability. A similar trend was also observed when we constructed ancestral mutant proteins of isocitrate dehydrogenase (ICDH) from the extremely thermophilic archaeon, Caldococcus noboribetus (Iwabata et al. 2005). At least four of the five ancestral mutants, each containing an ancestral amino acid residue, showed thermal stability higher than that of the wild-type ICDH. Thus, the ancestral amino acid residues tend to increase the thermostability of metabolic pro-teins originating from thermophilic and even hyperthermophilic organisms. The results provide experimental evidences for existence of extremely thermostable pro-teins in the last universal common ancestor, supporting the hypothesis that the Commonote was a hyperthermophile.

A similar experiment was also performed using a protein involved in the trans-lation system of T. thermophilus (Shimizu et al. 2007proteins involved in translation system often show the same topology as the rRNA tree, which is frequently used in phylogenetic analysis. The function of the transla-tion system is universal because all organisms on the earth have a translation sys-tem. Furthermore, aminoacyl-tRNA synthetases must be primordial proteins that emerged early in evolution. Therefore, the evolution of an aminoacyl-tRNA syn-thetase is likely coincided to the evolution of host organisms. In addition, probably because mutations occurred in aminoacyl-tRNA synthetases would affect survival of the organisms, the sequences of the proteins are well conserved. Therefore, it is unlikely that modification of the functions and horizontal transfer of the genes have been frequent during evolution. Thus, it is advantageous to use an aminoacyl-tRNA synthetase for a phylogenetic analysis. Shimizu et al. deduced a possible ancestral

-hood tree of 2 2007). An individual or pairs of the

T. thermophilus, and the thermal stabilities of the resulting eight mutant proteins were evaluated by monitoring the change in 222-nm ellipticity upon thermal unfolding. As a result,

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

343

the Commonote possessed extremely thermophilic translation enzymes. The result is again compatible with the hyperthermophile common ancestry. However, as dis-cussed below, it cannot be fully precluded that the observed trend for enhanced thermostability of mutant proteins is an artifact of the ancestral design method (Williams et al. 2006).

As described above, introduction of ancestral residues further enhanced the thermo-stabilities of the proteins involved in a metabolic pathway or a translation system of the (hyper)thermophiles with the probability between 50 and 80%. Therefore, the ancestral design method is a useful technique of designing mutant enzymes with higher thermo-stability that only relies on the primary amino acid sequences of homologous proteins. We also found that the extent to which thermostability of the mutants with an intro-duced ancestral residue enhances is directly correlated with the degree to which resi-

2010).Consensus approach is a very similar way to improve thermal stability of a protein

using a multiple amino acid sequence alignment of homologous proteins. This method is based on the hypothesis that, at a given position of a multiple sequence alignment of homologous proteins, most frequently occurring amino acids contribute to the thermostability of the protein more than other less frequently occurring amino acids. In 1994, Steipe et al. first rationalized the feasibility of this approach using statistical thermodynamics (Steipe et al. 1994). They analyzed the amino acid

basis of their design is that randomly occurring mutations are often destabilizing and, therefore, mutations tend to destabilize proteins if selection pressure is absent. However, during the actual evolution, mutations that caused reduced stability insufficient to maintain protein’s specific tertiary structure have been hardly selected. Consequently, the frequency of a given residue in the multiple sequence alignment of a protein correlates with the contribution of the amino acid to protein stability. Hence, the most frequent amino acid at any position among homologous immunoglobulin

than that with an amino acid rarely seen in the homologous sequences. They calcu-lated a statistical free energy from the frequencies of occurrence of a particular amino acid at a given site and designed proteins with specific amino acid residue substitu-

the consensus approach method was used to enhance other proteins: for example, 2000, 2002), the SH3 domain from the yeast actin-binding

protein 1 (Rath and Davidson 2000), the DNA-binding domain of the tumor suppres-sor p53 protein (Nikolova et al. 19981999). The consensus design concept was also applied to improve thermal stability of chorismate mutase from Escherichia coli by using artificially generated functional protein sequences selected from binary-patterned libraries (Jackel et al. 2010).

The consensus design approach and the ancestral design approach frequently resulted in the same residue substitutions because consensus residues often origi-nated from the ancestral residues. Therefore, it remain unclear if the enhanced ther-mostability of the proteins with an ancestral amino acid could be ascribed to the antiquity of the residue or if the enhanced thermostability is attributable to the sta-

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

344 S. Akanuma et al.

tistic free energy. To clarify the reason why ancestral mutations tended to improve protein stability, Watanabe et al.and ICDHs designed to date (Watanabe et al. 2006). In authors’ laboratory, the ther-

Bacillus subtilis and Saccharomyces cerevisiae had been improved by an evolutionary molecular engineering technique that consisted of random mutagenesis and selection (Akanuma et al. 1998, 1999; Tamakoshi et al. 2001). Some of the mutants isolated by evolutionary engineering have an ancestral residue at the mutated site. Therefore, the thermostabilizing ancestral amino acids found in the experimental evolution were also incorporated into the analysis. Watanabe et al. classified the ancestral mutations into two groups from the view-point of the consensus approach, that is, dominant ancestral residues and minor ancestral residues (Watanabe et al. 2006). The dominant residues are the residues that occupied a given site most frequently in the amino acid sequence alignment of

residue is not coincident to the ancestral residue at a site, the ancestral residue was designated as a minor ancestral residue. Among the 15 mutants with a dominant ancestral residue, ten led to improved thermal stability. Similarly, out of six mutants with a minor ancestral residue, four improved the thermal stability of the proteins. Because the rate of improving thermal stability by introducing the ancestral residue was not related to whether the ancestral residue was dominant, the stabilization effect of the ancestral residues cannot be attributed to the consensus residue: that is, statistical free energy. However, the analyzed data are limited and therefore not sufficient to justify that the increased stability of the mutant proteins into which an ancestral amino acid is introduced is only related to the inherent nature of ancestral sequences. Very recently, we predicted the sequence for the deepest nodal position of a phylogenic tree composed of 16 gyrase B subunit sequences, which was then synthesized and characterized (Akanuma et al. 2011of the reconstructed gyrase B is more thermally stable than is a corresponding sequence containing the most frequently occurring amino acids among the 16 gyrases. The thermal stability of the designed protein is likely due in part to the antiquity of some of the inferred residues. However, it would be also possible that the ancestral design algorithm simply corrected for the potential inclusion of erro-neous residues in the reconstructed sequence that would have been caused by the use of limited number of homologous amino acid sequences (Akanuma et al. 2011). Further evidences are, therefore, required to conclude that the results of our experi-mental testing really support the (hyper)thermophilic ancestor hypothesis.

12.6 Computer Prediction and Experimental Reconstruction of Ancient Proteins

Information about the ancient environment of Earth is often obtained from fossil records. In contrast, no tangible remnants of the primitive protein forms hosted by ancient organisms that lived more than 3,500 million years ago are preserved

509

510

511

512

513

514

515

516

517

518

519

520

521

522

523

524

525

526

527

528

529

530

531

532

533

534

535

536

537

538

539

540

541

542

543

544

545

546

547

548

549

345

(Schopf 1993). However, in addition to the currently available genome information that has provided growing database of homologous protein sequences, recent advance in phylogenetic analysis and whole-gene-synthesis technique have made it possible to reconstruct the genes encoding ancient proteins in laboratories. Therefore, predicting ancestral protein sequences and characterizing the properties of the reconstructed proteins are one of the most powerful means available for studying

examples of resurrection experiments are discussed in greater detail in an excellent review by Thornton (2004).

The empirical reconstruction of ancient proteins was used as a novel tool for improving our knowledge of environmental temperatures experienced by ancient

.et al. 2003, 2008). They estimated the growth temperature of the common ancestor of Bacteria according to the concept that the denaturation temperature of a protein

experiment, they reported that the common ancestor of bacteria was thermophilic, rather than hyperthermophilic or mesophilic. However, it is well known that a sin-gle random mutation, insertion, deletion, or substitution can drastically decrease the thermal stability of a protein. Therefore, it cannot be ruled out that the common ancestor of Bacteria was a hyperthermophilic organism. Conversely, Williams et al. have pointed out that an inaccurate estimation of ancestral amino acids has a risk to overestimate the thermostability and other properties of ancestral proteins (Williams et al. 2006). To assess the reliability of the properties of ancestral pro-teins reconstructed by various methods, they performed an evolution simulation of

thermodynamic properties of the true ancestral sequences with those of ancestral sequences inferred by maximum parsimony, maximum likelihood, and Bayesian inference. As the result, they found that reconstruction by maximum parsimony or maximum likelihood tends to overestimate thermodynamic stabilities although the two methods can effectively predict accurate ancestral amino acids. In contrast, Bayesian inference sometimes predicts less probable ancestral amino acids, but the method is more reliable guide to ancestral thermodynamic properties. Nevertheless, there may still be anxiety to use incorrect models, even when Bayesian inference is used. It is therefore important to keep in mind that none of the reconstruction methods provide a perfect success for predicting ancestral amino acid residues. Thus, although phylogenetic reconstruction of ancestral protein sequences is a powerful way for studying early evolution of life, any conclusion obtained from such studies relies largely on the accuracy of the reconstructed sequences.

Similar resurrection experiments have been also applied to eukaryotic pro-teins; ancestral reconstruction has been used to understand the evolution of ethanol production/consumption in yeast (Thomson et al. 2005) and the evolu-tionary trajectory of changes in substrate specificity of hormone receptors (Bridgham et al. 2006; Ortlund et al. 2007). Thus, the reconstruction method is currently a common technique to study the molecular evolutions of genes, proteins, and life.

550

551

552

553

554

555

556

557

558

559

560

561

562

563

564

565

566

567

568

569

570

571

572

573

574

575

576

577

578

579

580

581

582

583

584

585

586

587

588

589

590

591

592

593

594

346 S. Akanuma et al.

12.7 Conclusions

In this chapter, we have reviewed the universal trees constructed based on different types of genetic information. The tree topology was different depending on the type of the gene analyzed as well as the method used. The root of the universal tree is most likely placed between the bacterial branch and the common ancestor of Archaea and Eucarya. However, there are possibilities that the root may be within the Bacterial branches.

Monophyly of Archaea is rather controversial. Though the rRNA tree suggested the monophyly, other types of the trees have been also reported. The conclusive result where the Eucarya originated within/outside of the branch of Archaea is yet to come.

The growth temperature of the ancient organism has long been a topic that has interested many scientists. Theoretical works suggested mesophilic, thermophilic, and hyperthermophilic origins of life, depending on the report. Experimental test analyzing the effect of each or combination of ancestral amino acid residues sug-gested the hyperthermophilic origin of life. However, we cannot totally deny the possible artifact originated from the method used for the estimation of ancestral sequences possessed by the ancestral organisms.

References

Cavalier-Smith T (2006b) Biol Direct 1:19

595

596

597

598

599

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

347

Hasegawa M, Hashimoto T (1993) Nature 361:23

Jukes TH, Cantor CR (1969) In: Munro HN (ed) Mammalian protein metabolism. Academic,

639

640

641

642

643

644

645

646

647

648

649

650

651

652

653

654

655

656

657

658

659

660

661

662

663

664

665

666

667

668

669

670

671

672

673

674

675

676

677

678

679

680

681

682

683

684

685

686

687

688

689

690

691

692

348 S. Akanuma et al.

Martin W (2005) Archaebacteria (archaea) ant the origin of the eukaryotic nucleus. Curr Opin 693

694

695

696

697

698

699

700

701

702

703

704

705

706

707

708

709

710

711

712

713

714

715

716

717

718

719

720

721

722

723

724

725

726

727

728

729

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

349

Zavialov AV, Hauryliuk VV, Ehrenberg M (2005) Splitting of the posttermination ribosome into

747

748

749

750

751

752

753

754

755

Author QueryChapter No.: 12 0001782195

Query Details Required Author’s Response

occurrences.


Recommended