+ All Categories
Home > Documents > 1 Genome Sequences of Three Agrobacterium Biovars Help ...

1 Genome Sequences of Three Agrobacterium Biovars Help ...

Date post: 03-Jan-2017
Category:
Upload: vothuan
View: 217 times
Download: 0 times
Share this document with a friend
41
1 Genome Sequences of Three Agrobacterium Biovars Help Elucidate the Evolution of 1 Multi-Chromosome Genomes in Bacteria2 3 Running Title: Agrobacterium Biovar II and III genome sequences 4 5 Steven C. Slater 1 , Barry S. Goldman 2 , Brad Goodner 3 , João C. Setubal 4,5* , Stephen K. 6 Farrand 6 ; Eugene W. Nester 7 , Thomas J. Burr 8 , Lois Banta 9 , Allan W. Dickerman 5 , Ian 7 Paulsen 10 , Leon Otten 11 , Garret Suen 12‡ , Roy Welch 12 , Nalvo F. Almeida 5,13 , Frank 8 Arnold 3 , Oliver T. Burton 9 , Zijin Du 2 , Adam Ewing 3 , Eric Godsy 2 , Sara Heisel 2 , Kathryn 9 L. Houmiel 14,15 , Jinal Jhaveri , Jing Lu 2 , Nancy M. Miller 2 , Stacie Norton 2 , Qiang 10 Chen 14 , Waranyoo Phoolcharoen 14 , Victoria Ohlin 3 , Dan Ondrusek 3 , Nicole Pride 3 , 11 Shawn L. Stricklin 2 , Jian Sun , Cathy Wheeler 3|| , Lindsey Wilson 3 , Huijun Zhu 2 and 12 Derek W. Wood 7,15 13 14 1 Great Lakes Bioenergy Research Center, 1550 Linden Dr., University of Wisconsin, 15 Madison, WI 53706; 2 Monsanto Company, 800 North Lindbergh Boulevard, St. Louis, 16 MO 63167; 3 Department of Biology, Hiram College, Hiram, OH 44234; 4 Department of 17 Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 18 24060; 5 Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State 19 University, Blacksburg, VA 24060; 6 Department of Microbiology, University of Illinois 20 at Urbana-Champaign, Urbana, IL 61801; 7 Department of Microbiology, University of 21 Washington, Seattle WA 98195; 8 Department of Plant Pathology, Cornell University, 22 NYSAES, Geneva, NY 14456; 9 Department of Biology, Williams College, 23 Copyright © 2009, American Society for Microbiology and/or the Listed Authors/Institutions. All Rights Reserved. J. Bacteriol. doi:10.1128/JB.01779-08 JB Accepts, published online ahead of print on 27 February 2009 on February 14, 2018 by guest http://jb.asm.org/ Downloaded from
Transcript
Page 1: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

1

Genome Sequences of Three Agrobacterium Biovars Help Elucidate the Evolution of 1

Multi-Chromosome Genomes in Bacteria† 2

3

Running Title: Agrobacterium Biovar II and III genome sequences 4

5

Steven C. Slater1, Barry S. Goldman

2, Brad Goodner

3, João C. Setubal

4,5*, Stephen K. 6

Farrand6; Eugene W. Nester

7, Thomas J. Burr

8, Lois Banta

9, Allan W. Dickerman

5, Ian 7

Paulsen10

, Leon Otten11

, Garret Suen12‡

, Roy Welch12

, Nalvo F. Almeida5,13

, Frank 8

Arnold3, Oliver T. Burton

9, Zijin Du

2, Adam Ewing

3, Eric Godsy

2, Sara Heisel

2, Kathryn 9

L. Houmiel14,15

, Jinal Jhaveri5§

, Jing Lu2, Nancy M. Miller

2, Stacie Norton

2, Qiang 10

Chen14

, Waranyoo Phoolcharoen14

, Victoria Ohlin3, Dan Ondrusek

3, Nicole Pride

3, 11

Shawn L. Stricklin2, Jian Sun

5¶, Cathy Wheeler

3||, Lindsey Wilson

3, Huijun Zhu

2 and 12

Derek W. Wood7,15

13

14

1Great Lakes Bioenergy Research Center, 1550 Linden Dr., University of Wisconsin, 15

Madison, WI 53706; 2Monsanto Company, 800 North Lindbergh Boulevard, St. Louis, 16

MO 63167; 3Department of Biology, Hiram College, Hiram, OH 44234;

4Department of 17

Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA 18

24060; 5Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State 19

University, Blacksburg, VA 24060; 6Department of Microbiology, University of Illinois 20

at Urbana-Champaign, Urbana, IL 61801; 7Department of Microbiology, University of 21

Washington, Seattle WA 98195; 8Department of Plant Pathology, Cornell University, 22

NYSAES, Geneva, NY 14456; 9Department of Biology, Williams College, 23

Copyright © 2009, American Society for Microbiology and/or the Listed Authors/Institutions. All Rights Reserved.J. Bacteriol. doi:10.1128/JB.01779-08 JB Accepts, published online ahead of print on 27 February 2009

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 2: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

2

Williamstown, MA 01267; 10

Department of Chemistry and Biomolecular Sciences, 24

Macquarie University, North Ryde, Australia NSW2109; 11

Institute of Plant Molecular 25

Biology, Strasbourg, 67084 France; 12

Department of Biology, Syracuse University, 26

Syracuse, NY 13244; 13

Department of Computing and Statistics, Federal University of 27

Mato Grosso do Sul, Campo Grande, Brazil; 14

The Biodesign Institute, Arizona State 28

University, 1001 S. McAllister Ave., Tempe, AZ, 85287; 15

Department of Biology, 29

Seattle Pacific University, Seattle, WA 98119 30

31

Current addresses: ‡Great Lakes Bioenergy Research Center, University of Wisconsin-32

Madison, Madison, WI USA 53706-1521; §Weather Bill, San Francisco, CA 94108;

¶La 33

Jolla Institute for Allergy & Immunology, La Jolla, CA 92037; ||Cathy Wheeler, 34

Department of Biology, John Carroll University, Cleveland, OH 44118 35

36

*Corresponding Author. Mailing Address: Virginia Bioinformatics Institute, 37

Washington St., MC 0477, Blacksburg, VA 24060. Phone: (540) 231-9464. Fax: (540) 38

231-2606. E-mail: [email protected]. 39

40

†Supplemental material can be found at http://www.agrobacterium.org and 41

http://agro.vbi.vt.edu/public42

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 3: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

3

ABSTRACT 43

44

The family Rhizobiaceae contains plant-associated bacteria with critical roles in 45

ecology and agriculture. Within this family, many Rhizobium and Sinorhizobium strains 46

are nitrogen-fixing plant mutualists, while many strains designated as Agrobacterium are 47

plant pathogens. These contrasting lifestyles are primarily dependent on the transmissible 48

plasmids each strain harbors. Members of Rhizobiaceae also have diverse genome 49

architectures that include single chromosomes, multiple chromosomes, and plasmids of 50

various sizes. Agrobacterium strains have been divided into three Biovars, based on 51

physiological and biochemical properties. The genome of a Biovar I strain, A. 52

tumefaciens C58, has been previously sequenced. In this study the genomes of the 53

Biovar II strain A. radiobacter K84, a commercially available biological control strain 54

that inhibits certain pathogenic agrobacteria, and the Biovar III strain A. vitis S4, a 55

narrow host range strain that infects grapes and invokes a hypersensitive response on 56

non-host plants, were fully sequenced and annotated. Comparison with other sequenced 57

members of the α-proteobacteria provides new data on evolution of multi-partite bacterial 58

genomes. Primary chromosomes show extensive conservation of both gene content and 59

order. In contrast, secondary chromosomes share smaller percentages of genes, and 60

conserved gene order is restricted to short blocks. We propose that secondary 61

chromosomes originated from an ancestral plasmid to which genes have been transferred 62

from a progenitor primary chromosome. Similar patterns are observed in select β- and γ-63

proteobacteria species. Together these results define the evolution of chromosome 64

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 4: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

4

architecture and gene content among the Rhizobiaceae and support a generalized 65

mechanism for second chromosome formation among bacteria. 66

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 5: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

5

INTRODUCTION 67

68

The family Rhizobiaceae (order Rhizobiales) of the α-proteobacteria includes the 69

plant pathogens Agrobacterium and the nitrogen-fixing plant mutualists Rhizobium and 70

Sinorhizobium. Members house single and multiple chromosome arrangements, linear 71

replicons, and plasmids of various sizes. Genes of pathogenicity, mutualism, and other 72

symbiotic properties are primarily encoded on large transmissible plasmids. Given the 73

promiscuous nature of these elements, different genomic lineages within the 74

Rhizobiaceae exhibit a variety of symbiotic phenotypes that range from pathogenesis to 75

nitrogen-fixing mutualism. 76

Agrobacterium taxonomy and phylogeny display a marked disparity. Empirically, 77

Agrobacterium is grouped into five species based on the disease phenotype associated 78

with the resident disease-inducing plasmid: A. tumefaciens causes crown gall on 79

dicotyledonous plants including stone fruit and nut trees, A. rubi causes crown gall on 80

raspberry, A. vitis causes gall formation that is limited to grape, A. rhizogenes causes 81

hairy root disease, and A. radiobacter is avirulent. An alternative classification scheme 82

groups Agrobacterium into three biovars based on physiological and biochemical 83

properties without consideration of disease phenotype. Whole genome and molecular 84

marker comparisons indicate that Agrobacterium strains are derived from multiple 85

chromosomal lineages (see below; (19, 26, 51, 52)). The species and biovar 86

classification schemes do not coincide well, in large part because the disease-inducing 87

plasmids are readily transmissible. The history of Agrobacterium classification was 88

recently reviewed by Young (52). 89

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 6: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

6

Representative genomes from all three Agrobacterium biovars are now available. 90

The genome sequence of the biovar I strain A. tumefaciens C58 (C58) has one circular 91

and one linear chromosome (19, 51). The genome sequences for representatives of the 92

two remaining biovars are presented here. Agrobacterium radiobacter K84 (K84), an 93

avirulent biovar II strain, is a widely used biological control agent for preventing crown 94

gall disease in the field (25, 35). A. vitis S4 (S4), a virulent biovar III strain, is 95

phenotypically distinct from strains of A. tumefaciens in two significant ways. First, 96

whereas A. tumefaciens infects many host species, A. vitis causes crown gall only on 97

grapevines (2, 4). Second, A. vitis induces necrosis on grapevine roots and a 98

hypersensitive response on non-host plants (3, 22). 99

This study examines the evolution of genome architecture among Agrobacterium, 100

selected sequenced members of the Rhizobiales, and additional bacteria that harbor 101

multiple chromosomes. The biovar I genome of C58 harbors a linear chromosome II 102

derived from a plasmid to which large blocks of DNA, including rRNA operons and 103

other essential genes, have transferred from Chromosome I (19, 51). While the 104

sequencing of S4 and K84 was motivated by the need to have full genomic sequences for 105

at least one biovar II representative and at least one biovar III representative, we have 106

found that their genomes, as well as those of C58 and other Rhizobiales species, enabled 107

us to infer a general model for bacterial genome evolution. Crucial for this inference is 108

the complex (for bacteria) replicon architecture of all three Agrobacterium genomes. Data 109

provided here, and additional evidence (41, 49), support our model as a generalized 110

mechanism of genome evolution among bacteria that harbor multiple chromosomes. 111

112

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 7: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

7

MATERIALS AND METHODS 113

114

DNA sequencing and assembly. Two DNA libraries (insert sizes 2-4 kb and 4-8 115

kb) were generated for each Agrobacterium genome by mechanical shearing of DNA and 116

cloning into pUC18, followed by a shotgun sequencing approach. The reads (~87,000 for 117

K84 and ~82,000 for S4) were assembled and edited using Phred, Phrap, and Consed (13, 118

14, 20). Gaps were closed by sequencing specific products. All rRNA operons were 119

amplified with specific flanking primers, sequenced and assembled individually. All 120

nucleotides with Phred scores less than 40 were re-sequenced using an independent PCR 121

fragment as template. The error rate is estimated to be less than 1:10,000. 122

Comparative genomics analyses. Ortholog families were obtained with 123

orthoMCL (32). Ortholog alignments were obtained with custom Perl scripts. Circular 124

representations of these alignments were obtained with the tool genomeViz (17). 125

Analysis of potential intragenome transfers (Tables S6-S24) involved the Multi-Genome 126

Homology Comparison (38) and Phylogenetic Profiler (33) Web-based tools. Completed 127

bacterial genomes listed in NCBI as having more than one chromosome were initially 128

examined and only those cases where the additional chromosome(s) carried a substantial 129

number of essential genes were maintained. Within this subset, three cases in which two 130

or more closely related genera appear to share a common origin of additional 131

chromosomes were analyzed in greater detail. If intragenome transfer is a robust 132

explanation for the origin of additional chromosomes, then the “transferred” genes should 133

occur in clusters within which the synteny from the initial ancestral chromosome I was 134

maintained. The additional chromosomes of two related genera A and B were searched 135

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 8: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

8

for such shared gene clusters that are present on chromosome I of a unichromosomal 136

relative C but are no longer found on chromosome I in genera A and B. An initial 137

similarity lower limit of 60% identity was used, and once clusters were identified the 138

lower limit was adjusted to 40% identity to determine the fullest extent of each shared 139

gene cluster. Preliminary versions of Tables S6-S24 were checked against the ortholog 140

alignments, and minor corrections and additions were done to obtain the final set of 141

tables. 142

Analysis of the repABC systems of Agrobacterium. The RepA, RepB, and 143

RepC protein sequences from Agrobacterium tumefaciens were used as a query against 144

the NCBI database as of May, 2007 using the NCBI BlastP program (1). The top 100 145

matches were used for analysis. The sequences of each protein were aligned using the 146

MUSCLE program (11). Phylogenetic and molecular evolutionary analyses were 147

conducted using MEGA, version 4 (44). 148

Phylogenetic comparisons among the Rhizobiaceae. Phylogenetic analysis was 149

performed on a dataset of 507 homologous protein groups selected from 19 species of 150

Rhizobiales organisms (taxa listed in Table S4 (50); results in Fig. 1). The genes were 151

selected strictly from the primary chromosome of each genome. It was allowed that one 152

or two genomes could be missing the gene. Three hundred and seventeen homologous 153

groups contained all genomes, 146 were missing one genome, and 45 missed two 154

genomes (Table S5 (50)). Homolog groups with more than one entry for a genome were 155

not used. Sequences in each homolog group were trimmed by fit to a hidden Markov 156

model (HMM) using the HMMer package (10) and then aligned using MUSCLE (11) 157

with default parameters, as described previously (45). The concatenation of 119,758 158

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 9: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

9

aligned positions was analyzed using the program RaxML (43) using the GAMMA-159

distributed WAG substitution model. Bootstrapping was performed using the non-160

parametric (slow) method for 100 replicates. 161

Comparative single protein analysis. Genomes were clustered by gene content by 162

constructing a matrix of pairwise distances between bacterial proteomes (results in Fig. 163

S1). Pairwise distances were estimated using the following procedure. Using NCBI 164

BlastP, each protein in genome A was compared to the proteome of genome B. The 165

similarity of the top hit in genome B was noted for each protein in genome A. All such A 166

to B comparisons were summarized by calculating the percentage of the proteins in A 167

which had a match in B of at least certain similarity. That is, a table was created showing 168

what fraction of proteome A had matches of at least 100% identity in B, what fraction had 169

matches of at least 99% identity, what fraction had 98%, and so on. If A and B were the 170

same proteome, this table would contain values of 1.0 for all percents identity from 100% 171

to 1%. A histogram is generated for each genomic comparison and the area above the 172

histogram is measured. This represents the sum of the differences between the actual 173

fractions observed and those which would arise from having identical proteomes; this is a 174

distance measure. The proteomic comparison was repeated for all possible pairwise 175

comparisons. To generate the actual distances used in the phylogeny reconstruction we 176

have compared the pairs of organisms in both directions (A→B and B→A) and averaged 177

the histogram areas. 178

A dendrogram illustrating how the genomes cluster (and their relative distances) 179

with this scheme can be easily derived from the matrix of pairwise distances using the 180

neighbor-joining method implemented in PHYLIP (15). This proteomic comparison 181

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 10: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

10

method also appears robust with respect to highly divergent and even largely disjoint 182

protein sets: the archeon Pyrococcus furiosus branches deeply from the gamma 183

proteobacteria, while the small genome of Buchnera aphidicola, for example, clusters 184

with Wigglesworthia (data not shown). In spirit, this clustering method is similar to the 185

more rigorous average amino acid identity (AAI) measure proposed by Konstantinidis 186

and Tiedje (31); like theirs, our method shows that entire proteome comparisons largely 187

recapitulate standard 16S rRNA phylogeny yet provide insights into the correlation of 188

genome and ecological role as well as highlighting possible horizontal gene transfer. 189

A. radiobacter K84 and A. vitis S4 genome sequences. The annotated genome 190

sequences of both A. vitis S4 (GenBank CP000633 through CP000639) and A. 191

radiobacter K84 (GenBank CP000628 through CP000632) are available from GenBank 192

and from the Agrobacterium Genomes Database (40). 193

194

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 11: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

11

RESULTS 195

196

Sequencing and annotation of representative genomes from Agrobacterium 197

Biovars II and III. Representative genomes from all three Agrobacterium biovars are 198

now available. The genome sequence of the Biovar I strain A. tumefaciens C58 (C58) was 199

sequenced by our group and has been recently revised and updated (19, 42, 51). The 200

genome sequences for representatives of the two remaining biovars are presented here. 201

Table 1 compares the general features of C58, K84, and S4, and Tables S1-S3 202

provide a more detailed picture of each genome. The three sequenced Agrobacterium 203

biovars have distinct genome architectures. The genomes of C58 and S4 contain two true 204

chromosomes, which we define as replicons containing both rRNA operons and genes 205

essential for prototrophic growth. C58, however, has one circular and one linear 206

chromosome (19, 51) while S4 has two circular chromosomes. In both strains, the larger 207

chromosome (chromosome I) contains an origin of replication that is similar to other 208

chromosomal origins within the α-proteobacteria (24), while chromosome II has a 209

repABC origin of replication typical of the large plasmids within the Rhizobiaceae. C58 210

contains two plasmids, pTiC58 and pAtC58 (19, 51; Tables 1 and S1), whereas S4 has 211

five plasmids (Tables 1 and S2). K84, in contrast, has a single circular chromosome, a 212

second 2.65 Mb replicon and three plasmids (Tables 1 and S3): pAgK84 (44kb; (30)), 213

pAgK84b (185kb, pNOC (7)) and pAgK84c (388kb, pAgK434 (9)). Like the second 214

chromosomes of C58 and S4, the 2.65 Mb replicon contains a plasmid-type repABC 215

origin. However, it lacks the rRNA operons and does not contain the extensive sets of 216

essential metabolic genes found on the second chromosomes of C58 and S4. It does 217

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 12: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

12

contain at least one gene that is likely to be essential; L-seryl-tRNA selenium transferase 218

(Arad7947). 219

Multi-protein phylogeny of new genomes shows Agrobacterium to be 220

paraphyletic. The relationships among C58, S4, K84 and 16 previously sequenced 221

genomes in the Rhizobiales were investigated by maximum likelihood phylogenetic 222

analysis. Protein alignments were performed for 507 single-copy orthologous gene 223

families located on primary chromosomes that are likely to have tracked the vertical 224

component of ancestry (Fig. 1; Tables S4 and S5 (50)). Analysis of the concatenated 225

dataset produces a single topology with 100% a posteriori support for all branches within 226

the Rhizobiaceae, which is consistent with results of Williams et al. (48). This 227

phylogenetic reconstruction finds S4 to group with C58 and K84 to group with two 228

Rhizobium genomes (R. leguminosarum and R. etli). The lineage uniting K84 with 229

Rhizobium has a substantial branch length, while S4 and C58 appear to have separated 230

soon after the divergence of Sinorhizobium. 231

Whole-genome similarity plots support these findings (Fig. S1). The neighbor-232

joining tree of the distances measured from these plots gives the same topology and 233

similar relative branch lengths within the Rhizobiaceae as the maximum likelihood tree 234

analysis (Fig. S2). These large-scale investigations provide a well-defined phylogenetic 235

basis for uniting biovar II (represented by K84) with Rhizobium. 236

RepABC replication origins are not linearly descendant among secondary 237

chromosomes and large plasmids. Plasmid replication among the Rhizobiaceae is 238

generally under the control of the RepABC system (5, 47). Plasmid origins of replication 239

are typically considered characteristic of a plasmid since replication is required for 240

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 13: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

13

transmission. Thus, we would predict that the repABC genes, which are generally found 241

in an operon, evolved as a single unit on the plasmids and second chromosomes for 242

which they mediate replication. Phylogenetic analyses of these gene lineages, however, 243

indicate a lack of evolutionary congruence with the species tree (Fig. 1) among the 244

repABC systems of plasmids and of second largest replicons of the three biovars (Figs. 245

S5-S6). Therefore one cannot infer an ancestry for repABC genes that does not invoke 246

continuous horizontal gene transfer of these genes. Individual repABC genes show a 247

similar lack of evolutionary congruence within replicons (the RepA and RepB trees, 248

while congruent to each other, are not congruent to the RepC tree, Figs. S7-S8), 249

suggesting that plasmid evolution is mediated both by the frequent movement of plasmids 250

among strains and by exchange of the individual repABC genes within replicons. We note 251

that the repABC genes in the second largest replicons of C58, S4 and K84 do not form an 252

operon. In a wider evolutionary perspective, congruence among repABC genes generally 253

does hold. For example, even though the repC genes appear to move easily within 254

families, they move less easily within orders, and rarely outside of an order (Fig. 2). 255

These findings are consistent with recent work by Cevallos et al. (5) and confirm that the 256

intragenomic movement of genes across replicons includes the replication systems. 257

Conservation of gene content and order is much greater on primary 258

chromosomes than on secondary chromosomes. The C58 chromosome I shares large-259

scale synteny with the chromosome of Sinorhizobium meliloti 1021 and with the 260

chromosome of the more distantly related Mesorhizobium loti MAFF303099 (19, 51). 261

Subsequent analyses show conservation of gene order and content among primary 262

ancestral chromosomes of other Rhizobiales (Brucella, Bradyrhizobium, Mesorhizobium, 263

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 14: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

14

and Rhizobium strains, Ochrobactrum anthropi, and Azorhizobium caulinodans) (37, 41). 264

Given these relationships, we might expect the secondary chromosomes and large 265

replicons within Agrobacterium and across the Rhizobiales to display similar syntenic 266

relationships. Although some conservation of gene content is apparent, these replicons 267

lack the large-scale conservation of gene order seen among the primary chromosomes 268

(Fig. 3). Where gene order has been retained, it is limited to small blocks of genes. These 269

contrasting findings led us to examine the origins of the large secondary replicons. 270

271

Secondary chromosomes originated from intragenomic transfers from 272

primary chromosomes to ancestral plasmids. In spite of the lack of large-scale 273

synteny across the secondary chromosomes and large replicons of the Rhizobiales, 274

evidence supports a common origin for chromosomes II of C58 and S4 and the 2.65 Mbp 275

replicon of K84. Of the 3,382 genes shared by all three genomes, 291 are located on 276

chromosomes II of C58 and S4, and on the 2.65 Mbp replicon of K84. This represents 277

16%, 27%, and 12% of the genes on each of the respective DNA molecules (40). In 278

addition, six gene clusters are shared by chromosomes II of C58 and S4, by the 2.65 Mbp 279

replicon of K84, and by plasmids p42e of R. etli and pRL11 of R. leguminosarum (Fig. 3, 280

Tables S6 and S10 (50)). 281

Comparisons among the Rhizobiales suggest that gene transfer from primary 282

chromosomes to ancestral plasmids resulted in secondary chromosomes. Because these 283

transfers occur within the same genome (and can potentially occur between any pair of 284

replicons), we term them intragenomic gene transfers. Under this model, translocated 285

genes would be expected to occur in clusters that retain synteny with the ancestral 286

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 15: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

15

chromosome, and this is clearly observed (Fig. 3). All fully sequenced genomes in the 287

Brucella/Ochrobactrum clade (5 sequenced strains), two members of the genus 288

Sinorhizobium and the mixed Agrobacterium/Rhizobium clade (5 sequenced strains) 289

possess multiple chromosomes or a large replicon with some chromosomal 290

characteristics. Moreover, except for the Brucellae, all these members carry one or more 291

plasmids. 292

All fully sequenced Rhizobiales species that harbor multiple replicons have at 293

least one RepABC replicon. We suggest that the common ancestor of this order was a 294

uni-chromosomal strain that acquired a single ancestral plasmid of this class, here 295

referred to as the Intragenomic Translocation Recipient (ITR) (Fig. 4). The best evidence 296

for the existence of this ancestral plasmid is three gene clusters shared by almost all fully 297

sequenced Rhizobiales (in addition to repABC). As shown in Fig. 5 and Table S6 (50), in 298

29 out of 32 cases these four clusters are found in secondary large replicons. The three 299

exceptions (A. vitis (minCDE), O. anthropi (hutIHGU) and A. radiobacter (hutIHGU)) 300

can be explained by subsequent retrotransfers to the primary chromosome from the ITR, 301

based on analysis of adjacent syntenic regions shared with chromosome II of their nearest 302

sequenced relatives. Moreover, three of these clusters (minCDE, hutIHGU, and repABC) 303

are not seen in the uni-chromosomal genome of Azorhizobium caulinodans, a Rhizobiales 304

member, suggesting that the ITR plasmid brought those genes to the ancestral strain and 305

that the fourth gene cluster (pca) later moved from the ancestral chromosome to the ITR 306

plasmid. 307

At some point the Brucella/Ochrobactrum clade diverged from the lineage that 308

gave rise to the family Rhizobiaceae (Fig. 1). The transfer of chromosomal genes to the 309

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 16: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

16

ITR plasmid took place independently in the Brucella/Ochrobactrum clade (also 310

hypothesized in (36)) and in the Rhizobiaceae family. In the Brucella/Ochrobactrum 311

clade there have been 25 intragenomic transfers from the primary chromosome to the ITR 312

plasmid, as shown by the fact that these 25 clusters are shared by all of the sequenced 313

members of the Brucella/Ochrobactrum clade (Table S7 (50)) and that these clusters are 314

still found in the primary chromosome of S. meliloti. Twenty more transfers occurred 315

since Brucella diverged away from Ochrobactrum (Table S8 (50)). In fact, the recently 316

sequenced genome of Brucella suis ATCC 23445 (NC_010169.1) shows that another 220 317

kb section, found in chromosome I for all other fully sequenced Brucellae, is now part of 318

its chromosome II (46). In Sinorhizobium meliloti the ancestral ITR plasmid evolved into 319

the pSymB plasmid, with one intragenomic transfer event from the chromosome to the 320

ITR plasmid occurring prior to its divergence from the Agrobacterium/Rhizobium clade 321

and three events after (Table S9 (50)). 322

Among the Rhizobiaceae, at least two gene clusters transferred to the ancestral 323

ITR plasmid prior to the divergence of the clade that includes the biovar I/III strains from 324

the biovar II clade that includes K84, Rhizobium etli CFN42, and R. leguminosarum bv. 325

viciae 3841. These transfers include a cluster containing genes encoding a glutamate 326

synthase and glutamine synthetase III (Fig. 3B; Table S10 (50)). After this divergence, 327

there was at least one intragenomic transfer to the ITR plasmid before it became 328

chromosome II for Agrobacterium biovar I/III strains (Fig. 3B; Table S11 (50)). 329

Subsequently, transfers to chromosome II have occurred that are unique to biovars I or III 330

(19). For example, there have been at least seven large-scale gene transfer events, 331

ranging from 10 kbp to 220 kbp, and a few smaller transfer events between the ancestral 332

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 17: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

17

chromosome and chromosome II of C58 that did not occur in S4 (Fig. 3B; Table S12 333

(50)). In a separate but parallel track, there was at least one intragenomic transfer to the 334

ITR plasmid ancestral of K84 (2.65 Mbp replicon), R. etli (plasmid p42e) and R. 335

leguminosarum (plasmid pRL11) (Table S13 (50)). None of the secondary replicons in 336

this branch has reached chromosome status yet. 337

We observe that among Rhizobiales, another evolutionary path seems to be that of 338

integration of the ancestral ITR plasmid into the main chromosome. The best example of 339

this path is Bradyrhizobium strains. All fully sequenced Bradyrhizobium strains have 340

very large chromosomes (B. japonicum USDA 110 has a single chromosome larger than 341

nine Mbp (29)) and only one strain (Bradyrhizobium sp. BTAi1) has a plasmid that might 342

serve to nucleate second chromosomes. However, the presence of ITR plasmid gene 343

clusters and other plasmid genes in the chromosomes of these species (also seen in 344

Mesorhizobium main chromosomes) suggests integration of one or more plasmids into 345

the ancestral chromosome (Fig. 4). 346

347

Intragenomic flow from chromosomes to large plasmids mediates second 348

chromosome formation in other bacteria. A plasmid-based mechanism of secondary 349

chromosome formation was first proposed with the genome sequence of the two 350

chromosomes of Vibrio cholerae, based solely on the presence of plasmid replication 351

functions (12). The extensive data for the Rhizobiales just described goes well beyond 352

just replication functions, and we now provide evidence for two more examples of 353

extensive intragenomic gene transfer to a new chromosome based on published genomes 354

sequences. First, among the γ-Proteobacteria, the example of Vibrio is much older and 355

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 18: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

18

complex than first proposed. Strains of Photobacterium were once considered to be 356

within the genus Vibrio and multiple lines of evidence support Vibrio and 357

Photobacterium as sister genera. Both genera have two chromosomes and sequences are 358

available for P. profundum and four Vibrio species. Phylogenetic analysis of several 359

conserved proteins showed that among the available sequenced genomes, Aeromonas 360

hydrophila is the closest relative with a single chromosome. Comparative analyses 361

support six gene cluster transfers from the ancestral chromosome I to the plasmid 362

progenitor of chromosome II (itself defined by seven unique gene clusters) prior to the 363

divergence of the sister genera Photobacterium and Vibrio; seven additional gene cluster 364

transfers to chromosome II of the common ancestor of all the sequenced Vibrio strains; 365

and 29 transfers unique to the Photobacterium side (Fig. S3; Tables S14-S17 (50)). 366

Second, in the β-Proteobacteria, the genus Burkholderia was subdivided several years 367

ago, with some members of Burkholderia along with some stragglers from other genera 368

reclassified into the genus Ralstonia. Several lines of evidence support a very close 369

relationship between Burkholderia and Ralstonia, and they each consist of species with 370

two or three chromosomes. The most closely related sequenced genomes with a single 371

chromosome are those from the genus Bordetella; B. bronchiseptica was used as the 372

comparison genome for this analysis. Using chromosome II sequences from five 373

different Burkholderia species and three different Ralstonia species, the second 374

chromosomes of Burkholderia and Ralstonia share a common origin with 11 gene cluster 375

transfers from the ancestral chromosome to a plasmid progenitor (defined by two unique 376

gene clusters) (Fig. S4; Tables S18 and S19 (50)). After the divergence of these two 377

clades, 12 additional transfers to chromosome II are unique to the Burkholderia 378

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 19: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

19

bichromosome ancestor, and 24 transfers to the Ralstonia bichromosome ancestor (Fig. 379

S4; Tables S20 and S21 (50)). Within a subset of Burkholderia strains there is a third 380

plasmid-based chromosome to which four gene clusters were transferred from either 381

chromosome I or chromosome II (Fig. S4; Table S22 (50)). Taken together these data 382

support a generalized mechanism of secondary chromosome formation among bacteria. 383

384

385

386

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 20: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

20

DISCUSSION 387

388

Within the Rhizobiaceae, available evidence strongly supports a mixed 389

Agrobacterium/Rhizobium clade containing two subclades. One subclade includes the 390

biovar II agrobacteria (e.g., K84) and certain of the fast growing rhizobia including R. etli 391

and R. leguminosarum. The second subclade includes the Biovars I (e.g., C58) and III 392

(e.g., S4) lineages that separated after diverging from the Biovar II lineage. Linearization 393

of the Biovar I chromosome appears to have been a seminal event in this radiation (42). 394

Analysis of complete genome sequences within the Rhizobiales allows a more 395

precise definition of phylogenetic relationships. While it has long been known that gene 396

transfer can occur between organisms, the picture that results from our study shows a 397

group characterized by composite genomes in which genes of all classes are not only 398

migrating between organisms (19,51), but also intracellularly among chromosomal and 399

plasmid replicons. In the Rhizobiaceae, such movements, as well as chromosomal 400

rearrangements, have not completely disrupted the backbone of the ancestral 401

chromosome. In contrast, while second chromosomes and evolving plasmid-based large 402

replicons have some overlapping gene content, they display significant loss of gene order. 403

In Biovar I and III agrobacteria these movements produced second chromosomes derived 404

from plasmids, while in the biovar II strain K84 the plasmid-based replicon has yet to 405

reach second chromosome status. 406

Although it is clear that the 2.65 Mb replicon of K84, second chromosomes of 407

C58 and S4, and large plasmids in other members of the Rhizobiales have evolved from a 408

common plasmid ancestor, the repABC genes involved in replication initiation, copy 409

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 21: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

21

number control, and partition on these molecules are phylogenetically distinct even 410

within a single organism . These findings show that repABC genes, like other genes, are 411

being exchanged among replicons. This may reflect selective pressure to move from 412

incompatibility to coexistence in genomes with multiple repABC-based replicons. It also 413

means there is no internal standard by which to directly compare replicon lineages among 414

these plasmids. 415

Our data show a common mechanism of secondary chromosome formation in 416

Rhizobiacea and other bacteria. A prerequisite for this evolution is the intracellular 417

presence of a second replicon capable of stably and efficiently replicating large DNA 418

molecules. The repABC-type replicons that are widely distributed among the Rhizobiales 419

fall into this class, and have produced second chromosomes in addition to large replicons 420

such as the 2.65Mb K84 replicon and the Sym plasmids of nitrogen fixing members of 421

the Rhizobiaceae (6, 8, 16, 18, 19, 21, 37, 51, 53). In A. tumefaciens, it has been shown 422

that chromosome II is replicated concurrently with chromosome I; such overall genome 423

synchrony probably allowed intragenomic transfers to be maintained (27, 28). Most of 424

the large gene movements have been from the ancestral chromosome to plasmid 425

replicons, with only rare retrotransfers. While plasmids can undergo large gene 426

rearrangements and losses/insertions, available evidence suggests there are some 427

constraints to large-scale rearrangements of the bacterial chromosome (23, 34, 39). 428

The advantage of multiple chromosomes is unclear, but we speculate that they 429

may permit further accumulation of genes when the primary replicon cannot support 430

further chromosome enlargement. Within the Rhizobiaceae, different species appear to 431

handle gene accumulation in different ways. Bradyrhizobium and Mesorhizobium 432

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 22: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

22

species have very large chrosomosomes with few, if any, relatively small plasmids. In 433

contrast, Agrobacterium and Rhizobium strains have multiple chromosomes or large 434

replicons that show gene accumulation, and anywhere from one to six plasmids. These 435

differences may suggest that chromosomal origins have differing abilities to replicate 436

molecules larger than about five or six Mbp, with multiple chromosomes providing an 437

alternative reservoir for newly-acquired DNA. Alternatively, the initial movement of a 438

few essential gene clusters to a plasmid replicon may be simply a historical contingency 439

with no attached selective advantage. Additional essential gene transfers would simply 440

solidify the essential nature of the new replicon. An evaluation of the selective advantage 441

hypothesis is needed, but regardless of the reason, it is clear that genetic organization of 442

even essential genes in bacteria is much more complex and fluid than has been imagined. 443

444

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 23: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

23

ACKNOWLEDGMENTS 445

446

This work was supported by National Science Foundation Grants 0333297 and 447

0603491 to EWN and 0736671 to SCS, grants from the M. J. Murdock Charitable Trust 448

Life Sciences program (2004262:JVZ and 2006245:JVZ) to DWW, by a science 449

education grant from the Howard Hughes Medical Institute to BG (52005125), by a 450

Conselho Nacional de Desenvolvimento Científico e Tecnológico fellowship to N.F.A. 451

(#200447/2007-6) and by the Monsanto Company. Special thanks to the over 450 452

undergraduate students at Hiram College, Oregon State University, Seattle Pacific 453

University, Arizona State University, University of North Carolina, Washington 454

University in St. Louis, and Williams College who contributed to the deep annotation of 455

all three Agrobacterium genomes between 2004 and 2008. 456

457

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 24: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

24

REFERENCES 458

459

1. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, 460

and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of 461

protein database search programs. Nucleic Acids Res. 25:3389-402. 462

2. Burr, T. J., C. Bazzi, S. Sule, and L. Otten. 1998. Crown gall of grape: biology 463

of Agrobacterium vitis and the development of disease control strategies. Plant 464

Dis. 82:1288-1297. 465

3. Burr, T. J., A. L. Bishop, B. H. Katz, L. M. Blanchard, and C. Bazzi. 1987. A 466

root-specific decay of grapevine caused by Agrobacterium tumefaciens and A. 467

radiobacter biovar 3. Phytopathology 77:1424-1427. 468

4. Burr, T. J., and L. Otten. 1999. Crown Gall of grape: Biology and Disease 469

Management. Annu. Rev. Phytopathol. 37:53-80. 470

5. Cevallos, M. A., R. Cervantes-Rivera, and R. M. Gutierrez-Rios. 2008. The 471

repABC plasmid family. Plasmid. 60:19-37. 472

6. Chain, P. S., D. J. Comerci, M. E. Tolmasky, F. W. Larimer, S. A. Malfatti, 473

L. M. Vergez, F. Aguero, M. L. Land, R. A. Ugalde, and E. Garcia. 2005. 474

Whole-genome analyses of speciation events in pathogenic Brucellae. Infect. 475

Immun. 73:8353-61. 476

7. Clare, B. G., A. Kerr, and D. A. Jones. 1990. Characteristics of the nopaline 477

catabolic plasmid in Agrobacterium strains K84 and K1026 used for biological 478

control of crown gall disease. Plasmid 23:126-37. 479

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 25: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

25

8. DelVecchio, V. G., V. Kapatral, R. J. Redkar, G. Patra, C. Mujer, T. Los, N. 480

Ivanova, I. Anderson, A. Bhattacharyya, A. Lykidis, G. Reznik, L. Jablonski, 481

N. Larsen, M. D'Souza, A. Bernal, M. Mazur, E. Goltsman, E. Selkov, P. H. 482

Elzer, S. Hagius, D. O'Callaghan, J. J. Letesson, R. Haselkorn, N. Kyrpides, 483

and R. Overbeek. 2002. The genome sequence of the facultative intracellular 484

pathogen Brucella melitensis. Proc. Natl. Acad. Sci. U. S. A. 99:443-8. 485

9. Donner, S. C., D. A. Jones, N. C. McClure, G. M. Rosewarne, M. E. Tate, A. 486

Kerr, N. N. Fajardo, and B. G. Clare. 1993. Agrocin 434, a new plasmid 487

encoded agrocin from the biocontrol Agrobacterium strains K84 and K1026, 488

which inhibits biovar 2 agrobacteria. Physiol. Mol. Plant Pathol. 42:185-194. 489

10. Eddy S.R. 1998. Profile hidden Markov models. 1998. Bioinformatics, 14:755-490

763. 491

11. Edgar, R. C. 2004. MUSCLE: a multiple sequence alignment method with 492

reduced time and space complexity. BMC Bioinformatics. 5:113. 493

12. Egan, E. S., M. A. Fogel, and M. K. Waldor. 2005. Divided genomes: 494

negotiating the cell cycle in prokaryotes with multiple chromosomes. Molecular 495

Microbiology 56:1129-1138. 496

13. Ewing, B., and P. Green. 1998. Base-calling of automated sequencer traces 497

using phred. II. Error probabilities. Genome Res. 8:186-94. 498

14. Ewing, B., L. Hillier, M. C. Wendl, and P. Green. 1998. Base-calling of 499

automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 500

8:175-85. 501

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 26: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

26

15. Felsenstein, J. 1989. PHYLIP - Phylogeny Inference Package (Version 3.2). 502

Cladistics 5: 164-166. 503

16. Galibert, F., T. M. Finan, S. R. Long, A. Puhler, P. Abola, F. Ampe, F. 504

Barloy-Hubler, M. J. Barnett, A. Becker, P. Boistard, G. Bothe, M. Boutry, 505

L. Bowser, J. Buhrmester, E. Cadieu, D. Capela, P. Chain, A. Cowie, R. W. 506

Davis, S. Dreano, N. A. Federspiel, R. F. Fisher, S. Gloux, T. Godrie, A. 507

Goffeau, B. Golding, J. Gouzy, M. Gurjal, I. Hernandez-Lucas, A. Hong, L. 508

Huizar, R. W. Hyman, T. Jones, D. Kahn, M. L. Kahn, S. Kalman, D. H. 509

Keating, E. Kiss, C. Komp, V. Lelaure, D. Masuy, C. Palm, M. C. Peck, T. M. 510

Pohl, D. Portetelle, B. Purnelle, U. Ramsperger, R. Surzycki, P. Thebault, M. 511

Vandenbol, F.-J. Vorholter, S. Weidner, D. H. Wells, K. Wong, K.-C. Yeh, 512

and J. Batut. 2001. The Composite Genome of the Legume Symbiont 513

Sinorhizobium meliloti. Science 293:668-672. 514

17. Ghai, R., and T. Chakraborty. 2007. Comparative microbial genome 515

visualization using GenomeViz. Methods Mol. Biol. 395:97-108. 516

18. Gonzalez, V., R. I. Santamaria, P. Bustos, I. Hernandez-Gonzalez, A. 517

Medrano-Soto, G. Moreno-Hagelsieb, S. C. Janga, M. A. Ramirez, V. 518

Jimenez-Jacinto, J. Collado-Vides, and G. Davila. 2006. The partitioned 519

Rhizobium etli genome: Genetic and metabolic redundancy in seven interacting 520

replicons. Proc. Natl. Acad. Sci. U. S. A. 103:3834-3839. 521

19. Goodner, B., G. Hinkle, S. Gattung, N. Miller, M. Blanchard, B. Qurollo, B. 522

S. Goldman, Y. W. Cao, M. Askenazi, C. Halling, L. Mullin, K. Houmiel, J. 523

Gordon, M. Vaudin, O. Iartchouk, A. Epp, F. Liu, C. Wollam, M. Allinger, 524

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 27: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

27

D. Doughty, C. Scott, C. Lappas, B. Markelz, C. Flanagan, C. Crowell, J. 525

Gurson, C. Lomo, C. Sear, G. Strub, C. Cielo, and S. Slater. 2001. Genome 526

sequence of the plant pathogen and biotechnology agent Agrobacterium 527

tumefaciens C58. Science 294:2323-2328. 528

20. Gordon, D., C. Abajian, and P. Green. 1998. Consed: a graphical tool for 529

sequence finishing. Genome Res. 8:195-202. 530

21. Halling, S. M., B. D. Peterson-Burch, B. J. Bricker, R. L. Zuerner, Z. Qing, 531

L. L. Li, V. Kapur, D. P. Alt, and S. C. Olsen. 2005. Completion of the genome 532

sequence of Brucella abortus and comparison to the highly similar genomes of 533

Brucella melitensis and Brucella suis. J. Bacteriol. 187:2715-26. 534

22. Herlache, T. C., H. S. Zhang, C. L. Ried, S. A. Carle, P. Basaran, M. Thaker, 535

A. T. Burr, and T. J. Burr. 2001. Mutations that affect Agrobacterium vitis-536

induced grape necrosis also alter its ability to cause a hypersensitive response on 537

tobacco. Phytopathology 91:966-972. 538

23. Hughes, D. 2000. Evaluating genome dynamics: the constraints on 539

rearrangements within bacterial genomes. Genome Biol. 1:REVIEWS0006. 540

24. Ioannidis, P., J. C. Hotopp, P. Sapountzis, S. Siozios, G. Tsiamis, S. R. 541

Bordenstein, L. Baldo, J. H. Werren, and K. Bourtzis. 2007. New criteria for 542

selecting the origin of DNA replication in Wolbachia and closely related bacteria. 543

BMC Genomics. 8:182. 544

25. Jones, D. A., M. H. Ryder, B. G. Clare, S. K. Farrand, and A. Kerr. 1991. 545

Biological control of crown gall using Agrobacterium strains K84 and K1026, p. 546

161-170. In H. Komada, K. Kiritani, and J. Bay-Peterson (ed.), The biological 547

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 28: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

28

control of plant diseases. Food and Fertilizer Technology Center for the Asian and 548

Pacific Region, Taipei, Taiwan. 549

26. Jumas-Bilak, E., S. Michaux-Charachon, G. Bourg, M. Ramuz, and A. 550

Allardet-Servent. 1998. Unconventional genomic organization in the alpha 551

subgroup of the Proteobacteria. J. Bacteriol. 180:2749-55. 552

27. Kahng, L. S., and L. Shapiro. 2001. The CcrM DNA methyltransferase of 553

Agrobacterium tumefaciens is essential, and its activity is cell cycle regulated. J. 554

Bacteriol. 183:3065-3075. 555

28. Kahng, L. S., and L. Shapiro. 2003. Polar localization of replicon origins in the 556

multipartite genomes of Agrobacterium tumefaciens and Sinorhizobium meliloti. 557

J. Bacteriol. 185:3384-91. 558

29. Kaneko, T., Y. Nakamura, S. Sato, K. Minamisawa, T. Uchiumi, S. 559

Sasamoto, A. Watanabe, K. Idesawa, M. Iriguchi, K. Kawashima, M. 560

Kohara, M. Matsumoto, S. Shimpo, H. Tsuruoka, T. Wada, M. Yamada, and 561

S. Tabata. 2002. Complete genomic sequence of nitrogen-fixing symbiotic 562

bacterium Bradyrhizobium japonicum USDA110. DNA Research 9:189-197. 563

30. Kim, J. G., B. K. Park, S. U. Kim, D. Choi, B. H. Nahm, J. S. Moon, J. S. 564

Reader, S. K. Farrand, and I. Hwang. 2006. Bases of biocontrol: Sequence 565

predicts synthesis and mode of action of agrocin 84, the Trojan Horse antibiotic 566

that controls crown gall. Proc. Natl. Acad. Sci. U. S. A. 103:8846-51. 567

31. Konstantinidis, K. T., and J. M. Tiedje. 2005. Towards a genome-based 568

taxonomy for prokaryotes. J. Bacteriol. 187:6258-64. 569

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 29: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

29

32. Li, L., C. J. Stoeckert, Jr., and D. S. Roos. 2003. OrthoMCL: identification of 570

ortholog groups for eukaryotic genomes. Genome Res. 13:2178-89. 571

33. Markowitz, V. M., E. Szeto, K. Palaniappan, Y. Grechkin, K. Chu, I. M. 572

Chen, I. Dubchak, I. Anderson, A. Lykidis, K. Mavromatis, N. N. Ivanova, 573

and N. C. Kyrpides. 2008. The integrated microbial genomes (IMG) system in 574

2007: data content and analysis tool extensions. Nucleic Acids Res. 36:D528-33. 575

34. Miesel, L., A. Segall, and J. R. Roth. 1994. Construction of chromosomal 576

rearrangements in Salmonella by transduction: inversions of non-permissive 577

segments are not lethal. Genetics. 137:919-32. 578

35. Moore, L. W., and G. Warren. 1979. Agrobacterium radiobacter strain K84 and 579

biological control of crown gall. Annu. Rev. Phytopathol. 17:163-179. 580

36. Moreno, E., A. Cloeckaert, and I. Moriyon. 2002. Brucella evolution and 581

taxonomy. Vet. Microbiol. 90:209-27. 582

37. Paulsen, I. T., R. Seshadri, K. E. Nelson, J. A. Eisen, J. F. Heidelberg, T. D. 583

Read, R. J. Dodson, L. Umayam, L. M. Brinkac, M. J. Beanan, S. C. 584

Daugherty, R. T. Deboy, A. S. Durkin, J. F. Kolonay, R. Madupu, W. C. 585

Nelson, B. Ayodeji, M. Kraul, J. Shetty, J. Malek, S. E. Van Aken, S. 586

Riedmuller, H. Tettelin, S. R. Gill, O. White, S. L. Salzberg, D. L. Hoover, L. 587

E. Lindler, S. M. Halling, S. M. Boyle, and C. M. Fraser. 2002. The Brucella 588

suis genome reveals fundamental similarities between animal and plant pathogens 589

and symbionts. Proc. Natl. Acad. Sci. U.S.A. 99:13148-13153. 590

38. Peterson, J. D., L. A. Umayam, T. Dickinson, E. K. Hickey, and O. White. 591

2001. The Comprehensive Microbial Resource. Nucleic Acids Res. 29:123-5. 592

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 30: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

30

39. Rocha, E. P. 2006. Inference and analysis of the relative stability of bacterial 593

chromosomes. Mol. Biol. Evol. 23:513-22. 594

40. Setubal, J. C. 2008. JCSlab Genome Databases http://agro.vbi.vt.edu/public. 595

[Online.] 596

41. Setubal, J. C., D. Wood, T. Burr, S. Farrand, B. Goldman, B. Goodner, L. 597

Otten, and S. Slater. 2009. The Genomics of Agrobacterium: Insights into 598

Pathogenicity, Biocontrol, and Evolution. , p. 91-112. In R. Jackson (ed.), Plant 599

Pathogenic Bacteria: Genomics and Molecular Biology. Caister Academic Press, 600

Norfolk, UK. 601

42. Slater, S., J.C. Setubal, B. Goodner, Y. Zhou, K. Houmiel, J. Sun, B. S. 602

Goldman, S. K. Farrand, W. M. Huang, S. Casjens, R. Kaul, Q. Chen, T. 603

Burr, E. Nester, R. Kadoi, T. Ostheimer, N. Nicole Pride, A. Allison Sabo, E. 604

Erin Henry, E. Erin Telepak, L. Lindsey Wilson, A. Alana Harkleroad, and 605

D. Wood. submitted. Evolution and distribution of linear chromosomes in plant 606

symbionts of the Rhizobiaceae. BMC Genomics. 607

43. Stamatakis A., T. Ludwig, H. Meier. RAxML-III: a fast program for maximum 608

likelihood-based inference of large phylogenetic trees. 2005. Bioinformatics. 609

21(4):456-63. 610

44. Tamura, K., J. Dudley, M. Nei, and S. Kumar. 2007. MEGA4: Molecular 611

Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 612

24:1596-9. 613

45. Tian, Y., and A. W. Dickerman. 2007. GeneTrees: a phylogenomics resource 614

for prokaryotes. Nucleic Acids Res. 35:D328-31. 615

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 31: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

31

46. Wattam, A. R., K. P. Williams, E. E. Snyder, N. F. Almeida Jr., M. Shukla, 616

A. W. Dickerman, O. R. Crasta, R. Kenyon, J. Lu, J. M. Shallom, H. Yoo, T. 617

A. Ficht, R. M. Tsolis, C. Munk, R. Tapia, C. S. Han, J. C. Detter, D. Bruce, 618

T. S. Brettin, B. W. Sobral, S. M. Boyle, and J. C. Setubal. 2009. Analysis of 619

ten Brucella genomes reveals evidence for horizontal gene transfer despite a 620

preferred intracellular lifestyle. J. Bacteriol. (submitted) 621

47. Weaver, K. E. 2007. Emerging plasmid-encoded antisense RNA regulated 622

systems. Curr. Opin. Microbiol. 10:110-6. 623

48. Williams, K. P., B. W. Sobral, and A. W. Dickerman. 2007. A robust species 624

tree for the alphaproteobacteria. J. Bacteriol. 189:4578-86. 625

49. Wong, K., and G. B. Golding. 2003. A phylogenetic analysis of the pSymB 626

replicon from the Sinorhizobium meliloti genome reveals a complex evolutionary 627

history. Can. J. Microbiol. 49:269-80. 628

50. Wood, D. W. 2008. Agrobacterium.org: An online resource for the 629

Agrobacterium research community http://www.agrobacterium.org. [Online.] 630

51. Wood, D. W., J. C. Setubal, R. Kaul, D. E. Monks, J. P. Kitajima, V. K. 631

Okura, Y. Zhou, L. Chen, G. E. Wood, N. F. Almeida, L. Woo, Y. C. Chen, I. 632

T. Paulsen, J. A. Eisen, P. D. Karp, D. Bovee, P. Chapman, J. Clendenning, 633

G. Deatherage, W. Gillet, C. Grant, T. Kutyavin, R. Levy, M. J. Li, E. 634

McClelland, A. Palmieri, C. Raymond, G. Rouse, C. Saenphimmachak, Z. N. 635

Wu, P. Romero, D. Gordon, S. P. Zhang, H. Y. Yoo, Y. M. Tao, P. Biddle, M. 636

Jung, W. Krespan, M. Perry, B. Gordon-Kamm, L. Liao, S. Kim, C. 637

Hendrick, Z. Y. Zhao, M. Dolan, F. Chumley, S. V. Tingey, J. F. Tomb, M. P. 638

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 32: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

32

Gordon, M. V. Olson, and E. W. Nester. 2001. The genome of the natural 639

genetic engineer Agrobacterium tumefaciens C58. Science 294:2317-2323. 640

52. Young, J. M. 2008. Agrobacterium: Taxonomy of plant-pathogenic Rhizobium 641

species., p. 184-220. In T. Tzfira and V. Citovsky (ed.), Agrobacterium; From 642

biology to biotechnology. Springer, New York. 643

53. Young, J. P., L. C. Crossman, A. W. Johnston, N. R. Thomson, Z. F. 644

Ghazoui, K. H. Hull, M. Wexler, A. R. Curson, J. D. Todd, P. S. Poole, T. H. 645

Mauchline, A. K. East, M. A. Quail, C. Churcher, C. Arrowsmith, I. 646

Cherevach, T. Chillingworth, K. Clarke, A. Cronin, P. Davis, A. Fraser, Z. 647

Hance, H. Hauser, K. Jagels, S. Moule, K. Mungall, H. Norbertczak, E. 648

Rabbinowitsch, M. Sanders, M. Simmonds, S. Whitehead, and J. Parkhill. 649

2006. The genome of Rhizobium leguminosarum has recognizable core and 650

accessory components. Genome Biol. 7:R34. 651

652

653

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 33: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

33

654

FIGURE LEGENDS 655

656

Figure 1. Phylogenetic tree relating 19 genomes in the Rhizobiales. The tree was 657

inferred from 119,758 aligned protein positions from 509 genes located strictly on the on 658

the primary chromosome in each genome. Bootstrap support was 100% for all nodes 659

except that linking Bradyrhizobium and Nitrobacter, which was 98 out of 100. 660

661

Figure 2. Phylogenetic analysis of RepC proteins among the Rhizobiaceae. Organism 662

name is followed by the NCBI Gene Identification Number. Red indicates membership in 663

the Rhizobiales, purple Sphingomonadales, blue Rhodospirillales, green 664

Rhodobacterales, and orange Caulobacterales. 665

666

Figure 3: Gene conservation among replicons of the Rhizobiales. Graphic depicts 667

ortholog gene alignments shown from the outer circle and moving inward: Sinorhizobium 668

meliloti 1021 (NC_003047.1), Rhizobium leguminosarum bv. viciae 3841 669

(NC_008380.1), Rhizobium etli CFN42 (NC_007761.1), K84, S4, C58, Ochrobactrum 670

anthropi ATCC 49188 (NC_009668.1), and Brucella suis 1330 (NC_004310.3). Top: the 671

alignment is anchored by C58 chromosome I; bottom: the alignment is anchored by C58 672

chromosome II. The anchor replicon itself is represented by the circle bordered by scales 673

with marks every 1/8 of its total size. Each gene is colored according to its replicon of 674

origin: blue for primary chromosome, green for secondary chromosomes (including K84 675

2.65 Mb replicon), and orange for plasmids. Note that in all circles except the anchor, 676

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 34: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

location of gene in the figure is not tied to physical position in that genome. At higher 677

resolution (40) it is possible to see that many genes in the non-anchor circles occur 678

consecutively in their respective replicons, thus representing syntenic blocks or clusters. 679

The position of clusters that occur in C58 listed in supplementary Tables S9, S13-S15 680

(51) are indicated by outermost arc sections painted in black. Each such arc is labeled as 681

Sx-y, where x is the supplementary table number and y is the order of the cluster in the 682

table. The top alignment is predominantly blue, suggesting the high degree of 683

conservation among Rhizobiales primary chromosomes. The bottom alignment is a 684

mixture of blue, green, and orange, suggesting the mosaic nature of chromosome II and 685

hinting at the various genomic transfers hypothesized to have taken place, as explained in 686

the text. 687

688

Figure 4. Reconstruction of the origin of secondary chromosomes and related large 689

replicons within the Rhizobiales through transfers of gene clusters from the primordial 690

chromosome to what originally was a repABC-type plasmid (called here the 691

Intragenomic Translocation Recipient or ITR plasmid). 692

693

Figure 5. Key gene clusters present on ITR plasmid progenitor of chromosome II and 694

related large replicons, during evolution of Rhizobiales. C58 is the reference, and its 695

genes are represented as arrows consistent with the strand they are found on in the 696

deposited genome sequence. Genes for the other genomes were aligned with the C58 697

genes and are represented with circles or squares. Circles/squares are connected with 698

lines when corresponding genes are consecutive. A black (or gray) circle means that the 699

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 35: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

represented gene is in a secondary chromosome or plasmid; a black (or gray) square 700

means that the represented gene is in the primary chromosome. A black circle or square 701

means that the alignment to the C58 ortholog covered 80% of more of both genes; a gray 702

circle or square means the alignment covered less than 80%. Gene numbering shown for 703

C58, S4, K84, R. etli CFN42, R. leguminosarum bv. viciae 3841, S. meliloti 1021, B. suis 704

1330, and O. anthropi ATCC49188. 705

706

707

708

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 36: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

Organism A. tumefaciens C58 A. radiobacter K84 A. vitis S4

biovariant Biovar I Biovar II Biovar III

genome size (bp) 5,674,260 7,273,300 6,320,946

%GC content 59.0 59.9 57.5

chromosomesa 2 1 2

plasmids 2 4b 5

protein coding genes

total 5,385 6,752 5,479

functionality assigned 3,516 5,099 3,897

conserved hypothetical 1,287 1,201 1,282

hypothetical 582 452 300

pseudogenes 28 68 90

RNA genes

rRNA operons 4 3 4

tRNAs 56 51 54

other RNAsc 26 23 30

genomic islands

total 38 59 20

average size (kb) 23.3 28.2 33.0

aA chromosome is defined here as a replicon harboring rRNA operons and essential genes.

bThe 2.65Mb replicon, which does not meet our definition of a chromosome, is included here (see Results).

cIncludes tmRNAs, SRP RNAs, suhB, riboswitches, and miscellaneous features.

Table 1. Summary of genome features from sequenced Agrobacterium strains.

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 37: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

Parvibaculum lavamentivorans DS-1

Xanthobacter autotrophicus Py2

Azorhizobium caulinodans ORS 571

Rhodopseudomonas palustris BisA53

Bradyrhizobium japonicum USDA 110

Nitrobacter hamburgensis X14

Aurantimonas sp. SI85-9A1

Mesorhizobium loti MAFF303099

Mesorhizobium sp. BNC1

Bartonella quintana str. Toulouse

Ochrobactrum anthropi ATCC 49188Ochrobactrum anthropi ATCC 49188

Brucella abortus bv. 1 str. 9-941

Brucella melitensis 16M

Sinorhizobium meliloti 1021

Agrobacterium vitis S4

Agrobacterium tumefaciens str. C58

Agrobacterium radiobacter K84

Rhizobium leguminosarum bv. viciae

Rhizobium etli CFN 42

0.1

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 38: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

A. r

hiz

og

en

es

13

23

60

S.

me

dic

ae

11

38

72

61

1R

. etl

i Re

pC

d 2

14

92

96

5

R. etli 86360943

R. l

eg

um

ino

saru

m 1

16

25

49

13

Avi

70

02

p6

31

A. t

um

efac

ien

s p

AT

Atu

5002

pAT

S. m

elilo

ti 7

6880

804

R. le

gum

inos

arum

116

2549

92

Avi

8002

p25

9

R. legum

inosa

rum

116255070

A. rhiz

ogenes 10954780

R. etli

86359880

R. legum

inosa

rum

116254946

R. leguminosarum 116248679

R. leguminosarum 116255203

R. leguminosarum 116254470

A. tumefaciens linear 15891041Brucella suis 23499768

Mesorhizobium BNC1 11347335

Mesorhizobium BNC1 110347371

Mesorhizobium MAFF 13488160

Mesorhizobium MAFF 13488496

Mesorhizobium

BNC1 110347005

Xanthobacter 89362935

AV

i9003 p212

Nitrobacter 92109694

Mesorhizob

ium BN

C1 110347155

A. tu

mefacien

s pTi_092 10955108

Nitrobacter 92109428

Nitrobacter 92109644

Bradyrhizobium 78699859

R. sp

ha

ero

ide

s 77

40

46

11

Oce

an

ico

la 8

45

02

39

8

Oce

an

ico

la 8

90

68

36

8

Ro

seo

va

riu

s 1

14

76

52

26

Ro

seo

va

riu

s 1

14

76

32

65

Su

l�to

ba

cte

r 8

39

44

56

3

Su

l�to

ba

cte

r 8

39

55

99

9

Acidiphilium 88938589

Acidiphilium 88938673

Acidiphilium 88938797

Acidiphilium 88939256

Sul�tobacter 8

3956257

Sul�tobacte

r 83956069

Paracoccus v

ersutu

s 1402845

Paraco

ccus 1

7136069

Roseovarius 114766710Ro

seovarius 85706498

Roseovarius 85705643

Sphingomonas 94498101

Paracoccus denitri�cans 69936217

Ruegeria 28558919

Paraco

ccus 2

0385858

Ro

seo

vari

us

11

47

62

55

9

Roseova

rius

85705537R

ose

ova

riu

s 85

7072

92R

ose

ova

riu

s 8

39

52

92

4

Sul�

tobac

ter 8

3956

207

R. tropici 59327229Avi9901 p79

Avi8201 p259

R. etli 86359740

Arad12077 p388

R. s

ph

aero

ides

77

40

48

05

Oce

anic

ola

84

50

19

91

Oce

anic

ola

8450

3606

Sul�tobacter 83955816

Roseobacter 86139665S. meliloti 46319

Reugeria 28558827

Roseovarius 85707388

Caulobacter 113935928

Gluconobacter 58038354

Oceanicola 84503258

Roseobacter 86139546

Roseobacter 115345552

S. meliloti 66876424

Oligotropha 47176963

Atu

pTi Saku

ra 10

95

48

41

Atu

60

45

pTi

Brucella melitensis 16M 17988435

Atu3922 linear

Arad7003 p2651

S. medicae 113874539

S. medicae 113875852

Arad12022 p388

S. meliloti 16263795

R. etli 86361301

R. etli 86360733

R. etli 22023154

R. etli 86360277

R. sp NGR234 16519681

S. melilo

ti 16263745

Avi

98

02

p7

9

Arad

14186 p185

Avi5

00

2 p

C2

Avi

95

03

p1

30

Rh

od

ob

acte

rale

s 84

6868

25

Rhodobacterales 84686692

Rhodobacterales 84687329

R. sp

haero

ides 83370885

R. sp

hae

roid

es 7

74

04

83

2

R. sp

haero

ides 77386334

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 39: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

C58 Chromosome II

S8-1

S9-1

S9-2

S7-1

S3-4

S9-3

S9-4

S9-5

S9-6

S9-7

S9-13

S3-1S3-3

S9-14

S9-15

S9-19

S3-2

S9-16

S9-17

S9-18

S7-2

S9-9

S9-8

S9-10

S9-11

S9-12

C58 Chromosome I

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 40: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

Diverged before IRT plasmid entryOR

Loss of ITR plasmid(e.g., Azorhizobium, Xanthobacter)

Integration of ITR plasmid into Chromosome (e.g.,

Mesorhizobium, Bradyrhizobium)

Loss of ITR plasmid(e.g., Bartonella, Parvibaculum)

Unichromosome Rhizobiales Ancestor with Intragenome

Transfer Recipient (ITR) plasmid

IRT with 3 key gene Clusters (Figure 4,

Supplementary Table S3)

IRT with 4 key gene Clusters (Figure 4,

Supplementary Table S3)

UnichromosomeRhizobiaceae Ancestor

with IRT plasmid

UnichromosomeAgrobacterium/Rhizobium Ancestor with IRT plasmid

Biochromosomal Biovar 1/Biovar 3

Ancestor

transfer of pca gene

cluster25 shared cluster transfers by

Brucella-Ochrobactrum Ancestor (Supplementary Table S4)

20 shared cluster transfers(Supplementary Table S5)

1 shared cluster transfer3 cluster transfers

(Supplementary Table S6)

2 shared cluster transfers(Supplementary Table S7)

Bv2-speci�c cluster transfers(Supplementary Table S10)

1 shared cluster transfer + LGT(Supplementary Table S8)

Bv3-speci�c cluster transfers

Bv1-speci�c cluster transfers (Supplementary

Table S9)

Ancestral Bichromosome Brucella

Ancestral Bichromosome Ochrobactrum

Ancestral Sinorhizobium with IRT plasmid = pSymB

Ancestral Biovar 2

Ancestral Bichromosomal Biovar 3

Ancestral Bichromosomal Biovar 1

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from

Page 41: 1 Genome Sequences of Three Agrobacterium Biovars Help ...

pcaGHCD,Q,R,IJF

Agro C58 cII

Atu4538-4549

Agro vitis S4 cII Avi6130-6117

Agro K84 2651r

Arad9505-9490

R.etli p42e PE00055-60,CH03444-3,

PE00200-00203

R.leg pRL11

pRL110085-90,RL3905-4 pRL110296-89

S.meli pSymB

SMb20575-80,SMb20583,SMb20586-20589

B.suis 1330 cII

Bra0647-0636

O.anthropi cII

Oant_3729-3718

repABC minCDE hutIHGU

Agro C58 cII

Atu3922-3924

Agro C58 cIIAtu3247-3249

Agro C58 cII

Atu3931-3936

Agro vitis S4 cII

Avi5002-5000

Agro vitis S4 cI

Avi3506-3508

Agro vitis S4 cII

Avi5956-60,63

Agro K84 2651r

Arad7003-7000

Agro K84 2651r

Arad8858-8856

Agro K84 cI

Arad4562-4566

R.etli p42e PE00459-457

R.etli p42e

PE00407-00409

R.etli p42e

PE00070-00075

R.leg pRL11

pRL110003-01

R.leg pRL11

pRL110544-0546

R.leg pRL11

pRL110203-0208

S.meli pSymB

SMb20044-20046

S.meli pSymB

SMb21522-21524

S.meli pSymB

SMb21163-21166, SMc00673,SMb20048

B.suis 1330 cII

Bra 0001,1203-02

B.suis 1330 cII

Bra0323-0321

B.suis 1330 cII

Bra0932-0927

O.anthropi cII

Oant_4454-4456

O.anthropi cII

Oant_2972-2970

O.anthropi cI

Oant_1433-1438

on February 14, 2018 by guest

http://jb.asm.org/

Dow

nloaded from


Recommended