+ All Categories
Home > Documents > This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1)....

This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1)....

Date post: 04-Jun-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
38
This supplement contains: Supplemental Results: Analysis of enzymes related to wood degradation ..………………………………………… 2 Identification of antifungal targets using comparative genomics …………………………… 2 Virulence of a GFP transformant of C. gattii strain WM276 ……………………………….. 3 Comparisons of mitochondrial genomes of Cryptococcus species ………………………….. 3 Analysis of gene family evolution …………………………………………………………... 4 Materials and Methods ………………………………………………………………………………. 7 Supplemental figures with legends …………………………………………………………………. 19 Phylogenetic trees in text notation ………………………………………………………………….. 34 Listing of supplemental tables ……………………………………………………………………… 35 Data Set S1 description ……………………………………………………………………………... 35 References (in this supplement) …………………………………………………………………….. 37
Transcript
Page 1: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

This supplement contains: Supplemental Results:

Analysis of enzymes related to wood degradation ..………………………………………… 2 Identification of antifungal targets using comparative genomics …………………………… 2 Virulence of a GFP transformant of C. gattii strain WM276 ……………………………….. 3 Comparisons of mitochondrial genomes of Cryptococcus species ………………………….. 3 Analysis of gene family evolution …………………………………………………………... 4

Materials and Methods ………………………………………………………………………………. 7 Supplemental figures with legends …………………………………………………………………. 19 Phylogenetic trees in text notation ………………………………………………………………….. 34 Listing of supplemental tables ……………………………………………………………………… 35 Data Set S1 description ……………………………………………………………………………... 35 References (in this supplement) …………………………………………………………………….. 37

Page 2: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

2

SUPPLEMENTAL RESULTS Analysis of enzymes related to wood degradation

Given the consistent association of C. gattii with decaying wood and eucalyptus trees (1), we

looked for genes in C. gattii that facilitate wood rot as in the fungus P. chrysosporium (2). Although

the C. gattii strains lacked the lignin modifying enzymes typically associated with wood-rotting fungi

(lignin/manganese peroxidase), we found auxiliary enzymes belonging to the glyoxal oxidase family (3

genes), the cellulase family (1 gene), and the multi-copper oxidase/laccase family (5 genes). This

incomplete system of ligninolytic enzymes in C. gattii suggests a possible dependence on partial

degradation of lignin by other coexisting fungi/bacteria.

Identification of antifungal targets using comparative genomics

Comparative genomics offers the promise of novel antifungal drug target discovery, by

assisting in the identification of fungal-specific gene products that are absent from mammalian

genomes. To identify antifungal drug targets in the C. gattii strains WM276 and R265, predicted

polypeptide sequences were compared to those in other Cryptococcus strains (H99, JEC21 and

B3501A), four basidiomycetous fungi (M. globosa, C. cinereus, P. chrysosporium and U. maydis) and

S. cerevisiae. We collated proteins with highly similar homologs in all genomes within our dataset

(BLASTP e-value < 1E-45) (3), and having no significant sequence similarity (best BLASTP e-value >

1E-10) to proteins in either humans or mice (proteins downloaded from Ensembl BioMart, release 57;

human reference sequence (GRCh37), Mus musculus genes NCBIM37). Data Set S1 tab T and tab U

(see supplemental materials) respectively show lists of WM276 orthologs (116) and R265 orthologs

(127) that are potential antifungal targets. These lists include receptors for a variety of small

molecules, as well as proteins with enzymatic functions, suggesting that these targets could be

Page 3: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

3

harnessed in the design of broad-spectrum small molecule fungal growth inhibitors. The finding of a

1,3-beta-glucan synthase (WM276 ortholog: CGB_N3410W, R265 ortholog: CNBG_4964), a target

for the cell wall agents called echinocandins (4), ratifies our computational approach in the whole-

genome identification of antifungal target proteins. The drug targets identified here may be just as

appropriate for animal basidiomycete pathogens like M. globosa as for the development of novel

agricultural fungicides against plant basidiomycete pathogens. We recognize that our target discovery

strategy may have missed some target genes due to the high stringency requirement of protein

similarity in our intrafungal proteome comparisons.

Virulence of a GFP transformant of C. gattii strain WM276

The WM276gfp2 strain was constructed as part of an experiment designed to monitor green

fluorescent protein expression from an iron-regulated promoter fused to gfp in infected mice. A

number of transformants were tested for virulence and gfp expression and, fortuitously, the

WM276gfp2 strain was found to not cause disease (data not shown). Subsequent analysis by

electrophoretic karyotyping and CGH revealed a change in chromosome 11 in this strain.

Comparisons of mitochondrial genomes of Cryptococcus species

The outbreak of cryptococcosis on Vancouver Island has been attributed to an enhanced

intracellular proliferation rate driven by the up-regulation of several genes, including many encoded by

the mitochondrial genome or associated with mitochondrial activities (5). To survey mitochondrial

gene content for all of the sequenced Cryptococcus genomes, we compared the mitochondrial genome

sequences for the two C. gattii strains, and the C. neoformans strains B3501A/JEC21 (var.

neoformans) and H99 (var. grubii) (see Data Set S1, tab E, TextS2, TextS3 and Materials and Methods

Page 4: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

4

below). Our results indicated that the C. gattii mitochondrial genomes were similar in size to the C.

neoformans var. neoformans mitochondrial genomes (WM276: 34,342 bp, R265: 34,790 bp, B3501A

and JEC21: 33,199 bp), but differed from the C. neoformans var. grubii strain H99 (24,874 bp).

Although we expected gene content differences, given the size differences (especially for strain H99),

comparisons by BLAST (3) revealed that the C. gattii and C. neoformans var. neoformans genomes

were syntenic and similar in gene content to the H99 mitochondrial genome. This finding is consistent

with previous examinations of mitochondrial genomes of C. neoformans strains that revealed

substantial inter- and intra-serotype size differences; these were mainly attributed to size differences

for introns in some genes and between intergenic regions (6-8). No definitive role in virulence has

been attributed to mitochondrial genomes from the C. neoformans varieties (9), and it has been

suggested that virulence may in part be due to changes in nuclear-encoded proteins that affect

mitochondrial morphology and gene expression (8).

Analysis of gene family evolution

Fungal genome sequencing provides an opportunity to conduct an inventory of multi-gene

families as an initial step in assessing their role in host colonization and virulence. We used the

Markov Cluster (MCL) algorithm (run independently from OrthoMCL) for accurate assignment of

proteins into distinct multi-gene families based on pre-computed sequence similarity information (10).

Each predicted polypeptide sequence from all five annotated Cryptococcus genomes included in the

dataset was compared to every sequence in the set. Sequences were clustered into a family with

default OrthoMCL criteria (BLASTp alignment (3) E-value < 1E-5) and after the application of a

cluster correction filter implemented in MCL (see Materials and Methods below). When proteins from

the five Cryptococcal genomes were clustered in this manner, 19% of the WM276 ORFs belonged to

Page 5: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

5

families with at least two paralogs (17.4% in R265). From a total of 32,808 genes between the five

Cryptococcus strains, this approach yielded 5,742 clusters, of which 4,462 clusters had representative

genes from all five strains - the core Cryptococcus gene set. The largest gene families encoded protein

kinases (76 genes), sugar transporters (43 genes), DEAD-box proteins (24 genes), and Ras proteins (24

genes) (see Data Set S1, tab F, in supplemental material). We found 296 genes that were specific to C.

gattii (WM276: 222 genes, R265: 74 genes). Most of the genes were annotated as hypothetical

proteins, but the R265 specific set included genes encoding a transmembrane protein 167A, a

deoxyribose-phosphate aldolase, and an arsenic resistance protein. Further analyses of the species-

specific genes will be needed to explore functions relevant to ecological adaptation and virulence.

Among the 5,741 gene clusters, 1,789 were identified as having a size differential across the

five genomes. To examine gene family evolution across entire genomes in our dataset, we employed

the algorithm CAFÉ (11, 12), which takes as input a phylogenetic tree of the five Cryptococccus

genomes (described above) and a matrix of gene family sizes for each strain derived from the MCL

output (see Materials and Methods below). We carried out phylogenetic analysis on the 1,789 gene

family clusters, out of which 46 families had statistically significant size variance. This revealed four

families, lipoprotein, nucleotidyltransferase, Pfam-B_209, and Transposase_21 (see Data Set S1, tab

G, in supplemental material) significantly contracted (P < 0.05) in C. gattii strains versus C.

neoformans strains, and none that were significantly expanded. The biological significance of

contraction of these families in the C. gattii lineage remains to be determined.

To identify statistically significant variances in gene family sizes across ten fungal genomes

(including the four additional basidiomycetes) used in this study, 14,376 gene families were identified

from a total of 73,740 genes among the ten taxa. Phylogenetic analysis was carried out with 13,454

clusters having differential gene counts, out of which 264 clusters had statistically significant size

Page 6: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

6

variance. This revealed genes from 108 clusters that occur in at least one Cryptococcus species but

were absent in the other basidiomycetes and S. cerevisiae; these included the NAD-binding Rossmann

fold and NADH-Ubiquinone/plastoquinone (complex I) oxidoreductase families (see Data Set S1, tab

H, in supplemental material). Moreover, an amino acid transporter gene family was significantly

expanded in the pathogenic basidiomycetes (gene count in WM276: 6, R265: 4, H99: 3, B3501A: 4,

JEC21: 4, M. globosa: 1, U. maydis: 4, P. chrysosporium: 2) relative to the non-pathogens in our

dataset (C.cinereus: 0, S. cerevisiae: 0)

Page 7: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

7

MATERIALS AND METHODS

Strains

Strain Species Origin Capsular Serotype

Molecular type

Source

WM276 C. gattii Environmental, Australia

B VGI Wieland Meyer

R794 C. gattii Clinical, Vancouver Island, BC

B VGI Karen Bartlett

KB3864 C. gattii Environmental, Vancouver Island, BC

B VGI Karen Bartlett

R265 C. gattii Clinical, Vancouver Island, BC

B VGIIa Karen Bartlett

R272 C. gattii Clinical, Vancouver Island, BC

B VGIIb Karen Bartlett

RB28 C. gattii Environmental, Vancouver Island, BC

B VGIIb Karen Bartlett

R1412 (B5765) C. gattii Environmental, India

B VGI June Kwon-Chung

R1413 (B-5788) C. gattii Environmental, India

B VGI June Kwon-Chung

R-1346 (RB-13) C. gattii Environmental, Canada

B VGII June Kwon-Chung

R-1347 (RB-14) C. gattii Environmental, Canada

B VGII June Kwon-Chung

R-1402 (VPB571-058)

C. gattii Australia B VGII June Kwon-Chung/Wieland Meyer

R-1401 (RAM-15)

C. gattii Australia B VGII June Kwon-Chung/Wieland Meyer

B3501A C. neoformans var. neoformans

F1 progeny, USA

D VNIV June Kwon-Chung

H99 C. neoformans var. grubii

Clinical, USA

A VNI Joseph Heitman

Genome sequencing

For WM276, the whole-genome shotgun sequence assembler ARACHNE was used to assemble

reads generated by Sanger sequencing chemistry from plasmid (2.2, 2.3, 13 kb libraries), fosmid and

BAC clones (see Table S1 in supplemental materials) (13). A high level of completion was achieved

for WM276 through several rounds of finishing, and the contiguity of the whole-genome shotgun

Page 8: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

8

sequence (WGS) assembly was improved using information from Bacterial Artificial Chromosome

(BAC)-based physical maps (14, 15). The WM276 strain was to sequenced at 5-6X coverage and

assembled into 14 chromosomes with a combined size of 18.4 Mb; a summary of the sequencing

details is presented in Table S1). Automated annotation of the WM276 genome was performed with

an in-house genome annotation algorithm called Pegasys (16), based on a comparisons with annotated

gene models and ESTs for the C. neoformans strain JEC21, followed by manual curation facilitated by

the genome browser and editor Apollo (17). Whole genome sequences and manual annotation of each

chromosome has been submitted to GenBank (accession; chr1: CP000286, chr2: CP000287, chr3:

CP000288, chr4: CP000289, chr5: CP000290, chr6: CP000291, chr7: CP000292, chr8: CP000293,

chr9: CP000294, chr10: CP000295, chr11: CP000296, chr12: CP000297, chr13: CP000298, chr14:

CP000299). Note that while chromosomes are denoted by numerals throughout the manuscript we

used the corresponding alphabets in WM276 gene names for clarity.

The shotgun sequence of the genome of strain R265 was obtained using Sanger paired-end

sequences of 4 kb and 10 kb plasmid libraries, and assembled with ARACHNE (13). The resulting

6.5X depth assembly consisted of 139 supercontigs totaling 17.6 Mb with N50 of 1.1 Mb, and 968

contigs totaling 17.5 Mb with N50 of 426 kb. The R265 genes were annotated primarily by

transferring annotations from WM276 and also calling a small number of novel genes. R265 was

aligned to WM276 using PatternHunter (18) and gene calls in aligned blocks were mapped from

WM276 to R265 using an in-house mapping program. To call genes specific to R265, candidate gene

structures were identified using GENEID (19), FGENESH (20), and GLEAN (21), and the resulting

108 genes were supported by predictions of a PFAM protein domain or alignment with an EST

sequence. The assembled R265 genome and annotations were submitted to GenBank (project

accession: AAFP01000000). The mitochondrial genome for R265 was assembled as a single scaffold

Page 9: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

9

with one contig, totaling 34,790 bp. All genes are present in this contig, however NADH4L is

interrupted by inframe stop codons due to sequencing errors. The R265 nuclear and mitochondrial

genomic sequences are available at

http://www.broadinstitute.org/annotation/genome/cryptococcus_neoformans_b/MultiDownloads.html.

Data Sources

The C. neoformans var. grubii H99 polypeptide sequences and mitochondrial genomic sequence

are available at the following website:

http://www.broadinstitute.org/annotation/genome/cryptococcus_neoformans/MultiDownloads.html

The C. neoformans var. neoformans JEC21 polypeptide sequences are available at:

http://www.ncbi.nlm.nih.gov/genomeprj/10698

The C. neoformans var. neoformans B3501A polypeptide sequences and mitochondrial genomic

sequence are available at:

http://www-sequence.stanford.edu/group/C.neoformans/files/cneo040623.cds.pep.gz and

http://www-sequence.stanford.edu/group/C.neoformans/files/cnmito.fa.gz

The C. neoformans var. grubii R265 polypeptide sequences and mitochondrial genomic sequence are available at the following website: http://www.broadinstitute.org/annotation/genome/cryptococcus_neoformans_b/MultiDownloads.html Annotated genomic data used in this work were downloaded from the following sources: Saccharomyces cerevisiae annotated proteins:

http://downloads.yeastgenome.org/sequence/genomic_sequence/orf_protein/ (January 5, 2010 release)

Ustilago maydis annotated proteins: ftp://ftp.ncbi.nih.gov/genomes/Fungi/Ustilago_maydis/ (July25,

2007 release)

ftp://ftpmips.gsf.de/ustilago/Umaydis_valid/Umaydis_valid_orf_prot_121009 (October 12, 2009)

Page 10: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

10

Phanerochaete chrysosporium annotated proteins: http://genome.jgi-

psf.org/Phchr1/Phchr1.download.ftp.html (February 2005 release)

Coprinus cinereus (C. cinereus okayama7#130) annotated proteins:

http://www.broadinstitute.org/annotation/genome/coprinus_cinereus/MultiDownloads.html (January

13, 2009 release)

Malassezia globosa CBS 7966 annotated proteins: (http://www.ncbi.nlm.nih.gov/sites/entrez; Refseq

id NZ_AAYY00000000)

For consistency, mitochondrial genes were removed from the dataset.

Software BLAST searches were carried out using blast-2.2.22 from

ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/2.2.22/; MCL:http://micans.org/mcl/mcl-02-063/;

trimAl v1.2: http://trimal.cgenomics.org/downloads; Muscle3.6:

http://www.drive5.com/muscle/downloads.htm; PATHd8: http://www2.math.su.se/PATHd8/;

Seaview4: http://pbil.univ-lyon1.fr/software/seaview.html

Seaview was used because PhyML3.0 by itself generates unrooted trees, but Seaview has an option to

obtain rooted trees. Custom software implemented in Python were also used to carry out the analysis.

Genomic synteny analysis was carried out either with Cross_Match (http://www.phrap.org) and the

alignments were visualized with XMatchView

(http://www.bcgsc.ca/platform/bioinfo/software/xmatchview). Aligments of the AGO1, AGO2 and

CPR2 loci between C. gattii strains WM276 and R265 were generated by running BLASTn followed

by the visualization tool Artemis (22).

Page 11: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

11

PCR assay

After two days growth on YPD plate, C. gattii strains (see Table S2 in supplemental materials)

were collected directly for genomic DNA isolation using the MasterPureTM Yeast DNA purification kit

(Epicentre Biotechnologies).

PCR primers were selected from the conserved DNA region between strains R265 and WM276. These

included:

CPR2 (CGB_A1720W): (PCR1_F: ATACTTCTGTCTTTTGCA, PCR1_R:

GAAAGAGTGAGGAACATGAGA)

CGB_A1730W (Hypothetical protein): (PCR2_F: GGCCATTTTCAACCTCATGT, PCR2_R:

GACGCACATGGAGTGCAG)

AGO2: (PCR3_F: TACTTCTCCAACTACGGCAAATG, PCR3_R:

GCTTTCCAGTACTTCCCAAATCT)

AGO1: (PCR4_F: ACTTTGCGGAGAGGAGTCAGTAT, PCR4_R:

GGAACACCGTACACACATAGACA)

PCR assays were conducted in a PTC-200 automated thermal cycler (MJ Research, Waltham,

MA). Genomic DNA (300 ng) was used in a 25 μl PCR reaction mixture containing 10 pm of each

primer, 2 mm of each nucleotide (dATP, dCTP, dGTP and dTTP), 2.5 μl of 10x ExTaq buffer, 0.125 μl

of ExTaq polymerase (Takara) and an appropriate volume of distilled water. The following conditions

were used for the amplification: an initial 1 min of denaturation at 98oC, followed by 35 cycles of

denaturation for 15 sec at 98oC, an annealing time of 15 sec at 54oC, and an extension cycle for 1

(PCR1 and PCR2 reactions) or 4 (PCR3 and PCR4 reactions) min at 72oC. The amplification was

completed by an extension period of 5 min at 72oC. Sterile water served as a negative control in each

assay. PCR products were analyzed on 1% agarose gels.

Page 12: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

12

Identification and alignment of orthologs, and phylogenetic analysis of Cryptococcus strains and

other fungal taxa

Different orthology data sets were generated including 10-way clusters among selected

basidiomycetes and S. cerevisiae, and five-way orthology clusters between the sequenced

Cryptococcus spp. For each data set, all predicted protein sequences from the appropriate genomes

were searched against each other with BLASTP (3) and clustered into orthologous groups using

OrthoMCL (23) with the default criteria (E-value <1E-5). Among the five Cryptococcus strains, 5171

single-copy orthologs were identified as the clusters with exactly one member per species. Multiple

sequence alignments were constructed with MUSCLE (24) and the alignments were trimmed using an

heuristic method implemented in trimAl (25), with the automated option that selects optimal

parameters to trim the input alignment. Alignments for all 5171 clusters were concatenated into a

single file containing 2817121 characters, converted to the Phyllip format. Phylogenetic analysis of

the five Cryptococcus strains was performed using maximum likelihood method PhyML3.0 (26)

implemented in Seaview4 (27). The JTT amino-acid substitution model (28) was used along with the

tree topology search operation that combines NNI and SPR moves, the proportion of invariable sites

and category of substitution rate were optimized by the program, and gaps were treated as unknown

characters. The starting tree to be refined by the maximum likelihood algorithm was a distance-based

BIONJ tree estimated by the program (26). Statistical support for phylogenetic grouping was assessed

by approximate likelihood-ratio tests based on a Shimodaira-Hasegawa-like procedure (SH-aLRT) (29)

and by bootstrap analysis (500 re-samplings). An ultrametric tree was generated using PATHd8 (30)

with the maximum likelihood tree as a starting point and fixing the age of the most recent common

ancestor (mrca) involved in the H99-C. gattii split at 34 Myr . This age was derived using the neutral

Page 13: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

13

mutation rate of 2E-9 per nucleotide per year for protein coding genes (see calculation below). While

PATHd8 does not assume a molecular clock it runs a clock test allowing for substitution rate variation

along all lineages. Molecular clock tests indicated that 3/4 nodes were rejected at a confidence level of

0.95.

The five Cryptococcus strains were also tested for phylogenetic relationships with additional

fungi namely, S. cerevisiae, U. maydis, C. cinereus, M. globosa, and P. chrysosporium. Single-copy

orthologs from OrthoMCL clusters were aligned with Muscle (24). Alignments for all 1519 clusters of

single copy orthologs derived from the ten fungal candidates were concatenated into a single file

containing 837857 characters, and converted to the Phyllip format to construct phylogenetic trees as

described above with modification in Seaview4 to obtain a rooted tree. Divergence time estimation was

performed with the maximum likelihood tree as a starting point and calibrated based on a recent

estimate of ~500 million years of divergence between ascomycetous and basidiomycetous fungi (31)

(see below). Molecular clock tests indicated that 8/9 nodes were rejected at a confidence level of 0.95.

Cryptococcus taxa PATHd8 parameters:

Sequence length = 2817121 amino acids;

(((B3501A:0.00171,JEC21:0.00389):0.03707,H99:0.04223):0.02645,(R265:0.02734,WM276:0.02160)

:0.04264);

mrca: H99, WM276, fixage=34;

Average branch length between H99 and C. gattii strains (WM276 and R265) =

0.042+0.026+0.043+(0.027+0.022/2)

= 0.1355

Applying a neutral mutation rate of 2E-9 per nucleotide per year for protein coding genes,

Page 14: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

14

Divergence time between H99 and C. gattii strains (myr) = (avg. branch length/2E-9)/2 = ~34 myr

Ten taxa PATHd8 parameters:

Sequence length = 837857 amino acids;

(((Mglobosa:0.44018,Umaydis:0.32365):0.16255,((Pchrysospo:0.22060,Ccinereus:0.25953):0.20450,((

WM276:0.01294,R265:0.01698):0.02215,(H99:0.02471,(B3501A:0.00093,JEC21:0.00196):0.02150):

0.01596):0.41020):0.09215):0.16200,Scerevisiae:0.70276);

mrca: Scerevisiae, WM276, fixage=500;

MCL clustering of annotated proteins into families

We used the MCL algorithm to globally identify gene families in the fungal genomes in our

dataset. MCL detects proteins with very similar domain architectures rather than attempt to detect

each domain individually, thus accurately assigning proteins (even ones with different domain

structures) into distinct multi-gene families (10). The assumption is that proteins with near-identical

sets of domains may have very similar biochemical roles. Prior to clustering, we assembled a FASTA

file containing the sequences to be clustered that was then compared against itself using BLASTp (E-

value < 1E-5) (3). The all-against-all sequence similarities generated by this analysis were parsed and

stored in a square matrix that represents sequence similarities as a connection graph where nodes

represent proteins and weighted edges represent their relationships indicated by the average –log10(E-

value) for each pair of sequences. From the symmetric matrix generated, weights were transformed

into transition probabilities arranged in a Markov matrix that was then supplied to the MCL algorithm

(mcl-09-308). An inflation value parameter I = 1.5 was employed along with the --force-connected=y

Page 15: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

15

option to break up spurious linkages in clusters, and the resultant matrix was interpreted as a protein

family clustering.

Likelihood analysis of gene family expansion and contraction

Gene family size was determined for the following genomes in the Cryptococcus lineage:

WM276, R265, H99, JEC21, and B3501A. The MCL algorithm (described above) produced a matrix

indicating the numbers of genes in each family cluster for each strain/taxon in the dataset. Then, using

the phylogeny based on 5171 concatenated ortholog alignments (described above), gene family sizes

were associated with each taxon to determine the magnitude of gene family expansions and

contractions. The algorithm CAFE (Computational Analysis of gene Family Evolution) uses a random

model of birth and death process of gene family evolution to ascertain gene gain and loss across a user-

specified phylogenetic tree was used to detect significant gene family size changes between any two

lineages as defined in (11, 12). Taking the phylogenetic tree and a matrix of gene family sizes in

extant species as input, CAFE2.1 inferred the most likely gene family size at internal nodes, tested for

accelerated rates of gene family expansions or contractions and identified significantly evolving

branches at a user-specified P-value cut-off = 0.05.

Isolation of fluconazole-heteroresistant Cryptococcus gattii strains

Two representative strains each from Canada (RB-13, RB-14), Australia (RAM-15, VPB571-

058) and India (B-5765, B-5788) were used to study resistance to fluconazole and for CGH analysis

(32). The isolates were stored in glycerol (25%) at -80oC until use and were maintained on YPD (2%

glucose, 1% yeast extract, 2% peptone) agar at 25oC during this study. Stock solutions of Fluconazole

(FLC) powder (gift from Pfizer Global Research & Development, Groton, CT) were prepared in

Page 16: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

16

dimethyl sulfoxide (Sigma, St. Louis, MO) at a concentration of 50 mg/ml. A single colony from each

isolate grown on YPD agar was suspended in sterile saline and plated on plates of YPD agar (control)

or YPD with 8 µg/ml of FLC. After 72 h at 30oC clones that grew were sequentially passaged on YPD

agar containing stepwise increases in the concentrations of FLC (up to 256 µg/ml). The Canadian and

Australian isolates were resistant to fluconazole levels as high as 256 µg/ml while the Indian isolates

were resistant to a maximum level of 64 µg/ml.

Genomic DNA was extracted from all six strains prior to and after acquisition of resistance to 64

µg/ml fluconazole. Genomic DNA was also extracted from one Canadian strain (RB-14) and one

Australian strain (RAM-15) following acquisition of resistance to 128 and 256 µg/ml fluconazole. Cells of

each wild type strain were inoculated in 50 ml YEPD broth while fluconazole-resistant strains were patched

on YEPD agar media supplemented with desired concentration of fluconazole and grown overnight at 30oC.

The cells were suspended at a density of 107 per ml, harvested by centrifugation, lyophilized, and vortexed

with 3 mm glass beads. The powder was resuspended in 10 ml CTAB extraction buffer (100 mM Tris,

pH7.5; 0.7 M NaCl; 10 mM EDTA, pH 8; 1%CTAB (Alkyl trimethyl ammonium bromide); 1% beta-

mercaptoethanol) with 300 mg/ml Proteinase K. After incubation at 65oC for 30 min, DNA was extracted

with equal volumes of chloroform and isopropanol, washed with 70% ethanol and resuspended in TE buffer.

We have re-designated the previously isolated C. gattii strains (in brackets) as follows: R-1346 (RB-13), R-

1347 (RB-14), R-1401 (RAM-15), R-1402 (VPB571-058), R-1412 (B-5765), and R-1413 (B-5788).

Comparative genome hybridization analysis of clinical, environmental and fluconazole-

heteroresistant isolates of C. gattii

A custom CGH array for each of the C. gattii strains was constructed by Roche NimbleGen Inc.

(Madison, WI) containing 45- to 85-mer oligonucleotides spaced every 44 bp (on average) across each

Page 17: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

17

chromosome. The NimbleGen custom hybridization service carried out comparative hybridization of

DNA of all strains under study to the appropriate custom array and provided results in several formats

(including the plots of log2 ratios). For fluconazole-heteroresistant isolates, DNA from the isolates and

cognate parental strains were differentially labeled with fluorescent dyes prior to competitive

hybridization to the VGI or VGII reference genome array. Log2 ratios for assessing relative

hybridization were averaged in windows of 400 bp, 800 bp and 2000 bp for most strains (400 bp for

fluconazole-resistant isolates). Probes for the R265 CGH reference array were designed based on the

sequence provided by the Broad Institute. Custom python scripts were used to delineate aberrations at

the level of individual genes and to correlate variant genes with families identified as significantly

evolving as described above. For automated calls on regional variations, we inferred deletions or

regions of divergence in the test genome if the log2 ratio =< -1, and regions of insertions/amplifications

from log2 ratios >= 1. For the clinical and environmental isolate comparisons to either WM276 or

R265, genes were considered amplified, deleted or highly sequence-diverged if the trend was found for

all three log2 ratio-averages.

Ploidy determination by FACS analysis

Fluconazole-resistant C. gatti strains were maintained and passaged in YPD medium containing

fluconazole. Cells were harvested from liquid media at log phase and processed for flow cytometry as

described previously (33, 34). Briefly, cells were washed once with water and 107 cells were fixed in 1

ml of 70% ethanol overnight at 4oC. Fixed cells were subsequently washed once with 1 ml of NS

buffer (10 mM Tris–HCl (pH 7.6), 250 mM sucrose, 1 mM EDTA (pH 8.0), 1 mM MgCl2, 0.1 mM

CaCl2, 0.1 mM ZnCl2) and stained with 0.5 mL of NS buffer containing 10 μg/ml propidium iodide

and 1 mg/ml RNaseA at 4°C overnight. Stained cells were then diluted (40x; 0.05 ml cells in 2 ml

Page 18: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

18

diluant) in 50 mM Tris–HCl (pH 8.0) prior to FACS analysis. Flow cytometry was performed with 104

cells and analyzed on the FL3 channel with a Becton–Dickinson FACSCalibur (BD Biosciences,

Canada). Graphs and histograms were generated using Flowjo v8.8.6 (Ashland, OR).

Virulence assays

Two sets of virulence assays were performed using a mouse model of cryptococcosis. First, the

virulence of two green-fluorescent protein (GFP) fusion strains of C. neoformans var. grubii and C.

gattii was tested. Ten C57Bl/6 female mice were inoculated by the intratracheal method with 105 cells

for each test strain and the control strains WM276 and C. neoformans var. grubii strain H99. The mice

were monitored for illness over two months and the assay was repeated three times. The second assay

analyzed the virulence of VGI and VGII strains of C. gattii. In this experiment, ten A/Jcr female mice

were inoculated intra-nasally with 5x104 cells of each strain (RB28, R794, and KB3864) and

monitored for illness over two months. The experiment also included the VGI strain WM276, the

VGIIa strain R265 and the VGIIb strain R272. The virulence data for these three strains were

previously reported, as described in the legend to Figure 3. The virulence assays were performed twice

for all of the strains, with similar results. Survival data were analyzed using Kaplan-Meier curves and

the groups of mice infected with different strains were compared by the logrank test to assess statistical

significance and confidence of the virulence data. The protocols for virulence assays were approved

by the University of British Columbia Committee on Animal Care.

Page 19: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

19

SUPPLEMENTAL FIGURES

Fig. S1. Crossmatch alignments – C. gattii VGI versus VGII (all chromosomes)

Alignments of the chromosomes of the VGI strain WM276 versus VGII strain R265 were

performed using Cross_Match. The majority of chromosomes were co-linear but some rearranged

chromosomes and regions of inversion were also present between the two genomes. The labeling of the

figure is described in the legend to Figure 1.

VGI

VGII Chr1 Chr2 Chr3 Chr3

VGI

VGII Chr4 Chr5 Chr6 Chr7

VGI

VGII Chr8 Chr9 Chr10 Chr11 Chr11

VGI

VGII Chr12 Chr13 Chr14

Sequence Orientation

Direct Inverted

Page 20: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

20

Fig. S2. Truncation of the AGO1 and AGO2 genes is fixed in the C. gattii molecular type VGII

population

Synteny analysis of the AGO1 and AGO2 genes and flanking regions in VGI strain WM276

versus VGII strain R265. A 4122 bp deletion of the AGO2 gene and a 5167 bp deletion of the AGO1

gene were identified in strain R265. A PCR assay using primers based on conserved sequences

between strains R265 and WM276 demonstrated these two deletions are common in 46 tested VGII

strains, including representatives from the VGIIa, VGIIb, and VGIIc genotypes (see Table S2 in

supplemental materials). Deletion of the AGO1 and AGO2 genes is fixed in the C. gattii molecular

type VGII population.

AGO2 AGO1

WM276: Chr4: 2045123 - 2105215

R265: Chr4: 1303650 - 1347874

Sequencing gap Sequencing gap

VG

II

VG

I

VG

II

VG

I

VGII

VGI

5730 bp

1608 bp 1412 bp

6579 bp

Page 21: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

21

Fig. S3. Deletions of the CPR2 gene and downstream gene are fixed in the C. gattii molecular type

VGII population

Synteny analysis of the CPR2 gene and its flanking regions in VGI strain WM276 and VGII

strain R265. An 880 bp truncation of the CPR2 gene and a 3,109 bp deletion of the gene downstream

of CPR2 were identified in strain R265. A PCR assay using primers based on conserved sequences

between strains R265 and WM276 demonstrated these two deletions are common in 48 tested strains,

including representatives from the VGIIa, VGIIb, and VGIIc genotypes (see Table S2 in supplemental

materials).

Page 22: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

22

Fig. S4a. CrossMatch alignments for all of the C. gattii chromosomes versus the chromosomes of

C. neoformans B3501A

Alignments of the chromosomes of the VGI strain WM276 versus C. neoformans strain

B3501A were performed using Cross_Match (continued in Fig. S4b). In contrast to the alignments

observed for the C. gattii strains (Fig. S1), many rearrangements were observed. In particular, a

striking three-part chromosomal rearrangement was observed that involved chromosomes 4, 9 and 10.

The labeling of the figure is described in the legend to Figure 1.

B

D

Chr1 Chr1 Chr2 Chr2

Chr1 Chr2 Chr1 Chr2

Chr3 Chr4 Chr4

Chr3 Chr4 Chr10

B

D

Chr5

Chr5

Chr6

Chr6

Chr7

Chr7

Chr8

Chr8

B

D

Sequence Orientation

Direct Inverted

Chr4

Chr9

Page 23: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

23

Fig. S4b. CrossMatch alignments for all of the C. gattii chromosomes versus the chromosomes of

C. neoformans B3501A (continued)

Chr8

B

D Chr12

Chr9

Chr4

Chr9

Chr9 Chr10

Chr9

Chr10

B

D Chr4

Chr10

Chr9

Chr10

Chr10

B

D

Chr11

Chr11

Chr12

Chr12

Chr13

Chr13

Chr14

Chr14

Sequence Orientation

Direct Inverted

Page 24: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

24

Fig. S5. Electrophoretic karyotyping of C. gattii strain WM276 and the WM276gfp2 mutant.

The chromosomes of the strains were separated by Contour-clamped Homogeneous Electric

Field Pulsed Field Gel Electrophoresis (CHEF-PFGE). Lane 1 contains the chromosomes from a

strain of Saccharomyces cerevisiae as size markers. Lane 2 contains the chromosomes from the

parental strain WM276 and lane 3 contains the chromosomes from the avirulent mutant WM276gfp2.

Note that a 1 Mb chromosome is missing in lane 3 and the CGH analysis indicates that a deletion has

occurred in this chromosome; this deletion likely changes the mobility of the chromosome such that it

now co-migrates with a smaller chromosome.

Page 25: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

25

Fig. S6. Comparative genome hybridization for C. gattii strain WM276gfp2

A CGH plot of log2 ratios for chromosome 11 is shown for the WM276gfp2 strain. A deletion

of ~75 kb bearing 24 genes was detected at the right end of chromosome 11. Note that the region

adjacent to the deletion appears to have a log2 ratio slightly higher than zero, perhaps indicating an

amplification of sequences. For comparisons of all WM276gfp2 chromosomes see Fig. S7.

chr11

75 kb region: 24 genes deleted or diverged

Page 26: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

26

Fig. S7. Comparative genome hybridization to examine VGI strains WM276gfp2, R794 and

KB3864 (all chromosomes)

Most of the variations observed in these strains are associated with subtelomeric regions.

Page 27: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

27

Fig. S8. Comparative genome hybridization for VGI strain R794

Genomic DNA of strains WM276 and R794 were hybridized to the WM276 reference array

and the plots of log2 ratios are shown for chromosomes 5, 7, 8, 10, 12, and 13 to illustrate regions of

divergence. A log2 ratio of zero indicates a close match between the two sequences and the associated

black dashes indicate the 800 bp windows used for assessing relative hybridization. Deletions or

regions of divergence in the experimental genome (R794) give hybridization results with a log2 ratio of

less than zero. For comparisons of all R794 chromosomes, see Fig. S7.

chr7

chr10

chr13

!!

chr5

chr12

chr8

Page 28: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

28

Fig. S9. Comparative genome hybridization for VGII strains R272 and RB28

Genomic DNA of strains R265 and RB28, and R265 and R272 were hybridized to the R265

reference array to yield plots of log2 ratios for each chromosome. CGH plots of log2 ratios are shown

for chromosomes 11 and 12 in strains R272 and KB28. A log2 ratio of zero indicates a close match

between the two sequences and the associated black dashes indicate the 800 bp windows used for

assessing relative hybridization. Deletions or regions of divergence in the experimental genome

(RB28 or R272) give hybridization results with a log2 ratio of less than zero. Large-scale deletions or

sequence divergence can be detected in both of these chromosomes for strain R272. For RB28, an

amplification is detected in chromosome 11 and a terminal deletion in chromosome 12. For plots

showing all chromosomes, see Fig. S10.

A. R272

B. RB28

chr11

chr12

chr11

chr13

Page 29: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

29

Fig. S10. Comparative genome hybridization to examine VGII strains R272 and RB28 (all

chromosomes)

R272

RB28

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14

Log

2rat

io

Log

2rat

io

1 2 3 4 5 6 7 8 9 10 1 1 12 13 14

Page 30: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

30

Fig. S11. Comparative genome hybridization to examine fluconazole-resistant VGII strains

R1401F, R1346F (all chromosomes)

Page 31: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

31

Fig. S12. FACS analysis to examine ploidy of fluconazole-resistant strains of C. gattii (VGI

molecular type)

The VGI strains R1412 and R1413 and their corresponding fluconazole-resistant derivatives

were subjected to DNA content analysis by FACS. The DNA content of cells was measured by

determining the relative fluorescence intensities. R1413F was baseline haploid while R1412F was

diploid by flow cytometry.

R1413 R1413F

R1412 R1412F

Page 32: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

32

Fig. S13. FACS analysis to examine ploidy of fluconazole-resistant strains of C. gattii (VGII

molecular type)

The VGII strains R1346 and R1347 parental strains and their corresponding fluconazole-

resistant derivatives R1346F and R1347F were subjected to FACS analysis. The DNA content of cells

was measured by determining the relative fluorescence intensities. R1346F was haploid while R1347F

appeared to contain a mixed population of cells 1N and 2N.

R1347 R1347F

R1346 R1346F

Page 33: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

33

Fig. S14. FACS analysis to examine ploidy of fluconazole-resistant strains of C. gattii (VGII molecular type), continued.

The VGII strains R1401 and R1402 parental strains and their corresponding fluconazole-

resistant derivatives R1401F and R1402F were subjected to FACS analysis. The DNA content of cells

was measured by determining the relative fluorescence intensities. The results indicated that R1401F

and R1402F are haploid.

R1401 R1401F

R1402 R1402F

Page 34: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

34

PHYLOGENETIC TREES IN TEXT NOTATION Phylogenetic trees with estimates of divergence times used in the current study. Tree 1a. Rooted, uncalibrated phylogenetic tree for the five Cryptococcus strains (((B3501A:0.00171,JEC21:0.00389)1.00:0.03707,H99:0.04223):0.02645,(R265:0.02734,WM276:0.02160):0.04264)1.00; Tree 1b. Calibrated phylogenetic tree and estimates of divergence time in millions of years for the five Cryptococcus strains. (((B3501A:1.418539,JEC21:1.418539):19.179990,H99:20.598529):13.401471,(R265:12.397622,WM276:12.397622):21.602378); Tree 2a. Rooted, uncalibrated phylogenetic tree for the five Cryptococcus strains, four basidiomycetes and S. cerevisiae (((Mglobosa:0.44018,Umaydis:0.32365)1.0000000000:0.16255,((Pchrysosporium:0.22060,Ccinereus:0.25953)1.0000000000:0.20450,((WM276:0.01294,R265:0.01698)1.0000000000:0.02215,(H99:0.02471,(B3501A:0.00093,JEC21:0.00196)1.0000000000:0.02150)1.0000000000:0.01596)1.0000000000:0.41020)1.0000000000:0.09215):0.16200,Scerevisiae:0.70276); Tree 2b. Phylogenetic tree and estimates of divergence time in millions of years for all included fungi based tree (((Mglobosa:271.726891,Umaydis:271.726891):113.012516,((Pchrysosporium:170.802939,Ccinereus:170.802939):147.619386,((WM276:10.643127,R265:10.643127):16.776544,(H99:16.742889,(B3501A:1.027926,JEC21:1.027926):15.714963):10.676783):291.002653):66.317083):115.260592,Scerevisiae:500.000000);

Page 35: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

35

LISTING OF SUPPLEMENTAL TABLES

Table S1: WM276 genome statistics

Table S2: Strains used in PCR analysis of AGO1, AGO2 and CPR2

Table S3: Summary of genomic variations in the clinical, environmental and

fluconazole-resistant strains

DATA SET S1 DESCRIPTION (excel file)

tab A: WM276 annotations

tab B: Correspondence of genes between WM276 and R265

tab C: JEC21 genes not found in WM276 tab D: WM276 genes not found in JEC21 tab E: WM276 mitochondrial genome annotation

(Also, see WM276 mitochondrial genome vs. R265 mitochondrial genome alignment in TextS2 and

H99 mitochondrial genome vs. B3501A mitochondrial genome alignment in TextS3)

tab F: Gene families among the five Cryptococcus strains

tab G: Five Cryptococcus strains CAFÉ results

tab H: Ten fungal strains CAFÉ results

tab I: WM276gfp2vsWM276 variations

tab J: R794vsWM276 variations

tab K: KB3864vsWM276 variations

tab L: R272vsR265 variations

tab M: RB28vsR265 variations

Page 36: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

36

tab N: R1412-F64vsR1412 variations

tab O: R1413-F64vsR1413 variations

tab P: R1346-F64vsR1346_variations

tab Q: R1347-F64vsR1347_variations

tab R: R1401-F64vsR1401_variations

tab S: R1402-F64vsR1402_variations

tab T: WM276 antifungal targets

tab U: R265 antifungal targets

Page 37: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

37

REFERENCES 1. Sorrell, T. C. 2001. Cryptococcus neoformans variety gattii. Med. Mycol. 39:155-168. 2. Martinez, D., L. F. Larrondo, N. Putnam, M. D. Gelpke, K. Huang, J. Chapman, K. G. Helfenbein, P. Ramaiya, J. C. Detter, F. Larimer, P. M. Coutinho, B. Henrissat, R. Berka, D. Cullen, and D. Rokhsar. 2004. Genome sequence of the lignocellulose degrading fungus Phanerochaete chrysosporium strain RP78. Nat. Biotechnol. 22:695-700. doi: 10.1038/nbt967. 3. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410. doi: 10.1006/jmbi.1990.9999. 4. Boucher, H. W., A. H. Groll, C. C. Chiou, and T. J. Walsh. 2004. Newer systemic antifungal agents: pharmacokinetics, safety and efficacy. Drugs. 64:1997-2020. 5. Ma, H., F. Hagen, D. J. Stekel, S. A. Johnston, E. Sionov, R. Falk, I. Polacheck, T. Boekhout, and R. C. May. 2009. The fatal fungal outbreak on Vancouver Island is characterized by enhanced intracellular parasitism driven by mitochondrial regulation. Proc. Natl. Acad. Sci. U. S. A. 106:12980-12985. doi: 10.1073/pnas.0902963106. 6. Litter, J., A. Keszthelyi, Z. Hamari, I. Pfeiffer, and J. Kucsera. 2005. Differences in mitochondrial genome organization of Cryptococcus neoformans strains. Antonie Van Leeuwenhoek. 88:249-255. doi: 10.1007/s10482-005-8544-x. 7. Toffaletti, D. L., M. Del Poeta, T. H. Rude, F. Dietrich, and J. R. Perfect. 2003. Regulation of cytochrome c oxidase subunit 1 (COX1) expression in Cryptococcus neoformans by temperature and host environment. Microbiology. 149:1041-1049. 8. Ma, H., and R. C. May. 2010. Mitochondria and the regulation of hypervirulence in the fatal fungal outbreak on Vancouver Island. 1:197-201. 9. Toffaletti, D. L., K. Nielsen, F. Dietrich, J. Heitman, and J. R. Perfect. 2004. Cryptococcus neoformans mitochondrial genomes from serotype A and D strains do not influence virulence. Curr. Genet. 46:193-204. doi: 10.1007/s00294-004-0521-9. 10. Enright, A. J., S. Van Dongen, and C. A. Ouzounis. 2002. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 30:1575-1584. 11. Hahn, M. W., T. De Bie, J. E. Stajich, C. Nguyen, and N. Cristianini. 2005. Estimating the tempo and mode of gene family evolution from comparative genomic data. Genome Res. 15:1153-1160. doi: 10.1101/gr.3567505. 12. De Bie, T., N. Cristianini, J. P. Demuth, and M. W. Hahn. 2006. CAFE: a computational tool for the study of gene family evolution. Bioinformatics. 22:1269-1271. doi: 10.1093/bioinformatics/btl097. 13. Batzoglou, S., D. B. Jaffe, K. Stanley, J. Butler, S. Gnerre, E. Mauceli, B. Berger, J. P. Mesirov, and E. S. Lander. 2002. ARACHNE: a whole-genome shotgun assembler. Genome Res. 12:177-189. doi: 10.1101/gr.208902. 14. Schein, J. E., K. L. Tangen, R. Chiu, H. Shin, K. B. Lengeler, W. K. MacDonald, I. Bosdet, J. Heitman, S. J. Jones, M. A. Marra, and J. W. Kronstad. 2002. Physical maps for genome analysis of serotype A and D strains of the fungal pathogen Cryptococcus neoformans. Genome Res. 12:1445-1453. doi: 10.1101/gr.81002. 15. Warren, R. L., D. Varabei, D. Platt, X. Huang, D. Messina, S. P. Yang, J. W. Kronstad, M. Krzywinski, W. C. Warren, J. W. Wallis, L. W. Hillier, A. T. Chinwalla, J. E. Schein, A. S. Siddiqui, M. A. Marra, R. K. Wilson, and S. J. Jones. 2006. Physical map-assisted whole-genome shotgun sequence assemblies. Genome Res. 16:768-775. doi: 10.1101/gr.5090606.

Page 38: This supplement contains: Supplemental Results: Analysis ......details is presented in Table S1). Automated annotation of the WM276 genome was performed with an in-house genome annotation

38

16. Shah, S. P., D. Y. He, J. N. Sawkins, J. C. Druce, G. Quon, D. Lett, G. X. Zheng, T. Xu, and B. F. Ouellette. 2004. Pegasys: software for executing and integrating analyses of biological sequences. BMC Bioinformatics. 5:40. doi: 10.1186/1471-2105-5-40. 17. Lewis, S. E., S. M. Searle, N. Harris, M. Gibson, V. Lyer, J. Richter, C. Wiel, L. Bayraktaroglir, E. Birney, M. A. Crosby, J. S. Kaminker, B. B. Matthews, S. E. Prochnik, C. D. Smithy, J. L. Tupy, G. M. Rubin, S. Misra, C. J. Mungall, and M. E. Clamp. 2002. Apollo: a sequence annotation editor. Genome Biol. 3:RESEARCH0082. 18. Ma, B., J. Tromp, and M. Li. 2002. PatternHunter: faster and more sensitive homology search. Bioinformatics. 18:440-445. 19. Blanco, E., G. Parra, and R. Guigo. 2007. Using geneid to identify genes. Curr. Protoc. Bioinformatics. Chapter 4:Unit 4.3. doi: 10.1002/0471250953.bi0403s18. 20. Salamov, A. A., and V. V. Solovyev. 2000. Ab initio gene finding in Drosophila genomic DNA. Genome Res. 10:516-522. 21. Elsik, C. G., A. J. Mackey, J. T. Reese, N. V. Milshina, D. S. Roos, and G. M. Weinstock. 2007. Creating a honey bee consensus gene set. Genome Biol. 8:R13. doi: 10.1186/gb-2007-8-1-r13. 22. Carver, T., M. Berriman, A. Tivey, C. Patel, U. Bohme, B. G. Barrell, J. Parkhill, and M. A. Rajandream. 2008. Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. Bioinformatics. 24:2672-2676. doi: 10.1093/bioinformatics/btn529. 23. Li, L., C. J. Stoeckert Jr, and D. S. Roos. 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13:2178-2189. doi: 10.1101/gr.1224503. 24. Edgar, R. C. 2004. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics. 5:113. doi: 10.1186/1471-2105-5-113. 25. Capella-Gutierrez, S., J. M. Silla-Martinez, and T. Gabaldon. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics. 25:1972-1973. doi: 10.1093/bioinformatics/btp348. 26. Guindon, S., and O. Gascuel. 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704. 27. Gouy, M., S. Guindon, and O. Gascuel. 2010. SeaView version 4: A multiplatform graphical user interface for sequence alignment and phylogenetic tree building. Mol. Biol. Evol. 27:221-224. doi: 10.1093/molbev/msp259. 28. Jones, D. T., W. R. Taylor, and J. M. Thornton. 1992. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8:275-282. 29. Anisimova, M., and O. Gascuel. 2006. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst. Biol. 55:539-552. doi: 10.1080/10635150600755453. 30. Britton, T., C. L. Anderson, D. Jacquet, S. Lundqvist, and K. Bremer. 2007. Estimating divergence times in large phylogenetic trees. Syst. Biol. 56:741-752. doi: 10.1080/10635150701613783. 31. Lucking, R., S. Huhndorf, D. H. Pfister, E. R. Plata, and H. T. Lumbsch. 2009. Fungi evolved right on track. Mycologia. 101:810-822. 32. Varma, A., and K. J. Kwon-Chung. 2010. Heteroresistance of Cryptococcus gattii to fluconazole. Antimicrob. Agents Chemother. 54:2303-2311. doi: 10.1128/AAC.00153-10. 33. Tanaka, R., H. Taguchi, K. Takeo, M. Miyaji, and K. Nishimura. 1996. Determination of ploidy in Cryptococcus neoformans by flow cytometry. J. Med. Vet. Mycol. 34:299-301. 34. Sia, R. A., K. B. Lengeler, and J. Heitman. 2000. Diploid strains of the pathogenic basidiomycete Cryptococcus neoformans are thermally dimorphic. Fungal Genet. Biol. 29:153-163. doi: 10.1006/fgbi.2000.1192.


Recommended