+ All Categories
Home > Documents > Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia...

Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia...

Date post: 25-Sep-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
6
Historical variations in mutation rate in an epidemic pathogen, Yersinia pestis Yujun Cui a,b,1 , Chang Yu b,1 , Yanfeng Yan a,b,1 , Dongfang Li b,1 , Yanjun Li a,1 , Thibaut Jombart c,1 , Lucy A. Weinert d,1 , Zuyun Wang e , Zhaobiao Guo a , Lizhi Xu b , Yujiang Zhang f , Hancheng Zheng b , Nan Qin b , Xiao Xiao a , Mingshou Wu g , Xiaoyi Wang a , Dongsheng Zhou a , Zhizhen Qi e , Zongmin Du a , Honglong Wu b , Xianwei Yang b , Hongzhi Cao b , Hu Wang e , Jing Wang h , Shusen Yao i , Alexander Rakin j , Yingrui Li b , Daniel Falush k , Francois Balloux d,2 , Mark Achtman k,2 , Yajun Song a,k,2 , Jun Wang b,2 , and Ruifu Yang a,b,2 a State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China; b Beijing Genomics Institute- Shenzhen, Shenzhen 518083, China; c Medical Research Council Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College, London W2 1PG, United Kingdom; d University College London Genetics Institute, Department of Genetics Evolution and Environment, University College London, London WC1E 6BT, United Kingdom; e Qinghai Institute for Endemic Diseases Prevention and Control, Xining 811602, Qinghai Province, China; f Xinjiang Center for Disease Control and Prevention, Urumiqi 830002, Xijiang province, China; g Yunnan Institute for Epidemic Diseases Control and Research, Dali 671000, Yunnan Province, China; h Institute of Health Quarantine, Chinese Academy of Inspection and Quarantine, Beijing 100025, China; i Military Center for Disease Control and Prevention, Beijing 100850, China; j Max von Pettenkofer Institute, Ludwig-Maximilians-University, Munich 80336, Germany; and k Environmental Research Institute, University College Cork, Cork, Ireland Edited* by W. Ford Doolittle, Dalhousie University, Halifax, NS, Canada, and approved November 29, 2012 (received for review April 10, 2012) The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin coupled with a slow clock rate. Here we identied 2,326 SNPs from 133 genomes of Y. pestis strains that were isolated in China and elsewhere. These SNPs dene the genealogy of Y. pestis since its most recent com- mon ancestor. All but 28 of these SNPs represented mutations that happened only once within the genealogy, and they were distrib- uted essentially at random among individual genes. Only seven genes contained a signicant excess of nonsynonymous SNP, sug- gesting that the xation of SNPs mainly arises via neutral pro- cesses, such as genetic drift, rather than Darwinian selection. However, the rate of xation varies dramatically over the geneal- ogy: the number of SNPs accumulated by different lineages was highly variable and the genealogy contains multiple polytomies, one of which resulted in four branches near the time of the Black Death. We suggest that demographic changes can affect the speed of evolution in epidemic pathogens even in the absence of natural selection, and hypothesize that neutral SNPs are xed rapidly dur- ing intermittent epidemics and outbreaks. infectious disease | molecular clock | phylogenomics | NGS | molecular epidemiology P lague caused by Yersinia pestis has claimed millions of human lives as recently as the 20th century, and sylvatic plague remains enzootic (endemic) in multiple rodent foci around the world (1, 2). These Gram-negative bacteria can be transmitted from rodents to humans and animals by eabites, physical contact, or respiratory droplets, and human infection is usually lethal in the absence of antibiotic treatment. In 1894, microbiologists identied Y. pestis as the cause of modern plague (3). They also attributed historical pandemics of plague to this bacterium, including the Black Death (4), which reached Europe in 1347, and the Justinianic Pandemic (5), which began in 541. This interpretation has been questioned by historians and epidemiologists because the Black Death differed in epide- miology and symptoms from modern plague at the beginning of the 20th century (68). Furthermore, the hosts (rats) and vectors (eas, Xenopsyllis cheopsis) that are currently associated with efcient transmission of bubonic plague were rare or absent in Europe during the Black Death (8). However, Y. pestis DNA and antigens were detected in Black Death skeletons by multiple groups (9, 10) and sequences of Y. pestis genomes have been reconstructed from ancient DNA in skeletons that were buried in London in 13481349, soon after the Black Death reached Europe (11). Y. pestis evolves clonally and has only very limited genetic di- versity (12). Phylogeographic analysis of a global collection of isolates showed that these bacteria have spread repeatedly from China (13), and that the Black Death genotypes map to the base of phylogenetic branch 1 (9, 11). However, multiple questions remain unresolved, including whether plague spread globally dur- ing or after the Black Death, and whether the Justinianic Pandemic was caused by Y. pestis (11). Here we describe the genomic sequences of >100 Y. pestis strains from China and East Asia. Our results show that the medieval spread of plague coincided with a big bangin Y. pestis diversity, during which several extant lineages arose within a short time pe- riod. We conclude that sequence evolution is largely neutral and mutations have accumulated at a very uneven pace, possibly be- cause of the alternation of endemic and epidemic periods. Finally, we show that Y. pestis lineages that can infect humans already existed by the time of the Justinianic Pandemic. Results Core-Genome and Pan-Genome of Y. pestis. We sequenced 118 genomes of isolates from China, Mongolia, the former Soviet Union, Myanmar, and Madagascar using Illumina short-read sequencing (Dataset S1). The 107 Chinese isolates represent the diversity revealed by molecular genotyping of > 900 strains (14, 15), which in turn reect the geographic and temporal diversity of > 5,000 isolates collected during annual surveillance of all sylvatic plague foci in China. Thus, these genomes should provide a good rep- resentation of the geographic and genetic diversity of Y. pestis in China. Of the 118 genomes, 103 are from a variety of rodent genera and other nonprimate mammals, 14 are from human patients, and 1 is from a human ea (Dataset S1). The sequence reads were assembled into genomes with SOAPdenovo (16), resulting in an average coverage per genome of 61-fold and an average genome assembly size of 4.6 Mb Author contributions: Y.C., Y.S., Jun Wang, and R.Y. designed research; Y.C., Yanjun Li, Z.G., X.X., X.W., D.Z., and Z.D. performed research; T.J., Z.W., Y.Z., N.Q., M.W., Z.Q., H. Wang, Jing Wang, S.Y., and A.R. contributed new reagents/analytic tools; Y.C., C.Y., Y.Y., D.L., T.J., L.A.W., L.X., H.Z., H. Wu, X.Y., H.C., Yingrui Li, D.F., F.B., M.A., Y.S., and R.Y. analyzed data; and Y.C., T.J., L.A.W., D.F., F.B., M.A., Y.S., and R.Y. wrote the paper. The authors declare no conict of interest. *This Direct Submission article had a prearranged editor. Data deposition: Raw genome data have been deposited in the National Center for Bio- technology Information/Short Read archive database, www.ncbi.nlm.nih.gov/sra (acces- sion no SRA010790; individual genome assemblies have been deposited under accession nos. ADOV00000000 47685ADTI00000000 47685). The accession numbers for each strain are listed in Dataset S1. 1 Y.C., C.Y., Y.Y., D.L., Yanjun Li, T.J., and L.A.W. contributed equally to this work. 2 To whom correspondence may be addressed. E-mail: [email protected], m.achtman@ ucc.ie, [email protected], [email protected], or [email protected]. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1205750110/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1205750110 PNAS | January 8, 2013 | vol. 110 | no. 2 | 577582 EVOLUTION Downloaded by guest on January 22, 2021
Transcript
Page 1: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

Historical variations in mutation rate in an epidemicpathogen, Yersinia pestisYujun Cuia,b,1, Chang Yub,1, Yanfeng Yana,b,1, Dongfang Lib,1, Yanjun Lia,1, Thibaut Jombartc,1, Lucy A. Weinertd,1,Zuyun Wange, Zhaobiao Guoa, Lizhi Xub, Yujiang Zhangf, Hancheng Zhengb, Nan Qinb, Xiao Xiaoa, Mingshou Wug,Xiaoyi Wanga, Dongsheng Zhoua, Zhizhen Qie, Zongmin Dua, Honglong Wub, Xianwei Yangb, Hongzhi Caob, Hu Wange,Jing Wangh, Shusen Yaoi, Alexander Rakinj, Yingrui Lib, Daniel Falushk, Francois Ballouxd,2, Mark Achtmank,2,Yajun Songa,k,2, Jun Wangb,2, and Ruifu Yanga,b,2

aState Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China; bBeijing Genomics Institute-Shenzhen, Shenzhen 518083, China; cMedical Research Council Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology,Imperial College, London W2 1PG, United Kingdom; dUniversity College London Genetics Institute, Department of Genetics Evolution and Environment,University College London, London WC1E 6BT, United Kingdom; eQinghai Institute for Endemic Diseases Prevention and Control, Xining 811602, QinghaiProvince, China; fXinjiang Center for Disease Control and Prevention, Urumiqi 830002, Xijiang province, China; gYunnan Institute for Epidemic DiseasesControl and Research, Dali 671000, Yunnan Province, China; hInstitute of Health Quarantine, Chinese Academy of Inspection and Quarantine, Beijing 100025,China; iMilitary Center for Disease Control and Prevention, Beijing 100850, China; jMax von Pettenkofer Institute, Ludwig-Maximilians-University, Munich80336, Germany; and kEnvironmental Research Institute, University College Cork, Cork, Ireland

Edited* by W. Ford Doolittle, Dalhousie University, Halifax, NS, Canada, and approved November 29, 2012 (received for review April 10, 2012)

The genetic diversity of Yersinia pestis, the etiologic agent of plague,is extremely limited because of its recent origin coupled with a slowclock rate. Here we identified 2,326 SNPs from 133 genomes ofY. pestis strains that were isolated in China and elsewhere. TheseSNPs define the genealogy of Y. pestis since its most recent com-mon ancestor. All but 28 of these SNPs represented mutations thathappened only once within the genealogy, and they were distrib-uted essentially at random among individual genes. Only sevengenes contained a significant excess of nonsynonymous SNP, sug-gesting that the fixation of SNPs mainly arises via neutral pro-cesses, such as genetic drift, rather than Darwinian selection.However, the rate of fixation varies dramatically over the geneal-ogy: the number of SNPs accumulated by different lineages washighly variable and the genealogy contains multiple polytomies,one of which resulted in four branches near the time of the BlackDeath. We suggest that demographic changes can affect the speedof evolution in epidemic pathogens even in the absence of naturalselection, and hypothesize that neutral SNPs are fixed rapidly dur-ing intermittent epidemics and outbreaks.

infectious disease | molecular clock | phylogenomics | NGS |molecular epidemiology

Plague caused by Yersinia pestis has claimed millions of humanlives as recently as the 20th century, and sylvatic plague remains

enzootic (endemic) in multiple rodent foci around the world (1, 2).These Gram-negative bacteria can be transmitted from rodents tohumans and animals by fleabites, physical contact, or respiratorydroplets, and human infection is usually lethal in the absence ofantibiotic treatment.In 1894, microbiologists identified Y. pestis as the cause of

modern plague (3). They also attributed historical pandemics ofplague to this bacterium, including the Black Death (4), whichreached Europe in 1347, and the Justinianic Pandemic (5), whichbegan in 541. This interpretation has been questioned by historiansand epidemiologists because the Black Death differed in epide-miology and symptoms frommodern plague at the beginning of the20th century (6–8). Furthermore, the hosts (rats) and vectors (fleas,Xenopsyllis cheopsis) that are currently associated with efficienttransmission of bubonic plague were rare or absent in Europeduring the Black Death (8). However, Y. pestis DNA and antigenswere detected in Black Death skeletons by multiple groups (9, 10)and sequences of Y. pestis genomes have been reconstructed fromancient DNA in skeletons that were buried in London in 1348–1349, soon after the Black Death reached Europe (11).Y. pestis evolves clonally and has only very limited genetic di-

versity (12). Phylogeographic analysis of a global collection ofisolates showed that these bacteria have spread repeatedly from

China (13), and that the Black Death genotypes map to the baseof phylogenetic branch 1 (9, 11). However, multiple questionsremain unresolved, including whether plague spread globally dur-ing or after the BlackDeath, and whether the Justinianic Pandemicwas caused by Y. pestis (11).Here we describe the genomic sequences of>100Y. pestis strains

from China and East Asia. Our results show that the medievalspread of plague coincided with a “big bang” in Y. pestis diversity,during which several extant lineages arose within a short time pe-riod. We conclude that sequence evolution is largely neutral andmutations have accumulated at a very uneven pace, possibly be-cause of the alternation of endemic and epidemic periods. Finally,we show that Y. pestis lineages that can infect humans alreadyexisted by the time of the Justinianic Pandemic.

ResultsCore-Genome and Pan-Genome of Y. pestis.We sequenced 118 genomesof isolates from China, Mongolia, the former Soviet Union,Myanmar, and Madagascar using Illumina short-read sequencing(Dataset S1). The 107 Chinese isolates represent the diversityrevealed by molecular genotyping of > 900 strains (14, 15), whichin turn reflect the geographic and temporal diversity of > 5,000isolates collected during annual surveillance of all sylvatic plaguefoci in China. Thus, these genomes should provide a good rep-resentation of the geographic and genetic diversity of Y. pestisin China. Of the 118 genomes, 103 are from a variety of rodentgenera and other nonprimate mammals, 14 are from humanpatients, and 1 is from a human flea (Dataset S1).The sequence reads were assembled into genomes with

SOAPdenovo (16), resulting in an average coverage per genomeof 61-fold and an average genome assembly size of 4.6 Mb

Author contributions: Y.C., Y.S., Jun Wang, and R.Y. designed research; Y.C., Yanjun Li,Z.G., X.X., X.W., D.Z., and Z.D. performed research; T.J., Z.W., Y.Z., N.Q., M.W., Z.Q.,H. Wang, Jing Wang, S.Y., and A.R. contributed new reagents/analytic tools; Y.C., C.Y.,Y.Y., D.L., T.J., L.A.W., L.X., H.Z., H. Wu, X.Y., H.C., Yingrui Li, D.F., F.B., M.A., Y.S., and R.Y.analyzed data; and Y.C., T.J., L.A.W., D.F., F.B., M.A., Y.S., and R.Y. wrote the paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Data deposition: Raw genome data have been deposited in the National Center for Bio-technology Information/Short Read archive database, www.ncbi.nlm.nih.gov/sra (acces-sion no SRA010790; individual genome assemblies have been deposited under accessionnos. ADOV00000000 47685–ADTI00000000 47685). The accession numbers for each strainare listed in Dataset S1.1Y.C., C.Y., Y.Y., D.L., Yanjun Li, T.J., and L.A.W. contributed equally to this work.2To whom correspondence may be addressed. E-mail: [email protected], [email protected], [email protected], [email protected], or [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1205750110/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1205750110 PNAS | January 8, 2013 | vol. 110 | no. 2 | 577–582

EVOLU

TION

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021

Page 2: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

(Dataset S2). A pan-genome was created from all genomes, andreads containing putative SNPs relative to the pan-genome werefiltered to exclude potential sequencing errors. We tested thereliability of these procedures by comparing the results for twostrains, the genomes of which had previously been independentlysequenced with traditional Sanger sequencing (17). The twogenomes differed at positions 129314 and 1559993 for strainK11973002, whereas the two genomes of strain E1979001 wereindistinguishable. Sequencing of PCR products showed that thetwo differences in K11973002 represented sequencing errors inthe Sanger genome.We also analyzed 15 other, previously published genomes of

Y. pestis (Dataset S1), for a total of 133 genomic sequences. TheY. pestis core-genome, consisting of segments present in all 133genomes, is 3.53-Mb long, subdivided into 990 genomic blocksbecause most assemblies contain multiple contigs (Datasets S2and S3). This genome contains 3,450 annotated genes spanning3.03 Mb. Our comparisons further identified 1.92 Mb (1,700annotated genes spanning 1.16 Mb) that were variably present insome isolates (Dataset S4), of which 421 kb were assigned to sixpreviously known plasmids (SI Appendix, Table S1). The Y. pestispan-genome (core-genome plus accessory genome) is 5.46 Mb inlength (SI Appendix, Fig. S1 and Table S1).

Y. pestis Lineages in Relation to Geography and Host. Previously (13,18, 19) we interpreted the population structure of Y. pestis asa clonal lineage with three branches, designated 0, 1, and 2, whichevolved from Yersinia pseudotuberculosis 2,600–28,000 y ago.According to this interpretation, theY. pestis genealogy is rooted byY. pseudotuberculosis at the base of branch 0, and SNPs have ac-cumulated serially along branch 0 and subsequently along branches1 and 2. All Y. pestis isolates, including the genomes analyzed here,cluster within one multilocus sequence type that is embeddedwithin the Y. pseudotuberculosis population structure (20). These

observations show thatY. pestis is a geneticallymonomorphic cloneof Y. pseudotuberculosis, which has greater diversity. A highly ro-bustmaximum-likelihood (ML) tree of the 133 genomes ofY. pestisplus four published genomes of Y. pseudotuberculosis confirmedthat Y. pseudotuberculosis is a suitable outgroup for defining theorder of sequential mutations during the genealogy of Y. pestis, andthat the most recent common ancestor (MRCA) of all sampledisolates of Y. pestis is at the base of branch 0 (SI Appendix, Fig. S2).The four Y. pseudotuberculosis genomes share the same nucleotidefor all SNPs within Y. pestis, which defines the ancestral state. Al-most all core SNPs within the genealogy of Y. pestis correspond tosubstitutions that haveoccurred only once, and thus definea unique,fully parsimonious path of changes. The core-genome of Y. pestiscontains 2,298 such SNPs (Dataset S5), which were used to generatea rooted minimum spanning tree (MSTree) based on sequentialchanges since the MRCA (Fig. 1A and SI Appendix, Fig. S3).TheMSTree and ML tree are fully concordant (SI Appendix, Fig. S4).Twenty-eight other SNPs were excluded from these analyses be-cause they are homoplasies (repeated identical mutations in in-dependent branches) (SI Appendix, Table S2).This comparative analysis of a large number of genomes revealed

additional genealogical substructure. The genealogy now includesbranches 3 and 4, which split from branch 0 concurrently withbranches 1 and 2 in an extensive “big bang” polytomy at the junc-tion designated node N07 (Fig. 1A and SI Appendix, Fig. S3B). TheBlack Death sequences that were defined by ancient DNA studies(9, 11) are also very close to this polytomy. The sequences are lo-cated on branch 1, one SNP and three SNPs after it diverged fromnode N07 (Fig. 1A). Two other SNPs identified by Bos et al. (11)were ignored in our genealogical analyses because they are locatedin repetitive segments.We previously concluded that plague spread on multiple occa-

sions from China or its vicinity (13). Our new data support thishypothesis because the deepest branching lineage of the geneal-

MRCA of Y. pestis sample

A

*

*

**

Xi An(Chang An)

To SouthEastAsia

To South Asia

To Central Asiaand Europe I

II

IIIIV

Ancient trade routesI & II subroutes of Silk Road III & IV subroutes of Tea-Horse Road

B

0.ANT1

0.PE70.PE4C

0.ANT30.ANT2

Branch 01.IN1

1.ORI11.ORI3

1.IN31.IN2

Branch 12.ANT1

2.MED32.MED22.MED12.ANT32.ANT2

Branch 23.ANT1

Branch 3 & 4

3.ANT24.ANT1

MyanmarYunnan

Qinghai-Tibet Plateau

Mongolia

China

**

*

*

*

*

*

*

*

*

**

0.PE20.PE3

1.ANT1.ORI2Ancient genomes

Branch 1

Branch 0

Branch 2

Branch 3

Branch 4

0.PE4A0.PE4B

0.PE7

0.PE2

0.PE3

0.PE4A0.PE4B0.PE4C

0.ANT1

0.ANT20.ANT3

2.MED3

2.MED2

2.MED1

2.ANT1

2.ANT2

2.ANT3

4.ANT

3.ANT1

3.ANT2

1.ANT

1.IN1

1.IN2

1.IN3

1.ORI3

1.ORI21.ORI1

N07

0.ANT1d

N11N12

N14

N36

N49

Fig. 1. Population structure of Y. pestis revealed by core genome SNP analysis. (A) MSTree of 133 Y. pestis genomes based on 2, 298 SNPs, with Y. pseu-dotuberculosis IP 32953 as the outgroup to the MRCA. Branch lengths are logarithm transformed for visual effects (see SI Appendix, Fig. S3A for the un-transformed tree). Branches are indicated by distinct symbol shapes and populations within branches are distinguished by colors. Symbols with an asterisk arepublicly available genomes, not included in B. (B) Geographic sources of strains sequenced in this study. Ancient trade routes are illustrated by gray lines. Thecircled area is the Qinghai-Tibet Plateau, which encompasses the most diverse isolates and may be the original source of Y. pestis.

578 | www.pnas.org/cgi/doi/10.1073/pnas.1205750110 Cui et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021

Page 3: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

ogy, 0.PE7, has only been isolated in China. In contrast, isolatesfrom the former Soviet Union belong to lineages (0.PE2 and 0.PE4B) that evolvedmore recently. The Y. pestis strains fromChinacluster geographically (MANOVA: Pillai-Bartlett trace = 1.75,P < 2.2 × 10−16), but also cluster by host genera (nonparametricχ2 test, 999 permutations: χ2 = 64.61, P = 0.001) (SI Appendix, Fig.S5). Because these two factors are largely confounded statistically,we were unable to evaluate the relative importance of geographyand ecology in shaping the population structuring of Y. pestis inChina. However, we note that the deepest branching lineage of thegenealogy, 0.PE7, has only been isolated in the Qinghai-TibetPlateau inChina, and thatmost isolates fromwestern and southernChina were located near ancient trade routes designated as the“Silk” and “Tea-Horse”Roads, which crossed in theQinghai-TibetPlateau (Fig. 1B). These observations suggest that plaguemay haveoriginated in or near the Qinghai-Tibet Plateau and was trans-mitted between rodent populations by human travel along theancient trade routes.Our data also show that most Y. pestis lineages are capable of

causing human plague. Although we focused on Y. pestis fromChinese sylvatic foci in this study, 1,500 cases of human plaguehave been reported in China since 1955 (21). Our collectionincluded 14 strains from human patients and one from a humanflea. These 15 genomes included four on branch 0, as well as 11others on branches 1, 2, and 3 (SI Appendix, Fig. S5B). One isolatefrom human bubonic plague lies on the deepest branch, 0.PE7, andthe isolate from a human flea was found in the 0.ANT1 clade. Eventhough biovar Microtus (0.PE4C) is not associated with humandisease (22, 23), the deepest branch within 0.PE4, 0.PE4A, includestwo isolates fromhuman disease. These observations thus show thatthe potential to cause human disease has probably been presentsince Y. pestis first evolved, and all known lineages can potentiallycause plague, except 0.PE4C. Contrary to the conclusions of Boset al. (11), neither the Black Death nor the polytomy at N07 wereassociated with a sudden acquisition of the potential to causehuman disease.

Overdispersion of Branch Lengths in Y. pestis. According to theexpectations of a constant molecular clock, the core-genome ofall modern strains should be approximately equidistant from theMRCA. However, this was not the case, as the number of SNPsto the MRCA varied between 96 (0.PE7) and 548 (0.PE3). Evenafter excluding these two extremes, the number ranged from 183to 320 (SI Appendix, Fig. S3A and Dataset S1). Similarly, thenumber of SNPs accumulated since the polytomy at N07 byindividual genomes on branch 1 ranged from 27 (1.IN1a) to 100(1.ANTb). A constant molecular clock predicts a Poisson distri-bution of the root-to-tip distances, which results in similar valuesfor themean and the variance of the number of SNPs. However, inthe Y. pestis phylogeny, the variance [1,575; 99% confidence in-terval (CI): 456, 3,970] of the number of SNPs accumulated fromroot to tips was much higher than the mean (248; 99% CI: 239,257), using quantiles of the distributions obtained by bootstrappingroot-to-tip distances to calculate 99% CIs. Strong heterogeneity inbranch length from the MRCA was also apparent in the ML tree(SI Appendix, Fig. S2).Overdispersion of branch lengths can occur when mutations

accumulate progressively over the time period of sampling, as ob-served with H1N1 influenza virus (24) and methicillin-resistantStaphylococcus aureus (25, 26). However, both these organismsevolve very rapidly and have farmore recentMRCAs than Y. pestis.As such, a substantial proportion of their diversity arose during thetime covered by the sampling period. In contrast, the period duringwhich the Y. pestis isolates were collected represents only a smallfraction of the time during which they diversified from theirMRCA, and the branch lengths to theMRCA for individual strainsof Y. pestis were uncorrelated to their year of isolation (R2 = 0.01)(SI Appendix, Fig. S6).

Variable Clock Rate.An alternative explanation for overdispersionin root-to-tip distance is that the substitution rate varies across

lineages. We performed an analysis using the Bayesian relaxed-clock model in BEAST (27–30) on all genomes (SI Appendix),including the ancient DNAs, but excluding Angola (0.PE3),which prevented the analysis from converging. The analysis wasperformed with the known dates of isolation of the bacterialstrains, using 1348 as the date for the ancientDNAs and based ona lognormal distribution of clock rate. This analysis indicatedthat the genealogy has indeed undergone substantial rate varia-tion, with a nearly 40-fold difference between the slowest and thefastest evolving branches (Fig. 2 and SI Appendix, Fig. S7).Comparable calculations using a strict clock model, where rateswere constant across the tree, yielded log10 marginal likelihoodsof 109.6 with the Bayes Factor approach implemented in Tracer,which provides overwhelming support for a relaxed clock (27).According to our calculations, rate accelerations and deceler-

ations are interspersed in the tree, with the two fastest rates oc-curring at the big-bang polytomy at N07 and within 1.ORI, whichwas responsible for the global spread in 1894 of all three waves ofthe third pandemic (13). The date estimations indicate that 1.ORIevolved early in the 19th century, several decades before epidemicdisease manifested within China, and that the polytomy at N07preceded the Black Death by about 80 y (Table 1). The MRCA ofall modern isolates of Y. pestis was calculated as 1322 BC (95%CI:4394 BC to 510AD). The nodes flankingAngola were estimated tohave originated several centuries before the Justinianic Pandemicof 541 AD, with 95% CIs that overlap the beginning of that pan-demic. Our dating estimates now cast doubt on our previous pro-posal that major voyages from China to Africa at the beginning ofthe 14th century brought 1.ANT to East Africa because thosevoyages predated the age of 1.ANT (1499 AD; 95% CI: 1377–1650) (Table 1 and SI Appendix, Fig. S8).

Internal Nodes and Multiple Polytomies. As clock rates were highlyvariable across the phylogeny, we investigated whether variationin branch length was correlated with features of the tree topol-ogy. Interestingly, we observed a strong positive correlation be-tween the number of nodes and the number of SNPs from eachgenome to the MRCA of the genealogy (R2 = 0.56) (Fig. 3A),indicating that lineages with higher mutation rates tend to haveexperienced more branching events over time. The relationshipbetween SNPs and nodes can be biased by pseudoreplication,because the numbers of nodes and SNPs to the MRCA of the

3.1e-0

9

7.9e-0

9

2.0e-0

8

5.2e−

08

1.3e−

07

2.MED

2.ANT

1.ORI1.IN1.ANT

4.ANT 3.ANT

0.ANT0.PE Ancient genomes

Rate key

3500.0 500.01000.01500.02000.02500.03000.0 0.0

Years before present

0.ANT1a0.ANT1b0.ANT1c

0.ANT1d0.ANT1e0.ANT1f

0.ANT1g0.ANT1h

0.ANT2a 10.ANT2a 2

0.ANT3a0.ANT3b0.ANT3c

0.ANT3d0.ANT3e

0.PE2a0.PE2b

0.PE4a0.PE4b0.PE4c

0.PE4d0.PE4e0.PE4f

0.PE4g0.PE4h0.PE4i

0.PE7a0.PE7b

1.ANTa1.ANTb

1.IN1a1.IN1b1.IN1c

1.IN2a1.IN2b1.IN2c

1.IN2d1.IN2e

1.IN2f1.IN2g1.IN2h

1.IN2i

1.IN2j1.IN2k1.IN2l

1.IN2m1.IN2n

1.IN2o1.IN2p

1.IN2q

1.IN3a

1.IN3b1.IN3c1.IN3d

1.IN3e1.IN3f1.IN3g1.IN3h1.IN3i

1.ORI1a

1.ORI1b1.ORI1c1.ORI1d1.ORI1e1.ORI2a1.ORI2b1.ORI2c1.ORI2d1.ORI2e

1.ORI2f1.ORI2g1.ORI2h

1.ORI2i

1.ORI3a1.ORI3b1.ORI3c

2.ANT1a2.ANT1b2.ANT1c

2.ANT2a2.ANT2b2.ANT2c

2.ANT2d2.ANT2e2.ANT2f

2.ANT3a2.ANT3b2.ANT3c2.ANT3d2.ANT3e

2.ANT3f2.ANT3g

2.ANT3h

2.ANT3i

2.ANT3j2.ANT3k2.ANT3l

2.MED1a

2.MED1b2.MED1c2.MED1d

2.MED2a2.MED2b2.MED2c2.MED2d2.MED2e2.MED3a2.MED3b2.MED3c2.MED3d2.MED3e2.MED3f2.MED3g2.MED3h2.MED3i

2.MED3j2.MED3k2.MED3l2.MED3m2.MED3n

2.MED3o2.MED3p

3.ANT1a3.ANT1b3.ANT1c

3.ANT1d

3.ANT2a3.ANT2b3.ANT2c3.ANT2d

3.ANT2e

4.ANT1a

ES1ES2

Fig. 2. A maximum clade credibility tree of the core genomes of Y. pestis.The tree was estimated using the uncorrelated lognormal relaxed model inBEAST (27) with core-genomes. Angola (0.PE3) was excluded from this esti-mation because its anomalously fast rate could not be fitted by the log-normal model. Two sources of temporal information were used to date thetree. First, the two Black Death genomes were placed at 1348, the date ofburial of the skeletons. Second, we constrained the mean of the lognormaldistribution of rates to 1 × 10−8 substitutions per site per year (13). Estimatesusing these two parameters alone provided consistent dates, but withwider credibility intervals. Branch lengths are scaled to years. Branch colorsindicate the estimated branch substitution rate on the logarithmic scale shownin the Rate Key at the left. Population designations are as in Dataset S1.

Cui et al. PNAS | January 8, 2013 | vol. 110 | no. 2 | 579

EVOLU

TION

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021

Page 4: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

phylogeny largely overlap for closely related isolates. To circum-vent this issue, we repeated this analysis with increasingly smallerrandom subsets of isolates (from 90 to 10%of the original dataset)and with 1,000 replicates for each dataset size. As expected, therange of R2 was larger in small datasets. However, the relationshipbetween nodes and SNPs to the MRCA remained remarkablyrobust, with median R2 values ranging from 0.5 to 0.6 (Fig. 3B). Asimilar relationship between nodes and SNPs, the “node-densityeffect” (31), can arise under limited sampling of taxa with repeatedforward and back mutations, because of mutation saturation inundersampled areas of a genealogy.However, homoplasies are rarein Y. pestis, which excludes mutation saturation. Thus, despite thevariable mutation rate, SNPs are a good indication of branchingevents, and epidemics that have resulted in more SNPs are alsoassociated with more historical branches. We note, however, thatthis may reflect an indirect correlation, because lineages that haveexperienced more epidemics have been more successful, andmight therefore represent a greater proportion of our sample. Inturn, isolates from clades that are more common in the datasetare also expected to be characterized by higher numbers of nodesfrom root to tip.A remarkable feature of our genealogy of Y. pestis is the pres-

ence of multiple polytomies, including the big-bang polytomy atnode N07. Some of these nodes are the ancestors of populationsthat have only been identified in China (0.ANT1, 1.IN1, 1.IN2, 2.ANT3, and 2.MED3) but the basal node of 1.ORI (N14), the causeof the third pandemic, also represents a polytomy (Fig. 1 and SIAppendix, Fig. S3B). Thus, radial expansions (starburst genealo-gies) have occurred in Y. pestis on multiple occasions, interspersedwith the binary branching structure that is otherwise normally as-sociated with clonal diversification. As starburst genealogies areoften interpreted as signatures for rapid expansions, it is temptingto speculate that each polytomy corresponds to an ancient plagueepidemic. In this respect, the association between faster clock ratesand branching events is particularly interesting, as it would suggeststhat lineages responsible for more epidemics might also accumu-late genetic diversity more quickly.

Drivers of Molecular Clock Rate Variation in Y. pestis. What are thecauses behind the strong clock rate variation?Mutations accumulate rapidly among “mutator” strains with

elevated spontaneous mutation rates (32). However, Angola in0.PE3, the strain whose genome had the most SNPs, is nota mutator (33), nor did any of the genomes contain traces of theepistatic mutations that can result in reversion of a transientmutator to wild-type. Furthermore, only one mutator was iden-tified among 294 Y. pestis strains from global sources (13). Thus,transient mutators are unlikely to have caused the numerous rateaccelerations and decelerations we observed throughout theY. pestis phylogeny.An alternative explanation would be numerous events of di-

versifying (natural) selection leading to the rapid fixation of fa-vorable mutations, possibly coinciding with the spatial expansion ofY. pestis and host jumps to new host species. However, we found

only limited evidence in the genomes for the effects of pastnatural selection having driven the rate of evolution of Y. pestis.The ratio of nonsynonymous to synonymous mutation rates(dN/dS) is 0.91 for all 2,298 SNPs; ranging from 0.79 to 1.37 for allnodes in the MSTree (Dataset S6). The near-parity dN/dS ratiosare consistent with neutral evolution, and provide no obvious signsfor diversifying selection at the genome level. However, dN/dSratios are insensitive to a mixture of diversifying and purifyingselection, and we thus considered other typical signatures ofdiversifying selection.Diversifying selection might have resulted in frequent homo-

plasies, as has been observed for mutations to antibiotic re-sistance (34) and resistance to antivirals (35), but only 28 SNPswere homoplastic in the 133 genomes (SI Appendix, Table S2).Diversifying selection is also frequently hallmarked by genes withmultiple, clustered nonsynonymous SNPs (nsSNPs). The distri-bution of the number of nsSNPs per gene deviated significantlyfrom theoretical expectations under neutral evolution (Fig. 4A)(parametric and nonparametric χ2obs tests: χ

2obs = 3,582, pparam =

4.2 × 10−21; pnonparam = 9.9 × 10−4), but this deviation waslargely associated with only seven outlier genes (including twowith homoplasies) that exhibited an excess of nsSNPs (Fig. 4B andC). Only two of the 3,450 core genes displayed both homoplasiesand excess nsSNPs (YPO0348 and YPO2210) and 20 otherscontained only one of these typical signatures of diversifying se-lection (SI Appendix, Table S3). Those 22 genes did not possessobvious unifying features, except that one (YPO0348) is thoughtto be important for the virulence of Y. pestis (36) and two arerelated to motility. Thus, our analyses provided little evidence forwidespread natural selection having shaped the genomic diversityin Y. pestis, and natural selection cannot explain the extensivevariation in substitution rate.We now consider a third possibility, namely that clock rate

variation is simply the result of different bacterial replicationrates between epidemic spread versus latent endemic disease.Sylvatic plagues consist of alternate cycles of enzootic disease, inwhich Y. pestis is rare in both fleas and hosts, and epizootic dis-ease, in which it is common (1). During the enzootic phase,Y. pestis can survive for long periods in a dormant state, the basis ofwhich is still poorly understood (37). In contrast, during a typicalepizootic rodent-flea cycle, infected rodents suffer from intensebacteremia with >107 CFU/mL of blood, which is essential fortransmission because the flea blood meal is so small and trans-mission by fleas is so inefficient (38, 39). Human epidemics resultfrom epizootic plague in an intermediate host, such as rats, whosefleas then transmit Y. pestis to humans, and back to the rodentswho are the maintenance hosts for future evolution. The muta-tional clock rate is constant in the epizootic phase, at least for

Table 1. Date estimation of nodes potentially associated withpandemic plagues

Clade Median date (95% CI)

1.ORI 1808 AD (1735, 1863)1.ANT vs. branch 1 1499 AD (1377, 1650)Ancient genomes 1346 AD (1312, 1353)Polytomy N07 1268 AD (1142, 1339)N03, Descendent node of Angola (0.PE3) 68 AD (932 BC – 806 AD)N01, Parent node of Angola (0.PE3) 816 BC (2775 BC – 590 AD)MRCA 1322 BC (4394 BC – 510 AD)

The genome of strain Angola (0.PE3) was not included in the BEASTanalysis because of its excessive length; the table lists the estimaged agesof the flanking nodes in the genealogy, N01 and N03 (SI Appendix, Fig. S3).

BA

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●●

Percentage of strain numbers (%)

Coe

ffici

ent o

f det

erm

inat

ion

(R2 )

10 30 50 70 90

0.0

0.2

0.4

0.6

0.8

1.0

●●

●●●

●●

●●●●●●●●●●

●●●●●●

●●●●●●●

●●

●●●●●●●●●●●●●

●●●●●●●●●●●●

●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●

●●

●●

●●● ●●●●

0 5 10 15 20Nodes to MRCA

SN

Ps

to M

RC

A

● ●

100

200

300

550

Fig. 3. Linear correlation between SNPs and nodes to the MRCA. (A) Re-lationship between number of nodes and SNPs to the MRCA. The black linecorresponds to a linear regression based on the points indicated by blackdots. Open circles represent isolates with three or fewer nodes, which wereexcluded from the analysis because they belong to poorly sampled clades inthe phylogeny. (B) Correlation between SNPs and nodes to the MRCA afterrandomly removing individual isolates in addition to the outliers. Variableproportions of isolates were randomly selected to construct ML trees basedon SNPs, from which the coefficient of determination (R2) with nodes to theMRCA was calculated. Each box is based on 1,000 replicates.

580 | www.pnas.org/cgi/doi/10.1073/pnas.1205750110 Cui et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021

Page 5: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

microsatellite loci (40), similar to the rate measured in laboratoryserial transfer experiments, but data are lacking on clock rates inenzootic plague. The same is true of human epidemics, except thatSNPs can be fixed at exceptionally high rates during transmissionto new geographic areas (13). However, based on their epidemi-ological differences, epidemics and epizootics should be associ-ated with much higher rates of bacterial replication per unit timethan are typical during the long periods of intervening enzooticdisease, even if the mutation rate per bacterial replication cycle isconstant. Such periods of rapid replication are expected to resultin faster observed clock rates per unit time, even in the absence ofnatural selection.A similar “generation-time effect” has been proposed for higher

taxa, wherein species with more generations per unit time accu-mulate more mutations than species with longer generation times(41). However, for Y. pestis, this hypothesis indicates that lineages,which have been associated with more epidemic disease, shouldhave higher clock rates, and conversely that higher clock rates arean indicator for an association with epidemic disease, even in theabsence of historical evidence.

DiscussionMany genomic comparisons of multiple isolates within a bacterialspecies have focused on the diversity of gene content. In somebacteria with extensive HGT, such as Escherichia coli and Strep-tococcus agalactiae, the pan-genome is open-ended, and its sizegrows continuously with the number of sequenced genomes (42,43). For asexual bacteria, such as Salmonella enterica serovar Typhi(44), clones of S. aureus (25, 26), and the endosymbiont Buchnera(45), the genome is closed and any acquisition of foreign genesseems to be largely restricted to bacteriophages and plasmids. Thepan-genome ofY. pestis is also closed, with a size only slightly largerthan the average size of the sequenced genomes (SI Appendix, Fig.S1). Most of the accessory genome is variable in its parental spe-cies, Y. pseudotuberculosis, and the only genomic island which

might have been acquired after the earliest nodes in the genealogyis a filamentous bacteriophage (46). In contrast, the polymorphicnucleotides in the core genome provide an unambiguous record ofa clonal genealogy, allowing analyses of changes that might beassociated with historical epidemics and pandemics.This genealogical clock ticks at a strikingly variable rate, as

demonstrated by variable substitution rates and branching eventsalong individual branches of the genealogy. These rates resulted ingreat heterogeneity in the number of SNPs that modern genomesand ancient DNA have accumulated since the MRCA, and mul-tiple polytomies that probably reflect bursts of fixed mutation thatoccurred over brief periods. We provide evidence that these var-iations do not reflect either mutators or diversifying selection andpropose that the number of mutations per unit time varies withtransmission history: epidemics result in more replication cyclesand more SNPs than enzootic phases. This interpretation is sup-ported by an extensive polytomy that gave rise to branches 1through 4 shortly before the Black Death, and a second polytomythat gave rise to 1.ORI shortly before its global spread from HongKong in 1894 resulted in the third pandemic. We also infer thepossible occurrence of historically undocumented epidemics inChina because multiple other such polytomies and branch-spe-cific high clock rates were observed for populations of Y. pestisthat have only been isolated in China or Mongolia until now.Finally, we raise the possibility that strain Angola (0.PE3) mighthave been associated with the Justinianic Pandemic spread fromAfrica to all of Europe.The Angola genome contains an extraordinarily high number of

SNPs (SI Appendix, Fig. S3), which according to our interpretationmight have arisen during the two centuries of epidemic wavesin Europe and Africa that were associated with the JustinianicPandemic. The age of Angola is also compatible with this in-terpretation: the 95%CIs of our estimates for the ages of the nodesthat flank Angola in the genealogy overlap with the beginningof that pandemic in 541 (Table 1). Although the causative agentof the Justinianic Pandemic is still uncertain, the successful

2000 2500 3000 3500 4000

0.00

0 0.

001

0.

002

0.

003

0.00

4

0.00

5

Chi squared

Den

sity

Chi squared goodness of fit testA

YP

O03

48Y

PO

2210

YP

O30

41Y

PO

3561

YP

O23

87Y

PO

2639

YP

O08

21Y

PO

3506

YP

O29

80Y

PO

0060

YP

O26

78Y

PO

1634

YP

O13

94Y

PO

2390

YP

O06

72Y

PO

3675

YP

O08

50Y

PO

3776

YP

O41

28Y

PO

0328

YP

O27

52Y

PO

3223

YP

O13

48Y

PO

1571

YP

O07

91Y

PO

1573

YP

O05

44Y

PO

2634

YP

O04

15Y

PO

0955

Con

tribu

tion

(%)

0.0

0.5

1.0

1.5

2.0

2.

5

3

.0

Gene contribution to Chi-square value30 most contributing genes

Gene length100200500100020004000

All genes

0.0

0.5

1.0

1.

5 2

.0

B

0 1000 2000 3000 4000 5000 6000

0

2

4

6

8

Gene length (bp)

Num

ber o

f nsS

NPs

CI α threshold0.0010.010.05

C

Fig. 4. Identification of genes with unexpectednumbers of nsSNPs. (A) Parametric and nonparametricχ2 goodness-of-fit test of the expected number ofnsSNPs per gene. The observed χ2 (red arrow) is dis-tinct from both the parametric distribution (black line)and the simulated distribution (blue histograms).(B) Contribution of each gene to the deviationfrom the expected number of nsSNPs. The bar plotshows the percentage contribution of each gene tothe χ2 goodness-of-fit statistic, ordered by decreasingcontribution. The entire distribution is shown in theinset, and the main figure shows the 30 genes withthe highest contributions. Gene lengths are indicatedby colors. The arrow differentiates outliers (left of thearrow) from other genes. (C) Numbers of nsSNPs pergene as a function of gene length. Each circle repre-sents a gene, the symbol size of which is proportionalto the deviation from theoretical expectations(as measured by a χ2 statistic). Confidence intervals ofthe theoretical expectations are represented in shadesof blue for different α-thresholds (0.05, 0.01, 0.001).Red symbols indicate genes identified as outliers. Theinner white line indicates the mean expectation.

Cui et al. PNAS | January 8, 2013 | vol. 110 | no. 2 | 581

EVOLU

TION

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021

Page 6: Historical variations in mutation rate in an epidemic ... · The genetic diversity of Yersinia pestis, the etiologic agent of plague, is extremely limited because of its recent origin

amplification of Y. pestis specific genes from skeletons from thefifth to seventh centuries has been reported by two independentgroups (47–49), whereas the biovar or genotype designations ofthese strains are unknown (50). Our dating implies that multipleextantY. pestis lineages that were capable of causing humandiseaselikely existed at that time, including the ancestors of Angola.Similar demographic effects are expected to be true for other

epidemic pathogens, especially for zoonotic diseases with similarlifestyles, such as Bacillus anthracis and Francisella tularensis (51).Variable mutation rates have also been reported for S. entericaserovarMontevideo, which was associated with a recent food-bornoutbreak (52). However, we note that although a demographicexplanation may explain the observed patterns, we cannot excludealternative scenarios, such as host density changes driven by cli-matic factors (53, 54), which will also change the number of bac-terial replication cycles.Many of our conclusions were only possible because our sam-

pling was so extensive. An analysis of fewer genomes, or manygenomes from a more restricted geographical source, might havelacked the statistical power to demonstrate the extensive ratevariation and the multiple polytomies we describe here. We

anticipate that our data will enrich the understanding of evolu-tionary mechanisms in pathogens alternating between endemicand epidemic demographies. The genomic data presented in thisresearch will also be of help for further exploration of the patho-genicity of Y. pestis and its historical patterns of spread.

Materials and MethodsDetails of materials and methods are included in SI Appendix, SI Materialsand Methods and Dataset S7, which lists the SNPs excluded because they arein repetitive regions.

ACKNOWLEDGMENTS. We thank Baizhong Cui, Zhizhong Song, Xingqi Dong,Xiang Dai, and Baolin Wang for DNA extraction; and Feng Xi, Rui Chen, JuanWang, and Bo Wang for DNA library construction and sequencing. Supportwas provided by grants from the State Key Development Program for BasicResearchof China (2009CB522600), theNationalNatural Science FoundationofChina (Contract 30930001), Industry Research Special Foundation of ChinaMinistry of Health (201202021), National Key Program for Infectious Diseasesof China (2013ZX10004221-002), Shenzhen Biological Industry DevelopmentSpecial Foundation-Basic Research Key Projects (JC201005250088A), Euro-pean Research Council (No. 260801-BIG_IDEA), and Science Foundation ofIreland (05/FE1/B882).

1. Gage KL, Kosoy MY (2005) Natural history of plague: Perspectives from more thana century of research. Annu Rev Entomol 50:505–528.

2. Stenseth NC, et al. (2008) Plague: Past, present, and future. PLoS Med 5(1):e3.3. Yersin A (1894) La peste bubonique à Hong-Kong [Bubonic plague in Hong Kong].

Ann Inst Pasteur (Paris) 2:428–430.4. Benedictow OJ (2004) The Black Death 1346–1353 (Boydell Press, Woodbridge).5. Little LK (2007) Plague and the End of Antiquity: The Pandemic of 541-750 (Cam-

bridge Univ Press, New York), pp xiv, 360.6. Scott S, Duncan CJ (2001) Biology of Plagues: Evidence from Historical Populations

(Cambridge Univ Press, Cambridge, New York), pp xiv, 420.7. Cohn SK (2002) The Back Death Transformed: Disease and Culture in Early Re-

naissance Europe (Arnold and Oxford Univ Press, London, New York), pp x, 318.8. Twigg G (1984) The Black Death: A Biological Reappraisal (Batsford, London).9. Haensch S, et al. (2010) Distinct clones of Yersinia pestis caused the black death. PLoS

Pathog 6(10):e1001134.10. Tran TN, et al. (2011) High throughput, multiplexed pathogen detection authenti-

cates plague waves in medieval Venice, Italy. PLoS ONE 6(3):e16735.11. Bos KI, et al. (2011) A draft genome of Yersinia pestis from victims of the Black Death.

Nature 478(7370):506–510.12. Achtman M (2008) Evolution, population structure, and phylogeography of geneti-

cally monomorphic bacterial pathogens. Annu Rev Microbiol 62:53–70.13. Morelli G, et al. (2010) Yersinia pestis genome sequencing identifies patterns of

global phylogenetic diversity. Nat Genet 42(12):1140–1143.14. Li Y, et al. (2008) Different region analysis for genotyping Yersinia pestis isolates from

China. PLoS ONE 3(5):e2166.15. Li Y, et al. (2009) Genotyping and phylogenetic analysis of Yersinia pestis by MLVA:

Insights into the worldwide expansion of Central Asia plague foci. PLoS ONE 4(6):e6000.

16. Li R, et al. (2010) De novo assembly of human genomes with massively parallel shortread sequencing. Genome Res 20(2):265–272.

17. Eppinger M, et al. (2009) Draft genome sequences of Yersinia pestis isolates fromnatural foci of endemic plague in China. J Bacteriol 191(24):7628–7629.

18. Achtman M, et al. (1999) Yersinia pestis, the cause of plague, is a recently emergedclone of Yersinia pseudotuberculosis. Proc Natl Acad Sci USA 96(24):14043–14048.

19. Achtman M, et al. (2004) Microevolution and history of the plague bacillus, Yersiniapestis. Proc Natl Acad Sci USA 101(51):17837–17842.

20. Laukkanen-Ninios R, et al. (2011) Population structure of the Yersinia pseudotuberculosiscomplex according to multilocus sequence typing. Environ Microbiol 13(12):3114–3127.

21. WHO (2000)WHO Report on Global Surveillance of Epidemic-Prone Infectious Disease(WHO, Geneva), pp 25–37. Available at http://www.who.int/csr/resources/publications/surveillance/WHO_CDS_CSR_ISR_2000_1/en.

22. Song Y, et al. (2004) Complete genome sequence of Yersinia pestis strain 91001, anisolate avirulent to humans. DNA Res 11(3):179–197.

23. Zhou D, et al. (2004) Genetics of metabolic variations between Yersinia pestis biovarsand the proposal of a new biovar, microtus. J Bacteriol 186(15):5147–5152.

24. Smith GJ, et al. (2009) Origins and evolutionary genomics of the 2009 swine-originH1N1 influenza A epidemic. Nature 459(7250):1122–1125.

25. Harris SR, et al. (2010) Evolution of MRSA during hospital transmission and in-tercontinental spread. Science 327(5964):469–474.

26. Nübel U, et al. (2010) A timescale for evolution, population expansion, and spatialspread of an emerging clone of methicillin-resistant Staphylococcus aureus. PLoSPathog 6(4):e1000855.

27. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by samplingtrees. BMC Evol Biol 7:214.

28. Drummond AJ, Suchard MA (2010) Bayesian random local clocks, or one rate to rulethem all. BMC Biol 8:114.

29. Drummond AJ, Ho SY, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics anddating with confidence. PLoS Biol 4(5):e88.

30. Ho SY, Phillips MJ, Cooper A, Drummond AJ (2005) Time dependency of molecularrate estimates and systematic overestimation of recent divergence times. Mol BiolEvol 22(7):1561–1568.

31. Venditti C, Meade A, Pagel M (2006) Detecting the node-density artifact in phylogenyreconstruction. Syst Biol 55(4):637–643.

32. Barrick JE, et al. (2009) Genome evolution and adaptation in a long-term experimentwith Escherichia coli. Nature 461(7268):1243–1247.

33. Eppinger M, et al. (2010) Genome sequence of the deep-rooted Yersinia pestis strainAngola reveals new insights into the evolution and pangenome of the plague bac-terium. J Bacteriol 192(6):1685–1699.

34. Roumagnac P, et al. (2006) Evolutionary history of Salmonella typhi. Science314(5803):1301–1304.

35. Kryazhimskiy S, Dushoff J, Bazykin GA, Plotkin JB (2011) Prevalence of epistasis in theevolution of influenza A surface proteins. PLoS Genet 7(2):e1001301.

36. Bearden SW, et al. (2009) Attenuated enzootic (pestoides) isolates of Yersinia pestisexpress active aspartase. Microbiology 155(Pt 1):198–209.

37. Drancourt M, Houhamdi L, Raoult D (2006) Yersinia pestis as a telluric, human ecto-parasite-borne organism. Lancet Infect Dis 6(4):234–241.

38. Hinnebusch BJ (2005) The evolution of flea-borne transmission in Yersinia pestis. CurrIssues Mol Biol 7(2):197–212.

39. Lorange EA, Race BL, Sebbane F, Joseph Hinnebusch B (2005) Poor vector competence offleas and theevolutionofhypervirulence inYersiniapestis. J InfectDis191(11):1907–1912.

40. Girard JM, et al. (2004) Differential plague-transmission dynamics determine Yersiniapestis population genetic structure on local, regional, and global scales. Proc NatlAcad Sci USA 101(22):8408–8413.

41. Laird CD, McConaughy BL, McCarthy BJ (1969) Rate of fixation of nucleotide sub-stitutions in evolution. Nature 224(5215):149–154.

42. Tettelin H, et al. (2005) Genome analysis of multiple pathogenic isolates of Strepto-coccus agalactiae: Implications for the microbial “pan-genome” Proc Natl Acad SciUSA 102(39):13950–13955.

43. Touchon M, et al. (2009) Organised genome dynamics in the Escherichia coli speciesresults in highly diverse adaptive paths. PLoS Genet 5(1):e1000344.

44. Holt KE, et al. (2008) High-throughput sequencing provides insights into genomevariation and evolution in Salmonella Typhi. Nat Genet 40(8):987–993.

45. Moran NA, McLaughlin HJ, Sorek R (2009) The dynamics and time scale of ongoinggenomic erosion in symbiotic bacteria. Science 323(5912):379–382.

46. Derbise A, et al. (2007) A horizontally acquired filamentous phage contributes to thepathogenicity of the plague bacillus. Mol Microbiol 63(4):1145–1157.

47. Wiechmann I, Grupe G (2005) Detection of Yersinia pestis DNA in two early medievalskeletal finds from Aschheim (Upper Bavaria, 6th century A.D.). Am J Phys Anthropol126(1):48–55.

48. Drancourt M, et al. (2004) Genotyping, Orientalis-like Yersinia pestis, and plaguepandemics. Emerg Infect Dis 10(9):1585–1592.

49. Drancourt M, et al. (2007) Yersinia pestis Orientalis in remains of ancient plaguepatients. Emerg Infect Dis 13(2):332–333.

50. Vergnaud G (2005) Yersinia pestis genotyping. Emerg Infect Dis 11(8):1317–1318,author reply 1318–1319.

51. Keim PS, Wagner DM (2009) Humans and evolutionary and ecological forces shapedthe phylogeography of recently emerged diseases. Nat Rev Microbiol 7(11):813–821.

52. Bakker HC, et al. (2011) A whole-genome single nucleotide polymorphism-basedapproach to trace and identify outbreaks linked to a common Salmonella entericasubsp. enterica serovar Montevideo pulsed-field gel electrophoresis type. Appl Envi-ron Microbiol 77(24):8648–8655.

53. Stenseth NC, et al. (2006) Plague dynamics are driven by climate variation. Proc NatlAcad Sci USA 103(35):13110–13115.

54. Xu L, et al. (2011) Nonlinear effect of climate on plague during the third pandemic inChina. Proc Natl Acad Sci USA 108(25):10214–10219.

582 | www.pnas.org/cgi/doi/10.1073/pnas.1205750110 Cui et al.

Dow

nloa

ded

by g

uest

on

Janu

ary

22, 2

021


Recommended