+ All Categories
Home > Documents > Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a...

Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a...

Date post: 27-May-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
12
Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilstein a,1 , Nathalie S. Nagalingum a,2 , Mark D. Clements a , Steven R. Manchester b , and Sarah Mathews a,1 a The Arnold Arboretum of Harvard University, Jamaica Plain, MA 02138; and b Florida Museum of Natural History, University of Florida, Gainesville, FL 32611 Edited* by Michael J. Donoghue, Yale University, New Haven, CT, and approved September 7, 2010 (received for review August 26, 2009) Dated molecular phylogenies are the basis for understanding spe- cies diversity and for linking changes in rates of diversication with historical events such as restructuring in developmental path- ways, genome doubling, or dispersal onto a new continent. Valid fossil calibration points are essential to the accurate estimation of divergence dates, but for many groups of owering plants fossil evidence is unavailable or limited. Arabidopsis thaliana, the pri- mary genetic model in plant biology and the rst plant to have its entire genome sequenced, belongs to one such group, the plant family Brassicaceae. Thus, the timing of A. thaliana evolution and the history of its genome have been controversial. We bring pre- viously overlooked fossil evidence to bear on these questions and nd the split between A. thaliana and Arabidopsis lyrata occurred about 13 Mya, and that the split between Arabidopsis and the Brassica complex (broccoli, cabbage, canola) occurred about 43 Mya. These estimates, which are two- to threefold older than pre- vious estimates, indicate that gene, genomic, and developmental evolution occurred much more slowly than previously hypothe- sized and that Arabidopsis evolved during a period of warming rather than of cooling. We detected a 2- to 10-fold shift in species diversication rates on the branch uniting Brassicaceae with its sister families. The timing of this shift suggests a possible impact of the CretaceousPaleogene mass extinction on their radiation and that Brassicales codiversied with pierid butteries that spe- cialize on mustard-oilproducing plants. Bayesian dating | Brassica | Brassicaceae | chronogram | fossil calibration T he most important genetic model in plant biology is Arabi- dopsis thaliana. It is the rst plant to have its entire genome sequenced, and it serves as a key comparison point with other eukaryotic genomes. A. thaliana is diploid and has a small ge- nome distributed on just ve chromosomes, considerations in its choice as a model (1). The age of the Arabidopsis crown group (CG), previously estimated at 5.83 Mya (2, 3), and of splits within Brassicaceae have been used to understand the pace of evolution in genes affecting self-incompatibility (4, 5), the rate of change in signal transduction and gene expression (6, 7), the persistence of shared chromosomal rearrangements in A. thaliana and Brassica oleracea (8), the tempo of evolution of miRNA se- quences (9), the evolution of pierid butteries specializing in plants that produce mustard oils (10), and the ages of whole- genome duplication (WGD) events giving rise to gene pairs in Arabidopsis (11). As genomes of additional Brassicaceae (e.g., Capsella rubella) and other Brassicales (e.g., Carica papaya) (12) are sequenced, the importance of robust estimates of divergence dates relating these genomes to one another and to the geological record increases substantially. The accuracy of divergence times inferred from sequence data depends on valid, veriable fossils to calibrate phylogenetic trees. Previous dates for the origin of Arabidopsis relied on the report of fossil pollen assigned to the genus Rorippa (Brassicaceae) (2, 3). We discovered that this report is not linked to a physical specimen, published image, or description; thus, its validity for calibration of a Brassicaceae phylogeny cannot be evaluated. Reexamination of fossils from the order Brassicales (Table S1) revealed six fossil taxa that are sufciently documented to serve as age constraints; however, only one has been used in previous estimations of divergence times (13). Among the taxa that had been overlooked is Thlaspi primaevum (Brassicaceae) (14), an Oligocene fossil with angustiseptate winged fruits (Fig. 1A) from the Ruby Basin Flora of southwestern Montana, dated at 30.829.2 Mya (15). It may have been overlooked because many of the generic determinations from the Ruby Basin Flora remain to be veried, or there may have been concerns about convergence of winged fruits from unrelated angiosperm families. Although the placement of T. primaevum in Brassicaceae now has been con- rmed (16), this fruit form evolved independently multiple times within the family (17). However, among extant Brassicaceae, angustiseptate fruit combined with concentrically striate seeds (Fig. 1 B, C, and E) is unique to species of Thlaspi. We examined the seed chamber of T. primaevum and found seed striations in the same pattern as those of extant Thlaspi seeds (Fig. 1D). Alliaria petiolata is the only other Brassicaceae with striated seeds (18) (Fig. S1), but it has longitudinally oriented striations, and the fruit is latiseptate; latiseptate fruits are ancestral for the clade dened by the coalescence of A. petiolata and Thlaspi arvense (Fig. S1). Thus, T. primaevum is a valid age constraint within CG Brassicaceae, and it was placed at the coalescence of T. arvense and A. petiolata as a minimum age constraint for this split (Fig. 2 and Figs. S2S4). The other ve potentially useful Brassicales fossils are placed outside the Brassicaceae crown group. Capparidoxylon holleisii is a Miocene wood fossil with afnities to extant Capparis, dated at 1716.3 Mya (Table S1) (19). We explored its use as a con- straint within Core Brassicales (Brassicaceae + Cleomaceae + Capparaceae) (SI Materials and Methods). Dressiantha bicarpel- lata is a Turonian fossil dated at 93.689.3 Mya (20), the oldest known putative brassicalean fossil; it provided the single age constraint used by Couvreur et al. (13). However, membership of D. bicarpellata within Brassicales is contentious (21), and thus its validity as an age constraint for estimating divergence times in the order remains unclear. To explore these issues we (i ) assessed the impact of using it to constrain four nodes along the backbone of the Brassicales tree, based on results from analyses of combined nucleotide and morphological data (SI Materials and Methods) and (ii ) estimated ages in Brassicales without D. bicarpellata. Finally, three species of Akania were used to constrain the ages of nodes within the Bretschneidera/Akania/ Author contributions: M.A.B. and S.M. designed research; M.A.B., N.S.N., and S.R.M. per- formed research; M.D.C. contributed new reagents/analytic tools; M.A.B., N.S.N., M.D.C., and S.R.M. analyzed data; and M.A.B. and S.M. wrote the paper. The authors declare no conict of interest. *This Direct Submission article had a prearranged editor. Freely available online through the PNAS open access option. Data deposition: All accession numbers for sequences downloaded from GenBank and sequences deposited in GenBank as part of this study appear in Table S2. 1 To whom correspondence may be addressed. E-mail: [email protected] or smathews@ oeb.harvard.edu. 2 Present address: University of California Museum of Paleontology, University of Califor- nia, Berkeley, CA 94720. This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.0909766107/-/DCSupplemental. 1872418728 | PNAS | October 26, 2010 | vol. 107 | no. 43 www.pnas.org/cgi/doi/10.1073/pnas.0909766107
Transcript
Page 1: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Dated molecular phylogenies indicate a Mioceneorigin for Arabidopsis thalianaMark A. Beilsteina,1, Nathalie S. Nagalinguma,2, Mark D. Clementsa, Steven R. Manchesterb, and Sarah Mathewsa,1

aThe Arnold Arboretum of Harvard University, Jamaica Plain, MA 02138; and bFlorida Museum of Natural History, University of Florida, Gainesville, FL 32611

Edited* by Michael J. Donoghue, Yale University, New Haven, CT, and approved September 7, 2010 (received for review August 26, 2009)

Dated molecular phylogenies are the basis for understanding spe-cies diversity and for linking changes in rates of diversificationwith historical events such as restructuring in developmental path-ways, genome doubling, or dispersal onto a new continent. Validfossil calibration points are essential to the accurate estimation ofdivergence dates, but for many groups of flowering plants fossilevidence is unavailable or limited. Arabidopsis thaliana, the pri-mary genetic model in plant biology and the first plant to have itsentire genome sequenced, belongs to one such group, the plantfamily Brassicaceae. Thus, the timing of A. thaliana evolution andthe history of its genome have been controversial. We bring pre-viously overlooked fossil evidence to bear on these questions andfind the split between A. thaliana and Arabidopsis lyrata occurredabout 13 Mya, and that the split between Arabidopsis and theBrassica complex (broccoli, cabbage, canola) occurred about 43Mya. These estimates, which are two- to threefold older than pre-vious estimates, indicate that gene, genomic, and developmentalevolution occurred much more slowly than previously hypothe-sized and that Arabidopsis evolved during a period of warmingrather than of cooling. We detected a 2- to 10-fold shift in speciesdiversification rates on the branch uniting Brassicaceae with itssister families. The timing of this shift suggests a possible impactof the Cretaceous–Paleogene mass extinction on their radiationand that Brassicales codiversified with pierid butterflies that spe-cialize on mustard-oil–producing plants.

Bayesian dating | Brassica | Brassicaceae | chronogram | fossil calibration

The most important genetic model in plant biology is Arabi-dopsis thaliana. It is the first plant to have its entire genome

sequenced, and it serves as a key comparison point with othereukaryotic genomes. A. thaliana is diploid and has a small ge-nome distributed on just five chromosomes, considerations in itschoice as a model (1). The age of the Arabidopsis crown group(CG), previously estimated at 5.8–3 Mya (2, 3), and of splitswithin Brassicaceae have been used to understand the pace ofevolution in genes affecting self-incompatibility (4, 5), the rate ofchange in signal transduction and gene expression (6, 7), thepersistence of shared chromosomal rearrangements in A. thalianaand Brassica oleracea (8), the tempo of evolution of miRNA se-quences (9), the evolution of pierid butterflies specializing inplants that produce mustard oils (10), and the ages of whole-genome duplication (WGD) events giving rise to gene pairs inArabidopsis (11). As genomes of additional Brassicaceae (e.g.,Capsella rubella) and other Brassicales (e.g., Carica papaya) (12)are sequenced, the importance of robust estimates of divergencedates relating these genomes to one another and to the geologicalrecord increases substantially.The accuracy of divergence times inferred from sequence data

depends on valid, verifiable fossils to calibrate phylogenetic trees.Previous dates for the origin of Arabidopsis relied on the reportof fossil pollen assigned to the genus Rorippa (Brassicaceae) (2,3). We discovered that this report is not linked to a physicalspecimen, published image, or description; thus, its validity forcalibration of a Brassicaceae phylogeny cannot be evaluated.Reexamination of fossils from the order Brassicales (Table S1)revealed six fossil taxa that are sufficiently documented to serve

as age constraints; however, only one has been used in previousestimations of divergence times (13). Among the taxa that hadbeen overlooked is Thlaspi primaevum (Brassicaceae) (14), anOligocene fossil with angustiseptate winged fruits (Fig. 1A) fromthe Ruby Basin Flora of southwestern Montana, dated at 30.8–29.2 Mya (15). It may have been overlooked because many of thegeneric determinations from the Ruby Basin Flora remain to beverified, or there may have been concerns about convergence ofwinged fruits from unrelated angiosperm families. Although theplacement of T. primaevum in Brassicaceae now has been con-firmed (16), this fruit form evolved independently multiple timeswithin the family (17). However, among extant Brassicaceae,angustiseptate fruit combined with concentrically striate seeds(Fig. 1 B, C, and E) is unique to species of Thlaspi. We examinedthe seed chamber of T. primaevum and found seed striations inthe same pattern as those of extant Thlaspi seeds (Fig. 1D).Alliaria petiolata is the only other Brassicaceae with striated seeds(18) (Fig. S1), but it has longitudinally oriented striations, andthe fruit is latiseptate; latiseptate fruits are ancestral for the cladedefined by the coalescence of A. petiolata and Thlaspi arvense(Fig. S1). Thus, T. primaevum is a valid age constraint within CGBrassicaceae, and it was placed at the coalescence of T. arvenseand A. petiolata as a minimum age constraint for this split (Fig. 2and Figs. S2–S4).The other five potentially useful Brassicales fossils are placed

outside the Brassicaceae crown group. Capparidoxylon holleisiiis a Miocene wood fossil with affinities to extant Capparis, datedat 17–16.3 Mya (Table S1) (19). We explored its use as a con-straint within Core Brassicales (Brassicaceae + Cleomaceae +Capparaceae) (SI Materials and Methods). Dressiantha bicarpel-lata is a Turonian fossil dated at 93.6–89.3 Mya (20), the oldestknown putative brassicalean fossil; it provided the single ageconstraint used by Couvreur et al. (13). However, membershipof D. bicarpellata within Brassicales is contentious (21), and thusits validity as an age constraint for estimating divergence timesin the order remains unclear. To explore these issues we (i)assessed the impact of using it to constrain four nodes along thebackbone of the Brassicales tree, based on results from analysesof combined nucleotide and morphological data (SI Materialsand Methods) and (ii) estimated ages in Brassicales withoutD. bicarpellata. Finally, three species of Akania were used toconstrain the ages of nodes within the Bretschneidera/Akania/

Author contributions: M.A.B. and S.M. designed research; M.A.B., N.S.N., and S.R.M. per-formed research; M.D.C. contributed new reagents/analytic tools; M.A.B., N.S.N., M.D.C.,and S.R.M. analyzed data; and M.A.B. and S.M. wrote the paper.

The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Freely available online through the PNAS open access option.

Data deposition: All accession numbers for sequences downloaded from GenBank andsequences deposited in GenBank as part of this study appear in Table S2.1Towhom correspondencemay be addressed. E-mail: [email protected] or [email protected].

2Present address: University of California Museum of Paleontology, University of Califor-nia, Berkeley, CA 94720.

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.0909766107/-/DCSupplemental.

18724–18728 | PNAS | October 26, 2010 | vol. 107 | no. 43 www.pnas.org/cgi/doi/10.1073/pnas.0909766107

Page 2: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Tropaeolum clade (Fig. 2). To take full advantage of these ageconstraints deeper in Brassicales, we expanded our sample of theplastid locus NADH dehydrogenase subunit F (ndhF) and thenuclear locus phytochrome A (PHYA), which have provided a ro-bust phylogenetic framework for the Brassicaceae (17, 22), to in-clude members of the Core Brassicales, giving us a matrix of179 species (SI Material and Methods and Table S2). We estimateddivergence times for trees inferred from each locus and from thecombined data.Most estimates of divergence times in Brassicaceae have as-

sumed that rates of nucleotide evolution are equal across thetree and have been based on either the mutation rate at synon-ymous sites (2, 3) or the estimated times of genome duplications(23–26). Two recent analyses have allowed rates to be un-correlated across the tree, but one of these analyses relied on D.bicarpellata as a single age constraint (13), and the other reliedon a single secondary age constraint (27). Neither of these ana-lyses took into account uncertainties in the topology of the tree(13, 27). Our analyses used multiple age constraints, allowedrates of nucleotide evolution to be uncorrelated across the tree,and accounted for uncertainty in the phylogenetic hypothesis andin the placement of the fossil age constraints.

Results and DiscussionUsing Bayesian approaches (28), we estimated the origin of CGArabidopsis at ≈13.0 Mya [95% highest probability density(HPD): 17.9–8.0] (Fig. 2 and Table 1), considerably older thanthe frequently cited estimate of 5.8–3 Mya (Table 1) (2, 3). Whena penalized likelihood approach (29, 30) was used, the estimatefor this node was even older (Table S3). In the subsequent dis-cussion, we focus on estimates from the Bayesian analyses (Table1 and Table S3). The placement or exclusion of D. bicarpellatahad little impact on the age of this node and other nodes in thetree (Table S3). Conversely, the T. primaevum constraint didlead to slightly older ages for most nodes in the tree (Table S4).Nonetheless, even without this constraint within CG Brassica-ceae, age estimates were substantially older than previous esti-mates (Table 1), possibly resulting from our use of multipleconstraints, dense taxon sampling (but see ref. 13), from our

allowing for phylogenetic uncertainty, alternative placement offossil calibrations, from the relative completeness of our mo-lecular datasets (cf. ref. 13), or from a combination of thesefactors. The previous estimate for CG Arabidopsis is primarilyPliocene (5.3–1.8 Mya), whereas the revised age falls within theMiocene (23.03–5.3 Mya) and spans a particularly warm periodin recent earth history that included the Middle Miocene cli-matic optimum (31). Thus, warming may have played a rolein the divergence of A. thaliana from other Arabidopsis. More-over, the much older optimal age estimate of 13.0 Mya suggeststhat the pace of chromosomal rearrangements (1), divergentgene regulation (6), and the breakdown of self-incompatibility(4) may have proceeded more slowly than has been appreciated.The first split within CG Brassicaceae occurred in the early

Eocene, ≈54.3 Mya (95% HPD: 64.2–45.2) (Fig. 2 and Figs. S2–S4). CG Lineages I, II, and III, which contain the majority ofBrassicaceae (17, 22), radiated in the mid to late Eocene, from43.4 to 33.3 Mya (Table 1). Because A. thaliana and B. oleracea(broccoli and related plants) occur in Lineages I and II, re-spectively, the relatively deep coalescence of these scientificallyand economically important species dates to ≈43.2 Mya (95%HPD: 50.7–36.6; Table 1), indicating that conservation of thelarge chromosomal blocks shared by these two species (8) haspersisted for longer than previously thought (2, 3, 13). TheB. oleracea genome, along with several close relatives, includingBrassica rapa (turnip) and Brassica napus (canola), has been thesubject of intense study because of the agricultural importance ofthe group. The whole-genome triplication event that likely fa-cilitated the origin of these crops was dated previously at 14 Mya(32). Our age estimate for the triplication is centered at 22.5 Mya(95% HPD: 28.3–15.6; Table 1).Our data suggest that the Core Brassicales stem group (SG)

originated around 71.3 Mya (95% HPD: 83.2–59.7), shortly be-fore the end of the Cretaceous, when SG Capparaceae split fromSG Brassicaceae and SG Cleomaceae (Fig. 2 and Table 1).Furthermore, using LASER (33), we detected a twofold shiftin net diversification rate along the branch to core Brassicaleswhen extinction rates are low or a 10-fold shift when extinctionrates are high (Table S5). SG Brassicaceae split from SG Cleo-maceae at the Cretaceous–Paleogene boundary (KPB; ≈65.5Mya) (Table 1 and Fig. 2). Thus, stem members of all threefamilies may have survived the KPB mass extinction event beforetheir subsequent radiations in the Eocene (Fig. 2 and Fig. S4). Apattern of origin before the KPB and radiation afterward hasbeen noted in animal (34) and other plant lineages (35), al-though the lag time before recovery is greater in Brassicaceaethan has been inferred in the other groups (35).A great deal of interest has centered on the timing of WGD

events in the history of A. thaliana, particularly on the mostrecent of these events, the α WGD, estimated to have occurredbetween 100 and 20 Mya (11, 36–39), and the β WGD, esti-mated to have occurred between 235 and 112 Mya (36, 38, 39)(Fig. 2). Recent lines of evidence place the α WGD withinBrassicales (12), perhaps within Brassicaceae (26, 27, 36, 40),and place the β WGD within Brassicales (12, 40). Theseplacements have fueled speculations that genome-doublingevents are linked with diversification in Brassicaceae (13, 41)and with survival of the KBP mass extinction event (24). Thesehypotheses are attractive, because WGD may result in thecolonization of new habitats (42), major ecological transitions(43), invasiveness (44), and the generation of morphologicalnovelty (45). Our chronogram (Fig. 2 and Fig. S4) provides theframework for testing these hypotheses. For example, if the βWGD occurred around 70 Mya, placing it on the stem lineageof Core Brassicales (Fig. 2), we would expect to find thatparalogs stemming from this event are shared by members ofBrassicaceae, Cleomaceae, and Capparaceae but not by otherfamilies in Brassicales. If a WGD maps to this branch, the locus

Fig. 1. Fossil and extant angustiseptate winged Thlaspi fruits and striatedseeds. (A) T. primaevum fossil from the Ruby Basin Flora, western Montana.Extant T. arvense fruit backlit to show placement of seeds in the two locules(wings) (B) and with a portion of the valve removed to show striated seeds(C). Locules are separated by the replum. (D) Scanning electron micrographof the fossil seed chamber (indicated by rectangle in A) showing impressionsof striated seeds. (E) T. arvense striated seed. S, seed; R, replum; W, wing.(Dotted scale bars, 5 mm; solid scale bars, 1 mm.)

Beilstein et al. PNAS | October 26, 2010 | vol. 107 | no. 43 | 18725

PLANTBIOLO

GY

Page 3: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Fig. 2. Brassicales chronogram inferred using BEAST (28). Clades with >50 species are represented by wedges proportional to species diversity. See figure forkey to symbols. Thickened branch leading to Core Brassicales marks an inferred 2- to 10-fold shift in diversification rate. Putative intervals for the α and βWGDare based on refs. 12, 24, and 26. Oli, Oligocene; Pal, Paleocene; Pl, Pliocene; Q, Quaternary.

18726 | www.pnas.org/cgi/doi/10.1073/pnas.0909766107 Beilstein et al.

Page 4: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

of the shift in net diversification rate is consistent with a linkbetween genome doubling and survival across the KPB massextinction and perhaps to subsequent diversification.Despite the interest in genome doubling and its effects, it is

important to consider other factors that might promote diver-sification. Ehrlich and Raven (46) cited the interaction betweenBrassicales and pierid butterflies as a prime example of coevo-lution between plants and insects. Pierids specializing on Brassi-cales produce nitrile-specifier protein essential for detoxificationof glucosinolates, a chemical defense found almost exclusivelyin Brassicales (10). This key innovation facilitated a host switchonto Brassicales, dated at ≈85 Mya (10, 47), and was followed bythe subsequent diversification of pierids (10). The locus of theshift in Brassicales net diversification rate and the timing of thepierid butterfly radiation suggest that they codiversified (Fig. 2and Fig. S4), supporting a central tenet of coevolutionary theory(46). However, the extent to which diversification in Brassicalesmay have driven pierid diversification, and vice-versa, requiresfurther study.Carefully designed comparative studies reveal the processes

that generate and maintain biodiversity. In this study we identifya major shift in diversification rate that may correlate with sur-vival across and diversification following the KPB mass extinc-tion event, WGD events, and the evolution of pierid butterflies.Our results suggest a number of ways in which our understand-ing of the history of CG Arabidopsis should be revised, notablythat it evolved during a period of warming rather than coolingand that genome structure and developmental processes havebeen slower to evolve than has been appreciated. The evolu-tionary history of A. thaliana and its neighborhood describedhere will allow more precise application of our understanding ofthis model organism to other flowering plants, land plants, andall eukaryotes.

Materials and MethodsGenBank accession numbers, specifics of the evaluation of published fossils,and detailed methods can be found in SI Materials and Methods.

Phylogenetic Inference.We used RAxML (48) to infer phylogeny in Brassicalesfrom three alignments. The plastid ndhF alignment comprised 2,067 nu-cleotide sites representing 170 species; the nuclear PHYA alignment com-prised 1,824 nucleotide sites representing 139 species; the combinedalignment comprised 3,798 nucleotide sites from 177 species (Table S2).Analyses assumed a general time-reversible model of sequence evolution,with γ-distributed rate heterogeneity; partitions in the combined matrixwere allowed to evolve independently. The topology and clade support in

the resulting trees are consistent with previously published phylogeneticestimates using ndhF (17) and other plastid (49) and nuclear (3, 22, 50,51) markers.

Ultrametric Tree and Divergence Date Estimation. We calculated divergencedates for Brassicales from all three alignments using r8s v. 1.71 (30) andBEAST v. 1.5.3 (28). These methods account in different ways for variation insubstitution rates among branches on the tree. We used r8s to explore theimpact of placing the Dressiantha constraint at different nodes along theBrassicales backbone; potential placements were inferred in analyses ofcombined ndhF and morphological data (details are given in SI Materials andMethods). Because the placement of Dressiantha had little impact on ageestimates within Core Brassicales (Table S3), we used a single placement inour BEAST analyses.

We allowed BEAST to infer topology, branch lengths, and dates for ndhFand combined data. For the PHYA data, we fixed the tree to that used inthe r8s analyses and allowed BEAST to alter branch lengths while inferringdates (SI Materials and Methods). We used a uniform distribution for allthree fossil calibrations with the lower hard bound of the distribution set tothe youngest age of the fossil (see text, SI Materials and Methods, and TableS1) and the upper hard bound set to the first fossil record for eudicot pollen(125 Mya) (52) for D. bicarpellata and Akania sp. For other fossil calibrationsthe upper hard bound was determined by the age of other fossils used in theanalysis (SI Materials and Methods). BEAST runs of 3 × 107 generations,saving data every 1,000 generations, produced 30,000 estimates of datesunder a Yule speciation prior and an uncorrelated relaxed clock (28) for thesingle-gene datasets. Convergence statistics for each single-gene run wereanalyzed in Tracer, resulting in 27,000 post–burn-in trees. BEAST runs of 6 ×107 generations, saving data every 1,000 generations, produced 60,000estimates under a Yule speciation (53) prior and uncorrelated relaxed clock(28) for the combined data. Also, for these data, we resampled at a lowerfrequency using LogCombiner v. 1.5.3 (28), resulting in a tree file with 30,000trees. We used TreeAnnotator v. 1.5.3 (28) to produce maximum clade cred-ibility trees from the post–burn-in trees and to determine the 95% proba-bility density of ages for all nodes in the tree (Figs. S2–S4).

Shifts in Net Diversification Rate. We used LASER v. 2.2 (33) to test for sig-nificant shifts in net diversification rates over the history of Brassicales. Cladediversity for each group was determined from the Angiosperm PhylogenyWebsite (54). We calculated the likelihoods of our phylogenetic data undera model of constant diversification rate over time and under a two-ratemodel in which the diversification rate is permitted to change (Table S5).Because analyses of diversification rate can be sensitive to extinction, wedetermined whether our results were sensitive to differences in extinctionfraction by reanalyzing the data under low (a = 0) and high (a = 0.99) ex-tinction (Table S5).

ACKNOWLEDGMENTS. We thank Elisabeth Wheeler for her expertise inevaluating Capparidoxylon holleisii using Inside Wood (http://insidewood.lib.ncsu.edu), Aleksej Hvalj (Komorov Botanical Institute, Russian Acad-emy of Sciences) for help identifying paleobotanical references from Russian

Table 1. Comparison of published dates with those estimated in this study using Bayesian inference in BEAST on the ndhF + PHYABrassicales chronogram

Node age

Koch et al. 2000 (2),2001 (3) synonymous

substitution rate, singlefossil constraint (Mya)

Franzke et al. 2009 (27)BEAST, nad4,

single secondaryconstraint (Mya)

Couvreur et al. 2010 (13)BEAST, supermatrix,

single fossilconstraint (Mya)

This study BEAST,ndhF + PHYA,four fossil

constraints (Mya)

Arabidopsis CG 5.8–3 ND ND 17.9 – 13.0 – 8.0Brassiceae CG (whole genome triplication) 14 ND ND 28.3 – 22.5 – 15.6Lineage I CG 19–13 19.0 – 8.0 – 0.5 36.1 – 27.3 – 18.2 42.8 – 35.6 – 28.5Lineage II CG 18.1 ND 37.2 – 28.2 – 18.1 37.8 – 30.8 – 23.7Lineage III CG ND ND 29.3 – 21.4 – 14.8 42.7 – 35.5 – 28.3Arabidopsis–Brassica MRCA 21–16 ND ≈34* 50.7 – 43.2 – 36.6Core Brassicaceae CG ND 28.0 – 11.0 – 1.0 42.8 – 32.3 – 20.9 54.3 – 46.9 – 39.4Brassicaceae CG 60–30 35.0 – 15.0 – 1.0 49.4 – 37.6 – 24.2 64.2 – 54.3 – 45.2Brassicaceae–Cleomaceae MRCA ND 35.0 – 19.0 – 1.0 ND 76.5 – 64.5 – 54.4Core Brassicales CG ND ND ND 83.2 – 71.3 – 59.7

Ages in bold are optimal reconstructions and are bracketed by upper and lower bounds of the 95% HPD interval. CG, crown group; MRCA, most recentcommon ancestor; ND, not determined.*Approximate mean date inferred from chronogram and for which HPD data were not available (13).

Beilstein et al. PNAS | October 26, 2010 | vol. 107 | no. 43 | 18727

PLANTBIOLO

GY

Page 5: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

scientific literature, Jocelyn Hall (University of Alberta, Edmonton, AB,Canada) for DNA material of Brassicales species, and Paul Forster (Queens-land Herbarium, Brisbane, QLD, Australia) and David Woodlee (Burringbar

Rainforest Nursery, Burringbar, NSW, Australia) for plant material of Akaniabidwillii. This work was supported by a Mercer Postdoctoral Fellowship (toM.A.B.) awarded by the Arnold Arboretum of Harvard University.

1. Berr A, et al. (2006) Chromosome arrangement and nuclear architecture but notcentromeric sequences are conserved between Arabidopsis thaliana and Arabidopsislyrata. Plant J 48:771–783.

2. Koch MA, Haubold B, Mitchell-Olds T (2000) Comparative evolutionary analysis ofchalcone synthase and alcohol dehydrogenase loci in Arabidopsis, Arabis, and relatedgenera (Brassicaceae). Mol Biol Evol 17:1483–1498.

3. Koch M, Haubold B, Mitchell-Olds T (2001) Molecular systematics of the Brassicaceae:Evidence from coding plastidicmatK and nuclear Chs sequences. Am J Bot 88:534–544.

4. Sherman-Broyles S, et al. (2007) S locus genes and the evolution of self-fertility inArabidopsis thaliana. Plant Cell 19:94–106.

5. Foxe JP, et al. (2009) Recent speciation associated with the evolution of selfing inCapsella. Proc Natl Acad Sci USA 106:5241–5245.

6. Ha M, Kim ED, Chen ZJ (2009) Duplicate genes increase expression diversity in closelyrelated species and allopolyploids. Proc Natl Acad Sci USA 106:2295–2300.

7. Seoighe C, Gehring C (2004) Genome duplication led to highly selective expansion ofthe Arabidopsis thaliana proteome. Trends Genet 20:461–464.

8. Ziolkowski PA, Kaczmarek M, Babula D, Sadowski J (2006) Genome evolution inArabidopsis/Brassica: Conservation and divergence of ancient rearranged segmentsand their breakpoints. Plant J 47:63–74.

9. Felippes FF, Schneeberger K, Dezulian T, Huson DH, Weigel D (2008) Evolution ofArabidopsis thaliana microRNAs from random sequences. RNA 14:2455–2459.

10. Wheat CW, et al. (2007) The genetic basis of a plant-insect coevolutionary keyinnovation. Proc Natl Acad Sci USA 104:20427–20431.

11. Blanc G, Hokamp K, Wolfe KH (2003) A recent polyploidy superimposed on olderlarge-scale duplications in the Arabidopsis genome. Genome Res 13:137–144.

12. Ming R, et al. (2008) The draft genome of the transgenic tropical fruit tree papaya(Carica papaya Linnaeus). Nature 452:991–996.

13. Couvreur TLP, et al. (2010) Molecular phylogenetics, temporal diversification, andprinciples of evolution in the mustard family (Brassicaceae). Mol Biol Evol 27:55–71.

14. Becker HF (1961) Oligocene Plants from the Upper Ruby River Basin, SouthwesternMontana. Geological Society of America Memoir (Geological Society of America), Vol82, pp 1–127.

15. Wing SL (1987) Eocene and Oligocene floras and vegetation of the Rocky Mountains.Ann Mo Bot Gard 74:748–784.

16. Manchester S, O’Leary E (2010) Phylogenetic distribution and identification of fin-winged fruits. Bot Rev 76:1–82.

17. Beilstein MA, Al-Shehbaz IA, Kellogg EA (2006) Brassicaceae phylogeny and trichomeevolution. Am J Bot 93:607–619.

18. Appel O, Al-Shehbaz IA (2002) Cruciferae. Families and Genera of Vascular Plants, edKubitzki K (Springer, Berlin), Vol 5, pp 75–174.

19. Selmeier A (2005) Capparidoxylon holleisii nov. spec., a silicified Capparis(Capparaceae) wood with insect coprolites from the Neogene of southern Germany.Zitteliana 45:199–209.

20. Gandolfo MA, Nixon KC, Crepet WL (1998) A new fossil flower from the Turonian ofNew Jersey: Dressiantha bicarpellata gen. et sp. nov. (Capparales). Am J Bot 85:964–974.

21. De Craene LPR, Haston E (2006) The systematic relationships of glucosinolate-producing plants and related families: A cladistic investigation based onmorphological and molecular characters. Bot J Linn Soc 151:453–494.

22. Beilstein MA, Al-Shehbaz I, Mathews S, Kellogg E (2008) Brassicaceae phylogenyinferred from phytochrome A and ndhF sequence data: Tribes and trichomesrevisited. Am J Bot 95:1307–1327.

23. Ermolaeva MD, Wu M, Eisen JA, Salzberg SL (2003) The age of the Arabidopsisthaliana genome duplication. Plant Mol Biol 51:859–866.

24. Fawcett JA, Maere S, Van de Peer Y (2009) Plants with double genomes might havehad a better chance to survive the Cretaceous-Tertiary extinction event. Proc NatlAcad Sci USA 106:5737–5742.

25. Henry Y, Bedhomme M, Blanc G (2006) History, protohistory and prehistory of theArabidopsis thaliana chromosome complement. Trends Plant Sci 11:267–273.

26. Schranz ME, Mitchell-Olds T (2006) Independent ancient polyploidy events in thesister families Brassicaceae and Cleomaceae. Plant Cell 18:1152–1165.

27. Franzke A, German D, Al-Shehbaz IA, Mummenhoff K (2009) Arabidopsis family ties:Molecular phylogeny and age estimates in Brassicaceae. Taxon 58:425–437.

28. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by samplingtrees. BMC Evol Biol 7:214.

29. Sanderson MJ (2002) Estimating absolute rates of molecular evolution and divergencetimes: A penalized likelihood approach. Mol Biol Evol 19:101–109.

30. Sanderson MJ (2003) r8s: Inferring absolute rates of molecular evolution anddivergence times in the absence of a molecular clock. Bioinformatics 19:301–302.

31. Zachos J, Pagani M, Sloan L, Thomas E, Billups K (2001) Trends, rhythms, andaberrations in global climate 65 Ma to present. Science 292:686–693.

32. Lysak MA, Koch MA, Pecinka A, Schubert I (2005) Chromosome triplication foundacross the tribe Brassiceae. Genome Res 15:516–525.

33. Rabosky DL (2006) LASER: A maximum likelihood toolkit for detecting temporal shiftsin diversification rates from molecular phylogenies. Evol Bioinf Online 257–260.

34. Montgelard C, Forty E, Arnal V, Matthee CA (2008) Suprafamilial relationships amongRodentia and the phylogenetic effect of removing fast-evolving nucleotides inmitochondrial, exon and intron fragments. BMC Evol Biol 8:321.

35. McElwain JC, Punyasena SW (2007) Mass extinction events and the plant fossil record.Trends Ecol Evol 22:548–557.

36. Bowers JE, Chapman BA, Rong JK, Paterson AH (2003) Unravelling angiospermgenome evolution by phylogenetic analysis of chromosomal duplication events.Nature 422:433–438.

37. Lynch M, Conery JS (2000) The evolutionary fate and consequences of duplicategenes. Science 290:1151–1155.

38. Simillion C, Vandepoele K, Van Montagu MCE, Zabeau M, Van de Peer Y (2002) Thehidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci USA 99:13627–13632.

39. Vision TJ, Brown DG, Tanksley SD (2000) The origins of genomic duplications inArabidopsis. Science 290:2114–2117.

40. Barker MS, Vogel H, Schranz ME (2009) Paleopolyploidy in the Brassicales: Analyses ofthe Cleome transcriptome elucidate the history of genome duplications inArabidopsis and other Brassicales. Genome Biol Evol 1:391–399.

41. Soltis DE, et al. (2009) Polyploidy and angiosperm diversification. Am J Bot 96:336–348.

42. Rieseberg LH, et al. (2007) Hybridization and the colonization of novel habitats byannual sunflowers. Genetica 129:149–165.

43. Rieseberg LH, et al. (2003) Major ecological transitions in wild sunflowers facilitatedby hybridization. Science 301:1211–1216.

44. Ellstrand NC, Schierenbeck KA (2000) Hybridization as a stimulus for the evolution ofinvasiveness in plants? Proc Natl Acad Sci USA 97:7043–7050.

45. Levin DA (1983) Polyploidy and novelty in flowering plants. Am Nat 122:1–25.46. Ehrlich P, Raven P (1964) Butterflies and plants: A study in coevolution. Evolution 18:

586–608.47. Braby MF, Vila R, Pierce NE (2006) Molecular phylogeny and systematics of the

Pieridae (Lepidoptera: Papilionoidea): Higher classification and biogeography. Zool JLinn Soc 147:238–275.

48. Stamatakis A (2006) RAxML-VI-HPC: Maximum likelihood-based phylogeneticanalyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690.

49. Koch MA, et al. (2007) Supernetwork identifies multiple events of plastid trnF(GAA)pseudogene evolution in the Brassicaceae. Mol Biol Evol 24:63–73.

50. Bailey CD, et al. (2006) Toward a global phylogeny of the Brassicaceae. Mol Biol Evol23:2142–2160.

51. Galloway GL, Malmberg RL, Price RA (1998) Phylogenetic utility of the nuclear genearginine decarboxylase: An example from Brassicaceae. Mol Biol Evol 15:1312–1320.

52. Brenner GJ (1996) Evidence of the earliest stage of angiosperm pollen evolution: Apaleoequatorial section from Israel. Flowering Plant Origin, Evolution and Phylogeny,eds Taylor DW, Hickey LJ (Chapman and Hall, New York), pp 91–115.

53. Yule GU (1924) A mathematical theory of evolution based on the conclusions of Dr.J.C. Willis. Philos Trans R Soc Lond B Biol Sci 213:21–87.

54. The Angiosperm Phylogeny Group (2003) An update of the Angiosperm PhylogenyGroup classification for the orders and families of flowering plants: APG II. Bot JLinn Soc 141:399–436.

18728 | www.pnas.org/cgi/doi/10.1073/pnas.0909766107 Beilstein et al.

Page 6: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Supporting InformationBeilstein et al. 10.1073/pnas.0909766107SI Materials and MethodsEvaluation of Potential Fossil Calibrations. We searched the pa-leobotanical literature and identified 32 fossils assigned toBrassicales (Table S1). Only six (Akania americana, Akania pa-tagonica, Akania sp., Capparidoxylon holleisii, Dressiantha bi-carpellata, Thlaspi primaevum) could be placed confidently inBrassicales. A fossil was considered acceptable for use as an ageconstraint only if its record included a clear citation with pho-tographic evidence or accurate reproduction, fossil collectionnumber, and morphological characters that support the pro-posed placement.

Ultrametric Tree and Divergence Date Estimation. To calculate di-vergence dates for Brassicales, we first inferred trees from plastidndhF and the nuclear locus phytochrome A (PHYA) data sepa-rately and then from combined ndhF and PHYA data (Table S2).The ndhF alignment comprised 2,067 nucleotide positions repre-senting 170 species in the order Brassicales plus Gossypium gos-sypioides (Malvales). The PHYA alignment comprised 1,824nucleotide positions representing 149 accessions from 139 speciesin the order Brassicales. The combined alignment comprised 3,798nucleotide positions representing 177 species of Brassicales plusG. gossypioides. The combined matrix included species missingdata for one of the two genes (i.e., seven species lack ndhF data; 38species lack PHYA data). However, for all major lineages ofBrassicales, there is at least one taxon in the matrix represented byboth ndhF and PHYA data.We inferred trees with branch lengths from all three datasets

using the general time reversible (GTR) model and Γ distributedrate heterogeneity in RAxML v. 7.2.3 (1). In analyses of thecombined dataset, we allowed each gene to evolve independently.Trees resulting from these analyses were used to calculate ultra-metric trees with divergence dates using the penalized likelihood(PL) approach in r8s v. 1.71 (2). PL employs a smoothing pa-rameter to correlate substitution rates across the tree (3), therebyaccounting for variation in substitution rates among branches. Weperformed cross-validation (2) over a range of smoothing pa-rameters to determine that the best smoothing value for the ndhFtree is 32 (101.5), for the PHYA tree is 25 (101.4), and for thecombined tree is 32 (101.5). For all analyses in r8s, Dressianthabicarpellata was designated a fixed age constraint, and all otherfossil dates were used as minimum age constraints.Because the position of D. bicarpellata is uncertain, and be-

cause we used it as a fixed age constraint in all r8s analyses, weexplored the impact on age estimates from r8s of placing thisfossil at different nodes in the tree. We determined the mostlikely positions by analyzing a matrix of combined ndhF andmorphology data developed from the published morphologicaldataset for extant Brassicales (4) and scoring characters for D.bicarpellata as described in Gandolfo et al. (5). We analyzed thecombined data using a mixed model in MrBayes 3.2 (6) and usedMesquite 2.71 (7) to sort the resulting 19,000 post–burn-in trees.The most frequently visited topology from this analysis placed D.bicarpellata sister to Moringa oleifera. Two other frequently vis-ited topologies placed D. bicarpellata sister to Carica papaya orBatis maritima. Based on these results, we analyzed the ndhF,PHYA, and combined trees, placing the D. bicarpellata constraintat different nodes in the inferred trees (Table S3). We found thatthe position of the D. bicarpellata fixed age constraint has onlya moderate impact on the ages of more-derived nodes in thetree. For example, in the ndhF analyses, the optimal age of theArabidopsis crown group is 19 Mya when D. bicarpellata is placed

at the deepest node of the tree and 20.8 Mya for the most-derived node calibration (Table S3).All other fossils were used as minimum age constraints in r8s.

We calibrated two different nodes with the Akania fossils; theAkania americana/A. patagonica fossils are from a more recentdeposit than Akania sp. (Table S1), and thus we used theyounger date for these fossils to constrain the divergence ofAkania bidwillii and Bretschneidera sinensis. Akania sp. was usedto constrain the node defined by A. bidwilli and Tropaeolummajus, which is deeper in the tree than the split constrained by A.americana/A. patagonica. This strategy allowed us to use allAkania fossils as calibrations in the ndhF and combined analyses.We lacked PHYA data for B. sinensis, precluding the use ofA. americana/A. patagonica as a calibration in PHYA analyses.Morphological analysis of Capparidoxylon holleisii using InsideWood (8) indicated affinities between C. holleisii and extantCapparis, and thus this fossil was placed at the coalescence ofCapparis hastata with other Capparaceae. However, explorationof the effect of this calibration by its removal and subsequentreanalysis of node ages indicated that it had no effect on thedates of other nodes in the tree, and thus it was excluded fromsubsequent analyses. We placed the fossil Thlaspi primaevum atthe coalescence of Thlaspi arvense and Alliaria petiolata based onthe presence of striated seeds (unique to these two generaamong extant Brassicaceae; Fig. S1) and on likelihood ancestralcharacter state reconstruction of fruit morphology using the Mk1model of evolution (9) in Mesquite (7) with character statetransition rates estimated from the data. Ancestral traits wereinferred on a likelihood tree with branch lengths generated froman analysis of combined ndhF and PHYA data for Brassicaceaeusing RAxML (1). The ancestral character state reconstructionindicates that within this clade angustiseptate fruits evolvedalong the branch leading to T. arvense (Fig. S1). We also ex-plored the effect of T. primaevum as a calibration point in r8s byexcluding it from the r8s analyses (Table S4).We inferred divergence dates on the ndhF tree using five fossil

calibrations (A. americana/A. patagonica, 47.5 Mya; Akania sp.,61.7 Mya; Dressiantha bicarpellata, 89.3 Mya; and T. primaevum,29.0 Mya; Table S1). For all ndhF analyses in r8s, we rooted withG. gossypioides, and thus it was pruned from the tree beforedivergence dates were estimated.We explored alternative tree-rooting strategies to infer di-

vergence dates for the PHYA tree in r8s. We found only a partialPHYA fragment for G. gossypioides from an expressed sequencetag, and inclusion of this sequence in the PHYA phylogeneticanalysis produced trees with ingroup topologies inconsistent withtrees inferred from other data. Instead, we rooted our Brassi-cales PHYA tree with Akania bidwillii + Tropaeolum majus [thefirst diverging lineage resolved in analyses by Hall et al. (10) andRodman et al. (4, 11)], which yielded an ingroup topology con-sistent with other trees. However, it precluded the use of theAkania fossil constraint, because the root node was pruned be-fore inferring dates. Thus, our PHYA r8s analysis used two ageconstraints: D. bicarpellata as a fixed age constraint and T. pri-maevum as a minimum age constraint.Our analysis of the combined ndhF/PHYA tree using r8s was

similar to the ndhF single-gene analysis. The tree was rootedwith G. gossypioides, which was pruned in r8s before dates wereinferred. The four fossil calibration points used in the ndhFanalysis were used to calibrate nodes in the combined analysiswith D. bicarpellata as a fixed age constraint and all others asminimum age constraints.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 1 of 7

Page 7: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

We also used BEAST v. 1.5.3 (12, 13) to calculate divergencedates from each gene individually and from the combined ndhFand PHYA data (Figs. S2–S4). We allowed BEAST to infer thetree, branch lengths, and dates in analyses of the ndhF andcombined data. Because of sparser taxon sampling and thepresence of a long branch to Gyrostemon tepperi, attempts toinfer trees and dates simultaneously from the PHYA data usingBEAST resulted in topologies and dates inconsistent with ourRAxML, r8s, and BEAST analyses of the ndhF and combineddata. Thus, we fixed the PHYA topology to that which was usedto infer dates using r8s but allowed BEAST to alter all otheraspects of the model of evolution while inferring dates.All our analyses in BEAST accounted for rate variation by using

anuncorrelated relaxed clock drawn froma lognormal distributionthat allows different rates to be optimized independently on eachbranchof the tree (13). In addition, the likelihoodof speciationwasdescribed by the Yule process (14), and branch rates were de-termined under a GTR + Γ model of nucleotide evolution withfour rate categories (13). In the analysis of combined data weunlinked the substitution model and clock model, allowing eachdata partition to evolve independently across the same tree. Weused a uniform distribution for all four fossil calibrations with thelower hard bound of the distribution set to the age of the fossil(i.e., A. americana/A. patagonica, 47 Mya; Akania sp., 61 Mya;D. bicarpellata, 89.3 Mya; T. primaevum, 29 Mya). Upper hardbounds were set to the age of the oldest eudicot pollen (125 Mya)(15) for D. bicarpellata and A. sp. and were defined by the age ofother fossils in the analysis forA. americana/A. patagonica (noolderthan A. sp., 61 Mya) and T. primaevum (no older than D. bi-carpellata, 89.3 Mya). We also used a prior on the age of the rootnode which was set with a hard upper bound defined by the age ofthe oldest fossil eudicot pollen and the lower bound set to be noyounger than the age of D. bicarpellata (89.3 Mya). The A. ameri-cana/A. patagonica fossil calibration was not used in the PHYAanalysis becausewe lackedPHYAdata fromBretschneidera sinensis.We allowed BEAST to run for 3 × 107 generations saving data

every 1,000 generations for ndhF, PHYA, and combined analyses.Convergence statistics for each run were analyzed in Tracer v.1.4.1 (16), resulting in 27,000 post–burn-in generations. Wetuned the performance of the BEAST analyses from the initialruns using performance-tuning suggestions generated in theinitial analyses. Then a second set of BEAST analyses with tunedoperators was performed. The ndhF and PHYA analyses wererun for 3 × 107 generations, and the combined analysis was runfor 6 × 107 generations. Convergence statistics for each run wereanalyzed in Tracer. We used TreeAnnotator v. 1.5.3 (13) toproduce a maximum clade credibility tree from the post–burn-intrees and to determine the 95% probability density of ages for allnodes in the tree (Figs. S2–S4).

Species Diversification Rate.We used LASER v. 2.2 (17) in R (18)to test for significant shifts in net diversification rate over thehistory of Brassicales. LASER applies a likelihood frameworkbased on a birth–death model to identify whether a constant rateor two-rate model better fits the data (a chronogram for whichthe species diversity of the terminal taxa is known). The mostlikely shift point in the chronogram is identified by splitting thetree at each branch to determine whether the data are betterdescribed by a constant rate model (a single diversification rateacross the tree) or a two-rate model (one rate before the splitand a second rate after). The node giving the maximum com-bined log-likelihood for the two partitions (i.e., before and afterthe rate shift) is the maximum likelihood estimate of the shiftpoint on the tree. Inputs for the analysis include (i) a prunedchronogram and (ii) a table of species diversity for each lineageretained in the pruned chronogram.For all three datasets, we pruned the Brassicales chronogram

from BEAST (Figs. S2–S4) to a single representative from eachfamily, except Brassicaceae where we retained Aethionema as anindependent lineage to test whether a significant rate shift oc-curred after the divergence of Aethionema from all other Brassi-caceae. Other lineages retained in the ndhF and combinedanalyses included Forchhammeria and Tirania; these lineages arenot present in the PHYA tree. Branch lengths were maintainedduring the pruning process. Clade diversity for each group wasdetermined from the Angiosperm Phylogeny Website (19). Wecalculated the likelihoods of our phylogenetic data under a modelof constant diversification rate over time and under a two-ratemodel in which the diversification rate is permitted to change(Table S5). Because analyses of diversification rate can be sensi-tive to extinction, we determined whether our results were sen-sitive to differences in extinction fraction by reanalyzing the dataunder low (a= 0) and high (a= 0.99) extinction fractions (TableS5). The branch leading to the core Brassicales was the locus ofthe shift under both high and low extinction for the combined tree,high extinction for the ndhF tree, and low extinction for the PHYAtree. Analyses assuming a low extinction fraction across the ndhFtree or a high extinction fraction across the PHYA tree placed thelocus of the rate shift one node deeper (toward the base) of thetree (Table S5). In all analyses, the two-rate model was signifi-cantly better than the constant rate model, regardless of extinc-tion fraction (Table S5). Although the two-rate and rate-decreasemodels cannot be compared by likelihood ratio test, our resultssuggest that the two-rate model may fit the data better than therate-decrease model (Table S5). Taken together, our results in-dicate a significant shift in species diversification rate either alongthe branch leading to core Brassicales or before the divergence ofcore Brassicales from its sister clade.

1. Stamatakis A (2006) RAxML-VI-HPC: Maximum likelihood-based phylogeneticanalyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690.

2. Sanderson MJ (2003) r8s: Inferring absolute rates of molecular evolution anddivergence times in the absence of a molecular clock. Bioinformatics 19:301–302.

3. Sanderson MJ (2002) Estimating absolute rates of molecular evolution and divergencetimes: A penalized likelihood approach. Mol Biol Evol 19:101–109.

4. Rodman JE, Karol KG, Price RA, Sytsma KJ (1996) Molecules, morphology, andDahlgren’s expanded Order Capparales. Systematic Botany 21:289–307.

5. Gandolfo MA, Nixon KC, Crepet WL (1998) A new fossil flower from the Turonian ofNew Jersey: Dressiantha bicarpellata gen. et sp. nov. (Capparales). Am J Bot 85:964–974.

6. Ronquist FR, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference undermixed models. Bioinformatics 19:1572–1574.

7. Maddison WP, Maddison DR (2009) Mesquite: A modular system for evolutionaryanalysis Version 2.6. Available at http://mesquiteproject.org.

8. Wheeler EA (2004) Inside Wood. Available at: http://insidewood.lib.ncsu.edu/. AccessedJune 18, 2009.

9. Lewis PO (2001) A likelihood approach to estimating phylogeny from discretemorphological character data. Systematic Botany 50:913–925.

10. Hall JC, Iltis HH, Sytsma KJ (2004) Molecular phylogenetics of core Brassicales, placementof orphan genera Emblingia, Forchhammeria, Tirania, and character evolution.Systematic Botany 29:654–669.

11. Rodman J, et al. (1993) Nucleotide sequences of the rbcL gene indicate monophyly ofmustard oil plants. Ann Mo Bot Gard 80:686–699.

12. Drummond AJ, Ho SYW, Phillips MJ, Rambaut A (2006) Relaxed phylogenetics anddating with confidence. PLoS Biol 4:e88.

13. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by samplingtrees. BMC Evol Biol 7:214.

14. Yule GU (1924) A mathematical theory of evolution based on the conclusions of Dr.J.C. Willis. Philos Trans R Soc Lond B Biol Sci 213:21–87.

15. Brenner GJ (1996) Evidence of the earliest stage of angiosperm pollen evolution: Apaleoequatorial section from Israel. Flowering Plant Origin, Evolution and Phylogeny,eds Taylor DW, Hickey LJ (Chapman and Hall, New York), pp 91–115.

16. Rambaut A, Drummond AJ (2003) Tracer 1.4. Available at http://tree.bio.ed.ac.uk/software/tracer. Accessed June 18, 2009.

17. Rabosky DL (2006) LASER: A maximum likelihood toolkit for detecting temporalshifts in diversification rates from molecular phylogenies. Evol Bioinf Online247–250.

18. R Development Core Team (2008) R: A Language and Environment for StatisticalComputing (R Foundation for Statistical Computing, Vienna).

19. Stevens PF (2001 onwards) Angiosperm Phylogeny Website. Version 9. (http://www.mobot.org/MOBOT/research/APweb/). Accessed June 18, 2009.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 2 of 7

Page 8: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Eutrema altaicum

Conringia persica

Arabis alpina

Aubrieta deltoidea

Draba altaica

Coincya sp.

Isatis tinctoria

Alliaria petiolata

Streptanthus squamiformisThelypodium laciniatum

Graellsia saxifragaefolia

Eutrema heterophyllum

Neuontobotrys elloanensis

Noccaea sp.1

Nocceae cochleariforme

Sisymbrium altissimum

Hesperidanthus suffrutescens

Brassica oleracea

Baimshania pulvinata

Caulanthus crassicaulis

Sisymbrium linifolium

Goldbachia laevigata

Noccaea sp.2

Hirschfeldia incana

Parlatoria rostrata

Myagrum perfoliatum

Aubrieta parviflora

Mostacillastrum orbignyanum

Thlaspi arvensePseudocamelina campylopoda

Hesperidanthus jeageri

Cakile maritima

Chalcanthus renifolius

Neuontobotrys frutescens

Romanschulzia sp.

Stanleya pinnata

99100

100

9875

91100

53

65

100

100

100

100

75

99

99

99

94

85

99

90

51

91

72

E

Fruit Shape

AngustiseptateLatiseptateHeteroarthrocarpousOvoidTereteOrbicular

DArabidopsis neglecta

Arabidopsis halleriArabidopsis lyrata

Arabidopsis thaliana 2Arabidopsis thaliana 1

Camelina laxaCamelina hispida

Camelina microcarpaCamelina sativa

Camelina alyssumNeslia paniculataCapsella rubella

Capsella bursa-pastorisCatolobus pendulus

Turritis glabraErysimum perofskianum

Erysimum linifoliumErysimum capitatum

Erysimum cheiriErysimum cheiranthoides

Olimarabidopsis pumilaOlimarabidopsis cabulica

Boechera shortiiBoechera laevigata

Boechera platyspermaAnelsonia eurycarpa

Phoenicaulis cheiranthoidesNevada holmgrenii

Cusickiella quadricostataPolyctenium fremontii

Pennellia longifoliaPennellia brachycarpa

Exhalimolobos weddelliiMancoa hispida

Pachycladon novae zelandiaePachycladon exilis

Stenopetalum nutansCrucihimalaya lasiocarpaCrucihimalaya himalaica

Crucihimalaya wallichiiPachycladon stellatumNerisyrenia johnstonii

Dimorphocarpa wislizeniiSynthlipsis greggii

Physaria roseiPhysaria floribunda

Robeschia schimperiiDescurainia sophia

Ianhedgea minutifloraHornungia procumbens

Smelowskia tibeticaSmelowskia calycina

Smelowskia annuaLepidium draba

Lepdium alyssioidesCardamine rhomboidea

Cardamine pulchellaCardamine hirsuta

Iodanthus pinnatafidusArmoracia rusticana

Nasturtium officinalePlanodes virginicum

Barbarea vulgarisLeavenworthia crassa

Clypeola asperaAlyssum parviflorum

Fibigia suffruticosaPtilotrichum canescens

Solmslaubachia zhongdianensisRhammatophyllum erysimoides

Sisymbriopsis mollipilaDesideria linearis

Euclidium syriacumStrigosella africana

Neotorularia korolkowiiTetracme pamirica

Dilophia salsaChristolea crassifolia

Shangrilaia nanaBraya rosea

Sisymbriopsis yechengnicaLeiospora eriocalyx

Sterigmostemum acanthocarpumOreoloma violaceumMatthiola integrifolia

Matthiola farinosaHesperis sp. nov.

Hesperis matronalisBunias orientalis

Diptychocarpus stricutsChorispora tenella

Dontostemon senilisStanleya pinnata

Caulanthus crassicaulisHesperidanthus jeageri

Neuontobotrys elloanensisNeuontobotrys frutescens

Mostacillastrum orbignyanumRomanschulzia sp.

Thelypodium laciniatumStreptanthus squamiformis

Hesperidanthus suffrutescensSisymbrium linifolium

Sisymbrium altissimumHirschfeldia incana

Coincya spCakile maritima

Brassica oleraceaMyagrum perfoliatum

Isatis tinctoriaDraba altaicaArabis alpina

Baimashania pulvinataAubrieta parvifloraAubrieta deltoidea

Thlaspi arvensePseudocamelina campylopoda

Graellsia saxifragaefoliaParlatoria rostrata

Alliaria petiolataGoldbachia laevigata

Eutrema heterophyllumEutrema altaicum

Chalcanthus renifoliusNoccaea sp. 2

Nocceae cochleariformeNoccaea sp. 1

Conringia persicaIonopsidium acaule

Cochlearia danicaSchizopetalon rupestre

Heliophila sp.Asta schaffneri

Menonvillea hookeriSchizopetalon walkerii

Idahoa scapigeraCremolobus subscandens

Biscutella didymaBiscutella auriculata

Iberis umbellataIberis sempervirensLobularia maritima

Farsetia aeypticaLunaria annua

Moreira spinosaAethionema saxatile

Polanisia dodecandra

Fig. S1. Comparison of Alliaria petiolata and Thlaspi arvense, with likelihood character state reconstruction using the Mk1 (1) model in Mesquite (2). Seeds of(A) A. petiolatawith longitudinal striations and (B) T. arvensewith concentrically curved striations. (Scale bar, 1 mm.) (C) Latiseptate A. petiolata fruit (Left) andangustiseptate T. arvense fruit (Right). (Scale bar, 5 mm.) (D) Likelihood reconstruction for fruit type in Brassicaceae indicating that angustiseptate fruitsevolved along the branch to T. arvense in the clade defined by T. arvense (arrow) and A. petiolata. (E) Likelihood branch-length tree showing bootstrapsupport from 100 replicates for the clade that includes T. arvense (arrow) and A. petiolata.

1. Lewis PO (2001) A likelihood approach to estimating phylogeny from discrete morphological character data. Systematic Botany 50:913–925.2. Maddison WP, Maddison DR (2009) Mesquite: A modular system for evolutionary analysis. Version 2.6. http://mesquiteproject.org. Accessed June 18, 2009.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 3 of 7

Page 9: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Fig. S2. Maximum clade credibility chronogram of ndhF from BEAST with 95% highest probability density (HPD) for node ages (blue bars along branches). Thered HPD bar denotes the branch along which a significant shift in speciation rate occurred when extinction is low. The yellow HPD bar denotes the branchwhen extinction is high (Table S5). Numbers above nodes are means of the probability density distribution. Black bars indicate putative range of α and βwhole-genome duplications. See figure for key to symbols. Abbreviated taxonomic groups are AK, Akaniaceae; BAT, Bataceae; CAP, Capparaceae; CAR, Caricaceae;CLE, Cleomaceae; GY, Gyrostemonaceae; KOE, Koeberlineaceae; MAL, Malvaceae; MOR, Moringaceae; RE, Resedaceae; Tro, Tropaeolaceae; and FOR, thegenus Forchhammeria. Abbreviated geologic periods are Oli, Oligocene; Pal, Paleocene; PI, Pliocene; and Q, Quaternary.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 4 of 7

Page 10: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Fig. S3. Maximum clade credibility chronogram of PHYA from BEAST with 95% HPD for node ages (blue bars along branches). The red HPD bar denotes thebranch along which a significant shift in speciation rate occurred when extinction is low. The yellow HPD bar denotes the branch when extinction is high (TableS5). Numbers above nodes are means of the probability density distribution. Black bars indicate putative range of α and β whole-genome duplications. Seefigure for key to symbols. Abbreviated taxonomic groups are AK, Akaniaceae; BAT, Bataceae; CAP, Capparaceae; CAR, Caricaceae; CLE, Cleomaceae; GY,Gyrostemonaceae; KOE, Koeberlineaceae; MOR, Moringaceae; RE, Resedaceae; TRO, Tropaeolaceae; and FOR, the genus Forchhammeria. Abbreviated geo-logic periods are Oli, Oligocene; Pal, Paleocene; PI, Pliocene; and Q, Quaternary.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 5 of 7

Page 11: Dated molecular phylogenies indicate a Miocene origin for ...Dated molecular phylogenies indicate a Miocene origin for Arabidopsis thaliana Mark A. Beilsteina,1, Nathalie S. Nagalinguma,2,

Fig. S4. Maximum clade credibility chronogram of ndhF/PHYA combined data from BEAST with 95% HPD for node ages (blue bars along branches). Theyellow HPD bar denotes the branch along which a significant shift in speciation rate occurred (Table S5). Numbers above nodes are means of the probabilitydensity distribution. Black bars indicate putative range of α and β whole-genome duplications. See figure for key to symbols. Abbreviated taxonomic groupsare AK, Akaniaceae; BAT, Bataceae; CAP, Capparaceae; CAR, Caricaceae; CLE, Cleomaceae; GY, Gyrostemonaceae; KOE, Koeberlineaceae; MAL, Malvaceae;MOR, Moringaceae; RE, Resedaceae; TRO, Tropaeolaceae; and FOR, the genus Forchhammeria. Abbreviated geologic periods are Oli, Oligocene; Pal, Paleo-cene; PI, Pliocene; and Q, Quaternary.

Beilstein et al. www.pnas.org/cgi/content/short/0909766107 6 of 7


Recommended