+ All Categories
Home > Documents > Phylogeny of Conradina and Related Southeastern Scrub Mints (Lamiaceae)...

Phylogeny of Conradina and Related Southeastern Scrub Mints (Lamiaceae)...

Date post: 27-Jan-2017
Category:
Upload: pamela-s
View: 213 times
Download: 0 times
Share this document with a friend
17
Phylogeny of Conradina and Related Southeastern Scrub Mints (Lamiaceae) Based on GapC Gene Sequences Author(s): Christine E. Edwards, David Lefkowitz, Douglas E. Soltis, and Pamela S. Soltis Source: International Journal of Plant Sciences, Vol. 169, No. 4 (May 2008), pp. 579-594 Published by: The University of Chicago Press Stable URL: http://www.jstor.org/stable/10.1086/528758 . Accessed: 17/06/2014 03:28 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access to International Journal of Plant Sciences. http://www.jstor.org This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AM All use subject to JSTOR Terms and Conditions
Transcript
Page 1: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

Phylogeny of Conradina and Related Southeastern Scrub Mints (Lamiaceae) Based on GapC GeneSequencesAuthor(s): Christine E. Edwards, David Lefkowitz, Douglas E. Soltis, and Pamela S. SoltisSource: International Journal of Plant Sciences, Vol. 169, No. 4 (May 2008), pp. 579-594Published by: The University of Chicago PressStable URL: http://www.jstor.org/stable/10.1086/528758 .

Accessed: 17/06/2014 03:28

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

The University of Chicago Press is collaborating with JSTOR to digitize, preserve and extend access toInternational Journal of Plant Sciences.

http://www.jstor.org

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 2: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

PHYLOGENY OF CONRADINA AND RELATED SOUTHEASTERN SCRUB MINTS(LAMIACEAE) BASED ON GapC GENE SEQUENCES

Christine E. Edwards,1,*,y David Lefkowitz,z Douglas E. Soltis,y and Pamela S. Soltis*

*Florida Museum of Natural History, Gainesville, Florida 32611, U.S.A.; yDepartment of Botany,University of Florida, Gainesville, Florida 32611, U.S.A.; and zCollege of Medicine,

University of Florida, Gainesville, Florida 32611, U.S.A.

Phylogeny reconstruction at the species level, especially using organellar markers, is often complicated byproblems such as incomplete lineage sorting and interspecific hybridization. Single-copy nuclear genes may beuseful for these cases because they have higher mutation rates and are biparentally inherited. One plant groupin which hybridization and incomplete lineage sorting have been proposed based on analyses of internaltranscribed spacer (ITS) and plastid data is a clade of mints from the southeastern United States: Conradinaand the related genera Dicerandra, Piloblephis, Stachydeoma, and Clinopodium (Lamiaceae). To clarify thephylogeny in this clade and investigate the possibility of incomplete lineage sorting and interspecifichybridization, we isolated three members of the nuclear GapC gene family and used two to reconstructphylogeny. Separate phylogenetic analyses of the two GapC loci did not resolve species relationships. We thenused two approaches to concatenate the two heterozygous GapC loci with ITS and plastid data sets from aprevious study and carried out combined analyses. Trees resulting from the two concatenation approacheswere similar in the resolution and support of generic relationships, but they differed drastically in resolutionand support for relationships within Conradina. Conradina species are probably very recently derived, and itmay be unreasonable to reconstruct species relationships in Conradina using DNA sequence data due towidespread hybridization or lack of coalescence. Rapidly evolving microsatellite data may be more useful fordetecting hybridization and clarifying species boundaries in Conradina.

Keywords: nuclear genes, species phylogeny, GapC, Conradina, Lamiaceae.

Introduction

The use of single-copy nuclear genes to resolve phyloge-netic relationships among closely related species has becomeincreasingly common because the introns of these genes oftenhave a much more rapid evolutionary rate than other com-monly used regions, such as ribosomal or organellar markers(Sang 2002; Small et al. 1998, 2004). Introns can thereforeyield a high proportion of variable characters that may leadto increased resolution and support in phylogenetic analyses.However, because of their short length or lack of coalescence,nuclear gene introns still may not have sufficient informationto resolve evolutionary relationships, especially at shallowphylogenetic levels (Zhang and Hewitt 2003). In these cases,phylogenetic analyses that combine intron data with sequencedata from other genes may improve resolution and support.The purpose of this study was to use separate and combinedanalyses of single-copy nuclear genes to reconstruct the phy-logeny of species in the genus Conradina and related mints(Lamiaceae).

Conradina, comprising six described species (fig. 1), is amorphologically homogeneous group of diploid (all speciesn¼12; Gray 1965), aromatic shrubs that are easily distin-guished from other mints by a distinctively geniculate corolla

(Gray 1965). Conradina is related to four other genera ofLamiaceae from the southeastern United States: Dicerandra(nine species), four woody species of Clinopodium (C. ashei,C. georgianum, C. coccineum, and C. dentatum), and themonotypic genera Stachydeoma and Piloblephis (Crook 1998;Trusty et al. 2004; Edwards et al. 2006). Because most spe-cies in this clade are restricted to the sand pine scrub or sand-hill habitats of Florida or the southeastern United States, wewill hereafter refer to them as the southeastern (SE) scrubmint clade (Edwards et al. 2006). The SE scrub mint clade(Mentheae; Nepetoideae) is part of a larger group containingonly New World species; molecular phylogenetic analyses in-dicate that these species diversified after a single introductionfrom the Old World (Crook 1998; Trusty et al. 2004).

Despite the fact that several Conradina species (C. brevifolia,C. etonia, C. glabra, and C. verticillata) are federally endan-gered or threatened, species limits are unclear. Hybridizationmay occur among species (Edwards et al. 2006), and the tax-onomic status of several species is questionable. For example,C. brevifolia is listed as federally endangered yet taxonomi-cally questionable because it is morphologically similar to therelatively widespread, disjunct species, Conradina canescens(USFWS 1996). Likewise, populations of Conradina found inSanta Rosa County, Florida (referred to hereafter as the SantaRosa populations), are problematic because they are morpho-logical intermediates between the federally endangered C. gla-bra and the widespread C. canescens.

1 Author for correspondence; e-mail: [email protected].

Manuscript received June 2007; revised manuscript received October 2007.

579

Int. J. Plant Sci. 169(4):579–594. 2008.

� 2008 by The University of Chicago. All rights reserved.

1058-5893/2008/16904-0010$15.00 DOI: 10.1086/528758

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 3: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

The relationship of Conradina to other members of the SEscrub mint clade is also unclear. Hybridization was reportedbetween Clinopodium ashei (as Calamintha ashei) and C.brevifolia based on morphological evidence (Huck 1994), anda phylogenetic analysis of the SE scrub mint clade (with lim-ited taxon sampling) using plastid regions found that Conra-dina was nonmonophyletic with regard to Clinopodium andPiloblephis (Crook 1998). In a subsequent phylogenetic studyof Conradina and the SE scrub mint clade (Edwards et al.2006), Conradina was found to be monophyletic based oninternal transcribed spacer (ITS) sequence data, in agreementwith morphology (Crook 1998); however, trees based onplastid data were weakly supported but largely failed to sup-port the monophyly of each Conradina species, the genusConradina, or most other genera (Edwards et al. 2006). Thenonmonophyly of species and genera in the plastid tree maybe the result of low levels of informative nucleotide variation,introgression, or incomplete sorting of ancestral polymor-phism (Edwards et al. 2006). The combined ITS and plastidtree largely agreed with relationships found in the ITS dataset, perhaps due to the larger number of informative ITS char-acters or a more informative signal.

Because ITS and plastid data yielded trees with low levelsof resolution and support and because hybridization or in-complete lineage sorting (or both) have been hypothesized,more data are needed to clarify phylogenetic relationships inConradina and the SE scrub mint clade. The quickly evolvingintrons of biparentally inherited, single-copy nuclear genesare ideal in cases such as these. To this end, we focused onthe glyceraldehyde 3-phosphate dehydrogenase gene family(GapC or g3pdh), a nuclear-encoded gene family involved insugar phosphate regulation in the cytosol (Figge et al. 1999;Martin et al. 1993a). Previous studies using partial sequencesof GapC for phylogeny reconstruction (Martin et al. 1993b;Wall 2002; Howarth and Baum 2005) or analyses of within-species diversity (Olsen and Schaal 1999; Perusse and Schoen2004; Ingvarsson 2005; Morrell et al. 2005) have revealedthat GapC introns have high substitution rates and are phy-logenetically informative at low taxonomic levels. In thisstudy, we isolated and sequenced multiple GapC loci fromConradina and related mints, identified the GapC gene copynumber, quantified levels of sequence divergence and hetero-zygosity in the SE scrub mint clade, and carried out tests todetect recombination. We then focused on two GapC loci,

Fig. 1 Distributions of Conradina species.

580 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 4: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

both of which were heterozygous in many individuals, forphylogeny reconstruction of the SE scrub mint clade andcompared these results to previous studies, using ITS andplastid sequence data. Because combined analyses often in-crease support and resolution of relationships, we then usedtwo methods to concatenate the two heterozygous GapC datasets with the ITS and plastid data partitions of Edwards et al.(2006) and carried out combined analyses of the four datapartitions. Finally, we conducted combined analyses with eachdata partition removed, one at a time, to understand the con-tribution of each data partition to the combined analyses andwhether nuclear genes increase resolution and support ofphylogenetic relationships.

Material and Methods

Initial Isolation of GapC in the SE Scrub Mint Clade

Total DNA was isolated from leaf tissue as described byEdwards et al. (2006). Initially, the GapC gene was amplifiedfrom the genomic DNA of several representatives of theMentheae, using the PCR primers gpdx7F and gpd9r (Strandet al. 1997). These primers amplify exons 7–9, including in-trons 7 and 8, of the GapC gene. Amplifications were carriedout in 25-mL volumes with 1 unit of Amplitaq gold DNApolymerase (Applied Biosystems, Foster City, CA) or InvitrogenPlatinum Taq High Fidelity DNA polymerase (Invitrogen,Carlsbad, CA), 13 buffer (provided by the manufacturers),1 M betaine, and dNTP and primer concentrations that fol-lowed the manufacturers’ specifications. PCR conditions wereas follows: (1) an initial denaturation step at 95�C for 5 min,(2) 94�C for 1 min, (3) annealing at 53�C for 1 min, (4) elon-gation at 72�C for 2.5 min, and (5) a final elongation step of72�C for 2.5 min. Steps 2–4 were repeated for five cycles, ex-cept for dropping the annealing temperature by 1�C in eachof the five cycles until reaching 48�C, and then annealingtemperature was maintained at 48�C as steps 2–4 were re-peated for a total of 35 cycles. PCR products were visualizedon 1.5% agarose gels. Amplifications produced several PCRbands ranging from 700 to 1000 base pairs (bp) in length.PCR products were cloned into pCR 4.0 TOPO vector for se-quencing (Invitrogen). Eight to 16 colonies per PCR reactionwere amplified using the plasmid primers M13F and M13Rand the PCR reaction mixtures specified above. PCR condi-tions were as follows: (1) an initial denaturation step at 95�Cfor 3 min, (2) 95�C for 45 s, (3) annealing at 50�C for 45 s,and (4) elongation at 72�C for 45 s. Steps 2–4 were repeatedfor 30 cycles, followed by a final step of 72�C for 10 min.PCR products were cleaned and sequenced using the M13primers following Edwards et al. (2006).

Determination of GapC Copy Number

Our initial screening recovered two copies of GapC, whichwere then used to design specific primers for the Mentheae.For GapC-1, the primers are ConAmF (59 CAGCCTCGTTCA-ACATCATC)/ConAmR (59 CTTGAGCTTCGTCTCCGATG),and for GapC-2, they are CONGPD-F (59 GATGGTCCGT-CGAGCAAGGATT)/CONGPD-R (59 CCTGCTGTCACG-AAGTCTGT). To verify the GapC copy number, we selectedindividuals from 11 species: Mentha sp., Pycnanthemum muti-

cum, Monarda fistulosa, Stachydeoma graveolens, Clinopodiumdentatum, Clinopodium ashei 147-1, Dicerandra thinicola, Di-cerandra densiflora, Conradina brevifolia 198-2, Conradina eto-nia (Dunns Creek), and Conradina grandiflora 135-1. Accessioninformation for these individuals is given by Edwards et al.(2006). To increase taxonomic range of the copy number analy-ses, we also added accessions of Blephilia hirsuta (Cantino1421, BHO) and Hedeoma pulegioides (Cantino 1424, BHO).

We used the two specific primer pairs to amplify and se-quence all possible GapC alleles from these individuals. Weused the same PCR reagents as in the initial screening, and thetemperature cycling was as follows: (1) an initial heating stepat 95�C for 5 min, (2) 94�C for 1 min, (3) annealing for 1 minat a gradient between 48.2� and 53.1�C for ConGPDF/Rand between 55� and 58�C for ConAmF/R, (4) elongation at72�C for 2.5 min, (5) steps 2–4 were repeated for 30 cycles,and (6) a final elongation step of 72�C for 10 min. To avoidpreferential PCR of only some copies (Wagner et al. 1994)and to increase the possibility that all alleles were amplified,at least four reactions per individual per primer pair wereamplified simultaneously following Emshwiller and Doyle(2002). The four reactions were pooled, cloned, amplified us-ing the plasmid primers, and sequenced as specified above.We sequenced six to 15 colonies per cloning reaction, each toat least 23 coverage, in an attempt to recover all possible al-leles, although there is a possibility that some alleles werenot recovered via PCR. Sequences were assembled and editedusing Sequencher 4.0.5 (Gene Codes, Ann Arbor, MI). Allresulting sequences were first subjected to BLAST searches(Altschul et al. 1997) of GenBank (http://www.ncbi.nlm.nih.gov/) to verify that they are indeed most similar to cytosolicGapC. Next, to ensure that the multiple sequence types re-covered were not a result of PCR-mediated recombination(Cronn et al. 2002), we inspected and discarded any diver-gent sequences that were a chimera of two copy types. Inmany cases, we repeated PCR, cloning, and sequencing usingindependent genomic DNA extractions to verify our originalsequences. We also inspected sequences for Taq error; espe-cially when using Amplitaq Gold, we found that some col-onies had 1-bp substitutions that were not repeatable aftersequencing many colonies or resequencing colonies from a newDNA extraction. Because Taq polymerases, even those withproofreading ability, have a small rate of error (Cline et al.1996), we assumed that these nonrepeatable 1-bp substitu-tions were due to Taq error and did not include these in ouranalyses.

All other alleles were retained and treated as terminals toallow for the possibility of multiple GapC loci. All alignmentswere initially produced using ClustalX (Thompson et al.1997) and then adjusted ‘‘by eye’’ using the program Se-Al(Rambaut 2003). When all alleles recovered from these 13individuals were aligned, the alignment was ambiguous in theintron regions due to high levels of sequence divergence andlength variation among allele types, so introns were excludedfrom the initial copy number analysis.

All phylogenetic analyses were carried out using the Phylo-informatics Cluster for High Performance Computing facilityat the Florida Museum of Natural History (http://cluster.flmnh.ufl.edu). The exon-only data set was analyzed using parsimony,as implemented in PAUP* 4.0b10 (Swofford 2002). We con-

581EDWARDS ET AL.—GapC PHYLOGENY OF CONRADINA

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 5: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

ducted parsimony analyses with heuristic searches, using 1000random addition replicates and tree bisection-reconnection(TBR) branch swapping and saving 100 trees per replicate.Bootstrap analyses (Felsenstein 1985) with 1000 replicates wereused to assess branch support, using a heuristic search withTBR branch swapping and one random addition per replicateand saving no more than 100 trees per replicate.

We also analyzed the data set using Bayesian phylogeneticanalysis (Yang and Rannala 1997; Larget and Simon 1999;Huelsenbeck et al. 2001) as implemented in MrBayes 3.1.1(Ronquist and Huelsenbeck 2003). We ran two analyses, usingfour chains each, three hot and one cold, with the temperatureset to the default of 0.2. All analyses were run using flatpriors, using the optimal model of evolution chosen for eachdata set by the Akaike Information Criterion as implementedin Modeltest (Posada and Crandall 1998). The optimal modelchosen for the exon-only data set was GTRþ G. We ran anal-yses for 10 million generations, sampling a tree every 1000generations. To determine the burn-in value, we checked thelog-likelihood scores of the resulting trees and the split fre-quencies between the two independent runs and then dis-carded any trees that were saved before the stabilization of thescores or that had a split frequency of greater than 0.1. For thetwo independent runs, posterior probabilities for each branchwere found by constructing a majority-rule consensus of treesin the posterior distribution, using PAUP* 4.0b10 (Swofford2002). The two identical, independent analyses were checkedfor convergence in topology and branch lengths.

Unrooted trees placed alleles into two clades separated byan extremely long branch (see ‘‘Results’’), with alleles of ev-ery species found in each clade. We inferred that the highlevel of divergence between the exons of the two clusters ofalleles was due to the fact that there are at least two putativegene copies, and we rooted the tree between these two clades.We will hereafter refer to one of these putative loci as GapC-1(see ‘‘Results’’). The other clade placed alleles into two sisterclades, each containing alleles from most individuals, possi-bly corresponding to two additional putative loci. To deter-mine whether intron differences supported these two latterclades/putative loci separate from GapC-1, we conducted anadditional analysis of the intronsþ exons of these alleles, us-ing both parsimony and Bayesian phylogenetic analyses, withthe settings described above but with Bayesian analyses in-stead performed using the HKYþ G model as selected byModeltest. The intronþ exon data strongly supported thetwo sister clades that were recovered using only exon data(see ‘‘Results’’), and we will hereafter refer to these clades asGapC-2 and GapC-3. GapC-3 was not recovered in all indi-viduals due to poor amplification (the locus was amplified bythe primers that were designed to amplify GapC-2); thus, wedid not pursue GapC-3 for phylogeny reconstruction and fo-cused only on GapC-1 and GapC-2.

Taxon Sampling for Phylogeny ReconstructionUsing GapC-1 and GapC-2

For phylogenetic analysis of the SE scrub mint clade, wesampled all species of Conradina, including multiple acces-sions of most species. We also sampled individuals from allother genera of the SE scrub mint clade, i.e., Clinopodium,

Dicerandra, Piloblephis, Stachydeoma, and the outgroups M.fistulosa, P. muticum, and Mentha sp. The sampling strategy,collection locations, and voucher information are the same asin the work by Edwards et al. (2006; fig. 2). Because evi-dence supports the radiation of the New World Mentheaefollowing a single introduction from the Old World (Trustyet al. 2004) and because relationships among New Worldmints have yet to be resolved, we used Old World Menthasp. as our designated outgroup in all analyses to ensure thatpossible New World ingroup taxa were not designated as anoutgroup. Using this taxon sampling strategy, we attemptedto amplify all possible alleles of GapC-1 and GapC-2, screenedall resulting sequences for DNA polymerase error or PCR re-combination (see above), and aligned all alleles recoveredfrom all individuals using the protocols specified above. Dur-ing alignment, the introns of all GapC loci were easily distin-guishable from one another due to large differences in lengthand sequence motif, and we designated the loci accordingly.

Analyses of Recombination

Because recombination may have a significant impact onthe phylogenetic accuracy of a topology (Posada and Cran-dall 2002), it is important to understand whether and howalleles of nuclear genes have recombined before their use inphylogeny reconstruction. We thus tested the two GapC locifor the presence of recombination before phylogenetic analy-sis. Simulations and empirical studies have revealed a widerange in the power of different methods, depending on levelsof genetic diversity, recombination rate, and rate variationamong sites; therefore, the use of multiple methods of detect-ing recombination may help minimize false positive resultsand maximize the power of the methods (Posada 2002). Wetested for recombination in GapC-1 and GapC-2 separately,using the following methods: the phi test for recombination(Bruen et al. 2006) in Splitstree, version 4.0 (Huson andBryant 2006), and RDP, MaxChi, Chimera, and Geneconv inthe program RDP2 (Martin et al. 2005). For all analyses us-ing RDP2, we used settings for linear sequences, a P value of0.05 as the highest acceptable P value, Bonferroni multiplecomparison correction, polished breakpoints, and consensusbreakpoints and consensus daughter sequences. For RDP, weused ‘‘internal references only’’ and windows of 10, 20, 30,40, and 50 bp. For Geneconv, we scanned sequence triplets,with indel blocks treated as one polymorphism and a g-scalevalue of 0. For MaxChi, we scanned sequence triplets with avariable window size, the fraction of variable sites per win-dow, set to 0.1, 0.15, and 0.2, and 1000 permutations, with aP value of 0.05. For Chimera, we scanned with a variable win-dow size, the fraction of variable sites per window, set to 0.1,0.15, and 0.2, and 1000 permutations, with a P value of 0.05.

Separate Analyses of GapC-1 and GapC-2, CongruenceAnalyses, and Combined Phylogenetic Analyses

The GapC-1 and GapC-2 data sets were separately alignedusing Se-Al (Rambaut 2003), and were separately analyzed us-ing parsimony and Bayesian methods, with the same settingsused in the analyses of copy number (see ‘‘Determination ofGapC Copy Number’’) but with the appropriate model as cho-sen by Modeltest; for GapC-1, the model was GTRþ G, and

582 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 6: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

for GapC-2, the model was HKYþ G (Hasegawa et al. 1985;Yang 1994).

To concatenate the data matrices, we chose two methodswith which to code heterozygous sites. The first method com-bines two alleles of a heterozygote into a single sequence,by substituting any heterozygous sites with the appropriateInternational Union of Biochemistry (IUB) ambiguity code fol-lowing Howarth and Baum (2005). During parsimony analysiswith PAUP* the program can treat the IUB codes as either‘‘uncertainties’’ or ‘‘polymorphisms.’’ If the ‘‘uncertainties’’ op-tion is selected, this assumes that the state is not known butis one of the options designated by the IUB code, and the ter-minal is assigned the most parsimonious state (Swofford andBegle 1993; Wiens 1999). The ‘‘polymorphism’’ option is verysimilar in that PAUP* assigns the most parsimonious of thepresent states to the node; however, it differs in that it treatsall polymorphic sites as present in the terminal and thus al-ways places the less parsimonious character states as extrasteps along the branch (Swofford and Begle 1993). The treesresulting from the polymorphism option will therefore belonger than those using the uncertainty option, but they will

be topologically identical. Thus, for simplicity, we only carriedout analyses treating the IUB codes as uncertainties. MrBayesalso treats IUB codes as uncertainties (Ronquist et al. 2005).

The second concatenation method includes all possiblecombinations of alleles from an individual as terminals in thedata matrix (Sota and Vogler 2003). For example, if an indi-vidual is homozygous for GapC-1 and heterozygous for GapC-2,the individual would be represented twice in the matrix; thetwo concatenated sequences would be identical for ITS, plas-tid, and GapC-1, and they would differ for GapC-2. If anindividual is heterozygous for both GapC loci, all four com-binations of alleles would be concatenated with the ITS andplastid data partitions and included in the analysis. We chosethese two concatenation approaches because they representtwo extremes; the IUB code method removes all variation re-sulting from heterozygosity, while the ‘‘all combinations’’ ap-proach includes all possible information. The difference inthe results therefore should reflect the influence of includingheterozygosity on the resulting topology.

We then used the two concatenated data matrices to testfor incongruence among data partitions (GapC-1, GapC-2,

Fig. 2 Collection locations of accessions included in this study (figure reproduced from Edwards et al. 2006). Numbers following taxon namesindicate the accession number of the population and the number of the individual in the population. Accessions obtained from native nurseries or

botanical gardens are not shown (these include Clinopodium georgianum, Conradina brevifolia from Bok Tower Garden, and outgroups Menthasp., Pycnanthemum muticum, and Monarda fistulosa).

583EDWARDS ET AL.—GapC PHYLOGENY OF CONRADINA

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 7: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

ITS, and plastid). We conducted incongruence length difference(ILD) tests (Farris et al. 1994), as implemented in PAUP*, asthe partition homogeneity test (Swofford 2002). We con-ducted an ILD analysis to compare all possible pairs of thefour data partitions, as well as one that simultaneously com-pared all four data partitions to one another. We conductedILD analyses with 1000 replicates, heuristic searches, TBRbranch swapping, gaps coded as missing data, and 10 ran-dom additions per replicate, saving no more than 100 treesper replicate. However, although the ILD test is a good firstapproximation of incongruence among data partitions (Hippet al. 2004), the rejection of the null using the ILD test doesnot necessarily predict whether the trees resulting from thecombined data sets will be accurate (Cunningham 1997). Itis a conservative test that often rejects congruence because offactors such as differences in evolutionary rates, among-siterate variation among partitions, high amounts of noise, ordifferences in partition length (Dolphin et al. 2000; Barkerand Lutzoni 2002; Darlu and Lecointre 2002; Dowton andAustin 2002). Even if data sets have different evolutionaryhistories, combined phylogenetic analysis can improve resolu-tion and support (Wiens 1998). Thus, we generally followedthe methodology of Wiens (1998) for combining data sets;we compared trees resulting from analysis of separate datasets to identify conflicting topological differences that re-ceived >80% bootstrap support. Because there were almostno cases in which topological congruence was strongly sup-ported in the different data sets (see ‘‘Results’’), we analyzedthe combined data sets. However, any groupings that werefound to be conflicting in the various individual data setswere treated as tentative in all combined analyses.

For each concatenation method employed, we conducted asimultaneous analysis of all four data partitions (GapC-1,GapC-2, ITS, and plastid), using both parsimony and Bayes-ian phylogenetic methods. For parsimony searches, we usedthe same search strategy as specified in the copy number anal-ysis above. For Bayesian analysis, we used the same search set-tings as specified above, but we used mixed-models analysis(Nylander et al. 2004) to assign the appropriate evolutionarymodel to analyze each data partition. The GapC-1 andGapC-2 models were those that were chosen in the analysesof separate GapC loci, the model used for the ITS data setwas GTRþ G, and the model chosen for the plastid data setwas HKYþ G. For each concatenation method, we performedfour additional analyses, each with one data partition re-moved at a time, to explore the topological influence thateach data partition exerts on the total-evidence topology.Each combination of data partitions was analyzed using theparsimony and Bayesian methods as described above.

Results

GapC Copy Number in New World Mentheae

All sequences of GapC-1, GapC-2, and GapC-3 were de-posited in GenBank (GapC-1: EU179328-EU179374, GapC-2:EU183126-EU183184, GapC-3: EU183185-EU183196). Phylo-genetic analysis of the exon-only data set to infer copy num-ber contained 49 GapC alleles from 13 individuals, representing13 species in nine genera. Across all genera and all copy types,

the aligned length of the coding regions was 301 characters, ofwhich 216 were invariant, 65 were parsimony-informative,and 20 were variable but parsimony-uninformative. Parsi-mony analyses resulted in 59,873 trees of length ¼ 113, con-sistency index CIð Þ ¼ 0:876, retention index RIð Þ ¼ 0:973, andrescaled consistency index RCð Þ ¼ 0:852. One of the most par-simonious trees resulting from the exon-only data set, chosenat random, is presented in figure 3A.

The unrooted exon-only tree placed alleles into two cladesseparated by a long branch with more than 20 substitutions;the clades received 100% bootstrap support (BS) and 1.0Bayesian posterior probability (PP). Alleles of all individualswere found in both clades, with alleles of Old World Menthaplaced as sister to the remainder of the alleles in each clade.We used midpoint rooting to root the tree between the twodivergent clades (fig. 3A), which we interpreted to correspondputatively to GapC-1 and GapC-2þ 3 (fig. 3A). Within theGapC-2þ 3 clade, Mentha (100% BS, 1.0 PP) was sister to aclade of New World mints (80% BS, 0.86 PP). The NewWorld mints were divided into two closely related clades, withalleles of all individuals placed in both clades. In the subse-quent analysis of alleles, using both intron and exon sequencedata, the aligned length of this data set was 528 characters, ofwhich 402 were invariant, 95 were parsimony-informative,and 31 were variable but parsimony-uninformative. Parsimonyanalyses resulted in one most parsimonious tree of length ¼169, CI ¼ 0:846, RI ¼ 0:930, and RC ¼ 0:787 (fig. 3B). Inthe most parsimonious tree, the two clades recovered in theexon-only trees were very strongly supported (100% and99% BS, respectively, each with PP ¼ 1:0; fig. 3B). We inter-pret these clades to correspond to two additional putative loci;GapC-2 and GapC-3. However, while we identified three pu-tative loci, the primers did not consistently amplify GapC-3;therefore, for phylogeny reconstruction, we used only GapC-1and GapC-2.

Analyses of Recombination

For GapC-1, the phi test (P ¼ 0:18), RDP, and Geneconvdid not detect significant evidence for recombination. How-ever, Chimera and MaxChi found two and five recombina-tion events, respectively, in GapC-1. All these recombinationevents were detected in one allele of Mentha; parental se-quences for the recombination events were identified to bedistantly related ingroup accessions that have had no physi-cal opportunity to interbreed with Mentha. Furthermore, thetwo types of recombination analyses did not identify thesame recombination locations, visual inspection did not re-veal any obvious recombination, and phylogenetic analyseswith and without the Mentha allele (results not shown) didnot influence phylogenetic analyses. Thus, we believe that therecombination events detected by these programs are mostlikely false positives resulting from oversensitivity of the ana-lytical methods. None of the methods detected any significantevidence of recombination in GapC-2.

Phylogenetic Analysis of GapC-1 and GapC-2

GapC-1 and GapC-2 each have a higher percentage ofvariable, parsimony-informative characters than either ITSor a combined matrix (;2000 bp) of five plastid regions;

584 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 8: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

Fig.

3R

elati

onsh

ips

am

ong

Gap

Clo

ciin

the

SE

scru

bm

int

clade.

Pars

imony

boots

trap

valu

es>

50%

/Bay

esia

npost

erio

rpro

babil

itie

sare

indic

ate

dabove

the

bra

nch

es,

arr

ow

sin

dic

ate

bra

nch

esth

at

coll

apse

inth

est

rict

conse

nsu

s,num

ber

sfo

llow

ing

taxon

nam

esin

dic

ate

the

acc

essi

on

num

ber

of

the

popula

tion

and

num

ber

of

the

indiv

idual

inth

epopula

tion,

and

num

ber

sfo

llow

ing

locu

snam

esre

pre

sent

the

all

ele

num

ber

for

that

locu

s.A

nast

eris

kden

ote

sin

div

iduals

that

are

het

erozy

gous

for

apart

icula

rlo

cus.

A,

One

of

the

most

pars

imonio

us

tree

sre

sult

ing

from

analy

sis

of

the

exon-o

nly

data

set,

whic

hin

cludes

all

all

eles

reco

ver

edfr

om

are

duce

ddata

set

of

13

taxa.

The

two

hypoth

esiz

eddupli

cati

on

even

tsand

puta

tive

loci

are

indic

ate

d.B

,T

he

most

pars

imonio

us

tree

resu

ltin

gfr

om

anal

ysi

sof

intr

on

and

exon

data

of

Gap

C-2

and

Gap

C-3

tote

stw

het

her

intr

on

dif

fere

nce

sals

osu

pport

clades

reco

ver

edin

the

exon-o

nly

tree

s.

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 9: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

characteristics of the data matrices and tree searches for allDNA regions are presented in table 1. For both GapC loci,the Bayesian 50% majority-rule consensus trees and the par-simony strict consensus topologies were nearly identical, vary-ing only in the extent to which they collapse different shortbranches.

Characteristics of the GapC-1 data matrix and analysesare presented in table 1. GapC-1 contained 45 alleles from35 individuals; 12 individuals were heterozygous, and 23were homozygous. Parsimony strict consensus and Bayesianmajority-rule consensus trees were almost identical; we thuspresent only one of the most parsimonious trees selected atrandom and designate the branches that collapse in the strictconsensus tree (fig. 4A). Resolution among genera was poorin trees based on the GapC-1 data set (fig. 4A); Monarda fis-tulosa, Pycnanthemum, and the SE scrub mint clade formeda large polytomy, and the SE scrub mint clade was subdi-vided into many subclades, several of which collapsed eitherin the Bayesian trees or in the parsimony strict consensus(fig. 4A). One clade found in the parsimony consensus tree(BS <50%), which collapsed in the Bayesian topology, didnot correspond to generic boundaries; it contained one alleleof Conradina verticillata, two alleles of Conradina glabra109, three alleles of Stachydeoma graveolens, and two allelesof Clinopodium dentatum; sister to this clade was a cladecontaining all Clinopodium ashei alleles and a single Clino-podium coccineum allele. The remaining clades correspondedto generic boundaries; one contained all alleles of the two Di-cerandra species, one contained both alleles of Clinopodiumgeorgianum, one contained 13 alleles from five Conradinaspecies, and the last contained eight alleles from all six de-scribed Conradina species. The clades containing Conradinaalleles had no apparent connection with geography.

Characteristics of the GapC-2 data matrix and analysesare presented in table 1. GapC-2 contained 59 alleles from35 individuals; the amount of heterozygosity at GapC-2 wasdouble that of GapC-1, with heterozygotes totaling 24 of 35individuals, and 11 were homozygous for GapC-2. Therewas a 96-bp insertion in exon 9 of the C. verticillata GapC-2allele; this insertion was identical in sequence to a portion ofGapC-2 intron 8 and was not found in any other taxon. Be-cause it was not informative, this insertion was removedfrom all analyses. Parsimony strict consensus and the Bayes-ian majority-rule consensus trees were almost identical; we

present one of the most parsimonious trees selected at ran-dom and designate the branches that collapse in the strictconsensus tree in figure 4B. Using Mentha as the sole desig-nated outgroup, a clade of M. fistulosa and Pycnanthemummuticum was sister to the SE scrub mint clade, although thelatter received <50% BS and 0.52 PP. The SE scrub mintclade formed a large polytomy consisting of many clades ofvarying size, some of which had high BS support and PPvalues, but relationships among these clades were poorly re-solved and collapsed in consensus trees (fig. 4B). Many smallclades consisted of multiple accessions or multiple alleles of asingle species. In Conradina, several large clades containedalleles from multiple species of Conradina, two of whichloosely corresponded to geographic location. For example,except for one allele from the Santa Rosa populations fromthe Florida panhandle, one large clade (13 alleles) was com-posed entirely of alleles from species from peninsular Florida(Conradina grandiflora, Conradina brevifolia, and Conradinaetonia). Another geographically coherent clade contained al-leles from species distributed in the Florida panhandle andnorthern regions, including an allele each from C. verticillata,C. glabra, Santa Rosa population 133-30, and C. etonia. Thefinal Conradina clade did not correspond to geographic bound-aries and contained two alleles from C. brevifolia, two allelesfrom the Santa Rosa populations, and one allele from Conradinacanescens. Interestingly, C. canescens and the taxa of uncertainstatus, C. brevifolia and the Santa Rosa populations, were in-volved in all instances in which alleles did not correspond togeographic locations.

Congruence Analyses

All pairwise ILD analyses rejected the null hypothesis ofpartition homogeneity. However, visual inspection of the fourgene trees identified only one species, C. canescens 142-1,that was placed with >80% BS support in conflicting loca-tions; it was placed in a clade (85% BS) with C. glabra, C.grandiflora, and C. canescens 130-2 in the plastid data set, assister (83% BS) to C. canescens 128-2 in the GapC-2 dataset, and in a clade with C. etonia, C. brevifolia 198-2, C.verticillata, and the Santa Rosa population 133-1 in the GapC-1data set. Because there was only one instance of strongly sup-ported incongruence, we carried out combined analyses usingall four data sets. However, given the conflicting placements

Table 1

Characteristics of DNA Matrices and Results of Phylogenetic Analyses

Data setNo.

terminalsPercent

heterozygosity LengthPercent parsimony-

informative charactersNo. most

parsimonious treesTree

length CI RI RC

GapC-1 45 34.3 472 13.7 441 171 .848 .884 .750

GapC-2 59 68.5 512 18.3 92,800 176 .795 .877 .697

ITS (from Edwards et al. 2006) 35 0 624 8.8 99,400 164 .841 .864 .727

Plastid (from Edwards et al. 2006) 35 0 2034 1.4 96,300 111 .946 .921 .821ITS þ plastid combined 35 0 2658 3.1 2001 284 .856 .846 .724

IUB ambiguity code method 35 na 3642 4.4 46 589 .793 .751 .595

‘‘All combinations’’ method 80 na 3642 11.4 220 770 .684 .880 .602

Note. CI ¼ consistency index; RI ¼ retention index; RC ¼ rescaled consistency index; ITS ¼ internal transcribed spacer; IUB ¼ International

Union of Biochemistry; na ¼ not applicable.

586 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 10: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

Fig.

4T

rees

base

don

separ

ate

anal

yse

sof

the

Gap

Cdata

sets

.Pars

imony

boots

trap

valu

es>

50%

/Bayes

ian

post

erio

rpro

babil

itie

sare

indic

ate

dabove

the

bra

nch

es,

arr

ow

sin

dic

ate

bra

nch

esth

at

collapse

inth

est

rict

conse

nsu

s,num

ber

sfo

llow

ing

taxon

nam

esin

dic

ate

the

acc

essi

on

num

ber

of

the

popula

tion

and

the

num

ber

of

the

indiv

idualin

the

popula

tion,and

ast

eris

ks

den

ote

het

erozy

gote

s.A

,Phylo

gra

mof

one

of

the

most

pars

imonio

us

tree

sbase

don

analy

sis

of

Gap

C-1

,and

B,phylo

gra

mof

one

of

the

most

pars

imonio

us

tree

sbase

don

analy

sis

of

Gap

C-2

.

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 11: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

of C. canescens 142-1, we regarded it with caution in all com-bined analyses.

Combined Phylogenetic Analyses

Characteristics of the data matrices and tree searches forthe combined analyses are presented in table 1. When allfour data sets were concatenated by combining heterozygotesinto one sequence by coding heterozygous sites using an IUBcode, the Bayesian and parsimony trees were almost identicaland almost completely resolved; one of the most parsimoni-

ous trees selected at random, with branches indicated thatcollapse in the strict consensus tree, is presented in figure 5.Both parsimony and Bayesian topologies placed alleles fromM. fistulosa and P. muticum as successive sisters to a cladecomposed of alleles from all SE scrub mint clade taxa. Aclade of Dicerandra alleles was sister to the remainder of theSE scrub mint clade. The remaining SE scrub mint taxa weredivided into two clades. The first clade was composed of allalleles of Clinopodium ashei, with Clinopodium coccineum,Clinopodium dentatum, and Stachydeoma as successive sis-ters. The other large clade was composed of a monophyletic

Fig. 5 Phylogram of one of the most parsimonious trees resulting from combined analysis of GapC-1, GapC-2, and internal transcribed spacerand plastid data sets, with heterozygotes concatenated using the International Union of Biochemistry code method. Parsimony bootstrap values

>50%/Bayesian posterior probabilities are indicated above the branches, arrows indicate branches that collapse in the strict consensus, and

numbers following taxon names indicate the accession number of the population and the number of the individual in the population.

588 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 12: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

Conradina, with Piloblephis rigida and Clinopodium georgia-num placed as successive sisters to Conradina. Within Conra-dina, very few branches received >50% BS support, but mostreceived PP values of greater than 0.95, and relationshipswere well resolved; all taxa were grouped into two geograph-ically structured clades. One clade contained only allelesfrom species that inhabit the Florida panhandle and northernhabitats: C. glabra, C. canescens, C. verticillata, and one acces-sion from the Santa Rosa populations. The second Conradinaclade, with the exception of the second accession from theSanta Rosa populations, was composed of alleles from penin-sular Conradina species: C. brevifolia, C. grandiflora, and C.etonia. No species was monophyletic in either clade.

All four data partitions were also concatenated using theall-combinations method and analyzed simultaneously; oneof the most parsimonious trees selected at random is pre-sented in fig. 6. Relationships among genera recovered fromanalyses of the all-combinations method were almost identi-cal to those found in trees based on the IUB code method, withthe only difference being that instead of alleles of C. georgia-num and P. rigida being placed as successive sisters to Conra-dina, a clade of C. georgianum alleles was sister to Conradina,and P. rigida collapsed into a polytomy. Within Conradina,parsimony and Bayesian trees exhibited a large amount of topo-logical variation that collapsed in consensus trees. In compar-ison to the high level of resolution and geographical structuringfound in Conradina in analyses using the IUB code concate-nation method (fig. 5), trees based on the all-combinationsmethod (fig. 6) did not group Conradina into two geograph-ically structured clades; Conradina alleles formed a largepolytomy composed of several clades, which generally includedalleles or accessions of a single species.

Removal of Single Partitions

Analyses with one data partition removed using the datamatrices resulting from the two concatenation methods pro-duced similar results (results not shown). When GapC-1 wasremoved, resolution among SE scrub mint genera decreasedin comparison to the total evidence tree, and relationshipsamong many Conradina species collapsed. When GapC-2 orthe plastid data set was removed, relationships within Conra-dina were much less resolved and lost their geographic struc-ture, and the sister group to Conradina became unresolved.When ITS was removed, relationships among SE scrub mintgenera were drastically altered compared to the total evi-dence tree, indicating that the signal in the ITS data setstrongly influenced relationships in the combined analyses. Inthese trees, overall resolution decreased markedly, and in theIUB code tree, Conradina became nonmonophyletic, withsome Conradina alleles forming a clade with Stachydeoma,C. ashei, C. dentatum, and C. coccineum.

Discussion

Copy Number and Evolution of GapC inNew World Mentheae

At least three putative GapC loci are present in the NewWorld Mentheae sampled in this study, concordant with a

study of GapC evolution in another asterid, Amsinckia (Bor-aginaceae; Perusse and Schoen 2004). However, while threecopies were found in both the SE scrub mint clade and Am-sinckia spectabilis, we aligned and carried out phylogeneticanalysis of the sequences from both studies and found thatthey are not orthologous. All putative copy types found inAmsinckia are very homogeneous in sequence and structureand clustered with GapC-1, whereas the three copies in NewWorld Mentheae are much more divergent from one another.In the New World Mentheae, an initial duplication event re-sulted in GapC-1 and the ancestor of GapC-2 and GapC-3.Because these loci are present in all mint taxa sampled in thisstudy, this initial gene duplication must predate the origin ofthese genera and may be ancient, as evidenced by the largeamount of sequence divergence between the two copy types.Indeed, the intron regions are so divergent that sequencealignment and homology assessment between the two copytypes were impossible, and in 301 bp of protein-coding re-gions, the two loci differ by >20 substitutions (fig. 3A). Awide survey of GapC copy number in the asterids, or perhapsthe eudicots, would probably be necessary to understand thephylogenetic origin of this duplication.

A second duplication event yielded GapC-2 and GapC-3.This is a much more recent duplication, as evidenced by thefact that the exons of the two loci are almost identical (fig.3A) and their introns are similar enough to be aligned. Becauseall New World mint taxa have both GapC-2 and GapC-3,while only one copy is found in the Old World Mentha sp.,the duplication event may have occurred since the divergenceof the New World taxa from Old World Mentha. However,the single copy type in Mentha may also be due to other pro-cesses such as gene loss, so a more thorough sampling of GapC-3in the Mentheae would be necessary to pinpoint the exactorigin of this duplication. One possible cause of the multipleloci in the New World Mentheae, given the relatively highbase chromosome number of many members of this clade(e.g., in Conradina, n¼12; Gray 1965), is an ancient poly-ploidization event. However, regardless of whether the multi-ple loci are paralogs or homeologs, the multiple GapC locifound in this study also highlight the importance of ensuringthat only orthologous genes are included in comparative evo-lutionary studies.

Separate Phylogenetic Analysis of GapC-1 and GapC-2

Although both GapC loci have a much higher percentageof parsimony-informative characters than ITS or plastid se-quence data (table 1), neither contained enough informationto resolve phylogenetic relationships in the New World Men-theae, even between morphologically divergent genera. Forexample, both GapC loci only weakly differentiated the SEscrub mint clade from other morphologically distinct New Worldmints such as Monarda and Pycnanthemum, and GapC locidid not contain enough information to clarify phylogeneticrelationships among SE scrub mint taxa (fig. 4). The mono-phyly of most SE scrub mint genera is unresolved; some generaare placed as nonmonophyletic in the strict consensuses ofboth GapC trees, but none of these clades received highBS and/or PP values. Instead, all clades with high BS and/or PP values were composed of alleles from a single genus;

589EDWARDS ET AL.—GapC PHYLOGENY OF CONRADINA

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 13: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

however, none of these strongly supported clades containedall alleles of the entire genus. Thus, because the monophylyand relationships among the genera remain unclear, this pre-cludes most assessments of intergeneric hybridization (Ed-wards et al. 2006) based on GapC data.

One clade in GapC-1 does suggest that hybridization amonggenera may have occurred; alleles from Conradina glabra,Conradina verticillata, Stachydeoma graveolens, and Clino-podium dentatum were placed in the same clade. Althoughthis clade did not receive bootstrap support >50%, it was

Fig. 6 Phylogram of one of 220 most parsimonious trees resulting from analysis based on combined analysis of GapC-1, GapC-2, and internal

transcribed spacer and plastid data sets, with heterozygotes concatenated in all possible combinations (maximum of four per individual). Each

unique combination of alleles of heterozygotes is denoted at the end of the taxon name as A, B, C, or D. Parsimony bootstrap values >90%/

Bayesian posterior probabilities (PPs) are indicated above the branches. PP values below 0.90 are not included due to variation among Bayesianruns for the branches that have lower than 0.90 PP. Arrows indicate branches that collapse in the strict consensus, and numbers following taxon

names indicate the accession number of the population and the number of the individual in the population.

590 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 14: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

recovered in both parsimony and Bayesian trees. Plastid hap-lotypes for C. dentatum and one accession of S. graveolensare identical, this clade corresponds to geography (fig. 1),and the species flower during the same time period, allowingfor the possibility of hybridization. Crossing studies may bethe best way to evaluate this hypothesis.

Within Conradina, GapC-1 and GapC-2 individually pro-vided very little resolution. Reconstructions based on GapC-1did not place Conradina alleles into clades corresponding tospecies limits or geographic location. Based on GapC-2, sev-eral multispecies Conradina clades corresponded to geographiclocation, but no clade contained all alleles that would be ex-pected on the basis of geographic location. One noteworthyclade in the GapC-2 trees contained some alleles of Conradinacanescens and some alleles from the two taxa of uncertainstatus, the Santa Rosa populations and Conradina brevifolia.This clade may indicate that C. canescens, C. brevifolia, andthe Santa Rosa populations may share a common ancestoror, alternatively, experienced gene flow at some time in theirevolutionary history. However, in general, there was little res-olution of relationships among the various Conradina clades;many of these clades were poorly supported, and thus we areunable to make strong conclusions regarding species relation-ships or taxonomic circumscriptions within Conradina usingeither GapC data set alone.

Combined Analyses—Congruence and the Effectof Concatenation Approach

Although many studies have hailed single- or low-copy nu-clear genes as a solution for resolving relationships amongclosely related taxa such as these (Sang 2002), the alleles ofindividuals in the GapC data sets either do not coalesce orare of insufficient length to resolve relationships among closelyrelated Conradina species. Depending on substitution rate, amuch longer gene region would probably be necessary to re-solve relationships. In many cases, the most reasonable solutionto this problem is to combine nuclear genes with sequencedata from other genomes or gene regions for a total evidenceanalysis. However, congruence among data partitions shouldfirst be assessed, and heterozygosity of the nuclear genesmust be taken into account before phylogenetic analysis.

Significant levels of incongruence exist between ITS and plas-tid data partitions based on ILD tests (Edwards et al. 2006),and in this study we also found significant levels of incongru-ence among all four data partitions. However, because manydifferent factors could cause a significant ILD result, we visuallyinspected the topologies and found only one instance of ‘‘hardincongruence’’ (Seelanan et al. 1997): C. canescens 142-1 wasplaced with >80% bootstrap support in conflicting positions indifferent trees. Because there was very little well-supported con-flict, we carried out combined analysis of our data sets.

Trees based on the two concatenation methods were gener-ally very similar in their resolution and support along thebackbone of the trees (i.e., relationships among genera).Within Conradina, however, the two methods produced verydifferent topologies; the topology derived using the IUB codemethod (fig. 5) is well resolved and strongly supported, butthe trees resulting from the all-combinations method (fig. 6)are markedly less resolved. The variation in the topologies re-

sulting from the concatenation methods may be the result ofdiffering treatment of conflict and heterozygosity in the Con-radina sequence data. If the data sets conflict because of hy-bridization or incomplete lineage sorting, both of which maybe present in Conradina, the all-combinations method maysupport multiple differing placements of a taxon due to conflict-ing signal across the length of a sequence, causing branchesto collapse in consensus trees. The all-combinations (fig. 6)method also allows an individual to occupy multiple termi-nals, which permits alternative placements of an individualdue to heterozygosity. Conversely, the IUB code approach ig-nores both heterozygosity and conflict. Using this method,the most parsimonious of the states contained in the ambigu-ity code is assigned to a branch after analysis (Swofford andBegle 1993). Thus, even if the two alleles of an individualshould be placed in differing positions in a tree, using thismethod, the individual can occupy only one branch and thevariation will be ignored. This reduces conflict and results ina more strongly supported and well-resolved tree, but if con-flict truly exists in a data set, using this method may result ina well-resolved and strongly supported tree that is perhapsnot an accurate portrayal of relationships. Furthermore, it isassumed that all alleles represented in a terminal are mono-phyletic (Swofford and Begle 1993), and if the two alleles ofa heterozygote are not monophyletic because of processes suchas hybridization or incomplete sorting, the IUB code methodviolates the assumptions of phylogenetic analysis. Thus, if ei-ther hybridization or incomplete lineage sorting is suspected,it may be more valid to use the all-combinations method toportray the uncertainty involved in the phylogenetic hypothesis.

Because the true evolutionary tree of the SE scrub mint cladeis unknown and we are unable to compare the results of theconcatenation methods with a known tree, it is impossible toascertain which, if either, of the concatenation approaches isbetter for combining data. One of the most interesting pros-pects for further research on combined analysis of heterozy-gous loci would be a simulation study to explore how varyingmethods of concatenating heterozygotes would affect phylo-genetic accuracy of results under bifurcating evolution, afterhybridization, or after incomplete sorting. Simulations wouldenable us to compare the results of analyses based on theconcatenation approaches against a true tree and would pro-vide valuable information on how concatenation methodsperform with heterozygosity caused by different processes, sothat we may understand whether any of these methods canrecover accurate phylogenetic relationships when heterozy-gosity is present. Furthermore, a simulation study could testour hypotheses of whether disagreement among the concate-nation methods results from conflicting signal among datasets and whether the IUB code produces a misleading, well-resolved tree because it ignores conflict in the data set or,conversely, it is actually an accurate representation of phylo-genetic relationships.

Combined Phylogenetic Analyses—Relationshipsin the SE Scrub Mint Clade

In general, combined analyses of the four data partitionsfully resolved relationships among genera of the SE scrubmint clade, with much higher BS and PP values than obtained

591EDWARDS ET AL.—GapC PHYLOGENY OF CONRADINA

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 15: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

with individual data sets. The resolution of these higher-levelrelationships in the topologies is largely due to the signalfound in the ITS and GapC-1 data sets; when either data par-tition was removed, particularly ITS, relationships among SEscrub mint taxa collapsed. Topologies derived from all con-catenation methods are congruent and support a monophy-letic SE scrub mint clade, with Dicerandra as sister to theremainder of the clade. One morphological feature that mayunite this clade is habit: most New World mint species areherbaceous, Dicerandra species are herbaceous and suffrutes-cent, and the remainder of the SE scrub mint clade is woody(Edwards et al. 2006). The remaining SE scrub mint taxawere divided into two clades. The first clade consisted ofS. graveolens, C. dentatum, and Clinopodium coccineum assuccessive sisters to Clinopodium ashei. Interestingly, prin-cipal components analysis of quantitative morphological char-acters found a high degree of morphological similarity amongC. dentatum, C. coccineum, and C. ashei, while other Clino-podium species, notably Clinopodium georgianum, were mor-phologically distinct (Crook 1998). Stachydeoma graveolenswas not included in Crook’s (1998) analysis; however, it ismorphologically distinct. The three Clinopodium species areupright shrubs up to 0.5 m tall and have flowers in loose ax-illary clusters, whereas Stachydeoma is a sprawling shrub upto 0.2 m tall that has a much more compressed floral axis.

The composition of the second clade of SE scrub mintsvaries depending on the combination method, indicating thatconflict in placement may exist in the sequences of C. geor-gianum and Piloblephis rigida. The IUB code (fig. 5) placedC. georgianum as sister to Piloblephisþ Conradina, whereasthe all-combinations method (fig. 6) placed C. georgianum assister to Conradina. Clinopodium georgianum was not placedwith the remaining three Clinopodium species in any of ourtrees, in agreement with Crook (1998), who found thatC. georgianum was morphologically distinct from the otherthree Clinopodium species; however, Crook (1998) found nomorphological similarity between C. georgianum and Conra-dina or Piloblephis. Regardless, the woody Clinopodium spe-cies from the southeastern United States do not appear to bemonophyletic. The close relationship between Conradina andPiloblephis has also been found in other studies; for example,the two taxa have identical plastid DNA restriction site pro-files (Wagstaff et al. 1995). However, Crook et al. (1998)found them to be very distinct morphologically; for example,instead of flowers grouped in loose axillary clusters, as inConradina, the terminal flowering branches of P. rigida arecompressed to form a tight head, and the species is unique inhaving rigid, triangular hairs covering all surfaces of theplant. The only obvious characters that may unite Conradinaand Piloblephis are similar habit and the fact that both havereduced, needlelike leaves with tightly rolled margins.

Both concatenation methods strongly support the mono-phyly of Conradina. This relationship is also supported bymorphology; species of Conradina are readily distinguishablefrom other species in the SE scrub mint clade, united by the mor-phological synapomorphy of a distinctively geniculate corolla(Crook 1998). Species of Conradina are morphologically verysimilar to each other and are generally distinguishable byonly small differences in leaf, flower size, or pubescence char-acters (Gray 1965).

Combined Phylogenetic Analyses—Relationshipsamong Conradina Species

Signal in the GapC-2 and plastid sequence data contrib-uted most to resolution of phylogenetic relationships amongConradina species. However, the two concatenation methodsdiffer in their amount of resolution within Conradina. Theall-combinations method did not resolve relationships amongConradina species (fig. 6), while trees based on the IUB codemethod (fig. 5) were almost completely resolved. Althoughthe IUB code method ignores polymorphic sites and violatesthe assumptions of analysis because the two alleles of manyConradina individuals included in a terminal are not mono-phyletic, the resulting trees are difficult to ignore becausethey present a hypothesis of evolutionary relationships that isin agreement with geography. With only one exception, spe-cies of Conradina are placed into two clades that correspondto geographic location; one clade contains the peninsular spe-cies, and the other contains the northern/panhandle species.This is a common biogeographical split, described as ‘‘theApalachicola River Basin discontinuity’’ (Soltis et al. 2006),and is hypothesized to have arisen because the panhandleand peninsular areas became separate islands during the highsea levels of Pleistocene interglacial periods (Soltis et al.2006). The panhandle and peninsular clades of Conradinamay have diverged on these islands and subsequently dis-persed to their current distributions after sea levels dropped.However, because this geographic structure was supported us-ing only one of the concatenation methods, this hypothesisshould be tested further using different types of data and morerigorous statistical methods.

Although there may be some geographical signal within thegenus Conradina, none of the individual data sets or the com-bined analyses recovered the monophyly of any Conradinaspecies. The apparent lack of species monophyly reported herebased on sequence data correlates with the high degree of mor-phological similarity in Conradina, as well as the ability ofConradina species to hybridize in crossing experiments (Gray1965). One explanation for the nonmonophyly of the de-scribed Conradina species recovered in this study is that thetaxonomic groupings are not representative of true biologicalentities. However, another explanation is that Conradina spe-cies are closely related and probably very recently derived, sothe alleles of each species may not yet have coalesced. It maytherefore be unreasonable to expect species monophyly usingDNA sequence data. Other types of data, such as rapidlyevolving microsatellite data, will probably be more useful fordetermining whether interspecific hybridization has occurredand for clarifying species boundaries and taxonomic group-ings in Conradina.

Acknowledgments

We thank Javier Francisco-Ortgega, Andrew Doust, DavidReed, Walter Judd, Josh Clayton, Michael Moore, membersof the Soltis lab, and an anonymous reviewer for commentson previous versions of this manuscript; Phil Cantino andLuiz Oliveira for plant material; Matt Gitzendanner for helpwith parallelized Bayesian analyses; Zera Damji for assis-tance with lab work; Alan Prather and Rachel Williams for

592 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 16: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

access to unpublished data; and the numerous people whohave issued collection permits and assisted with collecting.Funding for this project was provided by the Florida NativePlant Society, the Florida Division of Forestry’s FloridaStatewide Endangered and Threatened Plant Conservation

Program, the Garden Club of America’s Catherine BeattieFellowship, Sigma Xi Grants-in-Aid of Research, a GraduateStudent Research Award from the American Society of PlantTaxonomists, and a Botanical Society of America GeneticsSection Award.

Literature Cited

Altschul SF, TL Madden, AA Schaffer, JH Zhang, Z Zhang, W Miller,

DJ Lipman 1997 Gapped BLAST and PSI-BLAST: a new genera-

tion of protein database search programs. Nucleic Acids Res 25:3389–3402.

Barker FK, FM Lutzoni 2002 The utility of the incongruence length

difference test. Syst Biol 51:625–637.

Bruen TC, H Philippe, D Bryant 2006 A simple and robust statisticaltest for detecting the presence of recombination. Genetics 172:

2665–2681.

Cline J, J Braman, H Hogrefe 1996 PCR fidelity of pfu DNA poly-merase and other thermostable DNA polymerases. Nucleic Acids

Res 24:3546–3551.

Cronn R, M Cedroni, T Haselkorn, C Grover, JF Wendel 2002

PCR-mediated recombination in amplification products derivedfrom polyploid cotton. Theor Appl Genet 104:482–489.

Crook R 1998 Systematics of Conradina (Lamiaceae). PhD diss.

University of Georgia, Athens.

Cunningham CW 1997 Is congruence between data partitions areliable predictor of phylogenetic accuracy? empirically testing an

iterative procedure for choosing among phylogenetic methods. Syst

Biol 46:464–478.Darlu P, G Lecointre 2002 When does the incongruence length

difference test fail? Mol Biol Evol 19:432–437.

Dolphin K, R Belshaw, CDL Orme, DLJ Quicke 2000 Noise and

incongruence: interpreting results of the incongruence lengthdifference test. Mol Phylogenet Evol 17:401–406.

Dowton M, AD Austin 2002 Increased congruence does not neces-

sarily indicate increased phylogenetic accuracy: the behavior of the

incongruence length difference test in mixed-model analyses. SystBiol 51:19–31.

Edwards CE, DE Soltis, PS Soltis 2006 Molecular phylogeny of

Conradina and other scrub mints (Lamiaceae) from the southeast-

ern USA: evidence for hybridization in Pleistocene refugia? Syst Bot31:193–207.

Emshwiller E, JJ Doyle 2002 Origins of domestication and polyploidy

in oca (Oxalis tuberosa: Oxalidaceae). 2. Chloroplast-expressed glu-tamine synthetase data. Am J Bot 89:1042–1056.

Farris JS, M Kallersjo, AG Kluge, C Bult 1994 Testing significance of

incongruence. Cladistics 10:315–319.

Felsenstein J 1985 Confidence limits on phylogenies—an approachusing the bootstrap. Evolution 39:783–791.

Figge RM, M Schubert, H Brinkmann, R Cerff 1999 Glyceraldehyde-

3-phosphate dehydrogenase gene diversity in eubacteria and eukary-

otes: evidence for intra- and inter-kingdom gene transfer. Mol BiolEvol 16:429–440.

Gray TC 1965 A monograph of the genus Conradina. PhD diss.

Vanderbilt University, Nashville.Hasegawa M, H Kishino, TA Yano 1985 Dating of the human-ape

split by a molecular clock of mitochondrial DNA. J Mol Evol 22:

160–174.

Hipp AL, JC Hall, KJ Sytsma 2004 Congruence versus phylogeneticaccuracy: revisiting the incongruence length difference test. Syst Biol

53:81–89.

Howarth DG, DA Baum 2005 Genealogical evidence of homoploid

hybrid speciation in an adaptive radiation of Scaevola (Goodeniaceae)in the Hawaiian Islands. Evolution 59:948–961.

Huck RB 1994 Complex patterns of evolution in perennial labiates

in Florida. Am J Bot 81(suppl):168–169.

Huelsenbeck JP, F Ronquist, R Nielsen, JP Bollback 2001 Bayesianinference of phylogeny and its impact on evolutionary biology.

Science 294:2310–2314.

Huson DH, D Bryant 2006 Application of phylogenetic networks in

evolutionary studies. Mol Biol Evol 23:254–267.Ingvarsson PK 2005 Nucleotide polymorphism and linkage disequi-

lbrium within and among natural populations of European aspen

(Populus tremula L., Salicaceae). Genetics 169:945–953.Larget B, DL Simon 1999 Markov chain Monte Carlo algorithms for

the Bayesian analysis of phylogenetic trees. Mol Biol Evol 16:

750–759.

Martin DP, C Williamson, D Posada 2005 RDP2: Recombinationdetection and analysis from sequence alignments. Bioinformatics

21:260–262.

Martin W, H Brinkmann, C Savonna, R Cerff 1993a Evidence for a

chimeric nature of nuclear genomes: eubacterial origin of eukary-otic glyceraldehyde-3-phosphate dehydrogenase genes. Proc Natl

Acad Sci USA 90:8692–8696.

Martin W, D Lydiate, H Brinkmann, G Forkmann, H Saedler, RCerff 1993b Molecular phylogenies in angiosperm evolution. Mol

Biol Evol 10:140–162.

Morrell PL, DM Toleno, KE Lundy, MT Clegg 2005 Low levels of

linkage disequilibrium in wild barley (Hordeum vulgare ssp.spontaneum) despite high rates of self-fertilization. Proc Natl

Acad Sci USA 102:2442–2447.

Nylander JAA, F Ronquist, JP Huelsenbeck, JL Nieves-Aldrey 2004

Bayesian phylogenetic analysis of combined data. Syst Biol 53:47–67.

Olsen KM, BA Schaal 1999 Evidence on the origin of cassava:

phylogeography of Manihot esculenta. Proc Natl Acad Sci USA 96:

5586–5591.Perusse JR, DJ Schoen 2004 Molecular evolution of the GapC gene

family in Amsinckia spectabilis populations that differ in out-

crossing rate. J Mol Evol 59:427–436.Posada D 2002 Evaluation of methods for detecting recombination

from DNA sequences: empirical data. Mol Biol Evol 19:708–717.

Posada D, KA Crandall 1998 MODELTEST: testing the model of

DNA substitution. Bioinformatics 14:817–818.——— 2002 The effect of recombination on the accuracy of

phylogeny estimation. J Mol Evol 54:396–402.

Rambaut A 2003 Se-Al: a manual sequence alignment editor. Version

2.0a11. http://tree.bio.ed.ac.uk/software/seal/.Ronquist F, JP Huelsenbeck 2003 MrBayes 3: Bayesian phylogenetic

inference under mixed models. Bioinformatics 19:1572–1574.

Ronquist F, HP Huelsenbeck, P van der Mark 2005 MrBayes 3.1manual. http://mrbayes.csit.fsu.edu/manual.php.

Sang T 2002 Utility of low-copy nuclear gene sequences in plant

phylogenetics. Crit Rev Biochem Mol Biol 37:121–147.

Seelanan T, A Schnabel, JF Wendel 1997 Congruence and consensusin the cotton tribe (Malvaceae). Syst Bot 22:259–290.

Small RL, RC Cronn, JF Wendel 2004 Use of nuclear genes for

phylogeny reconstruction in plants. Aust Syst Bot 17:145–170.

Small RL, JA Ryburn, RC Cronn, T Seelanan, JF Wendel 1998 Thetortoise and the hare: choosing between noncoding plastome and

593EDWARDS ET AL.—GapC PHYLOGENY OF CONRADINA

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions

Page 17: Phylogeny of               Conradina               and Related Southeastern Scrub Mints (Lamiaceae) Based on               GapC               Gene Sequences

nuclear Adh sequences for phylogeny reconstruction in a recentlydiverged plant group. Am J Bot 85:1301–1315.

Soltis DE, AB Morris, JS McLachlan, PS Manos, PS Soltis 2006

Comparative phylogeography of unglaciated eastern North Amer-ica. Mol Ecol 15:4261–4293.

Sota T, AP Vogler 2003 Reconstructing species phylogeny of the

carabid beetles Ohomopterus using multiple nuclear DNA se-

quences: heterogeneous information content and the performanceof simultaneous analyses. Mol Phylogenet Evol 26:139–154.

Strand AE, J LeebensMack, BG Milligan 1997 Nuclear DNA-based

markers for plant evolutionary biology. Mol Ecol 6:113–118.

Swofford DL 2002 PAUP*: phylogenetic analysis using parsimony(*and other methods), version 4.10b10. Sinauer, Sunderland, MA.

Swofford DL, DP Begle 1993 Phylogenetic analysis using parsimony,

version 3.1. User’s manual. Illinois Natural History Survey, Cham-paign. 257 pp.

Thompson JD, TJ Gibson, F Plewniak, F Jeanmougin, DG

Higgins 1997 The ClustalX windows interface: flexible strategies

for multiple sequence alignment aided by quality analysis tools.Nucleic Acids Res 25:4876–4882.

Trusty JL, RG Olmstead, DJ Bogler, A Santos-Guerra, J Francisco-

Ortega 2004 Using molecular data to test a biogeographic con-

nection of the Macaronesian genus Bystropogon (Lamiaceae) tothe New World: a case of conflicting phylogenies. Syst Bot 29:702–

715.

USFWS (U.S. Fish and Wildlife Service) 1996 Recovery plan for

nineteen central Florida scrub and high pineland plants (revised).U.S. Fish and Wildlife Service, Atlanta.

Wagner A, N Blackstone, P Cartwright, M Dick, B Misof, P Snow, GP

Wagner, J Bartels, M Murtha, J Pendleton 1994 Surveys of genefamilies using polymerase chain-reaction: PCR selection and PCR

drift. Syst Biol 43:250–261.

Wagstaff SJ, RG Olmstead, PD Cantino 1995 Parsimony analysis of

cpDNA restriction site variation in subfamily Nepetoideae (Labiatae).Am J Bot 82:886–892.

Wall DP 2002 Use of the nuclear gene glyceraldehyde 3-phosphate

dehydrogenase for phylogeny reconstruction of recently diverged

lineages in Mitthyridium (Musci: Calymperaceae). Mol PhylogenetEvol 25:10–26.

Wiens JJ 1998 Combining data sets with different phylogenetic

histories. Syst Biol 47:568–581.——— 1999 Polymorphism in systematics and comparative biology.

Annu Rev Ecol Syst 30:327–362.

Yang ZH 1994 Maximum likelihood phylogenetic estimation from

DNA sequences with variable rates over sites: approximate meth-ods. J Mol Evol 39:306–314.

Yang ZH, B Rannala 1997 Bayesian phylogenetic inference using

DNA sequences: a Markov chain Monte Carlo method. Mol Biol

Evol 14:717–724.Zhang DX, GM Hewitt 2003 Nuclear DNA analyses in genetic

studies of populations: practice, problems and prospects. Mol Ecol

12:1687–1687.

594 INTERNATIONAL JOURNAL OF PLANT SCIENCES

This content downloaded from 193.6.168.230 on Tue, 17 Jun 2014 03:28:52 AMAll use subject to JSTOR Terms and Conditions


Recommended