+ All Categories
Home > Documents > A multilocus phylogeny of the Sulidae (Aves: Pelecaniformes)

A multilocus phylogeny of the Sulidae (Aves: Pelecaniformes)

Date post: 28-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
11
A multilocus phylogeny of the Sulidae (Aves: Pelecaniformes) S.A. Patterson, J.A. Morris-Pocock , V.L. Friesen Department of Biology, Queen’s University, Kingston, Ontario, Canada K7L 3N6 article info Article history: Received 10 April 2010 Revised 18 November 2010 Accepted 23 November 2010 Available online 7 December 2010 Keywords: Booby Gannet Gene tree Intron Species tree Sulidae abstract Gene trees will often differ from the true species history, the species tree, as a result of processes such as incomplete lineage sorting. New methods such as Bayesian Estimation of the Species Tree (BEST) use the multispecies coalescent to model lineage sorting, and directly infer the species tree from multilocus DNA sequence data. The Sulidae (Aves: Pelecaniformes) is a family of ten booby and gannet species with a glo- bal distribution. We sequenced five nuclear intron loci and one mitochondrial locus to estimate a species tree for the Sulidae using both BEST and by concatenating nuclear loci. We also used fossil calibrated strict and relaxed molecular clocks in BEAST to estimate divergence times for major nodes in the sulid phylogeny. Individual gene trees showed little phylogenetic conflict but varied in resolution. With the exception of the mitochondrial gene tree, no gene tree was completely resolved. On the other hand, both the BEST and concatenated species trees were highly resolved, strongly supported, and topologically con- sistent with each other. The three sulid genera (Morus, Sula, Papasula) were monophyletic and the rela- tionships within genera were mostly consistent with both a previously estimated mtDNA gene tree and the mtDNA gene tree estimated here. However, our species trees conflicted with the mtDNA gene trees in the relationships among the three genera. Most notably, we find that the endemic and endangered Abbott’s booby (Papasula abbotti) is likely basal to all other members of the Sulidae and diverged from them approximately 22 million years ago. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction The evolutionary history of speciation events within a group of organisms (the species tree) is often inferred by equating it with the evolutionary history of individual genes (gene trees). However, gene trees can differ from species trees for at least three reasons: horizontal gene transfer between species, hybridization/introgres- sion between species, and incomplete lineage sorting (Maddison, 1997). Incomplete lineage sorting can cause both the topology and branch lengths of gene trees to differ from the species tree (Degnan and Rosenberg, 2006; Edwards, 2009), and the probability that a given gene tree will differ from the species tree is propor- tional to the effective population sizes of the species and the in- ter-node distances (i.e., the length of time between speciation events; Degnan and Rosenberg, 2006). Species trees with high effective population sizes and/or short inter-node distances are more likely to be affected by incomplete lineage sorting and to have gene trees that do not match the species tree. Although gene trees estimated from mitochondrial DNA (mtDNA) are less likely to differ from the species tree than those estimated from nuclear DNA due to the smaller effective population size of mtDNA (Zink and Barrowclough, 2008), all mtDNA loci are effectively linked and at most one gene tree can be estimated (Ballard and Whitlock, 2004). Recent research has focused on using multiple nuclear loci for species tree estimation, rather than equating gene trees with spe- cies trees (e.g., Edwards, 2009). Although single nuclear gene trees may also not provide an accurate representation of the species tree (e.g., Belfiore et al., 2008; Irestedt and Ohlson, 2008; Liu et al., 2008; Njabo et al., 2008), the approach has the advantage of assay- ing variation at several unlinked loci and combining these data into a single estimate of the species tree (cf. mtDNA). Until recently, two general categories of methods were used to estimate species trees from multilocus data: (1) Consensus methods estimate a gene tree for each locus and then construct the species tree such that only clades that appear in a certain percentage (e.g., 50% or 100%) of the gene trees are represented on the species tree (Madd- ison, 1997). These methods assume that, if enough gene trees are sampled, the most common gene tree topology will reflect the underlying species tree. However, Degnan and Rosenberg (2006) show that some species histories are more likely to generate gene trees that are inconsistent with the species tree than gene trees that match the species trees. For species trees that lie in this ‘‘anomaly zone’’, consensus methods are likely to fail. (2) Concate- nation methods construct a single alignment for all loci together. A single tree is estimated from the concatenated alignment and is used as an estimate of the species tree (Rokas et al., 2003; de 1055-7903/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2010.11.021 Corresponding author. Fax: +1 613 533 6617. E-mail addresses: [email protected] (S.A. Patterson), [email protected] (J.A. Morris-Pocock), [email protected] (V.L. Friesen). Molecular Phylogenetics and Evolution 58 (2011) 181–191 Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev
Transcript

Molecular Phylogenetics and Evolution 58 (2011) 181–191

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution

journal homepage: www.elsevier .com/ locate /ympev

A multilocus phylogeny of the Sulidae (Aves: Pelecaniformes)

S.A. Patterson, J.A. Morris-Pocock ⇑, V.L. FriesenDepartment of Biology, Queen’s University, Kingston, Ontario, Canada K7L 3N6

a r t i c l e i n f o a b s t r a c t

Article history:Received 10 April 2010Revised 18 November 2010Accepted 23 November 2010Available online 7 December 2010

Keywords:BoobyGannetGene treeIntronSpecies treeSulidae

1055-7903/$ - see front matter � 2010 Elsevier Inc. Adoi:10.1016/j.ympev.2010.11.021

⇑ Corresponding author. Fax: +1 613 533 6617.E-mail addresses: [email protected] (S.A. P

(J.A. Morris-Pocock), [email protected] (V.L. Friesen).

Gene trees will often differ from the true species history, the species tree, as a result of processes such asincomplete lineage sorting. New methods such as Bayesian Estimation of the Species Tree (BEST) use themultispecies coalescent to model lineage sorting, and directly infer the species tree from multilocus DNAsequence data. The Sulidae (Aves: Pelecaniformes) is a family of ten booby and gannet species with a glo-bal distribution. We sequenced five nuclear intron loci and one mitochondrial locus to estimate a speciestree for the Sulidae using both BEST and by concatenating nuclear loci. We also used fossil calibratedstrict and relaxed molecular clocks in BEAST to estimate divergence times for major nodes in the sulidphylogeny. Individual gene trees showed little phylogenetic conflict but varied in resolution. With theexception of the mitochondrial gene tree, no gene tree was completely resolved. On the other hand, boththe BEST and concatenated species trees were highly resolved, strongly supported, and topologically con-sistent with each other. The three sulid genera (Morus, Sula, Papasula) were monophyletic and the rela-tionships within genera were mostly consistent with both a previously estimated mtDNA gene tree andthe mtDNA gene tree estimated here. However, our species trees conflicted with the mtDNA gene trees inthe relationships among the three genera. Most notably, we find that the endemic and endangeredAbbott’s booby (Papasula abbotti) is likely basal to all other members of the Sulidae and diverged fromthem approximately 22 million years ago.

� 2010 Elsevier Inc. All rights reserved.

1. Introduction

The evolutionary history of speciation events within a group oforganisms (the species tree) is often inferred by equating it withthe evolutionary history of individual genes (gene trees). However,gene trees can differ from species trees for at least three reasons:horizontal gene transfer between species, hybridization/introgres-sion between species, and incomplete lineage sorting (Maddison,1997). Incomplete lineage sorting can cause both the topologyand branch lengths of gene trees to differ from the species tree(Degnan and Rosenberg, 2006; Edwards, 2009), and the probabilitythat a given gene tree will differ from the species tree is propor-tional to the effective population sizes of the species and the in-ter-node distances (i.e., the length of time between speciationevents; Degnan and Rosenberg, 2006). Species trees with higheffective population sizes and/or short inter-node distances aremore likely to be affected by incomplete lineage sorting and tohave gene trees that do not match the species tree. Although genetrees estimated from mitochondrial DNA (mtDNA) are less likely todiffer from the species tree than those estimated from nuclear DNAdue to the smaller effective population size of mtDNA (Zink and

ll rights reserved.

atterson), [email protected]

Barrowclough, 2008), all mtDNA loci are effectively linked and atmost one gene tree can be estimated (Ballard and Whitlock, 2004).

Recent research has focused on using multiple nuclear loci forspecies tree estimation, rather than equating gene trees with spe-cies trees (e.g., Edwards, 2009). Although single nuclear gene treesmay also not provide an accurate representation of the species tree(e.g., Belfiore et al., 2008; Irestedt and Ohlson, 2008; Liu et al.,2008; Njabo et al., 2008), the approach has the advantage of assay-ing variation at several unlinked loci and combining these data intoa single estimate of the species tree (cf. mtDNA). Until recently,two general categories of methods were used to estimate speciestrees from multilocus data: (1) Consensus methods estimate a genetree for each locus and then construct the species tree such thatonly clades that appear in a certain percentage (e.g., 50% or100%) of the gene trees are represented on the species tree (Madd-ison, 1997). These methods assume that, if enough gene trees aresampled, the most common gene tree topology will reflect theunderlying species tree. However, Degnan and Rosenberg (2006)show that some species histories are more likely to generate genetrees that are inconsistent with the species tree than gene treesthat match the species trees. For species trees that lie in this‘‘anomaly zone’’, consensus methods are likely to fail. (2) Concate-nation methods construct a single alignment for all loci together. Asingle tree is estimated from the concatenated alignment and isused as an estimate of the species tree (Rokas et al., 2003; de

182 S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191

Queiroz and Gatesy, 2007). However, concatenation methods can-not accommodate the conflicting phylogenetic signal that may bepresent in unlinked loci and may result in overestimated supportvalues for species tree nodes (Kubatko and Degnan, 2007).

Several new phylogenetic methods use an extension of coales-cent theory, the multispecies coalescent, to estimate the speciestree directly from a multilocus DNA sequence data set (e.g., Ed-wards et al., 2007; Liu et al., 2008; Kubatko et al., 2009; Heledand Drummond, 2010). While coalescent theory models the conse-quences of genetic drift within species or populations (Kingman,1982), the multispecies coalescent extends this framework to mod-el the lineage sorting process within a group of closely related spe-cies (Degnan and Rosenberg, 2009). One new method, BayesianEstimation of the Species Tree (BEST; Edwards et al., 2007; Liuand Pearl, 2007; Liu et al., 2008) uses the multispecies coalescentto make a joint estimate of the gene trees for all loci included inthe analysis, as well as the species tree. Although BEST is a rela-tively new approach, it has been used to estimate species levelphylogenies for rodents, birds, and turtles (Belfiore et al., 2008;Brumfield et al., 2008; Spinks and Shaffer, 2009), and appears toperform well for species histories where other species tree estima-tion methods sometimes fail (e.g., concatenation; Edwards et al.,2007).

The Sulidae (Aves: Pelecaniformes) is a family of seabirds con-sisting of ten booby and gannet species. Sulids have been subdi-vided into as many as three genera: Morus (gannets – threespecies), Sula (boobies – six species) and the monotypic Papasulaabbotti (Abbott’s booby; Olson and Warheit, 1988). Other authorsconsidered Abbott’s booby to be part of Sula and maintained Morus,

Fig. 1. Global breeding distribution and previously proposed phylogeny of the Sulidae. (ablack represents the approximate boundaries of the breeding distributions for the threeexact locations of breeding colonies. (b) Previously proposed mitochondrial cytochromePhylogenetics and Evolution, 7(2), V.L. Friesen and D.J. Anderson, Phylogeny and Evolutio252–260, 1997, with permission from Elsevier. (For interpretation of the references to c

or grouped all species into a single genus (Nelson, 1978). Of theSula species, brown (Sula leucogaster), red-footed (Sula sula), andmasked boobies (Sula dactylatra) have highly overlapping pantrop-ical distributions, while Peruvian (Sula variegata), blue-footed (Sulanebouxii), and Nazca (Sula granti) boobies are restricted to the East-ern Pacific Ocean (Fig. 1a). Blue-footed and Peruvian boobies sharea parapatric distribution, with blue-footed boobies distributedfrom the Gulf of California south to the Lobos Islands of Peru wherethey overlap with Peruvian boobies on two islands. The distribu-tion of Peruvian boobies extends southward from the Lobos Islandsto northern Chile (Fig. 1a). Nazca boobies have recently been splitfrom masked boobies (Pitman and Jehl, 1998; Friesen et al., 2002)and breed in the Galápagos Islands and on some other eastern Pa-cific islands. The gannets have been considered to form three allo-species (Nelson, 1978) and have non-overlapping distributions(Fig. 1a): northern gannets (Morus bassanus) breed in the NorthernAtlantic Ocean, Cape gannets (Morus capensis) breed on a few smallislands off the southwest coast of Africa, and Australasian (Morusserrator) gannets breed on islands south of Australia and aroundNew Zealand. Abbott’s booby is endemic to Christmas Island inthe tropical Indian Ocean (Fig. 1a).

Friesen and Anderson (1997) estimated a molecular phylogenyof the Sulidae using the mitochondrial cytochrome b gene.Although their phylogeny was completely resolved and well sup-ported (Fig. 1b), masked boobies were not included in their analy-sis (the ‘‘masked booby’’ individual in their study would now beconsidered a Nazca booby), it was based on a single mtDNA gene,and two relationships remain contentious and/or warrantreevaluation:

) Approximate global breeding distribution of the Sulidae. The rectangle outlined inpantropical species: brown, red-footed, and masked boobies. See Nelson (1978) forb phylogeny of the Sulidae (Friesen and Anderson, 1997). Reprinted from Molecularn of the Sulidae (Aves: Pelecaniformes): A test of alternative modes of speciation, pp.olour in this figure legend, the reader is referred to the web version of this article.)

S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191 183

(1) Within Morus, Australasian and Cape gannets were sisterspecies, whereas Nelson (1978) considered that northerngannets were derived from Cape gannets.

(2) Papasula formed a monophyletic group with Morus, to theexclusion of Sula, with moderately high support (maximumparsimony bootstrap of 84). Nelson (1978) was unsure of therelationships among the three genera; however, he did notconsider a sister relationship between Morus and Papasulato be a possibility. Moreover, skeletal evidence suggests thatPapasula may be basal to all other sulids (Olson and Warheit,1988).

In the present study we used five nuclear intron loci and onemitochondrial locus to test these competing hypotheses about su-lid evolutionary history. We estimated the individual gene trees forthese loci and used both concatenation and BEST to estimate spe-cies trees. We also used fossil calibrated strict and relaxed molec-ular clocks in a Bayesian phylogenetic framework to estimatedivergence times for major clades.

2. Materials and methods

2.1. Sample collection and DNA extraction

We collected tissue samples from all extant sulid species in aseries of field seasons between 1988 and 2008 (Table 1). Sampleswere collected from two individuals of each species and were col-lected from geographically distant sites to maximize sampling ofintraspecific genetic diversity, if possible (Steeves et al., 2003,2005a; Morris-Pocock et al., 2010; Taylor et al., in press). We alsoobtained a tissue sample from one red-faced cormorant (Phalacroc-orax urile) and one Christmas Island frigatebird (Fregata andrewsi)to use as outgroups. Recent molecular evidence supports a cladeconsisting of the Phalacrocoracidae and Anhingidae (cormorantsand anhingas) as the sister taxon to the Sulidae, and Fregata (fri-gatebirds) as sister to the cormorant/anhinga/sulid clade (Hackettet al., 2008). These clades are both strongly supported (bootstrapsupport of 100 in both cases), and we therefore feel that thesetwo species provide adequate sampling of appropriate outgrouptaxa. All samples consisted of blood collected from breeding adultsor juveniles on nests, with the exception of the red-faced cormo-

Table 1Collection locations for all samples included in the analysis. Index is a number (1 or 2) usedbooby 1 is from Isla Monito.

Species Common name Index

Fregata andrewsi Christmas Island frigatebirdPhalacrocorax urile Red-faced cormorantPapasula abbotti Abbott’s booby 1

2Morus bassanus Northern gannet 1

2M. capensis Cape gannet 1

2M. serrator Australasian gannet 1

2Sula sula Red-footed booby 1

2S. leucogaster Brown booby 1

2S. dactylatra Masked booby 1

2S. granti Nazca booby 1

2S. nebouxii Blue-footed booby 1

2S. variegata Peruvian booby 1

2

rant sample which consisted of liver tissue. All samples are ar-chived at Queen’s University, Kingston, Ontario. We extractedDNA using either a standard phenol–chloroform extraction proto-col (Friesen et al., 1997) or the PureLink Genomic DNA Mini Kit(Invitrogen, Carlsbad, CA).

2.2. PCR amplification and DNA sequencing of nuclear introns

We amplified five nuclear intron loci (Table 2); d-crystallin in-tron 7 (d-cryst), a-enolase intron 8 (a-enol), ornithine decarboxyl-ase introns 6 and 7 (OD67), triosephosphate isomerase intron 4(Tim4), and lipoprotein lipase intron 2 (Lipo2). Of these loci, fourare autosomal and map to different chromosomes within thechicken (Gallus gallus) genome and one is Z-linked in the chickengenome (Table 2; Hillier et al., 2004). All loci map to the same chro-mosomes in the recently released zebra finch (Taeniopygia guttata)genome (assembly: Taeniopygia_guttata-3.2.4, July 2008). There-fore, we assume that all loci are located on different chromosomesand are unlinked.

All PCR reactions were performed in 15 lL reactions containing10 mM Tris pH 8.4, 50 mM KCl, 1.5 mM MgCl2, 1.6 lM bovine ser-um albumin, 2% gelatin, 0.2 mM each of the four dNTPs, 0.4 mMeach of the forward and reverse primers, and 0.5U of Thermusaquaticus (Taq) DNA polymerase (Qiagen, Mississauga, ON). Eachamplification was performed with an initial denaturation at 95 �Cfor 3 min, followed by 35 cycles of denaturation at 95 �C for 30 s,annealing for 45 s, and extension at 72 �C for 45 s. A final extensionstep of 3 min was also included. Annealing temperatures were60 �C for OD67, Tim4 and Lipo2, 61.8 �C for d-cryst, and 59.4 �Cfor a-enol, with the following exceptions: the cormorant samplewas amplified at 59.4 �C for d-cryst, and all booby samples wereamplified at 61.8 �C for a-enol. All PCR products were sequencedwith both the forward and reverse primers using a 3730XL DNAAnalyzer (Applied Biosystems, Foster City, CA) at the Genome Que-bec Innovation Center (Montreal, QC).

2.3. Collection of mitochondrial cytochrome b data

We downloaded the sulid cytochrome b data from Friesen andAnderson (1997; accession numbers U90000–U90008) to includeit in some of our analyses, and to facilitate a comparison of mtDNA

to refer to a specific individual throughout the text and figures. For example Masked

Sampling location Latitude Longitude

Christmas Island, Indian Ocean 10�300S 105�400ESt. Paul Island, Alaska 57�110N 170�170WChristmas Island, Indian Ocean 10�300S 105�400EChristmas Island, Indian Ocean 10�300S 105�400ECape St. Mary’s, Newfoundland 46�490N 54�110WCape St. Mary’s, Newfoundland 46�490N 54�110WBird Island, Lambert Bay, South Africa 32�050S 18�180EBird Island, Lambert Bay, South Africa 32�050S 18�180ECape Kidnappers, New Zealand 39�380S 177�400EPedra Branca, Australia 43�510S 146�580EIsla Monito, Puerto Rico 18�900N 67�560WTern Island, Pacific Ocean 23�520N 166�160WRaso Island, Cape Verde, Atlantic Ocean 16�370N 24�350WPalmyra Atoll, Pacific Ocean 6�200N 162�250WIsla Monito, Puerto Rico 18�900N 67�560WTern Island, Pacific Ocean 23�520N 166�160WEspañola Island, Galápagos, Pacific 1�220S 89�410WIsla de la Plata, Equador 1�160S 81�400WIsla de la Plata, Equador 1�160S 81�400WIsla Lobos de Tierra, Peru 6�260S 80�500WMazorca Island, Peru 11�230S 77�450WIsla Lobos de Tierra, Peru 6�260S 80�500W

Tabl

e2

Prim

erse

quen

ces,

sour

ces

ofpr

imer

s,a

and

char

acte

rist

ics

ofse

quen

ceva

riat

ion

ofnu

clea

rin

tron

and

mit

ocho

ndri

allo

cius

edin

the

stud

y.‘‘C

hrom

osom

e’’r

efer

sto

the

chro

mos

ome

that

the

locu

sm

aps

toin

the

chic

ken

geno

me.

Tim

-F3

9Can

dTi

m-F

39T

are

alle

le-s

peci

fic

prim

ers

(see

text

Sect

ion

3.1)

.‘‘N

ore

com

bina

tion

leng

th’’

isth

ele

ngth

ofth

efr

agm

ent

used

inth

efi

nala

naly

sis.

GTR

,HK

Y,an

dJC

are

the

Gen

eral

Tim

eRe

vers

ible

,Has

egaw

a–K

ishi

no–Y

ano

and

Juke

s–Ca

ntor

mod

els

resp

ecti

vely

.GTR

+I+

Gre

pres

ents

aG

ener

alTi

me

Reve

rsib

lem

odel

wit

ha

prop

orti

onof

inva

rian

tsi

tes

and

gam

ma

dist

ribu

ted

rate

vari

atio

n.N

Are

pres

ents

loci

for

whi

chno

exon

site

sw

ere

incl

uded

inth

ean

alys

is.

Locu

sA

bbre

viat

ion

Intr

onC

hro

mos

ome

Prim

erse

quen

ces

Lon

gest

alig

ned

len

gth

(bp)

No

reco

mbi

nat

ion

len

gth

(bp)

Subs

titu

tion

mod

el#

ofva

riab

lesi

tes

infi

nal

anal

ysis

Exon

Intr

on

d-cr

ysta

llin

d-cr

yst

719

F:50

-GC

CC

ATC

AG

ATG

GA

GC

CA

GTT

C-30

341

241

NA

GTR

51R

:50

-CC

CA

GG

CG

CTC

AG

AG

TCA

CG

GG

-30

a-en

olas

ea-

enol

821

F:50

-TG

GA

CC

TTC

AA

ATC

CC

CC

GA

TGA

TCC

CC

AG

C-30

350

131

NA

GTR

22R

:50

-CC

AG

GC

AC

CC

CA

GTC

TAC

CTG

GTC

AA

A-30

Lipo

prot

ein

lipa

seLi

po2

2Z

F:50

-AG

TAA

AA

CC

TTTG

TGG

TGA

TCC

AT-

3028

026

4JC

HK

Y17

R:

50-C

ATG

GC

AA

CA

TCC

TTTC

CC

AC

CA

GC

TT-30

Orn

ith

ine

deca

rbox

ylas

eO

D67

6an

d7

3F:

50-G

CA

AA

AG

AA

CTT

GA

CC

TTG

C-30

601

337

JCH

KY

53R

:50

-AA

GC

AG

ATA

CA

TATG

AA

GC

C-30

Trio

seph

osph

ate

isom

eras

eTi

m4

41

F:50

-ATC

GC

CTG

CA

TTG

GG

GA

GA

AG

CT-

3031

168

NA

HK

Y14

R:

50-A

TAG

GC

AA

GA

AC

CA

CC

TTA

CTC

CA

GTC

-30

Tim

-F39

C50

-TTT

TTG

AA

CA

GA

CC

AA

GG

CC

-30

Tim

-F39

T50

-TTT

TTG

AA

CA

GA

CC

AA

GG

CT-

30

Cyt

och

rom

eb

Cyt

bm

tDN

Ab3

:50

-GG

AC

GA

GG

CTT

TTA

CTA

CG

GC

TC-30

807

807

GTR

+I+

G24

3b6

:50

-GTC

TTC

AG

TTTT

TGG

TTTA

CA

AG

AC

-30

ad-

crys

t,M

orri

s-Po

cock

etal

.(20

08);

a-en

ol,F

ries

enet

al.(

1999

);Li

po2,

Tim

4,an

dcy

tb,

Frie

sen

etal

.un

publ

ish

ed;

OD

67an

dal

lele

-spe

cifi

cpr

imer

sfo

rTi

m4,

this

stu

dy.

184 S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191

and nuclear DNA variation. Cytochrome b sequence was not avail-able for any red-faced cormorants, Christmas Island frigatebirds, ormasked boobies. Therefore, we amplified the same fragment usedby Friesen and Anderson (1997) in one individual of each speciesusing the primers b3 and b6 (Table 2), and the same PCR conditionsas above (annealing temperature = 50 �C). PCR products were thensequenced with forward and reverse primers, as above.

2.4. Haplotype reconstruction, alignment, and recombination tests

We visually verified chromatograms and for each nuclear locusidentified heterozygous individuals based on the presence of twopeaks of similar size at a single nucleotide site in both sequencingdirections. We reconstructed haplotypes for each individual (i.e.,we performed gametic phasing) using the following process. First,the haplotypic phase of individuals that were either homozygousor heterozygous with only one polymorphic site could be deter-mined unambiguously. Second, if a heterozygous individual hadtwo or more polymorphic sites, we used PHASE (Version 2.1; Ste-phens et al., 2001) to infer haplotypes statistically. We ran eachPHASE analysis three times with different starting seeds to ensureconvergence. Finally, if PHASE was unable to infer haplotypes withposterior probability higher than 0.95, we designed allele-specificprimers to amplify each allele within an individual separately (Bot-tema et al., 1993).

We aligned DNA sequences for each locus using ClustalW(Thompson et al., 1994) as implemented in BioEdit (Version7.0.5.3; Hall, 1999). To detect recombination within each nuclearlocus, we used the four-gamete test (Hudson and Kaplan, 1985)as implemented in DnaSP (Version 5.10; Rozas et al., 2003) todetermine the minimum number of presumed recombinationevents. For each locus, we determined the largest non-recombiningfragment and discarded the remaining sequence for all subsequentanalyses.

2.5. Phylogenetic analysis

We estimated Bayesian phylogenetic gene trees for each nucle-ar locus and also for cytochrome b using MrBayes (Version 3.1.2;Ronquist and Huelsenbeck, 2003). Two loci (OD67, Lipo2) includedsubstantial flanking exon sequence in addition to intron sequence.Therefore, we performed partitioned analyses for these loci apply-ing separate nucleotide substitution models for the intron and theexon. We also partitioned the cytochrome b data, allowing separatesubstitution models for each codon position. We used the substitu-tion models that best fit the data as determined by Akaike’s infor-mation criterion (AIC) in MrModelTest (Version 2.2; Nylander,2004). Parameters of the nucleotide substitution models were al-lowed to vary during runs, and insertion and/or deletion polymor-phisms (indels) in nuclear loci were coded as binary characters inan additional partition. If an individual was heterozygous at a nu-clear locus, both haplotypes were included in the gene tree estima-tion. Each analysis included one cold chain and three incrementallyheated chains to explore parameter space, and was run for1.00 � 107 generations, sampling every 1000 generations. To en-sure that the MCMC process was converging, we monitored theaverage standard deviation of split frequencies between twosimultaneous runs and investigated the trend lines of each param-eter. We discarded the first 2000 sampled trees as burnin and con-structed a majority-rules consensus tree in MrBayes using a finalsample of 8000 trees. To further ensure convergence, we repeatedeach analysis three times. Results were consistent among runs, andonly one run from each analysis is presented here.

To determine whether the phylogenetic signals in mtDNA vs.nuclear DNA were in conflict, we first performed a Bayesian phylo-genetic analysis of a concatenated nuclear data set in MrBayes. In

S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191 185

the concatenated analysis, all single nucleotide polymorphisms(within individuals) were coded with the standard IUPAC ambigu-ity codes. As a result, every individual is represented by a single se-quence. Every intron and every exon was assigned its ownpartition, and the indels from each locus were coded as binarycharacters and included in a single indel partition, resulting in a to-tal of eight partitions (five intron, two exon, and one indel). Allother MrBayes settings were identical to the individual gene treeanalyses and we based our estimate of the concatenated speciestree on a majority-rules consensus tree constructed in MrBayesfrom a final set of 8000 trees. The analysis was repeated threetimes with consistent results. Because we found that the phyloge-netic placement of Abbott’s booby differed between the mtDNAgene tree and the concatenated nuclear tree (see Section 3), weexplicitly tested the support for the alternative topologies usingBayes Factors (BF; Kass and Raftery, 1995). Bayes Factors can beused in a Bayesian phylogenetic framework to evaluate the relativesupport of two phylogenetic hypotheses by performing: (i) a stan-dard Bayesian phylogenetic analysis (as above), and (ii) a secondanalysis with the same data, but with the topology constrainedin some way. We re-estimated the cytochrome b gene tree inMrBayes and constrained the topology such that Abbott’s boobywas basal to all other boobies and gannets (the relationship foundby the concatenated nuclear analysis, see Section 3). FollowingBrandley et al. (2005), we calculated two times the natural loga-rithm of the Bayes Factor as 2 ln BF = 2[ln(hm0) – ln(hm1)]; wherehm0 and hm1 are the harmonic means of the post-burnin likeli-hood values for the constrained and unconstrained phylogeneticmodels, respectively. We estimated these harmonic means usingthe sump command in MrBayes. We took a value of 2 ln BF > 10as strong evidence against the constrained model (Kass and Raf-tery, 1995). We also performed a similar analysis with the nucleardata, this time constraining Abbott’s booby and gannets to bemonophyletic (the relationship supported by the mtDNA data).

We estimated a species tree for the Sulidae using two ap-proaches. First, we used the tree estimated from the concatenatednuclear intron analysis to infer the species tree. We did not includethe cytochrome b data in this analysis as we detected conflictingphylogenetic signal in the mtDNA and nuclear DNA data (see Sec-tion 3). Moreover, because the length of the cytochrome b se-quence is approximately four times the average non-recombiningintron length, including mtDNA in a concatenation frameworkmay bias the results. As a species tree based on mtDNA alone al-ready exists (Friesen and Anderson, 1997), we focused on a concat-enated species tree based on nuclear variation. Second, weestimated the species tree using BEST (Version 2.3.1; Liu, 2008).Although the most recent version of BEST allows species tree esti-mation to be based on multiple alleles per species per locus (Liuet al., 2008), our preliminary analyses using the complete dataset had difficulty converging, even after very long runs. Therefore,we generated a sub-sampled data set with one allele randomly se-lected per species per locus for analysis with BEST. The BEST anal-ysis allows only one outgroup to root the tree. We chose to use thered-faced cormorant for the outgroup as cormorants are more clo-sely related to the ingroup than are frigatebirds (Hackett et al.,2008). However, re-running the analysis with the Christmas Islandfrigatebird as the outgroup did not change the results. Using thesub-sampled data set, we estimated the species tree using fourchains and more than 8.00 � 107 generations, with trees sampledevery 1000 generations. We discarded 40,000 trees as burnin andbased our final estimate of the species tree on the remaining40,000 trees. We monitored the average standard deviation of splitfrequencies between two simultaneous runs and reviewed thetraces of all parameter estimates to ensure convergence. We re-ran the analysis three times with consistent results and only oneis presented here. We also performed the sub-sampling of alleles

a second time and repeated the analysis, with consistent results.As BEST explicitly accommodates conflicting phylogenetic signalin a coalescent framework, we ran the BEST analysis both withand without the cytochrome b data, with consistent results (thetopology of the species tree and all posterior probabilities for nodesin the tree were identical when including or excluding cytochromeb). Therefore, we present only the analysis that included cyto-chrome b.

2.6. Divergence time estimation

Divergence times for major nodes on the species tree were esti-mated using the sub-sampled data in BEAST (Version 1.5.1; Drum-mond and Rambaut, 2007). We used the sub-sampled data set forour BEAST analyses because the Yule tree prior implemented bythe program may be more appropriate when every species is rep-resented by a single sequence. We used PAUP⁄ (Version 4.0b10;Swofford, 2003) to determine whether we could reject the molec-ular clock hypothesis for each locus. Because the molecular clockwas rejected for one of our loci (d-cryst; v2 = 37.11, d.f. = 21,p < 0.05) we performed two separate BEAST analyses: one using astrict molecular clock and one using a relaxed lognormal clock.None of the BEAST analyses included the cytochrome b data dueto the conflicting signal between mtDNA and nuclear DNA.

The fossil record for seabirds in general and for Pelecaniformesspecifically, is relatively well known due to the high likelihood offossil preservation in marine habitats (Warheit, 2002). Therefore,we used the pelecaniform fossil record to calibrate both the strictand relaxed clock analyses in BEAST. The earliest phalacrocoracidand sulid fossils both appear in the late Eocene–early Oligocenefossil record (Warheit, 2002). The earliest probable sulid, S. ronzoni,has been dated to approximately 34 million years ago (Mya; Olson,1985); however, the relative placement of S. ronzoni as part ofeither the stem Sulidae lineage or the stem Phalacrocoracidae line-age has been debated (Warheit, 2002). Regardless, it can serve as alower bound for the Sulidae/Phalacrocoracidae split (i.e., the timeto most recent common ancestor, tmrca, of all sulids and phalacro-coracids must be older than 34 million years). The earliest fossilcormorant is likely an undescribed specimen dating to a similartime (Warheit, 2002). Earlier possible cormorants and sulids havebeen found (e.g., Masillastega rectirostris; Mayr, 2002), but the tax-onomic affinities of these fossils have not been confirmed. There-fore, we calibrated the strict molecular clock using S. ronzoni as aminimum estimate of the sulid/phalacrocoracid split. We cali-brated the relaxed clock using S. ronzoni and two other fossil sulids,to include the maximum possible amount of information about thefossil record. The earliest fossils belonging to the stem Sula andstem Morus groups both appear in the early Miocene in the AtlanticOcean (S. universitatis and M. loxostylus; Warheit, 2002; K.I. War-heit personal communication). We used lognormal priors on thesulid/phalacrocoracid and Sula/Morus tmrcas to model our fossilknowledge (Ho, 2007). For the sulid/phalacrocoracid tmrca prior,we used S. ronzoni to place the zero offset of the lognormal distri-bution at 33 Mya and set the standard deviation and mean of thedistribution so that 95% of the prior probability was within thelate-Eocene. Similarly, we set a prior on the tmrca of all Sula andMorus species with a zero offset at the early Miocene (16 Mya),and set the standard deviation and mean of the distribution so that95% of the prior probability was assigned to the early Miocene.

For both BEAST analyses we used a Yule tree prior, and thetopology and branch lengths of trees were allowed to vary duringthe analysis. The program was run for 1.00 � 108 generations, sam-pling trees every 1000 generations. We excluded all exon sitesfrom the analysis, and each intron was given its own nucleotidesubstitution and molecular clock models. We ran each analysisthree times and, once we verified that each run converged on the

186 S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191

same stationary distribution, we used the program LogCombiner(Version 1.5.3; Drummond and Rambaut, 2007) to combine the re-sults from all three runs (with 25% of the trees from each run elim-inated as burnin). We visualized results using Tracer (Version 1.5;Drummond and Rambaut, 2007) and based our final estimates on asample of 225,000 trees.

3. Results

3.1. Sequence variation, gametic phasing and recombination tests

We obtained an average of 377 base pairs of DNA sequence dataper nuclear locus (range: 311–601; Table 2) and all five loci werehighly variable with an average of 31 variable sites per locus(range: 14–53). All nuclear DNA sequence data have been depos-ited in Genbank (accession nos. H379742–H379851). We also ob-tained 807 base pairs of cytochrome b sequence for one red-faced cormorant, Christmas Island frigatebird, and masked boobyand deposited these sequences in GenBank (H379739–H379741).Alignment of the new cytochrome b sequence with the sequencesfrom Friesen and Anderson (1997), revealed 243 variable sites. Allnuclear loci included phylogenetically informative indels. OnePeruvian booby (Peruvian booby 2) was heterozygous at a-enoland the two alleles differed at two nucleotide sites. One Cape gan-net (Cape gannet 2) was heterozygous at Tim4 and the two allelesdiffered at three sites. PHASE was able to unambiguously estimatethe gametic phase of the Peruvian booby at a-enol (posterior prob-ability = 0.96); however the gametic phasing for the Cage gannet atTim4 was less certain (posterior probability = 0.75). Therefore, wedesigned allele-specific primers to amplify the two Tim4 allelesof the Cape gannet separately (Table 2). PCR conditions were iden-tical to those mentioned above, with an annealing temperature of60 �C. Direct sequencing of the PCR products from the allele-spe-cific primers fully resolved haplotypes and was consistent withthe original non-phased sequence (e.g., polymorphic sites withboth a C and T in the original sequence were unambiguously Cand T in the allele-specific primer based sequences).

Four-gamete tests revealed potential recombination events forall five nuclear loci. Therefore, we used the longest non-recombin-ing block for each locus for all subsequent analyses. The most likelymodel of nucleotide substitution was the General Time Reversiblemodel (GTR; Tavaré, 1986) for the d-cryst and a-enol introns, andthe Hasegawa–Kishino–Yano model (HKY; Hasegawa et al., 1985)for the Lipo2, Tim4, and OD67 introns. The flanking exons for Lipo2and OD67 also contained variable sites and the most likely modelof nucleotide substitution for both of these regions was theJukes–Cantor model (JC; Jukes and Cantor, 1969). The most likelymodel of nucleotide substitution for cytochrome b was the GTRmodel with a proportion of invariant sites and gamma distributedrate variation (GTR + I + G).

3.2. Gene tree estimation

Although resolution varied among loci, the gene trees showedlittle phylogenetic conflict (Fig. 2). One area of conflict was theplacement of Abbott’s booby alleles. Abbott’s booby alleles werebasal to all other ingroup alleles on two nuclear gene trees (d-crystand OD67), and formed an unresolved basal trichotomy with theclades containing Morus and Sula alleles on the other three nucleargene trees (a-enol, Lipo2, and Tim4). On the other hand, the cyto-chrome b tree strongly supported Abbott’s booby as sister taxon toMorus, consistent with the previously estimated tree (Friesen andAnderson, 1997). All gene trees strongly supported monophyly ofSula. Resolution varied among gene trees within the Sula clade,but most gene trees placed red-footed booby alleles basal to allother booby alleles (but see OD67), and brown booby alleles basal

to all other booby alleles except for red-footed booby alleles. Al-leles from the four remaining Sula species (masked, Nazca, blue-footed and Peruvian boobies) grouped together on all gene trees,but the relationships among these taxa were not completely re-solved on any single nuclear gene tree (the relationships in thisclade were fully resolved on the cytochrome b gene tree). Therewas very little resolution among Morus alleles on nuclear genetrees; however, northern gannet alleles grouped together to theexclusion of all other gannet alleles with high support on the d-cryst tree and this relationship is also supported by cytochromeb. One Australasian gannet allele appeared basal to all other gan-net alleles on the d-cryst tree (but with low posterior probability).

3.3. Species tree estimation

Both species tree estimation methods generated highly resolvedand strongly supported species trees (Fig. 3). Moreover, both spe-cies trees were topologically consistent with each other despiteinclusion of cytochrome b data in the BEST but not the concate-nated analysis. Nodes on both species trees had similar posteriorprobabilities. Abbott’s booby was basal to all other boobies andgannets, and this relationship was supported with 1.00 posteriorprobability in both the concatenation and BEST analyses. Otherthan the placement of Abbott’s booby and the inclusion of onenew species (masked booby), our species trees were topologicallyidentical to the mtDNA tree (Friesen and Anderson, 1997; ourFig. 1b). Not surprisingly, masked boobies formed a sister relation-ship with Nazca boobies. Although there was very little variationamong Morus alleles at any locus other than cytochrome b, bothspecies trees supported a sister relationship between Cape andAustralasian gannets, with moderate to high posterior probability(0.92 and 1.00, for concatenation and BEST, respectively).

We found significant conflict between the mtDNA (cytochromeb) data and the combined nuclear data using Bayes Factor analyses.When we constrained Abbott’s booby to be basal to all other sulidspecies on the mtDNA gene tree, the harmonic mean of the log like-lihood was �3647.98 (compared to �3459.43 for the uncon-strained analysis), resulting in a value of 2 ln BF = 377.1, strongevidence against the constrained tree. Similarly, when we con-strained Abbott’s booby alleles and Morus alleles to be monophy-letic on the concatenated nuclear tree, the harmonic mean of thelog likelihood was �2564.09 (compared to �2552.09 for theunconstrained analysis), resulting in a value of 2 ln BF = 24, strongevidence against the constrained tree.

3.4. Divergence time estimation

Divergence time estimates using strict and relaxed molecularclock models were similar and the 95% highest posterior density(HPD) intervals overlapped in all cases (Table 3). Divergence timesestimated using a relaxed clock were consistently older than strictclock estimates (except for the Cape gannet/Australasian gannetdivergence); however, dates estimated using the strict clock maybe underestimates because the clock was calibrated using a mini-mum estimate of the sulid/phalacrocoracid split (see Section 2.6),and the lognormal priors used in the relaxed clock analysis explic-itly consider divergence times that are older than the fossil dateused for the calibration to be more likely (Ho, 2007). For these rea-sons, and also because a molecular clock could be explicitly re-jected for one of our loci, we consider the relaxed clock tmrcaestimates to be more reliable. Therefore, we discuss only these re-sults (but for both strict and relaxed clock tmrcas, including 95%HPDs, see Table 3). The estimated tmrca of all extant sulids wasapproximately 22 Mya. Following the initial divergence of Abbott’sbooby from all other sulids, the estimated tmrca of all modernMorus and Sula species was approximately 17 Mya. The estimated

Fig. 2. Bayesian estimates of gene trees. Majority-rule consensus trees from the Bayesian estimates of the gene trees for (a) d-cryst, (b) a-enol, (c) Lipo2, (d) OD67, (e) Tim4,and (f) cytochrome b. Nodal support is indicated with Bayesian posterior probabilities. Heterozygous individuals are represented by two alleles, denoted ‘‘a’’ and ‘‘b’’.

S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191 187

tmrcas for the extant gannet and Sula species were approximately2.5 Mya and 6 Mya, respectively. The three extant sister speciespairs in the Sulidae (masked/Nazca boobies, Peruvian/blue-footedboobies, Cape/Australasian gannets) all appear to have divergedwithin the last 1.1 million years, however, the 95% HPDs aroundthese divergences include zero (Table 3).

4. Discussion

4.1. Phylogeny and biogeography of the Sulidae

In general, our study supports most of the phylogenetic hypoth-eses of sulid evolution proposed by Friesen and Anderson (1997).

Fig. 3. Concatenation and BEST species trees. Estimates of the sulid species tree from the (a) concatenation and (b) BEST analyses. Nodal support is indicated with Bayesianposterior probabilities. Numbers in open circles refer to node numbers that are used in Table 3.

188 S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191

The divergence times estimated in our analysis were based only onthe nuclear intron data, and are generally consistent with esti-mates based on mtDNA (Friesen and Anderson, 1997). Morus andSula are each monophyletic and appear to have diverged in theearly Miocene approximately 17 Mya (95% HPD: 16.0–19.4 Mya).Contrary to Friesen and Anderson (1997), Papasula was basal tothe Sula + Morus clade and appears to have originated 22 Mya(95% HPD: 17.1–27.4 Mya). This timing of early sulid diversifica-tion corresponds to major changes in paleoceanography, such asthe beginning of the Paratethys seaway closure between the Atlan-tic and Indian Oceans (Rögl, 1998; Harzhauser and Piller, 2007) andthe establishment of major thermal gradients on a global latitudi-nal scale (Warheit, 1992). Therefore, the early phases of suliddiversification may have been predominantly driven by majorocean re-structuring and vicariant events. An early to mid-Mioceneorigin of the early sulid lineages is also supported by mtDNA evi-dence (Friesen and Anderson, 1997) although these authors con-sidered only two major lineages, Sula and Morus + Papasula (seebelow). The early- to mid-Miocene also appears to have been animportant time for diversification in other marine vertebrates

(e.g., cetaceans; McGowen et al., 2009) and invertebrates (gastro-pods; Williams and Duda, 2008).

The phylogenetic relationships among the three major sulid lin-eages were previously unclear. Some authors suggested that simi-larities between Abbott’s and red-footed boobies (e.g., tree nesting;Nelson, 1978) argue for a close ancestry between Papasula andSula. However, Nelson (2005) argued that Abbott’s booby andMorus share behavioral affinities (e.g., a prolonged face-to-facemeeting ceremony between males and females upon arrival atthe nest) and an analysis of the scleral rings of all extant sulidsand many other pelecaniform birds led Warheit et al. (1989) tosuggest a basal relationship of Abbott’s booby to all other sulids.Our results support the hypothesis of Warheit et al. (1989): bothspecies tree estimation methods placed Papasula basal to all othersulids with strong support, a result in conflict with the sister rela-tionship of Abbott’s booby and Morus gannets on the mtDNA gene.Importantly, Papasula was still strongly supported as basal to allother sulids even when we included the cytochrome b data inthe BEST analysis and when we varied the outgroups that wereused to root the tree.

Table 3Divergence times estimated by BEAST using strict and relaxed molecular clock models. Point estimates are the medians of the posterior distributions, and 95% highest posteriordensity (HPD) intervals are given in brackets. All dates are in millions of years ago (Mya). Refer to Fig. 3 for a graphical representation of the node numbering.

Node Description Relaxed clock (Mya) Strict clock (Mya)

1 Abbott’s booby splits from all other sulids 21.8 [17.1–27.4] 17.6 [11.7–23.8]2 Gannets and Sula boobies split 17.2 [16.0–19.4] 12.7 [11.2–17.8]3 Origin of extant Sula boobies 5.9 [1.8–10.7 ] 4.5 [2.5–7.0]4 Origin of lineage leading to brown boobies 3.6 [0.7–7.4] 2.7 [1.2–4.3]5 Origin of extant gannets 2.5 [0.0–9.2] 2.1 [0.5–4.4]6 Origin of masked, Nazca, blue-footed, Peruvian booby clade 2.1 [0.1–5.1] 1.6 [0.7–2.8]7 Masked/Nazca booby divergence 1.1 [0.0–3.2] 0.8 [0.1–1.7]8 Blue-footed/Peruvian booby divergence 1.1 [0.0–3.2] 0.8 [0.1–1.7]9 Australasian/Cape gannet divergence 0.5 [0.0–4.1] 0.7 [0.0–2.0]

S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191 189

The conflicting results between Friesen and Anderson (1997)and the present study may reflect incomplete lineage sorting ofmtDNA. Although, the divergences between Papasula, Sula, andMorus occurred a long time ago, incomplete lineage sorting can stillconfound phylogenetic analysis if inter-node lengths are short and/or ancestral population sizes were large (Degnan and Rosenberg,2006). If the initial divergences between Papasula, Morus, and Sulawere tightly spaced in time, potentially during a period of oceanre-structuring, we might expect incomplete lineage sorting amongthese lineages to persist to the present. A second reason why themtDNA gene tree and the nuclear DNA species tree may differ islong branch attraction (Felsenstein, 1978; Hendy and Penny,1989; Bergsten, 2005). Due to the relatively high substitution rateof mtDNA (Brown et al., 1979), multiple mutations at single sitesmay occur along long branches in a phylogeny (e.g., the longbranch that connects Abbott’s booby to other sulids). This can con-found phylogenetic analysis and may result in strong support forrelationships that do not match the true species history. Becausethe BEST analysis explicitly models incomplete lineage sorting ina coalescent framework and because our results are robust to theparticular outgroup used, we argue that our estimate of the basalrelationships within the Sulidae is a better representation of thetrue species history than the previously estimated mtDNA genetree.

Within the Sula clade, we recovered the same topology as esti-mated from mtDNA (Friesen and Anderson, 1997; this study). Themost recent common ancestor to Sula apparently lived in the lateMiocene, approximately 6 Mya (95% HPD: 1.8–10.7 Mya), andsince then sulid speciation has proceeded at a relatively constantpace (Fig. 3 and Table 3). The pantropical distribution of half ofthe extant Sula species confounds many biogeographic interpreta-tions of the topology and divergence times (Friesen and Anderson,1997); however, several major barriers to gene flow existedthroughout the tropics during this time and may have isolatedincipient species and promoted allopatric divergence (e.g., the clos-ing of the Central American Seaway approximately 4 million yearsago and/or the periodic emergence of the Sunda and Sahul shelvesin the Indo-West Pacific; Voris, 2000; Kuhnt et al., 2004; Jain andCollins, 2007). More recently, Sula speciation has been concen-trated in the Eastern Tropical Pacific where two sister species pairs(masked/Nazca boobies and blue-footed/Peruvian boobies) bothappear to have diverged approximately 1 million years ago. Noobvious terrestrial barriers to gene flow exist in this area and someevidence suggests that blue-footed and Peruvian boobies divergedfrom a common ancestor with gene flow via the parapatric modelof speciation (Friesen and Anderson, 1997). The strong environ-mental gradient that exists where the Equatorial Counter Currentmeets the Humboldt Current in northern Peru may have acted asa partial barrier to gene flow between these species facilitatingdivergence (Taylor et al., unpublished results). Recent intraspecificdivergence in the Eastern Pacific also appears to have been drivenby non-terrestrial barriers to gene flow in brown, red-footed, and

blue-footed boobies (Steeves et al., 2005b; Morris-Pocock et al.,2010; Taylor et al., in press). Therefore, the complex oceanographyof the eastern Pacific may be a general driver of population differ-entiation and ultimately speciation in the Sulidae.

The most recent common ancestor of the gannets appears tohave lived approximately 2.5 million years ago, but we note thecredibility interval surrounding this estimate is wide. Our datasupport an initial divergence into northern (M. bassanus) andsouthern hemisphere (M. capensis and M. serrator) lineages fol-lowed by the recent divergence of the two southern hemispherespecies. A sister relationship between capensis and serrator is alsosupported by plumage similarities (e.g., amount of black on tailand wings; Nelson, 1978), and is consistent with the mtDNA genetree (Friesen and Anderson, 1997). However, nodal support for thissister relationship was moderate on our concatenated species trees(Fig. 3). This may have been due to the relatively low substitutionrate for nuclear introns and the recent ancestry of all gannets. Wesuggest that the relationships among the gannets might be best re-solved by using a population genetic approach and sequencingboth more individuals per species and more loci.

4.2. The use of nuclear intron data for species tree estimation

Sequence data from nuclear introns is becoming increasinglyuseful for phylogenetic analysis as a result of at least three ad-vances: (1) the ability to infer statistically the gametic phase of al-leles directly sequenced from PCR (e.g., PHASE; Stephens et al.,2001), (2) the increased availability of genomic resources (e.g.,the chicken genome; Hillier et al., 2004) to design intron primerswith widespread applicability (e.g., Friesen et al., 1999; Backströmet al., 2008; Kimball et al., 2009), and (3) the ability of new speciestree estimation methods (e.g., BEST) to take advantage of multilo-cus DNA sequence data sets. However, the low substitution rate ofnuclear DNA compared to mtDNA means that on average less var-iation per base pair is available to inform phylogeny estimation(Moore, 1995; Zink and Barrowclough, 2008). Our data suggestthat the use of multiple nuclear intron markers, in concert withnew species tree estimation methods, is a powerful approach tophylogeny estimation. Similar to previous avian studies that usedBEST (e.g., Brumfield et al., 2008), we found that although individ-ual nuclear gene trees were not completely resolved and some-times conflicting, the Bayesian estimate of the species tree usinga moderate number of loci (here five or six) generated a highly re-solved species tree. This tree was completely resolved whether weincluded mtDNA data or not. The potential for nuclear introns toinform species tree estimation in more recent, or rapid, speciationevents (e.g., adaptive radiations) is not clear. In our study, individ-ual nuclear introns were often not able to resolve relationshipsamong the most recently diverged taxa. One potential extensionto the approach used here is the inclusion of anonymous nuclearloci (Karl and Avise, 1993). Anonymous loci appear to have substi-tution rates higher than intron loci and appear to be informative at

190 S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191

both phylogeographic (Lee and Edwards, 2008) and phylogeneticlevels (Liu and Pearl, 2007).

5. Conclusions

Our study highlights that species tree estimation methods thatuse multiple loci and explicitly model lineage sorting in a coales-cent framework offer a significant improvement over equating sin-gle gene trees with the species tree. In our analysis, no single genetree was completely resolved; however, both concatenation andBEST species trees were very well resolved. While previous studies(e.g., Belfiore et al., 2008) have found that species trees estimatedusing concatenation were discordant with species trees that wereestimated using BEST, the trees inferred using concatenation andBEST methods in our study were topologically identical. However,the agreement between concatenation and BEST probably reflectsthe relative agreement among individual nuclear gene trees, itselfa result of a relatively simple species history (i.e., very few short in-ter-node lengths), and the fact that we did not include the cyto-chrome b data in the concatenated analysis. One of the morestriking results from our study is that Abbott’s booby, an endan-gered species with as few as 5000 individuals remaining, is mostlikely basal to all other boobies and gannets and is the last livingrepresentative of a clade that is approximately 22 million yearsold. Abbott’s booby is therefore evolutionarily distinct and shouldwarrant special conservation concern (May, 1990).

Acknowledgments

We would like to thank D.J. Anderson, J. Awkerman, A. Baker, T.Birt, N. Brothers, G. Del’ommo, C. Depkin, E. Gómez-Días, J. Gonza-les-Solis, J. Hennicke, W.A. Montevecchi, M. Peck, T. Steeves, S. Tay-lor, H. Walsh, and C. Zavalaga for help with sample collection.Funding was supplied by the National Sciences and EngineeringResearch Council of Canada (NSERC) Discovery Grant (V.L.F.),NSERC Alexander Graham Bell Canada Graduate Scholarship(J.A.M.P.), EG Bauman Award (J.A.M.P.), Queen’s Graduate Award(J.A.M.P) and the National Geographic Society (Grant 8331-07 toD.J. Anderson and V.L.F.). We would also like to thank L. Liu forassistance with the BEST analysis, Christmas Island National Parkfor logistical support, T. Birt, P. Deane, G. Ibarguchi, L. Maclagen,S. Taylor, K. Warheit for helpful discussion and/or lab support,and R. Tashian and an anonymous reviewer for helpful commentson the manuscript.

References

Backström, N., Fagerberg, S., Ellegren, H., 2008. Genomics of natural birdpopulations: a gene-based set of reference markers evenly spread across theavian genome. Mol. Ecol. 17, 964–980.

Ballard, J.W.O., Whitlock, M.C., 2004. The incomplete natural history ofmitochondria. Mol. Ecol. 13, 729–744.

Belfiore, N.M., Liu, L., Moritz, C., 2008. Multilocus phylogenetics of a rapid radiationin the genus Thomomys (Rodentia: Geomyidae). Syst. Biol. 57, 294–310.

Bergsten, J., 2005. A review of long branch attraction. Cladistics 21, 163–193.Bottema, C.D.K., Sarkar, G., Cassady, J.D., It, S., Dutton, C.M., Sommer, S.S., 1993.

Polymerase chain reaction amplification of specific alleles: a general method ofdetection of mutations, polymorphisms, and haplotypes. Method Enzymol. 218,388–402.

Brandley, M.C., Schmitz, A., Reeder, T.W., 2005. Partitioned Bayesian analyses,partition choice, and the phylogenetic relationships of scincid lizards. Syst. Biol.54, 373–390.

Brown, W.M., George Jr., M., Wilson, A.C., 1979. Rapid evolution of animalmitochondrial DNA. Proc. Natl. Acad. Sci. USA 76, 1967–1971.

Brumfield, R.T., Liu, L., Lum, D.E., Edwards, S.V., 2008. Comparison of species treemethods for reconstructing the phylogeny of bearded manakins (Aves: Pipridae,Manacus) from multilocus sequence data. Syst. Biol. 57, 719–731.

Degnan, J.H., Rosenberg, N.A., 2006. Discordance of species trees with their mostlikely gene trees. PLoS Genet. 2, 762–768.

Degnan, J.H., Rosenberg, N.A., 2009. Gene tree discordance, phylogenetic inferenceand the multispecies coalescent. Trends Ecol. Evol. 24, 332–340.

de Queiroz, A., Gatesy, J., 2007. The supermatrix approach to systematics. TrendsEcol. Evol. 22, 34–41.

Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis bysampling trees. BMC Evol. Biol. 7, 214–221.

Edwards, S.V., 2009. Is a new and general theory of molecular systematicsemerging? Evolution 63, 1–19.

Edwards, S.V., Liu, L., Pearl, D.K., 2007. High-resolution species trees withoutconcatenation. Proc. Natl. Acad. Sci. USA 104, 5936–5941.

Felsenstein, J., 1978. Cases in which parsimony or compatibility methods will bepositively misleading. Syst. Zool. 27, 401–410.

Friesen, V.L., Anderson, D.J., 1997. Phylogeny and evolution of the Sulidae (Aves:Pelecaniformes): a test of alternative modes of speciation. Mol. Phylogenet.Evol. 7, 252–260.

Friesen, V.L., Congdon, B.C., Walsh, H.E., Birt, T.P., 1997. Intron variation in marbledmurrelets detected using analyses of single-stranded conformationalpolymorphisms. Mol. Ecol. 6, 1047–1058.

Friesen, V.L., Congdon, B.C., Kidd, M.G., Birt, T.P., 1999. Polymerase chain reaction(PCR) primers for the amplification of five nuclear introns in vertebrates. Mol.Ecol. 8, 2147–2149.

Friesen, V.L., Anderson, D.J., Steeves, T.E., Jones, H., Schreiber, E.A., 2002. Molecularsupport for species status of the Nazca booby (Sula granti). Auk 119, 820–826.

Hackett, S.J., Kimball, R.T., Reddy, S., Bowie, R.C.K., Braun, E.L., Braun, M.J.,Chojnowski, J.L., Cox, W.A., Han, K.-L., Harshman, J., Huddleston, C.J., Marks,B.D., Miglia, K.J., Moore, W.S., Sheldon, F.H., Steadman, D.W., Witt, C.C., Yuri, T.,2008. A phylogenomic study of birds reveals their evolutionary history. Science320, 1763–1768.

Hall, T.A., 1999. BioEdit: a user friendly biological sequence alignment editor andanalysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98.

Harzhauser, M., Piller, W.E., 2007. Benchmark data of a changing sea –palaeogeography, palaeobiogeography and events in the Central Paratethysduring the Miocene. Palaeogeogr. Palaeocl. 253, 8–31.

Hasegawa, M., Kishino, H., Yano, T., 1985. Dating of the human-ape splitting by amolecular clock of mitochondrial DNA. J. Mol. Evol. 22, 160–174.

Heled, J., Drummond, A.J., 2010. Bayesian inference of species trees from multilocusdata. Mol. Biol. Evol. 27, 570–580.

Hendy, M.D., Penny, D., 1989. A framework for the quantitative study ofevolutionary trees. Syst. Zool. 38, 297–309.

Hillier, L.W., Miller, W., Birney, E., Warren, W., Hardison, R.C., Ponting, C.P., Bork, P.,Burt, D.W., Groenen, M.A.M., Delany, M.E., Dodgson, J.B., Chinwalla, A.T., Cliften,P.F., Clifton, S.W., Delehaunty, K.D., Fronick, C., Fulton, R.S., Graves, T.A.,Kremitzki, C., Layman, D., Magrini, V., McPherson, J.D., Miner, T.L., Minx, P.,Nash, W.E., Nhan, M.N., Nelson, J.O., Oddy, L.G., Pohl, C.S., Randall-Maher, J.,Smith, S.M., Wallis, J.W., Yang, S.P., Romanov, M.N., Rondelli, C.M., Paton, B.,Smith, J., Morrice, D., Daniels, L., Tempest, H.G., Robertson, L., Masabanda, J.S.,Griffin, D.K., Vignal, A., Fillon, V., Jacobbson, L., Kerje, S., Andersson, L.,Crooijmans, R.P.M., Aerts, J., van der Poel, J.J., Ellegren, H., Caldwell, R.B.,Hubbard, S.J., Grafham, D.V., Kierzek, A.M., McLaren, S.R., Overton, I.M.,Arakawa, H., Beattie, K.J., Bezzubov, Y., Boardman, P.E., Bonfield, J.K., Croning,M.D.R., Davies, R.M., Francis, M.D., Humphray, S.J., Scott, C.E., Taylor, R.G., Tickle,C., Brown, W.R.A., Rogers, J., Buerstedde, J.M., Wilson, S.A., Stubbs, L.,Ovcharenko, I., Gordon, L., Lucas, S., Miller, M.M., Inoko, H., Shiina, T.,Kaufman, J., Salomonsen, J., Skjoedt, K., Wong, G.K.S., Wang, J., Liu, B., Wang,J., Yu, J., Yang, H.M., Nefedov, M., Koriabine, M., de Jong, P.J., Goodstadt, L.,Webber, C., Dickens, N.J., Letunic, I., Suyama, M., Torrents, D., von Mering, C.,Zdobnov, E.M., Makova, K., Nekrutenko, A., Elnitski, L., Eswara, P., King, D.C.,Yang, S., Tyekucheva, S., Radakrishnan, A., Harris, R.S., Chiaromonte, F., Taylor, J.,He, J.B., Rijnkels, M., Griffiths-Jones, S., Ureta-Vidal, A., Hoffman, M.M., Severin,J., Searle, S.M.J., Law, A.S., Speed, D., Waddington, D., Cheng, Z., Tuzun, E., Eichler,E., Bao, Z.R., Flicek, P., Shteynberg, D.D., Brent, M.R., Bye, J.M., Huckle, E.J.,Chatterji, S., Dewey, C., Pachter, L., Kouranov, A., Mourelatos, Z., Hatzigeorgiou,A.G., Paterson, A.H., Ivarie, R., Brandstrom, M., Axelsson, E., Backstrom, N.,Berlin, S., Webster, M.T., Pourquie, O., Reymond, A., Ucla, C., Antonarakis, S.E.,Long, M.Y., Emerson, J.J., Betran, E., Dupanloup, I., Kaessmann, H., Hinrichs, A.S.,Bejerano, G., Furey, T.S., Harte, R.A., Raney, B., Siepel, A., Kent, W.J., Haussler, D.,Eyras, E., Castelo, R., Abril, J.F., Castellano, S., Camara, F., Parra, G., Guigo, R.,Bourque, G., Tesler, G., Pevzner, P.A., Smit, A., Fulton, L.A., Mardis, E.R., Wilson,R.K., 2004. Sequence and comparative analysis of the chicken genome provideunique perspectives on vertebrate evolution. Nature 432, 695–716.

Ho, S.Y.W., 2007. Calibrating molecular estimates of substitution rates anddivergence times in birds. J. Avian Biol. 38, 409–414.

Hudson, R.R., Kaplan, N.L., 1985. Statistical properties of the number ofrecombination events in the history of a sample of DNA sequences. Genetics111, 147–164.

Irestedt, M., Ohlson, J.I., 2008. The division of the major songbird radiation intoPasserida and ‘core Corvoidea’ (Aves: Passeriformes) – the species tree vs. genetrees. Zool. Scripta 37, 305–313.

Jain, S., Collins, L.S., 2007. Trends in Caribbean paleoproductivity related to theNeogene closure of the Central American Seaway. Mar. Micropaleontol. 63, 57–74.

Jukes, T.H., Cantor, C.R., 1969. Evolution of protein molecules. In: Munro, H.N. (Ed.),Mammalian Protein Metabolism, vol. 111. Academic Press, New York, pp. 21–132.

Kass, R.E., Raftery, A.E., 1995. Bayes factors. J. Am. Stat. Assoc. 90, 773–795.Karl, S.A., Avise, J.C., 1993. PCR-based assays of Mendelian polymorphisms from

anonymous single-copy nuclear DNA: techniques and applications forpopulation genetics. Mol. Biol. Evol. 10, 342–361.

S.A. Patterson et al. / Molecular Phylogenetics and Evolution 58 (2011) 181–191 191

Kimball, R.T., Braun, E.L., Barker, F.K., Bowie, R.C.K., Braun, M.J., Chojnowski, J.L.,Hackett, S.J., Han, K.-L., Harshman, J., Heimer-Torres, V., Holznagel, W.,Huddleston, C.J., Marks, B.D., Miglia, K.J., Moore, W.S., Reddy, S., Sheldon, F.H.,Smith, J.V., Witt, C.C., Yuri, T., 2009. A well-tested set of primers to amplifyregions spread across the avian genome. Mol. Phylogenet. Evol. 50, 654–660.

Kingman, J.F.C., 1982. The coalescent. Stoch. Proc. Appl. 13, 235–248.Kubatko, L.S., Degnan, J.H., 2007. Inconsistency of phylogenetic estimates from

concatenated data under coalescence. Syst. Biol. 56, 17–24.Kubatko, L.S., Carstens, B.C., Knowles, L.L., 2009. STEM: species tree estimation using

maximum likelihood for gene trees under coalescence. Bioinformatics 25, 971–973.

Kuhnt, W., Holbourn, A., Hall, R., Zuvela, M., Käse, R., 2004. Neogene history of theIndonesian throughflow. In: Cliff, P.D. (Ed.), Continent–ocean interactionswithin East Asian marginal seas. American Geophysical Union, Washington,pp. 299–320.

Lee, J.Y., Edwards, S.V., 2008. Divergence across Australia’s Carpentarian barrier:statistical phylogeography of the red-backed fairy wren (Malurusmelanocephalus). Evolution 62, 3117–3134.

Liu, L., 2008. BEST: Bayesian estimation of species trees under the coalescent model.Bioinformatics 24, 2542–2543.

Liu, L., Pearl, D.K., 2007. Species trees from gene trees: reconstructing Bayesianposterior distributions of a species phylogeny using estimated gene treedistributions. Syst. Biol. 56, 504–514.

Liu, L., Pearl, D.K., Brumfield, R.T., Edwards, S.V., 2008. Estimating species trees usingmultiple-allele DNA sequence data. Evolution 62, 2080–2091.

Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523–536.May, R.M., 1990. Taxonomy as destiny. Nature 347, 129–130.Mayr, G., 2002. A skull of a new pelecaniform bird from the Middle Eocene of

Messel, Germany. Acta Palaeontol. Pol. 47, 507–512.McGowen, M.R., Spaulding, M., Gatesy, J., 2009. Divergence date estimation and a

comprehensive molecular tree of extant cetaceans. Mol. Phylogenet. Evol. 53,891–906.

Moore, W.S., 1995. Inferring phylogenies from mtDNA variation: mitochondrial-gene trees versus nuclear-gene trees. Evolution 49, 718–726.

Morris-Pocock, J.A., Taylor, S.A., Birt, T.P., Damus, M., Piatt, J.F., Warheit, K.I., Friesen,V.L., 2008. Population genetic structure in Atlantic and Pacific Ocean commonmurres (Uria aalge): natural replicate tests of post-Pleistocene evolution. Mol.Ecol. 17, 4859–4873.

Morris-Pocock, J.A., Steeves, T.E., Estela, F.A., Anderson, D.J., Friesen, V.L., 2010.Comparative phylogeography of brown (Sula leucogaster) and red-footedboobies (S. Sula): the influence of physical barriers and habitat preference ongene flow in pelagic seabirds. Mol. Phylogenet. Evol. 54, 883–896.

Nelson, J.B., 1978. The Sulidae: Gannets and Boobies. Oxford University Press,Oxford.

Nelson, J.B., 2005. Pelicans, Cormorants and Their Relatives. Oxford University Press,New York.

Njabo, K.Y., Bowie, R.C.K., Sorenson, M.D., 2008. Phylogeny, biogeography andtaxonomy of the African wattle-eyes (Aves: Passeriformes: Platysteiridae). Mol.Phylogenet. Evol. 48, 136–149.

Nylander, J.A.A., 2004. MrModelTest v2. Program Distributed by the Author.Evolutionary Biology Centre. Uppsala University.

Olson, S.L., 1985. The fossil record of birds. In: Farner, D.S., King, J.R., Parkes, K.C.(Eds.), Avian Biology, vol. 8. Academic Press, Orlando, FL, pp. 79–252.

Olson, S.L., Warheit, K.W., 1988. A new genus for Sula abbotti. Bull. Brit. Ornithol.Club 108, 9–12.

Pitman, R.L., Jehl Jr., J.R., 1998. Geographic variation and reassessment of specieslimits in the ‘‘masked’’ boobies of the Eastern Pacific Ocean. Wilson Bull. 110,155–170.

Rögl, F., 1998. Mediterranean and Paratethys. Facts and hypotheses of anOligocene to Miocene paleogeography (short overview). Geol. Carpath. 50,339–349.

Rokas, A., Williams, B.L., King, N., Carroll, S.B., 2003. Genome-scale approaches toresolving incongruence in molecular phylogenies. Nature 425, 798–804.

Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inferenceunder mixed models. Bioinformatics 19, 1572–1574.

Rozas, J., Sánchez-DelBarrio, J.C., Messeguer, Z., Rozas, R., 2003. DnaSP: DNApolymorphism analyses by the coalescent and other methods. Bioinformatics19, 2496–2497.

Spinks, P.Q., Shaffer, H.B., 2009. Conflicting mitochondrial and nuclear phylogeniesfor the widely disjunct Emys (Testudines: Emydidae) species complex, and whatthey tell us about biogeography and hybridization. Syst. Biol. 58, 1–20.

Steeves, T.E., Anderson, D.J., McNally, H., Kim, M.H., Friesen, V.L., 2003.Phylogeography of Sula: the role of physical barriers to gene flow in thediversification of tropical seabirds. J. Avian Biol. 34, 217–223.

Steeves, T.E., Anderson, D.J., Friesen, V.L., 2005a. The Isthmus of Panama: a majorphysical barrier to gene flow in a highly mobile pantropical seabird. J. Evol. Biol.18, 1000–1008.

Steeves, T.E., Anderson, D.J., Friesen, V.L., 2005b. A role for nonphysical barriers togene flow in the diversification of a highly vagile seabird, the masked booby(Sula dactylatra). Mol. Ecol. 14, 3877–3887.

Stephens, M., Smith, N.J., Donnelly, P., 2001. A new statistical method for haplotypereconstruction from population data. Am. J. Human Genet. 68, 978–989.

Swofford, D.L., 2003. PAUP⁄: Phylogenetic Analysis Using Parsimony (⁄and othermethods). Version 4. Sinauer, Sunderland, MA.

Tavaré, S., 1986. Some probabilistic and statistical problems in the analysis of DNAsequences. Lect. Math. Life Sci. 17, 57–86.

Taylor, S.A., Maclagan, L., Anderson, D.J., Friesen, V.L., in press. Could specializationto cold water upwelling systems influence gene flow and populationdifferentiation in marine organisms? A case study using the blue-footedbooby, Sula nebouxii. J. Biogeog.

Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. ClustalW: improving the sensitivityof progressive multiple sequence alignment through sequence weighting,position-specific gap penalties and weight matrix choice. Nucleic Acids Res.22, 4673–4680.

Voris, H.K., 2000. Maps of Pleistocene sea levels in southeast Asia: shorelines, riversystems and time durations. J. Biogeogr. 27, 1153–1167.

Warheit, K.I., 1992. A review of the fossil seabirds from the Tertiary of the NorthPacific: plate tectonics, paleoceanography, and faunal change. Paleobiology 18,401–424.

Warheit, K.I., 2002. The seabird fossil record and the role of paleontology inunderstanding seabird community structure. In: Schreiber, E.A., Burger, J. (Eds.),Biology of Marine Birds. CRC Press, Boca Raton, FL, pp. 17–55.

Warheit, K.I., Good, D.A., de Queiroz, K., 1989. Variation in numbers of scleralossicles and their phylogenetic transformations within the Pelecaniformes. Auk106, 383–388.

Williams, S.T., Duda Jr., T.F., 2008. Did tectonic activity stimulate Oligo-Miocenespeciation in the Indo-West Pacific? Evolution 62, 1618–1634.

Zink, R.M., Barrowclough, G.F., 2008. Mitochondrial DNA under siege in avianphylogeography. Mol. Ecol. 17, 2107–2121.


Recommended