+ All Categories
Home > Science > Complex adaptation in Zea

Complex adaptation in Zea

Date post: 13-Dec-2014
Category:
Upload: jrossibarra
View: 247 times
Download: 4 times
Share this document with a friend
Description:
Some thoughts on the complexities of adaptation in a big plant genome.
Popular Tags:
68
Complex adaptation in Zea Jeffrey Ross-Ibarra @jrossibarra www.rilab.org Dept. Plant Sciences • Center for Population Biology • Genome Center University of California Davis
Transcript
Page 1: Complex adaptation in Zea

Complex adaptation in Zea

Jeffrey Ross-Ibarra @jrossibarra • www.rilab.org

Dept. Plant Sciences • Center for Population Biology • Genome Center University of California Davis

Page 2: Complex adaptation in Zea

acknowledgements

Tanja Pyhäjärvi (U. Oulu)

Shohei Takuno (Sokendai)

John Doebley (U Wisconsin)

Vince Buffalo

Michelle Stitzer Paul Bilinski

Anne LorantSofiane Mezmouk (KWS)

Nathan Springer (U Minnesota)

Page 3: Complex adaptation in Zea
Page 4: Complex adaptation in Zea

Matthew Hufford (Iowa State)

Pyhäjärvi T, Hufford MB, Mezmouk S, Ross-Ibarra J§ (2013) Complex patterns of local adaptation in teosinte. Genome Biology and Evolution 5: 1594-1609.†

Hufford MB, Lubinsky P, Pyhäjärvi T, Devengenzo MT‡, Ellstrand NC, Ross-Ibarra J§

(2013) The genomic signature of crop-wild introgression in maize. PLoS Genetics 9(5): e1003477.

Kanizay LB, Pyhäjärvi T, Lowry E, Hufford MB, Peterson DG, Ross-Ibarra J, Dawe RK (2013) Diver- sity and abundance of the Abnormal chromosome 10 meiotic drive complex in Zea mays. Heredity 110: 570-577.

Hufford MB, Bilinski P, Pyhäjärvi T, Ross-Ibarra J§ (2012) Teosinte as a model system for popula- tion and ecological genomics. Trends in Genetics 12:606-615†

van Heerwaarden J§, Hufford MB, Ross-Ibarra J§ (2012) Historical genomics of North American maize. PNAS 109: 12420-12425

Swanson-Wagner R, Briskine R, Schaefer R, Hufford MB, Ross-Ibarra J, Myers CL, Tiffin P, Springer NM. Reshaping of the maize transcriptome by domestication. (2012) PNAS 109: 11878-11883

Hufford MB∗, Xun X∗, van Heerwaarden J∗, Pyhäjärvi T∗, Chia J-M, Cartwright RA, Elshire RJ, Glaubitz JC, Guill KE, Kaeppler S, Lai J, Morrell PL, Shannon LM, Song C, Spinger NM, Swanson- Wagner RA, Tiffin P, Wang J, Zhang G, Doebley J, McMullen MD, Ware D, Buckler ES§, Yang S§, Ross-Ibarra J§ (2012) Comparative population genomics of maize domestication and improvement. Nature Genetics 44:808-811†

Chia J-M∗, Song C∗, Bradbury P, Costich D, de Leon N, Doebley JC, Elshire RJ, Gaut BS, Geller L, Glaubitz JC, Gore M, Guill KE, Holland J, Hufford MB, Lai J, Li M, Liu X, Lu Y, McCombie R, Nel- son R, Poland J, Prasanna BM, Pyhäjärvi T, Rong T, Sekhon RS, Sun Q, Tenaillon M, Tian F, Wang J, Xu X, Zhang Z, Kaeppler S, Ross-Ibarra J, McMullen M, Buckler ES, Zhang G, Xu Y, Ware, D (2012) Maize HapMap2 identifies extant variation from a genome in flux. Nature Genetics 44:803-807†

Hufford MB§, Gepts P, Ross-Ibarra J (2011) Influence of cryptic population structure on observed mating patterns in the wild progenitor of maize (Zea mays ssp. parviglumis). Molecular Ecology 20: 46-55

Tenaillon MI, Hufford MB, Gaut BS, Ross-Ibarra J§ (2011) Genome size and TE content as deter- mined by high-throughput sequencing in maize and Zea luxurians. Genome Biology and Evolu- tion 3: 219-229

Page 5: Complex adaptation in Zea

how do plants adapt?

Clausen, Keck, and Hiesey 1940

Jane Shelby Richardson

Page 6: Complex adaptation in Zea

hard sweep

Div

ersit

y

what evolutionary processes are involved?

Page 7: Complex adaptation in Zea

hard sweep multiple mutations

Div

ersit

y

what evolutionary processes are involved?

Page 8: Complex adaptation in Zea

hard sweep multiple mutations

Div

ersit

y

standing variation

what evolutionary processes are involved?

Page 9: Complex adaptation in Zea

hard sweep multiple mutations polygenic adaptation

Div

ersit

y

standing variation

what evolutionary processes are involved?

Page 10: Complex adaptation in Zea

what is the genetic basis of adaptation?

Lowry & Willis 2010 PLoS Biology

Page 11: Complex adaptation in Zea

Zea: teosinte & maize

Hufford et al. 2012 Trends in Genetics

Zea mays ssp. mays

Zea mays ssp. parviglumis

Zea mays ssp. mexicana

Zea nicaraguensis

Zea luxurians

Tripsacum dactyloides

Zea mays ssp. huehuetenangensis

Zea perennis

Zea diploperennis

Page 12: Complex adaptation in Zea

Zea as an evolutionary model

Purugganan and Fuller 2010 Evolution

Brandon Gaut

M. D. PURUGGANAN AND D. Q. FULLER

Figure 4. Comparison of evolutionary rate estimates. Box plots of the rates of evolution in (A) log (darwins) and (B) log (haldanes) fordomestication (DOM) as well as plants (PLAN) (from Bones and Farres 2001) and anthropogenic (AN) and natural (NAT) conditions for wildanimal species (Hendry et al. 2008). The asterisk indicates domestication rates under the assumption of the shortened 2000-year periodfor legume species. The vertical lines give the estimate ranges, whereas the boxes span the minimum and maximum quartile range. Thehorizontal line within the box gives the median rate.

while for grain/seed increase is 0.68 ± 0.15 × 10−3 haldanes.Using the log (rate) estimates, we find that the evolutionary ratesare not significantly different (t = 0.63, df = 18, P < 0.54). Incontrast, however, making the assumption of an ∼2000-year do-mestication period for legumes leads to a nearly twofold higherestimate for grain/seed size increase across all species (mean =1.15 × 10−3 ± 0.26 haldanes), but this rate difference betweennonshattering and seed size increase is still not significant (t =0.34, df = 18, P < 0.74). This analysis does not reveal a sig-nificantly higher rate for nonshattering compared to grain sizeevolution.

EVOLUTIONARY RATES DURING DOMESTICATION

ARE SIMILAR TO THOSE EXPERIENCED BY WILD

SPECIES

The real interest in calculating evolutionary rates for these dif-ferent domestication traits lies in the comparison with estimatesobserved in wild species that are subject to natural selection. Therehave been extensive compilations of phenotypic rates of evolu-tion from contemporary microevolutionary (Hendry and Kinnison1999; Bone and Farres 2001; Kinnison and Hendry 2001; Hendryet al. 2008) and paleontological data (Gingerich 2001), and wecan use these to compare domesticated versus wild species.

The rate estimates for evolution during domestication fallwithin the range observed in these microevolutionary studies, buton the lower side of the range, either those for plants (Bone andFarres 2001) or under natural and anthropogenic conditions inanimals (Hendry et al. 2008,) see Figure 4. Comparison with theplant data (Bone and Farres 2001), for example, reveals that themean rates of evolution based on log-transformed data are muchhigher than those observed for the rate of evolution during domes-tication (see Fig. 4 and Table 4). Indeed, the mean evolutionaryrates we observe during domestication are significantly lower thanmean rates of phenotypic evolution of plant species in the wild(t = 7.48, df = 91, P < 0.0001).

Most of the studies that document rapid evolution in wildplant species, however, represent cases of very strong selection(e.g., growth in serpentine soils, herbicide resistance, Bone andFarres 2001). To examine evolution of other wild species, we alsocompare our results to data from wild animal species (Hendry et al.2008). Moreover, the large size of the dataset (Hendry et al. 2008)allows us to focus on allochronic data (e.g., change across a timeseries, which is comparable to the our archaeological data), and topartition the data for wild species into those that have experiencednatural versus anthropogenic conditions. This comparison is basedon the assumption that, in contrast to the compiled data from plantsunder strong selection pressures (Bone and Farres 2001), the ratesof evolution under less-stringent selective conditions are at leastbroadly comparable between plants and animals.

Not surprisingly, the mean rates of evolution for wild speciesunder anthropogenic conditions are higher than those under nat-ural conditions (see Fig. 4 and Table 4). Nevertheless, the meanrate of evolution under domestication is significantly lower than

Table 4. Mean rates of evolution.

Mean Rates Rate Rate×103

(darwins) (haldanes)

DomesticationOverall 144.56±33.55 0.73±0.14Overall1 200.82±44.04 1.17±0.26Nonshattering 835.44±196.13 0.98±0.16Grain/seed size 107.92±20.7 0.69±0.16Grain/seed size1 158.35±32.15 1.21±0.29

NaturalPlant 8893.8±2604.8 33.4±10.5Anthropogenic animal 5906.7±781 71.3±18.4Natural animal 626.2±86.4 29.5±5.4

1Rates calculated under the assumption that legume domestication occurred

over a 2000-year period.

1 7 8 EVOLUTION JANUARY 2011

maizeArabidopsis

Hufford et al. 2012 Trends in Genetics

Kew C-Value Database

Page 13: Complex adaptation in Zea

Wills et al. 2013 PLoS Genetics

grassytillers: evolution of plant architecture

Page 14: Complex adaptation in Zea

prolificacy mapped to upstream of gt1

Wills et al. 2013 PLoS Genetics

markers SBM07 (AGP v2: 23,232,048) and SBM08 (AGP v2:23,234,775) (Figure 3, Figure S1). Correspondingly, all membersof the teosinte phenotypic class carry teosinte chromosomebetween these two markers. No other chromosomal region showsthis absolute correspondence with phenotype. Thus, substitutionmapping based on the recombination breakpoints indicates thatprol1.1 or the factor that governs prolificacy maps to this interval.This interval, which we will refer to as the ‘‘causative region,’’ isapproximately 7.5 kb upstream of gt1 and measures 2720 bp inW22, 3142 bp in our teosinte parent, and 2736 bp in the B73reference genome (Figure 3, Figure S1). The sequence alignmentof W22 and the teosinte parent expands to ,4.2 kb because thereare several large insertions unique to either W22 or teosinte (seebelow).

The decrease in prolificacy in maize is correlated with anincrease in kernel weight

The maize allele of prol1.1 confers a reduction in ear number,which by itself would cause a reduction in yield. To test whetherthere is a compensatory increase in either the number of kernelsper ear or kernel weight, we assayed plants of the BC2S3 familyused for fine-mapping to determine if prol1.1 has associated effectson these traits. The prol1.1 maize allele is not associated with anincrease in ear size as measured by the total number of spikelets(kernel forming units) produced in the primary ear (maize = 418,heterozygous = 423, teosinte = 421, p = 0.86; Table S2). However,the maize allele is associated with an increase in kernel weight(maize = 0.216 g, heterozygous = 0.208 g, teosinte = 0.187 g,p,0.0001; Table S2). Other aspects of plant architecture such

Figure 3. Fine-mapping of prol1.1 on chromosome 1S. At the top, there is a map of the prol1.1 chromosomal region with genetic markers andtheir APG v2 positions. The upper set of 25 horizontal bars represents the 23 recombinant chromosome lines and the maize and teosinte controllines. White segments indicate maize genotype, black segments teosinte genotype, and gray segments unknown or regions where maize andteosinte are identical. Prolificacy trait values and standard errors for each recombinant and control line are shown by the blue column graphs on theright. The lower set of 25 bars is a close-up view of the region near gt1 to which prol1.1 localized. At the bottom, a fine-scale map showing thelocation of prol1.1 between SBM07 and SBM08 and its position relative to the gt1 coding sequence. See also Figure S1 and Table S1.doi:10.1371/journal.pgen.1003604.g003

Genetics of Prolificacy during Maize Domestication

PLOS Genetics | www.plosgenetics.org 4 June 2013 | Volume 9 | Issue 6 | e1003604

Page 15: Complex adaptation in Zea

prolificacy mapped to upstream of gt1

Wills et al. 2013 PLoS Genetics

Page 16: Complex adaptation in Zea

gt1 controls lateral bud formation

Wills et al. 2013 PLoS Genetics Whipple et al. 2011 PNASWhipple et al. 2011 PNAS

Page 17: Complex adaptation in Zea

gt1 controls lateral bud formation

Wills et al. 2013 PLoS Genetics Whipple et al. 2011 PNASgreater efficiency of harvest is achieved by having all seed maturesynchronously. Similarly, harvesting a single large inflorescence orfruit from a plant is easier than harvesting dozens of smaller ones[18]. Thus, diverse crops have been selected to produce smallernumbers of larger seeds, fruits or inflorescences as a means ofimproving harvestability [2]. In the terminology of modern daymaize breeders, crops were selected to be less prolific.

Our QTL mapping for prolificacy confirms the results of threeprior studies that indicated this trait is controlled by a relativesmall number of QTL including one of large effect on the shortarm of chromosome 1. First, in an F2 cross of Chalco teosinte (Zeamays ssp. mexicana) with a Mexican maize landrace (Chapalote), oneof the four detected QTL was located on the short arm ofchromosome 1 and accounted for upwards of 19% of thephenotypic variance in prolificacy [19]. Second, in an F2 crossof Balsas teosinte with a different Mexican maize landrace(Reventador), one of the seven detected QTL was located on theshort arm of chromosome 1 and accounted for 25% of thephenotypic variance [20]. Finally, in a maize-teosinte BC1 cross of

Balsas teosinte by a US inbred line (W22), seven prolificacy QTLwere detected [21]. All seven QTL had small effects, but the onethat explained the greatest portion of the variance (4.5% averagedover two environments) was on the short arm of chromosome 1. Asin these prior studies, the QTL mapping reported here indicatesthat prolificacy is under relatively simple genetic control, involvingonly 8 QTL but including one QTL (prol1.1) of large effect. prol1.1accounted for 36.7% of the variation in the number of ears andreduces the number of ears from 7.2 for teosinte homozygous classto 2.4 for the maize homozygous class.

The genetic architecture of the change in prolificacy duringdomestication appears to be relatively simple in several other cropsas well. In tomato, five QTL of roughly equal effects for thenumber of flowers per truss between wild and domesticatedtomato were detected [22,23]. In the common bean, three QTLwere detected for the reduction in the number of pods per plant ina cross of wild and domesticated bean [24]. The QTL of largesteffect confers a reduction from 29 to 17 pods per plant andaccounts for 32% of trait variation. In pearl millet, the reduction in

Figure 5. Longitudinal sections of ear-forming primary lateral branches hybridized with antisense gt1 RNA probe. (A) M:M and (B) M:Tgenotypes, showing gt1 expressed at low levels in the nodes. (C) T:M and (D) T:T genotypes in which there is no viable gt1 expression in the nodes.Weak gt1 expression is seen in the leaves surround the branch in all sections.doi:10.1371/journal.pgen.1003604.g005

Genetics of Prolificacy during Maize Domestication

PLOS Genetics | www.plosgenetics.org 7 June 2013 | Volume 9 | Issue 6 | e1003604

Whipple et al. 2011 PNAS

Page 18: Complex adaptation in Zea

partial sweep upstream of gt1

-0.02 -0.01 0.00 0.01 0.02

050

100

150

density.default(x = teopi - lrpi, from = -0.02, to = 0.02, breaks = 100)

πteo − πmz

Wills et al. 2013 PLoS Genetics

Page 19: Complex adaptation in Zea

partial sweep upstream of gt1

-0.02 -0.01 0.00 0.01 0.02

050

100

150

density.default(x = teopi - lrpi, from = -0.02, to = 0.02, breaks = 100)

πteo − πmz

Wills et al. 2013 PLoS Genetics

MAIZE

TEOge

nom

e-w

ide

gt1

upst

ream

Page 20: Complex adaptation in Zea

partial sweep upstream of gt1

-0.02 -0.01 0.00 0.01 0.02

050

100

150

density.default(x = teopi - lrpi, from = -0.02, to = 0.02, breaks = 100)

πteo − πmz

Wills et al. 2013 PLoS Genetics

MAIZE

TEOge

nom

e-w

ide

gt1

upst

ream

Page 21: Complex adaptation in Zea

partial sweep upstream of gt1

Wills et al. 2013 PLoS Genetics

Page 22: Complex adaptation in Zea

Wills et al. 2013 PLoS Genetics

convergent evolution at gt1

T/TM/TM/M

T/TM/TM/M

A B

T/TM/TM/M

T/TM/TM/M

A B

3’ UTR

5’ control region

Page 23: Complex adaptation in Zea

Wills et al. 2013 PLoS Genetics

convergent evolution at gt1Multiple

MutationsStanding Variation

Page 24: Complex adaptation in Zea

• Maecenas aliquam maecenas ligula nostra, accumsan taciti. Sociis mauris in integer

• El eu libero cras interdum at eget habitasse elementum est, ipsum purus pede

• Aliquet sed. Lorem ipsum dolor sit amet, ligula suspendisse nulla pretium, rhoncus

Title Text Title Text

maize colonization of highlands

Matsuoka et al. 2002; Piperno 2006 Perry et al. 2006; Piperno et al. 2009van Heerwaarden et al. 2011 PNAS

Mexico highland6,000 BP

Mexico lowland

9,000 BP

Page 25: Complex adaptation in Zea

• Maecenas aliquam maecenas ligula nostra, accumsan taciti. Sociis mauris in integer

• El eu libero cras interdum at eget habitasse elementum est, ipsum purus pede

• Aliquet sed. Lorem ipsum dolor sit amet, ligula suspendisse nulla pretium, rhoncus

Title Text Title Text

maize colonization of highlands

Matsuoka et al. 2002; Piperno 2006 Perry et al. 2006; Piperno et al. 2009van Heerwaarden et al. 2011 PNAS

Mexico highland6,000 BP

S. America lowland

6,000 BP

Mexico lowland

9,000 BP

Page 26: Complex adaptation in Zea

• Maecenas aliquam maecenas ligula nostra, accumsan taciti. Sociis mauris in integer

• El eu libero cras interdum at eget habitasse elementum est, ipsum purus pede

• Aliquet sed. Lorem ipsum dolor sit amet, ligula suspendisse nulla pretium, rhoncus

Title Text Title Text

maize colonization of highlands

Matsuoka et al. 2002; Piperno 2006 Perry et al. 2006; Piperno et al. 2009van Heerwaarden et al. 2011 PNAS

Mexico highland6,000 BP

S. America lowland

6,000 BP

S. America Highland

4,000 BP

Mexico lowland

9,000 BP

Page 27: Complex adaptation in Zea

Title Text Title Text

Mexico

Mon

thon

Wac

hira

setta

kul

Andes

Mat

t Huf

ford

ResultsPatterns of Genetic Structure and Differentiation. Principal com-ponents analysis (PCA) (17) of the maize SNP data identifies 58significant principal components (PCs) (explaining 37.6% oftotal variance), probably reflecting isolation by distance (18) andlinkage effects (19). We use the first nine PCs, which present thestrongest spatial autocorrelation (Fig. S2) and explain a largeportion of the total variance (18.7%), to cluster the accessionsinto 10 geographically distinct groups (Fig. 1A). Meso-Americanmaize falls into three groups: the Meso-American Lowlandgroup, which includes predominantly lowland accessions fromsoutheast Mexico and the Caribbean; the West Mexico group,representing both lowlands and highlands; and the MexicanHighland group, encompassing most of Matsuoka et al.’s high-land Mexican accessions (5) as well as accessions from highlandGuatemala. These clusters also confirm the presence of US-de-rived varieties in South America (20); we excluded these acces-sions from further analysis.In the joint PCA analysis of the three subspecies, the first PC

(10.8% of variance) separates maize from its wild relatives andconfirms the similarity between maize from the Mexican Highlandgroup and parviglumis (Fig. 1B). The second PC (4.8%of variance)mainly separates the genetic groups of maize along a north–southaxis, with the Northern United States and Andean Highlands atthe extremes. The third PC (2.7% of variance) predominatelyreflects the difference between parviglumis and mexicana. TheMexican Highland cluster extends toward mexicana along bothPC 1 and 3, suggesting that the similarity of highland maize toparviglumis may reflect admixture with mexicana.

Admixture Analysis. Simulation of gene flow of mexicana into theMeso-American Lowland maize group suggests that 13% cu-mulative historical introgression is sufficient to explain observeddifferences between lowland and highland maize in terms ofheterozygosity and differentiation from parviglumis (Fig. S3).Structure analysis (21) of all Mexican accessions lends supportfor this magnitude of introgression (Fig. 2). The three subspeciesform clearly separated clusters, but evidence of admixture is

evident in all three groups, and the two wild relatives show clearsigns of bidirectional introgression at altitudes where theirranges overlap (Fig. 2). Highland maize shows strong signs ofmexicana introgression, with 20% admixture observed in theMexican Highland cluster, but below 1,500 m mexicana in-trogression drops to less than 1%. Introgression from parviglumisinto maize is much lower overall, reaching its highest averagevalue (3%) in the lowland West Mexico group.

Drift Analysis. Because introgression from mexicana may affectancestry inference based on genetic distance from parviglumis, wetook an approach that does not require reference to the wild rel-atives. Under models of historical range expansion, genetic dif-ferentiation increases away from the population of origin (22, 23),and estimates of drift from ancestral frequencies have been appliedsuccessfully to identify ancestral populations (24). We thereforeapplied the method of Nicholson et al. (25) to estimate simulta-neously ancestral frequencies and F, a measure of genetic drift ofaway from these frequencies, for sets of predefined populations.To illustrate the potential impact ofmexicana introgression, we

first performed a standard analysis that includes each maizepopulation in turn in conjunction with the two wild relatives.Average drift away from the inferred common ancestor of maize,parviglumis, and mexicana is higher for maize (F = 0.24) than formexicana (F = 0.15) or parviglumis (F = 0.07), probably due tochanges in allele frequency following the domestication bottle-neck. Because the inferred ancestral frequencies are closer tothose of the wild relatives than to present-day maize, comparisonwith this ancestor is sensitive to introgression from these sub-species. It therefore is not surprising that estimates of F betweenindividual maize populations and the common ancestor of allthree taxa identify the Mexican Highland group as being mostsimilar (Fig. 3A). This pattern is maintained in an analysis ex-cluding mexicana, in which Mexican Highland maize is tied withtheWestMexico group as themost ancestral population (Fig. 3B).To mitigate the impact of introgression, we used a slightly

modified approach that excludes both parviglumis and mexicanaand calculates genetic drift with respect to ancestral frequenciesinferred from domesticated maize alone. Because the genetic

Fig. 1. (A) Map of sampled maize accessions colored by genetic group. (B) First three genetic PCs of all sampled accessions.

van Heerwaarden et al. PNAS | January 18, 2011 | vol. 108 | no. 3 | 1089

EVOLU

TION

van Heerwaarden et al. 2011 PNAS

Page 28: Complex adaptation in Zea

Title Text Title Text

Mexico

Mon

thon

Wac

hira

setta

kul

Andes

Mat

t Huf

ford

ResultsPatterns of Genetic Structure and Differentiation. Principal com-ponents analysis (PCA) (17) of the maize SNP data identifies 58significant principal components (PCs) (explaining 37.6% oftotal variance), probably reflecting isolation by distance (18) andlinkage effects (19). We use the first nine PCs, which present thestrongest spatial autocorrelation (Fig. S2) and explain a largeportion of the total variance (18.7%), to cluster the accessionsinto 10 geographically distinct groups (Fig. 1A). Meso-Americanmaize falls into three groups: the Meso-American Lowlandgroup, which includes predominantly lowland accessions fromsoutheast Mexico and the Caribbean; the West Mexico group,representing both lowlands and highlands; and the MexicanHighland group, encompassing most of Matsuoka et al.’s high-land Mexican accessions (5) as well as accessions from highlandGuatemala. These clusters also confirm the presence of US-de-rived varieties in South America (20); we excluded these acces-sions from further analysis.In the joint PCA analysis of the three subspecies, the first PC

(10.8% of variance) separates maize from its wild relatives andconfirms the similarity between maize from the Mexican Highlandgroup and parviglumis (Fig. 1B). The second PC (4.8%of variance)mainly separates the genetic groups of maize along a north–southaxis, with the Northern United States and Andean Highlands atthe extremes. The third PC (2.7% of variance) predominatelyreflects the difference between parviglumis and mexicana. TheMexican Highland cluster extends toward mexicana along bothPC 1 and 3, suggesting that the similarity of highland maize toparviglumis may reflect admixture with mexicana.

Admixture Analysis. Simulation of gene flow of mexicana into theMeso-American Lowland maize group suggests that 13% cu-mulative historical introgression is sufficient to explain observeddifferences between lowland and highland maize in terms ofheterozygosity and differentiation from parviglumis (Fig. S3).Structure analysis (21) of all Mexican accessions lends supportfor this magnitude of introgression (Fig. 2). The three subspeciesform clearly separated clusters, but evidence of admixture is

evident in all three groups, and the two wild relatives show clearsigns of bidirectional introgression at altitudes where theirranges overlap (Fig. 2). Highland maize shows strong signs ofmexicana introgression, with 20% admixture observed in theMexican Highland cluster, but below 1,500 m mexicana in-trogression drops to less than 1%. Introgression from parviglumisinto maize is much lower overall, reaching its highest averagevalue (3%) in the lowland West Mexico group.

Drift Analysis. Because introgression from mexicana may affectancestry inference based on genetic distance from parviglumis, wetook an approach that does not require reference to the wild rel-atives. Under models of historical range expansion, genetic dif-ferentiation increases away from the population of origin (22, 23),and estimates of drift from ancestral frequencies have been appliedsuccessfully to identify ancestral populations (24). We thereforeapplied the method of Nicholson et al. (25) to estimate simulta-neously ancestral frequencies and F, a measure of genetic drift ofaway from these frequencies, for sets of predefined populations.To illustrate the potential impact ofmexicana introgression, we

first performed a standard analysis that includes each maizepopulation in turn in conjunction with the two wild relatives.Average drift away from the inferred common ancestor of maize,parviglumis, and mexicana is higher for maize (F = 0.24) than formexicana (F = 0.15) or parviglumis (F = 0.07), probably due tochanges in allele frequency following the domestication bottle-neck. Because the inferred ancestral frequencies are closer tothose of the wild relatives than to present-day maize, comparisonwith this ancestor is sensitive to introgression from these sub-species. It therefore is not surprising that estimates of F betweenindividual maize populations and the common ancestor of allthree taxa identify the Mexican Highland group as being mostsimilar (Fig. 3A). This pattern is maintained in an analysis ex-cluding mexicana, in which Mexican Highland maize is tied withtheWestMexico group as themost ancestral population (Fig. 3B).To mitigate the impact of introgression, we used a slightly

modified approach that excludes both parviglumis and mexicanaand calculates genetic drift with respect to ancestral frequenciesinferred from domesticated maize alone. Because the genetic

Fig. 1. (A) Map of sampled maize accessions colored by genetic group. (B) First three genetic PCs of all sampled accessions.

van Heerwaarden et al. PNAS | January 18, 2011 | vol. 108 | no. 3 | 1089

EVOLU

TION

van Heerwaarden et al. 2011 PNAS

Page 29: Complex adaptation in Zea

• 96 samples from four highland/lowland populations

• 100K SNPs: GBS, Maize SNP50

Title Text Title Text

independent genetic origins

Takuno et al. 2014 10.5281/zenodo.11692

Page 30: Complex adaptation in Zea

Title Text

demography explains most differentiation

Mexico Lowland

Mexico Highland

NA

NB

NC

N1 N2

N2P

tD tE

tF

NA

NB

NC

N1 N2

N2P

tD tE

tF

tmex

Nmex

NA

NB

NC

N1 N2

tD tE

tF

N3 N4

NC �ĮNA

N1 �ȕNC

N2 ����ȕ�NC

N2P� �ȖN2

NC �ĮNA

N1 �ȕNC

N2 ����ȕ�NC

N2P� �ȖN2

NC �ĮNA

N1 �ȕ1NC

N2 ����ȕ1�NC

N3 �ȕ2N2

N4 ����ȕ2�N2

N4P �ȖN4

tG

N4P Lowland Highland mexicana Mexico

Lowland SA

Lowland SA

Highland

Model IA Model IB Model II

Figure 2 Demographic models of maize low- and high-land populations. Parameters in bold were estimated inthis study. See text for details.

A HWE cut-off of P < 0.005 was used for each subpopu-lation due to our under-calling of heterozygotes. In total, weincluded 18,745 silent SNPs for the Mexican populations inModels IA and IB, 14,508 for the S. American populations inModel I and 11,305 for the Mexican lowland population andthe S. American populations in Model II. We obtained similarresults under more or less stringent thresholds for significance(P < 0.05 ⇠ 0.0005; data not shown), though the number ofSNPs was very small at P < 0.005. Demographic parameterswere inferred with the software �a�i (Gutenkunst et al. 2009),which uses a diffusion method to calculate an expected JFDand evaluates the likelihood of the data using a multinomialassumption.Model IA: This model is applied to the Mexican and S. Amer-ican populations. We assume the ancestral diploid popula-tion representing parviglumis follows a standard Wright-Fishermodel with constant size. The size of the ancestral popula-tion is denoted by NA. At tD generations ago, the bottleneckevent begins at domestication, and at tE generations ago, thebottleneck ends. The population size and duration of the bot-tleneck are denoted by NB and tB = tD � tE , respectively.The population size recovers to NC = ↵NA in the lowlands.Then, the highland population is differentiated from the low-land population at tF generations ago. The size of the low- andhighland populations at time tF is determined by a parameter� such that the population is divided by �NC and (1� �)NC .We assume that the population size in the lowlands is constantbut that the highland population experiences exponential ex-pansion after divergence: its current population size is � timeslarger than that at tF .isn’t this really a shrinking population in the lowlands, since �NC < NC ? wouldn’t

we want instead for lowlands to stay at NC and a new population branching off? how

much do we worry about this? actually, our conclusion holds when Iassumed the pop size of lowlands stays at NC . However, the

likelihood is a bit better in my original model.Model IB: We expand Model IA for the Mexican populationsby incorporating admixture from the teosinte mexicana to thehighland Mexican maize population. do we say ”Mexico population” or

”Mexican” (and thus ”South American”) ”population” throughout? as long as we’re

consistent probably OK either way. vote to Mexican population second

The time of differentiation between parviglumis and mexicanaoccurs at tmex generations ago. The mexicana population sizeis assumed to be constant at Nmex. At tF generations ago,the Mexican highland population is derived from admixturebetween the Mexican lowland population and a portion Pmex

from the teosinte mexicana .

Model II: The final model is for the Mexican lowland, S.American lowland and highland populations. This modelwas used for simulating SNPs with ascertainment bias (seebelow). At time tF , the Mexican and S. American lowlandpopulations are differentiated, and the sizes of populationsafter splitting are determined by �1. At time tG, S. Amer-ican lowland and highland populations are differentiated,and the sizes of populations at this time are determined by�2. As in Model IA, the S. American highland population isassumed to experience population growth with the parameter �.

Estimates of a number of our model parameters were avail-able from previous work. NA was set to 150,000 using esti-mates of the composite parameter 4NAµ ⇠ 0.018 from parvig-lumis (Eyre-Walker et al. 1998; Tenaillon et al. 2001, 2004;Wright et al. 2005; Ross-Ibarra et al. 2009) and an estimateof the mutation rate µ ⇠ 3 ⇥ 10

�8 (Clark et al. 2005) persite per generation. The severity of the domestication bottle-neck is represented by k = NB/tB (Eyre-Walker et al. 1998;Wright et al. 2005), and following Wright et al. (2005) we as-sumed k = 2.45 and tB = 1, 000 generations. Taking intoaccount archaeological evidence (Piperno et al. 2009), we as-sume tD = 9, 000 and tE = 8, 000. We further assumedtF = 6, 000 for Mexican populations in Models IA and IB(Piperno 2006), tF = 4, 000 for S. American populationsin Model lA (Perry et al. 2006; Grobman et al. 2012), andtmex = 60, 000, Nmex = 160, 000 (Ross-Ibarra et al. 2009),and Pmex = 0.2 (van Heerwaarden et al. 2011) for ModelIB. For both Models IA and IB, we inferred three parameters(↵, � and �), and, for Model II, we fixed tF = 6, 000 andtG = 4, 000 (Piperno 2006; Perry et al. 2006; Grobman et al.2012) and estimated the remaining four parameters (↵, �1, �2

and �).tF for model II is listed as 4,000 and 6,000 above. 6,000 is the number that matches

the lit best. is that what was used? if so, we should cite (Grobman et al. 2012) fixed

Differentiation between low- and highland popula-tions

We used our inferred demographic model to generate a nulldistribution of FST . As implemented in �a�i (Gutenkunst

4

Takuno et al. 2014 10.5281/zenodo.11692

Table 2 Inference of demographic parameters

Mexico Model I Model II

Likelihood �5592.80 Likelihood �4654.79

↵ 0.92 ↵ 1.5

� 0.38 � 0.76

� 1 � 1

South America Model I Model III

Likelihood �3855.28 Likelihood �8044.71

↵ 0.52 ↵ 1.0

� 0.97 �1 0.64

� 88 �2 0.95

� 54

Population structure

We performed a STRUCTURE analysis (Pritchard et al. 2000;Falush et al. 2003) of our landrace sample, varying the numberof groups from K = 2 to 6 (Figure 1, Figure S3). Most lan-draces were assigned to groups consistent with a priori popu-lation definitions, but admixture between highland and lowlandpopulations was evident at intermediate elevations (⇠ 1700m).Consistent with previously described scenarios for maize dif-fusion (Piperno 2006), we find evidence of shared ancestrybetween lowland Mexican maize and both Mexican highlandand S. American lowland populations. Pairwise FST amongpopulations reveals low overall differentiation (Table 1), andthe higher FST values observed in S. America are consistentwith decreased admixture seen in STRUCTURE. Archaeolog-ical evidence supports a more recent colonization of the high-lands in S. America (Piperno 2006; Perry et al. 2006; Grobmanet al. 2012), suggesting that the observed differentiation maybe the result of a stronger bottleneck during colonization of theS. American highlands.

Population differentiation under inferred demogra-phy

To provide a null expectation for allele frequency differentia-tion, we used the joint site frequency distribution (JFD) of low-land and highland populations to estimate parameters of twodemographic models using the maximum likelihood methodimplemented in �a�i (Gutenkunst et al. 2009). All models in-corporate a domestication bottleneck (Wright et al. 2005) andpopulation differentiation between lowland and highland popu-lations, but differ in their consideration of admixture and ascer-tainment bias (Figure 2; see Materials and Methods for details).

Estimated parameter values are listed in Table 2; while theobserved and expected JFDs were quite similar for both mod-els, residuals indicated an excess of rare variants in the ob-served JFDs in all cases (Figure 3). Under both models IA and

A

B

LowlandsH

ighl

ands

Observation Expectation ResidualMexico

South America

40

–40

0

Model IA

Model IB

Density

Residual

10–4

0

10–310–210–1

Lowlands

Hig

hlan

ds

Observation Expectation Residual

40

–40

0

Model IA

Model II

Density

Residual

10–4

0

10–310–210–1

Figure 3 Observed and expected joint distributions of mi-nor allele frequencies in low- and highland populations in(A) Mexico and (B) S. America. Residuals are calculatedas (model � data)/

pmodel

IB, we found expansion in the highland population in Mexicoto be unlikely, but a strong bottleneck followed by populationexpansion is supported in S. American maize in both modelsIA and II. The likelihood value of model IB was higher thanthe likelihood of model IA by 850 units of log-likelihood (Ta-ble 2), consistent with analyses suggesting that introgressionfrom mexicana played a significant role during the spread ofmaize into the Mexican highlands (Hufford et al. 2013).

In addition to the parameters listed in Figure 2, we investi-gated the impact of varying the domestication bottleneck size(NB). Surprisingly, NB was estimated to be equal to NC , thepopulation size at the end of the bottleneck, and the likelihoodof NB < NC was much smaller than for alternative parame-terizations (Table 2, S2). This result appears to contradict ear-

7

Table 2 Inference of demographic parameters

Mexico Model I Model II

Likelihood �5592.80 Likelihood �4654.79

↵ 0.92 ↵ 1.5

� 0.38 � 0.76

� 1 � 1

South America Model I Model III

Likelihood �3855.28 Likelihood �8044.71

↵ 0.52 ↵ 1.0

� 0.97 �1 0.64

� 88 �2 0.95

� 54

Population structure

We performed a STRUCTURE analysis (Pritchard et al. 2000;Falush et al. 2003) of our landrace sample, varying the numberof groups from K = 2 to 6 (Figure 1, Figure S3). Most lan-draces were assigned to groups consistent with a priori popu-lation definitions, but admixture between highland and lowlandpopulations was evident at intermediate elevations (⇠ 1700m).Consistent with previously described scenarios for maize dif-fusion (Piperno 2006), we find evidence of shared ancestrybetween lowland Mexican maize and both Mexican highlandand S. American lowland populations. Pairwise FST amongpopulations reveals low overall differentiation (Table 1), andthe higher FST values observed in S. America are consistentwith decreased admixture seen in STRUCTURE. Archaeolog-ical evidence supports a more recent colonization of the high-lands in S. America (Piperno 2006; Perry et al. 2006; Grobmanet al. 2012), suggesting that the observed differentiation maybe the result of a stronger bottleneck during colonization of theS. American highlands.

Population differentiation under inferred demogra-phy

To provide a null expectation for allele frequency differentia-tion, we used the joint site frequency distribution (JFD) of low-land and highland populations to estimate parameters of twodemographic models using the maximum likelihood methodimplemented in �a�i (Gutenkunst et al. 2009). All models in-corporate a domestication bottleneck (Wright et al. 2005) andpopulation differentiation between lowland and highland popu-lations, but differ in their consideration of admixture and ascer-tainment bias (Figure 2; see Materials and Methods for details).

Estimated parameter values are listed in Table 2; while theobserved and expected JFDs were quite similar for both mod-els, residuals indicated an excess of rare variants in the ob-served JFDs in all cases (Figure 3). Under both models IA and

A

B

Lowlands

Hig

hlan

ds

Observation Expectation ResidualMexico

South America

40

–40

0

Model IA

Model IB

Density

Residual

10–4

0

10–310–210–1

LowlandsH

ighl

ands

Observation Expectation Residual

40

–40

0

Model IA

Model II

Density

Residual

10–4

0

10–310–210–1

Figure 3 Observed and expected joint distributions of mi-nor allele frequencies in low- and highland populations in(A) Mexico and (B) S. America. Residuals are calculatedas (model � data)/

pmodel

IB, we found expansion in the highland population in Mexicoto be unlikely, but a strong bottleneck followed by populationexpansion is supported in S. American maize in both modelsIA and II. The likelihood value of model IB was higher thanthe likelihood of model IA by 850 units of log-likelihood (Ta-ble 2), consistent with analyses suggesting that introgressionfrom mexicana played a significant role during the spread ofmaize into the Mexican highlands (Hufford et al. 2013).

In addition to the parameters listed in Figure 2, we investi-gated the impact of varying the domestication bottleneck size(NB). Surprisingly, NB was estimated to be equal to NC , thepopulation size at the end of the bottleneck, and the likelihoodof NB < NC was much smaller than for alternative parame-terizations (Table 2, S2). This result appears to contradict ear-

7

Mexico Lowland

Mexico Highland

NA

NB

NC

N1 N2

N2P

tD tE

tF

NA

NB

NC

N1 N2

N2P

tD tE

tF

tmex

Nmex

NA

NB

NC

N1 N2

tD tE

tF

N3 N4

NC �ĮNA

N1 �ȕNC

N2 ����ȕ�NC

N2P� �ȖN2

NC �ĮNA

N1 �ȕNC

N2 ����ȕ�NC

N2P� �ȖN2

NC �ĮNA

N1 �ȕ1NC

N2 ����ȕ1�NC

N3 �ȕ2N2

N4 ����ȕ2�N2

N4P �ȖN4

tG

N4P Lowland Highland mexicana Mexico

Lowland SA

Lowland SA

Highland

Model IA Model IB Model II

Figure 2 Demographic models of maize low- and high-land populations. Parameters in bold were estimated inthis study. See text for details.

A HWE cut-off of P < 0.005 was used for each subpopu-lation due to our under-calling of heterozygotes. In total, weincluded 18,745 silent SNPs for the Mexican populations inModels IA and IB, 14,508 for the S. American populations inModel I and 11,305 for the Mexican lowland population andthe S. American populations in Model II. We obtained similarresults under more or less stringent thresholds for significance(P < 0.05 ⇠ 0.0005; data not shown), though the number ofSNPs was very small at P < 0.005. Demographic parameterswere inferred with the software �a�i (Gutenkunst et al. 2009),which uses a diffusion method to calculate an expected JFDand evaluates the likelihood of the data using a multinomialassumption.Model IA: This model is applied to the Mexican and S. Amer-ican populations. We assume the ancestral diploid popula-tion representing parviglumis follows a standard Wright-Fishermodel with constant size. The size of the ancestral popula-tion is denoted by NA. At tD generations ago, the bottleneckevent begins at domestication, and at tE generations ago, thebottleneck ends. The population size and duration of the bot-tleneck are denoted by NB and tB = tD � tE , respectively.The population size recovers to NC = ↵NA in the lowlands.Then, the highland population is differentiated from the low-land population at tF generations ago. The size of the low- andhighland populations at time tF is determined by a parameter� such that the population is divided by �NC and (1� �)NC .We assume that the population size in the lowlands is constantbut that the highland population experiences exponential ex-pansion after divergence: its current population size is � timeslarger than that at tF .isn’t this really a shrinking population in the lowlands, since �NC < NC ? wouldn’t

we want instead for lowlands to stay at NC and a new population branching off? how

much do we worry about this? actually, our conclusion holds when Iassumed the pop size of lowlands stays at NC . However, the

likelihood is a bit better in my original model.Model IB: We expand Model IA for the Mexican populationsby incorporating admixture from the teosinte mexicana to thehighland Mexican maize population. do we say ”Mexico population” or

”Mexican” (and thus ”South American”) ”population” throughout? as long as we’re

consistent probably OK either way. vote to Mexican population second

The time of differentiation between parviglumis and mexicanaoccurs at tmex generations ago. The mexicana population sizeis assumed to be constant at Nmex. At tF generations ago,the Mexican highland population is derived from admixturebetween the Mexican lowland population and a portion Pmex

from the teosinte mexicana .

Model II: The final model is for the Mexican lowland, S.American lowland and highland populations. This modelwas used for simulating SNPs with ascertainment bias (seebelow). At time tF , the Mexican and S. American lowlandpopulations are differentiated, and the sizes of populationsafter splitting are determined by �1. At time tG, S. Amer-ican lowland and highland populations are differentiated,and the sizes of populations at this time are determined by�2. As in Model IA, the S. American highland population isassumed to experience population growth with the parameter �.

Estimates of a number of our model parameters were avail-able from previous work. NA was set to 150,000 using esti-mates of the composite parameter 4NAµ ⇠ 0.018 from parvig-lumis (Eyre-Walker et al. 1998; Tenaillon et al. 2001, 2004;Wright et al. 2005; Ross-Ibarra et al. 2009) and an estimateof the mutation rate µ ⇠ 3 ⇥ 10

�8 (Clark et al. 2005) persite per generation. The severity of the domestication bottle-neck is represented by k = NB/tB (Eyre-Walker et al. 1998;Wright et al. 2005), and following Wright et al. (2005) we as-sumed k = 2.45 and tB = 1, 000 generations. Taking intoaccount archaeological evidence (Piperno et al. 2009), we as-sume tD = 9, 000 and tE = 8, 000. We further assumedtF = 6, 000 for Mexican populations in Models IA and IB(Piperno 2006), tF = 4, 000 for S. American populationsin Model lA (Perry et al. 2006; Grobman et al. 2012), andtmex = 60, 000, Nmex = 160, 000 (Ross-Ibarra et al. 2009),and Pmex = 0.2 (van Heerwaarden et al. 2011) for ModelIB. For both Models IA and IB, we inferred three parameters(↵, � and �), and, for Model II, we fixed tF = 6, 000 andtG = 4, 000 (Piperno 2006; Perry et al. 2006; Grobman et al.2012) and estimated the remaining four parameters (↵, �1, �2

and �).tF for model II is listed as 4,000 and 6,000 above. 6,000 is the number that matches

the lit best. is that what was used? if so, we should cite (Grobman et al. 2012) fixed

Differentiation between low- and highland popula-tions

We used our inferred demographic model to generate a nulldistribution of FST . As implemented in �a�i (Gutenkunst

4

lowlandshi

ghla

nds density

Mexico observed expected

Page 31: Complex adaptation in Zea

Yi et al. 2010 Science

little evidence for convergent sweeps

of altitude adaptation. The strongest such signalsinclude several genes with known roles in oxy-gen transport and regulation (Table 1 and tableS3). Overall, the 34 genes in our data set thatfell under the gene ontology category “responseto hypoxia” had significantly greater PBS valuesthan the genome-wide average (P = 0.00796).

The strongest signal of selection came from theendothelial Per-Arnt-Sim (PAS) domain protein1 (EPAS1) gene. On the basis of frequency dif-ferences among the Danes, Han, and Tibetans,EPAS1 was inferred to have a very long Tibetanbranch relative to other genes in the genome (Fig.2). In order to confirm the action of natural selec-tion, PBS values were compared against neutralsimulations under our estimated demographicmodel. None of one million simulations surpassedthe PBS value observed for EPAS1, and this resultremained statistically significant after accountingfor the number of genes tested (P < 0.02 afterBonferroni correction). Many other genes had un-corrected P values below 0.005 (Table 1), and,although none of these were statistically significantafter correcting for multiple tests, the functionalenrichment suggests that some of these genes mayalso contribute to altitude adaptation.

EPAS1 is also known as hypoxia-induciblefactor 2a (HIF-2a). The HIF family of transcrip-tion factors consist of two subunits, with three

Fig. 1. Two-dimensional unfolded site frequency spectrum for SNPs in Tibetan (x axis) and Han (y axis)population samples. The number of SNPs detected is color-coded according to the logarithmic scaleplotted on the right. Arrows indicate a pair of intronic SNPs from the EPAS1 gene that show stronglyelevated derived allele frequencies in the Tibetan sample compared with the Han sample.

Table 1. Genes with strongest frequency changes in the Tibetan population. The top 30 PBS values for the Tibetan branch are listed. Oxygen-relatedcandidate genes within 100 kb of these loci are noted. For FXYD, F indicates Phe; Y, Tyr; D, Asp; and X, any amino acid.

Gene Description Nearby candidate PBS P valueEPAS1 Endothelial PAS domain protein 1 (HIF-2a) (Self) 0.514 <0.000001C1orf124 Hypothetical protein LOC83932 EGLN1 0.277 0.000203DISC1 Disrupted in schizophrenia 1 EGLN1 0.251 0.000219ATP6V1E2 Adenosine triphosphatase (ATPase), H+ transporting, lysosomal 31 kD, V1 EPAS1 0.246 0.000705SPP1 Secreted phosphoprotein 1 0.238 0.000562PKLR Pyruvate kinase, liver, and RBC (Self) 0.230 0.000896C4orf7 Chromosome 4 open reading frame 7 0.227 0.001098PSME2 Proteasome activator subunit 2 0.222 0.001103OR10X1 Olfactory receptor, family 10, subfamily X SPTA1 0.218 0.000950FAM9C Family with sequence similarity 9, member C TMSB4X 0.216 0.001389LRRC3B Leucine-rich repeat–containing 3B 0.215 0.001405KRTAP21-2 Keratin-associated protein 21-2 0.213 0.001470HIST1H2BE Histone cluster 1, H2be HFE 0.212 0.001568TTLL3 Tubulin tyrosine ligase-like family, member 3 0.206 0.001146HIST1H4B Histone cluster 1, H4b HFE 0.204 0.001404ACVR1B Activin A type IB receptor isoform a precursor ACVRL1 0.198 0.002041FXYD6 FXYD domain–containing ion transport regulator 0.192 0.002459NAGLU Alpha-N-acetylglucosaminidase precursor 0.186 0.002834MDH1B Malate dehydrogenase 1B, nicotinamide adenine dinucleotide (NAD) (soluble) 0.184 0.002113OR6Y1 Olfactory receptor, family 6, subfamily Y SPTA1 0.183 0.002835HBB Beta globin (Self), HBG2 0.182 0.003128OTX1 Orthodenticle homeobox 1 0.181 0.003235MBNL1 Muscleblind-like 1 0.179 0.002410IFI27L1 Interferon, alpha-inducible protein 27-like 1 0.179 0.003064C18orf55 Hypothetical protein LOC29090 0.178 0.002271RFX3 Regulatory factor X3 0.176 0.002632HBG2 G-gamma globin (Self), HBB 0.170 0.004147FANCA Fanconi anemia, complementation group A (Self) 0.169 0.000995HIST1H3C Histone cluster 1, H3c HFE 0.168 0.004287TMEM206 Transmembrane protein 206 0.166 0.004537

2 JULY 2010 VOL 329 SCIENCE www.sciencemag.org76

REPORTS

on

Augu

st 1

7, 2

010

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fro

m

Maize

Han

Chi

nese

TibetanTakuno et al. 2014 10.5281/zenodo.11692

A

Fst

Freq

uenc

y

Model IA

GBS data

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

Model IB

Fst

Mexico

South America

Fst

Freq

uenc

y

Fst

B

Fst

Freq

uenc

y

Model IA

MaizeSNP50 data

Model IB

Fst

Mexico

South America

Fst

Freq

uenc

y

Model IA Model II

Fst

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

0 0.2 0.4 0.6 0.8 1 0.0001

0.001

0.01

0.1

1

Model IA Model II

Figure 4 Observed and expected distributions of FST val-ues in GBS (A) and MaizeSNP50 data (B). The x-axesrepresent FST values. The y -axes represent the fre-quency of SNPs with FST values within a bin of 0.05 size.Red dots and solid lines indicate observed and expecteddistributions.

lier work using sequences from coding regions to infer a maizedomestication bottleneck (Eyre-Walker et al. 1998; Tenaillonet al. 2004; Wright et al. 2005). Consistent with Hufford et al.(2012b), our genome-wide SNP data show an excess of rarevariants relative to expectations under Wright et al. (2005)’sbottleneck model (Figure 3), suggesting a domestication modelinvolving a weaker bottleneck or more rapid population growth.

Comparisons of our empirical FST values to the null ex-pectation simulated under our demographic models allowed usto identify significantly differentiated SNPs between low- andhighland populations. In all cases, observed FST values were

PS

PM

19 SNPs668 SNPs390 SNPs90,702 SNPs

Figure 5 Scatter plot of � log1 0P -values of observed FST

values based on simulation from estimated demographicmodels. P -values are shown for each SNP in both Mex-ico (Model IB; PM on x-axis) and South America (ModelII; PS on y-axis). Red, blue, orange and gray dots rep-resents SNPs showing significance in both Mexico andSouth America, only in Mexico, only in South America,respectively (see text for details). The number of SNPs ineach category is shown in the same color as the points.

quite similar to those generated under our null models (Fig-ure 4), and model choice – including the parameterization ofthe domestication bottleneck – had little impact on the distri-bution of estimated p-values (Figure S4). Thus, hereafter, weshow the results under Model IB for Mexican populations andModel II for S. American populations. We chose P < 0.01as an arbitrary cut-off for significant differentiation betweenlow- and highland populations, and identified 687 SNPs inMexico (687/76,989=0.89%) and 409 SNPs in South America(409/63,160=0.65%) as outliers (Figure 5).

Patterns of adaptation

Highland versus lowland adaptation:Given the historical spread of maize from an origin in the

lowlands, it is tempting to assume that significant populationdifferentiation should be primarily due to an increase in fre-quency of adaptive alleles in the highlands. To test this hypoth-esis, we sought to identify the adaptive allele at each locus us-ing comparisons between Mexico and S. America as well as toparviglumis (See Supplementary Text for details). Consistentwith predictions, we infer that differentiation at 72.3% (264)and 76.7% (230) of SNPs in Mexico and S. America is dueto adaptation in the highlands after excluding the SNPs withambiguous patterns (probably due to recombination). The ma-jority of these SNPs show patterns of haplotype variation (byPHS test) consistent with our inference (Supplementary Text

8

-log(p) Mexico

-log(

p) S

. Am

eric

a

Table 2 Inference of demographic parameters

Mexico Model I Model II

Likelihood �5592.80 Likelihood �4654.79

↵ 0.92 ↵ 1.5

� 0.38 � 0.76

� 1 � 1

South America Model I Model III

Likelihood �3855.28 Likelihood �8044.71

↵ 0.52 ↵ 1.0

� 0.97 �1 0.64

� 88 �2 0.95

� 54

Population structure

We performed a STRUCTURE analysis (Pritchard et al. 2000;Falush et al. 2003) of our landrace sample, varying the numberof groups from K = 2 to 6 (Figure 1, Figure S3). Most lan-draces were assigned to groups consistent with a priori popu-lation definitions, but admixture between highland and lowlandpopulations was evident at intermediate elevations (⇠ 1700m).Consistent with previously described scenarios for maize dif-fusion (Piperno 2006), we find evidence of shared ancestrybetween lowland Mexican maize and both Mexican highlandand S. American lowland populations. Pairwise FST amongpopulations reveals low overall differentiation (Table 1), andthe higher FST values observed in S. America are consistentwith decreased admixture seen in STRUCTURE. Archaeolog-ical evidence supports a more recent colonization of the high-lands in S. America (Piperno 2006; Perry et al. 2006; Grobmanet al. 2012), suggesting that the observed differentiation maybe the result of a stronger bottleneck during colonization of theS. American highlands.

Population differentiation under inferred demogra-phy

To provide a null expectation for allele frequency differentia-tion, we used the joint site frequency distribution (JFD) of low-land and highland populations to estimate parameters of twodemographic models using the maximum likelihood methodimplemented in �a�i (Gutenkunst et al. 2009). All models in-corporate a domestication bottleneck (Wright et al. 2005) andpopulation differentiation between lowland and highland popu-lations, but differ in their consideration of admixture and ascer-tainment bias (Figure 2; see Materials and Methods for details).

Estimated parameter values are listed in Table 2; while theobserved and expected JFDs were quite similar for both mod-els, residuals indicated an excess of rare variants in the ob-served JFDs in all cases (Figure 3). Under both models IA and

A

B

LowlandsH

ighl

ands

Observation Expectation ResidualMexico

South America

40

–40

0

Model IA

Model IB

Density

Residual

10–4

0

10–310–210–1

Lowlands

Hig

hlan

ds

Observation Expectation Residual

40

–40

0

Model IA

Model II

Density

Residual

10–4

0

10–310–210–1

Figure 3 Observed and expected joint distributions of mi-nor allele frequencies in low- and highland populations in(A) Mexico and (B) S. America. Residuals are calculatedas (model � data)/

pmodel

IB, we found expansion in the highland population in Mexicoto be unlikely, but a strong bottleneck followed by populationexpansion is supported in S. American maize in both modelsIA and II. The likelihood value of model IB was higher thanthe likelihood of model IA by 850 units of log-likelihood (Ta-ble 2), consistent with analyses suggesting that introgressionfrom mexicana played a significant role during the spread ofmaize into the Mexican highlands (Hufford et al. 2013).

In addition to the parameters listed in Figure 2, we investi-gated the impact of varying the domestication bottleneck size(NB). Surprisingly, NB was estimated to be equal to NC , thepopulation size at the end of the bottleneck, and the likelihoodof NB < NC was much smaller than for alternative parame-terizations (Table 2, S2). This result appears to contradict ear-

7

lowlands

high

land

s

0

1

1

Table 2 Inference of demographic parameters

Mexico Model I Model II

Likelihood �5592.80 Likelihood �4654.79

↵ 0.92 ↵ 1.5

� 0.38 � 0.76

� 1 � 1

South America Model I Model III

Likelihood �3855.28 Likelihood �8044.71

↵ 0.52 ↵ 1.0

� 0.97 �1 0.64

� 88 �2 0.95

� 54

Population structure

We performed a STRUCTURE analysis (Pritchard et al. 2000;Falush et al. 2003) of our landrace sample, varying the numberof groups from K = 2 to 6 (Figure 1, Figure S3). Most lan-draces were assigned to groups consistent with a priori popu-lation definitions, but admixture between highland and lowlandpopulations was evident at intermediate elevations (⇠ 1700m).Consistent with previously described scenarios for maize dif-fusion (Piperno 2006), we find evidence of shared ancestrybetween lowland Mexican maize and both Mexican highlandand S. American lowland populations. Pairwise FST amongpopulations reveals low overall differentiation (Table 1), andthe higher FST values observed in S. America are consistentwith decreased admixture seen in STRUCTURE. Archaeolog-ical evidence supports a more recent colonization of the high-lands in S. America (Piperno 2006; Perry et al. 2006; Grobmanet al. 2012), suggesting that the observed differentiation maybe the result of a stronger bottleneck during colonization of theS. American highlands.

Population differentiation under inferred demogra-phy

To provide a null expectation for allele frequency differentia-tion, we used the joint site frequency distribution (JFD) of low-land and highland populations to estimate parameters of twodemographic models using the maximum likelihood methodimplemented in �a�i (Gutenkunst et al. 2009). All models in-corporate a domestication bottleneck (Wright et al. 2005) andpopulation differentiation between lowland and highland popu-lations, but differ in their consideration of admixture and ascer-tainment bias (Figure 2; see Materials and Methods for details).

Estimated parameter values are listed in Table 2; while theobserved and expected JFDs were quite similar for both mod-els, residuals indicated an excess of rare variants in the ob-served JFDs in all cases (Figure 3). Under both models IA and

A

B

Lowlands

Hig

hlan

ds

Observation Expectation ResidualMexico

South America

40

–40

0

Model IA

Model IB

Density

Residual

10–4

0

10–310–210–1

Lowlands

Hig

hlan

ds

Observation Expectation Residual

40

–40

0

Model IA

Model II

Density

Residual

10–4

0

10–310–210–1

Figure 3 Observed and expected joint distributions of mi-nor allele frequencies in low- and highland populations in(A) Mexico and (B) S. America. Residuals are calculatedas (model � data)/

pmodel

IB, we found expansion in the highland population in Mexicoto be unlikely, but a strong bottleneck followed by populationexpansion is supported in S. American maize in both modelsIA and II. The likelihood value of model IB was higher thanthe likelihood of model IA by 850 units of log-likelihood (Ta-ble 2), consistent with analyses suggesting that introgressionfrom mexicana played a significant role during the spread ofmaize into the Mexican highlands (Hufford et al. 2013).

In addition to the parameters listed in Figure 2, we investi-gated the impact of varying the domestication bottleneck size(NB). Surprisingly, NB was estimated to be equal to NC , thepopulation size at the end of the bottleneck, and the likelihoodof NB < NC was much smaller than for alternative parame-terizations (Table 2, S2). This result appears to contradict ear-

7

Page 32: Complex adaptation in Zea

• Build on models of parallel adaptation

• Model new mutation vs. gene flow

Title Text Title Text

theory predicts little convergence

Peter Ralph (USC)

−1000 −500 0 500 1000

0.00

00.

002

0.00

40.

006

distance (km)

prob

of s

urvi

val

truth2*s/varcline location

ACTGCTG

ACTCCTG

Takuno et al. 2014 10.5281/zenodo.11692

Page 33: Complex adaptation in Zea

• Build on models of parallel adaptation

• Model new mutation vs. gene flow

Title Text Title Text

theory predicts little convergence

Peter Ralph (USC)

−1000 −500 0 500 1000

0.00

00.

002

0.00

40.

006

distance (km)

prob

of s

urvi

val

truth2*s/varcline location

ACTGCTG

ACTCCTGACTGCTG

Tmut = 1/�mut =2µ⇢Asb

⇠2 ⇡ 104 gens

Takuno et al. 2014 10.5281/zenodo.11692

Page 34: Complex adaptation in Zea

• Build on models of parallel adaptation

• Model new mutation vs. gene flow

Title Text Title Text

theory predicts little convergence

Peter Ralph (USC)

−1000 −500 0 500 1000

0.00

00.

002

0.00

40.

006

distance (km)

prob

of s

urvi

val

truth2*s/varcline location

ACTGCTG

ACTCCTGACTGCTG

Tmut = 1/�mut =2µ⇢Asb

⇠2 ⇡ 104 gens

Tmig = (2/N) exp(Rp2sm/�) ⇡ 5⇥ 10

34gens

Takuno et al. 2014 10.5281/zenodo.11692

Page 35: Complex adaptation in Zea

Takuno et al. 2014 10.5281/zenodo.11692

theory predicts little convergence

Page 36: Complex adaptation in Zea

polygenic adaptationstanding variation

Takuno et al. 2014 10.5281/zenodo.11692

theory predicts little convergence

Page 37: Complex adaptation in Zea

no change in frequency for growth SNPs

Biomass (Hot-Cold)

−1.0

−0.5

0.0

0.5

1.0

indHei−SA_10−3

Categories

Alle

le fr

eque

ncy

diff.

*

*****

*

*

*

*

**

***

****

**

**

*

***

*

*

***

*

*

*

*

*

**

*

****

*

*

***

*

*

*

*

*

***

*

*

*

*

**

**

*

*

****

*

**

**

*

**

****

*

**

***

*

*

**

*

*

**

**

*

****

*

*

****

********

*

*

*

**

*

****

*

*

*

*

***

**

**

**

*

*

*******

**

*

*

***

*

*

**

*

********

*

***

*

**

***

*

**

**

*

**

******

*

**

*

**

*

*

*

*

*

*

**

**

****

*

****

*****

***

**

*

***

*

*

*

*

*

******

****

*

*

**

*

*

*

*

**

*

*****

**

**

*

***

***

*****

***

*

**

***

**

******

**

*****

*

**

*

**

*

*

*

*

*

*

*

*****

****

*

*

*

*****

*

*******

***

*

*

*

**

*

**

*

*

*

*

***

***

*

**

**

**

***

*

*

*

*

*

*

*

****

***

**

***

*

*

*

**

*

*

*

*

***

****

*

*

**

*

*

*

*

**

*

*

*

*

**

*

*

*

***

******

**

*

*

**

*******

*

**

***

*

*

*****

*****

**

*

*

*

***

*

*

**

*****

*

**

**

**

*

*******

***

*

**

*

***

*

********

***

***

*

*****

**

*

*

*

*

*******

*

*

*

*

**

********

*

*

**

*

****

**

*

**

*

*****

*

*

*****

**

*

*

**

****

*********************

**

*

*

*

****

**

****

*

***

*****

**

*

*

*

*

******

***

**

*

*

*

**

**

*******

*

***

**

**

***

********

*

***********

****

**

*

*

*

*

*

*

**

***

*

*

*

*

*

*******

**

*

***

*

********************************************************************************************************************************************************************************

**

****

********

*******************************************************

****

*******************************************************************************************************

**************************

************************************

**

****

***

***

*****

**

*

****

*

*

***

***

*

*

*

*

*

********

**

*

*

*

*

**

*

***

*

**

*

**

**

**

*********

*

***

*

***

*

**

***

*

**

*****

*

*****

**

*

*

**

**

**

*

*

****

***

*

*

*

**

*

*

*

*

*

**

**

**

**

*

*

*

*

*

**

*

*

*

***

*

**

**

*

****

*

********

**

*

*****

***

*

*

*

*

*

*

**********

*

********

*

*

*

**

*

***

*

****

*

*****

****

*

***

***

*

***

**

*

**

**

******

*

*

*******

*

*

**

*

*

**

*

*

**

*

*

***

**

*

*

*

*

*

**

*

****

**

**

*

***

*

*****

**

*

*

*

**

****

*

****

**

*

*

***

*

*

****

**

*

**

**

*

****

*

*

***

*

**********

*

*

*

*

***

*

*

***

*

*

**

*

***

*

*

****

***

*

*

***

**

*

**

*

****

**

**

****

**

*

*

*

***

***

*

********

**

*

****

***

**

**

*

***

**

**************

***

**

*****

*

***

**

****

*

***

*

*

*

*

**

**

******

*

*

**

*

***

**

*

***

*

*

*****

****

**

**

*

**

*

****

*

*

***

**

********

*

*

*

*

********

*

**

***

*

********

*

**

**

*

*

*

*******

****

*******

***

*

**

*

*

*

*

***

**

**

*

**

*

*

*

*

*

*

*

*****

*

*

**

*

*

*******

*

**

*

*******

***

******

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

−1.0

−0.5

0.0

0.5

1.0

indHei−Mex_10−3

Categories

Alle

le fr

eque

ncy

diff. ****

**

*

*

*

**

*****

**********

*

*

***

*****

*

***

*

**

******

*

********

*

*****

***

***********

*

**

*****

**

***

*

****

*

*

************

***

*

**

****

**

*

**

**

*

*

******

*******

*

***

*******

*

*

*

**

*

****

*

*

*

**

****

***************

**

*

****

*

*

*

****

*

*********

*

***

**

*

****

*

*

*

***

**

***

*

***

**

***

****

***

*

*

*

**

*

*

*

****

*

*

*

**

*

*

**

******

*

*******

****

*****

****

**

*

*

*

****

*

* **

*

****

**

******

**

***

***

*

**

*

**

*

********

*

*

**

******

*

***

*

*

**********

**

*

*****

********

*

*

*

*

*

***********

**

**

****

***

**

*

*

**

*

***

*

*************

****

*

****

********

**

*

*****

**

***

*

*

******

***********

*****

*

***

******

***

*****

*

************************************************************************* **

*************************************************************************************************************************************************************************************************************

*****

****

**

***

*

*******

*

***

*

**

*

*

**

************

****

*

********

***

*******

*

*******

****

**

**

*

**

****

**

*****

**

*

*

****

**

******

***

****

**

**

*

**

****

*

****

**

*

******

**

*

**

****

*

**

****

*****

*

******

**********

**

**

***

***

**

***

**

***

*****

*

*

******

**

*****

*

**

*****

*

*****

*

******

***

****

**

*

*

***

*

*

****

**

*

****

***

*

*

**

*

**

*

**

**

****

*

**

*

*********

**

*

*

*

*

*********

******

**

*

****

*

*

*

*

*

*******

*

*

**

**

*

******

*

****

****

*

*

**

***

*

*

**

*

*

******

*

*****

***

*

*********

*

**

**

*****

*

**

*

*

****

*

*

*

*

**

**

****

*

*****

***

****

**

*

*

****

****

*

******

***

*

**

****

*

**

*

****

*

***

*

*

*********

*

*****

*

********

*****

***

***

****

**

*

*

*

****

**

*

**

**

***

*

**

**

*

**

*****

*

*

*********

*

*

*

*

*

*

******

**

***

**

*

*********

*

*

**

**

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95

Mexico

S. America

lowland allele freq.

Hig

hlan

d - L

owla

nd fr

eq. d

iffer

ence

randomGWAS

Sofiane Mezmouk, unpublished

Page 38: Complex adaptation in Zea

in progress: mapping popsM Hufford (ISU), R. Sawers (Langebio) Summer 2013

S. Flint-Garcia (MU) Winter 2012

MX x MX F2

SA x SA F2

Highland Landrace (PT) x B73 BC2 NILs

Highland x Lowland Landrace F2 populations

Page 39: Complex adaptation in Zea

the genome’s a mess

165700000

165900000

166100000

0e+00 1e+05 2e+05 3e+05 4e+05Query Position

Subj

ect P

ositi

on

chromosome12345678

A2. Divergence, repeats, or PAVs?

Sub

ject

pos

ition

Brunner et al. 2005 Plant CellMo17 contig position

B73

geno

me

posit

ion

Page 40: Complex adaptation in Zea

…and that mess is important

Wallace et al. unpublished

all SNPs GWAS hits

Page 41: Complex adaptation in Zea

…and that mess is important

Wallace et al. unpublished

all SNPs GWAS hits

outliers (P value cutoff 0.01) showed significant overlap(P value<0.001) with PC1 BAYENV, FFT, and FST outliers,due primarily to haplotype structure created by the putativeinversions on chromosomes 4 (position 168,832,447–182,596,678) and 9 (positions 0–29,403,252 and118,862,807–143,163,957). Significant overlap was also ob-served between FFT outliers and PHS outliers identified in testsof individual populations (11 out of 21 populations hadP values <0.01 based on permutations). These combined re-sults indicate that SNPs identified as differentiated in a specificpopulation often reside in a region with unexpectedly longshared haplotypes.

Characterization of Candidate Loci

We assessed additional evidence for local adaptation at can-didate loci by further dissecting signals in putative inversionsand gauging enrichment of candidates in described functions.All four putative inversions contained an excess of differenti-ated SNPs or SNPs associated with environmental variables(fig. 6 and table 2). To partially control for the nonindepen-dence of loci within inversions, each SNP was given a maxi-mum rank across all environmental association candidate lists(see Materials and Methods for details). Inversions were 2-foldenriched for candidate SNPs (P value< 0.001), containing5.6% of all SNPs but 11% of SNPs in the 99th percentile ofthe maximum rank distribution of these lists.

To further control for nonindependence of SNPs withinputative inversions, we conducted an additional BAYENV

analysis with inversions treated as single biallelic loci. We as-signed each individual a genotype based on neighbor-joiningtrees of SNPs within the inversion (supplementary fig. S5,Supplementary Material online). All four putative inversionswere identified in environmental association analysis whentreated as biallelic loci. Three of these inversions (Inv1n,Inv4m, and Inv9d) showed altitudinal clines (supplementaryfig. S8, Supplementary Material online), though the signal atInv1n was less clear than originally reported in Fang et al.(2012) as the derived (inverted) arrangement was absentfrom the low elevation Crucero Lagunitas population ofparviglumis.

We tested candidate SNPs for enrichment in functional cat-egories (genic vs. nongenic, synonymous vs. nonsynonymous)as well as for association with maize phenotypic traits (Flint-Garcia et al. 2005; Hung et al. 2012), putative adaptive intro-gressions from mexicana into maize (Hufford et al. 2013), and!5 Mb around centromeres (Wolfgruber et al. 2009). PHSoutliers were the only set of candidates enriched for genicSNPs (P value <0.001 for outliers based on the total sample,with 11 of 21 populations having P <0.01). In contrast,FST outliers were strongly enriched for nongenic SNPs(P values< 0.001, supplementary table S7, SupplementaryMaterial online), and both FST and FCT outliers were fartherfrom genes than other SNPs (P values< 0.001, two-sampleKolmogorov–Smirnov test). FST outliers were also enriched forSNPs associated with the architecture of the male inflores-cence in a diverse maize panel (Flint-Garcia et al. 2005;

fold

-enr

ichm

ent

FIG. 6.—Fold enrichment of observed ratios of genic/nongenic SNPs among candidates. Fold of enrichment for each set of candidates (red line) from

BAYENV, FST, and FCT, SNPs significant in any of the population-based FFT and PHS analysis, and phenotypic association analysis (GWAS). Expected

distribution of fold of enrichment ratios (gray dots) was obtained by random sampling of the same amount of SNPs as in the candidate set 1000 times

and calculating the ratio and fold enrichment. SNPs within inversions were excluded.

Complex Patterns of Local Adaptation in Teosinte GBE

Genome Biol. Evol. 5(9):1594–1609. doi:10.1093/gbe/evt109 Advance Access publication July 30, 2013 1603

at University of C

alifornia, Davis - Library on Septem

ber 11, 2014http://gbe.oxfordjournals.org/

Dow

nloaded from

Pyhäjärvi et al. 2013 GBE

fold

non

geni

c en

richm

ent

Page 42: Complex adaptation in Zea

transposable elements: 85% of maize

McClintock 1984 Science Baucom et al. 2009 PLoS Genetics

Damon Lisch

Page 43: Complex adaptation in Zea

TEs impact morphology, flowering time

Studer et al. 2011 Nature GeneticsDucrocq et al. 2008 Genetics

Yang et al. 2013 PNAS

tb1

ZmCCT

alleles at these loci. Two studies conducted by Blancet al. (2006) and Austin and Lee (1996) involved paren-tal lines that were identical at both mite and CGindel587,which is consistent with no QTL of major effect beingdetected at Vgt1. Finally, no QTL has been reported inbin 8.05 in three distinct mapping populations involv-ing F2 and MBS847 (Mechin et al. 2001; Poupard et al.2001; Bouchez et al. 2002). These inbred lines differ atmite but share the same allele at CGindel587.

Relationship between allele frequencies andgeographical origin: Data obtained in the inbred line

panel suggested that Vgt1 could have been subjected todifferential selection, with mite and CGindel587 allelefrequencies varying from 0.3 in the late tropical groupto 0.87 in the European and Northern Flint groups andwith an intermediate frequency of 0.45 in Stiff Stalk andCorn Belt Dent groups. For ease of genotyping andconsidering its high LD with CGindel587 when a broadgenetic diversity is addressed, we used mite as a proxy forCGindel587 and analyzed its frequency in a 256-landracecollection (Figure 2). This collection exhibits a latitu-dinal cline for flowering time (supplemental Figure 3)

TABLE 1

Comparison of QTL mapping studies with genotypes observed in parental lines

Studya Cross Traitd

QTL at Vgt1locus R 2(%)e LODe

Additive effect(days)e/late

parent mite f CGindel587 f

Vladutu et al. (1999)b N28 3 E20 DPS Detected NA !15 2.82/N28 1/" 1/"Barriere et al. (2005) F838 3 F286 SD Detected 28.8 17.8 2.3/F838 1/" 1/"Beavis et al. (1994) B73 3 Mo17 DPS Detected 31 11.2 NA/B73 1/" 1/"Blanc et al. (2006) DE 3 F283 3

F9005 3 F810SD Detected 5 NA 0.4/F9005 "/"/"/" "/"/"/"

Austin and Lee (1996) H99 3 Mo17 SD/DPS Not detected — — — "/" "/"Poupard et al. (2001) F2 3 MBS847 SD Not detected — — — "/1 "/"Mechin et al. (2001) F2 3 MBS847 SD Not detected — — — "/1 "/"Bouchez et al. (2002)c F2 3 MBS847c SD Not detected — — — "/1 "/"

a Other studies reported a QTL for days to flowering in the Vgt1 region (e.g., Abler et al. 1991; Koester et al. 1993; Tuberosaet al. 1997; Jiang et al. 1999) but, due to incomplete information, they cannot be included in our comparison.

b Mean value of two environments.c In Bouchez et al. (2002) the iodent line (Io) was MBS847 (A. Charcosset, personal communication).d Days to pollen shed (DPS); silking date (SD).e NA, not available.f 1, late allele; ", early allele, in parental line1/parental line2/. . .

Figure 2.—Geographicaldistribution of mite fre-quency(A)for256Europeanand American landracesand (B) for a subset of77 landraces from CentralAmerica and the Caribbeanconsidering both latitudeand elevation. The geno-typic analysis of landraceswas focused on mite due toits strong association withflowering time and its highlevel of LD with CGindel587.Moreover it could be easilygenotyped by size analysison a standard agarose gel.Genotyping was performedon a bulk of 15 plants perpopulation as previously de-scribed by Dubreuil et al.(2006). Additional informa-tion is given in supplementalMaterials and Methods.

Note 2435

vgt1

Page 44: Complex adaptation in Zea

Con

trol

Col

d

Leng

th o

f the

lo

nges

t roo

t

cm B

C

A

Cont

rol

Cold

Le

ngth

of 4

th

leaf

Hie

rarc

hic

al C

lust

eri

ng

-3

3 0

B O

M B

M O

B O

M B

M O

Heat

Sa

lt Ch

ill

UV

Log 2

(Str

ess/

Cont

rol)

on

Sept

embe

r 11,

201

4ht

tp://

bior

xiv.

org/

Dow

nloa

ded

from

stress response associated with TEs

Control

Cold Length of the

longest root

cm B

C

A

Control Cold

Length of 4th

leaf

Hierarchical Clustering

-3

3

0

B O M B M O B O M B M O

Heat

Salt

Chill

UV

Log2(Stress/Control)

on September 9, 2014

http://biorxiv.org/

Downloaded from Makarevitch et al. 2014 bioRxiv

Control Cold

Length of the longest root

cm

B

C

A

Control Cold Length of 4th leaf

Hierarchical Clustering

-3 3 0

B O M B M O B O M B M O

Heat Salt Chill UV Log2(Stress/Control)

on September 11, 2014http://biorxiv.org/Downloaded from

control

cold

Cold Heat Salt UV

Cold Heat Salt UV

A

B

odoj

flip dagaf raider

Zm02117

Zm05382

Zm03238

nihep riiryl uwum

jeli

ubel alaw

ipiki

etug

gyma naiba joemon pebi

Cold Heat Salt UV

Zm00346

0 6 9 2.5 1.5

C

15

10

5

0

-5

-10

Heat

UV

Cold

Salt

ipiki etug

Log 2

(stre

ss/c

ontro

l)

D

E

52% 41%

3% 4%

Cold (3624 genes)

Heat (2454 genes)

High salt (4267 genes)

UV (3450 genes)

45%

35%

17% 3%

45% 47%

4% 4%

40%

55%

3% 2%

Stress activated – not near TEs Stress up-regulated – not near TEs

Stress activated – near TEs Stress up-regulated – near TEs

on September 11, 2014http://biorxiv.org/Downloaded from

Page 45: Complex adaptation in Zea

enrichment of specific TEs near genes

Cold Heat Salt UV

Cold Heat Salt UV

A

B

odoj

flip dagaf raider

Zm02117

Zm05382

Zm03238

nihep riiryl uwum

jeli

ubel alaw

ipiki

etug

gyma naiba joemon pebi

Cold Heat Salt UV

Zm00346

0 6 9 2.5 1.5

C

15

10

5

0

-5

-10

Heat

UV

Cold

Salt

ipiki etug

Log 2

(stre

ss/c

ontro

l)

D

E

52% 41%

3% 4%

Cold (3624 genes)

Heat (2454 genes)

High salt (4267 genes)

UV (3450 genes)

45%

35%

17% 3%

45% 47%

4% 4%

40%

55%

3% 2%

Stress activated – not near TEs Stress up-regulated – not near TEs

Stress activated – near TEs Stress up-regulated – near TEs

on September 9, 2014http://biorxiv.org/Downloaded from

Makarevitch et al. 2014 bioRxiv

Page 46: Complex adaptation in Zea

enrichment of specific TEs near genes

Cold Heat Salt UV

Cold Heat Salt UV

A

B

odoj

flip dagaf raider

Zm02117

Zm05382

Zm03238

nihep riiryl uwum

jeli

ubel alaw

ipiki

etug

gyma naiba joemon pebi

Cold Heat Salt UV

Zm00346

0 6 9 2.5 1.5

C

15

10

5

0

-5

-10

Heat

UV

Cold

Salt

ipiki etug

Log 2

(stre

ss/c

ontro

l)

D

E

52% 41%

3% 4%

Cold (3624 genes)

Heat (2454 genes)

High salt (4267 genes)

UV (3450 genes)

45%

35%

17% 3%

45% 47%

4% 4%

40%

55%

3% 2%

Stress activated – not near TEs Stress up-regulated – not near TEs

Stress activated – near TEs Stress up-regulated – near TEs

on September 9, 2014http://biorxiv.org/Downloaded from

0 1 .40 .15

TSS

jeli odoj

flip

dagaf

nihep

Zm05382

Zm02117

Zm03228

riiryl

uwum

ubel alaw

ipiki

etug

gyma naiba joemon

pebi

raider

Zm00346

Figure S1. Properties of TE insertions that condition stress-responsive expression. (A) In our initial screening we only analyzed TE insertions located within 1kb of the TSS. Here we assessed the proportion of genes that exhibit stress-responsive expression for TE insertions located at different distances from the TSS (for the stress condition most associated with each TE family). Some of the TE families appear to only affect genes if they are inserted quite near the TSS while others can have influences at distances. (B) The CBF/DREB transcription factors have been associated with stress-responsive expression in a number of plant species [46]. We identified consensus CBF/DREB binding sites (A/GCCGACNT) in the consensus TE sequences (maizetedb.org) for the TEs associated with each of the stresses as well as in 40 randomly selected TEs that were not associated with gene expression responses to stress or 40 randomly selected 5kb genomic regions. The proportion of sequences that contained a CBF/DREB binding site and the average number of sites per element are shown. The TEs associated with cold, heat and salt stress are all enriched for containing CBF/DREB binding sites.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Proportion with CBF

binding site

Average # CBF binding

sites per element

Cold

Heat

Salt

UV

Random TEs

Random genomic regions

A

B

-10

0

10

0

50

0

20

00

10

00

0

-50

0

-20

00

-10

00

0

on September 11, 2014http://biorxiv.org/Downloaded from

0 1 .40 .15

TSS

jeli odoj

flip

dagaf

nihep

Zm05382

Zm02117

Zm03228

riiryl

uwum

ubel alaw

ipiki

etug

gyma naiba joemon

pebi

raider

Zm00346

Figure S1. Properties of TE insertions that condition stress-responsive expression. (A) In our initial screening we only analyzed TE insertions located within 1kb of the TSS. Here we assessed the proportion of genes that exhibit stress-responsive expression for TE insertions located at different distances from the TSS (for the stress condition most associated with each TE family). Some of the TE families appear to only affect genes if they are inserted quite near the TSS while others can have influences at distances. (B) The CBF/DREB transcription factors have been associated with stress-responsive expression in a number of plant species [46]. We identified consensus CBF/DREB binding sites (A/GCCGACNT) in the consensus TE sequences (maizetedb.org) for the TEs associated with each of the stresses as well as in 40 randomly selected TEs that were not associated with gene expression responses to stress or 40 randomly selected 5kb genomic regions. The proportion of sequences that contained a CBF/DREB binding site and the average number of sites per element are shown. The TEs associated with cold, heat and salt stress are all enriched for containing CBF/DREB binding sites.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

Proportion with CBF

binding site

Average # CBF binding

sites per element

Cold

Heat

Salt

UV

Random TEs

Random genomic regions

A

B

-10

0

10

0

50

0

20

00

10

00

0

-50

0

-20

00

-10

00

0

on September 11, 2014http://biorxiv.org/Downloaded from

Makarevitch et al. 2014 bioRxiv

Page 47: Complex adaptation in Zea

new insertions activate expression

Makarevitch et al. 2014 bioRxiv

-0.5

0.5

1.5

2.5

Lines with the TE insertion

Lines without the TE insertion

GRMZM2G071206

Log 2

(stre

ss/c

ontro

l)

-202468

1012

Lines with the TE insertion

Lines without the TE insertion

-202468

1012

Log 2

(stre

ss/c

ontro

l) GRMZM2G400718 C

-0.50.00.51.01.52.0D

GRMZM2G102447

Lines with the TE insertion

Lines without the TE insertion

GRMZM2G108057

-202468

101214

Lines with the TE insertion

Lines without the TE insertion

GRMZM2G108149

A

B

Log 2

(stre

ss/c

ontro

l) Lo

g 2(s

tress

/con

trol)

E

Log 2

(stre

ss/c

ontro

l)

Lines with the TE insertion

Lines without the TE insertion

on September 9, 2014http://biorxiv.org/Downloaded from

-0.50.00.51.01.52.02.53.03.5

1 2 3 4 5 6 7 8 9 10

Oh43

B73 Mo17

- - + - - + - + - - ++ - - + - - + - - + - - + - - + - - + Gene

Log 2

(stre

ss/c

ontro

l)

TE presence

0%

20%

40%

60%

80%

100%

alaw

dagaf

etug flip

gyma

ipiki

jeli

joem

onnaiba

nihep

odoj

pebi

raider

riiryl

ubel

uwum

Zm00346

Zm02117

Zm03238

Zm05382

Salt

UV

Heat

Cold

B

A P

erce

nt o

f con

serv

ed

gene

s

on September 9, 2014http://biorxiv.org/Downloaded from

***

****

*** *

Page 48: Complex adaptation in Zea

evolutionary patterns differ among TEs

Michelle Stitzer, unpublished

Page 49: Complex adaptation in Zea

fitness cost of inversions

Griffiths et al. 2010 10th Ed.

Page 50: Complex adaptation in Zea

1072 M. P. MAGUIRE

Maguire 1966 Genetics

limited underdominance in maize

46%

34%

1072 M. P. MAGUIRE

20%1072 M. P. MAGUIRE 34%

nonhomologous none loop

Page 51: Complex adaptation in Zea

Inv1n common and old

Fang et al. 2012 Genetics

bridges and acentric fragments at anaphase I (Dawe andCande 1996).

Results

We examined the level of LD in each of the three subspeciesof Z. mays with a genome-wide set of 941 SNPs from 2782samples. Using computationally phased genotypic data, wesearched for pairs of markers in high LD (r2 . 0.6) andseparated by .1 Mb. Our scan identified two such regions,an !50-Mb region on chromosome 1 and an !15-Mb spanof chromosome 8. Because the region on chromosome 8 isnear a likely assembly error in the reference genome (J.Glaubitz, unpublished data), we focused our analysis onchromosome 1. The region of high LD on chromosome 1in our data corresponds closely to the 65- to 115-Mb regionon the physical map of the reference mays genome (B73RefGen v2, release 5a.59, 2010–2011) recently reportedby Hufford et al. (2012) as a putative inversion. Our datareveal high LD (mean r2 = 0.24) among the 17 SNPs fromMb 65.09 to 106.16 (Figure 1), compared to a genome-wideaverage of 0.004. Gametic disequilibrium, as estimated fromunphased SNP genotyping data, also demonstrates this ex-cess of LD (data not shown). Finally, high levels of LD arealso evident in genotypic data from a panel of 13 individualsof parviglumis genotyped using the 55,000 SNPs on theMaizeSNP50 Illumina Infinium Assay (Hufford et al.2012), suggesting that the LD observed is not an artifactof the genotyping platform used.

The extended region of high LD on chromosome 1is a putative inversion

Because mays and the teosintes are outcrossing taxa withlarge effective population sizes, LD in the genome gener-ally declines rapidly with distance (r2 , 0.1 within 1500bp in domesticated mays) (Remington et al. 2001). Theregion of high LD is distinct from both the centromere(Wolfgruber et al. 2009) and known heterochromaticknobs (Buckler et al. 1999) and exhibits relatively low re-combination (Figure 1). An !50-Mb span of high LD isunexpected, and while parviglumis and mexicana showevidence of high LD in this chromosomal region, levelsof LD in our large sample of domesticated mays are similarto genome-wide averages (Figure 1). Other wild taxa alsodo not show an excess of LD on the short arm of chromo-some 1, although our power to measure LD in these sam-ples is likely hampered by smaller sample size and SNPascertainment bias. Finally, a recent genetic map froma BC2S3 population derived from a cross between a maysline and a parviglumis line with the putatively invertedarrangement shows no crossovers inside the !50-Mb spanin the 881 progeny genotyped, consistent with the puta-tive inversion suppressing recombination in heterozygotes(L. Shannon and J. Doebley, unpublished data). Althoughfinal validation will require demonstrating differentialmarker order in the progeny of self-fertilized individuals

Figure 1 Population genetic evidence for the Inv1n inversion. Top, cu-mulative genetic distance by physical position along chromosome 1. Thedashed curve is based on the teosinte–maize backcross map of Briggset al. (2007) and the solid curve is from the maize nested association-mapping (NAM) population (Yu et al. 2008). Bottom, haplotype number(blue curve) and FST between the inverted and standard arrangements(red curve). The number of haplotypes present across chromosome 1 wascalculated in overlapping 10-SNP windows with 1-SNP increments. Theinverted region is marked in gray and the centromere in green dashedlines. Below, LD (r2) is plotted across the chromosome for parviglumis,mexicana, and mays.

886 Z. Fang et al.

bridges and acentric fragments at anaphase I (Dawe andCande 1996).

Results

We examined the level of LD in each of the three subspeciesof Z. mays with a genome-wide set of 941 SNPs from 2782samples. Using computationally phased genotypic data, wesearched for pairs of markers in high LD (r2 . 0.6) andseparated by .1 Mb. Our scan identified two such regions,an !50-Mb region on chromosome 1 and an !15-Mb spanof chromosome 8. Because the region on chromosome 8 isnear a likely assembly error in the reference genome (J.Glaubitz, unpublished data), we focused our analysis onchromosome 1. The region of high LD on chromosome 1in our data corresponds closely to the 65- to 115-Mb regionon the physical map of the reference mays genome (B73RefGen v2, release 5a.59, 2010–2011) recently reportedby Hufford et al. (2012) as a putative inversion. Our datareveal high LD (mean r2 = 0.24) among the 17 SNPs fromMb 65.09 to 106.16 (Figure 1), compared to a genome-wideaverage of 0.004. Gametic disequilibrium, as estimated fromunphased SNP genotyping data, also demonstrates this ex-cess of LD (data not shown). Finally, high levels of LD arealso evident in genotypic data from a panel of 13 individualsof parviglumis genotyped using the 55,000 SNPs on theMaizeSNP50 Illumina Infinium Assay (Hufford et al.2012), suggesting that the LD observed is not an artifactof the genotyping platform used.

The extended region of high LD on chromosome 1is a putative inversion

Because mays and the teosintes are outcrossing taxa withlarge effective population sizes, LD in the genome gener-ally declines rapidly with distance (r2 , 0.1 within 1500bp in domesticated mays) (Remington et al. 2001). Theregion of high LD is distinct from both the centromere(Wolfgruber et al. 2009) and known heterochromaticknobs (Buckler et al. 1999) and exhibits relatively low re-combination (Figure 1). An !50-Mb span of high LD isunexpected, and while parviglumis and mexicana showevidence of high LD in this chromosomal region, levelsof LD in our large sample of domesticated mays are similarto genome-wide averages (Figure 1). Other wild taxa alsodo not show an excess of LD on the short arm of chromo-some 1, although our power to measure LD in these sam-ples is likely hampered by smaller sample size and SNPascertainment bias. Finally, a recent genetic map froma BC2S3 population derived from a cross between a maysline and a parviglumis line with the putatively invertedarrangement shows no crossovers inside the !50-Mb spanin the 881 progeny genotyped, consistent with the puta-tive inversion suppressing recombination in heterozygotes(L. Shannon and J. Doebley, unpublished data). Althoughfinal validation will require demonstrating differentialmarker order in the progeny of self-fertilized individuals

Figure 1 Population genetic evidence for the Inv1n inversion. Top, cu-mulative genetic distance by physical position along chromosome 1. Thedashed curve is based on the teosinte–maize backcross map of Briggset al. (2007) and the solid curve is from the maize nested association-mapping (NAM) population (Yu et al. 2008). Bottom, haplotype number(blue curve) and FST between the inverted and standard arrangements(red curve). The number of haplotypes present across chromosome 1 wascalculated in overlapping 10-SNP windows with 1-SNP increments. Theinverted region is marked in gray and the centromere in green dashedlines. Below, LD (r2) is plotted across the chromosome for parviglumis,mexicana, and mays.

886 Z. Fang et al.

Z.#Fang#et#al.#4#SI!

##Figure'S3###Geographic#distribution#of#the#33#parviglumis#populations.#The#size#of#the#circle#is#proportional#to#the#Inv1n#frequency,#and#color#represents#elevation.#The#study#area#in#Mexico#is#shown#in#the#inset.###########################

Page 52: Complex adaptation in Zea

We used multiple approaches to estimate the age ofInv1n-I from the resequencing data. Using the MCMC ap-proach of Becquet and Przeworski (2007), which estimates

divergence time from patterns of shared polymorphism un-der an isolation model, divergence was estimated to be!296,000 generations, with a 95% confidence interval

Figure 3 (A) Neighbor-joining tree for all SNPs outside Inv1n, using 15 parviglumis inbred lines. (B) Neighbor-joining tree for all SNPs inside Inv1n, using15 parviglumis inbred lines. (C) Neighbor-joining tree for all unique haplotypes in each taxon, using all SNPs inside Inv1n. The haplotypes in the grayregion represent the Inv1n-I arrangement.

Table 1 Mean (and standard deviation) of summary statistics for 7 resequenced loci inside and 88 loci outside Inv1n

No. loci n L Ssh Sf Sp h H up Taj D Fay and Wu’s H

Inside (Inv1n-I) 7 (6) 14.6 (1.5) 307 (88) 0.3 (0.8) 4.3 (3.1) 2.9 (3.2) 2.3 (1.4) 0.49 (0.41) 0.004 (0.004) 0.37 (1.21) 20.001 (0.003)Inside (Inv1n-S) 7 (6) 14.6 (1.5) 307 (88) 0.3 (0.8) 4.3 (3.1) 2.9 (2.5) 3.3 (1.7) 0.59 (0.36) 0.004 (0.003) 20.70 (0.56) 0 (0.003)Outside (Inv1n-I) 88 (68) 13.5 (1.9) 414 (107) 4.7 (4.6) 0.1 (0.9) 4.1 (4.1) 3.9 (1.2) 0.89 (0.21) 0.011 (0.009) 20.22 (0.65) 20.003 (0.017)Outside (Inv1n-S) 88 (68) 13.5 (1.9) 414 (107) 4.7 (4.6) 0.1 (0.9) 4.7 (3.6) 4.7 (1.8) 0.88 (0.21) 0.010 (0.007) 20.34 (0.62) 20.008 (0.029)

The number of loci with an outgroup is listed in parentheses in the “No. loci” column. The numbers in parentheses in other columns are standard deviations: n, number of samples; L,length of the locus; Ssh, number of shared SNPs between Inv1n-I and Inv1n-S; Sf, number of fixed SNPs; Sp, number of private SNPs; h, number of haplotypes; H, haplotype diversity; up,pairwise difference per base pair.

888 Z. Fang et al.

Inv1n common and old

Fang et al. 2012 Genetics

Page 53: Complex adaptation in Zea

Inv1n: selection and associations

Fang et al. 2012 Genetics

homozygous for alternate arrangements (Mano et al. 2012),we view these multiple lines of evidence as a strong case thatrecombination is suppressed due to an inversion in this re-gion, henceforth identified as Inv1n.

To test for evidence of pairing and recombination withinthe large Inv1n region, we examined male meiocytes fromsix F1 plants derived from two crosses between mays and aninbred parviglumis line containing Inv1n. Both hybridsrevealed a low frequency of dicentric bridge formation at!4% (7/167), but no acentric fragments were observed(Table S5). Although such bridges were rare, an anaphaseI bridge in a plant heterozygous for Inv1n was observed(Figure S1). In addition, we observed no obvious reductionin pollen viability or seed set in a total of five F1 plants (datanot shown).

Haplotype variation and divergence time

STRUCTURE (Pritchard et al. 2000; Falush et al. 2003) anal-ysis of SNPs on all 1936 parviglumis chromosomes insideInv1n shows the highest likelihood for K = 2 clusters, a pat-tern not seen from the full set of genome-wide SNPs (datanot shown). These groups are hereafter referred to as Inv1n-Iand Inv1n-S for the inverted and standard arrangements,respectively (Figure 2). Recombination among loci within achromosomal arrangement should be unaffected, and levelsof LD within Inv1n-I (mean r2 = 0.11) and Inv1n-S arrange-ments (mean r2 = 0.07) are indeed low and similar to back-ground levels (Figure S2). Average FST between chromosomeswith alternate arrangements is notably higher inside the Inv1nregion (0.54) than across the rest of the genome (0.01) (Fig-ure 1). Genetic distance among accessions for SNPs alongchromosome 1 outside the Inv1n region shows little evidenceof haplotype structure (Figure 3A), while genetic distance forSNPs inside Inv1n divides parviglumis into two clear haplo-typic groups representing Inv1n-I and Inv1n-S (Figure 3B).The Inv1n-S cluster includes all taxa of Zea and Tripsacum in-vestigated, and it is parsimonious to assume that the Inv1n-Icluster, present only in parviglumis and mexicana, representsthe derived inverted arrangement (Figure 3C). Despite strongdifferentiation, the two arrangements share polymorphic SNPs(Figure 2), even in homozygous individuals unaffected by hap-lotype phasing (data not shown). Among the 968 parviglumissamples, 345 (35.6%) are heterozygous at Inv1n, while 369(38.1%) and 254 (26.3%) are homozygous for the Inv1n-Iand Inv1n-S arrangements, respectively. Inv1n-I consists ofa smaller number of distinct haplotypes and shows a paucityof rare haplotype variants compared to Inv1n-S (Figure 2).

Resequencing data from seven loci within Inv1n mirrorthese results (Table 1). Four loci (PZA00692, PZA00593,PZA03014, and PZA00146) show distinct haplotype clustersconsistent with the SNP genotyping data (data not shown),dividing parviglumis into two groups representing Inv1n-I andInv1n-S. A comparison of the two groups reveals a highernumber of fixed differences, fewer shared derived SNPs,and higher average FST (0.53 vs. 0.05) inside the Inv1n regionthan outside. Average Tajima’s D of the entire sample is

higher inside Inv1n (0.58 vs. 20.29), and the lack of rarehaplotypes on the Inv1n-I background observed in the SNPdata is reflected in the positive Tajima’s D at sequencesfrom these chromosomes (Table 1). All alleles private toInv1n-I are derived on the basis of the Sorghum outgroupsequence, but 30% of the alleles private to Inv1n-S areancestral.

Figure 2 Diagram of haplotype diversity in parviglumis based on the 17SNPs within Inv1n. Haplotypes are divided into the two clusters identifiedby STRUCTURE. Each SNP is represented by either the ancestral state(solid) or the derived state (shaded). The frequency of each of the hap-lotypes from the inverted (top) and standard (bottom) arrangements isshown on the right. The middle bar shows the physical position of each ofthe 17 SNPs inside Inv1n.

Teosinte Inversion 887

Page 54: Complex adaptation in Zea

Inv1n: selection and associations

Fang et al. 2012 Genetics

Nielsen 2004; Nielsen et al. 2005; McVean 2007). However,the largest sweep identified in maize to date is only 1.1 Mb(Tian et al. 2009), and both the age of the inversion andcommon tests for departures from neutrality do not provideevidence of strong selection. Another alternative explana-tion would be the presence of strong negative interactionsbetween distantly linked loci, potentially due to syntheticlethality (Boone et al. 2007). Such interactions should notgenerate extended patterns of elevated LD among interven-ing SNPs, as crossing over among haplotypes not carryingalleles involved in the negative interaction should not beaffected. Both selective sweeps and negative interactionsare inconsistent with the presence of only two major haplo-types in the Inv1n region and fail to explain the clinal var-iation in haplotype frequencies seen at Inv1n-I.

To our knowledge, the only prior evidence for Inv1n isa report of high LD and high FST from a much smaller sam-ple of parviglumis (Hufford et al. 2012), but a number ofother large inversions have been previously reported inmaysand its wild relatives (Ting 1965, 1967, 1976; Maguire1966; Kato 1975). These include an !50-Mb inversion onthe long arm of chromosome 3 in Z. luxurians (Ting 1965)and an !35-Mb inversion that covers most of the short armof chromosome 8 in both mays (McClintock 1960) and mex-icana (Ting 1976). While some of these inversions wereexperimentally induced (McClintock 1931; Morgan 1950),several have also been identified in natural populations ofmultiple taxa (Kato 1975; Ting 1976).

One of the factors that may limit the geographic spread oflarge inversions is the potential fitness cost of crossing over.The frequency of chromosome loss is dependent on theinversion size and efficiency of synapsis over the inverted

region (Burnham 1962; Maguire and Riess 1994; Lamb et al.2007). When gene density is low, such as in pericentromericregions, or there is a lack of continuous homology, chromo-somes will often synapse in a nonhomologous manner with-out recombination (McClintock 1933). In maize, for example,an inversion on the long arm of chromosome 1 similar in sizeto Inv1n (19 cM) was seen to undergo homologous pairing inonly about one-third of cases (Maguire 1966). Since Inv1n islocated in a pericentromeric region with low gene density andcovers a short genetic distance (2–13 cM), we anticipatedthat it would rarely pair and recombine with a noninvertedchromosome. Our data are consistent with these arguments.We observed repressed recombination around Inv1n and nocytological evidence of crossing over in inversion heterozy-gotes. SNP data indicate no deviations from expected Hardy–Weinberg genotype frequencies at Inv1n, and we see noobvious evidence of effects on fertility. Given these observa-tions, we suspect that inversion polymorphisms may be rel-atively common in natural plant populations, especially inregions of the genome with low recombination rates suchas pericentromeres. Low recombination has also been offeredas an explanation for the lack of underdominance in manypericentromeric inversions in Drosophila (Coyne et al. 1993).As dense genotyping becomes more cost effective, we predictthat numerous common inversions will be identified in nat-ural populations of Zea and other organisms.

Origin and age of Inv1n

Our evidence suggests that Inv1n-I is the derived, invertedarrangement. Inv1n-I is not found in Tripsacum or Zea taxaexcept for parviglumis and mexicana (Figure 3C), and, un-like in Inv1n-S, all SNPs private to Inv1n-I are derived in

Figure 5 (A) Bayes factors for correlation between allelefrequencies and altitude in 33 natural parviglumis popula-tions. Inv1n is indicated by red vertical lines. The 99thpercentile of the distribution of Bayes factors is indicatedby a horizontal dashed line. Chromosomes 1–10 are plot-ted in order and in different colors. (B) Association be-tween all SNPs and culm diameter. SNPs significant at5% FDR are above the dashed line.

890 Z. Fang et al.

Temperature

Culm diameter

Page 55: Complex adaptation in Zea

Inv1n: selection and associations

Fang et al. 2012 Genetics

Nielsen 2004; Nielsen et al. 2005; McVean 2007). However,the largest sweep identified in maize to date is only 1.1 Mb(Tian et al. 2009), and both the age of the inversion andcommon tests for departures from neutrality do not provideevidence of strong selection. Another alternative explana-tion would be the presence of strong negative interactionsbetween distantly linked loci, potentially due to syntheticlethality (Boone et al. 2007). Such interactions should notgenerate extended patterns of elevated LD among interven-ing SNPs, as crossing over among haplotypes not carryingalleles involved in the negative interaction should not beaffected. Both selective sweeps and negative interactionsare inconsistent with the presence of only two major haplo-types in the Inv1n region and fail to explain the clinal var-iation in haplotype frequencies seen at Inv1n-I.

To our knowledge, the only prior evidence for Inv1n isa report of high LD and high FST from a much smaller sam-ple of parviglumis (Hufford et al. 2012), but a number ofother large inversions have been previously reported inmaysand its wild relatives (Ting 1965, 1967, 1976; Maguire1966; Kato 1975). These include an !50-Mb inversion onthe long arm of chromosome 3 in Z. luxurians (Ting 1965)and an !35-Mb inversion that covers most of the short armof chromosome 8 in both mays (McClintock 1960) and mex-icana (Ting 1976). While some of these inversions wereexperimentally induced (McClintock 1931; Morgan 1950),several have also been identified in natural populations ofmultiple taxa (Kato 1975; Ting 1976).

One of the factors that may limit the geographic spread oflarge inversions is the potential fitness cost of crossing over.The frequency of chromosome loss is dependent on theinversion size and efficiency of synapsis over the inverted

region (Burnham 1962; Maguire and Riess 1994; Lamb et al.2007). When gene density is low, such as in pericentromericregions, or there is a lack of continuous homology, chromo-somes will often synapse in a nonhomologous manner with-out recombination (McClintock 1933). In maize, for example,an inversion on the long arm of chromosome 1 similar in sizeto Inv1n (19 cM) was seen to undergo homologous pairing inonly about one-third of cases (Maguire 1966). Since Inv1n islocated in a pericentromeric region with low gene density andcovers a short genetic distance (2–13 cM), we anticipatedthat it would rarely pair and recombine with a noninvertedchromosome. Our data are consistent with these arguments.We observed repressed recombination around Inv1n and nocytological evidence of crossing over in inversion heterozy-gotes. SNP data indicate no deviations from expected Hardy–Weinberg genotype frequencies at Inv1n, and we see noobvious evidence of effects on fertility. Given these observa-tions, we suspect that inversion polymorphisms may be rel-atively common in natural plant populations, especially inregions of the genome with low recombination rates suchas pericentromeres. Low recombination has also been offeredas an explanation for the lack of underdominance in manypericentromeric inversions in Drosophila (Coyne et al. 1993).As dense genotyping becomes more cost effective, we predictthat numerous common inversions will be identified in nat-ural populations of Zea and other organisms.

Origin and age of Inv1n

Our evidence suggests that Inv1n-I is the derived, invertedarrangement. Inv1n-I is not found in Tripsacum or Zea taxaexcept for parviglumis and mexicana (Figure 3C), and, un-like in Inv1n-S, all SNPs private to Inv1n-I are derived in

Figure 5 (A) Bayes factors for correlation between allelefrequencies and altitude in 33 natural parviglumis popula-tions. Inv1n is indicated by red vertical lines. The 99thpercentile of the distribution of Bayes factors is indicatedby a horizontal dashed line. Chromosomes 1–10 are plot-ted in order and in different colors. (B) Association be-tween all SNPs and culm diameter. SNPs significant at5% FDR are above the dashed line.

890 Z. Fang et al.

600 800 1000 1200 1400 1600

0.0

0.2

0.4

0.6

0.8

1.0

Altitude (m)In

vers

ion F

requency

r2=0.34Temperature

Culm diameter

Page 56: Complex adaptation in Zea

Inv9d

���� ���� ���� ����

���

���

���

���

���

�����

������� ��

������������

�������

����

����

���� ���� ���� ����

����

�!�

���

�����

������ ��

������������

������"��

�!�#��

���� ���� ���� ����

���

���

���

�����

������� ��

������������

������"��

���!�$

Figure S8 Altitudinal clines of three inversions presented as a relationship between altitude and haplotype distance within each inversion. Distance (as a number of SNPs for which they differ) of each haplotype from the most distal haplotype in the main low diversity haplotype group is in the y-axis. Colors indicate populations. A) nIv1n, B) Inv4m and C) Inv9d.

Altitude

���� ���� ���� ����

��

������������

�����

������� ��

����������������������������

���� ���� ���� ����

��

��

!�

���

�����

������ ��

������������������"���!�#��

���� ���� ���� �����

��

���

���

�����

������� ��

������������������"�����!�$

Figure S8 Altitudinal clines of three inversions presented as a relationship between altitude and haplotype distance within each inversion. Distance (as a number of SNPs for which they differ) of each haplotype from the most distal haplotype in the main low diversity haplotype group is in the y-axis. Colors indicate populations. A) nIv1n, B) Inv4m and C) Inv9d.

Inve

rsio

n Fr

eque

ncy

Inv9d

Pyhäjärvi et al. 2013 GBE

large inversions common, clinal

Inv1n

Inv4m Inv9e

Inv9d

Page 57: Complex adaptation in Zea

all inversions show signs of selection

Pyhäjärvi et al. 2013 GBE

temperature

FCT FST

Page 58: Complex adaptation in Zea

adaptive introgression of Inv4m

Hufford et al. 2013 PLoS Genetics Lauter et al. 2004 Genetics

Page 59: Complex adaptation in Zea

extensive variation in genome size

Diez et al. 2013 New Phytologist

Page 60: Complex adaptation in Zea

2.50

2.75

3.00

3.25

3.50

3.75

MH ML SAH SAL mexicana parviglumis

1C G

enom

e Si

ze (G

b)

Altitudehighland

lowland

altitudinal cline in genome size in Zea

Paul Bilinski, unpublished

Page 61: Complex adaptation in Zea

altitudinal cline in genome size in Zea

Paul Bilinski, unpublished

Page 62: Complex adaptation in Zea

altitudinal cline in genome size in Zea

Paul Bilinski, unpublished

the geographically derived groups (NA and SA) to theancestral group (ME). The average individual size ishigher for the NA (t¼ 5.65, P , 0.0001) and SA groups(t ¼ 3.74, P , 0.0003) than for the ancestral ME group(fig. 1). No difference was detected between the NA andSA groups (t ¼ 0.82, P ¼ 0.41). These data indicate anincrease in allele size in the geographically derived NAand SA groups relative to the ancestral ME group.

Because genome size has been reported to benegatively correlated with altitude in maize (Poggio et al.1998), and because microsatellite loci are one componentof genome size, we examined whether average individualallele size is correlated with altitude and found that it was(fig. 2; R ¼ 0.35, P ¼ 3 3 10"6). We found the samecorrelation if we considered the three groups separately:ME (R ¼ 0.43, P , 0.0002), SA (R ¼ 0.30, P , 0.016),and NA (R¼ 0.54, P , 0.001). The SA group was derivedfrom low-altitude maize of Guatemala, and the NA groupwas derived from maize of northern Mexico (Matsuoka etal. 2002b). These independent histories argue that thecorrelations were established independently in ME, SA,and NA.

Next we asked whether the differences betweenaverage individual size among groups could result froma difference in the mean altitude for the groups. Theaverage altitude is 1,430 m for the ME group, 1,480 m forthe SA group, and 1,030 m for the NA group. First, wedetermined the regression slopes of altitude onto allele sizefor each group and found that they are all similar (F ¼2.41, df ¼ 2, P ¼ 0.093). Knowing that the relationshipbetween the average individual size and altitude is similarfor all groups, we then asked whether the intercepts aresignificantly different between groups and found that theyare (F¼19.9, df¼2, P, 0.001). Based on all the tests, weconclude that there are two distinct phenomena: (1)a significant difference between groups in average in-dividual size and (2) a significant correlation betweenaverage individual size and altitude.

Discussion

The observation that the average individual allele sizefor microsatellite loci is larger in North and South

American maize than in the ancestral population inMexico supports directional evolution of microsatellitesize in maize. Because the microsatellites were obtained byscreening North American maize, one could argue thatascertainment bias resulted in a larger average size inNorth America (Ellegren, Primmer, and Sheldon 1995).However, one would not expect to observe the samephenomenon for the South American sample. Thus, ourdata suggest that directional evolution of microsatellitesize occurs in maize. The mean difference between the SAand ME, and the NA and ME populations is 4.1 bp and 3.3bp per locus, respectively. Therefore, we conclude thatmaize microsatellite loci have not remained at equilibriumfor size but have tended to increase in size from the‘‘ancestral’’ to the geographically derived populations.

Directional evolution of this type could be explainedin several ways. First, there could be a change in thedegree of mutational bias to larger alleles in the derivedgroups. In our case, this would have had to occur twice,independently in North and South America. Second, giventhat mutations are more likely to cause an increase in allelesize in maize (Vigouroux et al. 2002), a change in themutation rate with movement into a new environmentcould cause directional evolution (Rubinsztein et al. 1995).Third, one could also propose a demographic explanation.For example, if the ancestral population is stable in sizeand at Hardy-Weinberg equilibrium, then the coalescencetree for a sample of alleles could be shorter than thecoalescence tree for a similar sample in an expanding,nonequilibrium, derived population. The longer treeimplies more opportunity for mutation, and given thatmutation tends to increase allele size in maize, the resultwould be a larger average allele size in the derivedpopulation. This explanation implies that the formation ofthe derived population was not associated with a severebottleneck. Other demographic scenarios could give theopposite outcome, i.e., a shorter coalescence tree fora derived population.

We have also observed a negative correlationbetween allele size and altitude in Mexico and North andSouth America that must have been independently derivedgiven the known phylogeography of maize (Matsuoka et al.2002b). Moreover, the strength of the correlation is similarin all three regions. There is an average decrease of 1.8 bp

FIG. 1.—Average individual microsatellite allele size for the threemaize populations: Mexican (ME), South American (SA), and NorthAmerican (NA). The mean, the standard error, and the number of plantsper population are presented.

FIG. 2.—Correlation between average individual microsatellite allelesize and altitude.

Directional Evolution in Maize Microsatellites 1481

Vigouroux et al. 2002 MBE

Nor

mal

ize

Repe

at L

engt

h

Page 63: Complex adaptation in Zea

cline present even for meiotic drive loci

Kanizay et al. 2013 Heredity

342

Fig. la,b. a Pachytene chromosomes of the KYS stock of maize, showing its five knobs and the heterochromatin of the nucleolus organizer region (NOR). S and L denote the short and long arms of chromosomes, b In situ hybridization of cloned knob DNA repeating unit to mitotic prometaphase chromosomes of maize stock P100. Hybridization is localized to knobs. In situ hybridization conditions were as in Peacock et at. (1981)

D N A sequence c o m p o n e n t w h i c h is r e p e a t e d t an - d e m l y in each k n o b ( P e a c o c k et al . 1981). T h e r e p e a t l eng th o f t he s e q u e n c e is 180 b p a n d in s i tu h y b r i d - i z a t i o n e x p e r i m e n t s s h o w e d t h a t t he s e q u e n c e was r e s t r i c t e d to k n o b h e t e r o c h r o m a t i n a n d was n o t d e - t e c t ab l e in a n y o t h e r h e t e r o c h r o m a t i c r eg ion (Fig. 1),

W e h a v e n o w d e m o n s t r a t e d t h a t th i s s a m e se- q u e n c e is t he m a j o r c o m p o n e n t o f k n o b h e t e r o - c h r o m a t i n in t e o s i n t e a n d Tripsacum. A n a l y s i s o f t he s e q u e n c e s o f c l o n e d i n d i v i d u a l r e p e a t s p r o v i d e s s t rong c o n f i r m a t i o n t h a t m a i z e is d e r i v e d f r o m teo- s in te . Tripsacum is a r e l a t e d b u t d i s t i n c t species . I n a l l t h r ee spec ies t h e r e is a 2 7 - b p region o f the 180- b p r e p e a t s e q u e n c e t ha t s h o w s n o n u c l e o t i d e v a r i - a t i o n a m o n g r epea t s .

Methods

Plant Stocks

Maize. Black Mexican, Pl00, KYS and 34704 were provided by Dr. Marcus Rhoades, Indiana University.

USA. Tripsacum dactyloides was from Dr. W. Galinat, Waltham, Massachusetts, USA. Coix aquatica, C. lacryma-jobi and C. gi- gantea were supplied by Dr. P.N. Rap, Visakhapatnam, India. Sorghum intrans was from Dr. R. Dowries, CSIRO, Canberra, Australia.

Isolation of DNA Shoots from 6-day-old seedlings were frozen in liquid nitrogen and ground to a fine powder in a mortar and pestle. Five times the weight of extraction buffer [0.1 M NaC1, 50 mM ethylene- diaminetetraacetate (EDTA), pH 8.5, 2% Sarkosyl] was added, and then NaCIO4 to 1 M and Bacovin to 0.1%. Nucleic acid was deproteinized with phenol and chloroform, and then ethanol pre- cipitated. The pellet was redissolved and digested with autodi- gested protease (5 h at 37"C at a concentration of 0.5 mg/ml protease) and centrifuged to equilibrium in a CsCI gradient con- mining ethidium bromide.

Cloning of Repeat Sequences

The individual repeat units were cloned from maize, teosinte and Tripsacum by digesting purified satellite DNA (prepared as in Peacock et al. 1981) with restriction endonuclease HaeIII, adding HindIII linkers, and eluting individual bands from preparative acrylamide gels and cloning into the HindIII site of pBR322 as described previously (Peacock et al. 1981). Bands corresponding to the 180-bp monomer, the 202-bp length variant and the dimer were all eluted and cloned separately.

Teosinte. Diploid lines: El Salado, Chalco, El Progresso and San Antonio Huista were provided by Dr. G. Beadle, University of Chicago. C-1-78 (near Chalco) was a gift of Dr. T. Angel Kate, Chapingo, Mexico.

Tetraploid line: Tetraploid Perennial (Hitchcock) was pro- vided by Dr. G. Beadle.

Other Species. Z. diploperennis stocks were from Dr. G. Beadle, University of Chicago, and Dr. H. Iltis, Madison, Wisconsin,

DNA Sequence Analysis Segments were 3' end-labelled using a[3~P] deoxynucleotide tri- phosphates and 'filling in' of a recessed 3' OH end using the large (Klenow) subtmit ofDNA polymerase I, and were sequenced by chemical methods (Maxam and Gilbert 1980). Alternatively, clones were made in M 13 mp8~ (Messing and Vieira 1982) and sequenced using the chain termination method (Sanger et at. 1977).

Kelly Dawe

Page 64: Complex adaptation in Zea

cline present even for meiotic drive loci

Kanizay et al. 2013 Heredity

342

Fig. la,b. a Pachytene chromosomes of the KYS stock of maize, showing its five knobs and the heterochromatin of the nucleolus organizer region (NOR). S and L denote the short and long arms of chromosomes, b In situ hybridization of cloned knob DNA repeating unit to mitotic prometaphase chromosomes of maize stock P100. Hybridization is localized to knobs. In situ hybridization conditions were as in Peacock et at. (1981)

D N A sequence c o m p o n e n t w h i c h is r e p e a t e d t an - d e m l y in each k n o b ( P e a c o c k et al . 1981). T h e r e p e a t l eng th o f t he s e q u e n c e is 180 b p a n d in s i tu h y b r i d - i z a t i o n e x p e r i m e n t s s h o w e d t h a t t he s e q u e n c e was r e s t r i c t e d to k n o b h e t e r o c h r o m a t i n a n d was n o t d e - t e c t ab l e in a n y o t h e r h e t e r o c h r o m a t i c r eg ion (Fig. 1),

W e h a v e n o w d e m o n s t r a t e d t h a t th i s s a m e se- q u e n c e is t he m a j o r c o m p o n e n t o f k n o b h e t e r o - c h r o m a t i n in t e o s i n t e a n d Tripsacum. A n a l y s i s o f t he s e q u e n c e s o f c l o n e d i n d i v i d u a l r e p e a t s p r o v i d e s s t rong c o n f i r m a t i o n t h a t m a i z e is d e r i v e d f r o m teo- s in te . Tripsacum is a r e l a t e d b u t d i s t i n c t species . I n a l l t h r ee spec ies t h e r e is a 2 7 - b p region o f the 180- b p r e p e a t s e q u e n c e t ha t s h o w s n o n u c l e o t i d e v a r i - a t i o n a m o n g r epea t s .

Methods

Plant Stocks

Maize. Black Mexican, Pl00, KYS and 34704 were provided by Dr. Marcus Rhoades, Indiana University.

USA. Tripsacum dactyloides was from Dr. W. Galinat, Waltham, Massachusetts, USA. Coix aquatica, C. lacryma-jobi and C. gi- gantea were supplied by Dr. P.N. Rap, Visakhapatnam, India. Sorghum intrans was from Dr. R. Dowries, CSIRO, Canberra, Australia.

Isolation of DNA Shoots from 6-day-old seedlings were frozen in liquid nitrogen and ground to a fine powder in a mortar and pestle. Five times the weight of extraction buffer [0.1 M NaC1, 50 mM ethylene- diaminetetraacetate (EDTA), pH 8.5, 2% Sarkosyl] was added, and then NaCIO4 to 1 M and Bacovin to 0.1%. Nucleic acid was deproteinized with phenol and chloroform, and then ethanol pre- cipitated. The pellet was redissolved and digested with autodi- gested protease (5 h at 37"C at a concentration of 0.5 mg/ml protease) and centrifuged to equilibrium in a CsCI gradient con- mining ethidium bromide.

Cloning of Repeat Sequences

The individual repeat units were cloned from maize, teosinte and Tripsacum by digesting purified satellite DNA (prepared as in Peacock et al. 1981) with restriction endonuclease HaeIII, adding HindIII linkers, and eluting individual bands from preparative acrylamide gels and cloning into the HindIII site of pBR322 as described previously (Peacock et al. 1981). Bands corresponding to the 180-bp monomer, the 202-bp length variant and the dimer were all eluted and cloned separately.

Teosinte. Diploid lines: El Salado, Chalco, El Progresso and San Antonio Huista were provided by Dr. G. Beadle, University of Chicago. C-1-78 (near Chalco) was a gift of Dr. T. Angel Kate, Chapingo, Mexico.

Tetraploid line: Tetraploid Perennial (Hitchcock) was pro- vided by Dr. G. Beadle.

Other Species. Z. diploperennis stocks were from Dr. G. Beadle, University of Chicago, and Dr. H. Iltis, Madison, Wisconsin,

DNA Sequence Analysis Segments were 3' end-labelled using a[3~P] deoxynucleotide tri- phosphates and 'filling in' of a recessed 3' OH end using the large (Klenow) subtmit ofDNA polymerase I, and were sequenced by chemical methods (Maxam and Gilbert 1980). Alternatively, clones were made in M 13 mp8~ (Messing and Vieira 1982) and sequenced using the chain termination method (Sanger et at. 1977).

Dennis and Peacock 1984 J Mol Evol

Kelly Dawe

Page 65: Complex adaptation in Zea

Rayburn et al. 1994 Plant Breeding Francis et al. 2008. Ann. Bot.

mechanism: cell cycle, flowering time?

cycle time that did not exceed 20 h compared with a muchgreater spread of cycle times for the monocots. If DNAmass per se is the limiting factor for cell cycle time, wehypothesize that cycle times would be the same for dicotsand monocots of comparable C-value. This is so even ifthe data for Scilla sibirica and Trillium grandiflorum are

excluded. Indeed, if we ignore the marked discontinuityof the y-axis caused by their inclusion, then the nucleotypiceffect is strong for all species regardless of phylogeny. Totest the rigour of these hypotheses would require data toplug the gap between Trillium grandiflorum and themajority of C-value/cell cycle times analysed here.

Separate plots for diploids and polyploids show a strongnucleotypic effect on CCT in diploids (Fig. 3; Table 2).Removing the five diploid outliers (.25 pg) reduced theslope (b ¼ 0.27) by approximately four-fold but theregression continued to be significant (P , 0.001). Forthe polyploids, a nucleotypic effect on CCT was alsodetected (Fig. 3; Table 2); however, removing the two poly-ploid outliers rendered the regression non-significant (y ¼0.03x 2 13.5). This confirms previous work in which theslope/rate of increase in CCT with increasing DNA washigher in diploids than in autopolyploids (Evans et al.,1972). With the exception of Scilla sibirica, CCT in poly-ploids is generally more buffered than in diploids (Fig. 3).

We acknowledge that some traditionally classifieddiploids are not necessarily so (see Soltis and Soltis,1999). For example, there are strong arguments that Zeamays is actually an allotetraploid (2n ¼ 4x ¼ 20; Gaut andDoebly, 1997). However, in the data reported here wehave assigned ploidy level as listed by the authors of thepapers and reviews we have consulted.

The longest CCTs (.20 h) are exhibited by the peren-nials (Fig. 4). Indeed, the data for perennials overall had anearly seven-fold steeper slope (b ¼ 1.37) than a compar-able regression for annuals (b ¼ 0.20; Table 2). Thesedata are consistent with findings of Bennett (1972) wherethe mean CCT in 19 annuals was significantly shorterthan in eight obligate perennials. Where our analysesdiffer from Bennett (1972) is in relation to the broadrange of CCTs shown by perennials compared withannuals (Fig. 4). However, in Fig. 4 the longer CCTs

FI G. 3. DNA C-value (pg) and cell cycle time (h) in the root apical mer-istem of a range of diploid and polyploid angiosperms. See Table 2 for

regression analyses.

FI G. 2. DNA C-value (pg) and cell cycle time (h) in the root apical mer-istem of a range of (A) eudicots and monocots (n ¼ 110), and (B) eudicots

(n ¼ 60). See Table 2 for regression analyses.

TABLE 2. Regression analyses of all data presented inFigs. 2–4 together with the percentage variance accountedfor by the regression (R2), the level of probability (P) for

each regression

Regression (y ¼ bx þ a) R2 P n

All measurements y ¼ 1.09x þ 5.39 54.2 *** 110Monocots y ¼ 1.29x þ 2.44 58.7 *** 48Eudicots y ¼ 0.32x þ 10.2 15.4 *** 62Diploids y ¼ 1.04x þ 4.95 49.86 *** 86Polyploids y ¼ 1.14x þ 3.12 56.3 *** 24Annuals y ¼ 0.20x þ 10.7 19.9 *** 75Perennials y ¼ 1.37x þ 4.13 63.6 *** 35

*** P , 0.001; n, number of replicates.

Francis et al. — DNA C-value and the Cell Cycle750

at University of C

alifornia, Davis - Library on February 19, 2013

http://aob.oxfordjournals.org/D

ownloaded from

Francis et al. 2008. Ann. Bot.

0

10

20

30

100 105 110DNA

plants

cycle0

6

Page 66: Complex adaptation in Zea

Paul Bilinski, unpublished Pyhäjärvi et al. 2013 GBE

opposing clines in teosinte genome size

Page 67: Complex adaptation in Zea

different-sized ascertainment panels (supplementary fig. S2,Supplementary Material online).

Overall, heterozygosity differed only slightly between thetwo subspecies, but considerable variation was evidentamong populations in both taxa (fig. 2 and supplementarytable S4, Supplementary Material online). Both parviglumisand mexicana showed generally high levels of heterozygosityand little evidence of inbreeding. Within parviglumis, the pop-ulation of Los Guajes showed high diversity, an inbreedingcoefficient (FIS) close to zero, no deviation from Hardy–Weinberg equilibrium, and relatively short runs of ROH (sup-plementary fig. S3, Supplementary Material online), as mightbe expected of a large, outcrossing population. Other popu-lations, however, showed evidence of recent demographicchanges: diversity in San Lorenzo parviglumis was only halfthat of Los Guajes, and mexicana individuals fromXochimilco were characterized by extremely long ROH andextensive within-population haplotype sharing, consistentwith a recent population bottleneck.

Patterns of LD varied greatly along the genome. AlthoughLD generally decays quickly in teosinte (Hufford et al. 2012)and the median LD within 5-kb windows in our data is low(r2¼ 0.04), we observed several discrete blocks of elevated LDin multiple populations. We identified four large (>10 Mb)regions of high (r2" 0.2) LD (fig. 3 and supplementary fig.S4, Supplementary Material online). Three of these regionsappear to correspond to inversions previously described cyto-logically (Inv9e; Ting 1964), in mapping populations (Inv4m,Mano et al. 2012), or by population genetic analysis (Inv1n,Fang et al. 2012; Hufford et al. 2012). The size of these blocksand their chromosomal location effectively rules out recentselective sweeps or centromeric regions as alternative expla-nations, and we interpret all four as inversion polymorphisms.Clear haplotype structure was observed in IBS analysis forthree of these putative inversions, and simple genetic-dis-tance-based clustering, including Tripsacum and maize, sug-gested that the nonmaize haplotype was likely the derivedstate (supplementary fig. S5, Supplementary Material

FIG. 2.—Diversity statistics. (A) Proportion of SNPs deviating from Hardy–Weinberg Equilibrium (HWE), proportion of polymorphic SNPs, and mean

inbreeding coefficient FIS. (B) Length and number of ROH and average pairwise length of genomic segments IBS.

Complex Patterns of Local Adaptation in Teosinte GBE

Genome Biol. Evol. 5(9):1594–1609. doi:10.1093/gbe/evt109 Advance Access publication July 30, 2013 1599

at University of C

alifornia, Davis - Library on Septem

ber 11, 2014http://gbe.oxfordjournals.org/

Dow

nloaded from

Paul Bilinski, unpublished Pyhäjärvi et al. 2013 GBE

opposing clines in teosinte genome size

Page 68: Complex adaptation in Zea

• simple scenarios of strong selection on new protein-coding mutations are probably rare

• much adaptation occurs via selection on quantitative traits, standing variation, and/or multiple mutations

• noncoding, structural variation likely play important roles in adaptation

concluding thoughts


Recommended