+ All Categories
Home > Documents > Population structure - Oxford Statisticsmcvean/slides7.pdf · • Island models of population...

Population structure - Oxford Statisticsmcvean/slides7.pdf · • Island models of population...

Date post: 25-Aug-2018
Category:
Upload: duongthien
View: 222 times
Download: 0 times
Share this document with a friend
27
Copyright: Gilean McVean, 2001 1 Population structure The evolutionary significance of structure Detecting and describing structure Wright’s F statistics Implications for genetic variability Inbreeding effects of structure The Wahlund effect Drift and founder effects Island models of population structure Identity by descent Diffusion methods The coalescent with structure Selection in subdivided populations Location adaptation – Clines Wright’s Shifting-Balance theory
Transcript

Copyright: Gilean McVean, 2001 1

Population structure

• The evolutionary significance of structure

• Detecting and describing structure

– Wright’s F statistics

• Implications for genetic variability

– Inbreeding effects of structure– The Wahlund effect– Drift and founder effects

• Island models of population structure

– Identity by descent– Diffusion methods– The coalescent with structure

• Selection in subdivided populations

– Location adaptation– Clines– Wright’s Shifting-Balance theory

Copyright: Gilean McVean, 2001 2

Population structure

• Non-random location

• Non-random mating

Genetic and phenotypic divergence due to

ChanceSelectionSelection plus chance

Distribution of surname

Hannah

Goodacre and Sykes

Copyright: Gilean McVean, 2001 3

Detecting and describing genetic structure

Wright’s FST statistic

T

ST

H

HH −=

Testing by permutation

Average heterozygositywithin subpopulations

Heterozygosity overall populations

Copyright: Gilean McVean, 2001 4

The hierarchical nature of F statistics

• F statistics can be used to contrast structure at different levels

e.g.S

ISIS H

HHF

−= Average within-individualheterozygosity

measure of inbreeding

TotalRegionPopulationionSubpopulatIndividual HHHHH <<<<

Copyright: Gilean McVean, 2001 5

FST in natural populations

0.6760.0120.037Jumping rodent

0.1130.0860.097House mouse

0.0770.0360.039Human (Yanomama)

0.0690.1210.130Human (major races)

FSTHTOrganismSH

Nei (1975)

Allozymes

SNPs

0.0230.01510.0154Drosophila melanogastera

0.0670.2010.195Human (major races)

FSTHTOrganism SH

aBased on pairwise diversity

Copyright: Gilean McVean, 2001 6

The inbreeding effect of population structure

• Differences in allele frequency between populations lead to an excess of homozygotes

21

21 qp +

HWeqm

22

22

11 qpF

FFF qp

T

TSST −−

σ+σ=

−−=

Expectedhomozygosity

Observedhomozygosity

2221

21 qpqp σ+σ++

Combined samples

Heterozygosity = 1- Homozygosity

Copyright: Gilean McVean, 2001 7

The Wahlund effect

• Increase in heterozyogisty following mixing of isolated populations

• Medical implications for disease incidence in admixed populations

– Recessive disease reduced by mixing

0.013

0.07

0.022

Disease allele frequency

Ashkenazi JewsTay-Sachs disease

HopiAlbinism

CaucasiansCystic fibrosis

High risk population

Disease

CombineRandommating

Copyright: Gilean McVean, 2001 8

Differences between allozymes and DNA?

• American oysters (Crassostrea virginica)

0

0.2

0.4

0.6

0.8

1

MA SC GA FL FL FL FL FL LA

0

0.2

0.4

0.6

0.8

1

MA SC GA GFL FL FL FL FL LA

Allozymes

DNAmtDNA

Avise (1994)

Copyright: Gilean McVean, 2001 9

Differences between allozymes?

0. 291hk

0.035to

0.027α-gpdh

0.034bdh

0.062ak

0.017got

0.052pgi

0.028pgm

FSTLocus

Unusually high differentiation

Checkersport butterfly

Euphydryas editha

McKechnie et al. 1975

Problems with FST

• Arbitrary a priori choice of structure to test

• High sampling variance when polymorphism low

• Throws away much information

Copyright: Gilean McVean, 2001 10

Population genetics models of structure

• Quantify relationship between genetic drift, selection and population differentiation

• Assumptions

– Infinite mainland population (island)– Equal population size (n-island)– Constant population size– Proportion m of population replaced migrants

each generation– Symmetric migration (n-island)

Island model n-island model

Copyright: Gilean McVean, 2001 11

Identity by descent in the island model

1

Same parent

ft-1 0

Different parents MigrationEvent

Identity

Probability eN2/1 2mmNe 22/11 −−

At equilibriummN

fe41

1

+=

generationper migrants ofNumber 24 ×=mNe

Only a few migrants each generation are required to prevent a build up of identity within the island population

Copyright: Gilean McVean, 2001 12

Relationship between FST and migration rate

• Can estimate scaled migration rate from estimated FST (assuming equilibrium, etc.)

mNFE

eST 41

1][

+≈

0.01

0.1

1

10

100

0 0.2 0.4 0.6 0.8 1STF

mNe

E.g. in humans, FST ≈ 0.067

5.3≈mNe

NB: This is NOT a good estimator – do not trust the answer!

Copyright: Gilean McVean, 2001 13

Wright’s diffusion model for allele frequencies with migration

0 0.2 0.4 0.6 0.8 1

Allele frequency onmainland = 0.5

Probability density

104 =mNe

2.04 =mNe

allele frequency on island

Mainlandfrequency = xm

ex

mx

N

xxV

xxmM

2

)1(

)(

−=

−=

δ

δ

Islandfrequency = x

Wright (1951)

Deterministic

Drift

Copyright: Gilean McVean, 2001 14

Example: SNP frequencies in African Americans

• Goddard et al. (2000)

– 114 SNPs in 33 genes– 190 African Americans sampled

• Likelihood estimation of Nem from sample

– assume independence between SNPs

0

0.25

0.5

0.75

1

0 0.25 0.5 0.75 1Worldwide frequency

Afr

ican

Am

eric

an f

requ

ency

-50

-40

-30

-20

-10

0

0 5 10 15

0.5=mNe

)(LLn∆ mNe

Copyright: Gilean McVean, 2001 15

The coalescent in structured populations

• Two-island model

Population 1 Population 2

Pr{coalescence} =e

ii

N

nn

4

)1( −

Pr{migration} = mni

Copyright: Gilean McVean, 2001 16

The time to coalescence for two sequences sampled from the same population

Pr{1st event is a coalescence}

mNmN

N

ee

e

41

1

22/1

2/1

+=

+

Pr{1st event is a migration}

mN

mN

mN

m

e

e

e 41

4

22/1

2

+=

+

Expected time to coalescence = eN4

Ne

Ne

2Ne≡For expectedpairwise diversity (within population)

BUT

0 3 6 9 12 15 18 21 24

Subdivided: 4Nem = 0.2

Single population

Variance affected by population structure

Average pairwise differences

Copyright: Gilean McVean, 2001 17

Effect on allele frequency spectrum

Rapid coalescencewithin population

Slow coalescence between populationsMutation at

high frequency

1 4 7 10 13 16 19Frequency of derived allele

Subdivided: 4Nem = 0.1Single population

Copyright: Gilean McVean, 2001 18

Effect on neutrality statistics within populations

• Tajima’s D statistic

• Fu and Li D statistic

-4 -3 -2 -1 0 1 2 3 4

Subdivided: 4Nem = 0.2

Single population

Main effect is to increase the variance

Other statistics (e.g. Fay and Wu, 2000) more sensitive

-4.5 -3 -1.5 0 1.5 3

Subdivided: 4Nem = 0.2

Single population

Copyright: Gilean McVean, 2001 19

Effect on polymorphism between populations

-4 -3 -2 -1 0 1 2 3 4

• Tajima’s D statistic

• Frequency distribution

Subdivided: 4Nem = 0.2Single population

1 3 5 7 9 11 13 15 17 19

Subdivided: 4Nem = 0.2Singlepopulation

Copyright: Gilean McVean, 2001 20

Effect on linkage disequilibrium

• Linkage disequilibrium measures correlations between alleles at different loci

• Population structure increases linkage disequilibrium between linked loci

• Population structure creates linkage disequilibrium between unlinked loci in different populations

0.05 0.15 0.25 0.35 0.45 0.55 0.65 0.75 0.85 0.95 r2

Subdivided: 4Nem = 0.1

Single population

14 =rNe

BAAB fffD −=

0

8.0,2.0

===

D

ff BA

0

2.0,8.0

===

D

ff BA

Naive analysis

09.0=D

Admixture

Copyright: Gilean McVean, 2001 21

Admixture dynamics

• Combination of two previously separated populations

• Over time random mating returns population to equilibrium

• Disequilibrium between unlinked loci can persist for several generations, while Hardy-Weinberg equilibrium is achieved instantly

tt rDD )1(0 −=BAD δδ= 4

10

AAA ff δ=− 21

BBB ff δ=− 21

0 2 4 6 8 10

0/ DDt

unlinked

1cM distance

generation

Copyright: Gilean McVean, 2001 22

Selection in a subdivided population

• Maruyama (1970)

– The fixation probability of an unconditionally beneficial mutation is unaffected by population structure (Pfix ≈ 2s)

• Levene (1953)

– Environmental heterogeneity can maintain genetic polymorphism

• BUT

– If migration high, selection has to be strong and finely balanced to habitat frequencies to maintain polymorphism

• Low migration rates can promote local adaptation

– Heavy metal tolerance in plants– Melanism in the peppered moth

– Milk tolerance in humans

favoured on

favoured on

Copyright: Gilean McVean, 2001 23

Selection at different scales

• Evidence for local adaptation from gradients in allele frequency : clines

• Continental clines in Adh activity and allozyme variation in Drosophila

• Clines in genetic and morphological characters in the toad Bombina

Driven by scale of environmental heterogeneity

Balance between selection against hybrids and migration, following secondary contact

0

0.2

0.4

0.6

22 27 32 37 42 47

Latitude

Freq

uenc

y

F/S

1∇

0

1

-10 -5 0 5 10

Distance (√km)

Frequency B. variegata

morphological

Genetic

Berry & Kreitman (1993)

Szymura & Barton (1991)

Copyright: Gilean McVean, 2001 24

Indirect evidence for local adaptation?

• Local hitch-hiking?

• But the structured coalescent also leads to variation in coalescence times

India

Zimbabwe

China

Antilles

Mic

rosa

tell

ite

dive

rsit

y

Locus Schlötterer et al. (1997)

Copyright: Gilean McVean, 2001 25

The interaction between selection, gene flow and genetic drift

• Wright’s Shifting Balance theory

• Epistasis between alleles at different loci

• The adaptive landscape

– Epistasis creates adaptive valleys between peaks of fitness

Population fitness

Frequency allele A

Frequencyallele B

Adaptive valley

Starting pointof population

AA Aa aa

BB

Bb

bb

Locus 2

Locus 1least fit

most fit

Copyright: Gilean McVean, 2001 26

The Shifting Balance theory

• Drift allows population to cross adaptive valley due to stochastic processes in finite populations

• Evidence for widespread epistasis?

– F2 hybrid breakdown

– Coadapted gene complexes

• Theoretical issues

– Very difficult for a population that has crossed a valley to spread throughout rest of population

– The interaction between epistatic selection and genetic drift may be important in reproductive isolation

• e.g. recessive epistatic interactions important in Haldane’s rule of unisexual hybrid sterility

Subpopulations are natural experiments, allowing species to evolve across complex adaptive landscapes

Copyright: Gilean McVean, 2001 27

Future directions

• Theoretical and statistical issues– Methods for discriminating between local

adaptation and chance effects of coalescence in a structured population

– The relationship between population structure and linkage disequilibrium

– Selection on polygenic traits in subdivided populations

• Empirical challenges– Describing patterns of gene diversity at

many loci across genomes (from an well-chosen sample)

– Comparing differentiation for different types of mutation (e.g. silent v replacement)

– Mapping genetic variation to phenotypic variation


Recommended