1
Population Genetics 2:
Linkage disequilibrium
Genotype frequencies in a population A gene B gene
fAA = p2
fAa = 2pq faa = q2
fBB = x2
fBb = 2xy fbb = y2
p + q = 1 x + y = 1
Consider two loci and 1 generation of random mating: A gene: AA, Aa, and aa
B gene: BB, Bb, and bb
Random association of alleles at a single locus: HWE
What about random association of alleles at different loci after random mating?
2
Random association in gametes
Alleles at A locus
A(p) a(q)
B (x)
AB (px)
aB (qx)
Alle
les
at B
locu
s b
(y) Ab (py)
ab (qy)
remember: p + q =1 and x + y = 1
Consider two loci and 1 generation of random mating:
Random association of alleles at a different locus:
LINKAGE EQUILIBRIUM
GAMETIC PHASE EQUILIBRIUM
Consider two loci and 1 generation of random mating:
Surprisingly common result:
Gene A: HWE
Gene B: HWE
Gene A + Gene B: disequilibrium
3
Population 1: 100% AABB Population 2: 100% aabb Mix populations equally: 50% AABB + 50% aabb 1 generation of random mating (only three matings possible) : AABB x AABB = AABB aabb x aabb = aabb AABB x aabb = AaBb Nine genotypes are possible: They did not reach equilibrium after one generation of random mating. With continued random mating the “missing” genotypes would appear, but not immediately at their equilibrium frequencies!
Consider two loci (on different chromosomes) and 1 generation of random mating: Example:
AaBB aaBB aaBb
AABb AAbb Aabb
AABB aabb AaBb
We only see 1/3 after 1 generation of random mating!
Consider two loci and 1 generation of random mating:
- attainment of linkage equilibrium is gradual - about 50% of disequilibrium “breaks down” per generation - linkage disequilibrium (LD) persists in populations for many generations - LD = gametic phase disequilibrium
4
LD in individuals (BIOL 2030 stuff):
Case 1: AB gamete + ab gamete = AaBb Case 2: Ab gamete + aB gamete = AaBb
New symbolism:
AB/ab
indicates the union of AB gamete + ab gamete
We need a new symbolism
LD in individuals: Let’s take an AB/ab individual as an
example:
What types of gametes can the AB/ab make?
(1) AB
(2) ab Parental or non-recombinant gametes
(3) Ab
(4) aB Non-parental or recombinant gametes
By the way, lets assume physical linkage.
A B
a b
Physical linkage:
Notation = AB/ab
5
Parental configuration
Gametes Notation:
Parental (non-recombinant): A B
Parental (non-recombinant): a b
Recombinant: A b
Recombinant: a B
1 crossing over event: 50% parental and 50% recombinant !!!
Review meiosis in a single meiocyte:
LD in individuals:
A B
a b
Physical linkage:
Notation = AB/ab
When genes are on the same chromosome:
fAB = fab ≥ fAb = faB
f (non-recombinant) ≥ f (recombinant)
Recombination fraction (r) is the proportion of recombinant gametes produced by an individual.
When r = 0: fAB + fab = 100% [fAb + faB = 0%]
When r = 0.5: fAB + fab = 50% [fAb + faB = 50%]
6
From:
iGenetics
P. J. Russell (2002)
page 350
LD in individuals:
When genes are on different chromosomes:
fAB = fab = fAb = faB
f (non-recombinant) = f (recombinant)
A
a
Un linked genes:
B
b
r = 0.5
• when genes are on different chromosomes
• when genes are on same chromosome and recombination is high enough for independent assortment
7
LD in individuals:
Individual AB/ab produces the following:
(1) AB: fAB = 0.38 (2) ab: fab = 0.38 (3) Ab: fAb = 0.12 (4) aB: faB = 0.12
r = 0.12 + 0.12 = 0.24
What do we expect for individuals if allelic association is random?
What do we expect in a population if allelic association is random?
LD in populations:
Random association in gametes
Alleles at A locus
A(p) a(q)
B (x)
AB (px)
aB (qx)
Alle
les
at B
locu
s
b (y)
Ab (py)
ab (qy)
remember: p + q =1 and x + y = 1
fAB =px fab = qy fAb = py faB =qx
fAB + fab + fAb + faB = 1
8
A B
LD in populations: The frequency of an AB gamete in a population has two sources:
A B
• Some individuals have this configuration: non-recombinant [parental]
• Some individuals produce this as a recombinant configuration.
tsrecombinan
tsrecombinanfrom randomat
B andA together putting of prob
recombof prob
tsrecombinan-non
generationlast in gametes
AB offrequency AB
ionrecombinat noofy probabilit
'AB )()1( pxrfrf +−=
LD in populations:
9
tsrecombinan
tsrecombinanfrom randomat
B andA together putting of prob
recombof prob
tsrecombinan-non
generationlast in gametes
AB offrequency AB
ionrecombinat noofy probabilit
'AB )()1( pxrfrf +−=
))(1( AB'
AB pxfrpxf −−=−Obs. – exp.
fAB = px + D fab = qy + D fAb = py - D faB =qx - D
))(1( AB pxfrD −−= = the linkage disequilibrium parameter
In excess due to LD
Deficient due to LD
Remember:
Individual AB/ab produces the following:
(1) AB: fAB = 0.38 (2) ab: fab = 0.38 (3) Ab: fAb = 0.12 (4) aB: faB = 0.12
r = 0.12 + 0.12 = 0.24
What do we expect for an individuals if association is random?
What do we expect in a population if association is random?
10
Forces that increase D in populations:
1. Migration 2. Natural selection 3. Genetic Drift
LD in populations:
smaller) is (whichever or max pyqxD =
• Comparing D among populations is difficult.
• Standardize D as a fraction of the theoretical maximum for the popn
maxDD
LD toduedeficient
aBAb
LD todue excess
abAB ffffD ×−×=
11
LD in a population:
MN blood group fM = p = 0.5425 fn = q = 0.4575
Ss blood group fS = x = 0.3080 fs = y = 0.6920
Gamete frequencies MS = 474/2000 = 0.2370 (+) Ms = 611/2000 = 0.3055 (-) nS = 142/2000 = 0.0710 (-) ns = 773/2000 = 0.3865 (+)
Example: blood group polymorphism in a sample of 1000 British people
D = (0.2370)(0.3865) – (0.3055)(0.0710) = 0.07
or max pyqxD =
€
D = efMS × efns
non recombinant − efMs × efnS
recombinant
Dmax: qx = 0.14 or py = 0.37; so Dmax = 0.14
D is (0.07/0.14)*100 = 50% of the theoretical maximum
expected px = 0.1671 py = 0.3751 qx = 0.1409 qy = 0.3166
Homework:
Genotype counts in the population MN locus Ss locus MM = 298 MN = 489 NN = 213
SS = 483 Ss = 418 ss = 99
Use chi-square test to:
1. determine if each locus is in HWE
2. determine if gamete frequencies are in equilibrium
Gametes MS = 474 Ms = 611 NS = 142 Ns = 773
12
Recombination reduces LD:
Rate of decay of LD under various recombination rates
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 9 17 25 33 41 49 57 65 73 81 89 97
generations
Stan
dard
ized
dis
equi
libriu
m
D/D
max
r= 0.001
r= 0.01
r= 0.1 r= 0.5
Recombination reduces LD:
“Hitchhiking” of a mutator gene with and without recombination
Adapted from Sniegowski et al. (2000) BioEssays 22:1057-1066.
No recombination
Recombination
Mutator allele that increase the mutation rate
Beneficial allele subject to strong positive selection
r = 0
r = 0.5
13
Mapping disease genes: 1. Family studies
• Uses family pedigrees
• Co-segregation of disease and a marker on the pedigree
2. Allelic association studies (LD mapping) • Uses population data
• Relies on strong LD only among closely linked loci
• Sample affected and unaffected individuals
• Very large samples are required!
• Look for markers with more LD in affected individuals
Both approached have powers and pitfalls
Mapping disease genes:
r = 0.01 (1 centiMorgan) is about 1million bp in humans
14
Bitter tasting Tasty mimics
(Papilio memnon females)
female-limited Batesian mimicry 5-locus linkage group (supergene):
§ Tail length
§ Hind-wing pattern
§ Forewing pattern
§ Epaulet color
§ Body color
Selection: Only certain complex color morphs provide gains in fitness
LD: Few maladaptive patterns produced per generation due to linkage
male
Class I Class III Class II
The MHC locus on human chromosome 6
LD over 10s-100s million of bp due to selection for combinations of loci
15
Sexual reproduction reduces LD
Linage disequilibrium: Keynotes • Attainment of equilibrium at different loci is gradual; > 1 generation of random mating.
• Physical linkage slows the rate to equilibrium even more!
• “r” determines the rate to equilibrium, the lower the fraction, the longer to equilibrium.
• When r = 0.5 the loci are said to be un-linked; such loci are very far apart on the same chromosome, or
in different chromosomes. When r < 0.5 the genes are said to be linked. When r =0 the loci are in permanent disequilibrium.
• Disequilibrium can arise from sources other than linkage:
o Admixture of populations o Natural selection acting on one or more of the loci o Inbreeding in plants that regularly undergo self-fertilization o Genes located in a chromosomal inversion (SUPERGENE)
• The term LINKAGE DISEQUILIBRIUM is used to describe any source of disequilibrium, regardless of whether
the two genes are physically linked or not.