9/17/18
1
Hardy Weinberg Equilibrium
Wilhem Weinberg(1862 – 1937)
Gregor Mendel
G. H. Hardy(1877 - 1947)
(1822-1884)
Lectures 4-11: Mechanisms of Evolution (Microevolution)
• Hardy Weinberg Principle (Mendelian Inheritance)
• Genetic Drift
• Mutation
• Sex: Recombination and Random Mating
• Epigenetic Inheritance
• Natural Selection
These are mechanisms acting WITHIN populations,
hence called “population genetics”—EXCEPT for epigenetic modifications, which act on individuals
in a Lamarckian manner
Evolution acts through changes in allele frequency at each generation
Leads to average change in characteristic of the population
Recall from Previous LecturesDarwin’s Observation
HOWEVER, Darwin did not understand how genetic variation was passed on from generation to generation
Recall from Lecture on History of Evolutionary Thought
Darwin�s Observation
Gregor Mendel, �Father of Modern Genetics�
• Mendel presented a mechanism for how traits got passed on
�Individuals pass alleles on to their offspring intact�
(the idea of particulate (genes) inheritance)
Gregor Mendel
(1822-1884)
http://www.biography.com/people/gregor-mendel-39282#synopsis
Gregor Mendel, �Father of Modern Genetics�
Mendel�s Laws of Inheritance• Law of Segregation– only one allele passes from each
parent on to an offspring• Law of Independent Assortment– different pairs of alleles are passed to
offspring independently of each other
Gregor Mendel
(1822-1884)
http://www.biography.com/people/gregor-mendel-39282#synopsis
9/17/18
2
Gregor Mendel
• In cross-pollinating plants with either yellow or green peas, Mendel found that the first generation (f1) always had yellow seeds (dominance). However, the following generation (f2) consistently had a 3:1 ratio of yellow to green.
Using 29,000 pea plants, Mendel discovered the 1:3 ratio of phenotypes, due to dominant vs. recessive alleles
• Mendel uncovered the underlying mechanism, that there are dominant and recessive alleles
• Mathematical description of Mendelianinheritance
Hardy-Weinberg Principle
Godfrey Hardy(1877-1947) Wilhem Weinberg
(1862 – 1937)
Testing for Hardy-Weinberg equilibrium can be used to assess whether a population is
evolving
The Hardy-Weinberg Principle
• A population that is not evolving shows allele and genotypic frequencies that are in Hardy Weinberg equilibrium
• If a population is not in Hardy-Weinberg equilibrium, it can be concluded that the population is evolving
9/17/18
3
Evolutionary Mechanisms (will put population out of HW Equilibrium):
• Genetic Drift• Natural Selection• Mutation• Migration
*Epigenetic modifications change expression of alleles but not the frequency of alleles themselves, so they won’t affect the actual inheritance of alleles
However, if you count the phenotype frequencies, and not the genotype frequencies , you might see phenotypic frequencies out of HW Equilibrium due to epigenetic silencing of alleles. (epigenetic modifications can change phenotype, not genotype)
Requirements of HW Evolution
Large population size Genetic drift
Random Mating Inbreeding & other
No Mutations Mutations
No Natural Selection Natural Selection
No Migration Migration
An evolving population is one that violates Hardy-Weinberg Assumptions
Violation
Fig. 23-5a
Porcupineherd range
Beaufort Sea NORTHWEST
TERRITORIES
M APAREA
ALAS
KA
CAN
ADA
Fortymileherd range
ALAS
KAYU
KON
•What is a “population?”A group of individuals within a species that is capable of interbreeding and producing fertile offspring
(definition for sexual species)
Patterns of inheritance should always be in “Hardy Weinberg Equilibrium”
Following the transmission rules of Mendel
In the absence of Evolution…
Hardy-Weinberg Equilibrium
• According to the Hardy-Weinberg principle, frequencies of alleles and genotypes in a population remain constant from generation to generation
• Also, the genotype frequencies you see in a population should be the Hardy-Weinberg expectations, given the allele frequencies
“Null Model”• No Evolution: Null Model to test if no
evolution is happening should simply be a population in Hardy-Weinberg Equilibrium
• No Selection: Null Model to test whether Natural Selection is occurring should have no selection, but should include Genetic Drift– This is because Genetic Drift is operating even
when there is no Natural Selection
9/17/18
4
Example: Is this population in Hardy Weinberg Equilibrium?
AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10
Hardy-Weinberg Theorem
In a non-evolving population, frequency of alleles and genotypes remain constant over generations
You should be able to predict the genotype frequencies, given the allele frequencies
important concepts• gene: A region of genome sequence (DNA or RNA), that is
the unit of inheritance , the product of which contributes to phenotype
• locus: Location in a genome (used interchangeably with “gene,” if the location is at a gene… but, locus can be anywhere, so meaning is broader than gene)
• loci: Plural of locus
• allele: Variant forms of a gene (e.g. alleles for different eye colors, BRCA1 breast cancer allele, etc.)
• genotype: The combination of alleles at a locus (gene)
• phenotype: The expression of a trait, as a result of the genotype and regulation of genes (green eyes, brown hair, body size, finger length, cystic fibrosis, etc.)
important concepts• allele: Variant forms of a gene (e.g. alleles for different eye
colors, BRCA1 breast cancer allele, etc.)
• We are diploid (2 chromosomes), so we have 2 alleles at a locus (any location in the genome)
• However, there can be many alleles at a locus in a population.– For example, you might have inherited a blue eye allele from
your mom and a brown eye allele from your dad… you can’t have more alleles than that (only 2 chromosomes, one from each parent)–BUT, there could be many alleles at this locus in the
population, blue, green, grey, brown, etc.
• Alleles in a population of diploid organisms
A1
A2
A3
A4A1
A1
A2
Sperm
Eggs
• Genotypes
Random Mating (Sex)
Zygotes
A1A3
A1A1 A1A1
A2A4
A3A1
A1A1
A1
A2A1
A1
A3A4
So then can we predict the % of alleles and genotypes in the population at each generation?
A1
A2
A3
A4A1
A1
A2
Sperm
Eggs
Zygotes
A1A3
A1A1 A1A1
A2A4
A3A1
A1A1
A1
A2A1
A1
A3A4
9/17/18
5
Hardy-Weinberg Theorem
In a non-evolving population, frequency of alleles and genotypes remain constant over generations
Fig. 23-6
Frequencies of allelesAlleles in the population
Gametes producedEach egg: Each sperm:
80%chance
80%chance
20%chance
20%chance
q = frequency of
p = frequency ofCR allele = 0.8
CW allele = 0.2
Hardy-Weinberg proportions indicate the expected allele and genotype frequencies, given the starting frequencies
• By convention, if there are 2 alleles at a locus, p and q are used to represent their frequencies
• The frequency of all alleles in a population will add up to 1
– For example, p + q = 1
If p and q represent the relative frequencies of the only two possible alleles in a population at a particular locus, then for a diploid organism (2 chromosomes),
(p + q) 2 = 1
= p2 + 2pq + q2 = 1
– where p2 and q2 represent the frequencies of the homozygous genotypes and 2pq represents the frequency of the heterozygous genotype
What about for a triploid organism? What about for a triploid organism?• (p + q)3 = 1
= p3 + 3p2q+ 3pq2 + q3 = 1
Potential offspring: ppp, ppq, pqp, qpp, qqp, pqq, qpq, qqq
How about tetraploid? You work it out.
9/17/18
6
Hardy Weinberg TheoremALLELESProbability of A = p p + q = 1Probability of a = q
GENOTYPESAA: p x p = p2
Aa: p x q + q x p = 2pqaa: q x q = q2
p2 + 2pq + q2 = 1
More General HW Equations• One locus three alleles: (p + q + r)2 = p2 + q2 + r2 + 2pq +2pr +
2qr
• One locus n # alleles: (p1 + p2 + p3 + p4 … …+ pn)2 = p12 + p22 + p32 + p42… …+ pn2 + 2p1p2 + 2p1p3 + 2p2p3 + 2p1p4 + 2p1p5 + … … + 2pn-1pn
• For a polyploid (more than two chromosomes): (p + q)c, where c = number of chromosomes
• If multiple loci (genes) code for a trait, each locus follows the HW principle independently, and then the alleles at each loci interact to influence the trait
ALLELE FrequenciesFrequency of A = p = 0.8Frequency of a = q = 0.2
p + q = 1
Expected GENOTYPE FrequenciesAA: p x p = p2 = 0.8 x 0.8 = 0.64Aa: p x q + q x p = 2pq
= 2 x (0.8 x 0.2) = 0.32aa: q x q = q2 = 0.2 x 0.2 = 0.04
p2 + 2pq + q2
= 0.64 + 0.32 + 0.04 = 1Expected Allele Frequencies at 2nd Generationp = AA + Aa/2 = 0.64 + (0.32/2) = 0.8q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2
Allele frequencies remain the same at next generation
Hardy Weinberg TheoremALLELE FrequencyFrequency of A = p = 0.8 p + q = 1Frequency of a = q = 0.2
Expected GENOTYPE FrequencyAA: p x p = p2 = 0.8 x 0.8 = 0.64Aa: p x q + q x p = 2pq = 2 x (0.8 x 0.2) = 0.32aa : q x q = q2 = 0.2 x 0.2 = 0.04
p2 + 2pq + q2 = 0.64 + 0.32 + 0.04 = 1
Expected Allele Frequency at 2nd Generationp = AA + Aa/2 = 0.64 + (0.32/2) = 0.8q = aa + Aa/2 = 0.04 + (0.32/2) = 0.2
Similar example,But with different starting allele frequencies
p q
9/17/18
7
p22pqq2
• The frequency of an allele in a population can be calculated from # of individuals:
– For diploid organisms, the total number of alleles at a locus is the total number of individuals x 2
– The total number of dominant alleles at a locus is 2 alleles for each homozygous dominant individual
– plus 1 allele for each heterozygous individual; the same logic applies for recessive alleles
Calculating Allele Frequencies from # of Individuals
AA Aa aa120 60 35 (# of individuals)
#A = (2 x AA) + Aa = 240 + 60 = 300#a = (2 x aa) + Aa = 70 + 60 = 130Proportion A = 300/total = 300/430 = 0.70Proportion a = 130/total = 130/430 = 0.30
A + a = 0.70 + 0.30 = 1
Proportion AA = 120/215 = 0.56Proportion Aa = 60/215 = 0.28Proportion aa = 35/215 = 0.16
AA + Aa + aa = 0.56 + 0.28 +0.16 = 1
Calculating Allele and Genotype Frequencies from # of Individuals Applying the Hardy-Weinberg Principle
• Example: estimate frequency of a disease allele in a population
• Phenylketonuria (PKU) is a metabolic disorder that results from homozygosity for a recessive allele
• Individuals that are homozygous for the deleterious recessive allele cannot break down phenylalanine, results in build up à mental retardation
• The occurrence of PKU is 1 per 10,000 births• How many carriers of this disease in the
population?
– Rare deleterious recessives often remain in a population because they are hidden in the heterozygous state (the “carriers”)
– Natural selection can only act on the homozygous individuals where the phenotype is exposed (individuals who show symptoms of PKU)
–We can assume HW equilibrium if:• There is no migration from a population with different
allele frequency• Random mating• No genetic drift• Etc
9/17/18
8
• The occurrence of PKU is 1 per 10,000 births(frequency of the disease allele):
q2 = 0.0001q = sqrt(q2 ) = sqrt(0.0001) = 0.01
• The frequency of normal alleles is:p = 1 – q = 1 – 0.01 = 0.99
• The frequency of carriers (heterozygotes) of the deleterious allele is:
2pq = 2 x 0.99 x 0.01 = 0.0198or approximately 2% of the U.S. population
So, let’s calculate HW frequencies Conditions for Hardy-Weinberg Equilibrium• The Hardy-Weinberg theorem describes a
hypothetical population
• The five conditions for nonevolving populations are rarely met in nature:
– No mutations – Random mating – No natural selection – Extremely large population size– No gene flow
• So, in real populations, allele and genotype frequencies do change over time
DEVIATIONfrom
Hardy-Weinberg EquilibriumIndicates that
EVOLUTIONIs happening
• In natural populations, some loci might be out of HW equilibrium, while being in Hardy-Weinberg equilibrium at other loci
• For example, some loci might be undergoing natural selection and become out of HW equilibrium, while the rest of the genome remains in HW equilibrium
Hardy-Weinberg across a Genome
Allele A1 Demo
How can you tell whether a population is out of HW Equilibrium?
9/17/18
9
• Perform HW calculations to see if it looks like the population is out of HW equilibrium
• Then apply statistical tests to see if the deviation is significantly different from what you would expect by random chance
Example: Does this population remain in Hardy Weinberg Equilibrium across Generations?
AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10
AA Aa aaGeneration 1 0.25 0.50 0.25Generation 2 0.20 0.60 0.20Generation 3 0.10 0.80 0.10
■ In this case, allele frequencies (of A and a) did not change.
■ ***However, the population did go out of HW equilibrium because you can no longer predict genotypic frequencies from allele frequencies
■ For example, p = 0.5, p2 = 0.25, but in Generation 3, the observe p2 = 0.10
How can you tell whether a population is out of HW Equilibrium?
1. When allele frequencies are changing across generations
2. When you cannot predict genotype frequencies from allele frequencies (means there is an excess or deficit of genotypes than what would be expected given the allele frequencies)
Testing for Deviaton from Hardy-Weinberg Expectations
• A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium.
• If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model.
• O = observed counts, E = expected counts, sum across genotypes
Example
• Genotype Count: AA 30 Aa 55 aa 15
• Calculate the c2 value:
Genotype Observed Expected (O-E)2/E AA 30 33 0.27
Aa 55 49 0.73
aa 15 18 0.50
Total 100 100 1.50
• Since c2 = 1.50 < 3.841 (from Chi-square table, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population is in Hardy-Weinberg equilibrium.
9/17/18
10
Testing for Deviaton from Hardy-Weinberg Expectations
• A c2 goodness-of-fit test can be used to determine if a population is significantly different from the expections of Hardy-Weinberg equilibrium.
• If we have a series of genotype counts from a population, then we can compare these counts to the ones predicted by the Hardy-Weinberg model.
• O = observed counts, E = expected counts, sum across genotypes
55
Testing for Deviaton from Hardy-Weinberg Expectations
• O = observed counts, E = expected counts, sum across genotypes
• We test our c2 value against the Chi-square distribution (sum of square of a normal distribution), which represents the theoretical distribution of sample values under HW equilibrium
• And determine how likely it is to get our result simply by chance (e.g. due to sampling error); i.e., do our Observed values differ from our Expected values more than what we would expect by chance (= significantly different)?
à Less likely to get these values by chance
56
Test for Deviation from HW equilibrium
• Genotype Count Generation 4:AA 65 Aa 31 aa 4
• Calculate the c2 value:Genotype Observed Expected (O-E)2/E
AA 65 64.8 0.00062Aa 31 31.4 0.0051 aa 4 3.8 0.0105Total 100 100 0.016
• Since c2 = 0.016 < 3.841 (from Chi-square table for critical values, alpha = 0.05), we conclude that the genotype frequencies in this population are not significantly different than what would be expected if the population were in Hardy-Weinberg equilibrium.57
• The chi-squared distribution is used because it is the sum of squared normal distributions
• Calculate Chi-squared test statistic• Figure out degrees of freedom• Select confidence interval (P-value)• Compare your Chi-squared value to the theoretical
distribution (from the table), and accept or reject the null hypothesis.– If the test statistic > than the critical value, the null hypothesis (H0
= there is no difference between the distributions) can be rejected with the selected level of confidence, and the alternative hypothesis (H1 = there is a difference between the distributions) can be accepted.
– If the test statistic < than the critical value, the null hypothesis cannot be rejected 58
Test for Significance of Deviation from HW Equilibrium
Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1
59
Testing for significance• The results come out not significantly different from HW
equilibrium
• This does not necessarily mean that genetic drift is not happening, but that we cannot conclude that genetic drift is happening
• Either we do not have enough power (not enough data, small sample size), or genetic drift is not happening
• Sometimes it is difficult to test whether evolution is happening, even when it is happening... The signal needs to be sufficiently large to be sure that you can’t get the results by chance (like by sampling error)
60
9/17/18
11
Test for Deviation from HW equilibrium
• Genotype Count Generation 4 à increase sample sizeAA 65000 Aa 31000 aa 4000
• Calculate the c2 value:
Genotype Observed Expected (O-E)2/E
AA 65000 64800 0.617
Aa 31000 31400 5.10
aa 4000 3800 10.32
Total 100,000 100,000 16.04
• Since c2 = 16.04 > 3.841 (from Chi-square table for critical values,
alpha = 0.05), we conclude that the genotype frequencies in this
population ARE significantly different than what would be
expected if the population were in Hardy-Weinberg equilibrium.61
Test for Significance of Deviation from HW Equilibrium
Degrees of Freedom is n – 1 = 2 alleles (p, q) -1 = 1
62
• One generation of Random Mating could put a population back into Hardy Weinberg Equilibrium Examples of Deviation from
Hardy-Weinberg Equilibrium
What would Genetic Drift look like?
• Most populations are experiencing some level of genetic drift, unless they are incredibly large
Examples of Deviation from
Hardy-Weinberg Equilibrium
AA Aa aa
Generation 1 0.64 0.32 0.04
Generation 2 0.63 0.33 0.04
Generation 3 0.64 0.315 0.045
Generation 4 0.65 0.31 0.04
Is this population in HW equilibrium?
If not, how does it deviate?
What could be the reason?
9/17/18
12
Examples of Deviation from
Hardy-Weinberg Equilibrium
AA Aa aa
Generation 1 0.64 0.32 0.04
Generation 2 0.63 0.33 0.04
Generation 3 0.64 0.315 0.045
Generation 4 0.65 0.31 0.04
This is a case of Genetic Drift, where
allele frequencies are fluctuating
randomly across generations
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.64 0.36 0
Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.64 0.36 0
Here this appears to be Directional Selection favoring AA
Or… Negative Selection disfavoring aa
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.25 0.70 0.05
Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.25 0.70 0.05
This appears to be a case of Heterozygote Advantage (or Overdominance)
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.10 0.10 0.80
Is this population in HW equilibrium?If not, how does it deviate?What could be the reason?
9/17/18
13
Examples of Deviation from Hardy-Weinberg Equilibrium
AA Aa aa0.10 0.10 0.80
Selection appears to be favoring aa
(1) A nonevolving population is in HW Equilibrium
(2) Evolution occurs when the requirements for HW Equilibrium are not met
(3) HW Equilibrium is violated when there is Genetic Drift, Migration, Mutations, Natural Selection, and Nonrandom Mating
Summary
Hardy Weinberg Equilibrium
Wilhem Weinberg(1862 – 1937)
Gregor Mendel
G. H. Hardy(1877 - 1947)
(1822-1884)
Fig. 23-7-4
Gametes of this generation:
64% CR CR, 32% CR CW, and 4% CW CW
64% CR + 16% CR = 80% CR = 0.8 = p
4% CW + 16% CW = 20% CW = 0.2 = q
64% CR CR, 32% CR CW, and 4% CW CW plants
Genotypes in the next generation:
SpermCR
(80%)
CW
(20%
)
80% CR ( p = 0.8)
CW
(20%)
20% CW (q = 0.2)
16% ( pq)CR CW
4% (q2)CW CW
CR
(80%
)
64% ( p2)CR CR
16% (qp)CR CW
Eggs
Perform the same calculations using percentages
Fig. 23-7-1
SpermCR
(80%)
CW
(20%
)
80% CR (p = 0.8)
CW
(20%)
20% CW (q = 0.2)
16% (pq)CRCW
4% (q2)CW CW
CR
(80%
)
64% (p2)CRCR
16% (qp)CRCW
Eggs
Fig. 23-7-2
Gametes of this generation:
64% CRCR, 32% CRCW, and 4% CWCW
64% CR + 16% CR = 80% CR = 0.8 = p
4% CW + 16% CW = 20% CW = 0.2 = q
9/17/18
14
Fig. 23-7-3
Gametes of this generation:
64% CRCR, 32% CRCW, and 4% CWCW
64% CR + 16% CR = 80% CR = 0.8 = p
4% CW + 16% CW = 20% CW = 0.2 = q
64% CRCR, 32% CRCW, and 4% CWCW plants
Genotypes in the next generation:
1. Nabila is a Saudi Princess who is arranged to marry her first cousin. Many in her family have died of a rare blood disease, which sometimes skips generations, and thus appears to be recessive. Nabila thinks that she is a carrier of this disease. If her fiancé is also a carrier, what is the probability that her offspring will have (be afflicted with) the disease?
(A) 1/4(B) 1/3(C) 1/2(D) 3/4(E) zero
The following are numbers of pink and white flowers in a population.
Pink WhiteGeneration 1: 901 302Generation 2: 1204 403
Generation 3: 1510 504
2. Which of the following is most likely to be TRUE?
(A) The heterozygotes are probably pink
(B) The recessive allele here (probably white) is clearly deleterious(C) Evolution is occurring, as allele frequencies are changing greatly over time(D) Clearly there is a heterozygote advantage(E) The frequencies above violate Hardy-Weinberg expectations
The following are numbers of purple and white peas in a population. (A1A1) (A1A2) (A2A2)Purple Purple White
Generation 1: 360 480 160Generation 2: 100 200 200Generation 3: 0 100 300
3. What are the genotype frequencies at each generation?(A) Generation 1: 0.30, 0.50, 0.20
Generation 2: 0.20, 0.40, 0.40Generation 3: 0, 0.333, 0.666
(B) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.10, 0.20, 0.20Generation 3: 0, 0.10, 0.30
(C) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.20, 0.40, 0.40Generation 3: 0, 0.25, 0.75
(D) Generation 1: 0.36, 0.48, 0.16Generation 2: 0.36, 0.48, 0.16Generation 3: 0.36, 0.48, 0.16
4. From the example on the previous slide, what are the frequencies of alleles at each generation?
(A) Generation1: Dominant allele (A1) = 0.6, Recessive allele (A2) = 0.4Generation2: Dominant allele = 0.4, Recessive allele = 0.6Generation3: Dominant allele = 0.125, Recessive allele = 0.875
(B) Generation1: Dominant allele = 0.6, Recessive allele = 0.4Generation2: Dominant allele = 0.6, Recessive allele = 0.4Generation3: Dominant allele = 0.6, Recessive allele = 0.4
(C) Generation1: Dominant allele = 0.6, Recessive allele = 0.4Generation2: Dominant allele = 0.5, Recessive allele = 0.5Generation3: Dominant allele = 0.25, Recessive allele = 0.75
(D) Generation1: Dominant allele = 0.4, Recessive allele = 0.6Generation2: Dominant allele = 0.5, Recessive allele = 0.5Generation3: Dominant allele = 0.25, Recessive allele = 0.75
5. From the example two slides ago, which evolutionary mechanism might be operating across generations?
(A) Mutation(B) Selection favoring A1(C) Heterozygote advantage(D) Selection favoring A2(E) Inbreeding