Date post: | 02-Apr-2015 |
Category: |
Documents |
Upload: | amanullah-tak |
View: | 193 times |
Download: | 8 times |
Population Genetics: An Introduction
B.M. PrasannaICAR National FellowDivision of Genetics
Indian Agricultural Research InstituteNew Delhi
New Delhi; April 1, 2007
Individual genotypes (Inbred lines; Pure lines; Clones)
Populations (Within/Between)
Germplasm Accessions
Species
Levels of Genetic Diversity
Ecosystem
Definition
• ‘Population genetics’ is the study of the frequency of occurrence of alleles within and between populations.
• The measurement of variability by describing changes in allele frequency for a particular trait over time.
• The analysis of causes leading to those changes• Closely related to evolutionary genetics because
evolution depends heavily on changes in gene frequencies
Population
• Ecologically:A group of individuals of the same species living within a restricted geographical area that allows any two individuals to interbreed.
• Genetically:A group of individuals who share a common gene pool and have the potential to interbreed.
Genetic Analysis of Populations
• Traditionally, identification of different alleles through observation of the expressed traits, broadly called the phenotype.
• Advances in molecular genetics facilitated identification of single genes at the molecular or biochemical level.
• Irrespective of the method used, population genetics requires statistical analyses of allele frequencies to understand and make predictions about gene flow in populations - past, present, and future.
Two major facets:Describing the pattern of genetic diversityInvestigating the processes that generate this diversity
Population genetics is a result of the “synthesis”of Darwin’s theory of evolution and
Mendel’s laws of heredity
• Darwin’s theory of evolution through natural selection can be summarized in three principles (Origin of Species 1859):
• Principle of variation: Among individuals within any population, there is variation in morphology, physiology, and behaviour.
• Principle of heredity: Offspring resemble their parents more than they resemble unrelated individuals.
• Principal of selection: Some forms are more successful at surviving and reproducing than other forms in a given environment.
Hardy-Weinberg Equilibrium
AlleleFreq.
Time
Assumptions:• Population is large.• Mating is random.• Isolation from other populations.• Effects of migration, mutation and selection are negligible.
Demonstrating Hardy-Weinberg Principle
Source: IPGRI
When HWE assumptions are met
• the frequency of a genotype is equal to the product of the allele frequencies.
AA Aa aap2 2pq q2
Calculating Allele Frequencies
P(A) = [2(AA) + (Aa)]/2n• Twice the number of homozygous genotypes with
that allele (because homozygotes carry two copies each of the same allele) +
• the number of heterozygous genotypes with that allele (because heterozygotes carry only one copy of a particular allele),
• divided by two times the total number of individuals in the sample (because each individual carries two alleles per locus)
Factors affecting Population Diversity
• Movement of seed (migration) or pollen (gene flow) in or out of the population.
• Mutation• Recombination (creating new combinations
of existing diversity)• Genetic Drift • Selection
Genetic Drift
Zygotes
a
Aa
a
aa
a
AA
A
AAaA
AA aa aaAA
aA aA aA
AA AA
aA
AA aa aaAA
aAAA AA
a
a
a
aa
aA
A
AA A
a
a
a aa
AA aa aa
aA AA
a
a
a
aa
aA A
a
a
a aaaa aa
aaa
a
a
a
a
a
aa
a
a
a
a aaaa aa
a
a
a
aa aa
aa
Gene pool Gametic sampling
G0
p = 0.5 q = 0.5
G1
p = 0.625 q = 0.375
G2
p = 0.3125 q = 0.6875
G3
p = 0.0 q = 1.0
Genetic Drift
• Random change in allele frequencies due to finite population size.
• Reflected in The History and Geography of Human Genes (i.e. different populations have different allele frequencies).
Founder Effect
• An extreme form of genetic drift• It is a particular example of the influence of random
sampling.• Defined by Ernst Mayr as:
"The establishment of a new population by a few original founders (in an extreme case, by a single fertilized female) which carry only a small fraction of the total genetic variation of the parental population."
Bottleneck Effect
• Natural calamities can drastically reduce a population -usually unselective.
• The result - genetic composition of small surviving population - unlikely to be representative of the original population.
Alleles in original population
Surviving population
Bottleneck event
Gene Flow
Source: IPGRI
Gene Flow
• HWE requires the gene pool to be a closed system – most populations are not completely isolated.
• Population can gain or lose alleles by gene flow — genetic exchange due to migration of fertile individuals or gametes between populations.
• Gene flow tends to reduce differences between populations that have accumulated because of natural selection or genetic drift.
• If gene flow is extensive enough amalgamate neighbouring populations.
Mutation• Mutation at any given genetic locus is usually very rare.• Rate of one mutation per 105-106 gametes is typical.
Example: an allele has frequency of 0.5 in the gene pool. Mutates to another allele at a rate of 0.00001 mutations per generation 2000 generations to reduce the frequency of the original allele from 0.50 to 0.49.
• If a new allele increases its frequency significantly in a population, it is usually because it confers a selective advantage.
• Mutations at a particular locus are indeed rare. For any one gene, mutation does not have much of an effect on a large population in a single generation.
• However, impact of mutation at all genes is significant – Each individual: 1000’s of genes.
Selection
• HWE requires that all individuals in a population be equal in their ability to survive and produce viable offspring (very unusual in reality).
• Microevolution through natural selection - adaptation to environment.
• Alternatively balancing selection acts to maintain genetic polymorphism/multiple alleles in the population
• Multiple alleles are maintained by:heterozygote advantage/overdominanceselective advantage of certain allele combinationsenvironmental heterogeneity
Population Structure
• A population is considered structured if:– genetic drift is occurring in some of its
subpopulations, – migration does not happen uniformly throughout the
population, or – mating is not random throughout the population.
• A population’s structure affects the extent of genetic variation and its patterns of distribution.
Standard Genetic Parameters for Population Diversity Analysis
• Allele richness (A)• Effective number of alleles [AE = 1/(1-HE)]• Observed heterozygosity (H0)• Expected heterozygosity (HE)• Fixation Index (FIS)• Within-population gene diversity (HI)• Mean within-population gene diversity (Hs)• Total diversity (HT)• Coefficient of gene differentiation among
populations (GST)
Standard Genetic Parameters
• Proportion of polymorphic loci• Allele richness (A) – no. of variants in a sample• Average no. of alleles per locus• Effective number of alleles [AE = 1/(1-HE)]• Observed heterozygosity (H0)• Expected heterozygosity (HE)• Fixation Index (FIS)• Within-population gene diversity (HI)• Mean within-population gene diversity (Hs)• Total diversity (HT)• Coefficient of gene differentiation among
populations (GST)
Quantifying Intrapopulation Genetic Diversity
• Based on the number of variantsPolymorphism or rate of polymorphism (Pj)Proportion of polymorphic lociNumber of alleles and allelic richness
• Based on the frequency of variantsAverage expected heterozygosity
(He; Nei’s genetic diversity)
Quantifying Interpopulation Genetic Diversity
• Wright’s F statistics (Wright) • Interpopulation differentiation for one locus (gST)• Interpopulation differentiation for several loci
(GST)• Population’s contribution to total genetic diversity• Analysis of molecular variance (AMOVA)
Effective Number of Alleles (Ae)
• Number of alleles that can be present in a population
where, pi = frequency of the ith allele in a locush = 1 – Σpi
2
= heterozygosity in a locus• Relevant for germplasm collection strategy
- first sampling- second sampling
Wright’s F Statistics• Population substructure results in a
reduction of heterozygosity• FST is a fixation index that compares
the average heterozygosity within groups to the total heterozygosity
Pop
n
iii
S
TTT
T
STST
n
ppH
ppHH
HHF
Pop
∑=
−=
−=
−=
1)ˆ1(ˆ2
)ˆ1(ˆ2 Interpretation (Wright 1978)0-0.05: little differentiation0.05-0.15: moderate0.15-0.25: large>0.25: very large
Genetic Differentiation of Populations
• Three different F coefficients: • Correlation of genes within individuals over all populations (FIT)• Correlation of genes of different individuals in the same
population (FST)• Correlation of genes within individuals within populations (FIS)
• FIT = 1 – (HI / HT)• FIS = 1 – (HI / HS)• FST = 1 – (HS / HT)
• FST, FIT and FIS are interrelated so that:• 1 – FIT = (1 – FST)(1 – FIS)• FST = (FIT – FIS) / (1 – FIS)
F Statistics
• FST always more than or equal to 0
• 0 when sub-populations are identical in allele
frequencies
• 1 when sub-populations have different alleles
• FIS and FIT: measures of departure from HWE
• >0: when excess of homozygotes
• <0: when excess of heterozygotes
Calculating F Statistics
Source: IPGRI
F Statistics: Another Example
Source: IPGRI
Interpopulation differentiation for one locus (gST)
• gST = 1 – (hS/hT)where hs = population diversity
hT = total diversity
• Can be used with codominant markers and restrictedly with dominant markers. This is because it is a measure of heterozygosity.
Calculating gST
Source: IPGRI
Interpopulation Differentiation for Several Loci (DST)
• GST = DST / HT
where,HT = total genic diversity = HS + DSTHS = intrapopulation genic diversityDST = interpopulation diversity(HT/HT) = (HS/HT) + (DST/HT) = 1
• GST measures the proportion of gene diversity that is distributed among populations.
• A larger number of loci must be sampled.• Equations are complex and should be calculated with
specific computer software.
Effective Population Size (Ne)
• Number of individuals in an ideal population with the same decrease in heterozygosity due to genetic drift (Hartl and Clark, 1997)
• Ne often lower than total number of individuals in a population (N)
• Human: N is roughly 6 billion; Ne estimated to be only 10,000 individuals
• Heterozygosity and mutation rate (µ) directly related to Ne
• Minimum Ne in conservation of an endangered species: 500-1000 individuals
Analysis of Molecular Variance (AMOVA)
• Different from ANOVA (in terms of evolutionary assumptions; permutational methods - normal distribution not necessary)
• Analysis of different hierarchical levels of genetic diversity:• Continents• Geographical regions within a continent• Areas within a region in a continent• Populations within an area of a region in a
continent• Individuals within a population in an area of a
region in a continent
High Resolution Genotyping
• Necessary for:• Genetic diversity analysis• Germplasm security
• Not so necessary for:• Mapping (if parental alleles
are known)• Marker-assisted selection
(MAS)
Population Genomics
A Few Case Studies
Diversity in Maize Landraces is Diversity in Maize Landraces is Organized along Ecogeographical LinesOrganized along Ecogeographical Lines
Analysis of 1200 maize landraces and 200 wild teosintes by John Doebley’s group.
•• 45 landraces and 462 45 landraces and 462 SSRsSSRs
•• Comparison of maize and Comparison of maize and teosintesteosintes
•• FFstst and relative deficit of genetic and relative deficit of genetic diversity (GD)diversity (GD)
•• Relationship between SSR Relationship between SSR diversity and domestication diversity and domestication QTLsQTLs
•• Bottleneck effect with smaller Bottleneck effect with smaller effect from selectioneffect from selection
•• C.S. Chord distanceC.S. Chord distance
•• NeighborNeighbor--joining method joining method ((PowerMarkerPowerMarker))
•• Population structure using Population structure using STRUCTURESTRUCTURE
•• Graphical display of STRUCTURE Graphical display of STRUCTURE results using results using DistructDistruct softwaresoftware
•• Gene diversity, Gene diversity, NNee , , PIC & PIC & FFstst using using PowerMarkerPowerMarker
•• BOTTLENECK used to test each BOTTLENECK used to test each group for deviation from mutation group for deviation from mutation drift equilibrium under SMMdrift equilibrium under SMM
235 modern wheat cultivars, 235 modern wheat cultivars, LCsLCsand T. and T. tauschiitauschii accessionsaccessions
90 90 SSRsSSRs
Loss of genetic diversity during Loss of genetic diversity during domestication and transition domestication and transition from from LCsLCs to modern cultivarsto modern cultivars
Introgression of novel materials Introgression of novel materials in breeding in breeding –– enhancement of enhancement of genetic diversity from 1990 to genetic diversity from 1990 to 19971997
Genetic Resources : Beyond Conservation
GeneticResources
Bioinformatics
Genomics
CropImprovement
Systematics
Functional Diversity
PopulationGenetics
Statistics
Conclusions
• The statistical measures of population genetics aid in elucidating population genetic structure and history.
• No natural population can possibly meet all the requirements for HWE, but in most cases, we begin the study of a population with a priori (prior) knowledge of what dynamics may be influencing the population.
• Information on the population's effective population size, heterozygosity levels, and inbreeding coefficients for particular individuals can be used to design breeding breeding programmes which will maximize the genetic variation in successive generations.
Conclusions
• The statistical measures of population genetics aid in elucidating population genetic structure and history.
• No natural population can possibly meet all the requirements for HWE, but in most cases, we begin the study of a population with a priori (prior) knowledge of what dynamics may be influencing the population.
• Information on the population's effective population size, heterozygosity levels, and inbreeding coefficients for particular individuals can be used to design breeding breeding programmes which will maximize the genetic variation in successive generations.