Date post: | 27-Mar-2015 |
Category: |
Documents |
Upload: | jackson-mclain |
View: | 223 times |
Download: | 0 times |
QTL Analysis: Concept
Parents
F1
F2
F2:3
×
A B
Generation Procedure
Alternatives: BC1, RIL, DHL
Field
PHT[cm]
210 190 203 159 206 . . 171
Marker # 1 2 3 4 5 .. M 1 B B H H A .. A 2 H A H A A .. H 3 B B H H H .. A 4 H H B B B .. H 5 H B H H A .. B . . . . . . . . . . . . . . . . N A H H H A .. A
Laboratory
Chromosome 1
LOD score PHT
Office
QTL Analysis: Single Marker Analysis
160
180
200
220
240
Plant height (cm)
XMC (cm)
Total
196
umc130
AA Aa aa
201 196 191
F = 6.47**
umc157
AA Aa aa
195 197 195
F = 0.48 ns
QTL Analysis: Single Marker Model (F2)
rM
m
Q
q
Additive effect: )21(2/)( rammMM
Dominance effect:2)21(2/)( rdmmMMMm
F tests on the contrasts of marker classes test the following hypothesis:
a > 0d > 0r < 0.5
QQ Qq qq
MM (1-r)2 2r(1-r) r2 μ(MM)
Mm r(1-r) (1-r)2+r2 r(1-r) μ(Mm)
mm r2 2r(1-r) (1-r)2 μ(mm)
μ1 μ2 μ3
Schön, 2002
QTL Analysis: Single Marker Model (F2)
r = 0M
m
Q
q
r = 0.2M
m
Q
q
Example: Plant height, umc130
Case 1 Case 2
X(MM) = 201cmX(Mm) = 196cmX(mm) = 191cm
PHT (cm) r = 0 r = 0.2 r = 0.4
Add. Effect 5.0 8.3 25.0
X(QQ) 201.0 204.3 221.0 X(Qq) 196.0 196.0 196.0X(qq) 191.0 187.7 171.0
4. Association Analysis
Concepts
Dissecting A Quantitative Trait: Time Versus Resolution
Resolution in bp
1x1071
Res
earc
h T
ime
in Y
ears
5
1AssociationsAssociations
1x104
F2 QTL Mapping
F2 QTL Mapping
NILs NILs Positional Cloning
Positional Cloning
RI QTL Mapping
RI QTL Mapping
Resolution Versus Allelic Range
Resolution in bp
1x1071
All
eles
Eva
luat
ed
>40
1
Associations In Diverse Germplasm
Associations In Diverse Germplasm
1x104
NILNIL
PedigreePedigree
F2 or RIL Mapping
F2 or RIL Mapping
Positional Cloning
Positional Cloning
Associations In Narrow Germplasm
Associations In Narrow Germplasm
AssociationTests
• Evaluate whether nucleotide polymorphisms associate with phenotype
• Natural populations• Exploit extensive recombination
1.3m
1.5m
1.4m
1.8m
2.0m
2.0m
T A GA A
C G GA A
C G TA A
T A TC G
T G TA G
T G GA G
Association mapping
• Mainstay of human genetics– One of a few possible approaches– Reproducibility was an issue
• Cystic fibrosis– Kerem, et al. (1989). Science 245, 1073-1080.
• Alzheimer's disease – Corder et al. (1994). Nature Genet. 7, 180-184.
Associations may result from at least three causes
2. The locus is in linkage disequilibrium with the cause of the phenotype
Linked and highly correlated
1. The locus is the cause of the phenotype
1 2
Complete Linkage Disequilibrium
Adapted from Rafalski (2002) CurrOpin Plant Biol 5:94-100.
D’=1r2=1
6
6
Locus 1
Lo
cus
2
Same mutational history and no recombination.
No resolution
1 2
Linkage Disequilibrium
D’=1r2=0.33
3
6
Locus 1
Lo
cus
2
Different mutational history and no recombination.
Some resolution
3
1 2
Linkage Equilibrium
D’=0r2=0
3
3
Locus 1
Lo
cus
2
Same mutational history with recombination.
Resolution
3
3
Andes U.S.
3. Population structure can produce associations
G TG G G G TT T G T T
P=0.04
GT80
100
120
140
160
180
200
Pla
nt H
eigh
tP<<0.001
T G0
2
4
6
8
10
Ker
nel H
ue
These non-functional associations can be accounted for by estimating the population structure using random markers.
5. QTL mapping analysis
QTL Analysis: Interval Mapping
PLOT Peak at 96 LOD = 4.7 + === ===== I === === I == === I == I = 2.4 + == I ==== I I ==== ===========********** ****** *************** 0.0 M----+----+---M+----MC--M+----M----+----+----+-C--+----+---M+----+----+--M cM (0.47) 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150
r1M1
m1
Q
q
M2
m2
r2
r
Simple Interval Mapping
Composite Interval Mapping
PlabQTL
QTL Analysis: Power of QTL detection
Power: Probability of finding a QTL
Heritability:
Power (%)
0
10
20
30
40
50
60
70
80
90
100
Heritability
0.4 0.5 0.6 0.7 0.8 0.9 1.0
N = 600
N = 300
N = 100
Utz and Melchinger, 1994
2
22
p
gh
QTL Analysis: Conclusions
There are a number of QTL, in analysis the largest ones easiest to detect BUT
Makes detection of others difficult
Models can adjust for this – detect others
QTL Analysis: Conclusions
QTL mapping combines qualitative linkage analysis with quantitative genetic analysis. – Association between marker genotypes and phenotypic trait values.
Single marker analysis is easy to perform but QTL effect and position are confounded. This results in low power of QTL detection.
Interval mapping approaches increase power of QTL detection and allow the estimation of QTL effects and position.
QTL Analysis: Conclusions
Estimates of QTL effects and the proportion of the genotypic variance explained by QTL are biased due to genotypic and environmental sampling.
Estimates of QTL position show low precision.
With large populations a large number of QTL is found for complex traits.
When conducting a QTL study you may wish to use a large population size.
6. Candidate Genes
Functional Genomics Using Diversity
Forward Genetics
Trait
Positionally clone gene
Reverse Genetics
Trait
Candidate gene
QTL Candidate Polymorpism
ComparativeGenomics
Candidate Genes
MutagenesisMolecular &Expression
BiochemicalAnalyses
Positional Candidate Genes
Evolutionary Association Tests
Identify Genes with Phenotypic Effects
Move Alleles into Elite Lines withTransgenics and Introgression
Survey Diverse Races For:1. Phenotype
2. Candidate Gene Sequence3. Population History
Evaluate Phenotypic Effects and MakeGermplasm Available to Breeders
MorphologyPhysiology
QTL Mapping
Association Analysis
Identification of More Favorable Alleles
Enhanced Marker Assisted Breeding
7. Linkage Disequilibrium
Analysis
25
Properties of LD
A
a
PAB = pApB + DAB
PAb = pApb - DAB
PaB = papB - DAB
Pab = papb + DAB
B b
pA
pa
pB pb 1
The basic measure of LD is:
DAB = PAB - pA pB ( DAB = DAb = DaB = Dab )
Linkage Disequilibrium versus Generations Since its Creation
0 100 200 300 400 500
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
c = 0.1 c = 0.02c = 0.01c = 0.005c = 0.001
Dis
eq
uilib
riu
m,
r AB
Generation, g
rAB (1-c)g
Recomb. Rate (c)
Other Measures of LD
Can divide DAB by the maximum value it can obtain:
D’AB = DAB / [max(-pApB, -papb)] if DAB < 0 DAB / [min (pApb, papB)] if DAB > 0
The sampling properties of D’AB are not well understood.
r2AB = D2
AB
pA pB pa pbE(r2)= 1 / (1 – 4Nc)
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
0 2000 4000 6000 8000 10000
Distance in bp
d8
id1
sh1
tb1
d3
fae2
su1
bt2
sh2
wx1
LD generally decays rapidly with distance
r2
Remington, D. L., et al. 2001.. PNAS-USA 98:11479-11484. & unpublished
Investigator Population Studied
Extent of LD
Gaut Landraces <1000 bp
Buckler Diverse Inbreds 2000 bp
Rafalski Elite Lines 100 kb?
(6 kb euchromatin?)
Population Effect on Linkage Disequilibrium in Maize
Reviewed in Flint-Garcia, S. A. et al. 2003. Annual Review of Plant Biology 54:357-374.
8. Association Analysis
Allele Case-Control Test
n1|aff n2|aff
n1|unaff n2|unaff
Affected
Unaffected
allele 1 allele 2
2 naff
2 nunaff
n1 n2
2 N individuals
X2 = i (ni|aff - ni|unaff)2
ni|aff + ni|unaff ~ 2
(k-1)
marker
if naff = nunaff
(k alleles)
39.3%35.9%8
28.8%28.3%4
19.9%17.8%0
-+
Gm3;5,13,14 haplotypeIndex of Indian Heritage
39.3%35.9%8
28.8%28.3%4
19.9%17.8%0
-+
Gm3;5,13,14 haplotypeIndex of Indian Heritage
Proportion with NIDDM by heritage and marker status
Full heritage American Indian Population
+ -Gm3;5,13,14 ~1% ~99%
(NIDDM Prevalence 40%)
Caucasian Population
+ -Gm3;5,13,14 ~66% ~34%
(NIDDM Prevalence 15%)
Full heritage American Indian Population
+ -Gm3;5,13,14 ~1% ~99%
(NIDDM Prevalence 40%)
Caucasian Population
+ -Gm3;5,13,14 ~66% ~34%
(NIDDM Prevalence 15%)
Gm3;5,13,14 haplotype
Cases Controls
+ 7.8% 29.0% - 92.2% 71.0%
Study without knowledge of genetic background:
OR=0.2795%CI=0.18 to 0.40
Population Stratification: American Indian and Diabetes
Knowler 1988 Am J Hum Genet 43, 520-526.
Use SSR Markers to Estimate Population Structure
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
% Non-Stiff Stalk
% S
tiff
Sta
lk
8 Stiff Stalk
38 Non-Stiff Stalk30 Sub-Tropical
Method: Pritchard, J. K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155:945-59.
Example: Remington, D. L., et al. 2001.. Proc Natl Acad Sci U S A 98:11479-11484.
Logistic Regression Ratio Test For Association
• Adapted from Pritchard case-control approach
• Where:–C = candidate polymorphism distribution
–T = trait value
–Q = matrix of population membership
• Evaluated by logistic regression
• Significance evaluated by permutation based on haplotype distribution in populations
)ˆ;(Pr
)ˆ,;(Pr
0
1
QC
QTC
Pritchard, J. K., M. Stephens, N. A. Rosenberg, and P. Donnelly. 2000. Am J Hum Genet 67:170-181.
Population Structure Estimates Greatly Reduce Estimated Type I Error Rates
0.00
0.05
0.10
0.15
0.20
0.25
1 2 3 4 1 2 3 4
SS
R E
stim
ated
Typ
e I
Err
or
Rat
e
No Pop. Structure EstimateWith Pop. Structure Estimate
Pop. Structure with Rescaling
Flowering Time Height
Fields
Su1• Sugary1 is an
isoamylase, a starch debranching enzyme
• Sequenced fully from 32 diverse lines
• Sampled 2 small parts of gene from 102 lines
11100bp
Whitt, S. R., et al. 2002. PNAS-USA 99:12959-12962.
su1 Promoter & 1st Exon
• Two distinct alleles
• Sweet phenotype not associated
SweetDent + Flint
Pop
2
0 02
2 015
0 01
0 04
0 03
0 01
2 079
0 07
4564:D E
21 1
3 44
su1 Coding Region
• Two distinct alleles
• Sweet phenotype associated with W578R
SweetDent + Flint
Pop
0 05
0 01
0 011
0 03
0 02
0 02
0 01
0 01
0 013
0 061
2
662:K E B4
0 01
0 11
3 00
5 00
0 47
92163:F L
578:W R
SweetDent + Flint
Pop
0 05
0 01
0 011
0 03
0 02
0 02
0 01
0 01
0 013
0 061
2
662:K E B4
0 01
0 11
3 00
5 00
0 47
92163:F L
578:W R
Su1
578:WR
Based on survey of 12kbp from 32-102 lines.
Dwarf8 functional variation
2 Amino AcidDeletion
SH2 Domain
When controlling for population structure, associates with flowering time & plant height across 12 environments.
Thornsberry et al. 2001 Nat. Genet.
MITEIndel
0.6
0.8
1
1.2
1.4
1.6
1.8
D8 SH2 Variant
Day
s to
Silk
ing
rel
ativ
e to
B73
9. Type I and Type II Error
Statistics - Hypothesis Test Null Hypoth True Null Hypoth False
Reject Null Hypothesis
Type I Errorα
Correct
Fail to Reject Null Hypothesis Correct
Type II Errorβ
Power = 1- β
P-value = α
Experimentwise P value
• Each statistical test has a Type I error rate– Test 20 independent SNPs, one will be significant at
P<0.05• Bonferroni correction essentially divides the P
by number of tests– Often too conservative (no power), as markers are
correlated• Churchill and Doerge permutation help estimate
experimentwise P, – Permutes the entire genotype relative to the
phenotypes
Power of approaches
• Sample size– 100 to 1000 are typical
• Heritability of trait– H2 = 10% - 90%– Depends on ability to measure trait– Interactions with environment
• Depends on statistical properties of test
Association Approaches Complement QTL Linkage Mapping
Linkage (RILs)Association
10,000,000 bp2000 bpResolution
High PowerLittle PowerGenome Scan
HighLowStatistical Power per Allele
Low (1 or 2)High (10s)Allelic Range