Date post: | 14-Jan-2016 |
Category: |
Documents |
Upload: | carlie-yoke |
View: | 219 times |
Download: | 0 times |
Selective mapping and simulation study
high-density genome maps
Are used for:• Comparative mapping• Map-based cloning• Genome sequencing
But genotyping costs time and money
And density can surpass resolution because of cosegregation (i.e. bins)
X
bins and map resolution
random
X
selected
high-density mapping strategies
Selective mapping• Optimizes map resolution• Requires less genotyping
Bin mapping• Provides “honest” marker placement • Aids map integration
selective mapping
Genotype a base population
(many individuals, few markers)
Construct a precise framework map
Select a subsample with high resolution
Genotype subsequent markers on subsample
Vision et al. (2000) Genetics 155:407-420.
summarizing bin length distribution
Average (ABL)• Most breakpoints
Maximum (MBL)• Minimize areas of low resolution
Sum of Squares (SSBL)or Expected (EBL) = SSBL/G
discrete optimization strategies
Mathematical programming• Integer program for exact breakpoints
• Linear program• To derive lower bound
• Randomized rounding to obtain candidate samples
Semi-greedy algorithm• Avoids local optima
• Uses mixed objective function
comparing results
Performance Ratio (PR) = Lsample/Lpopulation
• Optimum is 1• Higher values are inferior
Also useful: Lsample/Lrandom
simulation 100 doubled haploids
1000 cM genome
per f
orm
ance
rat
i o (
MB
L)
simulation:cumulative bin length distribution
- whole pop.▼expected× maximum◊ averageΔ non-optimized
sample size=30
per f
orm
ance
rat
i o (
MB
L)
barley IGRI x Franka cross150 doubled haploids
1100 cM genome
data from http://wheat.pw.usda.gov/ggpages/maps.html
Simulation study
10
12
14
16
18
20
22
0 200 400 600 800 1000 1200
number of marker
numb
er o
f br
eakp
oint
cart2 t i mespedi gree
Number of breakpoint in different simulation
V2
3.983.90
3.823.74
3.663.58
3.493.41
3.333.25
3.173.09
3.012.93
2.852.77
2.692.61
2.532.44
2.362.28
2.202.12
2.041.96
1.881.80
1.721.64
1.561.47
1.391.31
1.231.15
1.07.99
.91.83
.75.67
.59.51
.42.34
.26.18
.10.02
HistogramFr
equency
800
600
400
200
0
Std. Dev = .49
Mean = .50
N = 10000.00
Additive effect: gamma(1,2)
V1
.99.97
.95.93
.91.89
.87.85
.83.81
.79.77
.75.73
.71.69
.67.65
.63.61
.59.57
.55.53
.51.49
.47.45
.43.41
.39.37
.35.33
.31.29
.27.25
.23.21
.19.17
.15.13
.11.09
.07.05
.03.01
HistogramFr
equency
140
120
100
80
60
40
20
0
Std. Dev = .29
Mean = .50
N = 10000.00
Epistasis effect: beta(1,1), only two level interaction
Total:
Sample:
Analysis:
Simulation procedure
500
100 50 100 10
LR IM LR IM LR IM LR IM
MapPop MapPop Random Random
Simulation:
Random:
1. 2QTL, h2:0.3,0.7,total size:500,marker:101;epi:0.5;
2. 10QTL, h2:0.3,0.7,total size:500, marker:101; epi:0.2;
3. Using QTL Cartographer to do 1,2.
Fixed QTL position:
1. 5QTL,h2:0.7,total size:500, marker:101; no epi
2. 5QTL ,h2:0.7, total size:1000, marker:101; no epi
3. 10 QTL h2:0.7, total size:500, marker:101; no epi
Set QTL far apart (100cM away form each other):
1. random 5 QTL position, h2:0.5,total size 500; marker:101;epi:0.3
2. Samiliar to 1, only difference is marker:201
3. Maker: 101 to 201.
4. Uning mean traits for each line (2,5,10,100);
Fixed population and samples to mapping different set of QTLs.
*each has 100 replicate; threhold values are gotten from 300 rounds under corresponding H0
How about the power
0
0. 5
1
1. 5
2
2. 5
500 m100 r100 m50 r50
QTL
Numb
er q203LRq203I Mq207LRq207I M
*2QTL, random position h2:0.3,0.7,total size:500,marker:101;epi:0.5;
Average number of QTL detected
012345
6789
10
500 m100 r100 m50 r50
QTL
Numb
er q1003LRq1003I Mq1007LRq1007I M
*10QTL, random position ,h2:0.3,0.7,total size:500,marker:101;epi:0.5;
Average number of QTL detected
How about the power with increased population size
*F5500: 5QTL,h2:0.7,total size:500, marker:101; no epi
*F51000: 5QTL ,h2:0.7, total size:1000, marker:101; no epi
0
1
2
3
4
5
6
total m100 r100 m50 r50
Numb
er o
f QT
L de
tect
ed
F5500LRF5500I MF51000LRF51000I M
Average number of QTL detected
How about the breakpoint number
*F500: total size is 500
*F1000: total size is 1000
Mean of breakpoi nt number f or each i ndi vi dual
0
5
10
15
20
25
total m100 r100 m50 r50
Numb
er o
f br
eakp
oint
F500F1000
How about the detected QTLposition
*5QTL far apart (100 cM away)
MSD of mappi ng posi t i on
0
0. 002
0. 004
0. 006
0. 008
0. 01
0. 012
500 m100 r100 m50 r50
Squa
re o
f mo
rgan
R5101MR5201M
How about the QTL effects
MSD of QTL eff ects
00. 10. 20. 30. 40. 50. 60. 70. 80. 9
1
500 m100 r100 m50 r50
addi
tive
effe
cts
R5101MR5201M
*5QTL far apart (100 cM away)
Fixed population and sample,mapping different trait genes
the mean of detected QTL number
0
1
2
3
4
5
500 m100 r100 m50 r50
QT
L n
umbe
r
R5101LRR5101I M
*5QTL;far apart(100cM away);h2:0.5;epi:0.3
MSD of mappi ng posi t i on
00. 0050. 01
0. 0150. 02
0. 0250. 03
0. 0350. 04
0. 045
500 m100 r100 m50 r50
Squa
red
morg
an
R5101I M
*5QTL;far apart(100cM away);h2:0.5;epi:0.3
MSD of QTL eff ects
0
2
4
6
8
10
12
14
500 m100 r100 m50 r50
Addi
tive
eff
ects
R5101I M
*5QTL;far apart(100cM away);h2:0.5;epi:0.3
Future work:
1. Power v.s. resolution.
2. Sample size needed to achieve same power in random.
3. Analyzing the breakpoint number v.s. resolution theoretically.
4. Simulating to see how MapPop works in fine mapping.