+ All Categories
Home > Education > Experimental design and statistical analysis

Experimental design and statistical analysis

Date post: 20-Oct-2014
Category:
View: 773 times
Download: 3 times
Share this document with a friend
Description:
Lecture of Dr Jiankang Wang about statistical analysis and QTL mapping
Popular Tags:
111
1 Lecture 12 Genetic Linkage Analysis and Map Construction
Transcript
Page 1: Experimental design and statistical analysis

1

Lecture 12

Genetic Linkage Analysis and Map Construction

Page 2: Experimental design and statistical analysis

2

Page 3: Experimental design and statistical analysis

Experiments with Plant Hybrids (1866) Seed shape: 5474 round vs 1850 wrinkled Cotyledon color: 6022 yellow vs 2001 green Seed coat color: 705 grey-brown vs 224 white Pod shape: 882 inflated vs 299 constricted Unripe pod color: 428 green vs 152 yellow Flower position: 651 axial vs 207 terminal Stem length: 787 long (20-50cm) vs 277 short (185-230cm)

Rediscovered in 1900

Page 4: Experimental design and statistical analysis

4

Page 5: Experimental design and statistical analysis

Ear length of maize (East 1911)

5

P1: 7cm; P2: 17cm One locus

a=(17-7)/2=5; F2: 1/4 aa (7) + 2/4 Aa (12) + 1/4 AA (17) Two locus

a=(17-7)/4=2.5 F2: 1/16 (7) + 4/16 (9.5) + 6/16 (12) + 4/16 (14.5) +1/16 (17)

Page 6: Experimental design and statistical analysis

6

Page 7: Experimental design and statistical analysis

7

Page 8: Experimental design and statistical analysis

8

Page 9: Experimental design and statistical analysis

221

2 kaVF

Page 10: Experimental design and statistical analysis

)]([8)(

P2P121

F2

221

VVVPPk

kaP1 kaP2

Page 11: Experimental design and statistical analysis

2212

21 kaaVA

AVPPk

8)( 2

21

Page 12: Experimental design and statistical analysis

12

Page 13: Experimental design and statistical analysis

Mendel and Fisher

Annuals of Science 1:115-close to the values that Mendel expected under his theory that there must have been some manipulation, or omission, of data

13

Dominant trait: 1/3 AA + 2/3 Aa Family size: 10 Non-segregating (AA) : Segregating (Aa) = 1:2 (Mendel) Fisher: Pro {Aa family classified as AA} = 0.75^10=0.0563 Pro {Non-segregating (AA)} =2/3*(1-0.0563)=0.6291 Non-segregating (AA) : Segregating (Aa) = 0.3709 : 0.6291 = 1 : 1.6961

Page 14: Experimental design and statistical analysis

14

Page 15: Experimental design and statistical analysis

Genetic populations and pair-wise linkage analysis

15

Page 16: Experimental design and statistical analysis

Populations handled in QTL IciMapping Parent P1 Parent P2 Legends

HybridizationF1

Selfing1. P1BC1F1 7. F2 2. P2BC1F1

Repeated selfing9. P1BC2F1 13. P1BC1F2 8. F3 14. P2BC1F2 10. P2BC2F1

Doubled haploids15. P1BC2F2 16. P2BC2F2

11. P1BC2RIL 5. P1BC1RIL 4. F1RIL 6. P2BC1RIL 12. P2BC2RIL BC3F1, BC4F1 etc.

P1BC2F1 P1BC1F1 F1 P2BC1F1 P2BC2F1 Marker-assistedselection

19. P1BC2DH 17. P1BC1DH 3. F1DH 18. P2BC1DH 20. P2BC2DH CSS lines orIntrogression lines

P1 × CP P2 × CP P3 × CP Pn × CP CP=common parent

RIL family 1 RIL family 2 RIL family 3 RIL family i RIL family n

One NAM population

Page 17: Experimental design and statistical analysis

17

Marker C263 R830 R3166 XNpb387 R569 R1553 C128 C1402 XNpb81 C246 R2953 C1447 Grain width (mm)

Position (cM) 0.0 3.5 8.5 19.5 32.0 66.6 74.1 78.6 81.8 91.9 92.7 96.8

RIL1 0 0 0 0 0 0 0 0 0 0 0 0 2.33

RIL2 2 2 2 2 2 0 0 0 0 2 2 2 1.99

RIL3 0 2 2 2 2 2 2 2 2 2 2 2 2.24

RIL4 0 0 0 0 0 0 2 2 2 2 2 2 1.94

RIL5 0 0 0 0 0 2 2 0 0 0 0 0 2.76

RIL6 0 0 0 2 2 2 2 2 2 2 2 2 2.32

RIL7 0 0 0 0 0 0 0 0 0 0 0 0 2.32

RIL8 2 2 0 2 2 0 0 0 0 2 2 2 2.08

RIL9 0 0 0 0 2 2 0 0 0 0 0 0 2.24

RIL10 0 0 0 0 2 2 0 0 0 0 0 0 2.45

Example: 10 RILs in a rice population (Linkage map of Chr. 5)

Page 18: Experimental design and statistical analysis

Genetic markers in linkage analysis

Morphological traits

hybridization experiments

Cytogenetic and bio-chemistry markers (e.g. isozyme) DNA molecular markers

RFLP, SSR, SNP etc.

Page 19: Experimental design and statistical analysis

The four gametes (haplotypes) of an F1

19

A

a

B

b

P1: AABB P2: aabb

A B

a b

F1: AaBb

A B

a b

A B a b (1-r)/2 (1-r)/2

A b B a r/2 r/2

Parental type Parental type Recombinant type

Recombinant type

Meiosis

Page 20: Experimental design and statistical analysis

Expected genotypic frequency in backcross and DH populations

P1: AABB; P2: aabb

20

Page 21: Experimental design and statistical analysis

MLE of recombination frequency Likelihood function Logarithm of likelihood MLE of r Fisher information Variance of estimated r

3241

4321

)()1()1(21

21

21)1(

21

!!!!!

4321

nnnnnnnn

rrCrrrrnnnn

nL

rnnrnnCL ln)()1ln()(lnln 3241

nnn

nnnnnnr 32

4321

32

)1()1()ln( 2

32241

2

2

rrn

rnn

rnnE

rdLdEI

nrr

IVr

)1(1

Page 22: Experimental design and statistical analysis

Significance test of linkage Null hypothesis H0: r = 0.5 (no genetic linkage, or locus A-a and B-b are independent) Alternative hypothesis HA Likelihood ratio test (LRT) or LOD score

)1(~])(

)5.0(ln[2 2 dfrL

rLLRT

)5.0()(

rLrLLOD

Page 23: Experimental design and statistical analysis

An example P1BC1 population Genotypes of two inbred parents P1 and P2 are AABB and aabb Observed samples of the four genotypes in P1BC1

AABB 162 AABb 40 AaBB 41 AaBb158

23

%20.2040181

15841401624140r

41002.4)1(nrrVr

Page 24: Experimental design and statistical analysis

Test of linkage Null hypothesis H0: r = 0.5 Alternative hypothesis HA Likelihood ratio test (LRT) (P<0.0001) and LOD score

24

153

41

103.6)()1(

)5.0()(

4321

3241

nnnn

nnnn rrrLrL

27.708])5.0(

)(ln[*2rLrLLRT

80.153])5.0(

)(log[rLrLLOD

Page 25: Experimental design and statistical analysis

Genotypic frequencies in RIL populations, compared with DH

25

DH population

Theoretical frequency

RIL population

Theoretical frequency

AABB f1=(1-r)/2 AABB f1=(1-R)/2

AAbb f2=r/2 AAbb f2=R/2

aaBB f3=r/2 aaBB f3=R/2

aabb f4=(1-r)/2 aabb f4=(1-R)/2

R=2r/(1+2r)

Page 26: Experimental design and statistical analysis

RIL Marker 1 Marker 2 Parent type or recombinant

C263 XNpb387 RIL1 0 or A 0 or A P1 type RIL2 2 or B 2 or B P2 type RIL3 0 or A 2 or B Recombinant RIL4 0 or A 0 or A P1 type RIL5 0 or A 0 or A P1 type RIL6 0 or A 2 or B Recombinant RIL7 0 or A 0 or A P1 type RIL8 2 or B 2 or B P2 type RIL9 0 or A 0 or A P1 type RIL10 0 or A 0 or A P1 type

n1=6 n2=2 n3=0 n4=2

R=2/10=0.2

r=0.125

LRT=17.72 (P=2.56 10-5)

LOD=3.85

Page 27: Experimental design and statistical analysis

Expected genotypic frequencies in F2 populations

Page 28: Experimental design and statistical analysis

MLE of r in F2: dominant markersLogarithm of the likelihood ratio MLE of r Variance of the estimated r

2)1( rk

)21ln()2ln()()23ln(ln 29

273

21 rrnrrnnrrnCL

knknnknC ln)1ln()()2ln( 9731

nnnnnnnnn

rk2

)32()32()1( 9

291912

)243(2)23)(2(

)21(2)2)(1(

2

22

rrnrrrr

knkkVr

Page 29: Experimental design and statistical analysis

MLE of r in F2: co-dominant markers (Newton-Raphson algorithm)

Log-likelihood function The first-order derivative of LogL f'(r) The second-order derivative of LogL f''(r) The iteration algorithm:

ri+1 = ri - f'(ri)/f''(ri)

)221ln(ln)22(

)1ln()22(lnln2

5738642

864291

rrnrnnnnnnrnnnnnnCL

25738642864291

221)24(22

122ln)

rrrn

rnnnnnn

rnnnnnn

drLd

22

25

2738642

2864291

2

2

)221()44(22

)1(22ln)

rrrrn

rnnnnnn

rnnnnnn

rdLd

Page 30: Experimental design and statistical analysis

MLE of r in F2: co-dominant markers (EM algorithm)

EM for expectation and maximization E-step: for an initial r0, calculate the probability of crossover in each marker type M-step: Update r, and repeat from the E-step

kkkn GRPnr )|(' 1

Page 31: Experimental design and statistical analysis

Expected probability of crossover

r= [n1 0+ n2 0.5+ n3 1 n8 0.5+ n9 0]/n

Page 32: Experimental design and statistical analysis

Estimated r after 3 EM iterations (r0=0.5)

Page 33: Experimental design and statistical analysis

Estimated r after 3 EM iterations (r0=0.25)

Page 34: Experimental design and statistical analysis

Estimated r after 3 EM iterations (r0=0.0)

Page 35: Experimental design and statistical analysis

Co-dominant markers in other populations

R=2r/(1+2r)

Page 36: Experimental design and statistical analysis

More populations (e.g. BC1F2, F3 etc): Generation transition matrix of

Page 37: Experimental design and statistical analysis

Distortion has little effect on linkage analysis!

DH pop Theo. Freq. Distortion Freq. in distortion

AABB f1=(1-r)/2 (1-r)/2 (1-r)/(1+s)

AAbb f2=r/2 r/2 r/(1+s)

aaBB f3=r/2 s r/2 r s/(1+s)

aabb f4=(1-r)/2 s (1-r)/2 (1-r) s/(1+s)

Sum 1 (1+s)/2 1

rssrssrsrr )1/()1()1/()1/(

Page 38: Experimental design and statistical analysis

Three-point analysis and linkage map construction

38

Page 39: Experimental design and statistical analysis

Linkage analysis of three markers

When (no interference),

When (complete interference), The order of the three loci can be determined after linkage analysis (3!/2=3 potential orders)

1 2 3, or 1 3 2, or 2 1 3 39

2312231213 12 rrrrr0

2312231213 )1)(1()1( rrrrr

1

231223122312231213 2)1()1( rrrrrrrrr

231213 rrr

Page 40: Experimental design and statistical analysis

Mapping distance and recombination frequency

Mapping distance Unit of mapping distance

M (Morgan) or cM (centi-Morgan), 1M=100cM

The function of mapping distance on recombination frequency (Mapping function):

40

231213 mmm

)(rfm

Page 41: Experimental design and statistical analysis

Common mapping functions

41

Morgan function (complete interference)

In M: m =r (M) In cM: m =r 100 (cM)

Haldane function (no interference) In M: In cM:

Kosambi function (interference depends on length of interval) In M: In cM:

)1( 221 mer

)21ln(50)( rrfm )1( 50/21 mer

rrm

2121ln

41

11

21

4

4

ee

m

m

r

rrm

2121ln25

1

121

25/

25/

eem

m

r

)21ln()( 21 rrfm

Page 42: Experimental design and statistical analysis

Comparison of the three functions (M

)

42

Recombination frequency

Map

ping

dis

tanc

e (c

M)

Page 43: Experimental design and statistical analysis

Three steps in linkage map construction Step 1: Grouping. Grouping can be based on

(i) a threshold of LOD score (ii) a threshold of marker distance (cM) (iii) anchor information

Step 2: Ordering. Three ordering algorithms are (i) SER: SERiation (Buetow and Chakravarti, 1987. Am J Hum Genet 41:180 188) (ii) RECORD: REcombination Counting and ORDering (Van Os et al., 2005. Theor Appl Genet 112: 30 40) (iii) nnTwoOpt: nearest neighbor was used for tour construction, and two-opt was used for tour improvement, similar to Travelling Salesman Problem (TSP) (Lin and Kernighan, 1973. Oper. Res. 21: 498 516.

Page 44: Experimental design and statistical analysis

Due to the large number of markers (n), it is impossible to compare all possible orders (say n=50, possible orders are n!/2=1.52x1064). Orders from the above algorithms are regional optimizations. Step 3: Rippling. Five rippling criteria are

(i) SARF (Sum of Adjacent Recombination Frequencies) (ii) SAD (Sum of Adjacent Distances) (iii) SALOD (Sum of Adjacent LOD scores) (iv) COUNT (number of recombination events)

Three steps in linkage map construction

Page 45: Experimental design and statistical analysis

The MAP functionality in QTL IciMapping

45

Page 46: Experimental design and statistical analysis

Interface of the MAP functionality

Page 47: Experimental design and statistical analysis

Map outputs:

Linkage map for each chromosome (A) or all

chromosomes (B)

A. Map of one chromosome B. Map of all chromosomes

Page 48: Experimental design and statistical analysis

An example map of seven chromosomes or groups

48

Page 49: Experimental design and statistical analysis

Linkage map and physical map

49

Species Size of haploid genome (kb)

Size of linkage map (cM)

kb/cM

Yeast 2.2 104 3700 6 Neurospora 4.2 104 500 80 Arabidopsis 7.0 104 500 140 Drosophila 2.0 105 290 700 Tomato 7.2 105 1400 510 Human 3.0 106 2710 1110 Wheat 1.6 107 2575 6214 Rice 4.4 105 1575 279 Corn 3.0 106 1400 2140

Page 50: Experimental design and statistical analysis
Page 51: Experimental design and statistical analysis
Page 52: Experimental design and statistical analysis
Page 53: Experimental design and statistical analysis
Page 54: Experimental design and statistical analysis

What is QTL Mapping? The procedure to map individual genetic factors with small effects on the quantitative traits, to specific chromosomal segments in the genome

The key questions in QTL mapping studies are: How many QTL are there? Where are they in the marker map? How large an influence does each of them have on the trait of interest?

Page 55: Experimental design and statistical analysis
Page 56: Experimental design and statistical analysis

Marker C263 R830 R3166 XNpb387 R569 R1553 C128 C1402 XNpb81 C246 R2953 C1447 Grain width (mm)

Position (cM) 0.0 3.5 8.5 19.5 32.0 66.6 74.1 78.6 81.8 91.9 92.7 96.8

RIL1 0 0 0 0 0 0 0 0 0 0 0 0 2.33

RIL2 2 2 2 2 2 0 0 0 0 2 2 2 1.99

RIL3 0 2 2 2 2 2 2 2 2 2 2 2 2.24

RIL4 0 0 0 0 0 0 2 2 2 2 2 2 1.94

RIL5 0 0 0 0 0 2 2 0 0 0 0 0 2.76

RIL6 0 0 0 2 2 2 2 2 2 2 2 2 2.32

RIL7 0 0 0 0 0 0 0 0 0 0 0 0 2.32

RIL8 2 2 0 2 2 0 0 0 0 2 2 2 2.08

RIL9 0 0 0 0 2 2 0 0 0 0 0 0 2.24

RIL10 0 0 0 0 2 2 0 0 0 0 0 0 2.45

Page 57: Experimental design and statistical analysis

Bi-parental mapping populations (linkage mapping)

Temporary population: F2 and BC Permanent population: RIL, DH, CSSL Secondary population

Association mapping Natural populations: human and animals

Page 58: Experimental design and statistical analysis

Single marker analysis (Sax 1923; Soller et al. 1976)

The single marker analysis identifies QTLs based on the difference between the mean phenotypes for different marker groups, but cannot separate the estimates of recombination fraction and QTL effect.

Interval mapping (IM) (Lander and Botstein 1989)

IM is based on maximum likelihood parameter estimation and provides a likelihood ratio test for QTL position and effect. The major disadvantage of IM is that the estimates of locations and effects of QTLs may be biased when QTLs are linked.

Regression interval mapping (RIM) (Haley and Knott 1992; Martinez and Curnow 1992 ) RIM was proposed to approximate maximum likelihood interval mapping to save computation time at one or multiple genomic positions.

Page 59: Experimental design and statistical analysis

Composite interval mapping (CIM) (Zeng 1994) CIM combines IM with multiple marker regression analysis, which controls the effects of QTLs on other intervals or chromosomes onto the QTL that is being tested, and thus increases the precision of QTL detection.

Multiple interval mapping (MIM) (Kao et al. 1999) MIM is a state-of-the-art gene mapping procedure. But implementation of the multiple-QTL model is difficult, since the number of QTL defines the dimension of the model which is also an unknown parameter of interest.

Bayesian model (Sillanpää and Corander 2002) In any Bayesian model, a prior distribution has to be considered. Based on the prior, Bayesian statistics derives the posterior, and then conduct inference based on the posterior distribution. However, Bayesian models have not been widely used in practice, partially due to the complexity of computation and the lack of user-friendly software.

Page 60: Experimental design and statistical analysis

A. QTL

mm MMMm

B. QTL

mm MMMm

Page 61: Experimental design and statistical analysis

Backcrosses (P1BC1 and P2BC1) of P1: MMQQ and P2: mmqq

BC1 BC2

Genotype Frequency Genotypic

value Genotype Frequency

Genotypic

value

MMQQ )1(21 r m+a MmQq )1(2

1 r m+d

MMQq r21 m+d Mmqq r2

1 m-a

MmQQ r21 m+a mmQq r2

1 m+d

MmQq )1(21 r m+d mmqq )1(2

1 r m-a

Page 62: Experimental design and statistical analysis

Two marker types:

Difference in phenotype between the two types

MMQqMMQQMM )1( rr

rdarmdmramr )1()())(1(

MmQqMmQQMm )1( rr

drramdmramr )1())(1()(

))(21(MmMM dar

Page 63: Experimental design and statistical analysis

Linear model (j=1 2 n )

b* represent QTL effect is the indicator variable (0 or 1) for QTL genotype

Likelihood profile

Support interval: One-LOD interval

*jx

jji exbby **0

Page 64: Experimental design and statistical analysis

P1: P2:

F1: P1:

1 4

Mi Q Mi +1

Mi Q Mi +1

mi q mi +1

mi q mi +1

Mi Q Mi +1 Mi Q Mi +1

Mi Q Mi +1

Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1

Mi Q Mi +1 Mi Q mi +1 mi q mi q mi +1

Mi Q Mi +1

Mi q mi +1

mi q Mi +1

Mi Q Mi +1

mi Q Mi +1

mi q mi +1

Page 65: Experimental design and statistical analysis
Page 66: Experimental design and statistical analysis

Assumption: No more than one QTL per chromosome or linkage group

Large confidence interval Biased effect estimation

Composite interval mapping (CIM) (Zeng 1994)

Page 67: Experimental design and statistical analysis

In the algorithm of CIM, both QTL effect at the current testing position and regression coefficients of the marker variables used to control genetic background were estimated simultaneously in an expectation and maximization (EM) algorithm.

Thus, this algorithm could not completely ensure that the effect of QTL at current testing interval was not absorbed by the background marker variables and therefore may result in biased estimation of the QTL effect.

Page 68: Experimental design and statistical analysis

Theoretical basis of ICIM

kjkjjk

m

jjj ggaagaG

1

1)|( jjjjj xxgE X

1111)|( kjkjkjkjkjkjkjkjkj xxxxxxxxggE X

ikj

ikijjk

m

jijji exxbxbby

1

10

Page 69: Experimental design and statistical analysis

Two-dimensional scanning (interval mapping)

One-dimensional scanning (interval mapping)

1,kkj

ijjii xbyy

1,1,1,,1,

kksjjr

isirrskkjjr

irrii xxbxbyy

Page 70: Experimental design and statistical analysis

0

10

20

30

40

11111111111222222222233333333334444444444

LOD

sco

re

Scanning posoition along the genome-2

-1.5-1

-0.50

0.51

1.52

11111111111222222222233333333334444444444Effe

ct

Scanning posoition along the genome

0

20

40

60

80

11111111111222222222233333333334444444444

LOD

sco

re

Scanning posoition along the genome-4-3-2-10123

11111111111222222222233333333334444444444Effe

ctScanning posoition along the genome

010203040506070

11111111111222222222233333333334444444444

LOD

sco

re

Scanning posoition along the genome-1.5

-1

-0.5

0

0.5

1

1.5

11111111111222222222233333333334444444444Effe

ct

Scanning posoition along the genome

Page 71: Experimental design and statistical analysis

Detecting epistasis where the interacting

significant additive effects

Page 72: Experimental design and statistical analysis
Page 73: Experimental design and statistical analysis
Page 74: Experimental design and statistical analysis

One-locus model in F2

One-locus model:

where is mean of the two homozygous genotypes QQ and qq, a is the additive effect, d is the dominance effect . w and v are the indicators for genotypes at the QTL, valued at 1 and 0 for QQ, 0 and 1 for Qq, and -1 and 0 for qq, respectively

dvawG

Page 75: Experimental design and statistical analysis

The expected genotypic value of an individual with known marker types

),,,|( ),,,|(),,,|(

2121

21212121

yyxxvEdyyxxwEayyxxGE

Page 76: Experimental design and statistical analysis

Left marker

Right marker

QQ (w=1, v=0) (m+a)

Qq (w=0, v=1) (m+d)

qq (w=-1, v=0) (m-a)

AA BB

AA Bb

AA bb

Probability of the three QTL genotypes under given marker types

22

214

1 )1()1( rr )1()1( 221121 rrrr 2

22

141 rr

)1()1( 222

121 rrr 2

211212

21121 )1()1)(1( rrrrrr )1( 22

212

1 rrr2

22

141 )1( rr )1()1( 22112

1 rrrr 22

214

1 )1( rr

Page 77: Experimental design and statistical analysis

Estimation of marker class mean

Marker class n Frequency

Indicator for marker

Genetic mean of the class

x1 x2 y1 y2

AABB n1 1 1 0 0 f1 g1

AABb n2 1 0 0 1 f2 g2

AAbb n3 1 -1 0 0 f3 g3

),,,|( 2121 yyxxwE ),,,|( 2121 yyxxvE

241 )1( r

)1(21 rr

241 r

121 )1/(21 frrr 12

2211 )1/()1()1(2 grrrrr

dgaf 11

dgaf 22

dgaf 33

22

221 )/()]1()21[( frrrrr 222

2211 )/()221)(1( grrrrrr

312 /)( frrr 32

2211 /)1()1(2 grrrrr

Page 78: Experimental design and statistical analysis

Relationship between marker class mean and marker effect

(including marker interactions)

12

12

12

12

2

1

2

1

11

22

33

44

5

44

33

22

11

)(

)()()()()(

)(

000100111001010011000100111010001101100011001010001101000100111001010011000100111

DDdDAAD

AAdDdDdAaAa

d

dgafdgafdgafdgaf

dgdgafdgafdgafdgaf d

Page 79: Experimental design and statistical analysis

dggggg

dggdgggdggg

affaf

dgg

DDdDAAD

AAdDdDdAaAa

d d

)(00

)()()(

)(

)(

)(

)()()()()(

)(

54321

2121

3121

321

2121

4321

121

3121

2

3121

12

12

12

12

2

1

2

1

Relationship between marker effects and QTL effects

Page 80: Experimental design and statistical analysis

The linear model of genotypic values on markers in F2

22112121 ),,,|( xxyyxxwE

21122112

22112121

),,,|(

yyxxyyyyxxvE

Page 81: Experimental design and statistical analysis

The linear model of genotypic values on markers in F2

21122112

222211112121

)()(

)()()()(),,,|(

yyDDdxxAAd

yDdxAayDdxAayyxxGE

Page 82: Experimental design and statistical analysis

Properties of the linear model in F2The additive and dominance effects of the flanked QTL are completely absorbed by the six variables in the model above. Interactions between marker variables may be declared as interaction between QTL by mistake when using ANOVA. But from our analysis, interactions between marker variables can be caused simply by dominance effects of QTL .

Page 83: Experimental design and statistical analysis

Multiple QTL model in F2 For multiple QTL, assume there are m QTL located on m intervals defined by m+1 markers on one chromosome, then the genotypic value of an F2 individual is defined as:

m

jjjjj vdwaG

1][

Page 84: Experimental design and statistical analysis

The linear model in F2 under multiple QTL

The genotypic value of an F2 individual with known marker types can be re-organized as:

m

jjjjj

m

jjjjj

m

jjj

m

jjj

yyxx

yxGE

111,

111,

1

1

1

1

)(

Page 85: Experimental design and statistical analysis

The linear model for QTL mapping in F2

m

jjjjj

m

jjjjj

m

jjj

m

jjj

yyxx

yxGEP

111,

111,

1

1

1

1

)(

Page 86: Experimental design and statistical analysis

Property of the linear model for QTL mapping in F2

Page 87: Experimental design and statistical analysis

ICIM (Inclusive Composite Interval Mapping) in F2

kjjiijjjjiijjj

kkjijjijjii

yyxx

yxPP

][

][

1,1,1,1,

1,

Page 88: Experimental design and statistical analysis

Hypothesis test of QTL mapping in F2

The two hypotheses used to test the existence of QTL at the scanning position are: vs.

The logarithm likelihood under HA is

where denotes individuals belonging to the marker class (j=1,

k=1, 2, 3) is the proportion of the QTL genotype in the class, and is the density function of the normal distribution .

3210 :Hequalnot are and , of least twoat : 311AH

9

1

3

1

2 ]),;(log[j Si k

kijkAj

PfL

jS thjjk

thkthj ),;( 2

kf),( 2

kN

Page 89: Experimental design and statistical analysis

Use EM algorithm to get the estimation of So the genetic effects in were therefore estimated by

EM algorithm of QTL mapping in F2

321 and ,

dvawG

)( 3121 )( 312

1a 2d

Page 90: Experimental design and statistical analysis

EM algorithm of QTL mapping in F2

Parameters under H0 were calculated as: From which the maximum likelihood under H0, and the LOD score between HA and H0 can be calculated.

n

iin P

1

10

n

iin P

1

20

120 )(

Page 91: Experimental design and statistical analysis

QTL distribution models in simulation

Page 92: Experimental design and statistical analysis

QTL distribution models in simulation

Page 93: Experimental design and statistical analysis

QTL distribution models in simulation

F2 populations were simulated by the genetics and breeding simulation tool of QuLine. QTL mapping using ICIM was implemented by the software QTL IciMapping.

Page 94: Experimental design and statistical analysis

Theoretical marker effects in the genetic model used in simulation

The expected additive, dominance, additive by additive, and dominance by dominance effects of the two flanking markers associated with each QTL is shown in the following table. It indicated that the dominance of a QTL could complicate the coefficients of the two markers flanking a QTL, and cause the interactions between markers.

Page 95: Experimental design and statistical analysis

The expected marker effects in simulation

QTL

Interaction variation (%)

QTL1 0.000 0.498 0.498 0.000 0.000 0.000 0.000 0.0

QTL2 0.253 0.000 0.000 0.248 0.248 -0.248 0.243 21.8

QTL3 0.253 0.498 0.498 0.248 0.248 -0.248 0.243 5.7

QTL4 -0.253 0.498 0.498 -0.248 -0.248 0.248 -0.243 5.7

QTL5 0.379 0.498 0.499 0.371 0.371 -0.371 0.364 9.6

QTL6 -0.379 0.498 0.498 -0.371 -0.371 0.371 -0.364 9.6

dd)( 1)( Aa1)( Dd2)( Aa 12)( AAd 12)( DDd2)( Dd

Page 96: Experimental design and statistical analysis

QTL mapping in simulated F2 populations

Page 97: Experimental design and statistical analysis

QTL LOD score

PVE (%)

True Position (cM)

Est. Position (cM)

True add. effect

Est. add. effect

True dom. effect

Est. dom. effect

QTL distribution model I QTL1 16.52 6.67 25 28 1 0.88 0 -0.11 QTL2 7.67 3.27 55 53 0 0.03 1 0.85 QTL3 25.11 11.28 25 24 1 0.86 1 1.08 QTL4 35.46 16.43 55 57 1 0.74 -1 -1.58 QTL5 37.12 16.74 25 26 1 1.05 1.5 1.38 QTL6 28.44 13.16 55 55 1 0.84 -1.5 -1.22

Page 98: Experimental design and statistical analysis
Page 99: Experimental design and statistical analysis
Page 100: Experimental design and statistical analysis
Page 101: Experimental design and statistical analysis

180 individuals The cross was made in Chengdu, China, in July 2002 between the indica rice variety and Nipponbare. 137 SSR markers. The whole genome was of 2046.2 cM, and the average marker distance was 17.1 cM. A number of agronomic traits were investigated in the field.

Page 102: Experimental design and statistical analysis

QTL mapping in the actual F2 population

Page 103: Experimental design and statistical analysis

QTL distributionTrait R2 of

additive (%)

R2 of additive and dominance (%)

Absolute degree of dominance (|d/a|) Total

<=0.25 (0.25, 0.75] (0.75, 1.25] >1.25 PH 25.84 51.56 2 1 1 5 9 HD 16.12 41.37 1 1 1 3 6 PL 25.58 61.26 5 3 1 8 17 FL 20.86 40.00 0 2 0 3 5 SPK 25.64 27.09 1 1 1 1 4 TKW 20.11 20.11 2 0 2 1 5 DP 19.45 24.87 1 1 0 1 3 GL 30.69 41.96 1 1 0 0 2 GW 26.63 26.63 2 2 0 0 4 RLW 37.63 45.70 1 3 1 1 6

Total 16 15 7 23 61

Page 104: Experimental design and statistical analysis

PVE distribution

02468

101214161820

Freq

uenc

y a

cros

s tra

its

Phenotypic variation explained(%)

Page 105: Experimental design and statistical analysis

Trait QTL Chr Distance to left marker

Add Dom LOD PVE(%)

Plant height (Ph)

QPh1-1 1 12 -0.57 -7.98 8.04 12.03 QPh1-2 1 19.5 -8.59 0.59 15.54 25.57 QPh3-1 3 16.9 4.35 -4.86 6.51 13.30 QPh3-2 3 11.4 -4.69 -1.00 5.04 6.84 QPh4 4 13.7 -3.56 -2.09 4.61 5.53 QPh5 5 13 -0.44 -4.48 3.13 3.86 QPh6 6 6.2 -0.79 -5.05 3.17 4.96 QPh7 7 7 0.26 6.48 5.27 7.56 QPh12 12 2.4 -1.66 3.93 3.98 5.44

Heading date (Hd)

QHd1 1 22.1 1.74 -0.30 3.65 7.27 QHd3 3 19.9 0.88 -3.70 6.04 21.09 QHd4 4 0.2 -0.77 1.85 3.58 5.24 QHd8 8 5.7 -1.41 -1.46 4.79 8.20 QHd10 10 0.3 -1.78 -0.80 4.85 7.21 QHd11 11 6.2 0.15 -3.03 5.71 11.70

Page 106: Experimental design and statistical analysis

Conclusions

m

jjjjj

m

jjjjj

m

jjj

m

jjj

yyxx

yxGEP

111,

111,

1

1

1

1

)(

Page 107: Experimental design and statistical analysis
Page 108: Experimental design and statistical analysis

Six methods in BIP SMA: single marker analysis (Soller et al., 1976. Theor. Appl. Genet. 47: 35-39) IM-ADD: the conventional simple interval mapping (Lander and Botstein, 1989. Genetics 121: 185-199) ICIM-ADD: inclusive composite interval mapping of additive (and dominant) QTL (Li et al., 2007. Genetics 175: 361-374. Zhang et al., 2008. Genetics 180: 1177-1190) IM-EPI: interval mapping of digenic epistatic QTL ICIM-EPI: inclusive composite interval mapping of digenic epistatic QTL (Li et al., 2008. Theor. Appl. Genet. 116: 243-260) SGM: selective genotyping mapping (Lebowitz et al., 1987. Theor. Appl. Genet. 73: 556 562)

Page 109: Experimental design and statistical analysis

Interface of the BIP functionality

Page 110: Experimental design and statistical analysis

LOD profile of ICIM additive mapping (ICIM-ADD)

Page 111: Experimental design and statistical analysis

Figures of interacting QTL from ICIM epistatic mapping (ICIM-EPI)


Recommended