+ All Categories
Home > Documents > Introduction to QTL analysis Peter Visscher University of Edinburgh [email protected].

Introduction to QTL analysis Peter Visscher University of Edinburgh [email protected].

Date post: 12-Jan-2016
Category:
Upload: rosamund-mills
View: 224 times
Download: 1 times
Share this document with a friend
69
Introduction to QTL analysis Peter Visscher University of Edinburgh [email protected]
Transcript
Page 1: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Introduction to QTL analysis

Peter Visscher

University of Edinburgh

[email protected]

Page 2: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Overview

• Principles of QTL mapping

• QTL mapping using sibpairs

• IBD estimation from marker data

• Improving power– ML variance components– Selective genotyping– Large(r) pedigrees

Page 3: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

t

m-a m+d m+a

QQ

Qq

qq

Trait

m-a m+d m+a

QQ

Qq

qq

Trait

[Fisher, Wright]

Quantitative Trait Locus = a segment of DNA that affects a quantitative trait

Page 4: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Mapping QTL

• Determining the position of a locus causing variation in the genome.

• Estimating the effect of the alleles and mode of action.

Page 5: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Why map QTL ?

• To provide knowledge towards a fundamental understanding of individual gene actions and interactions

• To enable positional cloning of the gene • To improve breeding value estimation and

selection response through marker assisted selection (plants, animals)

Science; Medicine; Agriculture

Page 6: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Principles of QTL mapping

• Co-segregation of QTL alleles and linked marker alleles in pedigrees

Unobserved QTL alleles

q m

Q M

Observed marker alleles

pair ofchromosomes

Page 7: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Linkage = Co-Linkage = Co-segregationsegregation

A2A4

A3A4

A1A3

A1A2

A2A3

A1A2 A1A4 A3A4 A3A2

Marker allele A1

cosegregates withdominant disease

Page 8: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

RecombinationRecombinationA1

A2

Q1

Q2

A1

A2

Q1

Q2

A1

A2 Q1

Q2

Likely gametes(Non-recombinants)

Unlikely gametes(Recombinants)

Parental genotypes

“Linkage analysis = counting recombinants"

Page 9: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Map distanceMap distance

Map distance between two loci (Morgans)

= Expected number of crossovers per meiosis

Note: Map distances are additive. Recombination frequencies are not.

1 Morgan = 100 cM; 1 cM ~ 1 Mb

Page 10: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Recombination & map Recombination & map distancedistance

2

1 2me

Haldane (1919)Map Function

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0 0.2 0.4 0.6 0.8 1

Map distance (M)

Re

co

mb

ina

tio

n f

rac

tio

n

Page 11: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Principles of QTL mapping

• Co-segregation of phenotypes and genotypes in pedigrees– Genetic markers give information on IBD sharing

between relatives [genotypes]– Association between phenotypes and genotypes

gives information on QTL location and effect [linkage]

• Need informative mapping population

Page 12: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Population Features Example Species Inbred lines Backcross (BC) Simplest design; powerful if

dominance in ‘right’ direction mice, plants

F2 Estimation of additive and dominance effects; more powerful than BC for additive effects

mice, rats

Advanced intercross line (AIL)

As for F2 but with increased resolution of map location

mice

Recombinant inbred lines (RIL)

F1 followed by inbreeding; homozygous comparisons only; powerful for additive effects; less environmental noise

mice, plants

Congenic lines (= Nearly isogenic lines)

Backcrossing followed by inbreeding; homozygous comparisons only after inbreeding. Lines contain ~1% of donor genome

mice, rats, plants

Double haploid lines (DHL)

Instant homozygosity through doubling of F1 gametes; homozygous comparisons only; powerful for additive effects and QTLxE interactions

plants

F2:3 Inbred progeny of F2; increased precision through progeny means

plants

Structured outbred populations

BC / F2 / AIL As for inbred lines; mapping variation between lines

livestock, outbreeding trees/plants

Large fullsib families Estimating contrasts between parental alleles. Allows for dominance estimation.

trees, fish, poultry

Halfsib families Estimating contrasts between common parent alleles

cattle, pigs, poultry, trees

Nuclear families, including sibpairs

Detection of variance explained by markers

humans, livestock

Unstructured outbred populations

Complex pedigrees Detection of variance explained by markers

humans, livestock

Mapping populations

Page 13: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Informative pig pedigree

X

©Roslin Institute

QQ qq

QQ Qq qq

Page 14: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

q q

q q

qqq

q q

X

X X

Large White Large WhiteMeishan

Meishan

F1

F2

2

2

1

2

1

2

2

2

1

1

1

1

1

1

1

1

1

2

1

2

2

2

2

2

2

2 q1

2

Page 15: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Line cross

• Only two QTL alleles segregating

• QTL effect can be estimated as the mean difference between genotype groups

• Power depends on sample size & effect of QTL

• Ascertain divergent lines

• Resolution of QTL map is low: ~10-40 Mb

Page 16: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Figure 1.2: Sample size for QTL detection in genome scans

0

500

1000

1500

2000

2500

3000

3500

4000

0 0.05 0.1 0.15 0.2 0.25

Proportion of variance due to QTL

Sam

ple

siz

e

Additive QTL

Dominant QTL

=0.0001, power = 90%, F2 population

Page 17: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Outbred populations: Complications

Markers not fully informative (segregating in the parental generation)

QTL not segregating in all families (All F1 segregate in inbred line cross)

Association between marker and QTL at the family rather than population level (i.e. linkage phase differs between families)

Additional variance between families due to other loci

Page 18: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Line cross vs. outbred population

Cross Outbred

# QTL alleles 2 2

# Generations 3 2

Required sample size 100s 1000s

QTL Estimation Mean Variance

Page 19: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

QTL as a random effect

yi = + Qi + Ai + Ei

Qi = QTL genotype contribution for chrom. segment

Ai = Contribution from rest of genome

var(y) = q2 + a

2 + e2

Page 20: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Logical extension of linear models used during the course

• This week: partitioning (co)variances into (causal) components

• QTL mapping: partitioning genetic variance into underlying components– Linkage analysis: dissecting within-family

genetic variation

Page 21: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Genetic covariance between relatives

cov(yi,yj) = ij q2

+ aij a2

aij = average prop. of alleles shared in the genome (kinship matrix)

ij = proportion of alleles IBD at QTL

(0, ½ or 1)

E(ij) = aij

Page 22: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

ij = Pr(2 alleles IBD) + ½Pr(1 allele IBD)

= proportion of alleles IBD in non-inbred pedigree

Estimate ij with genetic markers

Page 23: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Fully informative marker

• Determine IBD sharing between sibpairs unambiguously

• Example: Dad = 1/2 Mum= 3/4– Transmitted allele from Dad is either 1 or 2– Transmitted allele from Mum is either 3 or 4

Page 24: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Sibpairs & fully informative marker

# Alleles IBD Pr.

0 0 ¼

1 ½ ½

2 1 ¼

E() = Pr() = ½

E(2) = 2Pr() = 3/8

var() = E(2) – E()2 = 1/8

CV = 0.52 = 70%

Page 25: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Haseman-Elston (1972)

“The more alleles pairs of relatives share at a QTL, the greater their phenotypic similarity”

or

“The more alleles they share IBD, the smaller the difference in their phenotype”

Page 26: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Population sib-pair trait distribution

Page 27: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

No linkage

Page 28: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Under linkage

Page 29: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Sib pair (or DZ twins) design to map QTL

• Multiple ‘families’ of two (or more) sibs

• Phenotypes on sibs

• Marker genotypes on sibs (& parents)

• Correlate phenotypes and genotypes of sibs

Page 30: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Data structure is simple

Pair Phenotypes Prop. alleles IBD

1 y11 y12 1

2 y21 y22 2

.....

n yn1 yn2 n

= 0, ½ or 1 for fully informative markers

Page 31: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Notation

Y

D = (y1 – y2)

D2 = (y1 – y2)2

S = [(y1 – ) + (y2 – )]

S2 = [(y1 – ) + (y2 – )]2

CP = (y1 – )(y2 – )

Page 32: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Proposed analysis…...Data Method Reference

y1 & y2 ML ‘LOD’ Parametric linkage analysis

D2 Regression Haseman & Elston (1972)

D2 & S2 Regression Drigalenko (1998)

Xu et al. (2000); Sham & Purcell (2001); Forrest (2001)

CP Regression Elston et al. (2000)

y1 & y2 ML VC Goldgar (1990); Schork (1993)

D ML Kruglyak & Lander (1995)

D & S ML VC Fulker & Cherny (1996); Wright (1997)

Page 33: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Properties of squared differences

E(Y1 – Y2)2 = var(Y1 – Y2) + (E(Y1 – Y2))2

var(Y1 – Y2) = var(Y1) + var(Y2) -2cov(Y1,Y2)

If E(Yi) = 0 and var(Y1)=var(Y2), then

E(Y1 – Y2)2 = 2(1-r)var(Y)

Page 34: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Haseman-Elston method• Phenotype on relative pair j:

Yj= (y1j - y2j)2

E(Yi) = E[(Q1j - Q2j + A1j - A2j + (e1j - e2j)2]

= E[(Q1j - Q2j)2] + {2(1-aij)a2 + 2e

2}

= 2[q2 - cov(Q1j,Q2j)] + {

2}

= (2q2 +

2) - 2jt q2

jt= proportion of alleles IBD at QTL (trait, t) for relative pair j

Page 35: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Conditional expectation

E(Yj| jt) = (2q2 +

2) - jt

2q2

• negative slope of Y on if q2 > 0

• estimate jt from marker data (jm)

• use simple linear regression to detect QTL:

E(Yj| jm) = + jm

Page 36: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Haseman-Elston regression

y = -1.3577x + 3.1252

R2 = 0.0173

0

5

10

15

20

25

30

35

40

0 0.5 1

IBD

Sq

ua

red

dif

fere

nc

e

A significant negative slope indicates linkage to a QTL

Page 37: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Single fully informative marker

= -2(1 - 2r)2 q2

(1 - 2r)2 q2 term is analogous to variance explained by a single marker in a backcross/F2

design

= 2[1 - 2(1-r)r] q2 +

2 r = recombination fraction between marker &

QTLStatistical test: = 0 versus < 0• Disadvantage of method

– not powerful– confounding between QTL location and effect

Page 38: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Interval mapping for sibpair analysis(Fulker & Cardon, 1994)

• Estimate jt from IBD status at flanking markers

• Allows genome screen, separating effect & location– regression with largest R2 indicates map

position of QTL

Page 39: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Example from Cardon et al. (1994)

[Lynch & Walsh, page 520]

Page 40: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Calculating jt|jm

For jt midway between two flanking markers:

jt ~ r2/c + ½[(1 - 2r)/c]jm1 + ½[(1 - 2r)/c]jm2

c = 1 - 2r + 2r2

r = recombination fraction between markers

jmk = jm at flanking marker k

Assumption: flanking markers are fully informative

Page 41: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Examplesr c jt

0.5 0.5 0.5

0.2 17/25 (2/34) + (15/34)jm1 + (15/34)jm2

[if jm1 and jm2 are 1, jt = 32/34 < 1]

Page 42: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Exercise

Calculate jt for a location midway between two markers that are 30 cM apart, when the proportion of alleles shared at the flanking markers are 1.0 and 0.5. Use the Haldane mapping function to calculate the recombination rate between the markers.

jm1 = 1, jm2 = 0.5

Page 43: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Extensions to Haseman-Elston method

• Interval mapping• Alternative models

– QTL with dominance

• Other methods to estimate jt

– Using all markers on a chromosome (Merlin)

– Monte Carlo sampling methods

– Using both markers info & phenotypic info

• Add linkage information from:– Zj = [(y1j - ) + (y2j - )]2

Page 44: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Sample size for QTL detection in genome scans using sibpairs

0

5000

10000

15000

20000

25000

30000

35000

40000

0 0.1 0.2 0.3 0.4 0.5

Proportion of variance due to QTL

Nu

mb

er o

f p

airs

Sib-correlation = 0

Sib-correlation = 0.5

Power = 90%. Type-I error = 10-5

Page 45: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Estimating when marker is not fully informative

• Using:– Mendelian segregation rules– Marker allele frequencies in the population

Page 46: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

IBD can be trivial…

1

1 1

1

/ 2 2/

2/ 2/

IBD=0

Page 47: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Two Other Simple Cases…

1

1 1

1

/

2/ 2/

1 1/

1 12/ 2/

IBD=2

2 2/ 2 2/

Page 48: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

A little more complicated…

1 2/

IBD=1(50% chance)

2 2/

1 2/ 1 2/

IBD=2(50% chance)

Page 49: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

And even more complicated…

1 1/IBD=? 1 1/

Page 50: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Bayes Theorem for IBD Probabilities

j

jIBDGPjIBDP

iIBDGPiIBDP

GP

iIBDGPiIBDP

GP

GiGiIBDP

)|()(

)|()(

)(

)|()(

)(

), P(IBD)|(

prior

Prob(data)

posterior

Page 51: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

P(Marker Genotype|IBD State)

[Assumes Hardy-Weinberg proportions of genotypes in the population]

Page 52: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Worked Example

1 1/ 1 1/ 94

)(4

1)|2(

94

)(2

1)|1(

91

)(4

1)|0(

649

41

21

41)(

41)2|(

81)1|(

161)0|(

5.0

21

3

41

21

31

41

21

31

41

1

1

GP

pGIBDP

GP

pGIBDP

GP

pGIBDP

pppGP

pIBDGP

pIBDGP

pIBDGP

p

Page 53: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Exercise

1 2/ 1 2/

)|2(

)|1(

)|0(

...41...2

1...41)(

)2|(

)1|(

)0|(

10.0,5.0 21

GIBDP

GIBDP

GIBDP

GP

IBDGP

IBDGP

IBDGP

pp

Page 54: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Using multiple markers

• Mendelian segregation rules

• Marker allele frequencies in the population

• Linkage between markers

• Efficient multi-marker (multi-point) algorithms available (e.g., Merlin, Genehunter)

Page 55: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Software for QTL analysis of sibpairs

• Mx• Merlin• Genehunter• S.A.G.E. ($)• QTL Express (regression)• Solar (complex pedigrees)• Lots of others…

http://www.nslij-genetics.org/soft/

Page 56: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

George Seaton, Sara Knott, Chris Haley, Peter Visscher

Roslin Institute University of Edinburgh

http://QTL.cap.ed.ac.uk/

QTL Express: User-friendly web-based software to map QTL

in outbred populations

Page 57: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Conclusions (sibpairs)

• Power of sib pair design is low– more relative pairs needed

• more contrasts e.g. extended pedigrees

• selective genotyping– extreme phenotypes are most informative for linkage

– more powerful analysis methods• ML variance component analysis

Page 58: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Maximum likelihood for sibpairs(assuming bivariate normality |

& fully informative marker)Full model:

-2ln(L) = nln|V| + (y-)V-1(y-)

V = f2 + q2 + r2 f2 + q2

f2 + q2 f2 + q2 + r2

Page 59: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Maximum likelihood

Reduced model:

-2ln(L) = nln|V| + (y-)V-1(y-)

V = f2 + r2 f2

f2 f2 + r2

Page 60: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Test statistic

LRT = 2ln(MLfull) - 2ln(MLreduced)

H0(q2=0): LRT ~ ½2(1) + ½(0)

Page 61: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

[Fisher et al. 1999]

Example: QTL analysis for dyslexia on chromosome 6p using sib-pairs

Phenotype: Irregular word test181 sib-pairs

~15 Mb

Page 62: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

or distribution approachin analysis?

012

012

ˆ*0ˆ½ˆˆ

ˆˆˆ1

IBDIBDIBD

IBDIBDIBD

Expectation approach: use

Distribution approach: use IBD probabilities and mixture distribution

Page 63: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Selective genotyping & sibpairs

• Concordant pairs– both sibs in upper or lower tail of the

phenotypic distribution

• Discordant pairs– one sib in upper tail, other in lower tail

• Powerful design– requires many (cheap) phenotypes

Page 64: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Anxiety QTLs

[Fullerton et al. 2003]

Selection from ~30,000 sibpairs

Page 65: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Results

[Fullerton et al. 2003]

~5 QTLs detected

Page 66: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Variance component analysis in complex pedigrees

• Partition observed variation in quantitative traits into causal components, e.g.,– Polygenic– Common environment (‘household’)– QTL– Residual, including measurement error

• IBD proportions () estimated from multiple markers

“ACEQ” model

Page 67: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

[Blackwood et al. 1996] Bipolar pedigree

Page 68: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

0

1

2

3

4

0 5 10 15 20 25

cM

test

sta

tist

ic (

LO

D s

core

)

Two-point linkage analysis

Variance component analysis

Blackwood et al. (1996) data

Page 69: Introduction to QTL analysis Peter Visscher University of Edinburgh peter.visscher@ed.ac.uk.

Example: QTL analysis for BMI using a complex pedigree

[Deng et al. 2002]


Recommended