+ All Categories
Home > Documents > Association Mapping

Association Mapping

Date post: 24-Feb-2016
Category:
Upload: shauna
View: 43 times
Download: 0 times
Share this document with a friend
Description:
Association Mapping. LD. Definition. Causes. Haplotype Blocks. Extent of LD. Recombination Hotspots. Marker Density. Breeding System. Candidate loci or whole genome?. Species. Regression. Sub-population structure. Multiple testing vs. Shrinkage. Model-based or PCA?. - PowerPoint PPT Presentation
Popular Tags:
90
Association Mapping LD Methods Germplasm Definition Causes Haplotype Blocks Marker Density Recombination Hotspots Model-based or PCA? Candidate loci or whole genome? Sub-population structure Extent of LD Breeding System Gene identification or Marker-assisted selection? Regression Genomic selection Multiple testing vs. Shrinkage Signatures of selection Species Panel diversity Confounded structure and polymorphism 1
Transcript
Page 1: Association Mapping

1

Association MappingLD

Methods Germplasm

DefinitionCauses

Haplotype Blocks

Marker DensityRecombinationHotspots

Model-basedor PCA?

Candidate locior whole genome?

Sub-populationstructure

Extent of LD

BreedingSystem

Gene identification orMarker-assisted

selection?

Regression

Genomic selection

Multiple testingvs. Shrinkage

Signatures ofselection

Species

Panel diversity

Confounded structure andpolymorphism

Page 2: Association Mapping

2

Outline

• Association mapping is regression• Accounting for structure– Estimating structure using markers– Truly multi-factorial models

• Miscelaneous topics:– Genomic control; TDT; Confounding with

structure; Haplotype predictors; Genetic heterogeneity; Missing heritability; NAM; Validation

Page 3: Association Mapping

3

Association Mapping

• It’s the same thing as linkage mapping in a bi-parental population but in a population that has not been carefully designed and generated experimentally

• Because the experiment has not been designed, it is messy. Statistical methods are needed to deal with the mess

Page 4: Association Mapping

4

Regression

• xi is the allelic state at a marker• Consider the total genotypic effect of I

• qi is the allelic state at a QTL with which the marker is (hopefully) in LD

• Now estimate β

Page 5: Association Mapping

5

Estimate of Beta

Part having to

do with LD Multi-factorial

trait / structure

Page 6: Association Mapping

6

When is cov(x, g) non-zero?

• Differences in allele frequencies at the marker between subpopulations AND difference in phenotypic mean between subpopulations– The difference in mean can be due to a single or

many loci• Difference in the frequency of alleles between

families AND difference in family phenotypic means within a (sub)population

Page 7: Association Mapping

7

Structure possibilities

Popu

latio

n st

ruct

ure

Familial relatedness

Yu, J., Pressoir, G., et al. 2006. Nat Genet 38:203-208

Page 8: Association Mapping

8

Controlling for structure

• Basic quantitative genetics:– Two individuals who share many alleles should

resemble each other phenotypically– Use markers to figure out how many alleles

individuals share and then use that to adjust statistically for their phenotypic resemblance

Page 9: Association Mapping

9

Controlling for structure

• The “mixed model”

Yu, J., Pressoir, G., et al. 2006. Nat Genet 38:203-208

Page 10: Association Mapping

10

Controlling for structure

• Structure => large differences in allele frequencies across many markers

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

1

Average marker Set1 score

Aver

age

mar

ker S

et2

scor

e

First PCA axisPotential Phenotypic Gradient

Regression coefficients of the phenotype on the PCA values

Page 11: Association Mapping

11

Use of PCA• Results are not sensitive to the

number of PCA, provided you have enough– Price, A.L. et al. 2006. Principal

components analysis corrects for stratification in genome-wide association studies. Nat Genet 38:904-909

• The number of significant PC can be determined– Patterson, N. et al. 2006. Population

Structure and Eigenanalysis. PLoS Genetics 2:e190

• Use a “Screeplot”

Page 12: Association Mapping

12

Historical footnote

• PCA achieves what the Pritchard program Structure does

• PCA is faster and more robust

Pritchard, J.K. et al. 2000. Genetics 155:945-959Price A.L. et al. 2006. Nat Genet 38:904-909.Patterson N. et al. 2006. PLoS Genetics 2:e190

Page 13: Association Mapping

13

Kinship

• We are all a little bit related:– Two unrelated people: go back 1 generation, all

four parents must be different people.– Go back 2 generations, all eight grand-parents

must be different people.– Go back 30 generations, all 2.1 billion ancestors

would need to be different people: Impossible!

Page 14: Association Mapping

14

Identity by Descent

• Two alleles that are copies (through reproduction) of the same ancestral allele

Coefficient of Coancestry• Choose a locus• Pick an allele from Ed and one from Peter• Probability that the alleles are IBD = Ed and

Peter’s Coefficient of Coancestry, θEP

Page 15: Association Mapping

15

Coef. of Coancestry –> A matrix

• A is the additive relationship or kinship matrix

Winter

Two-Row

Six-Row

“Bison”

Page 16: Association Mapping

16

A constrains u

• Two individuals who share many alleles should resemble each other phenotypically

• u is the polygenic effect• Its covariance matrix is Var(u) = Aσ2

u

• If aij has a high value, the ui and uj should have similar values (they have high covariance)

• A constrains the values that are possible for u

Page 17: Association Mapping

17

Single locus, additive model: cov(ui, uj)

Page 18: Association Mapping

18

A matrix from the pedigree

• The cells in the A matrix are aij = 2θij, the additive relationship coefficients between i in the row and j in the column

• Coefficient of coancestry θij: the prob that a random alleles from i and j are IBD

• Calculate from the pedigree by recursion:

Page 19: Association Mapping

19

A matrix from marker data

, the homozygosities over all markers and alleles

Page 20: Association Mapping

20

With inbreeding, parental contributions NOT 50:50

• Maize intermated population

• Drift during intermating and inbreeding

• Markers can give more accurate θ than pedigree

00.05 0.1

0.15 0.20.25 0.3

0.35 0.40.45 0.5

0.55 0.60.65 0.7

0.75 0.80.85 0.9

0.95 10

10

20

30

40

50

60

70

80

90

/

Page 21: Association Mapping

21

Mixed Model Example

• Five individuals, a, b, c, d, and e.• a and b in subpop 1; c, d, and e in subpop2.• a, b, c, and d unrelated; e is offspring of c and d.• a and d carry the 0; b, c, and e carry the 1 allele

y = + Qvμ + Xβ

Page 22: Association Mapping

22

Mixed Model Example

y = + Qvμ + Xβ

• Five individuals, a, b, c, d, and e.• a and b in subpop 1; c, d, and e in subpop2.• a, b, c, and d unrelated; e is offspring of c and d.• a and d carry the 0; b, c, and e carry the 1 allele

Page 23: Association Mapping

23

Mixed Model Example

• Five individuals, a, b, c, d, and e.• a and b in subpop 1; c, d, and e in subpop2.• a, b, c, and d unrelated; e is offspring of c and d.• a and d carry the 0; b, c, and e carry the 1 allele

+ Zu + ey = + Qvμ + Xβ

Page 24: Association Mapping

24

Mixed Model Example

• Five individuals, a, b, c, d, and e.• a and b in subpop 1; c, d, and e in subpop2.• a, b, c, and d unrelated; e is offspring of c and d.• a and d carry the 0; b, c, and e carry the 1 allele

A =

var(u) = σ2u

Zu

Page 25: Association Mapping

25

Mixed Model Example

• There is a polygenic effect u for each individual => overdetermined model?

• NO: u is a random effect, constrained by Aσ2u

Page 26: Association Mapping

26

Mixed Model Example+ Zu + ey = + Qvμ + Xβ

–1

✕=

Page 27: Association Mapping

27

Control false positives from structure

Observed P

Flowering time(High population structure)

Ear height(Moderate population structure)

Ear diameter(Low population structure)

0.5 0.50.5

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5Observed P

SimpleQ

KQ + K

GC

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5

Simple

Q

K

Q + K

GC

0

0.1

0.2

0.3

0.4

0 0.1 0.2 0.3 0.4 0.5Observed P

Cum

ulati

ve P

Simple

QK

Q + K

GC

SimpleQKQ + KGC

a. b. c.

A straight diagonal line indicates an appropriate control of false positives.

Q + K model has best Type I error control, most important when trait is related to population structure (e.g., flowering time).

Page 28: Association Mapping

28

Statistical power

Flowering time(High population structure)

Ear height(Moderate population structure)

Ear diameter(Low population structure)

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Genetic effect(Phenotypic variation explained in %)

Simple

Q

K

Q + K

GC

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Genetic effect(Phenotypic variation explained in %)

SimpleQ

K

Q + K

GC

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

Genetic effect(Phenotypic variation explained in %)

Adju

sted

ave

rage

pow

er

Simple

Q

KQ + K

GC

(0) (0.8) (3.3) (7.1) (11.9) (17.4) (0) (0.8) (3.3) (7.1) (11.9) (17.4) (0) (0.8) (3.3) (7.1) (11.9) (17.4)

d. e. f.

SimpleQKQ + KGC

Q + K model had highest power to detect SNPs with true effects.

Page 29: Association Mapping

29

Controlling for Structure

Original P Matrix K Matrix

Page 30: Association Mapping

30

FDR vs. Power for 300 lines, 10 QTL

Page 31: Association Mapping

31

Effect of line number, P-only10 QTL, 0.75 heritability

Page 32: Association Mapping

32

Effect of Reduced Population Diversity

Page 33: Association Mapping

33

Take homes on diversity

• At equal population size– A less diverse population can increase power

because relative to the extent of LD, the average marker distance is lower

– Given that you are testing fewer markers, the multiple testing problem is reduced

• Avoid as much as possible reducing population size for the sake of obtaining a more homogeneous population

Page 34: Association Mapping

34

Guidelines

• More lines and more markers are better• For a diverse population, 800+ lines• For a narrower population, 300+ (?)• FDR is a reasonable method of determining

significance, but probably conservative

Page 35: Association Mapping

35

0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%

Varia

nce

ratio

d e f

Marker number Marker number Marker number

SSRSNP

Q constant K estimated

• Q estimated with all markers, K estimated with varying fraction of markers available

Flowering time Ear height Ear diameter

Page 36: Association Mapping

36

0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%0.00

0.20

0.40

0.60

0.80

0% 25% 50% 75% 100%

Varia

nce

ratio

d e f

Marker number Marker number Marker number

SSRSNP

Q estimated K constant

• Q estimated with varying fraction of markers available, K estimated with all markers

Flowering time Ear height Ear diameter

Page 37: Association Mapping

37

History / future of controlling for structure

Part having to

do with LD Multi-factorial

trait / structure

Page 38: Association Mapping

38

Single locus: model mis-specification

• “the problem is better thought of as model mis-specification: when we carry out GWA analysis using a single SNP at a time, we are in effect modeling a multifactorial trait as if it were due to a single locus”– Atwell S. et al. 2010. Nature 465:627-631

Page 39: Association Mapping

39

History: Candidate locus studies

• AM started out with candidate locus studies where the effects of few loci could be fitted

• The biotechnology was not there to type more than a few loci

• The genetic background needed to be accounted for somehow (see above)

• In any event, the computational power was not there to fit all 106 loci simultaneously

Page 40: Association Mapping

40

Future: GWAS fitting all loci

• These methods could displace mixed models accounting for structure

Logs

don

B. e

t al.

2010

.BM

C Bi

oinf

orm

atics

11:

58.

Page 41: Association Mapping

41

Sundry topics

• Other methods to control structure• QTL confounded with structure• Single markers or haplotypes?• Genetic heterogeneity• Missing heritability• Linkage disequilibrium / Linkage analysis• Validation

Page 42: Association Mapping

42

Genomic Control

• Calculate bias in distribution of test statistic using “neutral” loci, then account for bias

• Devlin, B. and Roeder, K. 1999. Genomic Control for Association Studies. Biometrics 55:997-1004.

• Works best for candidate genes: test loci can be distinguished from neutral control loci. Works less well for whole genome scans• Marchini, J. et al. 2004. Nat. Genet. 36:512-517• Devlin, B. et al. 2004. Nat. Genet. 36:1129-1131.• Marchini, J. et al. 2004. Nat. Genet. 36:1131-1131

Page 43: Association Mapping

43

Transmission Disequilibrium Test

• Experimental rather than statistical control of effects of structure

• Originally conceived for dichotomous (e.g., disease / no disease) traits

• Affected offspring and both parents, of which one must be heterozygous

• Test whether the a putative causal allele is transmitted more often that 50% of the time

• Spielman, R.S. et al. 1993. Am. J. Hum. Genet. 52:506-516

Page 44: Association Mapping

44

TDT

• Extensions for quantitative traits• Allison, D.B. 1997. Am. J. Hum. Genet. 60:676-690

• Extensions for larger-than-trio pedigrees• Monks, S.A., and N.L. Kaplan. 2000. Am J Hum Genet

66:576-92

• Using for populations under artificial selection• Bink, M.C.A.M. et al. 2000. Genetical Res. 75:115-121

Page 45: Association Mapping

45

QTL confounded with structure• Particularly important for QTL affecting

adaptation, e.g., flowering time

Camus-Kulandaivelu, L. et al. 2006. Genetics 172:2449–2463

Page 46: Association Mapping

46

Also in rice…

Ghd7-0aNon-functional

Ghd7-2Weak allele

Ghd7-0Deleted

Ghd7-1, Ghd7-3Functional

Given geographic distribution and role in adaptation, selection using this locus will have marginal utility

Xue, W. et al. 2008. Nat Genet 40:761-767

Page 47: Association Mapping

47

Confounded QTL with structure• Association analysis will have difficulty

identifying such QTL: the QTL needs to be polymorphic within subpopulations

• Traditional linkage studies of crosses between members of different subpopulations should be very effective in this case

• e.g., Xue, W. et al. 2008. Nat Genet 40:761-767

• Multi-factorial methods will have difficulty identifying loci under strong structure

Page 48: Association Mapping

48

Dwarf8: Confounded with structure

• Thornsberry, J.M. et al. 2001. Nat. Genet. 28:286-289– First structured association test applied to plants

Camus-Kulandaivelu, L. et al. 2006. Genetics 172:2449–2463

Page 49: Association Mapping

49

Single markers or haplotypes?

• The jury is still out• Infinite ways to simulate and

analyze– Ne, QTL MAF, QTL effect,

quantitative vs. binary, age of mutation

• Ex. 1: Dramatically more power for haplotypes vs single markers

• Durrant, C. et al. 2004. Am J Hum Genet 75:35-43

Page 50: Association Mapping

50

Single markers or haplotypes?

• Ex. 2: Similar or lower power for haplotype method relative to single marker method

• Zhao, H.H. et al. 2007. Genetics 175:1975-1986

• Process to sort out what method most appropriate for when still has to happen

Page 51: Association Mapping

51

Exploiting Haplotype Blocks

• Objective: reduce the genotyping cost while capturing polymorphism at all (most) loci

• Haplotype: series of alleles at adjacent polymorphic loci

• Blocks: majority of diversity in few haplotypes• => Strong LD between loci within blocks; weak LD

between loci across blocks• Knowledge of the allele at one locus provides

much information on the alleles at other loci

Page 52: Association Mapping

52

What causes blocks?

• Recombination heterogeneity: coldspots within blocks, hotspots between blocks

• Random sampling of alleles and timing of mutation relative to recombination events

Page 53: Association Mapping

53

Evidence for mechanisms

• Humans: High marker density resources– Observed LD structure reproduced best if recombination

hotspots every ~ 100 kbp• Reich, D.E. et al. 2002. Nat Genet 32:135-142.

Wall, J.D., and J.K. Pritchard. 2003. Am J Hum Genet 73:502-15.– Block boundaries correspond with positions of high

current recombination• Jeffreys, A.J. et al. 2005. Nat Genet 37:601-606

– Block boundaries consistent across different human populations• De La Vega, F.M. et al. 2005. Genome Res. 15:454-462

Page 54: Association Mapping

54

Hotspots exist in plants (Arabidopsis) too

Kim, S. et al. 2007. Nat Genet 39:1151-1155

70 kbp

Page 55: Association Mapping

55

Blocks also arise randomly

•No relation between historic recombination (histogram) and block boundaries (dark bars)• Verhoeven, K.J.F. and K.L. Simonsen. 2005. Mol. Biol. Evol. 22:735-740

Page 56: Association Mapping

56

Blocks in barley

40 ?? Mbp

Page 57: Association Mapping

57

Block cause matters

• If blocks arise from a recombination process, they will be consistent across populations

• Markers that tag blocks identified in one population will therefore be useful in others

• If blocks arise from random processes, tags useful in one population will not be so in another

Page 58: Association Mapping

58

Haplotypes for discovery in barley

• 2198 mapped SNP in 1807 lines across barley• Five methods– Traditional single SNP– Four gamete: use D’ to determine boundaries– Tree scan: single df contrasts based on parsimony– HapBlock: group to capture diversity– Sliding window of 3 SNP

• Simulation: mask a SNP and pretend it’s a QTL• Real data: heading date on 1040 lines

Page 59: Association Mapping

59

Results

• Simulation: single SNP best in 5 / 8 cases and never worse than the best haplotype method

• CAUTION: the QTL had the same properties as the SNP: ideal for single SNP discovery

• IF QTL simulated as recent mutations on blocks with haplotype properties THEN haplotype methods had higher power– Even then, single SNP did pretty well

Page 60: Association Mapping

60

Chromosome

1H 2H 4H3H 6H5H 7H

Real data (heading date)

All methodsOnlyTreeScan

Only 4gamete

Page 61: Association Mapping

61

000-0.53

64

0110.16489

110-0.48386

010-0.44

82

111-0.69

12

001-0.99

3

*

4gamete success• Rare recombinants split off early-heading lines

Page 62: Association Mapping

62

Take-homes

• Simulations don’t support use of haplotype blocks– But we don’t know how to simulate the true

nature of QTL• With real data, a diversity of approaches

might produce the most useful candidate list

Page 63: Association Mapping

63

Block vs tag identification

• Blocks require position, tags do not• General tag marker approach:– Identify markers in high LD with each other– Retain only one– Aggressive: among tags, see if combinations can

be used instead of single tagsde Bakker, P.I.W. et al. 2005. Nat Genet 37:1217-

1223

Page 64: Association Mapping

64

Reducing marker numbers

• Tag SNP and Imputation:• Tag marker approach– Identify markers in high LD with each other– Retain only one– Aggressive: among tags, see if combinations can

be used instead of single tags

Page 65: Association Mapping

Tagging works

• Power is maintained; genotyping is reduced

65Average marker spacing (kbp)

Pow

er

•Greedy•Best N•Random Tags•No LD

de Bakker, P.I.W. et al. 2005

Page 66: Association Mapping

66

Tags serve as a base for imputation

• Model-based imputation using fastPHASE

Scheet and Stephens. 2006

Page 67: Association Mapping

67

Imputation on tag markers works

Jannink et al. 2007

Page 68: Association Mapping

68

Imputation can increase power

Marchini et al. 2007Chromosomal Position (Mb)

Page 69: Association Mapping

69

Imputation can increase power

Guan and Stephens 2008

Page 70: Association Mapping

70

Genome scans with low LD

• Numerous species have too low LD to perform (as of yet) whole genome scans– You would need too many SNP on too many genos

• “Nested Association Mapping”• Known as “Linkage disequilibrium linkage

analysis” (LDLA) in animal genetics

Yu, J. et al. 2008. Genetics 178:539-551

Meuwissen, T.H.E. et al. 2002. Genetics 161:373-379.

Page 71: Association Mapping

71

NAM Design

• B73 is the reference parent, crossed to 26 other inbred lines, representing a large part of maize diversity

CML52B73

F1

RIL2 RIL199 RIL200RIL1 …

B73

F1

RIL2 RIL199 RIL200RIL1 …

P3926 Times

Page 72: Association Mapping

72

RIL

1

2

200

B73×

F1s

SSD

25 DL

B97

CML1

03

CML2

28

CML2

47

CML2

77

CML3

22

CML3

33

CML5

2

CML6

9

Hp30

1

Il14H

Ki11

Ki3

Ky21

M16

2W

M37

W

Mo1

8W

MS7

1

NC3

50

NC3

58

Oh4

3

Oh7

B

P39

Tx30

3

Tzi8

Page 73: Association Mapping

73

NAM Genotyping

• Type parents at high density (2.5 M SNP…)• Type RIL at low density (10 k SNP): know, on a

sub-cM scale, which parental allele inherited

Page 74: Association Mapping

74

NAM linear models

• P: matrix indicating which parent contributed the allele to each offspring. α: vector of effects of parental alleles.

• Eq. 1• A linear model for the parental allele effects:• Eq. 2• This latter model is what Yu et al. call

“Projecting parental SNP on to the progeny.”

Page 75: Association Mapping

75

NAM / LDLA on Maize

• Consider: – 2.5 Gbp genome with LD extending to 1000 bp– Requires 2.5 M SNP…

• Apply Eq. 1 to identifiy QTL with, say, 3 cM C.I.– Within interval there are ~ 3000 parental SNP

• Apply Eq. 2 to dissect the QTL to its causal SNP– Feasible with 25 genotypes (apparently)– Note that α will be accurately estimated

Page 76: Association Mapping

76

Advantages of NAM / LDLA

• Adds power without adding huge genotyping burden

• Reduces / eliminates problems related to structure: the linkage part of the analysis removes long-distance LD

Page 77: Association Mapping

77

Sugary1: Genetic heterogeneity

Tracy, W.F. et al. 2006. Crop Sci 46:S-49-54

Page 78: Association Mapping

78

Genetic heterogeneity hinders AM

• Distinct mutations at the B locus

• If B2 and B3 cause a phenotype (e.g. loss of function at isoamylase), it will be associated with A1 in one case and A2 in the other case.

• B2 and B3 can be identified by linkage mapping

B2 associated with A1

A1B1 Exists A2B1 Exists

A1B2 New A2B2 --

B3 associated with A2

A1B1 Exists A2B1 Exists

A1B3 -- A2B3 New

Page 79: Association Mapping

79

How prevalent is heterogeneity?

Buckler et al. 2009. Science 325:714-718

Page 80: Association Mapping

80

Multiple hits => Allelic series

Buckler et al. 2009. Science 325:714-718

Page 81: Association Mapping

81

Heterogeneity and Pop. History

• If a population has gone through a severe bottleneck, polymorphic loci are unlikely to have > 2 alleles…

• Heterogeneity is less likely in domesticated populations with low Ne

Page 82: Association Mapping

82

Page 83: Association Mapping

83

Missing Heritability

• Heritabtility for height in humans is ~ 0.80.• Very large GWAS studies find ~ 50 SNP together

accounting for 5% of that heritability• Where’s the rest?– Infinitesimal effects– Low frequency SNP in same causal genes– Epigenetics– Genotype x environment interaction– Epistasis Maher, B. 2008. Nature 456:18-21

Manolio T.A. et al. 2009.Nature 461:747-753.

Page 84: Association Mapping

84

Plants are not like humans

• Atwell et al. 2010. Nature 465:627-631– Just 192 lines!– Some large effect variants (intermediate

frequency and explain 20% of variation…)– Inbred lines enable noise reduction– Extended association peaks because of low Ne

• Less evidence of missing heritability

Page 85: Association Mapping

85

Mouse composite not like humans

• Valdar et al. 2006. Nature Genetics 38:879-887

• QTL account for 73% of observed heritability

Page 86: Association Mapping

86

Humans are not like humans

• Yang et al. 2010. Nat. Genet. 10.1038/ng.608– Common SNP accounted for 45% of variation if all

SNP included in the model• i. Many very small QTL effects• ii. QTL generally have lower MAF than arrayed SNP

• Dickson S.P. et al. 2010. PLoS Biol 8:e1000294– Several rare variants can combine to produce an

association with a common SNP

Page 87: Association Mapping

87

Validation

• All genome-wide studies raise the question of validation

• In candidate studies, independent evidence from biological reasoning for candidate choice

• In Zhao et al. 2007, used previous linkage analyses of parents in the association panel

Page 88: Association Mapping

88

Chromosome

1H 2H 4H3H 6H5H 7H

Real data (heading date)Linkage Studies

VRN3

Page 89: Association Mapping

89

Arabidopsis: Residual structure

Residual Confounding: no bi-parental QTL found despite it segregating in the cross

Low Power: No association found despite large effect in the cross

Page 90: Association Mapping

90

Recap• Model has focused on one locus at a time• The locus has been treated as a fixed effect

– Makes sense in the candidate locus context• We have dealt with residual “polygenic” effects that, through

structure, wreak havoc• Going forward, statistical models will be multi-factorial• Linkage mapping needed to find loci associated with structure• LD exhibits block-like structure: what to do with that?• Potential for genetic heterogeneity depends on population

history• GWAS can miss substantial heritability• If you have very low LD, nested association, or LDLA, is a good

idea


Recommended