+ All Categories
Home > Documents > Estimation of genetic variation and SNP- heritability from ...

Estimation of genetic variation and SNP- heritability from ...

Date post: 11-Dec-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
39
1 Estimation of genetic variation and SNP- heritability from GWAS data
Transcript
Page 1: Estimation of genetic variation and SNP- heritability from ...

1

EstimationofgeneticvariationandSNP-heritabilityfromGWASdata

Page 2: Estimation of genetic variation and SNP- heritability from ...

• DenseSNPpanelsallowtheestimationoftheexpectedgeneticcovariancebetweendistantrelatives

• AmodelbaseduponestimatedrelationshipsfromSNPsisequivalenttoamodelfittingallSNPssimultaneously

• ThetotalgeneticvarianceduetoLDbetweencommonSNPsand(unknown)causalvariantscanbeestimated

• GeneticvariancecapturedbycommonSNPscanbepartitionedacrossthegenome

• DifferentmethodstoestimaterelatednessfromSNPsassumedifferentgenetictraitarchitectures

2

Keyconcepts

Page 3: Estimation of genetic variation and SNP- heritability from ...

EstimationofSNP-heritabilityfromGWASdata

Background– 2008:GWASwasperceivedbymanytohavefailedasanexperimentaldesign

– Missingheritability:discrepancybetweenpedigreeheritabilityandvariancecapturedbyassociatedSNPs

3

Page 4: Estimation of genetic variation and SNP- heritability from ...

Disease Number of loci

Percent of Heritability Measure Explained

Heritability Measure

Age-related macular degeneration

5 50% Sibling recurrence risk

Crohn’s disease 32 20% Genetic risk (liability)

Systemic lupus erythematosus

6 15% Sibling recurrence risk

Type 2 diabetes 18 6% Sibling recurrence risk

HDL cholesterol 7 5.2% Phenotypic variance

Height 40 5% Phenotypic variance

Early onset myocardial infarction

9 2.8% Phenotypic variance

Fasting glucose 4 1.5% Phenotypic variance

WhereistheDarkMatter?

4

Page 5: Estimation of genetic variation and SNP- heritability from ...

Hypothesistestingvs.Estimation

GWAS=hypothesistesting– Stringentp-valuethreshold– Estimatesofeffectsbiased(“Winner’sCurse”)

Canweestimate thetotalproportionofvariationaccountedforbyallSNPs?

5

Page 6: Estimation of genetic variation and SNP- heritability from ...

Amodelforasinglecausalvariant

AA AB BBfrequency (1-p)2 2p(1-p) p2

x 0 1 2effect 0 b 2bw =[x-E(x)]/sx -2p/√{2p(1-p)} (1-2p)/√{2p(1-p)} 2(1-p)/√{2p(1-p)}

yj = µ’ +xijbi +ej x=0,1,2{standardassociationmodel}

yj = µ +wijuj +ej u=bsx;µ =µ’+bsx

6

Page 7: Estimation of genetic variation and SNP- heritability from ...

yj = µ +Swijuj +ej

= µ +gj +ej

y = µ1 +g +e

= µ1 +Wu +e

7

Weighting scheme 1

Multiple(M)causalvariants

Page 8: Estimation of genetic variation and SNP- heritability from ...

Letubearandomvariable,u~N(0,su2)

Thensg2 =Msu

2

var(y)=WW’ su2 +Ise

2

=WW’(sg2/M)+Ise

2

=Gsg2 +Ise

2

8

Model with individual genome-wide additive values using relationships (G) at the causal variants is equivalent to a model

fitting all causal variants

We can estimate genetic variance just as if we would do using pedigree relationships

Equivalence

Page 9: Estimation of genetic variation and SNP- heritability from ...

IfweestimateG fromSNPs:– loseinformationduetoimperfectLDbetweenSNPsandcausalvariants

– howmuchwelosedependson• densityofSNPs• allelefrequencyspectrumofSNPsvs.causalvariants

– estimateofvarianceàmissingheritability

9

G fromMSNPs:

Gjk =(1/M)S {xij – 2pi)(xik – 2pi)/{2pi(1-pi)}

=(1/M)S wijwik

Butwedon’thavethecausalvariants

Page 10: Estimation of genetic variation and SNP- heritability from ...

• EstimaterealisedrelationshipmatrixfromSNPs• Estimateadditivegeneticvariance

y =Xb +e =Wu +e,var(y)=Gsg2 +Ise

2

Gjk =(1/M)S {xij – 2pi)(xik – 2pi)/{2pi(1-pi)}=(1/M)S wijwik

• Basepopulation=currentpopulation• Weightingscheme1 10

Methods(Yangetal.2010)

Page 11: Estimation of genetic variation and SNP- heritability from ...

var(y) =V =Gσ g2 + Iσ e

2

y standardised~N(0,1)

Nofixedeffectsotherthanmean

G estimatedfromSNPs

Residualmaximumlikelihood(REML)

11

Statisticalanalysis

Page 12: Estimation of genetic variation and SNP- heritability from ...

h2 ~ 0.5 (SE 0.1)

12

Results

[Yang et al. 2010, Nature Genetics]

Page 13: Estimation of genetic variation and SNP- heritability from ...

13[Visscher et al. 2010, Twin Research and Human Genetics]

Checkingforpopulationstructure

Page 14: Estimation of genetic variation and SNP- heritability from ...

GeneticvarianceassociatedwithallSNPscanbeestimatedfromGWASdata

– useSNPstoestimateG– usephenotypeson“unrelated”individualsandGtoestimategeneticvariance

Empiricalresults:mostadditivegeneticvariationforheightiscapturedbycommonSNPs

– little‘missing’heritability– GWASworksfine

14

ConclusionsYangetal.2010

Page 15: Estimation of genetic variation and SNP- heritability from ...

y = mean + g1 + g2 + g3 + g4 + g5 + evar(gi) = (WiWi’/Mi)σi

2 for SNPs in group i

Examples of groupings:• chromosome• genome annotation• MAF• LD

15

Partitioning of genetic variation

Page 16: Estimation of genetic variation and SNP- heritability from ...

IfwecanestimatethevariancecapturedbySNPsgenome-wide,weshouldbeabletopartitionitandattributevariancetoregionsofthegenome

“Populationbasedlinkageanalysis”

Application(2):partitioningvariation

16

Page 17: Estimation of genetic variation and SNP- heritability from ...

17

Exampleonquantitativetraits

[Yang et al. 2011, Nature Genetics]

Page 18: Estimation of genetic variation and SNP- heritability from ...

1

2

3

4

5

6

789

101112

13

14

15

16

17

1819

20

21

22

0

0.01

0.02

0.03

0.04

0.05

0 50 100 150 200 250

Varia

nceexplainedbyeachchromosom

e

Chromosomelength(Mb)

Slope=1.6×10-4

P =1.4×10-6

R2 =0.695

Height(combined)

12

34

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

2122

0

0.005

0.01

0.015

0.02

0.025

0 50 100 150 200 250

Varia

nceexplainedbyeachchromosom

e

Chromosomelength(Mb)

Slope=2.3×10-5

P =0.214R2 =0.076

BMI(combined)

18

longerchromosomesexplainmorevariation

Partitioning onchromosomes

[Yang et al. 2011, Nature Genetics]

Page 19: Estimation of genetic variation and SNP- heritability from ...

1

234

5

6

7

8

9

10

11

12

13 14

15

16

17

1819

20

21 22

R²=0.511

0.000

0.002

0.004

0.006

0.008

0.010

0.00 0.01 0.02 0.03 0.04 0.05

Varia

nceexplainedbyGIANTheightSNPson

eachch

romosom

e

Varianceexplainedbyeachchromosome

Height (11,586 unrelated)

1

2

34

5

6

7

8

9

10

11

12

13

14

1516

17

18

19

20

2122

0.000

0.004

0.008

0.012

0.016

0.020

0.000 0.004 0.008 0.012 0.016 0.020

Varia

nceexplainedbychrom

osom

e(adjustedfortheFTO

SNP)

Varianceexplainedbychromosome(noadjustment)

BMI(11,586unrelated)

FTO

19

ResultsareconsistentwithreportedGWAS

[Yang et al. 2011, Nature Genetics]

Page 20: Estimation of genetic variation and SNP- heritability from ...

1

234

5

6

78

9

1011

12

13

14

151617

18

192021220.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0 50 100 150 200 250

Varia

nceexplainedbyeachchromosom

e

Chromosomelength(Mb)

Slope=6.9×10-5

P =0.524R2 =0.021

vWF(ARIC)

20

Inferencerobustwithrespecttogeneticarchitecture

1

2

3

4

5

6

7

8

9

1011

12

13

14

151617

18

192021

22

0.00

0.03

0.06

0.09

0.12

0.15

0.00 0.03 0.06 0.09 0.12 0.15

Varia

nceexplainedbychrom

osom

e(adjustedfortheABO

SNP)

Varianceexplainedbychromosome(noadjustment)

vWF(6,662unrelated)

ABO

[Yang et al. 2011, Nature Genetics]

Page 21: Estimation of genetic variation and SNP- heritability from ...

0

0.01

0.02

0.03

0.04

0.05

0.06

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Varia

nceexplained

Chromosome

intergenic(± 20Kb)

genic(± 20Kb)

Height(combined)17,277proteincoding geneshGg

2 =0.328(s.e.=0.024)hGi

2 =0.126(s.e.=0.022)Coverageofgenicregions=49.4%P(observedvs.expected)=2.1x10-10

0

0.005

0.01

0.015

0.02

0.025

0.03

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Varia

nceexplained

Chromosome

intergenic(± 20Kb)

genic(± 20Kb)

BMI(combined)17,277proteincoding geneshGg

2 =0.117(s.e.=0.023)hGi

2 =0.047(s.e.=0.022)Coverageofgenicregions=49.4%P(observedvs.expected)=0.022

21

Genic regionsexplainvariationdisproportionately

[Yang et al. 2011, Nature Genetics]

Partitioning ongenomeannotation

Page 22: Estimation of genetic variation and SNP- heritability from ...

Application(3):Usingimputedsequencedata

HowmuchinformationisgainedbyusingSNParraydataimputedtoafullysequencedreference?

Howmuchislostrelativetowholegenomesequencing?

PartitionvariationaccordingtoMAFandLD

22[Yang et al. 2015 Nature Genetics]

Page 23: Estimation of genetic variation and SNP- heritability from ...

AccountingforLDandMAFspectrumallowsunbiasedestimationofgeneticvariance

23

0.0#

0.2#

0.4#

0.6#

0.8#

1.0#

1.2#

Random# More#common# Rarer# Rarer#&#DHS#

Herita

bility*es,m

ate*

GREML:SC#GREML:MS#LDAK#LDAK:MS#LDres#LDres:MS#GREML:LDMS#

[Yang et al. 2015 Nature Genetics]

0.0#

0.2#

0.4#

0.6#

0.8#

1.0#

Random# More#common# Rarer# Rarer#&#DHS#

Heritab

ility*es

,mate*

7MAF_4LD#

7MAF_3LD#

7MAF_2LD#

2MAF_2LD#

Page 24: Estimation of genetic variation and SNP- heritability from ...

Verylittledifferencein“taggability”betweenSNPchips

24

Genetic variation captured after imputation:96% due to common variants73% due to rare variants

0.0#

0.2#

0.4#

0.6#

0.8#

1.0#

0# 0.1# 0.2# 0.3# 0.4# 0.5# 0.6# 0.7# 0.8# 0.9#

Prop

or%o

n'of'varia%o

n'captured

'

Imputa%on'R2'threshold'

Common#1#Affymetrix#6#

Common#1#Affymetrix#Axiom#

Common#1#Illumina#OmniExpress#

Common#1#Illumina#Omni2.5#

Common#1#Illumina#CoreExome#

Rare#1#Affymetrix#6#

Rare#1#Affymetrix#Axiom#

Rare#1#Illumina#OmniExpress#

Rare#1#Illumina#Omni2.5#

Rare#1#Illumina#CoreExome#

[Yang et al. 2015 Nature Genetics]

Page 25: Estimation of genetic variation and SNP- heritability from ...

n=45kdataonheightandBMI

25

Totals~60% for height~30% for BMI

0.00#

0.05#

0.10#

0.15#

0.20#

0.25#

<#0.1# 0.1#~#0.2# 0.2#~#0.3# 0.3#~#0.4# 0.4#~#0.5#

Varia

nce(explaine

d(

MAF(stra2fied(variant(group(

Height# BMI#

[Yang et al. 2015 Nature Genetics]

0.00#

0.02#

0.04#

0.06#

0.08#

0.10#

0.12#

0.14#

2.5e+5#~#0.001#

0.001#~#0.01#

0.01#~#0.1#

0.1#~#0.2#

0.2#~#0.3#

0.3#~#0.4#

0.4#~#0.5#

Varia

nce(explaine

d(

MAF(

1st#quar4le#(low#LD)#

2nd#quar4le#

3rd#quar4le#

4th#quar4le#(high#LD)#

0.00#

0.01#

0.02#

0.03#

0.04#

0.05#

0.06#

0.07#

2.5e,5#~#0.001#

0.001#~#0.01#

0.01#~#0.1#

0.1#~#0.2#

0.2#~#0.3#

0.3#~#0.4#

0.4#~#0.5#

Varia

nce(explaine

d(

MAF(

Page 26: Estimation of genetic variation and SNP- heritability from ...

100%

80%

45%

16%

SlidebyRobertMaier 26

h2 overestimation?untaggedrarevariants?

better tagging of ungenotyped variants

samplesize/power

Partitioningvarianceofheight

TotalvarianceHeritability (based on Twin or family studies)SNP heritability from imputation to sequenced referenceSNP-heritability (variance explained by all genotyped SNPs ontheChip)VarianceexplainedbygenomewidesignificantSNPs

missingheritability60%

Page 27: Estimation of genetic variation and SNP- heritability from ...

Estimated relatedness and trait architecture

𝐺"#$ = (1

∑ 2𝑝+ 1 − 𝑝+ -./0+1-

)3 𝑧+#𝑧+$ 2𝑝+ 1 − 𝑝+ /0

+1-

If G describes the genetic covariance between individuals [var(g) = Gsg

2], then what is the equivalent linear model in terms of SNP effects?

27

Page 28: Estimation of genetic variation and SNP- heritability from ...

Equivalent models

𝐲 = 𝟏𝜇 + ∑ 𝐗#𝛽#;# + e

𝛽#~𝑁(0, 2𝑝#(1 − 𝑝#)/ 𝜎AB)

ℎ#B = 2𝑝# 1 − 𝑝# 𝐸 𝛽B = 2𝑝#(1 − 𝑝#)-./𝜎AB

28

Page 29: Estimation of genetic variation and SNP- heritability from ...

S = -1

𝛽#~𝑁(0, 2𝑝#(1 − 𝑝#F- 𝜎AB)

ℎ#B = 2𝑝#(1 − 𝑝#)G𝜎AB = 𝜎AB

• Weighting scheme 1• All SNPs contribute equally to heritability• Rare variants have bigger effects• “Purifying selection model”

29

Page 30: Estimation of genetic variation and SNP- heritability from ...

S = 0

𝛽#~𝑁 0, 2𝑝# 1 − 𝑝#G 𝜎AB ~𝑁(0, 𝜎AB)

ℎ#B = 2𝑝#(1 − 𝑝#)-𝜎AB

• Weighting scheme 2• Common SNPs contribute more to heritability• Rare and common variants have same effects• “Neutral model”

30

Page 31: Estimation of genetic variation and SNP- heritability from ...

Weighting scheme and genetic architecture

• Weighting schemes 1 and 2 can be justified in two ways:– IBD vs IBS– A priori assumption about the relationship

between allele frequency of effect size (natural selection)

• Can we estimate genetic architecture from the data?

31

Page 32: Estimation of genetic variation and SNP- heritability from ...

Bayesian mixture model (BayesS)

𝒚 = 𝟏𝜇 +3 𝑿#𝛽#�

#+ 𝒆

𝛽#~𝑁 0, 2𝑝#𝑞#/𝜎AB 𝜋 + 𝜙 1 − 𝜋

• S measures the relationship between effect size and MAF 𝑝#– S = 0: independence– S < 0: negatively related (rare variant tends to have large effect)– S > 0: positively related (common variant tends to have large effect)– GCTAdefault:S=-1

• 𝜋 is the polygenicity (proportion of SNPs with non-zero effects)

• ℎ/_`B = 𝑉𝑎𝑟 𝑔 /𝜎fB where g = ∑ 𝑿#𝛽#�#

• Simultaneously estimate SNP effects and genetic architecture parameters using MCMC

• Account for LD between SNPs

Zeng …. Yang 2017 (BioRxiv)

Page 33: Estimation of genetic variation and SNP- heritability from ...

Direction of S distinguishes stabilising selection from directional and disruptive selection

S = 0 S = 0.14 S = 0.13 S = −0.19 S = 0.09

S = 3.70

Neutral Directional (+) Directional (−) Stabilizing Disruptive

0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5 0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.5

1.0

1.5

2.0

2.5

MAF Bin

Varia

nce

of C

oded

Alle

le E

ffect

s

𝛽#~𝑁 0, 1 + 2𝑝#𝑞#/𝜎AB

Zeng …. Yang 2017 (BioRxiv)

Page 34: Estimation of genetic variation and SNP- heritability from ...

Height vs. BMI

Polygenicity

Heritability

S

0.04 0.06 0.08 0.10

0.25 0.30 0.35 0.40 0.45 0.50 0.55

−0.5 −0.4 −0.3 −0.20

5

10

15

20

0

30

60

90

0

100

200

300

Value

Density

Height

BMI

• Both height and BMI have

been under selection

• Selection has been stronger for

height than BMI-associated

SNPs

• Height is more heritable than

BMI.

• BMI is more polygenic than

height.

Posterior distribution of genetic architecture parameters

Zeng …. Yang 2017 (BioRxiv)

Page 35: Estimation of genetic variation and SNP- heritability from ...

S Heritability Polygenicity

−0.75 −0.50 −0.25 0.00 0.25 0.1 0.2 0.3 0.4 0.5 0.00 0.05 0.10 0.15Fluid intelligence score

Neuroticism scoreMDD

Birth weightBody fat percentage

Diastolic blood pressureBMI

WeightT2D

BaldnessSystolic blood pressure

Peak expiratory flowEducational attainment

Forced expiratory volumeBasal metabolic rate

Hand grip strength leftAge menarche

Age at first live birthHeel BMD T score

Heel QUIHand grip strength right

Forced vital capacityHeight

HCadjBMIMean time to correctly identify matches

WHRadjBMIWCadjBMIPulse rate

Age at menopause

Highest Probability Density

40000

60000

80000

100000

120000N

29 traits in UKB

Zeng …. Yang 2017 (BioRxiv)

Page 36: Estimation of genetic variation and SNP- heritability from ...

24 traits with significant 𝑆h

0.00

0.25

0.50

0.75

1.00

0.0 0.1 0.2 0.3 0.4 0.5MAF

Cum

ulat

ive g

enet

ic v

aria

nce

expl

aine

d

Age at menopause

Pulse rate

Mean time to correctly identify matches

WCadjBMI

Age at first live birth

WHRadjBMI

Hand grip strength right

HCadjBMI

Hand grip strength left

Forced vital capacity

Age menarche

Educational attainment

Forced expiratory volume

Basal metabolic rate

Systolic blood pressure

Heel QUI

Peak expiratory flow

Heel BMD T score

Height

Weight

Diastolic blood pressure

BMI

Body fat percentage

Baldness

●●

●●

●● ●

0.52

0.54

0.56

0.58

0.60

0.3 0.4 0.5 0.6

Absolute value of S

AUC

Zeng …. Yang 2017 (BioRxiv)

Page 37: Estimation of genetic variation and SNP- heritability from ...

0

5

10

15

−0.6 −0.3 0.0 0.3 0.6Posterior Mode

Cou

nt

S

Heritability

Polygenicity

●●● ● ●●● ●●

●●● ●●●●●

●● ●

●●

●●

r = 0.048

0.0

0.2

0.4

0.6

0.1 0.2 0.3 0.4 0.5Heritability

Abso

lute

val

ue o

f S

●● ●● ●●●●●

●● ●●●●●●

● ● ●

●●

●●

r = −0.359

0.0

0.2

0.4

0.6

0.00 0.05 0.10Polygenicity

Abso

lute

val

ue o

f S

● ●

●●

●●

●●

● ●●

r = −0.021

0.1

0.2

0.3

0.4

0.5

0.00 0.05 0.10Polygenicity

Her

itabi

lity

Mean Median

S -0.350 -0.367

h2SNP 0.223 0.222

𝜋 5.8% 5.1%

Summarize over 29 traits

Zeng …. Yang 2017 (BioRxiv)

Page 38: Estimation of genetic variation and SNP- heritability from ...

Multiplemethodstoestimateadditivegeneticvariance

Individual-leveldata- GREML- Haseman-Elston regression

(yjyk)=mean +bGjk +e

Summarydata- LDscore regression

Consideration:- dataavailability- modelassumptions- computation

38

Page 39: Estimation of genetic variation and SNP- heritability from ...

• Dense SNP panels allow the estimation of the expected genetic covariance between distant relatives

• A model based upon estimated relationships from SNPs is equivalent to a model fitting all SNPs simultaneously

• The total genetic variance due to LD between common SNPs and (unknown) causal variants can be estimated

• Genetic variance captured by common SNPs can be partitioned across the genome

• Different methods to estimate relatedness from SNPs assume different genetic trait architectures

39

Key concepts


Recommended