Post on 15-Apr-2017
transcript
HeritabilityanddetectionofexpressionQTL(eQTL)
byperformingGenome-Wide-Associationstudies
AbhilashKannanMscAnalyticalGenomicsUniversityofBirmingham
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
1
Overview
• WhatiseQTL?• PerformingGenome-Wide-Association(GWAS)studiesin
eQTLanalysis• Importantthingstobeaddressedbeforeperforming
GWAS.• IssueofMissingHeritability• Aimofthestudy• Approachestaken• Someresultsobtainedsofar….
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
2
eQTL
It is a particular site or Position in the Genome where the variation in thenucleotide sequences between two genotypes leads to a significantdifference in the gene expression levels between the individuals with thesegenotypes
eQTL analysis aims to combine the variation in the DNA sequence to theindividualdifferences in gene expression.
MeasuredbyMicroarrayGenotypedMarkerscoveringtheentiregenome
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
3
TypesofeQTL…..
based on their physical distance from the regulated gene.
cis-linkage - if the eQTL is located near the target gene itself. (depends on thegene size).
Trans-linkage - if the eQTL is located on a different gene.
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
4
Thingstobeaddressed………ProblemofPopulationStructure
systematic difference in the allele frequency between subpopulations in a studydue to ancestry difference between study subjects.
In case of unrelated individuals, Population stratification inflates the test statistics- type 1 error. If unnoticed population structure can result in false associationsignals.
Most notable methods for correcting the Structure:• Principal Components Analysis• Genomic Control
Missing heritability
only a small proportion of genetic variation - explained by any one expression trait
Large proportion – attributed to gene*gene or gene*environment interactions
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
5
PCAplot MDSplot
CCEU
EV=5.81279
EV=12.269
CCEU
CYRICYRI CCHB+JPTCCHB+JPT
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
6
MainObjectiveofthestudy……• To detect cis- and trans- acting expression QTLs (eQTLs) in a mixture of
HapMap populations
• To estimate the heritability of the gene expression traits
• to explore the relationship between the heritability of the gene’s expressionlevel and the power to identify cis- and trans-acting eQTLs.
Samples Used for the study:
210 unrelated HapMap individuals – 60 European founders (CEU)45 unrelated Chinese (CHB)45 unrelated Japanese (JPT)60 Africans (Nigeria – YRI)
Normalized Gene Expression Values from all the 210 individuals (illuminaBeadarray).
More than 2 million markers covering the entire genome (obtained from HapMapPhase 2 & 3).
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
7
Approachtaken….
1. Selectingthegenestostudy------------ BLATsearch,UniqueRefSeqnumber.AvoidinggeneswithalternateSpliceformsandmultipletranscriptionsites.
2. InvestigationandcorrectionofthePopulationStructure(PlinkandGoldenHelix)---------- UsingPCAapproach
3. TestforAssociationsbetweentheGeneexpressiontraitsandSNPmarkergenotypes(R(GenABEL),PlinkandGoldenhelix)-------------- PerformLinearRegression/correlation
3. Calculatingthemeasureofheritability(GeneticRelationshipmatrixandassociatingthemwiththegeneexpressionphenotype– Program- GCTA)
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
8
SomepromisingResults…….
POMZP3Cis- regulation
FDRpValue=1.153e-057
EV1=12.269EV2=5.8129
45Significantlyassociatedmarkers
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
9
PEX6Cis- regulation
55Significantlyassociatedmarkers
EV1=12.269EV2=5.8129
FDRpValue=3.38e-005
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
10
Q-QplotforGene‘hmm28636’ afterPopulationstratificationcorrection
majorityof thep-valuesisinagreementwithrandomexpectation
λ =1
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
11
LocalizationofCis-eQTLsignalsrelativetoTSS
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
12
PhaseI
PhaseII
7SNPsabovethe threshold
NoSignificantAssociation
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
13
NoSignificantAssociation
NoSignificantAssociationNoSignificantAssociation
NoSignificantAssociation
ManySNPssignificantlyassociatedwiththegene
MultipopulationanalysisvsSinglePopulationAnalysis
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
14
ComparisonofAdjustedR2 values
R2 =0.30- 1 R2 =0.35- 1
R2 =0.43- 1 R2 =0.28- 1
R2 =0.12- 0.73R2 =0.12–0.45
CiseffectsinMultipopulationmuchsmaller
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
15
ExampleofTrans-eQTL
• Gene‘hmm26702’belongstoChromosome20• ShowssignificantassociationswithSNPspresentonchromosome9
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
16
Trans–associationsvs.Cis-associations
• TransassociationsweremuchweakercomparedtoCis-associations• MajorityoftheCisassociations(60%)hada–logPscoregreaterthan10
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
17
HeritabilityofGeneExpressiontraits
• Mostofthegeneshaveheritabilitybetween0.15to0.60• 507genes(83%)hadheritabilityhigherthan0.2• VeryFewgeneswithheritabilitygreaterthat0.5• 112genes(18%)hadheritabilityhigherthat0.5
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
18
CanwerelyonHeritabilitytodetecteQTLs?????
• heritabilitymeasuresshowareasonablecorrelation(r=0.2)withthecis-associationsignificanceBut
• ButGeneswithverylowheritabilityestimates(heritability<0.1)showsignificantcisassociations
• maximum–log10PvaluesoftheseassociatedSNPs>–log10PvaluesofsomeoftheSNPsshowingstrongcis- associationswiththegenehavinghighheritability(heritability>0.5).
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
19
HeritabilityestimatesofCis–associatedgenesvs.HeritabilityestimatesofTrans–associatedgenes
• Nosignificantdifferencebetweenthemwithrespecttotheirheritabilityestimates• p-value=0.2792
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
20
eQTLheritability
• ThemeanheritabilityoftheSNPthatwasstronglyassociatedwitheachgeneexpressiontraitwas0.16
• meanheritabilityfortheoverallgeneexpressiontraitwas0.36.• about44%oftheheritabilityinthegeneexpressioncanbeexplainedby
thepeakSNP
The proportion ofvariation explained bythe cis eQTLs (cis eQTLheritability) issignificantly higherthan the variationexplained by transeQTLs (trans eQTLheritability).(p-value = 1.190e-10)
Only2trans-eQTLshaveheritability>0.3
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
21
Summaryofresults• Thefourdifferentpopulationusedinthisanalysisgaverisetoasignificant
stratification• LargenumberofCis- regulatoryvariantswereobtainedfromtheanalysis• weakerregulatoryeffectssharedacrossthefourpopulationgroupscould
beidentifiedfromtheMultipopulationanalysis.• associationsignalsinsomeofgenescouldeasilypassthestringent
significancethresholdwhenthePhaseIIgenotypeswereused• ThepresenceofmajorityoftheCisassociatedSNPsneartheTSS.• ComparedtoCiseffectsveryfewgeneshadsignificanttransassociations
withtheSNPs• transeffectswerenotasstrongascis.• medianheritabilityofthegenesexpressiontraitshavingeQTLswas34.5%
whichisconsistentwiththepreviousstudies.• geneswithheritabilityestimatesofbelow0.2weresignificantlyassociated
withtheSNPsandhadeQTLs.• cis-eQTLshavehigherestimatesofheritabilitycomparedtothetrans-
eQTLs
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
22
NotableObservations
• 579geneshadsignificantCisassociationsand32geneshadsignificanttransassociations
• Thesensitivityoftheanalysiswasgreatlyimprovedbyusingthepooledsampleoffourpopulationgroups(multipopulation).
• Becauseofthesmallsamplesizeinthesinglepopulationanalysis,thegeneticeffectscapturedbytheassociationtestingwashigh.
• slowdecayofLinkageDisequilibriumwiththecausalvariantsinPhaseIIHapMap.
• mostofthevariantswithciseffectspresentingenicorimmediateintergenicregionsinthehumangenome.
• Inspiteoftheweaktranseffects,severaldistantassociationsbetweentheSNPsandgeneexpressionphenotypeswereobserved.
• SNPsexplainedaslowas0%toashighas86%variationamongthephenotypes
• heritabilityestimateofthegeneexpressiontraitsdoesnotnecessarilydetermineaboutthepresenceofabsenceofeQTLsforaparticulargene
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
23
Challenges• theanalysiswasmainlyfocussedonasinglecelltype• limitedpoweroftheanalysis.• transregulationsareconsideredtomoreindirect(generegulatedbya
largenoofSNPs,whichfailtopasssignificantthresold)• Useofsmallsamplesize– lesspowertodetectmanytransassociation
signals• Humancell- aminutepartofthewholeorganism.Thereforemajorityof
thetranseffectsmediatedbyintercellulareventscanbedifficulttodetect.
• useofstrictsignificancethresholdtoavoidfalsepositives• causalvariantscouldstillhaveverylowminorallelefrequency(MAF)and
apoorLDwiththeassayedSNPs- hencelowHeritability• manycausalvariantsforaparticulargene- onlyasmallpercentageof
varianceisexplainedbythemajorityofthecausalvariants.• Heritabilityestimatesdependsonthestateofthecell/Tissue.• locihavingsmalleffectsbecomedifficulttobecapturedthanthosewith
largeeffects
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
24
FutureDirections
• Insomeofthediseaseslikecancer,Singlegeneisnotthecausativefactors.
• FunctionalSNPsaffectingasinglegene/allelemayhavelittleeffectoftheirown.
• HighThroughputsequencingtechnologies– largedatasets,clearlydefineSNPsandLDstructure.
• Detailedmappingandannotations,Largecollectionofsamples.• Integrationofresequenceddatawithindifferentethnicgroups.• InclusionofvariantswithMAF<0.001• Establishthegeneticnetwork.• Prioritizethe(QuantitativeTraitGenes)QTGs.• Canbeappliedtothedatafrommethylation,miRNAandother
experimentstocompletelycharacterizemolecularbasisofcomplextraits.
(c)AbhilashKannan- tobeusedonly foreducationalpurposes
25