Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | hubert-johnson |
View: | 221 times |
Download: | 1 times |
Geuvadis RNAseq analysis at UNIGE
Analysis plans
Tuuli Lappalainen
University of Geneva
Geuvadis Analysis Group Meeting, April 16 2012, Geneva
What we will do: Overview
Coordinate everything
Get the data together: QC, normalization, data sharing
Regulation quantitative trait loci (rQTL): Common and rare cis-regulatory variants
Participate in Loss-of-Function analyses
Functional annotation of both common and rare regulatory variants
Population and evolutionary genetic analyses
Genetic effects on regulatory variation
common/rare cis-variantsindependent
effects
trans-eQTLs
splicing QTLs
Fine-mapping the causal regulatory variants
miRNA/mRNA interactions
eQTL analysisASE analysis
splicing QTL analyses
Finding many needles and little hay
Technical variation reduces our power in eQTL analysis: correction of covariates such as library size, sequencing batches, GC content, % mapping reads…
Linear regression of covariates
Linear regression of ~10 PCs that are expected to be some sort of summaries of technical covariates
Population stratification may lead to false genetic associations
analyze EUR & YRI separately and correct for population structure within EUR with Eigenstrat
Reference allele mapping bias
reference genome
ALT reads map worse or not at
all
SNP INDEL cSNPsimulation results of biased reads & sites
remove from ASE
test: filter biased reads from sams, redo quantifications & eQTL analysis
eQTLs : genotype association to regulatory phenotypes
The classical cis-eQTL analysis:
all genetic variants >5% MAF
1MB from transcription start site
Spearman rank correlation with (normalized) exon read counts
permutations to assess significance
Expect a few thousand genes with an eQTL
Taking the eQTL approach further Other phenotypes:
Gene expression levels: exon read counts or transcript quantifications?
splicing variation: links between exons (Halit Ongen @ UNIGE), Barcelona’s transcript ratios
miRNA quantifications
Variation QTLs: variation between independent measures of an individual’s gene expression levels = stochastic variation in gene expression
genotype
expr variance
common/rare cis-variants
independent effects
trans-eQTLs
splicing QTLsmiRNA/mRNA
interactions
Independent regulatory variants affecting the same gene
Regress out the first eQTL effect and redo the analysis
How to integrate eQTLs – sQTLs – vQTLs - miQTLs?
ASE analysis
A C
G T
CC
TT
T TT
cis eQTL* coding SNP mRNA-sequencing Statistical testing for ASE
Is the allelic ratio different from 0.5 / 0.5?
Thousands of data points per individual
Less noisy than expression levels
No direct information of the causal variant
ASE applications : population genetics of regulatory effects
Clustering of individuals (and populations)
Expression distance
ASE distance
Genetic distance
Epistasis between regulatory and coding variants
Deficiency of putatively deleterious coding variants with high expression of the derived allele
(Lappalainen et al. 2011)
ASE applications : rare regulatory variants
Sharing of rare ASE effect leads to excess of sharing of the haplotype
We have developed a statistical method to look for ASE-genotype concordance to characterize rare regulatory variants (Montgomery et al. Plos Genetics 2011)
POOL OF INDIVIDUALS
ASE
ASE
NO ASE
NO ASE
NO ASE
NO ASE
Stephen Montgomery
Functional annotation of regulatory variants
Functional annotation of the genome: 1000g annotations, ENCODE, conservation, etc
-> overlap with rQLTs
Can we finally get the causal variants?