Studies Genotype –Phenotype Association
A G A G T T C T G C T C G
A G G G T T A T G C G C G
A G A G T T C T G C T C G
A G G G T T A T G C G C G
A G A G T T C T G C T C G
A G G G T T A T G C G C G
A G A G T T C T G C T C G
A G G G T T A T G C G C G
Association studies:
Phenotpyic effect of SNPs
Date Base of human genetic diversity
BioBanks: Studies of cohorts at a great scale
•deCODE (Islandia)
•Estonia
•Germany
•Canada
•Japan
•China
USA
Genetic association -> Guilty by association
Patiens vs Control
SNP2 (A/T) 100% A 0% T 0% A 100% T
SNP3 (T/G)
80% T 20% G 60% T 40% G
SNPn
SNP1 (G/C)
40% G 60% C 40% G 60% C
Association Studies
Genome-Wide Association Studies, GWAS
•Study design
•Statistical analyses
Association Studies
1st phase: Design Study designs
1st phase: Design Study designs
Statistical analysis methods
2nd phase: Statistical analysis
1. Data validation
2. Genetic description 1. Unidimensional (snp by snp)
2. Multidimensional
3. Test for association genotype-phenotype
1. snp by snp
2. Multisnp / haplotype /tagSNP
3. Power assessment
4. Predictive model
Steps
Statistical analysis methods
2nd phase: Statistical analysis
1. Data validation (error sources: sampling,
genotyping)
• Checking with SNPref • Hardy-Weinberg proportions (separately for controls and
cases)
• Consistence among samples
• Stratification (genetic markers)
Step
Statistical analysis methods
2nd phase: Statistical analysis
Genetic description:
SNP by SNP
CT CC TT
Control 38
(29,5%)
76
(58,9%)
15
(11,6%)
Case 105
(43,8%)
122
(50,8%)
13
(5,4%)
Genotype frequencies
C T
Control 190
(73,6%)
68
(26,4%)
Case 349
(72,7%)
131
(27,3%)
Allele frequencies
SNP rs1137933
Statistical analyses
in Association Studies Steps
1. Data validation
2. Genetic description 1. Unidimensional (snp by snp)
2. Multidimensional
3. Test for association genotype-phenotype
1. snp by snp
2. Multisnp / haplotype /tagSNP
3. Power assessment
4. Predictive model
Haplotype inference
Haplotype 1 acgtagcatcgtatgcgttagacgggggggtagcaccagtacag
Haplotype 2 acgtagcatcgtatgcgttagacgggggggtagcaccagtacag
Haplotype 3 acgtagcatcgtatgcgttagacgggggggtagcaccagtacag
Haplotype 4 acgtagcatcgtttgcgttagacgggggggtagcaccagtacag
Haplotype 5 acgtagcatcgtttgcgttagacgggggggtagcaccagtacag
Haplotype 6 acgtagcatcgtttgcgttagacggcatggcaccggcagtacag
Haplotype 7 acgtagcatcgtttgcgttagacggcatggcaccggcagtacag
Haplotype 8 acgtagcatcgtttgcgttagacggcatggcaccggcagtacag
Haplotype 9 acgtagcatcgtttgcgttagacggcatggcaccggcagtacag
Genetic description:
MultiSNP
a/t g/c ->
a) a g
t c
b) a c
t g
Genotypes Possible haplotypes
Frequency Haplotype estimates
Hapl
otyp
e
SNPrs10425
22
SNPrs12951
053
SNPrs80649
46
SNPrs65410
03
SNPrs48460
49
SNPrs46464
21
SNPrs49868
85
SNPrs91590
7
SNPrs41475
67
SNPrs22666
33
Total
1 G A G G T C G C G G 0.1056
2 G A G A G C G C G G 0.0767
3 G A G A G C G C A G 0.0485
4 G A G A G C G A G G 0.0423
5 G A C G G T G A A A 0.0378
6 G A C A G T A A A A 0.0282
7 G A G G G C G C A G 0.0276
CT CC TT
Control 38
(29,5%)
76
(58,9%)
15
(11,6%)
Case 105
(43,8%)
122
(50,8%)
13
(5,4%)
Genotypic
C T
Control 190
(73,6%)
68
(26,4%)
Case 349
(72,7%)
131
(27,3%)
Allele
Test for association
(snp by snp)
ChiSquare (2 gl) = 9,71**
p = 0,00779
G (Likelihood ratio) (2 gl) =
9,67**
p = 0,00795
ChiSquare (1 gl) = 0,07
p = 0,79134
G (Likelihood ratio) (1 gl) =
0,07 p = 0,79134
Odds Ratio (OR) = 1,05
Risk Ratio (RR) = 1,02
SNP rs1137933
Chi-square Independence Test
Links
•http://bioinfo.iconcologia.net/SNPstats (Web tool for association studies)
•http://www.mep.ki.se/genestat/tl/genass_ldmap (Tutorial for association studies)
•http://linkage.rockefeller.edu/soft (Software for genetic analysis)
•http://www.broad.mit.edu/personal/jcbarret/haploview (Haploview)
•http://www.genome.gov/26525384 (Catálogo de estudios de GWA publicados)
•http://geneticassociationdb.nih.gov (Base de datos de estudios de asociación de enfermedades humana)
•plink... Whole genome association analysis toolset Package: PLINK (including version number) Author: Shaun Purcell
URL: http://pngu.mgh.harvard.edu/purcell/plink/ Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D,
Maller J, Sklar P, de Bakker PIW, Daly MJ & Sham PC (2007) PLINK: a toolset for whole-genome association and
population-based linkage analysis. American Journal of Human Genetics, 81.
Genome-Wide Association Studies, GWAS
1.
3.
2.
500 568 SNPs
25 señales que
superan el umbral de
p = 10-7
12 se habían
detectado
previamente
58 señales que
superan el umbral de
p = 10-5
•Validation GWA approach: findings comparable to all
previous studies base on candidate genes or positional
cloning
•Importance of quality control
•Dominance of variants of small effect
•Sample sizes 2000 cases and 3000 controls are needed
48
49
La heredabilidad oculta del genoma
1. LIMITATIONS OF GWA (Rare variants)
2. ‘OUT OF SIGHT’ (Low penetrance)
3. GENOME ARCHITECTURE (Structural variation)
4. NETWORK AMONG GENES (Epistasis)
5. HERETABILITY ESTIMATES ON DOUBT (Epigenetics)
6. LOST IN DIAGNOSTIC (Different diseases)
Brendan Maher
5 November 2008 | Nature 456, 18-21 (2008)
Hints of hidden
heritability in GWAS.
Greg Gibson. Nature
Genetics 2010.
42:558560
http://www.nature.co
m/ng/journal/v42/n7/
full/ng0710-558.html
http://www.genome.gov/26525384
Published Genome-Wide Associations through 12/2012
Published GWA at p≤5X10-8 for 17 trait categories
http://www.genome.gov/26525384
Published Genome-Wide Associations through
12/2012
Published GWA at p≤5X10-8 for 17 trait
categories