Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | bennett-grant |
View: | 220 times |
Download: | 7 times |
Some current issues in QTL identification
Lon CardonWellcome Trust Centre for Human Genetics
University of Oxford
Acknowledgements: Goncalo AbecasisStacey ChernyTwin course faculty
Positional Cloning
LO
D
Sib pairs Chromosome Region Association Study
Genetics
GenomicsPhysical Mapping/Sequencing
Candidate Gene Selection/Polymorphism Detection
Mutation Characterization/Functional Annotation
Inflammatory Bowel Disease Genome Screen
Hampe et al., Am J Hum Genet, 64:808-816, 1999
Inflammatory Bowel Disease Genome Screen
Hampe et al., Am J Hum Genet, 64:808-816, 1999
Susceptibility locus mapped for Crohn’s Disease
Genome Screens for Linkage in Sib-pairs
1997/98- Diabetes (IDDM + NIDDM)- Asthma/atopy- Osteoporosis- Obesity- Multiple Sclerosis- Rheumatoid arthritis- Systemic lupus erythematosus- Ankylosing spondylitis- Epilepsy- Inflammatory Bowel Disease- Celiac Disease- Psychiatric Disorders (incl. Scz, bipolar)- Behavioral traits (incl. Personality, panic)- others missed...
1999- NIDDM- Asthma/atopy- Psoriasis- Inflammatory Bowel Disease- Osteoporosis/Bone Mineral Density- Obesity- Epilepsy- Thyroid disease- Pre-eclampsia- Blood pressure- Psychiatric disorders (incl. Scz, bipolar)- Behavioral traits (incl. smoking, alcoholism,
autism)- Familial combined hyperlipidemia- Tourette syndrome- Systemic lupus erythematosus- others missed…
Human QTL Linkage Gene Identification Successes
0Well, at least < 5
Why so few successes in human QTL mapping?
Many valid reasons proposed:• Phenotypic complexity (not measured well)• Genetic complexity (many genes of small effect, GxE, epistasis)• Genotype error• Sampling design• Statistical methods• ….
Most linkage studies have been under-powered (and over-hyped)
QTL Mapping has very low power !1000 sibs, no parents: markers every 10 cM, each marker H=0.8
QTLh2=0.33
Kruglyak L, Lander ES. (1995). Am J Hum Genet 57: 439-454
Increasing power to detect linkage in sib-pairs
• Phenotypic selection– Carey & Williamson, 1991, AJHG
– Eaves & Meyer, 1994, Behav Genet
– Cardon & Fulker, 1994, AJHG
– Risch & Zhang, 1996, AJHG
Equivalent full sample N for 200 selected pairs from 10,000 (QTL allele freq = .2)
Concordant Discordant Combined
Additive 1400 3300 5000
Recessive 6000 3100 9500
Dominant 1400 3100 4400
1 2 3 4 5 6 7 8 9 102
46
810100
150
200
250
300
350
Decile ranking - Sib 1
Sib 2
Info
rma
tio
n s
core
Information Score for Additive Gene Action (p=0.5)
Linkage Analysis of QTLs-Summary-
• Spotted history. Few, if any, bona fide successes• Power has been large problem
• Of the few replicated loci, most have used some form of selection• EDAC, other selection schemes from large cohorts now underway• Genome-scans coming soon
Promising beginning for QTL linkage mapping
Positional Cloning
LO
D
Sib pairs Chromosome Region Association Study
Genetics
GenomicsPhysical Mapping/Sequencing
Candidate Gene Selection/Polymorphism Detection
Mutation Characterization/Functional Annotation
Association Analysis
• Simple genetic basis
Short unit of resemblancePopulation-specific
• One of easiest genetic study
designs
Correlate allele frequencies with traits/diseasesAt core of monogenic & oligo/polygenic trait models
• Widely used in past 20 years
HLA, candidate genes, pharmacogenetics, positional cloning
Angiotensin-1 Converting Enzyme
Keavney et al. (1999) Hum Mol Gen, 7:1745-1751
Evidence for Linkage
0
5
10
LO
D
A-5466C A-240T T1237C I/D 4656(CT)3/2
T-5991C T-3892C T-93C G2215A G2350A
Results of ACE analysis using VC association model
A-5466C A-240T T1237C I/D 4656(CT)3/2
T-5991C T-3892C T-93C G2215A G2350A0
5
10
15
LOD
for Linkage for Association
Alzheimers and ApoE4
Roses, Nature 2000
Association Resolution by Position
Roses, Nature 2000
Decay of Linkage Disequilibrium in a Small Set of Genes
Toward a linkage disequilibrium map of the human genome
• > 10 year ago, emphasis mainly on theory - LD measures, decay, population comparisons, …
• 1989: 1st use of LD for disease mapping: Cystic Fibrosis
• Recent years, gene-based haplotypes used widely for monogenic mapping
• Last 2 years: larger scale assessment of common alleles in reference populations
LD/haplotype map objective: find regions of high and low ancestral conservation to clarify signal/noise in allelic association studies
History of LD studies in humans:
Haplotype Map: Data/Interpretations
Distribution of pairwise LD ‘average extent of LD’
LD differences in genes
Eaves et al, Nat Genet 2000 Taillon-Miller et al, Nat Genet 2000
Stephens et al, Science 2001
Reich et al, Nature 2001
Johnson et al, Nat Genet 2001 Abecasis et al, AJHG 2001
Haplotype Map: Data/Interpretations
Local patterns of LD … Conserved haplotype segments ... ‘Blocks’
5q31. Daly et al, Nat Genet 2001
MHC class II. Jeffreys et al, Nat Genet 2001
Chr21. Patil et al, Science 2001
Current Status: Data/Interpretations
• How to define ‘useful’ LD is still unclear
• Easier to focus on pairwise LD rather than haplotypes. Is this efficient?
• For common alleles, D’ measure, LD extends ~ 50-60 kb on averageFor rare alleles, ?
• There is great variability in regional patterns of LDExplanations, predictors yet unknown
• Haplotype blocks are detectable and present broadly
• Size of blocks? How best to define them? Utility of htSNPs?
Human Genome Haplotype Map
1. NIH/TSC/Wellcome Trust funded international collaboration (likely)- follow-on from human sequencing project & SNP consortium
2. Hierarchical strategy- ‘sparse-map’ then more fine- Initially use available SNPs
3. Multiple populations- some family-based, most likely to be unrelateds
4. Aim is to catalog regions of high LD down to very fine-scale (ie., find big and small blocks)
Human Chromosome 22• First human chromosome to be “fully” sequenced
• Extensive knowledge of genomic landscape
• Abundance of SNPs and other variants/bp
~34.5 Mb on q-arm; p-arm mostly structural RNA; 679 genes on qDunham et al, Nature, 1999
Samples
• 7 x 3 generation CEPH families– 77 Individuals– 59 founder chromosomes– 1505 SNPs successfully genotyped
• 90 Unrelated Caucasian Individuals– 1286 SNPs genotyped (1261 overlapping with CEPHs)
• 51 Unrelated Estonian Individuals– 908 SNPs genotyped (594 overlapping with CEPHs)
N = 1505 markers. Median spacing = 15.07kb. 4 gaps > 200 kb. Smallest = 12 bp; largest = 293 kb.
Marker spacing
0
100
200
300
400
500
600
< 5k
b5-
1011
-20
21-3
0
31-4
0
41-5
0
51-6
0
61-7
0
71-8
0
81-9
0
91-1
00
101-
110
111-
120
121-
130
131-
140
> 15
0kb
Spacing bin
Co
un
t
N=1505
Allele frequencies on Chromosome 22Ceph founders
0
0.05
0.1
0.15
0.2
0.25
< 0.10 .11-.20 .21-.30 .31-.40 < 0.50
Category
Fre
qu
ency
0.00
0.20
0.40
0.60
0.80
1.00
0 200 400 600 800 1000
Physical Distance (kb)
D'
0.00
0.20
0.40
0.60
0.80
1.00
0 200 400 600 800 1000
Physical Distance (kb)
r2
D’
r2
Variability in Pairwise LD
Decay of LD on chromosome 22Means in CEPHs, Unrelateds, Combined & Estonian Samples
Representing LD along a chromosome
Following several trends in genetics, genotyping technology outpaced ability to analyze LD information…
How to characterize regions of ‘interesting’ linkage disequilibrium?
1. Simply examine average levels across region/chromosome?2. Fit models to data, look at expectations & specific predictions3. Consider ‘interesting’ LD tracts as long runs of LD – borrow from
extant statistical approaches4. Look for ‘blocks’ of LD in the genome
LD Along Chromosome 22
0.00
0.25
0.50
0.75
1.00
0 5 10 15 20 25 30
D'
0
200
400
600
0 5 10 15 20 25 30
Position (Mb)
Pre
dic
ted
Hal
f-L
ife
(kb
)Average D’
D’ Half-Life
Disequilibrium Fingerprint
Plus 3 individual blocks:Position SNPs Haplos Length4.6-4.8 M 11 6 231 kb8.2-8.4 M 8 4 264 kb34.3 M 11 3 82 kb
Chromosome 22 Haplotype Blocks
Chr22 High LD: 22-27 Mb
Chr22 Low LD: 27-32 Mb
Recombination Pattern on Chromosome 22
1 Mb/cM
Microsatellite distance
0
10
20
30
40
50
60
0 5 10 15 20 25 30 35
Sequence Position (Mb)
cM
1 Mb/cM
Microsatellite distance
0
10
20
30
40
50
60
0 5 10 15 20 25 30 35
Sequence Position (Mb)
cM
GeneDensity
Recombination and Gene Density on Chromosome 22
Linkage Disequilibrium Map of Chromosome 22 - Summary -
• LD ‘half-length’ ~ 50 kb, but depends on measure & what is “useful” LD
• Family & unrelated samples yield consistent patterns
• Different analytical tools provide complementary views of long blocks
• 15% chromosome 22 in long LD blocks in these samples (40% in shorter blocks) Why? Selection, selective sweeps? Chromosome structure? Popln age?
• LD correlated with gene-density, GC content and related repeats.Gene/GC correlations almost entirely collinear with genetic distance.
LD patterns can immediately assist positional association studies:
Prioritise candidate regions.Use extant genetic maps and simple repeat structures in design & power.
Mapping QTLs in families:Summary
• Linkage and association studies follow directly from fundamental biometrical principles.
• Linkage studies of complex traits can work: All principles of this course apply
- power, study design, careful phenotype selection/modelling, comparison of statistical models
• New information about LD patterns should facilitate association studies
- help form a priori hypotheses and guide replication.
16th Annual Course on Methodology for Twins and FamiliesAdvanced workshop: Boulder, Colorado, March 2003
Monday, 5 March 2001
Eaves 9:00-10:30 Introduction: Cause of human variation
Amos & Heath 11:00-12:00 Basic Statistics: Likelihood models
Lessem 12:00-12:30 Introduction: Computer System P
Eaves & Sham 13:30-15:00 Genetic Theory
Neale, Martin & Boomsma 15:30-17:00 MX practical P
Tuesday, 6 March 2001
Sham 9:00-10:30 Linkage: Basic Principles
Abecasis, Cherny & Cardon 11:00-12:30 IBD estimation: Theory and Practice P
Martin & Maes 13:30-15:00 QTL Linkage Analysis in Sibships P
Eaves 15:30-17:00 Introduction to Bayesian Methods P
Wednesday, 7 March 2001
Neale & Heath 9:00-10:30 Linkage on Selected Samples P
Purcell & Sham 11:00-12:30 Power Calculation in Linkage Analysis P
Boomsma & van Baal 13:30-15:00 Multivariate Applications P
Purcell & Sham 15:30-17:00 Epistasis/Multi-locus modelling P
Thursday, 8 March 2001
Rice & Heath 9:00-10:30 Association Study Principles P
Cherny & Abecasis 11:00-12:30 Family Based Association Studies P
van den Oord 13:30-15:00 Population Stratification and General Association
P
Sham & Abecasis 15:30-17:00 Power for Association Analysis P
Friday, 9 March 2001
Cardon & Sham 9:00-10:30 Bioinformatics and Genome Patterns of Disequilibrium
Rice 11:00-12:30 Multiple Testing: Power and Type I Error
Flint 13:30-15:00 Animal models of complex traits
Cherny, Purcell & Abecasis 15:30-17:00 General computational issues P
http://ibgwww.colorado.edu/twins2001/schedule.html