African American disease gene discovery utilizing mapping by admixture linkage disequilibrium: progress and prospects
Michael W. Smith
Linkage disequilibrium resolution
Family studies20 cM
ALD3-7 cM
HAP map<0.1 cM
ALD around FY locus
MALD Map SNP Sources
Unique~450,000
AB ValidatedAB ValidatedSNPsSNPs
180,178180,178
Public DatabasePublic Database66,20466,204
Celera/ABCelera/ABResequencingResequencing
266,135266,135
MALD SNP map samplesGroup Population Sample sizeEuropean American Chicago 40
Baltimore 40African American Chicago 19
Pittsburgh 23Baltimore 45North Carolina 23
Africa Senegal 46Ghana 33Cameroon 20Botswana 21
Chinese Canton 40Amerindian Zapotec 29
MALD map loci assessed
3669 SNPs assayed
35 replicated SNPs
61 discrepancies 12,447 comparisons
WICGRWICGR18471847
LGDLGD18571857
Shannon Information ContentAfrican European TOTAL
Allele 1 (1-m)fA[a00]
mfE[a01]
(1-m)fA + mfE[a0*]
Allele 2 (1-m)(1-fA)[a10]
m(1-fE)[a11]
(1-m)(1-fA) + m(1-fE)[a1*]
TOTAL 1-m[a*0]
m[a*1]
1
iiii
iji j
ij
aaaa
aa
SIC
*2**2*
2
1
0
1
0
loglog
log
−−
=
∑∑= =
Same as “informativeness of assignment” (Rosenberg et al., 2003)
MALD Marker informativenessSingle Multi
Admixture in African Americans
Admixture linkage disequilibrium decay
MALD Marker Map, part 1
MALD Marker Map, part 2
SNPs for estimating individual ancestry
Comparison Average allele frequency difference
African/European 78%
African/Amerindian 85%
European/Amerindian 56%
European/East Asian 57%
100 Markers each spaced by at least 25cM
Structure Analysis
• Pritchard et al. 2001 and Falush et al 2003• Bayesian Approach of Modeling
– Number of populations– Admixture– Estimates contributions from each population
• Per sampled population• Per individual
MCMC Model
• Estimate ancestral origin along genome– Composite genotype of linked markers– Utilize parental allele frequencies– Estimate number of generations since admixture for
individuals– Estimate admixture fraction for individuals
• Evaluation– Locus genome statistic– Locus case-control
Mapping by segment ancestry
ALD around specific genes
Examples of chromosomal
admixturesegments
Power of MALD analysis
For 80% power
Samples needed to detect a gene
Pict AB 31003,000LocusMALDScreenOn one individual
MALD candidate diseasesCause of Death or Disease
(African/European)1 Relative Risk 95% CIHepatitis C Clearance2 0.19 (0.10,0.38)Melanoma 0.24 (0.09,0.66)HIV vertical transmission3 0.30 (0.10,0.90)
HIV progression5 1.41 (1.06,1.86)
Multiple sclerosis4 0.5
Suicide 0.64 (0.45,0.93)
Lung cancer 1.48 (1.30,1.67)Stroke 1.57 (1.27,1.94)
End-Stage Renal Disease6 1.87 (1.47,2.39)Intracranial hemorrhage 2.10 (1.44,3.06)Focal segmental glomerulosclerosis7 2.49 (1.05,5.95)
Prostate cancer 2.73 (2.13,3.52)Hypertensive heart disease 2.80 (2.03,3.86)Myeloma 3.14 (2.00,4.93)
1Daley Smith et al. 1998 except 2Thomas et al. 2000, 3Tess et al, 1998; 4Hogancamp, 1997; 5McGinniss et al, 2003; 6Klag et al. 1997 and 7Lopes et al., 2000
Acknowledgements
Jeffrey Kopp NIDDK)David Thomas (JHMI)Chloe Thio (JHMI)
(NCI/LGD)Holli HutchinsonMike MalaskyMary McNallyBailey KessingJoanne Clarke
Thanks to my lab too!
Yvette Berthier-SchaadTaras OleksykSadeep ShresthaAnn TrueloveKai Zhao Joanne Clarke