+ All Categories
Home > Documents > Association Mapping

Association Mapping

Date post: 24-Feb-2016
Category:
Upload: myrna
View: 53 times
Download: 0 times
Share this document with a friend
Description:
Association Mapping. David Evans. Outline. Definitions / Terminology What is (genetic) association? How do we test for association? When to use association HapMap and tagging Genome-wide Association Sequencing and Rare variants. Definitions. Locus: Location on the genome. - PowerPoint PPT Presentation
45
Association Mapping David Evans
Transcript
Page 1: Association Mapping

Association Mapping

David Evans

Page 2: Association Mapping

Outline

• Definitions / Terminology• What is (genetic) association?• How do we test for association?• When to use association• HapMap and tagging• Genome-wide Association• Sequencing and Rare variants

Page 3: Association Mapping

Definitions

SNP: “Single Nucleotide Polymorphism” a mutation that produces asingle base pair change in the DNA sequence

haplotypes

genotypes

alleles AC

AC G

CAA T

T

both alleles at a locus form a genotype

Locus: Location on the genome

alternate forms of a SNP (mutation)

AC

AC G

CAA T

T

AC

AC G

CAA T

Tthe pattern of alleles on a chromosome

QTL: “Quantitative trait locus” a region of the genome that changes the mean value of a quantitative phenotype

Page 4: Association Mapping

What is (genetic) association?

Correlation between an allele/genotype/haplotype and a trait of interest

Page 5: Association Mapping

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype• ~70% of Cystic Fibrosis patients have a deletion of 3 base

pairs resulting in the loss of a phenylalanine amino acid at position 508 of the CFTR gene

Page 6: Association Mapping

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype• ~70% of Cystic Fibrosis patients have a deletion of 3 base

pairs resulting in the loss of a phenylalanine amino acid at position 508 of the CFTR gene

• Indirect Association• Allele itself is not involved, but a nearby correlated

variant changes phenotype

Page 7: Association Mapping

Indirect association and Linkage disequilibrium

Page 8: Association Mapping

Indirect association and Linkage disequilibrium

time

Page 9: Association Mapping

Linkage Disequilibrium

Linkage disequilibrium means that we don’t need to genotype the exact aetiological variant, but only a variant that is correlated with it

Page 10: Association Mapping

Genetic AssociationThree Common Forms

• Direct Association• Mutant or ‘susceptible’ polymorphism• Allele of interest is itself involved in phenotype

• Indirect Association• Allele itself is not involved, but a nearby correlated

marker changes phenotype

• Spurious association• Apparent association not related to genetic aetiology

(e.g. population stratification)

Page 11: Association Mapping

Population Stratification

Marchini, Nat Genet. 2004

Page 12: Association Mapping

How do we test for association?

Page 13: Association Mapping

Genetic Case Control Study

T/GT/T

T/T

G/TT/T

T/G

T/G T/G

Allele G is ‘associated’ with disease

T/GT/G

G/G

G/G

T/T

T/T

Controls Cases

Page 14: Association Mapping

Allele-based tests

• Each individual contributes two counts to 2x2 table.

• Test of association

where

• X2 has χ2 distribution with 1 degrees of freedom under null hypothesis.

Cases Controls Total

G n1A n1U n1·

T n0A n0U n0·

Total n·A n·U n··

10i UAj ij

2ijij2

nEnEn

X, ,

nnn

nE jiij

Page 15: Association Mapping

Genotypic tests

• SNP marker data can be represented in 2x3 table.

• Test of association

where

• X2 has χ2 distribution with 2 degrees of freedom under null hypothesis.

Cases Controls Total

GG n2A n2U n2·

GT n1A n1U n1·

TT n0A n0U n0·

Total n·A n·U n··

210i UAj ij

2ijij2

nEnEn

X,, ,

nnn

nE jiij

Page 16: Association Mapping

Simple Regression Model of Association(Unrelated individuals)

Yi = a + bXi + ei

whereYi = trait value for individual iXi = number of ‘A’ alleles an individual has

10 2

0

0.2

0.4

0.6

0.8

1

1.2

X

Y

Association test is whether b > 0

Page 17: Association Mapping

AC AA

AC

•Rationale: Related individuals have to be from the same population

•Compare number of times heterozygous parents transmit “A” vs “C” allele to affected offspring

Transmission Disequilibrium Test

Page 18: Association Mapping

Transmission Disequilibrium Test

AC AA

AC

•Difficult to gather families•Difficult to get parents for late onset / psychiatric conditions

• Inefficient for genotyping (particularly GWA)

Page 19: Association Mapping

Case-control versus TDT

N units for 90% power

0

200

400

600

800

1000

1200

1400

1600

1800

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)CC (K=0.01)TDT

N individuals for 90% power

0

1000

2000

3000

4000

5000

6000

0 0.05 0.1 0.15 0.2 0.25

Allele frequency

CC (K=0.1)CC (K=0.01)TDT

p = 0.1; RAA = RAa = 2

Page 20: Association Mapping
Page 21: Association Mapping

When to use association...

Page 22: Association Mapping

Methods of gene huntingEf

fect

Siz

e

Frequency

rare, monogenic (linkage)

common, complex (association)

Page 23: Association Mapping

Association Summary

1. Families or unrelateds

2. Matching/ethnicity crucial

3. Many markers req for genome coverage (105 – 106 SNPs)

4. Powerful design

5. Ok for initial detection; good for fine-mapping

6. Powerful for common variants; rare variants difficult

Page 24: Association Mapping

HapMap and Tagging

Page 25: Association Mapping

Historical gene mapping

Glazier et al, Science (2002).

Page 26: Association Mapping

Reasons for Failure?

Complex Phenotype

Commonenvironment

Marker Gene1

Individualenvironment

Polygenicbackground

Gene2

Gene3

Linkage

Linkagedisequilibrium

Mode ofinheritance Linkage

Association

Weiss & Terwilliger (2000) Nat GenetInadequate Marker Coverage (Candidate gene studies)

Page 27: Association Mapping

Enabling association studies: HapMap

Page 28: Association Mapping

Visualizing empirical LD

Page 29: Association Mapping

Pairwise tagging

Tags:

SNP 1SNP 3SNP 6

3 in total

Test for association:

SNP 1SNP 3SNP 6

A/T1

G/A2

G/C3

T/C4

G/C5

A/C6

high r2 high r2 high r2

AATT

GC

CG

GC

CG

TCCC

ACCC

GC

CG

TCCC

GGAA

GGAA

Carlson et al. (2004) AJHG 74:106

Page 30: Association Mapping

Genome-wide Association

Page 31: Association Mapping

Enabling Genome-wide Association Studies

HAPlotype MAP

High throughput genotyping

Large cohorts

Page 32: Association Mapping

Genome-wide Association Studies

The Australo-Anglo-American Ankylosing Spondylitis Consortium (2010) Nature Genetics

Page 33: Association Mapping

Meta-analysis

Repapi et al. (2009) Nature Genetics

Page 34: Association Mapping

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

1 1 0 1 1 0 1 0 1 1 0 1 1 0……….

2 1 1 2 ? 2 1 ? ? 1 ? 2 2 0……….

2 1 1 2 ? 2 2 ? ? 0 ? 2 1 0……….

2 ? 1 2 ? 2 1 ? ? 1 ? 1 1 0……….

2 1 2 1 ? 2 2 ? ? 1 ? 1 1 0……….

2 1 1 1 ? 2 1 ? ? 1 ? 2 2 0……….

2 1 1 1 ? 2 2 ? ? 0 ? 2 2 0……….

1 0 1 2 ? 2 1 ? ? 1 ? 1 1 ?……….

2 1 2 1 ? 2 2 ? ? 1 ? 1 1 ?……….

HapMap Phase II

Cases

Controls

ImputationRecombination Rate

Page 35: Association Mapping

Imputation

Page 36: Association Mapping

Genomic control

Test locus Unlinked ‘null’ markers

2E

2 No stratification

2E

2

Stratification adjust test statistic

Page 37: Association Mapping

PCA

Page 38: Association Mapping

Replication

Replication studies should be of sufficient size to demonstrate the effect

Replication studies should conducted in independent datasetsReplication should involve the same phenotype

Replication should be conducted in a similar population

The same SNP should be tested

The replicated signal should be in the same direction

Joint analysis should lead to a lower p value than the original report

Well designed negative studies are valuable

Page 39: Association Mapping

Programs for performing association analysis

• Mx (Neale)– Fully flexible, ordinal data– Not ideal for large pedigrees or GWAs

• PLINK (Purcell, Neale, Ferreira)– GWA

• Haploview (Barrett)– Graphical visualization of LD, tagging, basic tests of

association • MERLIN, QTDT (Abecasis)

– Association and linkage in families

Page 40: Association Mapping

Sequencing and Rare Variants

Page 41: Association Mapping
Page 42: Association Mapping

Metzker et al (2010) Nature Reviews Genetics

Page 43: Association Mapping
Page 44: Association Mapping

Analysis of Rare Variants

• How to combine rare variants?– “Ordinary” tests of association won’t work– Collapse across all SNPs?

• Which SNPs to include?– Frequency?– Function?

• How to define a region?

Page 45: Association Mapping

Summary

1. Genetic association studies can be used to locate common genetic variants that increase risk of disease/affect quantitative phenotypes

2. Genome-wide association spectacularly successful in identifying common variants underlying complex traits and disease

3. The next challenge is to explain the “missing heritability” in the genome. Genome-wide sequencing and the analysis of rare variants will play a major part in this effort


Recommended