+ All Categories
Home > Documents > Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Date post: 19-Jan-2016
Category:
Upload: augustine-booker
View: 213 times
Download: 1 times
Share this document with a friend
Popular Tags:
60
Association mapping for mendelian, and complex disorders Jul 4, 2022 Bafna, BfB
Transcript
Page 1: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Association mapping for mendelian, and complex disorders

Apr 21, 2023 Bafna, BfB

Page 2: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

UG Bioinformatics specialization at UCSD

Apr 21, 2023 Bafna, BfB

Page 3: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Abstraction of a causal mutation

Apr 21, 2023 Bafna, BfB

Page 4: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Looking for the mutation in populations

Apr 21, 2023 Bafna, BfB

A possible strategy is to collect cases (affected) and control individuals, and look for a mutation that consistently separates the two classes. Next, identify the gene.

Page 5: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Looking for the causal mutation in populations

Apr 21, 2023 Bafna, BfB

Case

Control

Problem 1: many unrelated common mutations, around one every 1000bp

Page 6: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Case

Control

Apr 21, 2023 Bafna, BfB

Page 7: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Looking for the causal mutation in populations

Apr 21, 2023 Bafna, BfB

Case

Control

Problem 2: We may not sample the causal mutation.

Page 8: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

How to hunt for disease genes

• We are guided by two simple facts governing these mutations1. Nearby mutations are correlated2. Distal mutations are not

Apr 21, 2023 Bafna, BfB

Case

Control

Page 9: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

This lecture

1. The bottom line: How do these facts help in finding disease genes?

2. The genetics: why should this happen?3. The computation4. Challenge of complex diseases.

Apr 21, 2023 Bafna, BfB

Case

Control

1. Nearby mutations are correlated2. Distal mutations are not

Page 10: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

The basics of association mapping

• Sample a population of individuals at variant locations across the genome. Typically, these variants are single nucleotide polymorphisms (SNPs).

• Create a new bi-allelic variant corresponding to cases and controls, and test for correlations.

• By our assumptions, only the proximal variants will be correlated.

• Investigate genes near the correlated variants.

Apr 21, 2023 Bafna, BfB

Case

Control

00001

11

1

Page 11: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

So, why should the proximal SNPs be correlated, and distal SNPs not?

Apr 21, 2023 Bafna, BfB

Page 12: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

A bit of evolution

• Consider a fixed population (of chromosomes) evolving in time.

• Each individual arises from a unique, randomly chosen parent from the previous generation

Apr 21, 2023 Bafna, BfB

Time

Page 13: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Genealogy of a chromosomal population

Current (extant) population

Time

Apr 21, 2023 Bafna, BfB

Page 14: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Adding mutations

Apr 21, 2023 Bafna, BfB

Infinite sites assumption: A mutation occurs at most once at a site.

Page 15: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

SNPs

Apr 21, 2023 Bafna, BfB

The collection of acquired mutations in the extant population describe the SNPs

Page 16: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Fixation and elimination

• Not all mutations survive.• Some mutations get fixed, and are no

longer polymorphic

Apr 21, 2023 Bafna, BfB

Page 17: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Removing extinct genealogies

Apr 21, 2023 Bafna, BfB

Page 18: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Removing fixed mutations

Apr 21, 2023 Bafna, BfB

Page 19: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

The coalescent

Apr 21, 2023 Bafna, BfB

Page 20: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Disease mutation

Apr 21, 2023 Bafna, BfB

• We drop the ancestral chromosomes, and place the mutations on the internal branches.

Page 21: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Disease mutation

• A causal mutation creates a clade of affected descendants.

Apr 21, 2023 Bafna, BfB

Page 22: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Disease mutation

• Note that the tree (genealogy) is hidden. • However, the underlying tree topology

introduces a correlation between each pair of SNPs

Apr 21, 2023 Bafna, BfB

Page 23: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

What have we learnt?

• The underlying genealogy creates a correlation between SNPs.

• By itself, this is not sufficient, because distal SNPs might also be correlated.

• Fortunately, for us the correlation between distal SNPs is quickly destroyed.

Apr 21, 2023 Bafna, BfB

Page 24: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Recombination

Apr 21, 2023 Bafna, BfB

Page 25: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Recombination

• In our idealized model, we assume that each individual chromosome chooses two parental chromosomes from the previous generation

Apr 21, 2023 Bafna, BfB

Page 26: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Multiple recombination change the local genealogy

Apr 21, 2023 Bafna, BfB

Page 27: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

A bit of evolution

• Proximal SNPs are correlated, distal SNPs are not.• The correlation (Linkage disequilibirium) decays

rapidly after 20-50kb

Apr 21, 2023 Bafna, BfB

Page 28: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

BASIC STATISTICS

Apr 21, 2023 Bafna, BfB

Page 29: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Testing for correlation

• In the absence of correlation

Apr 21, 2023 Bafna, BfB

Page 30: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Testing for correlation

• When correlated

Apr 21, 2023 Bafna, BfB

Page 31: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Assigning confidence

2 2

2 2

Apr 21, 2023 Bafna, BfB

X

X 4 0

0 4

X

X

Expected Observed

Page 32: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Assigning confidence

2 2

2 2

Apr 21, 2023 Bafna, BfB

X

X 4 0

0 4

X

X

Expected Observed

Page 33: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Assigning confidence

2.5 2.5

1.5 1.5

Apr 21, 2023 Bafna, BfB

X

3 2

1 2

XExpected Observed

X X

Page 34: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

STATISTICAL TESTS OF ASSOCIATION

Apr 21, 2023 Bafna, BfB

Page 35: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Tests for association: Pearson

• Case-control phenotype:– Build a 3X2 contingency table– Pearson test (2df)=

Cases Controls

mm

Mm

MM O1 O2

O3 O4

O6O5

Apr 21, 2023 Bafna, BfB

Page 36: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

The χ2 test

Cases Controls

mm

Mm

MM O1

O5

O3 O4

O2

O6

• The statistic behaves like a χ2 distribution.

• A p-value can be computed directly

Apr 21, 2023 Bafna, BfB

Page 37: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Χ2 distribution properties

A related distribution is the F-distribution

Apr 21, 2023 Bafna, BfB

Page 38: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Likelihood ratio

• Another way to check the extremeness of the distribution is by computing a (log) likelihood ratio.

• We have two competing hypothesis. Let N be the total number of observations

Apr 21, 2023 Bafna, BfB

Page 39: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

LLR

• An LLR value close to 0, implies that the null hypothesis is true. Asymptotically, the LLR statistic also follows the chi-square distribution.

Apr 21, 2023 Bafna, BfB

Page 40: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Exact test

• The chi-square test does not work so well when the numbers are small.

• How can we compute an exact probability of seeing a specific distribution of values in the cells?

• Remember: we know the marginals (# cases, # controls,

Apr 21, 2023 Bafna, BfB

Page 41: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Fischer exact test

Cases Controls

mm

Mm

MM a

e

c d

b

f

• Num: #ways of getting configuration (a,b,c,d,e,f)

• Den: #ways of ensuring that the row sums and column sums are fixed

Apr 21, 2023 Bafna, BfB

Page 42: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Fischer exact test

• Remember that the probability of seeing any specific values in the cells is going to be small.

• To get a p-value, we must sum over all similarly extreme values. How?

Apr 21, 2023 Bafna, BfB

Page 43: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Test for association: Fisher exact test

• Here P is the probability of seeing the exact count.• The actual significance is computed by summing over

all such tables that are at least this extreme.

Cases Controls

mm

Mm

MM a

e

c d

b

f

Apr 21, 2023 Bafna, BfB

Page 44: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Test for association: Fisher exact test

Cases Controls

mm

Mm

MM a

e

c d

b

f

Apr 21, 2023 Bafna, BfB

Page 45: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Continuous outcomes

• Instead of discrete (Case/control) data, we have real-valued phenotypes– Ex: Diastolic Blood Pressure

• In this case, how do we test for association

Apr 21, 2023 Bafna, BfB

Page 46: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Continuous outcome ANOVA

• Often, the phenotypes are not offered as case-controls but like a continuous variable– Ex: blood-pressure measurements

• Question: Are the mean values of the two groups significantly different?

MM mm

Apr 21, 2023 Bafna, BfB

Page 47: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Two-sided t-test

• For two categories, ANOVA is also known as the t-test

• Assume that the variables from the two sets are drawn from Normal distributions– Different means, equal variances

• Null hypothesis is that they are both from the same distribution

Apr 21, 2023 Bafna, BfB

Page 48: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

t-test continued

Apr 21, 2023 Bafna, BfB

Page 49: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Two-sample t-test

• As the variance is not known, we use an estimate S, defined by

• The T-statistic is given by

• Significant deviations from 0 are used to reject the Null hypothesisApr 21, 2023 Bafna, BfB

Page 50: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Two-sample t-test (unequal variances)

• If the variances cannot be assumed to be equal, we use

• The T-statistic is given by

mS

nS

XX22

21

21 =T

• Significant deviations from 0 are used to reject the Null hypothesisApr 21, 2023 Bafna, BfB

Page 51: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

CONFOUNDING ASSOCIATION

Apr 21, 2023 Bafna, BfB

Page 52: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Confounding association

• Association tests can be confounded in many ways.

• We will explore a few of these, at a high level, and point to a few algorithmic problems.

Apr 21, 2023 Bafna, BfB

Page 53: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Confounding association with population substructure

Apr 21, 2023 Bafna, BfB

If the cases and controls are from different subpopulations, then sites with differing allele frequencies will confound association

Page 54: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

The algorithmic problem

• Given a collection of individual genotypes, separate them into sub-populations.

• Idea: take markers that are very far apart so that no LD is possible.

• LD indicates structure.• Problem: Partition individuals into sub-populations so

that all correlation across pairs of distant markers is minimized. Penalty for increasing sub-populations?

Apr 21, 2023 Bafna, BfB

Page 55: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Confounding associations with genotypes

Apr 21, 2023 Bafna, BfB

A recombination event

Distinct haplotypes can create identical genotypes confounding association

Page 56: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Confounding association with interactions

• Individually, the markers do not correlate.• Together, they perfectly predict genes.• Find interacting partners that associate

with genes

Apr 21, 2023 Bafna, BfB

Page 57: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Confounding association with rare variants

• Not only can we have multiple interacting SNPs, each SNP individually occurs with very low frequency (< 1%).

• Can you detect associations with rare variants?

Apr 21, 2023 Bafna, BfB

Page 58: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Other problems

Apr 21, 2023 Bafna, BfB

• Can we reconstruct the phylogeny?• Useful for computing recombination

bounds.

Page 59: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Conclusion

• As individual genomes are sequenced, the association of variations with phenotypes presents many confounding challenges.

• Some of these challenges can be modeled as algorithmic problems.

• Population genetics should be part of a bioinformatics undergraduate curriculum.

Apr 21, 2023 Bafna, BfB

Page 60: Association mapping for mendelian, and complex disorders January 16Bafna, BfB.

Thank you

• Homework (due Monday, March 15)– Describe an algorithm to detect associations

of interacting, rare-variants with a complex disease phenotype, in the presence of population substructure in the case-control population.

Apr 21, 2023 Bafna, BfB


Recommended