+ All Categories
Home > Documents > 2019 SISG Module 8: Bayesian Statistics for Genetics...

2019 SISG Module 8: Bayesian Statistics for Genetics...

Date post: 27-Jun-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
68
2019 SISG Module 8: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 67
Transcript
Page 1: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

2019 SISG Module 8: Bayesian Statistics forGenetics

Lecture 3: Binomial Sampling

Jon Wakefield

Departments of Statistics and BiostatisticsUniversity of Washington

1 / 67

Page 2: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Outline

Introduction and Motivating Example

Bayesian Analysis of Binomial DataThe Beta PriorBayes Factors

Analysis of ASE Data

Conclusions

2 / 67

Page 3: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Introduction

3 / 67

Page 4: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Introduction

In this lecture we will consider the Bayesian modeling of binomialdata.

Two motivations for a binomial model:I a so-called allele specific expression (ASE) experiment will be

considered.I a time series of counts, in order to model prevalence of a

condition.

Conjugate priors will be described in detail.

Sampling from the posterior will be emphasized as a method forflexible inference.

4 / 67

Page 5: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Motivating Example: Allele Specific Expression

I Gene expression variation is an important contribution tophenotypic variation within and between populations.

I Expression variation may be due to genetic or environmentalsources.

I Genetic variation may be due to cis- (local) or trans(distant)-acting mechanisms.

I Polymorphisms that act in cis affect expression in an allelespecific manner.

I RNA-Seq is a high throughput technology that allowsallele-specific expression (ASE) to be measured.

5 / 67

Page 6: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Motivating Example: An Example of ASE

I The data we consider is in yeast, and is a controlled experimentin which two strains, BY and RM, are hybridized.

I Consider a gene with one exon and five SNPs within that exon.I Suppose the BY allele of the gene is expressed at a high level.I In contrast, the RM allele has a mutation in a transcription factor

binding site upstream of the gene that greatly reducesexpression of this allele.

I Then, in the mRNA isolated from the yeast, when we look just atthis gene, there are lots more BY mRNA molecules than RMmRNA molecules.

6 / 67

Page 7: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Example of ASE

C

A

BY

RM

Figure: In the top figure the transcription factor (blue) leads to hightranscription. In the bottom figure an upstream polymorphism (red star)prevents the transcription factor from binding.

7 / 67

Page 8: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Specifics of ASE Experiment

Details of the data:I Two “individuals” from genetically divergent yeast strains, BY and

RM, are mated to produce a diploid hybrid.I Three replicate experiments: same individuals, but separate

samples of cells.I Two technologies: Illumina and ABI SOLiD.I Each of a few trillion cells are processed.I Pre- and post-processing steps are followed by fragmentation to

give millions of 200–400 base pair long molecules, with shortreads obtained by sequencing.

I Need SNPs since otherwise the reference sequence is identicaland so we cannot tell which strain the read arises from.

I Strict criteria to call each read as a match are used, to reduceread-mapping bias.

I Data from 25,652 SNPs within 4,844 genes.I More details in Skelly et al. (2011).

8 / 67

Page 9: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

The Data

Table: First few rows of ASE data.

BY Count Total Count MLE θ̂62 107 0.5833 59 0.56

658 1550 0.4214 61 0.2357 153 0.37

218 451 0.4810 19 0.53

......

...

9 / 67

Page 10: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Simple Approach to Testing for ASEFor a generic gene:

I Let N be the total number of counts at a particular gene, and Ythe number of reads to the BY strain.

I Let θ be the probability of a map to BY.I A simple approach is to assume:

Y |θ ∼ Binomial(N, θ),

and carry out a test of H0 : θ = 0.5, which corresponds to noallele specific expression.

I A non-Bayesian approach might use an exact test, i.e.enumerate the probability, under the null, of all the outcomes thatare equal to or more extreme than that observed.

I Issues:I p-values are not uniform under the null due to discreteness of Y .I How to pick a threshold? In general and when there are multiple

tests.I Do we really want a point null, i.e. θ = 0.5?I How would a Bayesian perform inference for this problem?

10 / 67

Page 11: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

p−values

Freq

uenc

y

0.0 0.2 0.4 0.6 0.8 1.0

020

040

060

080

010

0012

00

Figure: p-values from 4,844 exact tests.

11 / 67

Page 12: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Motivating Example: Smoothing/Penalization

I When looking at estimates over space or time, we want to know ifthe differences we see are “real”, or simply reflecting samplingvariability.

I In data sparse situations, when one expects similarity smoothinglocal patterns (in time, space, or both) can be highly beneficial.

I This can equivalently be thought of penalization, in which largedeviations from “neighbors”, suitably defined, are discouraged.

I In the examples that follow we will generically think of modelingprevalence.

I We give an example of temporal modeling.

12 / 67

Page 13: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Motivation for Smoothing: Temporal Case

I Temporal setting: Even if the underlying prevalence is the sameover time, we will see differences in the empirical estimates.

I Figure 3 demonstrates: We sampled binomial data withn = 10,20,200 and p = 0.2 (shown in blue) in all cases.

I In the top plot in particular, we might conclude large temporalvariation, but all we are seeing is sampling variation.

I Figure 4 summarizes estimates from a second simulation inwhich there is a real temporal pattern – here we would not wantto oversmooth and remove the trend.

I Later (Lecture 5) I will apply temporal smoothing models to thesetwo sets of data.

13 / 67

Page 14: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

0.0

0.1

0.2

0.3

0.4

0.5

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

e

n1=10

0.0

0.1

0.2

0.3

0.4

0.5

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

en2=20

0.0

0.1

0.2

0.3

0.4

0.5

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

e

n3=200

Figure: Prevalence estimates over time from simulated data with trueprevalence of p = 0.2 (blue solid lines).

14 / 67

Page 15: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

0.0

0.2

0.4

0.6

0.8

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

e

n1=10

0.0

0.2

0.4

0.6

0.8

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

en2=20

0.0

0.2

0.4

0.6

0.8

0 20 40 60

Time (months)

Pre

vale

nce

Est

imat

e

n3=200

Figure: Prevalence estimates over time from simulated data, true prevalencecorresponds to curved blue solid line.

15 / 67

Page 16: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Analysis of Binomial Data

16 / 67

Page 17: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayes Theorem Recap

I We derive the posterior distribution via Bayes theorem:

p(θ|y) =Pr(y |θ)× p(θ)

Pr(y). (1)

I The denominator:

Pr(y) =

∫Pr(y |θ)× p(θ)dθ = E[Pr(y |θ)]

is a normalizing constant to ensure the RHS of (1) integrates to 1(we assume a continuous parameter θ).

I More colloquially:

Posterior ∝ Likelihood × Prior= Pr(y |θ)× p(θ)

since in considering the posterior we only need to worry aboutterms that depend on the parameter θ.

17 / 67

Page 18: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Overview of Bayesian Inference

Simply put, to carry out a Bayesian analysis one must specify alikelihood (probability distribution for the data) and a prior (beliefsabout the parameters of the model).

And then do some computation... and interpretation...

The approach is therefore model-based, in contrast to approaches inwhich only the mean and the variance of the data are specified(e.g., weighted least squares, quasi-likelihood).

18 / 67

Page 19: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Overview of Bayesian Inference

To carry out inference, integration is required, and a large fraction ofthe Bayesian research literature focusses on this aspect. Bayesianapproaches to:

1. Estimation: marginal posterior distributions on parameters ofinterest.

2. Hypothesis Testing: Bayes factors give the evidence in the datawith respect to two or more hypotheses, and provide oneapproach.

3. Prediction: via the predictive distribution.These three endeavors will now be described in the context of abinomial model.

19 / 67

Page 20: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Elements of Bayes Theorem for a Binomial ModelWe assume independent responses with a common “success”probability θ.

In this case, the contribution of the data is through the binomialprobability distribution:

Pr(Y = y |θ) =

(Ny

)θy (1− θ)N−y (2)

and tells us the probability of seeing Y = y , y = 0,1, . . . ,N given theprobability θ.

For fixed y , we may view (2) as a function of θ – this is the likelihoodfunction.

The maximum likelihood estimate (MLE) is that value

θ̂ = y/N

that gives the highest probability to the observed data, i.e. maximizesthe likelihood function.

20 / 67

Page 21: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

0 2 4 6 8 10

0.00

0.05

0.10

0.15

0.20

0.25

N=10,θ=0.5

y

Binom

ial Pro

bability

0 2 4 6 8 10

0.00

0.05

0.10

0.15

0.20

0.25

N=10,θ=0.3

y

Binom

ial Pro

bability

Figure: Binomial distributions for two values of θ with N = 10.

21 / 67

Page 22: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

0.0 0.2 0.4 0.6 0.8 1.0

0.000.05

0.100.15

0.200.25

N=10, y=5

θ

Binomi

al Likel

ihood

0.0 0.2 0.4 0.6 0.8 1.0

0.000.05

0.100.15

0.200.25

N=10, y=3

θ

Binomi

al Likel

ihood

Figure: Binomial likelihoods for values of y = 5 (left) and y = 10 (right), withN = 10. The MLEs are indicated in red.

22 / 67

Page 23: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

The Beta Distribution as a Prior Choice for Binomial θ

I Bayes theorem requires the likelihood, which we have alreadyspecified as binomial, and the prior.

I For a probability 0 < θ < 1 an obvious candidate prior is theuniform distribution on (0,1): but this is too restrictive in general.

I The beta distribution, beta(a,b), is more flexible and so may beused for θ, with a and b specified in advance, i.e., a priori. Theuniform distribution is a special case with a = b = 1.

I The form of the beta distribution is

p(θ) =Γ(a + b)

Γ(a)Γ(b)θa−1(1− θ)b−1

for 0 < θ < 1, where Γ(·) is the gamma function1.I The distribution is valid2 for a > 0,b > 0.

1Γ(z) =∫∞

0 tz−1e−t dt2A distribution is valid if it is non-negative and integrates to 1

23 / 67

Page 24: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

The Beta Distribution as a Prior Choice for Binomial θ

How can we think about specifying a and b?

For the normal distribution the parameters µ and σ2 are just the meanand variance, but for the beta distribution a and b have no suchsimple interpretation.

The mean and variance are:

E[θ] =a

a + b

var(θ) =E[θ](1− E[θ])

a + b + 1.

Hence, increasing a and/or b concentrates the distribution about themean.

The quantiles, e.g. the median or the 10% and 90% points, are notavailable as a simple formula, but are easily obtained within softwaresuch as R using the function qbeta(p,a,b).

24 / 67

Page 25: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

1.2a=1, b=1

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.0

a=1, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

45

a=1, b=5

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

a=2, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.0

a=4, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.02.5

a=5, b=5

θ

Beta

Dens

ityFigure: Beta distributions, beta(a, b), the red lines indicate the means.

25 / 67

Page 26: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Samples to Summarize Beta Distributions

Probability distributions can be investigated by generating samplesand then examining histograms, moments and quantiles.

In Figure 8 we show histograms of beta distributions for differentchoices of a and b.

26 / 67

Page 27: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

a=1, b=1

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.2

0.40.6

0.81.0

a=1, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.0

a=1, b=5

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8

01

23

4

a=2, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

a=4, b=2

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.0

a=5, b=5

θ

Beta

Dens

ity

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.0

Figure: Random samples from beta distributions; sample means as red lines.

27 / 67

Page 28: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Samples for Describing Weird Parameters

I So far the samples we havegenerated have producedsummaries we can easilyobtain anyway.

I But what about functions ofthe probability θ, such as theodds θ/(1− θ)?

I Once we have samples for θwe can simply transform thesamples to the functions ofinterest.

I We may have clearer prioropinions about the odds, thanthe probability.

Odds with θ from a beta(10,10)

Odds

Fre

quen

cy

0 1 2 3 4 5

050

010

0015

0020

00Figure: Samples from the prior on theodds θ/(1− θ) with θ ∼ beta(10, 10),the red line indicates the samplemean.

28 / 67

Page 29: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Issues with UniformityWe might think that if we have little prior opinion about a parameterthen we can simply assign a uniform prior, i.e. a prior

p(θ) ∝ const.

There are two problems with this strategy:I We can’t be uniform on all scales since, if φ = g(θ):

pφ(φ)︸ ︷︷ ︸Prior for φ

= pθ(g−1(φ))︸ ︷︷ ︸Prior for θ

×∣∣∣∣ dθdφ

∣∣∣∣︸ ︷︷ ︸Jacobian

and so if g(·) is a nonlinear function, the Jacobian will be afunction of φ and hence not uniform.

I If the parameter is not on a finite range, an improper distributionwill result (that is, the form will not integrate to 1). This can leadto an improper posterior distribution, and without a properposterior we can’t do inference.

29 / 67

Page 30: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Are Priors Really Uniform?

I We illustrate the first (non-uniform onall scales) point.

I In the binomial example a uniformprior for θ seems a natural choice.

I But suppose we are going to modelon the logistic scale so that

φ = log

1− θ

)is a quantity of interest.

I A uniform prior on θ produces thevery non-uniform distribution on φ inFigure 10.

I Not being uniform on all scales is notnecessarily a problem, and is correctprobabilistically, but one should beaware of this characteristic.

Log Odds with θ from a beta(1,1)

Log Odds φ

Fre

quen

cy

−10 −5 0 5

010

020

030

040

050

060

0

Figure: Samples from the prioron the odds φ = log[θ/(1− θ)]with θ ∼ beta(1, 1), the redline indicates the samplemean.

30 / 67

Page 31: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Posterior Derivation: The Quick Way

I When we want to identify a particular probability distribution weonly need to concentrate on terms that involve the randomvariable.

I For example, if the random variable is X and we see a density ofthe form

p(x) ∝ exp(c1x2 + c2x),

for constants c1 and c2, then we know that the random variable Xmust have a normal distribution.

31 / 67

Page 32: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Posterior Derivation: The Quick Way

I For the binomial-beta model we concentrate on terms that onlyinvolve θ.

I The posterior is

p(θ|y) ∝ Pr(y |θ)× p(θ)

= θy (1− θ)N−y × θa−1(1− θ)b−1

= θy+a−1(1− θ)N−y+b−1

I We recognize this as the important part of aBeta(y + a,N − y + b) distribution.

I We know what the normalizing constant must be, because wehave a distribution which must integrate to 1.

32 / 67

Page 33: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Posterior Derivation: The Long (Unnecessary) Way

I The posterior can also be calculated by keeping in all thenormalizing constants:

p(θ|y) =Pr(y |θ)× p(θ)

Pr(y)

=1

Pr(y)

(Ny

)θy (1− θ)N−y Γ(a + b)

Γ(a)Γ(b)θa−1(1− θ)b−1. (3)

I The normalizing constant is

Pr(y) =

∫ 1

0Pr(y |θ)× p(θ)dθ

=

(Ny

)Γ(a + b)

Γ(a)Γ(b)

∫ 1

0θy+a−1(1− θ)N−y+b−1dθ

=

(Ny

)Γ(a + b)

Γ(a)Γ(b)

Γ(y + a)Γ(N − y + b)

Γ(N + a + b)

I The integrand on line 2 is a Beta(y + a,N − y + b) distribution,up to a normalizing constant, and so we know what this constanthas to be.

33 / 67

Page 34: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Posterior Derivation: The Long (and Unnecessary)Way

I The normalizing constant is therefore:

Pr(y) =

(Ny

)Γ(a + b)

Γ(a)Γ(b)

Γ(y + a)Γ(N − y + b)

Γ(N + a + b)

I This is a probability distribution, i.e.∑N

y=0 Pr(y) = 1 withPr(y) > 0.

I For a particular y value, this expression tells us the probability ofthat value given the model, i.e. the likelihood and prior we haveselected: this will reappear later in the context of hypothesistesting.

I Substitution of Pr(y) into (3) and canceling the terms that appearin the numerator and denominator gives the posterior:

p(θ|y) =Γ(N + a + b)

Γ(y + a)Γ(N − y + b)θy+a−1(1− θ)N−y+b−1

which is a Beta(y + a,N − y + b).34 / 67

Page 35: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

The Posterior Mean: A Summary of the PosteriorI Recall the mean of a Beta(a,b) is a/(a + b).I The posterior mean of a Beta(y + a,N − y + b) is therefore

E[θ|y ] =y + a

N + a + b

=y

N + a + b+

aN + a + b

=yN× N

N + a + b+

aa + b

× a + bN + a + b

= MLE×W + Prior Mean× (1-W).

I The weight W is

W =N

N + a + b.

I As N increases, the weight tends to 1, so that the posterior meangets closer and closer to the MLE.

I Notice that the uniform prior a = b = 1 gives a posterior mean of

E[θ|y ] =y + 1N + 2

.

35 / 67

Page 36: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

The Posterior ModeI First, note that the mode of a Beta(a,b) is

mode(θ) =a− 1

a + b − 2.

I As with the posterior mean, the posterior mode takes a weightedform:

mode(θ|y) =y + a− 1

N + a + b − 2

=yN× N

N + a + b − 2+

a− 1a + b − 2

× a + b − 2N + a + b − 2

= MLE×W? + Prior Mode× (1-W?).

I The weight W? is

W? =N

N + a + b − 2.

I Notice that the uniform prior a = b = 1 gives a posterior mode of

mode(θ|y) =yN,

the MLE. Which makes sense, right?36 / 67

Page 37: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Other Posterior Summaries

I We will rarely want to report a point estimate alone, whether it bea posterior mean or posterior median.

I Interval estimates are obtained in the obvious way.I A simple way of performing testing of particular parameter values

of interest is via examination of interval estimates.I For example, does a 95% interval contain the value θ0 = 0.5?

37 / 67

Page 38: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Other Posterior Summaries

I In our beta-binomial running example, a 90% posterior credibleinterval (θL, θU) results from the points

0.05 =

∫ θL

0p(θ|y) dθ

0.95 =

∫ θU

0p(θ|y) dθ

I The quantiles of a beta are not available in closed form, but easyto evaluate in R:

y <− 7; N <− 10; a <− b <− 1qbeta ( c ( 0 . 0 5 , 0 . 5 , 0 . 9 5 ) , y+a ,N−y+b )[ 1 ] 0.4356258 0.6761955 0.8649245

I The 90% credible interval is (0.44,0.86) and the posterior medianis 0.68.

38 / 67

Page 39: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Prior Sensitivity

I For small datasets in particular it is a good idea to examine thesensitivity of inference to the prior choice, particularly for thoseparameters for which there is little information in the data.

I An obvious way to determine the latter is to compare the priorwith the posterior, but experience often aids the process.

I Sometimes one may specify a prior that reduces the impact ofthe prior.

I In some situations, priors can be found that produce point andinterval estimates that mimic a standard non-Bayesian analysis,i.e. have good frequentist properties.

I Such priors provide a baseline to compare analyses with moresubstantive priors.

I Other names for such priors are objective, reference andnon-subjective.

I We now describe another approach to specification, viasubjective priors.

39 / 67

Page 40: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Choosing a Prior, Approach One

I To select a beta, we need to specify two quantities, a and b.I The posterior mean is

E[θ|y ] =y + a

N + a + b.

I Viewing the denominator as a sample size suggests a method forchoosing a and b within the prior.

I We need to specify two numbers, but rather than a and b, whichare difficult to interpret, we may specify the meanmprior = a/(a + b) and the prior sample size Nprior = a + b

I We then solve for a and b via

a = Nprior ×mprior

b = Nprior × (1−mprior).

I Intuition: a is like a prior number of successes and b like the priornumber of failures.

40 / 67

Page 41: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

An Example

I Suppose we set Nprior = 5 and mprior = 25 .

I It is as if we saw 2 successes out of 5.I Suppose we obtain data with N = 10 and y

N = 710 .

I Hence W = 10/(10 + 5) and

E[θ|y ] =7

10× 10

10 + 5+

25× 5

10 + 5

=9

15=

35.

I Solving:

a = Nprior ×mprior = 5× 25

= 2

b = Nprior × (1−mprior) = 5× 35

= 3

I This gives a Beta(y + a,N − y + b) = Beta(7 + 2,3 + 3) posterior.

41 / 67

Page 42: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Beta Prior, Likelihood and Posterior

0.0 0.2 0.4 0.6 0.8 1.0

0.00.5

1.01.5

2.02.5

3.0

θ

Dens

ity

PriorLikelihoodPosterior

Figure: The prior is Beta(2,3) the likelihood is proportional to a Beta(7,3) andthe posterior is Beta(7+2,3+3).

42 / 67

Page 43: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Choosing a Prior, Approach Two

I An alternative convenient way ofchoosing a and b is by specifying twoquantiles for θ with associated (prior)probabilities.

I For example, we may wishPr(θ < 0.1) = 0.05 andPr(θ > 0.6) = 0.05.

I The values of a and b may be foundnumerically. For example, we maysolve

[p1 − Pr(θ < q1|a,b)]2

+[p2 − Pr(θ < q2|a,b)]2 = 0

for a,b.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.5

1.0

1.5

2.0

2.5

θ

Bet

a D

ensi

ty

Figure: Beta(2.73,5.67) priorwith 5% and 95% quantileshighlighted.

43 / 67

Page 44: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Sequential Updating

I We show how probabilistic beliefs are updated as we receivemore data.

I Suppose the data arrives sequentially via two experiments:1. Experiment 1: (y1,N1).2. Experiment 2: (y2,N2).

I Prior 1: θ ∼ beta(a,b).I Likelihood 1: y1|θ ∼ binomial(N1, θ).I Posterior 1: θ|y1 ∼ beta(a + y1,b + N1 − y1).I This posterior forms the prior for experiment 2.I Prior 2: θ ∼ beta(a?,b?) where a? = a + y1, b? = b + N1 − y1.I Likelihood 2: y2|θ ∼ binomial(N2, θ).I Posterior 2: θ|y1, y2 ∼ beta(a? + y2,b? + N2 − y2).I Substituting for a?,b?:

θ|y1, y2 ∼ beta(a + y1 + y2,b + N1 − y1 + N2 − y2).

44 / 67

Page 45: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Sequential Updating

I Schematically:

(a,b)→ (a + y1,b + N1−y1)→ (a + y1 + y2,b + N1−y1 + N2−y2)

I Suppose we obtain the data in one go as y? = y1 + y2 successesfrom N? = N1 + N2 trials.

I The posterior is

θ|y? ∼ beta(a + y?,b + N? − y?),

which is the same as when we receive in two separate instances.

45 / 67

Page 46: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Predictive Distribution

I Suppose we see y successes out of N trials, and now wish toobtain a predictive distribution for a future experiment with Mtrials.

I Let Z = 0,1, . . . ,M be the number of successes.I Predictive distribution:

Pr(z|y) =

∫ 1

0p(z, θ|y)dθ

=

∫ 1

0Pr(z|θ, y)p(θ|y)dθ

=

∫ 1

0Pr(z|θ)︸ ︷︷ ︸binomial

×p(θ|y)︸ ︷︷ ︸posterior

where we move between lines 2 and 3 because z is conditionallyindependent of y given θ.

46 / 67

Page 47: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Predictive DistributionContinuing with the calculation:

Pr(z|y) =

∫ 1

0Pr(z|θ)× p(θ|y)dθ

=

∫ 1

0

(Mz

z (1− θ)M−z

×Γ(N + a + b)

Γ(y + a)Γ(N − y + b)θ

y+a−1(1− θ)N−y+b−1dθ

=

(Mz

)Γ(N + a + b)

Γ(y + a)Γ(N − y + b)

∫ 1

y+a+z−1(1− θ)N−y+b+M−z−1dθ

=

(Mz

)Γ(N + a + b)

Γ(y + a)Γ(N − y + b)

Γ(a + y + z)Γ(b + N − y + M − z)

Γ(a + b + N + M)

for z = 0,1, . . . ,M.

A likelihood approach would take the predictive distribution asbinomial(M, θ̂) with θ̂ = y/N: this does not account for estimationuncertainty.

In general, we have sampling uncertainty (which we can’t get awayfrom) and estimation uncertainty.

47 / 67

Page 48: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Predictive Distribution

0 2 4 6 8 10

0.00.1

0.20.3

0.4

z

Pred

ictive

Dist

ributi

onLikelihood PredictionBayesian Prediction

Figure: Likelihood and Bayesian predictive distribution of seeingz = 0, 1, . . . ,M = 10 successes, after observing y = 2 out of N = 20successes (with a = b = 1).

48 / 67

Page 49: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Predictive Distribution

The posterior and sampling distributions won’t usually combine soconveniently.

In general, we may form a Monte Carlo estimate of the predictivedistribution:

p(z|y) =

∫p(z|θ)p(θ|y)dθ

= Eθ|y [p(z|θ)]

≈ 1S

S∑s=1

p(z|θ(s))

where θ(s) ∼ p(θ|y), s = 1, . . . ,S, is a sample from the posterior.

This provides an estimate of the predictive distribution at the point z.

49 / 67

Page 50: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Predictive Distribution

I Alternatively, we may samplefrom p(z|θ(s)) a large numberof times to reconstruct thepredictive distribution.

I First sample from theposterior:

θ(s)|y ∼ p(θ|y).

I Next sample from thelikelihood:

z(s)|θ(s) ∼ p(z|θ(s)),

for s = 1, . . . ,S.I To give a sample z(s) from the

posterior, this is illustrated tothe right.

0 1 2 3 4 5 6 7 8

z

050

010

0015

0020

0025

0030

00

Figure: Sampling version of predictionin Figure 13, based on S = 10, 000samples.

50 / 67

Page 51: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Difference in Binomial Proportions

I It is straightforward to extend the methods presented for a singlebinomial sample to a pair of samples.

I Suppose we carry out two binomial experiments:

Y1|θ1 ∼ binomial(N1, θ1) for sample 1Y2|θ2 ∼ binomial(N2, θ2) for sample 2

I Interest focuses on θ1 − θ2, and often in examing the possibitlitythat θ1 = θ2.

I With a sampling-based methodology, and independent betapriors on θ1 and θ2, it is straightforward to examine the posteriorp(θ1 − θ1|y1, y2).

51 / 67

Page 52: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Difference in Binomial Proportions

I Savage et al. (2008) give data on allele frequencies within agene that has been linked with skin cancer.

I It is interest to examine differences in allele frequencies betweenpopulations.

I We examine one SNP and extract data on Northern European(NE) and United States (US) populations.

I Let θ1 and θ2 be the allele frequencies in the NE and USpopulation from which the samples were drawn, respectively.

I The allele frequencies were 10.69% and 13.21% with samplesizes of 650 and 265, in the NE and US samples, respectively.

I We assume independent Beta(1,1) priors on each of θ1 and θ2.I The posterior probability that θ1 − θ2 is greater than 0 is 0.12

(computed as the proportion of the samples θ(s)1 − θ

(s)2 that are

greater than 0), so there is little evidence of a difference in allelefrequencies between the NE and US samples.

52 / 67

Page 53: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Binomial Two Sample Example

θ1

Freque

ncy

0.08 0.12 0.16

0500

1000

1500

θ2

Freque

ncy

0.10 0.15 0.20 0.25

0500

1000

1500

θ1−θ2

Freque

ncy

−0.15 −0.05 0.05

0500

1000

1500

Figure: Histogram representations of p(θ1|y1), p(θ2|y2) and p(θ1 − θ2|y1, y2).The red line in the right plot is at the reference point of zero.

53 / 67

Page 54: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayes Factors for Hypothesis Testing

I The Bayes factor provides a summary of the evidence for aparticular hypothesis (model) as compared to another.

I The Bayes factor is

BF =Pr(y |H0)

Pr(y |H1)

and so is simply the probability of the data under H0 divided bythe probability of the data under H1.

I Values of BF > 1 favor H0 while values of BF < 1 favor H1.I Note the similarity to the likelihood ratio

LR =Pr(y |H0)

Pr(y |θ̂)

where θ̂ is the MLE under H1.I If there are no unknown parameters in H0 and H1 (for example,

H0 : θ = 0.5 versus H1 : θ = 0.3), then the Bayes factor isidentical to the likelihood ratio.

54 / 67

Page 55: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Calibration of Bayes Factors

I Kass and Raftery (1995) suggest intervals of Bayes factors forreporting:

1/Bayes Factor Evidence Against H0

1 to 3.2 Not worth more than a bare mention3.2 to 20 Positive20 to 150 Strong>150 Very strong

I These provide a guideline, but should not be followed withoutquestion.

55 / 67

Page 56: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Example: Bayes Factors for Binomial Data

For each gene in the ASE dataset we may be interested inH0 : θ = 0.5 versus H1 : θ 6= 0.5.

The numerator and denominator of the Bayes factor are:

Pr(y |H0) =

(Ny

)0.5y 0.5N−y

Pr(y |H1) =

∫ 1

0

(Ny

)θy (1− θ)N−y Γ(a + b)

Γ(a)Γ(b)θa−1(1− θ)b−1dθ

=

(Ny

)Γ(a + b)

Γ(a)Γ(b)

Γ(y + a)Γ(N − y + b)

Γ(N + a + b)

We have already seen the denominator calculation, when wenormalized the posterior.

56 / 67

Page 57: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Values Taken by the Negative Log Bayes Factor, as aFunction of y

●●

● ● ● ● ●●

0 5 10 15 20

02

46

810

12

y

−Lo

g B

ayes

Fac

tor

Very Strong

Strong

Positive

Bare Mention

Figure: Negative Log Bayes factor as a function of y |θ ∼ Binomial(20, θ) fory = 0, 1, . . . , 20 and a = b = 1. High values indicate evidence against thenull.

57 / 67

Page 58: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Analysis of ASE Data

58 / 67

Page 59: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Three Approaches to Inference for the ASE Data

1. Posterior Probabilities:I A simple approach to testing is to calculate the posterior probability

that θ < 0.5.I We can then pick a threshold for indicating worthy of further study,

e.g. if Pr(θ < 0.5|y) < 0.01 or Pr(θ < 0.5|y) > 0.99

2. Bayes Factors:I Calculating the Bayes factor.I Pick a threshold for indicating worthy of further study, e.g. if

reciprocal of the Bayes factor is greater than 150.

3. Decision theory:I Place priors on the null and alternative hypotheses.I Calculate the posterior odds:

Pr(H0|y)

Pr(H1|y)=

Pr(y |H0)

Pr(y |H1)× Pr(H0)

Pr(H1)

Posterior Odds = Bayes Factor× Prior Odds

I Pick a threshold R, so that if the Posterior Odds < R we choose H1.

59 / 67

Page 60: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Analysis of the ASE Data

I Here we give ahistogram of theposterior probabilitiesPr(θ < 0.5|y) and wesee large numbers ofgenes haveprobabilities close to 0and 1, indicating allelespecific expression(ASE).

Posterior Prob of θ < 0.5

Fre

quen

cy

0.0 0.2 0.4 0.6 0.8 1.0

020

040

060

080

0

Figure: Histogram of 4,844 posteriorprobabilities of θ < 0.5.

60 / 67

Page 61: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Analysis of the ASE Data

I To the left we plotPr(θ < 0.5|y) versusthe p-values and thegeneral pattern is whatwe would expect —small p-values haveposterior probabilitiesclose to 0 and 1.

I Weird lines are due todiscreteness of thedata.

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●● ●

●●●

●●

●●

●●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

p−value

Pos

t Pro

b of

θ <

0.5

Figure: Posterior probabilities of θ < 0.5 andp-values from exact tests.

61 / 67

Page 62: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Bayesian Analysis of the ASE Data

I Here we plot the -LogBayes Factor againstPr(θ < 0.5|y).

I Large values of the formercorrespond to strongevidence of ASE.

I Again we see anaggreement in inference,with large values of thenegative log Bayes factorcorresponding withPr(θ < 0.5|y) close to 0and 1.

● ●

●●●

●● ●● ● ●●●

●●●●

●●● ●●● ●● ● ● ●●● ●● ●●

●● ●

● ●●● ● ●●● ●● ● ●●●

● ●● ●● ●●●

● ●●●●

●●

● ● ●● ●●●●●

●● ● ●●● ●

● ●

●●●● ●● ●●●

● ●

●● ●● ●

● ● ●●

● ●● ● ●● ●● ●● ●●●

● ● ●●● ●● ●● ●● ● ●● ●● ●● ●●●

●●●●●● ● ●●● ●●

●●● ● ● ●●● ● ●

●● ●●●● ●●● ● ●●●

●●●

●●●● ●● ●●● ●● ●● ●●● ●

●●● ●●● ● ●●● ● ●● ●● ●●●● ●

●●

●● ●●●

●●●●●

●● ●● ●● ●● ● ●●●

●●

●● ●●●●● ●● ●● ● ●●●

●●●●● ● ●● ●

● ●● ●● ● ● ● ●● ● ●● ● ● ●●● ●●●●● ● ●● ● ● ●● ●● ●●●●●● ●● ●●●● ●● ●

●● ●●● ●● ● ●●

●●●●●●● ●● ● ● ●●●

●● ●

●● ●●●● ●

●● ●● ● ● ●● ●●

●●● ●● ●● ●● ●●● ●●● ● ●●● ●●● ● ● ●● ●● ●● ●●● ●●

●● ●● ●● ●● ●●●●● ●

●●

● ●● ● ●● ●● ●● ●●● ●● ●● ● ●●●● ● ●● ● ● ● ● ●● ●●● ●●● ●●

● ●

●●

● ●●●● ●● ●●● ●

● ●●● ●●● ●●● ● ●● ●● ●● ●●● ●● ●●●

● ●●●● ● ●● ● ●● ●● ● ●●● ●

●● ●● ●● ●● ● ●● ●● ● ●●● ●● ● ●● ●● ●●● ● ●● ●●●●

●●

● ● ● ●●● ●●●●

●● ● ●

● ● ●●●

● ●● ●● ●● ● ●● ●● ●● ● ● ●●● ●● ●●●

●● ●●● ● ● ●● ●●●● ●●● ●● ●● ●●●

●●●●● ● ●●●● ●● ●● ● ●● ● ●● ●● ● ●●● ●● ● ●●

●● ●●● ● ●●

● ● ●●● ● ● ●● ●●●●

●●● ●●

●● ●● ● ●●● ●● ●● ●● ● ●● ● ●●●

●●● ● ●●●

●●● ●●●●

●●

●●● ●● ●●●

● ●●●● ●

●●●

● ●● ●●● ●● ●● ●●● ●●●● ● ● ●● ●●● ●● ●● ●● ●● ●●● ●●

● ●● ●● ●●● ●● ●● ●● ● ●● ●●

● ●●● ● ●● ●●

● ●● ● ●● ● ●● ● ● ●● ●●● ●●● ● ●●● ●● ● ●●●● ● ●●● ● ●●● ● ●● ●●● ●● ●● ●●●●●

●●● ● ●●●●●● ●● ●● ●● ●●●● ● ●●● ● ●●●●●●● ●● ●● ● ●● ●● ● ●●●● ● ● ●● ●● ●

●● ●●● ●●● ●● ● ●●●●● ● ●●● ● ●● ● ●● ●●● ● ●● ●● ●●● ● ●●● ●● ●● ● ●● ●●●● ● ● ● ●● ● ●● ●●● ●● ●● ●● ●● ●● ●●● ● ●● ●● ●●● ●● ● ●● ●●● ●● ●●● ●● ● ●● ● ●●● ●● ●●● ●●●● ● ● ●● ● ●●● ●●● ● ●● ●● ●●●● ● ● ●●● ●●●

● ●● ●● ●●●●● ●● ●

●● ●●●● ●●●

● ● ● ●● ●● ● ●●●● ●●●●

● ●●●●● ● ●● ●● ●● ● ● ●● ● ●● ●● ●● ● ●● ●●●●

●● ●● ●● ● ●● ●●●● ●

●● ●

● ● ●● ●●

●●

● ●● ●● ●●●●●● ●● ●●●●

● ●● ●● ●● ●● ●●

●● ●● ● ●●●

●●●●

●●● ● ●

●●●● ● ●● ●●● ●● ●●● ● ●● ●●● ●● ●● ●● ●●

●● ●● ●● ●●

●●●

●● ●● ●● ●●●● ● ●●●● ● ● ●● ●●●● ●●● ●● ●●●● ●● ●● ● ●

● ● ●●● ●●● ● ●● ●● ● ●● ● ●●●●● ● ●●●●

●●

● ● ●●● ●●● ●●● ●●● ● ●●●

● ● ●●● ● ●●●●● ●● ●● ●● ●●

●●

●● ●●● ● ●●●

●●●● ●●●●● ●● ●●● ●● ● ●●● ●● ●● ●● ●●●● ●●●

●●

● ●●●● ●● ●●

●● ●● ● ●● ● ●●●●●● ●●●● ● ● ●●

● ●●●● ●●●● ●●

●● ●●● ●● ●●

●●●

● ●

● ●●● ●● ●

●● ●●●

●●●● ●● ● ●● ● ●●● ●● ●●●●● ●●

● ●● ● ●●

● ●●● ● ● ●●● ●● ●●●● ●● ●● ●●●● ● ●●●● ●●●

● ●●● ●● ●●●●● ●●

●●● ●●●● ●

●●

● ●● ● ●●● ●● ●●●●● ●

● ● ● ●● ●●● ●●● ●●●● ●● ●●●

●●

● ● ●●●●●● ● ● ●●● ●● ●● ●● ●●● ● ●● ●●● ● ●● ●●●

●●

●●● ●

● ●●● ● ●●● ● ●

●●

● ●●● ● ●● ● ●● ●●● ● ●●● ●● ● ● ●● ●●● ●●●● ●●●● ● ●● ●● ●●●● ● ● ●●●

● ● ●●● ● ●●●● ●●● ●●●

●●

● ●●●● ●●● ● ● ●● ●●●

●● ●●●●

● ● ●● ●● ● ●●

● ● ●● ● ●●● ● ●● ●● ● ●● ●●

● ●● ● ●

●●●● ● ●● ●● ● ●●●

● ● ●● ●●●

● ●●

●● ● ●●●● ● ●●● ●● ● ●● ●● ●● ● ●

● ● ● ●●●●●● ●●●● ● ●● ● ●● ● ●

● ●●

●●● ● ●●●●

●● ● ●● ●● ●●●●

●● ●● ●● ●● ● ● ●● ●● ● ●● ●●● ●●● ●● ● ●● ●●●

●● ●● ●●● ● ●● ● ● ● ●●● ●● ●● ● ●● ● ●●● ●● ● ●● ●● ● ● ●● ●● ●●● ● ●● ● ●●● ● ●●●● ● ● ● ●●●● ● ●●● ●●● ●● ●● ●

● ●● ●● ●● ● ●●●●

●●● ●●● ●●

● ● ●●

● ●● ●●● ● ●●●●● ●●●●● ● ●●●

●● ● ●

●●

●●

● ●●● ●● ●● ● ● ●●● ●●● ●●●● ●

● ● ●●● ●●●● ● ●●● ●● ●●● ●● ●● ● ●

●●● ● ●●● ● ●● ● ● ●●●● ● ●● ● ●

●●●

● ●● ● ●● ●● ●●●●

● ● ●●●● ● ●● ● ●● ●● ●● ●● ●● ● ● ●●● ●●●●● ●

●●● ●● ●●● ●●● ● ● ● ●●● ●● ● ●● ● ●● ●●●

●● ●●● ●● ●● ● ●

●●●

● ●● ●●● ● ●● ●

● ● ●● ●

● ●

●●

● ●●● ●●● ● ●●●●● ●● ●●● ● ●● ● ●● ●

●●● ● ●●●● ●● ●● ● ●● ●● ● ●● ●● ●●● ●

● ●●● ● ● ●● ● ●● ●●● ●● ●● ●● ● ●● ● ●● ● ● ●●

●● ● ●●●

● ●●●●● ●

● ● ●●

● ●● ●●

● ●●●● ●● ●●● ● ●● ●●

● ●● ●●● ●●●

● ● ●● ● ●●●● ●●● ●● ●

●●

● ●●● ● ●●● ● ●●

●●

●● ●● ●

●● ●●●

●●●● ● ● ● ●● ● ●●● ●● ●● ● ● ●●●●●● ● ●●●● ● ●● ●● ●● ●●● ● ●● ●● ●●●● ● ●● ●●● ●● ●● ●● ●

●● ●●●●● ●● ●● ●●● ●● ● ● ● ● ●● ● ●● ●

●●●● ●●●

●● ●●● ●●● ●● ● ●● ●● ● ●●●

● ●● ●● ●● ● ● ●● ●●

●● ●●● ● ●●

● ●● ●● ● ●● ●● ● ● ●●●● ● ● ●● ●●

●● ●●● ●● ●● ●● ●

●●● ●● ● ●● ●●● ●● ● ● ●

● ● ●●● ● ●●●● ●● ●● ●●● ● ●●● ●● ●● ● ●●●● ●● ●●● ●● ●●

● ● ●●●

● ● ●● ●

●● ●

● ●

●●●

● ● ●● ●●● ● ● ●●● ●●●

●● ● ●●● ●● ●●● ●●● ●● ● ●

● ●●● ● ●●●●● ●

●●●● ● ●● ●●●●● ● ●

●● ● ●●● ● ●

●●●

●● ●●● ●

●●●

●●● ● ●●●●● ● ●●●● ● ● ● ●●●●●●● ●

●● ● ● ●● ●●● ●● ●● ● ●● ● ●●

●● ●●●●

●●● ● ●● ●● ● ●●●● ●●●● ●

●●●

●● ●● ●●● ●● ● ●● ●● ●● ●●

● ●●●● ●● ●● ● ●●● ● ●●●

● ●●

●● ●●●● ●

● ●● ●● ●●●●●●● ●●● ●● ●●●● ● ● ●● ●● ●● ●●● ●● ●● ●● ●● ●●

●●●

●● ● ●● ● ●● ● ●● ●● ●● ●●●●

●● ●● ● ●● ● ●● ●●● ●● ●●●● ● ●●● ● ●●●● ● ● ●● ●

●●● ●● ● ●

● ●●

● ●● ● ●●● ●● ●●● ●●●● ●●● ● ●●● ● ●●● ● ●● ●●● ●●

●● ●● ●● ● ●● ●●● ● ●● ● ●●●●● ●● ●● ●●● ●●● ● ●● ●●●●

●● ●●● ●● ●●● ●●● ●●

●●●● ●● ●

●●● ●● ●● ● ●●

●●● ● ●●● ●●● ●●● ●●● ●● ●● ●● ● ●●● ● ●● ●● ● ●●●●

●● ● ●● ●●●● ●● ● ● ●●● ●● ●●●

●●● ● ● ●● ●●● ●● ●● ●●●●● ● ●●●● ●●● ●●● ●● ●● ● ●● ●●● ●● ●● ●●●● ●●

●● ●● ●● ●●● ● ● ● ● ●● ●

●●● ●●● ●●● ●● ●● ●●● ●●● ● ● ●● ● ● ●●● ●● ●

●● ● ● ●● ●●● ●● ● ●●● ●●● ●●

● ●●●●● ●● ●●●● ●● ●● ● ●●●

● ●● ● ●●●

●●●

● ●● ●●●●● ●● ●● ● ● ●●● ● ● ●● ●● ● ●● ●● ● ●● ● ●● ●● ●

●●●● ● ●●● ●● ●

●●● ●●

●●● ●● ●●● ●● ●● ●●

●● ● ● ● ●●●● ● ●● ●● ● ● ●● ● ●●● ●● ●● ● ● ● ●●●

●●● ●●● ●●●● ●● ●

●● ● ●● ●● ●●

●●

●● ●● ●● ●●

● ●● ●● ●● ●●●● ●●● ●● ●● ●● ●●●●

●●●● ●●● ●●● ●

●● ●● ●●●● ●● ●●●

●●● ●●●●● ●● ●●

● ●● ● ● ● ●● ●●● ●● ● ● ●●

● ●● ●●● ●● ● ●●●● ● ● ● ●

●● ● ●● ●● ●● ● ●●

● ●● ●● ●●● ●● ●● ●●● ● ●● ●●● ● ●● ● ●●●●● ● ●● ●● ● ●●● ●● ●

● ● ●● ● ●●

●●● ● ●● ● ● ●● ● ● ● ●● ● ●● ●●

●●● ●●●● ●● ●●

● ●●●

● ●●●● ● ●● ●●●● ●●●●

● ● ●● ●● ●●● ● ●● ● ●● ● ● ● ●● ●●●● ●●● ●

●●●

●● ● ●● ● ●● ●● ● ● ●●● ●● ●● ●●● ●● ●●● ●●●● ●●● ● ●●● ●● ●● ● ● ●●●● ● ● ●● ●●

●● ●● ●●● ● ● ●● ●● ● ●●●● ●

●●● ●

●●●● ●●● ●●● ●●●

●●

●●●

● ●● ●● ● ●●●● ●●● ● ●● ●●● ● ●●

●●● ●●

●●●● ● ●●

●● ●●

● ● ●● ●●●●●● ● ● ●●●●●● ●●● ● ●● ●●●● ●● ●● ●●●

●● ●● ● ●

● ●● ●● ●● ● ● ●

●●● ● ●●●● ● ● ●●

● ●● ●● ●● ● ●●●● ● ● ●●● ●●● ●● ●●

●● ●● ●●●● ●

●● ● ● ● ●● ●●●

● ●●●●● ●●● ●●● ●●● ●●●● ● ●● ●● ● ●●●●

● ●●●●●

● ● ●● ●●● ● ●●●● ●

● ●● ●● ●●●●●

●●●●● ●● ●

● ● ●●● ●● ● ● ●●●●● ●●●

●● ●●

● ●●● ● ●● ●● ● ●● ●● ●●●●● ● ●● ● ●● ●● ●● ●● ●● ●● ● ● ●●● ● ●● ●● ●●●● ●●● ●●● ●●

●● ●● ●●●

●●● ●●●● ● ●●● ●●

●●

●●●

●● ●●●

●● ●● ●● ●●● ● ●● ●●

● ● ● ●●●

●●

●●● ● ●● ● ●●● ● ●●●●●● ● ●●● ● ● ●●● ● ●● ● ●●● ●● ●● ●●● ●

●● ● ● ●●● ●● ● ●●●● ● ●● ●●● ● ●●● ●●● ●

●● ●● ●● ●●

●● ●● ●●● ●● ●● ● ●

● ●●● ●● ●● ●● ●● ●●●● ●● ● ●●● ● ● ●●●●●●

●●● ●● ●● ●● ●● ● ●● ● ●● ● ●●● ● ●●

● ● ●●● ● ●●● ●●● ●

●●●●●●●●

●●●

●● ● ●● ● ● ●●● ●● ●●● ●● ●● ● ●●● ● ●●

●● ●

● ●●● ● ● ●●●●● ●● ● ●● ●●●●● ●● ●● ●●● ●● ●● ●●● ● ●●

●● ●● ●●● ●● ●

● ● ●● ● ●● ● ● ●●● ●● ● ●●● ●● ● ● ●●

●● ●●●● ●

●● ● ●● ●● ●● ●●●● ●● ●●●●

●●●●● ●●● ●●

● ●● ●●

● ●● ●●● ● ●●●● ●● ●●●● ●●●● ● ●●● ●● ● ●● ●

● ●● ●

●●●●● ●●●●●● ●

● ●● ●●●

●● ● ●●● ● ●● ●●● ● ●●●●● ●●● ●● ● ●● ●● ●● ●●● ●●● ●● ●● ●● ●● ●●

●● ● ●● ●● ● ●● ●●●● ●● ●● ●

● ● ●● ●● ● ●●●●

●●

●● ●●● ●●

● ●●● ● ●● ●●●●● ●● ●●

●● ●●●

● ●●●●●

● ● ●●● ●●

●●●● ● ●●● ● ●●● ● ●●●● ●●● ● ● ● ●●● ● ●● ●●●● ● ● ●● ●● ●●● ● ●● ●●

●●● ●●●● ●●● ●● ●

●● ●● ●● ●●● ● ● ● ●●●●

●●●● ●

● ●● ●●

● ●●

● ●● ● ●● ●● ●● ●●●● ● ●● ●

●●

●●

●● ●● ●● ●● ●● ●● ● ●● ● ●● ●

●● ● ● ●● ● ●●●● ●● ●●● ●● ● ● ●●●● ● ● ●● ●●● ●● ●● ●●● ●● ●●●● ●●● ●●● ● ● ●● ●● ●● ●

●●●●●●

● ●●●●● ● ●●

●● ●● ●●●●

● ●●●●● ●● ● ● ●● ●● ● ●● ● ●● ●● ●●● ●● ●●●●

● ●● ● ●● ●● ● ●●● ● ●●● ●● ●● ●●● ● ●●● ● ●● ●

● ● ●●●●

0.0 0.2 0.4 0.6 0.8 1.0

050

100

150

200

250

Post Prob of θ < 0.5

−Lo

g B

ayes

Fac

tor

Figure: Negative Log Bayes factor versus posteriorprobabilities of θ < 0.5.

62 / 67

Page 63: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

ASE Example

Applying a Bonferroni correction to control the family wise error rateat 0.05, gives a p-value threshold of 0.05/4844 = 10−5 and 111rejections. More on this later!

There were 278 genes with Pr(θ < 0.5|y) < 0.01 and 242 genes withPr(θ < 0.5|y) > 0.99.

Following the guideline of requiring very strong evidence, there were197 genes with the reciprocal Bayes factor greater than 150.

Requiring less stringent evidence, i.e. strong and very strong(reciprocal BF greater than 20), there were 359 genes.

We later consider a formal decision theory approach to testing.

In this example, the rankings of the different approaches are similar,but the calibration, i.e., picking a threshold, is not straightforward.

63 / 67

Page 64: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

ASE Output Data

I Below are some summaries from the ASE analysis – we orderwith respect to the variable logBFr, which is the reciprocal Bayesfactor (so that high numbers correspond to strong evidenceagainst the null).

I The postprob variable is the posterior probability of θ < 0.5.

a l l v a l s <− data . frame (Nsum, ysum , pvals , postprob , logBFr )oBF <− order (− logBFr )o r d e r a l l v a l s <− a l l v a l s [ oBF , ]head ( o r d e r a l l v a l s )

Nsum ysum pvals postprob logBFr4751 437 6 5.340324e−119 1.000000e+00 267.95724041 625 97 1.112231e−72 1.000000e+00 161.13552370 546 468 8.994944e−69 2.621622e−69 152.25172770 256 245 1.127211e−58 2.943484e−59 129.6198t a i l ( o r d e r a l l v a l s )

Nsum ysum pvals postprob logBFr824 761 382 0.9422103 0.4567334 −2.0866042163 776 390 0.9142477 0.4429539 −2.0919553153 769 384 1.0000000 0.5143722 −2.0970792860 1076 546 0.6474878 0.3129473 −2.146555

64 / 67

Page 65: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Conclusions

65 / 67

Page 66: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Conclusions

Monte Carlo sampling provides flexibility of inference.

All this lecture considered Binomial sampling, for which there is only asingle parameter. For more parameters, prior specification andcomputing becomes more interesting...as we shall see.

Multiple testing is considered in Lecture 9.

For estimation and with middle to large sample sizes, conclusionsfrom Bayesian and non-Bayesian approaches often coincide.

For testing it’s more complex, as discussed in Lecture 9.

66 / 67

Page 67: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

Conclusions

Benefits of a Bayesian approach:I Inference is based on probability and output is very intuitive.I Framework is flexible, and so complex models can be built.I Can incorporate prior knowledge!I If the sample size is large, prior choice is less crucial.

Challenges of a Bayesian analysis:I Require a likelihood and a prior, and inference is only as good as

the appropriateness of these choices.I Computation can be daunting, though software is becoming

more user friendly and flexible; later we will describe andillustrate a number of approaches including INLA and Stan.

I One should be wary of model becoming too complex – we havethe technology to contemplate complicated models, but do thedata support complexity?

67 / 67

Page 68: 2019 SISG Module 8: Bayesian Statistics for Genetics ...faculty.washington.edu/kenrice/sisgbayes/Lecture3-JW-2019.pdf · 2.Hypothesis Testing:Bayes factorsgive the evidence in the

ReferencesKass, R. and Raftery, A. (1995). Bayes factors. Journal of the

American Statistical Association, 90, 773–795.Savage, S. A., Gerstenblith, M. R., Goldstein, A., Mirabello, L.,

Fargnoli, M. C., Peris, K., and Landi, M. T. (2008). Nucleotidediversity and population differentiation of the melanocortin 1receptor gene, MC1R. BMC Genetics, 9, 31.

Skelly, D., Johansson, M., Madeoy, J., Wakefield, J., and Akey, J.(2011). A powerful and flexible statistical framework for testinghypothesis of allele-specific gene expression from RNA-Seq data.Genome Research, 21, 1728–1737.

67 / 67


Recommended