J. Daunizeau Brain and Spine Institute, Paris, France

transcript

J. Daunizeau

Brain and Spine Institute, Paris, FranceWellcome Trust Centre for Neuroimaging, London, UK

Bayesian inference

Overview of the talk

1 Probabilistic modelling and representation of uncertainty1.1 Bayesian paradigm1.2 Hierarchical models1.3 Frequentist versus Bayesian inference

2 Numerical Bayesian inference methods2.1 Sampling methods2.2 Variational methods (ReML, EM, VB)

3 SPM applications3.1 aMRI segmentation3.2 Decoding of brain images3.3 Model-based fMRI analysis (with spatial priors)3.4 Dynamic causal modelling

Degree of plausibility desiderata:- should be represented using real numbers (D1)- should conform with intuition (D2)- should be consistent (D3)

a=2b=5

• normalization:

• marginalization:

• conditioning :(Bayes rule)

Bayesian paradigmprobability theory: basics

Bayesian paradigmderiving the likelihood function

- Model of data with unknown parameters:

y f e.g., GLM: f X

- But data is noisy: y f

- Assume noise/residuals is ‘small’:

4 0.05P

→ Distribution of data, given fixed parameters:

p y y f

Likelihood:

Prior:

Bayes rule:

Bayesian paradigmlikelihood, priors and the model evidence

generative model m

Bayesian paradigmforward and inverse problems

,p y m

forward problem

likelihood

,p y m

inverse problem

posterior distribution

Principle of parsimony :« plurality should not be assumed without necessity »

“Occam’s razor” :

space of all data sets

Model evidence:

Bayesian paradigmmodel comparison

••• inference

causality

Hierarchical modelsprinciple

Hierarchical modelsdirected acyclic graphs (DAGs)

t t Y t *

0*P t t H

0p t H

0*P t t H if then reject H0

• estimate parameters (obtain test stat.)

H0 : 0• define the null, e.g.:

• apply decision rule, i.e.:

classical SPM

• define two alternative models, e.g.:

• apply decision rule, e.g.:

Bayesian Model Comparison

Frequentist versus Bayesian inferencea (quick) note on hypothesis testing

1p Y m

0p Y m

space of all datasets

if then accept m0

P m yP m y

1 if 0:

0 otherwise

m p m N

Family-level inferencetrading model resolution against statistical power

A B A B

P(m1|y) = 0.04 P(m2|y) = 0.25

P(m2|y) = 0.7P(m2|y) = 0.01

1 1 max

P e y P m y

model selection error risk:

Family-level inferencetrading model resolution against statistical power

A B A B

P(m1|y) = 0.04 P(m2|y) = 0.25

P(m2|y) = 0.7P(m2|y) = 0.01

1 1 max

P e y P m y

model selection error risk:

P(f2|y) = 0.95P(f1|y) = 0.05

1 1 max

P e y P f y

family inference(pool statistical evidence)

P f y P m y

Sampling methodsMCMC example: Gibbs sampling

Variational methodsVB / EM / ReML

→ VB : maximize the free energy F(q) w.r.t. the approximate posterior q(θ) under some (e.g., mean field, Laplace) simplifying constraint

1 or 2q

1 or 2 ,p y m

1 2, ,p y m

realignment smoothing

normalisation

general linear model

template

Gaussian field theory

p <0.05

statisticalinference

segmentationand normalisation

dynamic causalmodelling

posterior probabilitymaps (PPMs)

multivariatedecoding

grey matter CSFwhite matter

class variances

classmeans

ith voxelvalue

ith voxellabel

classfrequencies

aMRI segmentationmixture of Gaussians (MoG) model

Decoding of brain imagesrecognizing brain states from fMRI

fixation cross

paceresponse

log-evidence of X-Y sparse mappings:effect of lateralization

log-evidence of X-Y bilateral mappings:effect of spatial deployment

fMRI time series analysisspatial priors and model comparison

PPM: regions best explainedby short-term memory model

PPM: regions best explained by long-term memory model

fMRI time series

GLM coeff

prior varianceof GLM coeff

prior varianceof data noise

AR coeff(correlated noise)

short-term memorydesign matrix (X)

long-term memorydesign matrix (X)

m2m1 m3 m4

V1 V5stim

attention

V1 V5stim

attention

V1 V5stim

attention

V1 V5stim

attention

m1 m2 m3 m4

V1 V5stim

attention

0.39 0.26

0.10estimated

effective synaptic strengthsfor best model (m4)

models marginal likelihoodln p y m

Dynamic Causal Modellingnetwork structure identification

( , , )x f x u

DCMs and DAGsa note on causality

1 2ln lnp y m p y m

subjects

fixed effect

random effect

assume all subjects correspond to the same model

assume different subjects might correspond to different models

Dynamic Causal Modellingmodel comparison for group studies

I thank you for your attention.

A note on statistical significancelessons from the Neyman-Pearson lemma

• Neyman-Pearson lemma: the likelihood ratio (or Bayes factor) test

p y Hu

is the most powerful test of size to test the null. 0p u H

MVB (Bayes factor) u=1.09, power=56%

CCA (F-statistics)F=2.20, power=20%

error I rate

ROC analysis

• what is the threshold u, above which the Bayes factor test yields a error I rate of 5%?

J. Daunizeau Brain and Spine Institute, Paris, France

Documents