Guillaume FlandinWellcome Trust Centre for Neuroimaging
University College London
SPM CourseZurich, February 2008
Bayesian Inference
RealignmentRealignment SmoothingSmoothing
NormalisationNormalisation
General linear modelGeneral linear model
Statistical parametric mapStatistical parametric mapImage time-seriesImage time-series
Parameter estimatesParameter estimates
Design matrix
TemplateTemplate
KernelKernel
Gaussian Gaussian field theoryfield theory
p <0.05p <0.05
StatisticalStatisticalinferenceinference
Bayesian segmentationand normalisation
Spatial priorson activation extent
Posterior probabilitymaps (PPMs)
Dynamic CausalModelling
Overview
Introduction Bayes’s rule Gaussian case Bayesian Model Comparison
Bayesian inference aMRI: Segmentation and Normalisation fMRI: Posterior Probability Maps (PPMs)
Spatial prior (1st level) MEEG: Source reconstruction
Summary
In SPM, the p-value reflects the probability of getting the observed data in the effect’s absence. If sufficiently small, this p-value can be used to reject the null hypothesis that the effect is negligible.
Classical approach shortcomings
)|)(( 0HYfp
Shortcomings of this approach:
Solution: using the probability distribution of the activation given the data.
)|( Yp
One can never accept the null hypothesis Given enough data, one can always demonstrate a significant
effect at every voxel
Probability of the data, given no
activation
Probability of the effect, given the observed data Posterior probability
)()()|()|(
YppYpYp
Baye’s Rule
Y
Given p(Y), p() and p(Y,) Conditional densities are given by
)(),()|(
YpYpYp )(
),()|(
pYpYp
Eliminating p(Y,) gives Baye’s rule
Likelihood Prior
Evidence
Posterior
Gaussian Case
Likelihood and Prior
Posterior
)2( m )1(
Relative Precision Weighting
Prior
Likelihood
Posterior
)2()2()1(
)1()1(
y
1
)2()2()1(
1)1(
)1()1(
,
, |
Np
Nyp
)2()2()1()1(
)2()1(
1)1( , |
ppm
ppmNyp
Multivariate Gaussian
Bayesian Inference
Three steps:
Observation of data Y
Formulation of a generative model likelihood p(Y|)
prior distribution p()
Update of beliefs based upon observations, given a prior state of knowledge
)()()|()|(
YppYpYP
Bayesian Model Comparison
)()()|()|(
YpmpmYPYmp
Select the model m with the highest probability given the data:
Model comparison and Baye’s factor:
)|()|(
2
112 mYp
mYpB
mmm dmpmYpmYp )|(),|()|(
Model evidence (marginal likelihood):
Accuracy Complexity
B12 p(m1|Y) Evidence1 to 3 50-75 Weak3 to 20 75-95 Positive
20 to 150 95-99 Strong 150 99 Very strong
Overview
Introduction Bayes’s rule Gaussian case Bayesian Model Comparison
Bayesian inference Bayesian inference aMRI: Segmentation and NormalisationaMRI: Segmentation and Normalisation fMRI: Posterior Probability Maps (PPMs)
Spatial prior (1st level) MEEG: Source reconstruction
Summary
Bayes and Spatial Preprocessing
Normalisation
)(log)|(log)|(log pypyp
Mean square difference between template and source image
(goodness of fit)
Squared distance between parameters and their expected values
(regularisation)
Deformation parameters
Unlikely deformation
Bayesian regularisation
Bayes and Spatial Preprocessing
Templateimage
Affine registration.
(2 = 472.1)
Non-linearregistration
withoutBayes
constraints.(2 = 287.3)
Without Bayesian constraints, the non-linear spatial normalisation can introduce unnecessary warps.
Non-linearregistration
usingBayes.
(2 = 302.7)
Bayes and Spatial Preprocessing
Segmentation
Intensities are modelled by a mixture of K Gaussian distributions.
Overlay prior belonging probability maps to assist the segmentation: Prior probability of each voxel being of a particular type is derived from segmented images of 151 subjects.
Empirical priors
Unified segmentation & normalisation Circular relationship between segmentation & normalisation:
– Knowing which tissue type a voxel belongs to helps normalisation.– Knowing where a voxel is (in standard space) helps segmentation.
Build a joint generative model:– model how voxel intensities result from mixture of tissue type
distributions– model how tissue types of one brain have to be spatially deformed to
match those of another brain
Using a priori knowledge about the parameters: adopt Bayesian approach and maximise the posterior probability
Ashburner & Friston 2005, NeuroImage
Overview
Introduction Bayes’s rule Gaussian case Bayesian Model Comparison
Bayesian inferenceBayesian inference aMRI: Segmentation and Normalisation fMRI: Posterior Probability Maps (PPMs)fMRI: Posterior Probability Maps (PPMs)
Spatial prior (1Spatial prior (1stst level) level) MEEG: Source reconstruction
Summary
Bayesian fMRI
XY
General Linear Model:
What are the priors?
),0( CNwith
• In “classical” SPM, no (flat) priors• In “full” Bayes, priors might be from theoretical arguments or from independent data• In “empirical” Bayes, priors derive from the same data, assuming a hierarchical model for generation of the data
Parameters of one level can be made priors on distribution of parameters at lower level
Bayesian fMRI with spatial priors
Even without applied spatial smoothing, activation maps (and maps of eg. AR coefficients) have spatial structure.
AR(1)Contrast
Definition of a spatial prior via Gaussian Markov Random Field Automatically spatially regularisation of Regression coefficients and AR coefficients
The Generative Model
A
Y
Y=X β +E where E is an AR(p)
),0()( 11 DNp kk ),0()( 11 DNap pp
General Linear Model with Auto-Regressive error terms (GLM-AR):
t
p
iititt eaXy
1
Spatial prior
11,0 DNp kk
Over the regression coefficients:
Shrinkage prior
Same prior on the AR coefficients.
Spatial kernel matrix
Spatial precison: determines the amount of smoothness
Gaussian Markov Random Field priors D
11
11
ji
ij
ddD
1 on diagonal elements dii
dij > 0 if voxels i and j are neighbors. 0 elsewhere
Prior, Likelihood and Posterior
The prior:
The likelihood:
The posterior?
The posterior over doesn’t factorise over k or n. Exact inference is intractable.
p( |Y) ?
nn
ppppk
kkk
uup
rrpapqqppAp
),|(
),|()|(),|()|(),,,,(
21
2121
n
nnnn aypAYp ),,|(),,|(
Variational Bayes
Approximate posteriors that allows for factorisation
nnnn
pp
kk YqYaqYqYqYqAq )|()|()|()|()|(),,,,(
InitialisationWhile (ΔF > tol) Update Suff. Stats. for β Update Suff. Stats. for A Update Suff. Stats. for λ Update Suff. Stats. for α Update Suff. Stats. for γEnd
Variational Bayes Algorithm
Event related fMRI: familiar versus unfamiliar faces
Global prior Spatial Prior
Smoothing
Convergence & Sensitivity
o Global o Spatialo SmoothingS
ensi
tivity
Iteration Number
F
1-Specificity
ROC curveConvergence
SPM5 Interface
Posterior Probability Maps
)|( yp )|( yp
Posterior distribution: probability of getting an effect, given the data
Posterior probability map: images of the probability or confidence that an activation exceeds some specified threshold, given the data
)|( yp
Two thresholds:• activation threshold : percentage of whole brain mean signal (physiologically relevant size of effect)• probability that voxels must exceed to be displayed (e.g. 95%)
mean: size of effectprecision: variability
Posterior Probability Maps
Mean (Cbeta_*.img)
Std dev (SDbeta_*.img)
PPM (spmP_*.img)
Activation threshold
Probability
Posterior probability distribution p( |Y)
)|( yp
Bayesian Inference
LikelihoodLikelihood PriorPriorPosteriorPosterior
SPMsSPMs
PPMsPPMs
u
)(yft
)0|( tp)|( yp
)()|()|( pypyp
Bayesian test Classical T-test
PPMs: Show activations greater than a given size
SPMs: Show voxels with non-zeros activations
Example: auditory dataset
0
2
4
6
8
0
50
100
150
200
250
Active > Rest Active != Rest
Overlay of effect sizes at voxels where SPM is 99% sure that the
effect size is greater than 2% of the global mean
Overlay of 2 statistics: This shows voxels where the activation is different
between active and rest conditions, whether positive or negative
PPMs: Pros and Cons
■ One can infer a cause DID NOT elicit a response
■ SPMs conflate effect-size and effect-variability whereas PPMs allow to make inference on the effect size of interest directly.
DisadvantagesAdvantages
■ Use of priors over voxels is computationally demanding
■ Practical benefits are yet to be established
■ Threshold requires justification
Overview
Introduction Bayes’s rule Gaussian case Bayesian Model Comparison
Bayesian inferenceBayesian inference aMRI: Segmentation and Normalisation fMRI: Posterior Probability Maps (PPMs)
Spatial prior (1st level) MEEG: Source reconstructionMEEG: Source reconstruction
Summary
MEG/EEG Source Reconstruction (1)
Inverse procedure
Forward modellingDistributed
Source model Data
KJ EJKY
- under-determined system- priors requiredEKJY
[nxt] [nxp] [nxt][pxt]
n : number of sensorsp : number of dipolest : number of time samples
Bayesian framework
Mattout et al, 2006
MEG/EEG Source Reconstruction (2)
)()|()|( JpJYpYJp
likelihood priorposterior
222/1)( WJKJYCJU eMAP
likelihood WMN prior
WWC Tj 1 jCNJp ,0~)(
minimum norm functional priorsmoothness prior
1EKJY
20 EJ ),CΝ(E e0~1
),CΝ(E p0~2
2-level hierarchical model:
Mattout et al, 2006
Summary
Bayesian inference: Incorporation of some prior beliefs, Preprocessing vs. Modeling Concept of Posterior Probability Maps.Variational Bayes for single-subject analyses:
Spatial prior on regression and AR coefficientsDrawbacks:
Computation time: MCMC, Variational Bayes.
Bayesian framework also allows: Bayesian Model Comparison.
References
■ Classical and Bayesian Inference, Penny and Friston, Human Brain Function (2nd edition), 2003.
■ Classical and Bayesian Inference in Neuroimaging: Theory/Applications, Friston et al., NeuroImage, 2002.
■ Posterior Probability Maps and SPMs, Friston and Penny, NeuroImage, 2003.
■ Variational Bayesian Inference for fMRI time series, Penny et al., NeuroImage, 2003.
■ Bayesian fMRI time series analysis with spatial priors, Penny et al., NeuroImage, 2005.
■ Comparing Dynamic Causal Models, Penny et al, NeuroImage, 2004.