Date post: | 07-May-2018 |
Category: |
Documents |
Upload: | truongliem |
View: | 232 times |
Download: | 1 times |
Multivariate models and machine learning for fMRI
Methods and Models in fMRI, 15.11.2016
Jakob [email protected] Neuromodeling Unit (TNU) Institute for Biomedical Engineering (IBT)University and ETH Zürich
Many thanks to Sudhir Raman and Kay Brodersenfor material
Translational Neuromodeling Unit
1
Overview
fMRI Analysis and Classifcation 2
Motivation
Learning from data
Multivariate Bayes in SPM
Generative Embedding
Modelling Terminology
Why multivariate?
Univariate approaches are excellent for localizing activations in individual voxels.
v1 v2 v1 v2
reward no reward
*
n.s.
Why multivariate?
Multivariate approaches can be used to examine responses that are jointly encoded in multiple voxels.
v1 v2 v1 v2
n.s.
orange juice apple juice v1
v2
n.s.
A bit of history – Multidymensional scaling
fMRI Analysis and Classifcation 5
Edelman et al, Psychobiology, 1998
Psychophysical rating fMRI
Two-dimensional projection of similarity measure for bothpsychophysical rating and fMRI response.
A bit of history – Classification Studies
fMRI Analysis and Classifcation 6
Haxby et al, Science, 2001
A bit of history – Classification Studies
fMRI Analysis and Classifcation 7
Kamitani and Tong, Nat Neurosci, 2005
Representational similarity analysis
fMRI Analysis and Classifcation 8
Idea: Compare the similarity of representations (correlation betweenactivation patterns) between different stimuli. Allows for a comparison between monkey(neural firing pattern) and human (fMRI activation patterns).
Kriegeskorte et al, Neuron, 2008
Overview
fMRI Analysis and Classifcation 9
Motivation
Learning from data
Multivariate Bayes in SPM
Generative Embedding
Modelling Terminology
Analysis steps
Feature Extraction
Modelling
Classification
Clustering
Regression
Prediction
Model Selection
Cross validation
Performance
Inference
Feature space
F1 F2 . . . FP
S1 1 0.5
S2 0 5.7
. 1 4
. 1 5.3SN 1 6.6
• Discrete• Continuous
Data Points
Features
Feature selection for fMRImultivariate analysis
fMRI Analysis and Classifcation 12
Different features answer different questions.Reducing the dimensionality might reduce noise,but could also reduce relevant information.
Model parametersMean valuesRaw data
Model Parameters,e.g. DCM
Correlationsbetweenregions
Model selection - Generalizability
fMRI Analysis and Classifcation 13
Model Fit
Model Complexity
Bishop (2006), Pitt & Miyung (2002), TICS
Encoding and decoding models
fMRI Analysis and Classifcation 14
context (cause or consequence)𝑋𝑋𝑡𝑡 ∈ ℝ𝑑𝑑
BOLD signal𝑌𝑌𝑡𝑡 ∈ ℝ𝑣𝑣
conditionstimulus
responseprediction error
encoding model
decoding model
𝑔𝑔:𝑋𝑋𝑡𝑡 → 𝑌𝑌𝑡𝑡
ℎ:𝑌𝑌𝑡𝑡 → 𝑋𝑋𝑡𝑡
Modelling goals
• Prediction
hY X
Predictive Density
Modelling goals
• Model Selection
Sparse Coding Distributed Coding
Model Evidence
Overview
Motivation
Learning From Data
Multivariate Bayes in SPM
Generative Embedding
Modelling Concepts
Learning from data
Reinforcement Learning
Semi-supervised Learning
SupervisedLearning
Unsupervised Learning
Labels for trainingdata are known!
Labels for trainingdata are NOT known!
Supervised learning
Function - f
Independent variablesX
dependent variableY CategoricalContinuous
Classification
Support Vector Machines
• Kernel Function – K 𝒙𝒙𝒊𝒊,𝒙𝒙𝒋𝒋 = 𝝓𝝓 𝒙𝒙𝒊𝒊 .𝝓𝝓 𝒙𝒙𝒋𝒋
𝝓𝝓
Function - fX Y
Kernel Methods
Kernel methods for pattern analysis, Taylor , Cristianini, 2004
Other popular classifiers• Gaussian Processes
• Deep Belief networks
G.E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets”, Neural Computation, vol 18, 2006
http://deeplearning.net/tutorial/DBN.html
C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006,
Generative and Discriminative classifiers
fMRI Analysis and Classifcation 22
• Generative classifiers• Learn the parameters for the functions p(Y) and
p(X|Y), e.g. Naïve Bayes Classifier• Discriminative classifiers
• Learn the parameters for p(Y|X), e.g. logistic regression, SVM
Cross-validation
The generalization ability of a classifier can be estimated using a resampling procedure known as cross-validation. One example is 2-fold cross-validation:
examples123
99100
?training exampletest examples
folds??
1
...
???
2
...performance evaluation
fMRI Analysis and Classifcation 23
• Model Selection• Performance evaluation
• Balanced Accuracy• F1 Score
Cross-validation
Another commonly used variant is leave-one-out cross-validation.
examples123
99100
?training exampletest example
?...98
?
...99
?
...
100
...
folds?1
...
?
2
...performance evaluation
In fMRI often leave one-run-out
fMRI Analysis and Classifcation 24
Performance – Single Subject
fMRI Analysis and Classifcation 25
𝑝𝑝 = 𝑃𝑃 𝑋𝑋 ≥ 𝑘𝑘 𝐻𝐻0 = 1 − 𝐵𝐵 𝑘𝑘|𝑛𝑛,𝜋𝜋0
Brodersen et al. 2013, NeuroImage
Binomial Test
k=30
!!! Cross-validated data are not necessarilybinomially distributed Permutation tests are better!!!
Performance – Mulitple subjects
fMRI Analysis and Classifcation 26
Brodersen et al. 2013, NeuroImage
Fixed effects
Random effects
http://www.translationalneuromodeling.org/tapas/
Confounds – GLM vs. MVPA
fMRI Analysis and Classifcation 27
Todd et al. 2013, NeuroImage
Second level t-tests for accuracies?
fMRI Analysis and Classifcation 28
True β-Values are normallydistributed.
True accuracies are not normal and truncated at chance.
A possible solution is givenby Allefeld et al.
Allefeld et al. Neuroimage, 2016
Statistical testing with classification
• Within subjects:– Permutation statistics
– Parametric tests ar not valid (assumptions not met), e.g. Biomial-or t-test (c.f. Schreiber and Krekelberg, 2013).
• Across subjects:– Assumptions for t-tests are not met
– Full Bayesian model (Bordersen et al. 2013, but assumptions arenot met for CV)
– Use prevalence statistic proposed in Allefeld et al., 2016
fMRI Analysis and Classifcation 29
Research questions for classification
Temporal evolution of discriminability Model-based classificationaccuracy
50 %
100 %
within-trial time
Accuracy rises above chance
Participant indicates decision
Overall classification accuracy Spatial deployment of discriminative regions
80%
55%
accuracy
50 %
100 %
classification task
Truthor
lie?
Left or right
button?
Healthy or ill?
Pereira et al. (2009) NeuroImage, Brodersen et al. (2009) The New Collection
{ group 1, group 2 }
fMRI Analysis and Classifcation 30
Decoding «hidden» intentions –searchlight approach
fMRI Analysis and Classifcation 31
Haynes et al., Current Biology, 2007
Decoding of free decisions
fMRI Analysis and Classifcation 32
Soon et al., Nat Neurosci, 2008
Decoding of fingerpresses (red line). Participants freely choose timingand hand.
Earliest information about left-rightlong before execution – free will?
Decoding task preparation –connectitivy based decoding
fMRI Analysis and Classifcation 33
Heinzle et al., J Neurosci, 2012
SV-Classifier on connectivity graph (correlation)
Discriminative maps
Unsupervised learning
fMRI Analysis and Classifcation 34
Building a representation of data
Dimensionality Reduction Time seriesClustering
K-means Mixture models
K-means clustering
fMRI Analysis and Classifcation 35
• Cost function
• Algorithm1. Initialize2. Estimate assignments3. Estimate cluster centroids4. Repeat 2,3 until
convergence
Bishop PRML (2006)
Clustering – Mixture of Gaussians
fMRI Analysis and Classifcation 36
Bishop PRML (2006)
Interpretation
fMRI Analysis and Classifcation 37
• Cluster parameters
• Internal Criterion – Model Evidence• External Criterion - Purity
Inferred Labels
External Labels
Subjects
Cluster 1 Cluster 2
fMRI Analysis and Classifcation 38
Motivation
Learning from Data
Multivariate Bayes in SPM
Generative EmbeddingModelling
Encoding vs. Decoding models
fMRI Analysis and Classifcation 39
Encoding vs. Decoding models
fMRI Analysis and Classifcation 40
Coding Hypotheses
fMRI Analysis and Classifcation 41
Spatial vectors Smooth vectors
Sparse vectors
Singular vectors of data Support vectors
Distributed vectors
𝑈𝑈 = 𝑅𝑅𝑌𝑌𝑇𝑇𝑈𝑈𝑈𝑈𝑉𝑉𝑇𝑇 = 𝑅𝑅𝑌𝑌𝑇𝑇
Coding Hypotheses
fMRI Analysis and Classifcation 42
Friston et al. 2008 NeuroImage
Solved with variational Bayes
fMRI Analysis and Classifcation 43
Friston et al. 2008 NeuroImage
Example – Decoding of motion.
fMRI Analysis and Classifcation 44
Experimental factors:1. Photic2. Motion3. Attention
Attention to motion dataset - Büchel & Friston 1999 Cerebral Cortex
Friston et al. 2008 NeuroImage
fMRI Analysis and Classifcation 45
Friston et al. 2008 NeuroImage
Results
fMRI Analysis and Classifcation 46
Friston et al. 2008 NeuroImage
Multivariate Bayes in SPM
fMRI Analysis and Classifcation 47
1 2 3 4 5-20
0
20
40
60
partitions
log-evidencemaximum p = 100.00%
-0.04 -0.02 0 0.02 0.040
100
200
300
400
500
voxel-weight
frequ
ency
distribution of weights
Posterior probabilities at maxima ________________________________p(|w| > 0) location (x,y,z) weight (w)________________________________p = 0.993 -39.0,-90.0,-3.0mm q = 0.0254;p = 0.983 -33.0,-99.0,-3.0mm q = -0.0216;p = 0.983 -30.0,-99.0,3.0mm q = 0.0211;p = 0.982 -42.0,-90.0,9.0mm q = 0.0201;p = 0.980 -45.0,-75.0,-3.0mm q = 0.0168;p = 0.979 -30.0,-84.0,6.0mm q = -0.0187;p = 0.977 -39.0,-87.0,3.0mm q = -0.0196;p = 0.973 -30.0,-84.0,-6.0mm q = -0.0204;p = 0.972 -39.0,-81.0,-15.0mm q = 0.0166;p = 0.946 -36.0,-84.0,12.0mm q = -0.0144;p = 0.933 -48.0,-84.0,-3.0mm q = -0.0119;p = 0.929 -39.0,-75.0,3.0mm q = -0.0160;________________________________506 voxels; 360 scans
PPM: MVB_Motion (Motion)
0 100 200 300 400-1
-0.5
0
0.5
scans
adjus
ted re
spon
se
MVB_Motion (prior: sparse)
targetprediction
-1 -0.5 0 0.5-0.4
-0.2
0
0.2
0.4
contrast
pred
iction
observed and predicted contrastSNR (variance) 0.64
SPM
mip
[-36
, -87
, -3]
<
< <
SPM{T338}
Motion
SPMresults: .\SPM-practical\attention\GLMHeight threshold T = 4.874226 {p<0.05 (FWE)}
50
100
150
200
250
300
contrast(s)
3
Laminar activity related to novelty andepisodic memory
fMRI Analysis and Classifcation 48
Maas et al. 2014 Nature Communications
fMRI Analysis and Classifcation 49
Motivation
Learning from Data
Multivariate Bayes in SPM
Generative Embedding
Modelling Principles
Classifying Groups of Subjects
fMRI Analysis and Classifcation 50
Subject 1
Subject 2
Subject N
Voxel activity
Subject 1
Subject 2
Subject N
Connectivity
Dynamic causal model (DCM)
ClassificationClustering
Group 1 Group 2
......
• High dimensionality• Unusual cluster distributions• Lack of interpretation
Generative Embedding
fMRI Analysis and Classifcation 51
Brodersen et al. PLOS computation biology 2011.
DCM for speech processing
fMRI Analysis and Classifcation 52
Working memory in Schizophrenia
fMRI Analysis and Classifcation 53
• 41 Schizophrenia patients (DSM IV,ICD 10), 42 controls
• Visual numeric n-back working memory task
Deserno et al (2012) The Journal of Neuroscience
1 5
4 29 8
9
3 5900ms
500ms
Model based clustering
fMRI Analysis and Classifcation 54
Brodersen et al 2014 Neuroimage
Results healthy vs. schizophrenia patients
fMRI Analysis and Classifcation 55
Brodersen et al 2014 Neuroimage
Within patients clustering
fMRI Analysis and Classifcation 56
Brodersen et al 2014 Neuroimage
Be aware
• Interpretation of decoding or classificationresults is difficult.
• The decoded information must be in thedata, but in what features exactly is oftenhard to find out …
fMRI Analysis and Classifcation 57
Summary
fMRI Analysis and Classifcation 58
Summary
Learning from Data
Multivariate Bayes in SPM
Generative Embedding
Modelling Principles
Acknowledgments
fMRI Analysis and Classifcation 59
Many thanks to K.E. Stephan, Sudhir S. Raman and K. Brodersen for sharing their teaching material.