Analysis of Functional MRI Timeseries DataUsing Signal Processing Techniques
Sea Chen
Department of Biomedical Engineering
Advisors: Dr. Charles A. Bouman and Dr. Mark J. Lowe
S. Chen – Final Exam – October 7, 2002 – p.1/39
Overview
� Introduction
� Update: Supertemporal Resolution Analysis
� Review
� New simulations
� New data
� Clustered Components Analysis
� Motivation
� Theory
� Methods
� Results
� Conclusions
S. Chen – Final Exam – October 7, 2002 – p.2/39
Goals
We would like to aid in the understanding of theblood-oxygenation-level-dependent (BOLD) contrastmechanisms used in functional magnetic resonanceimaging (fMRI) through
� achieving a high signal-to-noise (SNR) estimate ofthe BOLD response.
� achieving a high temporal resolution estimate ofthe BOLD response.
S. Chen – Final Exam – October 7, 2002 – p.3/39
fMRI: The basic idea
�
Experimental paradigm designed to activate neuronal metabolism
�
Changes in blood oxygenation during activation � changes in physicalparameters affecting MR signal
�
Contrast produced by difference between active and control states
�
Data set is volume of pixels repeated over time
Considerthis pixel Time
0 168 24 32 40
ResponseSignal
One pixel of one slice through time
... ... ... ... ... ...
StimulusSignal
S. Chen – Final Exam – October 7, 2002 – p.4/39
Supertemporal Resolution:Motivation
� Short TR
� Better time resolution
� Lower SNR due to saturation effects
� BOLD signal is distorted by blood inflow effects
� Long TR
� Poorer time resolution
� BOLD effect more dominant in activation signal
S. Chen – Final Exam – October 7, 2002 – p.5/39
Supertemporal Resolution: Review
�
Assumption: Voxels exhibiting the same generating activationsignal span different slices in a 2D acquisition
�
Method exploits the timing characteristics of the 2D acquisition
�
Bayesian prior used to implement temporal regularization
S. Chen – Final Exam – October 7, 2002 – p.6/39
Supertemporal Resolution: Review
� MAP estimate for Supertemporal Resoution (STR)
��� � � �� � � ��
�
��� ��� �
� � ��� �� � � ��� � �
� � � � � � � � �
��� � � � � � � �
�� � � �
! � � � " � # �
where
� � �$ % &
�('$ �' �$
� Optimization performed using conjugate gradient
� Regularization parameter ) found by crossvalidationstrategy
S. Chen – Final Exam – October 7, 2002 – p.7/39
Supertemporal Resolution: Updates
� Reduction in computation time
� Minor software revisions
� New hardware
� New simulations
� Introduced amplitude amplification factors � (1x,2x, 4x, 6x, 8x simulating increase in
��� -field)
� Generated multiple (20 / �) datasets
� New human visual system data
� Three inch surface coil
� Multiple runs (3 TR = 2.0s, 3 TR = 0.5) of 3.5cycles
S. Chen – Final Exam – October 7, 2002 – p.8/39
Performance comparison
�
Simple averaging (SA) method
�
Alignment of slices into timeframe of first slice
�
NO regularization
�
Closed form solution
�
0.5 s time resolution estimate from TR = 0.5 s dataset
�
Interpolation with regularization (IWR) method
�
Alignment of slices into timeframe of first slice
�
Regularization applied and chosen with crossvalidation
�
Numerical optimization with conjugate gradient
�
0.5 s time resolution estimate from TR = 0.5 s dataset
�
Supertemporal regularization (STR) method
�
Slice timing considered in data model
�
Regularization applied and chose with crossvalidation
�
Numerical optimization with conjugate gradient
�
0.5 s time resolution estimate from TR = 2.0 s dataset
S. Chen – Final Exam – October 7, 2002 – p.9/39
Simulation results: Performance
1 2 3 4 5 6 7 80
0.1
0.2
0.3
0.4
0.5
Amplitude Amplification Factor
Mea
n S
quar
e E
rror
(A
U)
Mean square error of estimates versus synthetic amplitude amplification
mean SA errormean IWR errormean STR errorindividual SA errorsindividual IWR errorsindividual STR errorse
0.5 error
e2.0
error
Mean square error of simulation results for different analysis methods plotted againstamplitude amplification factor �
S. Chen – Final Exam – October 7, 2002 – p.10/39
Simulation results: Examples
40 50 60 70 80 90 100 110 120−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time (s)
Inte
nsity
(A
U)
IWR estimate on synthetic dataset at 4x template amplitudes
normalized IWR estimateinjected BOLD signal
40 50 60 70 80 90 100 110 120−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
Time (s)
Inte
nsity
(A
U)
STR estimate on synthetic dataset at 4x template amplitudes
normalized STR estimateinjected BOLD signal
IWR estimate on TR=0.5s data STR estimate on TR=2.0s dataExamples of 0.5 second estimates at � � �
S. Chen – Final Exam – October 7, 2002 – p.11/39
Human data results:Simple averaging method
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
SA estimates for V(r#)0.5
data series
Nor
mal
ized
inte
nsity
(A
U)
Time (s)
V(r1)0.5
V(r2)0.5
V(r3)0.5
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Time (s)
Nor
mal
ized
inte
nsity
(A
U)
Statistics on SA estimates for V(r#)0.5
data series
meanmean ± std
SA estimates Mean and std. dev. of SA estimatesSimple averaging estimates on the TR=0.5 second dataset (3 experiments)
S. Chen – Final Exam – October 7, 2002 – p.12/39
Human data results: Interpolationwith regularization method
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Time (s)
Nor
mal
ized
inte
nsity
(A
U)
IWR estimates for V(r#)0.5
data series
V(r1)0.5
V(r2)0.5
V(r3)0.5
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Time (s)
Nor
mal
ized
inte
nsity
(A
U)
Statistics on IWR estimates for V(r#)0.5
data series
meanmean ± std
IWR estimates Mean and std. dev. of IWR estimatesInterpolation with regularization estimates on the TR=0.5 second dataset (3 experiments)
S. Chen – Final Exam – October 7, 2002 – p.13/39
Human data results:Supertemporal Resolution method
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
STR estimates for V(r#)2.0
data series
Time (s)
Nor
mal
ized
inte
nsity
(A
U)
V(r1)2.0
V(r2)2.0
V(r3)2.0
40 50 60 70 80 90 100 110 120−0.15
−0.1
−0.05
0
0.05
0.1
0.15
Time (s)
Nor
mal
ized
inte
nsity
(A
U)
Statistics on STR estimates for V(r#)2.0
data series
meanmean ± std
STR estimates Mean and std. dev. of STR estimatesSupertemporal resolution estimates on the TR=2.0 second dataset (3 experiments)
S. Chen – Final Exam – October 7, 2002 – p.14/39
Discussion and conclusions
� Simulated data
� Initially at low SNR, STR performs worse thanIWR because small features masked by noise
� At increasing SNR, STR performs better thanIWR/SA as inherent physical advantangebecomes apparent
� In human data, STR estimates qualitativelydifferent from IWR/SA estimates
� Conclusion: STR may be a valuable tool incharacterizing small features in the BOLD signal athigher static field strengths or higher SNR
S. Chen – Final Exam – October 7, 2002 – p.15/39
Clustered components analysis:Objectives
Hypothesis: Activation by specific functional tasks � Distinct neuralresponses in different parts of the brainTherefore, we propose the following goals:
�
Design and run fMRI experiment activating visual, auditory, andmotor cortices.
�
Estimate number of distinct neural responses (# ofclasses/clusters)
�
Extract an estimate for each response
�
Determine voxel memberships
S. Chen – Final Exam – October 7, 2002 – p.16/39
Existing approaches to signalestimation
�
Principle component analysis (PCA)
�
Extracts orthogonal signals
�
Disadvantage: Signals not usually orthogonal
�
Independent component analysis (ICA)
�
Extracts spatially independent signals
�
Disadvantage: Signals may not be independent
�
Conventional Clustering
�
Groups signal vectors as spheres about a mean
�
Disadvantage: Signals may not form spherical clusters
�
General Comment: None of these methods start with an explicitmodel of the data. All go about estimating the distinct signals inan ad hoc way.
S. Chen – Final Exam – October 7, 2002 – p.17/39
Analysis framework
�
Dimensionality Reduction
�
Signal subspace is orthogonal to noise subspace
�
Noise can be accurately modeled in fMRI
�
Separate signal subspace (dim�
) from noise subspace (dim��� �
)
�
Clustered Components Analysis
�
Useful information is in shape of signal, amplitude unimportant
�
Component direction is found instead of mean
�
Amplitude can vary in cluster so long as shape preserved
�
Clusters found in cylinders instead of spheres
S. Chen – Final Exam – October 7, 2002 – p.18/39
Interpretation of ClusteredComponents Analysis
�
Because amplitude of the voxel signal is not important, themethod clusters around component directions, not componentmeans.
�
This means the clusters can be thought of as cylinders rather thanthe traditional spheres.
S. Chen – Final Exam – October 7, 2002 – p.19/39
Dimensionality reduction:Harmonic decomposition
Data model for harmonic decomposition
� � � ��� ���
� �
:
� �
detrended voxel timecourse matrix (
=# of timepoints,�
=# of voxels)
� �
:
�
matrix of sampled sines and cosines (
�
harmoniccomponents)
�
���
:
�� �
harmonic image
� � :
�
maxtrix of residuals from the least squares fit
S. Chen – Final Exam – October 7, 2002 – p.20/39
Dimensionality reduction: Signalsubspace estimation
�
Signal + noise covariance:
������ � �
���� ���
�
Noise covariance:
�� � � trace
� � � � � � �� �� � � � � � � �
�
Signal covariance:
�� � � ���� � ��� �
�
Eigen decomposition
���� � ���� � �
�
Only the columns of the
� �eigenvector matrix
��
corresponding to the
�positive eigenvalues of
�� are retained,
yielding the
� �
modified eigenvector matrix
� �� .
� � �
reduced dimensionality feature vector matrix:
� � � � � �
���
(
�
is a whitening vector matrix derived from� )
S. Chen – Final Exam – October 7, 2002 – p.21/39
Data Model for Clustered ComponentAnalysis
�
Assumptions
�
Only shape of the response important
�
Amplitude is NOT important
�
Noise independent in space and time (time-independence can be relaxed)
�
Our Model
� �$ is
�
-dimensional column vector representation of the � � �
timecourse
�$ � �$ ���� �� $
e1
e2
e3
an Xnewhere
= 1.5, X = 1an n
an Xn ne + W
� �$ is the unknown scalar amplitude for pixel �,
� � � � ��� � � � � � � � are the
�
component directions
� �� �$ � �is class of the pixel
� � $ is a Gaussian noise vector
S. Chen – Final Exam – October 7, 2002 – p.22/39
Clustered components approach
�
Goal: Minimize minimum description length (MDL) criterion
MDL � � loglikelihood
��
��
# of parameters � � � �# of datapoints
�
Unknown model parameters
� �
is the model order (number of clusters)
� � � is the amplitude of each pixel
� ��� � � � � � � � � � �
is the set of distinct neural responses
� � � � �� � � � � � � � � �
are the prior probabilities for each class
�
Use maximum likelihood (ML) estimate
� � � implicitly
�
Find ML estimates
� ��� and
� � � using theExpectation-Maximization (EM) algorithm for each model order
�
�
Estimate model order
� �
by cluster merging and minimizing theMDL criterion
S. Chen – Final Exam – October 7, 2002 – p.23/39
Voxel likelihood function
�
Likelihood for each voxel
� '$ � �$��� � �� �� � � � �
�
�� ! � � � #� �� � ��
��� � � � � �� � ��� � � � � ��
�
ML estimate of the scalar amplitude
� � � � �� ��
�
Voxel log-likelihood
��� � � '$ � �$�� � � � �� � � � � �
�� � �
�
� � � � ��
��� �
� � � � �� � � � � �� �� �
S. Chen – Final Exam – October 7, 2002 – p.24/39
Maximum likelihood estimate
�
Log-likelihood of the entire dataset
� � � ��� ! ' � �� �� � �� � � # ��
$ � �� � �
��� �
���� � �� ! '$ � � � �� � �$ # �
��
$ � �� � �
� ��� �
� �
! # � � � ��� � � ��
�
� ' �$ '$ � � �� '$ '$ � � �� � ���
�
ML estimate of the parameters
� � ��� �� � � � �� � � ��� �"! # �� � �$ % �'& ( �� � � � � � � � �
S. Chen – Final Exam – October 7, 2002 – p.25/39
Expectation-maximization equations
�
Posterior probability
$ �� � %�� � ( & �� �� � � � � � � � � � �
$ %� � ���'& �( �� ��� � � � � � �
�� � � $ %� � ��
� & �( �� ��� � � � � � �
�
E-step
� � �� � � � �
�� � �
$ �� � %�� � ( & �� ��� ��
� � � � � � � � �
�� � � � � � �
�� � � �
� � �
�� � �
& � & � $ �� � %�
� � ( & �� � � � � � � � � � � � � �
�
M-step � � �
� � �� � �� � �
� � � �
� � � �
� � � � � � � � � � �
S. Chen – Final Exam – October 7, 2002 – p.26/39
Order Estimation through ClusterMerging
1. Start with large number of clusters (
�
) and initialize
2. Run EM algorithm to convergence
3. Choose the two clusters that minimize the distance function (which is the upperbound on the change in MDL)
� !� � � # � � �� � ! ���� � � # � � � � ! �� � � � # � � � � � ! ��� � � �� � � � #
4. Merge clusters using
��� � � � � �
� ��� � � � � � � � �� � � � ��� � � � �
5. Decrement
�
and initialize next iteration with new clusters
6. Repeat 2 through 5 until
�= 1
7. Choose number of components minimizing the MDL criterion
�� � ! �� �� � � # � � � � � �� ! ' � �� �� � �� � � # � � � � � � !� � #
S. Chen – Final Exam – October 7, 2002 – p.27/39
Synthetic fMRI Images
�
Synthetic data
�
Baseline control images created at each sample point
�
During periods of activation, 3 different realistic signals with varyingamplitudes were injected
�
Gaussian white noise added to each voxel at each timepoint time
�
Verification and comparison using different analysis methods applied before andafter signal subspace estimation (SSE)
�
PCA, using 3 components corresponding to the 3 largest variances
�
Spatial ICA constrained to yield 3 components
�
Spatial ICA unconstrained, using 3 best components
�
Fuzzy C-means (FCM) clustering constrained to yield 3 clusters
�
CCA
S. Chen – Final Exam – October 7, 2002 – p.28/39
Paradigm Design
�
For our dimensionality reduction, activation must be periodic
�
Block activation scheme
�
1 cycle = 32 seconds control (rest state), 32 seconds
�
1 scan = 16 seconds lead in - 4.5 paradigm cycles - 16 seconds lead out (onlyuse samples during paradigm)
�
0.5 Hz sample rate (TR = 2 seconds)
�
To illustrate the power of the clustering method, many different types of functionalcortex must be activated
�
Visual: Flashing 8Hz checkerboard
�
Auditory: Forward vs. Backward speech (backwards is the control)
�
Motor: Self paced finger tapping
On OnOn On OnOff Off Off Off Off
Lead-in Lead-out
0:00 0:16 0:48 1:20 1:52 2:24 2:56 3:28 4:00 4:32 5:04 5:20
S. Chen – Final Exam – October 7, 2002 – p.29/39
Simulated data: Hard classifications
Results for CCA applied to synthetic data
S. Chen – Final Exam – October 7, 2002 – p.30/39
Simulated data: Qualitative results
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
PCA FCM constrained ICA
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
10 20 30 40 50 60 70 80 90 100 110
−0.5
0
0.5
unconstrained ICA CCAEstimates after application of SSE
S. Chen – Final Exam – October 7, 2002 – p.31/39
Simulated data: Quantitative results
Mean squared error for analyses on synthetic data before and after signal subspaceestimation (SSE)
PCA FCM ICA (c) ICA (u) CCA
Before SSE
���
�� � �� � � ��
� � � �� � � ��
� � �� � � �
� � �� � � ��
� � �� �
After SSE
���
�� � �� � � ��
� � �� � � �
� � �� � � ��
� � � �� � � ��
� � �� �
Number of voxels classified correctly on synthetic data before and after signal subspaceestimation (SSE) out of 192 total voxels
PCA ICA (c) ICA (u) FCM CCA
Before SSE 61 113 38 95 167
After SSE 111 162 77 151 169
S. Chen – Final Exam – October 7, 2002 – p.32/39
Human data:Timesequence realizations
20 40 60 80 100 120 140 160 180 200 220−2
−1.5
−1
−0.5
0
0.5
1
1.5
2Class 1Class 2Class 3Class 4Class 5
First 5 clusters
S. Chen – Final Exam – October 7, 2002 – p.33/39
Human data:Timesequence realizations
20 40 60 80 100 120 140 160 180 200 220−2
−1.5
−1
−0.5
0
0.5
1
1.5
2Class 6Class 7Class 8Class 9
Clusters 6-9
S. Chen – Final Exam – October 7, 2002 – p.34/39
Human data: Hard classifications
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
Motor cortex (first 5 clusters): (L) Upper slice, (R) Lower slice
S. Chen – Final Exam – October 7, 2002 – p.35/39
Human data: Hard classifications
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
Auditory cortex (first 5 clusters): (L) Upper slice, (R) Lower slice
S. Chen – Final Exam – October 7, 2002 – p.36/39
Human data: Hard classifications
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
Visual cortex (first 5 clusters): (L) Upper slice, (R) Lower slice
S. Chen – Final Exam – October 7, 2002 – p.37/39
Conclusions
�
Clustered component analysis is a new method of extracting
signals where only shape, not amplitude, is important
�
CCA has been shown to perform well on simulated data
�
The experimental results show the following:
�
The distinct neuronal signals do not correlate strongly
with the known functional cortices
�
The clusters tend to lie along sulcal-gyral boundaries,
possibly correlated with vasculature
�
CCA can be used with dimensionality reduction strategies
other than the ones used in our experiments
�
CCA may also be adapted for use with applications other
than fMRIS. Chen – Final Exam – October 7, 2002 – p.38/39
Acknowledgements
� Major Professors: Dr. Mark J. Lowe and Dr.Charles A. Bouman
� Committee Members: Dr. Peter C. Doerschuk andDr. Edward J. Delp
� Department of Biomedical Engineering andDivision of Imaging Sciences
S. Chen – Final Exam – October 7, 2002 – p.39/39