An Evolutionary Approach to the Identification of Informative Voxel Clusters for Brain State...

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 2, NO. 6, DECEMBER 2008 919

An Evolutionary Approach to the Identificationof Informative Voxel Clusters for

Brain State DiscriminationMalin Björnsdotter Åberg, Student Member, IEEE, and Johan Wessberg

Abstract—We present a novel multivariate machine learning ap-proach to the identification of voxel clusters containing brain statediscriminating information, serving as a potentially more sensi-tive alternative to univariate activation detection. The proposedmethod consists of an evolutionary algorithm that, in conjunctionwith a classifier, extracts voxel clusters with a classification scoreabove a pre-defined, above-chance threshold. The results can bedisplayed as two- or three-dimensional voxel discrimination rele-vance maps (VDRMs), indicating where and to what degree brainstate classification is possible. When applied to a finger-tappingdataset numerous voxel clusters with impressive classification rateswere identified, at best scoring an area under the receiver oper-ating characteristic curve (ROC)-curve (AUC) of 1 within as wellas between subjects. The location of high-scoring regions corre-lated well with functionally relevant areas as defined by the generallinear model (GLM). Combining clusters for maximal classifica-tion scores as a feature selection approach outperformed the GLMT-map voxel ranking method (e.g., group level AUC of 0.908 com-pared to 0.785 for one cluster/200 voxels). Moreover, on data froma tactile study we show that the proposed algorithm can producesignificant brain state discrimination scores where both the GLMand ROI-based classification fail to detect significantly activatedvoxels. Finally, we demonstrate that the algorithm can be success-fully applied to data with more than two conditions and hence pro-duce multiclass voxel relevance maps. The proposed evolutionaryclassification scheme has thus proven excellent in identifying voxelclusters that contain information about given brain states, whichcan be utilized not only for maximal single-volume fMRI classifi-cation, but also for multivariate, multiclass, highly sensitive func-tional brain mapping.

Index Terms—Classification, evolutionary algorithms, fMRI,pattern recognition, support vector machines.

I. INTRODUCTION

A key issue in the analysis of fMRI data has long been theidentification of brain areas involved in the processing

of given conditions, typically addressed using the generallinear model (GLM) where data is temporarily averaged and

Manuscript received March 15, 2008; revised October 02, 2008. Current ver-sion published January 23, 2009. This work was supported by the Swedish Re-search Council under Grant K2007-63X-3548 and the by Sahlgrenska Univer-sity Hospital under Grant ALFGBG 3161. The associate editor coordianitng thereview of this manuscript and approving it for publication was Dr. Martin McK-eown.

The authors are with the Institute of Neuroscience and Physiology, Universityof Gothenburg, 413 90 Göteborg, Sweden (e-mail: [email protected]).

Color versions of one or more of the figures in this paper are available onlineat http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSTSP.2008.2007788

analyzed on a voxel-by-voxel basis [1]. The recent introductionof machine learning concepts and multivoxel pattern analysis(MVPA) to neuroscience has, however, made possible themapping of a single fMRI volume acquisition to an individual’scognitive state at that point in time [2], [3]. Importantly, theMVPA approach is multivariate, that is, takes the informationencoded over numerous voxels into account. This approach istherefore potentially more sensitive to the discrimination andcharacterization of brain states than univariate approaches likethe GLM. Several studies have established the utility of patternanalysis methods in various fMRI implementations, includingthe tracking of mental states over time [4], lie-detection [5],the decoding of single-trial visual stimuli—visible [6]–[8] aswell as invisible [9]—biofeedback [10], [11], real-time fMRIanalysis [12], and the increased sensitivity on account of theinherent multivariate encoding of condition differences in thefMRI [13].

In the MVPA approach, a classifier is typically trained toidentify brain patterns associated with a set of given cognitiveconditions, such as the brain state induced by viewing a photoof a teapot, as opposed to a garden gnome or other type of object[7]. A successfully trained classifier can then be applied to new,unknown data, that is, to report when the subject sees a teapotat a later stage.

The success of the trained classifier in classifying the fMRIvolumes is essentially a measure of the brain state category in-formation contained in the voxels the classifier receives as input.If the classifier is significantly better than chance at guessing thebrain states, there must be information about the conditions inthe data which the classifier can interpret. Thus, by applyingclassifiers to various areas of the brain, it should be possible tomap the brain state category information content to anatomicalregions. Given a typical brain volume of voxels andthe combinatorial explosion of possible voxel subsets, however,mapping the brain in this fashion is a daunting task. Moreover,in smaller regions of interest (ROIs), where whole-region clas-sification produces chance-level rates the existence of one (ormore), potentially well-defined, optimal voxel subsets yieldingsignificant classification scores is possible. Again, the excessivenumber of possible voxel combinations renders any exhaustivesearch virtually impossible.

To address this issue, we have developed a machine learningoptimization method based on evolutionary algorithms [14],that extracts voxel clusters yielding brain state classificationresults above a specified threshold in an intelligent and efficientfashion. Thus, the algorithm serves as a multivariate, and

1932-4553/$25.00 © 2008 IEEE

920 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 2, NO. 6, DECEMBER 2008

thereby potentially more sensitive, complement to univariateGLM activation detection. Notably, the proposed method isfundamentally different from our previous evolutionary schemefor voxel selection [15] in that the aim is to identify biologicallyinterpretable voxel clusters, that is, informative brain regions,as opposed to spatially distributed voxel subsets for optimalclassification.

The algorithm employs a classifier, of any type, for directevaluation of training (known data) as well as generalization(new data) performance. The generalization score, and, whenmultiple clusters are obtained, the voxel selection frequency,can be displayed as two- or three-dimensional voxel discrim-ination relevance maps (VDRMs) [15], akin to GLM T-maps.These can be projected on anatomical scans, indicating clusterlocation and corresponding condition classification relevance.By adjusting the threshold, the range and specificity of the ob-tained clusters can be manipulated. A high threshold can, for ex-ample, be used for feature selection, where the aim is to find thebest possible voxel subset to use for maximal brain state classi-fication. A threshold slightly above chance, on the other hand,yields a more representative map of brain areas where any, al-beit not maximal, brain state classification is possible.

A number of methods for deriving classification activationmaps specifically from the support vector machine have beendescribed in the literature [16]–[18]. Our approach is, however,a general purpose method, which can be used with any givenclassifier. Importantly, nonlinear classifiers, where the rela-tionship between classifier parameters (weights) and the inputfeature space is highly complex, can be easily incorporatedin our approach. Moreover, multiclass cluster maps can beconstructed in an intuitive manner, whereas the previous studiesonly manage binary data. The proposed algorithm can, instead,be seen as a refinement of the searchlight approach proposed byKriegeskorte and colleagues, where the brain is scanned with avolume of predetermined size whose content is multivariatelyevaluated [19].

In order to illustrate the potential of the algorithm, we ap-plied it to two different experimental paradigms. First, a stan-dard finger-tapping scheme was employed to generate robust,well-defined activations (mainly in the primary somatosensoryand motor cortices as well as in the supplementary motor area).Using this data we illustrate that the evolutionary algorithm isefficient and accurate in finding above-threshold voxel clustersin whole brain volumes, within as well as between subjects, andthat the cluster locations correspond well to both physiologi-cally expected areas and those identified by the GLM. We alsouse the finger-tapping dataset to show that cluster-based featureselection for maximal brain state discrimination outperforms aGLM T-map ranking method used extensively in fMRI classifi-cation situations [9], [20].

In the second paradigm, subjects were brushed on the thighand arm with a soft brush. This type of stimulus is primarilythought to activate the primary and secondary somatosensorycortices through thick, myelinated fibers. It is also believedthat certain thin unmyelinated fibers, called C-tactile fibers(CT-fibers), projecting to the insular cortex, are activated bysuch pleasant touch [21]. Applying the clustering scheme toan ROI localized to the insular cortex, our algorithm does in

fact find significant differences between brushing and rest in allsubjects, including those where the GLM fails to detect activevoxels. Using the same dataset, we also show that it is possibleto utilize the multiclass capability of pattern recognition bygenerating voxel cluster maps displaying three-way voxeldiscrimination relevance scores.

II. METHODS

A. Data Acquisition

A 1.5 T fMRI scanner (Philips Intera, Eindhoven, TheNetherlands) was first used to collect anatomical scansusing a high-resolution T1-weighted anatomical protocol( ; ; flip angle, 30 ; FOV 256 mm).Functional scans were then acquired using a blood oxygenationlevel dependent (BOLD) protocol and a -weighted gradientecho-planar imaging sequence (flip angle 90 , ,

). The scanning planes were oriented parallel tothe line between the anterior and posterior commissure andcovered the brain from the top of the cortex to the base of thecerebellum. Each scan volume contained 25 slices at a gridresolution of 128 128 voxels.

For the first dataset we employed a finger-tapping, block-design paradigm, conventionally used for mapping of primarysensorimotor cortex areas in brain surgery candidates. In ninehealthy volunteers (four female, all right-handed, aged 22–35yrs), 120 volumes of alternating three-volume cue-based finger-tapping, where the subjects tapped the fingers of their right handon to the thumb, and three volumes of rest were acquired.

In the second paradigm, the experimenter, following a cuefrom the scanner, applied 16-cm long soft brush strokes of 3s duration in the distal direction on the right thigh or forearmduring three volumes. The stimulations were alternated in arandom fashion and an equal number of three consecutive vol-umes of rest were interleaved. 120 volumes were acquired perscan and six scans were obtained per subject. Six healthy sub-jects (three female, aged 22–28 years) were included in thestudy.

The Regional Ethical Review Board at University of Gothen-burg approved the study, and the experiments were performedin accordance with the Declaration of Helsinki.

B. Data Pre-Processing

The data was pre-processed with the Neurolens softwarepackage (developed at the Neurovascular Imaging Lab, UNFMonreal, QC, Canada Online: www.neurolens.org). All fMRIdata was motion corrected. For the cluster analysis, spatialsmoothing (6-mm Gaussian kernel) was applied only to thebrushing data set. A stereotactic normalization of individualdata to fit the Montreal Neurological Institute (MNI) standardbrain was performed where appropriate for group analysis[22]. Voxels not containing brain matter were removed bythresholding the intensity values appropriately. Hemodynamicdelay was accounted for by discarding the first volume ineach three-volume condition and forming an average over theremaining two volumes. To investigate the effect of temporaldrift on the performance, the data was linearly detrended wherestated in the results section. Moreover, each volume was scaled

ÅBERG AND WESSBERG: AN EVOLUTIONARY APPROACH TO THE IDENTIFICATION OF INFORMATIVE VOXEL CLUSTERS 921

Fig. 1. Simplified example of an evolutionary iteration.

to the range [0,1]. For the ROI-study, the left insular cortex wasextracted using an anatomical reference [23].

Thus, a finger-tapping dataset, containing nine subjects with20 patterns from each class (tap/rest), was formed, as well as abrushing dataset with six subjects and 78 volumes per stimuluscategory (arm brushing/thigh brushing/rest). A standard gen-eral linear model analysis was performed on spatially smoothed(6-mm Gaussian kernel) data for comparison [24]. For the groupanalysis, a fixed effect model was used. The GLM maps werethresholded by controlling the false discovery rate (FDR), sothat [25]. For data visualization, the programsMRIcron (by Chris Rorden, www.sph.sc.edu/comd/rorden/mri-cron/) and Cartool (by Denis Brunet, http://www.brainmapping.unige.ch/Cartool.htm) were used.

C. Cluster Identification Using Evolutionary Algorithms

An evolutionary algorithm is an optimization scheme inspiredby Darwinian evolution, where potential problem solutions areencoded as individuals in a population [14]. A fitness value,measuring the quality of each solution, drives the evolution to-wards a best possible solution.

The aim of the present algorithm is to, in a whole brainvolume, find a cluster of voxels that in conjunction with aclassifier can discriminate between brain conditions to (at least)a specified degree. Thus, each individual in the populationcorresponds to one voxel cluster. Due to the exceedingly highdimensionality of fMRI data (in the order of tens to hundredsof thousands of voxels), the clusters are encoded sparsely asindexed lists.

The proposed algorithm is illustrated in Fig. 2. First, thepopulation of individuals is initialized in a stochastic fashion.Here, for each individual, one seed voxel is randomly selected.The voxel cluster is then constructed by the addition of randomvoxels which neighbor the seed voxel, and, subsequently, anyvoxel already in the cluster.

Three mutation operations are implemented in the algorithm:the addition of a number of voxels, the deletion of a number

of voxels, and the substitution of a voxel with another voxel.All voxel additions and substitutions are performed on neigh-boring voxels, that is, voxels that are within the 26 voxel cubesurrounding any voxel already contained in the cluster. The fre-quency of mutation is regulated by a constant mutation rate pa-rameter for each mutation operation. Also, a voxel cluster in thepopulation is occasionally substituted for a new, randomly gen-erated cluster to add fresh genetic material and aid in escapingpotential local maxima. A standard tournament scheme is usedfor parent selection. Since all individuals in the population rep-resent different locations and crossover thus would destroy thespatial integrity of the voxel clusters, reproduction is asexual.

The fitness value, that is, the brain state identification success,of each individual cluster was computed using a classifier. Anyclassifier can be applied, and in the present study we have usedlinear support vector machines [26] for binary data and standardlinear regression for multiclass data. To ensure high generaliza-tion capability, the algorithm is supplied with three datasets. Thefirst is used in classifier training (training data, 35% of the totalvolumes) while the second is used for fitness estimation (testingdata, 45%). The third dataset is exclusively used with an alreadytrained and optimized classifier and voxel cluster (validationdata, 20%). The fitness is computed by obtaining classifier pa-rameters from the training data input, and applying the trainedclassifier to the testing data to estimate the condition classes.As a fitness measure indicative of classification performance,the receiver operating characteristic curve (ROC) (a plot of thesensitivity versus 1-specificity for varying classifier thresholds)is computed and the area under the curve (AUC) is obtained forbinary data classified with the SVM. For multiclass data, wherethe MLR was applied, the proportion of correctly classified vol-umes is used. A simplified example of an algorithm iteration isshown in Fig. 1.

The algorithm is run for either a pre-determined maximumnumber of generations or until a cluster yielding testing dataclassification rates above a given threshold (fitness threshold) isobtained. The algorithm can, however, overtrain when allowed


Fig. 2. Flowchart describing the proposed evolutionary approach.

to run the full course. Therefore, the cluster with the best resulton the mean of the training and testing data performance (typ-ically, but not always, the cluster from the last generation) wasextracted and used in the validation classification.

Due to the stochastic nature of evolutionary algorithmsand the potential presence of several local, above-chancemaxima—each of which is of interest from a physiologicalpoint of view—the algorithm does not find the same clusterat every attempt. In order to obtain discriminating clustersrepresentative of distributed brain activations the algorithm istherefore iterated numerous times, producing as many clustersas iterations. Also, the fitness threshold can be set to obtainclusters with different minimum levels of classification rates.

The algorithm was implemented in Matlab (The Mathworks,Natick, MA) and C on a standard desktop computer by one ofthe authors (M. Åberg). For the multiple linear regression clas-sifier, the least-squares method was used to obtain the model co-efficients and the output was thresholded to obtain discrete cat-egories. For binary classification, the matlab support vector ma-chine package LS-SVMlab developed by the group SCD/sistain the department ESAT at the KULeuven, Belgium, was used[26].

All reported results refer to classification results obtainedusing trained and optimized classifiers and clusters on thevalidation data. All datasets are balanced, resulting in a chanceclassification accuracy of 0.5 (AUC) for binary data and 0.333(proportion of correctly classified volumes) for three-class data.

D. Voxel Discrimination Relevance Maps

The obtained clusters and corresponding classification ratescan be plotted on an anatomical brain scan to visualize thedegree of brain state discrimination relevancy in differentregions. Two different kinds of voxel discrimination relevancemaps (VDRMs) are used. The classification performance mapsdisplay clusters with corresponding classification capability,measured in area under the ROC-curve (AUC) for binaryclassification and proportion of correctly classified volumes(pCV) for multiclass data. For voxels contained in more thanone cluster, the maximum classification result is shown. Thesemaps give a direct understanding not only of which areas ofthe brain contain information pertaining to the discriminationbetween the conditions, but also the degree of discrimination.

The significance levels were established using nonparametricpermutation tests, where, under the null hypothesis that thereis no significant difference between conditions, the class labelswere permuted 1000 times and the corresponding classificationscores were computed [27].

The classification yield of an entire cluster might not, how-ever, be representative of the information content of eachvoxel contained in the cluster. Support vector machines areparticularly adept at dealing with high-dimensional datasets,meaning that a voxel cluster can achieve high classificationresults despite containing irrelevant (“noise”) voxels. Thus,the resulting clusters can be larger than strictly necessary.These noise voxels are, however, less likely to be includedover several clusters than voxels essential for above-thresholdclassification. A second type of map was therefore computed,using the voxel selection frequency as a relevancy measure.These maps indicate the number of times any given voxelwas selected over a number of algorithm iterations for a givenfitness threshold, representing the relative importance withinclusters given a certain classification threshold.

III. RESULTS

A. Summary of Methods

A finger-tapping fMRI dataset, containing nine subjectswith 20 volumes from each class (tap/rest), was acquired, aswell as a brushing dataset with six subjects and 78 volumesper stimulus category (arm brushing/thigh brushing/rest). Anevolutionary algorithm was applied to the datasets to obtainclusters of voxels classifying brain conditions above a specified(fitness) threshold. All reported results refer to the classificationperformance obtained using trained and optimized classifiersand clusters on the validation data, measured in the area underthe receiver operating characteristic (ROC) curve (AUC) forbinary and proportion correctly classified volumes (pCV) formulticlass data. Chance classification accuracies are 0.5 forbinary and 0.333 for the three-class data, whereas 1 indicatesperfect classification. Two different types of voxel discrimina-tion relevance maps are shown: the classification performancecluster maps, showing clusters with corresponding classifi-cation performance (in AUC or pCV), and frequency maps,indicating the number of times each voxel was selected by thealgorithm.


Fig. 3. (a)–(c) Various parameters from the finger-tapping individual analysis, including the subject mean number of generations required to find a cluster satis-fying the fitness threshold, cluster size, and classification performance (measured in area under the ROC-curve, AUC, where 1 equals perfect classification). Thecluster size is stable between different fitness thresholds, whereas the AUC (from the validation data) follows the fitness goal (evaluated on the testing data) nicely.Few generations are required to fulfill the fitness goal, but as the demands increase more generations are required. (d)–(f) Voxel discrimination relevance maps andGLM T-maps for the finger-tapping individual analysis, subject one. The cluster classification performance map is thresholded to only show clusters yielding anAUC of 1, and the GLM T-map displays voxels where �� . The obtained clusters coincide well with the GLM maps, especially around the contralateralprimary motor (MI) and sensory (SI) cortices as well as the supplementary motor area (SMA).

B. Finger-Tapping Versus Rest

Single Subject Analysis: The algorithm was applied to ob-tain ten voxel clusters for each of the fitness thresholds 0.6–0.9with interval 0.1 on each subject, for temporally detrended aswell as nondetrended data. In order to minimize the numberof noise voxels in the clusters the maximum cluster size wasset to 50 voxels. The subject mean classification achieved onthe validation data for finger-tapping vs. rest, measured in areaunder the ROC curve, was excellent—in each subject at leastone cluster with an AUC score of 1, that is, perfect classification,was found. The algorithm generally found clusters with at leastthe specified threshold in few generations (under 10), althoughthe number of required generations increased with higher fitnessthresholds [see Fig. 3(a)]. Due to the relative ease of findingan above-threshold cluster, the algorithm often found voxelswhere the classification exceeded an AUC of 0.95 for all thresh-olds. The subject mean cluster size is stable at approximately 30voxels for all fitness thresholds [Fig. 3(b)]. Larger clusters arenot over-represented, meaning that the 50 voxel maximum wasnot limiting classification performance. As the fitness thresholdis increased, the subject mean validation performance also in-creases [Fig. 3(c)], indicating that the algorithm succeeds infinding clusters and classifiers that can generalize reasonablywell to new data. It does, however, level out at higher thresh-olds, in part due to over-training, in part to the lower frequencyof above-fitness-threshold clusters. There was no significant dif-ference in any metric between temporally detrended and nonde-trended data, consistent with the prior observation that support

vector machine performance is relatively insensitive to pre-pro-cessing choices [16].

Representative cluster maps for a fitness threshold of 0.9 and500 iterations on detrended data for subject one and the corre-sponding GLM map are shown in Fig. 3(d)–(f). The cluster clas-sification performance map has been thresholded to only showvoxels yielding a classification score of 1. The major clusterareas coincide well with the GLM maps, especially around theprimary motor (MI) and sensory (SI) cortices as well as the sup-plementary motor area (SMA). However, as mentioned previ-ously, it is difficult to estimate the relative significance of eachvoxel within the high-performing clusters. The frequency map(Fig. 3(e)), on the other hand, clearly indicates which voxels areessential to high classification scores, again corresponding wellwith the GLM map. The occipital clusters correspond to highlynegative T-values in the GLM map, not shown in the figure. Themaps thus clearly locate important areas on the multivoxel level,identifying groups of voxels that, when jointly analyzed, differ-entiate the conditions in the cluster performance maps, as wellas on the single-voxel level, with the ranking obtained in the fre-quency maps. Taken together, this information can effectivelyaid in the planning of neurosurgical procedures in order to pro-tect cortical areas for motor and sensory control.

Group Analysis: For the group analysis, a leave-one-outcross-validation was performed on the MNI-transformed data

. Due to computer memory limitations,only the axial slices with MNI z-coordinates of 36–76, con-taining M1, S1, and the SMA, were included in the analysis.The algorithm was iterated 100 times for each cross-validationfold, thus resulting in 900 total voxel clusters, and the fitness


Fig. 4. (a) Detected above-threshold clusters with corresponding classification performance (measured in area under the ROC-curve, AUC, where 1 equals perfectclassification) and (b) frequency of selection for finger-tapping group analysis. The contralateral primary motor and somatosensory cortices (MI/SI) produce higherclassification results than any other area and are also more frequently selected by the algorithm. The supplementary motor area (SMA) and ipsilateral MI/SI are alsodetected, but less frequently and yield lower classification scores. The cluster maps coincide well with the standard GLM T-maps; (c) thresholded to �� .(d) 3-D rendering of clusters with a corresponding AUC of 0.7 for orientation.

threshold was set to 0.6. The maximum cluster size was set to200 voxels.

The classification scores achieved for the group analysis werelower than the single subject analysis, at a cluster mean AUCof 0.814 (range: 0.375–1). The optimization task thus appearsmore difficult between than within subjects, as is expected con-sidering individual differences in brain physiology. However,out of 900 attempts, the algorithm found 788 clusters withan AUC above the fitness threshold 0.6, whereas 348 clustersscored above 0.90. The average cluster size was 126.51 (range:2–200), approximately four times larger than on the individuallevel. Again, the maximum cluster size did not appear to affectclassification performance, since only five out of 900 clusterscontained 200 voxels. Fig. 4(a)–(c) shows the obtained clusterclassification performance and frequency of selection maps aswell as the corresponding GLM map. As in the individual levelanalysis, the contralateral primary motor and somatosensorycortices produce higher classification results than any other areaand are also more frequently selected by the algorithm. TheSMA and ipsilateral MI/SI are also detected, but less frequentlyand yield lower classification scores. The cluster classificationperformance maps include large areas which are expected to beactivated in the specific tasks, and, moreover, which are activeaccording to the GLM. Additionally, the highly selected voxelsin the frequency maps are virtually identical to the most activeGLM T-values. Again, the cluster classification performancemap provides an easy understanding of which voxels are im-portant in combination, whereas the GLM merely shows thedegree of difference between conditions in any single voxel.

Cluster Ranking for Feature Selection: Feature selection, thatis, the process of selecting a relevant subsample of the totalpopulation of data variables (features) is an important step inhigh-dimensional classification problems to alleviate the curseof dimensionality [28], enable faster classifier training and im-prove generalization [29]. In order to simulate feature selection,

Fig. 5. Comparison between feature selection using cluster maps and GLMvoxel ranking on the finger-tapping group data as a function of number of clus-ters/voxels included in the classification. Cluster feature selection outperformsthe GLM-based method, especially in the lower end of voxel subset sizes. AnAUC of 1 equals perfect classification.

the evolutionary algorithm was run ten times on training andtesting data sets containing eight subjects. Combinations of theextracted clusters were then used on the validation dataset con-sisting only of the ninth subject. The performance as a functionof the (unique) accumulated voxels from the highest throughlowest ranked cluster is presented in Fig. 5. The same clas-sification validation scheme was applied to the correspondingnumber of voxels ranked according to the GLM T-map ob-tained on the training subjects. The cluster algorithm outper-forms the GLM ranking method with an AUC of 0.908 com-pared to 0.785 for 200 voxels, and although the difference be-tween the two approaches decreases as the number of voxelsincreases, cluster feature selection performs consistently higher


TABLE ISUMMARY OF ROI CLASSIFICATION RESULTS FOR ARM/REST BRUSHING

TABLE IISUMMARY OF ROI CLASSIFICATION RESULTS FOR THIGH/REST BRUSHING

for all voxel subset sizes. A closer inspection of the selectedvoxels with respective methods shows that, although the gen-eral areas are similar and the overlap is large, the cluster algo-rithm generated voxel subsets slightly more medial and poste-rior than the GLM T-map ranking method. Also, the latter in-cluded voxels in the SMA at an early stage, whereas all of theten evolutionary clusters remained in the MI/SI area.

C. Arm and Thigh Brushing Versus Rest

ROI Analysis: For the localized analysis, the algorithm wasapplied to the left insular cortex ROI in the 12 data sets (six sub-jects, two condition pairs: arm/rest and thigh/rest). The fitnessthreshold was set to 0.7, and, again, a linear support vector ma-chine was used for classification. The algorithm was iterated tentimes to obtain ten clusters per dataset.

Cortical responses to soft brushing on the skin are substan-tially less conspicuous and robust than to the finger-tappingtask above, which is reflected in the classification results—inonly eight out of the 12 datasets an above-threshold clusterwas found. However, in all subjects at least one cluster with aROC area above 0.6 was obtained, and the subject mean bestcluster AUC was 0.736 (range: 0.618–0.858) for arm/rest and0.730 (range: 0.609–0.880) for thigh/rest. In order to validatethe general significance of the voxel cluster scores, all voxelswithin clusters with an AUC larger than 0.5 were combinedand a hold-out validation (using a random 80% of the data fortraining and 20% for validation, ten repetitions) was performedon each data set. For comparison, the same validation approachwas applied to the whole ROI. Tables I and II show the re-sulting classification AUCs as well as the maximum ROI GLMT-value and the corresponding number of voxels includedfor classification. In six data sets (subjects 2, 5, and 6 forarm/rest and subjects 1, 4, and 5 for thigh/rest) no significantlyactivated voxels (false discovery rate 0.05) were found inthe ROI. Using the whole ROI for classification, only threedata sets achieved significant classification scores, whereas thecluster-based classification achieved significant results in allcases ( ; permutation test with 1000 iterations).

D. Tables I and II—Summary of ROI Classification Resultsfor Brushing Versus Rest

The ROI classification parameters, including the maximumGLM T-value, classification results for whole ROI classifica-tion (measured in area under the ROC-curve, AUC, where 0.5corresponds to chance), the number of voxels contained in theentire ROI, the AUC for the cluster analysis and the number ofvoxels used by the cluster algorithm (voxel subset size). Starsdenote significant scores (GLMt: , AUC: permu-tation test, ).

Multiclass Analysis: In the multiclass analysis, whole brainvolumes for all three conditions were combined into one datasetfor each subject (containing 78 patterns of each category andaround 60,000 voxels). The evolutionary algorithm, with alinear regression classifier and a fitness threshold of 0.6, wasapplied to the dataset 100 times to extract 100 clusters. Sincethe AUC is applicable only for binary data, we instead reportthe proportion of correctly classified volumes (pCV), where0.333 corresponds to chance.

Again, the dataset was difficult to classify, but the subjectmean best cluster score was still well above chance at 0.624(range: 0.578–0.667). The corresponding subject mean confu-sion matrix is shown in Table III, illustrating that the categorieswere evenly classified. Moreover, the confusion matrix showsthat when the actual category was “thigh” and the algorithmguessed wrong, it more often identified the volume as “arm”(four times) than “rest” (2.33 times). Similarly, arm-brushingvolumes are more often mistaken for thigh-brushing than “rest”.A reasonable interpretation is that the brain states induced bybrushing on the arm and thigh are more similar to each otherthan to those produces by the rest condition. Interestingly, how-ever, the classifier also mistakes “rest” patterns for “thigh” overfive times as often as for “arm”.

Fig. 6(a)–(c) shows the cluster map obtained for subject one,as well as the corresponding GLM maps for arm/rest and thigh/rest. The clustering algorithm has identified the primary so-matosensory cortex (SI), where the well-established somatotopyis clearly visible in the GLM maps. The clusters yielding above-


TABLE IIISUBJECT MEAN CONFUSION MATRIX FOR MULTICLASS ANALYSIS OF

BRUSHING DATA

Fig. 6. (a) Example of a three-class map on the brushing data-set (arm/thigh/rest) for subject two, showing the obtained clusters with corresponding propor-tion of correctly classified fMRI volumes (pCV, 0.333 equals chance). Clustersproducing high classification scores are mainly found in the primary somatosen-sory cortex (SI), where there is a clear somatotopic organization, also shown inthe arm/rest and thigh/rest GLM maps. (b)–(c) �� ), extending fromthe thigh area to the arm area. Interestingly, high-classification clusters are alsofound in the secondary somatosensory cortex (SII), where no clear somatotopyis detected in the GLM.

chance classification rates extend surprisingly far down the post-central gyrus. Interestingly, areas in the secondary somatosen-sory cortex (SII) also produce above-chance classification re-sults, despite the similarity between arm/rest and thigh/rest inthe GLM maps. Also, the best cluster obtained in the insularcortex produced a classification score of 0.49 (again, chanceequals 0.333), and out of the 15 validation patterns per category,five arm, ten thigh, and seven rest volumes were correctly identi-fied. Although far from ideal classification, this result still showsthat voxels in the insular region contain information about thestimulus categories, and thus process these differently.

IV. DISCUSSION

This study demonstrates the effectiveness of evolutionary al-gorithms in identifying clusters of voxels that produce above-threshold brain state classification scores, thus mapping corticalconditions to single volume fMRI recordings as well as stim-ulus processing to anatomy. The algorithm has been success-fully demonstrated on four different problems, namely the iden-tification of expected areas of activation on both individual and

group levels in standard whole-brain, two-condition datasets,for brain state classification feature selection, in the ROI-basedidentification of a significant difference between experimentalconditions where the GLM failed to find significant activations,and, lastly, in the generation of multiclass maps.

The resulting VDRMs showed clear correlation with generallinear model maps and areas of expected activations, withinsubjects as well as between subjects. Where the SPMt displaysstatistical voxel-wise difference in BOLD response betweenstimulus conditions, the clustering approach searches for voxelsubsets that explicitly discriminate between brain states, thatis, identifies brain areas containing (multivariately coded)information about the conditions. Moreover, the informationcontent can be quantified, thus showing the degree of informa-tion content. The feasibility of information-based functionalbrain mapping, as opposed to the activation-based univariateapproach, was shown by Kriegskorte et al. with their “search-light” methodology [19]. Also, our brain mapping approach ismodel-free and data-driven, that is, no assumptions about thedata are required. The algorithm can thus be used in situationswhere a model-based method is inappropriate, such as whenthe haemodynamic function is not known (for example inspecies other than humans), or for the analysis of EEG inversesolutions.

A major advantage of the multivariate approach is the po-tentially increased sensitivity to differences between cognitivestates in comparison with univariate methods [2]. Our findingsagree with this statement—the clustering algorithm couldidentify voxels that, in combination, differed significantlybetween experimental conditions where the GLM could not.Interestingly, whole ROI classification did not consistently per-form above chance level, indicating the importance of properfeature selection, even with a classifier, such as support vectormachines, that deals well with high-dimensional data.

A potential disadvantage of the insensitivity to high di-mensions is, moreover, apparent when considering the clusterperformance maps: even high-performing clusters are, con-sequently, likely to contain irrelevant voxels. Classifiers, onthe other hand, that are highly sensitive to the number andconstitution of included features (such as the multiple linearregression scheme used for the multiclass data) will yield mapswith smaller, more distinct clusters.

The generation of multiclass discrimination maps is a novelconcept, utilizing the full potential of the classification ap-proach. Interestingly, the three-way classification of arm orthigh brushing and rest produced well above chance classifica-tion clusters both in the secondary somatosensory and insularcortices. Whether these are generalized differences reflectinga somatotopic organization in the processing of tactile inputrequires further investigation. Similarly, the significance of theclassifier consistently mistaking rest volumes for thigh but notfor arm brushing volumes (Table III) remains to be explained.A simple multiple linear regression classifier was used on thisdataset, and yet clusters with well-balanced classificationsfar above chance were obtained. Despite the lack of built-inmulticlass mechanisms for support vector machines, severalalgorithms exist to enable the classification of datasets withmore than two categories. It is very likely that such schemes


will produce substantially higher classification results. A moresophisticated clustering analysis can be performed by intro-ducing nonlinear classifiers, such as artificial neural networksor support vector machines with nonlinear kernels.

Spatial smoothing is generally avoided in fMRI patternanalysis, due to the inevitable loss of information. On thesingle-subject brushing dataset, however, we did in fact achievea slightly better classification performance after applying astandard fMRI low-pass filter (6-mm Gaussian kernel) to thedata. This find is also consistent with previous research into theeffects of smoothing on fMRI data for classification [12], [17].

The algorithm showed excellent results when used for fea-ture selection aimed at maximal classification performance, anissue specifically addressed by Norman et al. [2]. We have pre-viously shown that evolutionary based feature selection, wherethere is no spatial restraint, produces information-dense featuresubsets that are spatially distributed and virtually impossibleto interpret biologically—unless the algorithm is repeated nu-merous times and discrimination maps showing the voxelwisefrequency of selection are plotted [15]. The clustering algorithmdevised in this study, on the contrary, is aimed at establishingphysiologically identifiable relevant areas by restricting poten-tial voxel subsamples to contain only spatially close voxels. Thisalgorithm is thus limited as a feature selection method, but itnonetheless performs drastically better than the GLM rankingfeature selection popularly used in fMRI classification studies[20]. It should be noted that since the cluster ranking was per-formed on the whole dataset, the validation data was in fact in-cluded in the ranking of the features and the results can thus bean overestimation. The comparison with the GLM is, however,still valid, since the GLM maps were also generated using alldata. It would be of interest to implement the clustering algo-rithm for feature selection in an appropriate fashion, and com-pare the results with algorithms allowing distributed voxel sub-sets [15].

Despite the drastic reduction of voxels included for classifi-cation (e.g. from in the order of 70 000 whole volume voxels toclusters containing around 30 voxels per 20 samples for singlesubject finger-tapping), there are still more voxels than availabletraining samples. Thus, a hyperplane separating the training dataclasses highly accurately can always be found, albeit potentiallysuffer in terms of generalization to new samples—especiallywith highly variable data such as fMRI. An important yet ex-pected find in the present study is indeed that the algorithm con-sistently selected substantially fewer voxels than allowed (e.g.,in the order of 30 voxels for single-subject finger-tapping, wherethe maximum allowed cluster size was 50 voxels), that is, perfor-mance is higher at relatively low number of voxels. Moreover,in order to monitor potential overfitting, the algorithm was sup-plied with no less three datasets: one for classifier parameterestimation, a second for evolutionary fitness computation, anda third for assessment of final classifier and subset generaliza-tion ability. Additionally, the support vector machine packageutilized in this study incorporates a regularization parameter, al-lowing for some overfitting control.

A limitation of the algorithm, due to the inherent sto-chastic nature, is the inability to guarantee that all possible,above-threshold clusters are found. Algorithm repetitions and

fitness threshold manipulations can, as was demonstrated, aid inthe localization of multiple clusters more representative of ac-tual brain activations than one single, maximally discriminativecluster. Also, parameters such as the evolutionary popula-tion size and number of initial feature subsets (randomizedthroughout the brain volume) can improve cluster distribution.Further research in this area is required, however.

An advantage of evolutionary algorithms is the flexibility inimplementing various additional optimization variables, such asclassifier parameters. For instance, when using support vectormachines, the cost parameter could be subject to evolution, ascould the coefficients of multiple linear regression or networkstructure and weights when using an artificial neural network[30]. A drawback, on the other hand, of these algorithms is thenumerous mutation parameters that require empirical tweaking.We are therefore currently investigating particle swarm opti-mization, where there are substantially fewer parameters, as analternative [31]. Also, the evolutionary algorithm is iterative innature and in combination with excessive amounts of data, typ-ical for fMRI studies, the computer resource and time require-ments can be daunting. However, limiting the number of voxelsincluded in any classification session to a given few decreasesthe time required for fitness computation substantially. On astandard PC (3.20-GHz processor, 3-GB RAM) and the codeimplementation used here, the clusters shown in Fig. 3(d)–(f),ten whole brain iterations on the individual level for finger-tap-ping (for a maximum of 20 generations, a population size of 50individuals, a fitness threshold of 0.9 and 100 iterations) took 20minutes to run. Also, parallelization of evolutionary algorithmsis fairly trivial in nature, and although fast implementation po-tentially requires several standard computers it does not posetechnical difficulties.

The term voxel discrimination relevance map (VDRM), ordiscrimination map for short, previously proposed by us in [15]is intended as a general purpose pattern recognition-based alter-native to the word “activation map” typically used for maps gen-erated by univariate methods. The term aims to cover all typesof maps quantifying the amount of relevance (of arbitrary units)fMRI voxels possess in terms of discriminating between brainconditions. We prefer the word “discrimination” to “classifica-tion” since it distinctly implies the representation of the differ-ences between brain states.

V. CONCLUSION

This study has shown that the proposed evolutionary clas-sification scheme is highly effective in identifying voxel clus-ters that contain information about given brain states, whichcan be used for multivariate, model-free functional brain-map-ping as well as for maximal single-volume fMRI pattern classi-fication—for binary or multiclass data. Moreover, our approachproved more sensitive to small differences in activation betweenconditions than the general linear model.

ACKNOWLEDGMENT

The authors are grateful to L. Löken for supplying thebrushing data, to L. Lundblad for assisting in the finger-tapping


data acquisition, and to neurologist Dr. H. Krämer for expertviews on the results.

REFERENCES

[1] K. J. Friston, A. P. Holmes, K. J. Worsley, J. P. Poline, C. D. Frith, andR. S. J. Frackowiak, “Statistical parametric maps in functional imaging:A general linear approach,” Human Brain Mapping, vol. 2, no. 4, pp.189–210, 1994.

[2] K. A. Norman, S. M. Polyn, G. J. Detre, and J. V. Haxby, “Beyondmind-reading: Multi-voxel pattern analysis of fMRI data,” Trends inCog. Sci., vol. 10, no. 9, pp. 424–430, 2006.

[3] J.-D. Haynes and G. Rees, “Decoding mental states from brain activityin humans,” Nature Rev. Neurosci. vol. 7, no. 7, pp. 523–534, July 2006[Online]. Available: http://dx.doi.org/10.1038/nrn1931

[4] S. M. Polyn, V. S. Natu, J. D. Cohen, and K. A. Norman, “Category-specific cortical activity precedes retrieval during memory search,” Sci-ence, vol. 310, no. 5756, pp. 1963–1966, 2005.

[5] C. Davatzikos, K. Ruparel, Y. Fan, D. Shen, M. Acharyya, J. Loug-head, R. Gur, and D. Langleben, “Classifying spatial patterns of brainactivity with machine learning methods: Application to lie detection,”NeuroImage, vol. 28, pp. 663–668, 2005.

[6] J. V. Haxby, M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten, andP. Pietrini, “Distributed and overlapping representations of faces andobjects in ventral temporal cortex,” Science, vol. 293, pp. 2425–2430,2001.

[7] D. D. Cox and R. L. Savoy, “Functional magnetic resonance imaging(fMRI) ‘brain reading’: Detecting and classifying distributed patternsof fMRI activity in human visual cortex,” NeuroImage, vol. 19, no. 2,pt. 1, pp. 261–270, Jun. 2003.

[8] Y. Kamitani and F. Tong, “Decoding the visual and subjectivecontents of the human brain,” Nature Neuroscience vol. 8, no.5, pp. 679–685, May 2005 [Online]. Available: http://www.na-ture.com/neuro/journal/v8/n5/abs/nn1444.html

[9] J.-D. Haynes and G. Rees, “Predicting the orientation of invisiblestimuli from activity in human primary visual cortex,” Nature Neu-roscience vol. 8, no. 5, pp. 686–691, Apr. 2005 [Online]. Available:http://www.nature.com/neuro/journal/v8/n5/abs/nn1445.html

[10] R. deCharms, K. Christoff, G. Glover, J. Pauly, S. Whitfield, and J.Gabrieli, “Learned regulation of spatially localized brain activationusing real-time fMRI,” NeuroImage, vol. 21, no. 1, pp. 436–443, 2004.

[11] S. S. Yoo, H. M. O’Leary, T. Fairneny, N. K. Chen, L. Panych, H. Park,and F. A. Jolesz, “Increasing cortical activity in auditory areas throughneurofeedback functional magnetic resonance imaging,” Neuroreport,vol. 17, no. 12, pp. 1273–1278, 2006.

[12] S. M. M. LaConte, S. J. J. Peltier, and X. P. P. Hu, “Real-time fMRIusing brain-state classification,” Human Brain Mapping, Nov. 2006.

[13] B. Ng, R. Abugharbieh, S. J. Palmer, and M. J. McKeown, “Character-izing task-related temporal dynamics of spatial activation distributionsin fMRI BOLD signals,” in MICCAI (1), 2007, pp. 767–774.

[14] C. R. Reeves and J. E. Rowe, Genetic Algorithms—Principles and Per-spectives: A Guide to GA Theory. Norwell, MA: Kluwer, 2002.

[15] M. C. Åberg, L. Löken, and J. Wessberg, “An evolutionary approach tomultivariate feature selection for fMRI pattern analysis,” in Proc. Int.Conf. Bio-Inspired Systems and Signal Processing, 2008.

[16] S. LaConte, S. Strother, V. Cherkassky, J. Anderson, and X. Hu, “Sup-port vector machines for temporal classification of block design fMRIdata,” NeuroImage, vol. 26, no. 2, pp. 317–329, Jun. 2005.

[17] J. Mourão-Miranda, A. L. Bokde, C. Born, H. Hampel, and M. Stetter,“Classifying brain states and determining the discriminating activa-tion patterns: Support vector machine on functional MRI data,” Neu-roImage, vol. 28, no. 4, pp. 980–995, 2005.

[18] F. De Martino, G. Valente, N. Staeren, J. Ashburner, R. Goebel, andE. Formisano, “Combining multivariate voxel selection and SupportVector Machines for mapping and classification of fMRI spatial pat-terns,” NeuroImage, vol. 43, pp. 44–58, 2008.

[19] N. Kriegeskorte, R. Goebel, and P. Bandettini, “Information-basedfunctional brain mapping,” PNAS, vol. 103, pp. 3863–3868, 2006.

[20] T. M. Mitchell, R. Hutchinson, R. S. Niculescu, F. Pereira, X. Wang,M. Just, and S. Newman, “Learning to decode cognitive states frombrain images,” Mach. Learn., vol. 57, no. 1–2, pp. 145–175, 2004.

[21] H. Olausson, Y. Lamarre, H. Backlund, C. Morin, B. G. Wallin, G.Starck, S. Ekholm, I. Strigo, K. Worsley, A. B. Vallbo, and M. C. Bush-nell, “Unmyelinated tactile afferents signal touch and project to insularcortex,” Nature Neurosci., vol. 5, no. 9, pp. 900–904, 2002.

[22] A. C. Evans, D. L. Collins, S. R. Mills, E. D. Brown, R. L. Kelly, andT. M. Peters, “3d statistical neuroanatomical models from 305 mri vol-umes,” in Proc. IEEE Nuclear Science Symp. and Medical ImagingConf., 1993, pp. 1813–1817.

[23] T. Naidich, E. Kang, G. Fatterpekar, B. Delman, S. Gultekin, D.Wolfe, O. Ortiz, I. Yousry, M. Weismann, and T. Yousry, “The insula:Anatomic study and MR imaging display at 1.5 T,” Amer. J. Neurora-diol., vol. 25, no. 2, pp. 222–32, 2004.

[24] K. J. Worsley, C. H. Liao, J. Aston, V. Petre, G. H. Duncan, F. Morales,and A. C. Evans, “A general statistical analysis for fmri data,” Neu-roImage, vol. 15, no. 1, pp. 1–15, Jan. 2002.

[25] C. R. Genovese, N. A. Lazar, and T. Nichols, “Thresholding of statis-tical maps in functional neuroimaging using the false discovery rate,”NeuroImage, vol. 15, no. 4, pp. 870–878, Apr. 2002.

[26] J. Suykens, T. V. Gestel, J. D. Brabanter, B. D. Moor, and J. Van-dewalle, Least Squares Support Vector Machines. Singapore: WorldScientific, 2002.

[27] S. Siegel and N. J. Castellan, Jr., Nonparametric Statistics for the Be-havioral Sciences, 2nd ed. New York: McGraw-Hill, 1988.

[28] R. E. Bellman, Adaptive Control Processes. Princeton, NJ: PrincetonUniv. Press, 1961.

[29] A. Blum and P. Langley, “Selection of relevant features and examplesin machine learning,” Artific. Intell., vol. 97, no. 1–2, pp. 245–271,1997.

[30] M. C. Åberg and J. Wessberg, “Evolutionary optimization of classifiersand features for single trial EEG discrimination,” Biomed. Eng. Online,vol. 6, no. 32, 2007.

[31] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proc.IEEE Int. Conf. Neural Networks, 1995, vol. 4, pp. 1942–1948.

Malin Björnsdotter Åberg (S’07) was born inGullholmen, Sweden, in 1980. She received the M.S.degree in automation and mechatronics engineering,with specialization in bio-mechatronics, fromChalmers University of Technology, Gothenberg,Sweden, in 2004. She is currently pursuing the Ph.D.degree at the University of Gothenburg in machinelearning and neuroscience.

Her research interests include pattern recognition,predictive modeling, data mining, brain-machine in-terfacing, and the perception of pleasant touch.

Johan Wessberg received the M.D. and Ph.D. de-gree in physiology from the University of Gothen-burg, Göteborg, Sweden, in 1989 and 1995, respec-tively.

He is currently an Associate Professor in physi-ology at the University of Gothenburg. His laboratoryis interested in the peripheral and central nervoussystem mechanisms of the human sense of touch,which is investigated using a combination of pe-ripheral nerve recording from tactile receptors andfunctional brain imaging. During 1999 to 2000, he

was a Research Associate at Duke University, Durham, NC, in the laboratoryof Prof. Miguel Nicolelis, where he worked on a brain-machine interface whichallowed monkeys to control a robot arm using chronically recorded nerve cellactivity from the brain. An important line of research in his current laboratoryis the novel application of machine learning techniques to study patterns ofhuman brain activity.

Date post:	09-Apr-2023
Category:	Documents
Upload:	gu-se
View:	0 times
Download:	0 times

An Evolutionary Approach to the Identification of Informative Voxel Clusters for Brain State...

Documents