MRIBrain Tissue segmentation
Submitted bySergi Valverde Valverde
Supervisors:Dr. Xavier LladoDr. Arnau OliverMariano Cabezas
Department of Computer Architecture and TechnologyUniversity of Girona
A Thesis Submitted for the Degree ofMSc Erasmus Mundus in Vision and Robotics (VIBOT)
· 2012 ·
Abstract
Manual segmentation of brain tissue is both challenging and time-consuming due to of the largenumber of MRI slices for each patient which composes the three-dimensional information andalso due to intra/inter-observer variability of manually segmented scans. The development ofrobust automated MS segmentation methods, which can segment large amounts of MRI dataand do not suffer from intra/inter-observer variability, is nowadays an active research field.However, automated segmentation of brain tissue is still a challenging problem due to the com-plexity of the images, differences in tissue intensities, noise, intensity non-uniformities, partialvolume effects or absence of models of the anatomy that fully capture the possible deformationsin each structure. The main motivation of this master thesis is two-fold: first, to perform an ex-haustive comparative evaluation of existing state-of-the-art brain tissue segmentation methodsusing T1w data which is the most used for tissue classification; and second, to extend the eval-uation with a quantitative analysis of how MS lesions affect the tissue classification. We haveselected 4 publicly available segmentation approaches from the-state-of-the-art, where some ofthem such as FAST, SPM5, SPM8 or GAMIXTURE are currently used by the neuroimagingcommunity for tissue segmentation and volumetric analysis. Moreover, we extend the list withthe implementation of 4 more works selected from the state-of-the art which comprises twoFuzzy Clustering techniques, one Neural Network method based on Self Organized Maps andone KNN auto-trained with the subject itself. Quantitative analysis is carried out on syntheticand real T1w data from publicly available datasets such as Brainweb and IBSR20. Further-more, scans from the SALEM project dataset with different loads of MS lesion are employedto evaluate the efficiency of methods segmenting brain tissue in the presence of MS lesions.Results on synthetic data have reported a good accuracy for all the analyzed approaches andwere according with previous studies using one or more of these methods. Results on IBSR20have shown a slightly better performance on the KNN classifier and GMM approaches. Finally,results on the SALEM dataset with MS lesions have indicated that in general methods tend tomiss-classify WM as GM at least in 17%. These results vary from SPM8 (17%) to KNN, whichis miss-classifying WM in 37%.
A good runner leaves no footprints... . . .
Lao-Tzu
i
Contents
Acknowledgments viii
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Problem definition 7
2.1 Magnetic Resonance Imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 MRI concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Computer vision aspects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Partial volume effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.3 Intensity inhomogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3 State of the art 13
3.1 Classification of segmentation approaches . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Supervised methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1.2 Unsupervised methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
ii
3.2 Reported results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.1 Brainweb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2.2 IBSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2.3 Evaluation measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2.4 Results analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Proposal 29
4.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.1.1 Skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.2 Intensity correction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Tissue segmentation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.1 Tissue classification with FAST . . . . . . . . . . . . . . . . . . . . . . . . 32
4.2.2 Tissue classification with SPM . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.3 Tissue classification with GAGMM . . . . . . . . . . . . . . . . . . . . . . 35
4.2.4 Tissue classification with SOM . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.5 Tissue classification with FCM . . . . . . . . . . . . . . . . . . . . . . . . 37
4.2.6 Tissue classification with RFCM . . . . . . . . . . . . . . . . . . . . . . . 38
4.2.7 Tissue classification with KNN . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Tissue classification in the presence of lesions . . . . . . . . . . . . . . . . . . . . 40
4.4 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5 Results 43
5.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 Synthetic data results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.2.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.3 Real data results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.4 MS lesion results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
iii
6 Conclusions 59
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
A Results tables 62
Bibliography 74
iv
List of Figures
2.1 Different MRI acquisition sequences . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Brain skull stripping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Partial volume effects representation . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Different acquired MRI scans with image artifacts . . . . . . . . . . . . . . . . . 12
3.1 Brainweb generated dataset for different noise levels (n) and biases (b). . . . . . . 22
4.1 Proposed pipeline for brain tissue segmentation . . . . . . . . . . . . . . . . . . 30
4.2 Scan preprocessing output example with selected tools . . . . . . . . . . . . . . . 32
4.3 FAST segmentation output example with PVE. . . . . . . . . . . . . . . . . . . . 34
4.4 SPM prior tissue atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5 Scan preprocessing output example with selected tools . . . . . . . . . . . . . . . 36
4.6 Membership classification output in FCM . . . . . . . . . . . . . . . . . . . . . . 38
4.7 Proposed modified pipeline for brain tissue segmentation with MS lesions . . . . 41
5.1 Dice metrics boxplots computed from all Brainweb scans . . . . . . . . . . . . . . 45
5.2 Segmentation results for various methods on Brainweb scans . . . . . . . . . . . . 47
5.3 Dice metrics boxplots computed from all IBSR20 scans . . . . . . . . . . . . . . . 49
5.4 Dice metrics plots evaluated for each scan and method . . . . . . . . . . . . . . . 51
5.5 Segmentation results for various methods on 5 8 IBSR scan . . . . . . . . . . . . 52
5.6 WM FBP for masked and not masked scans SALEM . . . . . . . . . . . . . . . 55
v
5.7 Segmentation results for all methods on 201 SALEM scan . . . . . . . . . . . . . 57
5.8 BPF for masked and not masked scans SALEM . . . . . . . . . . . . . . . . . . . 57
vi
List of Tables
3.1 Selected state-of-the-art automatic brain tissue segmentation methods . . . . . . 15
3.2 Available IBSR datasets for segmentation analysis . . . . . . . . . . . . . . . . . 23
3.3 Surveyed works based on Brainweb database and Dice or Jaccard indexes . . . . 26
3.4 Surveyed works based on IBSR database and Dice or Jaccard indexes . . . . . . 27
5.1 FBT evaluation on healthy subjects SALEM . . . . . . . . . . . . . . . . . . . . 53
5.2 FBT evaluation on subjects with MS disease masking lesions SALEM . . . . . . 56
5.3 FBT evaluation on subjects with MS disease without masking lesions SALEM . . 56
5.4 Lesion tissue classification, SALEM . . . . . . . . . . . . . . . . . . . . . . . . . . 56
A.1 Dice metrics computed from segmented Brainweb scans with different intensity
inhomogeneity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.2 BrainWeb synthetic database statistical evaluation . . . . . . . . . . . . . . . . . 64
A.3 Average Dice metrics computed from segmented Brainweb scans with different
noise levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
A.4 IBSR20 database statistal evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 65
vii
Acknowledgments
Invisible steps... I guess we have lost the path...
To the eyes: Yago Diez, Mariano Cabezas, Arnau Oliver and Xavier Llado. Thank you, it
was a real pleasure.
To the professors which made me start this path with the Bachelor computer vision course
and put their signature to make possible two years later this Master thesis: Joan Martı, Xavier
Munoz and again Xavier Llado. Thank you.
To the staff : David, Yohan, Fabrice, Herma, David, Aina, Robert, Yvan. Thank you.
To the present: Darko, Reinhard, Adriyana, Sergio, Xin, Igor, Joven, Saeed, Hashim, Lukas,
Mari, Taro, Mark, Cedric, Taleb, Bernat and Dani. Thank you.
To the past: Moises, Alfredo, Antonio, Julio, Francesc, Moi, Edu. Thank you to made me
grow.
To the roots: Pol, mother and father: Thank you, it had been impossible without you.
To the dreams: Alicia. I guess we have lost the path...
We are new, and we are the same...
viii
Chapter 1
Introduction
1.1 Overview
Central nervous system (CNS) is the part of the human nervous network which integrates the
coordination and processing of receiving neural information. CNS is contained by the brain and
the spinal cord and constituted by two tissue components: gray matter tissue (GM), which is the
main CNS element and consists in neuronal cell bodies; and white matter tissue (WM), which is
the second CNS component and it is mainly composed of myelinated axon tracts. WM and GM
tissues occupy the most important part of the brain area. Cerebro-spinal fluid (CSF) is a bodily
fluid present all over the brain evolving both CNS tissues. The central nervous system may
be damaged by different affections caused by infections such as encephalitis, neurodegenerative
diseases like Alzheimer or autoimmune and inflammatory diseases such as multiple sclerosis.
Multiple sclerosis (MS) is the most common immune-mediated disabling neurological disease
of the CNS. It is an inflammatory disability in which the structure of the neurons are progres-
sively injured [41]. MS is the most frequent non-traumatic neurological disease that causes
more disability in young adults. It is relatively common in Europe, the United States, Canada,
New Zealand, and parts of Australia, but rare in Asia, and in the tropics and subtropics of all
continents [16]. It follows a similar behavior also seen in other putative autoimmune diseases,
and affects twice as many women as it does in men. MS has an incidence of about seven per
100 000 every year, prevalence of around 120 per 100.000, and lifetime risk of one in 400 [24]. It
is low in childhood, increases rapidly after the age of 18, reaches a peak between 25 and 35, and
then slowly declines, becoming rare at 50 and older. The world estimate is 1.3 to 2.5 million
cases of MS, with Western Europe having 350.000 [31].
Histopathology implicates a reduction in myelin and axonal degeneration as the major con-
1
Chapter 1: Introduction 2
tributors to the accumulation of disability [54]. Myelin is a dielectric material that forms a
layer, which is called myelin sheath, around the axon of the neuron. The disease damages the
fatty myelin sheaths of brain cells and spinal cord, leading to demyelination. Demyelinated
axons conduct impulses at reduced or spontaneous velocity causing impairment in sensation,
movement and cognition [24]. MS has four internationally recognized forms [55]: 1) Relaps-
ing/Remitting (RRMS) is characterized by exacerbation times where symptoms are present.
These periods are followed by periods of remission, where the patient recovers partial or totally
from the disease symptoms. 2) The Secondary Progressive (SPMS) form is characterized by a
gradual intensification of symptoms between affection relapses. 50% of MS patients after 10
years of the RRMS develop the SPMS stage. This percentage rises into 90% after 25 years [59].
3) Progressive remitting (PRMS) form is typified by an increase in the relapsing times with
significant recovery but with worsening symptoms in new relapsing intervals. 4) Lastly, Pri-
mary Progressive (PPMS) is characterized by a severe decrease of remitting times with special
localization in the brain.
Magnetic Resonance Imaging (MRI) are medical techniques commonly used to visualize the
internal structures of the body. The composition of each brain tissue permits different image
acquisitions types such T1, T2 and PD. T1 acquired scans clearly separate grey matter from
white matter and are often used in inter-tissue classification. T2 scans are often used in the
intra-tissue classification of abnormal fluid against the normal tissue which is suitable for lesion
detection. The ability of MRI to detect MS lesions, had led to its general acknowledgement
as tool to diagnosticate the disease [35]. Conventional MRI are highly sensitive for detecting
MS plaques and can provide quantitative assessment of inflammatory activity and lesion load
[16]. More advanced acquisition methodologies are being used today such as fluid attenuated
inversion recovery (FLAIR) [9], diffusion techniques as diffusion weighted imaging (DWI) [58]
and diffusion tensor (DTI) [25] or magnetization transfer (MT) [86]. MS detection techniques
are based on the assumption that a classification of the brain tissue is known. This process is
essential to perform an analysis of the brain tissue atrophy, volumetric analysis and its evolution.
Manual segmentation of brain tissue is both challenging and time-consuming because of the
large number of MRI slices for each patient which composes the three-dimensional information
[16]. The intra/inter-observer variability of manually segmented scans can be significant in some
cases. Effectively, the development of robust automated MS segmentation methods, which can
segment large amounts of MRI data and do not suffer from intra/inter-observer variability, is
nowadays an active research field [78] [85]. However, automated segmentation of brain tissue is
still a challenging problem due to the complexity of the images, differences in tissue intensities,
noise, intensity non-uniformities, partial volume effects [3] or absence of models of the anatomy
that fully capture the possible deformations in each structure [52].
3 1.2 Research framework
1.2 Research framework
This master thesis is located within the framework of same research project, which the Computer
Vision and Robotics (VICOROB) group of the Universitat de Girona is involved on:
Tıtulo del proyecto/contrato: AVALEM: avaluacio de l’atrofia
en pacients amb lesions d’esclerosi multiple
Empresa/Administracion financiadora: FUNDACIO ESCLEROSI MULTIPLE
Numero de proyecto/contrato: CEM-CAT2011 Importe: 40.000,00 Duracion,
desde: 2011 hasta: 2013
Investigador/a Principal: Xavier Llado Bardera
Tıtulo del proyecto/contrato: SALEM: segmentacion automatica de
lesiones de esclerosis multiple en imagenes de resonancia magnetica
Empresa/Administracion financiadora: Instituto de Salud Carlos III
Numero de proyecto/contrato: PI09/91018 Importe: 79.860,00 Duracion,
desde: 2010 hasta: 2012
Investigador/a Principal: Xavier Llado Bardera
Tıtulo del proyecto/contrato: SALEM: Toolkit para la segmentacion
automatica de lesiones de EM en RM.
Empresa/Administracion financiadora: Centre d’Innovacio i Desenvolupament
Empresarial (CIDEM)
Numero de proyecto/contrato: VALTEC09-1-0025 Importe: 100.000,00 Duracion,
desde: 2009 hasta: 2011
Investigador/a Principal: Xavier Llado Bardera
The project is based on the collaboration with relevant hospitals and medical expert teams
in the field of multiple sclerosis such as the Hospital Vall d’Hebron, the Clınica Girona and the
Hospital Dr. Josep Trueta. These 3 hospitals provides data from real patients used to evaluate
the implemented segmentation methods.
1.3 Objectives
Brain tissue segmentation is essential to perform an analysis of the brain atrophy and its
evolution in MS patients. In this document, several segmentation methods are analyzed
for the automatic brain tissue classification and volumetric quantification. Various
existing key segmentation methods are studied while new methods are also implemented, tested,
Chapter 1: Introduction 4
and compared. The effect of the multiple sclerosis lesions in the tissue segmentation and
quantification process are also investigated with baseline and 12 months scans from healthy
and MS patient scans.
Four brain tissue MRI scan sets will be used to evaluate the performance of the tissue
segmentation tools:
• 9 T1w synthetic volumes from the intracraneal Brainweb 3D simulated MR image gener-
ator [22].
• 20 Normal Subjects, T1-w Scans with manual segmentations from the Internet Brain
Segmentation database (IBSR)1.
• 9 T1w healthy subjects with a second scan after 12 months for temporal comparison from
SALEM.
• 10 T1w subjects with different lesion loads and a second scan after 12 months for temporal
comparison from SALEM.
The main goals of this master thesis are subdivided in the following objectives:
• Analyze the state of the art of brain tissue segmentation techniques. This objective aims
to review the whole brain tissue segmentation state of the art to understand better their
advantages and drawbacks.
• Select a representative set of the best state-of-the art techniques for brain tissue segmen-
tation. This objective aims to implement some automatic segmentation methods to be
added to the best available public methods.
• Perform a quantitative evaluation of the selected methods on simulated and real brain 3D
MRI scans. This objective aims to assess the accuracy of the methods in relation to the
provided ground truth of the public datasets.
• Evaluate the presence of MS in the quantization of brain tissue atrophy. This objective
aims to evaluate tissue volume differences and extend it in the presence of MS with
different lesion loads. Segmented volumes are evaluated in 12-months consecutive scans
for both healthy and MS patients with different levels of disease.
1.4 Planning
According to the objectives described in 1.3, the master thesis will be developed in different
steps. Those steps are summarized as follows with the proposed realization time:
1IBSR database is public accesible in http://www.cma.mgh.harvard.edu/ibsr
5 1.5 Document structure
• Analyze the existing state-of-the-art proposed methods by scientific commu-
nity in brain tissue segmentation. During the first stage of the master thesis, the
state-of-the-art in MRI brain tissue segmentation will be studied. The analysis is extended
with various preprocessing image tools commonly used by the community. Expected date:
1st February -15th February 2012.
• Implementation of brain tissue segmentation methods. Some of the best state-
of-the-art brain tissue segmentation methods will be implemented. These methods will
be incorporated to the best public methods available to complete a representative list
of segmentation methods to be used in the quantitative evaluation. Expected date: 16th
February- 31th March 2012.
• Quantitative accuracy evaluation on simulated data. The segmentation methods
will be evaluated using 3D simulated brain scans from BrainWeb dataset. The methods
will be assessed for different levels of noise and intensity inhomogeneities. Expected date:
1st April- 15th April 2012.
• Quantitative accuracy evaluation on real data. The segmentation methods will be
evaluated using 3D real brain scans with different levels of difficulty and intensity inho-
mogeneities from IBSR dataset. The accuracy of the methods is evaluated in relationship
with the provided ground-truth and compared with previous studies found in the brain
tissue segmentation literature. Expected date: 1st April- 30th April 2012.
• Quantitative evaluation of volumetric analysis. The volumetric evaluation of the
segmentation methods in the presence of MS will be carried out using healthy and disease
patients with different lesion loads. The experiments will be carried out using baseline
and 12 monthsscans from SALEM and AVALEM projects. Expected date: 16th April-
15h May 2012.
• Documentation. All details regarding those steps will be documented in this master
thesis document. The documentation step will be concluded with a scientific paper and
a poster of the proposed work. Expected date: 16th May - 30th May 2012.
1.5 Document structure
This document is structured in 6 chapters. Chapter 1 introduces a brief summary of the back-
ground, objectives and planning of the master thesis. Chapter 2 defines in more detail the
necessary research background in automatic brain tissue segmentation. MRI is introduced em-
phasizing in acquisition sequences used for brain tissue classification. Moreover, specific issues
Chapter 1: Introduction 6
and concepts related with computer vision are defined. Chapter 3 reviews the state-of-the-art in
the tissue segmentation research field. It pays attention on the most recent techniques dealing
with automatic brain tissue segmentation and focus on advantages and drawbacks of each tech-
nique. A classification of those techniques based on supervised and unsupervised methodologies
is introduced. Results from surveyed works are organized by employed dataset and compared
between them. Chapter 4 develops in more detail the set of the selected techniques used in
the study. In particular, public available segmentation tools as SPM5 and SPM8 [5], FSL
FAST [89] or GAMIXTURE [79] are revised. The revision of selected techniques is completed
analyzing four implemented methods: A K-nearest neighbor auto-trained with the same input
scan and atlas tissue probabilistic initialization [27], a self-organized map (SOM) approach [77]
and two fuzzy clustering approaches with classic membership function and spatially informa-
tion weighting respectively [61]. Chapter 5 performs a complete quantitative evaluation of the
results obtained with different datasets. Firstly, synthetic and real data are used to evaluate
the quantitative accuracy of the selected methods. Secondly, 12 months consecutive scans are
used to evaluate the volumetric analysis in the presence of the disease, with both healthy and
MS patients with different lesion loads. Finally, in chapter 6 conclusions summarizing the de-
veloped work are presented. Based on these conclusions, possible solutions are also introduced
to be implemented as future work.
Chapter 2
Problem definition
2.1 Magnetic Resonance Imaging
Imaging is usually preferred over biopsy on clinical practice when collateral risks from surgical
procedures are an important risk. Unlike X-rays and computer tomography (CT) scans, MRI
does not emit ionic radiation. Instead, magnetic waves stimule hydrogen atoms in molecules
using the property of nuclear magnetic resonance (NMR). NMR is the phenomenon in which
magnetic nuclei in a magnetic field absorb and re-emit electromagnetic radiation at a specific
resonance frequency. The radiation energy is dependent of the magnetic field and atom proper-
ties permitting different imaging configurations. For several decades, MRI has been used widely
in scientific research and medical care. 3D processing of medical brain images is an active re-
search topic in computer vision where MRI has provided meaningful information about brain
tissue at very high resolutions for use in fields like reparative surgery, radiotherapy, stereo-
tactic neurosurgery and others [50]. In particular, automatic 3D brain segmentation of white
matter (WM), gray matter (GM) and Cerebro spinal fluid (CSF) is especially important for
quantitative analysis and tissue volume measurements. Those quantitative measurements are
a key factor to assess the progress or remission of several diseases of the central nervous system
like MS. As a consequence of the high resolution data provided by MRI, its application to the
study of MS lesions has greatly improved the ability to diagnose and monitor the disease evo-
lution [48]. Effectively, MRI is the most reactive technique detecting demyelnating lesions on
the CNS in MS patients [64]. New diagnostic criteria based on MRI, allows to demonstrate the
dissemination of the pathology in space and time allowing early diagnosis [68] and permitting
the exclusion of other possible pathologies.
7
Chapter 2: Problem definition 8
2.1.1 MRI concepts
MRI takes the advantage of high amount of water in body tissue. Briefly, the principle behind
is the magnetization of proton molecules which get aligned in the presence of a large magnetic
field. Water molecules have two hydrogen nuclei or protons and if they are stimulated by
a radio-frequency coil pulse (RF), the average magnetic moment of the protons of a body
region exposed to magnetization becomes aligned with the direction of the field. Varying
the electromagnetic field accordingly to the resonance frequency, flips the spins of the water
molecules in the direction of the induced magnetic field. After the induced magnetic field is
discontinued, those molecules are progressively realigned to the initial static field, also known
as thermodynamic equilibrium. The re-equilibrium state time is defined as relaxation time.
During this relaxation, a radio frequency signal is generated, which can be measured with
receiver coils. The magnetization is repeated applying a new pulse sequence magnetization of
the water molecules. The repetition time (TR) is defined as the amount of time that exists
between successive pulse sequences applied to the same body region.
The composition of each brain tissue defines the relaxation times and permits different
image acquisitions types. T1-weighted acquisition sequence depicts differences in the spin-
lattice relaxation time of various tissues within the body. The spin-lattice relaxation time (T1
relaxation time) is defined as the time spend by each tissue to return to its thermodynamical
equilibrium. T2-weighted acquisition sequence depicts differences in the spin-spin relaxation
time. The spin-spin relaxation time (T2 relaxation time) characterizes the signal decay as
the time it takes for the magnetic resonance signal to reach a certain peak. The echo time
(TE) represents the amount of time between the application of the RF pulse and that peak
of the echo signal. The voxel intensity of the acquired image is determined by the previously
defined relaxation time variables (TR, TE , T1 and T2 times). Each acquisition sequence is
defined by these times variables. The T1w sequences are defined by a short TR and short TE
(TR < 1000ms, TE > 30ms). T2w sequences are defined by long TR and TE (TR > 2000ms,
TE > 80ms). T1 acquired scans clearly separate grey matter from white matter and are often
used in brain tissue segmentation when only one modality is used (monospectral). This is the
case for all the analyzed methods in this study, which are only run with T1 scans. Moreover, T1
is also included in combination with other sequences into (multispectral) tissue segmentation
methods. T2 scans are often used in the intra-tissue classification of abnormal fluid against the
normal tissue, which made T2 suitable for lesion detection, However, the correlation between
the burden of lesions observed on conventional MRI scans and the clinical lesion evidences
remains weak. In particular, discrepancies between clinical and conventional MRI findings in
MS are explained, at least partially, by the limited ability of conventional MRI to characterize
and quantify the heterogeneous features of MS pathology [35]. Consequently, more advanced
9 2.2 Computer vision aspects
(a) Brain tissue (b) T1w (c) T2w (d) FLAIR
Figure 2.1: Different acquisition sequences. (a) brain tissue segmentation with CSF(red), GM(green)and WM(blue). (b) t1w image, (c) t2w image and (d) FLAIR image
acquisition methods are being used today such as fluid attenuated inversion recovery (FLAIR)
[9], diffusion techniques as diffusion weighted imaging (DWI) [58] and diffusion tensor (DTI) [25]
or magnetization transfer (MT) [86]. FLAIR sequences suppress fluids from the image, for
example, restraining the CSF tissue effects on the acquired image. Those acquisition sequences
have been widely used to classify periventricular hyperintense lesions, such as MS plaques
[43]. Conversely, diffusion (DWI, DTI) and Magnetization techniques (MT) have been using
in histopathology studies with MS [65] [36]. DWI and DTI techniques use the rate of existing
diffusion in water molecules, known as Brownian motion, to link their apparent diffusivity to
produce neural tract data. In Magnetization Transfer, the image contrast is improved based on
the observed changes caused by magnetization transfer from hydrogen nuclei of water (hydration
or bound) with restricted motion into the hydrogen nuclei of water that move with many degrees
of freedom (free or bulk).
2.2 Computer vision aspects
Figure 2.1 shows a set of typical MRI images acquired with T1w, T2w and FLAIR modalities.
In T1 (figure 2.1(b)), CSF tissue has the darkest intensities while WM has the brightest. On the
contrary, in T2 (figure 2.1(c)) CSF has the brightest intensities while WM is the darkest. On
both sequences, GM has an intermediate gray level. However, on FLAIR acquisition sequences
(fig.2.1(d)), WM and GM have an intermediate grey level and lesions seems brighter.
3D MRI volumes are a stack of individual 2D magnetic resonance images captured at dif-
ferent slices with constant distance intervals. Every voxel, also known as volumetric pixel,
Chapter 2: Problem definition 10
(a) Synthetic representation (b) Real volume
Figure 2.2: Skull stripping of the brain. (a) MRI non tissue parts as eyes, skull and fat are alsopresent on images. (b) Same scan after skull-stripping process
represents a value on the regular grid expressed by each MR image and slice. Normally, voxels
codify the gray level intensity with 8 or 16bit numerical data range.
In what follow, different troubles related with MRI images are explained. Basically, acquired
MRI volumes have to deal with the extraction of the skull of the brain, partial volume effects
and intensity inhomogeneities.
2.2.1 Skull stripping
Acquired brain MRI volumes incorporate non brain tissue parts of the head such as eyes, fat,
spinal cord or brain skull. The segmentation of brain tissue from nonbrain tissue in MRI is
commonly referred as skull stripping, and it is an important image processing step in many
neuroimage studies. Studies have reported that differences in skull stripping would lead into
unexpected results in the tissue classification if skull or eyes are included as brain tissue [1].
Figure 2.2 depicts the skull stripping process. Figure 2.2(a) shows ta MRI scan, where eyes,
fat and skull are present. Figure ??(b) shows the preprocessed MRI scan with skull and non
brain parts extracted.
2.2.2 Partial volume effects
Automatic brain tissue segmentation algorithms classify the voxels into their possible classes
(CSF, GM and WM). However, one of the most important problems are the classification of
voxels where more than one tissue is present. This phenomenon is referred to as partial volume
effects (PVE). PVE blur the intensity distinction between tissue classes at their border. For
example, a T1 image voxel containing a boundary between CSF and WM can be misclassified
11 2.2 Computer vision aspects
(a) Synthetic representation (b) Real volume
Figure 2.3: Partial volume effects representation. (a) Synthetic representation of blurring intensityon boundaries and (b) real volume
as GM because of the increase in the blur (figure 2.3 (a)). A real example is shown in (figure
2.3 (a)), where the highlighted region is blurred due to PVE in the boundary tissues.
2.2.3 Intensity inhomogeneity
MRI acquisition process can be corrupted by several image artifacts. These artifacts have a
direct impact on segmentation results. Automated brain segmentation pipelines usually in-
corporate a preprocessing step by which these image inhomogeneities are removed. Sources
of inhomogeneities have been studied extensively [72]. The artifacts causes have been divided
into two main groups [82] by classifying them as inherent to the same MRI device or provoked
by the same scanned object. Main causes in first group are especially derived from radio fre-
quency (RF) transmissions and receptions but also differences in the magnetic field, bandwidth
filtering of the data or eddy currents driven by field gradients. Cause derivations in the second
group are related to the imaged object itself (position, shape, and orientation of the object
inside the magnet) or dielectric properties of the object. Figure 2.4 depicts four volumes with
different inhomogeneity intensity levels from the IBSR dataset 1: figure 2.4(a) shows a volume
with low bias while figure 2.4 (b) depicts a volume with high bias localized in one stripe in the
image center. Figure 2.4(c) shows a very high bias with different stripes along the volume while
figure2.4(d) characterizes a typical corrupted image acquisition. Intensity inhomogeneities are
inherent from the MRI scanner can be corrected by shimming techniques [82], special imag-
ing sequences and different sets of coils, or by calibrating the MRI device by a phantom or
a mathematical model. However, inhomogeneity correction related with the scanned object is
1IBSR dataset is sorted by level of difficulty. Some volumes come with very high intensity inhomogeneityload
Chapter 2: Problem definition 12
(a) Low bias (b) High bias (c) Very high bias (d) Corrupted volume
Figure 2.4: Different acquisition images with intensity inhomogeneity biases from the IBSR dataset(a) Low bias, (b) high bias localized in one stripe in the image center, (c) very high bias with differentstripes (d) corrupted image
still a hard problem. The linear increase of frequency used to stimulate imaged objects in high
magnetic field MR scanners increases the effects of RF standing waves and penetration caused
due to the important impact of the object.
Chapter 3
State of the art
This chapter reviews the state-of-the-art of methods in brain tissue segmentation, pointing out
their advantages and disadvantages. Different strategies have been defined in previous literature
reviews to classify those methods. Bezdek et al. [12] studied 90 papers on MRI segmentation
using pattern recognition techniques and divided them as supervised and unsupervised methods.
Clarke et al. [21] expanded this classification with pre-processing and registration techniques
as well methods of validation. Pham et al. [60] analyzed the most common algorithms used in
segmentation of anatomical structures in medical imaging. The authors analyzed previous work
classifying them directly by approach, i.e clustering, classifiers, thresholding, Markov model,
etc. In another review, Zhang et al. [90] categorized the methods by classification, assessment
criteria and performance metrics proposed by each method. More recently, Balafar et al [10]
added several imaging modalities, methods for noise reduction inhomogeneity correction to
classify the different approaches.
3.1 Classification of segmentation approaches
In machine learning, classification can be based in supervised or unsupervised learning. Su-
pervised learning uses a training set of correctly-identified observations. Those observations
are used as prior information to perform the brain tissue segmentation. On the contrary, un-
supervised learning, known as clustering analysis, does not use any prior information in the
classification task and involves grouping data into categories based on some measure of in-
herent similarity, i.e distance based-measures, etc... As pointed by Bezdek et al [12], we also
categorize the methods by learning type as supervised (S) and unsupervised (U). Moreover,
we extend the categorization adding algorithm and evaluation characteristics such as intensity
uniformity correction (IUC), image type (T1,T2,PD), dataset and statistical measure used. Al-
13
Chapter 3: State of the art 14
though a large number of works have been published, the study is focused only in those which
use common databases such as IBSR1 and Brainweb (BW) [22]. This characteristic will per-
mit us to compare quantitatively the accuracy within each method. 30 published works have
been reviewed from 2001 to nowadays. Table 3.1 summarizes the characteristics of the selected
works.
3.1.1 Supervised methods
Supervised methods (S) described in table 3.1 can be organized in two main groups. The first
group of methods make use of the Bayesian formulation, which implies an iterative simulta-
neous estimation of tissue classes and intensities. They model the intensity distribution of
brain tissues by a Gaussian mixture model (GMM) classifying voxels according to the intensity
distribution of the data. Prior information such as probabilistic atlases is introduced in the
methodology to estimate initial parameters of the model. Given the weighted mixture distribu-
tion, segmentation is commonly estimated by a expectation maximization (EM) optimization
algorithm with maximum a posteriori (MAP) or a maximum likelihood (ML) [5] [33] [57].
Moreover, spatial coherence assumptions can be added using a Bayesian approach, in the form
of Markov Random Fields (RMF) models [8] [89] [14] or multi-scale information [3]. The second
group is composed by statistical supervised classification approaches. Those methods typically
develop different strategies to train non-parametric [23] [27] or high-dimension feature space
classifiers [49] [87] [85], where training data is extracted from prior information given by statis-
tical or expert labelled atlases.
Probabilistic methods
Marroquin et al (2002) [57] proposed a variant of the Bayesian estimation framework with
parameter estimation via Expectation Maximization algorithm (EM). Initial tissue probabilities
were computed using a robust registration of a standard-brain and intensity inhomogeneities
for each class were eliminated by separate parametric smooth models. However, the method
produced a hard segmentation result which does not deal with PVE. Similarly, Dugas-Phocion
et al. (2004) [33] introduced an improved EM algorithm to brain tissue segmentation with
partial volume effect quantization. It was based on a Gaussian Mixture Model (GMM) with
prior information from probabilistic brain atlas. According to the authors, the vessel was
wrongly classified as CSF and they introduce a new class to the vessel. The results reported an
improvement on the global accuracy of the method. Ashburner et al (2005) [5] also presented a
probabilistic brain segmentation framework based on GMM with prior atlas information. Here,
1http://www.cma.mgh.harvard.edu/ibsr/data.html
15 3.1 Classification of segmentation approaches
Article reference Algorithm characteristics ExperimentAuthor Year Algorithm S-U IUC Image Type Database Measure
Ashburner et al. [5] 2005 GMM S yes all BW DiceAskelrod et al. [3] 2007 SWA S no T1 IBSR DiceAskelrod et al. [4] 2007 SVM S no T1 IBSR DiceAwate et al. [8] 2006 AMRF S no T1 BW,IBSR DiceBazin et al. [11] 2008 TBMS S no T1 IBSR DiceBoer et al. [27] 2010 KNN S no T1,PD other DiceBricq et al. [14] 2008 HMRF S yes T1,T2 BW,IBSR DiceCocosco et al [23] 2003 KNN S no all other DiceJimenez et al. [49] 2006 MS S no T1 IBSR OverlapMarroquin et al. [57] 2002 EM S yes T1,T2,PD BW DiceDugas-Phocion et al. [33] 2004 EM S no T1,T2 other DiceScherrer et al. [69] 2010 MRF S yes T1 BW DiceWels et al. [85] 2011 DMC-EM S yes T1,T2,PD BW,IBSR DiceYi et al. [87] 2009 RF S yes T1 IBSR OverlapZhang et al. [89] 2001 HRMF+EM S yes all IBSR -Demirhan et al. [28] 2011 WSOM U no T1 IBSR OverlapHasanzadeh et al. [42] 2007 GFCM U no all BW DiceHe et al. [44] 2008 AFCM U no T1 BW DiceKalaiselvi et al. [51] 2011 HFCM U no T1 IBSR -Krinidis et at [53] 2010 FLICM U no T1 other -Pham et al. [61] 2001 RFCM U no T1 BW -Tohka et al. [79] 2007 GA-GMM U no T1,T2 BW -Tian et al. [78] 2011 GA-VEM U no T1 IBSR OverlapYuanfeng et al. [88] 2011 PSOM U no T1 BW DiceCaldairou et al. [18] 2009 RFCM U no T1 BW, IBSR DiceShen et al. [71] 2005 SOM+FCM U no T1 BW, IBSR -
Table 3.1: Selected state-of-the-art automatic brain tissue segmentation methods. The acronymesfor the algorithms stand for: Adaptive Fuzzy C-means (AFCM), Adaptive Markov Random Fields(AMRF), Discriminative Model Constrained Expectation Maximization (DMC-EM), Expectation Max-imization (EM), Gaussian Mixture Model (GMM), Genetic Algorithm (GA), Genetic Fuzzy C-means(GFCM), Hidden Random Markov Fields (HRMF), Histogram-based Fuzzy C-means (HFCM), Hy-brid Fuzzy C-means (HbFCM), K-nearest Neighbor (KNN), Markov Random Fields (MRF), Mean-shift (MS), Random Forests (RM), Probabilistic Self Organized Map (PSOM), Robust Fuzzy C-Means(RFCM), Segmentation by Weighting Aggregates (SWA), Self Organized Map (SOM), Support VectorMachines (SVM), Topology-Preserving Fast Marching Segmentation (TBMS), Variational ExpectationMaximization (VEM), Weighted Self Organizing Map (WSOM).
Chapter 3: State of the art 16
prior tissue probabilities were obtained registering a ICBM Tissue Probabilistic Atlas2 to obtain
the initial tissue probabilities.3. The method aggregates the combination of image registration,
tissue classification and bias correction in the same time by a derivation of the log-likelihood
function. 4.
Zhang et al. (2001) [89] proposed a parametric EM-based approach with Hidden Markov
Random Fields (HMRF) and thresholding-based initialization5 as substitute for the widely used
GMM, more sensitive to noise. The method was based on an (HMRF-EM) framework, a com-
bination of the HMRF model which encoded spatial information through the mutual influences
of neighboring voxel, and the associated Markov Random Field- Maximum a posteriori (MRF-
MAP) estimation by EM fitting procedures. Bias correction was incorporated by introducing
the algorithm proposed by Guillemaud et al [39] into the EM parameter estimation. Awate et
al. (2006) [8] presented a segmentation method build on adaptive statistical model of image
neighborhoods and atlas-based initialization. Although the method was not using training data
it adapted to the data given by an initial configuration that was generated from an atlas of
labels and then performed tissue classification by adaptively learning the image-neighborhood
statistics via data-driven nonparametric density estimation (non parametric MRF). Addition-
ally, Bricq et al (2008) [14] introduced another Bayesian approach where prior information
were obtained both from probabilistic brain atlas containing prior expectations about the spa-
tial localization of different tissue classes and neighborhood information using a Hidden Markov
Chain (HMC) model. The bias field estimation was corrected online as a linear combination of
smooth bias functions as proposed by Van Lemput et al. [81]. Scherrer et al (2010) [69] provided
a joint model framework for carrying out tissue and structure segmentations by distributing a
set of local agents that estimate cooperatively local MRF models. The approach was based on
a fully Bayesian joint model that integrates within a multi-agent framework local tissue and
structure segmentations and local intensity distribution modeling. The joint model was build
on the specification of three conditional Markov Random Field (MRF) models: two encoding
cooperations between tissue and structure segmentations with a priori anatomical knowledge,
and another model specifying a Markovian spatial prior over the model parameters that en-
ables local estimations and handle consistently intensity non-uniformities correction without
bias field modeling.
Askelrod et al. (2007) [3] proposed a method based on Segmentation by Weighted aggregates
(SWA). This method incorporated prior knowledge information into a multi-scale framework
2Atlas freely available from http://www.loni.ucla.edu/ICBM/ICBM_Probabilistic.html3This method is developed in more detail in the next chapter as part of our selected segmentation methods
list4The proposed method is public available as Statistical Parametric Mapping (SPM). http://www.fil.ion.
ucl.ac.uk/spm/5Public available as part of FSL http://www.fmrib.ox.ac.uk/fsl/index.html
17 3.1 Classification of segmentation approaches
through a Bayesian formulation. Atlas priors provided probabilistic information and were added
to a likelihood function estimated from a manual training. The method constructed a pyramid
of different resolution image graph representations. This configuration permitted the authors
to adaptively represent progressively larger additions of voxels with similar properties.
Statistical classification
Cocosco et al. (2003) [23] proposed a non parametric K-Nearest Neighbor (KNN) classifier with
adaptive training. A set of training samples were generated from prior tissue probability maps
registered on the subject itself. The training set was customized by using a pruning strategy,
such that the classification was robust against anatomical variability and pathology. Similarly,
Boer et al. (2009) [27] introduced another KNN classifier also automatically trained on the
subject itself by using T1 and PD volume intensities. The training dataset was build in the
same manner developed in Cocosco et al (2003) [23], but here training samples for the classifier
were obtained from the subject itself by atlas-based registration of single or multiple label atlases
to the subject. The transformations were obtained by registration of the grayscale images and
applied to the labeled images. Jimenez et al. (2006) [49] proposed another non parametric
estimation strategy, based on the Mean-Shift (MS) algorithm, which defines the cluster centers
using the local modes of the underlying joint space-range density function. Tissue boundaries
information was improved by including an edge confidence map representing the confidence
of truly being in the presence of a border between adjacent regions. The confidence measure
was used to fuse iteratively a region adjacency map, merging and pruning regions with weak
edges and very small regions respectively. Class labeling was taken by maximum a posteriori
MAP decision, based on prior label atlases. For each homogeneous region found in the graph,
a probability map associated to each of the tissue classes was computed.
Askelrod et al. (2006) [4] proposed the combination of a fast pyramidal multichannel 3D
segmentation algorithm with a high-dimension feature space support vector machine (SVM).
The pyramid was constructed by adding different scale aggregates based on sets of weighted
voxels with intensity similarity. The pyramid added to voxel gray level intensity, rich atributes
about texture and shape. The feature set for the SVM classifier was composed by the expanded
attributes and registered prior knowledge of anatomic structures from probabilistic atlases. Yi
et al. (2009) [87] proposed a learning-based method based on discriminative Random Decision
Forest (RF) classification which took into account partial volume effects and non-uniformity
intensity correction by a smoothing multiplying factor. The authors built a RM feature space of
context-rich visual information, based on raw intensities of voxels, image gradient, atlas based
probabilities and the output of the Maximum a posteriori (MAP) classifier. The partial volume
effects were estimated as new classes and re-assigned with the mixing fraction of them. Wells
Chapter 3: State of the art 18
et al. (2010) [85] proposed another brain segmentation pipeline which also included proba-
bilistic boosting trees (PBT) as Discriminative Model Constrained Expectation Maximization
(DMC-EM) method. It was build on unsupervised statistical EM segmentation into an inte-
grated Bayesian framework and MRI modality specific discriminative modeling. The algorithm
estimated intensity non-uniformities via EM and regularized segmentation and parameter es-
timation using a Markov random field (MRF) prior model which provides knowledge about
spatial and appearance-related homogeneity of segments in terms of pairwise clique potentials
of adjacent voxels. Moreover, the method incorporated into the segmentation process unique
clique potentials composed by patient-specific knowledge about the global spatial distribution
of brain tissue by MRI-specific Haar-like features and rigidly aligned probabilistic atlas-based
features. Those clique potentials were used to classify image voxels via a probabilistic boosting
tree (PBT).
3.1.2 Unsupervised methods
Unsupervised methods (U) also described in table 3.1 can be organized in three main groups.
Fuzzy c-means (FCM) is the most common clustering technique applied for MRI brain tissue
segmentation. Fuzzy partitioning is carried out through an iterative optimization of an objec-
tive function, very similar to one used in hard k-means clustering but weighted by a fuzziness
degree. Because its fuzzy nature, voxels are allowed to belong to several classes making FCM
techniques intrinsically able to deal efficiently with partial volume artifacts. However, fuzzy
clustering is commonly very sensitive to noise and intensity uniformities, since spatial infor-
mation is not carried out in the partitioning process. Some works have been proposed to deal
with this aspect weighting membership functions by the addition of local [2] [20] [76] [17] [53]
or non-local [18] neighboring information. Furthermore, other strategies have been presented
to overcome intensity uniformities based on FCM, modifying cluster spaces [44] or by complex
neighboring approaches [71]. A second group of strategies are based on evolutionary optimiza-
tion. Genetic algorithms (GA) are one of the different unsupervised evolutionary optimization
techniques. They implement adaptive heuristic global optimization inspired on the evolutionary
ideas of natural selection and genetics. GA have been used in MRI segmentation as a prior
parameter estimation of finite mixture models as GMM [79] [42] or more recently to initialize
the parameters of proposed variations of the EM algorithm [78]. The third group is composed
by unsupervised artificial neural network (ANN) classifiers. Those models are computational
models inspired by functional aspects of biological neural networks. In particular, self organized
maps (SOM) have been significantly used in brain tissue segmentation as clustering methods [28]
or optimization processes [88] [71] to Probabilistic Neural Networks (PNN).
19 3.1 Classification of segmentation approaches
Fuzzy Clustering
Several modifications on the FCM algorithm have been proposed to incorporate the intensity in-
homogeneity correction. Pham et al (2001) generalized the FCM objective function to include
spatial penalty on the membership functions (RFCM). The authors introduced a smoothing
function to control the fuzzy membership of voxels to a label by a penalty based on the mem-
bership of a neighborhood to the other classes. The trade-off between classic fuzzy objective
function and smoothing was controlled by a β parameter, optionally tunned both experimen-
tally or by regression. Similarly, Ahmed et al. (2002) ?? presented the FCM S algorithm,
which also proposed a smoothing factor modifying the objective function by the addition of
a regularization term based on neighborhood membership to the same class. The amount of
regularization had to be tuned experimentally. Chen et al (2004) [20] proposed a variation
of FCM S (FCM S1), simplifying the regularization term. In this approach, spatial informa-
tion was obtained by a pre-filtered image containing the average of neighboring voxels. Since
pre-filtering was done before the clustering process, the method improved the execution time.
Szilagyi et al (2003) [76] introduced a new approach for regularization by computing in advance
a linearly-weighted sum image derived from the original image and its local neighbor and av-
erage images (EnFCM). This process sped-up the clustering, since regularization was done in
advance and the sum image was used in the minimization function instead of the input image.
However, those methods were handicapped by manually tuning of regularization parameters.
Cai et al (2007) [17] extended EnFCM with a fast generalized fuzzy c-means (FGFCM) algo-
rithm to improve the clustering results, as well as to facilitate the choice of the neighboring
control parameter. The precomputed image proposed in EnFCM was modified by a new image
based on a different local similarity measure to combine both spatial and gray level image infor-
mation into its objective function. Krinidis et at (2010) [53] presented a fuzzy local information
approach (FLICM) . The method made use of a new fuzzy local spatial and gray level similar-
ity factor to guarantee noise insensitiveness, image detail preservation and neighbor influence
depending on their distance to the voxel. Moreover, the algorithm was non-parametric in terms
of regularization weights. Finally, Caldairou et al. (2009) [18] integrated into the FCM seg-
mentation methodology a regularization from a non-local (NL) de-noising framework [15]. NL
exploited the similarity of small neighborhoods around a voxel within the same scene in order
to handle large neighborhoods without prior knowledge. The method combined non-local data
and regularization terms to handle with intensity inhomogeneities and image noise respectively.
Shen et al. (2005) [71] proposed a technique based on an extension of FCM, where segmen-
tation performance was improved by neighborhood attraction with artificial neural network
(ANN) optimization. During clustering, each pixel attempted to attract its neighboring pix-
els toward its own cluster. Neighbor attraction were based on two properties: voxel intensities
Chapter 3: State of the art 20
(feature attraction) and the spatial position of the neighbors (distance attraction). Classical dis-
tance in FCM from voxels to clusters was modified to incorporate both attractions. The degree
of attraction was controlled by two parameters λ and ξ, optimized by a simple Artificial Neural
Network (ANN). He et al. (2008) [44] extended the Adaptive Fuzzy C-Means (AFCM) approach
to multi-spectral segmentation with efficient intensity non-uniformity correction. Mahalanobis
distance replaced the classical euclidian measure between voxels and clusters in FCM, which
has the tendency to generate equal cluster volumes with spherical occupancy in the feature
space. Conversely, Mahalanobis generates volume and shape of the non-spherical occupancy
in the feature space of clusters. The proposed method introduced the size and density of the
clusters into the ACFM approach using the algorithm of Gath and Geva [37] to overcome equal
cluster volume limitation. Kalaiselvi et al. (2011) [51] proposed to modify cluster initialization
in FCM algorithm. MRI intensity characteristics of brain regions were used to initialize the
centroids. Lowest value in the image histogram was used as cluster center for background with
the highest intensity 255 is fixed as centroid for WM. In between, an equal interval was assumed
based on the peaks as centroids for CSF and GM.
Evolutionary optimization
Hasanzadeh et al. (2007) [42] introduced a method based on a genetic fuzzy system for modeling
different tissues in brain MRI. A fuzzy classifier was trained for each tissue where an evolution
process was defined for training classifiers. The output probability distribution functions of
these classifiers were modeled by a GMM. Gaussian parameters were used to voxel classifica-
tion via Maximum likelihood (ML) estimators and Bayesian classifiers. Similarly, Tonka et al.
(2007) [79] proposed a Gaussian Mixture Model (GMM) approach based on real coded genetic
algorithms initialization (GA-GMM)6. The authors proposed to use a blended crossover to
minimize the premature evolutionary algorithms convergence problem and a new permutation
operator specifically meant for the GMM parameter estimation. The permutation operator
allowed to impose biologically meaningful constraints to the GMM parametrization7. More
recently, Tian et al. (2010) [78] introduced a hybrid genetic and variational EM algorithm
for GMM (GA-VEM). The algorithm aimed to overcome the intrinsic overfitting found in EM
based on the global optimization provided by GA and the capability of avoid overfitting present
in VEM. In the variational EM approach, GMM parameters are assumed to be stochastic vari-
ables governed by “hyperparameters” which in conjunction with voxel labels can be estimated
in a variational extension of the EM technique. Here GA was employed to initialize prior
distributions of hyper-parameters involved in the VEM algorithm used to estimate the GMM.
6The implementation of the algorithm is public available from www.cs.tut.fi/~jupeto/gamixture.html7This method is developed in more detail in the next chapter as part of our selected segmentation methods
21 3.2 Reported results
Neural networks
Yuanfeng et al. (2011) [88] proposed a classification method based on a probabilistic 3D Neural
Network (PNN). Spatial information was incorporated into the denoising process by using a
neighboring system in 3D to build a robust training set. In order to made the made PNN
classifier more robust to noise, the training set was used to train ta SOM and reference output
vectors from each class were used to train the PNN. Demirhan et al. (2011) [28] presented
a method which combines unsupervised learning algorithm from self-organizing maps (SOM)
and supervised learning vector quantization (LVQ) methods. The authors proposed to distinct
different tissues using stationary wavelet transform (SWT) applied to the input volumes. An
application of spatial filtering into the wavelet coefficients permitted the extraction of statistical
information of brain tissues. SOM was used to segment input images, with feature vectors
composed by those statistical information and SWT coefficients. LVQ optimized the weight
vectors obtained from the trained SOM to produce optimal decision boundaries.
3.2 Reported results
Automatic brain tissue segmentation methods are usually evaluated using different quantitative
measures on both synthetic and real MRI volumes. Among these measures, common coefficients
found in the literature include Jaccard [47] and Dice [30] similarity indexes or statistical func-
tions found in pattern recognition such as sensitivity and specificity [32]. Although several
published works reported an evaluation using real patient scans from hospitals, public MRI
databases are commonly utilized to compare reported results between studies. In particular,
simulated scans from Brainweb database [22] and real T1w scans from Internet Brain Segmen-
tation Repository (IBSR)8 are being widely used.
3.2.1 Brainweb
Standard simulations from Brainweb database 9 include parameter setting fixed to 3 modalities
(T1, T2 and PD), 5 slice thicknesses (1, 3, 5, 7 or 9mm), 6 levels of noise (0, 1, 3, 4, 5 or
9%), and 3 levels of intensity non-uniformity (0, 20 or 40%) defining a volume of 187x217x181
voxels (x, y, z). An anatomical brain model is employed to generate simulated brain MRI data
consisting on a set of 3-dimensional ”fuzzy” tissue membership volumes for each tissue class.
Tissue classes include GM, WM and CSF but also muscle, fat or skin. The voxel values in
these volumes reflect the proportion of tissue present in that voxel.10. The brain model used to
8http://www.cma.mgh.harvard.edu/ibsr/data.html9http://mouldy.bic.mni.mcgill.ca/brainweb/selection_normal.html
10http://mouldy.bic.mni.mcgill.ca/brainweb/anatomic_normal.html
Chapter 3: State of the art 22
(a) n=0%, b=0% (b) n=0%, b=20% (c) n=0%,b=40% (d) n=3%, b=0%
(e) n=3%, b=20% (f) n=3%, b=40% (g) n=7%, b=20% (h) n=7%, b=40%
Figure 3.1: Brainweb generated dataset for different noise levels (n) and biases (b).
generate the simulations can be also employed as ground truth. 1mm simulations provide a dis-
crete ground-truth model while in other slice thicknesses the anatomical model is fuzzy. Figure
?? depicts the effect of noise and inhomogeneity in simulated data. From scans without noise
(figs. (a), (b) and (c)), it can be observed how bias increases the apparent intensity of voxels
due to abrupt changes in the distribution of voxel intensities. Furthermore, noise corruption
can be recognized in volumes with 0% bias (figs (a), (d) and (g)) as a decrease of intensity due
to noise addition.
3.2.2 IBSR
IBSR is a public dataset from the Center for Morphometric Analysis at Massachusetts General
Hospital 11. The IBSR provides two datasets with 20 (IBSR20) and 18 (IBSR18) raw image
scans from healthy subjects. Table 3.2 shows the specifications for both datasets. IBSR20 data
are available in 8 or 16 bits while IBSR18 are only available in 8 bits resolution format. Labelled
volumes for evaluation in IBSR20 are provided by trained investigators using a semi-automated
intensity contour mapping algorithm [34] and also using signal intensity histograms. IBSR18
11Available at http://www.cma.mgh.harvard.edu/ibsr
23 3.2 Reported results
Table 3.2: Summary of publicly available databases on IBSR repository [85]
IBSR18 IBSR20
Source http://www.cma.mgh.harvard.edu/ibsr http://www.cma.mgh.harvard.edu/ibsr
Volume size 256 × 256 × 128 256 × 65 × 256Voxel spacing 0.84 × 0.84 × 1.5mm3 1.0 × 3.1 × 1.0mm3
0.94 × 0.94 × 1.5mm3
1.00 × 1.00 × 1.5mm3
Modality T1 T1Number of scans 18 20
data are provided with segmentation results of 43 individually labeled principle gray and white
matter structures of the brain.
3.2.3 Evaluation measures
Quantitative evaluations are commonly based on the comparison between the segmentation
results and a manually expert labelled volume or ground truth. Usually, intra-inter observer
variability is avoided by the utilization of labelled volumes from more than one expert. Still,
this is not a sufficient condition and it is difficult to find a consensus among experts. Warfield
et al. (2004) [84] proposed an algorithm to simultaneous truth and performance level estima-
tion (STAPLE). The method took a collection of segmentations of an image, and computed
simultaneously a probabilistic estimate of the true segmentation and a measure of the perfor-
mance level represented by each segmentation. However, most of the reviewed studies report
evaluations based on statistical analysis measures derived from classification rates with respect
to the ground truth such as true positive (TPR), true negative (TNR), false positive (FPR)
and false negative (FNR) rates. In a single tissue classification, these rates are defined as:
• TPR is the percentage of voxels classified as tissue by the method that are labeled as
tissue by the expert.
• TNR is the percentage of voxels classified as non-tissue by the method that are labeled
as non-tissue by the expert.
• FPR is the percentage of voxels classified as tissue by the method that are labeled as
non-tissue by the expert.
• FNR is the percentage of voxels classified as non-tissue by the method that are labeled
as tissue by the expert.
Chapter 3: State of the art 24
From these rates, sensitivity or true positive fraction (TPF) is the classifier ability to cor-
rectly identify tissue voxels [32]. It can be defined as:
sensitivity =|TPR|
|TPR|+ |FNR|(3.1)
Similarly, specificity is defined as the classifier ability to identify non-tissue:
specificity =|TNR|
|TNR|+ |FPR|(3.2)
The accuracy of the classifier is usually computed as the rate of correct predicted voxels over
all predicted voxels. Hence,
accuracy =|TPR|+ |TNR|
|TPR|+ |TNR|+ |FNR|+ |FPR|(3.3)
and conversely, the error rate of the classifier is given by the misclassified voxels over all pre-
dicted voxels as:
accuracy =|FPR|+ |FNR|
|TPR|+ |TNR|+ |FNR|+ |FPR|(3.4)
Furthermore, similarity indexes can be used to compute the accuracy of the method. Dice
coefficient [30] is defined as the set agreement between classification and ground truth :
dsc =2 · |TPR|
2 · |TPR|+ |FPR|+ |FNR|(3.5)
Analogously, the Jaccard similarity index [47], also known as the Tanimoto coefficient, measures
the overlap between the segmentation results and the ground truth as:
j =|TPR|
|TPR|+ |FPR|+ |FNR|(3.6)
Other measures based on intensity, distance or connectivity can be used, as reported by Car-
denas et al [19]. However, Dice coefficient is the most broadly used measure to quantitatively
evaluate the accuracy of brain tissue segmentation. This behavior can be shown also in table
3.1, where Dice is employed in 16 out 26 reviewed methods. We extend the measure list adding
two metrics commonly used in volumetric quantification of brain tissue and tissue atrophy in
MS disease. The Fractional Brain Tissue of a given class returns the normalized fraction of the
given tissue in the brain. It is defined as the amount of voxels which are classified as the given
25 3.2 Reported results
class divided by all brain voxels. Hence:
FBTcsf =CSF
CSF +GM +WM(3.7)
FBTGM =GM
CSF +GM +WM(3.8)
FBTWM =WM
CSF +GM +WM(3.9)
Moreover, experts usually measure the atrophy in MS lesion tissues using the Brain parenchymal
factor coefficient which is defined as the number of GM,WM voxels and tissue lesion voxels L
divided by the all the brain voxels. A decrease in BPF over time might give early diagnostic
clues about the onset of MS disease:
BPF =GM +WM + L
CSF +GM +WM(3.10)
3.2.4 Results analysis
Published results are grouped by database. Although some works introduce other accuracy
measurements, only Jaccard and Dice similarity indexes are considered. Moreover, T1 modality
results are preferred to multispectral modalities when possible to obtain a unbiased comparison
between methods. Since some of the surveyed works compare their segmentation accuracy with
other approaches present in our study, results from those methods are included here when the
current work is not reporting results for the same database and similarity index.
Brainweb database
Table 3.2 summarizes the results obtained by surveyed works using Brainweb database. In all
works, WM and GM tissues segmentation accuracy is evaluated with Dice metric. A single
study is additionally reporting overlap measures based on Jaccard index. CSF is not evaluated
in all methods because most studies are only interested in GM and WM volumetric measures.
Since Brainweb simulated scans can be configured to introduce different noise and intensity
inhomogeneity loads, information about experimental setup is also included in the table.
In general, methods performed with very high similarity indexes (> 0.83) even with high
loads of inhomogeneity and noise. With 0% of inhomogeneity bias, Caldairou et al. (2009) [18]
implemented the RFCM approach defined in Pham et al. (2001) [61]. The RFCM method
performed a Dice overlap coefficient of (0.91 and 0.93) for GM and WM respectively. The
AFCM strategy from Awate et al. (2006) [8] reported a similar index for GM but increased
Dice overlap for WM (0.92 and 0.96). Those results were outperformed by Marroquin et al.
Chapter 3: State of the art 26
BrainWeb
Article reference Jaccard index Dice index observationsAlgorithm (author and year) csf gm wm csf gm wm
GMM (Ashburner et al. (2005) [5]) - - 0.93 0.96 40% bias, 0%noiseAMRF (Awate et al. (2006) [8]) - - - - 0.92 0.96 0% bias, 0%noiseHMRF (Bricq et al. (2008) [14]) - - - - 0.975 0.98 20% bias, 0%noiseEM (Marroquin et al. (2002) [57]) - - - - 0.96 0.97 0%bias, 1%noiseMRF (Scherrer et al. (2008) [69]) - - - 0.79 0.91 0.93 Avg all volsDMC-EM (Wels et al. (2011) [?]) - - - 0.77 0.92 0.94 20%bias, 0%noiseSOM (Hasanzadeh et al. (2007) [42]) 0.91 0.90 0.91 0.95 0.94 0.95 20%bias, 3%noiseRFCM (Pham et al. (2001) [61]) - - - 0.92 0.91 0.93 From [18]AFCM (He et al. (2008) [44]) - - - 0.98 0.90 0.91 40%bias, 3%noisePSOM (Yuanfeng et al. (2011) [88]) - - - - 0.91 0.93RFCM (Caldairou et al. (2009) [18]) - - - 0.87 0.83 0.86 20%bias, 9%noise
Table 3.3: Surveyed works based on Brainweb database and Dice or Jaccard indexes. The acronymesfor the algorithms stand for: Adaptive Fuzzy C-means (AFCM), Adaptive Markov Random Fields(AMRF), Discriminative Model Constrained Expectation Maximization (DMC-EM), Expectation Max-imization (EM), Gaussian Mixture Model (GMM), Markov Random Fields (MRF), Probabilistic SelfOrganized Map (PSOM), Robust Fuzzy C-Means (RFCM), Self Organized Map (SOM), Support VectorMachines (SVM).
(2002) [57] with a classical EM implementation (0.96 and 0.97). Bricq et al. (2008) [14] and
Wels et al. (2011) [?] employed simulations with 20% inhomogeneity bias and 0% Rician noise
with dissimilar results. Bricq et al work returned higher indexes for GM and WM (0.975 and
0.98) than Wels et al. (0.92 and 0.94). Lower results were also obtained by the SOM approach
of Hasanzadeh et al. (2007) [42] by increasing the Rician noise to 3% (0.94 and 0.95). The
same behavior was seen in Caldairou et al. (2009) [18] with 9% rician noise, where results
were significantly inferior (0.83 and 0.86). Two studies used simulations with 40% of intensity
inhomogeneity. Ashburner et al. (2005) [5] GMM method reported a Dice metric of (0.93 and
0.96) for GM and WM respectively without Rician noise. The ACFM of He et al. (2008) [44]
increased the Rician noise to 3% with again lower results (0.90 and 0.91).
Since not all the works incorporated CSF tissue results, Dice indexes for CSF are commented
together. The best result was obtained by using AFCM approach of He et al. (2008) [44] with
a Dice metric of 0.98 even with a simulation with 40% of bias. Similar results were reported
by Hasanzadeh et al. (2007) [42] (0.95) and significantly lower results by Caldairou et al.
(2009) [18]. The worst results were found in Wels et al. (2011) [85] DMC-EM approach with a
Dice similarity index for CSF of (0.77).
27 3.2 Reported results
IBSR database
Table 3.3 summarizes the results obtained from surveyed works using real T1 scans from IBSR
database. Again CSF is not evaluated in all methods. From 18 surveyed works, 10 studies
evaluate segmentation results using Dice index, 8 with Jaccard index and 4 with both measures.
The IBSR database of normal subjects is composed by 20 scans with different levels of difficulty.
Some of the volumes are corrupted by acquisition artifacts as shown in section 2.2.3. Table 3.3
shows the number of scans employed in each experiment.
IBSR
Article reference Jaccard index Dice index observationsAlgorithm (author and year) csf gm wm csf gm wm
GMM (Ashburner et al. (2005) [5]) - - 0.78 0.85 n=18. From [18]SWA (Askelrod et al. (2007) [3]) - - - 0.83 0.86 0.87 n=18SVM (Askelrod et al. (2006) [4]) 0.34 0.68 0.66 0.51 0.81 0.80 n=20AMRF (Awate et al. (2006) [8]) - - - - 0.80 0.88 n=18HMRF (Bricq et al. (2008) [14]) - - - - 0.8 0.86 n=18MS (Jimenez et al. (2006) [49]) 0.21 0.59 0.62 - - - n=20DMC-EM (Wels et al. (2011) [85]) 0.62 0.73 0.77 0.76 0.83 0.87 n=18RFCM (Pham et al. (2001) [61]) - - - - 0.84 0.86 n=18. From [18]RF (Yi et al. (2009) [87]) 0.61 0.83 0.73 0.69 0.90 0.83 n=20HRMF+EM (Zhang et al. (2001) [89]) 0.03 0.52 0.49 - - - n=20. From [78]WSOM (Demirhan et al. (2011) [28]) - 0.65 0.54 - 0.70 0.78 n=10GA-GMM (Tohka et al. (2007) [79]) 0.07 0.63 0.60 - - - n=20. From [78]GA-VEM (Tian et al. (2011) [78]) 0.20 0.70 0.57 - - - n=20RFCM (Caldairou et al. (2009) [18]) - - - - 0.83 0.84 n=18
Table 3.4: Surveyed works based on IBSR database and Dice or Jaccard indexes . The acronymesfor the algorithms stand for: Adaptive Markov Random Fields (AMRF), Discriminative Model Con-strained Expectation Maximization (DMC-EM), Gaussian Mixture Model (GMM), Genetic Algorithm(GA), Hidden Random Markov Fields (HRMF), Markov Random Fields (MRF), Mean-shift (MS), Ran-dom Forests (RF), Robust Fuzzy C-Means (RFCM), Segmentation by Weighting Aggregates (SWA),Self Organized Map (SOM), Support Vector Machines (SVM), Variational Expectation Maximization(VEM), Weighted Self Organizing Map (WSOM).
There is a certain correlation on Jaccard and Dice indexes. Typically, Jaccard similarity
values are lower than Dice. This behavior can be also seen in surveyed methods reporting
both similarity measures [4] [85] [28] [87]. From those reporting Dice coefficient, Askelrod et al.
(2006) [4] and Yi et al. (2009) [87] employed the IBSR20 dataset. The RF method proposed
by Yi et al. (2009) [87] outperformed the SVM classifier proposed by Askelrod et al. (2006) [4]
with Dice values of (0.90 and 0.83) for GM and WM respectively in comparison with those
obtained by SVM (0.81 and 0.80). Awate et al. (2006) [8] , Ashburner et al. (2005) [5],
Askelrod et al. (2007) [3], Bricq et al. (2008) [14], Pham et al. (2001) [61] and Caldairou et
al. [18] employed the IBSR18 dataset. SWA method from Askelrod et al. (2007) reported the
Chapter 3: State of the art 28
best overall results for GM and WM (0.86 and 0.87). MRF approaches from Awate et al. and
Bricq et al. reported both close values around (0.8 and 0.88) and (0.84 and 0.86). Similarly,
Dice coefficients in RFCM approaches of Pham et al. and Caldairou et al. were also similar
between them (0.84 and 0.6) . However, Gaussian Mixture Model (GMM) approach seemed to
perform lower on GM, as reported by Caldairou [18], with Dice values (0.78 and 0.85).
All the methods evaluated only on the Jaccard index employed IBSR20 scans. Tian et al.
(2011) reported results for GA-VEM approach [78] and also implemented the GA-GMM ap-
proach developed by Tohka et al. [79] and HRMF+EM from Zhang et al. (2001) [89], comparing
their work with those ones. From their survey, GA-VEM approach outperformed GA-GM and
HMRF+EM, segmenting GM with a Jaccard index of (0.70 and 0.57) compared with GA-GMM
(0.63 and 0.60) and HMRF-EM (0.52 and 049). Conversely, GA-GMM seemed to slightly out-
perform the other methods segmenting WM. Furthermore, MS method from Jimenez et al.
(2006) [49] reported the best overlap results segmenting WM (0.59 and 0.62).
Some methods returned a very low Jaccard index for CSF, such as GA-VEM [78], GA-
GMM [79], HMF+EM [89] and MS [49]. Low indexes are provoked by differences between
the ground truth and expected probabilities of CSF tissue of those methods. However, SWA
method from Askelrod et al. (2007) reported again the best overall results for CSF. In general,
methods that returned very high Dice indexes for GM and WM also reported a high index for
CSF as seen in Wells et al. [85] and Yi et al. [87].
Chapter 4
Proposal
The main motivation of this master thesis is two-fold: first, to perform an exhaustive compar-
ative evaluation of existing state-of-the-art brain tissue segmentation methods using T1w data
which is the most used in tissue classification; and second, to extend the evaluation with an
quantitative analysis of how MS lesions affect tissue classification. In order to generalize our
findings as much as possible, methods are selected trying to balance state-of-the art methods
with different segmentation strategies and most commonly-used methods in neuroimaging.
Our analysis of the most recent techniques presented in chapter 3 has shown accurate re-
sults using different strategies. Methods which used a priori information to introduce spatial
information into the segmentation process returned higher Dice metrics. Similarly, clustering
methods with spatial regularization also reported high accuracy both in synthetic and real data.
In this chapter, we propose an evaluation pipeline based on 8 state-of-the art tissue
segmentation methods, which includes skull extraction, intensity inhomogeneity
correction and tissue segmentation. Four of these methods are publicly available and in-
clude two segmentation packages widely used by the neuroimage community such as FAST1
and SPM with both SPM5 and SPM8 versions2. We also run an implementation of a GMM
approach with heuristic optimization (GAGMM). 3. This method performs tissue classification
by optimizing the tissue probabilities of the GMM by a modified real codec GA instead of
the classic EM algorithm. Moreover, we complete the set of methods implementing two fuzzy
clustering approches developed in Pham et al. [60], a KNN classifier derived from De Boer et
al. [27] and a unsupervised Neural Network method based on the work of [77] et al.
Our pipeline comprises 3 different stages, as illustrated in figure 4.1. From a T1w scan, a
1http://www.fmrib.ox.ac.uk/fsl/fast4/index.html2http://www.fil.ion.ucl.ac.uk/spm/software/3http://www.loni.ucla.edu/Software/GAMixture
29
Chapter 4: Proposal 30
Figure 4.1: Scheme for our pipeline approach. From a T1w scan, non brain parts of the MRI scan arestripped and brain voxels are corrected for intensity inhomogeneities. Corrected voxels are classifiedto one of the three tissue classes (CSF, GM and WM). Accuracy evaluation is measured by comparingthe returned tissue classification with the provided ground-truth.
preprocessing stage is carried out before tissue classification. The first step is to stripe the non
brain parts of the MRI scan. After skull-stripping, intensity inhomogeneity correction is applied
to all scans in order to eliminate image artifacts. In the second stage, intensity corrected voxels
are classified as CSF, GM or WM tissue. Finally, in the evaluation stage, the accuracy of the
segmentation is measured by comparing the returned tissue classification with the provided
ground-truth. Every stage is described in more detail in the next sections.
4.1 Preprocessing
The preprocessing step is composed by two main processes: skull stripping and intensity in-
homogeneity correction. These preprocessing steps are required to reduce miss-classification
errors in the later tissue estimation. Normally, the skull is striped before intensity correction to
avoid deviations in tissue intensity distributions caused by the skull. Both concepts have been
31 4.1 Preprocessing
introduced in chapter 2, as part of the intrinsic problems inherent to MRI. Here, we focus more
on the implementation aspects of the pipeline.
4.1.1 Skull stripping
Brain skull extraction from MRI images have been studied in several works comparing the
state-of-the-art techniques. Publicly available methods such as statistical parametric mapping
(SPM2) [6], BET brain extraction tool [74] or the brain surface extractor (BSE) [66] have been
evaluated with new proposals. Boesen et al. [13], compared those methods with their own
McStrip proposal [63] on T1w data. Hartley et al. [40], compared only the accuracy of BET
and BSE against 296 PDw images of asiatic subjects.
Like FAST segmentation approach, BET [74] is part of FSL package4, but both tools are
independent. Furthermore, SPM5 extracts brain tissue internally, with the option to save the
skull-stripped volume. However, using an internal feature of SPM5 as a skull-stripping step
for all methods could benefit this algorithm in detriment of the others. In order to be as
independent as possible, BET is used as skull-strip preprocessing tool. In all the experiments,
we run the BET2 version of the program5 with default configuration only setting the output
option to obtain a mask of the brain tissue. Brain masks will permit to disable non tissue voxels
such as background, speeding up also the segmentation process.
4.1.2 Intensity correction
Three out of eight methods (FAST, SPM5 and SPM8) implement an internal intensity inho-
mogeneity correction. Avoiding intensity correction as a previous step to segmentation would
benefit these methods among the others in noisy images. Here, we propose to correct image
artifacts as a previous step, and disable the internal feature of those 3 methods. These strategy
has been used previously as well in a quantitative study of those 3 methods with the KNN
approach proposed by De Boer et al [26].
Two different studies have recently revised different ways to overcome intensity correction
[83] [45]. Both studies classify these methods into various groups: segmentation-based, filtering-
based, surface fitting-based, histogram-based and other specific techniques. However, as pointed
out by Hou [45], none of the methods has been shown to be superior to the others and exclusively
applicable. Again, FAST, SPM5 and SPM8 returns a corrected image from their segmentation
output. Following the same coherence for the skull stripping tool selection, we decided to use
the external bias correction N3 tool proposed by Sled et al [73], which is part of the BIC package
4http://www.fmrib.ox.ac.uk/fsl/5BET2 is the default version of the program in FSL 4.1 package
Chapter 4: Proposal 32
(a) Input scan (b) BET output (c) N3 output (d) Bias field
Figure 4.2: Proposed preprocessing stage: (a) Input scan from SALEM. (b) Skull-stripped scanreturned by BET. (c) N3 corrected scan returned by N3. (d) Bias field returned by N3.
developed by the McConnell Brain Imaging Center of Montreal Neurological institute 6.
Figure 4.2 shows a real example of the preprocessing stage using selected tools. For an input
scan (Figure 4.2(a)), non tissue parts are extracted using BET (Figure 4.2(b)) and the skull-free
scan is corrected in N3 (Figure 4.2(c)). As it is very hard to appreciate the differences between
the original and corrected version, figure (Figure 4.2(d)) shows the bias fileld returned by N3.
4.2 Tissue segmentation methods
In the following section, we analyze 8 methods which are introduced into the pipeline for brain
tissue classification. FAST, SPM5, SPM8 and GAGMM are publicly available to download. In
these cases, we introduce the software and we explain briefly their implementation approaches.
Furthermore, we have implemented two fuzzy clustering approaches, a SOM derived Neural
network an a KNN classifier which are introduced with more detail. The set of methods define
a representative sample of the different strategies found in the state-of-the-art. All of them
will be evaluated on synthetic and real T1w scans of healthy subjects and also on scans with
different MS lesion loads.
4.2.1 Tissue classification with FAST
The FMRIB’s Automated Segmentation Tool (FAST) is a segmentation program included in the
FSL package and developed by the FMRIB group in Oxford. FAST is an standalone program
which can be used both from a GUI interface or the command line. The whole process is
fully automated and can also produce a bias field-corrected input image and a probabilistic
and/or partial volume tissue segmentation. Because the method is robust and reliable, it is
6N3 is available as part of the MINC-TOOLS package. www.bic.mni.mcgill.ca/ServicesSoftware/HomePage
33 4.2 Tissue segmentation methods
commonly used by the neuroimaging community as a quantitative tissue evaluation tool, but
also to compare results with other proposed approaches.
FAST is implemented on the work of Zhang et al. [89]. The authors proposed a parametric
EM-based approach with Hidden Markov Random Fields (HMRF) and thresholding-based ini-
tialization as substitute for the widely used GMM, more sensitive to noise. MRF theory was
used to include spatial information from neighboring pixels. In MRF, spatial information in an
image is modeled through contextual constraints of neighboring voxels. The model is based on
contextual constraints characterizing mutual influences among voxels using conditional MRF
distributions. Moreover, the HRMF parameters are estimated using the iterative Expectation
Maximization (EM) algorithm. Bias correction was incorporated by introducing the algorithm
proposed by Guillemaud et al [39] into the EM parameter estimation.
Basically, HMRF-EM framework for Brain MRI tissue classification is defined following the
EM approach. A first segmentation is done by image thresholding, which also provides an
initial parameter estimation for the HMRF. With the parameter initialization, the iterative
EM algorithm is started. The algorithm seeks for a solution for three dependent unknowns: the
bias field, the image classification and the involved model parameters. At each iteration, the
bias field and class labels are estimated by MRF-MAP probabilities and after them posterior
probabilities are computed taking into account the obtained bias field correction and estimated
labels. Finally, the parameters of the model are updated for the next iteration. The algorithm
stops when a maximum number of iterations have been performed.
FAST can be set to provide along with the discrete segmentation of brain tissue and intensity
corrected volumes, outputs for each class with tissue probabilities and PVE. Figure 4.3 depicts
the segmentation output for the input image ( Figure 4.3(a)). The program return three volumes
with tissue classification probabilities (Figure (4.3(b)), a discrete segmentation ( Figure 4.3(c))
and a PVE segmentation with added labels for CSF/GM and GM/WM tissues. Although we
have implemented a wrapper to call all the functionalities of FAST from MATLAB environment,
we call the program with default options in all the experiments except that intensity uniformity
correction was disabled.
4.2.2 Tissue classification with SPM
Statistical Parametric Mapping (SPM) is an open source software developed in MATLAB envi-
ronment by the Wellcome Department of Imaging Neuroscience at University College in London.
The SPM software package has been designed for the analysis of brain imaging data sequences.
The sequences can be a series of images from different cohorts, or time-series from the same
subject. The current version of the software is SPM8, and implements two segmentation ap-
proaches: a default segmentation method also present in the previous version SPM5 and based
Chapter 4: Proposal 34
(a) Input scan (b) GM output (c) segmentation (d) PVE out
Figure 4.3: FAST segmentation output. (a) Input image. (b) Probability tissue output. (c) Discretesegmentation. (d) Segmentation with PVE.
(a) Axial view (b) Sagittal view (c) Coronal view
Figure 4.4: MNI prior tissue atlas used in SPM5 and SPM8. (a) Axial view. (b) Sagittal view. (c)Coronal view
on the work of Ashburner et Friston [5] and an improvement of this version called new segment
which is available only since this new version. In the following, we will denote the first approach
as SPM5 while the new segment version will be called SPM8. SPM8 is essentially the same
algorithm as that described in [5] with several improvements in registration and extended tissue
maps.
The algorithm present in SPM5 is basically a probabilistic brain segmentation framework
based on GMM with prior atlas information. Brain tissue distributions are estimated by a
Gaussian Mixture Model where prior tissue probabilities are obtained registering a probabilistic
atlas into the input scan to obtain the initial tissue probabilities for each tissue. Figure 4.4
shows the prior tissue probability atlases used in SPM. The method aggregates the combination
of image registration, tissue classification and bias correction in the same time by a derivation
of the log-likelihood function
The program permits several output configurations options for segmented tissue: by default,
35 4.2 Tissue segmentation methods
segmented images are aligned with the input image, but it is also possible to align the output
images with the prior atlas to produce normalized segmentation to be used in other processes.
In our experiments, the program have been run with default options except that intensity
uniformity correction was disabled.
4.2.3 Tissue classification with GAGMM
GAGMM is part of the GAMIXTURE package, publicly available from the laboratory of Neu-
roImaging in the UCLA university. The implementation runs in Matlab and it is based on the
work of Tohka et al. (2007) [79] . The method combines the estimation of tissue distributions by
finite mixture models with heuristic optimization of the tissue distribution parameters. In GA
algorithms, a population of chromosomes encode different solutions to an optimization prob-
lem by iteratively recombination. GA are usually represented by chromosomes consisting in
vectors with binary gens. However, real-codec GA are represented by vectors of floating point
numbers which are able to adapt better to data distributions. A set of chromosomes generate
a population, and at each generation chromosomes are stochastically selected and recombined
between them by crossover and mutation to form new generations which iteratively minimizes
the optimization problem.
Basically, in GAGMM tissue segmentation is carried out by estimating Gaussian distribu-
tions for tissue classes. But, instead using approaches like EM algorithm to the optimization
of the distributions parameters, a GA optimizes the distributions. Population chromosomes
are here defined as vectors containing the parameters for the mixture models to be fitted to
the data. The authors proposed two modifications in the GA algorithm based on a blended
crossover to avoid premature convergence associated with flat crossovers and a reduced permu-
tation operator a with prior information designed to reduce the parameter search space. The
fact that the tissue distributions are modeled by Gaussian distributions gives the algorithm the
possibility to estimate the PVE as new classes that are a posteriori reclassified as GM,WM or
CSF.
GAGMM parameters can be set extensively. Given the GA optimization, population,
crossover rates or chromosome selection are available to tune. Furthermore, normal param-
eters of FMM such as number of Gaussians to estimate or number of PVE are also available.
Input parameter makes the software highly configurable for a set of different problems but also
sensible to changes. Figure 4.5 depicts the segmentation output for different settings of PVE. In
our experiments, we use default configuration for the GA algorithm, the number of Gaussians
is set to 3 and PVE to 2.
Chapter 4: Proposal 36
(a) Input scan (b) no PVE (c) PVE=1 (d) PVE =2
Figure 4.5: GAGMM output segmentation: (a) Input scan. (b) Output set not to deal with PVE.(c) PVE=1. (d) PVE =2.
4.2.4 Tissue classification with SOM
Self organized maps (SOM) or Kohonen networks cluster data based on an iterative process of
comparison of these data with related changes within them. The SOM organizes unknown data
into groups of similar patterns, according to a similarity criterion (e.g. Euclidean distance).
Such networks can learn to detect regularities and correlations in their input and adapt their
future responses to that input accordingly, even in the presence of certain noisy data [77]. The
map preserves topological relationships between inputs in a way that neighboring inputs in the
input space are mapped to neighboring neurons in the map space.
With this approach, a competitive learning algorithm re-adjust iteratively a matrix of
weights W . Given the number of expected classes k, W is defined as f × k, where f is the
number of features of the input vector and k is the number of expected output classes. Elements
in the weighting matrix form a network representation, where each element of W works as a
weight and each column as a output neuron. In practice, network training is done by computing
the closest Euclidian distance within each input vector and the weightings of each column of W .
The weights of the winning column are updated for every input vector multiplying them by a
learning rate or geometrical decrease α parameter. Once all the input vectors have updated W ,
the learning rate is decreased and the process is repeated for all the input vectors again (epoch).
The learning rate decreases the capability of each epoch to modify the weighting matrix. When
W has not significantly changes between epochs or a maximum number of epochs have been
reached, the algorithm stops.
Brain tissue classification with SOM can be achieved training a weighting matrix W and
after it, multiplying the input vectors of the input image by W . Lets say that three classes
have to be found (k = 3), and the feature space of voxels are represented by input vectors
vi = [vf , . . . , vf ] with f feature elements. According to what have been said, W will have a size
of k × f . Hence, the classsification of a pixel i will be given by the vector Ci = W · vi, where
37 4.2 Tissue segmentation methods
the classification result will be the column element of Ci with higher value.
SOM have been used in some works for brain tissue classification such as Yuanfeng et al. [88],
Tian et al [77] [78] [62]. Tian et al [?] used three MRI modalities (T1, T2 and PD) as input
vector for normal brain tissue classification (CSF, GM and WM). However, we are restricting
our study only to T1w images. Hence, we base our SOM approach in a 1 dimensional T1 feature
space self-trained with the subject itself. Given that our feature space is reduced to f = 1 and
three main tissue classes have to be classified, the learned weighting matrix in our experiments
is reduced to the centers of the clusters of each brain tissue. After learn the final clusters, labels
are assigned to voxels computing the absolute minimum distance between cluster centers and
image voxels.
4.2.5 Tissue classification with FCM
Fuzzy C-Means clustering techniques integrate fuzzy sets approach into the classical k-Means
clustering algorithm. Basically, in their application into brain tissue classification, a set of vox-
els with given intensity value have to be segmented as three different classes (CSF, GM and
WM). On the contrary to k-means, voxels do not necessary belong to one of the classes, but
can “partially” belong to several of them. In the basic k-means clustering approach, the seg-
mentation process for tissue classification is defined as the minimisation of the energy function
J [56]:
JKM =
3∑k=1
∑xj∈Sk
||xj − µk||2 (4.1)
where the intensity of the voxel is denoted as xj and µk is the centroid of the k tissue class
with k = {1, . . . , 3}. Classification is achieved by a refinement algorithm [70] which produces
an optimal partition in 3 classes by iteratively minimizing the squared error objective function
within cluster elements. Algorithm returns the centers of each cluster µk.
Fuzzy C-means introduces into the classical K-means a membership function of a voxel j
for a given tissue class k as Ujk, controlled by a fuzzy parameter q which sets the fuzziness
of the system. Hence, voxel classification with fuzzy clustering is defined by modifying the
minimization function JKM 4.1 as:
JFCM =
3∑k=1
∑xj∈Sk
Uqjk||xj − µk||2 (4.2)
Contrary to k-means, the algorithm also returns Ujk, which provides the probability of voxels
to belong to each tissue class. This fact makes FCM approach interesting for brain tissue
classification because it is intrinsically able to deal efficiently with PVE artifacts, since voxels
Chapter 4: Proposal 38
(a) output for WM (b) Input scan (c) output for CSF (d) FCM output forGM
Figure 4.6: Membership classification output in FCM. The algorithm return the probability of eachvoxel to belong to one of three tissues. (a) Synthetic scan. (b) Output for CSF. (c) Output for GM.(d) Output for WM.
presenting PVE are simply allowed to belong to several classes by the FCM. Figure 4.6 shows
the typical output of the algorithm, where the probability of each voxel to belong to one of
three tissues is returned. Given the probability for each class, defuzzyfication is done labeling
each voxel with the maximum probability from the output segmentations of each tissue.
Our implementation is based on the work of Pham et al. [61]. The authors developed
the basic steps of FCM algorithm in their work as a previous step to their proposal, which
incorporates spatial regularization to deal with artifacts. We develop this modification in more
detail in the next subsection 4.2.6.
4.2.6 Tissue classification with RFCM
Some authors have proposed modifications in FCM in order to add spatial information [61] [18]
[44] [53]. FCM objective function 4.3 does not take into consideration any spatial dependence
between observations. Hence, membership functions can be distorted by voxels with noise in
the observed image. Robust Fuzzy C-Means (RFCM), modifies the FCM objective function
including a penalty term based on neighbor membership for other classes. This penalty term
permits the algorithm to be more robust to noise. RFCM equations are only slightly different
from the objective function 4.3. Hence, the new objective function is defined as:
JRFCM =
3∑k=1
∑xj∈Sk
Uqjk||xj − µk||2 +
β
2
3∑k=1
∑xj∈Sk
Uqjk
∑l|inNj
∑m∈Mk
Uqlm (4.3)
where Nj is the set of neighbors of a given voxel j, an Mk is the set of classes different to the
current evaluated class k. Parameter β weights the amount of regularization included by the
39 4.2 Tissue segmentation methods
algorithm. Basically, the penalty term is minimizing the membership of a voxel when there is
a discrepancy between the voxel class of the class and their neighbors.
Our RFCM implementation is based on the equations for Uqjk membership and mk cluster
center functions proposed by Pham et al. [61]. We have introduced these modifications in our
previous FCM approach to implement this new version. Different selections in the number
of neighbors Nj and the regularization parameter β are proposed by the authors. Moreover,
they developed a method based on cross-correlation to optimally set the parameter β. How-
ever this method is excessively time consuming, and for our experiments β and Nj have been
experimentally set to 5 and 7 respectively, for all the test after several runs on different datasets.
4.2.7 Tissue classification with KNN
Based on the approach of De Boer et al. [27], CSF, GM and WM are segmented using an
automatically trained kNN classifier which is an extension of the work by Cocosco et al. [23].
Training samples for the kNN classifier are obtained from the subject itself by probabilistic
atlas-based registration as proposed in [27].
Initially, all the voxels are used to train the classifier. By registering tissue priors probability
atlas into the input image, a label for each voxel can be assigned from the maximum probability
obtained from the tissue atlases. However, a threshold parameter is introduced to allow only
to add voxels to the training set with a probability bigger than the threshold. The selection of
this parameter value will define the amount of training voxels of an specific brain region that
will be considered as a certain tissue in the training stage. In DeBoer et al. [27], PDw and T1w
intensities are included in the feature vector and a sampling and pruning steps are applied to
the dataset to remove incorrect samples from both modalities. Here, we are limited to T1w
images, and hence the feature vector is only based on intensities of one modality image. Given
this limitation, we decided to include in the training dataset all the voxels intensities which
have been not thresholded.
Although our implementation is one dimensional, the fact of include all the voxels with
high probability, decrease considerably the classification time given that for each voxel we have
to compute the distance with the rest of the voxels. Similar to De Boer work, we based our
implementation on a Approximate Neighbor Searching approach (ANN) which preprocess d-
dimensional data points into a data structure to report the k closer elements of the training
dataset efficiently7.
The kNN classifier based on ANN performs the final classification based on thresholded
training samples from the same image. It was shown in [75] [29] that if the number of neighbors
7Our implementation use the ANN library which can be downloaded fromhttp://www.cs.umd.edu/mount/ANN/
Chapter 4: Proposal 40
k satisfies both equations 4.4 and 4.5 the kNN classifier will have the minimum possible error
probability of a generic classifier (achieved when the classifier knows the true data distributions).
limn→∞
k =∞ (4.4)
limn→∞
k
n= 0 (4.5)
Although different values of k neighbors would have been selected, DeBoer [27] and Cocosco [23]
works both fixed k to 45 neighbors. In our experiments, we decided to use also the same number
of neighbors.
4.3 Tissue classification in the presence of lesions
The proposed pipeline was designed to evaluate the methods comparing the obtained results
with a ground-truth. However, we are also interested in the analysis of the efficiency classifying
brain tissue in the presence also of MS lesions in the brain. WM atrophy is usually measured
by the experts computing the volume of WM tissue or the FBP. The presence of WM lesions
which are not known a priori can introduce errors in the volume measurement if those lesions
are miss-classified as GM or CSF by the segmentation algorithms. We propose a modification
of the basic pipeline to include lesion data as shown in Figure 4.7. The segmentation framework
is modified as follows:
• Given the MS lesion masks provided within the SALEM subjects are used to disable
regions with lesion in scans before segmenting them. The segmentation output is modified
adding the lesion regions as WM. This masking process is the same methodology followed
by radiologists and neurologists in hospital.
• At the same time, scans with lesions are segmented without taking into account the lesion
regions.
Comparison between both sets will be used to assess the efficiency, as will be explained
in the following section within the experiments done using the ral patients from the SALEM
project.
4.4 Evaluation metrics
The third stage of the pipeline is the evaluation of the segmentation results obtained by the
methods. Given the proposed basic pipeline and the modification introduced in the previous
41 4.4 Evaluation metrics
Figure 4.7: Scheme for our modified pipeline approach. From a T1w scan, non brain parts of theMRI scan are stripped and brain voxels are corrected for intensity inhomogeneities. MS lesions aremasked before classifying voxels to one of the three tissue classes (CSF, GM and WM). Lesion voxelsare added as WM. At the same time, scans with lesions are segmented without taking into account thelesion regions. Evaluation is assessed by compare the FBT of both segmentations
Chapter 4: Proposal 42
section to evaluate the efficiency of the methods in the presence of MS lesions, we assess the 8
analyzed approaches using the following metrics:
• Dice, Sensitivity and specificity metrics between segmentation results and database ground-
truths on real and synthetic healthy T1w scans from the Brainweb and IBSR20 datasets.
• Differences in Fractional Brain Tissue (FBT) on real T1w scans of healthy subjects from
SALEM database with a first and a 12 months follow-up scan for each subject.
• Differences in FBT and Brain Parenchymal Fraction (BPF) on real T1w scans of subjects
with different levels of MS disease from SALEM database with a first and a 12 months
follow-up scan. Efficiency of methods in the presence of lesion is assessed by comparing
the obtained results with FBT and BPF values of the same subjects masking the WM
lesions before segmentation and adding them posteriorly as WM.
Chapter 5
Results
This chapter presents the segmentation results obtained by the 8 proposed automatic brain
tissue segmentation methods. First, a complete description of the employed datasets is given.
Afterwards, quantitative evaluation for all methods is carried out computing the following
metrics: sensitivity, specificity and Dice coefficient on synthetic and real datasets of healthy
subjects. Finally, the effects of MS lesions into segmentation are presented as differences in
fractions of brain tissue (FBT) and Brain Parenchimal factor (BPF) between scans of the
same MS patient with WM lesions.
5.1 Datasets
The 8 analyzed segmentation methods are evaluated using synthetic and real data. As seen
in chapter 3, Brainweb [22] and Internet Brain Segmentation Repository IBSR have become
standard datasets for inter-study comparison. In this study, Brainweb and IBSR datasets are
also employed to evaluate the methods on synthetic and real data respectively. Moreover, one of
the objectives of the study is to evaluate the effect of MS lesions into the segmentation methods.
Lesion data is provided by relevant hospitals and medical expert teams in the field of multiple
sclerosis from the SALEM project. MS patients selected within the project with different lesion
loads are used to evaluate the effect of lesions in brain tissue segmentation. Those patients
include an initial scan and its follow-up (after either 6 months or one year) using the 4 main
conventional MRI techniques (PD-w, T1-w, T2-w and FLAIR images). Number of subjects
and characteristics of each dataset are the following ones:
• 9 simulated scans from Brainweb with ground truth. Modality is fixed to T1w, since some
of evaluated methods are mono-spectral. Slice thickness is also fixed to 1mm in order to
43
Chapter 5: Results 44
benefit from the discrete ground truth. Then, 9 different configurations are generated
from picking up 3 out of 5 noise options (0, 3% and 7%) and all intensity inhomogeneities
(0, 20% or 40%).
• 20 real T1w scans from IBSR20 with ground truth. IBSR20 dataset is chosen among
IBSR18 because MRI scans have different levels of difficulty and inhomogeneity. Further-
more, provided ground truths in IBSR20, based only on WM, GM and CSF tissues, avoid
manual groupings of the 43 brain structures provided by IBSR18 segmentations.
• 19 T1w scans from different hospitals of SALEM Database with basal scan and its follow-
up (after 12 months). This set comprises both healthy (9 subjects) and MS lesion patients
(10 subjects) with different lesion loads from two hospitals.
5.2 Synthetic data results
The eight segmentation methods have been run on synthetic data and results are evaluated
with the ground truth provided by BrainWeb. Since different levels of noise and intensitiy
inhomogeneity has been used, this database will permit us to measure the sensitivity and speci-
ficity of each method in the presence of different levels of noise and intensity inhomogeneities.
Noise percentage is representative of the percent ratio of the standard deviation of the white
Gaussian noise versus the signal. Higher sensitivity and specificity metrics indicated that the
segmentation algorithm correctly identified more tissue voxels, and also was better at rejecting
tissue voxels that were not related to the tissue class of interest. Figure 5.1 depicts the average
Dice coefficient for each method computed from all BrainWeb scans. The complete table of
measures for Dice, sensitivity and specificity metrics for all 9 simulated scans can be consulted
in the table A.2 of the appendix.
GAGMM, RFCM, FCM and SOM seem to outperform the other methods with very similar
Dice values on CSF tissue (0.77 ± 0.05). Both SPM approaches report similar Dice values
(0.73±0.02) while FAST is reporting lower similarity (0.71±0.03). KNN seems to underperform
(0.22± 0.09) compared to other methods. Best sensitivity (0.73± 0.03) and specificity (0.88±0.03) are found on FAST and GAGMM respectively.
On GM tissue, 6 methods reported Dice values close to 0.93 ± 0.04. SPM5 and FAST
returned lower similarity measures (0.89 and 0.87±0.02 respectively) while KNN was the worst
method again with (0.83± 0.03). Best sensitivity (0.94± 0.05) was obtained by GAGMM and
best specificity by SOM (0.95 ± 0.03). On WM tissue, GAGMM and SOM also outperformed
the other methods with a Dice value of (0.95± 0.04). KNN, RFCM and FCM returned similar
values (0.94±0.04), while FAST and SPM5 reported both the lower values with a Dice coefficient
of 0.91±0.04. The best sensitivity and specificity was found on RFCM (0.97±0.03) and SPM5
45 5.2 Synthetic data results
(a) CSF tissue
(b) GM tissue
(c) WM tissue
Figure 5.1: Dice metrics boxplots computed from all methods with all Brainweb scans. (a) CSFtissue, (b) GM tissue, (c) WM tissue. Red line depicts the median value; Green cross depicts the meanvalue.
Chapter 5: Results 46
(0.96 ± 0.05). All methods reported very low standard deviation, with values (< 0.09) for the
worst case.
Furthermore, noise and bias artifacts were evaluated on all 9 scans. From the obtained
results, it has been found that segmentation methods reported similar values on this dataset
independently of the amount of intensity inhomogeneity. Bias correction introduced in the
pipeline seems to eliminate simulated intensity inhomogeneities. 1. However, Rician noise
seems to affect clearly the accuracy of the methods. For CSF tissue, all methods seem to follow
the same trend, decreasing the accuracy as long as noise increases. On GM and WM tissue,
SPM8 seems only to be affected with higher amounts of noise. On the contrary, SPM5 and
FAST increase their accuracy with 3% of noise but then report similar results to other methods
for 7%. The other strategies seem to follow the same trend decreasing their accuracy while
increasing noise. The complete table of results can be consulted in table A.3 of the appendix.
5.2.1 Discussion
Reviewed works usually reported results based on an unique level of noise and intensity inho-
mogeneity. Here, statistical measures are taken as averages for all the generated subset, which
comprises different levels of noise and inhomogeneity. Additionally, average Dice metrics for
different levels of noise and bias have been included as well in our study. Obtained results
for FAST are SPM5 are similar from those reported in previous studies [80] [46]. Tsang et al.
(2008) [80] showed also a similar trend in the results for different levels of noise, where scans
with 3% Rician Noise performed better than those without noise. According to the authors, the
reason was because imaging noise is an inevitable part of image acquisition, and methods deals
with it intrinsically. Ashburner et al [5] (2005) reported slightly higher results for SPM5 in GM
and WM tissues. This behavior is explained by the intensity inhomogeneity correction applied
in our experiments. In order to compare the different methods as fairly as possible, N3 was
employed to correct intensity bias and internal correction in methods were disabled. Very close
results of those obtained by Ashburner have been obtained repeating the segmentation process
in SPM5 with default options on a original simulated image. External intensity correction can
also explain lower values obtained in our experiment for CSF tissue on all methods. Although
Ashburner did not provide results for CSF, the results obtained for CSF with default options
follow those reported by Huang [46] and Tsang [80]. Repetition of FAST segmentations with
default options have confirmed our hypothesis, obtaining again results close to other published
works for all tissues. In general, GMM approaches such as SPM8, SPM5 and GAGMM per-
formed better than Hidden MRF method implementation in FAST. From the results obtained,
1Table A.1 shows Dice metrics for each method performing on different levels of intensity inhomogeneity.This table has not been included here for simplicity.
47 5.2 Synthetic data results
(a) Brainweb ground truth (b) SPM5 output (c) KNN output (d) MNI probability atlas(CSF)
Figure 5.2: Segmentation results for various methods on Brainweb scans. (a) Brainweb ground truth,(b) SPM5 output, (c) KNN output, (d) MNI probability CSF tissue atlas. Tissue labels are CSF (red),GM( green) and WM (red). KNN classifier self-trained on the same subject with atlas registration ismiss-classifying CSF tissue as GM in brain tissue borders because MNI atlas returns low probabilitiesfor CSF in borders.
and comparing them with other GMM approaches, GA optimization on the GMM model was
more accurately than atlas based methods as SPM5 and SPM8. Clustering methods as RFCM,
FCM or SOM reported similar values between them for all tissues, even with higher levels of
noise. This can be explained again by the segmentation pipeline used. Since all the scans are
preprocessed for intensity bias correction, RFCM regularization is very low and the model is
equal to FCM. Obtained results for GM and WM with SOM method are in accordance with
values obtained by Hasanzadeh. et al [42] (2007) using a similar approach. Finally, KNN
clearly underperforms on CSF tissue. Figure 5.2 depicts segmentation outputs for the KNN
classifier (Figure 5.2(c)) and SPM5 (Figure 5.2(b)) for comparison. KNN training dataset is
based on the registration of the MNI atlas prior (Figure 5.2(d)) into the input scan where MAP
probability with minimum probability > 0.7 is applied to add new elements to the training
set. Brainweb provided ground-truth (Figure 5.2(a)) considers the borders of the brain as CSF
while the MNI atlas is reporting probabilities < 0.6. Therefore, KNN classified border tissue
as GM, which explains why GM sensitivity and specificity values are lower as well. Taking into
account the low percentage of CSF tissue in brain, relative small changes in segmentation for
CSF tissue will decrease considerably the reported similarity. Previous tests setting the MAP
minimum threshold with lower values to increase CSF sensitivity have reported an increase of
miss-classification between GM and WM.
Chapter 5: Results 48
5.3 Real data results
As a second experiment, segmentation methods are performed on real T1w data from IBSR20
dataset introduced in section 3.2.2. Labelled volumes are also provided for evaluation by trained
experts using a semi-automated intensity contour mapping algorithm. The results obtained with
this dataset will permit us to evaluate the segmentation methods on real data with different
levels of intensity inhomogeneity and real acquisition artifacts. IBSR scans have been sorted by
decreasing order of difficulty. Average Dice coefficients by method on all 20 scans are depicted
in figure 5.3. The complete average measures for Dice, sensitivity and specificity can be found
in table A.4 of the appendix.
It can be observed how all methods reported very low values for CSF tissue. As reported
in the previous section, the amount of CSF tissue in the brain is small compared to GM and
WM and small differences in segmentation results will produce considerable changes in reported
results. However, KNN performed slightly better (0.41± 0.01) than the other methods, which
reported Dice values lower than (0.25) for CSF tissue. Again, KNN clearly outperformed the
other strategies with a Dice value 0.86± 0.05 for GM tissue. GMM methods such as GAGMM,
SPM5 and SPM8 reported similar results (0.76± 0.06). RFCM returned lower similarity than
GMM approaches (0.76± 0.06) while other methods performed with values lower than (0.70).
On WM, both SPM methods, FAST and KNN reported Dice values close (0.80). FCM and
SOM returned similar lower Dice values (0.77) while RFCM underperformed with respect to
the other approaches (0.72± 0.14).
Boxplots on figure 5.3 depicted higher distances between the first and third quartile for all
methods than those found with synthetic data. For CSF tissue, bigger differences were found
for the KNN classifier, while smallest on FCM and FAST. In general, all the methods depicted
high variability for CSF classification, compared with their response to the other tissues. On
GM, biggest variability was found on GAGMM approach and lowest on SPM5 and SPM8. On
WM, the highest variability was found in RFCM, followed by the other methods which return
similar lower variability, and again SPM approaches with very low deviation values.
Variability between scans for the same segmentation method was also analyzed. Figure
5.4 show Dice metrics returned by all methods, evaluated for each scan. Scans are sorted by
decreasing difficulty. For GM and WM tissue, all methods depicted an ascending accuracy
trend which corresponds with the disposition of the scans from difficult to more easy. In GM,
intra-scan accuracy differences were higher compared to WM plot. Although the ascending
trend in accuracy, differences for the same subject classifying GM remained stable along the
20 scans for all methods. Conversely, WM tissue plot depicted a bigger discrepancy between
methods on difficult scans and closer Dice metrics between methods as long as the scans were
easier to classify. Finally, CSF tissue followed a similar trend for all methods and scans with
49 5.3 Real data results
(a) CSF tissue
(b) GM tissue
(c) WM tissue
Figure 5.3: Dice metrics boxplots computed from all methods with all IBSR20 scans. (a) CSF tissue,(b) GM tissue, (c) WM tissue. Red line depicts the median value; Green cross depicts the mean value.
Chapter 5: Results 50
bigger differences in KNN for the middle difficult scans. From the plots, unexpected values can
be observed for the SPM5 method on volume 1-24 in all tissues, and a RFCM on 16-3 in GM.
5.3.1 Discussion
Analyzing other works where one or more methods are evaluated with IBSR20, similar results
for all tissues have been found. Tian et al (2011) [78] analyzed FSL, SPM8 and GAGMM
with the same database but with different bias preprocessing. Reported Dice values for all
tissues were in correspondence with our findings with small differences probably due to changes
in the segmentation pipeline. Caldairou et al [18] analyzed different Fuzzy methods which
included both FCM and RFCM. Reported Dice metrics were higher than results obtained in
our experiments. However, those experiments were done in ISBR18, and results can no be
directly compared. Results provided for SPM5 and FAST by Tsang et al. (2008) [80] for GM
and WM also reported similar results of those obtained in our experiments.
Low values obtained on brain CSF tissue in all methods are directly related with the ground
truth database. Provided labelled scans from IBSR classify the border tissue of the brain
as GM, while 7 out of 8 segmentation methods tend to classify these voxels as CSF, which
decreases considerably the Dice coefficient for CSF tissue. Higher results obtained for KNN are
explained because MNI probabilistic atlases assigned low probability to CSF in brain borders,
forcing the algorithm to classify those voxels as GM. However, reported sensitivity provides a
good estimator of the capability of each method to classify tissue as CSF. Effectively, 7 out of
8 methods (KNN underperformed for sensitivity) reported high values for sensitivity (> 0.79),
which indicated that methods are actually classifying correctly CSF tissue located in the lateral
ventricles. On the contrary, analyzing reported results, KNN sensitivity to detect CSF tissue
in the same brain region is lower than other methods. However, the classifier tend to detect
more accurately non-tissue classes, as reported by the higher specificity obtained. Brain tissue
in the borders of the brain was usually miss-classified as GM. Hence, obtained Dice coefficients
for CSF will affect the accuracy of the method for GM tissue classification. This phenomenon
explains the high accuracy obtained by the KNN for GM tissue classification. On the same
line, GMM approaches obtain better classification rates for GM than clustering strategies or
FAST. These results are related with those obtained for CSF, where GMM approaches reported
better classification rates as well. Similarly, FAST, SOM and FCM, which obtained lower
values for CSF, report the lowest Dice coefficients for GM classification. Moreover, on difficult
scans, GM miss-classification is also related with volume artifacts. Figure 5.5(a) shows the
most difficult volume provided in the IBSR20 database and the expected ground truth (Figure
5.5(b)). Figures 5.5(c) and (d) show obtained FAST and KNN tissue classification results in our
experiment. The input scan is hardly affected by acquisition artifacts. 7 out 8 methods miss-
51 5.3 Real data results
(a) CSF tissue
(b) GM tissue
(c) WM tissue
Figure 5.4: Dice metrics plots evaluated for each scan and method. (a) CSF tissue, (b) GM tissue,(c) WM tissue. Methods are labeled by color. FAST (red), SPM5 (green), SPM8 (blue), GAGMM(pink), SOM (yellow), RFCM (cyan), FCM (black), KNN (orange)
Chapter 5: Results 52
(a) 5 8 corrected scan (b) 5 8 ground truth (c) FAST output (d) KNN output
Figure 5.5: Segmentation results for various methods on IBSR 5 8 scan. (a) Scan as provided inIBSR, (b) Brainweb ground truth, (c) FAST output, (d) KNN output. Tissue labels are CSF (red),GM( green) and WM (red). Scan 5 8 is provided with real hard artifacts
classify those artifacts basically as WM, affecting the GM accuracy. On the contrary, KNN and
SPM5 were not altered by those artifacts, because they used prior information from probabilistic
atlases which modeled the bias as brain tissues. However, SPM8 was performing significantly
lower than SPM5 in scan 5 8. The algorithm was not considering those voxels and classified
them as background. Clustering methods performed very similar on GM matter classification.
Spatial regularization introduced in RFCM outperformed basic clustering approaches like SOM
and FCM in scans with moderate amount of artifacts.
WM tissue plot from Figure 5.4(c) revealed bigger discrepancies on Dice metrics between
methods on difficult scans. With scans of middle and low difficulty, it existed a correspondence
predicting WM tissue between methods as long as the scans were easier to classify. As it
has been said, those discrepancies were directly related with image artifacts because methods
tended to miss-classify them as WM. Again, best results were obtained by methods with spatial
probabilistic priors (KNN, SPM5 and SPM8). Although GAGMM returned accurate results
for images with low and middle difficulty, reported average Dice metrics were lower than other
GMM approaches because the GA initialization and optimization was penalized in the presence
of big artifacts. Regularization in fuzzy clustering was not improving results of simple clustering
methods, specially in images with high amount of bias. This could be caused by a wrong
penalization introduced in fuzzy membership computations. Because artifacts are long stripes
along the posterior part of the brain, neighboring windows are collecting information from voxels
also corrupted. This could be avoid incrementing considerably the window size in detriment of
excessive time consumption.
Analyzing the results more globally, KNN with auto trained dataset based on prior atlases
provided the best results on all tissues. This technique, even simple, was able to reduce con-
siderably the effect of image artifacts in scans. Furthermore, GMM approaches modeled more
53 5.4 MS lesion results
Table 5.1: FBT evaluation on 9 healthy subjects of SALEM dataset for all methods. Reported valuesare mean ± standard deviation. Diff refers to the absolute difference between basal and 12 months foreach method.
CSF GM WMMethod basal 12 months diff basal 12 months diff basal 12 months diff
FAST 0.19 ±0.02 0.19 ±0.01 0.01 0.39 ±0.02 0.39 ±0.01 0.00 0.42 ±0.02 0.43 ±0.01 0.01SPM5 0.19 ±0.02 0.18 ±0.01 0.01 0.44 ±0.02 0.44 ±0.02 0.00 0.37 ±0.03 0.38 ±0.02 0.01SPM8 0.17 ±0.01 0.16 ±0.01 0.01 0.45 ±0.01 0.45 ±0.01 0.00 0.38 ±0.02 0.38 ±0.01 0.00
GAGMM 0.11 ±0.02 0.10 ±0.02 0.00 0.64 ±0.14 0.59 ±0.14 0.05 0.26 ±0.15 0.31 ±0.15 0.05SOM 0.16 ±0.02 0.16 ±0.01 0.00 0.41 ±0.04 0.41 ±0.02 0.00 0.43 ±0.05 0.44 ±0.02 0.00RFCM 0.15 ±0.04 0.12 ±0.01 0.03 0.51 ±0.21 0.37 ±0.02 0.14 0.34 ±0.25 0.51 ±0.02 0.17FCM 0.11 ±0.01 0.11 ±0.01 0.00 0.36 ±0.04 0.34 ±0.03 0.02 0.52 ±0.04 0.55 ±0.03 0.03KNN 0.01 ±0.01 0.01 ±0.01 0.00 0.64 ±0.03 0.62 ±0.03 0.03 0.34 ±0.03 0.37 ±0.03 0.03
accurately brain tissue probabilities. Atlas prior initialization was revealing more robustness
than heuristic GA initialization and optimization, penalized in images with high amounts of
image artifacts. FAST returned high results for WM tissue, which could indicate that lower GM
results were obtained because differences with the ground truth interpreting CSF in addition to
the presence of big artifacts which debilitate MRF priors. Clustering methods seem to perform
well on images without or low level of artifacts, as seen in WM plots. Spatial regularization
introduced in RFCM seems to outperform basic clustering approaches like SOM and FCM in
scans with moderate amount of artifacts.
5.4 MS lesion results
Finally, the third experiment evaluates the effect of different loads of MS disease in brain tissue
segmentation. Our SALEM dataset consists on 18 scans: 9 healthy subjects and 10 patients
with different lesion loads mainly located in WM tissue. For each subject, both initial and 12
months scans are available with corresponding lesion masks. Lesion masks are used to localize
the zones of the brain where disease is present. With these data, three tests are computed:
1) fractional brain tissue (FBT) is evaluated on healthy scans and for each subject differences
between FBT of both scans are computed. The same test is done twice with subjects with
lesion, 2) masking lesions not to be considered by methods and adding them after segmentation
as WM to evaluate FBT, and 3 ) not masking lesions and computing FBT as it was normal
tissue. Table 5.1 shows FBT for each healthy subject on both consecutive scans. Difference
between FBT for each subject is reported. Figures 5.3 and 5.2 show the same results for not
masked and masked lesion scans respectively.
From the table 5.1 , it can be observed how all methods are not providing significantly dif-
Chapter 5: Results 54
ferences in FBT between basal and 12 months scans on healthy data. The maximum difference
reported for CSF is 0.03 for the RFCM method. The same behavior is seen for GM and WM
tissues. However, RFCM seems to return more difference between scans with 0.14 and 0.17
in GM and WM respectively. Ge et al. (2002) [38] reported common FBT values for brain
tissue in adults (GM = 0.50, WM =0.35, and CSF = 0.15). With healthy subjects, most of the
methods report values in concordance with those normal FBT values. On the contrary, since
KNN is miss-classifying CSF tissue as GM class, reported values for GM are not in accordance
with normal FBT values.
Consecutive scans from the same subject which have not been masked to extract lesions,
reported small differences between them in all methods. This behavior is seen as well indepen-
dently of the tissue. However, more significantly differences were obtained between methods.
In CSF, minimum estimation of FBT for all scans was given by KNN with (0.01 ± 0.02) and
maximum by FAST (0.21 ± 0.04). The minimum FBT estimation for GM was given by the
RFCM method (0.32± 0.06) and the maximum by KNN(0.63± 0.03). On WM, minimum esti-
mation is given by KNN(0.37±0.03) and maximum by RFCM (0.57±0.07). Results from table
5.2 showed again small differences in FBT between consecutive scans when they are segmented
masking the lesions and adding them posteriorly as WM. 6 out 8 methods reported average dif-
ferences lower than FBT = 0.05 in all tissues between masked scans and not masked segmented
scans of the same subject. Overlap between masked and not masked scans are represented in
figure 5.6.
FBT values for initial scans (basal) and 12 months (12m) of each subject where lesions
have been masked are depicted in red and blue respectively. Similarly, for the same subjects,
values for initial and 12 months scans without masking the lesions are depicted with × and
© respectively. The graph permits to observe differences in WM tissue estimation between
subjects which are hidden in table average measures. From the plot, diferences for GAGMM
are found. The GA initialization seems not to provide a good model for GMM when it is
run with default options. Although the method permits to set different GA parameters, default
options have been conserved here. Regarding to SOM, masked lesion voxels that are expected to
be WM seems to modify the training weighting matrix and forces the algorithm to miss-classify
WM.
Furthermore, we focus the study in how methods classify lesions. Figure 5.7(a) depicts
a SALEM scan with high lesion load (420 mm3). Figures 5.7(c) to (i) show segmentations
provided by each method. From the plots we observe how methods tend to classify lesions
voxels as WM (green) or GM (red). Focusing in the lesion regions, more smoothed results seem
to be obtained for the plotted slice in GMM models such as SPM5, SPM8 and GAGMM than in
clustering methods or FAST, which seem to classify the two big spots as GM. Table 5.4 shows
quantitatively the average fractions of classified lesion tissue by each method for all scans in the
55 5.4 MS lesion results
Figure 5.6: WM tissue FBP for lesion masked and non masked scans. On masked scans, lesions arebeen added as WM on FBP. On non masked scans, lesions are classified into normal tissues. Red valuesrefers to masked scans. Blue values refers to not masked scans. × refers to initial scan, © refers to 12month scan.
basal study2 . We observe how analyzed methods effectively tend to basically classify lesions
as GM at least as 16% (SOM) and with less proportions GM and WM.
Finally, parenchymal brain fraction (PBF) is computed on all scans with MS disease. PBF
is the fraction of GM, WM and lesion volume with respect to the whole brain volume. This
measure is used by experts to evaluate tissue atrophy in MS diagnosis. Figure 5.8 depicts
obtained BPF values for each segmentation method. The plots seem to indicate small changes
between masked and not masked scans in 6 out of 8 methods. GAGMM and SOM seem to
fluctuate between subjects, which is caused by miss-classified scans, as explained before. KNN
obtain considerably higher values with respect to the other methods, which is explained by low
FBT reported for CSF tissue. Reported BPF values in our experiment are in concordance with
studies evaluating WM tissue atrophy. Atkins et al. [7] and Rudick et al. [67] reported BPF
values in MS patients from 0.83 to 0.82 and 0.87 on healthy patients.
5.4.1 Discussion
Evaluate tissue classification methods in the presence of MS lesion is not an easy task. GAGMM
and SOM reported problems segmenting SALEM scans. GAGMM performed with high accu-
2Since there are no significant differences between consecutive scans, we base the classification in basal study
Chapter 5: Results 56
Table 5.2: FBT evaluation on 9 subjects with lesion loads from SALEM dataset for all methods.Segmentation is not carried out on lesions and are added as WM when computing FBT . Reportedvalues are mean ± standard deviation. Diff refers to the absolute difference between basal and 12months for each method.
CSF GM WMMethod basal 12 months diff basal 12 months diff basal 12 months diff
FAST 0.21 ±0.04 0.21 ±0.04 0.00 0.34 ±0.06 0.34 ±0.06 0.00 0.46 ±0.05 0.46 ±0.04 0.00SPM5 0.20 ±0.03 0.20 ±0.03 0.00 0.44 ±0.04 0.46 ±0.05 0.02 0.37 ±0.05 0.35 ±0.06 0.02SPM8 0.16 ±0.02 0.17 ±0.01 0.00 0.46 ±0.01 0.46 ±0.01 0.00 0.38 ±0.01 0.38 ±0.01 0.00
GAGMM 0.09 ±0.02 0.09 ±0.02 0.00 0.43 ±0.14 0.46 ±0.16 0.03 0.48 ±0.16 0.45 ±0.19 0.03SOM 0.17 ±0.04 0.09 ±0.07 0.07 0.48 ±0.17 0.33 ±0.11 0.15 0.36 ±0.19 0.59 ±0.19 0.23RFCM 0.12 ±0.03 0.13 ±0.02 0.00 0.34 ±0.07 0.33 ±0.06 0.01 0.55 ±0.07 0.56 ±0.06 0.01FCM 0.14 ±0.03 0.14 ±0.03 0.00 0.37 ±0.06 0.36 ±0.05 0.00 0.51 ±0.05 0.51 ±0.04 0.00KNN 0.01 ±0.02 0.01 ±0.01 0.00 0.63 ±0.03 0.63 ±0.02 0.01 0.37 ±0.03 0.37 ±0.02 0.00
Table 5.3: FBT evaluation on 9 subjects with lesion loads from SALEM dataset for all methods.Lesions have not been masked and segmentation methods try to classify lesions as normal tissue.Reported values are mean ± standard deviation. Diff refers to the absolute difference between basaland 12 months for each method.
CSF GM WMMethod basal 12 months diff basal 12 months diff basal 12 months diff
FAST 0.21 ±0.04 0.21 ±0.04 0.00 0.34 ±0.06 0.34 ±0.06 0.00 0.46 ±0.05 0.46 ±0.04 0.00SPM5 0.20 ±0.03 0.20 ±0.03 0.00 0.44 ±0.04 0.46 ±0.05 0.02 0.37 ±0.05 0.35 ±0.06 0.02SPM8 0.16 ±0.02 0.17 ±0.01 0.00 0.46 ±0.01 0.46 ±0.01 0.00 0.38 ±0.01 0.38 ±0.01 0.00
GAGMM 0.09 ±0.02 0.09 ±0.02 0.00 0.43 ±0.14 0.46 ±0.16 0.03 0.48 ±0.16 0.45 ±0.19 0.03SOM 0.15 ±0.03 0.16 ±0.03 0.00 0.36 ±0.07 0.36 ±0.07 0.00 0.49 ±0.07 0.49 ±0.07 0.00RFCM 0.12 ±0.03 0.13 ±0.02 0.00 0.34 ±0.07 0.33 ±0.06 0.01 0.55 ±0.07 0.56 ±0.06 0.01FCM 0.14 ±0.03 0.14 ±0.03 0.00 0.37 ±0.06 0.36 ±0.05 0.00 0.51 ±0.05 0.51 ±0.04 0.00KNN 0.01 ±0.02 0.01 ±0.01 0.00 0.63 ±0.03 0.63 ±0.02 0.01 0.37 ±0.03 0.37 ±0.02 0.00
Table 5.4: Lesion tissue classification in basal study. Percentage of lesion (number of voxels classifiedas tissue / total lesion voxels) which is segmented by methods as CSF, GM and WM, when lesions arenot masked. Reported values are mean and ± standard deviation of percentage of tissue for hospital 1(H1) and 2 (H2)
Method CSF GM WMmean std mean std mean std
FAST 0.06 ±0.05 0.22 ±0.11 0.72 ±0.11SPM5 0.04 ±0.03 0.20 ±0.14 0.76 ±0.14SPM8 0.03 ±0.02 0.17 ±0.09 0.80 ±0.09
GAGMM 0.01 ±0.01 0.22 ±0.30 0.76 ±0.30SOM 0.01 ±0.02 0.16 ±0.09 0.83 ±0.09RFCM 0.02 ±0.02 0.22 ±0.13 0.76 ±0.13FCM 0.02 ±0.02 0.22 ±0.29 0.76 ±0.29KNN 0.00 ±0.00 0.37 ±0.16 0.63 ±0.16
57 5.4 MS lesion results
(a) scan 201 (b) lesions (c) FAST (d) SPM5 (e) SPM8
(f) GAGMM (g) SOM (h) RFCM (i) FCM (j) KNN
Figure 5.7: Segmentation results for all methods on 201 SALEM scan without masking the lesion.(a) Scan 210 as provided by SALEM. (b) scan with lesion highlighted. (c) FAST output, (d) SPM5output, (e) SPM8 output, (f) GAGMM output, (g) SOM output, (h) RFCM output, (i) FCM output,(j) KNN output
Figure 5.8: Brain Parenchymal Fraction for lesion masked and non masked scans
Chapter 5: Results 58
racy in both past databases using default settings ( 3 classes with 2 PVE). However, here it
fails in some scans, not necessary with lesion. Our suspect is that proper settings of the GA al-
gorithm have to be done in order to adapt the method to a different MRI scanner, which makes
the algorithm unsuitable for doctors. This method is not commented in the next discussion.
Introduced tables in previous section have shown that exist small differences between con-
secutive healthy and MS disease scans from the same subject. Very close FBT results in a
second 12 month scan, for which is know to be for the same patient and without pathology, can
be used to assess the reproducibility of the methods. All analyzed methods obtained very close
results for all follow-up in healthy scans which indicates their capability to reproduce results.
Provided WM lesion masks permits the evaluation of methods canceling affected regions and
replacing them as WM after segmentation. One important point to consider is that methods
tend to report small differences in FBT on GM and WM between scans without masking and
scans with lesions added to WM. All the methods miss-classify at least 16% of lesions voxels
as GM and bigger than 20% in 6 methods (table 5.4). This effect would lead to obtain wrong
estimation of WM and GM where tissue volume is necessary because lesion voxels which are
supposed to be WM are miss-classified as GM. If we analyze each method, SPM8 and SOM
reported the smallest values of miss-classification (0.17 ± 0.09) while KNN returned the most
miss-classification rate from WM to GM in average (0.37± 0.16) .
Moreover, BPF is not helping to identify changes in atrophy, since BPF will not identify
differences in WM tissue if WM is miss-classified into GM. Hence the rate for the GM+WM
will remain similar and no changes will be detected. However, our analysis is only valued in
quantitative measurements. Further work have to be done in order to incorporate into the
analysis a qualitative analysis made by radiologists and neurologists to define which methods
define better the classification of brain tissue. This qualitative evaluation have not been done
at the time of closing this report, by time constraints in the doctors agenda.
Chapter 6
Conclusions
This master thesis have been carried out in 4 main parts. Firstly, we have reviewed the
state-of-the-art on automatic brain tissue segmentation. After an extensive analysis of
recent papers we have presented a classification based on supervised and unsupervised meth-
ods. Moreover, we have focused on published works whose evaluation have been done with
public databases such as Brainweb and IBSR. In order to evaluate the accuracy of the reviewed
methods, we have divided both supervised and unsupervised strategies by the dataset used,
and discussed the best obtained results.
Secondly, we have proposed a framework to compare brain tissue segmentation
tecnniques which includes preprocessing, segmentation and evaluation stages. For
the preprocessing step, we have introduced different techniques used to deal with inherent prob-
lems in MRI such as skull-stripping and intensity inhomogeneities and we have proposed some
of them to be incorporated to our pipeline reasoning our choose in each case. For the seg-
mentation stage, we have selected 4 publicly available segmentation approaches from
the-state-of-the-art, where some of them such as FAST, SPM5 and SPM8 are widely used
by the neuroimaging community for tissue segmentation and volumetric analysis. We have also
introduced the GAGMM approach, which implements tissue classification by the optimization
of the parameters of tissue distributions in GMM by genetic algorithms. Moreover, we have
selected 4 more works from the state-of-the art and we have implemented them.
First, we have presented two FCM approaches based on the work of Pham et al. [61]. The first
one is based on basic FCM theory while the second one modifies the FCM energy function with
a penalization term to include spatial information into the membership function. The modifi-
cation is designed to improve the performance of the basic FCM approach by penalizing voxels
with high variance with their neighborhood. Furthermore, we have proposed a SOM clustering
59
Chapter 6: Conclusions 60
approach based on the work of Tian et al. [77]. The SOM matrix is trained on the subject itself
restricting the feature vector dimensions to T1w intensities. It has been seen that, given the
limitation in the feature space, the learned weighting matrix in practice returns the clustering
centers of tissue distributions and classification is achieved computing the minimum absolute
distance from voxels to the cluster centers. Finally, we have developed a modified version of
the KNN classifier proposed by De Boer et al. [27]. The classifier is also auto trained on the
subject itself registering prior probability tissue atlases on the subject scan. After selecting the
voxels with probability higher than a given threshold, the training dataset is build based on
voxel intensities and labels from the tissue with higher probability.
Thirdly, we have proposed a modification of the framework in order to measure
the efficiency of the methods in the presence of MS lesions. This modification is based
on the addition of a lesion masking step before segmentation. Hence, methods have been run
twice: first with lesion masked scans and afterward with the same scans without modification.
The capability of the methods to deal with WM lesions is evaluated by comparison between
the obtained FBT coefficients by both sets.
Finally, we have evaluated the 8 methods on synthetic and real T1w scans of healthy
subjects from Brainweb and IBSR20 respectively and in scans with different loads of MS lesion
from the SALEM project database. Results on Brainweb and IBSR20 have been presented using
quantitative measures (the Dice similarity index, sensitivity and specificity), while results on
SALEM scans have been evaluated by the Fraction of Brain Tissue and the Brain Parenchymal
fraction coefficient. Results on synthetic data have shown that in general, all the methods
performed with very high accuracy, specially Gaussian Mixture Models as SPM8 and GAGMM.
The results obtained with synthetic data were according with previous studies using one or more
of these methods. The preprocessing bias correction step removed completely the different levels
of intensity inhomogeneities present in the data, and methods reported very similar results
independently of the amount of bias. On the contrary, all methods returned a decrease in the
accuracy where noise were increased considerably. Results on the IBSR data have shown
that KNN provided the best results on all tissues. In general, GMM approaches again have
been able to model more accurately brain tissue probabilities than clustering methods or FAST.
Atlas prior initialization have been revealed more robust that heuristic GA optimization, which
is penalized in images with high amounts of image artifacts. However, FAST have returned high
results for WM tissue, and lower GM results could have been obtained by a penalization with
the ground truth segmenting CSF in addition to the presence of big artifacts which debilitate
MRF priors. Clustering methods have performed well on images without or low level of artifacts.
Spatial regularization introduced in RFCM have outperformed basic clustering approaches like
61 6.1 Future Work
SOM and FCM in scans with moderate amount of artifacts but have reported to fail in the
presence of more bias.
Finally, experiments with initial scans and 12 moths follow-ups with healthy subjects of the
SALEM project, have reported the capability of the methods to repeat FBT estimations. On
the other hand, when methods have been run with the masks, the results have indicated that
in general methods tend to miss-classify WM as GM at least in 17%. These results vary from
SPM8 (17%) to KNN, which is miss-classifying WM in (37%). However, these evaluations are
carried out based on segmentation results and provided labeled scans which can differ between
experts. Therefore, evaluations have to be weighted by experts to decide the real accuracy of
each approach.
6.1 Future Work
We present here some improvements to do in future works. Some of them have not been
implemented in this work due to time contraints. Others are part of new projects in the
research framework of SALEM and AVALEM. First, with respect to the work presented here:
1. Qualitative evaluation by radiologists and neurologists of the obtained results. Add a
quantitative evaluation into our quantitatively evaluation to balance obtained results.
2. Modify the implemented algorithms with improvements: Three out of four implemented
methods are not dealing with spatial information. Some modifications could be done,
specially in building training datasets to include spatial information to improve the ro-
bustness of the methods.
3. Optimize the implemented methods: although time is not a hard constraint in brain
tissue segmentation, all the methods have been implemented in MATLAB, and their run-
ning time could be improved significantly implementing them in other common computer
languages as ITK/C++.
4. MS lesion evaluation in tissue classification: Lesion effects on tissue segmentation methods
have been evaluated masking lesions as WM.
62
63
Appendix A
Results tables
Table A.1: Dice metrics computed from segmented Brainweb scans with different noise levels. Re-ported values are mean ± standard deviation.
3% noiseMethod CSF GM WM
0% 20% 40% 0% 20% 40% 0% 20% 40%FAST 0.743 0.743 0.751 0.853 0.854 0.855 0.856 0.858 0.856SPM5 0.751 0.748 0.755 0.876 0.874 0.875 0.883 0.883 0.883SPM8 0.793 0.793 0.801 0.932 0.932 0.931 0.951 0.951 0.948
GAGMM 0.820 0.819 0.826 0.967 0.967 0.967 0.985 0.984 0.983SOM 0.826 0.826 0.832 0.962 0.962 0.961 0.976 0.976 0.975
RFCM 0.823 0.823 0.831 0.960 0.961 0.961 0.974 0.975 0.975FCM 0.827 0.826 0.832 0.965 0.965 0.964 0.980 0.979 0.978KNN 0.267 0.254 0.265 0.842 0.841 0.839 0.964 0.963 0.960
3% noiseFAST 0.703 0.701 0.699 0.901 0.900 0.900 0.958 0.957 0.955SPM5 0.731 0.730 0.727 0.917 0.917 0.917 0.957 0.956 0.956SPM8 0.764 0.763 0.759 0.933 0.933 0.933 0.954 0.955 0.955
GAGMM 0.778 0.776 0.771 0.948 0.948 0.948 0.964 0.964 0.964SOM 0.783 0.781 0.777 0.944 0.944 0.943 0.959 0.960 0.960
RFCM 0.781 0.780 0.777 0.945 0.945 0.945 0.960 0.961 0.961FCM 0.784 0.782 0.778 0.946 0.946 0.945 0.961 0.961 0.961KNN 0.303 0.307 0.304 0.854 0.854 0.853 0.955 0.955 0.953
7% noiseFAST 0.670 0.677 0.678 0.857 0.865 0.867 0.911 0.917 0.920SPM 0.698 0.700 0.704 0.878 0.877 0.881 0.901 0.902 0.904SPM5 0.701 0.708 0.709 0.883 0.884 0.886 0.898 0.899 0.901
GAGMM 0.707 0.715 0.704 0.872 0.882 0.884 0.885 0.894 0.898ANN 0.709 0.715 0.716 0.860 0.869 0.873 0.881 0.890 0.894
RFCM 0.707 0.715 0.716 0.876 0.882 0.883 0.894 0.901 0.902FCM 0.709 0.716 0.716 0.864 0.873 0.876 0.883 0.892 0.895KNN 0.091 0.108 0.117 0.790 0.796 0.799 0.883 0.892 0.896
Table A.2: Statistical evaluation on BrainWeb synthetic database. Reported values are mean ± standard deviation. dsc, Dice similarity;sens, sensitivity; spec, specificity.
Method CSF GM WMdsc sens spec dsc sens spec dsc sens spec
FAST 0.71 ±0.03 0.73 ±0.06 0.68 ±0.01 0.87 ±0.02 0.86 ±0.04 0.89 ±0.06 0.91 ±0.04 0.87 ±0.09 0.96 ±0.05SPM5 0.73 ±0.02 0.72 ±0.06 0.74 ±0.02 0.89 ±0.02 0.88 ±0.02 0.90 ±0.04 0.91 ±0.03 0.88 ±0.07 0.96 ±0.05SPM8 0.75 ±0.04 0.70 ±0.08 0.83 ±0.02 0.92 ±0.02 0.91 ±0.03 0.92 ±0.02 0.93 ±0.03 0.93 ±0.03 0.94 ±0.04
GAGMM 0.77 ±0.05 0.68 ±0.07 0.88 ±0.03 0.93 ±0.04 0.94 ±0.05 0.93 ±0.03 0.95 ±0.04 0.95 ±0.03 0.94 ±0.05SOM 0.77 ±0.05 0.71 ±0.07 0.85 ±0.03 0.92 ±0.04 0.90 ±0.06 0.95 ±0.03 0.94 ±0.04 0.97 ±0.03 0.92 ±0.05RFCM 0.77 ±0.05 0.70 ±0.07 0.87 ±0.02 0.93 ±0.04 0.92 ±0.05 0.94 ±0.03 0.94 ±0.03 0.97 ±0.02 0.92 ±0.05FCM 0.77 ±0.05 0.71 ±0.07 0.86 ±0.03 0.93 ±0.04 0.91 ±0.05 0.95 ±0.03 0.94 ±0.04 0.97 ±0.03 0.92 ±0.05KNN 0.22 ±0.09 0.13 ±0.06 0.86 ±0.06 0.83 ±0.03 0.95 ±0.04 0.74 ±0.02 0.94 ±0.03 0.93 ±0.02 0.94 ±0.04ASOM 0.77 ±0.05 0.71 ±0.06 0.77 ±0.05 0.92 ±0.04 0.90 ±0.05 0.92 ±0.04 0.95 ±0.04 0.96 ±0.03 0.93 ±0.05
Table A.3: Dice metrics computed from segmented Brainweb scans with different noise levels (0%, 3% and 7%). Reported values aremean ± standard deviation.
Method CSF GM WM0% 3% 7% 0% 3% 7% 0% 3% 7%
FAST 0.746 ±0.004 0.701 ±0.002 0.675 ±0.004 0.854 ±0.001 0.900 ±0.001 0.863 ±0.005 0.857 ±0.001 0.957 ±0.001 0.916 ±0.005SPM5 0.751 ±0.004 0.729 ±0.002 0.701 ±0.003 0.875 ±0.001 0.917 ±0.000 0.879 ±0.002 0.883 ±0.000 0.956 ±0.000 0.902 ±0.002SPM8 0.796 ±0.005 0.762 ±0.003 0.706 ±0.004 0.931 ±0.001 0.933 ±0.000 0.884 ±0.002 0.950 ±0.001 0.955 ±0.000 0.899 ±0.002
GAGMM 0.822 ±0.004 0.775 ±0.004 0.709 ±0.006 0.967 ±0.000 0.948 ±0.000 0.879 ±0.007 0.984 ±0.001 0.964 ±0.000 0.892 ±0.007SOM 0.828 ±0.003 0.780 ±0.003 0.713 ±0.004 0.961 ±0.001 0.943 ±0.000 0.867 ±0.007 0.976 ±0.001 0.960 ±0.000 0.888 ±0.007RFCM 0.826 ±0.004 0.779 ±0.002 0.713 ±0.004 0.961 ±0.001 0.945 ±0.000 0.880 ±0.004 0.975 ±0.001 0.960 ±0.000 0.899 ±0.004FCM 0.829 ±0.003 0.781 ±0.003 0.713 ±0.004 0.965 ±0.001 0.945 ±0.000 0.871 ±0.006 0.979 ±0.001 0.961 ±0.000 0.890 ±0.006KNN 0.262 ±0.007 0.305 ±0.002 0.106 ±0.013 0.840 ±0.002 0.854 ±0.000 0.795 ±0.005 0.963 ±0.002 0.954 ±0.001 0.890 ±0.007
Table A.4: Statistal evaluation on Real T1 IBSR20 database. Reported values are mean ± standard deviation. dsc, Dice similarity;sens, sensitivity; spec, specificity.
IBSR datasetCSF GM WM
Method dsc sens spec dsc sens spec dsc sens specFAST 0.13 ±0.04 0.91 ±0.04 0.07 ±0.03 0.68 ±0.06 0.56 ±0.05 0.85 ±0.08 0.79 ±0.10 0.82 ±0.12 0.76 ±0.08SPM5 0.17 ±0.07 0.86 ±0.20 0.09 ±0.04 0.76 ±0.06 0.69 ±0.08 0.85 ±0.02 0.80 ±0.04 0.78 ±0.04 0.84 ±0.09SPM8 0.21 ±0.07 0.89 ±0.04 0.12 ±0.05 0.78 ±0.06 0.70 ±0.07 0.88 ±0.05 0.81 ±0.08 0.82 ±0.08 0.79 ±0.08
GAGMM 0.25 ±0.12 0.79 ±0.10 0.16 ±0.09 0.77 ±0.09 0.71 ±0.08 0.85 ±0.12 0.74 ±0.16 0.77 ±0.22 0.75 ±0.07ANN 0.15 ±0.06 0.89 ±0.05 0.08 ±0.03 0.69 ±0.09 0.59 ±0.08 0.85 ±0.12 0.77 ±0.14 0.81 ±0.18 0.74 ±0.07
RFCM 0.25 ±0.05 0.81 ±0.05 0.08 ±0.03 0.73 ±0.09 0.64 ±0.07 0.84 ±0.12 0.72 ±0.14 0.82 ±0.18 0.75 ±0.08FCM 0.14 ±0.10 0.90 ±0.08 0.15 ±0.07 0.69 ±0.12 0.58 ±0.13 0.88 ±0.11 0.77 ±0.16 0.80 ±0.22 0.67 ±0.10KNN 0.41 ±0.12 0.34 ±0.10 0.56 ±0.20 0.86 ±0.05 0.84 ±0.07 0.88 ±0.02 0.80 ±0.05 0.83 ±0.03 0.77 ±0.08
Bibliography
[1] Julio Acosta-Cabronero, Guy B. Williams, Jo£o M.S. Pereira, George Pengas, and Peter J.
Nestor. The impact of skull-stripping and radio-frequency bias correction on grey-matter
segmentation for voxel-based morphometry. NeuroImage, 39(4):1654 – 1665, 2008.
[2] M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag, and T. Moriarty. A modified fuzzy
c-means algorithm for bias field estimation and segmentation of mri data. Medical Imaging,
IEEE Transactions on, 21(3):193 –199, march 2002.
[3] Ayelet Akselrod-Ballin, Meirav Galun, John Moshe Gomori, Achi Brandt, and Ronen
Basri. Prior knowledge driven multiscale segmentation of brain mri. In Proceedings of the
10th international conference on Medical image computing and computer-assisted interven-
tion, MICCAI’07, pages 118–126, Berlin, Heidelberg, 2007. Springer-Verlag.
[4] Ayelet Akselrod-Ballin, Meirav Galun, Moshe Gomori, Ronen Basri, and Achi Brandt.
Atlas guided identification of brain structures by combining 3d segmentation and svm
classification. In Rasmus Larsen, Mads Nielsen, and Jon Sporring, editors, Medical Image
Computing and Computer-Assisted Intervention – MICCAI 2006, volume 4191 of Lecture
Notes in Computer Science, pages 209–216. Springer Berlin / Heidelberg, 2006.
[5] J. Ashburner and K.J. Friston. Unified segmentation. NeuroImage, 26:839–851, 2005.
[6] John Ashburner and Karl J. Friston. Voxel-based morphometry’Aıthe methods. NeuroIm-
age, 11(6):805 – 821, 2000.
[7] M. Stella Atkins, Jeff J. Orchard, Ben Law, and Melanie K. Tory. t robustness of the brain
parenchymal fraction for measuring brain atrophy.
[8] Suyash P. Awate, Tolga Tasdizen, Norman Foster, and Ross T. Whitaker. Adaptive markov
modeling for mutual-information-based, unsupervised mri brain-tissue classification. Med-
ical Image Analysis, 10(5):726 – 739, 2006. ¡ce:title¿The Eighth International Confer-
ence on Medical Imaging and Computer Assisted Intervention’Aı MICCAI 2005¡/ce:title¿
66
¡xocs:full-name¿The Eighth International Conference on Medical Imaging and Computer
Assisted Intervention’Aı MICCAI 2005¡/xocs:full-name¿.
[9] Rohit Bakshi, Suzie Ariyaratana, Ralph H. B. Benedict, and Lawrence Jacobs. Fluid-
attenuated inversion recovery magnetic resonance imaging detects cortical and juxtacortical
multiple sclerosis lesions. Arch Neurol, 58(5):742–748, 2001.
[10] M. Balafar, A. Ramli, M. Saripan, and S. Mashohor. Review of brain mri image segmenta-
tion methods. Artificial Intelligence Review, 33:261–274, 2010. 10.1007/s10462-010-9155-0.
[11] Pierre-Louis Bazin and Dzung L. Pham. Homeomorphic brain image segmentation
with topological and statistical atlases. Medical Image Analysis, 12(5):616 – 625, 2008.
¡ce:title¿Special issue on the 10th international conference on medical imaging and com-
puter assisted intervention - MICCAI 2007¡/ce:title¿.
[12] J. C. Bezdek, L. O. Hall, and L. P. Clarke. Review of MR image segmentation techniques
using pattern recognition. Medical Physics, 20:1033–1048, July 1993.
[13] Kristi Boesen, Kelly Rehm, Kirt Schaper, Sarah Stoltzner, Roger Woods, Eileen Liders,
and David Rottenberg. Quantitative comparison of four brain extraction algorithms. Neu-
roImage, 22(3):1255 – 1261, 2004.
[14] S. Bricq, Ch. Collet, and J.P. Armspach. Unifying framework for multimodal brain mri
segmentation based on hidden markov chains. Medical Image Analysis, 12(6):639 – 652,
2008. ¡ce:title¿Special issue on information processing in medical imaging 2007¡/ce:title¿.
[15] Antoni Buades, Bartomeu Coll, and Jean-Michel Morel. Nonlocal image and movie de-
noising. International Journal of Computer Vision, 76:123–139, 2008. 10.1007/s11263-007-
0052-1.
[16] Mariano Cabezas, Arnau Oliver, Xavier Llado, Jordi Freixenet, and Meritxell Bach Cuadra.
A review of atlas-based segmentation for magnetic resonance brain images. Computer
Methods and Programs in Biomedicine, 104(3):e158 – e177, 2011.
[17] Weiling Cai, Songcan Chen, and Daoqiang Zhang. Fast and robust fuzzy c-means clustering
algorithms incorporating local information for image segmentation. Pattern Recognition,
40(3):825 – 838, 2007.
[18] Benoıt Caldairou, Francois Rousseau, Nicolas Passat, Piotr Habas, Colin Studholme, and
Christian Heinrich. A non-local fuzzy segmentation method: Application to brain mri.
In Xiaoyi Jiang and Nicolai Petkov, editors, Computer Analysis of Images and Patterns,
67
volume 5702 of Lecture Notes in Computer Science, pages 606–613. Springer Berlin /
Heidelberg, 2009.
[19] Ruben Cardenes, Rodrigo de Luis-Garcia, and Meritxell Bach-Cuadra. A multidimensional
segmentation evaluation for medical image data. Computer Methods and Programs in
Biomedicine, 96(2):108 – 124, 2009.
[20] Songcan Chen and Daoqiang Zhang. Robust image segmentation using fcm with spatial
constraints based on new kernel-induced distance measure. Systems, Man, and Cybernetics,
Part B: Cybernetics, IEEE Transactions on, 34(4):1907 –1916, aug. 2004.
[21] L. P. Clarke, R. P. Velthuizen, M. A. Camacho, J. J. Heine, M. Vaidyanathan, L. O. Hall,
R. W. Thatcher, and M. L. Silbiger. MRI segmentation: Methods and applications. Magn
Reson Imaging, 13:343–368, 1995.
[22] Chris A. Cocosco, Vasken Kollokian, Remi K.-S. Kwan, G. Bruce Pike, and Alan C. Evans.
Brainweb: Online interface to a 3d mri simulated brain database. NeuroImage, 5:425, 1997.
[23] Chris A. Cocosco, Alex P. Zijdenbos, and Alan C. Evans. A fully automatic and robust
brain mri tissue classification method. Medical Image Analysis, 7(4):513 – 527, 2003.
¡ce:title¿Medical Image Computing and Computer Assisted Intervention¡/ce:title¿.
[24] Alastair Compston and Alasdair Coles. Multiple sclerosis. Lancet, 372(9648):1502–17,
2008.
[25] Thomas E. Conturo, Robert C. McKinstry, Joseph A. Aronovitz, and Jeffrey J. Neil.
Diffusion mri: Precision, accuracy and flow effects. NMR in Biomedicine, 8(7):307–332,
1995.
[26] Renske de Boer, Henri A. Vrooman, M. Arfan Ikram, Meike W. Vernooij, Monique M.B.
Breteler, Aad van der Lugt, and Wiro J. Niessen. Accuracy and reproducibility study of
automatic mri brain tissue segmentation methods. NeuroImage, 51(3):1047 – 1056, 2010.
[27] Renske De Boer, Henri A Vrooman, Fedde Van Der Lijn, Meike W Vernooij, M Arfan
Ikram, Aad Van Der Lugt, Monique M B Breteler, and Wiro Niessen. White matter lesion
extension to automatic brain tissue segmentation on mri. NeuroImage, 45(4):1151–1161,
2009.
[28] Ayse Demirhan and Inan Gulan. Combining stationary wavelet transform and self-
organizing maps for brain mr image segmentation. Engineering Applications of Artificial
Intelligence, 24(2):358 – 367, 2011.
68
[29] Luc Devroye. On the almost everywhere convergence of nonparametric regression function
estimates, 1981.
[30] L. R. Dice. Measures of the amount of ecologic association between species. Ecology,
26(3):297–302, July 1945.
[31] Tarun. Dua, Paul. Rompani, World Health Organization., and Multiple Sclerosis Interna-
tional Federation. Atlas : multiple sclerosis resources in the world, 2008. World Health
Organization Geneva, Switzerland :, 2008.
[32] Richard O. Duda, Peter E. Hart, and David G. Stork. Pattern Classification (2nd Edition).
Wiley-Interscience, 2 edition, November 2001.
[33] Guillaume Dugas-Phocion, Miguel Ballester, Gregoire Malandain, Christine Lebrun, and
Nicholas Ayache. Improved em-based tissue segmentation and partial volume effect quan-
tification in multi-sequence brain mri. In Christian Barillot, David Haynor, and Pierre
Hellier, editors, Medical Image Computing and Computer-Assisted Intervention – MICCAI
2004, volume 3216 of Lecture Notes in Computer Science, pages 26–33. Springer Berlin /
Heidelberg, 2004.
[34] P. A. Filipek, C. Richelme, D. N. Kennedy, and V. S. Caviness. The young adult human
brain: an MRI-based morphometric analysis. Cereb Cortex, 4(4):344–360, 1994.
[35] M Filippi and F Agosta. Imaging biomarkers in multiple sclerosis. J Magn Reson Imaging,
31(4):770–88, 2010.
[36] M. Filippi, G. Iannucci, C. Tortorella, L. Minicucci, M.A. Horsfield, B. Colombo, M.P.
Sormani, and G. Comi. Comparison of ms clinical phenotypes using conventional and
magnetization transfer mri. Neurology, 52(3):588, 1999.
[37] Gath and Geva. Unsupervised optimal fuzzy clustering. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 11(7):773 –780, jul 1989.
[38] Yulin Ge, Robert I. Grossman, James S. Babb, Marcie L. Rabin, Lois J. Mannon, and
Dennis L. Kolson. Age-related total gray matter and white matter changes in normal
adult brain. part i: Volumetric mr imaging analysis. American Journal of Neuroradiology,
23(8):1327–1333, 2002.
[39] R. Guillemaud and M. Brady. Estimating the bias field of mr images. Medical Imaging,
IEEE Transactions on, 16(3):238 –251, june 1997.
69
[40] S.W. Hartley, A.I. Scher, E.S.C. Korf, L.R. White, and L.J. Launer. Analysis and validation
of automated skull stripping tools: A validation study based on 296 mr images from the
honolulu asia aging study. NeuroImage, 30(4):1179 – 1186, 2006.
[41] Khader M. Hasan, Indika S. Walimuni, Humaira Abid, Sushmita Datta, Jerry S. Wolin-
sky, and Ponnada A. Narayana. Human brain atlas-based multimodal mri analysis of
volumetry, diffusimetry, relaxometry and lesion distribution in multiple sclerosis patients
and healthy adult controls: Implications for understanding the pathogenesis of multiple
sclerosis and consolidation of quantitative mri results in ms. Journal of the Neurological
Sciences, 313(1’Aı2):99 – 109, 2012.
[42] M. Hasanzadeh and S. Kasaei. Multispectral brain mri segmentation using genetic fuzzy
systems. In Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International
Symposium on, pages 1 –4, feb. 2007.
[43] R H Hashemi, W G Bradley, D Y Chen, J E Jordan, J A Queralt, A E Cheng, and
J N Henrie. Suspected multiple sclerosis: Mr imaging with a thin-section fast flair pulse
sequence. Radiology, 196(2):505–510, 1995.
[44] Renjie He, Balasrinivasa Sajja, Sushmita Datta, and Ponnada Narayana. Volume and shape
in feature space on adaptive fcm in mri segmentation. Annals of Biomedical Engineering,
36:1580–1593, 2008. 10.1007/s10439-008-9520-1.
[45] Zujun Hou. A review on MR image intensity inhomogeneity correction. nternational
Journal of Biomedical Imaging, 2006.
[46] A. Huang, R. Abugharbieh, R. Tam, and A. Traboulsee. Automatic mri brain tissue
segmentation using a hybrid statistical and geometric model. In Biomedical Imaging:
Nano to Macro, 2006. 3rd IEEE International Symposium on, pages 394 –397, april 2006.
[47] Paul Jaccard. The distribution of the flora in the alpine zone.1. New Phytologist, 11(2):37–
50, 1912.
[48] K. A. Jellinger. New frontiers of mr-based technique in multiple sclerosis. European Journal
of Neurology, 10(4):467–467, 2003.
[49] J.R. Jimenez-Alaniz, V. Medina-Banuelos, and O. Yanez-Suarez. Data-driven brain mri
segmentation supported on edge confidence and a priori tissue information. Medical Imag-
ing, IEEE Transactions on, 25(1):74 –83, jan. 2006.
70
[50] M. Joliot and B.M. Mazoyer. Three-dimensional segmentation and interpolation of mag-
netic resonance brain images. Medical Imaging, IEEE Transactions on, 12(2):269 –277,
jun 1993.
[51] T. Kalaiselvi and K. Somasundaram. Fuzzy c-means technique with histogram based
centroid initialization for brain tissue segmentation in mri of head scans. In Humanities,
Science Engineering Research (SHUSER), 2011 International Symposium on, pages 149
–154, june 2011.
[52] Tina Kapur, W.Eric L. Grimson, William M. Wells III, and Ron Kikinis. Segmentation
of brain tissue from magnetic resonance images. Medical Image Analysis, 1(2):109 – 127,
1996.
[53] Stelios Krinidis and Vassilios Chatzis. A robust fuzzy local information c-means clustering
algorithm. Trans. Img. Proc., 19(5):1328–1337, May 2010.
[54] Cornelia Laule, Irene M Vavasour, Esther Leung, David KB Li, Piotr Kozlowski, Anthony L
Traboulsee, Joel Oger, Alex L MacKay, and GR Wayne Moore. Pathological basis of
diffusely abnormal white matter: insights from magnetic resonance imaging and histology.
Multiple Sclerosis Journal, 17(2):144–150, 2011.
[55] F. D. Lublin and S. C. Reingold. Defining the clinical course of multiple sclerosis: results
of an international survey. National Multiple Sclerosis Society (USA) Advisory Committee
on Clinical Trials of New Agents in Multiple Sclerosis. Neurology, 46(4):907–911, April
1996.
[56] J. B. MacQueen. Some methods for classification and analysis of multivariate observa-
tions. In L. M. Le Cam and J. Neyman, editors, Proc. of the fifth Berkeley Symposium on
Mathematical Statistics and Probability, volume 1, pages 281–297. University of California
Press, 1967.
[57] J.L. Marroquin, B.C. Vemuri, S. Botello, E. Calderon, and A. Fernandez-Bouzas. An
accurate and efficient bayesian method for automatic segmentation of brain mri. Medical
Imaging, IEEE Transactions on, 21(8):934 –945, aug. 2002.
[58] Karla L. Miller and John M. Pauly. Nonlinear phase correction for navigated diffusion
imaging. Magnetic resonance in medicine : official journal of the Society of Magnetic Res-
onance in Medicine / Society of Magnetic Resonance in Medicine, 50(2):343–353, August
2003.
71
[59] J M Minderhoud, J H van der Hoeven, and A J Prange. Course and prognosis of chronic
progressive multiple sclerosis. results of an epidemiological study. Acta Neurol Scand,
78(1):10–5, 1988.
[60] D. L. Pham, C. Xu, and J. L. Prince. Current methods in medical image segmentation.
Annual review of biomedical engineering, 2(1):315–337, 2000.
[61] Dzung L. Pham. Spatial models for fuzzy clustering. Computer Vision and Image Under-
standing, 84(2):285 – 297, 2001.
[62] W.E. Reddick, J.O. Glass, E.N. Cook, T.D. Elkin, and R.J. Deaton. Automated segmenta-
tion and classification of multispectral magnetic resonance images of brain using artificial
neural networks. Medical Imaging, IEEE Transactions on, 16(6):911 –918, dec. 1997.
[63] Kelly Rehm, Kirt Schaper, Jon Anderson, Roger Woods, Sarah Stoltzner, and David Rot-
tenberg. Putting our heads together: a consensus approach to brain/non-brain segmenta-
tion in t1-weighted mr volumes. NeuroImage, 22(3):1262 – 1270, 2004.
[64] M. A. Rocca, N. Anzalone, A. Falini, and M. Filippi. Contribution of magnetic resonance
imaging to the diagnosis and monitoring of multiple sclerosis. La Radiologia Medica, pages
1–14, March 2012.
[65] M. Rovaris, A. Gass, R. Bammer, S. J. Hickman, O. Ciccarelli, D. H. Miller, and M. Filippi.
Diffusion mri in multiple sclerosis. Neurology, 65(10):1526–1532, 2005.
[66] S. Ruan, C. Jaggi, J. Xue, J. Fadili, and D. Bloyet. Brain tissue classification of magnetic
resonance images using partial volume modeling. Medical Imaging, IEEE Transactions on,
19(12):1179 –1187, dec. 2000.
[67] R.A. Rudick, E. Fisher, J.-C. Lee, J. Simon, L. Jacobs, and the Multiple Sclerosis Collab-
orative Research Group. Use of the brain parenchymal fraction to measure whole brain
atrophy in relapsing-remitting ms. Neurology, 53(8):1698, 1999.
[68] Ali Sahrain and E.-W Radue. Mri atlas of ms lesions. 2008.
[69] Benoit Scherrer, Florence Forbes, Catherine Garbay, and Michel Dojat. Fully bayesian joint
model for mr brain scan tissue and structure segmentation. In Dimitris Metaxas, Leon Axel,
Gabor Fichtinger, and Gabor Szekely, editors, Medical Image Computing and Computer-
Assisted Intervention – MICCAI 2008, volume 5242 of Lecture Notes in Computer Science,
pages 1066–1074. Springer Berlin / Heidelberg, 2008.
[70] G. A. F. Seber. Multivariate Distributions, pages 17–58. John Wiley & Sons, Inc., 2008.
72
[71] Shan Shen, W. Sandham, M. Granat, and A. Sterr. Mri fuzzy segmentation of brain tissue
using neighborhood attraction with neural-network optimization. Information Technology
in Biomedicine, IEEE Transactions on, 9(3):459 –467, sept. 2005.
[72] Andrew Simmons, Paul S. Tofts, Gareth J. Barker, and Simon R. Arridge. Sources of
intensity nonuniformity in spin echo images at 1.5 t. Magnetic Resonance in Medicine,
32(1):121–128, 1994.
[73] J.G. Sled, A.P. Zijdenbos, and A.C. Evans. A nonparametric method for automatic cor-
rection of intensity nonuniformity in mri data. Medical Imaging, IEEE Transactions on,
17(1):87 –97, feb. 1998.
[74] S.M Smith. Fast robust automated brain extraction. Hum. Brain Mapp., 17(3):143–155,
November 2002.
[75] M Styner, J Lee, B Chin, M Chin, O Commowick, H Tran, S Markovic-Plese, V Jewells,
and S Warfield. 3d segmentation in the clinic: A grand challenge ii: Ms lesion segmentation.
MIDAS, pages 1–6, 2008.
[76] L. Szilagyi, Z. Benyo, S.M. Szilagyi, and H.S. Adam. Mr brain image segmentation using
an enhanced fuzzy c-means algorithm. In Engineering in Medicine and Biology Society,
2003. Proceedings of the 25th Annual International Conference of the IEEE, volume 1,
pages 724 – 726 Vol.1, sept. 2003.
[77] D. Tian and L. Fan. A brain mr images segmentation method based on som neural network.
In Bioinformatics and Biomedical Engineering, 2007. ICBBE 2007. The 1st International
Conference on, volume 2, pages 686 –689, july 2007.
[78] GuangJian Tian, Yong Xia, Yanning Zhang, and Dagan Feng. Hybrid genetic and vari-
ational expectation-maximization algorithm for gaussian-mixture-model-based brain mr
image segmentation. Information Technology in Biomedicine, IEEE Transactions on,
15(3):373 –380, may 2011.
[79] J. Tohka, E. Krestyannikov, I.D. Dinov, A.M. Graham, D.W. Shattuck, U. Ruotsalainen,
and A.W. Toga. Genetic algorithms for finite mixture model based voxel classification in
neuroimaging. Medical Imaging, IEEE Transactions on, 26(5):696 –711, may 2007.
[80] On Tsang, Ali Gholipour, Nasser Kehtarnavaz, Kaundinya Gopinath, Richard Briggs, and
Issa Panahi. Comparison of tissue segmentation algorithms in neuroimage analysis software
tools. In Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual
International Conference of the IEEE, pages 3924 –3928, aug. 2008.
73
[81] K. Van Leemput, F. Maes, D. Vandermeulen, and P. Suetens. Automated model-based
bias field correction of mr images of the brain. Medical Imaging, IEEE Transactions on,
18(10):885 –896, oct. 1999.
[82] U. Vovk, F. Pernus, and B. Likar. A review of methods for correction of intensity inhomo-
geneity in mri. Medical Imaging, IEEE Transactions on, 26(3):405 –421, march 2007.
[83] U. Vovk, F. Pernus, and B. Likar. A review of methods for correction of intensity inhomo-
geneity in mri. Medical Imaging, IEEE Transactions on, 26(3):405 –421, march 2007.
[84] S.K. Warfield, K.H. Zou, and W.M. Wells. Simultaneous truth and performance level esti-
mation (staple): an algorithm for the validation of image segmentation. Medical Imaging,
IEEE Transactions on, 23(7):903 –921, july 2004.
[85] Michael Wels, Yefeng Zheng, Martin Huber, Joachim Hornegger, and Dorin Comaniciu.
A discriminative model-constrained em approach to 3d mri brain tissue classification and
intensity non-uniformity correction. Physics in Medicine and Biology, 56(11):3269, 2011.
[86] Steven D. Wolff and Robert S. Balaban. Magnetization transfer contrast (mtc) and tissue
water proton relaxation in vivo. Magnetic Resonance in Medicine, 10(1):135–144, 1989.
[87] Zhao Yi, Antonio Criminisi, Jamie Shotton, and Andrew Blake. Discriminative, seman-
tic segmentation of brain tissue in mr images. In Guang-Zhong Yang, David Hawkes,
Daniel Rueckert, Alison Noble, and Chris Taylor, editors, Medical Image Computing and
Computer-Assisted Intervention – MICCAI 2009, volume 5762 of Lecture Notes in Com-
puter Science, pages 558–565. Springer Berlin / Heidelberg, 2009.
[88] Lian Yuanfeng and Wu Falin. Three-dimensional probabilistic neural network using for mr
image segmentation. In Electronic Measurement Instruments (ICEMI), 2011 10th Inter-
national Conference on, volume 3, pages 127 –131, aug. 2011.
[89] Y. Zhang, M. Brady, and S. Smith. Segmentation of brain mr images through a hidden
markov random field model and the expectation-maximization algorithm. Medical Imaging,
IEEE Transactions on, 20(1):45 –57, jan 2001.
[90] Yu Jin Zhang. A review of recent evaluation methods for image segmentation. In Signal
Processing and its Applications, Sixth International, Symposium on. 2001, volume 1, pages
148 –151 vol.1, 2001.
74