GHENT UNIVERSITY
FACULTY OF PHARMACEUTICAL SCIENCES
Department of Bio Analysis
Laboratory of Food Analysis
Master thesis performed at:
NATIONAL INSTITUTE OF OCCUPATIONAL
HEALTH
Department of the Chemical and
Biological Work Environment
Academic year 2014-2015
UNTARGETED METABOLOMICS IN OCCUPATIONAL HEALTH – THE SEWAGE WORKER CASE
Florence GOETHALS
First Master of Pharmaceutical Care
Promoter:
Prof. Dr. Apr. S. De Saeger
co-promoter:
Dr. S. Uhlig
Commissioners:
Dr. M. De Boevre
Prof. Dr. K. Audenaert
GHENT UNIVERSITY
FACULTY OF PHARMACEUTICAL SCIENCES
Department of Bio Analysis
Laboratory of Food Analysis
Master thesis performed at:
NATIONAL INSTITUTE OF OCCUPATIONAL
HEALTH
Department of the Chemical and
Biological Work Environment
Academic year 2014-2015
UNTARGETED METABOLOMICS IN OCCUPATIONAL HEALTH – THE SEWAGE WORKER CASE
Florence GOETHALS
First Master of Pharmaceutical Care
Promoter:
Prof. Dr. Apr. S. De Saeger
Co-promoter:
Dr. S. Uhlig
Commissioners:
Dr. M. De Boevre
Prof. Dr. K. Audenaert
COPYRIGHT
“The author and the promoters give the authorization to consult and to copy parts of this
thesis for personal use only. Any other use is limited by the laws of copyright, especially
concerning the obligation to refer to the source whenever results from this thesis are cited.”
May …, 2015
Promoter Author
Prof. Dr. S. De Saeger Florence Goethals
SUMMARY
Due to the increasing doubt about the safety among sewage workers in occupational
health, there is a need to gain better insight into these workers’ state of health. Previous
investigations already found out that these individuals suffer from headache, lung function
reduction, irritation of the respiratory tract etc. due to daily exposure to potential harmful
contaminants in sewage. With the major objective to investigate these workers’ health more
thoroughly, differences between exposed individuals and others, who work in safe and
healthy environments, had to be established. Therefore, an untargeted HPLC-HRMS
metabolomics approach using serum samples was chosen in order to discover metabolic
changes in sewage workers as a result of the exposures in their working environment.
Serum samples were analyzed by two orthogonal HPLC-HRMS methods employing
either hydrophilic interaction liquid chromatography or reversed-phase HPLC. Raw data
were preprocessed using MZmine in order to create data sets consisting of “true” metabolic
features. Comparison of the two groups (i.e. exposed vs. control), was then performed using
multivariate data analyses included principal component analysis (PCA) and orthogonal
partial least squares – discriminant analysis (OPLS-DA). Extraction of the most significant
variables from the OPLS-DA models resulted finally in 13 potential metabolic markers out of
1000’s. The identity for eight of these could tentatively be established based on calculation
of elemental formulae, database searches and study of MS2 product ion spectra obtained
from data-depending scanning using ion trap MS. The tentatively identified metabolites
were two amino acids (phenylalanine, tyrosine) a dipeptide (phe-phe) and phosphocholines.
Whether or not these metabolites can be used for further elucidation of the adverse effects
connected to working in a sewage environment needs to be shown in the future.
SAMENVATTING
Door de toenemende onzekerheid omtrent de veiligheid van arbeiders in riool- en
afvalwaterzuiveringsfabrieken is er nood aan betere inzichten betreffende de
gezondheidstoestand van deze arbeiders. Eerder onderzoek heeft reeds aangetoond dat
deze individuen gevoelig kunnen zijn aan hoofdpijn, daling in longfunctie, luchtwegirritatie
etc. als gevolg van dagelijkse blootstelling aan potentieel schadelijke verontreinigingen in
afval- en rioolwater. Met als hoofddoelstelling om de gezondheid van deze arbeiders meer
diepgaand te onderzoeken, diende een vergelijking tussen deze blootgestelde individuen en
andere, werknemers in veilige en gezonde werkomstandigheden, gemaakt te worden.
Hiervoor wordt gebruik gemaakt van een untargeted HPLC-HRMS metabolomics methode,
met de bedoeling om metabolische veranderingen te detecteren in het metaboloom van
deze arbeiders als gevolg van blootstelling in hun werkomgeving.
Analyse van serum stalen werd uitgevoerd met behulp van twee orthogonale HPLC-
HRMS methoden, enerzijds hydrofilic interaction liquid chromatography en anderzijds
reversed-phase HPLC. Met behulp van MZmine werd de onbewerkte data behandeld, met
de bedoeling om data sets te ontwikkelen waarin informatie over de optimaal bruikbare
metabolieten aanwezig is. Vergelijking tussen de twee groepen (i.e. blootgesteld vs.
controle) was vervolgens mogelijk door gebruik te maken van multivariate data analyse,
betreffende principal component analysis (PCA) en orthogonal partial least squares –
discriminant analysis (OPLS-DA). Extractie van de meest significante variabelen van de OPLS-
DA modellen resulteerde uiteindelijk in 13 potentiele metabolomische biomarkers uitgaande
van meer dan duizenden metabolieten. De identiteit van acht van deze metabolieten kon
onder voorbehoud vastgesteld worden, gebaseerd op het bepalen van de elementaire
compositie, database zoekopdrachten en het bestuderen van de MS2 ion spectra. De
voorlopig geïdentificeerde metabolieten waren twee aminozuren (phenylalanine, tyrosine)
een dipeptide (phenylalanine-phenylalanine) en fosfocholines. Of deze metabolieten al dan
niet kunnen gebruikt worden voor verdere verduidelijking van de schadelijke effecten die
verbonden zijn aan de risicovolle werkomgeving, dient aangetoond te worden in de
toekomst.
THANKS TO
First of all, I would like to thank Prof. Dr. S. De Saeger for giving me the opportunity to work
and write on my thesis abroad. In particular I would like to thank Dr. S. Uhlig for the excellent
guidance concerning all the work, for everything I learned during the past few months and
most of all that he would take a lot of time for correcting and giving feedback on my thesis.
Apart from this, I would like to thank everybody at STAMI, for being warm-hearted and
helpful and especially, for all the experience I gained during work. Besides this, I also like to
thank all the people I met during my stay in Norway, for all the experiences and the beautiful
memories. I want to thank my family for giving me the possibility and the faith in me to study
abroad. At last I want to thank my boyfriend for the visits and his support.
TABLE OF CONTENTS
1 INTRODUCTION.......................................................................................................... 1
1.1 METABOLOMICS .......................................................................................................... 1
1.1.1 Background .................................................................................................. 1
1.1.2 Systems Biology ........................................................................................... 2
1.1.2.1 Genomics .................................................................................................................................. 3
1.1.2.2 Transcriptomics ........................................................................................................................ 3
1.1.2.3 Proteomics ................................................................................................................................ 3
1.1.2.4 Metabolomics ........................................................................................................................... 4
1.1.3 Analytical Methodologies ............................................................................. 5
1.1.3.1 Target Analysis.......................................................................................................................... 5
1.1.3.2 Metabolite Profiling .................................................................................................................. 5
1.1.4 Analytical platforms for detection of metabolites ......................................... 6
1.1.4.1 MS based metabolomics .......................................................................................................... 6
1.1.4.2 NMR based metabolomics ....................................................................................................... 7
1.2 THE SEWAGE WORKERS PROJECT ............................................................................... 7
1.2.1 Background .................................................................................................. 7
1.2.2 Contaminants ............................................................................................... 8
1.2.2.1 Non-infectious biological agents .............................................................................................. 8
1.2.2.2 Toxic gases ................................................................................................................................ 8
1.2.2.3 Infectious bacteria (pathogens) and viruses ............................................................................ 9
1.2.2.4 Chemical agents ....................................................................................................................... 9
2 OBJECTIVES .............................................................................................................. 10
3 METHODS AND MATERIALS ...................................................................................... 11
3.1 SAMPLING (SAMPLE COLLECTION) ............................................................................ 11
3.2 SAMPLE PREPARATION .............................................................................................. 11
3.2.1 Protein precipitation and preparation of samples for HILIC-HRMS ............... 12
3.2.2 Preparation of samples for RP-HPLC-MS ..................................................... 12
3.2.3 Lyophilisation ............................................................................................. 12
3.3 LIQUID CHROMATOGRAPHY-MASS SPECTROMETRY (DATA ACQUISITION) ............. 13
3.3.1 Abstract ..................................................................................................... 13
3.3.1.1 UHPLC ..................................................................................................................................... 13
3.3.1.2 The Q Exactive benchtop Orbitrap mass spectrometer ......................................................... 14
3.3.2 RP-HPLC ..................................................................................................... 16
3.3.3 HILIC .......................................................................................................... 17
3.3.4 High-resolution mass spectrometry (HRMS) ................................................ 17
3.3.5 Linear ion trap mass spectrometry (ITMS) ................................................... 18
3.4 MASS SPECTROMETRY DATA PROCESSING ............................................................... 18
3.4.1 Raw data file conversion ............................................................................ 18
3.4.2 Data processing in MZmine ........................................................................ 18
3.4.2.1 Peak detection/ peak picking ................................................................................................. 19
3.4.2.2 Deisotoping ............................................................................................................................ 20
3.4.2.3 Peak list alignment ................................................................................................................. 20
3.4.2.4 Gap filling ................................................................................................................................ 21
3.4.2.5 Peak list filtering ..................................................................................................................... 22
3.4.2.6 Identification .......................................................................................................................... 22
3.4.3 File export and normalization ..................................................................... 23
3.5 DATA ANALYSIS .......................................................................................................... 24
3.5.1 Principal component analysis (PCA) ............................................................ 24
3.5.2 Orthogonal partial least squares – discriminant analysis ............................. 24
3.5.3 Identification of potential metabolite markers ........................................... 24
4 RESULTS AND DISCUSSION ....................................................................................... 26
4.1 SAMPLE SELECTION AND SAMPLE PREPARATION ..................................................... 26
4.2 LC-HRMS ANALYSES AND DATA PROCESSING ........................................................... 27
4.3 MULTIVARIATE DATA ANALYSES ............................................................................... 28
4.4 SELECTION OF POTENTIAL METABOLOMIC MARKERS OF EXPOSURE ...................... 34
4.5 TENTATIVE IDENTIFICATION OF METABOLITES ......................................................... 36
4.6 INSTRUMENTAL DRIFT AND REPRODUCIBILITY ......................................................... 44
5 CONCLUSION ........................................................................................................... 47
6 BIBLIOGRAPHY ......................................................................................................... 48
7 APPENDIX ................................................................................................................ 55
LIST OF ABBREVIATIONS
CRP C-Reactive Protein
CSF Cerebrospinal Fluid
ESI Electrospray Ionization
FT-IR Fourier Transform- Infrared Spectroscopy
GC Gas Chromatography
H2S Hydrogen Sulfide
HESI Heated-Electrospray Ionization
HILIC Hydrophilic Interaction Liquid Chromatography
ITMS Ion Trap Mass Spectrometry
LC Liquid Chromatography
LPS Lipopolysacharide
MS Mass Spectrometry
MVA Multivariate Statistical Analysis
NMR Nuclear Magnetic Resonance
OPLS-DA Orthogonal Partial Least Squares – Discriminant Analysis
PAR Pareto-scaling
PC Principal Component
PCA Principal Component Analysis
QC Quality Control
RDB Ring Double Bond Equivalent
RP Reversed Phase
RSD Relative Standard Deviation
RT Retention Time
UHPLC Ultra High-Performance Liquid Chromatography
UV Unit Variance
VIP-plot Variable Importance in the Projection-plot
1
1 INTRODUCTION
1.1 METABOLOMICS
1.1.1 Background
Metabolomics is the comprehensive analysis in which all small molecule metabolites
of a biological system are systematically identified and quantified. Such an approach aims to
reveal the metabolome (or part of it) of a biological system1 2. The metabolome is referred
to be the sum of all small metabolites in a biological system such as a cell culture or a living
organism. Taking a closer look at the metabolome, it has been taken for granted that it’s
inconceivable to detect all of the metabolites in a biological sample with only one single
technology3 4 5. When examining metabolites, an enormous variation in chemical structures
with different physicochemical properties can be seen. This diversity in physicochemical
properties is attributed to the existence of various types of molecules i.e. proteins, lipids,
nucleotides, sugars etc. Additionally, even though all these molecules differ in functional
groups, size and hydrophilicity, they also tend to exist in a very broad concentration range6.
It is estimated that the human metabolome comprises over nearly 11 magnitudes of
concentration (approximately pmol-mmol)7.
Metabolomics was first introduced by Oliver et al. in 1998 and research about this
topic had begun emerging at the end of the 20th century, a few decades after the
introduction of genomics and proteomics8 9. This assumes that both the technology and the
research on genomics and proteomics is yet much more advanced compared to
metabolomics6. This backlog in the developing technologies for the metabolomics approach
in comparison with the other more progressive ‘omics’ investigations, is in accordance with
the fact that there is no availability of a single technology that is able to identify all
metabolites of the metabolome at once6. Also, for the improvement and the broadening of
the application of metabolomic studies, limitations in metabolomics technology such as
sample preparation, the lack of LC-MS databases and metabolite standards need to be
resolved10.
The most commonly analyzed biofluids in metabolomics investigations are plasma,
serum, urine and cerebrospinal fluid (CSF), but also other tissues or fluids such as saliva or
2
seminal fluid could be applied11. Of these, urine is generally the most often used due to the
possibility of non-invasive collection, the uncomplicated preparation and the low protein
content12. Collecting CSF samples is very invasive and therefore undesirable when a large
number of samples are acquired. Plasma and serum samples have the advantage to be less
invasive than CSF collection and also cover the largest part of the metabolome of a living
organism11.
1.1.2 Systems Biology
Besides several other ‘omics’, e.g. genomics, transcriptomics and proteomics,
metabolomics is a component of the biological field, which is also called ‘the omics cascade’
by Dettmer et al.6 (Fig. 1.1). All these components can be used for illustrating the association
between the genotype and the phenotype of organisms3.
Figure 1. 1 The ‘omics cascade’6.
The metabolomics approach has created a new dimension for research in many
branches of life science. The importance of metabolomics with regard to the inspection of
3
food quality has been confirmed by Ryan et al. (2005), and also for the discovery of novel
drugs and bioactive molecules it offers major opportunities13. For functional genomics,
which tries to identify the function and the activity of genes by establishing a better
understanding of the association between genes and the functional phenotype, it also
became a useful complementary tool9 13 14. Metabolomics also is an essential part of systems
biology and revealed many secrets of biological systems. In fact, functional genomics and
systems biology are using analogous approaches, but the latter one has the main goal to
integrate genomics, transcriptomics, proteomics and metabolomics for a more global
comprehension of biological systems and to discover the structure of the entire system
under investigation13 15. To be more specific, systems biology focuses itself on the behavior
and the relationships of all interacting elements and the environment in a biological system
rather than examining individual genes, proteins or metabolites. Thus, systems biology tries
to reveal the impact of particular perturbations such as biological, genetical or chemical
perturbations at the genomic, proteomic and metabolomics level5 13 16.
1.1.2.1 Genomics
‘The omics cascade’ is initiated with genomics (Fig. 1.1). Genomics research deals
with investigating an organisms’ whole DNA-set. Commonly the entire genome is under
study even though one’s intention is to clarify the function of single genes, their activity and
abundance in the genome17. Hence genomics data are associated primarily with the
genotype.
1.1.2.2 Transcriptomics
The second part in the ‘omics cascade’ is transcriptomics. The transcriptome includes
the whole set of transcripts in a cell, i.e. mRNA, small RNA and non-coding RNA18. The
transcriptome reproduces the expression levels of genes under a variety of developmental
or physiological conditions and it is the key fundament for the synthesis of proteins9 19.
1.1.2.3 Proteomics
The characterization of the proteome, which is referred to as the collection of all
proteins in an organism including their downstream modifications, is the subject of the
proteomics approach20 21. The analysis of the proteins is not straightforward. One reason for
this is that proteins are composed of amino acids with different physicochemical properties
4
complicating their analysis22. Another important reason is the fact that the proteome is
rather complex, as proteins may degrade, become enzymatically or chemically modified, get
spliced or may form complexes with other proteins22.
1.1.2.4 Metabolomics
The aim of the metabolomics approach is to identify and characterize endogenous
and exogenous low-molecular-weight metabolites in biological samples (typically lower than
1.500 Da)3 8 23. The metabolites that can be found in such samples, can be either organic or
inorganic compounds and either primary or secondary metabolites3 4. A metabolomics study
tries to elucidate either the whole or part of the metabolome. It has been stated that
metabolomics may supply the most ‘functional’ information of all ‘omics’ because
metabolites are the end products of regulatory processes in cells1 13. The metabolite levels
can be considered as the final response of biological systems to environmental and genetic
changes, and hence, define more closely the phenotype of an organism1 24 25. The measured
(part of the) metabolome is not solely comprised of metabolites that are encoded by the
genome. This is because also exogenously acquired molecules and their products from
biotransformation will be present in the sample, among which are drugs, and compounds
from foods such as food additives etc.25. As already mentioned, it is therefore that
metabolomics analysis is not straightforward due to the enormous complexity of the
metabolome6. While the proteome and the transcriptome, on the molecular scale are an
assembly of 20 amino acids and four nucleotides, respectively the metabolome contains a
much greater variability. Thus, for its analysis one needs to carry out particular steps for the
preparation of samples, depending on the choice of the analytical technique(s)1 6.
Metabolome analysis enables, among others, the identification of metabolites
involved in disease pathophysiology. This means that such an analysis may reveal metabolic
changes in the studied individuals and thereby identify predictive biomarkers that could
provide valuable insights into the disease mechanisms and result in earlier intervention26 27
28. Therefore, in order to detect those metabolic differences, robust analytical methods are
required29. Kume et al. (2015) for example used metabolomics analysis to identify diagnostic
biomarkers of chronic fatigue in humans. Results have shown that the metabolites which
were found to be significantly different between an affected and a control group, may be
promising diagnostic biomarkers for chronic fatigue27.
5
1.1.3 Analytical Methodologies
There exist two basic analytical approaches in metabolomics, which are target
analyses and metabolite profiling. Metabolite profiling can be subdivided into metabolic
fingerprinting and metabolic footprinting5 24.
1.1.3.1 Target Analysis
Target analysis is the analytical approach that aims to detect and quantify a
predefined metabolite or fewest of predefined metabolites in biological samples, which are
related to a specific metabolic reaction or pathway4 5 13 24. Consequently, signals from the
other, not targeted, metabolites are disregarded because the approach is not aimed at the
detection of as many metabolites as possible13 30. For that reason, the sample preparation
and separation method is designed to ensure that unnecessary metabolites are removed
because otherwise they may cause interference during the investigation1 4. This approach is
often used when there is the need to have low detection limits1 4. The number of
metabolites that may be included in a target approach is nowadays high, i.e. up to several
hundred different metabolites can be detected with acceptable precision and accuracy5 30 31.
1.1.3.2 Metabolite Profiling
Metabolite profiling, which is also known as metabolomic profiling, is generally used
for the qualitative and semi-quantitative screening of a large amount of metabolites of
known or unknown identity5 13 24. This approach is widely used for drug discovery, metabolic
biotransformation and for the elucidation of metabolic responses of therapeutic
treatments4.
1.1.3.2.1 Metabolic Fingerprinting
This subclass is a non-targeted methodology that keeps up with the global high-
throughput analysis of biological samples, in order to classify samples in consistence with
their origin (i.e. healthy/disease) or biological status (i.e. control/case)4 1 32. This means that
the intention of this kind of investigation is to compare patterns of metabolites or
fingerprints that could differ due to the response to a disease, and due to genetic or
environmental alterations6 33. This approach seems to be very useful for diagnostic
purposes, quality control of products and for the screening of mutant collections1.
6
1.1.3.2.2 Metabolic Footprinting
From a technical point of view is metabolic footprinting similar to metabolic
fingerprinting, but the main difference is that metabolic fingerprinting is focused on the
intracellular metabolites (i.e the endometabolome) and metabolic footprinting is focused on
the extracellular metabolites which are left behind or secreted into the medium by the
cells(i.e. the exometabolome)5 24 34 35.
1.1.4 Analytical platforms for detection of metabolites
Nowadays several analytical platforms have been developed, e.g. Nuclear magnetic
resonance (NMR), Fourier transform- infrared spectroscopy (FT-IR) and MS coupled to
separation techniques, including NMR, GC-MS, LC-MS, FT-MS and UPLC-MS, which could be
useful for metabolomics investigations36. Among those, nuclear magnetic resonance (NMR)
spectroscopy, gas chromatography (GC) and liquid chromatography (LC) coupled to mass
spectrometry (MS) are the most widely applied. Each of these three techniques has its own
advantages and drawbacks to be used in metabolomics investigations23 36.
1.1.4.1 MS based metabolomics
Mass spectrometry coupled with chromatography techniques is the most commonly
used technology in metabolomics studies4 36. In general, this technology provides the ability
to identify metabolites in samples by rapid, high sensitive and high selective quantitative and
qualitative analyses4. MS-based techniques without the application of high-throughput
separation techniques also gives these advantages, but the sample preparation in this case
can cause metabolite loss36. Although it’s better to combine MS with a certain separation
technique, there are differences depending on the applied chromatography technique, of
which GC and LC are most often utilized.
The main principle of GC-MS , is to separate thermally stable and volatile compounds
by GC before they become detected by MS4. Although most compounds need to be
derivatized to acquire this thermal stability and volatility before analysis by GC-MS4 23.The
elaborate sample preparation is a significant disadvantage in the applicability of this
technique4. On the other side, GC-MS has the ability to provide efficient and reproducible
analyses23 36.
7
LC-MS is the other MS based technology which is widely applied in metabolomics
studies. Since this one has the advantage that it can avoid the extensive chemical
derivatization of molecules, it’s more often used than GC-MS36. An elucidation about the LC-
MS principle can be found in section 3.3.1.
1.1.4.2 NMR based metabolomics
NMR-spectroscopy is another frequently used analytical platform. This one is known to have
less sensitivity compared to the MS-platform, but it has the benefit to require no or only a
simple sample preparation and no derivtization37. This technique is highly reproducible, non-
destructive and doesn’t come into direct contact with the platforms’ operational
components6 23 30. Thus, contamination is infinitesimal and this enables also the routine and
high-throughput analysis of great amounts of samples23.
1.2 THE SEWAGE WORKERS PROJECT
1.2.1 Background
To counteract for the ever increasing pollution of the environment, and to prevent
possible diseases in the human population and unhygienic conditions, there’s a considerable
need for waste control. Treatment of sewage is an important section in this waste control
and it includes the removal of various harmful contaminants in sewage by using biological,
physical and chemical processes38. Physical processes for the removal of contaminants are
operations that are based on sedimentation of contaminants in wastewater. Chemical
processes are those where chemical reactions occur such as flocculation into bigger particles
before sedimentation. All these processes contribute in the reforming of sewage water into
clean water and the rest product, which is sludge. As a consequence of these processes,
chemicals and microorganisms may appear in the atmosphere of sewage treatment plants
and may be harmful for workers when inhaling38.
A rising doubt about the safety of health among workers who work in sewage treatment
plants, has already led to various different investigations concerning these sewage workers’
health. This all has been done because sewage workers may be chronically exposed to a
variety of contaminants during work. These contaminants comprise toxic gases from
microbial degradation, chemical agents, infectious biological agents as well as non-infectious
biological agents39 40. A brief enlightenment about these pollutants can be found in section
8
1.2.2. The complex and variable exposure of different contaminants and a broad diversity of
health hazards reported among the workers handling the waste water, makes it very difficult
to study the associations between exposure and health effects40.
1.2.2 Contaminants
1.2.2.1 Non-infectious biological agents
Endotoxins or lipopolysaccharides (LPS) are the most important exposure agents among
sewage workers, and are therefore most studied as causative agents for changes in the state
of health among these individuals. Endotoxins are the outer membrane components of
gram-negative bacteria, which type of bacteria dominates the bacterial exposure in
sewage39. These components are released when the bacteria die and the cell wall
disintigrates41. They can be present in sewage treatment plants in amounts that exceed
those that give rise to symptoms and illness38 42. The primary target of endotoxins are the
lungs and the inhalation of endotoxin aerosols can lead to acute airflow obstruction, shock
and death since they have a high pro-inflammatory potency and they produce an
inflammatory response when inhaled38 43. Besides the respiratory intake of endotoxins,
traces of these components can be found in food and water, but this type of exposure is less
common in sewage treatment plants41. Rylander (1999) confirmed data from previous
studies on an increased incidence of diarrhea, fatigue and airway symptoms among sewage
workers exposed to endotoxins43.
1.2.2.2 Toxic gases
Potentially harmful gases such as hydrogen sulfide (H2S) and ammonia are produced
by microbial degradation in sewage and sludge38 39. Of these toxic gases, Hydrogen sulfide is
probably the most examined in this type of studies, but the overall amount of studies that
have measured the H2S concentration is low. H2S is well-known as a colorless gas with the
specific odor of rotten eggs and harmful to human health44 45. It’s the sulfur analog of water
and its development depends on the presence of sulfur in organic matter and the depletion
of oxygen in a specific environment44 46. Hence, this gas is frequently encountered in sewage
treatment plants44.
Given that most organ systems are susceptible for the effects of H2S, it’s often
regarded as a broad-spectrum toxicant. The degree of susceptibility varies among the
9
different organ systems46. Tissues with high oxygen demand are usually most susceptible for
H2S toxicity44 46. Acute exposure to low H2S concentrations can elicit irritation of the eyes
and respiratory tract. Higher concentrations concerning acute exposure can cause symptoms
such as dizziness, headache, convulsions and unconsciousness or even death44 46. Chronic
exposure to low concentrations may affect the lung function47. Symptoms after chronic
exposure to H2S is less known, but the development of symptoms from the central nervous
system such as tiredness and concentration difficulties have also been reported among
workers handling waste water46. Based on self-reported symptoms, Jeroen Douwes et al.
(2001) studied the associations between symptoms and possible exposure among sewage
workers40. The results demonstrated that neurological symptoms such as chronic fatigue and
forgetfulness may be a result of possible exposure to especially the neurotoxic gases such as
H2S. Also other studies suggested the association between exposure to toxic gases and the
neurological symptoms44 48.
1.2.2.3 Infectious bacteria (pathogens) and viruses
Besides exposure to non-infectious biological agents, there is also the infectious-
parasitic type. The infectious and parasitic biological agents comprises for example hepatitis
A virus, Helicobacter bacteria, Giardia protozoans etc.40. A number of studies already
investigated whether or not the sewage workers run a greater risk of contracting infectious
diseases, especially hepatitis A49. Although some of those studies showed that sewage
workers have an increased risk for acquiring hepatitis A infection, disease risks are generally
low40 50 51.
1.2.2.4 Chemical agents
At last, an enormous diversity of chemical agents is present in sewage treatment plants.
These chemical agents include chemicals that are used for treatment of sewage, chemicals
that are used in the cleaning and the maintaining of plants and chemical contaminants with
which the sewage has been polluted38 52. The association between chemical exposure and
health effects is usually studied after industrial accidents took place through which sewage
workers were exposed to extraordinary high concentrations40.This type of contaminants is
less important for exposure and less studied compared to all the others.
10
2 OBJECTIVES
Biomarker identification is an important domain in metabolomics, which allows
clarifying metabolic changes due to disease or exposure to xenobiotics among individuals.
Once significant biomarkers have been discovered, they could be used to predict the state of
health among humans and thus result in an earlier intervention during the onset of a
conceivable disease and help elucidating disease mechanisms.
The aim of this investigation was to 1) test the suitability of an untargeted HPLC-
HRMS based metabolomics approach for an occupational health problem and 2) reveal
predictive biomarkers in a group of individuals, which is chronically exposed to sewage and
its contaminants. Clarification of the significant metabolic changes among this group of
workers, will be important to get a better understanding about the frequently encountered
diseases concerning sewage workers.
11
3 METHODS AND MATERIALS
This chapter comprises the discussion of the materials, the instruments and the
methodology.
Table 3. 1: List of chemicals used during the sample and mobile phase preparation.
Chemicals
Name Specification Manufacturer Origin
Acetonitrile
Optima LC/MS Fisher Scientific, Thermo
Fisher Scientific, Inc.
Sunnyvale, CA, USA
Ammonium carbonate
For HPLC ≥ 30 % NH3 basis
Sigma-Aldrich St. Louis, MO,
USA
Formic acid puriss. p.a., ≥98% Sigma-Aldrich St. Louis, MO,
USA
Methanol for analysis Merck KGaA Darmstadt, Germany
3.1 SAMPLING (SAMPLE COLLECTION)
Serum samples were taken from 146 sewage workers from sewage plants and sewage
net systems situated in big cities and small communities in the surroundings of the cities.
Based on a questionnaire of job operations, 21 of these workers were characterized as little
or not exposed and served as a control group. Personal exposure to endotoxins and bacteria
was measured during different work operations. Biomarkers of systemic inflammation, C-
reactive protein (CRP), measured by HS-MicroCRP assay and lung function by spirometry test
(SPIRARE) were also studied among the workers. The blood was obtained by veinpuncture
and the particular blood samples were kept for 60-120 minutes at room temperature in 10
mL tubes (vacutainers) without additives (BD Diagnostics, Plymouth, UK) for the coagulation.
After this period, the samples were centrifuged for 15 minutes at 1500 g and the serum
supernatant of each tube had been collected into NUNC® cryotubes (NUNC, Roskilde,
Denmark). The obtained serum samples were stored at -80°C until analysis.
3.2 SAMPLE PREPARATION
Serum samples from 50 exposed workers and 21 controls had been chosen to include
in our investigation. Out of these, two different sets of samples were prepared for two
different types of HPLC, i.e. hydrophilic interaction chromatography (HILIC) and reverse-
phase chromatography (RP-HPLC).
12
3.2.1 Protein precipitation and preparation of samples for HILIC-HRMS
Samples were allowed to thaw at room temperature for 30 minutes, followed by
precipitation of high-molecular-weight compounds, present in the samples. This is described
as the deproteinization step23. In order to achieve this, an organic solvent, namely methanol
was added to the samples in a 3:1 (vol/vol) ratio .
In more detail, 250 µL of each sample was transferred to labeled Eppendorf tubes.
Then, 750 µL of methanol was added, and the samples were vortexed for about 15 seconds
and then centrifuged for 15 minutes at 20 °C and at 15,000 × g using a Sigma 4K15 centrifuge
(Sigma, Osterode, Germany) to separate the supernatant, which was assumed to contain the
majority of all metabolites, and the precipitate23. After centrifugation, 200 µL of the
supernatant from each Eppendorf tube was transferred to separate chromatography vials
with inserts and sealed. This sample set was used for HILIC-HRMS.
The serum was exchanged with purified water in order to prepare a blank sample. The
blank was otherwise treated in the same way as the serum samples. A quality control (QC)
sample was prepared by pooling 20 µL from each Eppendorf tube in a chromatography
vial23.
3.2.2 Preparation of samples for RP-HPLC-MS
In order to prepare samples for RP-HPLC-HRMS the samples were processed further.
After the deproteinization step (cfr.3.2.1), 600 µL of each sample was transferred into new
labeled Eppendorf tubes. The samples were then evaporated under a gentle stream of
nitrogen until approximately 150 µL of solvent remained in the tubes.
The remaining solvent comprised of water. All Eppendorf tubes were sealed with
Parafilm before storing them in the freezer at – 80°C.
3.2.3 Lyophilisation
The residual water in the samples was removed by lyophilisation. Lyophilisation or
freeze-drying is the technique in which water or another solvent of high boiling point (e.g.
dimethyl sulfoxide) is sublimed under vacuum. The frozen samples were placed in a HetoVac
freeze-dryer (Heto InterMed, Birkerod, Denmark) for one day, after which all of the water
present in the samples and the blank had been removed. According to the literature it is
13
advantageous to dry biological fluids using lyophilisation in order to avoid metabolite
degradation23.
In order to assure good chromatography from RP-HPLC the samples were
reconstituted in 200 µL of purified water by vortexing for about 15 seconds followed by
sonication for 5 minutes23. Finally, all the samples were centrifuged again at 15 000 × g for
15 minutes at room temperature, and then 150 µL of each centrifuged sample was
transferred to a chromatography vial. Also, a QC was prepared for the RP-HPLC sample set,
for which 15 µL of each sample was pooled in a chromatography vial.
3.3 LIQUID CHROMATOGRAPHY-MASS SPECTROMETRY (DATA ACQUISITION)
3.3.1 Abstract
In this study, human serum samples have been analyzed using two slightly different
methods. All the compounds in the samples were at first separated by high-performance
liquid chromatography (HPLC) before analyzing the metabolites by high resolution mass
spectrometry (HRMS). A Dionex Ultimate™ 3000 RS UHPLC (Thermo Fisher Scientific,
Sunnyvale, CA, USA) and either an Exactive or a Q Exactive™ Mass Spectrometer (Thermo
Fisher Scientific, Bremen, Germany) were used for the analysis of the samples. While the
former instrument was used for previous analyses of the serum samples at the University of
Strathclyde, Glasgow, the latter instrument was used at the Norwegian Veterinary Institute,
Oslo.
Two different types of columns (RP and pHILIC) have been used for HPLC and for that
reason, different types of mobile phases were used. The mass spectrometry part is the same
for the two methodologies. The Xcalibur software (version 2.2 for instrument control, version
2.3 for data processing, Thermo Fisher Scientific, Inc., Waltham, MA, USA) was used for
instrument control and basic MS data processing. The serum samples were injected randomly.
Each sequence started with two blank injections followed by four QC injections. The QC sample
was then repeatedly run every 20-25th injection in order to monitor instrumental drift.
3.3.1.1 UHPLC
The separation of the metabolites in the serum samples has been achieved by HPLC.
This section gives a brief explanation of the main principle of HPLC and UHPLC (Figure 3. 1).
14
In a HPLC/UHPLC machine, the mobile phase from the reservoir is pumped through
the column with a continuous flow, most often with a constant flow rate and at high
backpressure. A small amount of a test sample (commonly 1-20 µL for analytical
applications) is being injected in the mobile phase before it enters the column. The mobile
phase and the sample solution are now guided together through the column. Depending on
the polarity of the analytes in the sample, the mobile phase and the stationary phase in the
column, analyte molecules will get separated in time. For example, analytes with a similar
polarity as the packing material of the stationary phase, will generally elute later due to
interactions. The separation is, however, also dependent on secondary effects such as π-
interactions. For this reason not all analytes spend as much time in the column, and different
retention times are obtained. When the analytes leave the column, they are detected by a
detector. In this study the detector was a mass spectrometer. http://www.waters.com/waters/en_US/How-
Does-High-Performance-Liquid-Chromatography-Work%3F/nav.htm?cid=10049055(20-04-2015)
Figure 3. 1: Schematic presentation of the main principle of HPLC/UHPLC. (http://www.waters.com/waters/en_US/How-Does-High-Performance-Liquid-Chromatography-Work%3F/nav.htm?cid=10049055 (20-04-2015))
3.3.1.2 The Q Exactive benchtop Orbitrap mass spectrometer
After HPLC separation, the eluting metabolites were analyzed using an Exactive™ or
Q Exactive™ HRMS (Thermo Scientific, Bremen, Germany) (Figure 3. 2).
Figure 3. 2: Representation of a Q exactive mass spectrometer. (http://www.textronica.com/lcline/q_exactive_prodspec.pdf (20-04-2015))
15
The compounds that elute from the column are first ionized in heated-electrospray
ionization probe (HESI-II). The generation of ions with such a probe is shown in Figure 3. 3.
The HESI is used as the ion source to create ions in the gas phase out of a liquid sample. A
high voltage is applied when the liquid reaches the probe, and as a result the liquid is
sprayed out of the probe as aerosols with an electric charge. Heated gas (nitrogen of 97-99%
purity) is applied to help the evaporation of solvents in the droplets. When the solvent
becomes more evaporated, the density of the charge at the surface of the droplets
increases. At a certain point, the charge density becomes so high that the electrical
repulsion, due to similar charges, overcomes the surface tension. Hence, the droplets will
shatter into smaller droplets. The shattering of droplets into smaller droplets continues for a
certain time and every time a droplet “explodes”, small gas phase ions are generated. The
probe can be used both in positive and negative polarity mode53. The formed ions are now
guided via an ion transfer tube to the S-lens. Here they are focused into a small ion beam,
which is led to the hyperbolic quadrupole mass filter54.
(http://www.thermoscientific.com/content/dam/tfs/ATG/CMD/cmd-support/tsq-quantum-access-max/manuals/HESI-II-Probe-User.pdf
(20-04-2015))
Figure 3. 3: H-ESI in the positive mode. (http://www.thermoscientific.com/content/dam/tfs/ATG/CMD/cmd-support/tsq-quantum-access-max/manuals/HESI-II-Probe-User.pdf (20-04-2015))
The quadrupole mass filter is implemented in the circuit to filter the formed ions
coming from HESI, based on their mass-to-charge ratio (m/z). The quadrupole consists of
four hyperbolic rods with a small space in the middle where the ions are passing through.
The pairs of opposite rods are each held at the same potential. There is the possibility to
apply different voltages to the rods, and therefore, depending on the mass-to-charge ratio,
some of the ions will reach the end of the quadrupole, yet others will get defocused from
their track and strike a rod. By applying a specific voltage, ions with a prescribed mass-to-
16
charge ratio will get focused, while the others get eliminated. Additionally it’s also possible
to employ alterations in the voltage, so ions with a certain range of mass-to-charge ratios
can be filtered55.
For a full scan analysis, the ions are accumulated in the C-trap after they went
through the quadrupole and are now led to the orbitrap, while clustering into a small ion
cloud. In the orbitrap, the ions circulate in an orbital motion between a central and a coaxial
electrode. This motion creates an image current that is detected, and the chromatograms
are built after Fourier-transformation of the measured current. In order to perform MS
fragmentation experiments, selected ions may be transferred to a higher-collision
dissociation (HCD) cell where fragmentation occurs, before the product ions are analyzed in
the orbitrap55 56.
3.3.2 RP-HPLC
Two different RP-HPLC columns have been used in this study. In previous analyses,
performed at the University of Strathclyde, an ACE Excel3 Super C18 column was used
(Advanced Chromatography Technologies Ltd., Aberdeen, Scotland; 150 × 3.0 mm i.d.). For
RP-HPLC-HRMS analyses in Oslo, the column of choice was a Kinetex™ XB-C18 column
(Phenomenex, Torrance, CA, USA; 100 × 2.1 mm i.d., 1.7 µm particle size). Both stationary
phases had a pore size of 100 Å.
The mobile phase consisted of 0.1% formic acid in purified water (mobile phase A)
and acetonitrile (mobile phase B)23 57.
A 5-µL aliquot of each sample was injected, and the column was kept at 30°C during
the entire run. The column was eluted using a linear gradient as shown in Table 3. 2.
Table 3. 2: The multi-step gradient for the RP-HPLC analysis using the Kinetex XB-C18 column.
Multi-step gradient RP-HPLC
Time (min) Flow (ml/min) % A % B
0.000 0.300 95.0 5.0 0.500 0.300 95.0 5.0
30.000 0.300 2.0 98.0 35.000 0.300 2.0 98.0 35.200 0.300 95.0 5.0 40.000 0.300 95.0 5.0
17
3.3.3 HILIC
For HILIC-HRMS, a ZIC®-pHILIC HPLC column (Merck KGaA, Darmstadt, Germany;
150 × 4.6 mm, 5 µm particle size) has been used. In this type of column the stationary phase
contains zwitterionic sulphobetaine functional groups.
The mobile phase consisted of 20 mM ammonium carbonate in purified water
(mobile phase A) and acetonitrile (mobile phase B). The linear gradient used for elution of
the column is shown in Table 3. 3.
Table 3. 3: The multi-step gradient for the HILIC analyses.
Multi-step gradient HILIC
Time (min) Flow-rate (ml/min)
% A % B
0.000 0.3 20 .0 80.0 30.000 0.3 80.0 20.0 31.000 0.3 92.0 8.0 36.000 0.3 92.0 8.0
37.000 0.3 20.0 80.0 46.000 0.3 20.0 80.0
As listed in Table 3. 3, the flow-rate of the eluents remained constant i.e. 0.3 ml/min
during the entire run. A 5-µL aliquot of each sample was injected, and the column was kept
at 30°C during the entire run.
3.3.4 High-resolution mass spectrometry (HRMS)
As described in section 3.3.1.2, the metabolites have been analyzed either with an
Exactive™ or with a Q Exactive™ mass spectrometer. The instrument has been run in the full
scan mode with a scan range of 75-1125 m/z in connection with HILIC or 100-1200 m/z in
connection with RP-HPLC. Table 3. 4 contains important instrumental parameters for the
analyses. The polarity of the electrospray interface was continuously switched between
positive and negative polarity, i.e. the instrument performed one scan in the positive ion
mode followed by one scan in the negative ion mode.
Table 3. 4: The HRMS parameters for the RP-HPLC and HILIC analyses.
HRMS-Parameters
RP-HPLC HILIC-HPLC
Resolution 70,000 70,000 AGC target 3 × 106 1 × 106
Maximum inject time 200 ms 200 ms Scan range 100-1200 m/z 75 – 1125 m/z
18
HESI Source parameters
RP-HPLC HILIC-HPLC
Sheat gas flow rate 35 35 Aux gas flow rate 10 10
Sweep gas flow rate 1 0 Spray voltage (kV) 4.00 4.00
Capillary temperature 250°C 250°C S-lens RF level 50.0 50.0
Heater temperature 300°C 300°C
3.3.5 Linear ion trap mass spectrometry (ITMS)
In order to acquire MS2 data of most metabolites HILIC was also performed in
connection with a linear ion trap mass spectrometer (ITMS). The HPLC-ITMS instrument used
consisted of a Finnigan LTQ linear ion trap mass spectrometer coupled to a Finnigan
Surveyor MS Pump Plus and Autosampler Plus (all Thermo Fisher Scientific Inc., Waltham,
MA, USA). The ITMS was run in the full-scan mode in the mass range 75-1125 m/z.
Simultaneous fragmentation of the three most intense ions was achieved using data-
dependent scanning. Ions above an intensity threshold of 5 × 104 were isolated with a 2 m/z
isolation width; the activation Q was set to 0.25, and the activation time was set to 30 ms for
fragmentation with a relative fragmentation energy of 35 units (i.e L/min).
3.4 MASS SPECTROMETRY DATA PROCESSING
3.4.1 Raw data file conversion
In order to make the Xcalibur raw data files available for automatized processing in
MZmine (see below) they were converted into the mzML standard mass spectrometry data
format using ProteoWizard (http://proteowizard.sourceforge.net/). The file converter splits
the raw data files into two new files, containing either the data from positive or negative
ionization. ProteoWizard also centroids the data.
3.4.2 Data processing in MZmine
MZmine 2 is an open-source software for mass-spectrometry data processing, with
the main focus on LC-MS data58. The processing of the data has been done separately for the
positive and the negative ionization mode. All the processing steps were, however, identical
for each set of data, except for the peak list filtering and the identification (see below). In
this study, MZmine version 2.10 has been used.
19
3.4.2.1 Peak detection/ peak picking
This step is crucial and consists itself of three steps, the mass detection, the
chromatogram building and the deconvolution of peaks59. Mass detection generates a list of
masses (ions) for each data file. All data points above the specified noise level are detected
as m/z peaks (ions) and all those below this intensity level are ignored58 60. The set
parameters can be found in Table 3. 5.
Table 3. 5: MZmine parameters for mass detection.
Mass detection
Criteria Parameter settings
Mass detector Centroid Noise level 5.0 × 103
MS level 1
During chromatogram building, ion chromatograms were extracted for each m/z with
a span over a predefined minimum peak width. These chromatograms are assembled in
peak lists for each data file meaning that a peak list is created for each sample containing
extracted chromatograms for every ion above the set threshold (Table 3.5).
(http://mzmine.sourceforge.net/features.shtml(03-05-2015))
Table 3. 6: MZmine parameters for chromatogram building.
Chromatogram building
Criteria Parameter settings
Minimum time span (min) 0.2 min Minimum height 5 × 104
m/z tolerance 0.0010 m/z or 5.00 ppm
Some of the constructed chromatograms may contain several peaks due to structural
isomers. Chromatogram deconvolution separates these into individual chromatograms
ideally containing only one single peak. Table 3. 7 shows that the ‘local minimum search’
algorithm has been used, which attempts to identify local minima as border points between
individual peaks in the chromatogram58 60.
20
Table 3. 7: MZmine parameters for chromatogram deconvolution.
Chromatogram deconvolution
Criteria Parameter settings
Algorithm Local minimum search Chromatographic threshold 1 %
Search minimum in RT range (min)
0.4 min
Minimum relative height 5 % Minimum absolute height 5 × 104
Minimum ratio of peak top/edge 5 Peak duration range (min) 0.2-5
3.4.2.2 Deisotoping
Following peak detection, all the peaks were deisotoped. The aim of the deisotoping
is to remove chromatograms that arise from isotope peaks thereby reducing the data set.
The deisotoping algorithm tries to find the most qualified charge state for each peak by
comparing the number of identified isotopes for each charge. Peaks are regarded as isotopes
when they comply for the predefined RT tolerance and m/z tolerance limits for a given
charge state. The applied parameters for deisotoping are shown in Table 3. 8. The isotope
pattern is generated for the charge state with the highest number of detected isotopes.
Then, the highest isotope is kept in the peak list, while the other isotopes were removed60.
Table 3. 8: MZmine parameters for deisotoping.
Isotopic peaks grouper
Criteria Parameter settings
m/z tolerance 0.0010 m/z or 5.00 ppm Retention time tolerance 0.1 absolute (min)
Maximum charge 2 Representative isotope Most intense
3.4.2.3 Peak list alignment
Peak list alignment aims to match corresponding peaks from separate data files into
one new aligned peak list. When this has been done, the new aligned peak list will contain
several rows and several columns. Each column represents one individual file and every row
represents one peak (“metabolic feature”) that has been matched to the corresponding
peaks in the other files60. A representation of such an aligned (and gap filled) peak list is
shown in Figure 3. 4. The chosen alignment algorithm was the ‘RANSAC aligner’, and the
parameters can be found in Table 3. 9.
21
Table 3. 9: MZmine parameters for peak list alignment.
Peak list alignment
Criteria Parameter settings
Algorithm RANSAC aligner m/z tolerance 0.001 m/z or 5ppm
Retention time tolerance after correction
0.8 min
Retention time tolerance 0.8 min RANSAC iterations 15 000
Minimum number of points 20.00 % Threshold value 2
Linear model No
3.4.2.4 Gap filling
The peak list alignment is never perfect, and thus not each peak had been matched
leaving 'gaps’ in peak rows for some samples. In some cases this is because a peak remained
undetected by the previous algorithms, e.g. due to errors in the alignment or insufficient
peak detection. Such errors are accounted for by a process called “gap filling” (Figure 3. 4).
In this case the ‘same m/z and RT range gap filler’ has been applied to detect the potentially
missing peaks and to add these to the aligned peak lists (Table 3. 10). The ranges for the m/z
and retention time for the gap filling process are automatically defined according to the
already detected peaks in the same row60.
Figure 3. 4: Screenshot of MZmine 2.10 showing the aligned and gap filled (and filtered) peak list. Every row represents a metabolic feature with its corresponding m/z and extracted ion chromatogram. The columns with colored dots represent individual samples (i.e. blank, QC and test samples). Green dots represent detected features, yellow dots represent features that were only detected during gap-filling and red dots represent undetected features (not shown in the figure).
22
Table 3. 10: MZmine parameters for gap filling.
Gap filling
Criteria Parameter settings
Algorithm Same m/z and RT range gap filler m/z tolerance 0.0010 m/z or 5.00 ppm
After the gap filling, peaks that also were present in all blank samples at an intensity
of approximately 5%, or higher, relative to the QC samples were deleted as it could be
anticipated that these were due contamination.
3.4.2.5 Peak list filtering
During peak list filtering, rows which do not comply with the set criteria, are removed
from the peak list (Table 3. 11). In this study, the peak list filtering was carried out in order to
remove peaks, which were only detected in a rather low number of samples (<45) and in
order to exclude peaks in the beginning of a chromatogram that lack chromatographic
resolution60.
Table 3. 11: MZmine parameters for peak list filtering.
Peak list filtering
Criteria Parameter settings
Algorithm Rows filter Minimum peaks in a row 45
Minimum peaks in an isotope pattern
1
m/z range RP: 100-1200 m/z
HILIC: 75-1125 m/z RT range 3 – 40 min
Peak duration range 0.2 – 5 min
3.4.2.6 Identification
This identification step consisted of two individual tasks, the ‘adduct search’ and the
‘peak complex search’. The adduct search function in MZmine aims to find possible
predefined adducts in the peak list, e.g. formate or solvent adducts (Table 3. 12). Adducts
have been identified by two important criteria. Criterion 1 requests that the mass difference
between the adduct and the original ion must be equal to one of the chosen adducts and
criterion 2 requests that the RT of the original ion and the ion of the adduct must be the
same60.
23
Table 3. 12: MZmine parameters for identification.
Adduct search
Criteria Parameter settings
Algorithm Adduct search RT tolerance 0.2 absolute (min)
Adducts
positive mode: Na, K, NH4 and ACN+H negative mode: HCOO and ACN+H
m/z tolerance 0.0010 m/z or 5.00 ppm Maximum relative adduct peak
height 50 %
As ions have the ability to form an ion complex a ‘peak complex search’ has been
carried out (Table 3. 13). In order to be identified as a complex, the following two criteria
needed to be fulfilled: First, the RT for both the complex ion and the separated ions must be
the same, and also the m/z of the ion complex must be equal to the sum of the m/z of the
separate ions, taken the mass change caused by ionization into account60.
Table 3. 13: MZmine parameters for complex search.
Peak complex search
Criteria Parameter settings
Algorithm Peak complex search
Ionization method ESI positive: M-H+
ESI negative: M-H-
RT tolerance 0.2 absolute (min) m/z tolerance 0.0010 m/z or 5.00 ppm
Maximum complex peak height 50 %
3.4.3 File export and normalization
After finishing the data processing in MZmine the remaining peak lists were exported
and saved as .csv files, which can be read by Microsoft Excel. Then, the following steps have
been carried out for both .csv files (from positive and negative ionization) before they were
combined into one file.
The blank samples were deleted and the data arranged so that the columns containing
metabolite peak areas from QC samples were in front, followed by the samples from
controls and finally the samples of the affected workers. The columns were arranged in
exactly the same way for the positive and the negative data set. The normalization was
24
performed by dividing the peak area of a certain metabolic feature by the sum of peak areas
for all the features in one sample.
The positive and the negative data set were finally combined and subjected to
multivariate statistical analyses.
3.5 DATA ANALYSIS
The software Simca (version 14, Umetrics, Umeå, Sweden) was used for the multivariate
statistical analyses (MVA) of the data sets obtained from HILIC-HRMS and RP-HPLC-HRMS. By
using principal component analysis (PCA) and orthogonal partial least squares projections –
discriminant analysis (OPLS-DA), the data could be visualized and analyzed.
The transposed data table from Microsoft Excel was copied into Simca and the first row,
containing a row ID and polarity, was defined as the ‘primary ID’. The second row and the
third row, which contained information on the m/z and the RT, respectively, were defined as
secondary ID’s.
3.5.1 Principal component analysis (PCA)
For unsupervised PCA in order to reveal the total variation of the dataset, all
variables were unit variance scaled (UV), which means that all variables have been centered
and divided by its standard deviation computed around the mean61. The scores for the
affected workers, the controls and the QC samples colored differently for better
visualization.
3.5.2 Orthogonal partial least squares – discriminant analysis
Supervised discriminatory analysis was performed to reveal potential markers of
response in sewage workers relative to the control group. The variables were pareto-scaled
(Par) for OPLS-DA, which means that the variables have been centered and divided by the
square root of the standard deviation of the mean. The QC samples were excluded from the
analysis.
3.5.3 Identification of potential metabolite markers
Metabolic features that contributed most to the discrimination of the affected
sewage workers and the control group were selected from the S-plot and variable-of-
importance (VIP) plot, which visualizes and scores the contribution of the loadings (i.e.
25
metabolic features) to the OPLS-DA model (cfr. 4.3). Potential metabolic markers for which
the relative standard deviation within the QC samples or in one of the groups exceeded 30%
were excluded. A two-tailed T-test was then performed using MS Excel in order to test if the
difference of the potential metabolic marker was statistically significant, and the significance
level was set to 0.05%.
The elemental composition of the remaining potential metabolite markers was
determined in Xcalibur using both positive and negative mode data. Mass uncertainty was
set to 3 ppm and obtained elemental compositions were verified using isotope peaks.
Elements included were C, H, O, N, P, Cl, Na and S. Elemental compositions were searched
against PubChem, Chemspider, Metlin, HMDB and KEGG online databases. The identity of
potential metabolite markers was further verified from MS2 fragmentation spectra.
26
4 RESULTS AND DISCUSSION
4.1 SAMPLE SELECTION AND SAMPLE PREPARATION
The sewage worker study had not originally been designed for a metabolomics study
and the sample set might thus not be suited for an untargeted metabolomics approach.
Instead, the original aim was to survey the impact of such a working environment on the
concentration of pneumoproteins in the serum of these workers as well as lung function,
primarily because of the expected exposure of sewage workers to endotoxins and bacteria in
sewage dust62. In that investigation, all workers had to complete a questionnaire, an
interview and a basic medical examination. Based on this, they have been divided into an
affected group and a control group (cfr. 3.1). For our investigation, 50 samples from the
affected group and 21 samples from the control group had been chosen based on the
concentration of C-reactive protein (CRP) in the samples. For selection of samples from the
affected group, samples from individuals with the highest concentration of CRP were
chosen.
In the preparation of serum samples, methanol had been used in a 3:1 (vol/vol) ratio
for the deproteinization (cfr. 3.2.1). The use of this solvent in this ratio is proven to be very
efficient in the removal of proteins at room temperature23. Want et al. examined the use of
different protein precipitation methods, including different organic solvents and acids to
heat denaturation, in order to find the method that is the most efficient for using in
metabolite profiling studies. This investigation demonstrated that the use of pure methanol
or methanol/acetone mixtures were best suited as they retained most reproducible
features, but at the same time removed proteins efficiently. Furthermore, methanol is cheap
and uncomplicated to use11.
Two different types of chromatographic columns have been applied in this study for
RP-HPLC and HILIC. These types of columns exhibit orthogonal selectivity, meaning that
metabolites with a low to medium polarity will have higher retention on the former, while
highly polar metabolites will have higher retention on the latter column63 64. The used HILIC
column differs from the used RP columns in that the stationary phase comprises of a
zwitterionic sulphobetaine-functionalized polymer instead of a rather lipophilic
octadecylsilanol-particle. A unique selectivity is the consequence of weak electrostatic
27
interactions between the sulphobetaine-stationary phase and polar analytes. This type of
column fits excellent for highly polar compounds, which are barely retained on the RP
column65. Thus, HILIC has the advantage of improving the retention of hydrophilic
compounds64 66.
The first set of samples, which has been applied for the HILIC column, was ready for
use after the deproteinization step (cfr. 3.2.1). For this type of column, a high level of organic
solvent was needed in order to get better separation and better peak shapes. Thus, the
methanol present in these samples, didn’t need to be evaporated. Given that a RP column
has been used for the other set of samples, these samples needed to be highly aqueous on
the contrary. Therefore, this set required some additional steps to remove the methanol and
to resolve the content in an aqueous solvent.
Besides the preparation of two different sample sets for the two chromatographic
approaches, also a QC sample (i.e. Quality Control) and a blank sample were prepared both
for HILIC-HRMS as well as RP-HPLC-HRMS. The purpose of the blank samples was their use
for the identification of “background features“, while the purpose of the QC samples was to
monitor instrumental drift. In this study, the QC sample has been a pooled QC, and the use
of this kind of QC is favorable due to its high appropriateness, but yet might not always be
possible to include in a metabolomics study23 67. A pooled QC contains an aliquot of each
test sample, and for that reason it represents more or less an average of the composition of
all the test samples, both qualitatively and quantitatively23 67.
In untargeted metabolomics studies, just like this one, the QC samples are principally
used to evaluate the potential drift of the instrument68. Another important reason for the
utilization of QC’s is that they could be implemented at the start of the batch in order to
condition the analytical platform23 68 69.
4.2 LC-HRMS ANALYSES AND DATA PROCESSING
In this investigation, two different types columns with different selectivity have been
applied in order to achieve chromatographic separation for a wide range of metabolites. As
already mentioned in section 4.1, the applicability of RP-HPLC is limited for highly polar
compounds. However, biological fluids such as serum contain polar compounds, e.g. amino
acids and carnitines, that will be better retained by HILIC63 64 70. In fact, RP-HPLC was
28
originally widely used in connection with HRMS, but recently the use of HILIC gains more
interest as the selection of stationary phases increases70 71.
Serum samples are a complex mixture of different types of compounds. This means
that good separation of all these compounds with different polarities can’t take place in an
isocratic mode. Therefore, HPLC has been used with a multi-step gradient mode for both
types of columns. The multi-step gradient for the HILIC column (cfr. 3.3.3) started with a high
proportion of organic mobile phase, and the proportion of aqueous mobile phase was
gradually increased in the course of the chromatographic run. In case of the RP-HPLC, the
gradient started with a high proportion of aqueous mobile phase (cfr. 3.3.2), while the
proportion of organic mobile phase was gradually increased in the course of the
chromatographic run.
Using an untargeted LC-HRMS approach yields a very high number of potential
metabolic features. It is impossible to handle such a high number of features manually.
MZmine typically extracted more than 10,000 potential metabolic features from the raw
data. During the processing this number was reduced to about 1,500 features. The data
processing itself is explained in section 3.4.2. One of the final steps of the processing is an
alignment step. This enabled the direct comparison of samples within one method. An
example of an aligned and gap filled peak list is shown in Figure 3. 4.
The principal aim with the normalization of data (cfr. 3.4.3), prior to multivariate
statistical analysis (MVA), was to correct for the observed instrumental drift (cfr. 4.6).
4.3 MULTIVARIATE DATA ANALYSES
The data set was visualized and analyzed by unsupervised principal component analysis
(PCA) and supervised orthogonal partial least squares projections – discriminant analysis
(OPLS-DA). These techniques offer dimension reduction and reveal associations between
data6 64. This section is based on the analysis of the original raw data set from HPLC-HRMS
analyses performed at the University of Strathclyde, Glasgow.
The first step was the visualization of data sets in unit-variance scaled PCA score scatter
plots (Figure 4. 1 and Figure 4. 2). It was not possible to observe any clear separation
between the affected and the control samples in the PCA plots including PC1 and PC2 (Figure
29
4. 1 and Figure 4. 2). The variation in the observations explained by the first two PC’s was
relatively low, i.e. 26% and 21% for RP-HPLC- and HILIC-HRMS (Table 4. 1). In order to
maximize the predictive ability, Q2, of the models 11 PC’s were necessary to include for the
two instrumental methods (Table 4. 1). The necessity of such high numbers of principal
components for a maximum predictive ability indicates a weak model and low correlation
between variables and observations. Furthermore, Q2 remained relatively low keeping in
mind that according to the literature this parameter should be larger than 0.5 for ‘good’
models72 73. However, PCA groups the samples solely on the information in the
measurement data, and the two different study-groups, i.e. the controls and the affected,
were not specified24 74. Thus, the model has been used to detect inherent trends within the
data64. PCA is also a good method to detect possible outliers that could affect the data64. A
Hotelling’s T2 outlier test is part of the PCA score scatter plot in SIMCA and reveals strong
outliers. In the unit-variance scaled PCA score scatter plots of the normalized data sets from
HILIC-HRMS and RP-HPLC-HRMS three or two outliers can be seen, respectively (Figure 4. 1
and Figure 4. 2). The argumentation about the QC’s in PCA score-plots can be found in
section 4.6.
Table 4. 1: : Summary of statistical measures from unsupervised PCA of the normalized, original dataset obtained from RP-HPLC- and HILIC-HRMS at the University of Strathclyde, Glasgow. The variance explained by PC 1 + PC 2 is defined but also the maximum variance that can be explained by PC’s.
PCA-X (UV)
PC R2X(cum)a Q2(cum)b
RP GLASGOW PC 1 + PC 2 0.261 0.157
PC 1 – PC 11 0.595 0.268
HILIC GLASGOW PC 1 + PC 2 0.209 0.103
PC 1 – PC 11 0.545 0.225 a Fraction of the variation explained by the model
b Predictive ability of the model
Figure 4. 1 Score scatter plot from unsupervised, unit variance scaled PCA for the normalized data set from RP-HPLC-HRMS obtained at the University of Strathclyde, Glasgow. Colors represent: red, affected workers; blue, control samples; green, QC samples. Observations outside the ellipse are strong outliers from the Hotelling’s T
2 test.
30
The PCA score scatter plot of the normalized, original data set from HILIC-HRMS
obtained at the University of Strathclyde, Glasgow showed two strong outliers, identified by
the Hotelling’s T2 test (Figure 4. 2) Interestingly, the outliers observed in the RP-HPLC-HRMS
data set were different from those observed in the HILIC-HRMS data set. The QC samples did
not cluster together in the PCA score plots indicating instrumental drift. Thus, the
normalization of the data did not correct entirely for instrumental drift indicating that
individual features in the samples did not drift equally during the analyses.
Figure 4. 2: Score scatter plot from unsupervised, unit variance scaled PCA for the normalized data set from HILIC-HRMS obtained at the University of Strathclyde, Glasgow. Colors represent: red, affected workers; blue, control samples; green, QC samples. Observations outside the ellipse are strong outliers from the Hotelling’s T
2 test.
OPLS-DA was used for the selection of possible biomarkers in the sewage workers.
This is a supervised MVA method , which means that it correlates data to a certain property
such as ‘affected’ or ‘controls’ in this case75. OPLS-DA has been carried out by using pareto
scaling. Pareto scaling is very similar to unit variance scaling. However, in pareto scaling the
square root of the standard deviation is used as the scaling factor, and not the standard
deviation itself as in unit variance scaling. By using pareto scaling, large fold changes are
decreased more than small fold changes, and thus the large fold changes are less
dominant76. The QC samples were excluded from the OPLS-DA, since this type of MVA has
the intention to obtain information about the variance and covariance between the affected
and control groups. The QC samples, representing an ‘average’ of the entire data set
obviously cannot contribute to this.
The OPLS-DA score scatter plot for the RP-HPLC-HRMS data set is shown in Figure 4.
3. These plots are, in the same way as PCA score scatter plots, based on the summarization
31
of observations77. The plot shows some separation between both groups. However, from the
plot it can also be seen that the differentiation between the groups was not complete. This
indicated a weak model which was further confirmed by the following statistical measures:
The model explained 48.6 percent of the variation between both groups (i.e.
R2Y(cum))(Table 4. 2). However, a negative number for the predictive ability of the model,
Q2, shows an especially poor fit and any identification of putative marker metabolites based
on the model must be handled with care77.
Figure 4. 3 Score scatter plot from supervised OPLS-DA (pareto scaling) of the observations from RP-HPLC-HRMS performed at the University of Strathclyde, Glasgow. Colors represent: blue, affected workers; green, control samples.
The OPLS-DA score scatter plot for the HILIC-HRMS data set is shown in Figure 4. 4. At
first sight, this looks slightly better than the same score scatter plot from the RP-HPLC-HRMS
data set. Table 4. 2, showing selected statistical measures of the model supports that this
model is significantly stronger than the above, as indicated by a higher R2Y(cum). Also, the
higher predictive ability of the model, Q2, and thus the smaller difference between Q2 and
R2Y(cum) support a stronger model.
Figure 4. 4: Score scatter plot from supervised OPLS-DA (pareto scaling) of the observations from RP-HILIC-HRMS performed at the University of Strathclyde, Glasgow. Colors represent: blue, affected workers; green, control samples.
32
Exclusion of the outliers in RP-HPLC-HRMS (S114, S146, 02_S127) and HILIC-HRMS
(S196, S109) based on the PCA plots, resulted in slightly better values for R2X, R2Y and Q2
(Table 4. 2). In the first case, observing an improvement for the negative predictive ability of
the model, but the fit still remains very weak. In HILIC-HRMS, also an increase could have
been noticed, but in this case an extra component had been included to achieve the
maximizing of the Q2. In general, this correction resulted in an insignificant small
improvement and even more, new outliers were revealed in the new score-plots.
Investigation of the S-plots, corrected for the outliers, did not change the selection of
putative biomarkers for further identification. Therefore, there had been continued with the
complete sample set.
Table 4. 2: Comparison of statistical measures for supervised OPLS-DA models obtained from RP-HPLC-HRMS and HILIC-HRMS data sets obtained at the University of Strathclyde, Glasgow.
OPLS-DA (PAR)
PC R2X(cum)a R2Y(cum)b Q2(cum)c
RP GLASGOW PC 1 + PC 2 0.237 0.486 -0.0344 RP GLASGOW
adjusted for outliers PC 1 + PC 2 0.243 0.512 0.0236
HILIC GLASGOW
PC 1 + PC 2 0.220 0.589 0.302
HILIC GLASGOW
adjusted for outliers PC 1 – PC 3 0.262 0.831 0.356
a Fraction of the variation explained by the model
b Fraction of the variation between both groups explained by the model
c Predictive ability of the model
In order to select putative metabolic markers of exposure, the S-plots for both OPLS-
DA models were studied77. An S-plot is a loadings plot visualizing and scoring the variables
(i.e. metabolic features) according to their significance for the model77. S-plots have been
constructed out of the pareto scaled OPLS-DA models. The potential metabolic markers of
exposure were selected according to their location in the S-plot (Figure 4. 5 and Figure 4. 6).
This type of loadings plot combines the modelled covariance (p[1]-axis) and the modelled
correlation (p(corr)[1]-axis) from OPLS-DA77. Putative biomarker molecules are characterized
by large variable magnitude and good reliability, and in an S-plot such variables are located
at the bottom left or upper right corner of the plot (Figure 4. 5 and Figure 4. 6)77.
33
Figure 4. 5: S-plot of supervised, pareto scaled OPLS-DA of RP-HPLC-HRMS profiled serum samples run at the University of Strathclyde, Glasgow. The blue and red colored loadings correspond to potential marker metabolites that were selected for T-testing and tentative identification.
Figure 4. 6: S-plot of supervised, pareto scaled OPLS-DA of HILIC-HRMS profiled serum samples run at the University of Strathclyde, Glasgow. The blue and red colored loadings correspond to potential marker metabolites that were selected for T-testing and tentative identification .
Figure 4. 7: Variable Importance Plot (VIP) of supervised, pareto scaled OPLS-DA of RP-HPLC-HRMS profiled serum samples run at the University of Strathclyde, Glasgow. Coloration of variables is according to the S-plot.
34
The Variable Importance Plot (VIP) is a scoring feature of the SIMCA software that
allows verifying the selection of potential metabolite markers from the S-plot. The higher the
VIP score (>1) the more significant is the metabolic feature in complex analysis in comparing
the difference between the two groups. The VIP plot is a coefficient plot that summarizes
the relationship between the X and Y variables, but the algorithm is not known (Figure 4. 7
and Figure 4. 8)78.
Figure 4. 8: Variable Importance Plot (VIP) of supervised, pareto scaled OPLS-DA of RP-HPLC-HRMS profiled serum samples run at the University of Strathclyde, Glasgow. Coloration of variables is according to the S-plot.
4.4 SELECTION OF POTENTIAL METABOLOMIC MARKERS OF EXPOSURE
The selected features from supervised OPLS-DA were tested for statistical significance
using a two-tailed T-test in excel. The statistical significant (P<0.05) potential metabolic
markers are summarized as m/z and retention time pairs in Table 4. 3 and Table 4. 4
together with their group ratio, as well as the relative standard deviation (RSD) of the
metabolic features in the QC samples and within the two groups. The ratio demonstrates
either upregulation (>1) or downregulation (<1) of the metabolite in the control vs. the
affected group. Features with a poor repeatability, i.e. features with RSD’s > 30 % in the
QC’s were removed. However, where the same feature was identified as a statistically
significant putative metabolic marker in the re-analyses carried out at STAMI, Oslo, it was
not rejected from the original (i.e. University of Strathclyde) list of metabolic markers, even
though they had RSD’s > 30 %.
35
Table 4. 3: Putative metabolic markers selected from the RP-HPLC-HRMS data set.
RP-HPLC Potential
Metabolic Markers
Primary ID
row m/z
Row retention
time (min)
P-value
Ratio Control/affected
%RSD of QC
%RSD of
Control
%RSD of Affected
P1 166.0862 6.82 0.025 0.90 3.74 18.3 14.7 N6 167.0213 3.59 0.037 0.91 2.50 18.1 17.4 P9 182.0812 4.61 0.010 0.86 3.55 20.9 24.0
P33 204.1230 3.38 0.032 0.79 13.3 41.6 44.0 N20 311.1406 11.72 0.001 0.70 4.17 38.9 40.1 P17 313.1544 11.72 0.001 0.69 2.75 43.2 42.4 P89 480.3450 29.14 0.017 1.4 11.7 49.6 29.8 P3 520.3396 26.08 0.020 1.1 2.20 20.8 18.9 P7 522.3552 28.82 0.026 1.1 4.70 21.5 22.1
P53 522.3552 28.15 0.019 1.2 5.58 28.0 29.2 N30 588.3314 26.11 0.023 0.87 1.33 24.9 20.0 N11 617.7375 13.92 0.001 0.77 6.31 29.2 33.2 N15 653.2669 11.79 0.004 0.74 3.73 41.2 39.4
N23 661.2536 13.12 0.033 0.84 6.20 27.5 37.0
N50 862.3950 11.79 2.3E-
05 0.59 4.93 47.1 46.5
N71 942.3451 7.57 1.7 E-
04 0.60 3.38 50.2 54.0
P55 1043.704 28.82 0.037 1.4 7.61 55.3 53.4 N54 1083.663 26.08 0.012 1.2 0.81 25.0 24.3
Table 4. 4: Putative metabolic markers selected from the HILIC-HRMS data set.
HILIC Potential Metabolic Markers
Primary ID
row m/z
Row retention
time (min)
P-value
Ratio Control/affected
%RSD of QC
%RSD of
Control
%RSD of Affected
N34 187.0074 4.57 0.028 1.6 62.5 61.0 108 N27 253.2178 4.09 0.040 0.70 11.8 68.9 68.0 N59 269.2128 3.96 0.046 1.3 49.2 49.1 61.5
N133 287.2311 3.94 3.0E-
07 3.1 49.8 44.5 131
N54 296.2363 3.91 0.031 1.3 49.5 44.1 53.3 N24 297.2440 3.92 0.042 1.4 49.2 52.8 64.4 P4 496.3395 4.97 0.053a 1.1 3.05 15.4 21.0
P10 520.3396 4.94 0.0037 1.2 13.0 22.0 26.6 P23 522.3555 4.89 0.015 1.1 8.51 17.3 29.5 P19 524.3710 4.85 0.022 1.1 3.18 18.3 23.9
P18 760.5854 4.32 0.0056 0.78 6.78 34.9 34.1 P12 786.6006 4.27 0.022 0.83 7.78 33.3 28.4 P39 810.6008 4.22 0.0034 0.77 7.43 37.8 28.8 P77 812.6160 4.23 0.0025 0.62 9.01 69.4 54.5
a The P-value of compound 496.3395 exceeds the limiting value slightly, but has been kept.
36
4.5 TENTATIVE IDENTIFICATION OF METABOLITES
The selection criteria, i.e. P < 0.5 and within-QC and within-group RSD < 30% gave 12
metabolic features that were potential metabolic markers of exposure (Table 4. 7). Two of
these features were significant in both the RP-HPLC-HRMS and the HILIC-HRMS datasets (i.e.
520.3396 and 522.3552) This was regarded as a strong indicator for the group difference
being real rather than simply by chance. However, when the sample preparation and
instrumental analysis was repeated in Oslo, the majority of the features that were identified
as putative metabolic markers in the data sets, obtained in Glasgow, were not identified as
such in Oslo. Only two of the features with m/z 496.3395 and 520.3396 could again be
identified as putative metabolic markers in the HILIC-HRMS data set, obtained in Oslo
(P < 0.5, within-QC and within-group RSD < 30%). Furthermore, four of the putative
metabolic markers from the RP-HPLC-HRMS data set acquired in Glasgow (N20, P17, N11
and N15) showed within-group RSD’s higher than 30%, but were again detected as putative
metabolic markers in the analyses performed in Oslo. These were thus kept in the list of final
putative metabolic markers (Table 4.5).
The elemental composition was determined using the Xcalibur software for all final
putative metabolic markers (Table 4. 7). A QC sample was used in order to extract the mass
spectra of all putative metabolic markers, because the data from these samples should give
the best representation of all features (cfr. 4.1). QC data files from RP were applied for RP
features and QC data files from HILIC for HILIC features. The scan filter during ion extraction
was set according to the polarity mode of the electrospray interface. Each mass spectrum
had been observed in order to find a qualified elemental composition with respect to this
mass spectrum.
The tentative structure determination for the putative metabolic markers is in the
following demonstrated in more detail for the metabolic feature P1 (Table 4. 3 and Table
4.7). This putative metabolic marker afforded ions with m/z of 166.0862 in the positive
ionization mode. Therefore, the scan filter has been set to positive mode. The mass
tolerance for extracting the peak from the full-scan chromatogram was set to 5 ppm (Figure
4. 9). Figure 4. 9 shows the extracted ion chromatogram and the mass spectrum of the
metabolic feature. The mass peak at m/z 166.0862 is regarded as the M peak, while also the
37
isotope mass peaks from exchange of one or two 12C isotopes by 13C can be seen at m/z
167.0892 and 168.0926, respectively (Figure 4.9).
Figure 4. 9: The extracted chromatogram and mass spectrum for the positively ionized metabolic feature with m/z 166.0862.
After mass spectrum extraction, Xcalibur was used to generate conceivable elemental
compositions for the given m/z value. The elements C, H, O, N, P, Cl, Na and S were included
for generation of elemental formulae, and the mass tolerance was set to 3 ppm. When
observing the isotope pattern for feature P1, it was already possible to exclude chlorine
from the elements list, because in this case, a M+2 peak with 30 % intensity compared to the
M peak should be present. Also sulfur could likely be removed from the list as otherwise the
size of the M + 2 peak was expected to increase by approximately 5%. Taking the nitrogen
rule into account, which states that molecules with even molecular mass have zero or an
even number of nitrogen in their formula and molecules with odd molecular mass an odd
number of nitrogen, feature P1 was expected to contain an odd number of nitrogen. The
observed base peak with m/z 166.0862, corresponds to the protonated molecule (positive
ionization) and therefore the measured monoisotopic mass would be 165.0783Da. Based on
the size of the molecule, the expected number for each element was estimated for the
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 0.0000 - 9.7432
0 1 2 3 4 5 6 7 8 9
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
1.5360
2.0695 3.0761 6.65624.4607 7.83155.2455 8.57551.2421
NL: 1.48E9
Base Peak m/z=
166.0854-166.0870
F: FTMS + p ESI
Full ms
[80.00-1200.00] MS
QC6
QC6 #143-169 RT: 1.44-1.69 AV: 14 NL: 3.73E8
T: FTMS + p ESI Full ms [80.00-1200.00]
166.0 166.5 167.0 167.5 168.0 168.5
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
166.0861
167.0892
167.0126168.0926166.1430
167.5480
167.1429
168.4836166.4693
38
determination of elemental formulae using Xcalibur (Table 4. 5), and in Table 4. 6 the
composed formulas are listed.
Table 4. 5: Overview of the included elements for the calculation of the elemental composition for m/z 166.0862, estimated for their amount.
Elements in use
Isotope Min Max DB eq. Mass
14 N 1 3 0.5 14.003
16 O 0 6 0.0 16.995
12 C 0 20 1.0 12.000
1H 0 30 -0.5 1.008
31 P 0 1 0.5 30.974
Table 4. 6: Possible elemental formulae for m/z 166.0862 calculated in Xcalibur based on the parameters in Table 4. 5.
As feature P1 corresponded to a metabolite with odd molecular mass an odd number
of nitrogens was expected, and thus only the first elemental composition was likely. The
‘delta ppm’ is equivalent to the mass accuracy and shows the difference between the
theoretical and experimental m/z, and a difference of 0.3 ppm can be considered as very
good79. For metabolites with larger m/z values, more theoretical elemental formulae are
expected to be calculated because there are more possibilities for the combination of
chemical elements. A mass spectrum for an ion with the elemental formula C9H12O2N
formula was generated in order to compare the isotope pattern with the authentic mass
spectrum (Figure 4. 10). The simulated mass spectrum resembled the observed mass
spectrum and supporting that the calculated elemental composition was correct.
Based on the m/z value of the base peak and the calculated elemental formula, the
databases PubChem, Chemspider, Metlin, HMDB and KEGG were searched for metabolites
that would comply with these data. Then, the MS2 fragmentation pattern of the metabolite
was studied in order to find out which of the database compounds fits with the
fragmentation pattern of the metabolite. A QC sample was used to acquire MS2 data for the
putative metabolite markers.
Results
Index Formula RDB Delta ppm
1 C9H12O2N 4.5 -0.332
2 C5H15O2N2P 0.0 -2.202
39
Figure 4. 10: Isotope pattern of generated elemental composition (above) compared to isotope pattern of metabolite with m/z 166.0862 (below). The patterns are very similar.
The databases gave several suggestions for the structure of the metabolite e.g.
phenylalanine, benzocaine, 3 amino-phenylpropionic acid (which would be identical to
phenylalanine) etc. The ring double bond equivalent (RDB) of the neutral equivalent of P1
was 5 (cfr. Table 4. 6) suggested that the molecule could have a benzene ring and a carbonyl
group as in phenylalanine. Examination of the fragmentation spectrum revealed a major
product ion at m/z 120, corresponding to loss of 46 Da, and a minor product ion at m/z 149
(Figure 4. 11). The −46 Da loss, attributed to loss of the carboxyl group as formic acid, which
is diagnostic for the presence of a carboxylic acid. (Figure 4. 11). The latter product ion can
be explained by the presence of an amine function in the structure.
For all other final putative metabolic markers the chromatograms, mass spectra and
MS2 product ion spectra are shown in the appendix. However, the obtained mass spectral
data for the remaining putative metabolic markers will be briefly discussed in the following
sections (Table 4. 7).
164 165 166 167 168 169 170 171 172
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
166.0863
167.0896
168.0930 169.0939 170.0972 171.1006 172.1015
166.0864
167.0895
171.9931169.9774168.0903 171.1493169.0861164.5456 165.0524165.9160
167.5485166.2571
NL:8.99E5
C 9 H12 O2 N: C 9 H12 O2 N1
pa Chrg 1
NL:4.10E8
QC8#143-166 RT: 1.44-1.65 AV: 12 T: FTMS + p ESI Full ms [80.00-1200.00]
40
Figure 4. 11: Product ion spectrum from HILIC-ion trap mass spectrometry and possible explanation for the observed product ions for the metabolite with m/z 166.0862, tentatively identified as phenylalanine.
Another metabolic feature, which was identified as a putative metabolic marker
afforded negatively charged ions with m/z 167.0213. Provided this m/z corresponds to the
deprotonated molecular ions the metabolite contains an even number of nitrogen atoms, if
any. However, the mass spectrum was noisy, and it was therefore difficult to identify an
isotope pattern (Figure 7. 1). Furthermore, the elemental formulae returned from
calculations made in Xcalibur were meaningless, and the MS2 product ion spectrum showed
only two ions corresponding to loss of 43 and 44 Da (Figure 7. 2). Thus, the compound
remained unidentified.
The elemental formula C9H12O3N was found to be the most likely for the metabolite
affording positively charged ions with m/z 182.0812 (Table 4. 7). The elemental composition
differed by one oxygen atom to that of the tentatively identified phenylalanine. The MS2
product ion spectrum was likewise similar and showed product ions corresponding to losses
of 17 and 46 Da (Figure 7. 4). These product ions are likely the result of loss of ammonia and
formic acid indicating the presence of amine and carboxylic acid functions. It is thus most
likely that m/z 182.0812 is equivalent to the amino acid tyrosine (4 hydroxy-phenylalanine).
41
The metabolic features m/z 311.1406 and m/z 313.1544 from negative and positive
ionization, respectively, eluted at the same retention time from the RP-HPLC columns (Table
4. 7). The calculated elemental compositions for the two ions were C18H19N2O3 and
C18H21N2O3 showing that the ions were the deprotonated and protonated molecular ions,
respectively. Database searches suggested a phenylalanine-phenylalanine dipeptide for the
elemental formulae. Several of the product ions observed in the MS2 product ion spectrum
supported a phenylalanine dipeptide (Figure 7. 6). For example, the m/z 147 product ion is
likely due to cleavage of the amide linkage between the monomers, and the m/z 164
product ion could arise from fragmentation of the bond on the other side of the amide
linkage resulting in a phenylalanine-amide (Figure 7. 6).
Compound 496.3395 was tentatively identified as 2- or 3-hydroxy-palmitoyl-
glycerophosphocholine (C24H51O7NP) from database searches. First, the formula had been
generated by including at least one nitrogen and no chlorine in the calculations as a result of
observations of the isotope pattern and the molecular mass indicating the presence of an
odd number of nitrogen atoms. The MS2 product ion spectrum showed a major product ion
with m/z 184 likely corresponding to a phosphocholine moiety verifying a phosphocholine-
type phospholipid (Figure 7. 9). Whether the molecule is a 2- or 3-hydroxy compound can
only be found out by comparison to authentic standards.
The m/z 520.3396 metabolic feature was significantly different between the affected
and control workers in both the RP and the HILIC approach from the samples run at the
University of Strathclyde, Glasgow. Calculating the elemental composition revealed
C26H51O7NP as the most probable elemental formula, and again sn-glycero-3-phosphocholine
molecules were suggested in the databases. The elemental composition differed by C2 to a
putative hydroxy-palmitoyl-glycerophosphocholine. The fragmentation pattern showed the
same characteristics as for previous compound (Figure 7. 12), and it could therefore be
concluded that this compound is a similar phospholipid with a longer fatty acid chain. Thus,
m/z 520.3396 is probably a 2- or 3-hydroxy-octadecadienoylglycerophosphocholine.
42
Table 4. 7: Overview of the final potential metabolic markers from untargeted RP-HPLC- and HILIC-HRMS based metabolomics and tentative identification of metabolites based on calculation of the elemental composition and study of MS2 product ion spectra. Superscripts for row m/z mean: 1: significant features from both the RP-HPLC- and HILIC data set from Glasgow, 2: significant features from the RP-HPLC data set with inter-group RSD >30%, but which were identified both in Glasgow and in Oslo, 3: significant features from the HILIC data set both from Glasgow and Oslo, 4: significant feature from HILIC and RP-HPLC data sets from Glasgow and HILIC data set from Oslo.
Primary ID
Row m/z Retention
time Type of
ion Elemental composition RDB
a Delta ppm
Tentative structure Type
P1 166.0862 6.82 [M-H]+ C9H12O2N 5 -0.332 Phenylalanine Amino acid
N6 167.0213 3.60 [M-H]- / / / Unknown /
P9 182.0812 4.61 [M-H]+ C9H12O3N 5 0.166 Tyrosine Amino acid
N20 311.14062 11.72 [M-H]
- C18H19N2O3 10 2.928 Phenylalanine- phenylalanine Dipeptide
P17 313.15442 11.72 [M-H]
+ C18H21N2O3 10 -0.859 Phenylalanine- phenylalanine Dipeptide
P4 496.33953 4.97 [M-H]
+ C24CH51O7NP 1 -0.846
2-hydroxy-palmitoyl-glycerophosphocholine or 3-hydroxy-palmitoyl-glycerophosphocholine
Phospholipid
P3 520.3396
1
(RP) 26.08 [M-H]
+ C26H51O7NP 3 -0.319
2-hydroxy-octadecadienylglycerophosphocholine or 3-hydroxy-octadecadienylglycerophosphocholine
Phospholipid
P10 520.3396
4
(HILIC) 4.94 [M-H]
+ C26H51O7NP 2 -0.319
2-hydroxy-octadecadienoylglycerophosphocholine or 3-hydroxy-octadecadiepnylglycerophosphocholine
Phospholipid
P7 522.3552
1
(RP) 28.82 [M-H]
+ C26H53O7NP 2 -0.413
2-hydroxy-octadecenoylglycerophosphocholine or 3-hydroxy-octadecenoylglycerophosphocholine
Phospholipid
P53 522.3552
1
(RP) 28.15 [M-H]
+ C26H53O7NP 2 -0.413
2-hydroxy-octadecenoylglycerophosphocholine or 3-hydroxy-octadecenoylglycerophosphocholine
Phospholipid
P23 522.3555
1
(HILIC) 4.89 [M-H]
+ C26H53O7NP 2 -0.873
2-hydroxy-octadecenoylglycerophosphocholine or 3-hydroxy-octadeceno0ylglycerophosphocholine
Phospholipid
P19 524.3710 4.85 [M-H]+ C26H55O7NP 1 -0.965
2-hydroxy-octadecanoylglycerophosphocholine or 3-hydroxy-octadecanoylglycerophosphocholine
Phospholipid
N30 588.3314 26.11 [M-H]- C24H48O4N9P2 6 1.070 Unknown /
N11 617.73752 13.92 [M-H]
- / / / Unknown /
N15 653.26692 11.79 [M-H]
- / / / Unknown /
N54 1083.663 26.08 [M-H]- / / / Unknown /
a The ring double bond equivalent is for the neutral molecule.
43
The metabolic feature with m/z 522.3552 was also present in both the RP and the
HILIC approach from the samples run at the University of Strathclyde, Glasgow.
Furthermore, the extracted ion chromatograms showed two closely eluting isomers that
were significantly different between the affected and control workers in both RP-HPLC-
HRMS data sets (Glasgow and Oslo). The calculated elemental formula for m/z 522.3552
(C26H53O7NP) showed that this metabolite contained two hydrogen atoms more than m/z
520.3396, while its MS2 product ion spectrum showed that it was a phosphocholine (Figure
7. 15). This means that the fatty acid chain of the phospholipids likely contained a mono-
unsaturated hydrocarbon chain, and thus was likely a 2- and 3-hydroxy-
octadecenoylglycerophosphocholine.
The MS characteristics for m/z 524.3710 were similar to the latter two, and its
elemental formula (C26H55O7NP) again indicated the presence of two additional hydrogen
atoms relative to m/z 522.3552 (Figure 7. 17). Thus, this metabolite was likely a 2- or 3-
hydroxy-octadecanoylglycerophosphocholine.
The most likely elemental composition for m/z 588.3314 was C24H48O4N9P2. The databases
did not contain any metabolite with this elemental composition. It was thus not possible to
come up with a suggestion for a structure of this compound.
For the metabolic features m/z 617.7375 and 653.2669, observed in the negative
ionization mode, no meaningful elemental formulae were found. Both molecules were
doubly charged as can be seen from their isotope pattern (Figure 7. 19 and Figure 7. 20)80.
Thus, these metabolites were of high molecular mass allowing for many possible
combinations of chemical elements. The data-dependent scanning did not yield MS2 product
ion spectra for these two metabolic features. Therefore, the identity of these two features
remained unknown.
The 1083.6634 compound wasn’t identified either. This molecule was very large and
therefore there are a lot possible elemental compositions that would fit with the mass
spectra. Since there was no remarkable characteristic in the mass spectrum, there was no
possibility to eliminate elements to include for the elemental composition. Either no
fragmentation spectrum had been found for this molecule.
44
All these observations already gave a thought about the structures of the tentative
biomarkers. Nevertheless, further verification is needed to prove that the findings are
effective. Therefore, the standards of these discovered metabolites should have ordered and
comparisons between the mass spectra and fragmentation patterns of those and the original
ones should have been carried out to make a conclusion. In this case it’s only possible to
demonstrate which compounds they most likely are but no further conclusion could have
been made.
4.6 INSTRUMENTAL DRIFT AND REPRODUCIBILITY
In the beginning, analysis of data had been carried out on the data acquired from the
samples made in Oslo. During the investigation, the decision was made to focus on the
analysis of data found at the University of Strathclyde, Glasgow. Even though both data sets
were subjected to drift, the Oslo samples went unfortunately through two redundant freeze-
thaw-cycles before sample preparation. Suggesting that these data was less trustworthy for
the objective to detect changes in the metabolome compared to data from Glasgow.
Instrumental drift is a typical phenomenon and a major confounding factor in long-term
metabolomics investigations23. Instrumental instability results in poor data quality,
consequently complicating comparison of data between different laboratories or data
collected over time. Drift can also be encountered in the same run81. Principally, this drift is
caused by samples coming into direct contact with components of the analytical platform.
This can lead to changes in retention times and measured response over time by
contaminating or dirtying of the ion source and by changes in chromatographic performance
such as column aging81 82. Increasing analysis times generally lead to increasing drift81.
In order to deal with this drift, normalization of all data sets had been carried out to
improve reproducibility. After these normalization, there could be deduced with the help
from the QC’s that the data sets were still subjected to some instrumental drift. Since both
Oslo and Glasgow data sets were still dominated by this drift after normalization, the
acquired results should be handled cautiously.
The use of PCA score-plots was very useful to acquire practical information about the
QC samples and in general about the instrumental drift. When having a look at the RP PCA
score-plot for samples made in Oslo (Figure 4. 12), there could be noticed that QC1, QC2,
45
QC3 and QC4, were located very far from the other QC’s but in general very far from all
other samples. They were even detected as strong outliers. Another remarkable thing was
that they weren’t clustered. Ideally, the QC samples should have been grouped as a cluster
in the plot, because they all derive from the same vial and thus hold the same content23 67 68.
In section 4.1, there had been described that QC’s are implemented at the start of the batch
in order to condition the analytical platform and counteract for the drift. This was the fact in
our case and is probably the reason why the first QC’s are dispersed and isolated from the
others. Therefore QC1 until 4 could have been excluded68 69.
A new score-plot (Figure 4. 13), where the first QC’s weren’t taken into account, had
been constructed in order to analyze the variability of the remaining essential QC’s without
the influence of the others. Drift between these essential QC’s could have been established.
Since the QC’s were periodically analyzed throughout an analytical run (cfr. 3.3.1), it is
presumable that all samples are affected by this drift.
Figure 4. 12: Score scatter plot from unsupervised, unit variance scaled PCA for the normalized data set from RP-HPLC-HRMS obtained in Oslo. The red dots are the affected samples, the blue dots are the control samples and the green dots are the QC samples.
Figure 4. 13: Score scatter plot from unsupervised, unit variance scaled PCA for the normalized data set from RP-HPLC-HRMS obtained in Oslo. Because of conditioning of the analytical platform, redundant QC1, QC2, QC3 and QC4 have
46
been excluded. The red dots are the affected samples, the blue dots are the control samples and the green dots are the QC samples.
Figure 4. 14 for example, shows the PCA-X score-plot from the Oslo samples that had
been run through the HILIC column. The same remarks about the first QC’s and about the
drift, like in previous score-plot (Figure 4. 13) could have been confirmed. Except for the fact
that unexpectedly the QC’s have now been located more in the center of the plot and that
the QC’s are located less far from each other. So the instrumental drift is probably more
pronounced when using RP compared to HILIC.
Figure 4. 14: Score scatter plot from unsupervised, unit variance scaled PCA for the normalized data set from HILIC-HPLC-HRMS obtained in Oslo. The red dots are the affected samples, the blue dots are the control samples and the green dots are the QC samples.
There is existence of imaginable solutions to solve the complications about this
instrumental drift, in order to obtain consistent results. One of them is based on
fragmenting the study into small blocks68 83. Zelena et al. described that it was favored to
employ analytical blocks with less than 90 samples to provide better results82. This
recommendation is not applicable in our case because the study contained less than 90
samples. Another one applies labeled internal standards of the analytes of interest. This one
is apparently impractical because all of the analytes are of interest in our untargeted
analysis82. Therefore, Sysi-Aho et al. suggested optimized multiple internal standards in case
of untargeted metabolomics studies, which enables the correction of each feature according
to the best fit from the internal standard collection82 84. Real time monitoring of the systems
performance finally, allows instant corrections to be made, which would be highly
advantageous in order to achieve stable and high quality data over time81.
47
5 CONCLUSION
In this study, 13 different metabolites have been revealed as potential metabolic
markers in sewage workers by comparing serum samples with a control group based on an
untargeted HPLC-HRMS metabolomics approach. This study thus also demonstrated the
potential of this technique for research in occupational health.
The major problem encountered was related to the presence of instrumental drift. Even
though efforts were done to counteract for this instrumental drift such as including a pre-
run for instrument equilibration and normalization of the raw data, the insurmountable
occurrence of instrumental drift, could not entirely be avoided nor corrected for. This is
likely the principal reason for the observed inconsistency of the results when the analyses
were repeated. Future research should thus focus on minimizing the problems related to
instrumental drift. Since the potential metabolic markers have solely been identified
tentatively the identity of their structures still needs be compared with authentic standards.
48
6 BIBLIOGRAPHY
1. Fiehn O. Metabolomics - The link between genotypes and phenotypes. Plant Mol Biol. 2002;48(1-2):155-171.
2. Griffiths WJ, Karu K, Hornshaw M, Woffendin G, Wang Y. Metabolomics and metabolite profiling: past heroes and future developments. Eur J Mass Spectrom (Chichester, Eng). 2007;13(1):45-50.
3. Vulimiri S V, Pachkowski B, Bale AS, Sonawane B. Metabolomics Approach for Hazard Identification in Human Health Assessment of Environmental Chemicals. Metabolomics. 2012:349-364.
4. Dunn WB, Ellis DI. Metabolomics: Current analytical platforms and methodologies. TrAC - Trends Anal Chem. 2005;24(4):285-294.
5. Jewett JN and MC. The role of metabolomics in systems biology. Stress Protein Kinases. 2008;20(August 2007):51-79.
6. Dettmer K, Aronov P a, Hammock BD. Mass spectrometry-based metabolomics. Mass Spectrom Rev. 2007;26(1):51-78.
7. Psychogios N, Hau DD, Peng J, et al. The human serum metabolome. PLoS One. 2011;6(2).
8. Courant F, Antignac J-P, Dervilly-Pinel G, Le Bizec B. Basics of mass spectrometry based metabolomics. Proteomics. 2014;14(21-22):2369-2388.
9. Oliver SG, Winson MK, Kell DB, Baganz F. Systematic functional analysis of the yeast genome. Trends Biotechnol. 1998;16(9):373-378.
10. Fang ZZ, Gonzalez FJ. LC-MS-based metabolomics: An update. Arch Toxicol. 2014;88(8):1491-1502.
11. Want EJ, O'Maille G, Smith C a., et al. Solvent-dependent metabolite distribution, clustering, and protein extraction for serum profiling with mass spectrometry. Anal Chem. 2006;78(3):743-752.
12. Zhang A, Sun H, Wang P. et al.. Recent and potential developments of biofluid analyses in metabolomics. Journal of Proteomics 2012;75: 1079-1088.
13. Ryan D, Robards K. Metabolomics: The greatest omics of them all? Anal Chem. 2006;78(23):7954-7958.
14. Bino RJ, Hall RD, Fiehn O, et al. Potential of metabolomics as a functional genomics tool. Trends Plant Sci. 2004;9(9):418-425.
49
15. Rochfort S. Biology and Implications for Natural Products Research. 2005:1813-1820.
16. Ideker T, Galitski T, Hood L. A New Approach To Decoding L Ife: Systems Biology. Annu Rev Genomics Hum Genet. 2001;2:343-372.
17. Dettmer K, Hammock BD. Metabolomics - A new exciting field within the “omics” sciences. Environ Health Perspect. 2004;112(7):396-397.
18. Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10(1):57-63.
19. Hegde PS, White IR, Debouck C. Interplay of transcriptomics and proteomics. Curr Opin Biotechnol. 2003;14(6):647-651.
20. Bensimon A, Heck AJR, Aebersold R. Mass Spectrometry–Based Proteomics and Network Biology. Annu Rev Biochem. 2012;81(1):379-405.
21. Choudhary C, Mann M. Decoding signalling networks by mass spectrometry-based proteomics. Nat Rev Mol Cell Biol. 2010;11(6):427-439.
22. Altelaar a FM, Munoz J, Heck AJR. Next-generation proteomics: towards an integrative view of proteome dynamics. Nat Rev Genet. 2013;14(1):35-48.
23. Dunn WB, David Broadhurst, Paul Begley EZ, Francis-McIntyre S, et al. Procedures for large-scale metabolic profiling of serum and plasma using gas chromatography and liquid chromatography coupled to mass spectrometry. 2011;6(7):1060-1083
24. Villas-Bôas SG, Mas S, Åkesson M, Smedsgaard J, Nielsen J. Mass spectrometry in metabolome analysis. Mass Spectrom Rev. 2005;24(5):613-646.
25. Kell DB. Systems biology, metabolic modelling and metabolomics in drug discovery and development. Drug Discov Today. 2006;11(23-24):1085-1092.
26. Mickiewicz B, Villemaire ML, Sandercock LE, Jirik FR, Vogel HJ. Metabolic changes associated with selenium deficiency in mice. Biometals. 2014:1137-1147.
27. Kume S, Yamato M, Tamura Y, et al. Potential biomarkers of fatigue identified by plasma metabolome analysis in rats. PLoS One. 2015;10(3):e0120106.
28. Zhang A, Sun H, Wang X. Serum metabolomics as a novel diagnostic approach for disease: a systematic review. 2012; 404: 1239-1245
29. Zhang T, Watson DG, Wang L, et al. Application of Holistic Liquid Chromatography-High Resolution Mass Spectrometry Based Urinary Metabolomics for Prostate Cancer Detection and Biomarker Discovery. PLoS One. 2013;8(6):1-10.
50
30. Halket JM, Waterman D, Przyborowska AM, Patel RKP, Fraser PD, Bramley PM. Chemical derivatization and mass spectral libraries in metabolic profiling by GC/MS and LC/MS/MS. J Exp Bot. 2005;56(410):219-243.
31. Lu W, Bennett BD, Rabinowitz JD. Analytical strategies for LC-MS-based targeted metabolomics. J Chromatogr B Anal Technol Biomed Life Sci. 2008;871(2):236-242.
32. Ellis DI, Dunn WB, Griffin JL, Allwood JW, Goodacre R. Metabolic fingerprinting as a diagnostic tool. Pharmacogenomics. 2007;8(9):1243-1266.
33. García-Pérez I, Vallejo M, García a., Legido-Quigley C, Barbas C. Metabolic fingerprinting with capillary electrophoresis. J Chromatogr A. 2008;1204(2):130-139.
34. Allen J, Davey HM, Broadhurst D, Rowland JJ, Oliver SG, Kell DB. Discrimination of modes of action of antifungal substances by use of metabolic footprinting. Appl Environ Microbiol. 2004;70(10):6157-6165.
35. Kell DB, Brown M, Davey HM, Dunn WB, Spasic I, Oliver SG. Metabolic footprinting and systems biology: the medium is the message. Nat Rev Microbiol. 2005;3(7):557-565.
36. Zhang A, Sun H, Wang P, Han Y, Wang X. Modern analytical techniques in metabolomics analysis. Analyst. 2012;137(2):293.
37. John C. Lindon EH and JKN. So, what’s the deal with metabolomics? Analytical Chemistry 2003:384-391
38. Spaan S, Smit L a M, Eduard W, et al. Endotoxin exposure in sewage treatment workers: investigation of exposure variability and comparison of analytical techniques. 2008:251-261.
39. Heldal KK, Madsø L, Huser PO, Eduard W. Exposure, symptoms and airway inflammation among sewage workers. 2010:263-268.
40. Douwes J, Mannetje a., Heederik D. Work-related symptoms in sewage treatment workers. Ann Agric Environ Med. 2001;8(1):39-45.
41. Gattie DK, Lewis DL. A high-level disinfection standard for land-applied sewage sludges (biosolids). Environ Health Perspect. 2004;112(2):126-131.
42. Thorn J, Beijer L, Jonsson T, Rylander R. Measurement strategies for the determination of airborne bacterial endotoxin in sewage treatment plants. Ann Occup Hyg. 2002;46(6):549-554.
43. Rylander R. Health effects among workers in sewage treatment plants. 1999:354-357.
44. Svendsen K. Dutch Expert Committee on Occupational Standards: Hydrogen Sulphide. Health Based Recommended Occupational Exposure Limit in the Netherlands.; 2001.
51
45. Weng H, Dai Z, Ji Z, Gao C, Liu C. Release and control of hydrogen sulfide during sludge thermal drying. J Hazard Mater. 2015;296:61-67.
46. Reiffenstein RJ, Hulbert WC, Roth SH. Toxicology of hydrogen sulfide. Annu Rev Pharmacol Toxicol. 1992;32(5):109-134.
47. Richardson DB. Respiratory effects of chronic hydrogen sulfide exposure. Am J Ind Med. 1995;28(1):99-108.
48. Watt MM, Watt SJ, Seaton a. Episode of toxic gas exposure in sewer workers. Occup Environ Med. 1997;54(4):277-280.
49. Thorn È. Health Effects Among Employees in Sewage Treatment Plants : A Literature Survey. 2001;179(May):170-179.
50. Heng BH, Gohr KT, Doraisingham S, Quek GH. Prevalence of hepatitis A virus infection among sewage workers in Singapore. 1994:121-128.
51. Brugha R, Heptonstall J, Farrington P, Andren S, Perry K, Parry J. Risk of hepatitis A infection in sewage workers. Occup Environ Med. 1998;55(8):567-569.
52. Grady CPL, Jr., Daigger GT, Love NG, Filipe CDM. Biological Wastewater Treatment, Third Edition.; 2011.
53. Fenn JB, Mann M, Meng CKAI, Wong SF, Whitehouse CM. Electrospray Ionization for Mass Spectrometry of Large Biomolecules. 2007: 64-71
54. Bromirski M, Exactive PM. Exactive Plus.
55. Miller PE, Denton MB. The quadrupole mass filter: Basic operating concepts. J Chem Educ. 1986;63(7):617.
56. Bharti A, Ma PC, Salgia R. Biomarker discovery in lung cancer--promises and challenges of clinical proteomics. Mass Spectrom Rev. 2008;26(3):451-466.
57. De Vos R, Moco S, Lommen A, Keurentjes J, Bino R, Hall R. Untargeted large-scale plant metabolomics using liquid chromatography coupled to mass spectrometry. 2007;2(4):778-791
58. Pluskal T, Castillo S, Villar-Briones A, Oresic M. MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinformatics. 2010;11:395.
59. Alexandrov T, Steinhorst K. Peak detection in mass spectrometry data using sparse coding. 2008.
60. Mzmine A. About MZmine 2. 2010:1-107.
52
61. Simca manual. http://www.sartorius.com/fileadmin/media/global/products/Manual_BioPAT_SIMCA_SBI6011-e.pdf. Accessed March 12, 2015.
62. Heldal KK, Barregard L, Larsson P, Ellingsen DG. Pneumoproteins in sewage workers exposed to sewage dust. Int Arch Occup Environ Health. 2013;86(1):65-70.
63. Theodoridis G, Gika HG, Wilson ID. LC-MS-based methodology for global metabolite profiling in metabonomics/metabolomics. TrAC - Trends Anal Chem. 2008;27(3):251-260..
64. Cubbon S, Bradbury T, Wilson J, Thomas-Oates J. Hydrophilic interaction chromatography for mass spectrometric metabonomic studies of urine. 2007;79(23):8911-8918.
65. Sample Solvent and Solvent Strength ZIC - p HILIC HPLC Column General Instructions for Care and Use. 2008;49(0):6427.
66. Nguyen HP, Schug K a. The advantages of ESI-MS detection in conjunction with HILIC mode separations: Fundamentals and applications. J Sep Sci. 2008;31(9):1465-1480.
67. Dunn W, Wilson I, Nicolls A, Broadhurst D. The importance of experimental design and QC samples in large-scale and MS-driven untargeted metabolomic studies of humans. 2012;4(18):2249-2264
68. Zelena E, Dunn WB, Broadhurst D, et al. Development of a robust and repeatable UPLC-MS method for the long-term metabolomic study of human serum. Anal Chem. 2009;81(4):1357-1364.
69. Begley P, Francis-McIntyre S, Dunn WB, et al. Development and performance of a gas chromatography-time-of-flight mass spectrometry analysis for large-scale nontargeted metabolomic studies of human serum. Anal Chem. 2009;81(16):7038-7046.
70. Bharti A, Ma PC, Salgia R. Biomarker discovery in lung cancer--promises and challenges of clinical proteomics. Mass Spectrom Rev. 2010;26(3):451-466.
71. Zhang R, Watson DG, Wang L, Westrop GD, Coombs GH, Zhang T. Evaluation of mobile phase characteristics on three zwitterionic columns in hydrophilic interaction liquid chromatography mode for liquid chromatography-high resolution mass spectrometry based untargeted metabolite profiling of Leishmania parasites. J Chromatogr A. 2014;1362:168-179.
72. Eriksson L, Johansson E, Kettaneh-Wold N, Trygg C, Wikström C, Wold S. Pca. Multi- Megavariate Data Anal Part 1, Basic Princ Appl. 2006:39-62.
53
73. Triba MN, Le Moyec L, Amathieu R, et al. PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters. Mol BioSyst. 2015;11(1):13-19.
74. Locci E, Scano P, Rosa MF, et al. A metabolomic approach to animal vitreous humor topographical composition: a pilot study. PLoS One. 2014;9(5):e97773.
75. Jung JY, Lee HS, Kang DG, et al. 1 H-NMR-based metabolomics study of cerebral infarction. Stroke. 2011;42(5):1282-1288.
76. Van den Berg R a, Hoefsloot HCJ, Westerhuis J a, Smilde AK, van der Werf MJ. Centering, scaling, and transformations: improving the biological information content of metabolomics data. BMC Genomics. 2006;7:142.
77. Wiklund S. Multivariate Data Analysis for Omics. 2008:228.
78. Trivedi DK, Iles RK. The Application of SIMCA P+ in Shotgun Metabolomics Analysis of
ZICⓇHILIC-MS Spectra of Human Urine - Experience with the Shimadzu IT-T of and Profiling Solutions Data Extraction Software. J Chromatogr Sep Tech. 2012;03(06):145.
79. Analysis Q, Guide U. Xcalibur. 2012;(August).
80. Trauger S a., Webb W, Siuzdak G. Peptide and protein analysis with mass spectrometry. Spectroscopy. 2002;16(1):15-28.
81. Hällqvist J. Investigation of parameters causing drift in metabolomic analyzes. 2014.
82. Kamleh MA, Ebbels TMD, Spagou K, Masson P, Want EJ. Optimizing the use of quality control samples for signal drift correction in large-scale urine metabolic profiling studies. Anal Chem. 2012;84(6):2670-2677.
83. Bijlsma, S.; Bobeldijk, I.; Verheij, E. R.; Ramaker, R.; Kochhar, S.; Macdonald, I. A.; van Ommen, B.; Smilde AK. Large-Scale Human Metabolomics Studies: A strategy for Data (Pre-) Processing and Validation. 2006:567-574.
84. Sysi-Aho M, Katajamaa M, Yetukuri L, Oresic M. Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics. 2007;8:93.
85 http://www.sartorius.com/fileadmin/media/global/products/Manual_BioPAT_SIMCA_SBI6011-e.pdf. (12-04-2015)..
86. http://www.textronica.com/lcline/q_exactive_prodspec.pdf (20-04-2015).
87. http://www.thermoscientific.com/content/dam/tfs/ATG/CMD/cmd-support/tsq-quantum-access-max/manuals/HESI-II-Probe-User.pdf (20-04-2015).
54
88. http://www.waters.com/waters/en_US/How-Does-High-Performance-Liquid-Chromatography-Work%3F/nav.htm?cid=10049055(20-04-2015).
55
7 APPENDIX
In this part, the chromatograms, the mass spectra, the calculated mass spectra out of
the potential elemental formula and the fragmentation spectra could be found.
Figure 7. 1: The extracted chromatogram and mass spectrum for metabolites affording negatively charged ions with m/z 167.0213. For this molecule, no elemental composition has been found.
Figure 7. 2: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 167.0213.
F:\thesis Florence\datafiles\RP-HPLC\QC5 19/02/2015 08:55:21 Q.C
RT: 0.0000 - 2.0879
0.0 0.5 1.0 1.5 2.0
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
0.4261 NL: 3.58E3
Base Peak m/z=
167.0205-167.0221
F: FTMS - p ESI Full
ms [80.00-1200.00]
MS QC5
QC5 #39-44 RT: 0.41-0.45 AV: 3 NL: 5.98E4
T: FTMS - p ESI Full ms [80.00-1200.00]
167.0 167.5 168.0 168.5 169.0
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
169.0860
168.0809
167.0703
166.9926
167.1430 169.0131
169.1335
169.4228167.4941
20150416fgESIneg doubleplay #3910-3912 RT: 11.08-11.09 AV: 2 NL: 2.26E3F: ITMS - c ESI d Full ms2 [email protected] [50.00-180.00]
50 60 70 80 90 100 110 120 130 140 150 160 170 180
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lativ
e A
bu
nd
an
ce
124.0
167.1
123.2
56
Figure 7. 3: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 182.0812. The simulated isotope pattern for C9H12O3N is shown for comparison with the original isotope pattern.
Figure 7. 4: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 182.0812.
F:\thesis Florence\datafiles\RP-HPLC\QC8 21/02/2015 00:06:05 Q.C
RT: 0.0000 - 40.0118
0 5 10 15 20 25 30 35 40
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Rela
tive A
bundance
1.0653
1.59475.1932 35.2546
31.76835.5282 28.118623.112835.7266
NL: 3.16E6
Base Peak m/z=
182.0803-182.0821
F: FTMS + p ESI
Full ms
[80.00-1200.00] MS
QC8
183.0 183.5 184.0 184.5 185.0 185.5 186.0
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Rela
tive A
bundance
183.0845
184.0854
185.0888 186.0921
182.9854
184.9859
183.9890183.0629 185.1538184.4857 185.9892183.5285
184.8086
NL:
8.73E4
C9 H12 O3 N:
C9 H12 O3 N1
pa Chrg 1
NL:
9.27E6
QC8#77-156 RT:
0.78-1.56 AV: 40
T: FTMS + p ESI
Full ms
[80.00-1200.00]
20150416fgESIpos doubleplay #4249-4252 RT: 11.61-11.62 AV: 2 NL: 3.48E4F: ITMS + c ESI d Full ms2 [email protected] [50.00-195.00]
50 60 70 80 90 100 110 120 130 140 150 160 170 180 190
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
165.0
136.2
57
Figure 7. 5: The extracted chromatogram and mass spectrum for metabolites affording negatively charged ions with m/z 311.1406. The simulated isotope pattern for C18H19N2O3 is shown for comparison with the original isotope pattern.
Figure 7. 6: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 311.1406.
G:\thesis Florence\datafiles\RP-HPLC\QC8 21/02/2015 00:06:05 Q.C
RT: 5.23 - 8.39 SM: 5G
5.5 6.0 6.5 7.0 7.5 8.0
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
6.60
7.05 7.17 7.46 8.227.79
NL: 1.70E7
Base Peak m/z=
311.1390-311.1422
F: FTMS - p ESI
Full ms
[80.00-1200.00]
MS QC8
311 312 313 314 315 316
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
311.1390
312.1424
313.1457 314.1466 315.1500 316.1533
311.1402
312.1436
315.1450313.1467 316.1483314.1330311.9603
NL:
8.10E5
C18 H19 N2 O 3:
C18 H19 N2 O 3
pa Chrg 1
NL:
5.50E6
QC8#655-679 RT:
6.50-6.71 AV: 12
T: FTMS - p ESI
Full ms
[80.00-1200.00]
20150416fgESIneg doubleplay #1374-1377 RT: 4.00-4.01 AV: 2 NL: 4.79E3F: ITMS - c ESI d Full ms2 [email protected] [75.00-325.00]
140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
175.1
279.2
164.1
147.1
163.1
267.2
249.3
293.3221.1
295.2223.1 250.2
146.1
58
Figure 7. 7: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 313.1544. No MS
2 pattern was found for this compound. The simulated isotope patterns for different compositions have
been shown.
Figure 7. 8: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 496.3395. The simulated isotope pattern for C24H51O7NP is shown for comparison with the original isotope pattern.
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 0.9327 - 8.4769
1 2 3 4 5 6 7 8
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
6.6757
6.9304
NL:2.23E8
m/z= 313.1528-313.1560 F: FTMS + p ESI Full ms [80.00-1200.00] MS QC6
312 314 316 318 320
m/z
0
20
40
60
80
100
0
20
40
60
80
100
0
20
40
60
80
100
Re
lative
Ab
un
da
nce
0
20
40
60
80
100
0
20
40
60
80
100313.1547
314.1580
315.1614 317.1656 318.1690 320.1732
313.1540
314.1574316.1532 318.1574 320.1566
313.1548
314.1582
315.1616 316.1678 318.1692
313.1542
314.1576
315.1609 317.1676 319.1719
313.1542
314.1576
315.1606 318.1907 320.1700
NL:8.10E5
C 18 H21 O 3 N2: C 18 H21 O 3 N2
pa Chrg 1
NL:8.27E5
C 10 H25 O 5 N4 S: C 10 H25 O 5 N4 S1
pa Chrg 1
NL:8.49E5
C 1413C H26 O N2 P2:
C 1413C 1 H26 O 1 N2 P2
pa Chrg 1
NL:7.86E5
C 2213C H20 O:
C 2213C 1 H20 O 1
pa Chrg 1
NL:5.74E7
QC6#667-689 RT: 6.60-6.81 AV: 12 T: FTMS + p ESI Full ms [80.00-1200.00]
F:\thesis Florence\...\0327201527 28/03/2015 08:30:11 QC
RT: 1.79 - 7.28
2 3 4 5 6 7
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
4.82
4.78
5.25 5.443.91 7.196.793.683.142.40
NL: 7.28E8
Base Peak m/z= 495.8395-496.8395 F: FTMS + p ESI Full ms [75.00-1125.00] MS 0327201527
494 496 498 500 502
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
496.3395
497.3427
498.3453494.3241
495.3275
499.3476 502.2923
496.3398
497.3431
498.3465499.3474 501.3541
NL:1.83E8
0327201527#427-504 RT: 4.53-5.27 AV: 39 T: FTMS + p ESI Full ms [75.00-1125.00]
NL:7.52E5
C 24 H51 O 7 NP: C 24 H51 O 7 N1 P1
pa Chrg 1
59
Figure 7. 9: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 496.3395.
Figure 7. 10: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 520.3396 from RP. The simulated isotope pattern for C26H51O7NP is shown for comparison with the original isotope pattern.
20150416fgESIpos doubleplay #1524-2514 RT: 4.29-6.91 AV: 3 NL: 1.22E5F: ITMS + c ESI d Full ms2 [email protected] [130.00-535.00]
150 200 250 300 350 400 450 500
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
502.4
184.2
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 0.0000 - 40.0168
0 5 10 15 20 25 30 35 40
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
19.1797
18.8504
20.8822 27.0935 35.296917.7806
NL:6.41E8
m/z= 520.3370-520.3422 F: FTMS + p ESI Full ms [80.00-1200.00] MS QC6
521 522 523 524
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
520.3398
521.3431
522.3465523.3474
520.3396
521.3429
522.3455523.2983520.4720 522.7948 523.7997
521.2002
521.4757
NL:7.36E5
C 26 H51 O 7 N P: C 26 H51 O 7 N1 P1
pa Chrg 1
NL:8.22E7
QC6#1902-2045 RT: 18.68-20.05 AV: 72 T: FTMS + p ESI Full ms [80.00-1200.00]
60
Figure 7. 11: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 520.3396 from HILIC. The simulated isotope pattern for C26H51O7NP is shown for comparison with the original isotope pattern.
Figure 7. 12: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 520.3396.
E:\thesis Florence\...\0327201527 28/03/2015 08:30:11 QC
RT: 1.53 - 18.81
2 4 6 8 10 12 14 16 18
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
4.74
7.56
7.907.09 10.12 12.60
NL: 5.35E8
Base Peak m/z= 520.3370-520.3422 F: FTMS + p ESI Full ms [75.00-1125.00] MS 0327201527
519 520 521 522 523 524
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
520.3398
521.3431
522.3465523.3474 524.3507
520.3396
522.3550524.3706
521.3428
523.3588
519.3271
520.0241
520.6513 524.0502521.6079
NL:7.36E5
C 26 H51 O7 N P: C 26 H51 O7 N1 P 1
pa Chrg 1
NL:2.12E8
0327201527#423-472 RT: 4.49-4.95 AV: 25 T: FTMS + p ESI Full ms [75.00-1125.00]
20150416fgESIpos doubleplay #1534-2528 RT: 4.29-6.92 AV: 4 NL: 9.92E4F: ITMS + c ESI d Full ms2 [email protected] [130.00-535.00]
150 200 250 300 350 400 450 500
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
502.4
184.1
61
Figure 7. 13: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 522.3552 from RP. The simulated isotope pattern for C26H53O7NP is shown for comparison with the original isotope pattern.
Figure 7. 14: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 522.3555 from HILIC. The simulated isotope pattern for C26H53O7NP is shown for comparison with the original isotope pattern.
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 0.0000 - 40.0168
0 5 10 15 20 25 30 35 40
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
20.8050
20.4366
22.5333 26.745417.8780
NL:2.64E8
m/z= 522.3526-522.3578 F: FTMS + p ESI Full ms [80.00-1200.00] MS QC6
522 523 524 525 526 527
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
522.3554
523.3588
524.3621525.3630 526.3664 527.3697
522.3553
523.3587
524.3611522.7950
523.7998
525.3636 526.9607
NL:7.36E5
C 26 H53 O 7 N P: C 26 H53 O 7 N1 P1
pa Chrg 1
NL:3.42E7
QC6#2045-2247 RT: 20.05-22.01 AV: 102 T: FTMS + p ESI Full ms [80.00-1200.00]
F:\thesis Florence\...\0327201527 28/03/2015 08:30:11 QC
RT: 0.00 - 7.28
0 1 2 3 4 5 6 7
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
4.69
5.25 7.216.71
NL: 2.17E8
Base Peak m/z= 522.3529-522.3581 F: FTMS + p ESI Full ms [75.00-1125.00] MS 0327201527
522 523 524 525
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
522.3509
523.3543
524.3552 525.3585
522.3549
524.3705
523.3587 525.3741
524.2293522.2155 522.4877 524.5029
525.2321523.2177
523.4930
NL:2.07E5
C 26 H52 O 7 N P: C 26 H52 O 7 N1 P1
pa Chrg 1
NL:1.01E8
0327201527#426-469 RT: 4.53-4.93 AV: 22 SB: 92 5.11-6.92 T: FTMS + p ESI Full ms [75.00-1125.00]
62
Figure 7. 15: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 522.3552.
Figure 7. 16: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 524.3710. The simulated isotope pattern for C26H55O7NP is shown for comparison with the original isotope pattern.
20150416fgESIpos doubleplay #1518-2530 RT: 4.29-6.92 AV: 4 NL: 9.92E4F: ITMS + c ESI d Full ms2 [email protected] [130.00-535.00]
150 200 250 300 350 400 450 500
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
502.4
184.1
F:\thesis Florence\...\0327201527 28/03/2015 08:30:11 QC
RT: 0.00 - 7.28
0 1 2 3 4 5 6 7
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
4.65
5.05 7.135.354.113.41
NL: 2.15E8
Base Peak m/z= 524.3684-524.3736 F: FTMS + p ESI Full ms [75.00-1125.00] MS 0327201527
524 525 526 527 528
m/z
0
10
20
30
40
50
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
Re
lative
Ab
un
da
nce
524.3711
525.3744
526.3778527.3787 528.3820
524.3706
525.3741
526.3766527.3793
524.5028
525.5073
528.3086
NL:7.36E5
C 26 H55 O 7 N P: C 26 H55 O 7 N1 P1
pa Chrg 1
NL:7.69E7
0327201527#415-470 RT: 4.42-4.93 AV: 28 SB: 92 5.11-6.92 T: FTMS + p ESI Full ms [75.00-1125.00]
63
Figure 7. 17: Product ion spectrum from HILIC-ion trap mass spectrometry for the metabolite with m/z 524.3710.
Figure 7. 18: The extracted chromatogram and mass spectrum for metabolites affording negatively charged ions with m/z 588.3314. No MS
2 pattern was found for this compound.
20150416fgESIpos doubleplay #1510-2512 RT: 4.21-6.88 AV: 6 NL: 1.96E4F: ITMS + c ESI d Full ms2 [email protected] [130.00-535.00]
150 200 250 300 350 400 450 500
m/z
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
506.4
184.1
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 17.74 - 21.08
18.0 18.5 19.0 19.5 20.0 20.5 21.0
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
19.23
18.94
19.7718.59 20.29 20.4518.4917.85
NL: 4.20E7
Base Peak m/z= 588.3285-588.3343 F: FTMS - p ESI Full ms [80.00-1200.00] MS QC6
588 589 590 591
m/z
20
40
60
80
100
0
20
40
60
80
100
Re
lative
Ab
un
da
nce
20
40
60
80
100588.3253
589.3286
590.3320591.3227
588.3305
589.3338
590.3365588.5066
589.1680
589.5101
588.3299
589.3333
590.3366591.3210
NL:7.10E5
C 27 H42 O 6 N9: C 27 H42 O 6 N9
pa Chrg 1
NL:2.71E7
QC6#1955-1974 RT: 19.19-19.36 AV: 10 T: FTMS - p ESI Full ms [80.00-1200.00]
NL:7.36E5
C 24 H48 O 4 N9 P2: C 24 H48 O 4 N9 P2
pa Chrg 1
64
Figure 7. 19: The extracted chromatogram and mass spectrum for metabolites affording negatively charged ions with m/z 617.7375. No MS
2 pattern was found for this compound.
Figure 7. 20: The extracted chromatogram and mass spectrum for metabolites affording positively charged ions with m/z 653.2669. No MS
2 pattern was found for this compound.
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 5.1345 - 24.4502
6 8 10 12 14 16 18 20 22 24
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
9.3071
9.8162 23.028011.7943 18.413615.33395.5503
NL:4.23E7
m/z= 617.7344-617.7406 F: FTMS - p ESI Full ms [80.00-1200.00] MS QC6
QC6 #912-956 RT: 8.99-9.42 AV: 23 NL: 5.78E6T: FTMS - p ESI Full ms [80.00-1200.00]
618 620 622 624 626 628
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
617.7368
618.2380
618.7391
619.2401
628.7270619.7416621.2424 625.2515624.1634
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 2.2005 - 30.3997
5 10 15 20 25 30
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
7.9397
8.23356.5681 12.7706 23.667819.3821
NL:2.30E7
m/z= 653.2636-653.2702 F: FTMS - p ESI Full ms [80.00-1200.00] MS QC6
653 654 655 656 657 658 659
m/z
0
20
40
60
80
100
0
20
40
60
80
100
Re
lative
Ab
un
da
nce
0
20
40
60
80
100653.2669
654.2702
655.2627656.2660 657.2694 658.2618
653.2669
654.2702
655.2627656.2660 657.2694 658.2618
653.2660
653.7674
654.2685
654.7696655.7722 656.7708
NL:6.08E5
C 2313C H49 O 14 N2 P S:
C 2313C 1 H49 O 14 N2 P1 S1
pa Chrg 1
NL:6.08E5
C 3413C H44 O 6 N2 S 2:
C 3413C 1 H44 O 6 N2 S2
pa Chrg 1
NL:5.14E6
QC6#793-816 RT: 7.84-8.06 AV: 12 T: FTMS - p ESI Full ms [80.00-1200.00]
65
Figure 7. 21: The extracted chromatogram and mass spectrum for metabolites affording negatively charged ions with m/z 1083.663. No MS
2 pattern was found for this compound.
F:\thesis Florence\datafiles\RP-HPLC\QC6 20/02/2015 00:02:43 Q.C
RT: 2.2005 - 30.3997
5 10 15 20 25 30
Time (min)
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
19.1701
18.860119.6536
NL: 1.44E7
m/z= 1083.6580-1083.6688 F: FTMS - p ESI Full ms [80.00-1200.00] MS QC6
QC6 #1938-1998 RT: 19.02-19.60 AV: 31 NL: 2.69E6T: FTMS - p ESI Full ms [80.00-1200.00]
1082 1084 1086 1088 1090 1092
m/z
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
Re
lative
Ab
un
da
nce
1083.6624
1084.6665
1085.6688
1086.6716
1087.67481083.2583 1089.4350