Leonardo TenoriCERM/CIRMM
FiorGen Foundation
Metabolomics
Systems Biology and the rise of the “-omics”
Omics technologies such as genomics and high-throughput DNA sequencing wereintroduced in parallel to the Human Genome Project since 1990s.
According to one etymological analysis, the suffix 'ome' is derived from theSanskrit OM ("completeness and fullness") (Lederberg and McCray, 2001).
Omics technologies and various neologisms that define their application contexts,however, are more than a simple play on words.
They substantially transformed both the throughput and the design of scientificexperiments. The omics technologies allow the generation of copious amounts ofdata at multiple levels of biology from gene sequence and expression to proteinand metabolite patterns underlying variability in cellular networks and function ofwhole organ systems (Nicholson and Lindon, 2008; Wilke et al., 2008)
Systems biology...is about putting together rather than taking apart, integrationrather than reduction. It requires that we develop ways of thinking about integrationthat are as rigorous as our reductionist programmes, but different....It meanschanging our philosophy, in the full sense of the term" (Denis Noble).
Genomics
Study of genes
a branch of biotechnology concerned with applying thetechniques of genetics and molecular biology to thegenetic mapping and DNA sequencing of sets of genesor the complete genomes of selected organisms, withorganizing the results in databases, and withapplications of the data (as in medicine or biology)
Metagenomics
The genomics of the microbial community inside our body.
Microbial communities play a key role in preserving human health, but theircomposition and the mechanism by which they do so remains mysterious.Metagenomic sequencing is being used to characterize the microbialcommunities from 15-18 body sites from at least 250 individuals.
Changes in the human microbiome can be correlated with human health.
The human body carries about 100 trillion microorganisms in its intestines, anumber ten times greater than the total number of human cells in the body. Themetabolic activities performed by these bacteria resemble those of an organ,leading some to liken gut bacteria to a "forgotten" organ.
It is estimated that these gut flora have around a hundred times asmany genes in aggregate as there are in the human genome.
Epigenomics
is the study of the complete set of epigenetic modifications on thegenetic material of a cell, known as the epigenome. The field isanalogous to genomics and proteomics, which are the study ofthe genome and proteome of a cell (Russell 2010 p. 217 &230).
Epigenetic modifications are reversible modifications on a cell’sDNA or histones that affect gene expression without altering theDNA sequence (Russell 2010 p. 475). Two of the mostcharacterized epigenetic modifications are DNAmethylation and histone modification. Epigenetic modificationsplay an important role in gene expression and regulation, andare involved in numerous cellular processes such asindifferentiation/development and tumorigenesis. The study ofepigenetics on a global level has been made possible onlyrecently through the adaptation of genomic high-throughputassays.
Transcriptomics
The study of the Transcriptome.The term can be applied to the total set of transcripts in agiven organism, or to the specific subset of transcripts present in aparticular cell type.Unlike the genome, which is roughly fixed for a given cell line thetranscriptome can vary with external environmental conditions.Because it includes all mRNA transcripts in the cell, the transcriptomereflects the genes that are being actively expressed at any given time.
The study of transcriptomics, also referred to as expression profiling,examines the expression level of mRNAs in a given cell population,often using high-throughput techniques based on DNAmicroarray technology.
The use of next-generation sequencing technology to study thetranscriptome at the nucleotide level is known as RNA-Seq.
Proteomics
The studio of the Proteome.
A branch of biotechnology concerned with applying the techniques ofmolecular biology, biochemistry, and genetics to analyzing thestructure, function, and interactions of the all the proteins produced bythe genes of a particular cell, tissue, or organism, with organizing theinformation in databases, and with applications of the data
Characterizing human plasma proteome has become a major goal inproteomics arena. The plasma proteome is without doubt the mostcomplex proteome in the human body. It contains immunoglobulin,cytokines, protein hormones, secreted proteins and indicative ofinfection on top of resident, hemostatic proteins. It also contains tissueleakage proteins due to the blood circulation through different tissuesin the body.
The blood thus contains information on physiological of all tissues andcombined with its accessibility makes the blood proteome invaluablefor medical purposes.
Communicomics
The analysis of the communicome.
The communicome is the whole set of communication factors in abiological specimen, and it is a more specific subset of the proteome.
This aspect of proteomics will consist of targeted measurements ofcirculating plasma proteins with endocrine activity including cytokines,hormone-like proteins, growth factors and so forth, which have beentermed the “communicome” (Ray 2007). These factors are involved ininter-cellular and -organ communication and their changes with agewill carry crucial information regarding biological aging andneurodegeneration.
Metallomics
The term metallome has been introduced by analogy with proteomeas distribution of free metal ions in every one of cellularcompartments.
Subsequently, the term metallomics has been coined as the study ofmetallome. Szpunar (2005) defined metallomics as "comprehensiveanalysis of the entirety of metal and metalloid species within a cell ortissue type".
Lipidomics
Lipidomics is the lipidome analysis. Basically, a lipidome is the comprehensiveand quantitative description of a set of lipid species present in an organism.Lipidomics involves systems-level identification and quantitation of thousands ofpathways and networks of cellular lipids molecular species and their interactionswith other lipids, proteins and other moieties in vivo.
Lipids are hydrophobic or amphipathic molecules which include fats, waxes,sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides,diglycerides and phospholipids.The crucial role of lipids in a cell, tissue and organ physiology is evident by theirunique membrane organizing properties that provide cells with functionally distinctsubcellular membrane compartments.
The main biological functions of lipids include:Energy storage and structural components of cellular membranes.Cell signaling (e.g. phospholipase C and phospholipase A2 in modulatingimmunological responses).Endocrine actions (e.g. steroid hormones)Essential role in signal transduction, membrane trafficking and morphogenesis.
Glycomics
Glycomics is the comprehensive study of glycomes (the entirecomplement of sugars, whether free or present in more complexmolecules, of an organism), including genetic, physiologic, pathologic,and other aspects.
The term glycomics is derived from the chemical prefix for sweetnessor a sugar, "glyco-", and was formed to follow the naming conventionestablished by genomics (which deals with genes) and proteomics(which deals with proteins).
Glycoproteins
Glycolipids
Foodomics
Foodomics is discipline that studies the food and nutrition domainsthrough the application of advanced omics technologies to improveconsumer’s well-being, health, and confidence.
Foodomics is a global discipline that includes all the working areas inwhich food (including nutrition) and advanced omics tools are puttogether.
The interest in Foodomics also coincides with a clear shift in medicineand biosciences toward prevention of future diseases throughadequate food intakes, and the development of the so-calledfunctional foods.
And metabolomics ?
Now it is easy to answer….
“Metabolomics is a further “omic” science with the purpose of elaborate acomprehensive analysis of the metabolome,which is thecomplete set of metabolites in a biological fluid, cell, tissue, organor organism”.
Metabolomics
Metabolomics can alsoprovide tissue-specificinformation
Biological fluids such asblood and urine canprovide information at thewhole-body level
Genomics tells you what could happen.Metabolomics tells you what has happenedOnly a few thousand metabolites.Not negligible external environment influence
Genomics:the complete blueprint of an individual. What do we need more?There are 6 million parts in a 747 plane. If someone shows you theblueprints of all of them one after the other, would you be able to tell howthe plane looks like?
Proteomics:Only 30-40,000 proteins.However, millions of potential interactions that make an “individual”. And theanalysis is still very difficult…
Metabolomics:Only a few thousand metabolites.However, not negligible external variability.
Genomics is “only” thestart!
“Genomics and proteomics tell you whatmight happen, but metabolomics tellsyou what actually is happening”
Bill Lasley - University of California, Davis
“If you have a disease, it’s likely thatyour metabolism is going to be affected.The same is true if you get hit with atoxicant. To be honest, the diagnosticpotential is staggering”
Mark Viant - University of Birmingham
Benefits of analyzing themetabolome
Number of metabolites lower than number of genes and proteinsin a cell - sample complexity reduced
Although concentration of enzyme & metabolic flux may notsignificantly change during a biochemical reaction, concentrationof metabolites can change significantly
Reflect more accurately functional level of a cell
Metabolic fluxes regulated not only by gene expression but also byenvironmental stresses - hence worth measuring downstreamproducts (i.e. metabolites)
Estimated that metabolomic expts are 2x to 3x less expensivethan proteomic & transcriptomic expts
Metabolomics is More TimeSensitive Than Other “Omics”
Metabolomics
Proteomics
Genomics
Res
pons
eR
espo
nse
Res
pons
e
Time
Effetto della dieta
Consumo di pesce
Trimetilammina N-ossido
4 .00 3.90 3.80 3.7 0 3.60
Consumo di chewing-gum, caramelle etc..
NON Consumo di pesce
Mannitolo
Challenges when analyzingmetabolomes
Metabolomes extend over 7 to 9 order of magnitudes inconcentration (picomoles to millimoles)
Currently not possible to analyze all metabolites in asingle analysis
Several analytical strategies (MS in combination withdifferent chromatographic separations, NMR)
Requires high throughput
Term Definition
Metabolism The whole ensemble of all chemical reactions that occur in livingorganisms, including digestion and the transport of substancesinto and between different cells. The set of reactions within thecells is called intermediary metabolism.
Metabolite Substance produced during or taking part in metabolism.
Metabolome The full complement of metabolites present in a cell, tissue, ororganism in a particular physiological, pathological ordevelopmental state.
Metabonomics The quantitative measurement of the dynamic multiparametricmetabolic response of living systems to pathophysiologicalstimuli or genetic modification. Nicholson et al., 1999
Metabolomics The identification and quantification of the complete set ofmetabolites/low-molecular-weight intermediates, which arecontext dependent, varying according to the physiology,developmental or pathological state of the cell, tissue, organ ororganism. Oliver 2002
Metabolic profiling Identification and quantification of all metabolites, which aregenerally related to a specific metabolic pathway.
Metabolic fingerprinting Global, high-throughput, rapid analysis to provide sampleclassification. Also utilized as a screening tool to discriminatebetween samples from different biological status or origin (i.e.,case vs. control, disease vs. healthy)
Since the late 1990s, suchmetabolomic studies have
undergone an explosivegrowth and this trend is
still continuing, with morethan a thousand of papers
published in 2010!
Year1998 2000 2002 2004 2006 2008 2010 2012
Ent
ries
in P
ubm
ed
0
200
400
600
800
1000
1200
1400
Metabonom*Metabolom*
What is a Metabolite?
Any organic molecule detectable in thebody with a MW < 1000 Da
Includes peptides, oligonucleotides,sugars, nucelosides, organic acids,ketones, aldehydes, amines, amino acids,lipids, steroids, alkaloids and drugs(xenobiotics)
Includes human & microbial products Concentration > 1µM
Why 1mM?
Equals roughly ~200 ng/mL Limit of detection by NMR Limit of easy isolation/separation by many
analytical methods Excludes environmental pollutants Most disease indicators have
concentrations >1 µM Need to draw the line somewhere
H2N
O
OHGlycine
NH2
NH
O
OH
Tryptophan
NH2
HN
NH
H2N
O
OH
Arginine
OHHO
ONN
H2N
N
N
PO
O
OH
O
P
O
OH
O
PO
OH
OH
Adenosine-5'-triphosphate
O
O
OH
Pyruvic acid
O
OH
O
HO
Succinic acid
O
O
HO
O
OH
Oxaloacetic acidAcetyl CoA
Examples of metabolites
Small Molecules Count…
• >95% of all diagnostic clinical assays testfor small molecules
• 89% of all known drugs are smallmolecules
• 50% of all drugs are derived from pre-existing metabolites
• 30% of identified genetic disorders involvediseases of small molecule metabolism
• Small molecules serve as cofactors andsignaling molecules to 1000’s of proteins
Metabolomics Applications
• Nutritional Analysis• Drug Compliance
• Toxicology Testing• Clinical Trial Testing• Fermentation Monitoring• Food & Beverage Tests• Nutraceutical Analysis• Drug Phenotyping• Water Quality Testing• Petrochemical Analysis
Generate metabolic “signatures” for diseasestates or host responses
Obtain a more “holistic” view of metabolism(and treatment)
Diagnosis More rapidly and accurately (and cheaply)
assess/identify disease phenotypes Monitor gene/environment interactions Rapidly track effects from drugs/surgery
Medical Metabolomics
Metabolomica:alcuni obiettivi
Valutare eventuali correlazioni traimpronta metabolica e malattia
(sarebbe così possibile disporre di nuovistrumenti per approfondire leconoscenze su determinate patologie)
Metabolomica:alcuni obiettivi
Cercare di capire se sia possibile diagnosticaree valutare lo stadio di avanzamento di unamalattia
(una diagnosi più precoce dei tumori di quellaattualmente possibile, per esempio,permetterebbe di salvare il 30% di malatiutilizzando i farmaci attualmente disponibili)
Metabolomica:alcuni obiettivi
Scoprire nuovi biomarker
(quelli attuali utilizzati per la diagnosi dialcune patologie potrebbero nonessere gli unici e/o i più efficienti)
Metabolomica:alcuni obiettivi
Studiare i metaboliti connessi a specificipathway metabolici
(sarebbe possibile definire dei nuovibersagli per farmaci futuri e valutarel’impatto di quelli attuali permettendouna personalizzazione avanzata dellaterapia)
Metabolic profiling is not new. Profiling for clinical detection of human disease usingblood and urine samples has been carried out for Centuries.
This urine wheel waspublished in 1506 byUllrich Pinder, in his bookEpiphanie Medicorum.The wheel describes thepossible colors, smellsand tastes of urine, anduses them to diagnosedisease.
Nicholson, J. K. & Lindon, J. C. Nature455, 1054–1056 (2008).
History
Few already knownmetabolites for somedisease (e.g. glucosefor diabetes, etc…)
Metabolomics:Traditional clinical analysis:
All metabolites areanalyzed together
without priorknowledge
Data acquisition• MS based techniques
– Gas Chromatography (GC-MS)– High Performance Liquid Chromatography (HPLC-MS)– Ultra Performance Liquid Chromatography (UPLC-MS)– Capillary Electrophoresis (CE-MS)
• NMR– Nuclear Magnetic Resonance Spectroscopy (NMR)– High resolution magic angle spinning (HR-MASS)
• Quantitative, very fast
• Requires no work up orseparation
• Allows analysis of 200+cmpds at once
• Relatively insensitivetechnique
• Lower limit of detection1-5 uM
• Usually large samplesize (500 uL)
• Quite fast• Very sensitive• Allows analysis or
ID of 3000+ cmpds(not at once, butusing severaldifferent analyticalstrategies)
• Not quantitative• Requires complex
work-up forsample preparation
NMR versus MS
Metabolic Profiling MethodsMain Analytical Techniques
How can one decide which analytical platform should be used?
- Should be rapid, reproducible, with easy sample preparation.
- Selection based on objectives, target metabolites, availability, etc.
Scale from - to +++ for major disadvantages to major advantages
Phytochem Rev (2008) 7:525–537
Quantitative (Targeted): preferred MSway
Metabolite Identification & Quantification
Biological Interpretation
Sample Prep
Fingerprinting (Untargeted)- preferredNMR way
-25-20-15-10-505
10152025
-30 -20 -10 0 10
PC1
PC2
PAP
ANIT
Control
Data Reduction
Data Collection
Sample Prep
If NMR identify some abundant metabolites belonging to specific metabolic cycles,
then using MS it is possible to look for target metabolites, not detected by NMR but
that is known that are related to that specific biological cycle.
NMR generate hypotheses, MS can confirm them and can go in deeper details
COMPLEMENTARY TECHNIQUES
+
NMR Metabolomics
Metabolic fingerprint
We mainly use NMR
1H NMR spectrum of ethanol: a series ofpoints with an associated intensity
Our data: NMR spectraNMR spectroscopy is usually used to detect hydrogen nuclei in metabolites - Thus,in a typical biological sample, all hydrogen-containing molecules in the sample willgive an 1H NMR spectrum, as long as they are present in concentrations above thedetection limit. The NMR spectrum is therefore the superposition of the spectra of
all of the metabolites in the sample.
Profilo 1H NMR di urina umana
Profilo 1H NMR di urina umana
1234567ppm
hippurate urea
allantoin creatininehippurate
2-oxoglutarate
citrate
TMAO
succinatefumaratewater
creatinine
taurine
1234567ppm
-25-20-15-10-505
10152025
-30 -20 -10 0 10PC1
PC2
Quantitativemethods
Chemometric methods(fingerprinting and pattern recognition)
Two approaches:• Identify as many metabolites as possible• Use the whole spectrum as a fingerprint (statistics)
2 Routes to Metabolomics
Quantitative vs.Chemometric
• Identifies compounds• Quantifies compds• Concentration range of
1 µM to 1 M• Handles wide range of
samples/conditions• Allows identification of
diagnostic patterns• Limited by DB size
• No compound ID• No compound conc.• No compound
concentration range• Requires strict sample
uniformity• Allows identification of
diagnostic patterns• Limited by training set
NMR analysis
Metabolites identification
Data processing and bucketingStatistical analysis
Handling andpreparation of
samples
Metabolomics steps
Typical biofluids used in metabolomics
Urine Serum/plasma Saliva Fecal extracts Exhaled breath condensate Cells/tissues extracts
And also tears, sweat, vaginal fluid,seminal fluid, synovial fluid, bile,cerebrospinal liquid, …
Handling and preparation of the samples
A prerequisite for the further improvement of the diagnosis and prognosis of diseases is thedevelopment of systems and procedures involved in all stages of the process from specimen
collection throughout the analysis.
A critical point in the process is the treatment of the sample material before the analysis itself takesplace – the so-called pre-analytical phase. If the sample is not stabilized as soon as possible after itscollection from the human body, the following analysis loses reliability and reproducibility because
the biological or chemical structure of the sample material will have changed between the collectionand the analysis and will not reflect the real situation in the patient's body anymore.
SOPs are fundamental
Every step of a metabolomic experiment must be carefully standardized
Samples arrive at CERMfrozen and are immediatelystored in -80°C
The day of the analysissamples (serum or urines)are thawed at roomtemperature, and then abuffer containing deuteratedwater is added
Samples are then pipetted in4.5 mm tubes
The automatic samples charger allow us towork in (moderately) high throughput
~10 min for a urine sample~30 min for a serum sample (more experiments)
Acquisition of the NMR spectra
600 MHz standard fieldfor metabolomics
CPTI 1H-13C/31P-2H cryo-probeWith automatic tuning and matching
Most common NMR techniques in metabolomics:
1D-noesy with water presaturation
1D-cpmg(Carr-Purcel-Meibom-Gil T2 relaxation editing)
1D-diffusion edited spectra
2D-JRES (j-resolved experiments)
CPMG
1.01.52.02.53.03.54.0 0.5
J-resolved
Diffusion edited
Low molecular weightand proteins profile
Low molecularweight metabolite
profile
Lipids and proteins profile
Identificationpurposes
1D-noesy
1H NMR analysis of a serum sample
serum
urine
saliva
fecal extract
Binned spectra, n=~400 bins
phasing, baseline correction…NMR spectra with 64 or 128k points
Data Preparation
Bucketing is a means to reducethe number of total variablesand to compensate for small
shifts in the spectra.
• Data collected represented in a matrix
At this point, the data has been transformedto a matrix with the samples in rows and thevariables (bins) in columns
Univariate statisticsVariable are analyzed independently from each otherNo interaction is taken into account.- statistical test (t.test, wilcoxon test, kruskall test,etc…)- correlations- ROC curves
Multivariate statisticsVariables are analyzed together, taking into account the relationshipsbetween them.
Unsupervised methods: no informations between groups- Projection based methods (principal component analysis,independent component analysis)- cluster analysis (k-means, hierarchical clustering, spectralclustering, …)
Supervised methods- Projection based methods (LDA, PLS,…)- Machine learning (SVM, randomForest, neural networks)
Univariate Statistics• Univariate means a single variable• If you measure a population using some
single measure such as height, weight,test score, IQ, you are measuring asingle variable
Multivariate Statistics• Multivariate means multiple variables• If you measure a population using
multiple measures at the same timesuch as height, weight, hair colour,clothing colour, eye colour, etc. you areperforming multivariate statistics
• Multivariate statistics requires morecomplex, multidimensional analyses ordimensional reduction methods
Multivariate StatisticsUnsupervised methods: no information on the samples:
Projection methods :Principal component analysis, Independent component analysis, ISOMAP,locally linear embedding, diffusion maps, locality preserving projection,laplacian eigenmap
Clustering methods:K-means, K-medoids, hierarchical clustering
Supervised methods: information on samples (e.g. disease/helthy)
Projection methods:PLS, OPLS, LDA
Machine learning:K-NN, support vector machines, random forest, neural networks
PCA
• PCA – PrincipalComponenent Analysis
• Process that transforms anumber of possiblycorrelated variables into asmaller number ofuncorrelated variablescalled principalcomponents
• Reduces 1000’s ofvariables to 2-3 keyfeaturesScores plot
Data Presentation
• Example: 53 Blood andurine measurements (wetchemistry) from 65people (33 alcoholics, 32non-alcoholics).
• Matrix Format
H-WBC H-RBC H-Hgb H-Hct H-MCV H-MCH H-MCHCH-MCHCA1 8.0000 4.8200 14.1000 41.0000 85.0000 29.0000 34.0000A2 7.3000 5.0200 14.7000 43.0000 86.0000 29.0000 34.0000A3 4.3000 4.4800 14.1000 41.0000 91.0000 32.0000 35.0000A4 7.5000 4.4700 14.9000 45.0000 101.0000 33.0000 33.0000A5 7.3000 5.5200 15.4000 46.0000 84.0000 28.0000 33.0000A6 6.9000 4.8600 16.0000 47.0000 97.0000 33.0000 34.0000A7 7.8000 4.6800 14.7000 43.0000 92.0000 31.0000 34.0000A8 8.6000 4.8200 15.8000 42.0000 88.0000 33.0000 37.0000A9 5.1000 4.7100 14.0000 43.0000 92.0000 30.0000 32.0000
0 10 20 30 40 50 600100200300400500600700800900
1000
measurement
Val
ue
Measurement
0 10 20 30 40 50 60 7000.20.40.60.811.21.41.61.8
Person
H-B
ands
0 50 150 250 350 45050100150200250300350400450500550
C-Triglycerides
C-L
DH
0 100200300400500
0200
4006000
1
2
3
4
C-TriglyceridesC-LDH
M-E
PI
Univariate Bivariate
Trivariate
Data Presentation
• Better presentation than ordinate axes?• Do we need a 53 dimension space to view data?• How to find the ‘best’ low dimension space that
conveys maximum useful information?• One answer: Find “Principal Components”
Data Presentation
The Goal
We wish to explain/summarize the underlying structure ofa large set of variables through a few linear combinationsof these variables.
Applications
• Uses:– Data Visualization– Data Reduction– Data Classification– Trend Analysis– Noise Reduction
• Examples:– How many unique “sub-sets” are in
the sample?– How are they similar / different?– What are the underlying factors that
influence the samples?– Which time / temporal trends are
(anti)correlated?– Which measurements are needed to
differentiate?– How to best present what is
“interesting”?– Which “sub-set” does this new sample
rightfully belong?
From k original variables: x1,x2,...,xk:Produce k new variables: y1,y2,...,yk:y1 = a11x1 + a12x2 + ... + a1kxk
y2 = a21x1 + a22x2 + ... + a2kxk
...yk = ak1x1 + ak2x2 + ... + akkxk
such that:
yk's are uncorrelated (orthogonal)y1 explains as much as possible of original variance in data sety2 explains as much as possible of remaining varianceetc.
yk's arePrincipal Components
PCA
2D Gaussian dataset
1st PCA axis
2nd PCA axis
PCA Plot Nomenclature
• PCA Generate 2kinds of plots, thescores plot and theloadings plot
• Scores plot (on right)plots the data usingthe main principalcomponents
Z = X Ascores loading
originaldata
PCA Loadings Plot
• Loadings plot showshow much each of thevariables (metabolites)contributed to thedifferent principalcomponents
• Variables at theextreme cornerscontribute most to thescores plot separation
PCA Details/Advice• In some cases PCA will not succeed in
identifying any clear clusters or obviousgroupings no matter how many componentsare used. If this is the case, it is wise toaccept the result and assume that thepresumptive classes or groups cannot bedistinguished with PCA
• As a general rule, if a PCA analysis fails toachieve even a modest separation of classes,then it is probably better to use otherstatistical techniques to try to separate them
PLS• Supervised learning method.• Principles that of PCA. But in
PLS, a second piece ofinformation is used, namely, thelabeled set of class identities.
• Two data tables considerednamely X (input data fromsamples) and Y (containingqualitative values, such as classbelonging, treatment of samples)
• The PLS algorithm maximizesthe covariance between the Xvariables and the Y variables
How PLS works (Concept)
• PLS finds a set of orthogonal components that :– maximize the variance of both X and Y– provide a predictive equation for Y in terms of the X’s
• This is done by:– fitting a set of components to X (as in PCA)– similarly fitting a set of components to Y– reconciling the two sets of components so as to maximize
covariance of X and Y
PLS is also a step-wise process. This is how it works conceptually:
• OPLS method is a recent modification of the PLS method to help overcome pitfalls• Main idea to seperate systematic variation in X into two parts, one linearly related to Y and one
unrelated (orthogonal).• Comprises two modeled variations, the Y-predictive (TpPp
T) and the Y-orthogonal (ToPoT)
compononents.• Only Y-predictive variation used for modeling of Y.• X = TpPp
T + ToPoT + E
• Y = TpCpT + F
• E and F are the residual matrices of X and Y• OPLS-DA compared to PLS-DA
KNN Classificationk-nearest neighbours
$0
$50.000
$100.000
$150.000
$200.000
$250.000
0 20 40 60 80
Non-DefaultDefault
Age
Loan$
79www.ismartsoft.com
Metabolic signature of individualsMetabolic phenotype
Metabolic signature of diseases• Celiac disease• tumor metastasis (breast, colorectal)• cardiovascular diseases• pulmonary diseases• …
Metabolomics in agriculture and for nutritionalstudies
Metabolomics for biobank samples• Sensitive reporters of stability• Assess sample preparation and preanalytical procedures• …
Our interest in metabolomics
METabolomic REFerence
• 22 Individuals, 11 Males & 11 Females• 40 urine samples each, on a period of 2-3 months• First in the morning preprandial• Collection suspended in case of illness; otherwise no restrictions
• MetaData recording:DietDrugsLifestyle, general habitsSmoker / No Smoker
• NMR analysis: 1D 1H spectra
Experimental scheme:
• Training
• Urine samples are easy to collect
• Large number of samples
• Potential intrinsic value of the information
Why?
METabolomic REFerence
Visual inspection suggests that it should be interesting to look for individualfingerprints by statistical analysis
Ind 1
Ind 2 We believe the humaneye is very sensitive todifferences in patterns
10.00 7.50 5.00 2.50 ppm
Getting a first feeling…
METabolomic REFerenceConvex hulls of 22 donors in the three most significant PCA-CA dimensions
Assfalg, Bertini, Colangiuli, Luchinat, Schäfer, Schütz, Spraul, PNAS, 2008, 105, 1420-4
PCA for datareduction
CA for obtainwell separatedclusters
KNN forclassification
99% accuracyin montecarlocross validation
“natural” gender discrimination
MALEFEMALE
METabolomic REFerence
Assfalg, Bertini, Colangiuli, Luchinat, Schäfer, Schütz, Spraul, PNAS, 2008, 105, 1420-4
Dendrogram of the 22 donors on the 21-dimensional PCA-CA subspace
METabolomic REFerence
MM-SIMCA
PCA/K-NN
PCA/CA/K-NN
Leave-one-out Majority rule
38%
69%
99% 100%
95%
81%
Assfalg, Bertini, Colangiuli, Luchinat, Schäfer, Schütz, Spraul, PNAS, 2008, 105, 1420-4
METabolomic REFerence
Concentrations of 12 selected metabolites for each donor. Absolutecreatinine concentration (Crea) and relative metabolite
concentrations (relative to creatinine) are scaled with respect to theirmedian values averaged over the total set of samples.
PCA/CA/K-NN12 metabolites
73%
METabolomic REFerenceAn individual metabolic fingerprint exist!
But it is hidden inside the daily noise
Assfalg, Bertini, Colangiuli, Luchinat, Schäfer, Schütz, Spraul, PNAS, 2008, 105, 1420-4
METabolomic REFerence 2
• Expanding the dataset
• Trying to learn more about relevance of genetic vs lifestyle contributions
• Check the constancy of metabolic phenotypes over time
Why Healthy Individuals Again?
METabolomic REFerence 2
• 20 Individuals, 9 Male & 11 Females
• 11 Individuals (6 M + 5 F) already in the first screening
• 40 samples/each on a period of 2-3 months
• First in the morning preprandial
• Collection suspended in case of illness
• Data recording:DietDrugsLife styleSmoker / No Smoker
• NMR analysis: 1D 1H spectra
Experimental Scheme (2 years later)
METREF 1,2,3
MetRef12005
MetRef22007
MetRef32008
11
7
4
2
twins
5
father& son
2t
47
4
22
20
4
METabolomic REFerence
Average recognitions using different test/trainingcombinations
Distances Metref 1,2,3
Heat Map of the inter distances beetwen pools of spectrabelongings to 46 pseudo-individuals
genes
lifestyle etc.
Bernini, P.; Bertini, I.; Luchinat, C.; Nepi, S.; Saccenti, E.; Schäfer, H.; Schütz, B.; Spraul, M.; Tenori, L. Individual humanphenotypes in metabolic space and time, J. Prot. Res. 2009
• The metabotype consists of a variable part (environment) and an invariantpart (genetics + environment)• The invariant part persists for at least two-three years• The discovery of the existence of individual metabotypes is the baseline forbiomedical and nutritional studies
NMR instrumentationMETabolomic REFerence
MetRef1
MetRef2
MetRef3
A metabolic “jump”
Hippurate
Metabolic profile evolution of an individual
MetRef 4
At present, a new sample collection is ongoing, to study the evolution of themetabolic fingerprint over an even longer time scale (6-9 years).
• 11 Individuals, 6 Males & 5 Females participating in at least oneprevious collection
For the first three individuals for which the collection of 20 samples hasbeen completed the accuracies are
19/20,18/20,20/20
Celiac Disease MetabolomicsWhat is Celiac Disease?
• Celiac Disease (CD), or sprout, is a permanent intolerance to gluten• Gluten is found in wheat, rye and barley and others• Gliadin and glutenin comprise about 80% of the protein contained
in wheat seeds.• Gluten is present in bread, pasta, pizza, biscuits…
The ONLY therapy is a
totally gluten-free diet
Aim: define the metabolome of celiac disease; obtain hints on its biochemistry
Celiac Disease Metabolomics
• Study subjects: 34• Control subjects: 34• Samples: Serum and Urine
NMR spectra acquired:
• 1D Noesy (standard 1D 1H spectra) for serum and urine samples• CPMG: to remove signals due to macromolecules (on serum samples)• @ a Bruker 600 MHz
Experimental scheme:
Statistical Analysis
• Projection to Latent Structures (PLS) to reduce data dimension Optimal number ofcomponents obtained by minimizing the Cross-Validated (CV) error
• Canonical Analysis (CA) to obtain two well separated clusters
• Support Vector Machines (SVM) for classification
Celiac Disease Metabolomics
Clusterization of serum spectra of celiac and healthy subjects
Note: both subjects areasymptomatic!
Bertini, I.; Calabrò, A.; De Carli, V.; Luchinat, C.; Nepi, S.; Porfirio, B.; Renzi, D.; Saccenti, E.;Tenori, L. The metabonomic signature of celiac disease, J. Proteome Res. 2009, 8(1), 170
Accuracy between80% and 90%
Celiac Disease MetabolomicsSignificantly different metabolites in
serum (p<0.01)
Already known NAC = N-acetyl-
Significantly differentmetabolites in urine (p<0.01)
Celiac Disease Metabolomics
Celiac disease often associated with fatigue:Why ?
Increased glucose, decreased pyruvate, lactate:Impaired glycolysis, impaired energy production
Lipid beta-oxidation + use of ketonic bodies:Alternate less efficient energy production
Celiac Disease Metabolomics
Clusterization of Celiac and Healthy subject serum spectra
Bertini, I.; Calabrò, A.; De Carli, V.; Luchinat, C.; Nepi, S.; Porfirio, B.; Renzi, D.; Saccenti, E.;Tenori, L. The metabonomic signature of celiac disease, J. Proteome Res. 2009, 8(1), 170
Celiac Disease Metabolomics
Clusterization of Celiac and Healthy subject serum spectra and Follow-up
Bertini, I.; Calabrò, A.; De Carli, V.; Luchinat, C.; Nepi, S.; Porfirio, B.; Renzi, D.; Saccenti, E.;Tenori, L. The metabonomic signature of celiac disease, J. Proteome Res. 2009, 8(1), 170
Celiac – Healthy Subjects –Cross: predicted Potential Celiac
Bernini P, Bertini I, Calabrò A, la Marca G, Lami G, Luchinat C, Renzi D, Tenori L. Are patients withpotential celiac disease really potential? The answer of metabonomics. J. Proteome Res. 2010
There exists a metabolicfingerprint of celiac disease
These alteration are present alsoin potential celiac subjects: so
they precede the intestinaldamage
Potential CD largely shares themetabonomic signature of overtCD. Most metabolites found to
be significantly differentbetween control and CD subjects
were also altered in potentialCD. Our results suggest earlyinstitution of GFD in patients
with potential CD
Celiac Disease MetabolomicsSubjects: 134
Celiacs: 59 - Potential Celiacs: 25 - Healthy: 50
Sensitivity Specificity Accuracy
CMD vs CMS 45.52% 68.29% 61.19%
NYHA1 vs NYHA 2 61.88% 71.42% 67.71%
NYHA2 vs NYHA 3/4 73.62% 56.44% 68.04%
NYHA 1 vs NYHA 3/4 74.83% 68.55% 72.15%
Classification between different subgroups of Heartfailure patients (1D CPMG spectra).
Patients are separated from healthy, but there is not any significantdifference between the disease grading that could reflect the clinicalseverity of the disease.
Although good discrimination between healthy and HF subjects with a severedisease, if not expected, was easy to be hypothesized, a comparable gooddiscrimination ability between healthy and HF subjects with a mild disease wasunexpected and appears rather counter-intuitive.
Heart failure metabolomics
Patients vs Healthy 85.11% 91.04% 87.29%
Int. J. Cardiol. 2013 Oct 9;168(4):e113-5
HF patients with a stable disease and under astate-of-the-art therapy (131 males, 54 females,mean age 62.93± 12.9 years) and an age andgender matched cohort of 111 healthy volunteers(86 males, 25 females, mean age 61.00± 3.28years)
Heart failure metabolomics
Our data support the hypothesis that the HF fingerprint may be an on/offphenomenon in the HF scenario, and the HF fingerprint correlates with thepresence of HF irrespective of the disease severity. The presence of a HFfingerprint in an asymptomatic subject could be therefore more significant ascompared to the potential risk established in the presence of certain genomicfeatures.
Int. J. Cardiol. 2013 Oct 9;168(4):e113-5
-80-70-60-50-40-30-20-10
0102030405060
TMAO
Crea
tine
Lipid
Isol
eucin
eFo
rmat
eLip
opro
tein
Hypo
xant
hine
Prol
ine
Phen
ylal
anin
ePy
ruva
teU
rea
Dim
ethy
lam
ine
Serin
eTy
rosin
eAc
etat
eHi
stid
ine
Met
hano
lVa
line
Chol
ine
Argi
nine
Crea
tinin
eDi
met
hylsu
lfone
Gln+
Glu
Alan
ine
L-do
paDi
met
hylg
lycin
eCi
trat
eLa
ctat
eLy
sine
Urid
ine
Met
hion
ine
Heart failure metabolomics
The model for prediction of heart failure was developed for each of the 3 kind ofavailable spectra: cpmg, noesy and diffusion edited753 new healthy samples (blood donors) were tested against each model.20 subjects was predicted as patients in all of the three kinds of spectra.We were able to recall 11 of them for a ecocardiographical screening
6 out of 11 showed altered parameters
Lower bound
Upper bound
Thickness of the left ventricular wall
Breast cancer metabolomics
In breast cancer the tumor-host interaction is small, and thesystemic effects of the tumor are expected to be modest
Aims:
1) To define a systemic metabolic fingerprint of breast cancer2) To find a predictive signature of metastatic disease3) To predict the survival in metastatic patients4) To find a prognostic score for the risk of relapse in post-operativepatients
Breast cancer metabolomics
Healthy vsMet
Accuracy 73.44%
Healthy vsPost-op
Accuracy 75.80%
Post vsMet
Accuracy 74.96%
NOESY
Healthy vsMet
Accuracy 72.67%
Healthy vsPost-op
Accuracy 70.00%
Post-op vsMet
Accuracy 70.00%
CPMG
Classification betweenPre-Op and Metastaticsubjects.
Accuracy ~80%
Other comparisons Ann Oncol (2011) 22 (6): 1295-1301
Breast Cancer Metabolomics
Phenylalanine Tyrosine Aspartate Methylamine
Lipids Lipoproteins
Choline
Bones: 29
Non Bones: 19Bone Metastasis Non Bone Metastasis
Bone Metastasis 23 6
Non Bone Metastasis 6 13
Accuracy: 75.50%Specificity: 68.42%Sensitivity: 79.31%
Different Kinds of Metastases
Soft Tissue: 32Non Soft Tissue: 16
Soft tissues Non Soft tissuesmetastasis
Soft tissues 28 4
Non Soft Tiss. Metastasis 9 7
Accuracy: 75.5%Specificity: 63.6%Sensitivity: 75.7%
Visceral: 32
Non Visceral: 16Visceral
MetastasisNon VisceralMetastasis
Visceral Metastasis 29 3
Non Visceral Metastasis 5 11
Accuracy: 83.4%Specificity: 85.3%Sensitivity: 78.6%
The Glaxo trial• We received samples from Clinical trial EGF30001 (GSK) :• Paclitaxel + lapatinib vs. paclitaxel + placebo
• women ≥ 18 years old• histologically confirmed stage III or IV breast cancer• negative or unknown Her-2 status• good performance status (ECOG 0-1)• ≥ 6/12 separating completion of post operative chemotherapy and disease relapse
• 579 patients• ~ 500 baseline samples• ~ 400 at least a second sample
• In the whole set of samples, no correlations with outcome, therapyor toxicity was found
• In on treatment metastatic women with HER- status, significantresults for survival and time to progression was found.
Breast Cancer MetabolomicsPaclitaxel plus lapatinib-treated patients with HER2-positive disease (N=22)On-treatment week 9 serum samples. TTP prediction
Outcome as predictedby metabolomics:
Predictiveaccuracy
Short TTP Long TTPActualoutcome:
Short TTP 87.3% 12.7%
Long TTP 8.1% 91.9% 89.6%
Paclitaxel plus lapatinib-treated patients with HER2-positive disease (N=16)On-treatment week 9 serum samples. OS prediction
Outcome as predictedby metabolomics:
Predictiveaccuracy
Short OS Long OS
Actualoutcome:
Short OS 73.0% 27.0%
Long OS 15.6% 84.4% 78.0%
Molecular Oncology Volume 6, Issue 4, August 2012, 437–444
MSKCC ProjectProject in collaboration with Monica Fornier, MD from theMemorial Sloan-Kettering Cancer Center – New York
Samples were collected between 2003 and 2009
95 serum samples from metastatic breast cancer patients
80 serum samples from early breast cancer, with available risk assessment calculated withAdjuvant On-line clinical tool.(21 of them had documented disease relapse)
The 80 early are split in a training set and in an independent test set for validation
AIM: Testing the potential use of metabolomics as an improved relapse risk assessment tool
MSKCC Project
Random Forest was used to derive a score for relapse prediction.ROC curve for CPMG spectra is much better than Adjuvant on line (AUC<0.80).
Mol Oncol. 2014 Aug 10. pii: S1574-7891(14)00167-7
A composite score was then devised combining CPMG scoreand Adjuvant on line.
MSKCC Project
Auc: 0.8983
Colorectal Cancer Metabolomics
Early diagnosis of metastatic colorectal cancer is oftenelusive and there are no robust markers for the survival
time of patients affected by this disease.
We want to prove here whether metabolomic profiling ofserum samples can provide a signature of the disease
and give hints on the outcome of its progression.
We compared these results against already assessedprognostic indicator: ECOG-PS1, KRAS mutation2, CRP3,
CEA4 and YKL-405.
1. Oken MM, Creech RH, Tormey DC, et al.: Toxicity and response criteria of the Eastern Cooperative Oncology Group. Am J Clin Oncol 5:649-655, 19822. McGrath JP, Capon DJ, Smith DH, et al.: Structure and organization of the human Ki-ras proto-oncogene and a related processed pseudogene. Nature304:501-506, 19833. Erlinger TP, Platz EA, Rifai N, et al.: C-reactive protein and the risk of incident colorectal cancer. JAMA 291:585-590, 20044. Duffy MJ: Carcinoembryonic antigen as a marker for colorectal cancer: Is it clinically useful? Clin Chem 47:624-630, 20015. Johansen JS, Bojesen SE, Mylin AK, et al.: Elevated Plasma YKL-40 Predicts Increased Risk of Gastrointestinal Cancer and Decreased Survival After AnyCancer Diagnosis in the General Population. J Clin Oncol 27:572-578, 2009
Colorectal Cancer Metabolomics
155 metastatic colorectalcancer (CRC) patients
participated in the study.
139 serum samples fromhealthy subjects (HS) served
as a control group.
PatientsVariable No. %Age, years
Median 64Range 36-87
SexMale 96 61.9Female 59 38.1
Distant metastasisLiver 102 66.8Lung 51 32.9Nodes 33 21.3Skin 1 0.6Peritoneal 7 4.5Other location 44 28.4
ECOG-PSScore 0 77 49.7Score 1 54 34.8Score 2 24 15.5
KRAS mutation 53 36.1CRP level†, mg/L
Median 14.5Range 3.0-188.0High value 73 64.6
CEA level†, ku/LMedian 43.9Range 1.1-9807.0High value 108 85.7
YKL-40 level†, µg/LMedian 126.0Range 1.5-176.6High value 73 47.1
ECOG-PS = Eastern Cooperative Oncology Group performance statusKRAS = V-Ki-ras2 Kirsten rat sarcoma viral oncogene homologCRP = C-reactive proteinCEA = carcinoembryonic antigenYKL-40 = glycosyl hydrolase YKL-40
Colorectal Cancer Metabolomics
−0.10 −0.05 0.00 0.05 0.10
−0.2
−0.1
0.0
0.1
0.2
−0.4 −0.2 0.0 0.2 0.4
−0.4
−0.2
0.0
0.2
0.4
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
2425
26
2728
29
30
31
32
33
34
35
Accuracy 100.0% Accuracy 84.5%
Good survival > 24 monthsPoor survival < 3 months
Cancer Res. 2012 Jan 1;72(1):356-64
Colorectal Cancer Metabolomics
Kaplan-Meier plots of overall survival dataBioclinical markers
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P < .00001
good survivalpoor survival
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .00048
ECOG-PS grade - 0ECOG-PS grade - 1ECOG-PS grade - 2
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .04310
KRAS w ild-typeKRAS mutated
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40
0.0
0.2
0.4
0.6
0.8
1.0
P = .00007
low level of CRPhigh level of CRP
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .68700
low level of CEAhigh level of CEA
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .00198
low level of YKL-40high level of YKL-40
KRAS mutation CRP level
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 440.0
0.2
0.4
0.6
0.8
1.0
P < .00001
good survivalpoor survival
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 440.0
0.2
0.4
0.6
0.8
1.0
P = .00048
ECOG-PS grade - 0ECOG-PS grade - 1ECOG-PS grade - 2
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .04310
KRAS w ild-typeKRAS mutated
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40
0.0
0.2
0.4
0.6
0.8
1.0
P = .00007
low level of CRPhigh level of CRP
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.40.6
0.8
1.0
P = .68700
low level of CEAhigh level of CEA
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.40.6
0.8
1.0
P = .00198
low level of YKL-40high level of YKL-40
ECOG-PS
Time (months)Su
rviv
al p
roba
bilit
y
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P < .00001
good survivalpoor survival
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .00048
ECOG-PS grade - 0ECOG-PS grade - 1ECOG-PS grade - 2
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 440.0
0.2
0.4
0.6
0.8
1.0
P = .04310
KRAS w ild-typeKRAS mutated
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 400.0
0.2
0.4
0.6
0.8
1.0
P = .00007
low level of CRPhigh level of CRP
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .68700
low level of CEAhigh level of CEA
Time (months)
Surv
ival
pro
babi
lity
0 4 8 12 16 20 24 28 32 36 40 44
0.0
0.2
0.4
0.6
0.8
1.0
P = .00198
low level of YKL-40high level of YKL-40
CEA level YKL-40 level
Cancer Res. 2012 Jan 1;72(1):356-64
Colorectal Cancer Metabolomics
Univariate Cox Regression Analysis for theValidation Set:
HR: 3.3095% CI: 2.02 to 5.37P: 1.75 ∙ 10-6
Kaplan-Meier plots of overall survival dataMetabolomics
Cancer Res. 2012 Jan 1;72(1):356-64
The future of medicineIntra-individual metabolomes to determine
evolution towards diseases
Before disease onset people may start therapies
Perspectives in MedicineMetabolomics can monitor the same individual in a
multidimensional space
Intestinal bowel disease
Hypertension
hepatocarcinoma
steatosis
cirrhosis
Diabetes
Metabolic syndrome
Colorectal cancer
Hearth FailureHealthy aging
Today:Presence
of symptomsSearch for
specific markers Disease onset
Future:-omics periodical check-ups will allow to predict diseasepropensities and possibly undertake preventive therapies
From reactiveto predictive and preventive medicine
Spectral analysis and metabolites provide info on diseases, and on tendency todiseases of the “virtual patients”
The future of medicine
Allow me a little dreaming...
From general to personalized medicine
Metabolomics infood science
Metabolomics&
Food science
Metabolomics&
Food science
Foodcomponents
analysis
Foodcomponents
analysis
Food quality &authenticitydetection
Foodconsumptionmonitoring
Foodconsumptionmonitoring
Monitoring infood
interventionstudies
Monitoring infood
interventionstudies
Adapted from Wishart DS. Metabolomics: applications to food science and nutrition research. Trends in food science & technology 19 (2008) 482-493
On the consumers
Metabolites variation before and afterconsumption of particular products(PATHWAYS-27 project) orcomparison between groups ofindividuals on different diets(CHANCE project)
On the products
Different profiles ofdifferent products or ofdifferent processing
The CHANCE specific objectives are to:•Identify the main nutritional criticalities and barriers to healthy eating•Select ingredients and raw materials•Develop CHANCE foods•Produce CHANCE food prototypes
Partner 9, WP3
Evaluate the actual impactof the evidenced nutritionalcriticalities on the metabolicprofile by usingNMR metabolomics
Comparison of NMRprofiles of individualsbelonging to differentdietary classes (surveyfrom WP2) and fromreference healthy diet
Fingerprints/ biomarkersassociated to Risk ofpoverty groups
CERM in the CHANCE project
Investigate the metabolic state of volunteers recruited within eachselected population group.
RECRUITMENT CENTER
• ITALY: 581 samples
177 AFF; 369 ROP
• SERBIA: 602 samples
201 AFF; 401 ROP
• UK: 410 samples
• FINLAND: 471 samples
• LITHUANIA: 400 samples
AFF=affluent (control group)
ROP=Risk Of Poverty
• No strong differences between AFF and ROP using metabolomicsprofiles
• Strong differences between Serbia and Italy in urinary metabolomics profiles.
Serbia Italy
PLS/CA - Accuracy 60% ca.
PLS/CA – Accuracy 92%
PRELIMINARY RESULTSNMR Spectra
PATHWAY-27 projectAIM: a betterunderstandingof the role andmechanismsof selectedbioactives andbioactivesenriched food,performing invitro and invivo studiesby means ofadvancedomicstechniques(includingmetabolomics)
Mugello milk analysis
Value of dairy products is associated with the origin, species and composition of milk.Breed, metabolism, seasons, health, nutritions and milking habits produce a variabilityin the composition of milk. 1H-NMR spectroscopy can be used to profile milk samples
to extract information about traceability and composition.
Milk fingerprint
+ =
Mugello milk analysis
% a b c d e f g h i ja 100 0.0 0.0 0.0 0 0 0 0.0 0 0.0b 0 95.2 0.0 0.0 3 0 0 1.8 0 0.0c 0 0.0 10.2 89.8 0 0 0 0.0 0 0.0d 0 0.0 79.2 20.8 0 0 0 0.0 0 0.0e 0 0.0 0.0 0.0 100 0 0 0.0 0 0.0f 0 0.0 0.0 0.0 0 100 0 0.0 0 0.0g 0 0.0 0.0 0.0 0 0 100 0.0 0 0.0h 0 0.0 0.0 0.0 0 0 0 100.0 0 0.0i 0 0.0 0.0 0.0 0 0 0 9.0 91 0.0j 0 0.0 0.0 1.2 0 0 0 0.0 0 98.8
c (green)d (blue)
200 raw milk samplescoming fromMugello’s stables (20for each stable)
PLS/CA/kNNAccuracy 90.6%
Mugello milk analysis
% a b c+d e f g h i j
a 100 0 0 0.0 0 0.0 5.0 0.0 0b 0 95 0 10.0 0 0.0 1.0 0.0 0
c+d 0 0 100 0.0 0 0.0 0.0 0.0 0e 0 0 0 100 0 0.0 0.0 0.0 0f 0 0 0 0.0 100 0.0 0.0 0.0 0g 0 0 0 0.7 0 100 0.0 0.0 0h 0 0 0 0.0 0 0.0 100 0.0 0i 0 0 0 0.0 0 0.0 7.1 91 0j 0 0 0 0.0 0 0.0 0.0 0.0 98.8
PLS/CA/kNNAccuracy 97.6%
c+d=The same farm, two brands
Mugello milk analysis
Granarolo
Coop
Mukki Mugello(from the shelf)
We can perfectly discriminate thethree brands.Pastorized Mugello’s samples fromthe tank (crosses) fall in the correctcluster
% COOP GRANAROLO MUKKI sel.MUGELLO
COOP 95.9 4.1 0.0GRANAROLO 8.0 90.2 1.8
MUKKI sel.MUGELLO 0.0 3.3 96.7
60 samples of pasteurized milk produced by three Italian milk brands (20 for eachbrand, collected in different days).
PLS/CA/kNNAccuracy 95.8%
1
3
2
Insilati, Integratori energetici e salini 0%
FEEDING-TYPE 1: SILAGE and HAYSFEEDING-TYPE 3: SILAGE and FLOURS
FEEDING-TYPE 2: HAYS and FLOURS
Mugello milk analysis
Metabolomics can be an accurate tool to evaluatetraceability of different milks, allowing theidentification of some metabolites that correlate withthe component of cow feeding (feedomics?)
The ability to discriminate cows feeded or not withsilage could have important economical application:the production rules for Parmesan forbid the useof milk from cows feeded with silage. We canimagine a new analytical technique able to discoverfrauds.
Disponibilità di Tesi inMETABOLOMICA
presso ilCentro di Risonanze Magnetiche
dell’Università di Firenze
Argomento di Ricerca:
La proposta è rivolta a laureandi(Laurea di I livello e\o II livello)
In Chimica, Chimica Farmaceutica,Biologia, Biotecnologie
Per informazioni: [email protected]
Analisi Metabolomica di Fluidi Biologicitramite Risonanza Magnetica Nucleare