Download - Metabolomics · Systems Biology and the rise of the “-omics” Omics technologies such as genomics and high-throughput DNA sequencing were introduced in parallel to the Human Genome

Leonardo TenoriCERM/CIRMM

FiorGen Foundation

Metabolomics

Systems Biology and the rise of the “-omics”

Omics technologies such as genomics and high-throughput DNA sequencing wereintroduced in parallel to the Human Genome Project since 1990s.

According to one etymological analysis, the suffix 'ome' is derived from theSanskrit OM ("completeness and fullness") (Lederberg and McCray, 2001).

Omics technologies and various neologisms that define their application contexts,however, are more than a simple play on words.

They substantially transformed both the throughput and the design of scientificexperiments. The omics technologies allow the generation of copious amounts ofdata at multiple levels of biology from gene sequence and expression to proteinand metabolite patterns underlying variability in cellular networks and function ofwhole organ systems (Nicholson and Lindon, 2008; Wilke et al., 2008)

Systems biology...is about putting together rather than taking apart, integrationrather than reduction. It requires that we develop ways of thinking about integrationthat are as rigorous as our reductionist programmes, but different....It meanschanging our philosophy, in the full sense of the term" (Denis Noble).

Genomics

Study of genes

a branch of biotechnology concerned with applying thetechniques of genetics and molecular biology to thegenetic mapping and DNA sequencing of sets of genesor the complete genomes of selected organisms, withorganizing the results in databases, and withapplications of the data (as in medicine or biology)

Metagenomics

The genomics of the microbial community inside our body.

Microbial communities play a key role in preserving human health, but theircomposition and the mechanism by which they do so remains mysterious.Metagenomic sequencing is being used to characterize the microbialcommunities from 15-18 body sites from at least 250 individuals.

Changes in the human microbiome can be correlated with human health.

The human body carries about 100 trillion microorganisms in its intestines, anumber ten times greater than the total number of human cells in the body. Themetabolic activities performed by these bacteria resemble those of an organ,leading some to liken gut bacteria to a "forgotten" organ.

It is estimated that these gut flora have around a hundred times asmany genes in aggregate as there are in the human genome.

Epigenomics

is the study of the complete set of epigenetic modifications on thegenetic material of a cell, known as the epigenome. The field isanalogous to genomics and proteomics, which are the study ofthe genome and proteome of a cell (Russell 2010 p. 217 &230).

Epigenetic modifications are reversible modifications on a cell’sDNA or histones that affect gene expression without altering theDNA sequence (Russell 2010 p. 475). Two of the mostcharacterized epigenetic modifications are DNAmethylation and histone modification. Epigenetic modificationsplay an important role in gene expression and regulation, andare involved in numerous cellular processes such asindifferentiation/development and tumorigenesis. The study ofepigenetics on a global level has been made possible onlyrecently through the adaptation of genomic high-throughputassays.

Transcriptomics

The study of the Transcriptome.The term can be applied to the total set of transcripts in agiven organism, or to the specific subset of transcripts present in aparticular cell type.Unlike the genome, which is roughly fixed for a given cell line thetranscriptome can vary with external environmental conditions.Because it includes all mRNA transcripts in the cell, the transcriptomereflects the genes that are being actively expressed at any given time.

The study of transcriptomics, also referred to as expression profiling,examines the expression level of mRNAs in a given cell population,often using high-throughput techniques based on DNAmicroarray technology.

The use of next-generation sequencing technology to study thetranscriptome at the nucleotide level is known as RNA-Seq.

Proteomics

The studio of the Proteome.

A branch of biotechnology concerned with applying the techniques ofmolecular biology, biochemistry, and genetics to analyzing thestructure, function, and interactions of the all the proteins produced bythe genes of a particular cell, tissue, or organism, with organizing theinformation in databases, and with applications of the data

Characterizing human plasma proteome has become a major goal inproteomics arena. The plasma proteome is without doubt the mostcomplex proteome in the human body. It contains immunoglobulin,cytokines, protein hormones, secreted proteins and indicative ofinfection on top of resident, hemostatic proteins. It also contains tissueleakage proteins due to the blood circulation through different tissuesin the body.

The blood thus contains information on physiological of all tissues andcombined with its accessibility makes the blood proteome invaluablefor medical purposes.

Communicomics

The analysis of the communicome.

The communicome is the whole set of communication factors in abiological specimen, and it is a more specific subset of the proteome.

This aspect of proteomics will consist of targeted measurements ofcirculating plasma proteins with endocrine activity including cytokines,hormone-like proteins, growth factors and so forth, which have beentermed the “communicome” (Ray 2007). These factors are involved ininter-cellular and -organ communication and their changes with agewill carry crucial information regarding biological aging andneurodegeneration.

Metallomics

The term metallome has been introduced by analogy with proteomeas distribution of free metal ions in every one of cellularcompartments.

Subsequently, the term metallomics has been coined as the study ofmetallome. Szpunar (2005) defined metallomics as "comprehensiveanalysis of the entirety of metal and metalloid species within a cell ortissue type".

Lipidomics

Lipidomics is the lipidome analysis. Basically, a lipidome is the comprehensiveand quantitative description of a set of lipid species present in an organism.Lipidomics involves systems-level identification and quantitation of thousands ofpathways and networks of cellular lipids molecular species and their interactionswith other lipids, proteins and other moieties in vivo.

Lipids are hydrophobic or amphipathic molecules which include fats, waxes,sterols, fat-soluble vitamins (such as vitamins A, D, E and K), monoglycerides,diglycerides and phospholipids.The crucial role of lipids in a cell, tissue and organ physiology is evident by theirunique membrane organizing properties that provide cells with functionally distinctsubcellular membrane compartments.

The main biological functions of lipids include:Energy storage and structural components of cellular membranes.Cell signaling (e.g. phospholipase C and phospholipase A2 in modulatingimmunological responses).Endocrine actions (e.g. steroid hormones)Essential role in signal transduction, membrane trafficking and morphogenesis.

Glycomics

Glycomics is the comprehensive study of glycomes (the entirecomplement of sugars, whether free or present in more complexmolecules, of an organism), including genetic, physiologic, pathologic,and other aspects.

The term glycomics is derived from the chemical prefix for sweetnessor a sugar, "glyco-", and was formed to follow the naming conventionestablished by genomics (which deals with genes) and proteomics(which deals with proteins).

Glycoproteins

Glycolipids

Foodomics

Foodomics is discipline that studies the food and nutrition domainsthrough the application of advanced omics technologies to improveconsumer’s well-being, health, and confidence.

Foodomics is a global discipline that includes all the working areas inwhich food (including nutrition) and advanced omics tools are puttogether.

The interest in Foodomics also coincides with a clear shift in medicineand biosciences toward prevention of future diseases throughadequate food intakes, and the development of the so-calledfunctional foods.

And metabolomics ?

Now it is easy to answer….

“Metabolomics is a further “omic” science with the purpose of elaborate acomprehensive analysis of the metabolome,which is thecomplete set of metabolites in a biological fluid, cell, tissue, organor organism”.

Metabolomics

Metabolomics can alsoprovide tissue-specificinformation

Biological fluids such asblood and urine canprovide information at thewhole-body level

Genomics tells you what could happen.Metabolomics tells you what has happenedOnly a few thousand metabolites.Not negligible external environment influence

Genomics:the complete blueprint of an individual. What do we need more?There are 6 million parts in a 747 plane. If someone shows you theblueprints of all of them one after the other, would you be able to tell howthe plane looks like?

Proteomics:Only 30-40,000 proteins.However, millions of potential interactions that make an “individual”. And theanalysis is still very difficult…

Metabolomics:Only a few thousand metabolites.However, not negligible external variability.

Genomics is “only” thestart!

“Genomics and proteomics tell you whatmight happen, but metabolomics tellsyou what actually is happening”

Bill Lasley - University of California, Davis

“If you have a disease, it’s likely thatyour metabolism is going to be affected.The same is true if you get hit with atoxicant. To be honest, the diagnosticpotential is staggering”

Mark Viant - University of Birmingham

Benefits of analyzing themetabolome

Number of metabolites lower than number of genes and proteinsin a cell - sample complexity reduced

Although concentration of enzyme & metabolic flux may notsignificantly change during a biochemical reaction, concentrationof metabolites can change significantly

Reflect more accurately functional level of a cell

Metabolic fluxes regulated not only by gene expression but also byenvironmental stresses - hence worth measuring downstreamproducts (i.e. metabolites)

Estimated that metabolomic expts are 2x to 3x less expensivethan proteomic & transcriptomic expts

Metabolomics is More TimeSensitive Than Other “Omics”

Metabolomics

Proteomics

Genomics

Res

pons

eR

espo

nse

Res

pons

e

Time

Effetto della dieta

Consumo di pesce

Trimetilammina N-ossido

4 .00 3.90 3.80 3.7 0 3.60

Consumo di chewing-gum, caramelle etc..

NON Consumo di pesce

Mannitolo

Challenges when analyzingmetabolomes

Metabolomes extend over 7 to 9 order of magnitudes inconcentration (picomoles to millimoles)

Currently not possible to analyze all metabolites in asingle analysis

Several analytical strategies (MS in combination withdifferent chromatographic separations, NMR)

Requires high throughput

Term Definition

Metabolism The whole ensemble of all chemical reactions that occur in livingorganisms, including digestion and the transport of substancesinto and between different cells. The set of reactions within thecells is called intermediary metabolism.

Metabolite Substance produced during or taking part in metabolism.

Metabolome The full complement of metabolites present in a cell, tissue, ororganism in a particular physiological, pathological ordevelopmental state.

Metabonomics The quantitative measurement of the dynamic multiparametricmetabolic response of living systems to pathophysiologicalstimuli or genetic modification. Nicholson et al., 1999

Metabolomics The identification and quantification of the complete set ofmetabolites/low-molecular-weight intermediates, which arecontext dependent, varying according to the physiology,developmental or pathological state of the cell, tissue, organ ororganism. Oliver 2002

Metabolic profiling Identification and quantification of all metabolites, which aregenerally related to a specific metabolic pathway.

Metabolic fingerprinting Global, high-throughput, rapid analysis to provide sampleclassification. Also utilized as a screening tool to discriminatebetween samples from different biological status or origin (i.e.,case vs. control, disease vs. healthy)

Since the late 1990s, suchmetabolomic studies have

undergone an explosivegrowth and this trend is

still continuing, with morethan a thousand of papers

published in 2010!

Year1998 2000 2002 2004 2006 2008 2010 2012

Ent

ries

in P

ubm

ed

0

200

400

600

800

1000

1200

1400

Metabonom*Metabolom*

What is a Metabolite?

Any organic molecule detectable in thebody with a MW < 1000 Da

Includes peptides, oligonucleotides,sugars, nucelosides, organic acids,ketones, aldehydes, amines, amino acids,lipids, steroids, alkaloids and drugs(xenobiotics)

Includes human & microbial products Concentration > 1µM

Why 1mM?

Equals roughly ~200 ng/mL Limit of detection by NMR Limit of easy isolation/separation by many

analytical methods Excludes environmental pollutants Most disease indicators have

concentrations >1 µM Need to draw the line somewhere

H2N

O

OHGlycine

NH2

NH

O

OH

Tryptophan

NH2

HN

NH

H2N

O

OH

Arginine

OHHO

ONN

H2N

N

N

PO

O

OH

O

P

O

OH

O

PO

OH

OH

Adenosine-5'-triphosphate

O

O

OH

Pyruvic acid

O

OH

O

HO

Succinic acid

O

O

HO

O

OH

Oxaloacetic acidAcetyl CoA

Examples of metabolites

Small Molecules Count…

• >95% of all diagnostic clinical assays testfor small molecules

• 89% of all known drugs are smallmolecules

• 50% of all drugs are derived from pre-existing metabolites

• 30% of identified genetic disorders involvediseases of small molecule metabolism

• Small molecules serve as cofactors andsignaling molecules to 1000’s of proteins

Metabolomics Applications

• Nutritional Analysis• Drug Compliance

• Toxicology Testing• Clinical Trial Testing• Fermentation Monitoring• Food & Beverage Tests• Nutraceutical Analysis• Drug Phenotyping• Water Quality Testing• Petrochemical Analysis

Generate metabolic “signatures” for diseasestates or host responses

Obtain a more “holistic” view of metabolism(and treatment)

Diagnosis More rapidly and accurately (and cheaply)

assess/identify disease phenotypes Monitor gene/environment interactions Rapidly track effects from drugs/surgery

Medical Metabolomics

Metabolomica:alcuni obiettivi

Valutare eventuali correlazioni traimpronta metabolica e malattia

(sarebbe così possibile disporre di nuovistrumenti per approfondire leconoscenze su determinate patologie)


Cercare di capire se sia possibile diagnosticaree valutare lo stadio di avanzamento di unamalattia

(una diagnosi più precoce dei tumori di quellaattualmente possibile, per esempio,permetterebbe di salvare il 30% di malatiutilizzando i farmaci attualmente disponibili)


Scoprire nuovi biomarker

(quelli attuali utilizzati per la diagnosi dialcune patologie potrebbero nonessere gli unici e/o i più efficienti)


Studiare i metaboliti connessi a specificipathway metabolici

(sarebbe possibile definire dei nuovibersagli per farmaci futuri e valutarel’impatto di quelli attuali permettendouna personalizzazione avanzata dellaterapia)

Metabolic profiling is not new. Profiling for clinical detection of human disease usingblood and urine samples has been carried out for Centuries.

This urine wheel waspublished in 1506 byUllrich Pinder, in his bookEpiphanie Medicorum.The wheel describes thepossible colors, smellsand tastes of urine, anduses them to diagnosedisease.

Nicholson, J. K. & Lindon, J. C. Nature455, 1054–1056 (2008).

History

Few already knownmetabolites for somedisease (e.g. glucosefor diabetes, etc…)

Metabolomics:Traditional clinical analysis:

All metabolites areanalyzed together

without priorknowledge

Data acquisition• MS based techniques

– Gas Chromatography (GC-MS)– High Performance Liquid Chromatography (HPLC-MS)– Ultra Performance Liquid Chromatography (UPLC-MS)– Capillary Electrophoresis (CE-MS)

• NMR– Nuclear Magnetic Resonance Spectroscopy (NMR)– High resolution magic angle spinning (HR-MASS)

• Quantitative, very fast

• Requires no work up orseparation

• Allows analysis of 200+cmpds at once

• Relatively insensitivetechnique

• Lower limit of detection1-5 uM

• Usually large samplesize (500 uL)

• Quite fast• Very sensitive• Allows analysis or

ID of 3000+ cmpds(not at once, butusing severaldifferent analyticalstrategies)

• Not quantitative• Requires complex

work-up forsample preparation

NMR versus MS

Metabolic Profiling MethodsMain Analytical Techniques

How can one decide which analytical platform should be used?

- Should be rapid, reproducible, with easy sample preparation.

- Selection based on objectives, target metabolites, availability, etc.

Scale from - to +++ for major disadvantages to major advantages

Phytochem Rev (2008) 7:525–537

Quantitative (Targeted): preferred MSway

Metabolite Identification & Quantification

Biological Interpretation

Sample Prep

Fingerprinting (Untargeted)- preferredNMR way

-25-20-15-10-505

10152025

-30 -20 -10 0 10

PC1

PC2

PAP

ANIT

Control

Data Reduction

Data Collection

Sample Prep

If NMR identify some abundant metabolites belonging to specific metabolic cycles,

then using MS it is possible to look for target metabolites, not detected by NMR but

that is known that are related to that specific biological cycle.

NMR generate hypotheses, MS can confirm them and can go in deeper details

COMPLEMENTARY TECHNIQUES

+

NMR Metabolomics

Metabolic fingerprint

We mainly use NMR

1H NMR spectrum of ethanol: a series ofpoints with an associated intensity

Our data: NMR spectraNMR spectroscopy is usually used to detect hydrogen nuclei in metabolites - Thus,in a typical biological sample, all hydrogen-containing molecules in the sample willgive an 1H NMR spectrum, as long as they are present in concentrations above thedetection limit. The NMR spectrum is therefore the superposition of the spectra of

all of the metabolites in the sample.

Profilo 1H NMR di urina umana

Profilo 1H NMR di urina umana

1234567ppm

hippurate urea

allantoin creatininehippurate

2-oxoglutarate

citrate

TMAO

succinatefumaratewater

creatinine

taurine

1234567ppm

-25-20-15-10-505

10152025

-30 -20 -10 0 10PC1

PC2

Quantitativemethods

Chemometric methods(fingerprinting and pattern recognition)

Two approaches:• Identify as many metabolites as possible• Use the whole spectrum as a fingerprint (statistics)

2 Routes to Metabolomics

Quantitative vs.Chemometric

• Identifies compounds• Quantifies compds• Concentration range of

1 µM to 1 M• Handles wide range of

samples/conditions• Allows identification of

diagnostic patterns• Limited by DB size

• No compound ID• No compound conc.• No compound

concentration range• Requires strict sample

uniformity• Allows identification of

diagnostic patterns• Limited by training set

NMR analysis

Metabolites identification

Data processing and bucketingStatistical analysis

Handling andpreparation of

samples

Metabolomics steps

Typical biofluids used in metabolomics

Urine Serum/plasma Saliva Fecal extracts Exhaled breath condensate Cells/tissues extracts

And also tears, sweat, vaginal fluid,seminal fluid, synovial fluid, bile,cerebrospinal liquid, …

Handling and preparation of the samples

A prerequisite for the further improvement of the diagnosis and prognosis of diseases is thedevelopment of systems and procedures involved in all stages of the process from specimen

collection throughout the analysis.

A critical point in the process is the treatment of the sample material before the analysis itself takesplace – the so-called pre-analytical phase. If the sample is not stabilized as soon as possible after itscollection from the human body, the following analysis loses reliability and reproducibility because

the biological or chemical structure of the sample material will have changed between the collectionand the analysis and will not reflect the real situation in the patient's body anymore.

SOPs are fundamental

Every step of a metabolomic experiment must be carefully standardized

Samples arrive at CERMfrozen and are immediatelystored in -80°C

The day of the analysissamples (serum or urines)are thawed at roomtemperature, and then abuffer containing deuteratedwater is added

Samples are then pipetted in4.5 mm tubes

The automatic samples charger allow us towork in (moderately) high throughput

~10 min for a urine sample~30 min for a serum sample (more experiments)

Acquisition of the NMR spectra

600 MHz standard fieldfor metabolomics

CPTI 1H-13C/31P-2H cryo-probeWith automatic tuning and matching

Most common NMR techniques in metabolomics:

1D-noesy with water presaturation

1D-cpmg(Carr-Purcel-Meibom-Gil T2 relaxation editing)

1D-diffusion edited spectra

2D-JRES (j-resolved experiments)

CPMG

1.01.52.02.53.03.54.0 0.5

J-resolved

Diffusion edited

Low molecular weightand proteins profile

Low molecularweight metabolite

profile

Lipids and proteins profile

Identificationpurposes

1D-noesy

1H NMR analysis of a serum sample

serum

urine

saliva

fecal extract

Binned spectra, n=~400 bins

phasing, baseline correction…NMR spectra with 64 or 128k points

Data Preparation

Bucketing is a means to reducethe number of total variablesand to compensate for small

shifts in the spectra.

• Data collected represented in a matrix

At this point, the data has been transformedto a matrix with the samples in rows and thevariables (bins) in columns

Univariate statisticsVariable are analyzed independently from each otherNo interaction is taken into account.- statistical test (t.test, wilcoxon test, kruskall test,etc…)- correlations- ROC curves

Multivariate statisticsVariables are analyzed together, taking into account the relationshipsbetween them.

Unsupervised methods: no informations between groups- Projection based methods (principal component analysis,independent component analysis)- cluster analysis (k-means, hierarchical clustering, spectralclustering, …)

Supervised methods- Projection based methods (LDA, PLS,…)- Machine learning (SVM, randomForest, neural networks)

Univariate Statistics• Univariate means a single variable• If you measure a population using some

single measure such as height, weight,test score, IQ, you are measuring asingle variable

Multivariate Statistics• Multivariate means multiple variables• If you measure a population using

multiple measures at the same timesuch as height, weight, hair colour,clothing colour, eye colour, etc. you areperforming multivariate statistics

• Multivariate statistics requires morecomplex, multidimensional analyses ordimensional reduction methods

Multivariate StatisticsUnsupervised methods: no information on the samples:

Projection methods :Principal component analysis, Independent component analysis, ISOMAP,locally linear embedding, diffusion maps, locality preserving projection,laplacian eigenmap

Clustering methods:K-means, K-medoids, hierarchical clustering

Supervised methods: information on samples (e.g. disease/helthy)

Projection methods:PLS, OPLS, LDA

Machine learning:K-NN, support vector machines, random forest, neural networks

PCA

• PCA – PrincipalComponenent Analysis

• Process that transforms anumber of possiblycorrelated variables into asmaller number ofuncorrelated variablescalled principalcomponents

• Reduces 1000’s ofvariables to 2-3 keyfeaturesScores plot

Data Presentation

• Example: 53 Blood andurine measurements (wetchemistry) from 65people (33 alcoholics, 32non-alcoholics).

• Matrix Format

H-WBC H-RBC H-Hgb H-Hct H-MCV H-MCH H-MCHCH-MCHCA1 8.0000 4.8200 14.1000 41.0000 85.0000 29.0000 34.0000A2 7.3000 5.0200 14.7000 43.0000 86.0000 29.0000 34.0000A3 4.3000 4.4800 14.1000 41.0000 91.0000 32.0000 35.0000A4 7.5000 4.4700 14.9000 45.0000 101.0000 33.0000 33.0000A5 7.3000 5.5200 15.4000 46.0000 84.0000 28.0000 33.0000A6 6.9000 4.8600 16.0000 47.0000 97.0000 33.0000 34.0000A7 7.8000 4.6800 14.7000 43.0000 92.0000 31.0000 34.0000A8 8.6000 4.8200 15.8000 42.0000 88.0000 33.0000 37.0000A9 5.1000 4.7100 14.0000 43.0000 92.0000 30.0000 32.0000

0 10 20 30 40 50 600100200300400500600700800900

1000

measurement

Val

ue

Measurement

0 10 20 30 40 50 60 7000.20.40.60.811.21.41.61.8

Person

H-B

ands

0 50 150 250 350 45050100150200250300350400450500550

C-Triglycerides

C-L

DH

0 100200300400500

0200

4006000

1

2

3

4

C-TriglyceridesC-LDH

M-E

PI

Univariate Bivariate

Trivariate

Data Presentation

• Better presentation than ordinate axes?• Do we need a 53 dimension space to view data?• How to find the ‘best’ low dimension space that

conveys maximum useful information?• One answer: Find “Principal Components”

Data Presentation

The Goal

We wish to explain/summarize the underlying structure ofa large set of variables through a few linear combinationsof these variables.

Applications

• Uses:– Data Visualization– Data Reduction– Data Classification– Trend Analysis– Noise Reduction

• Examples:– How many unique “sub-sets” are in

the sample?– How are they similar / different?– What are the underlying factors that

influence the samples?– Which time / temporal trends are

(anti)correlated?– Which measurements are needed to

differentiate?– How to best present what is

“interesting”?– Which “sub-set” does this new sample

rightfully belong?

From k original variables: x1,x2,...,xk:Produce k new variables: y1,y2,...,yk:y1 = a11x1 + a12x2 + ... + a1kxk

y2 = a21x1 + a22x2 + ... + a2kxk

...yk = ak1x1 + ak2x2 + ... + akkxk

such that:

yk's are uncorrelated (orthogonal)y1 explains as much as possible of original variance in data sety2 explains as much as possible of remaining varianceetc.

yk's arePrincipal Components

PCA

2D Gaussian dataset

1st PCA axis

2nd PCA axis

PCA Plot Nomenclature

• PCA Generate 2kinds of plots, thescores plot and theloadings plot

• Scores plot (on right)plots the data usingthe main principalcomponents

Z = X Ascores loading

originaldata

PCA Loadings Plot

• Loadings plot showshow much each of thevariables (metabolites)contributed to thedifferent principalcomponents

• Variables at theextreme cornerscontribute most to thescores plot separation

PCA Details/Advice• In some cases PCA will not succeed in

identifying any clear clusters or obviousgroupings no matter how many componentsare used. If this is the case, it is wise toaccept the result and assume that thepresumptive classes or groups cannot bedistinguished with PCA

• As a general rule, if a PCA analysis fails toachieve even a modest separation of classes,then it is probably better to use otherstatistical techniques to try to separate them

PLS• Supervised learning method.• Principles that of PCA. But in

PLS, a second piece ofinformation is used, namely, thelabeled set of class identities.

• Two data tables considerednamely X (input data fromsamples) and Y (containingqualitative values, such as classbelonging, treatment of samples)

• The PLS algorithm maximizesthe covariance between the Xvariables and the Y variables

How PLS works (Concept)

• PLS finds a set of orthogonal components that :– maximize the variance of both X and Y– provide a predictive equation for Y in terms of the X’s

• This is done by:– fitting a set of components to X (as in PCA)– similarly fitting a set of components to Y– reconciling the two sets of components so as to maximize

covariance of X and Y

PLS is also a step-wise process. This is how it works conceptually:

• OPLS method is a recent modification of the PLS method to help overcome pitfalls• Main idea to seperate systematic variation in X into two parts, one linearly related to Y and one

unrelated (orthogonal).• Comprises two modeled variations, the Y-predictive (TpPp

T) and the Y-orthogonal (ToPoT)

compononents.• Only Y-predictive variation used for modeling of Y.• X = TpPp

T + ToPoT + E

• Y = TpCpT + F

• E and F are the residual matrices of X and Y• OPLS-DA compared to PLS-DA

KNN Classificationk-nearest neighbours

$0

$50.000

$100.000

$150.000

$200.000

$250.000

0 20 40 60 80

Non-DefaultDefault

Age

Loan$

79www.ismartsoft.com

Metabolic signature of individualsMetabolic phenotype

Metabolic signature of diseases• Celiac disease• tumor metastasis (breast, colorectal)• cardiovascular diseases• pulmonary diseases• …

Metabolomics in agriculture and for nutritionalstudies

Metabolomics for biobank samples• Sensitive reporters of stability• Assess sample preparation and preanalytical procedures• …

Our interest in metabolomics

METabolomic REFerence

• 22 Individuals, 11 Males & 11 Females• 40 urine samples each, on a period of 2-3 months• First in the morning preprandial• Collection suspended in case of illness; otherwise no restrictions

• MetaData recording:DietDrugsLifestyle, general habitsSmoker / No Smoker

• NMR analysis: 1D 1H spectra

Experimental scheme:

• Training

• Urine samples are easy to collect

• Large number of samples

• Potential intrinsic value of the information

Why?


Visual inspection suggests that it should be interesting to look for individualfingerprints by statistical analysis

Ind 1

Ind 2 We believe the humaneye is very sensitive todifferences in patterns

10.00 7.50 5.00 2.50 ppm

Getting a first feeling…

METabolomic REFerenceConvex hulls of 22 donors in the three most significant PCA-CA dimensions

Assfalg, Bertini, Colangiuli, Luchinat, Schäfer, Schütz, Spraul, PNAS, 2008, 105, 1420-4

PCA for datareduction

CA for obtainwell separatedclusters

KNN forclassification

99% accuracyin montecarlocross validation

“natural” gender discrimination

MALEFEMALE



Dendrogram of the 22 donors on the 21-dimensional PCA-CA subspace


MM-SIMCA

PCA/K-NN

PCA/CA/K-NN

Leave-one-out Majority rule

38%

69%

99% 100%

95%

81%



Concentrations of 12 selected metabolites for each donor. Absolutecreatinine concentration (Crea) and relative metabolite

concentrations (relative to creatinine) are scaled with respect to theirmedian values averaged over the total set of samples.

PCA/CA/K-NN12 metabolites

73%

METabolomic REFerenceAn individual metabolic fingerprint exist!

But it is hidden inside the daily noise


METabolomic REFerence 2

• Expanding the dataset

• Trying to learn more about relevance of genetic vs lifestyle contributions

• Check the constancy of metabolic phenotypes over time

Why Healthy Individuals Again?

METabolomic REFerence 2

• 20 Individuals, 9 Male & 11 Females

• 11 Individuals (6 M + 5 F) already in the first screening

• 40 samples/each on a period of 2-3 months

• First in the morning preprandial

• Collection suspended in case of illness

• Data recording:DietDrugsLife styleSmoker / No Smoker

• NMR analysis: 1D 1H spectra

Experimental Scheme (2 years later)

METREF 1,2,3

MetRef12005

MetRef22007

MetRef32008

11

7

4

2

twins

5

father& son

2t

47

4

22

20

4


Average recognitions using different test/trainingcombinations

Distances Metref 1,2,3

Heat Map of the inter distances beetwen pools of spectrabelongings to 46 pseudo-individuals

genes

lifestyle etc.

Bernini, P.; Bertini, I.; Luchinat, C.; Nepi, S.; Saccenti, E.; Schäfer, H.; Schütz, B.; Spraul, M.; Tenori, L. Individual humanphenotypes in metabolic space and time, J. Prot. Res. 2009

• The metabotype consists of a variable part (environment) and an invariantpart (genetics + environment)• The invariant part persists for at least two-three years• The discovery of the existence of individual metabotypes is the baseline forbiomedical and nutritional studies

NMR instrumentationMETabolomic REFerence

MetRef1

MetRef2

MetRef3

A metabolic “jump”

Hippurate

Metabolic profile evolution of an individual

MetRef 4

At present, a new sample collection is ongoing, to study the evolution of themetabolic fingerprint over an even longer time scale (6-9 years).

• 11 Individuals, 6 Males & 5 Females participating in at least oneprevious collection

For the first three individuals for which the collection of 20 samples hasbeen completed the accuracies are

19/20,18/20,20/20

Celiac Disease MetabolomicsWhat is Celiac Disease?

• Celiac Disease (CD), or sprout, is a permanent intolerance to gluten• Gluten is found in wheat, rye and barley and others• Gliadin and glutenin comprise about 80% of the protein contained

in wheat seeds.• Gluten is present in bread, pasta, pizza, biscuits…

The ONLY therapy is a

totally gluten-free diet

Aim: define the metabolome of celiac disease; obtain hints on its biochemistry

Celiac Disease Metabolomics

• Study subjects: 34• Control subjects: 34• Samples: Serum and Urine

NMR spectra acquired:

• 1D Noesy (standard 1D 1H spectra) for serum and urine samples• CPMG: to remove signals due to macromolecules (on serum samples)• @ a Bruker 600 MHz

Experimental scheme:

Statistical Analysis

• Projection to Latent Structures (PLS) to reduce data dimension Optimal number ofcomponents obtained by minimizing the Cross-Validated (CV) error

• Canonical Analysis (CA) to obtain two well separated clusters

• Support Vector Machines (SVM) for classification


Clusterization of serum spectra of celiac and healthy subjects

Note: both subjects areasymptomatic!

Bertini, I.; Calabrò, A.; De Carli, V.; Luchinat, C.; Nepi, S.; Porfirio, B.; Renzi, D.; Saccenti, E.;Tenori, L. The metabonomic signature of celiac disease, J. Proteome Res. 2009, 8(1), 170

Accuracy between80% and 90%

Celiac Disease MetabolomicsSignificantly different metabolites in

serum (p<0.01)

Already known NAC = N-acetyl-

Significantly differentmetabolites in urine (p<0.01)


Celiac disease often associated with fatigue:Why ?

Increased glucose, decreased pyruvate, lactate:Impaired glycolysis, impaired energy production

Lipid beta-oxidation + use of ketonic bodies:Alternate less efficient energy production


Clusterization of Celiac and Healthy subject serum spectra



Clusterization of Celiac and Healthy subject serum spectra and Follow-up


Celiac – Healthy Subjects –Cross: predicted Potential Celiac

Bernini P, Bertini I, Calabrò A, la Marca G, Lami G, Luchinat C, Renzi D, Tenori L. Are patients withpotential celiac disease really potential? The answer of metabonomics. J. Proteome Res. 2010

There exists a metabolicfingerprint of celiac disease

These alteration are present alsoin potential celiac subjects: so

they precede the intestinaldamage

Potential CD largely shares themetabonomic signature of overtCD. Most metabolites found to

be significantly differentbetween control and CD subjects

were also altered in potentialCD. Our results suggest earlyinstitution of GFD in patients

with potential CD

Celiac Disease MetabolomicsSubjects: 134

Celiacs: 59 - Potential Celiacs: 25 - Healthy: 50

Sensitivity Specificity Accuracy

CMD vs CMS 45.52% 68.29% 61.19%

NYHA1 vs NYHA 2 61.88% 71.42% 67.71%

NYHA2 vs NYHA 3/4 73.62% 56.44% 68.04%

NYHA 1 vs NYHA 3/4 74.83% 68.55% 72.15%

Classification between different subgroups of Heartfailure patients (1D CPMG spectra).

Patients are separated from healthy, but there is not any significantdifference between the disease grading that could reflect the clinicalseverity of the disease.

Although good discrimination between healthy and HF subjects with a severedisease, if not expected, was easy to be hypothesized, a comparable gooddiscrimination ability between healthy and HF subjects with a mild disease wasunexpected and appears rather counter-intuitive.

Heart failure metabolomics

Patients vs Healthy 85.11% 91.04% 87.29%

Int. J. Cardiol. 2013 Oct 9;168(4):e113-5

HF patients with a stable disease and under astate-of-the-art therapy (131 males, 54 females,mean age 62.93± 12.9 years) and an age andgender matched cohort of 111 healthy volunteers(86 males, 25 females, mean age 61.00± 3.28years)


Our data support the hypothesis that the HF fingerprint may be an on/offphenomenon in the HF scenario, and the HF fingerprint correlates with thepresence of HF irrespective of the disease severity. The presence of a HFfingerprint in an asymptomatic subject could be therefore more significant ascompared to the potential risk established in the presence of certain genomicfeatures.

Int. J. Cardiol. 2013 Oct 9;168(4):e113-5

-80-70-60-50-40-30-20-10

0102030405060

TMAO

Crea

tine

Lipid

Isol

eucin

eFo

rmat

eLip

opro

tein

Hypo

xant

hine

Prol

ine

Phen

ylal

anin

ePy

ruva

teU

rea

Dim

ethy

lam

ine

Serin

eTy

rosin

eAc

etat

eHi

stid

ine

Met

hano

lVa

line

Chol

ine

Argi

nine

Crea

tinin

eDi

met

hylsu

lfone

Gln+

Glu

Alan

ine

L-do

paDi

met

hylg

lycin

eCi

trat

eLa

ctat

eLy

sine

Urid

ine

Met

hion

ine


The model for prediction of heart failure was developed for each of the 3 kind ofavailable spectra: cpmg, noesy and diffusion edited753 new healthy samples (blood donors) were tested against each model.20 subjects was predicted as patients in all of the three kinds of spectra.We were able to recall 11 of them for a ecocardiographical screening

6 out of 11 showed altered parameters

Lower bound

Upper bound

Thickness of the left ventricular wall

Breast cancer metabolomics

In breast cancer the tumor-host interaction is small, and thesystemic effects of the tumor are expected to be modest

Aims:

1) To define a systemic metabolic fingerprint of breast cancer2) To find a predictive signature of metastatic disease3) To predict the survival in metastatic patients4) To find a prognostic score for the risk of relapse in post-operativepatients

Breast cancer metabolomics

Healthy vsMet

Accuracy 73.44%

Healthy vsPost-op

Accuracy 75.80%

Post vsMet

Accuracy 74.96%

NOESY

Healthy vsMet

Accuracy 72.67%

Healthy vsPost-op

Accuracy 70.00%

Post-op vsMet

Accuracy 70.00%

CPMG

Classification betweenPre-Op and Metastaticsubjects.

Accuracy ~80%

Other comparisons Ann Oncol (2011) 22 (6): 1295-1301

Breast Cancer Metabolomics

Phenylalanine Tyrosine Aspartate Methylamine

Lipids Lipoproteins

Choline

Bones: 29

Non Bones: 19Bone Metastasis Non Bone Metastasis

Bone Metastasis 23 6

Non Bone Metastasis 6 13

Accuracy: 75.50%Specificity: 68.42%Sensitivity: 79.31%

Different Kinds of Metastases

Soft Tissue: 32Non Soft Tissue: 16

Soft tissues Non Soft tissuesmetastasis

Soft tissues 28 4

Non Soft Tiss. Metastasis 9 7


Visceral: 32

Non Visceral: 16Visceral

MetastasisNon VisceralMetastasis

Visceral Metastasis 29 3

Non Visceral Metastasis 5 11


The Glaxo trial• We received samples from Clinical trial EGF30001 (GSK) :• Paclitaxel + lapatinib vs. paclitaxel + placebo

• women ≥ 18 years old• histologically confirmed stage III or IV breast cancer• negative or unknown Her-2 status• good performance status (ECOG 0-1)• ≥ 6/12 separating completion of post operative chemotherapy and disease relapse

• 579 patients• ~ 500 baseline samples• ~ 400 at least a second sample

• In the whole set of samples, no correlations with outcome, therapyor toxicity was found

• In on treatment metastatic women with HER- status, significantresults for survival and time to progression was found.

Breast Cancer MetabolomicsPaclitaxel plus lapatinib-treated patients with HER2-positive disease (N=22)On-treatment week 9 serum samples. TTP prediction

Outcome as predictedby metabolomics:

Predictiveaccuracy

Short TTP Long TTPActualoutcome:

Short TTP 87.3% 12.7%

Long TTP 8.1% 91.9% 89.6%

Paclitaxel plus lapatinib-treated patients with HER2-positive disease (N=16)On-treatment week 9 serum samples. OS prediction

Outcome as predictedby metabolomics:

Predictiveaccuracy

Short OS Long OS

Actualoutcome:

Short OS 73.0% 27.0%

Long OS 15.6% 84.4% 78.0%

Molecular Oncology Volume 6, Issue 4, August 2012, 437–444

MSKCC ProjectProject in collaboration with Monica Fornier, MD from theMemorial Sloan-Kettering Cancer Center – New York

Samples were collected between 2003 and 2009

95 serum samples from metastatic breast cancer patients

80 serum samples from early breast cancer, with available risk assessment calculated withAdjuvant On-line clinical tool.(21 of them had documented disease relapse)

The 80 early are split in a training set and in an independent test set for validation

AIM: Testing the potential use of metabolomics as an improved relapse risk assessment tool

MSKCC Project

Random Forest was used to derive a score for relapse prediction.ROC curve for CPMG spectra is much better than Adjuvant on line (AUC<0.80).

Mol Oncol. 2014 Aug 10. pii: S1574-7891(14)00167-7

A composite score was then devised combining CPMG scoreand Adjuvant on line.

MSKCC Project

Auc: 0.8983

Colorectal Cancer Metabolomics

Early diagnosis of metastatic colorectal cancer is oftenelusive and there are no robust markers for the survival

time of patients affected by this disease.

We want to prove here whether metabolomic profiling ofserum samples can provide a signature of the disease

and give hints on the outcome of its progression.

We compared these results against already assessedprognostic indicator: ECOG-PS1, KRAS mutation2, CRP3,

CEA4 and YKL-405.

1. Oken MM, Creech RH, Tormey DC, et al.: Toxicity and response criteria of the Eastern Cooperative Oncology Group. Am J Clin Oncol 5:649-655, 19822. McGrath JP, Capon DJ, Smith DH, et al.: Structure and organization of the human Ki-ras proto-oncogene and a related processed pseudogene. Nature304:501-506, 19833. Erlinger TP, Platz EA, Rifai N, et al.: C-reactive protein and the risk of incident colorectal cancer. JAMA 291:585-590, 20044. Duffy MJ: Carcinoembryonic antigen as a marker for colorectal cancer: Is it clinically useful? Clin Chem 47:624-630, 20015. Johansen JS, Bojesen SE, Mylin AK, et al.: Elevated Plasma YKL-40 Predicts Increased Risk of Gastrointestinal Cancer and Decreased Survival After AnyCancer Diagnosis in the General Population. J Clin Oncol 27:572-578, 2009


155 metastatic colorectalcancer (CRC) patients

participated in the study.

139 serum samples fromhealthy subjects (HS) served

as a control group.

PatientsVariable No. %Age, years

Median 64Range 36-87

SexMale 96 61.9Female 59 38.1

Distant metastasisLiver 102 66.8Lung 51 32.9Nodes 33 21.3Skin 1 0.6Peritoneal 7 4.5Other location 44 28.4

ECOG-PSScore 0 77 49.7Score 1 54 34.8Score 2 24 15.5

KRAS mutation 53 36.1CRP level†, mg/L

Median 14.5Range 3.0-188.0High value 73 64.6

CEA level†, ku/LMedian 43.9Range 1.1-9807.0High value 108 85.7

YKL-40 level†, µg/LMedian 126.0Range 1.5-176.6High value 73 47.1

ECOG-PS = Eastern Cooperative Oncology Group performance statusKRAS = V-Ki-ras2 Kirsten rat sarcoma viral oncogene homologCRP = C-reactive proteinCEA = carcinoembryonic antigenYKL-40 = glycosyl hydrolase YKL-40


−0.10 −0.05 0.00 0.05 0.10

−0.2

−0.1

0.0

0.1

0.2

−0.4 −0.2 0.0 0.2 0.4

−0.4

−0.2

0.0

0.2

0.4

1

2

3

45

6

7

8

9

10

11

12

13

14

15

16

17

18

19

2021

22

23

2425

26

2728

29

30

31

32

33

34

35

Accuracy 100.0% Accuracy 84.5%

Good survival > 24 monthsPoor survival < 3 months

Cancer Res. 2012 Jan 1;72(1):356-64


Kaplan-Meier plots of overall survival dataBioclinical markers

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P < .00001

good survivalpoor survival

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .00048

ECOG-PS grade - 0ECOG-PS grade - 1ECOG-PS grade - 2

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .04310

KRAS w ild-typeKRAS mutated

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40

0.0

0.2

0.4

0.6

0.8

1.0

P = .00007

low level of CRPhigh level of CRP

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .68700

low level of CEAhigh level of CEA

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .00198

low level of YKL-40high level of YKL-40

KRAS mutation CRP level

Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 440.0

0.2

0.4

0.6

0.8

1.0

P < .00001


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 440.0

0.2

0.4

0.6

0.8

1.0

P = .00048


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .04310


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40

0.0

0.2

0.4

0.6

0.8

1.0

P = .00007


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.40.6

0.8

1.0

P = .68700


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.40.6

0.8

1.0

P = .00198


ECOG-PS

Time (months)Su

rviv

al p

roba

bilit

y

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P < .00001


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .00048


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 440.0

0.2

0.4

0.6

0.8

1.0

P = .04310


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 400.0

0.2

0.4

0.6

0.8

1.0

P = .00007


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .68700


Time (months)

Surv

ival

pro

babi

lity

0 4 8 12 16 20 24 28 32 36 40 44

0.0

0.2

0.4

0.6

0.8

1.0

P = .00198


CEA level YKL-40 level

Cancer Res. 2012 Jan 1;72(1):356-64


Univariate Cox Regression Analysis for theValidation Set:

HR: 3.3095% CI: 2.02 to 5.37P: 1.75 ∙ 10-6

Kaplan-Meier plots of overall survival dataMetabolomics

Cancer Res. 2012 Jan 1;72(1):356-64

The future of medicineIntra-individual metabolomes to determine

evolution towards diseases

Before disease onset people may start therapies

Perspectives in MedicineMetabolomics can monitor the same individual in a

multidimensional space

Intestinal bowel disease

Hypertension

hepatocarcinoma

steatosis

cirrhosis

Diabetes

Metabolic syndrome

Colorectal cancer

Hearth FailureHealthy aging

Today:Presence

of symptomsSearch for

specific markers Disease onset

Future:-omics periodical check-ups will allow to predict diseasepropensities and possibly undertake preventive therapies

From reactiveto predictive and preventive medicine

Spectral analysis and metabolites provide info on diseases, and on tendency todiseases of the “virtual patients”

The future of medicine

Allow me a little dreaming...

From general to personalized medicine

Metabolomics infood science

Metabolomics&

Food science

Metabolomics&

Food science

Foodcomponents

analysis

Foodcomponents

analysis

Food quality &authenticitydetection

Foodconsumptionmonitoring

Foodconsumptionmonitoring

Monitoring infood

interventionstudies

Monitoring infood

interventionstudies

Adapted from Wishart DS. Metabolomics: applications to food science and nutrition research. Trends in food science & technology 19 (2008) 482-493

On the consumers

Metabolites variation before and afterconsumption of particular products(PATHWAYS-27 project) orcomparison between groups ofindividuals on different diets(CHANCE project)

On the products

Different profiles ofdifferent products or ofdifferent processing

The CHANCE specific objectives are to:•Identify the main nutritional criticalities and barriers to healthy eating•Select ingredients and raw materials•Develop CHANCE foods•Produce CHANCE food prototypes

Partner 9, WP3

Evaluate the actual impactof the evidenced nutritionalcriticalities on the metabolicprofile by usingNMR metabolomics

Comparison of NMRprofiles of individualsbelonging to differentdietary classes (surveyfrom WP2) and fromreference healthy diet

Fingerprints/ biomarkersassociated to Risk ofpoverty groups

CERM in the CHANCE project

Investigate the metabolic state of volunteers recruited within eachselected population group.

RECRUITMENT CENTER

• ITALY: 581 samples

177 AFF; 369 ROP

• SERBIA: 602 samples

201 AFF; 401 ROP

• UK: 410 samples

• FINLAND: 471 samples

• LITHUANIA: 400 samples

AFF=affluent (control group)

ROP=Risk Of Poverty

• No strong differences between AFF and ROP using metabolomicsprofiles

• Strong differences between Serbia and Italy in urinary metabolomics profiles.

Serbia Italy

PLS/CA - Accuracy 60% ca.

PLS/CA – Accuracy 92%

PRELIMINARY RESULTSNMR Spectra

PATHWAY-27 projectAIM: a betterunderstandingof the role andmechanismsof selectedbioactives andbioactivesenriched food,performing invitro and invivo studiesby means ofadvancedomicstechniques(includingmetabolomics)

Mugello milk analysis

Value of dairy products is associated with the origin, species and composition of milk.Breed, metabolism, seasons, health, nutritions and milking habits produce a variabilityin the composition of milk. 1H-NMR spectroscopy can be used to profile milk samples

to extract information about traceability and composition.

Milk fingerprint

+ =


% a b c d e f g h i ja 100 0.0 0.0 0.0 0 0 0 0.0 0 0.0b 0 95.2 0.0 0.0 3 0 0 1.8 0 0.0c 0 0.0 10.2 89.8 0 0 0 0.0 0 0.0d 0 0.0 79.2 20.8 0 0 0 0.0 0 0.0e 0 0.0 0.0 0.0 100 0 0 0.0 0 0.0f 0 0.0 0.0 0.0 0 100 0 0.0 0 0.0g 0 0.0 0.0 0.0 0 0 100 0.0 0 0.0h 0 0.0 0.0 0.0 0 0 0 100.0 0 0.0i 0 0.0 0.0 0.0 0 0 0 9.0 91 0.0j 0 0.0 0.0 1.2 0 0 0 0.0 0 98.8

c (green)d (blue)

200 raw milk samplescoming fromMugello’s stables (20for each stable)

PLS/CA/kNNAccuracy 90.6%


% a b c+d e f g h i j

a 100 0 0 0.0 0 0.0 5.0 0.0 0b 0 95 0 10.0 0 0.0 1.0 0.0 0

c+d 0 0 100 0.0 0 0.0 0.0 0.0 0e 0 0 0 100 0 0.0 0.0 0.0 0f 0 0 0 0.0 100 0.0 0.0 0.0 0g 0 0 0 0.7 0 100 0.0 0.0 0h 0 0 0 0.0 0 0.0 100 0.0 0i 0 0 0 0.0 0 0.0 7.1 91 0j 0 0 0 0.0 0 0.0 0.0 0.0 98.8


c+d=The same farm, two brands


Granarolo

Coop

Mukki Mugello(from the shelf)

We can perfectly discriminate thethree brands.Pastorized Mugello’s samples fromthe tank (crosses) fall in the correctcluster

% COOP GRANAROLO MUKKI sel.MUGELLO

COOP 95.9 4.1 0.0GRANAROLO 8.0 90.2 1.8

MUKKI sel.MUGELLO 0.0 3.3 96.7

60 samples of pasteurized milk produced by three Italian milk brands (20 for eachbrand, collected in different days).


1

3

2

Insilati, Integratori energetici e salini 0%

FEEDING-TYPE 1: SILAGE and HAYSFEEDING-TYPE 3: SILAGE and FLOURS

FEEDING-TYPE 2: HAYS and FLOURS


Metabolomics can be an accurate tool to evaluatetraceability of different milks, allowing theidentification of some metabolites that correlate withthe component of cow feeding (feedomics?)

The ability to discriminate cows feeded or not withsilage could have important economical application:the production rules for Parmesan forbid the useof milk from cows feeded with silage. We canimagine a new analytical technique able to discoverfrauds.

Disponibilità di Tesi inMETABOLOMICA

presso ilCentro di Risonanze Magnetiche

dell’Università di Firenze

Argomento di Ricerca:

La proposta è rivolta a laureandi(Laurea di I livello e\o II livello)

In Chimica, Chimica Farmaceutica,Biologia, Biotecnologie

Per informazioni: [email protected]

Analisi Metabolomica di Fluidi Biologicitramite Risonanza Magnetica Nucleare