Post on 23-Feb-2016
description
transcript
METABOLOMICS&
Biomarker discovery
Anika Vaarhorst (a.a.m.vaarhorst@lumc.nl)Section of Molecular EpidemiologyLeiden University Medical CentreLeiden, The Netherlands
What is Metabolomics
• The nonbiased identification and quantification of all metabolites in a biological system
• Metabolites are the biochemicals including lipids, sugars, nucleotides, amino acids and related amines of < 2000 Dalton to be found in biological fluids
• All metabolites combined make the human metabolome
Why metabolomics
• More than 4000 metabolites can be measured by different platforms in blood. Not all at high throughput yet.
• Blood is the highway for degraded, secreted, discarded and synthesized molecules.
• Indicates tissues lesions, organ dysfunction and pathological state
• As -omics technology is close to biomedical phenotypes.
Epigenome
Pathman.smpdb.ca
Suhre et al. PLoS ONE | November 2010 | Volume 5 | Issue 11
Metabolites marking diabetes in patients
environment
Metabolome PhenotypeGenotype
Wang et al., Nat Med 2011: markers of 4 x increased T2D risk branched chain amino acids, tyrosine and
phenylalanine
Suhre et al., Nat 2011 Genetically Determined Metabotypes 37 genetic loci accounting for 10-60 variance in level
Administration of branched amino acids increased insulin resistance
Psychogios et al. 2011 PloS One
A step to step approachBiological
experiment
Raw data
Clean data
Data fit for analysis
Rank the important metabolites
Sample extraction NMR analysis
Data preprocessing
Data pretreatment
Data analysis
Van den Berg et al. 2006 BMC Genomics
Sample analysis 1H-NMR spectroscopy
vacuum
Liquid nitrogen
Liquid helium
coil
core
The sample is in the tube, which is in the probe, which is in the core of the magnetic field.
Metabolomics, NMR
1, imidazole; 2, urea; 3,D-glucose; 4, L-lactic acid; 5, glycerol; 6, L-glutamine; 7, L-alanine; 8, DSS; 9, glycine; 10, L-glutamic acid; 11, L-valine; 12, L-proline; 13, L-lysine; 14, Lhistidine;15, L-threonine; 16, propylene glycol; 17, L-leucine; 18, L-tyrosine; 19, L-phenylalanine; 20, methanol; 21,creatinine; 22, 3-hydroxybutyricacid; 23, ornithine; 24, L-isoleucine; 25, citric acid; 26, acetic acid; 27, carnitine; 28, 2-hydroxybutyric acid; 29, creatine; 30, betaine; 31, formic acid; 32,isopropyl alcohol; 33, pyruvic acid; 34, choline; 35, acetone; 36, glycerol.
Analyse known variables 50
Data pretreatment
• Check for outliers• Check for distribution
• Centering• Scaling• Transformations
Data analysis
• Univariate analysis• Univariate analysis combined with step wise
regression– multicollinearity
• LASSO regression, elastic net, ridge regression, PLS-DA
Multiple testing
• Bonferoni correction– 100 tests, test with a significance level of 0.05– P after Bonferoni correction: 0.05/100 = 0.0005– For metabolomics to conservative
• Replicate your findings in independent studies• Cross-validation
Storey and Tibshirani 2003, PNAS
Confounding
• Confounder variable: a variable other than the predictor variables that potentially affects the outcome variable
• Prevent confounding:– Matching– Stratification
• Controlling for confounding– Include the known confounders as covariates in your
model
Metabolite
Outcome variable
Confounder
Problems: Confounding• Brindle JT et al., 2002. Nat Med. 8(12), 1439-
45. → NMR spectroscopy is diagnostic for the occurrence and severity of CAD
• But according to: Kirschenlohr et al. 2006. Nat Med. 12(6), 705-10.– Gender & statin treatment affect the ‘biomarkers’
of disease → groups must be stratified
– NMR analysis of plasma is a weak predictor for CAD
BBMRI Rainbow RP4 MetabolomicsApplying Metabolomics in Dutch cohorts
Reference populations • Leiden Longevity Study (LLS)• Netherlands Twin Register (NTR)• Erasmus Rucphen Family study (ERF)
• Selection based on existing metabolomics data • Extensive phenotypic data
High throughput / high resolution NMR
LUMCDeelder et al.
Mass spectrometry: Biocrates platform Gieger et al.
Mass spectrometry: Nederlands Metabolomics Centre, lipid platformHankemeier et al.
Netherlands Twin Registry
Leiden Longevity Study
Erasmus Rucphen Study
Lipidomics
Matrix Citrate plasma Citrate plasma Citrate plasma
Stored at -30°C -80°C -80°C
Fasted yes no Yes
N 3000 2201 3000 1H-NMR
Matrix EDTA plasma EDTA plasma Serum
Stored at -30°C -80°C -80°C
Fasted Yes No Yes
N 3000 2487 3000
Biocrates
Matrix Serum Serum Serum
Stored at -30°C -80°C -80°C
Fasted yes 267 (yes)/390(no) Yes
N 1900 657 994
327 metabolites measured
146
124
512
40
Biocrates N=163
Lipidomics N=1291H-NMR N=52
The practical
Long-lived siblings
Offspring of long-lived siblingsSpouses as controls
Which metabolites differ between controls and offspring of long-lived siblings