1
Model-based investigation of bacterial metabolism using gene essentiality data.
PhD defense – Maxime Durot
PhD prepared in the
Computational Systems Biology Group at Genoscope
under the supervision of
Vincent Schachter & Jean Weissenbach
2 Maxime Durot – PhD defense –
October 12, 2009
Motivation & goals
of the thesis
Maxime Durot – PhD defense –
October 12, 2009 3
Metabolism
[Picture: Roche Applied Science : http://www.expasy.org/tools/pathways/]
Maxime Durot – PhD defense –
October 12, 2009 4
Information from two scales
genome metabolism phenotype
molecular scale cellular scale
Maxime Durot – PhD defense –
October 12, 2009 5
Mutant phenotyping experiments
Wild-type bacterium
Genome
Gene
Knock-out mutant
Deleted gene
Wild-type growth phenotype
Mutant growth phenotype
Mutant phenotype: No growth = gene is essential on the tested environment Growth = gene is dispensable on the tested environment
Experiments are performed genome-wide for a growing number of organisms (Gerdes et al, Curr Opin Biotechnol 2006)
Maxime Durot – PhD defense –
October 12, 2009 6
Confronting the two scales is complex
Maxime Durot – PhD defense –
October 12, 2009 7
Modeling metabolism can help
(Stelling, Curr Opin Microbiol. 2004)
Maxime Durot – PhD defense –
October 12, 2009 8
The constraint-based modeling frameworkA(ext) B(ext) P(ext)
A
B
C
D
P
R1
R4
R7
R8
R6
R2
R3R5
R9
Key concepts: variable of interest = reactions fluxes
Maxime Durot – PhD defense –
October 12, 2009 9
The constraint-based modeling framework
Key concepts: variable of interest = reactions fluxes
A(ext) B(ext) P(ext)
A
B
C
D
P
1.50
0.5
1
0.5
0.5
10
1
Maxime Durot – PhD defense –
October 12, 2009 10
The constraint-based modeling framework
Key concepts: variable of interest = reactions fluxes constraint-based approach: applying
constraints to the model reduces the possible flux distributions
A(ext) B(ext) P(ext)
A
B
C
D
P
R1
R4
R7
R8
R6
R2
R3R5
R9
Admissible flux distributions
v1
v2
v3
Maxime Durot – PhD defense –
October 12, 2009 11
The constraint-based modeling framework
Key concepts: variable of interest = reactions fluxes constraint-based approach: applying
constraints to the model reduces the possible flux distributions
Classical constraints: metabolism in steady-state: metabolic
concentrations remain constant some reactions are irreversible flux values are bound to a maximal
value
Applicable at genome scale
A(ext) B(ext) P(ext)
A
B
C
D
P
R1
R4
R7
R8
R6
R2
R3R5
R9
Admissible flux distributions
Maxime Durot – PhD defense –
October 12, 2009 12
The constraint-based modeling framework
Key concepts: variable of interest = reactions fluxes constraint-based approach: applying
constraints to the model reduces the possible flux distributions
explore the space of admissible flux distributions
Classical constraints: metabolism in steady-state: metabolic
concentrations remain constant some reactions are irreversible flux values are bound to a maximal
value
Applicable at genome scale
A(ext) B(ext) P(ext)
A
B
C
D
P
R1
R4
R7
R8
R6
R2
R3R5
R9
Admissible flux distributions
Maxime Durot – PhD defense –
October 12, 2009 13
Models and gene essentiality datasets Constraint-based models can predict growth phenotypes for
genetic and environmental perturbations (Price et al, Nat Rev Microbiol 2004)(Durot et al, FEMS Microbiol Rev 2009)
Gene essentiality datasets have been used to provide rough assessments of metabolic models (Covert et al, Nature 2004)(Joyce et al, J Bacteriol 2006)
Compute predictive accuracy for gene essentiality prediction List of inconsistencies, used as a starting point for curation
Can gene essentiality datasets be used more systematically for metabolic model assessment & refinement ?
Maxime Durot – PhD defense –
October 12, 2009 14
Objectives of the thesis
1. Develop a framework for the refinement of metabolic models using gene essentiality data
Maxime Durot – PhD defense –
October 12, 2009 15
Context: the Metabolic Thesaurus project
Experimental context : Reliable genome annotation (Barbe et al, Nucleic Acics Res 2004) Comprehensive knock-out mutant collection (de Berardinis et al,
Mol Syst Biol 2008) Phenotyping capability : complete conditional essentiality
datasets on several media (de Berardinis et al, Mol Syst Biol 2008)
Acinetobacter baylyi ADP1 -proteobacteria, Pseudomonales group Nutritionally versatile, strictly aerobic Non-pathogenic Evidence of xenobiotic degradation
capabilities
Maxime Durot – PhD defense –
October 12, 2009 16
Objectives of the thesis
1. Develop a framework for the refinement of metabolic models using gene essentiality data
2. Application to Acinetobacter baylyi metabolism reconstruct a global metabolic model from its genome
annotation assess and refine the model using mutant phenotypes point out poorly understood metabolic events requiring
further experimental investigation
Maxime Durot – PhD defense –
October 12, 2009 17
Outline
A/ A formal framework for comparing predicted and experimental gene essentialities
B/ Reconstruction and refinement of A. baylyi metabolic model using mutant phenotypes
C/ Automated reasoning with metabolic models and essentiality data
18 Maxime Durot – PhD defense –
October 12, 2009
A/ A formal framework for comparing predicted and experimental gene essentialities
Maxime Durot – PhD defense –
October 12, 2009 19
Improved metabolic reconstruction
Initial metabolic reconstruction
model predictions experimental results
(Large-scale) experiments
model assessment & refinement
Model refinement using experimental data
model predictions experimental results
(Large-scale) experiments 2
model assessment & refinement
refinementstep 2
Maxime Durot – PhD defense –
October 12, 2009 20
Formal representation of a metabolic model Model refinement using large-scale genetics data requires :
Computer generation of variants of models Understanding the impact of model variations on phenotype
predictions
Problem : Constraint-based models appear to be complex mathematical
objects
An appropriate representation of metabolic models is required to perform automated reasoning with essentiality
Maxime Durot – PhD defense –
October 12, 2009 21
Set of reactions fulfilling the modeling
constraints
GPR
Genetic background
Formal representation of a metabolic model
Boolean gene-reaction associations (GPR)
r1: g1
r2: g1 and g2
Boolean rulesg1 g2
p1
c1
r1
p2
r2
Gene
Protein
Complex
Reaction
Maxime Durot – PhD defense –
October 12, 2009 22
Formal representation of a metabolic model
Metabolites of the medium
Set of reactions fulfilling the modeling
constraints
Producible metabolites
GPR
Genetic background
Boolean gene-reaction associations (GPR)
Set of metabolic reactions (NETWORK)
Maxime Durot – PhD defense –
October 12, 2009 23
Formal representation of a metabolic model
Metabolites of the medium
Set of reactions fulfilling the modeling
constraints
Producible metabolites
GPR
Genetic backgroundessential biomass
precursors
Boolean gene-reaction associations (GPR)
Set of metabolic reactions (NETWORK)
List of essential biomass precursors (BIOMASS)
Maxime Durot – PhD defense –
October 12, 2009 24
Predicting mutant phenotypes
Metabolites of the medium
Reactions fulfilling the modeling constraints
Producible metabolites
GPR
Genetic backgroundessential biomass
precursors
GPR
Gene deletion Reduction of producible metabolites space
Inactivated reactions
genetic perturbation
Maxime Durot – PhD defense –
October 12, 2009 25
Confronting model predictions with experiments Comparison of predictions with experiments reveal
inconsistencies
Essential Dispensable
Essential True Essential False Dispensable
Dispensable False Essential True Dispensable
Predictions
Ex
pe
rim
en
ts
Maxime Durot – PhD defense –
October 12, 2009 26
Classifying inconsistencies according to likely cause & correction type
GPR
NETWORK
BIOMASS
False essential
decrease impact of gene deletion on reaction set- add an alternate enzyme- gene is a non-essential subunit of a complex- reaction may occur spontaneously
augment reaction set
reduce biomass requirements- remove a biomass precursor
- add an alternate pathway
False dispensable
increase impact of gene deletion on reaction set- remove an isozyme- form a complex instead of isozyme- gene has an additional essential role
reduce reaction set
augment biomass requirements- add a biomass precursor
- remove or block an alternate pathway
Type of inconsistency
GPR
27 Maxime Durot – PhD defense –
October 12, 2009
B/ Reconstruction and refinement of A. baylyi metabolic model using mutant phenotypes
Maxime Durot – PhD defense –
October 12, 2009 28
A. baylyi model reconstruction Two step process
1. Identify all metabolic reactions occurring in the cell
2. Adapt representation to modeling requirements
Maxime Durot – PhD defense –
October 12, 2009 29
1/ Metabolic network reconstruction
Maxime Durot – PhD defense –
October 12, 2009 30
2/ Adapt to modeling requirements
Specific developments made for A. baylyi model Automated expansion of generic pathways Inference of enzyme complexes by homology to E. coli
Maxime Durot – PhD defense –
October 12, 2009 31
Initial model reconstructionCentral metabolism
Amino acids synthesis
Nucleotides synthesis
Transport
Lipid metabolism
Degradation pathways Cofactor synthesis
73
11692
148
145
115 108
Central metabolism
Amino acids synthesis
Nucleotides synthesis
Transport
Lipid metabolism
Degradation pathways Cofactor synthesis
70
13988
133
141
181 107
859 reactions using 697 metabolites, linked with 787 genes 109 metabolites that are exchangeable with the environment
Maxime Durot – PhD defense –
October 12, 2009 32
Evidence supporting the enzymatic function of model genes
70
Maxime Durot – PhD defense –
October 12, 2009 33
Growth phenotypes of wild-type strain on 190 carbon sources
Results: Growth on 45 carbon
sources No growth on
remaining 145 carbon sources
Dataset 1
Experimental datasets
Dataset 2
Genome-wide gene essentialities from A. baylyi mutant collection construction
Selection on succinate minimal medium
Gene essentiality results:
(de Berardinis et al, Mol Syst Biol 2008)
Dataset 3
Growth phenotypes of A. baylyi mutants on 8 defined environments
7 alternate C sources, 1 alternate N source
Quantitative growth measure (OD)
Frequency
Frequency
Maxime Durot – PhD defense –
October 12, 2009 34
Iterative refinement of A. baylyi modelInitial reconstruction
from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyiv1
Maxime Durot – PhD defense –
October 12, 2009 35
Model refinement using dataset 1
overall prediction accuracy
correctly predicted carbon sources
correctly predicted non carbon sources
GPRNETWORKBIOMASS
090
Corrected inconsistencies
86%
24 / 45 (53%)
140 / 145 (97%)
iAbaylyiv1
91%
33 / 45 (73%)
140 / 145 (97%)
iAbaylyiv2
Maxime Durot – PhD defense –
October 12, 2009 36
Iterative refinement of A. baylyi model
Model accuracy• 91% on dataset 1
iAbaylyiv2
Initial reconstruction from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyi v1
Model accuracy• 88% on dataset 1
Maxime Durot – PhD defense –
October 12, 2009 37
Iterative refinement of A. baylyi model
Model accuracy• 91% on dataset 1
iAbaylyiv2
Initial reconstruction from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyi v1
Model accuracy• 88% on dataset 1
Dataset 2
genome-wide gene essentialities from A. baylyi mutant collection construction
3093 strains x 1 medium
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Maxime Durot – PhD defense –
October 12, 2009 38
Model refinement using dataset 2
overall prediction accuracy
correctly predicted essential genes
correctly predicted dispensable genes
GPRNETWORKBIOMASS
261110
Corrected inconsistencies
88%
187 / 251 (75%)
489 / 516 (95%)
iAbaylyiv2
94%
217 / 251 (86%)
495 / 505 (98%)
iAbaylyiv3
Maxime Durot – PhD defense –
October 12, 2009 39
Model accuracy• 91% on dataset 1• 94% on dataset 2
iAbaylyi v3
Iterative refinement of A. baylyi model
Model accuracy• 91% on dataset 1
iAbaylyiv2
• 88% on dataset 2
Initial reconstruction from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyi v1
Model accuracy• 88% on dataset 1
Dataset 2
genome-wide gene essentialities from A. baylyi mutant collection construction
3093 strains x 1 medium
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Maxime Durot – PhD defense –
October 12, 2009 40
Model accuracy• 91% on dataset 1• 94% on dataset 2
iAbaylyi v3
Iterative refinement of A. baylyi model
Quantitative growth measure
Dataset 3growth phenotypes of A. baylyi mutant collection on 8 minimal media
2350 strains x 8 media
Model accuracy• 91% on dataset 1
iAbaylyiv2
• 88% on dataset 2
Initial reconstruction from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyi v1
Model accuracy• 88% on dataset 1
Dataset 2
genome-wide gene essentialities from A. baylyi mutant collection construction
3093 strains x 1 medium
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Maxime Durot – PhD defense –
October 12, 2009 41
Model refinement using dataset 3
overall prediction accuracy
GPRNETWORKBIOMASS
810
Corrected inconsistencies
93%
16 / 36 (44%)
406 / 419 (97%)
iAbaylyiv3
94%
18 / 36 (50%)
408 / 416 (98%)
iAbaylyiv4
correctly predicted gene phenotypeswith ≥ 1 essentiality
correctly predicted gene phenotypeswith no essentiality
Maxime Durot – PhD defense –
October 12, 2009 42
Model accuracy• 91% on dataset 1• 94% on dataset 2
iAbaylyi v3
Iterative refinement of A. baylyi model
Quantitative growth measure
Dataset 3growth phenotypes of A. baylyi mutant collection on 8 minimal media
2350 strains x 8 media
Model accuracy• 91% on dataset 1
iAbaylyiv2
• 88% on dataset 2
Initial reconstruction from:•genome annotation•pathway databases•literature
Dataset 1growth phenotypes of wild-type strain on 190 carbon sources
1 strain x 190 media
iAbaylyi v1
Model accuracy• 88% on dataset 1
• 93% on dataset 3
Model accuracy• 91% on dataset 1• 94% on dataset 2• 94% on dataset 3
iAbaylyi v4
Dataset 2
genome-wide gene essentialities from A. baylyi mutant collection construction
3093 strains x 1 medium
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Gene StatusACIAD0001 NAACIAD0002 EssentialACIAD0003 DispensableACIAD0004 EssentialACIAD0005 DispensableACIAD0006 Dispensable
Maxime Durot – PhD defense –
October 12, 2009 43
ATP phospho-ribosyltransferase
ACIAD0661 (hisG) and ACIAD1257 (hisZ) were initially assigned as isozymes of ATP phosphoribosyl transferase reaction.
Observed essentiality of both genes suggests they are both necessary to the activity.
Further examination of the literature confirms that both proteins form an enzymatic complex (Sissler et al, PNAS 1999)
GPR correction example
essential gene or reaction
dispensable gene or reaction
biomass precursor
ACIAD0661 OR ACIAD1257
PRPP
phosphoribosyl-ATP
histidine
protein
Maxime Durot – PhD defense –
October 12, 2009 44
ATP phospho-ribosyltransferase
GPR correction example
essential gene or reaction
dispensable gene or reaction
biomass precursor
ACIAD0661 OR ACIAD1257
PRPP
phosphoribosyl-ATP
histidine
ACIAD0661 AND ACIAD1257
PRPP
phosphoribosyl-ATP
histidine
protein protein
Maxime Durot – PhD defense –
October 12, 2009 45
Network correction example ACIAD0822-0824 (gatABC)
annotated as an aspartyl/glutamyl-tRNA amidotransferase
gatABC are essential : only way to produce asparagine.
ACIAD1920 (glnS) catalyzes direct charging of glutamine on its tRNA
Essentiality of ACIAD1920 suggests that gatABC pathway is not effective for glutamine
essential gene or reaction
dispensable gene or reaction
biomass precursor
glutamine
glutamine-tRNA(gln)
protein
glutamate
glutamate-tRNA(gln)
ACIAD1920
ACIAD3371 ORACIAD0272
asparagine -tRNA(asn)
protein
aspartate
aspartate-tRNA(asn)
ACIAD0609
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD0822 AND ACIAD0823 AND ACIAD0824
Maxime Durot – PhD defense –
October 12, 2009 46
Network correction example
essential gene or reaction
dispensable gene or reaction
biomass precursor
glutamine
glutamine-tRNA(gln)
protein
glutamate
glutamate-tRNA(gln)
ACIAD1920
ACIAD3371 ORACIAD0272
asparagine -tRNA(asn)
protein
aspartate
aspartate-tRNA(asn)
ACIAD0609
ACIAD0822 AND ACIAD0823 AND ACIAD0824
ACIAD0822 AND ACIAD0823 AND ACIAD0824
glutamine
glutamine-tRNA(gln)
protein
ACIAD1920asparagine -tRNA(asn)
protein
aspartate
aspartate-tRNA(asn)
ACIAD0609
ACIAD0822 AND ACIAD0823 AND ACIAD0824
Maxime Durot – PhD defense –
October 12, 2009 47
A. baylyi model refinement
Maxime Durot – PhD defense –
October 12, 2009 48
Online prediction of mutant phenotypes
(Le Fèvre et al, Bioinformatics 2009)
49 Maxime Durot – PhD defense –
October 12, 2009
C/ Automated reasoning with metabolic models and essentiality data
Maxime Durot – PhD defense –
October 12, 2009 50
Automated reasoning on gene-reaction associations
Use phenotypes as specifications for gene-reaction associations
Assume NETWORK and BIOMASS parts of the model are correct
For each inconsistency: search all GPRs compatible with experimental
data
GPR
GPR
Maxime Durot – PhD defense –
October 12, 2009 51
1/ Deduce impact scenarios from phenotypes Equivalent view of gene-reaction associations:
Deletion impact Impact (deletion of {G1,…,Gn}) = {R1,..,Rp} inactivated
Key idea: Phenotypes of reaction deletions can be predicted Compatible deletion impacts must follow the rules: lethal gene deletions must impact an essential reaction set
viable gene deletions must not impact any essential reaction set
Maxime Durot – PhD defense –
October 12, 2009 52
1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact
scenarios
Closed-world assumption: the set of genes potentially linked to a reaction is known
gene/reaction set is essential
gene/reaction set is dispensable
Predicted reaction
essentialities
Observed gene essentialities
R1
R2
G1
G2
G3
G4
reaction
gene
Legend
predefined gene-reaction link
Maxime Durot – PhD defense –
October 12, 2009 53
1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact
scenarios
Closed-world assumption: the set of genes potentially linked to a reaction is known
gene/reaction set is essential
gene/reaction set is dispensable
Predicted reaction
essentialities
Observed gene essentialities
R1
R2
G1
G2
G3
G4
reaction
gene
Legend
predefined gene-reaction link
impactscenario 1
Maxime Durot – PhD defense –
October 12, 2009 54
1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact
scenarios
Closed-world assumption: the set of genes potentially linked to a reaction is known
gene/reaction set is essential
gene/reaction set is dispensable
Predicted reaction
essentialities
Observed gene essentialities
R1
R2
G1
G2
G3
G4
reaction
gene
Legend
predefined gene-reaction link
impactscenario 2
Maxime Durot – PhD defense –
October 12, 2009 55
1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact
scenarios
Closed-world assumption: the set of genes potentially linked to a reaction is known
gene/reaction set is essential
gene/reaction set is dispensable
Predicted reaction
essentialities
Observed gene essentialities
R1
R2
G1
G2
G3
G4
reaction
gene
Legend
predefined gene-reaction link
impactscenario 3
Maxime Durot – PhD defense –
October 12, 2009 56
1/ Deduce impact scenarios from phenotypes For each inconsistency, generate all possible impact
scenarios
Closed-world assumption: the set of genes potentially linked to a reaction is known
gene/reaction set is essential
gene/reaction set is dispensable
Predicted reaction
essentialities
Observed gene essentialities
R1
R2
G1
G2
G3
G4
reaction
gene
Legend
predefined gene-reaction link
impactscenario 4
Maxime Durot – PhD defense –
October 12, 2009 57
2/ Implement proposed impacts with GPR Choose an impact scenario
For each reaction, find Boolean rules implementing the impacts analogy to logic circuit design
GPR specificity: no negation rule monotonic increasing Boolean function (F(0,0) ≤ F(1,0) ≤ F(1,1)) constrains the possible implementations
Maxime Durot – PhD defense –
October 12, 2009 58
2/ Implement proposed impacts with GPR
Truth table for R1
Specifications for R1
G1 deletion does not impact R1
G2 deletion does not impact R1
G3 deletion does impact R1
G1 G2 G3 GPR
0 0 0
1 0 0
0 1 0
1 1 0 0
0 0 1
1 0 1 1
0 1 1 1
1 1 1
monotony
R1
R2
G1
G2
G3
G4
scenario 1
G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1
1 0 1 1
0 1 1 1
1 1 1 1
Maxime Durot – PhD defense –
October 12, 2009 59
2/ Implement proposed impacts with GPR Multiple solutions
G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1 ?
1 0 1 1
0 1 1 1
1 1 1 1
Generate all possible cases
GPRG3G2G1
1111
1110
1101
1100
0011
0010
0001
0000
GPRG3G2G1
1111
1110
1101
0100
0011
0010
0001
0000
GPR = G3
GPR = G3 and (G1 or G2)
Maxime Durot – PhD defense –
October 12, 2009 60
2/ Implement proposed impacts with GPR Multiple solutions
Generate all possible cases
Choose closest behavior to the original GPR
Propose experiment to fully determine the Boolean rule {G2, G3} double deletion here
G1 G2 G3 GPR
0 0 0 0
1 0 0 0
0 1 0 0
1 1 0 0
0 0 1 ?
1 0 1 1
0 1 1 1
1 1 1 1
Maxime Durot – PhD defense –
October 12, 2009 61
Comparing AutoGPR proposals with expert interpretations
Comparison with manual corrections of A. baylyi model
Type of expert interpretation InconsistenciesInconsistencies having AutoGPR proposals
Inconsistencies corrected using AutoGPR proposal
Model correctedGPR 34 24 (71%) 22 (65%) activity simultaneously requiring all genes 3 3 (100%) 3 (100%) isozyme not functional 22 19 (86%) 19 (86%) gene associated to another essential reaction 1 0 (0%) 0 (0%) presence of an alternate enzyme 6 1 (17%) 0 (0%) spontaneously occurring reaction 1 0 (0%) 0 (0%) wrong complex subunit 1 1 (100%) 0 (0%)NETWORK 12 0 (0%) 0 (0%)BIOMASS 10 0 (0%) 0 (0%)
Model not correctedValidated explanation 6 0 (0%) 0 (0%)Hypothetical explanation 21 5 (24%) 0 (0%)No precise interpretation 28 3 (11%) 0 (0%)
Maxime Durot – PhD defense –
October 12, 2009 62
Comparing AutoGPR proposals with expert interpretations
Type of expert interpretation InconsistenciesInconsistencies having AutoGPR proposals
GPR 21 12 (60%)NETWORK 18 0 (0%)BIOMASS 41 1 (2%)Other interpretation 128 12 (9%)No precise interpretation 29 1 (3%)
Comparison for S. cerevisiae model iND750 model predictions compared with gene essentiality data
on 8 environments (Duarte et al, Genome Res 2004)
Inconsistent predictions were manually interpreted (not corrected)
Maxime Durot – PhD defense –
October 12, 2009 63
Number of generated proposals for A. baylyi
0
2
4
6
8
10
12
14
16
18
20
1 10 100 1000 1E+04 1E+05 1E+06
Number of generated corrections
Num
ber
of c
ases
Maxime Durot – PhD defense –
October 12, 2009 64
Reducing complexity
First, simply test the existence of GPR corrections
Impose similar reactions to have similar GPR
Maxime Durot – PhD defense –
October 12, 2009 65
Examining corrections across environments GPR corrections can contradict each other across environments
Model Environments
Fraction of cases where AutoGPR corrections are inconsistent across environments
A. baylyi 8 minimal media 3 / 8 (37%)E. coli 2 minimal media 4 / 22 (18%) (J oyce et al, J Bacteriol 2006) S. cerevisiae 8 minimal and complex media 21 / 22 (95%) (Duarte et al, Genome Res 2004)
Possible interpretations Inconsistencies between experimental conditions
Error in NETWORK or BIOMASS model components
GPR are not constant across environments Conditional expression of genes Regulatory interactions intervene
(Durot et al, BMC Syst Biol 2008)
66 Maxime Durot – PhD defense –
October 12, 2009
Conclusion & perspectives
Maxime Durot – PhD defense –
October 12, 2009 67
Main contributions
Reconstruction of a global metabolic model of A. baylyi
Development of a framework for interpreting inconsistent growth phenotype predictions
Systematic interpretation of A. baylyi mutant phenotypes using its metabolic model
Design of an automated method to reason on GPR corrections from gene essentialities
Maxime Durot – PhD defense –
October 12, 2009 68
Perspectives
A. baylyi metabolic model Tool to integrate further experimental data
RNA-seq , metabolomics on A. baylyi and mutants
Metabolic model reconstruction Automate the reconstruction process from genome annotation Systematically assess model correctness using high-throughput
experimental data
=> Microme European project to be started
Maxime Durot – PhD defense –
October 12, 2009 69
Claudine MédigueDavid VallenetValérie Barbe
Georges CohenNuria FonknechtenAnnett Kreimeyer
Metabolic Thesaurus experimental work
Marcel SalanoubatVéronique de Berardinis
Alain PerretMarielle Besnard
Christophe LechaplaisAgnès Pinet
Acinetobacter baylyi annotation
Computational Systems Biology group
AcknowledgmentsSupervisors
Vincent Schachter & Jean Weissenbach
François Le FèvreGilles Vieira
Richard Baran*Pierre-Yves Bourguignon*
Serge Smidtas*(* former members)
70 Maxime Durot – PhD defense –
October 12, 2009
Discussion