François Fages Rocquencourt, Sep. 2007
Semantical and Algorithmic Aspects of the Living
François FagesConstraint Programming project-team,
INRIA Paris-Rocquencourt
To tackle the complexity of biological systems, investigate• Programming Theory Concepts• Formal Methods of Circuit and Program Verification• Automated Reasoning Tools
Prototype Implementation in the Biochemical Abstract Machine BIOCHAMmodeling environment available at http://contraintes.inria.fr/BIOCHAM
François Fages Rocquencourt, Sep. 2007
Systems Biology ?
“Systems Biology aims at systems-level understanding [which]
requires a set of principles and methodologies that links the
behaviors of molecules to systems characteristics and functions.”
H. Kitano, ICSB 2000
• Analyze (post-)genomic data produced with high-throughput technologies (stored in databases like GO, KEGG, BioCyc, etc.);
• Integrate heterogeneous data about a specific problem;
• Understand and Predict behaviors or interactions in large networks of genes and proteins.
Systems Biology Markup Language (SBML) : exchange format for reaction models
François Fages Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
François Fages Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
François Fages Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
François Fages Rocquencourt, Sep. 2007
Issue of Abstraction
Models are built in Systems Biology with two contradictory perspectives :
1) Models for representing knowledge : the more concrete the better
2) Models for making predictions : the more abstract the better !
These perspectives can be reconciled by organizing formalisms and models into hierarchies of abstractions.
To understand a system is not to know everything about it but to know
abstraction levels that are sufficient for answering questions about it
François Fages Rocquencourt, Sep. 2007
Semantics of Living Processes ?
Formally, “the” behavior of a system depends on our choice of observables.
? ?
Mitosis movie [Lodish et al. 03]
François Fages Rocquencourt, Sep. 2007
Boolean Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules
Boolean transitions
0 1
François Fages Rocquencourt, Sep. 2007
Continuous Differential Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules
Rates of reactions
x ý
François Fages Rocquencourt, Sep. 2007
Stochastic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Numbers of molecules
Probabilities of reaction
n
François Fages Rocquencourt, Sep. 2007
Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Presence/absence of molecules
Temporal logic formulas
F xF x
F (x ^ F ( x ^ y))
FG (x v y)
…
François Fages Rocquencourt, Sep. 2007
Constraint Temporal Logic Semantics
Formally, “the” behavior of a system depends on our choice of observables.
Concentrations of molecules
Constraint LTL temporal formulas
F x>1F (x >0.2)
F (x >0.2 ^ F (x<0.1 ^ y>0.2))
FG (x>0.2 v y>0.2)
…
François Fages Rocquencourt, Sep. 2007
Language-based Approaches to Cell Systems Biology
Qualitative models: from diagrammatic notation to• Boolean networks [Kaufman 69, Thomas 73]
• Petri Nets [Reddy 93, Chaouiya 05]
• Process algebra π–calculus [Regev-Silverman-Shapiro 99-01, Nagasali et al. 00] • Bio-ambients [Regev-Panina-Silverman-Cardelli-Shapiro 03]
• Pathway logic [Eker-Knapp-Laderoute-Lincoln-Meseguer-Sonmez 02]
• Reaction rules [Chabrier-Fages 03] [Chabrier-Chiaverini-Danos-Fages-Schachter 04]
Quantitative models: from ODEs and stochastic simulations to• Hybrid Petri nets [Hofestadt-Thelen 98, Matsuno et al. 00]
• Hybrid automata [Alur et al. 01, Ghosh-Tomlin 01] HCC [Bockmayr-Courtois 01]
• Stochastic π–calculus [Priami et al. 03] [Cardelli et al. 06]
• Reaction rules with continuous time dynamics [Fages-Soliman-Chabrier 04]
François Fages Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems 1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification
2. Inferring reaction rules from CTL specification
L. Calzone, N. Chabrier, F. Fages, S. Soliman. TCSB VI, LNBI 4220:68-94. 2006.
François Fages Rocquencourt, Sep. 2007
Molecules of the living
Small molecules: covalent bonds 50-200 kcal/mol
• 70% water
• 1% ions
• 6% amino acids (20), nucleotides (5),
• fats, sugars, ATP, ADP, …
Macromolecules: hydrogen bonds, ionic, hydrophobic, Waals 1-5 kcal/mol
• 20% proteins (50-104 amino acids)
• RNA (102-104 nucleotides AGCU)
• DNA (102-106 nucleotides AGCT)
François Fages Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
François Fages Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
2) Secondary: word of m helix, strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
François Fages Rocquencourt, Sep. 2007
Structure Levels of Proteins
1) Primary structure: word of n amino acids residues (20n possibilities)
linked with C-N bonds
Example: INRIA
Isoleucine-asparagiNe-aRginine-Isoleucine-Alanine
2) Secondary: word of m helix, strands, random coils,… (3m-10m)
stabilized by hydrogen bonds H---O
3) Tertiary 3D structure: spatial folding
stabilized by
hydrophobic
interactions
François Fages Rocquencourt, Sep. 2007
Syntax of proteins
Cyclin dependent kinase 1 Cdk1
(free, inactive)
Complex Cdk1-Cyclin B Cdk1–CycB
(low activity)
Phosphorylated form Cdk1~{thr161}-CycB
at site threonine 161
(high activity)
mitosis promotion factor
François Fages Rocquencourt, Sep. 2007
BIOCHAM Syntax of Objects
E == compound | E-E | E~{p1,…,pn}
Compound: molecule, #gene binding site, abstract @process…
- : binding operator for protein complexes, gene binding sites, …
Associative and commutative.
~{…}: modification operator for phosphorylated sites, …
Set of modified sites (Associative, Commutative, Idempotent).
O == E | E::location
Location: symbolic compartment (nucleus, cytoplasm, membrane, …)
S == _ | O+S
+ : solution operator (Associative, Commutative, Neutral _)
François Fages Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
François Fages Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
François Fages Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
François Fages Rocquencourt, Sep. 2007
Elementary Reaction Rules
Complexation: A + B => A-B Decomplexation A-B => A + B cdk1+cycB => cdk1–cycB
Phosphorylation: A =[C]=> A~{p} Dephosphorylation A~{p} =[C]=> A
Cdk1-CycB =[Myt1]=> Cdk1~{thr161}-CycB
Cdk1~{thr14,tyr15}-CycB =[Cdc25~{Nterm}]=> Cdk1-CycB
Synthesis: _ =[C]=> A. Degradation: A =[C]=> _. _ =[#E2-E2f13-Dp12]=> CycA cycE =[@UbiPro]=> _
(not for cycE-cdk2 which is stable)
Transport: A::L1 => A::L2Cdk1~{p}-CycB::cytoplasm => Cdk1~{p}-CycB::nucleus
François Fages Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
In SBML : no semantics (exchange format)
François Fages Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules 1. Concurrent Transition System (asynchronous, non-
deterministic)
François Fages Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules 1. Concurrent Transition System (asynchronous, non-
deterministic)
2. Differential Semantics: concentration• Ordinary Differential Equations or Hybrid system
(deterministic)
François Fages Rocquencourt, Sep. 2007
From Syntax to Semantics
R ::= S=>S | S =[O]=> S | S <=> S | S <=[O]=> S where A =[C]=> B stands for A+C => B+C
A <=> B stands for A=>B and B=>A, etc.
| kinetic for R (import/export SBML format)
In SBML : no semantics (exchange format)
In BIOCHAM : three abstraction levels
1. Boolean Semantics: presence-absence of molecules 1. Concurrent Transition System (asynchronous, non-deterministic)
2. Differential Semantics: concentration 1. Ordinary Differential Equations or Hybrid system (deterministic)
• Stochastic Semantics: number of molecules • Continuous time Markov chain
François Fages Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of diffusion …
François Fages Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of compartment
Compiles a set of rules { ei for Si=>S’I }i=1,…,n (by default ei is MA(1))
into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak}
dA/dt = Σni=1 ri(A)*ei - Σn
j=1 lj(A)*ej
where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i)
multiplied by the volume ratio of the location of A.
François Fages Rocquencourt, Sep. 2007
1. Differential Semantics
Associates to each molecule its concentration [Ai]= | Ai| / volume ML-1
volume of compartment
Compiles a set of rules { ei for Si=>S’I }i=1,…,n (by default ei is MA(1))
into the system of ODEs (or hybrid automaton) over variables {A1,…,Ak}
dA/dt = Σni=1 ri(A)*ei - Σn
j=1 lj(A)*ej
where ri(A) (resp. li(A)) is the stoichiometric coefficient of A in Si (resp. S’i)
multiplied by the volume ratio of the location of A.
volume_ratio (15,n),(1,c).
mRNAcycA::n <=> mRNAcycA::c.
means 15*Vn = Vc and is equivalent to 15*mRNAcycA::n <=> mRNAcycA::c.
François Fages Rocquencourt, Sep. 2007
Numerical Integration
• Adaptive step size 4th order Runge-Kutta can be weak for stiff systems
• Rosenbrock implicit method using the Jacobian matrix ∂x’ i/∂xj
computes a (clever) discretization of time
and a time series of concentrations and their derivatives
(t0, X0, dX0/dt), (t1, X1, dX1/dt), …, (tn, Xn, dXn/dt), …
at discrete time points
François Fages Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
François Fages Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
Compile the rule set into a continuous time Markov chain
over vector states X=(|A1|,…, |Ak|)
and where the transition rate τi for the reaction ei for Si=>S’I
(giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers
François Fages Rocquencourt, Sep. 2007
2. Stochastic Semantics
Associate to each molecule its number |Ai| in its location of volume Vi
Compile the rule set into a continuous time Markov chain
over vector states X=(|A1|,…, |Ak|)
and where the transition rate τi for the reaction ei for Si=>S’I
(giving probability after normalization) is obtained from ei by replacing concentrations by molecule numbers
Stochastic simulation [Gillespie 76, Gibson 00]
computes realizations as time series (t0, X0), (t1, X1), …, (tn, Xn), …
François Fages Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in its location
François Fages Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in its location
Compile the rule set into an asynchronous transition system
François Fages Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in its location
Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
François Fages Rocquencourt, Sep. 2007
3. Boolean Semantics
Associate to each molecule a Boolean denoting its presence/absence in its location
Compile the rule set into an asynchronous transition system where a reaction like A+B=>C+D is translated into 4 transition rules taking into account the possible complete consumption of reactants:
A+BA+B+C+D
A+BA+B +C+D
A+BA+B+C+D
A+BA+B+C+D
Necessary for over-approximating the possible behaviors under the stochastic/discrete semantics (abstraction N {zero, non-zero})
François Fages Rocquencourt, Sep. 2007
Hierarchy of Semantics
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Theory of abstract Interpretation
[Cousot Cousot POPL’77]
[Fages Soliman TCSc’07]
Syntactical
model
Models for answering queries:
The more abstract the better
Optimal abstraction w.r.t. query
François Fages Rocquencourt, Sep. 2007
Query: what are the stationary states ?
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
Jacobian circuit analysis
Discrete circuit analysis
Boolean circuit analysisabstraction
abstraction
abstraction
Positive circuits are a
necessary condition
for multistationarity
[Thomas 73] [de Jong 02]
[Soulé 03]
[Remy Ruet Thieffry 05]
François Fages Rocquencourt, Sep. 2007
Type Inference / Type Checking
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
[Fages Soliman CMSB’06]
Influence graph of proteins
Protein functions
(kinase, phosphatase,…)
Compartments topology
François Fages Rocquencourt, Sep. 2007
Type Inference / Type Checking
Stochastic model
Differential model
Discrete model
abstraction
concretization
Boolean model
Syntactical
model
[Fages Soliman CMSB’06]
Influence graph of proteins
Protein functions
(kinase, phosphatase,…)
Compartments topology
Influence graph of proteins
(activation/inhibition)
François Fages Rocquencourt, Sep. 2007
Cell Cycle: G1 DNA Synthesis G2 Mitosis
G1: CdK4-CycD S: Cdk2-CycA G2,M: Cdk1-CycA
Cdk6-CycD Cdk1-CycB (MPF)
Cdk2-CycE
François Fages Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Rocquencourt, Sep. 2007
Interaction Graph
François Fages Rocquencourt, Sep. 2007
Stochastic Simulation
François Fages Rocquencourt, Sep. 2007
Differential Simulation
François Fages Rocquencourt, Sep. 2007
Boolean Simulation
François Fages Rocquencourt, Sep. 2007
François Fages Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Map [Kohn 99]
François Fages Rocquencourt, Sep. 2007
Transcription of Kohn’s Map
_ =[ E2F13-DP12-gE2 ]=> cycA....cycB =[ APC~{p1} ]=>_.cdk1~{p1,p2,p3} + cycA => cdk1~{p1,p2,p3}-cycA.cdk1~{p1,p2,p3} + cycB => cdk1~{p1,p2,p3}-cycB....cdk1~{p1,p3}-cycA =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycA.cdk1~{p1,p3}-cycB =[ Wee1 ]=> cdk1~{p1,p2,p3}-cycB.cdk1~{p2,p3}-cycA =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycA.cdk1~{p2,p3}-cycB =[ Myt1 ]=> cdk1~{p1,p2,p3}-cycB....cdk1~{p1,p2,p3} =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}.cdk1~{p1,p2,p3}-cycA =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycA.cdk1~{p1,p2,p3}-cycB =[ cdc25C~{p1,p2} ]=> cdk1~{p1,p3}-cycB.
165 proteins and genes, 500 variables, 800 rules [Chiaverini Danos 02]
François Fages Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems 1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2.2. Temporal Logic Language for Formalizing Biological PropertiesTemporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3. Automated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification
2. Inferring reaction rules from CTL specification
François Fages Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
Formalize properties of the biological system in:
• Computation Tree Logic CTL for the boolean semantics
• Linear Time Logic with numerical constraints for the concentration semantics
• Probabilistic CTL with numerical constraints for the stochastic semantics
Evaluate the formulas by model checking techniques
[Lincoln et al. PSB’02] [Chabrier Fages CMSB’03] [Bernot et al. TCS’04] …
François Fages Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model: BIOCHAM
- Boolean - simulation
- Differential
- Stochastic
(SBML)
François Fages Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model: BIOCHAM Biological Properties:
- Boolean - simulation - CTL
- Differential - query evaluation - LTL with constraints
- Stochastic - PCTL with constraints
(SBML)
François Fages Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model: BIOCHAM Biological Properties:
- Boolean - simulation - CTL
- Differential - query evaluation - LTL with constraints
- Stochastic - PCTL with constraints
(SBML)
François Fages Rocquencourt, Sep. 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checking
In the Biochemical Abstract Machine environment,
Model: BIOCHAM Biological Properties:
- Boolean - simulation - CTL
- Differential - query evaluation - LTL with constraints
- Stochastic - rule learning - PCTL with constraints
(SBML) - parameter search
François Fages Rocquencourt, Sep. 2007
2.1 Computation Tree Logic CTL
Extension of propositional (or first-order) logic with operators for time and choices [Clarke et al. 99]
Time
Non-determinism E, A
F,G,U EF
EU
AG
Choice
Time
E
exists
A
always
X
next time
EX(f)
¬ AX(¬ f)
AX(f)
F
finally
EF(f)
¬ AG(¬ f)
AF(f)
G
globally
EG(f)
¬ AF(¬ f)
AG(f)
U
untilE (f1 U f2) A (f1 U f2)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (1/3)
About reachability:
• Can the cell produce some protein P? reachable(P)==EF(P)
• Can the cell produce P, Q and not R? reachable(P^Q^R)
• Can the cell always produce P? AG(reachable(P))
About pathways:
• Can the cell reach a (partially described) set of states s while passing by another set of states s2? EF(s2^EFs)
• Is it possible to produce P without Q? E(Q U P)• Is (set of) state s2 a necessary checkpoint for reaching (set of) state s?
checkpoint(s2,s)== E(s2U s)
• Is s2 always a checkpoint for s? AG(s -> checkpoint(s2,s))
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (2/3)
About stationarity:
• Is a (set of) state s a stable state? stable(s)== AG(s)
• Is s a steady state (with possibility of escaping) ? steady(s)==EG(s)
• Can the cell reach a stable state s? EF(stable(s)) not in LTL
• Must the cell reach a stable state s? AG(stable(s))
• What are the stable states? Not expressible in CTL. Needs to combine CTL with search (e.g. constraint programming [Thieffry et al. 05] )
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscil(P)== EG(F P ^ F P)
CTL* formula that can be approximated in CTL by
oscil(P)== EG((P EF P) ^ (P EF P))
(necessary but not sufficient condition for oscillation)
François Fages Rocquencourt, Sep. 2007
Biological Properties formalized in CTL (3/3)
About oscillations:
• Can the system exhibit a cyclic behavior w.r.t. the presence of P ? oscil(P)== EG((P EF P) ^ (P EF P))
(necessary but not sufficient condition)
• Can the system loops between states s and s2 ?
loop(P,Q)== EG((s EF s2) ^ (s2 EF s))
François Fages Rocquencourt, Sep. 2007
Symbolic Model-Checking
Still for finite Kripke structures, use boolean constraints to represent
1. sets of states as a boolean constraint c(V)
2. the transition relation as a boolean constraint r(V,V’)
Binary Decision Diagrams BDD [Bryant 85] provide canonical forms to Boolean formulas (decide Boolean equivalence)
(x⋁¬y)⋀(y⋁¬z)⋀(z⋁¬x)
and
(x⋁¬z)⋀(z⋁¬y)⋀(y⋁¬x)
are equivalent, they
have the same BDD(x,y,z)
François Fages Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Map [Kohn 99]
François Fages Rocquencourt, Sep. 2007
Cell Cycle Model-Checking (with BDD NuSMV)
biocham: check_reachable(cdk46~{p1,p2}-cycD~{p1}). Ei(EF(cdk46~{p1,p2}-cycD~{p1})) is truebiocham: check_checkpoint(cdc25C~{p1,p2}, cdk1~{p1,p3}-cycB). Ai(!(E(!(cdc25C~{p1,p2}) U cdk1~{p1,p3}-cycB))) is truebiocham: nusmv(Ai(AG(!(cdk1~{p1,p2,p3}-cycB) -> checkpoint(Wee1, cdk1~{p1,p2,p3}-cycB))))). Ai(AG(!(cdk1~{p1,p2,p3}-cycB)->!(E(!(Wee1) U cdk1~{p1,p2,p3}-cycB)))) is falsebiocham: why.-- Loop starts here cycB-cdk1~{p1,p2,p3} is present cdk7 is present cycH is present cdk1 is present Myt1 is present cdc25C~{p1} is presentrule_114 cycB-cdk1~{p1,p2,p3}=[cdc25C~{p1}]=>cycB-cdk1~{p2,p3}. cycB-cdk1~{p2,p3} is present cycB-cdk1~{p1,p2,p3} is absentrule_74 cycB-cdk1~{p2,p3}=[Myt1]=>cycB-cdk1~{p1,p2,p3}. cycB-cdk1~{p2,p3} is absent cycB-cdk1~{p1,p2,p3} is present
François Fages Rocquencourt, Sep. 2007
Mammalian Cell Cycle Control Benchmark
500 variables, 2500 states. 800 rules.
BIOCHAM NuSMV model-checker time in sec. [Chabrier et al. TCS 04]
Initial state G2 Query: Time:
compiling 29 s
Reachability G1 EF CycE 2 s
Reachability G1 EF CycD 1.9 s
Reachability G1 EF PCNA-CycD 1.7 s
Checkpoint
for mitosis complex
EF ( Cdc25~{Nterm}
U Cdk1~{Thr161}-CycB)
2.2 s
Oscillation EG ( (CycA => EF CycA) ^
( CycA => EF CycA))
31.8 s
François Fages Rocquencourt, Sep. 2007
2.2 LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
François Fages Rocquencourt, Sep. 2007
LTL with Constraints for the Differential Semantics
• Constraints over concentrations and derivatives as FOL formulae over the reals:
• [M] > 0.2
• [M]+[P] > [Q]
• d([M])/dt < 0
• Linear Time Logic LTL operators for time X, F, U, G• F([M]>0.2)
• FG([M]>0.2)
• F ([M]>2 & F (d([M])/dt<0 & F ([M]<2 & d([M])/dt>0 & F(d([M])/dt<0))))
• oscil(M,n) defined as at least n alternances of sign of the derivative
• Period(A,75)= t v F(T = t & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)
& F(T = t + 75 & [A] = v & d([A])/dt > 0 & X(d([A])/dt < 0)))…
François Fages Rocquencourt, Sep. 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
François Fages Rocquencourt, Sep. 2007
How to Evaluate a Constraint LTL Formula ?
• Consider the ODE’s of the concentration semantics dX/dt = f(X)
• Numerical integration methods produce a discretization of time (adaptive step size Runge-Kutta or Rosenbrock method for stiff syst.)
• The trace is a linear Kripke structure:
(t0,X0,dX0/dt), (t1,X1,dX1/dt), …, (tn,Xn,dXn/dt), …
over concentrations and their derivatives at discrete time points
• Evaluate the formula on that Kripke structure with a model checking alg.
François Fages Rocquencourt, Sep. 2007
Simulation-Based Constraint LTL Model Checking
Hypothesis 1: the initial state is completely known
Hypothesis 2: the formula can be checked over a finite period of time [0,T]
1. Run the numerical integration from 0 to T producing values at a finite sequence of time points
2. Iteratively label the time points with the sub-formulae of that are true:
Add to the time points where a FOL formula is true,
Add F (X ) to the (immediate) previous time points labeled by Add U to the predecessor time points of while they satisfy (Add G to the states satisfying until T)
Model checker and numerical integration methods implemented in Prolog
François Fages Rocquencourt, Sep. 2007
Constraint-LTL Instanciation Algo. [Fages Rizk CMSB’07]
François Fages Rocquencourt, Sep. 2007
2.3 PCTL Model Checker for the Stochastic Semantics
Compute the probability of realisation of a TL formula (with constraints) by Monte Carlo method
Perform several stochastic simulations
Evaluate the probability of realization of the TL formula
Costly…
PRISM [Kwiatkowska et al. 04] : PCTL model checker based on BDDs or using Monte Carlo method.
François Fages Rocquencourt, Sep. 2007
Overview of the Talk
1. Rule-based Modeling of Biochemical Systems 1. Syntax of molecules, compartments and reactions
2. Semantics at three abstraction levels: boolean, differential, stochastic
3. Cell cycle control models
2. Temporal Logic Language for Formalizing Biological Properties1. CTL for the boolean semantics
2. Constraint LTL for the differential semantics
3. PCTL for the stochastic semantics
3.3. Automated Reasoning ToolsAutomated Reasoning Tools1. Inferring kinetic parameter values from Constraint-LTL
specification
2. Inferring reaction rules from CTL specification
François Fages Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
François Fages Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3),150).
First values found :
parameter(k3,10).
parameter(k4,70).
François Fages Rocquencourt, Sep. 2007
3.1 Inferring Parameters from Temporal Properties
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
oscil(Cdc2-Cyclin~{p1},3) & F([Cdc2-Cyclin~{p1}]>0.15), 150).
First values found :
parameter(k3,10).
parameter(k4,120).
François Fages Rocquencourt, Sep. 2007
3.1 Inferring Parameters from LTL Specification
biocham: learn_parameter([k3,k4],[(0,200),(0,200)],20,
period(Cdc2-Cyclin~{p1},35), 150).
First values found:
parameter(k3,10).
parameter(k4,280).
François Fages Rocquencourt, Sep. 2007
Leloup and Goldbeter (1999)
MPF preMPF
Wee1
Wee1P
Cdc25
Cdc25PAPC
APC
....
....
........
Cell cycle
Linking the Cell and Circadian Cycles through Wee1
BMAL1/CLOCK
PER/CRY
Circadian cycle
Wee1 mRNA
L [L. Calzone, S. Soliman 2006]
François Fages Rocquencourt, Sep. 2007
PCN
Wee1m
Wee1MPF
BN
Cdc25
François Fages Rocquencourt, Sep. 2007
entrainmententrainment
Condition on Wee1/Cdc25 for the Entrainment in Period
Entrainment in period constraint expressed in LTL with the period formula
François Fages Rocquencourt, Sep. 2007
3.2. Inferring Rules from Temporal Properties
Given
• a BIOCHAM model (background knowledge)
• a set of properties formalized in temporal logic
learn
• revisions of the reaction model, i.e. rules to delete and rules to add such that the revised model satisfies the properties
François Fages Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
• Background knowledge T: BIOCHAM model • reaction rule language: complexation, phosphorylation, …
• Examples φ: biological properties formalized in temporal logic language• Reachability
• Checkpoints
• Stable states
• Oscillations
• Bias R: Reaction rule patterns or parameter ranges• Kind of rules to add or delete
Find a revision T’ of T such that T’ |= φ
François Fages Rocquencourt, Sep. 2007
Model Revision Algorithm
General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm.
Anticipate whether one has to add or remove a rule.
• Positive ECTL formula: if false, remains false after removing a rule• EF(φ) where φ is a boolean formula (pure state description)
• Oscil(φ)
• Negative ACTL formula: if false, remains false after adding a rule• AG(φ) where φ is a boolean formula,
• Checkpoint(a,b): ¬E(¬aUb)
• Remove a rule on the path given by the model checker (why command)
• Unclassified CTL formulae
François Fages Rocquencourt, Sep. 2007
Example: Cell Cycle Control Model [Tyson 91]
MA(k1) for _ => Cyclin.
MA(k2) for Cyclin => _.
MA(K7) for Cyclin~{p1} => _.
MA(k8) for Cdc2 => Cdc2~{p1}.
MA(k9) for Cdc2~{p1} =>Cdc2.
MA(k3) for Cyclin+Cdc2~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k4p) for Cdc2~{p1}-Cyclin~{p1} => Cdc2-Cyclin~{p1}.
k4*[Cdc2-Cyclin~{p1}]^2*[Cdc2~{p1}-Cyclin~{p1}] for
Cdc2~{p1}-Cyclin~{p1} =[Cdc2-Cyclin~{p1}] => Cdc2-Cyclin~{p1}.
MA(k5) for Cdc2-Cyclin~{p1} => Cdc2~{p1}-Cyclin~{p1}.
MA(k6) for Cdc2-Cyclin~{p1} => Cdc2+Cyclin~{p1}.
François Fages Rocquencourt, Sep. 2007
Automatic Generation of True CTL PropertiesEi(reachable(Cyclin)))Ei(reachable(!(Cyclin))))Ai(oscil(Cyclin)))Ei(reachable(Cdc2~{p1})))Ei(reachable(!(Cdc2~{p1}))))Ai(oscil(Cdc2~{p1})))Ai(AG(!(Cdc2~{p1})->checkpoint(Cdc2,Cdc2~{p1}))))Ei(reachable(Cdc2-Cyclin~{p1,p2})))Ei(reachable(!(Cdc2-Cyclin~{p1,p2}))))Ai(oscil(Cdc2-Cyclin~{p1,p2})))Ei(reachable(Cdc2-Cyclin~{p1})))Ei(reachable(!(Cdc2-Cyclin~{p1}))))Ai(oscil(Cdc2-Cyclin~{p1})))Ai(AG(!(Cdc2-Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1,p2},Cdc2-Cyclin~{p1})))Ei(reachable(Cdc2)))Ei(reachable(!(Cdc2))))Ai(oscil(Cdc2)))Ei(reachable(Cyclin~{p1})))Ei(reachable(!(Cyclin~{p1}))))Ai(oscil(Cyclin~{p1})))Ai(AG(!(Cyclin~{p1})->checkpoint(Cdc2-Cyclin~{p1},Cyclin~{p1}))))
François Fages Rocquencourt, Sep. 2007
Rule Deletion
biocham: delete_rules(Cdc2 => Cdc2~{p1}).
biocham: check_all.
First formula not satisfied
Ei(EF(Cdc2-Cyclin~{p1}))
François Fages Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
François Fages Rocquencourt, Sep. 2007
Model Revision from Temporal Properties
biocham: revise_model.
Rules to delete:
Rules to add:
Cdc2 => Cdc2~{p1}.
biocham: learn_one_addition.
(1) Cdc2 => Cdc2~{p1}.
(2) Cdc2 =[Cdc2]> Cdc2~{p1}.
(3) Cdc2 =[Cyclin]> Cdc2~{p1}.
François Fages Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems
François Fages Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
François Fages Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
• Parameter search from temporal properties proved useful and complementary to bifurcation theory tools (Xppaut)
François Fages Rocquencourt, Sep. 2007
Conclusion
• Temporal logic with constraints is powerful enough to express both qualitative and quantitative biological properties of systems
• Three levels of abstraction in BIOCHAM :
Boolean semantics CTL formulas (rule learning)
Differential semantics LTL with constraints over reals (parameter search)
Stochastic semantics Probabilistic CTL with integer constraints
• Parameter search from temporal properties proved useful and complementary to bifurcation theory tools (Xppaut)
• Rule inference from temporal properties still in infancy, to be optimized and improved by types (e.g. protein functions, computed by abstract interpretation)
François Fages Rocquencourt, Sep. 2007
Collaborations
STREP APrIL2 : Luc de Raedt, Univ. Freiburg, Stephen Muggleton, IC ,…
• Learning in a probabilistic logic setting (finished)
ARC MOCA :
• modularity, compositionality and abstraction
NoE REWERSE : semantic web, François Bry, Münich, Rolf Backofen,
• Connecting Biocham to gene and protein ontologies (types)
STREP TEMPO : Cancer chronotherapies, INSERM Villejuif, F. Lévi; Bang: J. Clairambault, Contraintes: S. Soliman
• Coupled models of cell cycle, circadian cycle, cytotoxic drugs.
INRA Tours : E. Reiter, D. Heitzler, Sysiphe : F. Clément
• Model of FSH signalling.