François Fages MPRI Bio-info 2007
Formal Biology of the Cell
Inferring Reaction Rules from Temporal Properties
François Fages, Constraint Programming Group,
INRIA Rocquencourt mailto:[email protected]://contraintes.inria.fr/
François Fages MPRI Bio-info 2007
Overview of the Lectures
1. Introduction. Formal molecules and reactions in BIOCHAM.
2. Formal biological properties in temporal logic. Symbolic model-checking.
3. Continuous dynamics. Kinetic models.
4. Learning kinetic parameter values. Constraint-based model checking.
5. Abstract Interpretation for systems biology I: hierarchy of semantics
6. Abstract Interpretation for systems biology II: types
7. Locations, transport and intercellular signalling
8. Inferring reaction rules from temporal properties
9. …
10.Protein structure prediction in constraint logic programming
François Fages MPRI Bio-info 2007
A Logical Paradigm for Systems Biology
Biological model = Transition System
Biological property = Temporal Logic Formula
Biological validation = Model-checkingInitial state : experimental conditions, wild-life/mutated organisms,…
Boolean semantics (propositionnal-CTL)
reachable(P) = EF(P) checkpoint(s2,s) = E(s2U s)
stable(s) = AG(s) steady(s) = EG(s)
oscil(P) = EG(F P ^ F P)
Differential semantics (constraint-LTL)
Reach threshold concentration : F([M]>0.2) on derivative : F(d([M]/dt)>0.2)
Reach and stays above threshold : FG([M]>0.2)
oscil(P,n)=F(d([M])/dt>0 & F(d([M])/dt<0 & … )) n times
François Fages MPRI Bio-info 2007
Learning Model Revision from Temporal Properties
• Theory T: BIOCHAM model • molecule declarations
• reaction rules: complexation, phosphorylation, …
• Examples φ: CTL specification of biological properties• Reachability
• Checkpoints
• Stable states
• Oscillations
• Bias R: Rule pattern• Kind of rules to add or delete
Find a revision T’ of T such that T’ |= φ
François Fages MPRI Bio-info 2007
Kripke Semantics of CTL*
Kripke structure K=(S,R) where S is a set of states and RSxS is total.
s |= if propositional formula is true in s,
s |= E if there is a path from s such that |= ,
s |= A if for every path from s, |= ,
|= if s |= where s is the starting state of ,
|= X if 1 |= ,
|= U iff there exists k ≥ 0 such that k |= for all j < k j |= |= W iff j j |= or k ≥ 0 k |= and j < k j |=
F = (true U |= F if there exists k ≥ 0 such that k |= ,
G = (W false |= G if for every k ≥ 0, k |=
François Fages MPRI Bio-info 2007
Duality in CTL*
E = A
X = X
U = W
F = G
CTL*(X) : fragment of CTL* without U, W, F, G
CTL*(U) : fragment of CTL* without X
CTL : fragment of CTL* with E, A immediately before X, F, G, U , W can be identified to the set of states where it is true ~ {sS : s |=
}
LTL : fragment of CTL* without E, A
LTL(U) : fragment of LTL without X
LTL(F) : fragment of LTL without X, U, W
François Fages MPRI Bio-info 2007
Complexity of Model-checking and Satisfiability
Model-checking Satisfiability
given an explicit Kripke structure K given a formula , does there exist
and a formula , does K,s |= ? a structure K,s such that K,s |= ?
LTL, LTL(U) : Pspace complete Pspace complete
LTL(F) : NP-complete NP-complete
CTL : Ptime DetExpTime complete
CTL* : Pspace complete DetExpExpTime complete
François Fages MPRI Bio-info 2007
Simple Model of Cell Cycle Control
[Tyson et al. 91] model over 6 variables, initial state present(cdc2).
_=>Cyclin. Cyclin=>_. Cyclin+Cdc2~{p1}=>Cdc2-Cyclin~{p1,p2}.
Cdc2-Cyclin~{p1,p2}=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2-Cyclin~{p1}=>Cyclin~{p1}+Cdc2.Cyclin~{p1}=>_.Cdc2=>Cdc2~{p1}.Cdc2~{p1}=>Cdc2.
François Fages MPRI Bio-info 2007
(Aut. Generated) CTL Specification of the Model
biocham: add_genCTL.reachable(Cyclin).reachable(!(Cyclin)).oscil(Cyclin).reachable(Cdc2~{p1}).reachable(!(Cdc2~{p1})).checkpoint(Cdc2, Cdc2~{p1}).
oscil(Cdc2).
…
reachable(Cyclin~{p1}).reachable(!(Cyclin~{p1}))
oscil(Cyclin~{p1}).checkpoint(Cdc2-Cyclin~{p1}, Cyclin~{p1}).
François Fages MPRI Bio-info 2007
Model Compression
biocham: reduce_model.1: deleting Cyclin=>_2: deleting Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}3: deleting Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}4: deleting Cdc2~{p1}=>Cdc2After reduction, 6 rules remain corresponding to the bias ? => ?Deletion(s):Cyclin=>_.Cdc2-Cyclin~{p1,p2}=[Cdc2-Cyclin~{p1}]=>Cdc2-Cyclin~{p1}.Cdc2-Cyclin~{p1}=>Cdc2-Cyclin~{p1,p2}.Cdc2~{p1}=>Cdc2.
François Fages MPRI Bio-info 2007
Theory Revision
biocham: delete_rules(Cdc2=>Cdc2~{p1}).Cdc2=>Cdc2~{p1}
biocham: revise_model.1: adding Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}2: adding Cdc2=>_2: backtracking on previous add -> deleting Cdc2=>_2: adding Cdc2=[Cyclin]=>_2: backtracking on previous add -> deleting Cdc2=[Cyclin]=>_2: adding Cdc2=[Cdc2-Cdc2~{p1}]=>_3: adding Cdc2=>Cdc2~{p1}4: deleting Cdc2=[Cdc2-Cdc2~{p1}]=>_5: deleting Cdc2-Cdc2~{p1}=>Cdc2+Cdc2~{p1}
Modifications found: Deletion(s): Addition(s): Cdc2=>Cdc2~{p1}.
François Fages MPRI Bio-info 2007
Search for all Solutions
biocham: learn_one_addition(elementary_interaction_rules).Time: 5.00 sRules tested: 112Good rules to be added: 2Cdc2=>Cdc2~{p1}Cdc2=[Cyclin]=>Cdc2~{p1}
François Fages MPRI Bio-info 2007
CTL Equivalence of Boolean Models
For a class C of CTL formulae, and an initial state s,
two Kripke structures K=(S,R), K’=(S,R’) are equivalent
K ~C K’ iff {C : K,s|= } = {C : K’,s|= }
François Fages MPRI Bio-info 2007
CTL Equivalence of Boolean Models
For a class C of CTL formulae, and an initial state s,
two Kripke structures K=(S,R), K’=(S,R’) are equivalent
K ~C K’ iff {C : K,s|= } = {C : K’,s|= }
Which model transformations preserve a class of CTL properties?
Model refinement or simplification preserving a CTL specification
Which model transformations can make a CTL property true?
Learning of rules to add or to delete to satisfy a CTL specification
François Fages MPRI Bio-info 2007
CTL Equivalence for a Simple Enzymatic Reaction
Two Biocham models: M1= {A+B<=>D, D=>A+C} M
M2 = {B =[A]=> C} M
D having no occurrence in M nor in the initial state s , is atomic.
François Fages MPRI Bio-info 2007
CTL Equivalence for a Simple Enzymatic Reaction
Two Biocham models: M1= {A+B<=>D, D=>A+C} M
M2 = {B =[A]=> C} M
D having no occurrence in M nor in the initial state s, is atomic.
Proposition If M2 ,s |= EF() then M1 ,s |= EF().
Proof In M2 the transitions A+BA+C (resp. A+BA+B+C) can be replaced in M1 by A+BDA+C (resp. A+BB+DB+A+C).
François Fages MPRI Bio-info 2007
CTL Equivalence for a Simple Enzymatic Reaction
Two Biocham models: M1= {A+B<=>D, D=>A+C} M
M2 = {B =[A]=> C} M
D having no occurrence in M nor in the initial state s , is atomic.
Proposition If M2 ,s |= EF() then M1 ,s |= EF().
Proposition If M1 ,s |= EF() then M2 ,s |= EF() whenever A and B do not appear negatively (i.e. under an odd number of negations) in and D does not appear at all in .
Proof Let be a path in M1 such that k |= . If k does not contain D then one can easily mimick with ’ in M2 such that ’k’ = k for some k’≤k. Otherwise, the last transition on D is either DD+A+C and can be replaced by DA+C, or A+BD and can be erased. In both cases the path is mimicked in M2.
François Fages MPRI Bio-info 2007
CTL Equivalence for a Simple Enzymatic Reaction
Two Biocham models: M1= {A+B<=>D, D=>A+C} M
M2 = {B =[A]=> C} M
D having no occurrence in M nor in the initial state s, ψ atomic.
Proposition If M2 ,s |= ¬ E(¬ U ψ) then M1 ,s |= ¬ E(¬ U ψ) whenever A and B do not appear negatively in ψ and D does not appear positively in ψ
François Fages MPRI Bio-info 2007
CTL Equivalence for a Simple Enzymatic Reaction
Two Biocham models: M1= {A+B<=>D, D=>A+C} M
M2 = {B =[A]=> C} M
D having no occurrence in M nor in the initial state s, ψ atomic.
Proposition If M2 ,s |= ¬ E(¬ U ψ) then M1 ,s |= ¬ E(¬ U ψ) whenever A and B do not appear negatively in ψ and D does not appear positively in ψ
Proposition If M1 ,s |= ¬ E(¬ U ψ) implies M2 ,s |= ¬ E(¬ U ψ) A and B do not appear negatively in and D does not appear positively in
François Fages MPRI Bio-info 2007
Positive and Negative CTL Formulae
Let K = (S,R,L) and K’ = (S,R’,L) be two Kripke structures such that RR’
Def. An ECTL (positive) formula is a CTL formula with no occurrence of A (nor negative occurrence of E).
Ex. : reachability EF(), steady EG()
Def. An ACTL (negative) formula is a CTL formula with no occurrence of E (nor negative occurrence of A).
Ex. : checkpoint E(2U ), stable AG()
François Fages MPRI Bio-info 2007
Monotonicity of Positive ECTL Formulae
Let K = (S,R) and K’ = (S,R’) be two Kripke structures such that RR’.
Proposition For any ECTL formula , if K’,s |≠ then K,s |≠ .
Proof We show that K,s |= implies K’,s |= by induction on the proof of If is propositionnal, s |= hence K’,s |= ;
If =1&2 (resp. 1|2) then by induction K’,s|=1 and (resp. or) K’,s|=2.
If =EX then K, |= X 1 for some path in K, hence in K’, so K, 1|= 1 and by induction K’, 1|= 1 hence K’, |= X 1
If =E(U 2) then K, |= 1 U 2 for some path in K, hence in K’, so there exists k K, k|= 2 and for all j<k K, j|= 1. By induction K’, k|= 2 and for all j<k K’, j|= 1 hence K, |= 1 U 2.
François Fages MPRI Bio-info 2007
Anti-monotonicity of Negative ECTL Formulae
Let K = (S,R) and K’ = (S,R’) be two Kripke structures such that RR’.
Proposition For any ACTL formula , if K,s |≠ then K’,s |≠ .
Proof If K,s |≠ then K,s |= where is an ECTL formula.
By the previous proposition, K’,s |= hence K’,s |≠ .
François Fages MPRI Bio-info 2007
Theory Revision Algorithm
General idea of constraint programming: replace a generate-and-test algorithm by a constrain-and-generate algorithm.
Anticipate whether one has to add or remove a rule?
• Positive ECTL formula: if false, remains false after removing a rule• Reachability, stability
• Need to add rules
• Negative ACTL formula: if false, remains false after adding a rule• Checkpoints
• Need to remove a rule on the path given by the model checker
• Unclassified CTL formulae• oscil(a)= AG((a EFa)^(a EFa))
François Fages MPRI Bio-info 2007
Theory Revision Algorithm Rules
Initial state: <(0, 0, 0), (E,U,A), R>
E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e
E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>
if R |≠ e and f {e} E U A, K {r} |= f
François Fages MPRI Bio-info 2007
Theory Revision Algorithm Rules
Initial state: <(0, 0, 0), (E,U,A), R>E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= eE’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}> if R |≠ e and f {e} E U A, K {r} |= fU transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= uU’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}> if R|≠u and f {u} E U A, R {r} |= fU” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R> if K, si|≠u and f {u} E U A, R |= f
François Fages MPRI Bio-info 2007
Theory Revision Algorithm Rules
Initial state: <(0, 0, 0), (E,U,A), R>
E transition: <(E,U,A), (E{e},U,A), R> <(E{e},U,A), (E,U,A),R> if R |= e
E’ transition: <(E,U,A), (E {e},U,A), R> <(E {e},U,A), (E,U,A),R {r}>
if R |≠ e and f {e} E U A, K {r} |= f
U transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R> if R |= u
U’ transition: <(E,U,A), (0,U {u},A), R > <(E,U {u},A), (0,U,A),R {r}>
if R|≠u and f {u} E U A, R {r} |= f
U” transition: <(E,U,A), (0,U {u},A), R Re > <(E,U {u},A),(0,U,A), R>
if K, si|≠u and f {u} E U A, R |= f
A transition: <(E,U,A), (0, 0,A {a}), R > <(E,U,A {a}), (Ep,Up,A),R> if R |= a
A’ transition: <(EEp,UUp,A),(0,0,A{a}), RRe><(E,U,A{a}),(Ep,Up,A),R> if R|≠ a, f {u} [ E U A, R |= f and Ep Up is the set of formulae no longer satisfied after the deletion of the rules in Re.
François Fages MPRI Bio-info 2007
Termination
Proposition The model revision algorithm terminates.
Proof
The termination of the algorithm is proved by considering the lexicographic
ordering over the couple < a, n >
where a is the number of unsatisfied ACTL formulae,
and n is the number of unsatisfied ECTL and UCTL formulae.
Each transition strictly decreases a,
or lets a unchanged and strictly decreases n.
François Fages MPRI Bio-info 2007
Correctness
Proposition If the terminal configuration is of the form < (E,U,A), (0,0,0), R > then the model R satisfies the initial CTL specification.
Proof
Each transition maintains only true formulae in the satisfied set,
and preserves the complete CTL specification
in the union of the satisfied set and the untreated set.
François Fages MPRI Bio-info 2007
Incompleteness
Two reasons:
1) The satisfaction of ECTL and UCTL formula is searched by adding only one rule to the model (transition E’ and U’)
2) The Kripke structure associated to a Biocham set of rules adds loops on terminal states. Hence adding or removing a rule may have an opposite deletion or addition of those loops.
François Fages MPRI Bio-info 2007
Optimisations
Restrict the search space for rules to add by:
• Considering type information on molecular species• Kinase(A) B=[A]=>B~{p}. for any B• Phosphatase(A) B~{p}=[A]=>B. for any B• Kinase(A,B)• Phosphatase(A,B)
• Considering the influence graph between molecular species• Activates(A,B) _=[A]=>B. A+B’=>B. B~{p}=[A]=>B.
B’=[A]=>B. • Inhibits(A,B) B=[A]=>_. A+B=>A-B. B=[A]=>B~{p}.
B=[A]=>B’.
• Considering the topology of locations• Neighbor(L,L’) A:L+…=>B:L’+…