i
The Pennsylvania State University
The Graduate School
Eberly College of Science
UNDERSTANDING SIGNAL TRANSDUCTION IN BIOLOGICAL SYSTEMS
WITH NETWORK-BASED DYNAMIC MODELING
A Dissertation in
Physics
by
Xiao Gan
© 2019 Xiao Gan
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Doctor of Philosophy
August 2019
ii
The dissertation of Xiao Gan was reviewed and approved* by the following:
Réka Albert
Distinguished Professor of Physics and Biology
Dissertation Advisor and Chair of Committee
Carina Curto
Associate Professor of Mathematics
Dezhe Jin
Associate Professor of Physics
Sarah M. Assmann
Waller Professor of Biology
Richard Robinett
Professor of Physics, Associate Department Head, Director of Undergraduate
Studies, and Director of Graduate Studies
*Signatures are on file in the Graduate School.
iii
Abstract
Complex biological systems are composed of simple, low-level elements. A promising
avenue toward understanding how system-level behavior arises from the interactions of lower-
level components is network-based dynamical modeling. For example, dynamic modeling of
molecular interaction networks can capture cell behavior or phenotype as an emergent property
that arises from the dynamics of the system. In a dynamic model, a node is associated with a
state and a regulatory function that describes its time-evolution. The attractors (long-time
behavior) of a network-based dynamical model represent significant biological phenotypes, e.g.
cell fates. It is therefore important to know the attractors of a network model, so that one may
design interventions to avoid undesired attractors and keep the system in the desired attractor.
The challenge for determining the complete dynamic repertoire is the huge state space size.
The unifying theme of my dissertation research is to understand signal transduction in
complex biological systems. All of my projects used discrete dynamic modeling, which can
recapitulate biological knowledge with minimal requirement of kinetic parameterization, and is
thus simple enough to apply on large biological systems. In my first project I analyzed the
attractor landscape of a 70-node multi-level biological network model. This model described
plant guard cell signaling during the process in which microscopic pores on the surface of the
leaves (called stomata) open in response to light of different wavelength. Due to the size of the
network and the multiple states of a portion of the nodes, this model has a huge state space
(~1031 states). Using a combination of network reduction analysis techniques, I found the
model’s complete dynamic repertoire, revealing the stability of signal transduction in the
stomatal opening process.
In a following project, I developed a general method to automatically identify the attractors
of any finite discrete model, based on a Boolean method developed by our group previously.
The idea is to exploit an expanded network representation that incorporates regulatory rules into
the interaction network. A certain type of subgraph of the expanded network determines a trap
subspace of the state space (i.e. a subspace which if the system enters, it cannot escape). These
motifs are the dynamic cores of a model. Iterative identification of stable motifs yields the
attractors of the system. The method finds not only steady states, but also complex (oscillating)
attractors. I showed this mathematically, and validated it on synthetic network ensembles, and
on a list of existing multi-level models in the literature.
iv
My third project is modeling plant response to environmental stress, in collaboration with
wet-bench biologists. Plants close their stomata in response to high CO2 concentration or to
phytohormones such as ABA (abscisic acid) induced by drought. We aim to understand how
different signaling components participate in the crosstalk of ABA and CO2 in inducing
stomatal closure. We are also interested in the different signaling mechanisms involving
canonical and non-canonical subunits of the G-protein (a membrane protein involved in many
types of trans-membrane signal transduction). The network model integrates previous work on
ABA signaling with existing knowledge on CO2 signaling, and predicts necessary regulations
of the G-protein based on necessary conditions for the model to be consistent with experimental
observations. We explain the mechanism by which different signals induce closure by our motifs
analysis. The model is also predicting interesting closure patterns under interventions. The
predictions will be assessed experimentally by our collaborative team.
In summary, my dissertation research has provided a general way to analyze complex
discrete dynamic models, and has expanded the understanding of plant responses to
environmental stress.
v
Table of Contents
List of Figures ..................................................................................................... vii
List of Tables ..................................................................................................... xiv
Acknowledgments .............................................................................................. xvii
Chapter 1 Review of biological networks and dynamic modeling ...................... 1
Introduction ................................................................................................................. 1
Networks in biology .................................................................................................... 2
Dynamic modeling ...................................................................................................... 6
Modeling T cell survival ........................................................................................... 12
Modeling epithelial to mesenchymal transition (EMT) ............................................ 14
Integration of the interaction network and regulatory rules...................................... 15
Chapter 2 Analysis of a dynamic model of guard cell signaling reveals the
stability of signal propagation ............................................................................. 25
Background ............................................................................................................... 25
Methods..................................................................................................................... 28
Results ....................................................................................................................... 30
Discussion ................................................................................................................. 43
Chapter 3 A general method to find the attractors of discrete dynamic models
of biological systems............................................................................................ 45
Introduction ............................................................................................................... 45
Methods..................................................................................................................... 46
Results ....................................................................................................................... 60
Discussion ................................................................................................................. 63
Chapter 4 Modeling ABA and CO2 crosstalk in inducing stomatal closure ..... 66
Introduction ............................................................................................................... 66
Construction and simulation methods of the crosstalk network and dynamic model
................................................................................................................................... 69
Predicting XLG related regulations by reproducing known wild type and G-protein
mutants’ stomatal response to ABA, CO2, and external Calcium ............................ 72
vi
Motifs analysis identifies key feedback loops, shows the attractor of the system, and
explain the effect of different G-protein alpha subunits ........................................... 78
Multiple intervention scenarios predict potential G-protein regulation effectors, and
mutant response to signals ........................................................................................ 81
The crosstalk model offers a potential explanation to the seemingly contradictory
stomatal response to CO2 in presence and absence of mesophyll cells .................... 84
Time-course simulation reveal a knowledge gap in CO2 early signaling ................. 86
Discussion ................................................................................................................. 87
Chapter 5 Conclusions and outlook ................................................................. 89
Appendix A Analysis of a dynamic model of guard cell signaling reveals the
stability of signal propagation ............................................................................. 93
A1 Regulatory Functions of the Reduced Stomatal Opening Model ............ 93
A2 Table of stomatal opening levels for simulated single node knockouts in
the reduced model ..................................................................................................... 99
Appendix B A general method to find the attractors of discrete dynamic models
of biological systems.......................................................................................... 102
B1 Runtime performance of the multi-level Quine-McCluskey algorithm 102
B2 Description of the multi-level Quine-McCluskey algorithm ................. 103
B3 Mathematical foundations of the motif-based attractor identification
algorithm 104
B4 Oscillating Motif Examples ................................................................... 114
B5 Generation of regulatory functions in synthetic networks ..................... 119
Appendix C Modeling ABA and CO2 crosstalk in inducing stomatal closure 120
C1 Node name, abbreviation and regulatory rule for each node ................. 120
C2 Systematic single node intervention of the crosstalk model.................. 122
C3 Selected triple intervention of the crosstalk model ............................... 126
References ......................................................................................................... 128
vii
List of Figures
Figure 1.1. Signal transduction network corresponding to the process of stomatal
opening in plants, adapted from [2]. This network has 32 nodes and 81 directed
edges. Arrows represent positive edges, and terminal filled circles represent
negative edges. The network contains three strongly-connected components
(SCCs), marked with dotted lines. The thick edges indicate a path from the
source node ‘Blue light’ to the sink node ‘Stomatal Opening’. ....................... 4
Figure 1.2. Flow chart of the main steps of constructing and analyzing a dynamic
model of a signal transduction network. The key to the construction and
validation of the model is experimental data. Different types of data are used
for model construction and for model validation: interaction data and initial
state data are used as inputs; and time-course or long-term state data are used
for model validation. This separation of information helps avoid overfitting. 8
Figure 1.3. Example of Boolean and multi-level regulatory functions in truth table
representation. A truth table is generated by enumerating all input
combinations and indicating the corresponding outputs. The output of the
regulatory function will become the next state of the target node. Here and
throughout the chapter we represent the next state of node A as A*. .............. 9
Figure 1.4. Example of a toy Boolean network model and its dynamics under
synchronous update (when both nodes are updated simultaneously) and under
general asynchronous update (when one node is updated at each time). The
dynamics of the model is represented by a state transition graph (STG), in
which system states are represented by nodes and state transitions are
represented by edges. Terminal strongly-connected components (including
nodes with only a self-loop) in an STG are attractors of the system. This model
exemplifies that complex attractors may depend on update schemes.
Specifically, under synchronous update, there is a complex attractor formed
by two states that differ in the value of both nodes. As state transitions that
change the value of two nodes are not possible under general asynchronous
update, this complex attractor disappears under asynchronous update. ........ 11
Figure 1.5. T-LGL survival signaling network by Zhang et al, reproduced with
permission from [22] , copyright (2008) National Academy of Sciences, U.S.A.
The network contains 58 nodes and 123 edges. Up-regulated or constitutively
active nodes are in red, down-regulated or inhibited nodes are in green, nodes
that have been suggested to be deregulated (either up-regulation or down-
regulation) are in blue, and the states of white nodes are unknown or
unchanged compared with normal. Blue edges with arrowheads indicate
activation and red edges that terminate in diamonds indicate inhibition. The
shape of the nodes indicates the cellular location or the corresponding proteins,
transcripts or molecules: rectangles indicate intracellular components, ellipses
indicate extracellular components, and diamonds indicate receptors.
Conceptual nodes (Stimuli, Cytoskeleton signaling, Proliferation, and
viii
Apoptosis) are orange. ................................................................................... 13
Figure 1.6 EMT network by Steinway et al., reproduced with permission from [24].
The network has 70 nodes and 135 edges. Nodes that represent extracellular
signals are shown in blue, green nodes are transcription factors, and the single
output node EMT is shown in red. Multiple molecules that serve as
extracellular signals are also produced by the cell, thus these nodes have
incoming edges. ............................................................................................. 15
Figure 1.7 Examples of expanded network construction in the Boolean and multi-
level case. Each virtual node is labeled with the state it represents. Each
composite node is black, with a label indicating which node combination it
represents. The complete expanded network is obtained by expanding all
regulatory functions of the original model. ................................................... 17
Figure 1.8. Example of elementary signaling modes (ESMs) in a partial expanded
network. The labeled virtual nodes correspond to the ON state of the respective
nodes in the original signal transduction network; the black node is a
composite node. There are two ESMs in the network: the path A1 B1 D1 E1,
shown as the dotted line, and the subgraph that contains A1, C1, B1, the
composite node, and E1, shown with a dashed line. Each is sufficient for the
signal to activate the outcome. This figure was adapted from [29]. .............. 18
Figure 1.9. Example of stable motif identification from a three-node Boolean
dynamic model. The regulatory functions of the virtual nodes are given. The
black nodes in the expanded network are composite nodes. Three stable motifs
can be identified from the expanded representation of the network. The first
stable motif represents the simultaneous activation (state 1) of nodes A and B.
The second and third stable motifs represent the sustained inactivation (state
0) of A and C, respectively. Notice that a stable motif corresponds to a positive
feedback loop (or SCC) in the original network, but not all positive feedback
loops are stable motifs. .................................................................................. 19
Figure 1.10. Example of attractor identification with iterative stable motif guided
network reduction using the same model as in Figure 1.9. There are three
stable motifs in the model. In the iterative reduction process, each of them is
plugged into the regulatory functions (represented by indicating the stable
motif above an arrow), resulting in a reduced model (indicated by the
interaction network and regulatory functions), where further stable motif
analysis is performed. For simplicity of representation of the A1, B1 stable
motif, we do not show the composite node. When all nodes’ states are
identified in the process, the reduction is complete and an attractor is obtained.
....................................................................................................................... 20
Figure 1.11. Part of the stable motif succession diagram of the T-LGL network,
adapted from [31]. The state of the nodes in each motif is indicated by a
number, separated from the node name by an underscore (e.g. S1P_0
represents S1P at state 0). A stable motif sequence determines the attractor, i.e.
Apoptosis or T-LGL leukemia (cancer). For example, the activation of the
Ceramide=0, S1P=1, PDGFR=1, SPHK1=1 motif leads to the reduction of the
ix
whole network and convergence into the T-LGL leukemia attractor. ............ 21
Figure 1.12. The logic backbone of the EMT model, reproduced from [34]. This is
a condensed version of the EMT network, where each stable motif of the
model is represented by a single node (in blue), and its causal relationships
with the signals and the outcome node EMT (in yellow) are visualized. All
edges are sufficient activations, i.e. the activity (sustained ON state) of the
input node/motif will activate the target node or motif. Any signal, or any
stable motif is sufficient to drive EMT. ......................................................... 22
Figure 1.13. An example ESM from the EMT network, from the signal PDGF to
output EMT. The state of the nodes are marked at the end of each node label
(e.g. PDGF_1 means PDGF at state 1.). The existence of this ESM indicates
that the sustained presence of the PDGF signal alone is sufficient to drive EMT.
Note that this ESM contains three composite nodes. .................................... 22
Figure 1.14. Stable motif associated with the epithelial state in the EMT network
and illustration of control sets that guarantee convergence to the epithelial state,
reproduced from [25]. The entire graph is the epithelial stable motif. Nodes in
black are OFF, and nodes in white are ON. Controlling of one node in each
yellow rectangle, e.g. SMAD, SNAI1, RAS, SHH knockout combined with β-
catenin_memb constitutive activation, ensures convergence to the epithelial
state. The nodes highlighted in blue represent SMAD and the nine nodes
whose knockout in combination with SMAD is able to prevent TGFβ-driven
EMT. The fact that these blue nodes are either part of a yellow rectangle
(SMAD, RAS), on a path that ends in a node of a yellow rectangle (DELTA,
NOTCH, NOTCH_ic, CSL) or on a path that starts with a node of a yellow
rectangle (PI3K, AKT) indicates the inclusive relationship between node sets
whose control prevents or, respectively, reverses EMT. ................................ 24
Figure 2.1 The signal transduction network responsible for stomatal opening, as
reconstructed by Sun et al.[1]. The color of a node marks which signal
regulates this node. Red nodes are regulated solely by red light. Blue nodes
are regulated solely by blue light. Yellow nodes are regulated solely by ABA.
Grey nodes are regulated by CO2. Purple nodes are regulated by both blue and
red light. Green nodes are regulated by blue (and potentially, red) light and
ABA. White nodes are source nodes not regulated by any of the four signals.
To improve visualization, certain pairs of edges with the same starting or end
nodes overlap. Nodes with multiple levels in the dynamic model are
represented by red shadows; the others are Boolean. The full names of the
network components denoted by abbreviated node names are given in Table 1.
This figure and part of its caption is reproduced from Sun Z, Jin X, Albert R,
Assmann SM (2014) Multi-level Modeling of Light-Induced Stomatal
Opening Offers New Insights into Its Regulation by Drought. PLoS Comput
Biol 10(11): e1003930. doi:10.1371/journal.pcbi.1003930. ......................... 26
Figure 2.2 The stomatal opening network after model reduction, with 32 nodes and
81 edges. Nodes with shadows have multiple states; other nodes are binary.
The three strongly-connected components (SCCs) of the network are indicated
x
by rectangles with dashed contours. .............................................................. 33
Figure 2.3 The Ion SCC after reducing all edges that depend on Calcium. All
regulators of this sub-network have been omitted. On the left, [Ca2+]c related
nodes form a sink sub-network. ..................................................................... 39
Figure 3.1 Demonstration of the construction of a quasi-Boolean regulatory
function. A 3-level node A has regulatory function: fA =B+C, where B and C
both have 2 levels. From the truth table, one can identify the regulatory
function for each virtual node of A, by connecting all conjunctive clauses that
yield the same state of A with the Boolean ‘or’ operator. In this way, each
virtual node’s regulatory function will have a Boolean disjunctive normal form.
....................................................................................................................... 49
Figure 3.2 Example of the multi-level Quine-McCluskey algorithm. A Boolean
node D is regulated by a Boolean node A and two 3-state nodes B and C. The
original function of D is shown in a truth table on the top left, in a form
summarizing all input combinations that yield fD(1) =1. The top right table
shows the minterms sorted according to the number of zeros in them. From
this table, one can merge the terms between layers that are different by 1 digit,
if all states of the difference node are present within the two layers. The result
of the merging is shown below. Merged terms are represented by an ‘X’. There
are 5 leftover terms after 1st order merging, and there is 1 leftover term after
0th order merging. The sum of all six terms is the final expression. .............. 51
Figure 3.3 Construction of an expanded network from a regulatory function. Virtual
node A0 has function fA(0) = B0 or (C1 and B1), so in the expanded network,
B0 is connected directly to A0; C1 and B1 are connected indirectly to A0 via
composite node 'C1 and B1'. A1 has function fA(1) = C0 and B1, so C0 and B1
are connected indirectly to A0 via composite node 'C0 and B1'. .................. 52
Figure 3.4 Illustration of stable motif identification in a three-node network. (A)
The original network and the regulatory functions of each node; (B) The
expanded network is constructed according to the steps in sub-section E of
Methods, and then the stable motifs are found by their definition in I.F. (C)
Stable motifs found in this example. The first stable motif, A0, B0,
corresponds to a fixed point attractor of the system A=0, B=0, C=0. The state
C=0 is found by plugging A=B=0 into the regulatory function of C. The 2nd
stable motif corresponds to another fixed point attractor A=2, B=2, C=0. ... 53
Figure 3.5 An example of an oscillating motif in a multi-level network. Panel (A)
shows the network and regulatory functions; panel (B) indicates the expanded
network and motifs. A0 and B0 form a stable motif, indicating a fixed point
A=0, B=0; while A1, A2, B1 and B2 form an oscillating motif, indicating a
possible complex attractor involving states A=1, A=2, B=1 and B=2. Panel (C)
indicates the state transition graph of the system when using general
asynchronous update. The stable motif and oscillating motif identified in 5B
correspond to a fixed point and a complex attractor, respectively................. 55
Figure 3.6 An example of an oscillating motif that contains a stabilized node. (A)
The network and regulatory functions. (B) The expanded network and motifs.
xi
The oscillating motif contains only one virtual node of B, meaning that B will
stabilize at 1 in the complex attractor. (C) The state transition graph using
general asynchronous update. There are two attractors: a fixed point attractor,
and a complex attractor. ................................................................................. 56
Figure 3.7 Attractor identification for a four-node network by a motif succession
diagram. A. The network and the regulatory function of each node. B. Motif
succession diagram. Three motifs are found from the original network,
including 2 stable motifs (A0, B0), (C1, D1), and one oscillating motif (A1,
A2, B1, B2). For each motif, the values of the nodes in the motif are plugged
into the regulatory functions, reducing the network. Then new motifs are
identified from the reduced networks. The sequences corresponding to the
three motifs are labeled (1), (2) and (3). ........................................................ 58
Figure 4.1 The ABA-CO2 crosstalk network. The network has 28 nodes and 58
edges. Nodes with red labels are CO2 related. Red edges are assumed
regulations. Among them, directed red edges are inferred regulations that are
necessary for CO2 induced closure; undirected red edges are based on
observed protein-protein interactions (see the next section for details). The
sole strongly-connect component, marked with “SCC” label, contain 18 nodes.
A table of nodes names and abbreviations can be found in Appendix C1. .... 70
Figure 4.2 Time course simulation of closure in response to ABA, CO2 and external
Calcium signals. The horizontal axis is the simulation time step, and the
vertical axis is the average closure averaged over 1000 simulations. The tiny
peak at time step~1 is due to randomized initial conditions. ......................... 72
Figure 4.3 Two representative edge/regulation settings of the CO2 signaling sub-
network. Substituting this into Figure 4.1 will complete the network. Black
edges are known and red edges are assumptions/predictions. The main
difference between these two network settings is the opposite direction of the
regulatory relationship between XLGs and HT1. .......................................... 77
Figure 4.4 Result of motifs analysis of the crosstalk model. These motifs are shown
in the expanded network representation (described in Chapter 3) here. Node
states are represented by color: grey colored nodes represent nodes in their
OFF states, white colored nodes represent nodes in their ON states. Black
nodes without labels represent a composite node, as combinatorial regulation
(i.e. “AND” logical operation). “Rboh” is the short-hand notation for
“AtRbohD/F”. A&B: stable motifs found in ABA and CO2 signal, respectively.
C. Two stable motifs are found in the absence of any signals: one associated
with closure and the other associated with non-closure. The dotted line means
that either XLG or GPA1 is sufficient to complete the motif. D. two-node
oscillating motif found in all closure ON attractors. The left hand side is the
original network, the right hand side is the motif in expanded network
representation. ................................................................................................ 79
Figure 4.5 Flow chart of activation sequence of components in the network. “CaIM”
is short for Calcium influx through the membrane. “AtRboh stable motif” is
defined in the previous section and can be interpreted as ROS (reactive oxygen
xii
species) production. The ABA response is fastest because ABA early signaling
can activate AtRboh stable motif. External Calcium activates the downstream
of the CO2 signaling pathway directly, and is therefore faster than CO2
signaling. The fact that in experiments CO2 response is fast may suggest a
Calcium independent pathway from CO2 signaling to the downstream, as
indicated in the figure with the dotted edge(s). ............................................. 87
Figure B.1 Histogram of QM transformation runtime on 100 randomly generated
heterogeneous networks with 50 nodes. The result shows that the complexity
of QM transformation is much less than identifying motifs. ....................... 103
Figure B.2 An example of a timing-dependent complex attractor. (A) The network
and regulatory functions. (B) The state transition graph under synchronous
update. Each node of the state transition graph is a state, given in the order A,
B, and each edge is a state transition allowed by synchronous update. The
system has two fixed points, (0,0) and (1,1). It also has a complex attractor
formed by the states (0,1) and (1,0). (C) The state transition graph under
general asynchronous update (i.e. when one node is updated at a time). Only
the two fixed point attractors exist. The synchronous complex attractor is
timing-dependent and does not exist in this update scheme. ....................... 114
Figure B.3 An example of an oscillating motif without a complex attractor. (A) The
network and regulatory functions. (B) The expanded network and motifs.
There is a stable motif formed by A0 and B0, and an oscillating motif made
up by A1, A2, B1. (C) The state transition graph using general asynchronous
update. There is only one attractor, which is a fixed point. The transient
oscillation between states (2,1) and (1,1) will eventually converge into the
fixed point. ................................................................................................... 115
Figure B.4 Examples of stabilized nodes downstream of oscillating node(s). (A) A
Boolean example where A and B oscillate but their downstream C is stable
under that oscillation. (B) The general asynchronous state transition graph of
nodes A and B. The state (A=1,B=1) is not visited in the long term, leading to
the stabilization of C=0. (C) A multi-level example where A is oscillating
between 1 and 2, leading to B stabilizing at 1. This example arises because of
asymmetry in the nodes’ number of states: A has three states but B only has
two states. .................................................................................................... 116
Figure B.5 An example of an unstable oscillation. The system has a fixed point and
a complex attractor. (A) The network and regulatory functions. (B) The
expanded network and motifs. The entire expanded network forms an
oscillating motif, containing the stable motif by two nodes A1, B1, and one
composite node. (C) The state transition graph using general synchronous
update. There is a fixed point attractor A=1, B=1, and a complex attractor.
Note that in the complex attractor, although both A and B are allowed to enter
state 1, they cannot be in state 1 simultaneously. ........................................ 117
Figure B.6 An example of an oscillating motif containing two complex attractors.
(A) The network and regulatory functions. (B) The expanded network and
motifs. The entire expanded network forms an oscillating motif. (C) The state
xiii
transition graph. For simplicity self-loops representing self-transitions are not
shown in the graph. There are two complex attractors, the first attractor is B=0,
A=0 or 1, and the second attractor is B=1, A =2 or 3. ................................. 118
xiv
List of Tables
Table 2.1 Grouping of the stomatal opening values by the level of [K+]v and sucrose
The first two columns indicate the [K+]v and sucrose levels. The third column
is the possible values of stomatal opening in the Sun et al. model for the given
[K+]v and sucrose levels. Note that here we only show [K+]v, sucrose and
stomatal opening value combinations observed in the simulations of the 66
experimentally studied scenarios reported by Sun et al.[1]. More stomatal
opening values are possible when considering node perturbations. The 4th
column shows the simplified stomatal opening level after grouping. The
update function for the simplified stomatal opening level covers all possible
values of [K+]v and sucrose (see Appendix A1). ........................................... 31
Table 2.2 Example of Boolean conversion. The multi-level node shown in the 1st
column is mapped into two Boolean nodes, shown in the 2nd and 3rd columns,
using the binary representation of the corresponding integer. ....................... 35
Table 2.3 Summary of the attractors found using the stable motif algorithm. The
first 5 columns indicate the input signal combination. The setting CO2_high=1
and CO2=0 is not included because it is not biologically meaningful. The “SO
(Bool)” column indicates the state of the Boolean node combination
representing stomatal opening. The “SO” column is the state of stomatal
opening when converted back to an integer. Note that the stomatal opening
level of four is not defined, and no attractors have a stomatal opening level of
two. The next column indicates whether Ca2+ oscillation can possibly happen
under the given signal combination. The last column indicates whether
bistability of PMV_pos can be observed under this setting. In those cases, two
stable steady states with (PMV_pos=0, Kout=0) and (PMV_pos=1, Kout=1) can
be observed. The rest of the nodes are unaffected by this two-node bistability.
....................................................................................................................... 36
Table 2.4 Summary of systematic perturbation results. The first set of columns, with
the header ‘Light, CO2 and ABA condition’, indicate the input signal
combinations. The abbreviation “Mod.” means moderate CO2 concentration.
Note that we do not list the four input combinations (high CO2 with ABA and
with any type of light, or moderate CO2 with ABA and red light) wherein all
simulated stomatal opening values are zero. The 2nd column is the simulated
stomatal opening (SO) level in the unperturbed system. The 3rd column set
shows the percentage of single-node knockouts that yield the corresponding
SO level. There is no stomatal opening level 4 in the reduced model. No entry
means zero percentage. The last column is the percentage of settings where
the stomatal opening remains at the same level as the unperturbed case. A
complete table of perturbation results is provided in Appendix A2. ............. 41
Table 2.5 Nodes whose knockouts diminish ABA’s inhibition of stomatal opening.
The first set of columns, with the header ‘Light, CO2 and ABA condition’,
indicate the input signal combinations. The 2nd column is the stomatal opening
xv
without perturbations. The 3rd column set indicates the nodes whose knockout
would yield a stomatal opening level that is higher than the unperturbed value
of 0. CO2 knockout means CO2 being set to zero (CO2 free air). No entry
means the setting does not cause partial reversal. ......................................... 44
Table 3.1 Benchmark runtime of the motif-based algorithm on synthetic networks
of different sizes (number of nodes). For each size, 50-100 random networks
with in-degree k=2 are generated. For multi-level networks, each node has 50%
chance of having 2 levels and has 50% chance of having 3 levels. In all runs,
the attractors found by the algorithm are identical or highly consistent with the
attractors found with the sampling method. .................................................. 61
Table 3.2 Summary of the runtime of the two algorithms. The networks fall into
three categories. The first column is the number of networks in each category.
The second column is the range of the network sizes in each category. The 3rd
and 4th columns indicate whether motif analysis and GINsim STG/HTG
generation was successfully completed or not. For completed analysis, the
range of computational time is shown in the table. Otherwise, we indicate
DNC (meaning “did not complete”), which includes cases that ran out of
memory or did not finish in 6 hours. All tests were run on a personal computer.
There is no model where GINsim succeeds and the motif-based algorithm fails.
The motif algorithm is successful in 18 of 19 models, while GINsim
STG/HTG only works in the small networks of the first category. ............... 63
Table 4.1 Simulation of the closure pattern compared with experimental
observation. The first two columns indicate the status of the ABA and CO2
signal. The third column is the intervention applied to the system. External
Calcium is a treatment; XLG and GPA KO represent the xlg triple mutant or
gpa1 mutant, respectively. The “Observed” column indicates the qualitative
outcome of the experiments. “Closure” indicates a significantly decreased
stomatal aperture compared to the control setting that lacks any signal or
intervention. “Loss of closure” indicates that the relevant intervention causes
a substantial decrease in the effect of the relevant signal, thus the combined
outcome of the signal and intervention is closer to the control (no closure) than
to the effect of the signal alone (closure). The “Simulation” column records
the simulated closure value at the end of the simulation (i.e. after 40 time steps)
under each condition, averaged over 100 simulations. A value less than 1 in
the simulation column is consistent with a loss of closure. The table shows that
the model reproduces experimental observations. Notation “KO” means
knockout. ....................................................................................................... 73
Table 4.2 closure response to interventions of early CO2 signaling components. The
first row is experimental observation of closure response, and the second row
is the model simulation. Additional edges (e.g. RHC1 XLGs) are required
to make the two rows consistent. ................................................................... 76
Table 4.3 example of single node intervention. The number is the closure value after
50 time steps, averaged over 500 simulations. This set of simulations predicts
that ROS treatment can restore loss of closure in xlg triple mutants or
xvi
atrbohD/F mutants. ....................................................................................... 81
Table 4.4 Selected double interventions under each signal: A. External Ca2+; B. CO2;
C. ABA. Each row is a genotype (wildtype or the indicated mutant), and each
column is a treatment (including no special treatment). All simulated closure
values are reported after 50 time steps, averaged over 500 simulations.
Yellowed slots are those that display a significantly different value compared
with no treatment. Conclusion on the observations in the row/column are
located on the last column/row of each sub-table. ......................................... 84
Table 4.5 Simulation on closure response to CO2 without or with the ‘mesophyll
signal’ node, together with the assumption that mesophyll produces ABA. . 86
Table A.1 Full names of the network components denoted by abbreviated node
names in Figure 2.1. The same abbreviations are used in the original Sun et al.
model and the reduced model. ....................................................................... 95
Table A.2 Stomatal opening levels for simulated single node knockouts in the
reduced model .............................................................................................. 101
xvii
Acknowledgments
The research described in this dissertation was supported by the National Science
Foundation grants NSF IIS 1161007, NSF PHY 1205840, NSF PHY 1545832, NSF
MCB 1715826. The findings and conclusions do not necessarily reflect the view of the
funding agency.
I would like to thank my advisor, Dr. Réka Albert, for her exemplary expertise and
professionalism, for her extensive mentoring, and for her firm support throughout my
Ph.D. career. I thank the committee members, Dr. Carina Curto, Dr. Dezhe Jin, Dr. Sarah
M. Assmann, plus Dr. John Fricks and Dr. Costas D. Maranas as former committee
members. I would also like to thank Dr. Colin Campbell, Dr. Zhongyao Sun, Dr. Sarah.
M. Assmann, Dr. Jorge G.T. Zañudo, Dr. Gang Yang, Dr. István Albert, Nianyuan Bao,
Parul Maheshwari, Jordan Rozum, Dr. David Chakravorty, Dr. Palanivelu Sengottaiyan,
Dr. Yotam Zait, Dr. David Wooten and Dávid Deritei for helpful discussions related to
my projects.
1
Chapter 1 Review of biological networks
and dynamic modeling
The content of this chapter is based on a book chapter “Modeling biological information
processing networks” where I am the first author. The book chapter has been submitted
to the book Physics of Molecular and Cellular Systems, edited by Krastan B. Blagoev
and Herbert Levine. A subset of the figures are reproduced in this chapter.
Introduction
Interacting systems abound at every level of biological organization (molecular,
cellular, organ, organism or population). For example, molecular interacting systems
consist of genes, their transcripts (mRNAs), proteins, small molecules; their interactions
include gene transcription, protein translation, protein-protein interactions and chemical
reactions. A fundamental goal of biology is to understand why biological systems
behave the way they do. One promising avenue toward this goal is to realize that
interacting biological systems at each level can determine the behavior at the next level.
For example, cellular decisions, behaviors, and phenotypes arise from the interactions
of numerous molecular components. Similarly, interactions among cells determine how
multi-cellular organisms develop and how tissues and organs function; interactions
among individuals form the basis of social communities; and interactions among species
underlie ecological communities.
A higher-level function, behavior, or phenotype is an emergent property that arises
from the totality of the lower-level elements and interactions. That is, one usually cannot
attribute a cell behavior to a single gene or protein. This does not necessarily mean,
however, that all the elements and interactions are equally important in determining a
higher-level behavior. Biological networks offer a visual and effective way to represent
the lower level elements and interactions; the analysis of these networks is a key step
toward the elucidation of higher-level emergent properties. Specifically, network
analysis and network-based dynamic modeling can be used to determine the repertoire
of cellular behaviors associated to a within-cell network, and to identify the sub-
networks that play a key role in the cell adopting a certain behavior.
Another aspect of understanding biology is that, despite the vast amounts of recent
information about regulatory relationships among genes, proteins, and small molecules,
2
many knowledge gaps still exist. Networks and network-based modeling can integrate
fragmentary and qualitative interaction information, and can make powerful predictions
about undiscovered biology.
In this chapter we focus on the application of network-related methods and techniques
in understanding biology. A variety of networks can be defined at the cellular,
organismal and ecosystem levels. The examples in this chapter will focus on the
molecular to cellular level. We aim to illustrate how to connect the properties of within-
cell information processing networks to cellular phenotypes. We start by introducing
network concepts and measures such as paths, cycles and strongly-connected
components, and their biological interpretation. Then we introduce network-based
dynamic modeling, which offers in-depth insights into dynamical processes and the
effect of perturbations. We describe the construction of dynamic models, and
demonstrate their predictive power through two examples. We also introduce
methodologies that reveal structure-dynamics connections through the construction of
so-called expanded networks.
Networks in biology
A network (also called graph) consists of nodes (also called vertices) and edges that
connect pairs of nodes. In a biological network, nodes represent biological elements, for
example proteins and molecules in a cell signaling process; edges represent interactions
or regulatory effects between these elements. The edges of a network may be directed
or undirected. An undirected edge connects a node pair without order, that is, edge (x,
y) is identical to edge (y, x). For a directed edge the order of the node pair matters: a
directed edge (x, y) starts from x and ends in y. One can refer to x as the head or source
of the edge and refer to y as the tail or target of the edge; y is also said to be a direct
successor of x and x is said to be a direct predecessor of y. Edges can also be
characterized by positive or negative signs. In biological networks, the sign of an edge
represents the effect of the regulation. A positive edge stands for positive regulation (i.e.,
activation); a negative edge stands for negative regulation (i.e., inhibition). Biological
networks are often directed and signed; this way the network is a reflection of the flow
of mass and information in the system. During the construction of the network, certain
nodes may be designated as markers or proxies for higher-level behavior. For example,
certain genes or proteins can be used as markers of cell types (as it is also done in
experimental investigation), and abstract nodes can be added as proxies of the
phenotypic outcome of a signal transduction network.
We use as first illustration a signal transduction network inside plant guard cells. This
3
network will be described in more detail in Chapter 2. Guard cells border the stomata,
which are pores on leaf surfaces that allow the plant to exchange carbon dioxide (CO2)
and oxygen with the atmosphere. The shape change of the guard cells determines
stomatal opening (increased aperture) or closure. This shape change is elicited by
environmental signals, including light of different wavelength, CO2 concentration, as
well as internal signals (hormones) such as abscisic acid (ABA). Thus, the within-guard
cell signal transduction network can be defined as the elements and interactions that
respond to the external and internal signals and yield stomatal opening (or closure). Sun
et al. constructed a signaling network of light-induced stomatal opening, which
contained more than 70 nodes and 150 directed and signed edges [1]. Figure 1.1 shows
a reduced version of this network, with 32 nodes, including the outcome node Stomatal
Opening, and 81 edges [2].
The organizational features of a network reflect the properties that are critical for
emergent behavior. One way to connect the micro-scale (node) properties to the macro-
scale (network) properties is to look at the connectivity patterns of a network. First, one
can analyze the patterns of how the edges are distributed among nodes. A local measure
of this is the node degree, which in directed networks can be separated into in- and out-
degree. The in-degree of a node is the number of its incoming edges; the out-degree of
a node is the number of its outgoing edges. Nodes with a high in-degree have many
regulators and nodes with a high out-degree regulate many other nodes. A node can also
have a high degree (sum of in-and out-degree) by having intermediary values of in- and
out-degree. All of these types of high-degree nodes (also called hub nodes) have
biological meaning.
A node with only outgoing edges and no incoming edges is called a source node.
These nodes represent external signals. A node with only incoming edges and no
outgoing edges is called a sink node; these nodes represent outcomes of the network. In
the reduced stomatal opening network there are four source nodes, each representing a
signal, namely CO2, Blue Light, Red Light and ABA. There is a single sink node,
Stomatal Opening. The highest-degree (hub) nodes include AnionCh (referring to
multiple anion channels) with in-degree 6 and out-degree 2; and [Ca2+]c (cytosolic Ca2+
concentration) with in-degree 4 and out-degree 7.
4
Figure 1.1. Signal transduction network corresponding to the process of stomatal
opening in plants, adapted from [2]. This network has 32 nodes and 81 directed edges.
Arrows represent positive edges, and terminal filled circles represent negative edges.
The network contains three strongly-connected components (SCCs), marked with
dotted lines. The thick edges indicate a path from the source node ‘Blue light’ to the
sink node ‘Stomatal Opening’.
In order to characterize the flow of information from source nodes (signals) to sink
nodes (outcomes), we can use the concept of the path. A path is a sequence of distinct
nodes in which each node is adjacent (connected by an edge) to the next one. If there is
a path from node A to node B, B is reachable from A, meaning that information may be
transmitted from A to B. For example, the thick edges in the reduced stomatal opening
network form a path from the input signal Blue Light to the outcome node Stomatal
Opening. All the edge signs in this path are positive, making the path positive. Another
5
way in which a path is positive if it contains an even number of negative edges.
Conversely, a path is negative if it contains an odd number of negative edges. The
indirect connection between any two nodes can be characterized by the distance
between them, defined as the number of edges along the shortest path connecting them.
The thick path shown above has a length of 6. However, it is not the shortest path from
Blue Light to Stomatal Opening. The shortest path has four edges, thus the distance
between these two nodes is 4.
A pair of nodes can be connected by multiple paths (as we have seen for Blue Light
and Stomatal Opening). If all these paths have the same sign, the regulatory relationship
between the two nodes can be unambiguously characterized as positive or negative. A
network wherein the relationship between all pairs of nodes is unambiguous is called
sign-consistent (or structurally balanced). It is also possible that paths of both signs exist
between a pair of nodes, making their relationship ambiguous. This ambiguity can be
resolved by additional, dynamic information (which will be described later). Finally, it
is possible that two nodes are disconnected (there are no paths between them); these
nodes do not influence each other.
A special type of path is the cycle: it starts and ends at the same node and does not
revisit any nodes. Another way to refer to a directed cycle is feedback loop. For example,
in the stomatal opening network, the nodes NO, PLD, and ROS form the NO cycle,
circumscribed by the dotted rectangle in Figure 1.1. In a directed and signed network,
one can define the sign of a cycle (feedback loop) depending on whether the number of
negative (inhibitory) edges is odd or even. If a feedback loop has an even number of
negative edges, it is a positive feedback loop. Thus, mutual inhibition between two
nodes is an example of a positive feedback loop. If a directed cycle has an odd number
of negative edges, it is negative feedback loop. The NO cycle is a positive feedback
loop. In the stomatal opening network there is a negative feedback loop between Ca2+c
and the node Ca2+ ATPase, which represents the pumps and transport mechanisms
aiming to prevent a sustained high cytosolic Ca2+ concentration, which would be
detrimental to the cell. The sign of feedback loops have significant meaning in
predicting emergent properties of a network: positive feedback loops are necessary for
multi-stability; and negative feedback loops are necessary for sustained oscillations [3].
We will talk about dynamics in detail in the next section of this chapter.
One especially important connectivity pattern of a network is its strongly connected
component (SCC). An SCC is a sub-network in which every node is reachable from
every other node. As each SCC is made up of cycles, it can serve as an information
processing, decision-making unit [4]. If node A is reachable from node B but node B is
not reachable from A, these nodes are called weakly connected. This chapter will focus
6
on networks that are at least weakly connected. For a strongly-connected component,
one can identify its in-component as the nodes that can reach the SCC but cannot be
reached from the SCC, and its out-component as the nodes that can be reached from the
SCC but cannot reach the SCC. These components often have functional interpretations.
For example, many biological networks contain a dominant SCC. The in-component of
this SCC contains the signal(s) and its out-component contains the outcome(s); most of
the paths from signal(s) to outcome(s) pass through the SCC. In the stomatal opening
network, there are three strongly-connected components, namely the Ci SCC, the NO
cycle, and the ion SCC, as shown in Figure 1.1. The ion SCC is the dominant SCC. Its
in-component includes the four signals, the other two smaller SCCs, and 7 other nodes
(i.e. all nodes above the ion SCC in the figure). Its out-component is a single node,
Stomatal Opening. The node sucrose is neither in the in-component nor out-component
of the ion SCC.
A variety of software for network visualization and analysis exists. For example, yEd
excels in visualizing mid-size networks using a number of effective layouts. Cytoscape
is an open source software platform for visualizing molecular interaction networks and
integrating these networks with multiple types of data [5]. NetworkX is a Python
package for the creation and analysis of complex networks [6].
Dynamic modeling
Networks defined as in the previous section indicate which biological entities interact
with and regulate each other, but do not provide details about the results of multiple
regulatory relationships that are incident on the same node. This is especially
problematic if the network is not sign-consistent, meaning that the regulatory
relationships of a subset of the nodes are ambiguous. For example, in the Stomatal
Opening network, the node CO2 positively and directly regulates Ci, but it also has an
indirect negative effect on Ci via carbon fixation. If one wants to evaluate the aggregated
result of multiple interactions like this, one must consider the temporal and quantitative
aspects of information propagation on the network. This is done by network-based
dynamic modeling.
After a network is established, one associates each node with a variable to represent
its state. For example, if the node represents a protein in a cell signaling network, the
state variable can represent this protein’s concentration or activation level; if a node
represents a species in a food web, the state variable can represent the population of this
species. Then one constructs a regulatory function for this variable, based on the
regulators of the node indicated in the network. In this way, information (realized as a
7
state change of a node) propagates through the network. Each node’s state will evolve
over time, eventually converging into a long-term behavior such as a steady state or a
sustained oscillation. The phenotype of the system can then be characterized by the
long-term state of all nodes, or of a subset of the nodes. For example, if there is a sink
node that represents a phenotypic outcome, the long-term state of this node may be a
sufficient proxy to describe the whole system.
To construct a dynamic model, the modeler needs to start from identifying the process
to be modeled, which will specify the signals and outcomes of the network. Next is to
identify the additional nodes involved in the process, and the interactions among them.
Experimental interaction data, such as physical interactions, chemical reactions, post-
translational modifications, causal effects of knockouts, are used in this process. Then
the regulatory function for each node needs to be determined, and is usually
parameterized using experimental interaction data. In the vast majority of cases, there
isn’t enough information to fully characterize and parameterize each regulatory function.
Once a model is established, one can validate it by simulating the model and
comparing the results with experimental data. A simulation starts at an initial state that
represents the resting (pre-stimulus) status of the system, and it identifies the
consecutive states by applying the regulatory functions. The simulation result should
agree with the experimentally known response of the system to the signal(s).
Intervention or perturbation scenarios can also be simulated and analyzed. Comparing
the model’s results with existing experimental results in these scenarios is an additional
test of the model. If there are discrepancies, one or more regulatory functions need to
be adjusted until a reasonable percentage of simulations is consistent with experiments.
This adjustment process decreases the uncertainty of the regulatory functions.
After the model is validated, it can be used to make predictions about situations that
were not studied before. For example, the model can identify key nodes, whose
perturbation disrupts a certain behavior of the system. If this behavior is undesired, (e.g.
it represents uncontrolled growth of cancer cells) these key nodes serve as intervention
targets. The model’s predictions should be tested by follow-up experiments, which may
confirm the predictions or contradict them. Both cases represent a gain of knowledge.
An invalidated prediction spurs further revisions and increased certainty to the
regulatory functions. Figure 1.2 presents a flow chart of the modeling process.
8
Figure 1.2. Flow chart of the main steps of constructing and analyzing a dynamic
model of a signal transduction network. The key to the construction and validation of
the model is experimental data. Different types of data are used for model construction
and for model validation: interaction data and initial state data are used as inputs; and
time-course or long-term state data are used for model validation. This separation of
information helps avoid overfitting.
There are multiple frameworks for dynamic modeling, categorized by the type of
their state variables, the type of their time variable, or by the incorporation of
stochasticity in the model. For example, in continuous modeling the state variables as
well as time are continuous, and the regulatory functions describe the rate of change of
the state variables by differential equations. In discrete modeling, the state variables are
discrete, with regulatory functions that indicate the value of the state variables after a
time delay (which usually is given as a multiple of a discrete time unit). The major
advantage of discrete modeling is that it can reflect the functional repertoire of a
biological system without the need for a lot of kinetic information. If one wants to
construct a continuous regulatory function as part of a differential equation model, few
parameter values are known beforehand, and one needs to estimate a lot of parameters
by fitting experimental data. However, such experimental data is difficult to obtain, and
much less such data is available than what would be enough for construction and
validation of continuous models. On the other hand, discrete models, especially Boolean
models, require a minimal number of parameters, and are shown to be useful in
9
biological modeling, especially in modeling large systems [7-11]. This chapter will
focus on discrete dynamic modeling using discrete time.
The simplest discrete dynamic model is the Boolean model, where each node can
only employ two states. The 0 (OFF) state refers to a concentration or activity
insufficient (below threshold) to initiate downstream processes; conversely, the 1 (ON)
state represents a sufficient (above-threshold) concentration or activity. The Boolean
regulatory functions can be expressed with the Boolean operators ‘AND’, ‘OR’ and
‘NOT’; they can also be expressed as truth tables. When a node has more than two but
finite levels, the 0 state refers to inactivity, and different levels of activity are usually
represented by positive integers, e.g. 1, 2, until the maximal activity. The choice of
number of levels to use is determined by experimental evidence (e.g. if there are
different outcomes when a node has an intermediate activity compared to when it is
fully active). For example, the stomatal opening model has more than two levels for
about one-third of the nodes, informed by observations of additive or synergistic
relationships between blue and red light in regulating nodes of the network. The rest of
the nodes, for which no such evidence exists, are binary. The regulatory functions of
multi-level discrete dynamic models can be expressed in multiple ways [8, 12],
including a truth table, as shown in Figure 1.3.
Figure 1.3. Example of Boolean and multi-level regulatory functions in truth table
representation. A truth table is generated by enumerating all input combinations and
indicating the corresponding outputs. The output of the regulatory function will become
the next state of the target node. Here and throughout the chapter we represent the next
state of node A as A*.
Discrete models use different implementations of time evolution, called update
schemes. A synchronous update scheme is where all nodes are evaluated at once, and
each node will take its regulatory function-given value as its state in the next time step
(e.g. the next state of node A, denoted A*, is given in the last column of the truth table
on Figure 1.3). This update scheme is realistic if the synthesis and decay processes of
10
each node are the same; for example this may apply to certain gene regulatory networks,
as the timing of gene transcription and mRNA degradation is similar, on the order of
minutes. Asynchronous update schemes allow different nodes to update with different
rate, which is necessary in networks that include both pre- and post-translational events.
There are many ways to implement an asynchronous update. Some are deterministic,
for example updating nodes according to a fixed order; others are stochastic, for
example in general asynchronous update, at each time step a randomly chosen node is
updated.
Given an initial state and an update scheme, the system’s state will eventually evolve
into an attractor. An attractor is a minimal set of states of the system, from which only
states in the same set can be reached. The simplest attractor, called a fixed point, consists
of a single state. This state is also referred to as a steady state (in analogy with
continuous models). Attractors consisting of more than one state, which the system
keeps revisiting, are called complex attractors or oscillating attractors.
The evolution of a system can be effectively summarized into a state transition graph
(STG), whose nodes are the states of the system, and whose edges represent allowed
state transitions. In the state transition graph, attractors have one-to-one correspondence
with sink states or terminal SCCs (SCCs that do not have any successor nodes). That is,
a fixed-point attractor is a sink state and a complex attractor is a terminal SCC in the
STG. The intuition of this is simple: if the system gets into an attractor, it cannot escape
from it as there are no out-going state transitions. Since discrete dynamic models of
biological systems have a finite number of nodes and finite number of states, the system
will eventually evolve into an attractor, and then stay in this attractor unless disrupted
by a change in external signals or an internal perturbation. The biological significance
of this is that the attractors represent biological phenotypes. For example, in the stomatal
opening model, one attractor represents stomatal opening, while another attractor
represents stomatal closure.
State transitions depend on the update scheme. For example, in synchronous update,
one state can only transit into one state, i.e. each state has one and only one out-going
edge in the STG; while in some stochastic asynchronous update schemes one state can
transit into different states. This means that the attractors that involve state transitions,
i.e. complex attractors, will depend on the update scheme, too. Fixed points are the same
under different update schemes. Figure 1.4 demonstrates an example where a Boolean
model’s complex attractor depends on the update scheme.
11
Figure 1.4. Example of a toy Boolean network model and its dynamics under
synchronous update (when both nodes are updated simultaneously) and under general
asynchronous update (when one node is updated at each time). The dynamics of the
model is represented by a state transition graph (STG), in which system states are
represented by nodes and state transitions are represented by edges. Terminal strongly-
connected components (including nodes with only a self-loop) in an STG are attractors
of the system. This model exemplifies that complex attractors may depend on update
schemes. Specifically, under synchronous update, there is a complex attractor formed
by two states that differ in the value of both nodes. As state transitions that change the
value of two nodes are not possible under general asynchronous update, this complex
attractor disappears under asynchronous update.
In order to determine the complete repertoire of dynamic trajectories of a network-
based model, one needs to identify all possible state transitions. This is computationally
challenging, as the number of states increases exponentially with the number of nodes
(e.g. 2N for a Boolean network with N nodes). An effective way to reduce the state space
is network reduction; of course this reduction needs to preserve the dynamic repertoire
of the system. Two types of nodes can be reduced (eliminated or merged): source nodes
that have a sustained state, and simple mediator nodes that have one incoming and/or
one outgoing edge. In the reduction, the source node’s state is directly plugged into the
regulatory function of all of its direct successor nodes; then the source node is
eliminated. For a simple mediator node with one direct predecessor (regulator) and/or
one direct successor (target), its regulator is connected directly to its target, and the
mediator node is merged into the regulator. This reduction method is proven to conserve
attractors [13, 14].
12
A variety of software exist to facilitate discrete dynamical modeling. Model and
software development efforts are coordinated by the Consortium of Logical Models and
Tools (CoLoMoTo), an international open community that aims to develop standards
for model representation and interchange, establish criteria for the comparison of
methods, models and tools and to promote these methods, tools and models [15].
CoLoMoTo members have developed the Qualitative Models Package (“qual”) of the
Systems Biology Markup Language (SBML) [16]. The Cell Collective is a web-based
platform that enables collective model construction and real-time model simulation [17];
GINsim allows asynchronous and/or multi-level dynamics and STG construction [18].
The Python library BooleanNet allows simulation of Boolean models with different
update schemes [19]; SimBoolNet is a Cytoscape app that benefits from the
functionalities and friendly graphic user interface of Cytoscape [20]; the R package
BoolNet can construct and simulate Boolean models and analyze attractors using
exhaustive or heuristic search methods [21].
In the following we present two published models to demonstrate the power of
dynamical modeling of biological networks.
Modeling T cell survival
This model reflects the survival and proliferation of cytotoxic T cells in the context
of the disease T-LGL leukemia. Cytotoxic T cells are generated to fight an infection by
eliminating infected cells, and after the infection is over they usually undergo the
process of activation induced cell death. However in T-LGL leukemia they survive,
adopt a cell state different both from resting and from activated T cells, and start
attacking healthy cells. Zhang et al. synthesized the pathways involved in activation
induced cell death, cell proliferation, as well as the pathways that are known to be
different in T-LGL cells compared to normal cytotoxic T cells (Figure 1.5) [22]. They
formulated a Boolean model of the process and simulated its trajectories, starting from
a just-stimulated T cell, using stochastic timing. The model reproduces the survival of
a fraction of the initial stimulated cells and the known markers of this process, for
example the activation of JAK in every surviving cell. The model has two fixed points:
the normal fixed point that corresponds to programmed cell death, and the disease fixed
point that reproduces the T-LGL survival state. The model predicts that a small subset
of the known deregulations (abnormal node states) is sufficient to cause all the others,
thus preventative efforts should focus on this subset. The model predicts 12 additional
nodes whose state stabilizes in the T-LGL state. The model also predicts several key
nodes whose state change can ensure the apoptosis of the whole population; these key
nodes are potential therapeutic targets for T-LGL leukemia. Several of these predictions
13
have been verified experimentally.
Figure 1.5. T-LGL survival signaling network by Zhang et al, reproduced with
permission from [22] , copyright (2008) National Academy of Sciences, U.S.A. The
network contains 58 nodes and 123 edges. Up-regulated or constitutively active nodes
are in red, down-regulated or inhibited nodes are in green, nodes that have been
suggested to be deregulated (either up-regulation or down-regulation) are in blue, and
the states of white nodes are unknown or unchanged compared with normal. Blue edges
with arrowheads indicate activation and red edges that terminate in diamonds indicate
inhibition. The shape of the nodes indicates the cellular location or the corresponding
proteins, transcripts or molecules: rectangles indicate intracellular components, ellipses
indicate extracellular components, and diamonds indicate receptors. Conceptual nodes
(Stimuli, Cytoskeleton signaling, Proliferation, and Apoptosis) are orange.
In a follow-up project, Saadatpour et al. reduced the system to 6 nodes (in a way that
preserves the attractor repertoire) and determined its state transition graph [23]. They
found that the basin of attraction of the normal fixed point is larger than the basin of the
T-LGL fixed point, but there is a significant overlap between the basins, meaning that
there exist states from which certain trajectories lead to the normal fixed point and other
trajectories lead to the T-LGL fixed point, depending on the order of events. They also
performed systematic single-node perturbation analysis starting from the T-LGL state,
wherein a node is driven and maintained into the state opposite of its state in the T-LGL
survival state. They found that the perturbation of any one of 19 nodes leads to
disappearance of the T-LGL attractor, meaning that the only possible long-term outcome
is apoptosis. Thus these 19 nodes are potential therapeutic targets, whose control
(knockout or constitutive activity) leads to apoptosis of the T-LGL cells. The majority
(68%) of these predictions are corroborated by experimental evidence; the rest have not
yet been assessed. This work illustrates how network-based modeling can be used for
14
predictions that can potentially lead to new therapeutic targets.
Modeling epithelial to mesenchymal transition (EMT)
The epithelial to mesenchymal transition (EMT) is the process where epithelial cells
lose their cell polarity and cell-cell adhesion, and gain migratory and invasive properties,
to ultimately become mesenchymal cells. The loss of the expression of the protein E-
cadherin is considered the hallmark of the EMT transition. This cell fate change is
beneficial during embryonic development and wound healing, but it also is the first step
of cancer metastasis. Steinway et al. constructed a signal transduction network and
Boolean model of this process (Figure 1.6) [24]. The model uses stochastic update with
separate update probabilities (and thus separate time-scales) for nodes regulated at the
protein and mRNA level.
Simulations of the model start from the epithelial state, after which a sustained input
signal, TGFβ, is provided. During the simulation, most nodes in the model change states,
and the system converges into a fixed point attractor that recapitulates the mesenchymal
state, including the inactivity of E-cadherin. The model reproduces known molecular
markers of the transition and captures the importance of known key mediators, for
example the transcription factors that downregulate the E-cadherin mRNA. The model
also predicts that several pathways which were previously thought to be independent of
TGFβ are also activated through the process. In the sustained presence of TGFβ, the
EMT network can be simplified to 16 nodes, which enables the determination of a state
transition graph (STG). Model simulations and the STG both indicate that despite the
timing (update) stochasticity, all the trajectories end in the mesenchymal state,
indicating that the EMT transition is a robust process. Based on the model, the authors
predicted interventions that can block the transition, and validated several of these
predictions experimentally [25]. This work is important because EMT is the first step
of cancer metastasis so therapies that block it have high clinical potential.
15
Figure 1.6 EMT network by Steinway et al., reproduced with permission from [24]. The
network has 70 nodes and 135 edges. Nodes that represent extracellular signals are
shown in blue, green nodes are transcription factors, and the single output node EMT is
shown in red. Multiple molecules that serve as extracellular signals are also produced
by the cell, thus these nodes have incoming edges.
Integration of the interaction network and regulatory rules
As we have seen in the previous section, determination of the attractor repertoire of
a dynamical system, and of the ways in which this attractor repertoire changes in
response to perturbations and interventions, is a key step of connecting molecular
interaction networks with cellular behaviors. One of the methods to determine the
attractor repertoire is to use the state transition graph, which contains all the trajectories
of the system. However, the STG can have an enormous number of nodes and edges if
the biological system is large. An alternative way to determine the attractor repertoire
of a system is to exploit the connectivity patterns of the network. Indeed, it has been
shown in multiple dynamic frameworks, including discrete dynamic systems, that
16
positive feedback loops are necessary for multi-stability, while negative feedback loops
are necessary for sustained oscillations [3, 26-28]. A recently proposed family of
methods to connect between structural and dynamic analysis is based on integrating the
signal transduction network with its regulatory functions, into an expanded network. By
using this approach, one can determine elementary and independent signal transduction
pathways, find centers of stability in the network, reveal the attractor repertoire, and
drive the system into beneficial attractors or away from undesired ones [29-32].
The regulatory logic is integrated into the signaling network in two steps. First, one
creates a virtual node for each state of a node. This virtual node will be Boolean, with
the 1 (True) value indicating that the original node is in this state and the 0 (False) value
indicating that the original node is not in this state. One can construct this virtual node’s
regulatory function by summarizing the corresponding input combinations. In the
Boolean case, the virtual nodes’ regulatory functions can be straightforwardly obtained
from the original node’s regulatory function. For example, a Boolean function A* = not
B will now be represented as two functions, A1* = B0, and A0* =B1. For multi-level
models, the regulatory functions can be constructed from the truth table by summing up
the corresponding input combinations. For example, the virtual nodes and regulatory
functions of the multi-level truth table of Figure 1.3 are A0* = B0 and C0, A1* = B1
and C0 or B0 and C1, A2* = B1 and C1. The resulting regulatory functions are in a
Boolean disjunctive form [33]. Second, one eliminates AND/OR ambiguity by
representing each ‘AND’ clause with a composite node. The nodes in the clause will
have edges pointing to the composite node, and the composite node will have an edge
pointing to the regulated node. This expanded network contains positive edges only, and
explicitly identifies interactions of a combinatorial nature. Examples of expanded
network construction for both Boolean and multi-level functions are shown in Figure
1.7. A more detailed description of the construction of the expanded network of multi-
level dynamic models will be given in Chapter 3.
The expanded network makes it easy to identify a sufficient condition to activate a
virtual node (i.e. to make the original node attain the state represented by the virtual
node): a virtual node will have state 1 if any of its regulator virtual nodes has state 1, or
if any of its regulator composite nodes has all its input virtual nodes in state 1. If either
of these conditions is satisfied, the target virtual node will have state 1, regardless of the
state of other regulators of the target node. Following this intuition to more distant
virtual nodes, one can see that a path or subgraph in the expanded network satisfying
the above criterion allows signal propagation from the first node of the path/subgraph
to the last node of the path/subgraph, independent of other nodes; and a cycle in the
expanded network satisfying the above criterion will be self-sufficient to stabilize.
17
Figure 1.7 Examples of expanded network construction in the Boolean and multi-level
case. Each virtual node is labeled with the state it represents. Each composite node is
black, with a label indicating which node combination it represents. The complete
expanded network is obtained by expanding all regulatory functions of the original
model.
Connectivity patterns of the expanded network lead to the definition of elementary
signal modes and stable motifs, which reveal important dynamical properties of the
system. An elementary signaling mode (ESM) is defined as a minimal set of
components that can perform signal transduction from signals (source nodes) to
outcome nodes (proxies for cellular responses) [29, 32]. A key property of an elementary
signaling mode is that if it includes a composite node, it must include all the regulators
of the composite node as well (see Figure 1.8). There are many applications of the ESMs:
one can evaluate the importance of signaling components by the effect of their
perturbation on the ESMs of the network; the number of node-independent elementary
signaling modes also shows the redundancy of a network. In many signaling networks
the number of node-independent elementary signaling modes is one, meaning that there
is no more than one independent modality of signaling, and loss of a single node can
disrupt signaling.
18
Figure 1.8. Example of elementary signaling modes (ESMs) in a partial expanded
network. The labeled virtual nodes correspond to the ON state of the respective nodes
in the original signal transduction network; the black node is a composite node. There
are two ESMs in the network: the path A1 B1 D1 E1, shown as the dotted line, and the
subgraph that contains A1, C1, B1, the composite node, and E1, shown with a dashed
line. Each is sufficient for the signal to activate the outcome. This figure was adapted
from [29].
A stable motifs can be defined as one the smallest strongly connected component in
the expanded network that satisfies two criteria: 1. It does not contain multiple virtual
nodes that correspond to the same original node; 2. If it contains composite nodes, it
also contains these nodes’ inputs [30, 33]. Such definition guarantees that a stable motif
is a self-sufficient cycle, so that it can stabilize on its own, regardless of the rest of the
network. Figure 1.9 is an example of stable motif identification in a three-node model.
It is important to note that a stable motif is both a network motif and an associated state,
encoded in the names of the virtual nodes that form the stable motif. For example, the
first stable motif on Figure 1.9 indicates that the positive feedback loop between A and
B is sufficient to sustain both node in the ON state. Stable motifs are centers of stability
in the system and have a one-to one correspondence to the partial fixed points of the
system. Specifically, each stable motif determines a partial fixed point in which the
nodes of the stable motif, and potentially additional nodes, stabilize. Conversely, each
partial fixed point (i.e. fixed state of a subset of the nodes) corresponds to one or more
stable motif(s).
This one-to-one correspondence indicates that identifying stable motifs is enough to
determine the stabilized part of any attractor of the system. A node must either stabilize
or oscillate in an attractor. Since stabilized nodes are associated with stable motifs, the
nodes not associated with stable motifs must be oscillating or influenced by an
oscillating regulator. In this way, one can identify the attractors of the system by finding
the stable motifs. The main advantage of this method is that it allows identification of
all attractors without enumerating the entire state space. As the size of the expanded
network is smaller than the size of the state space, stable-motif-based attractor
identification is more efficient computationally than state-space-based attractor
identification.
19
Figure 1.9. Example of stable motif identification from a three-node Boolean
dynamic model. The regulatory functions of the virtual nodes are given. The black nodes
in the expanded network are composite nodes. Three stable motifs can be identified
from the expanded representation of the network. The first stable motif represents the
simultaneous activation (state 1) of nodes A and B. The second and third stable motifs
represent the sustained inactivation (state 0) of A and C, respectively. Notice that a stable
motif corresponds to a positive feedback loop (or SCC) in the original network, but not
all positive feedback loops are stable motifs.
The implementation of the attractor identification is an iterative network reduction
based on stabilized components. The idea is simple: if a node is known to stabilize, one
can plug its state into the regulatory functions of its direct successors and eliminate the
node. Similarly, after identifying a stable motif, on can plug in the corresponding states,
identify additional stabilized nodes and reduce them until no more nodes stabilize. After
each step of reduction, new stable motifs may be found and can be plugged in. If at the
end of this iterative process there are any nodes left that cannot be reduced, they must
be related to oscillations. The stable motif sequence (regardless of order) found in the
reduction process determines the attractor [31]. Figure 1.10 demonstrates the complete
attractor identification process of the network example presented in Figure 1.9. Note
that this process is the same for both Boolean and multi-level models; this will be
explained in more detail in Chapter 3. The resulting diagram is referred to as a stable
20
motif succession diagram. This diagram also reflects the system’s natural dynamical
repertoire: starting from an arbitrary initial condition, sooner or later one of the possible
stable motifs will stabilize, which will make other nodes stabilize, and so on. When a
system allows multiple stable motifs, the timing of events determines which stable motif
stabilizes first, which may make other stable motifs unattainable. For example, in Figure
1.10, the initial condition and timing determines whether both A and B stabilize at 1
(first row) or A stabilizes at 0 (second row). These two stable motifs are mutually
exclusive. The system wherein A stabilized at 0 may achieve stabilization of C at 1 or
at 0, reaching attractor 2, or attractor 3, respectively.
Figure 1.10. Example of attractor identification with iterative stable motif guided
network reduction using the same model as in Figure 1.9. There are three stable motifs
in the model. In the iterative reduction process, each of them is plugged into the
regulatory functions (represented by indicating the stable motif above an arrow),
resulting in a reduced model (indicated by the interaction network and regulatory
functions), where further stable motif analysis is performed. For simplicity of
representation of the A1, B1 stable motif, we do not show the composite node. When
all nodes’ states are identified in the process, the reduction is complete and an attractor
is obtained.
We illustrate stable motif and ESM analysis on our two previously introduced
examples, the T-LGL network and the EMT network (Figure 1.11, Figure 1.12, and
21
Figure 1.13). The complete expanded networks are too large and complex to be visually
parsed, so we illustrate the stable motifs in each network. Figure 1.11 is a part of the
stable motif succession diagram of the T-LGL network, illustrating one motif sequence
whose stabilization leads to the normal, apoptosis attractor, and a motif whose
stabilization leads to the T-LGL leukemia attractor. The complete succession diagram
contains more motif sequences. Note that the stable motifs that’s first in the apoptosis-
inducing sequence and the T-LGL-causing stable motif contain opposite states of the
nodes S1P, PDGFR, and SPHK1. This suggests that the positive feedback among these
nodes, coupled with the mutual inhibition between S1P and Ceramide, is an attractor-
determining connectivity pattern in the T-LGL leukemia network.
Figure 1.11. Part of the stable motif succession diagram of the T-LGL network,
adapted from [31]. The state of the nodes in each motif is indicated by a number,
separated from the node name by an underscore (e.g. S1P_0 represents S1P at state 0).
A stable motif sequence determines the attractor, i.e. Apoptosis or T-LGL leukemia
(cancer). For example, the activation of the Ceramide=0, S1P=1, PDGFR=1, SPHK1=1
motif leads to the reduction of the whole network and convergence into the T-LGL
leukemia attractor.
There are 8 stable motifs associated to the mesenchymal state in the EMT network,
ranging in size from four to eleven nodes. Stabilization of any one of these stable motifs
can independently drive the system into the mesenchymal state. Figure 1.12 shows the
logic backbone of the EMT network, where stable motifs are represented with blue
nodes [34]. All edges of the backbone represent sufficient activation, mediated by a path
or subgraph of the EMT network. The figure indicates that any input signal is sufficient
to drive all stable motifs, any of which is sufficient to drive EMT. An example ESM is
given in Figure 1.13.
22
Figure 1.12. The logic backbone of the EMT model, reproduced from [34]. This is a
condensed version of the EMT network, where each stable motif of the model is
represented by a single node (in blue), and its causal relationships with the signals and
the outcome node EMT (in yellow) are visualized. All edges are sufficient activations,
i.e. the activity (sustained ON state) of the input node/motif will activate the target node
or motif. Any signal, or any stable motif is sufficient to drive EMT.
Figure 1.13. An example ESM from the EMT network, from the signal PDGF to
output EMT. The state of the nodes are marked at the end of each node label (e.g.
PDGF_1 means PDGF at state 1.). The existence of this ESM indicates that the
sustained presence of the PDGF signal alone is sufficient to drive EMT. Note that this
23
ESM contains three composite nodes.
The existence of eight stable motifs and their connectivity illustrated on Figure 1.12
indicates that EMT is a very robust process. Steinway et al. analyzed the effect of single
and multiple-node knockout (sustained OFF state) on TGFβ-driven EMT, focusing on
the status of the outcome node EMT. They found that knockout of the TGFβ receptor or
of one of the seven transcription factors that downregulate E-cadherin are the only EMT-
blocking single node interventions. The effective double-node interventions include
knockout of SMAD combined with knockout of another node out of nine, marked with
blue color on Figure 1.14.
Stable motifs also offer a way to control the network. Generally, control can have two
meanings: 1. to be able to drive the system into an arbitrary state (but the system may
not necessarily stay there); 2. to be able to drive the system into an arbitrary attractor.
Because the cellular phenotypes are the attractors of molecular interaction systems, the
second meaning is more natural, and will be our focus. Since stable motifs correspond
to (partial) fixed points of the system, a sequence of stable motifs will determine an
attractor. Therefore, controlling one or more stable motifs (i.e. eliciting their
stabilization by maintaining one or more nodes in a fixed state) is enough to drive the
system into one of its attractors. The number of nodes that need to be controlled
(maintained in a fixed state) can be minimized in two ways: First, not all stable motifs
in a sequence need to be controlled. Specifically, stable motifs whose stabilization
inevitably follows from the stabilization of a previous motif do not need independent
control. Furthermore, to control a stable motif, one does not need to control all of its
nodes, but only a subset of nodes called driver nodes. These two criteria can be used to
predict a small set of driver nodes that can drive the entire system into a desired attractor.
Let’s consider the EMT network again, but now focusing on the epithelial state. The
stable motif associated to the epithelial state, shown in Figure 1.14, is quite large (it is
the entire SCC of the EMT network). Yet to control this motif, one only needs to control
as few as five nodes: one node in each yellow rectangle. Maintaining these five nodes
in their epithelial states is enough to ensure convergence to the epithelial state from any
initial state of the system. Taken together, stable motif analysis of the EMT model
allowed the prediction of two types of interventions: interventions that block TGFβ-
driven EMT, thus suppressing features of invasive tumors, and interventions that revert
mesenchymal cells to their epithelial state.
24
Figure 1.14. Stable motif associated with the epithelial state in the EMT network and
illustration of control sets that guarantee convergence to the epithelial state, reproduced
from [25]. The entire graph is the epithelial stable motif. Nodes in black are OFF, and
nodes in white are ON. Controlling of one node in each yellow rectangle, e.g. SMAD,
SNAI1, RAS, SHH knockout combined with β-catenin_memb constitutive activation,
ensures convergence to the epithelial state. The nodes highlighted in blue represent
SMAD and the nine nodes whose knockout in combination with SMAD is able to
prevent TGFβ-driven EMT. The fact that these blue nodes are either part of a yellow
rectangle (SMAD, RAS), on a path that ends in a node of a yellow rectangle (DELTA,
NOTCH, NOTCH_ic, CSL) or on a path that starts with a node of a yellow rectangle
(PI3K, AKT) indicates the inclusive relationship between node sets whose control
prevents or, respectively, reverses EMT.
25
Chapter 2 Analysis of a dynamic model of guard cell
signaling reveals the stability of signal propagation
Most of this chapter is based on previously published work for which I am the first
author [2]. Parts of the published work are reproduced in this chapter from BMC
Systems Biology (open access).
Background
Modeling offers a comprehensive way to understand biological processes by
integrating the components involved in them and the interactions between components.
Models can recapitulate and explain the emergent outcome(s) of the process [35, 36].
Representing cellular processes that involve many proteins and small molecules by a
signal transduction network can reveal indirect relationships between components and
provide new insight [37-39]. Such network usually consists of nodes representing
biological entities, and edges representing interactions. Once a network has been
constructed, dynamic modeling, where each node in the network is associated with a
variable representing its abundance or activity, can further describe the behavior of the
network. Dynamic models can have continuous variables whose change is described by
differential equations [40], discrete variables described by discrete (logical) regulatory
functions [3, 41], or a combination of continuous and discrete variables [42]. The major
advantage of discrete dynamic and continuous-discrete hybrid modeling is that they use
much fewer parameters than continuous models and thus need less parameter estimation
[11, 18, 43]. Modeling allows one to analyze the biological system represented by the
network in silico, when performing the relevant experiment is infeasible. It also helps
identify general principles of biological systems [44, 45].
The biological process of stomatal opening in plants is a good example of a complex
system wherein modeling leads to significant gain in understanding [1, 46]. Stomata are
pores on leaf surfaces that allow the plant to exchange carbon dioxide (CO2) and oxygen
with the atmosphere. Stomata are formed by two guard cells that can change shape:
swelling of guard cells leads to stomatal opening; their shrinking leads to stomatal
closure. The shape of each guard cell is directly controlled by water flow through the
membrane, which is in turn controlled by ion flow. Different signals can affect the guard
cell, changing its ion concentration in direct and indirect ways, resulting in stomatal
opening or closure [47-49]. These signals include light of different wavelengths, CO2
26
concentration in the air, and plant hormones like abscisic acid (ABA). The regulation of
stomatal opening is essential to plants, as it controls vital activities like the uptake of
CO2 for photosynthesis, and the unavoidable water loss through evaporation [50].
Through extensive experimentation over several decades, more than seventy proteins
and small molecules have been identified to participate in this process.
Sun et al. [1] constructed a signal transduction network based on conclusions from
more than 85 articles in the literature, describing how more than 70 nodes (proteins,
small molecules, ions) interact with each other in the stomatal opening process. The
network, reproduced as Figure 2.1 [1], includes four source nodes that correspond to the
signals red light, blue light, CO2, and ABA. The more than 150 edges are directed and
signed, with arrowheads indicating activation and terminal black circles indicating
inhibition.
Figure 2.1 The signal transduction network responsible for stomatal opening, as
reconstructed by Sun et al.[1]. The color of a node marks which signal regulates this
node. Red nodes are regulated solely by red light. Blue nodes are regulated solely by
blue light. Yellow nodes are regulated solely by ABA. Grey nodes are regulated by CO2.
Purple nodes are regulated by both blue and red light. Green nodes are regulated by blue
(and potentially, red) light and ABA. White nodes are source nodes not regulated by any
27
of the four signals. To improve visualization, certain pairs of edges with the same
starting or end nodes overlap. Nodes with multiple levels in the dynamic model are
represented by red shadows; the others are Boolean. The full names of the network
components denoted by abbreviated node names are given in Table 1. This figure and
part of its caption is reproduced from Sun Z, Jin X, Albert R, Assmann SM (2014)
Multi-level Modeling of Light-Induced Stomatal Opening Offers New Insights into Its
Regulation by Drought. PLoS Comput Biol 10(11): e1003930.
doi:10.1371/journal.pcbi.1003930.
Translating this network into a dynamic model, Sun et al. characterized each node
with a discrete variable describing its activity and with a discrete (logical) regulatory
function describing its regulation. Twenty-one out of the 70 nodes in the model are
multi-level, the rest are Boolean (binary). The levels reflect relative and qualitative
information: a level of 2 is a higher level than 1, but should not be interpreted as twice
as high. A few discrete values are not integers; e.g. stomatal opening is a weighted sum
with non-integer weights. The dynamic model has ~1031 states. The logical regulatory
functions, describing each node’s future state based on the states of the node’s regulators,
use a combination of Boolean logic operators (And, Or, Not), algebraic operations, and
input-output tables. For example, the regulatory function of PRSL1 is:
PRSL1* = phot1complex Or phot2.
Here for simplicity the node states are denoted by the node names; the asterisk in
“PRSL1*” indicates that this will be the next state of the PRSL1. The “Or” Boolean
operator expresses that either of the blue light receptors, i.e. the phot1 complex or phot2,
can independently activate PRSL1.
The Sun et al. model starts from an initial condition representative of closed stomata.
Then a combination of the four input signals is applied. Red light, blue light, and ABA
are represented as binary variables, and external CO2 is represented with three states: 0
(CO2 free air), 1 (ambient CO2) and 2 (high CO2). The system’s response is simulated
through repetitive re-evaluation of each node’s state until a stable value of stomatal
opening is observed. The model successfully captured stomatal opening in response to
combinations of the signals. It also successfully reproduces stomatal opening under
most of the experimentally studied perturbation scenarios (i.e. genetic knockouts or
external supply of components). In total, the model is consistent with 63 out of 66
experimental observations collected by Sun et al. [1]. The model predicts the outcome
of a large number of scenarios that have not been explored experimentally so far. It also
revealed a gap of knowledge regarding the cross-talk of red light and ABA signaling,
28
and filled it with a newly predicted interaction.
Although the Sun et al. model recapitulates existing knowledge and offers new
predictions, the model's full dynamic repertoire could not be characterized due to its
large state space. Instead, Sun et al. focused on tracking the output node, stomatal
opening, and a few selected internal nodes, in time. In this chapter we apply multiple
methods to analyze the model and aim to fully map all its potential long-term behaviors,
or in other words, attractors.
Methods
Attractors of a dynamical system
An attractor is a set of states from which only states in the same set can be reached.
Attractors that consist of a single state are called stable steady states or fixed points;
attractors that contain multiple states are called complex attractors or oscillations [11].
In biological networks, attractors often have significant biological meaning. In a cell
signaling network, attractors correspond to cell types, cell fates or behaviors [51]. For
example, one attractor can represent a healthy differentiated cell, while another attractor
can represent an abnormally motile cancer cell [24].
Update scheme of a discrete time model
In the Sun et al. model, as in most discrete dynamic models, time is an implicit
variable. As there is very little information about the kinetics of the nodes in the stomatal
opening network, the model incorporates an element of stochasticity in timing. The
timing does not affect a system’s fixed point attractors, but it can change the complex
attractors and the possibility of reaching a given attractor from a given initial state [11].
In the Sun et al. model, a random–order asynchronous update is used. Specifically, at
each time step, a random order of nodes (excluding the four input nodes and the output
node stomatal opening) is generated, and each node’s state is reevaluated in this order;
stomatal opening is always updated last. In the next time step a different order is selected
randomly. In this chapter, we use a different type of stochastic update, called general
asynchronous update, wherein a randomly selected node is updated at each time step.
This is required by the network reduction method we use. Although this theoretically
could cause a difference in complex attractors, we will show that in this specific model
the two methods yield the same attractors.
Network reduction
To reduce the Sun et al. model’s state space, we apply a network reduction method
developed by Saadatpour et al. [13] that is proven to preserve the attractors in a Boolean
model. Two types of nodes can be reduced (eliminated or merged): source nodes with
29
no incoming edges, and simple mediator nodes that have one incoming and/or one
outgoing edge. In the reduction, the source node’s state is directly plugged into the
regulatory function of all of its direct successor nodes; then the source node is
eliminated. For a simple mediator node with one predecessor (regulator) and one
successor (target), its regulator is connected to its target and the mediator node is merged
into the regulator. If there is one regulator and several targets of the mediator node, but
no direct edges between the regulator and any of the targets, the mediator node is merged
into the regulator. Conversely, if there are several regulators and one target of the
mediator node, but no direct edges among any of the regulators and the target, the
mediator node is merged into its target. Although this method is not proven in the multi-
level case, we conjecture that attractors are also conserved for a multi-level mode, and
will show from the results that in the Sun et al. model this reduction method preserved
all attractors.
Elimination of redundant edges
During the process of creating a discrete dynamic model from biological data, when
an influence is weaker than other influences, the modeler may choose to omit this
influence or, alternatively, include it a redundant way. The latter choice was made by
Sun et al. in four cases, leading to four regulatory functions that contain an input that
does not affect the outcome of the regulatory function. One of these is
ROS* = NADPH And AtrbohD/F Or NADPH And AtrbohD/F And CDPK Or Not
Atnoa1
The italicized words “And”, “Or” and “Not” are Boolean logic operators; the non-
italicized words represent node names. In this regulatory function every node is Boolean
(binary). The first clause “NADPH And AtrbohD/F” and the second “NADPH And
AtrbohD/F And CDPK” are connected with an “Or” rule, with the result that the node
“CDPK” does not have any influence on the outcome. Therefore, we can reduce the
edge from CDPK to ROS without changing the model’s dynamics. We similarly prune
three additional redundant edges.
Converting a multi-level model to Boolean
There are several possibilities to convert a multi-level model to Boolean [52]. The
standard method used in the case of logical models of regulatory networks is the Van
Ham mapping [53, 54]. It preserves the dynamics of the original model if the variables
in the original model can be represented by integers and if the original model only
allows state transitions in which one node changes its state by one level [54]. The Sun
et al. model does not satisfy these criteria. However there still is a conclusion that we
can use: All types of conversions maintain the fixed points and the reachability of states
(i.e. if there is a sequence of state transitions from state A to state B before conversion,
30
there must be a sequence of state transitions from the corresponding state A’ to state
B’ after the conversion) [54]. So the worst distortion of attractors due to the conversion
is the merging of two complex attractors into one. In this light we choose to use an
economic mapping of each multi-level node into as many Boolean nodes as necessary
for the binary representation of the corresponding integer. We will show that in this
specific model, the conversion did not change the attractors.
Results
Network reduction
The Sun et al. model has a huge state space of ~1031 states, making its analysis
difficult. To obtain a smaller state space, we reduce the size of the network by applying
a network reduction technique developed by Saadatpour et al. [13] that is proven to
preserve the attractors of Boolean models (see Methods). All source nodes other than
the four signals (blue light, red light, CO2, and abscisic acid) and all simple mediator
nodes are identified and reduced. This process is done iteratively until it cannot be done
any more. A total of 7 source nodes (14-3-3 proteinphot1, PIP2C, AtNOA1, Nitrate, PP1cn,
mitochondria, and CHL1), and 19 simple mediator nodes (phot1, phot2, NIA1, H+-
ATPase, LPL, ATP, acid. of apoplast, [NO3-]v, [Cl-]v, NADPH, [malate2-]v, PA, ABA
receptors, OST1, PRSL1, PIP2PM, AtrbohD/F, Nitrite, and phot1complex) are eliminated.
Several of the simple mediator nodes form linear paths (e.g. phot1, OST1) thus their
iterative reduction shortens the linear paths in the network. In addition, 16 of the 19
reduced mediators have a regulatory function of the form “B* =A”. It is intuitive that
reduction of this node type preserves the attractors.
We do not eliminate the four signal nodes because we want to simultaneously
explore all the combinations of input signals. We also choose to not reduce the five
nodes (Kin, Kout, Kc, Ca2+-ATPase, mesophyll cell photosynthesis) whose merging with
their sole regulator would result in a self-loop (self-regulation), because such self-loops
may be difficult to interpret. Two additional nodes with significant biological meaning
to the network (sucrose, stomatal opening), are not reduced either.
Another form of network reduction is the elimination of redundant edges (see
Methods). After removal of redundant edges, the node CDPK becomes a sink node, thus
it can also be eliminated. The reduction of the above-described nodes and redundant
edges simplifies the network from 70 nodes to 42 nodes, with an estimated state space
of ~1022 states.
Simplification of regulatory functions
In order to further reduce the state space from ~1022 to a manageable size, we
31
grouped state values so that nodes are represented with fewer states. This grouping was
guided by the 66 experimental observations summarized in Sun et al.; we aimed to
maintain the reduced model’s results consistent with these experimental observations.
For example, in the Sun et al. model [1] the regulatory function of Stomatal
Opening is a weighted sum of different ions and sucrose:
Stomatal opening* = [Cl-]v contribution + [NO3-]v contribution + [K+]v + [malate2-]v
contribution+ sucrose – RIC7/6
The weights of the anion contributions to the osmotic potential were chosen based on
the literature. Also, the anion contributions must not exceed a proportion of [K+]v due
to charge balance. The anion contributions are [malate2-]v contribution ≤ 0.425 × [K+]v ;
[NO3-]v contribution ≤0.10 × [K+]v ; [Cl-]v contribution ≤ 0.05 × [K+]v . The primary
contributions come from [K+]v and sucrose. We grouped the stomatal opening values
into 6 groups with different [K+]v and sucrose values (see Table 2.1 and Appendix A1).
[K+]v sucrose Stomatal Opening value
in the Sun et al. model
Simplified
Stomatal
Opening value
0 0 0 0
0 1 or 2 1 or 2 1
1 0 1.58 1
1.8 1 3.84 2
1.5 2 4.36 2
2 0 or 1 3.15 or 4.15 3
4.5 0 or 2 5.18 or 8.92 3
6 0 9.28 or 9.45 5
6 2 11.28 or 11.45 5
9 0 or 2 14.01 or 16.01 6
Table 2.1 Grouping of the stomatal opening values by the level of [K+]v and sucrose
The first two columns indicate the [K+]v and sucrose levels. The third column is the
possible values of stomatal opening in the Sun et al. model for the given [K+]v and
sucrose levels. Note that here we only show [K+]v, sucrose and stomatal opening value
combinations observed in the simulations of the 66 experimentally studied scenarios
reported by Sun et al.[1]. More stomatal opening values are possible when considering
32
node perturbations. The 4th column shows the simplified stomatal opening level after
grouping. The update function for the simplified stomatal opening level covers all
possible values of [K+]v and sucrose (see Appendix A1).
Similarly to the original model, the simplified states represent qualitative, relative
categories. For example, a stomatal opening level of 2 is not twice as high as level 1.
We choose the simplified stomatal opening values so that there is no state “4”, to better
reflect an experimentally observed synergistic effect between blue and red light [48, 49,
55]. Simulation results with the simplified regulatory function are that under
monochromatic red light stomatal opening =1; under monochromatic blue light stomatal
opening =3; under dual beam the stomatal opening =5, which is larger than the sum
“1+3”. This qualitatively reproduces the experimental observation that under dual beam
illumination stomata open to a size much larger than the sum of opening under
monochromatic blue or red light.
33
Figure 2.2 The stomatal opening network after model reduction, with 32 nodes and 81
edges. Nodes with shadows have multiple states; other nodes are binary. The three
strongly-connected components (SCCs) of the network are indicated by rectangles with
dashed contours.
We find by simulation of the reduced model, using the same initial condition as the
Sun et al. model, that the simplification of the stomatal opening regulatory function
results in only 3 additional cases of inconsistency with experimental observations out
of a total of 66 experimentally studied scenarios. Additional File 2 of [2] lists all
34
experimental observations and compares them to the relevant simulation results.
Ignoring the contribution of malate2- , NO3-, and RIC7 to stomatal opening each causes
one additional discrepancy; ignoring Cl- does not cause any additional discrepancy.
Ignoring these nodes trades a decrease in accuracy for a significant increase in simplicity.
The simplification of the stomatal opening regulatory function eliminates the effect
of vacuolar anions and of RIC7 on stomatal opening. As a result we can further simplify
the Sun et al. model by eliminating 10 nodes in total, [malate2-]a, [malate2-]c, starch,
[Cl-]c, [NO3-]c, [NO3
-]a, ROP2, RIC7, ABC, and PEPC. The only edge from these nodes
to other nodes is [malate2-]a → AnionCh. In section 3, Additional File 3 of [2] we show
that eliminating this edge does not change the system’s long-term behavior, i.e.
attractors. Also, the regulatory function describing the cytosolic K+ concentration, [K+]c,
can be simplified without loss, as described in section 3, Additional File 3 of [2]. After
this simplification we have a network of 32 nodes, 81 edges, indicated on Figure 2.2.
We will refer to this model as the “reduced model”. A list of nodes and their regulatory
functions is provided in Appendix A1.
Identifying strongly connected components (SCCs) is important for attractor
analysis, as complex dynamic behavior such as oscillations or multi-stability requires
feedback loops [3]. There are three SCCs in the network of the reduced model, as
marked in Figure 2.2. The NO cycle contains three nodes and three positive edges. The
Ci SCC contains three nodes, which form two negative feedback loops. The Ion SCC is
the most complex, containing 13 nodes and 26 edges, 7 of which are negative.
Next we perform attractor analysis using two methods: 1. by converting the reduced
model to Boolean and applying two analysis tools; 2. by analyzing the regulatory
functions theoretically. The former method finds all stable steady states and candidate
oscillations; the latter confirms the results of the first method and gives insight about
perturbation scenarios.
Conversion of nodes from multi-level to Boolean states and attractor analysis
We perform the conversion to Boolean to enable attractor analysis by existing
software tools. Zañudo et al. [30] proposed an algorithm to find the attractors of a
Boolean network based on the concept of “stable motif”, a strongly-connected group of
nodes that can stabilize regardless of their inputs. The algorithm finds all stable motifs,
which determine the part of the network that stabilizes in an attractor. After a stable
motif is found, one can plug in its stabilized state into the network, and obtain a smaller
remaining network. After repeating this, eventually the remaining part is either nothing
(indicating a fixed point/steady state) or a candidate oscillating sub-network. Compared
with other software tools [16, 56], the major advantage of this algorithm is that it finds
35
all the attractors of Boolean networks with hundreds of nodes [30]. Application of this
powerful method requires a Boolean model, so we convert the multi-level model into
Boolean first (see Methods). An example of conversion is given in Table 2.2.
Level of the
original node
State of Bool.
node_2
State of Bool.
node_1
0 0 0
1 0 1
2 1 0
3 1 1
Table 2.2 Example of Boolean conversion. The multi-level node shown in the 1st
column is mapped into two Boolean nodes, shown in the 2nd and 3rd columns, using
the binary representation of the corresponding integer.
More detailed examples of the conversion of the states and regulatory function of
specific nodes are given in the Additional File 4 of [2]. We will refer to the reduced
model after conversion to Boolean variables as the “Boolean-converted reduced model”.
When simulating the Boolean-converted reduced model, all the Boolean nodes that
represent the same entity (the same multi-level node) are updated simultaneously. In
this way the state transitions of the reduced model will be kept the same in the Boolean-
converted reduced model, and therefore the Boolean conversion will not cause
additional discrepancies from experimental observations.
We apply the stable motif algorithm’s implementation, downloaded from
http://github.com/jgtz/StableMotifs/ [30], to the Boolean-converted reduced model. The
algorithm uses the Boolean regulatory functions of the converted model (in a special
format) as input. We consider every combination of sustained states of the five signal
nodes (blue light, red light, ABA, CO2, CO2_high). We find two possible stable motifs,
corresponding to the self-regulatory node PMV_pos (one of the two Boolean nodes
associated with the multi-level node PMV), in conditions where the H+-ATPasecomplex is
inactive. These two stable motifs indicate the bistability of PMV. Under its influence,
another node, Kout, will also be bistable. The algorithm also indicates that for any signal
combination, every node, except [Ca2+]c and Ca2+-ATPase, will stabilize in a fixed state.
[Ca2+]c has three states, and in the Boolean-converted model it is represented by two
nodes, Cac and Cac_high. Cac_high, which represents the higher level of [Ca2+]c,
stabilizes at zero in all situations. Cac and Ca2+-ATPase may oscillate in conditions
where blue light is present and ABA is absent (a total of six cases, two of which allow
PMV bistability). Table 2.3 summarizes key features of the attractors found by the stable
motif algorithm for all 24 input combinations. Attractors where Ca2+ oscillation is not
36
possible are fixed points (stable steady states).
BL RL CO2 CO2_high ABA SO (Bool) SO
Ca2+
Oscillation
Possible?
PMV_pos
bistability
0 0 Any Any Any 000 0 No Yes
0 1 0 0 1 000 0 No No
0 1 1 Any 1 000 0 No Yes
1 Any 1 0 1 000 0 No No
1 Any 1 1 1 000 0 No Yes
0 1 1 Any 0 010 1 No Yes
1 Any 1 1 0 010 1 Yes Yes
0 1 0 0 0 101 3 No No
1 0 1 0 0 101 3 Yes No
1 Any 0 0 1 101 3 No No
1 0 0 0 0 110 5 Yes No
1 1 1 0 0 110 5 Yes No
1 1 0 0 0 111 6 Yes No
Table 2.3 Summary of the attractors found using the stable motif algorithm. The first 5
columns indicate the input signal combination. The setting CO2_high=1 and CO2=0 is
not included because it is not biologically meaningful. The “SO (Bool)” column
indicates the state of the Boolean node combination representing stomatal opening. The
“SO” column is the state of stomatal opening when converted back to an integer. Note
that the stomatal opening level of four is not defined, and no attractors have a stomatal
opening level of two. The next column indicates whether Ca2+ oscillation can possibly
happen under the given signal combination. The last column indicates whether
bistability of PMV_pos can be observed under this setting. In those cases, two stable
steady states with (PMV_pos=0, Kout=0) and (PMV_pos=1, Kout=1) can be observed.
The rest of the nodes are unaffected by this two-node bistability.
We verified the obtained attractors with GINsim [18] , a software suite capable of
model construction, simulation, and analysis. GINsim can compute all stable steady
states (called stable states in GINsim), or determine complex attractors by mapping the
state transitions. The stable steady states found by GINsim are identical to those found
by the stable motif algorithm. To verify and further explore the complex attractors, we
use the simulation function of GINsim, starting from a state in the complex attractor.
The result that the system oscillates between four states, where only the state of Cac and
Ca2+-ATPase changes, is consistent with the findings of the stable motif algorithm. We
37
provide the summary of GINsim computation/simulation results in Additional File 7 of
[2]. Additional File 8 of [2] indicates the Boolean-converted reduced model in SBML-
qual format [16], a general format for biological model to be analyzed using various
tools including GINsim.
We can also connect the stable motif analysis results to network reduction. We have
previously decided to not reduce the four nodes that correspond to input signals. If we
do consider a specific input combination when using network reduction, e.g. blue light
and red light with normal CO2 without ABA, we can reduce much more of the network:
two of the three SCCs, namely the NO cycle and the Ci SCC, will stabilize and get
reduced. Only the Ion SCC and its sole output stomatal opening remain, indicating that
this SCC is not driven solely by the external signals and has the capacity for oscillations
or multi-stability. This is consistent with the results found by stable motif analysis,
according to which the NO cycle and the Ci SCC attain a steady state and the Ion SCC
admits a [Ca2+]c - Ca2+-ATPase oscillation and PMV bistability. This consistency
supports the appropriateness of the network reduction method and of the Boolean
conversion.”
Theoretical analysis of the reduced model
To gain additional insight into the attractors of the reduced model and their potential
changes due to node perturbations, we analyze the reduced model theoretically.
Specifically, we aim to answer the question: Can there be other types of oscillation, or
can there be additional multi-stability, if a node is knocked out (fixed in the OFF state)
or is constitutive active (fixed in the ON state)?
We first test whether the network and regulatory rules allow multi-stability or
oscillations. This analysis is based on R. Thomas’s conjectures [3]: The presence of a
positive (negative) feedback loop - a cycle with an even (odd) number of inhibitory
edges - in the network is a necessary but not sufficient condition for the occurrence of
multiple steady states (oscillations). The conjectures have been proven in the case of
discrete dynamic systems [26-28, 57]. Since only feedback loops are candidates for
potential multi-stability or oscillations, we analyze the regulatory functions of each
strongly connected component of the network. For each feedback loop, we identify a
sufficient condition for the nodes to stabilize in a specific state. The violation of this
condition becomes a further necessary condition of multi-stability or oscillation. Here
we describe the main steps and results of the analysis; the detailed analysis is in
Additional File 3 of [2].
The NO cycle is composed of the nodes PLD, ROS, NO, and the three positive
edges between them. It does not have any negative edges, so it cannot oscillate. A fixed
38
ABA value is sufficient to stabilize each node of the cycle in a specific state, thus the
cycle does not admit multi-stability under any perturbation.
The Ci SCC has three nodes, Ci, mesophyll cell photosynthesis (MCPS), carbon
fixation, and four edges that form two negative feedback loops, one between carbon
fixation and Ci, and the other between Ci and MCPS. Despite the existence of negative
feedback, this cycle will stabilize if given a fixed CO2 value. From this we know that
this cycle cannot oscillate or admit multi-stability under any perturbation.
The Ion SCC has 13 nodes. To reduce its complexity we show that the key node
[Ca2+]c, which has states 0,1, and 2, cannot enter state 2 in the long term under any
perturbation. Since most nodes respond to [Ca2+]c only if [Ca2+]c =2, we can eliminate
all edges that depend only on “[Ca2+]c =2”, and obtain a simplified Ion SCC, as shown
in Figure 2.3. The Ca2+ SCC ([Ca2+]c, Ca2+ ATPase, PLC, CaR) now becomes a sink
SCC. The only negative edge in this sub-network is from Ca2+-ATPase to [Ca2+]c. These
two nodes are known to oscillate. The positive feedback loop formed by [Ca2+]c, PLC,
and CaR will stabilize if given fixed inputs. So there cannot be multi-stability. For the
nodes outside of the Ca2+ feedback loops, we show that the edges from KEV and [K+]v
are redundant in the long term, so there are no feedback loops except the PMV self-loop.
PMV is not capable of having oscillations, but can have bistability (as also indicated by
the stable motif analysis). The bistability can affect at most one other node, Kout, under
any perturbation. This means that the bistability has very limited effect on the attractor
of the reduced model.
39
Figure 2.3 The Ion SCC after reducing all edges that depend on Calcium. All
regulators of this sub-network have been omitted. On the left, [Ca2+]c related nodes
form a sink sub-network.
Now we can summarize our conclusions and return to the question we sought to
answer: there is no oscillation except in the calcium nodes; there is no multi-stability
except in the nodes PMV and Kout. These statements are true under any perturbation.
Moreover, for the calcium oscillation, [Ca2+]c cannot enter the state 2, so the sub-
network between [Ca2+]c and Ca2+-ATPase is a negative feedback loop between two
Boolean nodes, with the regulatory functions Ca2+ ATPase* = [Ca2+]c; [Ca2+]c* = not
Ca2+ ATPase. It results in the simplest type of oscillation, as also found by GINsim
simulation. For the PMV bistability, even if the bistability exists, most nodes, especially
the output node stomatal opening, still has a unique value. Thus the theoretical analysis,
in agreement with the computational analysis, leads to very strong conclusions about
the reduced model’s dynamic repertoire.
We can also show that the reduction or Boolean conversion did not change the
attractors of the Sun et al. model. Although the reduction we used is only proven in the
Boolean case, Naldi et al. showed that for multi-valued models, removal of non-
40
autoregulated nodes, like in our reduction, preserves crucial dynamical properties [14],
including fixed point attractors and the two-node simple oscillation we found. So our
reduction is valid in this specific model. To confirm that the Boolean conversion
preserved attractors, we note that in the Boolean-converted reduced model we found
fixed point attractors and a complex attractor in which only two nodes oscillate. Because
the only potential change to attractors as a consequence of the conversion is merging of
complex attractors [54], it is straightforward that the attractors have been conserved
during the conversion, as the two-node oscillation found is the simplest type of complex
attractor and cannot be a result of attractor merging. In addition, using general
asynchronous update instead of random order asynchronous update does not cause any
changes to the attractor, because the update schemes do not affect fixed points or the
two-node simple oscillation we found.
Stability of guard cell signal transduction
Our previous results indicate the stability of the system in the sense that all the
initial conditions lead to the same attractor except for up to four nodes. We also examine
another facet of the system’s stability: the robustness of the stomatal opening in response
to node perturbations that render them non-functional. We perform a systematic set of
single-node knockouts of every non-signal node in the reduced model, under all
combinations of light, CO2 and ABA conditions. For each signal combination, we set
the perturbed node’s initial state and regulatory function to 0, initialize the rest of the
nodes in the condition representative of closed stomata, and then simulate the reduced
model until it reaches its attractor. In the absence of ABA under each light and CO2
condition, 60%-90% perturbation scenarios produce the same stomatal opening value
as the unperturbed system (Table 2.4). These results are similar to those reported by Sun
et al. for the original model [1] (Appendix A2). In the presence of ABA 50%-90%
perturbation scenarios produce the same stomatal opening value as the unperturbed
system, and 4-16% knockouts lead to a higher stomatal opening value. Perturbations in
the ABA=1 case were not studied by Sun et al., but our simulations of the original model
give the same qualitative results as the reduced model. These results indicate the
closeness of the perturbed attractor (at least in terms of the stomatal opening value) to
the unperturbed attractor in more than 50% of single node perturbations. They also
suggest the resilience of the stomatal opening process against internal failures and
perturbations.
Light, CO2 and ABA
condition
Unperturbed
SO level
Simplified SO level Percentage
of cases 0 1 2 3 5 6
41
Percentage of single knockouts that
lead to each SO level
with
unchanged
SO value
Dual
Beam
Mod.
CO2
ABA
OFF
5 4% 31% 65% 65%
Low
CO2 6 31% 4% 65% 65%
High
CO2 1 4% 96% 96%
Blue
Light
Mod.
CO2 3 35% 65% 65%
Low
CO2 5 31% 4% 65% 65%
High
CO2 1 4% 96% 96%
Red
Light
Mod.
CO2 1 4% 96% 96%
Low
CO2 3 35% 65% 65%
High
CO2 1 4% 96% 96%
Dual
Beam
Mod.
CO2
ABA
ON
0 85% 4% 8% 4% 85%
Low
CO2 3 46% 50% 4% 50%
Blue
Light
Mod.
CO2 0 85% 4% 8% 4% 85%
Low
CO2 3 46% 50% 4% 50%
Red
Light
Low
CO2 0 96% 4% 96%
Table 2.4 Summary of systematic perturbation results. The first set of columns, with the
42
header ‘Light, CO2 and ABA condition’, indicate the input signal combinations. The
abbreviation “Mod.” means moderate CO2 concentration. Note that we do not list the
four input combinations (high CO2 with ABA and with any type of light, or moderate
CO2 with ABA and red light) wherein all simulated stomatal opening values are zero.
The 2nd column is the simulated stomatal opening (SO) level in the unperturbed system.
The 3rd column set shows the percentage of single-node knockouts that yield the
corresponding SO level. There is no stomatal opening level 4 in the reduced model. No
entry means zero percentage. The last column is the percentage of settings where the
stomatal opening remains at the same level as the unperturbed case. A complete table of
perturbation results is provided in Appendix A2.
Extending the conclusions to the original model
We found that in the reduced model there is no oscillation except in the calcium
nodes; there is no multi-stability except in the nodes PMV and Kout. Because the
reduction we used has been shown to conserve attractors [13, 14], we know that our
attractor conclusions can be immediately extended to all nodes in the original model
except the reduced nodes and stomatal opening. Next we extend the attractor analysis
to include the reduced nodes as well.
First we consider the nodes reduced during the first step of network reduction, i.e.
non-signal source nodes and simple mediator nodes. These nodes are trivially incapable
of having multi-stability and oscillations themselves, so we need only to consider their
perturbations. Perturbation of a simple mediator node can always be replaced by a
corresponding (set of) perturbation(s) in the mediator node’s direct successor(s), so
these perturbations have already been considered. Perturbing a non-signal source node
may theoretically cause a difference, however the nodes in this category in the Sun et
al. model represent molecules that are abundant in the cell or cell environment, thus
their perturbation is not biologically relevant or practical.
Next we consider the anion nodes reduced due to the simplified stomatal opening
rule. Recall that these nodes do not affect other nodes except stomatal opening in the
long term. There cannot be multi-stability in anion nodes unless the assumptions of
sufficient initial [NO3-]a and starch concentration, and sufficient initial mitochondrial
TCA cycle activity are violated (details are provided in Additional File 3 of [2], section
5 and 6). Since there is no support for interventions that would lead to the violation of
these assumptions, it is reasonable to conclude that no multi-stability can be found in
the reduced nodes under biologically relevant situations. We also found that there can
be an additional oscillation in the RIC7 path (involving the nodes ROP2, RIC7 and SO)
when a special set of perturbations is applied. Under that case, the nodes RIC7 and SO
43
will oscillate. Since the effect of this behavior is small (within 5% of the unperturbed
SO value in the Sun et al. model [1]), it has little biological significance. There are no
more possible oscillations as there are no more negative feedback loops. To conclude,
the original Sun et al. model has oscillations only in cytosolic Ca2+ ([Ca2+]c) and Ca2+
ATPase, and has multi-stability only in PMV and Kout, under situations that are
biologically meaningful.
Discussion
The conclusions we obtained can tell us how to control this network model.
Generally in engineering applications, control means to drive a system into an arbitrary
state [58, 59]. However in biological systems, it is more meaningful to drive the system
into one of its natural attractors rather than into an arbitrary state, as the attractors
correspond to stable phenotypes [60]. To control the attractor of a Boolean system, one
needs to control only its input nodes and a subset of nodes in each stable motif [31]. Our
integrated analysis, involving Boolean conversion, indicates that to control the attractor
that the stomatal opening network evolves into, one only needs to control the input
signals and PMV, even in case of perturbations. In particular, to control the stomatal
opening value, one only needs to control the input signals, under any perturbation.
The reduced model provides new biological insights. Normally, when ABA is
present, stomata will close. However in some knockout mutants stomata can open to a
certain extent in the presence of ABA, although the opening level is not as much as in
the case without ABA [1]. Such partial reversals of the effect of ABA are important for
understanding the mechanism of stomatal opening. For example, Sun et al. reported that
OST1 knockout (OST1 is kept 0) and inhibition of the NADPH oxidase (AtrbohD/F is
kept 0) yielded partially restored SO level in simulations, in agreement with
experimental observations. Simplification of the Sun et al. model allows easier
simulation of more perturbation scenarios, e.g. the systematic identification of possible
partial reversals. Table 2.5 indicates all the partial reversals due to single node
knockouts in the reduced model.
Light, CO2 and ABA
condition
Unperturbed
SO level
Nodes whose knockout results in a partially
restored SO, and the corresponding SO value
CO2 NO PLD ROS AnionCh
Dual Beam Moderate
CO2,
0 3 3 5 3 2
Blue Light 0 3 2 3 2 1
44
Red Light ABA is
present
0 3 1
Table 2.5 Nodes whose knockouts diminish ABA’s inhibition of stomatal opening. The
first set of columns, with the header ‘Light, CO2 and ABA condition’, indicate the input
signal combinations. The 2nd column is the stomatal opening without perturbations. The
3rd column set indicates the nodes whose knockout would yield a stomatal opening level
that is higher than the unperturbed value of 0. CO2 knockout means CO2 being set to
zero (CO2 free air). No entry means the setting does not cause partial reversal.
Our results reproduce the observation that knockout of nodes in the ABA pathway
(PLD, NO, ROS) can cause partial reversals of ABA’s effect. We find that AnionCh
knockout can partially restore stomatal opening inhibited by ABA, a result not reported
by Sun et al., but which is supported by experimental evidence [61]. In addition, Table
2.5 offers a new biological prediction: low CO2 concentration can partially restore
stomatal opening when ABA is present. This is consistent with the knowledge that CO2-
free air promotes stomatal opening in the absence of ABA [62]. This CO2 effect suggests
a mechanism of cross-talk between CO2 and ABA. We will study this cross-talk in
Chapter 4. Importantly, apart from the five nodes listed in Table 2.5, no other node’s
knockout can reverse ABA’s inhibition of stomatal opening. The perturbation results of
Table 2.4 offer many more new predictions.
Our combination of techniques offers a powerful framework for determining the
dynamic repertoire of a multi-level dynamic model. Multi-level models are more
accurate than Boolean models in describing the quantitative characteristics of dynamic
systems, but there are few general methods to analyze multi-level models [11, 18]. By
combining different existing methods, we were able to overcome the limitations of each
method. Our successful combination of existing methods offers a promising way to
analyze multi-level models, and might point towards a general strategy to analyze the
attractors of multi-level models, biological or non-biological.
A notable future direction for this work is to develop an alternative way to
determine the attractors of multi-level models by extending the concept of stable motifs.
Compared with conversion to a Boolean model, then applying Boolean stable motif
algorithm, extending the stable motif algorithm to multi-level models can avoid
potential attractor change issues. Development of such a technique will allow easy and
powerful attractor analysis for multi-level models. In the next chapter, I will describe
how to establish a general framework in discrete network models that allows a
generalized version of motifs, and thus identification of attractors.
45
Chapter 3 A general method to find the attractors of
discrete dynamic models of biological systems
Most of this chapter is based on previously published work for which I am the first
author [33]. Parts of the published work are reproduced in this chapter from Phys Rev
E, 2018. 97(4-1), Copyright 2018 APS.
Introduction
Dynamic modeling is a valuable avenue for understanding the emergent properties of
interacting biological systems [63, 64]. Networks, with their nodes representing
biological entities and their edges representing interactions, can connect the interactions
among cellular constituents (e.g. mRNAs, proteins or small molecules) to cell-level
functions or behaviors [44, 65]. Once a network is constructed, a dynamic model can be
created next. Each node is characterized with a state variable, representing its abundance,
concentration or activation level [40]. The state variable will evolve over time according
to a regulatory function that depends on the regulators of the node. The state variables
and regulatory functions of a dynamic model can be discrete or continuous. Discrete
modeling is particularly powerful in biological models in that it can capture the system’s
behavior without the need for much kinetic detail [66-68]. Such detail, including
reaction stoichiometry and kinetic rates, is often difficult to obtain in experiments, and
for most systems, especially large networks, the existing knowledge is insufficient to
effectively inform continuous models [69]. In this work we focus on discrete dynamic
models.
Attractors are long-term behaviors of a dynamic system, and represent system-level
outcomes. They are especially important for biological systems because they represent
biological phenotypes. For example, in a cell signaling network, attractors can
correspond to cell types, cell fates or behaviors, including cyclic behaviors such as
circadian rhythms and the cell cycle [22, 70]. Therefore, finding the attractor repertoire
of a network model is an important goal. However finding all attractors (including cyclic
and complex attractors) is challenging due to the complex dynamics of networks [71].
Thanks to the strong advances in understanding network structure [72-75], a promising
way to tackle this problem is to try to find the attractors based on the network topology
and the key features of the network’s dynamics, instead of from its detailed dynamics
[76]. For example, R. Thomas related the conditions of multi-stability and cyclic
attractors to positive and negative feedback loops, respectively [3]. Boolean models,
46
which characterize each node with two states and describe regulation in a parameter-
free manner, are most strongly based on the network structure. Many methods exist for
finding attractors of Boolean networks [56, 77-79]. Although for some systems Boolean
modeling is appropriate, often at least a subset of the nodes needs to be characterized
by multiple levels, in order to accurately describe experimentally observed relative
outcomes in case of combinations of inputs [80, 81]. For example, multiple elements of
the signal transduction network that underlies light-induced opening of microscopic
pores on plant leaves were observed to have different activity levels under red light,
blue light, and white (combined) light [1]. Three levels also allow the separate
representation of upregulation or downregulation compared to a baseline/normal level
[82]. Current approaches to attractor (mainly fixed point) identification in multi-level
models use exhaustive search, model checking methods or polynomial algebra [18, 83,
84]. Yet, there is still an unmet need for a general method that can effectively find all
attractors (fixed points and complex attractors) of a multi-level model.
In this chapter, we propose a general method that can find both fixed points and
complex attractors of any finite multi-level model. Our method is an extension of a
Boolean attractor finding method proposed by Zañudo & Albert [30]. We test and
validate our method on synthetic networks and on a collection of biological models from
the literature.
Methods
In this section we give background information on discrete dynamic modeling and
attractors, and then an overview of our method. In sub-sections C to I we describe each
step of the method in detail.
A. Discrete dynamic modeling and attractors
Discrete dynamic models require minimal parameterization, yet they can capture
important biological emergent properties, and are widely used in describing biological
networks [71, 85]. These models use discrete time (implemented through update
schemes). There are deterministic update schemes such as synchronous update, where
all nodes are updated simultaneously at each time step according to their regulatory
function [42], or asynchronous schemes with fixed time delays [86]; there are also
stochastic update schemes [87], for example a general asynchronous update where in
each time step, one node from the network is randomly chosen to update [88]. By
considering multiple replicate simulations, general asynchronous update in effect
samples all kinds of rates. It is motivated by the fact that the temporal details of
47
biological processes are difficult to obtain and usually insufficiently known. By
considering every kind of rates of states transitions, this update method is capable of
covering multiple timescales involved in intracellular processes, making up for
incomplete knowledge of the reaction timescales in biological network modeling, while
synchronous update can lead to spurious behaviors [89]. Therefore it is applied
frequently in biological models, in both simulations [90] and theoretical analysis [26,
28].
An attractor can be described as one of the smallest self-contained set of states, i.e. a
set of states from which only states in the same set can be reached. Attractors include
steady state attractors (fixed points), and complex (oscillating) attractors where a subset
of the nodes do not take fixed values. In discrete models, attractors can also be defined
as the terminal strongly-connected-components (SCCs) of the state transition graph
(STG). An STG of a dynamic model is the graph wherein each node represents a state
of the system, and each edge represents a state transition. The nodes in a terminal SCC
of the STG are self-contained as they cannot reach nodes other than themselves, and are
therefore attractors of the system.
In a discrete dynamical system, the fixed point attractors are independent of the
update scheme; on the other hand, complex attractors may depend on the update scheme
of the system. This is intuitive, as the edges of a STG can be different for different
update schemes. An example is provided in Appendix B4. Since the general
asynchronous update allow all kinds of rates and timing, complex attractors found under
general asynchronous update are invariant with respect to arbitrary fluctuations in the
rates of the processes involved [91]. In this chapter, we will focus on general
asynchronous update.
An accurate method to find all attractors of discrete models is to perform an
exhaustive search in the state space. However this is not practical because the state space
of a network scales exponentially with its size. Even for the simplest, Boolean model,
the size of the state space of an N-node network is 2N, which is too large for exhaustive
search. There has been a lot of effort to develop methods to find attractors in the Boolean
framework [56, 77-79], but there are only a few methods that can find attractors of
multi-level models, and they have special constrains when finding complex attractors.
For example, Dubrova et al. proposed an SAT-based bounded model checking method
that can only find complex attractors in a synchronous update scheme [83]. Hinkelmann
et al. converted the attractor finding problem into solving polynomial equations; this
method can only find complex attractors of a limited size [84]. Our method does not
explicitly consider time and does not enumerate the system’s trajectories. Instead, it
combines graph topology and regulatory functions into an expanded graph
48
representation. Because this expanded network is much smaller than the size of the state
space, our method can work on networks of larger size. Our method is comprehensive
in the broad family of dynamical systems wherein one node changes state at any given
time instant.
B. Overview of our motif-based attractor identification method
The idea of our method is to translate the attractor identification problem into a graph
theoretical problem by creating an expanded representation of the network that
incorporates all the regulatory functions, then identifying certain motifs (subgraphs) of
this expanded network [92]. We will refer to our method as the motif-based attractor
identification method, or ‘motif-based method’ for short.
We first represent each state of each original node with a Boolean virtual node. The
‘ON’ state of the virtual node means that the original node is in the state embodied by
the virtual node. The regulatory function of a virtual node is a quasi-Boolean function,
whose inputs are virtual nodes, expressed in an appropriate disjunctive normal form.
This disjunctive normal form is obtained by summing up the input combinations that
yield the ‘ON’ state for the virtual node.
Then an expanded network containing all information expressed in the regulatory
functions can be established. The expanded network is obtained from the original one
by the following operations: 1. Include each virtual node in the expanded network, and
connect all the virtual node’s regulators to it; 2. for each ‘and’ rule in the regulatory
functions, create a composite node, and re-wire the edges from the input nodes of the
‘and’ rule to this composite node, then connect the composite node to the regulated
(target) node. The original edges from the input nodes of the ‘and’ rule to the target node
are removed. The expanded network allows one to distinguish between co-pointing
interactions that are combinatorial in nature (i.e. they are combined by ‘and’ rules) from
co-pointing interactions that are individually sufficient (i.e. they are combined by ‘or’
rules).
We use the term “motif” for strongly connected components of the expanded network
that satisfy certain properties (which we will describe later). Depending on the virtual
nodes involved in the motif, we define stable motifs, which correspond to stabilized
states of the constituent nodes, and oscillating motifs, which are candidates for
oscillations of the constituent nodes.
After the motifs are found, plugging in the node states specified in the motifs into the
regulatory functions of their target nodes will specify the states of these nodes, therefore
reducing the network. Then more motifs can be found in the reduced network, and this
reduction process can be done iteratively. Ultimately, the motif sequence we find in the
49
iteration process will determine the attractor. In the following sub-sections we describe
the details of each step.
C. Quasi-Boolean formalism of multi-level models
We establish a formalism where multi-level regulatory functions become Boolean-
like. We treat each level (state) of a multi-level node as a separate node, called a virtual
node. For example, if a node A has 3 different levels, 0, 1, and 2, then 3 virtual nodes
for A, namely A0, A1, A2, are created in our formalism. Each virtual node is like a
Boolean variable, and the combination of all virtual nodes represents the state of the
original node. We will refer to these virtual nodes as ‘sibling nodes’ of each other. For
example, original state A=2 (where for simplicity the node state is represented by the
node name) will now be represented as the combination A0=0, A1=0, A2=1. Note that
one and only one of the virtual nodes takes value 1, while all other virtual nodes, i.e. its
sibling nodes, must all be 0. Then we write the regulatory function of each virtual node
in a Boolean disjunctive normal form, by treating each input combination as a
conjunctive clause and then connecting all conjunctive clauses that yield the same target
node level with the Boolean ‘or’ operator. Figure 3.1 demonstrates the example of
converting the regulatory function fA = B+C into a set of quasi-Boolean regulatory
functions of virtual nodes.
Figure 3.1 Demonstration of the construction of a quasi-Boolean regulatory function. A
3-level node A has regulatory function: fA =B+C, where B and C both have 2 levels.
From the truth table, one can identify the regulatory function for each virtual node of A,
by connecting all conjunctive clauses that yield the same state of A with the Boolean
‘or’ operator. In this way, each virtual node’s regulatory function will have a Boolean
disjunctive normal form.
Note that the Boolean ‘not’ rule is absent from this formalism, because we have
assigned virtual nodes to all states of nodes. Negation is now replaced with activation
by the sibling nodes. We will proceed through the rest of our analysis based on the
regulatory functions of the virtual nodes, instead of the functions of the original nodes.
50
We require the regulatory functions to be written in a disjunctive normal form with
all of their prime implicants present, or in other words, in the Blake canonical form [93].
A minterm is a combination of inputs that yields the value 1 for a Boolean expression.
An implicant is a ‘covering’ (sum or product) of minterms in a Boolean function; a
prime implicant of a function is an implicant that cannot be covered by a more general
(more reduced) implicant. For example, the Blake canonical form of the regulatory
function ‘𝑓𝐴 = B and C or D and not C’ is ‘𝑓𝐴 = B and C or D and not C or B and D’,
as the conjunctive clause ‘B and D’ is also a prime implicant of A. This form is not
preferred in Boolean models because of its redundancy, but it is necessary for the
creation of the expanded network, because it explicitly contains all sufficient conditions
to activate a virtual node. The Quine-McCluskey (QM) algorithm finds the Blake
canonical form of a Boolean function. We extend this algorithm to multi-level models.
D. Multi-level Quine-McCluskey algorithm
To obtain the Blake canonical form of a multi-level function, we developed a multi-
level version of the QM algorithm. The original QM algorithm not only finds all prime
implicants but also minimizes the function [94-96]. We aim to find all prime implicants
and omit the latter step.
The idea of the QM algorithm is that, if multiple minterms cover all states of a node,
these minterms can be merged and the node can be eliminated from the function. For
example, in a Boolean case, A and B or A and not B = A. Similarly, if all states of a
node in a multi-level function are covered by certain minterms, these minterms can be
merged. For example, if B has 3 states, then A1 and B0 or A1 and B1 or A1 and B2
=A1. The key property here is B0 or B1 or B2 =1; or in general, N(0) or N(1) or N(2)
or … or N(m-1)=1, where m is the number of states of node N has and 𝑁(𝑖−1) represents
the ith state of N. We call this the completeness condition. The main difference of the
multi-level functions compared to a Boolean function is that the completeness condition
becomes implicit. There is also a uniqueness condition, which can be written
as N(𝑖) 𝑎𝑛𝑑 N(𝑗) = 0, ∀ i ≠ j. The interpretation is that N can only take a single state.
Together the completeness and uniqueness conditions mean that at any given time node
N can take one and only one state from its possible states, which is a natural requirement.
These conditions are true in the Boolean formalism (A or not A = 1, A and not A =0).
However, in the multi-level formalism where we represent each node state separately,
we will need to separately impose these two conditions. Specifically, the multi-level
QM requires the completeness condition to merge minterms.
The systematic merging can be done in a way demonstrated in Figure 3.2. Suppose a
virtual node state D1 has its regulatory function expressed in truth table format. One
51
can then re-arrange the minterms into groups, based on how many zeros each minterm
has. Then one can start merging by checking minterms in neighboring groups that are
different by one node. If the minterms cover all states of that node, then they can be
merged. In the example demonstrated in Figure 3.2, m1 (002), m5 (012) and m6 (022)
differ in the state of node B, and these three minterms cover all possible states of B,
so we can merge them to get ‘0X2’ in the 1st row on the right, as a merged term. This
process is done repeatedly until all minterms are considered. Any leftover minterms that
did not get merged are prime implicants, e.g. (011) in Figure 3.2. The merged terms will
contain ‘X’s representing merged nodes. Next, one treats the 1st order merged table in
the same way, i.e. re-arrange according to the number of zeros, and try to merge into a
2nd order merged table. The difference is that ‘X’s are treated as a separate state of the
variable that cannot be merged. For example, (X01) and (X11) are different by 1 node
and may be considered as candidates for merging, while (X01) and (0X1) are different
by 2 nodes and cannot be merged. This process is done iteratively until no more merging
can be done. All ‘leftover’ terms are prime implicants. In Figure 3.2, nothing can be
merged after 1st order, so we get a final prime implicant form of D1 as fD(1) = A0 and
B1 and C1 or A0 and C2 or B0 and C2 or A1 and C0 or B2 and C2 or B2 and C0. We
discuss the performance of the algorithm in Appendix B1, and a description of the
implementation is provided in Appendix B2.
Figure 3.2 Example of the multi-level Quine-McCluskey algorithm. A Boolean node D
is regulated by a Boolean node A and two 3-state nodes B and C. The original function
of D is shown in a truth table on the top left, in a form summarizing all input
combinations that yield fD(1) =1. The top right table shows the minterms sorted
52
according to the number of zeros in them. From this table, one can merge the terms
between layers that are different by 1 digit, if all states of the difference node are present
within the two layers. The result of the merging is shown below. Merged terms are
represented by an ‘X’. There are 5 leftover terms after 1st order merging, and there is 1
leftover term after 0th order merging. The sum of all six terms is the final expression.
E. The expanded network representation
After all functions are transformed into the proper form, we create an expanded
network, which is a representation of the network with regulatory functions embedded.
The expanded network is obtained from the original network by applying the following
operations: 1. Include each virtual node in the expanded network, and connect its
regulators to it; 2. for each ‘and’ rule in the regulatory functions, create a composite
node, and re-wire the edges from the input nodes of the ‘and’ rule to this composite node,
then connect the composite node to the regulated node. The original edges from input
nodes of the ‘and’ rule to the target node are removed. Figure 3.3 exemplifies the
construction of an expanded network from a regulatory function. To construct the entire
expanded network, all virtual nodes and all interactions must be created.
Figure 3.3 Construction of an expanded network from a regulatory function. Virtual
node A0 has function fA(0) = B0 or (C1 and B1), so in the expanded network, B0 is
connected directly to A0; C1 and B1 are connected indirectly to A0 via composite node
'C1 and B1'. A1 has function fA(1) = C0 and B1, so C0 and B1 are connected indirectly
to A0 via composite node 'C0 and B1'.
The expanded network contains not only the network structure, but also all
information about the regulatory functions. Furthermore, interactions of a combinatorial
nature are separated, as all ‘and’ rules have become explicit nodes. In this way, the
expanded network makes it easy to identify a sufficient condition to activate a node: a
virtual node will have state 1 if any of its regulator virtual nodes is 1, or if any of its
53
regulators that is a composite node has all its input virtual nodes being 1, regardless of
the states of the rest of its regulators. Following this intuition, a cycle in the expanded
network that satisfies the above criterion will be self-sufficient to stabilize. This leads
to the definition of stable motifs.
F. Stable motifs
A stable motif is a subgraph of the expanded network that can stabilize on its own.
We define it in the following way: a stable motif is a strongly-connected-component
(SCC) in the expanded network that satisfies: (1) the SCC contains no sibling node pairs;
(2) if the SCC contains a composite node, all of its input nodes must also be in the SCC.
The first condition is a natural requirement for a stabilized state of the original node;
the second condition is about the nature of the Boolean ‘and’ operator, as all inputs must
be present to activate the ‘and’ function. In our algorithm we identify stable motifs as
the smallest SCCs that satisfy the above conditions. Figure 3.4 shows the expanded
network and stable motifs of a three node network.
Figure 3.4 Illustration of stable motif identification in a three-node network. (A) The
original network and the regulatory functions of each node; (B) The expanded network
is constructed according to the steps in sub-section E of Methods, and then the stable
motifs are found by their definition in I.F. (C) Stable motifs found in this example. The
first stable motif, A0, B0, corresponds to a fixed point attractor of the system A=0, B=0,
54
C=0. The state C=0 is found by plugging A=B=0 into the regulatory function of C. The
2nd stable motif corresponds to another fixed point attractor A=2, B=2, C=0.
In order for stable motifs to be correctly recognized, the regulatory functions must
contain all prime implicants. If a prime implicant is missing, a sufficient condition for
a node to stabilize is missing, which would lead to incorrect identification of stable
motifs. This is why we require the Blake canonical form of regulatory functions.
There is a one-to-one correspondence between a stable motif and a partial fixed point
of the system (which is defined as a state in which a subset of nodes stabilize regardless
of the state of the rest of the system). The proof of this statement is provided in Appendix
B3. Consequently, by finding all stable motifs we find all fixed points or partial fixed
points of the system.
G. Oscillating motifs
An oscillating motif is defined as the largest SCC in the expanded network that
satisfies: (1) at least one virtual node in the SCC has at least one sibling node in the
SCC; (2) if the SCC contains a composite node, all its input nodes must also be in the
SCC. In contrast to nodes in stable motifs, an oscillating node must be able to enter at
least two states, so the first condition is necessary. The second condition is also
necessary due to the combinatorial nature of the composite node.
55
Figure 3.5 An example of an oscillating motif in a multi-level network. Panel (A) shows
the network and regulatory functions; panel (B) indicates the expanded network and
motifs. A0 and B0 form a stable motif, indicating a fixed point A=0, B=0; while A1, A2,
B1 and B2 form an oscillating motif, indicating a possible complex attractor involving
states A=1, A=2, B=1 and B=2. Panel (C) indicates the state transition graph of the
system when using general asynchronous update. The stable motif and oscillating motif
identified in 5B correspond to a fixed point and a complex attractor, respectively.
Unlike the relation between stable motifs and partial fixed points, there is no one-to-
one correspondence between oscillating motifs and complex attractors, because
complex attractors are dependent on the timing of individual events [90] (see Appendix
B4 for an example). Our motif-based method is based on network structure and
regulatory functions and is independent of timing, thus it cannot find timing-dependent
complex attractors. General asynchronous update prunes timing-dependent complex
attractors in discrete framework, and all complex attractors under this update are proven
to be based on negative feedback loops [26, 28]. These complex attractor are also
reliable under perturbation, in contrary to timing-dependent complex attractors [91].
Therefore the complex attractors identified by our method should be consistent with the
complex attractors under general asynchronous update. We propose that for every
complex attractor of the discrete dynamic system under general asynchronous update,
56
there is a set of oscillating motifs and their downstream that contain the virtual nodes
representing all the states visited by the oscillating nodes. We sketch the proof of this
proposition in Appendix B3. In our benchmarks presented in sub-section B of Results,
this proposition was never violated. Figure 3.5 shows an example of a complex attractor
in a multi-level network model. This example also illustrates the coexistence of a fixed
point attractor and a complex attractor for different states of the same nodes (See
Appendix B4 for more detail).
Figure 3.6 An example of an oscillating motif that contains a stabilized node. (A) The
network and regulatory functions. (B) The expanded network and motifs. The
oscillating motif contains only one virtual node of B, meaning that B will stabilize at 1
in the complex attractor. (C) The state transition graph using general asynchronous
update. There are two attractors: a fixed point attractor, and a complex attractor.
There is a difference between the criteria of oscillating motifs in the Boolean and
multi-level case: in the Boolean case, all nodes in an oscillating motif must oscillate
[30], while in the multi-level case, an oscillating motif can allow stabilized nodes. An
example of a complex attractor corresponding to an oscillating motif with a stabilized
node is shown in Figure 3.6. We illustrate several additional properties of oscillating
motifs in Appendix B4.
57
H. Iterative motif reduction yields the attractors of the system
The source (unregulated) nodes of a network that stabilize in a fixed state can be
reduced prior to any attractor identification process. The corresponding fixed states can
be substituted into the regulatory functions of the nodes they regulate. This can be done
iteratively until no source nodes are present in the network, without affecting the
attractor repertoire of the system [13, 14]. For some biological networks, this reduction
alone can reduce a large fraction of the network model, leading to a much simplified
model.
Once motifs are identified, we can plug in the states of the nodes specified in the
motifs into the expanded network, as if these nodes were source nodes, to further reduce
the network. For stabilized nodes, the stabilized virtual node takes value 1 and its sibling
nodes are set to 0; for oscillating nodes, their corresponding virtual nodes are marked
as oscillating, and their sibling nodes excluded from the oscillating motif are set to 0.
Certain nodes downstream of the motifs may stabilize as a result. In this way, a reduced
version of the network model is obtained. We then identify stable motifs and oscillating
motifs in the reduced network and substitute the corresponding virtual node values again,
until this cannot be done any more. By the end of this process all nodes will either
become a part of a motif, or be downstream of a motif and be determined by that motif,
and we will have obtained a set of motif sequences. If no oscillating motifs are found in
a motif sequence and at the end of the process the network has reduced completely, all
nodes will be stabilized, and we will have obtained a fixed point attractor. If oscillating
motifs are found in a motif sequence, at the end of the process we will find some
(possibly none) of the nodes stabilized, while some other nodes oscillating. We call this
result a quasi-attractor. Specifically, a quasi-attractor will indicate the unique state of
each stabilized node and it will give a set of states among which a potentially oscillating
node oscillates. This quasi-attractor is likely (but not guaranteed) to correspond to a
complex attractor (see Figure B.3 in Appendix for an example). Under general
asynchronous update, since all partial fixed points correspond to stable motifs, and all
complex attractors correspond to oscillating motifs, all attractors will be covered with
our motif-based method. Note that there is no exact match between the actual number
of complex attractors and the number of quasi-attractors found by our method (see
Appendix B4).
58
Figure 3.7 Attractor identification for a four-node network by a motif succession
diagram. A. The network and the regulatory function of each node. B. Motif succession
diagram. Three motifs are found from the original network, including 2 stable motifs
(A0, B0), (C1, D1), and one oscillating motif (A1, A2, B1, B2). For each motif, the
values of the nodes in the motif are plugged into the regulatory functions, reducing the
network. Then new motifs are identified from the reduced networks. The sequences
corresponding to the three motifs are labeled (1), (2) and (3).
The reduction process can be represented as a motif succession diagram, which is the
diagram of the motifs obtained successively in the iterative network reduction process
[31]. Figure 3.7 illustrates a motif succession diagram, where iterative network
reduction based on identified motifs leads to the identification of attractors and quasi-
attractors. The original network has two stable motifs (A0, B0), (C1, D1), and one
oscillating motif (A1, A2, B1, B2). When the stable motif (A0, B0) is chosen, the
network is reduced down to two nodes, C and D, with new regulatory functions fC(0) =
D0, fC(1)= D1, fD
(0) =C0, fD(1)=C1. Two new stable motifs, (C0, D0) and (C1, D1) are
found in the reduced network, leading to two attractors Attractor 1: A=0, B=0, C=0,
D=0 and Attractor 2: A=0, B=0, C=1 D=1. When the oscillating motif is chosen, A0
and B0 become 0, and as a consequence C0 and D0 become 0, thus C=D=1. The system
59
is thus in a quasi-attractor in which A and B oscillate between 1 and 2 and C=D=1.
When the stable motif (C1, D1) is chosen, the regulatory functions of A and B stay the
same, thus either the (A0, B0) stable motif or the oscillating motif can come next. Both
yield already encountered (quasi)-attractors (see Figure 3.6). Thus Attractor 1 is reached
if stabilization of (A0, B0) is followed by (C0, D0); Attractor 2 is reached in case of
stabilization of (A0, B0) and (C1, D1) in either order; and quasi-attractor 3 is reached
due to the oscillating motif. In general, a (partially) ordered sequence of motifs
determines a fixed point attractor or quasi-attractor, similarly to the Boolean case [31].
I. Description of the motif-based algorithm
Here we summarize the steps of the implementation of the motif-based algorithm1.
The algorithm takes as input a set of regulatory functions and specific values for each
source node. For a source node A whose value is uncertain, one can define its regulatory
function as itself, i.e. 𝑓𝐴 = 𝐴. In this way each virtual node that corresponds to A will
have a self-loop, which is also a stable motif. Thus all possible values of A are
considered.
1. Reduce the source nodes of the network model by plugging their values into the
regulatory functions of the nodes they regulate. Repeat until no source node is
present.
2. Transform the regulatory functions to Blake canonical form using the multi-level
Quine-McCluskey algorithm.
3. Create the expanded network according to the definition in sub-section E of
Methods.
4. Search the expanded network for stable motifs and oscillating motifs.
5. For each stable motif and oscillating motif identified, create a copy of the network,
with the node states specified in the motif plugged into the regulatory functions of
their targets. In the case of oscillating motifs, the virtual nodes in the oscillating
motif are marked, and their sibling nodes that are not in the motif are set to 0. In
addition, for each oscillating motif, create a copy of the network with all virtual
nodes downstream of the oscillating motif marked.
6. Repeat 1, 2, 3, 4, and 5 until no more motifs can be identified. In step 1, the
reduction process, virtual nodes marked as potentially oscillatory are not reduced
when evaluating regulatory functions.
7. Discard duplicate attractors.
The final result of the algorithm will be a set of attractors or quasi-attractors. Each of
these (quasi) attractors will indicate a state (or multiple possible states) for each node.
1 The source code is available on GitHub: https://github.com/jackxiaogan/Multi-level_motif_algorithm.
60
For each stabilized node, its unique stabilized state is given; for a potentially oscillating
node, the multiple states among which it potentially oscillates are given.
Results
To test the effectiveness of our motif-based attractor identification method, we apply
it to an ensemble of synthetic networks and biological networks from the literature.
A. Benchmark on synthetic networks
We test the motif-based algorithm on synthetic networks of different size, ranging
from 10 to 40. To approximate biological networks, we first generate networks where
the in-degree is k=2 for each node and the network is otherwise random [41, 97]. Next,
we generate the number of states for each node. For multi-level ensembles, we generate
number of states according to an equal probability of having 2 or 3 states. For Boolean
ensembles all nodes have 2 states. Then we randomly generate a regulatory function
among those consistent with the number of regulators and number of states for each
node. The generation process of regulatory functions is described in Appendix B5.
To test whether the motif-based algorithm finds attractors correctly, we perform
simulations similar to Wang et al. [98] and Zañudo et al. [30]. We start from different
random initial conditions, and let the system evolve for Tstep effective time steps. We
used general asynchronous update, where at each time step, one node is randomly
chosen and its state is updated according to its regulatory function. If the new state of
the node is the same as before, another node will be selected within the same time step,
until the selected node changes state. If no node can reach a new state, a fixed point
attractor is reached. If no fixed point attractor is reached within Tstep effective time steps,
we evaluate whether the system is in a complex attractor by determining the
corresponding partial state transition graph (STG). Note that this sampling method is
heuristic, and is likely to miss attractors when the state space is large. For each fixed
point attractor found by simulation, we check whether it is predicted by our motif-based
algorithm. In addition, for each predicted fixed point or partial fixed point we check
whether there is a simulated attractor that contains the same stabilized nodes in the same
states. If a pair of predicted and simulated fixed points passes both checks, we categorize
them as identical. If a predicted partial fixed point passes the second check, we call it
consistent with the simulated attractor. Complex attractors depend on the update scheme
(i.e. on the timing), so there cannot be a definitive conclusion. The expectation (based
on our proposition presented in sub-section G of Methods) is that the set of nodes found
to oscillate in a simulation should be a subset of the nodes predicted to oscillate by our
61
motif-based algorithm. If this is indeed the case (in addition to the stabilized nodes, i.e.
the partial fixed points, being consistent), we say that the attractors are highly consistent.
In all tests, we found identical fixed points and highly consistent complex attractors
with the sampling method. The runtime of the motif-based algorithm increases
exponentially with the number of nodes, and increases faster on the ensemble of multi-
level networks than on an ensemble of Boolean networks, as expected (Table 3.1). From
the table, the motif-based algorithm would not be practical for large networks with more
than 50 nodes or too many multi-level nodes. The important question is whether the
algorithm is practical for biological network models existing at present or constructed
in the near future. To estimate the answer to this question, we test our algorithm on
published multi-level biological network models.
Multi-level Networks
Size of
network
10 15 20 25
Time (s) 0.07 1.1 48 251
Boolean Networks
Size of
network
10 20 30 40
Time (s) 0.07 0.89 74 600
Table 3.1 Benchmark runtime of the motif-based algorithm on synthetic networks of
different sizes (number of nodes). For each size, 50-100 random networks with in-
degree k=2 are generated. For multi-level networks, each node has 50% chance of
having 2 levels and has 50% chance of having 3 levels. In all runs, the attractors found
by the algorithm are identical or highly consistent with the attractors found with the
sampling method.
B. Tests on biological networks from the literature
The tested models include a signal transduction network model describing stomatal
opening in plants [1] whose attractor repertoire we explored before [2]. We also selected
18 models from the model repository of the software tool GINsim, which simulates
discrete dynamic models of gene regulatory networks [18]. These 19 models have sizes
ranging from 4 to 72 nodes, with 6%-100% of these nodes being multi-level. We run
our motif-based algorithm on each model, and compare the results with the results found
by GINsim.
To apply the motif-based algorithm, we first convert the GINsim model into a ‘.txt’
62
file, with regulatory functions suitable for our algorithm2. In the few cases where the
GINsim framework and our framework are different, we adapt the model to our
framework. For example, GINsim allows an ‘empty function’: ‘fA(0) = B0, fA
(2)=B1, fA(1)
is empty, i.e. A1 has no function’, which our method doesn’t allow. In this GINsim
example, ‘A1’ will be visited transiently when node A changes from A0 to A2. We
discard the state ‘A1’. We can do so because such transient states are never part of an
attractor. We also reduce some of the large models before applying our algorithm. The
reduction consists of three methods: removing output nodes (nodes with no outgoing
edges), removing simple mediator nodes (nodes with one incoming edges and one
outgoing edge), and replacing input trees (acyclic sub-networks that contain a source
node) with a single source node. These reductions are known to conserve the attractors
of the model [13, 14]. In cases where there are a lot of different signal (source node)
state combinations, it is not practical to compare all the fixed points found. Instead, we
select representative signal combinations corresponding to different biological
phenotypes (some of which are indicated as pre-made selections in GINsim), or signal
combinations that result in different attractors.
We compare the attractor analysis results by first checking whether the fixed points
are identical, and then checking whether the complex attractors are consistent. We find
that the fixed points found by the two algorithms are identical, as expected. For complex
attractors, it is difficult to get a definite conclusion. GINsim cannot predict complex
attractors; it can only simulate the state transition graph (STG) or hierarchical transition
graph (HTG) and find the strongly-connected-component from the STG/HTG [99]. The
complexity of this method goes up quickly with the increase of the model size. Our
method can only predict quasi-attractors, which may or may not be actual complex
attractors. Therefore it is impossible to know the complex attractors exactly unless an
exhaustive (partial) state space search is performed. If the model is simple enough for
GINsim to construct an STG, we check whether the complex attractors found from the
STGs are covered by the candidates predicted by our algorithm. We found consistent
complex attractor results from the two algorithms: all complex attractors found in
simulations are covered by predicted quasi-attractors. The detailed results can be found
in the Supplementary File S1 of [33].
We also compared the runtime of the two algorithms. For the motif-based algorithm,
we record the runtime for each signal combination, then average them. GINsim does
not show the actual time spent in computation, so we only record whether the
computation completed, and give an estimated time. Note that both algorithms are
2 The converted models are uploaded to the ‘models’ folder in: https://github.com/jackxiaogan/Multi-level_motif_algorithm/.
63
guaranteed to find solutions given enough computational power, so cases of not
completed calculations are due to limited computational resources. All GINsim fixed
point computations are done in seconds. The only model wherein the motif-based
algorithm did not finish computing had a 72-node strongly connected network. A
summary of the results is shown in Table 3.2. The details of the runtime of each model
can be found in Supplementary File S1 of [33].
Network
count
Network
size
Computational Time
Motif
algorithm
GINsim
STG/HTG
9 4~15 0~8s 0~10s
9 17~36 0s~1h DNC
1 72 DNC DNC
Table 3.2 Summary of the runtime of the two algorithms. The networks fall into three
categories. The first column is the number of networks in each category. The second
column is the range of the network sizes in each category. The 3rd and 4th columns
indicate whether motif analysis and GINsim STG/HTG generation was successfully
completed or not. For completed analysis, the range of computational time is shown in
the table. Otherwise, we indicate DNC (meaning “did not complete”), which includes
cases that ran out of memory or did not finish in 6 hours. All tests were run on a personal
computer. There is no model where GINsim succeeds and the motif-based algorithm
fails. The motif algorithm is successful in 18 of 19 models, while GINsim STG/HTG
only works in the small networks of the first category.
Discussion
Our motif-based attractor identification method connects the structure, regulatory
logic and attractors of discrete dynamical systems. The expanded network
representation is conceptually similar to Petri nets (as the composite nodes share certain
properties with the Petri nets’ transition nodes) [100] [101] and also to logic hypergraphs
[102] (which represent the group of edges incident on a composite node with a hyper-
edge). The innovation of our analysis of the expanded network lies in interpreting the
patterns formed by multiple connected regulatory functions. The motifs identified in our
expanded network have a strong correspondence with the long-term dynamic behaviors
of the modeled system. The expanded network is therefore a good complementary
technique to the existing family of techniques to predict the attractor repertoire of
discrete dynamical systems.
64
Our method captures not only fixed points, but also complex attractors. The fixed
points of a dynamic system are independent of timing, and will be found accurately.
Complex attractors may be timing-dependent. Since our method is based on the
structure and regulatory logic of the system, it will capture timing-independent, negative
feedback-driven complex attractors. Our method can find all attractors of systems
updated by general asynchronous update; for systems updated using other update
schemes (i.e. when there exists at least some node synchrony), our method can
accurately find fixed points and timing-independent complex attractors, but there may
be timing-dependent attractors that our method cannot capture.
The complexity of the motif-based algorithm mainly comes from the identification
of cycles. Both stable and oscillating motifs are formed as unions of simple cycles in
the expanded network. Identifying simple cycles in a directed graph is known to be NP-
complete, with time complexity O((N + E)(c + 1))using Johnson’s algorithm [103],
where N is the number of nodes, E is the number of edges, and c is the number of
directed cycles. The last can grow faster than 2N for dense networks. In addition, the
introduction of multi-level nodes dramatically increases the number of nodes, especially
the number of composite nodes in the expanded network. These facts limit the
effectiveness of the motif-based algorithm on networks with a large size, a high number
of levels, or with high connectivity. Typical biological network models have a low
average degree, around two, and a low number of states for each node (two or three). In
addition, only a relatively small fraction of the nodes are in SCCs; i.e. biological
networks are not feedback-dense. As we have demonstrated in sub-section B of Results,
our motif-based method can be successfully applied to these networks. For other types
of networks, although our method can theoretically work, the computational complexity
may be a challenge. Possible further work on this project include optimizations of the
algorithm so it can work on more complex network models, and finding more necessary
conditions of multi-level complex attractors to reduce the number of quasi-attractors. A
possible way to optimize the algorithm is to add a step to divide the network into SCCs
before trying to analyze for motifs, as all motifs can only be found within an SCC. This
may dramatically reduce cycle-finding time in networks with SCC ‘communities’,
which is quite common in biological networks.
Although the idea is the same, there are significant differences between the Boolean
stable motifs method and our multi-level motif-based method. The most important
difference is in the criteria for oscillating motifs, as mentioned in sub-section G of
Methods: the Boolean oscillating motif requires the participation of two (i.e., both)
sibling virtual nodes for every node of the motif, while the multi-level oscillating motif
65
does not require that two or more sibling virtual nodes participate for every original
node (see the multi-level example in Figure 3.6). In addition, in the Boolean framework,
a fixed point and a complex attractor cannot co-exist for different states of the same
node; while in the multi-level case this is possible (see the example in Figure 3.5 and
Figure 3.6). These differences bring fundamental differences and complications to the
design of the algorithm, because in the iterative reduction process toward attractor
identification, the Boolean method needs only knowledge of the stable motifs, while the
multi-level case needs both stable motifs and oscillating motifs.
66
Chapter 4 Modeling ABA and CO2 crosstalk in inducing
stomatal closure
The research described in this chapter is done in collaboration with Prof. Sarah M.
Assmann’s team, which includes Dr. Palanivelu Sengottaiyan, Dr. David Chakravorty,
Dr. Yotam Zait, and Prof. Sarah M. Assmann. The chapter describes my contribution,
namely the construction, analysis, and predictions of the network model.
Introduction
Stomata are microscopic pores on the epidermis of leaves that allow gas exchange
for plants. A stoma is bordered by a pair of guard cells. Guard cells change their shapes
to control stomatal opening (increase in aperture) or closure (decrease in aperture), in
response to external environmental signals such as light or CO2, or endogenous signals
such as water pressure or phytohormones [104, 105]. The regulation of guard cells keeps
the balance of water loss and carbon dioxide (CO2) uptake, and is thus vital to the plant.
Understanding its mechanism can help better understand how plants react to real-world
stress such as drought or global rising of CO2 concentration, and show insight how to
better manage crop productivity in presence of such stress [106, 107].
The guard cell responses to signals are mediated by a complex system of signal
cascades and involves dozens of signaling components. For example, abscisic acid
(ABA), a phytohormone that the plant produces in response to drought, can induce
stomatal closure to prevent further water loss [108]. ABA induces a wide range of
cellular regulation changes, including activation of serine-threonine kinase OPEN
STOMATA1 (OST1), actin reorganization, cytosolic Ca2+ ([Ca2+]c) increases, reactive
oxygen species (ROS) production, pH increase, and vacuolar acidification [109-111].
All these processes are known to promote stomatal closure. Eventually, ion channels at
the guard cell membrane will open, causing ion efflux, followed by water efflux. The
guard cells will then shrink, making stomata close. It is also known that high
concentration of carbon dioxide (CO2) can lead to stomatal closure [112-114].
Compared with ABA signaling, CO2 signaling is much less understood. Carbonic
anhydrase (CA) is an early component known as necessary for CO2 signaling, by
converting CO2 into bicarbonate [HCO3]- [115]. A major CO2 signaling pathway is
described by Tian et al. [116], where RHC1 (RESISTANT TO HIGH CO2), a MATE-
67
type transporter, links elevated CO2 concentration to repression of HT1 (HIGH LEAF
TEMPERATURE1), a protein kinase that negatively regulates CO2-induced stomatal
closing by phosphorylating and inhibiting OST1. Mitogen-activated protein (MAP)
kinases are also known to respond to CO2 signaling and interact with HT1 [117, 118].
ABA and CO2 share signaling components in inducing stomatal closure, but each also
has their independent signaling components.
Among the signaling components, heterotrimeric G-proteins (“G-proteins”), also
known as guanine nucleotide-binding proteins, are especially important. They are
located on the inner side of the cell membrane, and are responsible for signal
transduction across the cell membrane. Heterotrimeric G proteins are composed of Gα,
Gβ, and Gγ subunits. The β and γ subunits are closely bound to each other, and are
referred to as the beta-gamma complex. The G-protein alpha subunits is active when
GTP bound, and can activate certain effectors (signaling proteins) [119]. For example,
the Arabidopsis phospholipase D, PLDα1, is a confirmed GPA1 (Gα subunit 1) effector
in the plant’s signal transduction in response to ABA [120]. In plant cells, there are
canonical alpha subunits (GPA), and noncanonical extra-large G-proteins (XLGs)[121-
123]. In 2015 Chakravorty et al. defined a new paradigm in plant G-protein signaling in
Arabidopsis Thaliana [124], with the noncanonical XLGs as components of the plant
G-protein heterotrimer. Compelling experimental evidence suggests that the canonical
GPA and the noncanonical XLGs have contradicting effects on many phenotypes, for
example primary root ABA hyposensitivity, salt tolerance, and stomatal density [122,
124-127]. Canonical and noncanonical G proteins are also known to mediate different
signaling processes: GPA is involved in ABA-induced closure but not in high CO2 or
external Ca2+ induced closure [128]; while XLGs are the opposite, being not involved
in ABA-induced but necessary in high CO2 or external Ca2+ induced closure3. The
stomatal guard cell is the best understood cellular system of G-protein regulation [129-
133]. Thus a promising path toward understanding and explaining the effects of
different G-protein alpha subunits is through including them in a guard cell signaling
network. Multiple versions of the network, and of the dynamic model based on it, can
incorporate multiple possible hypotheses about the XLGs. Comparison of each dynamic
model’s results to existing experimental observations would allow the identification of
the most promising models. Moreover, the signaling network model can predict further
expected phenotypes of G protein mutants.
Since the stomatal closure process involves a complex array of signaling components
3 unpublished observation from Prof. Sarah Assmann’s group
68
and their interactions in guard cells, network-based modeling is an ideal approach to
study the guard cell signal transduction system. A network reflects signal transduction
by employing nodes representing the biological entities involved in the process, and
edges representing interactions and relationships. Then, a comprehensive dynamic
model built on the network, with each node associated with a state variable that changes
over time, can simulate how the system responds to a signal, and predict how
mutations/intervention would change the response. Together, network-based modeling
can explain how an ensemble of lower-level interactions, such as protein interactions or
phosphorylation, can lead to the system-level behavior like stomatal closure.
Prof. Réka Albert’s group has constructed several network models on guard cell
signaling and stomatal response to different signals. In 2006 Li et al. constructed the
first discrete model of the ABA signaling network that mediates stomatal closure [46].
The model successfully reproduced many observed knockout phenotypes, and predicted
new mutant response to ABA. A new, updated version of the same ABA signaling
process, published in 2017 [134] expanded the knowledge by including the results of
studies published since 2006, identified key feedback loops that can sustain their activity
in presence or absence of ABA, and made new predictions. In another work (reviewed
in Chapter 1 and 2), Sun et al. constructed a multi-level discrete model reflecting the
stomata opening process, in response to light of different wavelength, CO2, and abscisic
acid [1]. The model predicted ABA inhibition on red-light induced stomatal opening,
and the prediction is verified experimentally. These previous works provide a solid
background for a crosstalk model involving multiple signals.
While there has been intensive past work on ABA signaling network, little has been
done on a system level on CO2 signaling. Especially, no work has considered the
crosstalk between the two signals in inducing stomatal closure from a comprehensive
system-level perspective. With the new evidence that different G-protein subunits have
different effects on different signals, it would be natural to try to establish a
comprehensive crosstalk model to explain all the experimental observations. In this
work we constructed a new predictive model of the early stages of ABA/CO2 signaling
in guard cells, focusing on the crosstalk of the two signals and the effect of different G-
protein alpha subunits. The ABA part of the network is based on the ABA model by
Albert et al [134]. We used a novel method to analyze the dynamic repertoire of the
model, to identify key feedback loops that govern the model’s dynamic behavior, and
to explain the mechanism of the closure response to different signals. We predict several
new regulations in the ABA-CO2 crosstalk, potential experiments to validate these
predictions, and closure response of mutants in presence of treatments.
69
Construction and simulation methods of the crosstalk network and dynamic model
We constructed the network model based on simplification of the ABA model
proposed in 2017 [134] (which will be referred to as the “ABA model”. We start with
the strongly-connected component (SCC) of the ABA model and add CO2 related nodes.
We apply dynamics-preserving reduction methods to the SCC to reduce its size without
changing its dynamic repertoire [13, 14], so the resulting reduced SCC has the same
long-time behavior (attractors) compared with the original. In the ABA model the in-
component (upstream) of the SCC contained many input nodes that represent e.g. cell
environment. These nodes does not change state during ABA signaling, so can we
reduce them by plugging their state directly into the nodes they regulate, without
changing the system’s attractors [13]. The out-component (downstream) of the SCC is
approximated with Aquaporin and three representative ion channel nodes. Since there
is no feedback from the downstream to the SCC, approximations like this do not change
long-time dynamic behavior of the model [3, 26], i.e. again we are simplifying the
network model without changing its attractors. As a result of these simplifications, the
network was reduced from 84 nodes and 156 edges to 28 nodes and 59 edges. This
greatly reduced the complexity of the network, allowing easier analysis. To construct
the CO2 signaling pathway, we examined known signaling components and their
regulations, and add them as additional node and edges such that the network model
simulations are consistent with experimental observations. Details of the construction
process are described in the next section. Figure 4.1 shows the ABA-CO2 crosstalk
network. Similar to the ABA model, the crosstalk model has a single large SCC (18
nodes) containing early guard cell signaling components. Red edges are predicted so the
model can reproduce known closure responses (see the next section for details).
70
Figure 4.1 The ABA-CO2 crosstalk network. The network has 28 nodes and 58 edges.
Nodes with red labels are CO2 related. Red edges are assumed regulations. Among them,
directed red edges are inferred regulations that are necessary for CO2 induced closure;
undirected red edges are based on observed protein-protein interactions (see the next
section for details). The sole strongly-connect component, marked with “SCC” label,
contain 18 nodes. A table of nodes names and abbreviations can be found in Appendix
C1.
To construct the dynamic model, each node in the network is associated with a
Boolean variable and a regulatory function. The variable represents the state of the node,
for example the variable could represent whether a protein is being produced (ON state)
71
or not (OFF state), or represent the concentration level of a molecule (ON for high
concentration, OFF for low concentration). The regulatory function describes how the
node variable changes over time. Specifically, we employ a discrete time framework, so
the regulatory functions determine the node variable at the next time step. For the
dynamic system, we apply a random-order asynchronous update: at each time step, all
nodes in the network are updated in a random order. Introducing stochasticity here is a
good method to make up for the lack timing details of the regulations. If one tracks a
node variable over time, one can get a single time-course simulation, representing how
the node behaves over time within one simulation. We perform a large ensemble of time-
course simulations (e.g. 1000 simulations) and average their data. The simulations are
performed from a partially fixed initial system state, where the states of signal
components are set to their observed values in reality. For the a few nodes whose initial
state we do not know, their initial states are randomized. In this model, the only
randomly-initialized nodes are SLAC1 Anion Channel, Membrane depolarization, K+
efflux, and Aquaporin PIP2:1.
Figure 4.2 shows representative simulations as wild type stomatal closure response
to different signals ABA, CO2, and external Calcium. Monte-Carlo simulation allows
the generation of time course data (i.e. the time a variable converges to its attractor state)
and enables quantitative analysis on a Boolean model, under the stochastic update
scheme. If the simulated value reaches and stays at 0 or 1 after long enough, then we
know the node converges to a stabilized value 0 or 1 in all the simulations, suggesting
a sole attractor with this node in a fixed value. If the final value is in between 0 and 1,
there may be multi-stability or a complex attractor. A fluctuation of the node value is
the signature of a complex attractor; while a stabilized node value (between 0 and 1)
implies multistability of steady states.
72
Figure 4.2 Time course simulation of closure in response to ABA, CO2 and external
Calcium signals. The horizontal axis is the simulation time step, and the vertical axis is
the average closure averaged over 1000 simulations. The tiny peak at time step~1 is due
to randomized initial conditions.
Predicting XLG related regulations by reproducing known wild type and G-
protein mutants’ stomatal response to ABA, CO2, and external Calcium
In this section we describe how we determine the red edges in Figure 4.1. These
inferred edges are predictions of as of yet undiscovered regulations. The main challenge
to determine these edges is the limited knowledge on the regulatory role of XLGs: they
are known to mediate CO2 signaling, but their regulators or effectors are unknown. To
overcome this challenge, we collaborate to investigate potential XLG regulations from
two approaches: (1) we deduce necessary regulation that XLGs must have on the
network components, such that the model simulation is consistent with experimental
observations; (2) Our collaborating experimental group tested interactions between
XLGs and known signaling components, using yeast-two hybrid and bimolecular
fluorescence complementation (BiFC) assays. Our purpose is to come up with edges
and regulatory functions of each node within the CO2 signaling pathway, such that the
simulations are consistent with the the experimentally observed stomatal closure
response to different signals (presented in Table 4.1, in the “Observed” column). The
edges that makes simulations and observations consistent are our predictions of the
regulatory relations of XLGs and other components.
73
ABA CO2 Intervention Observed Simulation
0 0 none No closure 0
0 0 External Ca2+ Closure 1
0 0 External Ca2+ + XLGs
KO
Loss of closure 0
0 0 External Ca2+ + GPA
KO
Closure 1
0 1 none Closure 1
0 1 XLGs KO Loss of closure 0
0 1 GPA KO Closure 1
1 0 none Closure 1
1 0 XLGs KO Closure 1
1 0 GPA1 KO Closure 1
1 0 GPA1 KO + pH clamp Loss of closure 0.4 (oscillates)
1 1 none Closure 1
1 1 XLGs KO Closure 1
1 1 GPA KO Closure 1
Table 4.1 Simulation of the closure pattern compared with experimental observation4.
The first two columns indicate the status of the ABA and CO2 signal. The third column
is the intervention applied to the system. External Calcium is a treatment; XLG and GPA
KO represent the xlg triple mutant or gpa1 mutant, respectively. The “Observed”
column indicates the qualitative outcome of the experiments. “Closure” indicates a
significantly decreased stomatal aperture compared to the control setting that lacks any
signal or intervention. “Loss of closure” indicates that the relevant intervention causes
a substantial decrease in the effect of the relevant signal, thus the combined outcome of
the signal and intervention is closer to the control (no closure) than to the effect of the
signal alone (closure). The “Simulation” column records the simulated closure value at
the end of the simulation (i.e. after 40 time steps) under each condition, averaged over
100 simulations. A value less than 1 in the simulation column is consistent with a loss
of closure. The table shows that the model reproduces experimental observations.
Notation “KO” means knockout.
We start by finding necessary conditions for closure, based on prior knowledge
compiled in the ABA induced closure model, i.e. based on the previously reconstructed
4 These experiments are done by previous members of Prof. Sarah M. Assmann’s group.
74
regulatory functions of the nodes in the ABA pathway. We identified three necessary
conditions for closure in the absence of ABA: inhibition of the PP2C protein
phosphatases (which otherwise would inhibit closure), activation of AtRbohD/F (in
order to produce reactive oxygen species ROS, a type of secondary messengers
necessary for multiple processes), and CaIM (Ca2+ influx through the membrane, which
is the first process that can yield the Ca2+c increase). These conditions must be met in
CO2 signaling, because CO2 can induce stomatal closure in the absence of ABA. Each
necessary condition is translated into a regulatory edge, through which CO2 signaling
can meet the condition. These edges are the directed red edges in Figure 4.1. In the
following part, we list these edges, with an explanation why each edge is necessary, and
point out potential experiments that can help validate them.
1. XLGs -| PP2C inhibitory regulation
PP2Cs inhibit OST1 and are inhibited by ABA receptors in ABA signaling. Under
CO2 signaling the ABA receptors are inactive but PP2Cs still need to be inhibited to
obtain closure. We assume that XLGs are responsible for this inhibition, either directly
or indirectly. Our collaborators have found evidence of XLGs (XLG3 specifically),
interaction with some of the PP2Cs (ABI2 and HAB1) in both yeast-two hybrid assays
and BiFC assays. Experiments showing XLG activity causing decreased PP2C
phosphatase activity, or experiments showing XLG activation associated with low PP2C
activity under high CO2 would be a good validation of this assumption.
2. XLGs AtRbohD/F regulation
AtRbohD/F enzymes catalyze ROS production in guard cells, which is essential for
ABA-induced stomatal closure. It is also reported that ROS production is necessary in
CO2 signaling [135]. Under ABA, AtrbohD/F is activated by GPA1, which is not
necessary in CO2 signaling. Therefore, under CO2 signaling, some other signaling
component should be able to take GPA1’s place and activate AtRbohD/F. We assume
XLGs are responsible for the activation of AtRbohD/F under CO2 signaling. Our
collaborators have found evidence of XLG interaction with AtRbohD in yeast-two
hybrid assay and BiFC assay. Observation of (1) loss of ROS production in xlg triple
mutants under CO2 signal; or (2) ROS treatment being able to restore xlg mutants’ loss
of closure in response to CO2 would further validate this regulation.
3. CaIM XLGs regulation
XLG is necessary for external Calcium induced closure. The simplest way to
implement this is to assume that XLG is activated by external Calcium and it mediates
75
its closure-inducing effects. Since external Calcium is represented in the model as CaIM
(Calcium influx through membrane) being constitutively ON, we decide to add a CaIM
XLGs regulation edge.
4. OST1 CaIM regulation
CaIM is necessary to induce an initial increase of the cytosolic Ca2+ (denoted Ca2+c ).
The Ca2+c level cannot stay elevated for long due to its toxic effects to cells [136, 137];
instead, repeated peaks of Ca2+c are observed, and were proposed to be necessary for the
closure process[138]. CaIM can be activated by many nodes in the ABA model.
However most of its activators can only activate CaIM after an initial CaIM activation.
It works like an engine that needs a CaIM “ignition” process, after which it can sustain
itself. The “ignition” of CaIM under ABA signaling is provided by an independent
stretch-activated channel mechanism activated by actin reorganization [110], which is
not activated in CO2 signaling. Therefore CO2 has to induce initial CaIM activation
through a different regulation mechanism. We assume OST1 is responsible for the
ignition of CaIM under CO2, and add an edge OST1 CaIM. We did not assume XLGs
as responsible for this, because after we assumed CaIM XLG, adding XLG CaIM
feedback could result in a partial fixed point where XLGs and CaIM remain inactivated
regardless of the other nodes.
The experimental results of our collaborators help construct the rest of the network.
Four interactions, namely CA1/4 – XLGs, XLGs - HT1, XLGs – AtRbohD/F, and XLGs
– PP2Cs interactions, were found experimentally. Among them, XLGs – AtRbohD/F
and XLGs – PP2Cs interactions are already assumed in the previous approach; this
experimental evidence helps validate those assumptions. The other two interactions,
CA1/4 – XLGs and XLGs - HT1 interactions, are tested in simulations to see if they can
make simulations consistent with experimental observations. Since the protein-protein
interaction experimental assays only suggests physical interaction (binding) but not
regulation (i.e. an interaction does not guarantee a regulatory effect, and does not
indicate which interacting partner is the regulator and which the target), we have to test
all possible settings of the two edges, by testing their direction and whether they can be
removed without causing inconsistency. In principle we have to test the sign of the two
edges too, i.e. whether the regulation is positive (promoting) or negative (inhibitory).
To simplify, we assumed the sign of the edge according to the regulatory role of the
nodes. That is, we assume an edge between two up-regulators or two down-regulators
of closure (i.e. CA1/4 - XLGs) to be positive, and assume an edge between an up-
regulator and a down-regulator (i.e. XLGs – HT1) to be negative. With this assumption,
76
we only need to test regulation directions of those two edges. The purpose is again to
make simulations consistent with the experimental observations, like in Table 4.1. In
addition, we make the simulations consistent with observations of CO2 early signaling
component mutants, which are summarized in Table 4.2 [115, 116, 139].
CO2 signaling CA1/4 KO RHC1 KO HT1 KO HT1 KO +
RHC1 KO
Observed
closure
Loss Loss Closure Closure
Simulated
closure
0 0 1 1
Table 4.2 closure response to interventions of early CO2 signaling components. The first
row is experimental observation of closure response, and the second row is the model
simulation. Additional edges (e.g. RHC1 XLGs) are required to make the two rows
consistent.
After determining edge direction, we have to determine the Boolean function for each
node that yields the best the consistency with the experimental observations. It is
computational infeasible to examine all possible combinations of regulatory functions
exhaustively, so here we aim to find representative models instead. Take the function of
the node OST1 for example. OST1 is known to be inhibited by PP2Cs from the ABA
model [134], and is also known to be inhibited by HT1 in the CO2 pathway[116]. The
problem is to determine how these two regulatory effects combine, i.e. whether it is an
“or” operator, as in the regulatory function OST1 * = not HT1 or not PP2Cs (which
indicates that OST1 is active if either HT1 or PP2Cs are inactive), or an “and” operator
as OST1 * = not HT1 and not PP2Cs (which indicates that OST1 is active only if both
HT1 and PP2Cs are inactive). The “and” relationship would indicate that ABA, which
cannot inhibit HT1 according to current knowledge, would fail to activate OST1 and
thus fail to induce closure. As this conclusion is inconsistent with observed reality, we
set the regulatory function of OST1 to be OST1 * = not HT1 or not PP2Cs instead. A
list of all regulatory functions can be found in Appendix C1. Additional edges may be
necessary to keep the model simulations consistent with the experimental observations
in Table 4.2, for example RHC1 XLGs or XLGs OST1, as shown in Figure 4.3
below.
With these assumptions/predictions, we can now complete the network by making
the previous undirected edges directed. Note that the edge settings and regulatory
77
functions are not unique. It is computationally infeasible to exhaustively evaluate all
models to find possible ones. We present two representative models with different edge
settings in Figure 4.3. We claim that all possible models are similar: first of all, these
models have fixed regulatory functions in the known ABA signaling pathway; second,
these models must all satisfy the consistency between simulation and observations.
These are heavy constraints to the models. To further evaluate the similarity between
models, we performed a systematic single node intervention test on the two
representative models, under CO2 and Calcium signal, and monitored their stomatal
closure responses (see Appendix C2 for intervention details). The closure responses
between the two models are only different in 1 out of 110 intervention cases. This
confirms the similarity of the models. Because of the similarity, the analyses in the
following sections are performed on the model in Figure 4.3A instead of both models
for simplicity.
Figure 4.3 Two representative edge/regulation settings of the CO2 signaling sub-
network. Substituting this into Figure 4.1 will complete the network. Black edges are
known and red edges are assumptions/predictions. The main difference between these
two network settings is the opposite direction of the regulatory relationship between
XLGs and HT1.
78
Motifs analysis identifies key feedback loops, shows the attractor of the system,
and explain the effect of different G-protein alpha subunits
One of the most important features of a dynamic model is its repertoire of attractors,
which are long-time behavior of the system. Attractors usually represent biological
phenotypes, making them particularly interesting. For example, a cellular regulatory
network can have two attractors, one representing a healthy cell state, while the other
representing a cancer cell state. There are steady state attractors, where the entire system
stays in equilibrium; there are also complex/oscillatory attractors, where a subset of the
system is oscillating. Simulations of the crosstalk model showed attractors with stomatal
closure, attractors with no closure, and attractors with oscillating closure value. Note
that simulations indicate specific dynamic trajectories and cannot reveal the entire
repertoire of a dynamical system. In addition, oscillation of certain node states is
observed, indicating complex attractor. All these suggest that it is interesting to analyze
the attractors of the crosstalk model. However, it is difficult to find all attractors of a
discrete dynamical system. The only way to accurately find all attractors is to search the
state space exhaustively, which fails quickly as the state space of a discrete system scales
up exponentially.
In a previous work (described in Chapter 3), we developed a general method to find
attractors of a discrete dynamic system [33]. The method finds self-sufficient motifs
from an expanded network representation that are similar to points of no return in the
dynamics: if the system is in a state specified by such a motif, it cannot leave the motif.
One can then plug in the states specified by the motifs to reduce the network, and do
this iteratively until the whole system attractor is found. The motifs method is general
and can apply to a wide range of modeling frameworks, from Boolean to continuous
[30, 140].
We apply the motifs method to the crosstalk model to find its attractors. The results
are shown in Figure 4.4. We found a sole stable motif in the presence of a single signal,
i.e. either ABA or CO2, as shown in Figure 4.4A&B. Single stable motif means sole
attractor, which implies that closure will occur regardless of the initial condition of the
system, indicating closure as a robust process. Figure 4.4C shows the motifs found
under no signal (i.e. both ABA and CO2 being OFF). We observe bi-stability, the co-
existence of two attractors, one of which is associated with the lack of closure (i.e.
Closure=0), while the other is associated with closure. The dotted line in the closure ON
stable motif means that either XLG or GPA1 is sufficient to complete the motif. This is
interesting because a mutation or external intervention can drive the system to one
79
attractor instead of another. For example, external Calcium, which is not an explicit
signal, is implemented as CaIM being constantly ON, the effect of which is the
activation of XLG and thus the activation of the closure ON stable motif. Therefore,
supplying external Calcium to the guard cells under default initial condition can drive
the cell to the closure attractor instead of the non-closure attractor. Figure 4.4D shows
the oscillating motif that is active in all closure ON attractors. It works as the “core” of
oscillation: other nodes in the attractor oscillate as a result from these two nodes’
oscillation. This is consistent with the knowledge that Calcium oscillation is necessary
in closure processes [138].
Figure 4.4 Result of motifs analysis of the crosstalk model. These motifs are shown in
the expanded network representation (described in Chapter 3) here. Node states are
represented by color: grey colored nodes represent nodes in their OFF states, white
colored nodes represent nodes in their ON states. Black nodes without labels represent
a composite node, as combinatorial regulation (i.e. “AND” logical operation). “Rboh”
is the short-hand notation for “AtRbohD/F”. A&B: stable motifs found in ABA and CO2
signal, respectively. C. Two stable motifs are found in the absence of any signals: one
80
associated with closure and the other associated with non-closure. The dotted line means
that either XLG or GPA1 is sufficient to complete the motif. D. two-node oscillating
motif found in all closure ON attractors. The left hand side is the original network, the
right hand side is the motif in expanded network representation.
The motifs found in the crosstalk model are similar to the motifs found in the ABA
model [134]. For example, in the ABA model, four motifs are found, but only one of
them is in the SCC of the network: the positive feedback loop of RBOH ROS
PLDδ PA. This is identical to Figure 4.4A because PLDδ is a mediator node that was
reduced in the reduction process. In the case of no signal, the ABA model has two stable
motifs that resemble Figure 4.4C, but are slightly larger as Figure 4.4C has XLG in both
of its sub-figures. The high similarity of motifs is expected as the reduction method we
applied is known to be attractor conserving. Nevertheless, this agreement still validates
our simplification approach, especially our approximation of the SCC downstream, and
indicates no change of “dynamic cores” after reduction or simplification.
The stable motif under CO2 signaling is similar to the stable motif under ABA
signaling. The CO2 stable motif contain two more nodes, ABI1 and pHc. This is
consistent with the ABA stable motif, because ABI1 and pHc are stabilized in the
presence of ABA, which explains why they do not appear in the stable motif. The
similarity of the ABA and CO2 stable motifs in the Rboh – ROS – PA feedback loop
suggests that after a different up-stream receiving/signaling process, ABA and CO2
signaling converge at ROS production, before the signal reaches the ion channels
downstream to cause closure.
Additionally, motifs are dynamic cores not only because they help identify attractors,
but also because they offer a way to control the network attractor [31]. As a succession
of stable motifs specifies each attractor, keeping the nodes of the relevant motifs in the
associated states can guarantee that the system reaches the target attractor. Therefore,
the identification of the stable motifs actually show a theoretical way to control the
network attractor. For example, CaIM or ROS constitutive activations can keep the
motifs ON and lead to closure ON attractor, which translates into the prediction that
external Calcium or ROS treatment is sufficient to cause closure, even in the absence of
signals, or of upstream mutations/interventions.
81
Multiple intervention scenarios predict potential G-protein regulation effectors,
and mutant response to signals
After the crosstalk model is constructed and verified, we can predict the effects of
intervention scenarios that haven’t been experimentally tested. For this purpose, we
performed systematic single node knockout (KO) or constitutive activation (CA), of the
crosstalk model and select some of them to be presented here. The full single node
intervention tables can be found in Appendix C2 (recall that we performed systematic
node interventions when we evaluate the similarities between the two representative
models in Figure 4.3). A representative example of interventions XLGs=0 and
AtRbohDF=0 is shown here in Table 4.3.
Signal CO2 External Calcium
mutant No treatment ROS treatment No treatment ROS treatment
Wildtype 1 1 1 1
XLGs=0 0 1 0 1
AtRbohDF=0 0.136 1 0.184 1
Table 4.3 example of single node intervention. The number is the closure value after 50
time steps, averaged over 500 simulations. This set of simulations predicts that ROS
treatment can restore loss of closure in xlg triple mutants or atrbohD/F mutants.
Note in Table 4.3, ROS treatment in the xlg triple mutant causes a different stomatal
response than that of the mutant without treatment. Indeed in many experiments, a
combination of two treatments or interventions yields a different result than a single
treatment or intervention. These experiments reveal the potential regulation relations of
the signaling components. For example, the ROS treatment causing recovery of closure
is interpreted as ROS functioning downstream of XLGs and AtRbohD/F. Predictions of
the combinatorial effects of multiple interventions can help discover key experiments
to elucidate regulatory roles of nodes. Due to the combinatorial complexity, it is not
practical to perform and analyze these interventions systematically like single
interventions. In addition, only a small proportion of interventions can be performed
experimentally. Therefore, we selectively performed simulations of closure response to
experimental treatments on wildtype and mutants that our collaborators are familiar with
and can test on (Table 4.4.).
A.
External Ca2+ Treatment Conclusion on the
82
observations in the row
genotype No
treatment
ROS=1 PLC=0 PLC=1 pHc=0
Wildtype 1 1 0.118 1 0.368
GPA1=0 1 1 0.084 1 0.388
XLGs=0 0 1 0 0 0 ROS reverts XLG KO
OST1=0 0 0 0 0 0 OST1 is necessary for
Aquaporin, AtrbohD/F, and
SLAC1, any one of which
is necessary for closure
ABI1=0 1 1 0.384 1 0.378
AtRbohDF=0 0.184 1 0.122 0.366 0.362 ROS reverts AtRbohD/F
KO
GHR1=0 0.372 0.358 0.118 0.366 0.396 GHR1 KO has decisive
reduction in closure. Not
reversible with ROS ON
CA14=0 1 1 0.116 1 0.36
RHC1=0 1 1 0.122 1 0.348
HT1=0 1 1 0.124 1 0.372
Conclusion
on the
observations
in the column
Major
loss
Major
loss
PLC or pHc KO each
causes major/decisive loss
closure
B.
CO2 signaling Treatment Conclusion on the
observations in the
row
genotype No
treatment
ROS=1 PLC=0 PLC=1 pHc=0 CaIM=0
wildtype 1 1 0.038 1 0.36 0
GPA1=0 1 1 0.056 1 0.366 0
XLG=0 0 1 0 0 0 0 ROS reverts XLG KO
OST1=0 0 0 0 0 0 0 Similar to external
Ca2+
ABI1=0 1 1 0.338 1 0.396 0
83
AtRbohDF=0 0.136 1 0.038 0.374 0.4 0 Similar to external
Ca2+
GHR1=0 0.412 0.364 0.048 0.346 0.364 0 Similar to external
Ca2+
CA14=0 0 1 0 0 0 0 CA14 and RHC1
mutants only affect
CO2 induced closure;
ROS ON can counter
these mutants
RHC1=0 0 1 0 0 0 0
HT1=0 1 1 0.062 1 0.39 0
Conclusion
on the
observations
in the column
Major
loss
Major
loss
Major
loss
CO2 signaling is
similar to external
Calcium, except the
CA14 and RHC1
mutants
C.
ABA
signaling
Treatment Conclusion on the
observations in the
row
genotype No
treatment
ROS=1 PLC=0 PLC=1 pHc=0 CaIM=0
wildtype 1 1 1 1 0.364 0
GPA1=0 1 1 1 1 0.404 0
XLG=0 1 1 1 1 0.406 0
OST1=0
0 0 0 0 0 0
Similar to external
Ca2+ and CO2
ABI1=0 1 1 1 1 0.344 0
AtRbohDF=0
0.312 1 0.354 0.4 0.434 0
Similar to external
Ca2+ and CO2
GHR1=0
0.362 0.376 0.38 0.374 0.41 0
Similar to external
Ca2+ and CO2
CA14=0 1 1 1 1 0.366 0 CA14 and Rhc1
mutants does not affect
ABA signaling
RHC1=0
1 1 1 1 0.374 0
HT1=0 1 1 1 1 0.39 0
84
Conclusion
on the
observations
in the column
No loss Major
loss
Major
loss
ABA signaling does
not depend of PLC
Table 4.4 Selected double interventions under each signal: A. External Ca2+; B. CO2; C.
ABA. Each row is a genotype (wildtype or the indicated mutant), and each column is a
treatment (including no special treatment). All simulated closure values are reported
after 50 time steps, averaged over 500 simulations. Yellowed slots are those that display
a significantly different value compared with no treatment. Conclusion on the
observations in the row/column are located on the last column/row of each sub-table.
Table 4.4 reveals interesting results: (1) CA14 and RHC1 mutants do not affect
external Ca2+ or ABA signaling. This is expected as CA1/4 and RHC1 are CO2 early
components and are not part of the network’s SCC. (2) ROS treatment can revert closure
loss caused by XLG KO or by AtRbohD/F KO. (3) pHc KO causes loss of closure for
all signals, all mutants. (4) CaIM KO causes loss of closure for all signals, all mutants.
(5) PLC treatment has a different effect under ABA signaling (no loss) than the other
two signals (loss of closure). (6) OST1 KO mutant causes loss of closure and cannot be
recovered by any treatment. (7) GHR1 KO mutant causes loss of closure and cannot be
recovered by any treatment.
To investigate further we performed triple interventions. The full simulation table can
be found in Appendix C3; here we summarize some of the interesting findings.
Examples of interesting findings include: (1) ROS =1 treatment can recover closure
from PLC=0 treatment, under all three signals. (2) ROS ON can revert closure loss due
to pHc KO treatment. (3) pHc KO can induce closure loss despite PLC ON treatment.
Note that some of the “double treatment” interventions may not be practical to perform.
The crosstalk model offers a potential explanation to the seemingly contradictory
stomatal response to CO2 in presence and absence of mesophyll cells
There has been controversy on how mesophyll cells may contribute to the regulation
on stomatal closure under CO2 signaling. Mesophyll cells are in the middle of plant
leaves, below the epidermis where the guard cells are. They have a lot of chloroplasts
in them, and are thus the main sites of photosynthesis [141]. Stomatal closure
experiments are done in different settings: on whole leaves or whole plants, where
85
mesophyll cells are present, or on epidermal peels where mesophyll cells are absent.
Different groups have reported that the stomatal aperture responses to CO2 in whole
leaves and epidermal peels are different: the closure response is much less in epidermal
peels [142, 143]. Our collaborators have also found that the xlg triple mutant retains
closure in live plants, in contrast to their loss of closure response in epidermal peels5.
Despite these observations, whether mesophyll play a significant and independent role
in stomatal closure is still debated. Assumptions of a diffusible factor known as ‘the
mesophyll signal’ that regulates stomatal movement have been expressed in the
literature for a very long time [141, 142, 144]. However, it remains unclear what this
signaling component is.
With the help of our network model, we can evaluate the hypothesis that mesophyll
cells regulate stomatal closure by adding an additional node “mesophyll signal” into the
network. In our model, mesophyll signal is assumed to be regulated by CA1/4, to
represent the fact that mesophyll cells take bicarbonate, the product of CA1/4 mediated
CO2 fixation, for photosynthesis; and to be consistent with the fact that CA1/4 mutant
lose all CO2 induced closure [115]. With the mesophyll signal unknown, we test
potential mesophyll regulation by assuming an additional edge from mesophyll to its
potential second messengers. The mesophyll signal can be mediated by products of
biosynthesis like ABA, sucrose, and apoplastic malate, or the signal can have other
effectors, like apoplastic pH or guard cell ion channels.
Our model has nodes ABA and pHc (cytosolic pH), so a straightforward approach is
to test the hypothesis that mesophyll cells produce ABA, or regulate pHc, by seeing if
the simulations can reproduce the different CO2 response of xlg mutants in whole leaf
and epidermal peel experiments. We start with the hypothesis that mesophyll cells
produce or release ABA. There is evidence in the literature that ABA is being produced
in mesophyll cells to support this hypothesis [145]. We implement this hypothesis into
the network model by adding mesophyll ABA activation edge. The resulting network
model simulations shows that xlg mutant and rhc1 mutant displays different closure
response in the absence or presence of mesophyll (Table 4.5). The simulation also shows
consistency with the knowledge that the ca1/4 mutants are insensitive to CO2 in both
epidermal peels and live plants [115]. Notably, the closure response of rhc1 mutant is
in controversy: Tian et al. reports that rhc1 mutant plants have reduced CO2 sensitivity
in live plant measurements and are insensitive to CO2 in epidermal peels[116]; while a
recent report from Tõldsepp et al. finds normal (or close to normal) sensitivity of rhc1
5 These experiments (unpublished) are done by members of Prof. Sarah M. Assmann’s group.
86
mutant live plants and live leaves [146]. Our simulations suggest that rhc1 mutant may
display different behavior with or without mesophyll cells, which may help explain the
contradicting observations.
On the other hand, assuming mesophyll regulating [pH]c does not yield simulations
consistent with observations, suggesting that [pH]c is not a good candidate to be
assumed as effector of the mesophyll signal. Note there is a difference between cytosolic
[pH]c and apoplastic pH, so this simulation does not rule out the possibility of apoplastic
pH being regulated by mesophyll cells.
Closure response to CO2
intervention w/o mesophyll
(epidermal peels)
With mesophyll
(live plant/leaf)
wildtype 1 1
GPA1=0 1 1
XLGs=0 0 1
HT1=0 1 1
CA14=0 0 0
RHC1=0 0 1
Table 4.5 Simulation on closure response to CO2 without or with the ‘mesophyll signal’
node, together with the assumption that mesophyll produces ABA.
Time-course simulation reveal a knowledge gap in CO2 early signaling
Figure 4.2 shows the time for each signal to cause closure. Curiously, this is not
entirely consistent with experimental observation: In experiments, it often takes longer
for external calcium to cause closure, compared with CO2 or ABA [refs needed].
However in the simulations, the CO2 signaling is the slowest. This happens in the
simulation because in the model it takes a long series of activations for CO2 to cause
closure. Analysis reveals the sequence of activation of signal components in the
stomatal closure process under each signal, shown in Figure 4.5. ABA leads to fastest
closure response because ABA activates GPA1, which in turn activates the AtRboh
stable motif (which is defined in the previous section). In the other two signals, the
signal has to activate Calcium influx and then Calcium oscillation, before it can activate
the AtRboh stable motif. Compared to external Calcium, CO2 has to activate its earlier
components, e.g. CA1/4 and RHC1, before it can activate Calcium influx, and is
therefore the slowest.
87
Figure 4.5 Flow chart of activation sequence of components in the network. “CaIM” is
short for Calcium influx through the membrane. “AtRboh stable motif” is defined in the
previous section and can be interpreted as ROS (reactive oxygen species) production.
The ABA response is fastest because ABA early signaling can activate AtRboh stable
motif. External Calcium activates the downstream of the CO2 signaling pathway
directly, and is therefore faster than CO2 signaling. The fact that in experiments CO2
response is fast may suggest a Calcium independent pathway from CO2 signaling to the
downstream, as indicated in the figure with the dotted edge(s).
This contradiction in response time may be resolved by the addition of an (as of yet
unknown) CO2 signaling mechanism that activates the downstream of the flowchart
directly and in a Calcium-independent manner. The target of this regulation can be: (1)
components in the AtRboh stable motif such as AtRboh or ROS production; (2) ion
channels, which would mean a Calcium-independent way of ion channel activation; (3)
an unknown pathway involving undiscovered components. According to our knowledge,
no current work has suggested such a pathway or regulation. It would be interesting to
investigate what this signal pathway is. In addition, the assumption of mesophyll
regulation can also help explain this. Either mesophyll regulating ABA or regulating ion
channel directly can speed up closure response, because under these regulations CO2
signal can reach downstream components quickly, in a similar pattern as Figure 4.5.
Discussion
In this chapter we constructed a network-based model reflecting the ABA-CO2
88
crosstalk in inducing stomatal closure, based on previous ABA signaling network and
knowledge of CO2 signaling in the literature. The model offers explanation to how ABA
and high CO2 induce stomatal closure, respectively. It also predicts the role of non-
canonical G-protein alpha subunits and their regulatory relationship with the canonical
G-protein alpha subunit. The crosstalk of ABA and CO2 are interpreted as combinatorial
effects in the single large SCC of the network. The SCC serves as the dynamic core and
determines the system’s long term behavior, i.e. the stomatal closure movement. The
up-stream early signaling components serve as input chains, and the down-stream ion
channels serve as chains of results. The combination of a hierarchical network backbone
structure plus a large core SCC is typical in signal transduction network.
The CO2 signaling pathway is currently under study, and there is debate on the
regulatory role of certain signaling components. For example, RHC1 is reported as a
necessary component for CO2 signaling by Tian et al.[116], in both epidermal peels and
live leaves; but a recent paper present contradicting evidence in live leaves [146]. It is
also worth pointing out that introducing mesophyll regulation (as we did) may be able
to explain different responses between epidermal peels and live leaves of rhc1 mutants.
In another paper, the authors found that unlike ABA, elevated CO2 does not activate
OST1/SnRK2 kinases in guard cells [118], while it is known that ost1 mutant is
insensitive to both ABA and CO2 [147]. The mechanism behind this remains unclear.
Further experimental evidence will reveal more information, allowing a more accurate
version of the model.
G-protein’s regulatory role in plant signal transduction is another mystery. Despite
the predictions made in the model, there are many more potential effectors of the G-
protein. Our collaborator team is performing a comprehensive test of protein-protein
interaction between G-protein alpha subunits (GPA1 and XLGs) and known signaling
components of stomatal closure, with yeast 2/3 hybrid assays, BiFC assays, and Co-IP
assays. These interaction data can help identify more potential regulators and effectors
of the G-protein alpha subunits.
Our discrete network-based modeling framework offers a promising way to uncover,
understand and predict system-level biological behavior by integration of lower-level
knowledge. It offers novel, alternative approaches to explore and predict about signal
transduction in biological systems. As the interaction data between signaling
components grow bigger and bigger, computational methods like ours will become
necessary to complement traditional methods.
89
Chapter 5 Conclusions and outlook
In this dissertation I presented my contributions in both theoretical and computational
aspects of modeling and understanding biological systems. I analyzed a multi-level
model to show its dynamic properties; I developed a general method to analyze the
attractor landscape of any finite discrete model; and I constructed a network-based
dynamic model on the crosstalk of plant responses to different environmental stress. My
work further reinforces the conclusion that of network-based modeling is a promising
pathway to understanding system-level biology.
In the first chapter of this dissertation I reviewed how network analysis and network-
based dynamic modeling can be used to determine the repertoire of cellular behaviors
associated to a within-cell network, and to identify the sub-networks that play a key role
in the cell adopting a certain behavior. Overall, the expanded network representation,
an integration of the network topology with regulatory functions, reveals the indirect
and self-sustaining influences in the system, which ultimately determine the system’s
repertoire of behaviors. The emerging answers indicate that stable motifs are a key
information processing, decision-making connectivity pattern. Stable motifs receive
information from external signals and internal perturbations, and their stabilization
serves as a point of no return in the system’s dynamics. One can characterize attractors
by the stable motifs they are determined by, and one can control the system’s outcome
by controlling stable motifs.
In the second chapter I performed analysis on the attractors of the Sun et al. stomatal
opening model, and found a very strong conclusion: under any combination of sustained
signals, all nodes in the model converge into steady states, with the potential exception
of the cytosolic Ca2+ ([Ca2+]c) and Ca2+ ATPase. Variations in the initial condition of
non-source nodes or in process timing (node update sequence) can drive at most two
nodes, PMV and Kout, into a different attractor. This high degree of attractor similarity
is somewhat unexpected, as the network has a large strongly connected component and
several feedback loops. Thus, despite the decidedly non-linear structure of the network,
most parts of the system behave in the consistent manner of a linear pathway. This is a
distinct feature of the stomatal opening model: many dynamic models of biological
systems have multiple, diverse attractors [24, 148]. The models of these systems will
evolve into drastically different attractors when starting from different initial conditions,
90
sometimes even when starting from the same initial condition, demonstrating different
biological trajectories. In the stomatal opening model, however, the uniqueness of the
steady state stomatal opening level suggests that the final extent of the stomatal opening
response is robust and resilient against changes in initial conditions or in timing. Note
that although a change in the initial condition will not change the steady-state opening
level, it may change the steady state of PMV and Kout, and may change how fast the
system converges to an attractor.
I also showed that the reduced stomatal opening model does not admit additional,
emergent oscillations or multi-stability under any biologically relevant node
perturbation (knockout or constitutive expression). I further demonstrate the robustness
of the system by examining the stomatal opening level under single node knockouts: in
most cases the signals are still likely to propagate and lead to a similar degree of stomatal
opening as in the absence of perturbation. This robustness is unlike a single linear
pathway, which would be very sensitive to node disruption. This suggests that the role
of the strongly connected components in the network could be to provide multiple paths
for the signal to propagate, but at the same time not allowing extensive multistability or
oscillations. The innovative combination of existing methods used in this work offers a
promising way to analyze multi-level models.
Following up the problem of finding attractors, I proposed and developed a general
motif-based reduction method to find both fixed points and complex attractors of any
finite discrete dynamic model, by extending an existing method from Boolean to any
discrete level. As described in Chapter 3, I established a multi-level formalism that can
identify motifs from an expanded representation of the multi-level network. Iteratively
reduction of the network according to the motifs can identify the attractors. I
demonstrated the method’s correctness and effectiveness by implementing an algorithm,
then benchmarking it on synthetic networks, and applying it to biological networks in
the literature.
The integration of the network structure and regulatory logic in the expanded network
can reveal the connectivity patterns that underlie the system’s functional repertoire.
There can be multiple extensions to this work. For example, in the Boolean case,
elementary signaling mode (ESM) has been defined from the expanded network as the
minimal set of nodes that can perform signal transduction independently [29, 32]. It can
be extended to the multi-level as well to help understand signal transduction a multi-
level expanded network. Another direction is to extend the concepts of expanded
network and stable motifs to a continuous framework. If one can distill the causal
relationships wherein a certain value of a continuous variable is sufficient to maintain a
91
certain value of another continuous variable, one can construct an expanded network
from these relationships, and obtain insight into the system’s dynamic repertoire [140].
Furthermore, one can develop the control capability of multi-level motifs. Network
controllability has multiple definitions and frameworks to address it [58, 60, 149, 150].
Motifs can be used to control the system by driving it into one of its natural attractors.
Zañudo et al. proved that in the Boolean case a sequence of stable motifs uniquely
determines an attractor, which means that driving certain nodes into their states in a
stable motif can drive the network into the corresponding attractor; they also
implemented an algorithm to identify driver nodes from Boolean stable motifs [31]. The
same principle applies to multi-level stable motifs as well, and the algorithm to find the
driver nodes to be controlled can be adapted as well. This is particularly valuable in
biological networks, as the control of stable motifs can suggest possible practical
interventions to switch the system from an undesired attractor to a desired one [25].
Another possible aspect of control is target control, i.e., driving a single node or small
set of nodes into a desired state. This can be done by exploiting more of the sufficiency
conditions revealed in an expanded network [151].
As described in Chapter 4, I constructed a network-based model reflecting the ABA-
CO2 crosstalk in inducing stomatal closure, based on previous ABA signaling network
and knowledge of CO2 signaling in the literature. We validate the model by showing
consistency between the model simulations and experimental observations. This
consistency shows the correctness of our assumed regulations, making them valid
predictions, and we propose several ways to further validate these hypotheses with
experiments. We identified the core components that causes closure by applying our
novel attractor analysis method, and thus elucidated the closure mechanism for different
signals and their crosstalk. We also perform systematic single node interventions, and a
selection of double/triple node interventions, which predict the closure responses of
mutants under a list of treatments. These intervention simulations help explain the
regulation roles of each signaling component, and predict the combinatorial effects of
interventions. In addition, we propose that introducing mesophyll regulation into the
network model can resolve the different high CO2 induced closure responses observed
between peeled epidermis and live leaves/plants. Our network-based modeling
framework offers a promising way to uncover, understand and predict system-level
biological behavior by integration of lower-level knowledge.
There have been many updates in knowledge on CO2 induced closure, and more are
expected in the near future. One of the obvious future direction is to keep updating the
model, like the Albert group did with the ABA model, according to new discoveries in
92
the signaling pathways. As the model becomes more comprehensive, the predictions,
e.g. simulations of intervention results, will further improve in accuracy. Moreover,
multi-level behavior of signaling components has been observed [118], suggesting the
potential value of developing a multi-level model, where my generalized multi-level
formalism and motifs method can apply.
This work is also a good demonstration of how to predict important regulations and
guide experiments from an in silico approach. There are too many potential regulators
and effectors for XLGs to be experimentally explored. However with the network-based
analysis, we are able to quickly identify several neccesary conditions for CO2 induced
closure, for example XLGs -| PP2Cs inhibition, and XLG → RbohD/F activation. This
points out a method how to utilize known knowledge in order to make new predictions
on less-understood regualtions. Simulations from a model can also help making
hypotheses of introducing new regulations, e.g. effect of introduction of mesophyll
regulation. Combining expertimental approaches with theoretical/computational
approaches is a promising way to understand system-level biology.
To summarize, my dissertation focuses on how to understand system-level biological
signaling with theoretical and computational approaches. Network-based discrete
modeling of biological systems, plus analyses of such models, is a promising method
toward this goal. My analysis on a complex multi-level model shows how combination
of theoretical and computational tools can reveal dynamic functions of a biological
system. To further extend the capability of theoretical and computational tools, I
developed a general modeling framework with any finite discrete levels that allows a
novel method to find attractors and control of a model. I also constructed the ABA-CO2
crosstalk model, which integrated knowledge in the literature to explain how the two
environmental signals induces closure, to predict new regulations and results of
intervention, and to generate new hypotheses. In short, my dissertation work has offered
a new and general way to analyze complex discrete models, and expanded the
understanding of the mechanism how plants respond to different environmental stress.
93
Appendix A Analysis of a dynamic model of guard cell
signaling reveals the stability of signal propagation
A1 Regulatory Functions of the Reduced Stomatal Opening Model
In this section we provide the details of the reduced stomatal opening model,
including a table of abbreviations (Table A.1), followed by regulatory functions of the
model.
Abbreviation Full Name Abbreviation Full Name
14-3-3
proteinH-ATPase
14-3-3 protein that
binds to the H+-ATPase
14-3-3
proteinphot1
14-3-3 protein that binds
to phototropin 1
ABA abscisic acid ABI1 2C-type protein
phosphatase
acid. of
apoplast
the acidification of the
apoplast
AnionCh anion efflux channels at
the plasma membrane
AtABCB14 ABC transporter gene
AtABCB14
Atnoa1 protein nitric oxide-
associated 1
AtrbohD/F NADPH oxidase D/F AtSTP1 H-monosaccharide
symporter gene AtSTP1
Ca2+-ATPase Ca2+-ATPases and
Ca2+/H+ antiporters
responsible for Ca2+
efflux from the cytosol
CaIC inward Ca2+ permeable
channels
CaR Ca2+ release from
intracellular stores
carbon
fixation
light-independent
reactions of
photosynthesis
CDPK Ca2+-dependent protein
kinases
CHL1 dual-affinity nitrate
transporter gene
AtNRT1.1
Ci intercellular CO2
concentration
FFA free fatty acids
H+-ATPase the phosphorylated H+-
ATPase at the plasma
H+-
ATPasecomplex
14-3-3 protein bound H+-
ATPase
94
membrane prior to the
binding of the H+-
ATPase 14-3-3 protein
KEV K+ efflux from the
vacuole to the cytosol
Kin K+ inward channels at the
plasma membrane
Kout K+ outward channels at
plasma membrane
LPL lysophospholipids
NADPH reduced form of
nicotinamide adenine
dinucleotide phosphate
NIA1 nitrate reductase
NO nitric oxide OST1 protein kinase open
stomata 1
PA phosphatidic acid PEPC phosphoenolpyruvate
carboxylase
phot1 phototropin 1 phot1complex 14-3-3 protein bound
phototropin 1
phot2 phototropin 2 Photophos-
phorylation
light-dependent reactions
of photosynthesis
PIP2C phosphatidylinositol
4,5-bisphosphate
located in the cytosol
PIP2PM phosphatidylinositol 4,5-
bisphosphate located at
the plasma membrane
PLA2β phospholipase A2β PLC phospholipase C
PLD phospholipase D PMV electric potential
difference across the
plasma membrane
PP1cn the catalytic subunit of
type 1 phosphatase
located in the nucleus
PP1cc the catalytic subunit of
type 1 phosphatase
located in the cytosol
protein kinase a serine/threonine
protein kinase that
directly phosphorylates
the plasma membrane
H-ATPase
PRSL1 type 1 protein
phosphatase regulatory
subunit 2-like protein1
RIC7 ROP-interactive CRIB
motif-containing
protein 7
ROP2 small GTPase ROP2
95
ROS reactive oxygen species [Ca2+]c cytosolic Ca2+
concentration
[Cl-]c/v cytosolic/vacuolar Cl-
concentration
[K+]c/v cytosolic/vacuolar K+
concentration
[malate2-]a/c/v apoplastic/
cytosolic/vacuolar
malate2- concentration
[NO3-]a/c/v apoplastic/cytosolic/vacu
olar nitrate concentration
Table A.1 Full names of the network components denoted by abbreviated node names
in Figure 2.1. The same abbreviations are used in the original Sun et al. model and the
reduced model.
Next we provide the regulatory functions for each of the 32 nodes in the reduced
stomatal opening model. The following table shows the possible states of these nodes;
the node names are the same as in Figure 1.1 in the main text unless specified.
Possible node levels List of nodes
{0, 1} Blue light, phot1complex, PLC, PLA2β, CaIC, CaR, NO,
Ca2+ATPase 6 , FFA, Kin, Kout, KEV, Red light, ABA,
ABI1, ROS
{0, 1, 2} [Ca2+]c, CO2, photophosphorylation, carbon fixation,
PLD, sucrose, MCPS (mesophyll cell photosynthesis)
{0, 1, 1.6} AnionCh
{0, 0.5, 1, 2} Ci
{–2, –1, 0, 1, 2} PMV
{0, 1, 1.5, 2, 3, 3.5, 4} PP1cc
{0, 0.5, 0.9, 1, 1.5, 2, 3} protein kinase
{0, 0.5, 1, 1.5, 2, 3, 4, 6, 9} H+ ATPasecomplex, [K+]c, [K
+]v
{0,1,2,3,5,6} stomatal opening
The regulatory function’s left hand side refers to the node whose state is
evaluated, and the right hand side refers to this node’s regulators. The variables of the
regulatory function are node states, which for simplicity are denoted by the node
name. The regulatory function specifies the next state of the target node (indicated by
the use of an asterisk on the name of the target node) as a function of the current states
of its regulators. Four of the nodes are input signals that are assumed to have a
sustained expression. Thus their next state equals their current state, which can be
6 To distinguish from the subtraction operator ‘–‘, all dashes in the node names of this file are removed. Ca2+-
ATPase is written as Ca2+ATPase, and H+ -ATPasecomplex is written as H+ ATPasecomplex
96
expressed by making them self-regulated. For example, the regulatory function for
Blue Light is “Blue Light* = Blue Light”.
The regulatory functions of most other nodes involve the Boolean logic operators
“And, Or, Not”; True is interpreted as 1 and False is interpreted as 0. The regulatory
function of multi-level nodes also involves algebraic operations like addition “+” or
multiplication “×”. In these functions the state of Boolean nodes is interpreted as the
integers 1 or 0. For example, the Boolean nodes A=True=1 B=False=0, and C=True=1
will yield the algebraic relationships A+B=1 and A+C=2. If a multi-level node, say D,
is used in a Boolean logic function, we use clauses like “(D>0)” or “(D=2)” to convert
its state to Boolean values. As in the Sun et al model, the regulatory functions of several
nodes are indicated as truth tables that summarize the next state of the target node for
every combination of the states of its regulators.
Compared with the original model, 15 nodes in the reduced model kept the same
regulatory functions, namely CaIC, CaR, FFA, [Ca2+]c, Ca2+ATPase, KEV, PLD, PMV,
photophosphorylation, carbon fixation, sucrose, Kin, Kout, [K+]v, MCPS, Ci.
Blue Light* =Blue Light
Red Light* =Red Light
ABA* =ABA
CO2* =CO2
phot1complex* = Blue Light
PLC* = Blue Light Or ABA And [Ca2+]c
PLA2β* = (phot1complex Or Blue Light Or Red Light)
CaIC* = ROS And (PMV<0)
CaR* = NO Or PLC
NO* = (photophosphorylation>0) And ROS
[Ca2+]c* = ((CaIC or CaR) And Not Ca2+ ATPase) + ABA
Ca2+ ATPase* = ([Ca2+]c >0)
PP1cc truth table
Blue Light phot1complex PLD PP1cc*
0 0 0 2
1 1.5
2 1
1 0 4
1 3.5
2 3
1 any 0 4
97
1 3.5
2 3
Protein kinase truth table:
Ci PP1cc protein kinase*
any 0 0
0
1, 1.5 0.5
2 1
3, 3.5 1.5
4 3
0.5
1, 1.5 0
2 0.5
3 0.5
3.5 1.5
4 2
1
1, 1.5, 2 0
3 0.5
3.5 0.9
4 1
2 any 0
H+ ATPasecomplex*= ((FFA Or PLA2β) And Not ([Ca2+]c = 2)) × PK × (1 +
photophosphorylation)
FFA* = PLA2β
PMV* = PMV- (H+ ATPasecomplex>0) + (AnionCh And (PMV<0)) + (([Ca2+]c = 2) Or
KEV)
Kin* = (FFA Or Not [Ca2+]c=2 Or ABA) And (Not (Ci==2)) And (PMV<0)
Kout* = (ABA Or (Ci=2) Or (Not ROS) Or Not NO Or Not FFA) And (PMV>0)
[K+]c* = [(Kin Or KEV And [K+]v) And Not Kout] ×(H+-ATPasecomplex ≥ AnionCh)
×H+ ATPasecomplex
KEV* = ([Ca2+]c =2) And [K+]v)
[K+]v* = [K+]c
sucrose* = carbon_fixation And Not ABA
Ci truth table:
consumption = max{carbon_fixation, MCPS}.
98
CO2 consumption Ci*
0 (CO2-free air) Any 0
1 (moderate atmospheric CO2) 0 or 1 1
2 0.5
2 (high atmospheric CO2) Any 2
photophosphorylation* = Blue Light + Red Light
Carbon_fixation* = (CO2 or Ci) × photophosphorylation
PLD* = ABA + NO
ABI1* = Not ABA
ROS* = (photophosphorylation>0) And PLD And Not ABI1
AnionCh truth table:
An intermediate variable is calculated first:
Anionhighactivation = (([Ca2+]c = 2) Or ABA) And Not ABI1 Or (Ci = 2)
Anionhighactivation phot1complex Blue Light AnionCh*
0
0 0 1
0 1 0
1 Any 0
1 Any Any 1.6
MCPS* = (Blue Light +Red Light) × (Ci>0)
SO truth table:
[K+]v Sucrose SO*
0 0 0
Sucrose >0 1
0< [K+]v <=1 Any 1
1<[K+]v <2 Any 2
2<=[K+]v <6 Any 3
6<=[K+]v<9 Any 5
9<=[K+]v Any 6
99
A2 Table of stomatal opening levels for simulated single node knockouts in the
reduced model
The following table shows the simulated steady state stomatal opening level for each
single node knockout in the reduced multi-level stomatal opening model. A color scale
is used to reflect whether the perturbed condition yields a different level of stomatal
opening as compared to wild type. White: the opening level is the same as the wild type
opening level; green: there is a reduction compared with the wild type opening. Blue:
there is an increase compared with the wild type opening. Both green and blue marked
knockouts yield the same qualitative result in the Sun et al. model. Yellow: there is a
small reduction in the stomatal opening reported in the Sun et al. model, which is not
observed in the reduced model because it groups stomatal opening values. Notation
“mod” in CO2 concentration means "moderate"; node "MCPS" is short for the node
"mesophyll cell photosynthesis".
light condition dual beam blue light red light
CO2
concentration mod low high mod low high mod low high
node being
knocked out ABA absent
None (wild
type) 5 6 1 3 5 1 1 3 1
phot1complex 5 6 1 3 5 1 1 3 1
PLC 5 6 1 3 5 1 1 3 1
PLA2β 1 0 1 1 0 1 1 0 1
CaIC 5 6 1 3 5 1 1 3 1
CaR 5 6 1 3 5 1 1 3 1
NO 5 6 1 3 5 1 1 3 1
[Ca2+]c 5 6 1 3 5 1 1 3 1
Ca2+-ATPase 5 6 1 3 5 1 1 3 1
PP1cc 1 0 1 1 0 1 1 0 1
protein kinase 1 0 1 1 0 1 1 0 1
H+-
ATPasecomplex 1 0 1 1 0 1 1 0 1
FFA 5 6 1 3 5 1 1 3 1
PMV 1 0 1 1 0 1 1 0 1
100
Kin 1 0 1 1 0 1 1 0 1
Kout 5 6 1 3 5 1 1 3 1
[K+]c 1 0 1 1 0 1 1 0 1
KEV 5 6 1 3 5 1 1 3 1
[K+]v 1 0 1 1 0 1 1 0 1
sucrose 5 6 0 3 5 0 0 3 0
photophos-
phorylation 3 3 1 1 3 1 1 0 1
carbon fixation 5 6 1 3 5 1 1 3 1
PLD 5 6 1 3 5 1 1 3 1
ABI1 5 6 1 3 5 1 1 3 1
ROS 5 6 1 3 5 1 1 3 1
AnionCh 5 6 1 3 5 1 1 3 1
MCPS 5 6 1 3 5 1 1 3 1
light condition dual beam blue light red light
CO2
concentration mod low high mod low high mod low high
node being
knocked out ABA present
None (wild
type) 0 3 0 0 3 0 0 0 0
phot1complex 0 3 0 0 3 0 0 0 0
PLC 0 3 0 0 3 0 0 0 0
PLA2β 0 0 0 0 0 0 0 0 0
CaIC 0 3 0 0 3 0 0 0 0
CaR 0 3 0 0 3 0 0 0 0
NO 3 3 0 2 3 0 0 0 0
[Ca2+]c 0 3 0 0 3 0 0 0 0
Ca2+-ATPase 0 0 0 0 0 0 0 0 0
PP1cc 0 0 0 0 0 0 0 0 0
protein kinase 0 0 0 0 0 0 0 0 0
H+-
ATPasecomplex 0 0 0 0 0 0 0 0 0
FFA 0 0 0 0 0 0 0 0 0
PMV 0 0 0 0 0 0 0 0 0
101
Kin 0 0 0 0 0 0 0 0 0
Kout 0 3 0 0 3 0 0 0 0
[K+]c 0 0 0 0 0 0 0 0 0
KEV 0 0 0 0 3 0 0 0 0
[K+]v 0 0 0 0 0 0 0 0 0
sucrose 0 3 0 0 3 0 0 0 0
photophos-
phorylation 0 0 0 0 0 0 0 0 0
carbon fixation 0 3 0 0 3 0 0 0 0
PLD 5 6 0 3 5 0 0 3 0
ABI1 0 3 0 0 3 0 0 0 0
ROS 3 3 0 2 3 0 0 0 0
AnionCh 2 3 0 1 3 0 0 0 0
MCPS 0 3 0 0 3 0 0 0 0
Table A.2 Stomatal opening levels for simulated single node knockouts in the reduced
model
102
Appendix B A general method to find the
attractors of discrete dynamic models of
biological systems
B1 Runtime performance of the multi-level Quine-McCluskey algorithm
The computational complexity of the Boolean Quine–McCluskey algorithm grows
exponentially with the number of variables, because the problem it solves is NP-hard,
and it is shown that the upper bound on the number of prime implicants of a Boolean
function with n variables is 3n ln(𝑛) [152]. Since a Boolean function is a special case
of a discrete function , it is straightforward that finding all prime implicants of a multi-
level function is at least as complex as finding all prime implicants of a Boolean function.
To test whether the multi-level QM algorithm is capable of analyzing biological network
models, we benchmark how long it takes for the algorithm to transform all node
functions on 100 randomly generated heterogeneous networks. The networks have 50
nodes and have a power law in-degree distribution with exponent -3 and maximum
degree 8. Each node has 60% chance of having 2 states, 25% chance of having 3 states,
10% chance of having 4 states, and 5% chance of having 5 states. These parameters
exceed the complexity of current multi-level biological models. The result is shown in
Figure B.1: the multi-level QM algorithm can effectively transform the functions. In
addition, we found that within the algorithm, the complexity of identifying stable or
oscillating motifs is much more than that of the QM transformation. So we conclude
that the complexity of the QM algorithm is acceptable for practical problems.
103
Figure B.1 Histogram of QM transformation runtime on 100 randomly generated
heterogeneous networks with 50 nodes. The result shows that the complexity of QM
transformation is much less than identifying motifs.
B2 Description of the multi-level Quine-McCluskey algorithm
Here we describe the implementation of the multi-level Quine-McCluskey algorithm:
1. Scan all functions to get the all states for each node.
2. For each function, enumerate all input combinations to get the minterms, make it
list1
3. Group the implicants in list1 according to the number of zeroes
4. Compare between neighbor groups:
For each implicant1 in group i:
For each implicant2 in group i+1:
If implicant1 and 2 are different by 1 digit:
Access all implicants with all states of the different node, if they are
all in group i+1, merge the implicants;
5. If an implicant does not get merged in any comparison, mark it. Go to step 4 with
i+=1.
6. If there is no merged implicant, proceed to step 7. Otherwise set list1 to be the
merged implicants, then go to step 3.
7. The marked implicants are prime implicants
8. Go to step 2 with the next function; repeat until all functions are transformed.
104
B3 Mathematical foundations of the motif-based attractor identification
algorithm
In this section we rigorously define the concepts we used in our motif-based method,
and present important conclusions on why stable motifs and oscillating motifs can be
used to find attractors. Our method does not depend on the update scheme, so the
complex attractors predicted by our method are consistent with the complex attractors
under an asynchronous update where one node is updated per time step. An efficient
way to implement the most general case of asynchronous update is to randomly choose
a node to update at each time step, which is the ‘general asynchronous update’ we
mentioned in the main text. It is a representative update scheme for the broad class of
update schemes where our method can accurately find all attractors.
Mathematical definitions of node states and regulatory functions
Let 𝑣𝑖 , 𝑖 = (1,2, … , 𝑁) be the N nodes of a multi-level dynamical system; 𝑚𝑖, 𝑖 =
(1,2, … 𝑁) be the highest level of node 𝑣𝑖 (which means that it has mi+1 levels,
namely 0, 1… mi). Let 𝜎𝑖, 𝑖 = (1,2, … , 𝑁) be a state of the 𝑖𝑡ℎ node 𝑣𝑖; and 𝛴 =
(𝜎1, 𝜎2, … , 𝜎𝑁) be a state of the entire system. We use 𝛴𝑃 to represent a partial system
state where 𝑃 = (𝜎𝑚1= 𝑙1, 𝜎𝑚2
= 𝑙2, … , 𝜎𝑚𝑀= 𝑙𝑀), M < N is a subset of nodes that
have their states specified, while the other states are unspecified.
Alternatively, we can represent the system with virtual nodes. We use 𝑣𝑖(𝑙)
, 𝑙 =
(0, 1, … , 𝑚𝑖) to represent the virtual node for the 𝑙𝑡ℎ state of 𝑣𝑖. The total number of
virtual nodes is 𝑁𝑣 = ∑ (𝑚𝑖 + 1)𝑁𝑖=1 . 𝑣𝑖
(𝑙) is Boolean-like, meaning that it can only
have state values 0 or 1. The state of each virtual node is now represented by 𝜎𝑖(𝑙)
, 𝑖 =
(1,2, … , 𝑁), 𝑙 = (0, 1, … , 𝑚𝑖) . The state of the system is then represented as 𝛴 =
(𝜎1(1)
, 𝜎1(2)
, … , 𝜎1(𝑚1)
, 𝜎2(1)
, 𝜎2(2)
, … , 𝜎𝑁(𝑚𝑁)
). Let 𝑓𝑖: ℵ𝑁 → {0,1, . . 𝑚𝑖} be the regulatory
function of node 𝑣𝑖, where ℵ𝑁 is the potential state space of the system (as node levels
are described by natural numbers); the actual state space has levels0,1, . . 𝑚𝑗 for each
node j. The regulatory function of each virtual node is a function of virtual nodes, e.g.
the function of the 𝑖𝑡ℎ node’s 𝑙𝑡ℎ state is 𝑓𝑖(𝑙)
(𝜎𝑘1
(𝑙1), 𝜎𝑘2
(𝑙2), … ) (where 𝑘𝑗 is the
jth input of node i), thus it is Boolean-like, 𝑓𝑖(𝑙)
: {0,1}𝑁𝑣 → {0,1} . Let 𝐹 =
(𝑓1(0)
, 𝑓1(1)
, … 𝑓1(𝑚1)
, 𝑓2(0)
, 𝑓2(1)
, … 𝑓2(𝑚2)
, … , 𝑓𝑁(𝑚𝑁)
) be the vector of all virtual node
105
functions. We use 𝑓𝑖(𝑙)
(Σ) to represent a function of a virtual node evaluated under state
Σ of the system, and 𝑓𝑖(𝑙)
|𝑃 to represent a function evaluated under a partial state P,
where only 𝑃 = (𝜎𝑝1(0)
, 𝜎𝑝1(1)
, … , 𝜎𝑝2(0)
, 𝜎𝑝2(1)
, … , 𝜎𝑝𝑘(0)
, 𝜎𝑝𝑘(1)
, … ) are evaluated.
The virtual nodes that correspond to the same original node 𝑣𝑖 are called ‘sibling
nodes’ of each other, and these nodes form a sibling set of 𝑣𝑖, represented with Si =
{𝑣𝑖(𝑙)
}, 𝑙 = (0, 1, … , 𝑚𝑖) . A sibling set satisfies the following property: when the
functions of these nodes are evaluated based on a state Σ, one and only one of the
functions in the set is 1 and the rest are 0, i.e. ∑ 𝑓𝑗(𝑖)
(Σ)mj
i=1= 1 , and 𝑓𝑗
(𝑘)(Σ)𝑓𝑗
(𝑙)(Σ) =
0, ∀𝑘 ≠ 𝑙. When implemented in a simulation, all sibling virtual nodes corresponding
to the same original node should be evaluated simultaneously.
We assume that each of the virtual nodes’ regulatory functions has the following
properties:
1. Non-constant. 𝑓𝑖(𝑙)
is not a constant, i.e. 𝑓𝑖(𝑙)
≠ 0 and 𝑓𝑖(𝑙)
≠ 1
2. Each input node is effective. If 𝑓𝑖 depends on node 𝑣𝑗 , then there must be at least
one pair of network states Σ(1) and Σ(2) with 𝜎𝑗(1)
≠ 𝜎𝑗(2)
and σ𝑘(1)
= 𝜎𝑘(2)
for all
k ≠ j such that 𝑓𝑖(Σ(1)) ≠ 𝑓𝑖(Σ(2)). Or, equivalently in terms of virtual nodes, if a
sibling node function set Fi = {𝑓𝑖(𝑙)
}, 𝑙 = (0, 1, … , 𝑚𝑖) depends on a set of sibling
nodes Si, then there must be at least one pair of network states Σ(1) and Σ(2) with
σj(1)
≠ 𝜎𝑗(2)
and σ𝑘(1)
= 𝜎𝑘(2)
for all k ≠ j, such that ∃ 𝑓𝑖𝑙(Σ(1)) ≠ 𝑓𝑖
𝑙(Σ(2)).
3. Each Boolean-like function 𝑓𝑖(𝑙)
is in a disjunctive normal form (specifically, in a
Blake canonical form), with the inputs being the virtual nodes:
𝑓𝑖(𝑙)
= (𝑣𝑗1
(𝑙1) 𝑎𝑛𝑑 𝑣𝑗2
(𝑙2) 𝑎𝑛𝑑 … 𝑎𝑛𝑑 𝑣𝑗𝑐1
(𝑙𝑐1))
𝑜𝑟 (𝑣𝑗𝑐1+1
(𝑙𝑐1+1) 𝑎𝑛𝑑 𝑣𝑗𝑐1+2
(𝑙𝑐1+2) 𝑎𝑛𝑑 … 𝑎𝑛𝑑 𝑣𝑗𝑐2
(𝑙𝑐2)) 𝑜𝑟 …
In addition, if for a network state subset 𝑃 ⊂ Σ, 𝑓𝑖(l)|𝑃 = 1 regardless of the states of
the other nodes, then the disjunctive normal form of 𝑓𝑖(𝑙)
must have at least one
conjunctive clause equal to 1 when evaluated under this partial state P.
106
Definition of the expanded network
The expanded network is a graph embodiment of the virtual nodes and their
regulatory functions. The nodes of the expanded network consist of virtual nodes 𝑣𝑖
(𝑙𝑗),
𝑖 = (1,2, … , 𝑁), 𝑗 = (1,2, … , 𝑚𝑖) and composite nodes (which represent ‘and’
rules) 𝑣𝑘(𝑐𝑜𝑚𝑝)
, (𝑖 = 1,2, … , 𝐾), where K is the total number of ‘and’ rules used in the
functions. The edges of the expanded network can be one of two types: edges from
virtual or composite nodes to virtual nodes (which are aggregated with ‘or’ rules); and
edges from virtual nodes to composite nodes (which are aggregated with ‘and’ rules).
One can think of virtual nodes as having a function that contains only the Boolean
operator ‘or’: 𝑓𝑖(𝑙)
= 𝐼1 𝑜𝑟 𝐼2 𝑜𝑟 …, where the I’s are inputs of the virtual node in the
expanded network, including both virtual nodes and composite nodes. The composite
nodes can be treated as having only the Boolean operator ‘and’: 𝑓𝑖(𝑐𝑜𝑚𝑝)
=
𝐼1 𝑎𝑛𝑑 𝐼2 𝑎𝑛𝑑 …. , where the I’s are the inputs (virtual nodes) of the composite node.
An example is provided in Sec. II E.
We define a sufficient regulator of a virtual node A as either a virtual node connected
directly to A, or a composite node together with all of its input virtual nodes. Thus a
sufficient regulator may be a group of virtual nodes.
Definitions of motifs
A stable motif is defined as a strongly-connected-component (SCC) of the expanded
network that satisfies:
(1) If 𝑣𝑖(𝑙)
is in the SCC, then any 𝑣𝑖(𝑘)
, (𝑘 ≠ 𝑙) is not in the SCC.
(2) If 𝑣𝑘(𝑐𝑜𝑚𝑝)
is in the SCC, then all of its inputs are in the SCC.
An oscillating motif is defined as a strongly-connected-component (SCC) of the
expanded network that satisfies:
(1) There exists a 𝑣𝑖(𝑙)
in the SCC such that at least one of its sibling nodes, say 𝑣𝑖(𝑘)
,
(𝑘 ≠ 𝑙) is also in the SCC.
(2) If 𝑣𝑘(𝑐𝑜𝑚𝑝)
is in the SCC, then all of its inputs are in the SCC.
These motifs are described and illustrated in Sec. II F and G.
We also define a self-sufficient motif as an SCC in the expanded network that satisfies:
If 𝑣𝑘(𝑐𝑜𝑚𝑝)
is in the SCC, then all of its inputs are in the SCC. The intuition of this SCC
107
is that it is a self-sustaining feedback loop. Stable motifs and oscillating motifs are
special self-sufficient motifs, with extra requirements in the states involved in the motif.
It is important to note that both stable motifs and oscillating motifs correspond to SCCs
in the original network. Stable motifs are SCCs in which all cycles are positive.
Oscillating motifs contain negative cycles. These negative cycles may only be apparent
when considering the specific regulatory functions.
In analogy to source nodes (i.e. nodes that do not have incoming edges), we call an
SCC a source SCC if there are no nodes other than the nodes of the SCC that can reach
the source SCC through directed paths.
There is a one-to-one correspondence between stable motifs and partial fixed
points
We define a partial fixed point (or partial steady state), as a set of nodes and associated
states in which the nodes stabilize regardless of the rest of the network. Note that this
definition expresses a stricter condition than a set of nodes whose states stabilizes in a
certain context (which depends on the rest of the network).
We show that each stable motif corresponds to a partial fixed point of the system, and
that each partial fixed point corresponds to a stable motif.
Proposition 1. A stable motif corresponds to a fixed point of the nodes that participate
in the motif, i.e. the states of the nodes of the stable motif remain the same regardless
of the state of the other nodes. Formally,
Let 𝑀 = (𝑣𝑗1
(𝑙1), 𝑣𝑗2
(𝑙2), … , 𝑣𝑗𝑘
(𝑙𝑘), 𝑣𝑚1
(𝑐𝑜𝑚𝑝), 𝑣𝑚2
(𝑐𝑜𝑚𝑝), … , 𝑣𝑚𝐿
(𝑐𝑜𝑚𝑝)) be a stable motif where
𝑣𝑗1
(𝑙1), 𝑣𝑗2
(𝑙2), … , 𝑣𝑗𝑘
(𝑙𝑘) are virtual nodes and 𝑣𝑚1
(𝑐𝑜𝑚𝑝), 𝑣𝑚2
(𝑐𝑜𝑚𝑝), … , 𝑣𝑚𝐿
(𝑐𝑜𝑚𝑝) are composite
nodes. Let 𝑃 = (𝜎𝑗1= 𝑙1, 𝜎𝑗𝑘
= 𝑙𝑘, … , 𝜎𝑗𝑘= 𝑙𝑘) be a partial system state. Then for any
system state 𝛴𝑃 with 𝜎𝑗𝑖= 𝑙𝑖, we have 𝑓𝑗𝑖
(𝑙𝑘)(𝛴𝑃) = 𝛿𝑖𝑘.
Sketch of proof: We first show that 𝑓𝑗𝑖
(𝑙𝑖)(Σ𝑃) = 1. By definition of a stable motif, each
virtual node’s function must have a conjunctive clause (implicant) that consists of either
of the following: (1) a virtual node of the same stable motif; or (2) a composite node
whose inputs consists only of virtual nodes of the same stable motif. This implicant will
be 1 when 𝑓𝑗𝑖
(𝑙𝑖)(Σ𝑃) is evaluated, fixing the value 𝑓𝑗𝑖
(𝑙𝑖)(Σ𝑃) = 1. Then 𝑓𝑗𝑖
(𝑙𝑘)(Σ𝑀) =
0 ∀𝑘 ≠ 𝑖 is trivially true because the functions of sibling nodes must
108
satisfy: 𝑓𝑗(𝑘)
(Σ)𝑓𝑗(𝑙)
(Σ) = 0 ∀𝑘 ≠ 𝑙.
Proposition 2. (Reverse of proposition 1) For any partial fixed point of the system, i.e.
a set of node states where updating any involved node gives back the same state for the
node, there is a set of stable motifs that correspond to it. Formally,
Let 𝑃 = (𝜎𝑗1= 𝑙1, 𝜎𝑗𝑘
= 𝑙𝑘, … , 𝜎𝑗𝑘= 𝑙𝑘) be a partial system state such
that 𝑓𝑗𝑖
(𝑙𝑖)(𝛴𝑃) = 1, ∀𝑗𝑖. Then (1) there exists a set of stable motifs {𝑀𝑛} where each
stable motif contains only nodes from {𝑣𝑗𝑖
𝑙𝑖}, 𝑖 = 1, … , 𝑘 as virtual nodes; (2) the nodes
specified in P but not in nodes of {𝑀𝑛} are downstream of the nodes of {𝑀𝑛}.
Sketch of proof: From the disjunctive normal form of the functions, 𝑓𝑗𝑖
(𝑙𝑖)(Σ𝑃) = 1
means that at least one of the conjunctive clauses of each function is 1, and consists of
virtual nodes specified in P. Then one can create a sub-network of the expanded network,
whose nodes are these virtual nodes as well as composite nodes representing
conjunctive clauses; and edges are added if a virtual node or composite node is an input
in a virtual node’s function, or if a virtual node is an input of a composite node. Since
each virtual node in this sub-network has at least one input within the sub-network, there
exists at least one SCC. This SCC(s) is/are the stable motif(s) we are looking for.
Stable and oscillating parts of complex attractors
A complex attractor of the whole system consists of a set of states that the system
keeps revisiting. When considering the states visited by each node in a complex attractor,
there may be a subset of nodes whose state remains the same. We call these nodes
stabilized nodes. The remaining nodes (potentially, all nodes) will oscillate, meaning
that they will keep revisiting all, or possibly a subset, of their states. We will call these
nodes oscillating nodes. In the following two propositions we establish the relationships
between these nodes.
Proposition 3. Stabilized nodes in an attractor can be downstream of stabilized nodes
or downstream of oscillating nodes.
Let 𝐴 be an attractor of a multi-level dynamical system under general asynchronous
update, and let 𝑆 and 𝑂 be the stabilized and oscillating nodes, respectively. If 𝑣𝑠 ⊂
𝑆 and 𝑙𝑠 is the node’s stabilized value, then one of the following holds: (1) one of the
conjunctive clauses of 𝑓𝑠(𝑙𝑠)
depends only on nodes of 𝑆 in 𝐴; if (1) is not true, then (2)
109
𝑓𝑠(𝑙𝑠)
and the function of at least one sibling node 𝑓𝑠(𝑘𝑠)
, 𝑘𝑠 ≠ 𝑙𝑠 have at least one
conjunctive clause dependent on the nodes in O.
The first case is self-evident. An example for the second case is a network with Boolean
nodes, A, B and C:
𝑓𝐴(0)
= (𝐴1 or 𝐵1) 𝑎𝑛𝑑 𝐶0,
𝑓𝐴(1)
= 𝐴0 𝑎𝑛𝑑 𝐵0 𝑜𝑟 𝐶1,
𝑓𝐵(0)
= (𝐴1 or 𝐵1) 𝑎𝑛𝑑 𝐶0,
𝑓𝐵(1)
= 𝐴0 𝑎𝑚𝑑 𝐵0 𝑜𝑟 𝐶1,
𝑓𝐶(0)
= 𝐵0 𝑎𝑛𝑑 𝐶0 𝑜𝑟 𝐴0 𝑎𝑛𝑑 𝐶0,
𝑓𝐶(1)
= (𝐴1 𝑎𝑛𝑑 𝐵1) 𝑜𝑟 𝐶1,
where for simplicity the virtual nodes are denoted Xi, X={A,B,C} instead of Xi. This
network has an oscillating attractor with A and B oscillating and C stabilized at 0. C is
stabilized despite being regulated by nodes that oscillate. It does not satisfy (1) in the
proposition; instead, 𝑓𝐶(0)
and 𝑓𝐶(1)
satisfy (2) in the proposition.
Proposition 4. Oscillating nodes in an attractor must be downstream of oscillating
nodes.
Let 𝐴 be an attractor of a multi-level dynamical system under general asynchronous
update, and let 𝑆 and 𝑂 be the stabilized and oscillating nodes, respectively. If 𝑣𝑂 ⊂
𝑂 and 𝑙𝑂1, 𝑙𝑂2
, … , 𝑙𝑂𝑘 are the oscillating states, then the following holds: none of the
conjunctive clauses of 𝑓𝑂𝑖
(𝑙𝑂𝑖), (𝑖 = 1,2, … , 𝑘) depends only on nodes of 𝑆 in 𝐴; or
alternatively, all functions 𝑓𝑂𝑖
(𝑙𝑂𝑖), (𝑖 = 1,2, … , 𝑘) have at least one conjunctive clause
dependent on state of nodes in O.
The proof for proposition 4 is straightforward.
Iterative stable motif based network reduction conserves the attractors of the system
We proceed to the proof of conservation of attractors during iterative network reduction
by stating three lemmas.
Lemma 1. Construction of the stabilized set 𝑆𝑟𝑒𝑑 that corresponds to at least one stable
motif
110
Let 𝐴 be an attractor of a multi-level dynamical system under general asynchronous
update, and let 𝑆 and 𝑂 be the stabilized and oscillating nodes, respectively. If there
is a partial fixed point in A, then: there exists a set of nodes 𝑆𝑟𝑒𝑑 ⊂ 𝑆 such that in the
expanded network representation there will be at least one stable motif composed only
of virtual nodes of 𝑆𝑟𝑒𝑑 in A, or composite nodes composed of such nodes.
Sketch of proof: Each stabilized node in S corresponds to a function 𝑓S(ls)
. By
Proposition 3, we can divide S into nodes whose functions have a conjunctive clause
that depends only on node states (virtual nodes) specified in S, denoted 𝑆0, and nodes
that have at least one conjunctive clause in their rule dependent on the states of nodes
in O, denoted 𝑆𝑜𝑠𝑐. Let 𝑆1 ⊂ 𝑆0 be the nodes that have at least one conjunctive clause
dependent only on nodes’ states specified in 𝑆0. Let 𝑆2 ⊂ 𝑆1 be the nodes that have at
least one conjunctive clause dependent only on node states specified in 𝑆1. One can do
this iteratively until 𝑆𝑖𝑚𝑎𝑥= 𝑆𝑖𝑚𝑎𝑥+1 , and denote 𝑆𝑟𝑒𝑑 = 𝑆𝑖𝑚𝑎𝑥
. Since there exists a
partial fixed point, 𝑆𝑟𝑒𝑑 will contain nodes in the partial fixed point and will not be an
empty set. The iterative selection guarantees that 𝑆𝑟𝑒𝑑 does not depend on oscillating
nodes or nodes influenced by oscillating nodes. And since the function of each node in
𝑆𝑟𝑒𝑑 contains at least one conjunctive clause dependent only on nodes in 𝑆𝑟𝑒𝑑 itself,
there is at least one SCC in 𝑆𝑟𝑒𝑑 and this SCC satisfies the definition of a stable motif.
Lemma 2. Network reduction based on stable motifs stabilizes the nodes in 𝑆𝑟𝑒𝑑
Let 𝑆𝑟𝑒𝑑 ⊂ 𝑆 be the set of nodes constructed in Lemma 1. Then (1) Network reduction
based on stable motifs composed only of nodes from 𝑆𝑟𝑒𝑑 can only stabilize nodes
in 𝑆𝑟𝑒𝑑 . Moreover, (2) if a node i in 𝑆𝑟𝑒𝑑 stabilizes during the reduction, it has to
stabilize at its state specified in A; if a node i does not stabilize during the reduction,
then after the reduction, its function 𝑓𝑖(𝑙𝑠)
, where 𝑙𝑠 is the node’s stabilized state in A,
must have a conjunctive clause that depends only on nodes’ states specified in 𝑆𝑟𝑒𝑑 in
A that did not stabilize during reduction.
Sketch of proof: We first prove (1) by showing that the other nodes, i.e. nodes in S0 −
𝑆𝑟𝑒𝑑 and 𝑆𝑜𝑠𝑐, cannot stabilize from stable motifs composed only of nodes from 𝑆𝑟𝑒𝑑.
This statement is straightforward from the definitions of S0 − 𝑆𝑟𝑒𝑑 and 𝑆𝑜𝑠𝑐. Nodes
in S0 − 𝑆𝑟𝑒𝑑 do not have any conjunctive clauses that depend only on nodes’ states
from 𝑆𝑟𝑒𝑑, otherwise the nodes would be in 𝑆𝑟𝑒𝑑. According to Proposition 3, nodes
in 𝑆𝑜𝑠𝑐 do not have any conjunctive clauses that depend only on nodes’ states from 𝑆𝑟𝑒𝑑.
Therefore reduction based on stable motifs composed only of nodes from 𝑆𝑟𝑒𝑑 is not
sufficient to stabilize these nodes. To show (2), consider the iterative process of
111
reduction by plugging in the stabilized nodes’ states. One starts with a chosen SCC
in 𝑆𝑟𝑒𝑑, and then nodes with at least one conjunctive clause depending only on nodes
states from 𝑆𝑟𝑒𝑑 will stabilize in their value in A. When this reduction is applied
iteratively until it cannot be done anymore, the resulting 𝑆𝑟𝑒𝑑 contains only non-
stabilized nodes, whose functions do not have any dependence on the reduced nodes.
Then these functions must have a conjunctive clause that depends only on nodes’ states
specified in 𝑆𝑟𝑒𝑑 in A that did not stabilize during reduction.
Lemma 3. In a system/reduced system with no stable motifs, all nodes are influenced
by oscillating nodes.
Let 𝐴 be an attractor of a multi-level dynamical system under general asynchronous
update, and let 𝑆 and 𝑂 be the stabilized and oscillating nodes, respectively.
Let 𝑆𝑟𝑒𝑑 ⊂ 𝑆 be the set of nodes constructed in Lemma 1. Assume 𝑆𝑟𝑒𝑑 is empty and
O is not empty. Then in the original system, all nodes in O and S must all be a part of,
or downstream of, a set of source SCCs, each of which contains at least one oscillating
motif. Moreover, the oscillating motifs will contain the virtual nodes corresponding to
all the states visited by the oscillating nodes.
Sketch of proof: We can assume that there are no source nodes in the network
corresponding to the dynamical system, because if there are any, one can reduce them
and substitute their values of the source nodes into the regulatory functions of their
downstream nodes. The network contains one or more source SCCs. Then, any source
SCC in the network must contain at least one oscillating node, otherwise this source
SCC would contain only stabilized nodes, meaning a non-empty 𝑆𝑟𝑒𝑑.
We then show that any of these source SCCs corresponds to at least one oscillating
motif in the expanded network. Suppose that a pair of sibling virtual nodes 𝑣1(𝑙1)
, 𝑣1(𝑙2)
correspond to an oscillating node 𝑣1 in the source SCC. Since it is a source SCC, all
regulators of 𝑣1 are from this SCC, and 𝑣1 regulates at least one other node from this
SCC. Consider the expanded network around 𝑣1(𝑙1)
. We construct an oscillating motif
candidate starting with marking its regulators and selected targets. First we mark all
inputs of 𝑣1(𝑙1)
, including inputs directly connected to 𝑣1(𝑙1)
and inputs connected
to 𝑣1(𝑙1)
via composite nodes. All marked virtual nodes correspond to nodes in the
source SCC. Then we mark the target virtual nodes of 𝑣1(𝑙1)
that satisfy: (1) the target is
regulated directly by 𝑣1(𝑙1)
or via one composite node; (2) the target corresponds to a
112
node in the source SCC. We iteratively continue this marking process for all marked
virtual nodes. Since in each step only virtual nodes corresponding to nodes in the source
SCC are marked, and each node marked must have at least one regulator and one
selected target, we will obtain an SCC in the expanded network all of whose virtual
nodes correspond to the source SCC in the original graph. Because we started the
process in a source SCC in the original network, if a composite node is marked, all of
its inputs will satisfy the marking condition, and will be marked as well. We refer to this
SCC in the expanded network as the expanded motif, and will show that it can be used
to construct an oscillating motif. Notice that for both 𝑣1(𝑙1)
and 𝑣1(𝑙2)
, one can construct
the corresponding expanded motif, respectively. Because this pair of virtual nodes
represents oscillating states under a general asynchronous complex attractor, they must
be connected to each other, otherwise they cannot oscillate. Thus their expanded motifs
are strongly connected, and can be merged to obtain a larger strongly connected motif
that includes both 𝑣1(𝑙1)
and 𝑣1(𝑙2)
. In cases where more than two virtual nodes
corresponding to the same node are involved in an oscillation, the same merging can be
applied, and it similarly results in a single expanded motif. This merging can be done
for each pair of oscillating sibling nodes. The resulting merged motif is an oscillating
motif, because the marking process guarantees that all inputs of composite nodes are
marked; and the merging guarantees that at least two states of oscillating nodes are
marked. In addition, all oscillating virtual nodes in the oscillation are marked, i.e. the
oscillating motif covers all the oscillating states of each oscillating node in the
oscillation.
Therefore, after the reduction of stable motifs, in a reduced network any source SCC
corresponds to at least one oscillating motif, and all nodes in the expanded network are
either part of an oscillating motif or downstream of an oscillating motif.
Remark: It is worth pointing out that complex attractors of a dynamic model depend
on the update scheme. Some complex attractors only exist if a specific update scheme
is imposed (see Appendix B4). Therefore, a timing-independent method like ours is not
able to find candidates of all complex attractors, but only candidates for timing-
independent complex attractors, i.e. complex attractors under asynchronous update. In
the proof of Lemma 3, this is reflected by the condition “Because this pair of virtual
nodes represents oscillating states under a general asynchronous complex attractor, they
must be connected to each other, otherwise they cannot oscillate.” Everything else in
the proof applies for arbitrary update schemes. In addition, the actual oscillation may be
different from the corresponding oscillating motifs, so no exact conclusions can be made
113
regarding nodes downstream of an oscillating motif.
The following theorem is the main result of this section, and it combines the results
of Lemma 1, 2, and 3. It shows that for every attractor of the system, our motif-based
method will find a corresponding quasi-attractor in which:
(1) The state of the nodes in 𝑆𝑟𝑒𝑑 is the same as in the attractor
(2) There is at least one oscillating motif that corresponds to the oscillating part of each
complex attractor.
Theorem 1. Conservation of attractors in motif reduction
Let 𝐴 be an attractor of a multi-level dynamical system under general asynchronous
update, and let 𝑆 and 𝑂 be the stabilized and oscillating nodes, respectively.
Let 𝑆𝑟𝑒𝑑 ⊂ 𝑆 be the set of nodes constructed in Lemma 1. Then, there exists a set of
stable motifs such that, by applying network reduction, all the nodes in 𝑆𝑟𝑒𝑑 will
stabilize in their steady state in A, while the rest of the nodes will be part of the final
reduced network. This final reduced network will be such that all nodes in O and S must
all be a part of, or downstream of a set of source SCCs, each of which contains at least
one oscillating motif. Moreover, the oscillating motifs will contain the virtual nodes
corresponding to all the states visited by the oscillating nodes.
Sketch of proof: Using Lemma 2, the network obtained after reducing any stable motif
composed only of the corresponding states of 𝑆𝑟𝑒𝑑 in A will have a new 𝑆𝑟𝑒𝑑
containing only the nodes in the previous 𝑆𝑟𝑒𝑑 that did not stabilize. One can iteratively
plug in the stable motifs until 𝑆𝑟𝑒𝑑 is empty. Because of Lemma 1, there is always a
stable motif as long as 𝑆𝑟𝑒𝑑 is not empty. In the reduction process only nodes in 𝑆𝑟𝑒𝑑
can stabilize. By Lemma 3, the source SCCs in the resulting reduced network contains
oscillating motifs that cover all virtual nodes corresponding to oscillating states of
oscillating nodes.
Finally we list some straightforward corollaries of the theorem that help demonstrate
the properties of attractors.
Corollary 1. If a multi-level dynamic system does not have oscillating motifs in its
expanded network, the system does not have complex attractors.
Corollary 2. If a multi-level system does not have fixed point attractors, it must have at
least one oscillating motif.
Corollary 3. A quasi-attractor can correspond to multiple complex attractors. Examples
in Appendix B4 illustrate this corollary.
114
B4 Oscillating Motif Examples
Here we illustrate certain properties of oscillating motifs with examples. Because
certain regulatory relationships between nodes are non-monotonic (their sign depends
on the node state), for simplicity we use the same type of arrow for all edges. For better
visualization, we omitted the names of composite nodes in complicated expanded
networks.
1. Timing-dependent complex attractor
Figure B.2 shows an example of a dynamical system with different attractors under
different update schemes.
In synchronous update al nodes are updated simultaneously, thus state transitions are
deterministic. Each state has only one successor (i.e. each node of the state transition
graph has a single outgoing edge). In the state transition graph corresponding to general
asynchronous update, a given state has as many potential state transitions as many nodes
there are in the system (because each node has a chance to be updated).
In this example a complex attractor exists for synchronous update, but not for general
asynchronous update. This complex attractor is induced by positive feedback, not
negative feedback, and requires that nodes A and B are updated at exactly the same time.
So it is timing-dependent and will not be preserved under fluctuations in timing. This
type of timing-dependent complex attractor will not be identified by our motif-based
method.
Figure B.2 An example of a timing-dependent complex attractor. (A) The network and
115
regulatory functions. (B) The state transition graph under synchronous update. Each
node of the state transition graph is a state, given in the order A, B, and each edge is a
state transition allowed by synchronous update. The system has two fixed points, (0,0)
and (1,1). It also has a complex attractor formed by the states (0,1) and (1,0). (C) The
state transition graph under general asynchronous update (i.e. when one node is updated
at a time). Only the two fixed point attractors exist. The synchronous complex attractor
is timing-dependent and does not exist in this update scheme.
2. The existence of an oscillating motif does not guarantee the existence of a
complex attractor
Figure B.3 demonstrates a simple example where the oscillating motif corresponds
to a transient oscillation, which will converge into a fixed point attractor.
Figure B.3 An example of an oscillating motif without a complex attractor. (A) The
network and regulatory functions. (B) The expanded network and motifs. There is a
stable motif formed by A0 and B0, and an oscillating motif made up by A1, A2, B1. (C)
The state transition graph using general asynchronous update. There is only one attractor,
which is a fixed point. The transient oscillation between states (2,1) and (1,1) will
eventually converge into the fixed point.
3. Oscillating nodes can have stabilized downstream nodes
Figure B.4 shows a Boolean example adapted from [30] in (A)(B) and a multi-level
116
example in (C). In the system on Figure B.4(A), nodes A and B do not visit the state
A=1, B=1 unless starting from there, which causes the stabilization of C=0. Such
situations are expected to be more common in multi-level systems than in Boolean
systems. In the system of Figure B.4(C) the regulator node A has more states than the
regulated node B, thus the oscillation in A does not affect B This situation is expected
to be observed in biological systems.
Figure B.4 Examples of stabilized nodes downstream of oscillating node(s). (A) A
Boolean example where A and B oscillate but their downstream C is stable under that
oscillation. (B) The general asynchronous state transition graph of nodes A and B. The
state (A=1,B=1) is not visited in the long term, leading to the stabilization of C=0. (C)
A multi-level example where A is oscillating between 1 and 2, leading to B stabilizing
at 1. This example arises because of asymmetry in the nodes’ number of states: A has
three states but B only has two states.
4. Co-existence of a fixed point and a complex attractor
If a dynamical system has input variables (source nodes with sustained states), it can
have a different attractor for different values of the input variables. Here we consider a
dynamical system with a given choice of input variables, or equivalently, no input
variables. Co-existence of a fixed point attractor and a complex attractor for such a
system is possible but rare in Boolean systems. Zañudo et al. [30] referred to this
situation as unstable oscillation. We reproduce the example given in as Figure B.5.
117
Notice that the nodes involved in the two attractors share node states, i.e. A is fixed at 1
in the fixed point attractor, but also enters state 1 in the complex attractor. In multi-level
dynamical systems the fixed point and complex attractor do not need to share node states
(see Figure 3.5 and Figure 3.6 in Chapter 2). Thus we expect that coexistence of
(potentially multiple) fixed point(s) and complex attractor(s) is more frequently
observed.
Figure B.5 An example of an unstable oscillation. The system has a fixed point and a
complex attractor. (A) The network and regulatory functions. (B) The expanded network
and motifs. The entire expanded network forms an oscillating motif, containing the
stable motif by two nodes A1, B1, and one composite node. (C) The state transition
graph using general synchronous update. There is a fixed point attractor A=1, B=1, and
a complex attractor. Note that in the complex attractor, although both A and B are
allowed to enter state 1, they cannot be in state 1 simultaneously.
118
5. One oscillating motif can correspond to multiple attractors
Figure B.5 also illustrates that the same oscillating motif can correspond to multiple
attractors, in this case a complex attractor and a fixed point. In multi-level cases,
multiple complex attractors can also be found within the same oscillating motif. Figure
B.6 shows such an example. Combined with the property that an oscillating motif does
not guarantee a complex attractor, the conclusion is that there is no exact match between
the actual number of complex attractors and the number of quasi-attractors found, i.e.
there may be more actual attractors than quasi-attractors found, and there may be less
actual attractors than quasi-attractors found.
Figure B.6 An example of an oscillating motif containing two complex attractors. (A)
The network and regulatory functions. (B) The expanded network and motifs. The entire
expanded network forms an oscillating motif. (C) The state transition graph. For
simplicity self-loops representing self-transitions are not shown in the graph. There are
two complex attractors, the first attractor is B=0, A=0 or 1, and the second attractor is
B=1, A =2 or 3.
119
B5 Generation of regulatory functions in synthetic networks
Here we describe how we randomly generated regulatory functions among those
consistent with the number of regulators and number of states for each node.
In the network generation part, each node’s regulators are generated. In the
benchmarks, we generated networks where each node has two input nodes. For each
target node, we assign to each combination of different states of the regulator nodes a
randomly selected state of the target node. For example, if Boolean target node A is
regulated by Boolean nodes B and C, each of the four state combinations of B and C
will be randomly assigned to either the function of A0 or A1. Different input
combinations assigned to the same target state will be separated by an ‘or’ operator. For
example, combinations B0 C0 and B1 C0 are assigned to A0, then the function of A0 is
just fA(0) = (B0 and C0) or (B1 and C0). If at the end of the assignment a target state did
not get any assigned combination, this function is ineffective, and we discard all the
functions of this target node and start over to generate a new set of functions.
120
Appendix C Modeling ABA and CO2
crosstalk in inducing stomatal closure
C1 Node name, abbreviation and regulatory rule for each node
Node name in the network Full name
ABA Abscisic acid
ABI1 ABA (abscisic acid)-insensitive 1
ABI2 ABA (abscisic acid)-insensitive 2
CA1/4 Β- Carbon anhydrase 1 and 4
[Ca2+]c Cytosolic calcium
Ca2+ ATPase Ca2+ ATPases and Ca2+/H+ antiporters responsible
for Ca2+ efflux from the cytosol
CaIM Ca2+ influx across the plasma membrane
CIS Ca2+ influx to the cytosol from intracellular stores
Closure Stomatal closure
CO2 Carbon dioxide
Depolarization Plasma membrane depolarization
GHR1 Guard cell hydrogen peroxide resistant 1
GPA1 Heterotrimeric G protein α subunit 1
H2O Efflux water efflux through the plasma membrane
HT1 protein kinase HIGH LEAF TEMPERATURE1
Microtubule
depolymerization
Microtubule depolymerization
NO Nitric Oxide
OST1 protein kinase OPEN STOMATA 1
PA Phosphatidic acid
pHc Increase of the cytosolic pH level
Aquaporin PIP2;1 Plasma membrane intrinsic protein 2;1
PLC Phospholipase C
PLDα Phospholipase D α1
PP2Cs Represent the collection of PP2Cs, including PP2CA
(Protein Phosphatase 2CA), HAB1 (Hypersensitive
to ABA 1), and ABI2 (ABA-insensitive 2)
121
AtrbohD/F NADPH oxidases AtRBOH D and F
RCARs Regulatory Components of ABA Receptor
RHC1 MATE-type transporter RESISTANT TO HIGH CO2
ROS Reactive oxygen species
SLAC1, Anion efflux Slow Anion Channel- associated 1 and Anion efflux
merged into one node
Next we provide regulation functions of all nodes in the model. There are two input
nodes, ABA and CO2, which do not require regulatory functions. The regulatory
functions of the rest of the nodes are shown below. “*” is a notation to indicate that the
regulatory function take effect on the next time step for the node. Two models, as shown
in Figure 4.3, are denoted as Model A and Model B.
Regulatory functions for Model A:
RCARs * = ABA
GPA1 * = ABA OR PA
PLDa * = GPA1 AND Cac
ABI1 * = not PA AND not RCARs AND not ROS AND pHc
PP2Cs * = not RCARs AND not ROS AND not XLG
PA * = NO OR ROS OR PLDa OR PLC
OST1 * = not PP2Cs OR not HT1
pHc * = (OST1 AND not ABI1 AND not PP2Cs) OR Cac
AtRbohDF * = not ABI1 AND OST1 AND pHc AND PA AND (GPA1 OR XLG)
ROS * = AtRbohDF
GHR1 * = not PP2Cs AND ROS
NO * = ROS
CIS * = ROS OR PLC
Cac * = (CIS OR CaIM) AND not CaATPase
CaATPase * = Cac
CaIM * = GHR1 OR (not ABI1 AND (ABA OR OST1))
PLC * = Cac
XLG * = CA14 AND RHC1 OR CaIM
CA14 * = high_CO2
RHC1 * = CA14
HT1 * = not RHC1 OR not XLG
SLAC1_AnionCh * = OST1 AND not ABI1 AND (PP2Cs AND GHR1 OR Cac)
membrane_depolarization * = SLAC1_AnionCh OR Cac
Aquaporin * = OST1 OR CA14
122
K_Efflux * = membrane_depolarization
closure * = SLAC1_AnionCh AND K_Efflux AND Aquaporin
Regulatory functions for Model B:
RCARs * = ABA
GPA1 * = ABA OR PA
PLDa * = GPA1 AND Cac
ABI1 * = not PA AND not RCARs AND not ROS AND pHc
PP2Cs * = not RCARs AND not ROS AND not XLG
PA * = NO OR ROS OR PLDa OR PLC
OST1 * = not PP2Cs OR (not HT1 AND XLG)
pHc * = (OST1 AND not ABI1 AND not PP2Cs) OR Cac
AtRbohDF * = not ABI1 AND OST1 AND pHc AND PA AND (GPA1 OR XLG)
ROS * = AtRbohDF
GHR1 * = not PP2Cs AND ROS
NO * = ROS
CIS * = ROS OR PLC
Cac * = (CIS OR CaIM) AND not CaATPase
CaATPase * = Cac
CaIM * = GHR1 OR (not ABI1 AND (ABA OR OST1))
PLC * = Cac
XLG * = not HT1 OR CaIM
CA14 * = high_CO2
RHC1 * = CA14
HT1 * = not RHC1
SLAC1_AnionCh * = OST1 AND not ABI1 AND (PP2Cs AND GHR1 OR Cac)
membrane_depolarization * = SLAC1_AnionCh OR Cac
Aquaporin * = OST1 OR CA14
K_Efflux * = membrane_depolarization
closure * = SLAC1_AnionCh AND K_Efflux AND Aquaporin
C2 Systematic single node intervention of the crosstalk model
Here we provide the simulations on closure value as response to CO2 and external
Calcium signaling of the two different models presented in Figure 4.3, under systematic
node intervention (knockouts as “=0” and constitutive activations as “=1”). The models
123
are referred to as Model A and B, respectively, as shown in the figure. Notation like
“~0.05 (osc.)” indicate an oscillating closure value with average around 0.05. All
closure values are taken after 50 time steps, averaged over 20 simulations7. The value
should be grouped in to three categories: “1”s, “0”s, and some value in between, before
comparison. For example, ‘0.2’ and ‘0.6’ are considered similar as they are both in the
category “between 0 and 1”. The reason for comparison after categorization is that in
an oscillation the closure value can fluctuate in a considerably large range, so one cannot
distinguish the values in between 0 and 1.
Model A. XLG-|HT1, RHC-> XLG
Intervention CO2 response External Calcium response
wildtype 1 1
[ABA =0] 1 1
[ABA =1] 1 1
[RCARs =0] 1 1
[RCARs =1] 1 1
[GPA1 =0] 1 1
[GPA1 =1] 1 1
[PLDa =0] 1 1
[PLDa =1] 1 1
[ABI1 =0] 1 1
[ABI1 =1] 0 0
[PP2Cs =0] 1 1
[PP2Cs =1] 0.2 0.6
[PA =0] ~0.05 (osc.) 0.1
[PA =1] 1 1
[OST1 =0] 0 0
[OST1 =1] 1 1
[pHc =0] 0.25 0.45
[pHc =1] 0 1
[AtRbohDF =0] 0.15 0.4
[AtRbohDF =1] 1 1
[ROS =0] 0.25 0.15
[ROS =1] 1 1
7 This is preliminary data so the simulation number looks small. Note 20 simulations is actually large enough to capture the response categories.
124
[GHR1 =0] 0.2 0.3
[GHR1 =1] 1 1
[NO =0] 1 1
[NO =1] 1 1
[CIS =0] 1 1
[CIS =1] 1 1
[Cac =0] 0 0
[Cac =1] 1 1
[CaATPase =0] 1 1
[CaATPase =1] 0 0.05 (no osc.)
[CaIM =0] 0 0
[CaIM =1] 1 1
[high_CO2 =0] 0.05 1
[high_CO2 =1] 1 1
[XLG =0] 0 0
[XLG =1] 1 1
[CA14 =0] 0 1
[CA14 =1] 1 1
[RHC1 =0] 0 1
[RHC1 =1] 1 1
[HT1 =0] 1 1
[HT1 =1] 1 1
[SLAC1_AnionCh =0] 0 0
[SLAC1_AnionCh =1] 1 1
[membrane_depolarization =0] 0 0
[membrane_depolarization =1] 1 1
[Aquaporin =0] 0 0
[Aquaporin =1] 1 1
[K_Efflux =0] 0 0
[K_Efflux =1] 1 1
[closure =0] 0 0
[closure =1] 1 1
Model B. HT1-|XLG, XLG->OST1
Intervention CO2 response External Calcium response
wildtype 1 1
125
[ABA =0] 1 1
[ABA =1] 1 1
[RCARs =0] 1 1
[RCARs =1] 1 1
[GPA1 =0] 1 1
[GPA1 =1] 1 1
[PLDa =0] 1 1
[PLDa =1] 1 1
[ABI1 =0] 1 1
[ABI1 =1] 0 0
[PP2Cs =0] 1 1
[PP2Cs =1] 0.45 0.25
[PA =0] ~0.05 (osc.) 0.2
[PA =1] 1 1
[OST1 =0] 0 0
[OST1 =1] 1 1
[pHc =0] 0.1 0.5
[pHc =1] 0 1
[AtRbohDF =0] 0.15 0.2
[AtRbohDF =1] 1 1
[ROS =0] 0.1 0.3
[ROS =1] 1 1
[GHR1 =0] 0.6 0.4
[GHR1 =1] 1 1
[NO =0] 1 1
[NO =1] 1 1
[CIS =0] 1 1
[CIS =1] 1 1
[Cac =0] 0 0
[Cac =1] 1 1
[CaATPase =0] 1 1
[CaATPase =1] 0 0.05 (no osc.)
[CaIM =0] 0 0
[CaIM =1] 1 1
[high_CO2 =0] 0.25 1
[high_CO2 =1] 1 1
126
[XLG =0] 0 0
[XLG =1] 1 1
[CA14 =0] 0 1
[CA14 =1] 1 1
[RHC1 =0] 0 1
[RHC1 =1] 1 1
[HT1 =0] 1 1
[HT1 =1] 0 1
[SLAC1_AnionCh =0] 0 0
[SLAC1_AnionCh =1] 1 1
[membrane_depolarization =0] 0 0
[membrane_depolarization =1] 1 1
[Aquaporin =0] 0 0
[Aquaporin =1] 1 1
[K_Efflux =0] 0 0
[K_Efflux =1] 1 1
[closure =0] 0 0
[closure =1] 1 1
C3 Selected triple intervention of the crosstalk model
The following tables list selected triple intervention under External Calcium and CO2
signaling of the crosstalk model. Due to limited space, some abbreviated notations are
used, e.g. “ROS restore CaIM” means “ROS ON can restore reduced closure from CaIM
KO”.
This table should be compared with the single treatment (double interventions) in the
main text. The coloring is: if the double treatment is significantly different from any of
its single treatments, mark the slot as orange.
External
Calcium
Double treatment
mutant ROS=1, PLC=0 ROS=1, PLC=1 ROS=1, pHc=0 PLC=0, pHc=0 PLC=1, pHc=0
[] 1 1 1 0.26 0.24
GPA1=0 1 1 1 0.34 0.38
XLG=0 1 1 1 0 0
OST1=0 0 0 0 0 0
127
ABI1=0 1 1 1 0.38 0.32
AtRbohDF=0 1 1 1 0.32 0.34
GHR1=0 0.48 0.52 0.38 0.36 0.46
CA14=0 1 1 1 0.22 0.12
RHC1=0 1 1 1 0.3 0.48
HT1=0 1 1 1 0.36 0.24
Column
Comment
ROS ON can
revert PLC
KO cases
ROS ON can
revert pHc KO
pHc KO
screens PLC
ON
CO2
signaling
Double treatment
mutant ROS=1,
PLC=0
ROS=1,
PLC=1
ROS=1,
pHc=0
PLC=0,
pHc=0
PLC=1,
pHc=0
ROS=1,
CaIM=0
CaIM=0
, PLC=0
CaIM=0
, PLC=1
CaIM=0
, pHc=0
[] 1 1 1 0.28 0.44 1 0 1 0
GPA1=0 1 1 1 0.32 0.32 1 0 1 0
XLG=0 1 1 1 0 0 1 0 0 0
OST1=0 0 0 0 0 0 0 0 0 0
ABI1=0 1 1 1 0.26 0.28 1 0 1 0
AtRbohDF
=0
1 1 1 0.44 0.32 1 0 0.4 0
GHR1=0 0.34 0.4 0.46 0.4 0.38 0.42 0 0.2 0
CA14=0 1 1 1 0 0 1 0 0 0
RHC1=0 1 1 1 0 0 1 0 0 0
HT1=0 1 1 1 0.4 0.38 1 0 1 0
Column
Comment
ROS
restore
CaIM
PLC
restore
CaIM
Seems the only difference from external Calcium case to CO2 case is the CA14
and RHC1 mutants
128
References
1. Sun, Z., et al., Multi-level modeling of light-induced stomatal opening offers new insights into
its regulation by drought. PLoS Comput Biol, 2014. 10(11): p. e1003930.
2. Gan, X. and R. Albert, Analysis of a dynamic model of guard cell signaling reveals the stability
of signal propagation. BMC Systems Biology, 2016. 10(1): p. 78.
3. Thomas, R. and European Molecular Biology Organization., Kinetic logic : a Boolean approach
to the analysis of complex regulatory systems : proceedings of the EMBO course "Formal
analysis of genetic regulation," held in Brussels, September 6-16, 1977. Lecture notes in
biomathematics. 1979, Berlin ; New York: Springer-Verlag. xiii, 507 p.
4. Albert, R., et al., A new discrete dynamic model of ABA-induced stomatal closure predicts key
feedback loops. PLOS Biology, 2017. 15(9): p. e2003451.
5. Shannon, P., et al., Cytoscape: A Software Environment for Integrated Models of Biomolecular
Interaction Networks. Genome Research, 2003. 13(11): p. 2498-2504.
6. Hagberg, A.A., D.A. Schult, and P.J. Swart. Exploring network structure, dynamics, and function
using NetworkX. in Proceedings of the 7th Python in Science Conference (SciPy2008). 2008.
Pasadena, CA USA.
7. Morris, M.K., et al., Logic-based models for the analysis of cell signaling networks. Biochemistry,
2010. 49(15): p. 3216-24.
8. Abou-Jaoudé, W., et al., Logical Modeling and Dynamical Analysis of Cellular Networks.
Frontiers in Genetics, 2016. 7(94).
9. Wynn, M.L., et al., Logic-based models in systems biology: a predictive and parameter-free
network analysis method. Integr Biol (Camb), 2012. 4(11): p. 1323-37.
10. Laubenbacher, R., et al., Algebraic Models and Their Use in Systems Biology, in Discrete and
Topological Models in Molecular Biology, N. Jonoska and M. Saito, Editors. 2014, Springer Berlin
Heidelberg: Berlin, Heidelberg. p. 443-474.
11. Wang, R.S., A. Saadatpour, and R. Albert, Boolean modeling in systems biology: an overview of
methodology and applications. Phys Biol, 2012. 9.
12. Veliz-Cuba, A., A.S. Jarrah, and R. Laubenbacher, Polynomial algebra of discrete models in
systems biology. Bioinformatics, 2010. 26(13): p. 1637-43.
13. Saadatpour, A., R. Albert, and T.C. Reluga, A Reduction Method for Boolean Network Models
Proven to Conserve Attractors. SIAM J. Appl. Dyn. Syst., 2013. 12.
14. Naldi, A., et al., Dynamically consistent reduction of logical regulatory graphs. Theoretical
Computer Science, 2011. 412(21): p. 2207-2218.
15. Naldi, A., et al., Cooperative development of logical modelling standards and tools with
CoLoMoTo. Bioinformatics, 2015. 31(7): p. 1154-1159.
16. Chaouiya, C., et al., SBML qualitative models: a model representation format and infrastructure
to foster interactions between qualitative modelling formalisms and tools. BMC Syst Biol, 2013.
7: p. 135.
17. Helikar, T., et al., The Cell Collective: Toward an open and collaborative approach to systems
biology. BMC Systems Biology, 2012. 6(1): p. 96.
18. Chaouiya, C., A. Naldi, and D. Thieffry, Logical modelling of gene regulatory networks with
GINsim. Methods Mol Biol, 2012. 804: p. 463-79.
129
19. Albert, I., et al., Boolean network simulations for life scientists. Source Code for Biology and
Medicine, 2008. 3(1): p. 16.
20. Zheng, J., et al., SimBoolNet—a Cytoscape plugin for dynamic simulation of signaling networks.
Bioinformatics, 2010. 26(1): p. 141-142.
21. Müssel, C., M. Hopfensitz, and H.A. Kestler, BoolNet—an R package for generation,
reconstruction and analysis of Boolean networks. Bioinformatics, 2010. 26(10): p. 1378-1380.
22. Zhang, R., et al., Network model of survival signaling in large granular lymphocyte leukemia.
Proc Natl Acad Sci U S A, 2008. 105(42): p. 16308-13.
23. Saadatpour, A., et al., Dynamical and Structural Analysis of a T Cell Survival Network Identifies
Novel Candidate Therapeutic Targets for Large Granular Lymphocyte Leukemia. PLOS
Computational Biology, 2011. 7(11): p. e1002267.
24. Steinway, S.N., et al., Network modeling of TGFbeta signaling in hepatocellular carcinoma
epithelial-to-mesenchymal transition reveals joint sonic hedgehog and Wnt pathway activation.
Cancer Res, 2014. 74(21): p. 5963-77.
25. Steinway, S.N., et al., Combinatorial interventions inhibit TGFβ-driven epithelial-to-
mesenchymal transition and support hybrid cellular phenotypes. 2015. 1: p. 15014.
26. Remy, E., P. Ruet, and D. Thieffry, Graphic requirements for multistability and attractive cycles
in a Boolean dynamical framework. Advances in Applied Mathematics, 2008. 41(3): p. 335-350.
27. Richard, A. and J.-P. Comet, Necessary conditions for multistationarity in discrete dynamical
systems. Discrete Applied Mathematics, 2007. 155(18): p. 2403-2413.
28. Richard, A., Negative circuits and sustained oscillations in asynchronous automata networks.
Advances in Applied Mathematics, 2010. 44(4): p. 378-392.
29. Wang, R.-S. and R. Albert, Elementary signaling modes predict the essentiality of signal
transduction network components. BMC Systems Biology, 2011. 5(1): p. 44.
30. Zanudo, J.G. and R. Albert, An effective network reduction approach to find the dynamical
repertoire of discrete dynamic networks. Chaos, 2013. 23.
31. Zanudo, J.G. and R. Albert, Cell fate reprogramming by control of intracellular network dynamics.
PLoS Comput Biol, 2015. 11(4): p. e1004193.
32. Sun, Z. and R. Albert, Node-independent elementary signaling modes: A measure of redundancy
in Boolean signaling transduction networks. Network Science, 2016. 4(3): p. 273-292.
33. Gan, X. and R. Albert, General method to find the attractors of discrete dynamic models of
biological systems. Phys Rev E, 2018. 97(4-1): p. 042308.
34. Maheshwari, P. and R. Albert, A framework to find the logic backbone of a biological network.
BMC Systems Biology, 2017. 11(1): p. 122.
35. Stigler, B. and H.M. Chamberlin, A regulatory network modeled from wild-type gene expression
data guides functional predictions in Caenorhabditis elegans development. BMC Syst Biol, 2012.
6.
36. Chifman, J., et al., The core control system of intracellular iron homeostasis: a mathematical
model. J Theor Biol, 2012. 300: p. 91-9.
37. Massague, J., TGF-beta signal transduction. Annu Rev Biochem, 1998. 67.
38. Xu, H.L., et al., Construction and Validation of a Regulatory Network for Pluripotency and Self-
Renewal of Mouse Embryonic Stem Cells. Plos Computational Biology, 2014. 10(8).
39. Kestler, H.A., et al., Network modeling of signal transduction: establishing the global view.
Bioessays, 2008. 30(11-12): p. 1110-25.
130
40. Tyson, J.J., K. Chen, and B. Novak, Network dynamics and cell physiology. Nature Reviews
Molecular Cell Biology, 2001. 2(12): p. 908-916.
41. Kauffman, S.A., Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor
Biol, 1969. 22.
42. Glass, L. and S.A. Kauffman, Logical analysis of continuous, nonlinear biochemical control
networks. J Theor Biol, 1973. 39.
43. Miskov-Zivanov, N., et al., The duration of T cell stimulation is a critical determinant of cell fate
and plasticity. Sci Signal, 2013. 6(300): p. ra97.
44. Deritei, D., et al., Principles of dynamical modularity in biological regulatory networks. Sci Rep,
2016. 6: p. 21957.
45. Murrugarra, D. and R. Laubenbacher, Regulatory patterns in molecular interaction networks. J
Theor Biol, 2011. 288.
46. Li, S., S.M. Assmann, and R. Albert, Predicting essential components of signal transduction
networks: a dynamic model of guard cell abscisic acid signaling. PLoS Biol, 2006. 4(10): p. e312.
47. Schroeder, J.I., et al., Guard Cell Signal Transduction. Annu Rev Plant Physiol Plant Mol Biol,
2001. 52: p. 627-658.
48. Shimazaki, K., et al., Light regulation of stomatal movement. Annu Rev Plant Biol, 2007. 58: p.
219-47.
49. Assmann, S.M., Enhancement of the Stomatal Response to Blue Light by Red Light, Reduced
Intercellular Concentrations of CO(2), and Low Vapor Pressure Differences. Plant Physiol, 1988.
87.
50. Bergmann, D.C. and F.D. Sack, Stomatal development. Annu Rev Plant Biol, 2007. 58.
51. MacArthur, B.D., A. Ma'ayan, and I.R. Lemischka, Systems biology of stem cell fate and cellular
reprogramming. Nature Reviews Molecular Cell Biology, 2009. 10(10): p. 672-681.
52. Ansotegui, C. and F. Manya, Mapping problems with finite-domain variables to problems with
Boolean variables. Theory Appl of Satisfiability Test, 2005. 3542.
53. Van Ham, P., How to deal with more than two levels, in Kinetic logic : a Boolean approach to the
analysis of complex regulatory systems : proceedings of the EMBO course "Formal analysis of
genetic regulation," held in Brussels, September 6-16, 1977, R. Thomas, Editor. 1979, Springer-
Verlag: Berlin ; New York. p. 326-344.
54. Didier, G., E. Remy, and C. Chaouiya, Mapping multivalued onto Boolean dynamics. J Theor Biol,
2011. 270.
55. Karlsson, P.E., Blue light regulation of stomata in wheat seedlings. I. Influence of red background
illumination and initial conductance level. Physiologia Plantarum, 1986. 66: p. 5.
56. Veliz-Cuba, A., et al., Steady state analysis of Boolean molecular network models via model
reduction and computational algebra. BMC Bioinformatics, 2014. 15(1): p. 221.
57. Remy, E. and P. Ruet, On differentiation and homeostatic behaviours of Boolean dynamical
systems. Lecture Notes in Bioinformatics, 2007. 4780: p. 92-101.
58. Liu, Y.-Y., J.-J. Slotine, and A.-L. Barabasi, Controllability of complex networks. Nature, 2011.
473(7346): p. 167-173.
59. Lin, C.T., STRUCTURAL CONTROLLABILITY. Ieee Transactions on Automatic Control, 1974.
AC19(3): p. 201-208.
60. Mochizuki, A., et al., Dynamics and control at feedback vertex sets. II: A faithful monitor to
determine the diversity of molecular activities in regulatory networks. Journal of Theoretical
131
Biology, 2013. 335: p. 130-146.
61. Schwartz, A., et al., Anion-Channel Blockers Inhibit S-Type Anion Channels and Abscisic Acid
Responses in Guard Cells. Plant Physiol, 1995. 109(2): p. 651-658.
62. Kim, T.H., et al., Guard cell signal transduction network: advances in understanding abscisic acid,
CO2, and Ca2+ signaling. Annu Rev Plant Biol, 2010. 61: p. 561-91.
63. Arenas, A., et al., Synchronization in complex networks. Physics Reports, 2008. 469(3): p. 93-
153.
64. Tian, X.-J., et al., Achieving diverse and monoallelic olfactory receptor selection through dual-
objective optimization design. Proceedings of the National Academy of Sciences, 2016. 113(21):
p. E2889-E2898.
65. Barabasi, A.-L. and Z.N. Oltvai, Network biology: understanding the cell's functional
organization. Nat Rev Genet, 2004. 5(2): p. 101-113.
66. Albert, R. and R.S. Wang, DISCRETE DYNAMIC MODELING OF CELLULAR SIGNALING NETWORKS,
in Methods in Enzymology: Computer Methods, Part B, M.L. Johnson and L. Brand, Editors. 2009,
Elsevier Academic Press Inc: San Diego. p. 281-306.
67. Pennisi, M., et al., A methodological approach for using high-level Petri Nets to model the
immune system response. BMC Bioinformatics, 2016. 17(19): p. 498.
68. Butchy, A.A. and N. Miskov-Zivanov, Discrete modeling of macrophage differentiation. The
Journal of Immunology, 2017. 198(1 Supplement): p. 67.13-67.13.
69. Albert, R. and J. Thakar, Boolean modeling: a logic-based dynamic approach for understanding
signaling and regulatory networks and for making useful predictions. Wiley Interdiscip Rev Syst
Biol Med, 2014. 6(5): p. 353-69.
70. Li, F., et al., The yeast cell-cycle network is robustly designed. Proceedings of the National
Academy of Sciences of the United States of America, 2004. 101(14): p. 4781-4786.
71. Abou-Jaoude, W., et al., Logical Modeling and Dynamical Analysis of Cellular Networks. Front
Genet, 2016. 7: p. 94.
72. Havlin, S., et al., Challenges in network science: Applications to infrastructures, climate, social
systems and economics. The European Physical Journal Special Topics, 2012. 214(1): p. 273-293.
73. Onnela, J.-P., et al., Structure and tie strengths in mobile communication networks. Proceedings
of the National Academy of Sciences, 2007. 104(18): p. 7332-7336.
74. Federico, B., P. Matjaž, and L. Vito, Determinants of public cooperation in multiplex networks.
New Journal of Physics, 2017. 19(7): p. 073017.
75. Lancichinetti, A., S. Fortunato, and J. Kertész, Detecting the overlapping and hierarchical
community structure in complex networks. New Journal of Physics, 2009. 11(3): p. 033015.
76. Mori, F. and A. Mochizuki, Expected Number of Fixed Points in Boolean Networks with Arbitrary
Topology. Physical Review Letters, 2017. 119(2): p. 028301.
77. Klarner, H., A. Bockmayr, and H. Siebert, Computing maximal and minimal trap spaces of
Boolean networks. Natural Computing, 2015. 14(4): p. 535-544.
78. Garg, A., et al., Synchronous versus asynchronous modeling of gene regulatory networks.
Bioinformatics, 2008. 24(17): p. 1917-1925.
79. Naldi, A., D. Thieffry, and C. Chaouiya, Decision Diagrams for the Representation and Analysis
of Logical Models of Genetic Networks, in Computational Methods in Systems Biology:
International Conference CMSB 2007, Edinburgh, Scotland, September 20-21, 2007.
Proceedings, M. Calder and S. Gilmore, Editors. 2007, Springer Berlin Heidelberg: Berlin,
132
Heidelberg. p. 233-247.
80. Traynard, P., et al., Logical model specification aided by model-checking techniques: application
to the mammalian cell cycle regulation. Bioinformatics, 2016. 32(17): p. i772-i780.
81. Gómez Tejeda Zañudo, J., M. Scaltriti, and R. Albert, A network modeling approach to elucidate
drug resistance mechanisms and predict combinatorial drug treatments in breast cancer.
Cancer Convergence, 2017. 1(1): p. 5.
82. Chifman, J., et al., Activated Oncogenic Pathway Modifies Iron Network in Breast Epithelial Cells:
A Dynamic Modeling Perspective. PLOS Computational Biology, 2017. 13(2): p. e1005352.
83. Dubrova, E., M. Liu, and M. Teslenko, Finding Attractors in Synchronous Multiple-Valued
Networks Using SAT-based Bounded Model Checking. Journal of Multiple-Valued Logic and Soft
Computing, 2012. 19(1-3): p. 109-131.
84. Hinkelmann, F., et al., ADAM: Analysis of Discrete Models of Biological Systems Using Computer
Algebra. BMC Bioinformatics, 2011. 12(1): p. 295.
85. Puniya, B.L., et al., Systems Perturbation Analysis of a Large-Scale Signal Transduction Model
Reveals Potentially Influential Candidates for Cancer Therapeutics. Frontiers in Bioengineering
and Biotechnology, 2016. 4: p. 10.
86. Cheng, X., M. Sun, and J.E.S. Socolar, Autonomous Boolean modelling of developmental gene
regulatory networks. Journal of the Royal Society Interface, 2013. 10(78): p. 20120574.
87. Murrugarra, D., et al., Modeling stochasticity and variability in gene regulatory networks.
EURASIP Journal on Bioinformatics and Systems Biology, 2012. 2012(1): p. 5.
88. Chaves, M., R. Albert, and E.D. Sontag, Robustness and fragility of Boolean models for genetic
regulatory networks. J Theor Biol, 2005. 235(3): p. 431-49.
89. Thomas, R., Regulatory networks seen as asynchronous automata: A logical description. Journal
of Theoretical Biology, 1991. 153(1): p. 1-23.
90. Saadatpour, A., I. Albert, and R. Albert, Attractor analysis of asynchronous Boolean models of
signal transduction networks. J Theor Biol, 2010. 266(4): p. 641-56.
91. Klemm, K. and S. Bornholdt, Topology of biological networks and reliability of information
processing. Proceedings of the National Academy of Sciences of the United States of America,
2005. 102(51): p. 18414-18419.
92. Gan, X. and R. Albert. A general method to find the attractors of discrete dynamic models of
biological systems. in the 8th International Conference on Physics and Control (PhysCon 2017).
2017. Florence, Italy.
93. Brown, F.M., The Blake Canonical Form, in Boolean Reasoning: The Logic of Boolean Equations.
1990, Springer US: Boston, MA. p. 71-86.
94. Quine, W.V., The Problem of Simplifying Truth Functions. The American Mathematical Monthly,
1952. 59(8): p. 521-531.
95. Quine, W.V., A Way to Simplify Truth Functions. The American Mathematical Monthly, 1955.
62(9): p. 627-631.
96. McCluskey, E.J., Minimization of Boolean Functions*. Bell System Technical Journal, 1956. 35(6):
p. 1417-1444.
97. Aldana, M., S. Coppersmith, and L.P. Kadanoff, Boolean Dynamics with Random Couplings, in
Perspectives and Problems in Nolinear Science: A Celebratory Volume in Honor of Lawrence
Sirovich, E. Kaplan, J.E. Marsden, and K.R. Sreenivasan, Editors. 2003, Springer New York: New
York, NY. p. 23-89.
133
98. Wang, R.S. and R. Albert, Effects of community structure on the dynamics of random threshold
networks. Physical Review E, 2013. 87(1).
99. Berenguier, D., et al., Dynamical modeling and analysis of large cellular regulatory networks.
Chaos, 2013. 23(2): p. 025114.
100. Reisig, W., Petri Nets, in Modeling in Systems Biology: The Petri Net Approach, I. Koch, W. Reisig,
and F. Schreiber, Editors. 2011, Springer London: London. p. 37-56.
101. Chaouiya, C., et al., Petri net representation of multi-valued logical regulatory graphs. Natural
Computing, 2011. 10(2): p. 727-750.
102. Samaga, R. and S. Klamt, Modeling approaches for qualitative and semi-quantitative analysis
of cellular signaling networks. Cell Communication and Signaling, 2013. 11(1): p. 43.
103. Johnson, D.B., Finding All the Elementary Circuits of a Directed Graph. SIAM Journal on
Computing, 1975. 4(1): p. 77-84.
104. Assmann, S.M. and T. Jegla, Guard cell sensory systems: recent insights on stomatal responses
to light, abscisic acid, and CO2. Current Opinion in Plant Biology, 2016. 33: p. 157-167.
105. Munemasa, S., et al., Mechanisms of abscisic acid-mediated control of stomatal aperture.
Current Opinion in Plant Biology, 2015. 28: p. 154-162.
106. Davies, W.J. and M.J. Bennett, Achieving more crop per drop. Nature Plants, 2015. 1: p. 15118.
107. Roelfsema, M.R. and H. Kollist, Tiny pores with a global impact. New Phytol, 2013. 197(1): p.
11-5.
108. Acharya, B.R. and S.M. Assmann, Hormone interactions in stomatal function. Plant Mol Biol,
2009. 69(4): p. 451-62.
109. Li, J., et al., Regulation of Abscisic Acid-Induced Stomatal Closure and Anion Channels by Guard
Cell AAPK Kinase. Science, 2000. 287(5451): p. 300-303.
110. Zhang, W., L.-M. Fan, and W.-H. Wu, Osmo-sensitive and stretch-activated calcium-permeable
channels in Vicia faba guard cells are regulated by actin dynamics. Plant physiology, 2007.
143(3): p. 1140-1151.
111. Jiang, K., et al., The ARP2/3 complex mediates guard cell actin reorganization and stomatal
movement in Arabidopsis. The Plant cell, 2012. 24(5): p. 2031-2040.
112. Engineer, C.B., et al., CO2 Sensing and CO2 Regulation of Stomatal Conductance: Advances and
Open Questions. Trends in plant science, 2016. 21(1): p. 16-30.
113. Brearley, J., M.A. Venis, and M.R. Blatt, The effect of elevated CO2 concentrations on K+ and
anion channels of Vicia faba L. guard cells. Planta, 1997. 203: p. 10.
114. ASSMANN, S.M., The cellular basis of guard cell sensing of rising CO2. Plant, Cell & Environment,
1999. 22(6): p. 629-637.
115. Hu, H., et al., Carbonic anhydrases are upstream regulators of CO2-controlled stomatal
movements in guard cells. Nature cell biology, 2010. 12(1): p. 87-18.
116. Tian, W., et al., A molecular pathway for CO(2) response in Arabidopsis guard cells. Nat Commun,
2015. 6: p. 6057.
117. Hõrak, H., et al., A Dominant Mutation in the HT1 Kinase Uncovers Roles of MAP Kinases and
GHR1 in CO<sub>2</sub>-Induced Stomatal Closure. The Plant Cell, 2016. 28(10): p. 2493-2509.
118. Hsu, P.-K., et al., Abscisic acid-independent stomatal CO<sub>2</sub> signal transduction
pathway and convergence of CO<sub>2</sub> and ABA signaling downstream of OST1 kinase.
Proceedings of the National Academy of Sciences, 2018. 115(42): p. E9971-E9980.
119. Chakravorty, D. and S.M. Assmann, G protein subunit phosphorylation as a regulatory
134
mechanism in heterotrimeric G protein signaling in mammals, yeast, and plants. Biochemical
Journal, 2018. 475(21): p. 3331-3357.
120. Mishra, G., et al., A Bifurcating Pathway Directs Abscisic Acid Effects on Stomatal Closure and
Opening in <em>Arabidopsis</em>. Science, 2006. 312(5771): p. 264-266.
121. Lee, Y.-R.J. and S.M. Assmann, Arabidopsis thaliana ‘extra-large GTP-binding protein’ (AtXLG1):
a new class of G-protein. Plant Molecular Biology, 1999. 40(1): p. 55-64.
122. Ding, L., S. Pandey, and S.M. Assmann, Arabidopsis extra-large G proteins (XLGs) regulate root
morphogenesis. Plant J, 2008. 53(2): p. 248-63.
123. Pandey, S., et al., Regulation of root-wave response by extra large and conventional G proteins
in Arabidopsis thaliana. Plant J, 2008. 55(2): p. 311-22.
124. Chakravorty, D., et al., Extra-Large G Proteins Expand the Repertoire of Subunits in Arabidopsis
Heterotrimeric G Protein Signaling. Plant Physiol, 2015. 169(1): p. 512-29.
125. Pandey, S., et al., G-protein complex mutants are hypersensitive to abscisic acid regulation of
germination and postgermination development. Plant Physiol, 2006. 141(1): p. 243-56.
126. Urano, D., et al., Saltational evolution of the heterotrimeric G protein signaling mechanisms in
the plant kingdom. Sci Signal, 2016. 9(446): p. ra93.
127. Maruta, N., et al., Membrane-localized extra-large G proteins and Gbg of the heterotrimeric G
proteins form functional complexes engaged in plant immunity in Arabidopsis. Plant Physiol,
2015. 167(3): p. 1004-16.
128. Wang, X.Q., et al., G protein regulation of ion channels and abscisic acid signaling in Arabidopsis
guard cells. Science, 2001. 292(5524): p. 2070-2.
129. Ge, X.M., et al., Heterotrimeric G protein mediates ethylene-induced stomatal closure via
hydrogen peroxide synthesis in Arabidopsis. Plant J, 2015. 82(1): p. 138-50.
130. Jones, A.M. and S.M. Assmann, Plants: the latest model system for G-protein research. EMBO
Rep, 2004. 5(6): p. 572-8.
131. Wu, W.H. and S.M. Assmann, A membrane-delimited pathway of G-protein regulation of the
guard-cell inward K+ channel. Proc Natl Acad Sci U S A, 1994. 91(14): p. 6310-4.
132. Coursol, S., et al., Sphingolipid signalling in Arabidopsis guard cells involves heterotrimeric G
proteins. Nature, 2003. 423(6940): p. 651-4.
133. Li, J.H., et al., A signaling pathway linking nitric oxide production to heterotrimeric G protein
and hydrogen peroxide regulates extracellular calmodulin induction of stomatal closure in
Arabidopsis. Plant Physiol, 2009. 150(1): p. 114-24.
134. Albert, R., et al., A new discrete dynamic model of ABA-induced stomatal closure predicts key
feedback loops. PLoS Biol, 2017. 15(9): p. e2003451.
135. Chater, C., et al., Elevated CO2-Induced Responses in Stomata Require ABA and ABA Signaling.
Curr Biol, 2015. 25(20): p. 2709-16.
136. Clapham, D.E., Calcium signaling. Cell, 1995. 80(2): p. 259-268.
137. BROADLEY, M.R. and P.J. WHITE, Calcium in Plants. Annals of Botany, 2003. 92(4): p. 487-511.
138. Allen, G.J., et al., A defined range of guard cell calcium oscillation parameters encodes stomatal
movements. Nature, 2001. 411: p. 1053.
139. Hashimoto, M., et al., Arabidopsis HT1 kinase controls stomatal movements in response to CO2.
Nature Cell Biology, 2006. 8: p. 391.
140. Rozum, J.C. and R. Albert, Identifying (un)controllable dynamical behavior in complex networks.
PLOS Computational Biology, 2018. 14(12): p. e1006630.
135
141. Lawson, T., et al., Mesophyll photosynthesis and guard cell metabolism impacts on stomatal
behaviour. New Phytologist, 2014. 203(4): p. 1064-1081.
142. MOTT, K.A., Opinion: Stomatal responses to light and CO2 depend on the mesophyll. Plant, Cell
& Environment, 2009. 32(11): p. 1479-1486.
143. Fujita, T., K. Noguchi, and I. Terashima, Apoplastic mesophyll signals induce rapid stomatal
responses to CO2 in Commelina communis. New Phytologist, 2013. 199(2): p. 395-406.
144. Wong, S.C., I.R. Cowan, and G.D. Farquhar, Stomatal conductance correlates with
photosynthetic capacity. Nature, 1979. 282(5737): p. 424-426.
145. McAdam, S.A.M. and T.J. Brodribb, Mesophyll Cells Are the Main Site of Abscisic Acid
Biosynthesis in Water-Stressed Leaves. Plant Physiology, 2018. 177(3): p. 911-917.
146. Tõldsepp, K., et al., Mitogen-activated protein kinases MPK4 and MPK12 are key components
mediating CO2-induced stomatal movements. The Plant Journal, 2018. 96(5): p. 1018-1035.
147. Mustilli, A.C., et al., Arabidopsis OST1 protein kinase mediates the regulation of stomatal
aperture by abscisic acid and acts upstream of reactive oxygen species production. Plant Cell,
2002. 14(12): p. 3089-99.
148. Albert, R. and H.G. Othmer, The topology of the regulatory interactions predicts the expression
pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol, 2003. 223.
149. Yuan, Z., et al., Exact controllability of complex networks. Nature Communications, 2013. 4: p.
2447.
150. Zañudo, J.G.T., G. Yang, and R. Albert, Structure-based control of complex networks with
nonlinear dynamics. Proceedings of the National Academy of Sciences, 2017. 114(28): p. 7234-
7239.
151. Yang, G., J. Gomez Tejeda Zanudo, and R. Albert, Target Control in Logical Models Using the
Domain of Influence of Nodes. bioRxiv, 2018.
152. K. Chandra, A. and G. Markowsky, On the number of prime implicants. Vol. 24. 1978. 7-11.
Vita
Xiao Gan
Education and Research:
The Pennsylvania State University University Park, USA.
Ph.D. Major: Physics Aug. 2013 – Aug. 2019 (expected)
Advisor: Prof. Réka Albert
National Laboratory of Solid State Microstructures, Nanjing University,
Research assistant Jun. 2012 – Jun. 2013
Advisor: Prof. Xinglong Wu
Nanjing University, Kuang Yaming Honors School Nanjing, China
Bachelor of Science, Major: Physics Aug. 2018 - Jun. 2012
Publications
Gan, X and Albert, R. "Modeling biological information processing networks." In Physics of
Molecular and Cellular Systems, Krastan B. Blagoev and Herbert Levine (eds.) (Invited book
chapter, submitted for publication).
Gan, X and Albert, R. "General method to find the attractors of discrete dynamic models of
biological systems." Physical Review E 97(4): 042308 (2018).
Gan, X and Albert, R. "A general method to find the attractors of discrete dynamic models of
biological systems." Presented at the 8th International Conference on Physics and Control, Florence,
Italy, July 2017, Proceedings paper available at: http://lib.physcon.ru/doc?id=f916c9044267
Gan, X and Albert, R "Analysis of a dynamic model of guard cell signaling reveals the stability of
signal propagation." BMC Systems Biology 10.1 (2016): 78.
Shan, Y., Wu, X., Gan, X., et al. “CdS: Mn–Polysulfido Complex Nanoclusters with H2O2-
Dependent and Site-Specific Color Changes” The Journal of Physical Chemistry C, 118(20),
pp.11085-11092. (2014)
Gan, Z., Xiong, S., Wu, X., Xu, T., Zhu, X., Gan, X., et al. “Mechanism of photoluminescence from
chemically derived graphene oxide: role of chemical reduction.” Advanced Optical Materials, 1(12),
926-932. (2013)
Jiang J., Zhu H., Zhou Y. GAN X., et al. "Research on diffraction of twisted nematic liquid crystal",
Physics Experimentation, 31, No.10, 44-46 (2011)
HONORS AND AWARDS
The Downsbrough Department Head's Chair in Physics (fellowship) in 2019
Financial support for invited talk in Shenzhen Institutes of Advanced Technology (SIAT), Chinese
Academy of Science (2018)
Financial support for NetSciX, Hangzhou, China (2018)
The Downsbrough Graduate Fellowship in Physics, Pennsylvania State University (2017)
NSF travel award for ICSB (International Conference on Systems Biology), Virginia Tech (2017)
The David C. Duncan Graduate Fellowship in Physics, Pennsylvania State University (2016)
The People’s Scholarship - Special Award, Nanjing University (2010)
The People’s Scholarship, Nanjing University (2009 & 2010)