Post on 06-Apr-2018
transcript
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
1/65
I
X-Machine Model of a Biological System
Third year undergraduate dissertation project
Final Dissertation
Department of Computer Science
University of Sheffield
Author: Xingdong Bian
Supervisor: Prof. Mike HolcombeModule code: COM3021
Date: 29/03/2006
This report is submitted in partial fulfilment of the requirement for the degree of
Bachelor of Science with Honours in Computer Science by Xingdong Bian.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
2/65
II
Signed declaration:
All sentences or passages quoted in this dissertation from other people's work havebeen specifically acknowledged by clear cross-referencing to author, work and page(s).
Any illustrations which are not the work of the author of this dissertation have been
used with the explicit permission of the originator and are specifically acknowledged. I
understand that failure to do this amounts to plagiarism and will be considered grounds
for failure in this dissertation and the degree examination as a whole.
Name: XINGDONG BIAN
Signature:
Date: 02/05/2006
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
3/65
III
Abstract:
This project is in the field of computational biology, by using the computer simulation
model to display the biological systems spatial and temporal aspects in detail.
The aim for this project is develop a simulation of a vital part of the immune system by
using X-machine framework and tools such as xparser and xml. By converting the
exist models in Matlab code into xml, and then use an xparser parse it to a runnable C
source coded programme.
Three models are involved in this project: chemical interaction model, NF-kB
signalling pathway model and NF-kB & MAP kinase signalling combined model. The
first two models have existing Matlab models to be converted, but the last model is
needed to do some research and add a new pathway into NF-kB.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
4/65
IV
Acknowledgments
Thanks everyone who helped me with this project. Especially my supervisor Prof.
Mike Holcombe, thanks him leading me to the right direction, many ideas and muchadvice of this project. Also thanks Mr. Simon Coakley helped me with xml
specification, xparser and visualisation. Thanks Mr. Mark Pogson help me with Matlab
example models. Lastly, thanks Prof. Eva Qwarnstrom helped me with biological
knowledge and experimental data.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
5/65
V
Contents
Title -------------------------------------------------------------------------------------- I
Signed declaration -------------------------------------------------------------------------------------- II
Abstract -------------------------------------------------------------------------------------- III
Acknowledgments -------------------------------------------------------------------------------------- IV
Contents -------------------------------------------------------------------------------------- V
Figure List -------------------------------------------------------------------------------------- VII
Chapter 1 Introduction 1
Section 1.1 Background 1
Section 1.2 About the Project 2
Section 1.2.1 Agent-Based Modelling 2
Section 1.2.2 X-machine 3
Section 1.2.3 HPCx 3
Section 1.3 About This Dissertation 4
Chapter 2 Literature Review 5
Section 2.1 Overview 5
Section 2.2 Agent-Based Intracellular Chemical Interactions Model 6
Section 2.3 Agent-Based the NF-B Signalling Pathway Model 8
Section 2.4 NF-B Signalling Pathway and MAP Kinase Signal Pathway
Combined Model
11
Section 2.5 Some Agent-Based Modelling Approaches 12
Section 2.5.1 Swarm Agent-Based Modelling 12
Section 2.5.2 MASON Multi-Agent Simulations 13Section 2.5.3 X-machine Framework and XML 14
Chapter 3 Requirements and Analysis 17
Section 3.1 Objectives and Requirement for the Project 17
Section 3.2 Analysis for Intracellular Chemical Interaction Model 18
Section 3.2.1 Importance and User Requirements 18
Section 3.2.2 Conversion from Matlab 19
Section 3.2.3 Concentrations Rates 20
Section 3.3 Analysis for the NF-B Signalling Pathway Model 20
Section 3.3.1 Importance and User Requirements 20
Section 3.3.2 Conversion from Matlab 21
Section 3.4 Analysis for the NF-B & MAP Kinase Signalling Pathway
Combined Model
22
Chapter 4 Design 24
Section 4.1 Associated Language with the Project 24
Section 4.1.1 XML 24
Section 4.1.2 Matlab 24
Section 4.1.3 C 25
Section 4.2 Overall Design 25
Section 4.2.1 X-machine Frameworks Architecture 25Section 4.2.2 Main XML File Structure 26
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
6/65
VI
Section 4.2.3 Iteration XML File Structure 27
Section 4.3 Design of Chemical Interaction Model 28
Section 4.4 Design of NF-B Signalling Pathway Model 29
Section 4.5 Design of NF-B & MAP Kinase Signalling Pathway Combined
Model
30
Chapter 5 Implementation and Testing 33
Section 5.1 Implementation of Three Models 33
Section 5.1.1 Implementation of Chemical Interaction Model 33
Section 5.1.2 Implementation of the NF-B Signalling Pathway Model 36
Section 5.1.3 Implementation of NF-B & MAP Kinase Signal Pathway
Combined Model
39
Section 5.2 Testing Methods 40
Section 5.2.1 Unix Tool for Single Iteration Testing 40
Section 5.2.2 Getdata Programme for Whole Iteration Files Testing 40
Chapter 6 Results and Discussion 42
Section 6.1 Results and Discussion of Chemical Interaction Model 42
Section 6.2 Result and Discussion of NF-B Pathway model 45
Section 6.3 Result and Discussion of NF-B & MAP kinase pathways
combined model
49
Chapter 7 Conclusions 52
Section 7.1 Summary of the Dissertation and Project 52
Section 7.2 Future Work of this Project 52
References ------------------------------------------------------------------------ 54
Appendices ------------------------------------------------------------------------ i
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
7/65
VII
Figure List
Figure 2.1 Transmembrane Signalling Biomechanical and Soluble Mediators 5
Figure 2.2 Chemical Interaction Model Visualisation (Matlab) 8
Figure 2.3 Chemical Interaction Model Results (Matlab) 8
Figure 2.4 NF-B Pathway Model Visualisation (Matlab) 10
Figure 2.5 NF-B Pathway Model Results (Matlab) 10
Figure 2.6 Summary of MAP kianse pathway 11
Figure 3.1 Process of combination 18
Figure 3.2 Chemical reactions 19
Figure 3.3 concentration of molecule A, B and C against time 20
Figure 3.4 Possible states and transition of an NF-B 22
Figure 3.5 Simplify of the MAP Kinase pathway 23
Figure 4.1 Structure of the Main file (a) 26
Figure 4.2 Structure of the Main file (b) 27
Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation 31
Figure 5.1 states and relations in X-machine 34
Figure 5.2 Visualisation of Chemical Interaction Model 35
Figure 5.3 Visualisation of NF-B signalling pathway model 38
Figure 5.4 Visualisation of NF-B MAP kinase combined model 39
Figure 5.5 Concentration against Iterations (time steps) graph 41
Figure 6.1 Chemical interaction agent model graph one 43
Figure 6.2 Chemical interaction agent model graph two 44
Figure 6.3 Visualisation for chemical interaction model 44Figure 6.4 NF-B pathway agent model result (a) 48
Figure 6.5 NF-B pathway agent model result (b) 49
Figure 6.6 Result of the combined model 51
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
8/65
Chapter 1: Introduction
1
Chapter 1: Introduction
Section 1.1: Background
This project is in the field of computational biology, computational biology is a term
for an interdisciplinary field of the joining of both computer technology and biology.
Computational biology has just started in recent years. The field is located at the
interface between the two scientific and technological disciplines that can be argued to
drive a significant if not the dominating part of contemporary scientific innovation
[1].
After more discoveries in biology such as the structure, organisation and behaviour of
cells, tissues, organisms and communities of biological systems, more understanding
and may be simulation is needed. Computer technology is able to solve this question,and providing prediction for important aspects of the biology systems behaviour.
Computer technology gives vitality to the research of biology area. The famous
example is the Human Genome Project, it has generated an extraordinary amount of
data. Biologists are now faced with the challenge of extracting meaning from linear
sequences composed of billions of base pairs. The work of computational biologists is
indispensable for this task and for many other biological problems that lend themselves
to computational solutions [2]. This is the reason why computational biology field is
developed dramatically, more and more people in both areas are starting to work
together and get best solution of their research.
There are 10 major research areas for computational biology now: sequence analysis,
computational evolutionary biology, gene expression analysis, regulation analysis,
protein expression analysis, analysis of mutations in cancer, structure prediction,
measuring biodiversity, modelling biological systems and high-throughput image
analysis. My project is in the 9th
area stated above modelling biological systems, this
area involves the use of computer simulations of cellular subsystems for both spatial
and temporal aspects the complex connections of these cellular processes.
The definition for biological computer modelling is using a computer programmewhich tries to simulate an abstract model of a particular biological system. Biological
computer simulation is a subset of computer simulation. Computer simulation is a
really useful part in modelling lots of natural systems, which gives insight into the
operation of the nature systems are been modelled. The age before computer
simulation, people were using mathematical models, but with computer simulation,
modelling went in a new stage.
Here is history of computer simulation (quoted from the Wikipedia article "Computer
Simulation", it is licensed under the GNU Free Documentation License --
http://www.gnu.org/copyleft/fdl.html):
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
9/65
Chapter 1: Introduction
2
Computer simulation was developed hand-in-hand with the rapid growth of the
computer, following its first large-scale deployment during the Manhattan Project in
World War II to model the process of nuclear detonation. It was a simulation of 12 hard
spheres using a Monte Carlo algorithm. Computer simulation is often used as an
adjunct to, or substitution for, modelling systems for which simple closed formanalytic solutions are not possible. There are many different types of computer
simulation; the common feature they all share is the attempt to generate a sample of
representative scenarios for a model in which a complete enumeration of all possible
states of the model would be prohibitive or impossible. Computer models were
initially used as a supplement for other arguments, but their use later became rather
widespread. The physicist Richard Feynman, was not fond of such models and once
called them "a disease"[3].
Section 1.2: About the Project
About my project: the aim for my project is developing a simulation of a vital part of
the immune system by using framework and tools. Based on the existing framework
which was developed by the Computational Biology Research Group in our
department, it can model different kinds of biological systems and the systems are
defined in terms of individual agents which play the role of different biological
entities such as molecules, receptors etc. Also the simulations they have built can solve
thousand of these agents operating and communication with other agents. This is
called Agent-Based Modelling.
Section 1.2.1: Agent-Based Modelling
Agent-Based Modelling is developed to deal with the complexities of the system and
to extend the capabilities of previous chemical modelling attempts [4][5]. It can
provide better understanding of the operation for the cellular reactions for both spatial
and temporal aspects.
Agent-Based modelling (also known as individual-based modelling) treats each
individual component of a system as a single entity (or agent) obeying its ownpre-defined rules and reacting to its environment and neighbouring agents accordingly
[4][6]. Agent is good for representing component of a system.
Also, for the agents, they can be represented by various computational models; the
approach chosen here is the X-machine, providing an intuitive and precise method to
model the functional behaviour of systems in a flexible and modular manner [5]. A
single stream X-machine is used to describe each individual agent, and communication
channels are identified between machines to deal with agent interactions [7]. When
modelling complex systems, there is an essential feature for X-machine: it is directly to
develop by adding new agents to the system and makes the modelling process
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
10/65
Chapter 1: Introduction
3
extensible.
Section 1.2.2: X-machine
The reason we use X-machine is due to its speciality. X-machines are similar to finitestate machines, which are models of behaviour based on states and transitions, but the
X-machines has a addition feature: memory, it achieve that transitions between states
can include the memory and the modification of it [9]. The memory lets X-machine
have an important and novel feature. The memory in X-machine contains physical
location, so that the number of states required to model the system is manageably
small.
The using of framework as this: programme using XML with the X-machine specific
way and then the Xparser (which is built by the computational biology research group
in our department as well) will produce a programme in C code from the X-machineXML specification. By running the programme it will simulate the agents behaviour
and it is also possible to visualise the simulation by the special visualisation C
programme built for the model.
The reason why the framework is based on XML instead of directly writing it into C
code is: the XML is simple and it is flexible text format derived from SGML, which
will show all the state of each agent clearly and it is really simple to be code compare
with C. After the XML code created, the Xparser will parse it into C code easily.
Section 1.2.3: HPCx
The computational biology research team has already done the model for the vital part
of immune system in Matlab, what I will do is convert the model into X-machine
framework which will be running under C compilers.
The reason for that is because of the super computer HPCx cannot run Matlab but C.
In order to get this super computer to calculate our simulation, we have to convert our
model into C.
We can see the super computers hardware specification of it (quoted fromhttp://www.epcc.ed.ac.uk/msc/systems_HPCx.htm):
The HPCx system is located at the UK's CCLRC's Daresbury Laboratory and operated by the HPCx
Consortium.
The HPCx system uses IBM p690+ Regatta nodes for the compute and IBM p690 Regatta nodes for
login and disk I/O. Each Regatta node contains 32 processors. At present there are two p690 service
nodes. At the beginning of the user service on HPCx phase2 in April 2004, twenty p690+ nodes were
used for compute jobs, offering a total of 640 processors. From Monday, 10 May, there were 38 frames,
i.e. 1216 processors, available to users. Then the system had a throughput of at least 4.8 Tflops (4800
AU/hr). This was increased to 50 nodes offering 1600 processors end of May 2004. The peak
computational power of the HPCx system is 10.8 Tflops peak, or at least 6 Tflops sustained. The
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
11/65
Chapter 1: Introduction
4
complete new platform gave a value of 6,188 Gflops for the Rmax value of the Linpack benchmark. The
service can thus provide 6,188 AUs per hour, 148,512 AUs per day.
HPCx service is provided by a consortium led by the University of Edinburgh, with the
Council for the Central Laboratory of the Research Council and IBM. This supercomputer will help us by running the simulation by thousands of processors with
different agent in different processor to get a much more accurate result. However my
project doesnt involve to HPCx directly.
Section 1.3: About This Dissertation
This dissertation consists of seven chapters, after this beginning introduction chapter,
the second chapter is literature review, all the related background literature will be
mentioned as well as the X-machine framework in detail and the associating three
programming language with my project. The third chapter is requirements and analysis,this chapter talks about the project by objectives, requirement and the analysis in a
more detailed way. How the project will be evaluated will also be included in this
chapter. The next chapter is design the design technique of this project. Then the fifth
chapter is implementation and testing, this chapter is about the coding methods and
how to test the model. The sixth chapter is results and discussion, this is a important
chapter that shows the main results of the model and some discussion. The last chapter
is conclusions, a summarisation of the project and the dissertation.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
12/65
Chapter 2: Literature Review
5
Chapter 2: Literature Review
Section 2.1: Overview
Three models are involved in my project: intracellular chemical interactions model, the
NF-B signalling pathway model and a combined of NF-B signalling pathway and
MAP Kinase signal pathway model. Also there are three programming languages
associated with my project xml, Matlab and C. We can see a picture which shows a
part signalling pathways in cell, and some of the molecules are going to appear in the
model, this picture was done by Prof. Eva Qwarnstrom:
Figure 2.1 [26]
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
13/65
Chapter 2: Literature Review
6
Section 2.2: Agent-Based Intracellular Chemical Interactions Model
Firstly, I will introduce intracellular chemical interactions model. Even the simplest
life forms require the interaction of more than 400 chemical processes that are encoded
by genes [9]. To track and understand the intracellular chemical interactions, theintracellular signalling pathways should be considered. Intracellular signalling
pathways are really important for cell behaviour in control and regulation. With
agent-based modelling it will show the intracellular signalling pathways in both spatial
and temporal concerns. By using the agent-based modelling, it is possible to provide a
framework for calculating chemical interactions with accurate result.
Complex interactions of genes, proteins and other molecules within the cell must be
addressed in order to gain a better understanding of how these pathways operate
[6][10][11]. Also by using mathematical models with the information of physical
components of the cell, it is easier to understand the activities of signalling pathways.
People used to model intracellular signalling pathways relying on reaction kinetics, by
using ordinary differential equations to show each chemicals quantities with time.
This is possible only when the chemicals in the cell are well mixed. However, due to
internal structure and low numbers and non-uniform distributions of certain key
molecules in the cell, this is certainly not true [12].
Also because the signalling pathways are complex, only using mass number of
ordinary differential equations is necessary for the reaction kinetics models. However
the description will be huge and the solutions will be difficult to be expressed. This
kind of models has some other problems as well: they have limitations in function
properly and those large numbers of ordinary differential equations are sensitive, only
small changes to the equations will cause big changes in behaviours. So this kind of
models has a narrow view of the real behaviours in the cells even they can provide
useful results sometimes.
An important factor needs to be encountered for intracellular modelling is time delays.
Time delays in certain cellular processes such as transcription can have very significant
effects on pathway behaviour [6]. Differential equation models dont consider thisfactor because of its attributes, they cant include inside with those ordinary
differential equations.
An even more important factor for intracellular modelling is spatial effects. Again,
differential equation models are hard to consider spatial effects.
As all above, even the differential equation models are important, but they still have
lots of disadvantages and limitation on modelling of intracellular interactions. So, to
gain a higher level of understanding the mechanical and structural effects on
intracellular pathways, more transparent and abstract models are needed.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
14/65
Chapter 2: Literature Review
7
A good modelling approach here, which is called: agent-based modelling. Agent-based
modelling models each individual component of a system as a single agent obeying its
own pre-defined rules and reacting to its environment and neighbouring agents
accordingly [6]. That means agent-based modelling contains new methods of
modelling spatial systems that deal with much finer spatial and temporal scales whereactivity is represented at the level of the individual or agent. Also, processes naturally
enter these systems as agent behaviour and then it joins the spatial context naturally as
well. Agent-based modelling has recently been applied to a variety of biological
systems, including insect communities and epithelial tissue [13][14][15][16].
Agents in a biological system for a biochemical pathway, can be presented as anything
from a molecule to a signalling receptor to a an entire chain of interactions can be
modelled as an agent, thus providing a modular and extensible modelling framework
which allows abstraction of details as necessary [5]. So agent-based modelling is
clarified in spatial modelling, which is good for monitoring intracellular interactionand the change cell structures by the interaction processes.
Compared with the differential equation models, agent-based models have a lot more
freedom: they can model different quantity and different positions of molecules with
no limitations if the computer is good enough. Also, the two important factors: time
delays and spatial effects can be included in the model easily. But notice the number of
agents must be positive.
Different from the differential equation models, agent-based models dont need a lot
ordinary differential equations in modelling, but they need some other details for each
agents position and properties, so that is a large amount of information that needs to
be specified. Another thing needs to be noticed is the agent-based model should agree
with the associate kinetics model.
The two images below is an agent-based model coded in Matlab by Mark Pogson in
our department. The Figure 2.2 shows a step in the middle of interaction, it clearly
displays all three kinds of molecules position and number in a three dimensional box.
The Figure 2.3 shows the number of each kind molecule against time in second. We
can see that by the time change molecule A interacts with molecule B producesmolecule C. Also the numbers of them are associated. An agent-based intracellular
interaction model (A + B C) by Matlab code:
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
15/65
Chapter 2: Literature Review
8
Figure 2.2
Figure 2.3
Section 2.3: Agent-Based the NF-B Signalling Pathway Model
After the intracellular chemical interaction model, now we move on to the second
model which is involved with my project, it is called the NF-B signalling pathway.
NF-B nuclear factor kappa B, is a heterodimeric protein composed of different
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
16/65
Chapter 2: Literature Review
9
combinations of members of the Rel family of transcription factors. The Rel/ NF-kB
family of transcription factors are involved mainly in stress-induced, immune, and
inflammatory responses. In addition, these molecules play important roles during the
development of certain hemopoietic cells, keratinocytes, and lymphoid organ
structures. More recently, NF-kB family members have been implicated in neoplastic progression and the formation of neuronal synapses. NF-kB is also an important
regulator in cell fate decisions, such as programmed cell death and proliferation
control, and is critical in tumorigenesis [17]. So the intracellular NF-B signalling
pathway is important to immune systems.
Due to its control of cells death and proliferation, the research of NF-B signalling
pathway is really important. Imagine if people can control it, let cancer cells kill
themselves and normal cells stay alive, then the biggest problem in the world now
cancer, will be solved. However, it is not easy to control it so a good model for
intracellular NF-B signalling pathway is needed to show both spatial and temporaldetails of the pathway for research purpose.
NF-B activation is tightly controlled by inhibitors of NF-B (IB) proteins [5][18].
IB sequesters the majority of NF-B in the cytoplasm as complexes by masking their
nuclear localisation signals [19]. During activation, IB is phosphorylated by IB
kinases (IKK), causing its ubiquitination and proteosome-mediated degradation. The
newly free NF-B is consequently transported into the nucleus, inducing genes bearing
cognate binding motifs [5].
All the information above is for showing how important NF-B signalling pathway is
and how NF-B is activated. Now we need a computational model to get the
information of the way how it controls the signalling pathways, with the results
provided by the experiment.
It is the same with intracellular chemical interaction model, people use differential
equations to model inhibitors performance. However, as I mentioned above, the
differential equation models have limitation to show the actual pathway. So, the best
approach here is agent-based modelling.
Agent-based modelling is able to give the intracellular NF-B signalling pathway a
better scope of analysis and more complete view of the regulatory mechanisms. It
shows what is actually happening inside the cell. A single agent is a molecule inside
the cell in this model and its behaviour is controlled by the rules of interaction and its
environment. Even sometimes it is not possible to model all the individual molecules
due to biological or computational limitations, but by using some other agents to
separate the system into useful components, it will provide a complete view of the
pathway.
Again in this model, the agent-based modelling has wilder scope than the reaction
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
17/65
Chapter 2: Literature Review
10
kinetics modelling, but the agent-based model must agree with the corresponding
reaction kinetics model.
The two images bellow is a second agent-based model coded in Matlab by Mark
Pogson, from our department. The Figure 2.4 shows a step in the middle of the NF-
Bsignalling pathway simulation, it clearly displays a cells model and the position for
each kind of molecule. The Figure 2.5 shows the concentration of each kind of
molecule against time in second.
Figure 2.4
Figure 2.5
As we can see, the model is made up of lots of different molecule in a spherical cell
with a spherical nuclear centre region. However, in the actual world, some cells have
unique and non-spherical free shape. To model those cells, we will need some special
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
18/65
Chapter 2: Literature Review
11
software to sort the boundary out, but it is still based on a spherical shaped model with
all kinds of coordinates.
Section 2.4: NF-B Signalling Pathway and MAP Kinase Signal Pathway
Combined Model
MAP Kinase stands for Mitogen-activated protein kinase. In cell biology,
mitogen-activated protein kinases are serine/threonine-specific protein kinases that
respond to extracellular stimuli (mitogens) and regulate various cellular activities, such
as gene expression, mitosis, differentiation, and cell survival/apoptosis. Extracellular
stimuli lead to activation of a MAPK via a signalling cascade composed of MAPK,
MAPK kinase (MAPKK), and MAPKK kinase (MAPKKK). A MAPKKK that is
activated by extracellular stimuli phosphorylates a MAPKK on its serine and threonine
residues, and then this MAPKK activates a MAPK through phosphorylation on its
serine and tyrosine residues. This MAPK signalling cascade has been evolutionarilywell-conserved from yeast to mammals. [27]
Figure 2.6 [25]
The Figure 2.6 only shows a summary of MAP kinase pathway, but the Figure 2.1
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
19/65
Chapter 2: Literature Review
12
shows a more complex and complete signalling pathways. It also shows the cross talk
between NF-B and MAP kinase pathways.
This pathway can also be modelled by the agent-based model. By introduce each
molecule as an agent. Same with NF-
B signalling pathway agent-based modelling isalso able provide a better scope of analysis and more complete view of the regulatory
mechanisms. However, the combined model is more complex and more important for
research purpose, what is actually happening inside the cell is necessary to be
displayed by computer model.
The most important thing is to see if these two pathways interfere with each other
when they are in the same model, also the cross interaction between the members of
them is fatal.
If two pathways behave normal in the same model that means X-machine frameworkis capable to model more than one pathway. This is also the base of the future models
which have three or more pathways inside.
Section 2.5: Some Agent-Based Modelling Approaches
Section 2.5.1: Swarm Agent-Based Modelling
Swarm is a multi-agent software platform for the simulation of complex adaptive
systems. In the Swarm system the basic unit of simulation is the swarm, a collection of
agents executing a schedule of actions. Swarm supports hierarchical modelling
approaches whereby agents can be composed of swarms of other agents in nested
structures. Swarm provides object oriented libraries of reusable components for
building models and analyzing, displaying, and controlling experiments on those
models. Swarm is currently available as a beta version in full, free source code form. It
requires the GNU C Compiler, Unix, and X Windows. [33]
The modelling formalism that Swarm adopts is a collection of independent agents
interacting via discrete events. Within that framework, Swarm makes no assumptionsabout the particular sort of model being implemented. There are no domain specific
requirements such as particular spatial environments, physical phenomena, agent
representations, or interaction patterns. Swarm simulations have been written for such
diverse areas as chemistry, economics, physics, anthropology, ecology, and political
science. [33]
Swarm uses each individual agent as a basic unit, each agent generates events affect
itself and other agents, and the simulation of Swarm uses a number of agents
interacting with each other.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
20/65
Chapter 2: Literature Review
13
Swarm needs libraries to do the simulation. Swarm libraries serve two major
functions. The libraries are a set of classes that model builders can use by direct
instantiation. For many objects, especially highly technical ones such as schedule data
structures, it's likely that all a user will ever do is use the classes as provided. But in
addition, one can use Swarm libraries by subclassing them, specializing particularclasses for particular modelling needs. Both modes of using the Swarm libraries are
important; Swarm is designed to facilitate both as appropriate. [33] This is also the
limitation of the Swarm agent-based modelling.
Section 2.5.2: MASON Multi-Agent Simulations
MASON Stands forMulti-Agent SimulatorOfNeighbourhoods... orNetworks... or
something..., MASON is a fast discrete-event multiagent simulation library core in
Java, designed to be the foundation for large custom-purpose Java simulations, andalso to provide more than enough functionality for many lightweight simulation needs.
MASON contains both a model library and an optional suite of visualization tools in
2D and 3D. MASON is a joint effort between George Mason University's ECLab
Evolutionary Computation Laboratory and the GMU Center for Social Complexity,
and was designed by Sean Luke, Gabriel Catalin Balan, and Liviu Panait, with help
from Claudio Cioffi-Revilla, Sean Paus, Keith Sullivan, Daniel Kuebrich, Joey
Harrison, and Ankur Desai. [34]
MASON has some special features:
Simulations can be serialized to checkpoints (freeze-dried and written to disk),which can be recovered from at any time, even to different Java platforms and new
MASON visualization toolkits.
MASON can be set up to be guaranteed duplicatable, meaning that the samesimulation parameters will produce the same results regardless of platform.
Libraries are provided for visualizing in 2D and in 3D (using Java3D), tomanipulate the model graphically, to take screenshots, and to generate movies
(using Java Media Framework).
While the visualization toolkits are fairly large, the core simulation model isintentionally very small, fast, and easy to understand. [34]
However, from the description above, MASON uses Java technology to simulation
models, as in last chapter, we need to run models on HPCx, but HPCx doesnt support
Java, so it is not possible to choose this simulation system for my project.
As in last two sections, these two models are not suit for my project as the X-machine
framework, you will know why the X-machine framework is the most suitable one for
my project in next section.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
21/65
Chapter 2: Literature Review
14
Section 2.5.3: X-machine Framework and XML
Due to the mass usage of agent-based modelling for intracellular interactions, it is
necessary to develop a common architecture for the large amount of agents systems.
The approach here is a framework based on the X-machine. The framework canstandardise the expression of agents in a special way. The X-machine framework uses
XML code, through a C coded Xparser, it can be parsed into a runnable C code.
There are quite a lot of tools for computational biology modelling research, but for
agent examples uses, there is not many, only some framework with inadaptable
structure based, which wont suit our models. Also there are some agent-based
frameworks already exists but they cant reach the needs for intracellular modelling.
Because inside actual cells there are millions of molecules and associated cellular
signalling. Due to the huge number of agents the need of a common architecture is
essential. With running on a super computer like HPCx as I mentioned in theintroduction, it makes the modelling result more accurate. The reason why it can be
run on the supercomputers is the definition of agents. The agents are defined as
autonomous computing machines that communicate with messages the processing of
the agents can be spread across many processors and computers that are connected on
a network [8].
The messaging between agents is similar with the message communication with
computers, so the messages from the agents can be used in computers. MPI (Message
Passing Interface) is a library that allows the creation of programs that can be spread
across computers and that communicate with messages and has become the de facto
standard for distributed memory parallel processing [8]. So we can use computers to
simulate the agents and the messages between those agents.
It is possible to define a cell as a system which processes some parallel collections of
communication. So we need a good model to define the behaviour of agents running in
parallel and sending each other data and process them. The X-machine matches all
needs, X-machine is similar with other finite state machines, and it has states, input
output alphabet and a unique thing which other state machines dont have memory.
With this additional memory, it is then really useful and suitable for agent-basedmodelling. When the transition between states, they can have memory with them and
modify it. We can see the definition of a stream X-machine.
The definition of a stream X-machine is an 8-tuple [16]:
X = (, , Q, M, , F, q0, m0)
and is the input and output alphabets respectively.Q is the finite set of states.Mis the (possibly) infinite set called memory., the type of the machineX, is a set of partial functions that map an input and a
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
22/65
Chapter 2: Literature Review
15
memory state to an output and possibly different memory state, : x Mx
M.
F is the next state partial function, F: Q x Q, which given a state and afunction from the type determines the next state. Fis often described as a state
transition diagram.qo and mo the initial state and initial memory respectively.
From now on the term X-machine refers to a stream X-machine [8]. Because the
X-machines can communicate, we can use the Communication X-machine. A
Communication X-machine model uses X-machines which can exchange messages.
The Communication X-machine model can be defined as the tuple [8]:
((Cix) i = 1..n, R)
where:
Cix is the i-th Communicating X-machine in the system, andR is a communication relation between the n X-machines
By different method of defining R, we can get different definition of communicating
X-machine. One of the most accepted approaches uses the idea of a communication
matrix which acts as the means of communication between X-machines [8]. The
communication cells in this approach contain message between X-machines. However,
this approach still has disadvantages when using X-machines as agents, especially
when there are a lot of agents, the communication matrices will be too large to link
each other. Also, the target agent to send message is unclear from the point of an agent,
due to the changes of the communication.
Agents are restricted to interact with surrounding agents in the communicating
X-machine agent-based models, so the distance of massages sending between agents is
restricted. In this approach, the communication relation between X-machines R
consists of two lists: message list and message type list. In the message list, all the
X-machines will understand and able to read the messages. It is really important forthe concept of this kind of implementation, it means the actions of each X-machine are
based on input messages. If the source of the input message is too far from this
X-machine, then the message will be ignored; if the source is at a reasonable distance,
it will be processed. Also, this method can be extended, just need to put a tag with
some intelligent information on it, e.g. the max. distance for the sending X-machine
and possible receiving X-machines.
There are a lot of ways of communicating and handling messages. There is a useful
one, which is the communication between two agents that are processed on distinct
computers in a computer cluster or a grid system. What people are doing now is having
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
23/65
Chapter 2: Literature Review
16
a local message list for each computer CPU in the computer cluster. The agent only
sends and receives message from the local computer CPU, but there will be a separate
calculation to see if any other agent need the message on different computer CPUs.
The calculation involves the distance between each agent, by giving each of them an
influence boundary, it will be easy to decide if an agent needs the message.
XML is used for the implementation architecture of X-machine here. By coding with a
XML text file, the X-machine architecture can be defined. This is really easy to use for
most people, by using any kind of file editor, they can modify the XML code easily.
Also, it is possible to develop a graphical interface to modify the XML, without seeing
the implementation directly.
It is necessary to build a parser for the XML code which can parse the XML into a
runnable C programme to run the X-machine agents with the message list relation. The
parser itself is coded in C and it is universal for all XML coded X-machine agentsmodels, we call the parser Xparser. To complete an iteration, another XML text file is
needed to define the starting state and details for each agent as an initial point to run
the programme. By using these files, it is possible to have certain different runs of the
model with different result for research.
The representation of the X-machine model can be visualised by using a special coded
visualisation programme. The visualisation programme is coded in C as well. By using
the visualisation, it gives us a direct view of the models structure and interactions
procedures. Also, it is possible to screenshot each frame of the visualisation as a photo
file, with a set of screenshots, they can be converted to a video file by using a free
software which is called VirtualDub ( see http://www.virtualdub.org/).
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
24/65
Chapter 3: Requirements and Analysis
17
Chapter 3: Requirements and Analysis
Section 3.1: Objectives and Requirement for the Project
This chapter is a mainly about objectives and requirements about the project. Each of
the three models will be discussed in detail.
For my project, the aim is developing a simulation of a vital part of the immune system
by using framework and tools. Based on the existing framework which is the
X-machine, and it was developed by the computational biology research group in our
department. It can model biological systems which are involved with my project easily.
Each individual agent plays as a role of a molecule or a receptor. Based on
agent-based modelling, it can solve thousand of these agents operating and
communication with other agents.
The objective for my project is, based on the existing two Matlab models, convert
them into X-machine models. For both intracellular chemical interaction model and the
NF-B signalling pathway model, Mark Pogson has used Matlab to model them and I
have already received them. However, for the third model, there is no existing Matlab
model for combined two pathways. So this is something challenging and needed to be
fully tested to see if this works properly in X-machine framework.
Clearly, for requirements, the first thing is to understand all the Matlab models in detail,
and then I need to sort out the architecture and the method of X-machine modelling.
Also, I need to understand how to use the Xparser developed by Simon Coakley.
Then, I can make my start: after fully understanding the Matlab model, I need to
convert them into X-machine model, which represented by a XML file. Then I need to
create an initial state file called 0.xml (based on XML as well) to give the model initial
starting agents details, because the Matlab can generate initial agents at every run
starting point, but in X-machine, I need to create myself. Then, use Xparser to parse
the XML into C. if there is no problem with compiling, then it is possible to get an .exe
runnable programme file. Use the programme, assign a iteration number and point the0.xml initial state file, all the process will be done and I can get a XML file for each
iteration. Simon Coakley also has developed a visualisation programme specialised for
the X-machine model. With that programme, it can give us a direct view of the model
in 3D pictures.
After the conversion of the two Matlab models into X-machine model, then it is
possible to start the third model. By defining each molecule as an agent, set of binding
rules for each new kind of molecules and set of moving rules for them, this model will
be made up.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
25/65
Chapter 3: Requirements and Analysis
18
As in Figure 3.1, it is possible to start with two individual models for NF-B and
MAPK pathways, then put them together into a single model. However, there is an
important thing: the state numbers for each pathways molecules should be unique,
then it wont clash when they are combined together. Also, the cross-talk between
NF-
B pathway and MAPK pathway is necessary to be shown in the model, if there isavailable detailed data for that. I will discuss more about the combination model in a
following section and chapter.
Figure 3.1 Process of combination
Section 3.2: Analysis for Intracellular Chemical Interaction Model
Section 3.2.1: Importance and User Requirements
This model is a very basic and simple model, but everything is from the basic to
complex. Many aspects of life involve the interaction of multiple components and
subunits and the corresponding emergence of both form and function. This is true
whether we are dealing with molecules within an individual cell, cells within tissue,
organs within an organism or organisms within a community or ecology. [28] Bysorting out how each molecule interaction with another kind, it is possible to build a
large and complex model with a number of different kinds molecules or pathways.
The key feature for agent-based modelling is model each molecule as an agent, from
the Figure 3.2, (a) Reaction kinetics differential equations treat reacting chemicals as
well mixed and uniform; (b) Agent-based approach models each individual molecule
[28].
NF-B
MAPK
Combined
Mode
Cross-Talk
Mix
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
26/65
Chapter 3: Requirements and Analysis
19
Figure 3.2 Chemical reactions [28]
The agent-based models have greater scope than the reaction kinetics differential
equation models, but they need to define a lot more details than the latter one. For
example, the movement of a single molecule is needed to be defined, also the bindingrules of A molecules to B molecules as well. Incorrect data may course a big difference
in result.
Agent-based models have to agree with reaction kinetics differential equation models.
Because when the agent-based model has large number of molecules and they are
mixed well, reaction kinetics differential equation models can be applied. However,
there are not many information about individual molecular interactions, so it is
necessary to gain some data from reaction kinetics for agent-based model.
Section 3.2.2: Conversion from Matlab
During conversion, there is a big change need to be defined first the state of each
molecule. X-machine is a special kind of state machines, so when modelling
intracellular actions, each of the molecules is an X-machine, and each of them has a
state. So I need to sort out each kind of molecules possible state.
The intracellular chemical interaction model only has two kinds of molecule initially,
so the states are easy to be defined. Two states for molecule A: free and bond withmolecule B, one state for molecule B: free. From the perspective of A, it receive
message from molecule B and decide bond or not. After bound with B, they changed to
a third kind of molecule, at this time, when we marking the state, we can let molecule
B disappear and molecule A changes to the state bond with B it is actually
molecule C now, but for easier to compute and display.
Also, the requirement for a bind is important as well. Normally the interaction
boundary depends on the radius of the molecule. It is necessary to define the radius
and interaction boundary for each kind of molecules.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
27/65
Chapter 3: Requirements and Analysis
20
Section 3.2.3: Concentrations Rates
There is a good way to check if the result is correct or not, just calculate the number of
molecules, for each bond, the molecule A and molecule B will decrease one unit, and
molecule C will then increase one unit, this should happen in the same time step, lookback to Figure 2.3, you can see the concentration changes easily. And the model will
be built based on these. The evaluation for this model will be easy as well if the
concentration change in molecule A with a time step t is a, for molecule B is b, the
interaction is between molecule A and molecule B and produces molecule C, so the a
= b. from Figure 3.3 [6]:
Figure 3.3: concentration of molecule A, B and C against time [6]
Section 3.3: Analysis for the NF-B Signalling Pathway Model
Section 3.3.1: Importance and User Requirements
As in last chapter, we know that NF-B signalling pathway is vital to immune response
regulation. Alterations in pathway regulation underlie many diseases, includingatherosclerosis and arthritis. The modelling of individual molecules, receptors and
genes provides a more comprehensive outline of regulatory network mechanisms than
previously possible with equation-based approaches. [28] For this model, all the data
is from single cell experimental analysis by the Academic Unit of Cell Biology,
Division of Genomic Medicine in the University of Sheffield.
For a user using this model, he/she will be able to change and alter each kind of
molecules moving speed, radius and initial quantity (concentration). Another thing is
user should be able to define the colour for each kind of molecules. That means even
the data from the Matlab code is not correct, but as soon as the experiment finished,
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
28/65
Chapter 3: Requirements and Analysis
21
user is able to correct the model and each kind of molecules is independent to another
kind change ones detail wont affect others but will get correct result.
NF-B interact with IB should follow the interaction requirement as described in last
section, NF-
B can be seen as molecule A in last model, I
B can be seen as moleculeB, so when they bound it will be NF-B & IB, can be seen as molecule C. So the
concentration change should follow the Figure 3.4, but there are lots of other kinds of
molecule involved, the situation will be a lot more complex.
Section 3.3.2: Conversion from Matlab
From the detail in the Matlab code, it is possible to know that: Activation of the
NF-B pathway if controlled by inhibitors of NF-B (IB) proteins, which sequester
the majority of NF-B in the cytoplasm as complexes by masking their nuclear
localisation signals. During activation, IB is phosphorylated by IB kinases (IKK),causing its degradation. The newly freed NF-B is consequently transported into the
nucleus, inducing inflammatory genes, including those encoding IB, thus regulating
the pathway through negative feedback.[28][29][30][31]
Also from the Matlab code, there are NF-B, IB, IB, IB, Nuclear Importing
Receptors and Nuclear Exporting Receptors modelled as agents. The conversion from
Matlab is complicated. Each kind of molecules has a set of states. However, the
number of IB and IB in the real cell is tiny, from the suggestion of Mark Pogson,
it is not necessary to include these two molecules into the model. Then we can have a
look the possible state for each kind of molecules:
For the NF-B molecule, it is most complicated one in this model, see Figure 3.4 on
next page for the possible states and transition of a NF-B. As you can see only one
molecule will have those states: bound and unbound with different molecules in
cytoplasm and nuclear, also states for free in cytoplasm and nuclear, bound and
unbound with importing and exporting receptor. In more detail, NF-B should have a
state bound with IB in cytoplasm; a state of free in cytoplasm; a state of bound with
nuclear importing receptors; a state of free in nucleus; a state of bound with IB in
nucleus; a state of bound with nuclear exporting receptors, a state of bound with IBthen bound with nuclear importing receptors and a state of bound with IB then
bound with nuclear exporting receptors.
ForIB the possible states are: free in cytoplasm, bound with nuclear importing receptor,
free in nucleus and bound with exporting receptor. IB is a lot simpler than the NF-B
molecule.
For both kinds of nuclear receptors, there are two states: dormant and active. When
active, that means something bound with it; when dormant, that means it is free and
ready to bind with other kinds of molecules.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
29/65
Chapter 3: Requirements and Analysis
22
Figure 3.4: Possible states and transition of an NF-B [5]
Section 3.4: Analysis for the NF-B & MAP Kinase Signalling Pathway
Combined Model
This model involves two pathways: NF-B signalling pathway and MAP KinaseSignalling pathway. NF-B pathway has already done in the second model, so the
tasks are build the MAP kinase pathway separately and then combine them together.
As in the Figure 3.5 (next page), it is possible to simplify the model from the Figure
2.6. Ras, SOS and GRb2 molecules can be seen as a single kind, this can be treated as
NF-B in the last model. Active-Ras can be treated as IB. However, both of them
cant go inside of nuclear. After they bound, will produce a molecule called MAPK,
instead of Raf (MAP KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K). Raf (MAP
KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K) is a degradation process, so they
can be treated as one kind MAPK. MAPK is the only one goes inside nuclear and
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
30/65
Chapter 3: Requirements and Analysis
23
then it will switch on gene.
Same with NF-B and IB, the interaction of Ras_SOS_Grb2, Active-Ras and
MAPK, should follow the concentration change as in Figure 3.4. Also the change of
NF-
B and I
B should not be affected in this model, this is the way for evaluation.
The cross-talk between these two pathway has not yet been discovered fully. The only
thing we know now is a molecule called NIK, it is the important part of cross-talk.
The next chapter is design, I will talk about the design of each model in detail.
Figure 3.5 Simplify of the MAP Kinase pathway
Bound
Ras_SOS_GRb2 Active-Ras
Raf
(MAP KKK)
MEK1/2
(MAP KK)
ERK1/2
(MAP K)
Ap-1
Nuclear Membrane
MAPK
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
31/65
Chapter 4: Design
24
Chapter 4: Design
Section 4.1: Associated Language with the Project
There are three programming languages associated with my project -- XML, Matlab
and C. It is necessary to get familiar with these languages before the design of the
models.
Section 4.1.1: XML
Firstly, lets have a look at XML. XML, also known as Extensible Markup Language,
similar with our familiar language HTML (Hypertext Markup Language), they are all
derived from SGML (Standard for General Markup Language). XML is a simple but
very flexible language. XML was actually designed for the challenge of large-scaleelectronic publishing [24]. Also, people are now using XML on exchanging data
between the Web and other devices. E.g. the RSS (Really Simple Syndication) feed
service, by providing a common format text file in XML, let the users receive most
up-to-date information such as news, weather and so on.
Compared with HTML, XML are very flexible. Because the tags in HTML are
predefined; but in XML, you can define the tags by yourself. With your own-tags
compatible XML parser or reader, they can archive a goal with great efficiency.
Section 4.1.2: Matlab
Secondly, we turn to Matlab. Matlab is an interactive mathematical environment and
high-level technical computing language, originally based on the FORTRAN packages
LINPACK and EISPACK, but now based on LAPACK and BLAS [20]. Matlab is a
really useful tool for mathematical modelling; it also has a lot of features [21]:
High-level language for technical computing Development environment for managing code, files, and data Interactive tools for iterative exploration, design, and problem solving Mathematical functions for linear algebra, statistics, Fourier analysis, filtering,
optimization, and numerical integration
2-D and 3-D graphics functions for visualizing data Tools for building custom graphical user interfaces Functions for integrating MATLAB based algorithms with external applications
and languages, such as C, C++, Fortran, Java, COM, and Microsoft Excel
With these features, Matlab is really a powerful tool for computing and mathematical
studies. However, compared with XML architecture, it is not so suitable for
agent-based modelling when handling the agents and the communication relation
messages. Another reason is the HPCx super computer does not support Matlab, in
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
32/65
Chapter 4: Design
25
order to get the super computer running in parallel with different agent on different
CPU, so it is necessary to convert the existing Matlab coded models into X-machine
models
Section 4.1.3: C
Lastly, we focus on C programming language. There is a book called The C
Programming Language by Brian Kernighan and Dennis Ritchie, give us an informal
specification on C and some history information about C.
The C programming language is a standardized imperative computer programming
language developed in the early 1970s by Ken Thompson and Dennis Ritchie for use
on the UNIX operating system. It has since spread to many other operating systems,
and is one of the most widely used programming languages. C is prized for its
efficiency, and is the most popular programming language for writing system software,though it is also used for writing applications. It is also commonly used in computer
science education, despite not being designed for novices. [22][23]
C is a language which operates very close to the hardware, also C is most similar with
assembly language rather than other high-level languages. So C makes it easier for
programmers to control what the programme is doing. That results in more efficiency
than other languages.
C also can archive lots of features than other languages, because C accepts most of the
compilers, libraries, and interpreters. That is why the Xparser uses C as well as
visualisation programme for X-machine models. Also, as mentioned above, the HPCx
super computer has no problem to run C, so C is the best choice for the post-parsing
programming language of X-machine models.
Now we know all three languages, it is a good preparation of the design stage.
Section 4.2: Overall Design
Section 4.2.1: X-machine Frameworks Architecture
X-machine framework is a specialised framework for modelling biologic and other
areas models based on individual agents. The architecture now is using .xml text file to
define the data for each individual agents. From the last section, XML is transferable
description language. It is easy to build an .xml file by using text editors (low level
programming) to write directly or using a GUI tool (high level implementation) to
construct it.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
33/65
Chapter 4: Design
26
Also, there is another important .xml file which defines all the interaction rules,
sending and receiving messages, movements and variables etc. With a parser called
Xparser, the .xml file could be parsed into a C code file. Then use a compiler, it will be
an executable programme. The programme can run X-machine agents and
implements the global message list communication relation [8].
By using the programme from above and supporting an initial .xml text file which
holds all the states and other information of every agent, the model will start. Each
iteration of the programme generates an .xml text file, holds all the changes of the
states and other information such as location, speed etc.
After a number of iterations (can be defined when programme start), there will be a set
of .xml text files. Please note that one iteration is 0.5 second, so 2 iterations are 1
second. Now using these files is a great pleasure: you could use a specialised
visualisation tool to get the display of the model; you could use a getdata tool to getneeded information to generate a graph with specified x-axis and y-axis. In next
chapter -- implementation and testing, this will be introduced in detail.
Section 4.2.2: Main XML File Structure
The main XML file is the soul of the model. Even tough when we visualise the model
and get the data of the model, we wont need this main XML model file, but without
this file or this file is incorrect, the model wont work or wont work properly.
The main file structure is not simple (please see the Figure 4.1), the highest structure
of the model main file consists of three parts, which are defined states, X-machine,
Messages. Defined states part is actually comments of all the states for each molecule,
which help users understand. Messages show each kind of messages and contents.
Figure 4.1 Structure of the Main file (a)
Model
main file
Defined
StatesX-machine Messages
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
34/65
Chapter 4: Design
27
The most complicated part is X-machine, it consists of three sub-parts as well, but they
are: Memory, States and Functions. (Figure 4.2)
Memory part is actually for variables, user can define all the global variables in this
part, with special tags, and it is quite simple to define them.
States part contains three states normally: input, output and move, which are linked
with the functions in Function part. This part normally doesnt need to be changed for
most of the model.
Functions part is the core part in this file. It is the most complicated sub-part. It
controls each agents behaviour. Outputdata function is for outputting messages which
contains location, state and bond information. Inputdata function is for get message
from other agents and process them then with appropriate reactions. Movements
function is the function which controls the movement and locations of agents. It alsodraws the boundary of the model structure. For different cases of models, there are
might be some other necessary functions act in this sub-part.
Figure 4.2 Structure of the Main file (b)
Memory States Functions
X-machine
Outputdata Inputdata Movements
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
35/65
Chapter 4: Design
28
Section 4.2.3: Iteration XML File Structure
Iteration files are also important for this framework; they have to be simple and
uniform easier for reading and processing them.
Most commonly, these files will start with a iteration number tag to show which
iteration this file is and then the following part is for each agent in this model. This
will include all the agents at this iteration time and the detailed data for each of the
agent. Different model should have different type of data for the agents.
Section 4.3: Design of Chemical Interaction Model
Chemical interaction model is the simplest model in my project. It follows that A+B
C. Only three kinds of molecules are involved. So the structure of the model and xmlfiles is simple as well. There is already a Matlab model exists, and I could use the
information of the molecules inside and interaction rules for the X-machine model. It
is actually a conversion for this model.
In Matlab, it is possible to generate numbers of molecules data randomly when the
programme starts; also Matlab is a good tool to plot the graph of concentration against
time for the model. But in X-machine model, these functions are needed extra tool to
do it. So the design of the main file and iteration file should be quite simple.
As in the chapter three, the first thing need to do for conversion is clarify the states for
each kind of the molecules. For molecule A, there are two states: 0 free in box, 1
bound with B (this is actually appears as molecule C). For molecule B, there is only
one state: 100 free in box. When molecule B bound with molecule A, it should be
treated as disappeared. So there is no need to have a state for B which says it bound
with molecule A.
This models shape is inside a box, but according to the Matlab code, there are two
coordination methods needed -- Cartesian and Polar coordinates. Cartesian coordinates
mostly used in this model as location purpose, polar coordinates used as motion andmovement purpose. So by using the Cartesian coordinates, it is possible to draw the
boundary of the box and limit each molecule stay inside this box by reversing the
movement in polar coordinates when they hit the edge of the box.
The Memory part has to contain all the global variables for supporting the coordination
systems described above. Also it has to contain other necessary variables such as state
number, molecule radius and so on.
The states part is simple, only have three states to be set: output, input and move as
described in section 4.2.1.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
36/65
Chapter 4: Design
29
Then the most challenge part is the functions part. Four functions are essential:
outputdata, inputdata, checkbondtries and move. The best way for binding is from the
perspective of each A molecule to look for bind. By processing the location messages
from other B molecules, the A molecule will choose the best one and then bond with it.During binding process, bond message involved as well. The move function will make
sure all the molecules freely moving around inside the box.
Messages part contains two kinds of messages. First one is location message, which
contains each molecules state, Cartesian coordinates and id number. Second one is
bond message: it has the information of senders id and state, receivers state,
bondunbond tag and a distance value.
The iteration files have the same structure as described in section 4.2.2. The detailed
implementation of this model will be appeared in the next chapter implementationand testing.
Section 4.4: Design of NF-B Signalling Pathway Model
The NF-B signalling pathway model is a complicated model compare with last one, it
involves four kinds of molecules and tens of different states. The molecules are NF-B,
IB (IB and IB are ignored because of their concentration is low), nuclear
importing receptors and nuclear exporting receptors.
NF-B can bind with IB, nuclear importing receptors and nuclear exporting
receptors. Also after it bound with IB, the NF-B& IB is possible to bound with
nuclear importing and exporting receptors as well. So the design of state numbers for
NF-B are: 0 - free in cytoplasm, 1 - bound to IB in cytoplasm, 2 - bound to nuclear
importing receptors, 3 - bound to IB and then bound to nuclear importing receptors,
4 - bound to nuclear exporting receptors, 5 - bound to IB and then nuclear exporting
receptors, 6 - free in nuclear and 7 - bound to IkBa in nuclear.
Same with chemical interaction model, IB acts similar with the molecule B in thatmodel. When IB bound with NF-B, it is not necessary to display it, so the solution
is eliminate it. However, for the situation that IB bound with nuclear importing and
exporting receptors is different. The state numbers have to be unique so the design of
state numbers for IB are: 10 - free in cytoplasm, 11 - bound to nuclear importing
receptors, 12 - bound to nuclear exporting receptors and 13 - free in nuclear.
For nuclear importing and exporting receptors, the states are easy. They dont need to
worry which kind is bound with them, because the state numbers for the above two
kinds of molecules have indicated the type of bind. So the only thing they need to
make themselves clear is if they are busy or not. The design of state numbers for
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
37/65
Chapter 4: Design
30
nuclear importing receptors is: 20 dormant and 21 active. For nuclear exporting
receptors is same but different number: 30 dormant and 31 active.
Now the design of states numbers has done, the next part is memory part. The only
thing in memory part is definition of variables. As last model, the shape was a box, butthis model is a shape of a cell. The structure and boundaries are more complex.
However, the coordination systems are still the same with last model Cartesian and
polar coordination systems. Also same purpose for each system: Cartesian coordinates
take care of locations, polar coordinates control the movement of molecules. The
boundaries are drawn by the molecules, with the co-operation of both coordination
systems, nuclear receptors will lay on the nuclear membrane and other molecules will
be moving in the region they should be e.g. inside cytoplasm or nuclear. If any if the
molecule is about to across the boundary, it is possible to reverse the movement and
pull them back.
The states part would be exactly the same with last model. However, the functions
parts will not.
Because this model involves nuclear receptors, some new sets of rules are necessary to
appear inside this part. Another thing needs to be noticed is, for each bind with nuclear
receptors, there should be a delay before they are unbind and be released into a new
region. For example, NF-B bound with nuclear importing receptors, after a while, its
state should be changed as a NF-B free moving inside nuclear. The last function
move, it takes the responsibility of drawing the boundary, it should let nuclear
receptors move on the nuclear membrane only and control other molecules moving in
the right regions.
Messages part is same with last model as well, which contains location message and
bond message. Also, location message contains each molecules state, Cartesian
coordinates and id number; bond message contains the information of senders id and
state, receivers state, bondunbond tag and a distance value.
Iteration files are same structure but more kinds of molecules are inside them now.
In the next chapter, I will follow the design and talk about implementation in detail for
NF-B signalling pathway model.
Section 4.5: Design of NF-B & MAP Kinase Signalling Pathway
Combined Model
This models design of structure is almost the same with the NF-B model, but there is
a new pathway added in MAP kinase pathway. Because there is no existing Matlab
model for this one, so the relationship and cross-talk between these two pathways is
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
38/65
Chapter 4: Design
31
important. However, according to the Academic Unit of Cell Biology, Division of
Genomic Medicine in the University of Sheffield the only relationship of these two
pathways is in the Figure 4.3, they havent sorted the exact cross-talk between NF-B
and MAP kinase pathways. So the design for this model is actually add a new pathway
into the last model. From the Figure 4.3, it means molecules from outside of cell throwthe toll receptor, some of the molecule will go inside of NF-B pathway, and others
will go MAP kinase with a probability. But in the model it actually models intracellular
behaviours, so the initial molecules are assigned in the first iteration file.
Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation [32]
There are some more states needed to add in, from the Figure 3.4 in last chapter, it is
possible to treat Ras, SOS and Grb2 as a single molecule, it acts similar with molecule
A in the first model, and Active-Ras acts similar with molecule B in the first model.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
39/65
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
40/65
Chapter 5: Implementation and Testing
33
Chapter 5: Implementation and Testing
This chapter is about the detailed method of implementation of the three models in myproject. After the models are finished, it is important to test the models and evaluate
them, so the testing method of the models will also be mentioned in this chapter.
Section 5.1: Implementation of Three Models
This section is the implementation of the models in my project, followed the design
from last chapter, all the three models will be explained well of the implementation
process. The first two models are actually converted from two Matlab models, but
because of the difference between Matlab and X-machine framework, the best way to
convert is get the ideas and algorithms from the Matlab and then write directly in XMLwith XML specification for X-machine framework.
Section 5.1.1: Implementation of Chemical Interaction Model
In Matlab, the model of chemical interaction works as: firstly, it defines some
constants needed, such as time step, box length, speed range etc; then, it generates
initial molecules positions and plot immediately; thirdly, it creates initial directions
vectors; fourthly, it uses a loop from the perspective of each molecule A to look for a
suitable molecule B to bind; fifthly, it controls the moving of each molecule and keep
them inside the interaction model box; lastly, it draws a graph which shows
concentration against time.
As in last chapter, X-machine doesnt have its own initial value generation tool and
graph drawing tool, so these need special external tools to help, but it is not difficult to
archive them.
This model started with the main .xml file. In last chapter, the states for each molecule
are defined, and then we need to define constants and variables. Box length can be
defined as a constant 3000 (in meter e-10) and used later. Variables are id number andstate number as integers; doubles are x, y and z for Cartesian coordinates, postheta,
posphi and posr for polar coordinates, movetheta, movephi and mover for movements
in Cartesian coordinates and iradius for the radius of molecules.
Then the states part, three states are defined: output, input and move. For output state,
it has association with Outputdata function and pointing to input state. For input
state, it is associated with Inputdata function and has the destination move state.
For move state, it linked with Move function and with the output as next state. So
the three states are actually linked together as a closed ring shape (see Figure 5.1):
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
41/65
Chapter 5: Implementation and Testing
34
Figure 5.1 states and relations in X-machine
The next part is functions part. From the design chapter, this is the most complicated
part, and all the code is written between xml tags is actually C code. Mostly if else,
while, and other simple C functions.
The first function is Outputdata. This function is for sending out location message,
with a method called add_location_message. It sends out the id, state, x, y and z of the
molecules information to other molecule to decide. This function is quite short.
The second function is Inputdata. In this function, it defines some local variables first
for processing the location message. With a while loop, it gets all the location
messages and process one by one for each molecule A. For each message, it first
checks to see if it comes from the molecule is referring to (from itself), and it gets rid
of messages from other molecule A. Also if the un-squared distance is less than the
molecules radius squared (radius2). Then all the requirements are matched and a bond
message can be sent. With information of the source molecules id, state and
destinations molecules state, the distance of them and an integer 3 for bindunbind
tag (3 means a try for a bond).
The third function is called checkbondtries. This function is for processing the bond
message with 3 on bindunbind tag, and decides if it is necessary to make the bond.
Once the id of both associated molecule is checked, a new bond message will be sent,
it has all the same information but with a 0 on bindunbind tag (0 means a bind tag)
and 0.0 for distance distance is therefore not useful after they bind. Also the
molecule B will be freed in memory (disappear) by a code return 1 in an if else
statement.
The fourth function is also the last function in this model -- move. This function
processes the bond message with a 0 on bindunbind tag, and makes the bond. That
Output
(initial)
Input
Move
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
42/65
Chapter 5: Implementation and Testing
35
means the molecules state will be changed here as well (from 0 to 1 in this model).
Then the following bit of this function is for controlling the movement of molecules,
by using the Cartesian coordinates for location purpose, polar coordinates for
movement, all the molecules will be restrict inside the box. And the movement is
followed Brownian motion freely moving inside the box within defined speed range.
The last part is messages part. As in design chapter, two kinds of messages are defined:
location and bond message.
That is all for the main .xml file. Now we need to make an initial iteration .xml file and
visualisation programme. The initial iteration file creating programme and
visualisation programme are part of the X-machine framework, so Simon Coakley has
already done some examples only need to change them suit this model.
The create initial iteration programme is written in C. Firstly, some variables defined,which are molecule initial numbers and moving speeds. The main part is some for
loops, each for loop is used for a type of molecule, and it generates assigned number
of molecules with random coordinates and speeds.
Visualisation programme is written in C as well, it uses some openGL libraries. It reads
each of the iteration files and displays each type of molecule in different colour, with
the iteration file change. It is possible to display the molecules as moving objects. It
also has function of rotation, save each iteration display as an image etc. There are
some images of the model visualisation Figure 5.2:
Figure 5.2 Visualisation of Chemical Interaction Model
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
43/65
Chapter 5: Implementation and Testing
36
Section 5.1.2: Implementation of the NF-B Signalling Pathway Model
The Matlab model of NF-B signalling pathway model is similar with chemical
interaction model, but more complex and involves a lot more molecules and receptors.
The model works as: firstly, it also defines some constants needed; then, it assignsinitial positions in spherical polar coordinates and also converts into Cartesian
coordinates; thirdly, it uses a big while loop to do interactions, with defined sets of
rules; lastly, the results will be drawn on a graph which shows concentration against
time.
As state number have been defined, it is necessary to define some constants and
variables. The only constant defined here is receptor delay with a value 10. Variables
are exactly the same with chemical interaction model: id number, state number, x, y, z,
postheta, posphi, posr, movetheta, movephi and iradius.
The states part is the same with last model, also three states are defined: output, input
and move. The relationship of them are followed the Figure 5.1.
From the functions part, it is easily to see the difference between NF-B pathway
model and chemical interaction model. Same this last model, four functions in this part:
Outputdata, Inputdata, Checkbondtries and Move.
The first function is Outputdata. This time the function is not simple as last one. There
are two sub-parts in this function. The first one is for sending out location message,
which contains the id, state, x, y and z of the molecules information to other molecule
to decide. The second sub-part is for nuclear receptors. By checking the states, if the
molecule is an active nuclear receptor, it will decrease receptor delay counter. Once the
receptor delay counter is changed to zero, it will send a bond message out to unbind
the molecule which was binding with it. The bond message will contain both
molecules state and with a 1 on bindunbind tag (1 corresponds to an unbind tag).
Then the active nuclear receptor will release the molecule which was bound with it and
the receptors state will be changed to dormant.
The second function is Inputdata. By start with setting some local variables forfunction use and then check the bond message one by one with a while loop. If the
bond message has a bindunbind tag 1, the associate molecule will be changed state
into appropriate region. There are six situations:
1. If NF-B bound to importing nuclear receptor then make free in nuclear;2. If NF-B bound to exporting nuclear receptor then make free in cytoplasm;3. If NF-B & IB bound to importing nuclear receptor then make free in nuclear;4. If NF-B & IB bound to exporting nuclear receptor then make free in cytoplasm;5. If IB bound to importing nuclear receptor then make free in nuclear;6. If IB bound to exporting nuclear receptor then make free in cytoplasm.
8/3/2019 Xingdong Bian- X-Machine Model of a Biological System
44/65
Chapter 5: Implementation and Testing
37
Then this function will get location message for each molecule. Firstly check if the
location message was sent from the molecule itself, if it is not, then check the distance
between the molecule and message sender. If the distance is less than radius2, then it
will check if the states of them match any of the four situations (s: sender, r: receiver ofthe location message):
1. r: NF-B free in cytoplasm and s: IB free in cytoplasm;2. r: NF-B free in nuclear and s: IB free in nuclear;3. r: dormant nuclear importing receptor and s: (NF-B free in cytoplasm, IB free
in cytoplasm or NF-B & IB free in cytoplasm);
4. r: dormant nuclear exporting receptor and s: (NF-B free in nuclear, IB free in nuclearor NF-B & IB free in nuclear).
If any of the four situations matched, a bond message with a bindunbind tag 3 a tryfor bond will be sent.
The third function is Checkbondtries. This function only processes the bond message
with 3 on bindunbind tag. If it gets a message like that, then it will check if the
sender is closest in distance, if it is, then it will be bound with each other. Firstly it will
send a bond message with 0 on bindunbind tag means bind. And then change