Xingdong Bian- X-Machine Model of a Biological System

transcript

8/3/2019 Xingdong Bian- X-Machine Model of a Biological System

1/65

I

X-Machine Model of a Biological System

Third year undergraduate dissertation project

Final Dissertation

Department of Computer Science

University of Sheffield

Author: Xingdong Bian

Supervisor: Prof. Mike HolcombeModule code: COM3021

Date: 29/03/2006

This report is submitted in partial fulfilment of the requirement for the degree of

Bachelor of Science with Honours in Computer Science by Xingdong Bian.


2/65

II

Signed declaration:

All sentences or passages quoted in this dissertation from other people's work havebeen specifically acknowledged by clear cross-referencing to author, work and page(s).

Any illustrations which are not the work of the author of this dissertation have been

used with the explicit permission of the originator and are specifically acknowledged. I

understand that failure to do this amounts to plagiarism and will be considered grounds

for failure in this dissertation and the degree examination as a whole.

Name: XINGDONG BIAN

Signature:

Date: 02/05/2006


3/65

III

Abstract:

This project is in the field of computational biology, by using the computer simulation

model to display the biological systems spatial and temporal aspects in detail.

The aim for this project is develop a simulation of a vital part of the immune system by

using X-machine framework and tools such as xparser and xml. By converting the

exist models in Matlab code into xml, and then use an xparser parse it to a runnable C

source coded programme.

Three models are involved in this project: chemical interaction model, NF-kB

signalling pathway model and NF-kB & MAP kinase signalling combined model. The

first two models have existing Matlab models to be converted, but the last model is

needed to do some research and add a new pathway into NF-kB.


4/65

IV

Acknowledgments

Thanks everyone who helped me with this project. Especially my supervisor Prof.

Mike Holcombe, thanks him leading me to the right direction, many ideas and muchadvice of this project. Also thanks Mr. Simon Coakley helped me with xml

specification, xparser and visualisation. Thanks Mr. Mark Pogson help me with Matlab

example models. Lastly, thanks Prof. Eva Qwarnstrom helped me with biological

knowledge and experimental data.


5/65

V

Contents

Title -------------------------------------------------------------------------------------- I

Signed declaration -------------------------------------------------------------------------------------- II

Abstract -------------------------------------------------------------------------------------- III

Acknowledgments -------------------------------------------------------------------------------------- IV

Contents -------------------------------------------------------------------------------------- V

Figure List -------------------------------------------------------------------------------------- VII

Chapter 1 Introduction 1

Section 1.1 Background 1

Section 1.2 About the Project 2

Section 1.2.1 Agent-Based Modelling 2

Section 1.2.2 X-machine 3

Section 1.2.3 HPCx 3

Section 1.3 About This Dissertation 4

Chapter 2 Literature Review 5

Section 2.1 Overview 5

Section 2.2 Agent-Based Intracellular Chemical Interactions Model 6

Section 2.3 Agent-Based the NF-B Signalling Pathway Model 8

Section 2.4 NF-B Signalling Pathway and MAP Kinase Signal Pathway

Combined Model

11

Section 2.5 Some Agent-Based Modelling Approaches 12

Section 2.5.1 Swarm Agent-Based Modelling 12

Section 2.5.2 MASON Multi-Agent Simulations 13Section 2.5.3 X-machine Framework and XML 14

Chapter 3 Requirements and Analysis 17

Section 3.1 Objectives and Requirement for the Project 17

Section 3.2 Analysis for Intracellular Chemical Interaction Model 18

Section 3.2.1 Importance and User Requirements 18

Section 3.2.2 Conversion from Matlab 19

Section 3.2.3 Concentrations Rates 20

Section 3.3 Analysis for the NF-B Signalling Pathway Model 20

Section 3.3.1 Importance and User Requirements 20

Section 3.3.2 Conversion from Matlab 21

Section 3.4 Analysis for the NF-B & MAP Kinase Signalling Pathway

Combined Model

22

Chapter 4 Design 24

Section 4.1 Associated Language with the Project 24

Section 4.1.1 XML 24

Section 4.1.2 Matlab 24

Section 4.1.3 C 25

Section 4.2 Overall Design 25

Section 4.2.1 X-machine Frameworks Architecture 25Section 4.2.2 Main XML File Structure 26


6/65

VI

Section 4.2.3 Iteration XML File Structure 27

Section 4.3 Design of Chemical Interaction Model 28

Section 4.4 Design of NF-B Signalling Pathway Model 29

Section 4.5 Design of NF-B & MAP Kinase Signalling Pathway Combined

Model

30

Chapter 5 Implementation and Testing 33

Section 5.1 Implementation of Three Models 33

Section 5.1.1 Implementation of Chemical Interaction Model 33

Section 5.1.2 Implementation of the NF-B Signalling Pathway Model 36

Section 5.1.3 Implementation of NF-B & MAP Kinase Signal Pathway

Combined Model

39

Section 5.2 Testing Methods 40

Section 5.2.1 Unix Tool for Single Iteration Testing 40

Section 5.2.2 Getdata Programme for Whole Iteration Files Testing 40

Chapter 6 Results and Discussion 42

Section 6.1 Results and Discussion of Chemical Interaction Model 42

Section 6.2 Result and Discussion of NF-B Pathway model 45

Section 6.3 Result and Discussion of NF-B & MAP kinase pathways

combined model

49

Chapter 7 Conclusions 52

Section 7.1 Summary of the Dissertation and Project 52

Section 7.2 Future Work of this Project 52

References ------------------------------------------------------------------------ 54

Appendices ------------------------------------------------------------------------ i


7/65

VII

Figure List

Figure 2.1 Transmembrane Signalling Biomechanical and Soluble Mediators 5

Figure 2.2 Chemical Interaction Model Visualisation (Matlab) 8

Figure 2.3 Chemical Interaction Model Results (Matlab) 8

Figure 2.4 NF-B Pathway Model Visualisation (Matlab) 10

Figure 2.5 NF-B Pathway Model Results (Matlab) 10

Figure 2.6 Summary of MAP kianse pathway 11

Figure 3.1 Process of combination 18

Figure 3.2 Chemical reactions 19

Figure 3.3 concentration of molecule A, B and C against time 20

Figure 3.4 Possible states and transition of an NF-B 22

Figure 3.5 Simplify of the MAP Kinase pathway 23

Figure 4.1 Structure of the Main file (a) 26

Figure 4.2 Structure of the Main file (b) 27

Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation 31

Figure 5.1 states and relations in X-machine 34

Figure 5.2 Visualisation of Chemical Interaction Model 35

Figure 5.3 Visualisation of NF-B signalling pathway model 38

Figure 5.4 Visualisation of NF-B MAP kinase combined model 39

Figure 5.5 Concentration against Iterations (time steps) graph 41

Figure 6.1 Chemical interaction agent model graph one 43

Figure 6.2 Chemical interaction agent model graph two 44

Figure 6.3 Visualisation for chemical interaction model 44Figure 6.4 NF-B pathway agent model result (a) 48

Figure 6.5 NF-B pathway agent model result (b) 49

Figure 6.6 Result of the combined model 51


8/65

Chapter 1: Introduction

1


Section 1.1: Background

This project is in the field of computational biology, computational biology is a term

for an interdisciplinary field of the joining of both computer technology and biology.

Computational biology has just started in recent years. The field is located at the

interface between the two scientific and technological disciplines that can be argued to

drive a significant if not the dominating part of contemporary scientific innovation

[1].

After more discoveries in biology such as the structure, organisation and behaviour of

cells, tissues, organisms and communities of biological systems, more understanding

and may be simulation is needed. Computer technology is able to solve this question,and providing prediction for important aspects of the biology systems behaviour.

Computer technology gives vitality to the research of biology area. The famous

example is the Human Genome Project, it has generated an extraordinary amount of

data. Biologists are now faced with the challenge of extracting meaning from linear

sequences composed of billions of base pairs. The work of computational biologists is

indispensable for this task and for many other biological problems that lend themselves

to computational solutions [2]. This is the reason why computational biology field is

developed dramatically, more and more people in both areas are starting to work

together and get best solution of their research.

There are 10 major research areas for computational biology now: sequence analysis,

computational evolutionary biology, gene expression analysis, regulation analysis,

protein expression analysis, analysis of mutations in cancer, structure prediction,

measuring biodiversity, modelling biological systems and high-throughput image

analysis. My project is in the 9th

area stated above modelling biological systems, this

area involves the use of computer simulations of cellular subsystems for both spatial

and temporal aspects the complex connections of these cellular processes.

The definition for biological computer modelling is using a computer programmewhich tries to simulate an abstract model of a particular biological system. Biological

computer simulation is a subset of computer simulation. Computer simulation is a

really useful part in modelling lots of natural systems, which gives insight into the

operation of the nature systems are been modelled. The age before computer

simulation, people were using mathematical models, but with computer simulation,

modelling went in a new stage.

Here is history of computer simulation (quoted from the Wikipedia article "Computer

Simulation", it is licensed under the GNU Free Documentation License --

http://www.gnu.org/copyleft/fdl.html):


9/65


2

Computer simulation was developed hand-in-hand with the rapid growth of the

computer, following its first large-scale deployment during the Manhattan Project in

World War II to model the process of nuclear detonation. It was a simulation of 12 hard

spheres using a Monte Carlo algorithm. Computer simulation is often used as an

adjunct to, or substitution for, modelling systems for which simple closed formanalytic solutions are not possible. There are many different types of computer

simulation; the common feature they all share is the attempt to generate a sample of

representative scenarios for a model in which a complete enumeration of all possible

states of the model would be prohibitive or impossible. Computer models were

initially used as a supplement for other arguments, but their use later became rather

widespread. The physicist Richard Feynman, was not fond of such models and once

called them "a disease"[3].

Section 1.2: About the Project

About my project: the aim for my project is developing a simulation of a vital part of

the immune system by using framework and tools. Based on the existing framework

which was developed by the Computational Biology Research Group in our

department, it can model different kinds of biological systems and the systems are

defined in terms of individual agents which play the role of different biological

entities such as molecules, receptors etc. Also the simulations they have built can solve

thousand of these agents operating and communication with other agents. This is

called Agent-Based Modelling.

Section 1.2.1: Agent-Based Modelling

Agent-Based Modelling is developed to deal with the complexities of the system and

to extend the capabilities of previous chemical modelling attempts [4][5]. It can

provide better understanding of the operation for the cellular reactions for both spatial

and temporal aspects.

Agent-Based modelling (also known as individual-based modelling) treats each

individual component of a system as a single entity (or agent) obeying its ownpre-defined rules and reacting to its environment and neighbouring agents accordingly

[4][6]. Agent is good for representing component of a system.

Also, for the agents, they can be represented by various computational models; the

approach chosen here is the X-machine, providing an intuitive and precise method to

model the functional behaviour of systems in a flexible and modular manner [5]. A

single stream X-machine is used to describe each individual agent, and communication

channels are identified between machines to deal with agent interactions [7]. When

modelling complex systems, there is an essential feature for X-machine: it is directly to

develop by adding new agents to the system and makes the modelling process


10/65


3

extensible.

Section 1.2.2: X-machine

The reason we use X-machine is due to its speciality. X-machines are similar to finitestate machines, which are models of behaviour based on states and transitions, but the

X-machines has a addition feature: memory, it achieve that transitions between states

can include the memory and the modification of it [9]. The memory lets X-machine

have an important and novel feature. The memory in X-machine contains physical

location, so that the number of states required to model the system is manageably

small.

The using of framework as this: programme using XML with the X-machine specific

way and then the Xparser (which is built by the computational biology research group

in our department as well) will produce a programme in C code from the X-machineXML specification. By running the programme it will simulate the agents behaviour

and it is also possible to visualise the simulation by the special visualisation C

programme built for the model.

The reason why the framework is based on XML instead of directly writing it into C

code is: the XML is simple and it is flexible text format derived from SGML, which

will show all the state of each agent clearly and it is really simple to be code compare

with C. After the XML code created, the Xparser will parse it into C code easily.

Section 1.2.3: HPCx

The computational biology research team has already done the model for the vital part

of immune system in Matlab, what I will do is convert the model into X-machine

framework which will be running under C compilers.

The reason for that is because of the super computer HPCx cannot run Matlab but C.

In order to get this super computer to calculate our simulation, we have to convert our

model into C.

We can see the super computers hardware specification of it (quoted fromhttp://www.epcc.ed.ac.uk/msc/systems_HPCx.htm):

The HPCx system is located at the UK's CCLRC's Daresbury Laboratory and operated by the HPCx

Consortium.

The HPCx system uses IBM p690+ Regatta nodes for the compute and IBM p690 Regatta nodes for

login and disk I/O. Each Regatta node contains 32 processors. At present there are two p690 service

nodes. At the beginning of the user service on HPCx phase2 in April 2004, twenty p690+ nodes were

used for compute jobs, offering a total of 640 processors. From Monday, 10 May, there were 38 frames,

i.e. 1216 processors, available to users. Then the system had a throughput of at least 4.8 Tflops (4800

AU/hr). This was increased to 50 nodes offering 1600 processors end of May 2004. The peak

computational power of the HPCx system is 10.8 Tflops peak, or at least 6 Tflops sustained. The


11/65


4

complete new platform gave a value of 6,188 Gflops for the Rmax value of the Linpack benchmark. The

service can thus provide 6,188 AUs per hour, 148,512 AUs per day.

HPCx service is provided by a consortium led by the University of Edinburgh, with the

Council for the Central Laboratory of the Research Council and IBM. This supercomputer will help us by running the simulation by thousands of processors with

different agent in different processor to get a much more accurate result. However my

project doesnt involve to HPCx directly.

Section 1.3: About This Dissertation

This dissertation consists of seven chapters, after this beginning introduction chapter,

the second chapter is literature review, all the related background literature will be

mentioned as well as the X-machine framework in detail and the associating three

programming language with my project. The third chapter is requirements and analysis,this chapter talks about the project by objectives, requirement and the analysis in a

more detailed way. How the project will be evaluated will also be included in this

chapter. The next chapter is design the design technique of this project. Then the fifth

chapter is implementation and testing, this chapter is about the coding methods and

how to test the model. The sixth chapter is results and discussion, this is a important

chapter that shows the main results of the model and some discussion. The last chapter

is conclusions, a summarisation of the project and the dissertation.


12/65

Chapter 2: Literature Review

5


Section 2.1: Overview

Three models are involved in my project: intracellular chemical interactions model, the

NF-B signalling pathway model and a combined of NF-B signalling pathway and

MAP Kinase signal pathway model. Also there are three programming languages

associated with my project xml, Matlab and C. We can see a picture which shows a

part signalling pathways in cell, and some of the molecules are going to appear in the

model, this picture was done by Prof. Eva Qwarnstrom:

Figure 2.1 [26]


13/65


6

Section 2.2: Agent-Based Intracellular Chemical Interactions Model

Firstly, I will introduce intracellular chemical interactions model. Even the simplest

life forms require the interaction of more than 400 chemical processes that are encoded

by genes [9]. To track and understand the intracellular chemical interactions, theintracellular signalling pathways should be considered. Intracellular signalling

pathways are really important for cell behaviour in control and regulation. With

agent-based modelling it will show the intracellular signalling pathways in both spatial

and temporal concerns. By using the agent-based modelling, it is possible to provide a

framework for calculating chemical interactions with accurate result.

Complex interactions of genes, proteins and other molecules within the cell must be

addressed in order to gain a better understanding of how these pathways operate

[6][10][11]. Also by using mathematical models with the information of physical

components of the cell, it is easier to understand the activities of signalling pathways.

People used to model intracellular signalling pathways relying on reaction kinetics, by

using ordinary differential equations to show each chemicals quantities with time.

This is possible only when the chemicals in the cell are well mixed. However, due to

internal structure and low numbers and non-uniform distributions of certain key

molecules in the cell, this is certainly not true [12].

Also because the signalling pathways are complex, only using mass number of

ordinary differential equations is necessary for the reaction kinetics models. However

the description will be huge and the solutions will be difficult to be expressed. This

kind of models has some other problems as well: they have limitations in function

properly and those large numbers of ordinary differential equations are sensitive, only

small changes to the equations will cause big changes in behaviours. So this kind of

models has a narrow view of the real behaviours in the cells even they can provide

useful results sometimes.

An important factor needs to be encountered for intracellular modelling is time delays.

Time delays in certain cellular processes such as transcription can have very significant

effects on pathway behaviour [6]. Differential equation models dont consider thisfactor because of its attributes, they cant include inside with those ordinary

differential equations.

An even more important factor for intracellular modelling is spatial effects. Again,

differential equation models are hard to consider spatial effects.

As all above, even the differential equation models are important, but they still have

lots of disadvantages and limitation on modelling of intracellular interactions. So, to

gain a higher level of understanding the mechanical and structural effects on

intracellular pathways, more transparent and abstract models are needed.


14/65


7

A good modelling approach here, which is called: agent-based modelling. Agent-based

modelling models each individual component of a system as a single agent obeying its

own pre-defined rules and reacting to its environment and neighbouring agents

accordingly [6]. That means agent-based modelling contains new methods of

modelling spatial systems that deal with much finer spatial and temporal scales whereactivity is represented at the level of the individual or agent. Also, processes naturally

enter these systems as agent behaviour and then it joins the spatial context naturally as

well. Agent-based modelling has recently been applied to a variety of biological

systems, including insect communities and epithelial tissue [13][14][15][16].

Agents in a biological system for a biochemical pathway, can be presented as anything

from a molecule to a signalling receptor to a an entire chain of interactions can be

modelled as an agent, thus providing a modular and extensible modelling framework

which allows abstraction of details as necessary [5]. So agent-based modelling is

clarified in spatial modelling, which is good for monitoring intracellular interactionand the change cell structures by the interaction processes.

Compared with the differential equation models, agent-based models have a lot more

freedom: they can model different quantity and different positions of molecules with

no limitations if the computer is good enough. Also, the two important factors: time

delays and spatial effects can be included in the model easily. But notice the number of

agents must be positive.

Different from the differential equation models, agent-based models dont need a lot

ordinary differential equations in modelling, but they need some other details for each

agents position and properties, so that is a large amount of information that needs to

be specified. Another thing needs to be noticed is the agent-based model should agree

with the associate kinetics model.

The two images below is an agent-based model coded in Matlab by Mark Pogson in

our department. The Figure 2.2 shows a step in the middle of interaction, it clearly

displays all three kinds of molecules position and number in a three dimensional box.

The Figure 2.3 shows the number of each kind molecule against time in second. We

can see that by the time change molecule A interacts with molecule B producesmolecule C. Also the numbers of them are associated. An agent-based intracellular

interaction model (A + B C) by Matlab code:


15/65


8

Figure 2.2

Figure 2.3

Section 2.3: Agent-Based the NF-B Signalling Pathway Model

After the intracellular chemical interaction model, now we move on to the second

model which is involved with my project, it is called the NF-B signalling pathway.

NF-B nuclear factor kappa B, is a heterodimeric protein composed of different


16/65


9

combinations of members of the Rel family of transcription factors. The Rel/ NF-kB

family of transcription factors are involved mainly in stress-induced, immune, and

inflammatory responses. In addition, these molecules play important roles during the

development of certain hemopoietic cells, keratinocytes, and lymphoid organ

structures. More recently, NF-kB family members have been implicated in neoplastic progression and the formation of neuronal synapses. NF-kB is also an important

regulator in cell fate decisions, such as programmed cell death and proliferation

control, and is critical in tumorigenesis [17]. So the intracellular NF-B signalling

pathway is important to immune systems.

Due to its control of cells death and proliferation, the research of NF-B signalling

pathway is really important. Imagine if people can control it, let cancer cells kill

themselves and normal cells stay alive, then the biggest problem in the world now

cancer, will be solved. However, it is not easy to control it so a good model for

intracellular NF-B signalling pathway is needed to show both spatial and temporaldetails of the pathway for research purpose.

NF-B activation is tightly controlled by inhibitors of NF-B (IB) proteins [5][18].

IB sequesters the majority of NF-B in the cytoplasm as complexes by masking their

nuclear localisation signals [19]. During activation, IB is phosphorylated by IB

kinases (IKK), causing its ubiquitination and proteosome-mediated degradation. The

newly free NF-B is consequently transported into the nucleus, inducing genes bearing

cognate binding motifs [5].

All the information above is for showing how important NF-B signalling pathway is

and how NF-B is activated. Now we need a computational model to get the

information of the way how it controls the signalling pathways, with the results

provided by the experiment.

It is the same with intracellular chemical interaction model, people use differential

equations to model inhibitors performance. However, as I mentioned above, the

differential equation models have limitation to show the actual pathway. So, the best

approach here is agent-based modelling.

Agent-based modelling is able to give the intracellular NF-B signalling pathway a

better scope of analysis and more complete view of the regulatory mechanisms. It

shows what is actually happening inside the cell. A single agent is a molecule inside

the cell in this model and its behaviour is controlled by the rules of interaction and its

environment. Even sometimes it is not possible to model all the individual molecules

due to biological or computational limitations, but by using some other agents to

separate the system into useful components, it will provide a complete view of the

pathway.

Again in this model, the agent-based modelling has wilder scope than the reaction


17/65


10

kinetics modelling, but the agent-based model must agree with the corresponding

reaction kinetics model.

The two images bellow is a second agent-based model coded in Matlab by Mark

Pogson, from our department. The Figure 2.4 shows a step in the middle of the NF-

Bsignalling pathway simulation, it clearly displays a cells model and the position for

each kind of molecule. The Figure 2.5 shows the concentration of each kind of

molecule against time in second.

Figure 2.4

Figure 2.5

As we can see, the model is made up of lots of different molecule in a spherical cell

with a spherical nuclear centre region. However, in the actual world, some cells have

unique and non-spherical free shape. To model those cells, we will need some special


18/65


11

software to sort the boundary out, but it is still based on a spherical shaped model with

all kinds of coordinates.

Section 2.4: NF-B Signalling Pathway and MAP Kinase Signal Pathway

Combined Model

MAP Kinase stands for Mitogen-activated protein kinase. In cell biology,

mitogen-activated protein kinases are serine/threonine-specific protein kinases that

respond to extracellular stimuli (mitogens) and regulate various cellular activities, such

as gene expression, mitosis, differentiation, and cell survival/apoptosis. Extracellular

stimuli lead to activation of a MAPK via a signalling cascade composed of MAPK,

MAPK kinase (MAPKK), and MAPKK kinase (MAPKKK). A MAPKKK that is

activated by extracellular stimuli phosphorylates a MAPKK on its serine and threonine

residues, and then this MAPKK activates a MAPK through phosphorylation on its

serine and tyrosine residues. This MAPK signalling cascade has been evolutionarilywell-conserved from yeast to mammals. [27]

Figure 2.6 [25]

The Figure 2.6 only shows a summary of MAP kinase pathway, but the Figure 2.1


19/65


12

shows a more complex and complete signalling pathways. It also shows the cross talk

between NF-B and MAP kinase pathways.

This pathway can also be modelled by the agent-based model. By introduce each

molecule as an agent. Same with NF-

B signalling pathway agent-based modelling isalso able provide a better scope of analysis and more complete view of the regulatory

mechanisms. However, the combined model is more complex and more important for

research purpose, what is actually happening inside the cell is necessary to be

displayed by computer model.

The most important thing is to see if these two pathways interfere with each other

when they are in the same model, also the cross interaction between the members of

them is fatal.

If two pathways behave normal in the same model that means X-machine frameworkis capable to model more than one pathway. This is also the base of the future models

which have three or more pathways inside.

Section 2.5: Some Agent-Based Modelling Approaches

Section 2.5.1: Swarm Agent-Based Modelling

Swarm is a multi-agent software platform for the simulation of complex adaptive

systems. In the Swarm system the basic unit of simulation is the swarm, a collection of

agents executing a schedule of actions. Swarm supports hierarchical modelling

approaches whereby agents can be composed of swarms of other agents in nested

structures. Swarm provides object oriented libraries of reusable components for

building models and analyzing, displaying, and controlling experiments on those

models. Swarm is currently available as a beta version in full, free source code form. It

requires the GNU C Compiler, Unix, and X Windows. [33]

The modelling formalism that Swarm adopts is a collection of independent agents

interacting via discrete events. Within that framework, Swarm makes no assumptionsabout the particular sort of model being implemented. There are no domain specific

requirements such as particular spatial environments, physical phenomena, agent

representations, or interaction patterns. Swarm simulations have been written for such

diverse areas as chemistry, economics, physics, anthropology, ecology, and political

science. [33]

Swarm uses each individual agent as a basic unit, each agent generates events affect

itself and other agents, and the simulation of Swarm uses a number of agents

interacting with each other.


20/65


13

Swarm needs libraries to do the simulation. Swarm libraries serve two major

functions. The libraries are a set of classes that model builders can use by direct

instantiation. For many objects, especially highly technical ones such as schedule data

structures, it's likely that all a user will ever do is use the classes as provided. But in

addition, one can use Swarm libraries by subclassing them, specializing particularclasses for particular modelling needs. Both modes of using the Swarm libraries are

important; Swarm is designed to facilitate both as appropriate. [33] This is also the

limitation of the Swarm agent-based modelling.

Section 2.5.2: MASON Multi-Agent Simulations

MASON Stands forMulti-Agent SimulatorOfNeighbourhoods... orNetworks... or

something..., MASON is a fast discrete-event multiagent simulation library core in

Java, designed to be the foundation for large custom-purpose Java simulations, andalso to provide more than enough functionality for many lightweight simulation needs.

MASON contains both a model library and an optional suite of visualization tools in

2D and 3D. MASON is a joint effort between George Mason University's ECLab

Evolutionary Computation Laboratory and the GMU Center for Social Complexity,

and was designed by Sean Luke, Gabriel Catalin Balan, and Liviu Panait, with help

from Claudio Cioffi-Revilla, Sean Paus, Keith Sullivan, Daniel Kuebrich, Joey

Harrison, and Ankur Desai. [34]

MASON has some special features:

Simulations can be serialized to checkpoints (freeze-dried and written to disk),which can be recovered from at any time, even to different Java platforms and new

MASON visualization toolkits.

MASON can be set up to be guaranteed duplicatable, meaning that the samesimulation parameters will produce the same results regardless of platform.

Libraries are provided for visualizing in 2D and in 3D (using Java3D), tomanipulate the model graphically, to take screenshots, and to generate movies

(using Java Media Framework).

While the visualization toolkits are fairly large, the core simulation model isintentionally very small, fast, and easy to understand. [34]

However, from the description above, MASON uses Java technology to simulation

models, as in last chapter, we need to run models on HPCx, but HPCx doesnt support

Java, so it is not possible to choose this simulation system for my project.

As in last two sections, these two models are not suit for my project as the X-machine

framework, you will know why the X-machine framework is the most suitable one for

my project in next section.


21/65


14

Section 2.5.3: X-machine Framework and XML

Due to the mass usage of agent-based modelling for intracellular interactions, it is

necessary to develop a common architecture for the large amount of agents systems.

The approach here is a framework based on the X-machine. The framework canstandardise the expression of agents in a special way. The X-machine framework uses

XML code, through a C coded Xparser, it can be parsed into a runnable C code.

There are quite a lot of tools for computational biology modelling research, but for

agent examples uses, there is not many, only some framework with inadaptable

structure based, which wont suit our models. Also there are some agent-based

frameworks already exists but they cant reach the needs for intracellular modelling.

Because inside actual cells there are millions of molecules and associated cellular

signalling. Due to the huge number of agents the need of a common architecture is

essential. With running on a super computer like HPCx as I mentioned in theintroduction, it makes the modelling result more accurate. The reason why it can be

run on the supercomputers is the definition of agents. The agents are defined as

autonomous computing machines that communicate with messages the processing of

the agents can be spread across many processors and computers that are connected on

a network [8].

The messaging between agents is similar with the message communication with

computers, so the messages from the agents can be used in computers. MPI (Message

Passing Interface) is a library that allows the creation of programs that can be spread

across computers and that communicate with messages and has become the de facto

standard for distributed memory parallel processing [8]. So we can use computers to

simulate the agents and the messages between those agents.

It is possible to define a cell as a system which processes some parallel collections of

communication. So we need a good model to define the behaviour of agents running in

parallel and sending each other data and process them. The X-machine matches all

needs, X-machine is similar with other finite state machines, and it has states, input

output alphabet and a unique thing which other state machines dont have memory.

With this additional memory, it is then really useful and suitable for agent-basedmodelling. When the transition between states, they can have memory with them and

modify it. We can see the definition of a stream X-machine.

The definition of a stream X-machine is an 8-tuple [16]:

X = (, , Q, M, , F, q0, m0)

and is the input and output alphabets respectively.Q is the finite set of states.Mis the (possibly) infinite set called memory., the type of the machineX, is a set of partial functions that map an input and a


22/65


15

memory state to an output and possibly different memory state, : x Mx

M.

F is the next state partial function, F: Q x Q, which given a state and afunction from the type determines the next state. Fis often described as a state

transition diagram.qo and mo the initial state and initial memory respectively.

From now on the term X-machine refers to a stream X-machine [8]. Because the

X-machines can communicate, we can use the Communication X-machine. A

Communication X-machine model uses X-machines which can exchange messages.

The Communication X-machine model can be defined as the tuple [8]:

((Cix) i = 1..n, R)

where:

Cix is the i-th Communicating X-machine in the system, andR is a communication relation between the n X-machines

By different method of defining R, we can get different definition of communicating

X-machine. One of the most accepted approaches uses the idea of a communication

matrix which acts as the means of communication between X-machines [8]. The

communication cells in this approach contain message between X-machines. However,

this approach still has disadvantages when using X-machines as agents, especially

when there are a lot of agents, the communication matrices will be too large to link

each other. Also, the target agent to send message is unclear from the point of an agent,

due to the changes of the communication.

Agents are restricted to interact with surrounding agents in the communicating

X-machine agent-based models, so the distance of massages sending between agents is

restricted. In this approach, the communication relation between X-machines R

consists of two lists: message list and message type list. In the message list, all the

X-machines will understand and able to read the messages. It is really important forthe concept of this kind of implementation, it means the actions of each X-machine are

based on input messages. If the source of the input message is too far from this

X-machine, then the message will be ignored; if the source is at a reasonable distance,

it will be processed. Also, this method can be extended, just need to put a tag with

some intelligent information on it, e.g. the max. distance for the sending X-machine

and possible receiving X-machines.

There are a lot of ways of communicating and handling messages. There is a useful

one, which is the communication between two agents that are processed on distinct

computers in a computer cluster or a grid system. What people are doing now is having


23/65


16

a local message list for each computer CPU in the computer cluster. The agent only

sends and receives message from the local computer CPU, but there will be a separate

calculation to see if any other agent need the message on different computer CPUs.

The calculation involves the distance between each agent, by giving each of them an

influence boundary, it will be easy to decide if an agent needs the message.

XML is used for the implementation architecture of X-machine here. By coding with a

XML text file, the X-machine architecture can be defined. This is really easy to use for

most people, by using any kind of file editor, they can modify the XML code easily.

Also, it is possible to develop a graphical interface to modify the XML, without seeing

the implementation directly.

It is necessary to build a parser for the XML code which can parse the XML into a

runnable C programme to run the X-machine agents with the message list relation. The

parser itself is coded in C and it is universal for all XML coded X-machine agentsmodels, we call the parser Xparser. To complete an iteration, another XML text file is

needed to define the starting state and details for each agent as an initial point to run

the programme. By using these files, it is possible to have certain different runs of the

model with different result for research.

The representation of the X-machine model can be visualised by using a special coded

visualisation programme. The visualisation programme is coded in C as well. By using

the visualisation, it gives us a direct view of the models structure and interactions

procedures. Also, it is possible to screenshot each frame of the visualisation as a photo

file, with a set of screenshots, they can be converted to a video file by using a free

software which is called VirtualDub ( see http://www.virtualdub.org/).


24/65

Chapter 3: Requirements and Analysis

17


Section 3.1: Objectives and Requirement for the Project

This chapter is a mainly about objectives and requirements about the project. Each of

the three models will be discussed in detail.

For my project, the aim is developing a simulation of a vital part of the immune system

by using framework and tools. Based on the existing framework which is the

X-machine, and it was developed by the computational biology research group in our

department. It can model biological systems which are involved with my project easily.

Each individual agent plays as a role of a molecule or a receptor. Based on

agent-based modelling, it can solve thousand of these agents operating and

communication with other agents.

The objective for my project is, based on the existing two Matlab models, convert

them into X-machine models. For both intracellular chemical interaction model and the

NF-B signalling pathway model, Mark Pogson has used Matlab to model them and I

have already received them. However, for the third model, there is no existing Matlab

model for combined two pathways. So this is something challenging and needed to be

fully tested to see if this works properly in X-machine framework.

Clearly, for requirements, the first thing is to understand all the Matlab models in detail,

and then I need to sort out the architecture and the method of X-machine modelling.

Also, I need to understand how to use the Xparser developed by Simon Coakley.

Then, I can make my start: after fully understanding the Matlab model, I need to

convert them into X-machine model, which represented by a XML file. Then I need to

create an initial state file called 0.xml (based on XML as well) to give the model initial

starting agents details, because the Matlab can generate initial agents at every run

starting point, but in X-machine, I need to create myself. Then, use Xparser to parse

the XML into C. if there is no problem with compiling, then it is possible to get an .exe

runnable programme file. Use the programme, assign a iteration number and point the0.xml initial state file, all the process will be done and I can get a XML file for each

iteration. Simon Coakley also has developed a visualisation programme specialised for

the X-machine model. With that programme, it can give us a direct view of the model

in 3D pictures.

After the conversion of the two Matlab models into X-machine model, then it is

possible to start the third model. By defining each molecule as an agent, set of binding

rules for each new kind of molecules and set of moving rules for them, this model will

be made up.


25/65


18

As in Figure 3.1, it is possible to start with two individual models for NF-B and

MAPK pathways, then put them together into a single model. However, there is an

important thing: the state numbers for each pathways molecules should be unique,

then it wont clash when they are combined together. Also, the cross-talk between

NF-

B pathway and MAPK pathway is necessary to be shown in the model, if there isavailable detailed data for that. I will discuss more about the combination model in a

following section and chapter.

Figure 3.1 Process of combination

Section 3.2: Analysis for Intracellular Chemical Interaction Model

Section 3.2.1: Importance and User Requirements

This model is a very basic and simple model, but everything is from the basic to

complex. Many aspects of life involve the interaction of multiple components and

subunits and the corresponding emergence of both form and function. This is true

whether we are dealing with molecules within an individual cell, cells within tissue,

organs within an organism or organisms within a community or ecology. [28] Bysorting out how each molecule interaction with another kind, it is possible to build a

large and complex model with a number of different kinds molecules or pathways.

The key feature for agent-based modelling is model each molecule as an agent, from

the Figure 3.2, (a) Reaction kinetics differential equations treat reacting chemicals as

well mixed and uniform; (b) Agent-based approach models each individual molecule

[28].

NF-B

MAPK

Combined

Mode

Cross-Talk

Mix


26/65


19

Figure 3.2 Chemical reactions [28]

The agent-based models have greater scope than the reaction kinetics differential

equation models, but they need to define a lot more details than the latter one. For

example, the movement of a single molecule is needed to be defined, also the bindingrules of A molecules to B molecules as well. Incorrect data may course a big difference

in result.

Agent-based models have to agree with reaction kinetics differential equation models.

Because when the agent-based model has large number of molecules and they are

mixed well, reaction kinetics differential equation models can be applied. However,

there are not many information about individual molecular interactions, so it is

necessary to gain some data from reaction kinetics for agent-based model.

Section 3.2.2: Conversion from Matlab

During conversion, there is a big change need to be defined first the state of each

molecule. X-machine is a special kind of state machines, so when modelling

intracellular actions, each of the molecules is an X-machine, and each of them has a

state. So I need to sort out each kind of molecules possible state.

The intracellular chemical interaction model only has two kinds of molecule initially,

so the states are easy to be defined. Two states for molecule A: free and bond withmolecule B, one state for molecule B: free. From the perspective of A, it receive

message from molecule B and decide bond or not. After bound with B, they changed to

a third kind of molecule, at this time, when we marking the state, we can let molecule

B disappear and molecule A changes to the state bond with B it is actually

molecule C now, but for easier to compute and display.

Also, the requirement for a bind is important as well. Normally the interaction

boundary depends on the radius of the molecule. It is necessary to define the radius

and interaction boundary for each kind of molecules.


27/65


20

Section 3.2.3: Concentrations Rates

There is a good way to check if the result is correct or not, just calculate the number of

molecules, for each bond, the molecule A and molecule B will decrease one unit, and

molecule C will then increase one unit, this should happen in the same time step, lookback to Figure 2.3, you can see the concentration changes easily. And the model will

be built based on these. The evaluation for this model will be easy as well if the

concentration change in molecule A with a time step t is a, for molecule B is b, the

interaction is between molecule A and molecule B and produces molecule C, so the a

= b. from Figure 3.3 [6]:

Figure 3.3: concentration of molecule A, B and C against time [6]

Section 3.3: Analysis for the NF-B Signalling Pathway Model

Section 3.3.1: Importance and User Requirements

As in last chapter, we know that NF-B signalling pathway is vital to immune response

regulation. Alterations in pathway regulation underlie many diseases, includingatherosclerosis and arthritis. The modelling of individual molecules, receptors and

genes provides a more comprehensive outline of regulatory network mechanisms than

previously possible with equation-based approaches. [28] For this model, all the data

is from single cell experimental analysis by the Academic Unit of Cell Biology,

Division of Genomic Medicine in the University of Sheffield.

For a user using this model, he/she will be able to change and alter each kind of

molecules moving speed, radius and initial quantity (concentration). Another thing is

user should be able to define the colour for each kind of molecules. That means even

the data from the Matlab code is not correct, but as soon as the experiment finished,


28/65


21

user is able to correct the model and each kind of molecules is independent to another

kind change ones detail wont affect others but will get correct result.

NF-B interact with IB should follow the interaction requirement as described in last

section, NF-

B can be seen as molecule A in last model, I

B can be seen as moleculeB, so when they bound it will be NF-B & IB, can be seen as molecule C. So the

concentration change should follow the Figure 3.4, but there are lots of other kinds of

molecule involved, the situation will be a lot more complex.

Section 3.3.2: Conversion from Matlab

From the detail in the Matlab code, it is possible to know that: Activation of the

NF-B pathway if controlled by inhibitors of NF-B (IB) proteins, which sequester

the majority of NF-B in the cytoplasm as complexes by masking their nuclear

localisation signals. During activation, IB is phosphorylated by IB kinases (IKK),causing its degradation. The newly freed NF-B is consequently transported into the

nucleus, inducing inflammatory genes, including those encoding IB, thus regulating

the pathway through negative feedback.[28][29][30][31]

Also from the Matlab code, there are NF-B, IB, IB, IB, Nuclear Importing

Receptors and Nuclear Exporting Receptors modelled as agents. The conversion from

Matlab is complicated. Each kind of molecules has a set of states. However, the

number of IB and IB in the real cell is tiny, from the suggestion of Mark Pogson,

it is not necessary to include these two molecules into the model. Then we can have a

look the possible state for each kind of molecules:

For the NF-B molecule, it is most complicated one in this model, see Figure 3.4 on

next page for the possible states and transition of a NF-B. As you can see only one

molecule will have those states: bound and unbound with different molecules in

cytoplasm and nuclear, also states for free in cytoplasm and nuclear, bound and

unbound with importing and exporting receptor. In more detail, NF-B should have a

state bound with IB in cytoplasm; a state of free in cytoplasm; a state of bound with

nuclear importing receptors; a state of free in nucleus; a state of bound with IB in

nucleus; a state of bound with nuclear exporting receptors, a state of bound with IBthen bound with nuclear importing receptors and a state of bound with IB then

bound with nuclear exporting receptors.

ForIB the possible states are: free in cytoplasm, bound with nuclear importing receptor,

free in nucleus and bound with exporting receptor. IB is a lot simpler than the NF-B

molecule.

For both kinds of nuclear receptors, there are two states: dormant and active. When

active, that means something bound with it; when dormant, that means it is free and

ready to bind with other kinds of molecules.


29/65


22

Figure 3.4: Possible states and transition of an NF-B [5]

Section 3.4: Analysis for the NF-B & MAP Kinase Signalling Pathway

Combined Model

This model involves two pathways: NF-B signalling pathway and MAP KinaseSignalling pathway. NF-B pathway has already done in the second model, so the

tasks are build the MAP kinase pathway separately and then combine them together.

As in the Figure 3.5 (next page), it is possible to simplify the model from the Figure

2.6. Ras, SOS and GRb2 molecules can be seen as a single kind, this can be treated as

NF-B in the last model. Active-Ras can be treated as IB. However, both of them

cant go inside of nuclear. After they bound, will produce a molecule called MAPK,

instead of Raf (MAP KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K). Raf (MAP

KKK), MEK1/2 (MAP KK) and ERK1/2 (MAP K) is a degradation process, so they

can be treated as one kind MAPK. MAPK is the only one goes inside nuclear and


30/65


23

then it will switch on gene.

Same with NF-B and IB, the interaction of Ras_SOS_Grb2, Active-Ras and

MAPK, should follow the concentration change as in Figure 3.4. Also the change of

NF-

B and I

B should not be affected in this model, this is the way for evaluation.

The cross-talk between these two pathway has not yet been discovered fully. The only

thing we know now is a molecule called NIK, it is the important part of cross-talk.

The next chapter is design, I will talk about the design of each model in detail.

Figure 3.5 Simplify of the MAP Kinase pathway

Bound

Ras_SOS_GRb2 Active-Ras

Raf

(MAP KKK)

MEK1/2

(MAP KK)

ERK1/2

(MAP K)

Ap-1

Nuclear Membrane

MAPK


31/65

Chapter 4: Design

24

Chapter 4: Design

Section 4.1: Associated Language with the Project

There are three programming languages associated with my project -- XML, Matlab

and C. It is necessary to get familiar with these languages before the design of the

models.

Section 4.1.1: XML

Firstly, lets have a look at XML. XML, also known as Extensible Markup Language,

similar with our familiar language HTML (Hypertext Markup Language), they are all

derived from SGML (Standard for General Markup Language). XML is a simple but

very flexible language. XML was actually designed for the challenge of large-scaleelectronic publishing [24]. Also, people are now using XML on exchanging data

between the Web and other devices. E.g. the RSS (Really Simple Syndication) feed

service, by providing a common format text file in XML, let the users receive most

up-to-date information such as news, weather and so on.

Compared with HTML, XML are very flexible. Because the tags in HTML are

predefined; but in XML, you can define the tags by yourself. With your own-tags

compatible XML parser or reader, they can archive a goal with great efficiency.

Section 4.1.2: Matlab

Secondly, we turn to Matlab. Matlab is an interactive mathematical environment and

high-level technical computing language, originally based on the FORTRAN packages

LINPACK and EISPACK, but now based on LAPACK and BLAS [20]. Matlab is a

really useful tool for mathematical modelling; it also has a lot of features [21]:

High-level language for technical computing Development environment for managing code, files, and data Interactive tools for iterative exploration, design, and problem solving Mathematical functions for linear algebra, statistics, Fourier analysis, filtering,

optimization, and numerical integration

2-D and 3-D graphics functions for visualizing data Tools for building custom graphical user interfaces Functions for integrating MATLAB based algorithms with external applications

and languages, such as C, C++, Fortran, Java, COM, and Microsoft Excel

With these features, Matlab is really a powerful tool for computing and mathematical

studies. However, compared with XML architecture, it is not so suitable for

agent-based modelling when handling the agents and the communication relation

messages. Another reason is the HPCx super computer does not support Matlab, in


32/65

Chapter 4: Design

25

order to get the super computer running in parallel with different agent on different

CPU, so it is necessary to convert the existing Matlab coded models into X-machine

models

Section 4.1.3: C

Lastly, we focus on C programming language. There is a book called The C

Programming Language by Brian Kernighan and Dennis Ritchie, give us an informal

specification on C and some history information about C.

The C programming language is a standardized imperative computer programming

language developed in the early 1970s by Ken Thompson and Dennis Ritchie for use

on the UNIX operating system. It has since spread to many other operating systems,

and is one of the most widely used programming languages. C is prized for its

efficiency, and is the most popular programming language for writing system software,though it is also used for writing applications. It is also commonly used in computer

science education, despite not being designed for novices. [22][23]

C is a language which operates very close to the hardware, also C is most similar with

assembly language rather than other high-level languages. So C makes it easier for

programmers to control what the programme is doing. That results in more efficiency

than other languages.

C also can archive lots of features than other languages, because C accepts most of the

compilers, libraries, and interpreters. That is why the Xparser uses C as well as

visualisation programme for X-machine models. Also, as mentioned above, the HPCx

super computer has no problem to run C, so C is the best choice for the post-parsing

programming language of X-machine models.

Now we know all three languages, it is a good preparation of the design stage.

Section 4.2: Overall Design

Section 4.2.1: X-machine Frameworks Architecture

X-machine framework is a specialised framework for modelling biologic and other

areas models based on individual agents. The architecture now is using .xml text file to

define the data for each individual agents. From the last section, XML is transferable

description language. It is easy to build an .xml file by using text editors (low level

programming) to write directly or using a GUI tool (high level implementation) to

construct it.


33/65

Chapter 4: Design

26

Also, there is another important .xml file which defines all the interaction rules,

sending and receiving messages, movements and variables etc. With a parser called

Xparser, the .xml file could be parsed into a C code file. Then use a compiler, it will be

an executable programme. The programme can run X-machine agents and

implements the global message list communication relation [8].

By using the programme from above and supporting an initial .xml text file which

holds all the states and other information of every agent, the model will start. Each

iteration of the programme generates an .xml text file, holds all the changes of the

states and other information such as location, speed etc.

After a number of iterations (can be defined when programme start), there will be a set

of .xml text files. Please note that one iteration is 0.5 second, so 2 iterations are 1

second. Now using these files is a great pleasure: you could use a specialised

visualisation tool to get the display of the model; you could use a getdata tool to getneeded information to generate a graph with specified x-axis and y-axis. In next

chapter -- implementation and testing, this will be introduced in detail.

Section 4.2.2: Main XML File Structure

The main XML file is the soul of the model. Even tough when we visualise the model

and get the data of the model, we wont need this main XML model file, but without

this file or this file is incorrect, the model wont work or wont work properly.

The main file structure is not simple (please see the Figure 4.1), the highest structure

of the model main file consists of three parts, which are defined states, X-machine,

Messages. Defined states part is actually comments of all the states for each molecule,

which help users understand. Messages show each kind of messages and contents.

Figure 4.1 Structure of the Main file (a)

Model

main file

Defined

StatesX-machine Messages


34/65

Chapter 4: Design

27

The most complicated part is X-machine, it consists of three sub-parts as well, but they

are: Memory, States and Functions. (Figure 4.2)

Memory part is actually for variables, user can define all the global variables in this

part, with special tags, and it is quite simple to define them.

States part contains three states normally: input, output and move, which are linked

with the functions in Function part. This part normally doesnt need to be changed for

most of the model.

Functions part is the core part in this file. It is the most complicated sub-part. It

controls each agents behaviour. Outputdata function is for outputting messages which

contains location, state and bond information. Inputdata function is for get message

from other agents and process them then with appropriate reactions. Movements

function is the function which controls the movement and locations of agents. It alsodraws the boundary of the model structure. For different cases of models, there are

might be some other necessary functions act in this sub-part.

Figure 4.2 Structure of the Main file (b)

Memory States Functions

X-machine

Outputdata Inputdata Movements


35/65

Chapter 4: Design

28

Section 4.2.3: Iteration XML File Structure

Iteration files are also important for this framework; they have to be simple and

uniform easier for reading and processing them.

Most commonly, these files will start with a iteration number tag to show which

iteration this file is and then the following part is for each agent in this model. This

will include all the agents at this iteration time and the detailed data for each of the

agent. Different model should have different type of data for the agents.

Section 4.3: Design of Chemical Interaction Model

Chemical interaction model is the simplest model in my project. It follows that A+B

C. Only three kinds of molecules are involved. So the structure of the model and xmlfiles is simple as well. There is already a Matlab model exists, and I could use the

information of the molecules inside and interaction rules for the X-machine model. It

is actually a conversion for this model.

In Matlab, it is possible to generate numbers of molecules data randomly when the

programme starts; also Matlab is a good tool to plot the graph of concentration against

time for the model. But in X-machine model, these functions are needed extra tool to

do it. So the design of the main file and iteration file should be quite simple.

As in the chapter three, the first thing need to do for conversion is clarify the states for

each kind of the molecules. For molecule A, there are two states: 0 free in box, 1

bound with B (this is actually appears as molecule C). For molecule B, there is only

one state: 100 free in box. When molecule B bound with molecule A, it should be

treated as disappeared. So there is no need to have a state for B which says it bound

with molecule A.

This models shape is inside a box, but according to the Matlab code, there are two

coordination methods needed -- Cartesian and Polar coordinates. Cartesian coordinates

mostly used in this model as location purpose, polar coordinates used as motion andmovement purpose. So by using the Cartesian coordinates, it is possible to draw the

boundary of the box and limit each molecule stay inside this box by reversing the

movement in polar coordinates when they hit the edge of the box.

The Memory part has to contain all the global variables for supporting the coordination

systems described above. Also it has to contain other necessary variables such as state

number, molecule radius and so on.

The states part is simple, only have three states to be set: output, input and move as

described in section 4.2.1.


36/65

Chapter 4: Design

29

Then the most challenge part is the functions part. Four functions are essential:

outputdata, inputdata, checkbondtries and move. The best way for binding is from the

perspective of each A molecule to look for bind. By processing the location messages

from other B molecules, the A molecule will choose the best one and then bond with it.During binding process, bond message involved as well. The move function will make

sure all the molecules freely moving around inside the box.

Messages part contains two kinds of messages. First one is location message, which

contains each molecules state, Cartesian coordinates and id number. Second one is

bond message: it has the information of senders id and state, receivers state,

bondunbond tag and a distance value.

The iteration files have the same structure as described in section 4.2.2. The detailed

implementation of this model will be appeared in the next chapter implementationand testing.

Section 4.4: Design of NF-B Signalling Pathway Model

The NF-B signalling pathway model is a complicated model compare with last one, it

involves four kinds of molecules and tens of different states. The molecules are NF-B,

IB (IB and IB are ignored because of their concentration is low), nuclear

importing receptors and nuclear exporting receptors.

NF-B can bind with IB, nuclear importing receptors and nuclear exporting

receptors. Also after it bound with IB, the NF-B& IB is possible to bound with

nuclear importing and exporting receptors as well. So the design of state numbers for

NF-B are: 0 - free in cytoplasm, 1 - bound to IB in cytoplasm, 2 - bound to nuclear

importing receptors, 3 - bound to IB and then bound to nuclear importing receptors,

4 - bound to nuclear exporting receptors, 5 - bound to IB and then nuclear exporting

receptors, 6 - free in nuclear and 7 - bound to IkBa in nuclear.

Same with chemical interaction model, IB acts similar with the molecule B in thatmodel. When IB bound with NF-B, it is not necessary to display it, so the solution

is eliminate it. However, for the situation that IB bound with nuclear importing and

exporting receptors is different. The state numbers have to be unique so the design of

state numbers for IB are: 10 - free in cytoplasm, 11 - bound to nuclear importing

receptors, 12 - bound to nuclear exporting receptors and 13 - free in nuclear.

For nuclear importing and exporting receptors, the states are easy. They dont need to

worry which kind is bound with them, because the state numbers for the above two

kinds of molecules have indicated the type of bind. So the only thing they need to

make themselves clear is if they are busy or not. The design of state numbers for


37/65

Chapter 4: Design

30

nuclear importing receptors is: 20 dormant and 21 active. For nuclear exporting

receptors is same but different number: 30 dormant and 31 active.

Now the design of states numbers has done, the next part is memory part. The only

thing in memory part is definition of variables. As last model, the shape was a box, butthis model is a shape of a cell. The structure and boundaries are more complex.

However, the coordination systems are still the same with last model Cartesian and

polar coordination systems. Also same purpose for each system: Cartesian coordinates

take care of locations, polar coordinates control the movement of molecules. The

boundaries are drawn by the molecules, with the co-operation of both coordination

systems, nuclear receptors will lay on the nuclear membrane and other molecules will

be moving in the region they should be e.g. inside cytoplasm or nuclear. If any if the

molecule is about to across the boundary, it is possible to reverse the movement and

pull them back.

The states part would be exactly the same with last model. However, the functions

parts will not.

Because this model involves nuclear receptors, some new sets of rules are necessary to

appear inside this part. Another thing needs to be noticed is, for each bind with nuclear

receptors, there should be a delay before they are unbind and be released into a new

region. For example, NF-B bound with nuclear importing receptors, after a while, its

state should be changed as a NF-B free moving inside nuclear. The last function

move, it takes the responsibility of drawing the boundary, it should let nuclear

receptors move on the nuclear membrane only and control other molecules moving in

the right regions.

Messages part is same with last model as well, which contains location message and

bond message. Also, location message contains each molecules state, Cartesian

coordinates and id number; bond message contains the information of senders id and

state, receivers state, bondunbond tag and a distance value.

Iteration files are same structure but more kinds of molecules are inside them now.

In the next chapter, I will follow the design and talk about implementation in detail for

NF-B signalling pathway model.

Section 4.5: Design of NF-B & MAP Kinase Signalling Pathway

Combined Model

This models design of structure is almost the same with the NF-B model, but there is

a new pathway added in MAP kinase pathway. Because there is no existing Matlab

model for this one, so the relationship and cross-talk between these two pathways is


38/65

Chapter 4: Design

31

important. However, according to the Academic Unit of Cell Biology, Division of

Genomic Medicine in the University of Sheffield the only relationship of these two

pathways is in the Figure 4.3, they havent sorted the exact cross-talk between NF-B

and MAP kinase pathways. So the design for this model is actually add a new pathway

into the last model. From the Figure 4.3, it means molecules from outside of cell throwthe toll receptor, some of the molecule will go inside of NF-B pathway, and others

will go MAP kinase with a probability. But in the model it actually models intracellular

behaviours, so the initial molecules are assigned in the first iteration file.

Figure 4.3 NF-B & MAP Kinase Signalling Pathway Relation [32]

There are some more states needed to add in, from the Figure 3.4 in last chapter, it is

possible to treat Ras, SOS and Grb2 as a single molecule, it acts similar with molecule

A in the first model, and Active-Ras acts similar with molecule B in the first model.


39/65


40/65

Chapter 5: Implementation and Testing

33


This chapter is about the detailed method of implementation of the three models in myproject. After the models are finished, it is important to test the models and evaluate

them, so the testing method of the models will also be mentioned in this chapter.

Section 5.1: Implementation of Three Models

This section is the implementation of the models in my project, followed the design

from last chapter, all the three models will be explained well of the implementation

process. The first two models are actually converted from two Matlab models, but

because of the difference between Matlab and X-machine framework, the best way to

convert is get the ideas and algorithms from the Matlab and then write directly in XMLwith XML specification for X-machine framework.

Section 5.1.1: Implementation of Chemical Interaction Model

In Matlab, the model of chemical interaction works as: firstly, it defines some

constants needed, such as time step, box length, speed range etc; then, it generates

initial molecules positions and plot immediately; thirdly, it creates initial directions

vectors; fourthly, it uses a loop from the perspective of each molecule A to look for a

suitable molecule B to bind; fifthly, it controls the moving of each molecule and keep

them inside the interaction model box; lastly, it draws a graph which shows

concentration against time.

As in last chapter, X-machine doesnt have its own initial value generation tool and

graph drawing tool, so these need special external tools to help, but it is not difficult to

archive them.

This model started with the main .xml file. In last chapter, the states for each molecule

are defined, and then we need to define constants and variables. Box length can be

defined as a constant 3000 (in meter e-10) and used later. Variables are id number andstate number as integers; doubles are x, y and z for Cartesian coordinates, postheta,

posphi and posr for polar coordinates, movetheta, movephi and mover for movements

in Cartesian coordinates and iradius for the radius of molecules.

Then the states part, three states are defined: output, input and move. For output state,

it has association with Outputdata function and pointing to input state. For input

state, it is associated with Inputdata function and has the destination move state.

For move state, it linked with Move function and with the output as next state. So

the three states are actually linked together as a closed ring shape (see Figure 5.1):


41/65


34

Figure 5.1 states and relations in X-machine

The next part is functions part. From the design chapter, this is the most complicated

part, and all the code is written between xml tags is actually C code. Mostly if else,

while, and other simple C functions.

The first function is Outputdata. This function is for sending out location message,

with a method called add_location_message. It sends out the id, state, x, y and z of the

molecules information to other molecule to decide. This function is quite short.

The second function is Inputdata. In this function, it defines some local variables first

for processing the location message. With a while loop, it gets all the location

messages and process one by one for each molecule A. For each message, it first

checks to see if it comes from the molecule is referring to (from itself), and it gets rid

of messages from other molecule A. Also if the un-squared distance is less than the

molecules radius squared (radius2). Then all the requirements are matched and a bond

message can be sent. With information of the source molecules id, state and

destinations molecules state, the distance of them and an integer 3 for bindunbind

tag (3 means a try for a bond).

The third function is called checkbondtries. This function is for processing the bond

message with 3 on bindunbind tag, and decides if it is necessary to make the bond.

Once the id of both associated molecule is checked, a new bond message will be sent,

it has all the same information but with a 0 on bindunbind tag (0 means a bind tag)

and 0.0 for distance distance is therefore not useful after they bind. Also the

molecule B will be freed in memory (disappear) by a code return 1 in an if else

statement.

The fourth function is also the last function in this model -- move. This function

processes the bond message with a 0 on bindunbind tag, and makes the bond. That

Output

(initial)

Input

Move


42/65


35

means the molecules state will be changed here as well (from 0 to 1 in this model).

Then the following bit of this function is for controlling the movement of molecules,

by using the Cartesian coordinates for location purpose, polar coordinates for

movement, all the molecules will be restrict inside the box. And the movement is

followed Brownian motion freely moving inside the box within defined speed range.

The last part is messages part. As in design chapter, two kinds of messages are defined:

location and bond message.

That is all for the main .xml file. Now we need to make an initial iteration .xml file and

visualisation programme. The initial iteration file creating programme and

visualisation programme are part of the X-machine framework, so Simon Coakley has

already done some examples only need to change them suit this model.

The create initial iteration programme is written in C. Firstly, some variables defined,which are molecule initial numbers and moving speeds. The main part is some for

loops, each for loop is used for a type of molecule, and it generates assigned number

of molecules with random coordinates and speeds.

Visualisation programme is written in C as well, it uses some openGL libraries. It reads

each of the iteration files and displays each type of molecule in different colour, with

the iteration file change. It is possible to display the molecules as moving objects. It

also has function of rotation, save each iteration display as an image etc. There are

some images of the model visualisation Figure 5.2:

Figure 5.2 Visualisation of Chemical Interaction Model


43/65


36

Section 5.1.2: Implementation of the NF-B Signalling Pathway Model

The Matlab model of NF-B signalling pathway model is similar with chemical

interaction model, but more complex and involves a lot more molecules and receptors.

The model works as: firstly, it also defines some constants needed; then, it assignsinitial positions in spherical polar coordinates and also converts into Cartesian

coordinates; thirdly, it uses a big while loop to do interactions, with defined sets of

rules; lastly, the results will be drawn on a graph which shows concentration against

time.

As state number have been defined, it is necessary to define some constants and

variables. The only constant defined here is receptor delay with a value 10. Variables

are exactly the same with chemical interaction model: id number, state number, x, y, z,

postheta, posphi, posr, movetheta, movephi and iradius.

The states part is the same with last model, also three states are defined: output, input

and move. The relationship of them are followed the Figure 5.1.

From the functions part, it is easily to see the difference between NF-B pathway

model and chemical interaction model. Same this last model, four functions in this part:

Outputdata, Inputdata, Checkbondtries and Move.

The first function is Outputdata. This time the function is not simple as last one. There

are two sub-parts in this function. The first one is for sending out location message,

which contains the id, state, x, y and z of the molecules information to other molecule

to decide. The second sub-part is for nuclear receptors. By checking the states, if the

molecule is an active nuclear receptor, it will decrease receptor delay counter. Once the

receptor delay counter is changed to zero, it will send a bond message out to unbind

the molecule which was binding with it. The bond message will contain both

molecules state and with a 1 on bindunbind tag (1 corresponds to an unbind tag).

Then the active nuclear receptor will release the molecule which was bound with it and

the receptors state will be changed to dormant.

The second function is Inputdata. By start with setting some local variables forfunction use and then check the bond message one by one with a while loop. If the

bond message has a bindunbind tag 1, the associate molecule will be changed state

into appropriate region. There are six situations:

1. If NF-B bound to importing nuclear receptor then make free in nuclear;2. If NF-B bound to exporting nuclear receptor then make free in cytoplasm;3. If NF-B & IB bound to importing nuclear receptor then make free in nuclear;4. If NF-B & IB bound to exporting nuclear receptor then make free in cytoplasm;5. If IB bound to importing nuclear receptor then make free in nuclear;6. If IB bound to exporting nuclear receptor then make free in cytoplasm.


44/65


37

Then this function will get location message for each molecule. Firstly check if the

location message was sent from the molecule itself, if it is not, then check the distance

between the molecule and message sender. If the distance is less than radius2, then it

will check if the states of them match any of the four situations (s: sender, r: receiver ofthe location message):

1. r: NF-B free in cytoplasm and s: IB free in cytoplasm;2. r: NF-B free in nuclear and s: IB free in nuclear;3. r: dormant nuclear importing receptor and s: (NF-B free in cytoplasm, IB free

in cytoplasm or NF-B & IB free in cytoplasm);

4. r: dormant nuclear exporting receptor and s: (NF-B free in nuclear, IB free in nuclearor NF-B & IB free in nuclear).

If any of the four situations matched, a bond message with a bindunbind tag 3 a tryfor bond will be sent.

The third function is Checkbondtries. This function only processes the bond message

with 3 on bindunbind tag. If it gets a message like that, then it will check if the

sender is closest in distance, if it is, then it will be bound with each other. Firstly it will

send a bond message with 0 on bindunbind tag means bind. And then change

Xingdong Bian- X-Machine Model of a Biological System

Documents