Introduction to Bioinformatics
Systems biology: modeling biologicalnetworks
Systems biologyp Study of ”whole biological systems”p ”Wholeness”: Organization of dynamic
interactionsn Different behaviour of the individual parts
when isolated or when combined togethern Systems cannot be fully understood by
analysis of their components in isolation
-- Ludwig von Bertalanffy, 1934(according to Zvelebil & Baum)
Outlinep 1. Systems biology and biological networks
n Transcriptional regulationn Metabolismn Signalling networksn Protein interactions
p 2. Modeling frameworksn Continuous and discrete modelsn Static and dynamic models
p 3. Identification of models from data
1. Systems biologyp Systems biology – biology of networks
n Shift from component-centered biology to systems ofinteracting components
Prokaryotic cell
Eukaryotic cell
http://en.wikipedia.org/wiki/Cell_(biology)Mariana Ruiz, Magnus Manske
Interactions within the cellp Density of biomolecules in
the cell is high: plenty ofinteractions!
p Figure shows a cross-section of an Escherichiacoli celln Green: cell walln Blue, purple: cytoplasmic
arean Yellow: nucleoid regionn White: mRNA
http://mgl.scripps.edu/people/goodsell/illustration/publicDavid S. Goodsell
Paradigm shift from study of individualcomponents to systems
System size
Num
ber
ofdiffe
rent
syst
em
s
System 1
System 2
InteractionComponent
?
Paradigm shift from study of individualcomponents to systems
System size
Num
ber
ofdiffe
rent
syst
em
s
Level of model detail
Biological systems of networks
Transcriptional regulation
gene
regulatoryregion
transcription factor
co-operativeregulation
microarrayexperiments
Gene product (protein)
Metabolism
enzyme
metabolite
Signal transductionsignal molecule & receptor
activated relay molecule
inactivesignalingprotein
activesignalingprotein
end product of thesignaling cascade(activated enzyme)
Protein interaction networksp Protein interaction is the unifying theme of all
regulation at the cellular levelp Protein interaction occurs in every cellular system
including systems introduced earlierp Data on protein interaction reveals associations
both within a system and between systems
Protein interaction
2. Graphs as models of biologicalnetworksp A graph is a natural model for biological systems of
networksp Nodes of a graph represent biomolecules, edges
interactions between the moleculesp Graph can be undirected or directed
p To address questions beyond simple connectivity(node degree, paths), one can enrich the graphmodels with information relevant to the modelingtask at hand
Enriching examples: transcriptionalregulationp Regulatory effects can be
(roughly) divided inton activationn inhibition
p We can encode thisdistinction by labeling theedges by ’+’ and ’-’, forexample
p Graph models oftranscriptional regulationare called gene(tic)regulatory networks
Activation
Inhibition
gene 1
gene 2
gene 3
2 1 3
RepressorActivator
Enriching examples: more transcriptionalregulation
A gene regulatory network might be enriched further:In this diagram, proteins working cooperatively asregulators are marked with a black circle.
This network is a simplified part of cell cycle regulation.
Frameworks for biological networkmodelingp A variety of information can be encoded in graphsp Modeling frameworks can be categorised based
on what sort of information they includen Continuous and/or discrete variables?n Static or dynamic model? (take time into account?)n Spatial features? (consider the physical location
molecules in the cell?)
p Choice of framework depends on what we wantto do with the model:n Data explorationn Explanation of observed behaviourn Prediction
Static models Dynamic models
Discretevariables
Continuousvariables
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Dynamic models: differential equationsp In a differential equation model
n variables xi correspond to the concentrations ofbiological molecules;
n change of variables over time is governed by rateequations,
dxi/dt = fi(x), 1 i n
p In general, fi(x) is an arbitrary function (notnecessarily linear)
p Note that the graph structure is encoded byparameters to functions fi(x)
Properties of a differential equationmodelp The crucial step in specifying the model is
to choose functions fi(x) to balancen model complexity (number of parameters)n level of detail
p Overly complex model may need moredata than is available to specify
Example of a differential equation model oftranscriptional regulation
p Let x be the concentration of the target geneproduct
p A simple kinetic (i.e., derived from reactionmechanics) model could take into accountn multiple regulators of target gene andn degradation of gene products
and assume that regulation effects areindependent of each other
Example of a differential equation model oftranscriptional regulation
p Rate equation for change of x could then be
where k1 is the maximal rate of transcription ofthe gene, k2 is the rate constant of target genedegradation, wj is the regulatory weight ofregulator j and yj is the concentration ofregulator j
Number of parameters?
Differential equation model formetabolismp Likewise, rate equations can be derived for
differential equation models for metabolismp For simple enzymes, two parameters might be
enoughp Realistic modeling of some enzyme requires
knowledge of 10-20 parametersp Such data is usually not available in high-
throughput manner
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Biochemical systems theory (BST)p BST is a modeling framework, where differential
rate equations are restricted to the followingpower-law form,
wheren i is the rate constant for molecule i andn gij is a kinetic constant for molecule i and reaction j
p BST approximates the kinetic system andrequires less parameters than the genetic kineticmodel
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Interestingly, if we assume that the concentrationsare constant over time (steady-state), an analyticalsolution can be found to a BST model.
But then we throw away the dynamics of the system!
Steady-state modelingp Is the study of steady-states meaningful?p If we assume dxi/dt = 0, we restrict ourselves to
systems, where the production of a molecule isbalanced by its consumption
enzyme
metabolite In a metabolic steady-state, these twoenzymes consume and producethe metabolite in the middle at the same rate
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Constraint-based modelingp Constraint-based
modeling is a linearframework, where thesystem is assumed tobe in a steady-state
p Model is representedby a stoichiometricmatrix S, where Sijgives the number ofmolecules of type iproduced in reaction jin a time unit.
2
1
3 4
1 2
3 4 5
6 7 89 10
12345678910
1 2 3 411
-1-1
12
-2-1
1
1-2
-11
Sij = 0 if valueomitted
Constraint-based modelingp Since variables xi are constant, the questions
asked now deal with reaction ratesp For instance, we could characterise solutions to
the linear steady-state condition, which can bewritten in matrix notation as
Sv = 0p Solutions v are reaction rate vectors, which for
example reveal alternative pathways inside thenetwork
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
Discrete models: Boolean networksp Boolean networks have been widely used in
modeling gene regulationn Switch-like behaviour of gene regulation resembles logic
circuit behaviourn Conceptually easy framework: models easy to interpretn Boolean networks extend naturally to dynamic modeling
Boolean networksA Boolean network
G(V, F) containsp Nodes V = {x1, …, xn},
xi = 0 or xi = 1p Boolean functions
F = {f1, …, fn}p Boolean function fi is
assigned to node xi
NOT AND
Logic diagramfor activity ofRb
Dynamics in Boolean networksp Dynamic behaviour can be simulatedp State of a variable xi at time t+1 is calculated by
function fi with input variables at time tp Dynamics are deterministic: state of the network
at any time depends only on the state at time 0.
Example of Boolean network dynamicsp Consider a Boolean network with 3 variables x1,
x2 and x3 and functions given byn x1 := x2 and x3
n x2 := not x3
n x3 := x1 or x2
t x1 x2 x30 0 0 01 0 1 02 0 1 13 1 0 14 0 0 1
...
Problems with Boolean networksp 0/1 modeling is unrealistic in many casesp Deterministic Boolean network does not cope well
with missing or noisy datap Many Boolean networks to choose from –
specifying the model requires a lot of datan A Boolean function has n parameters, or inputsn Each input is 0 or 1 => 2n possible input statesn The function is specified by input states for which
f(x) = 1 => 2^(2^n) possible Boolean functions
Static models Dynamic models
Discretevariables
Continuousvariables
Plain graphs
Bayesian networks
(Probablistic)Boolean networks
Stochastic simulation
Dynamic Bayesiannetworks
Biochemical systemstheory (in steady-state)
Metabolic controlanalysis
Constraint-basedmodels
Differential equations
Biochemical systemstheory (general)
3. Model identification from datap We would like to learn a model from the data
such that the learned modeln Explains the observed datan Predicts the future data well
p Generalization property: model has a goodtradeoff between a good fit to the data andmodel simplicity
Three steps in learning a modelp Representation: choice of modeling framework,
how to encode the data into the modeln Restricting models: number of inputs to a Boolean
function, for example
p Optimization: choosing the ”best” model from theframeworkn Structure, parameters
p Validation: how can one trust the inferred model?
Conclusionsp Graph models are important tools in systems
biologyp Choice of modeling framework depends on the
properties of the system under studyp Particular care should be paid to dealing with
missing and incomplete data - choice of theframework should take the quality of data intoaccount
References and further readingp Florence d’Alché-Buc and Vincent Schachter: Modeling and
identification of biological networks. In Proc. Intl.Symposium on Applied Stochastic Models and DataAnalysis, 2005.
p Marketa Zvelebil and Jeremy O. Baum: Understandingbioinformatics. Garland Science, 2008.
p Hiroaki Kitano: Systems Biology: A Brief Overview. Science295, 2002.
p Marie E. Csete and John C. Doyle: Reverse engineering ofbiological complexity. Science 295, 2002.
p James M. Bower and Hamid Bolouri (eds): ComputationalModeling of Genetic and Biochemical Networks. MIT Press,2001.