+ All Categories
Home > Documents > Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks...

Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks...

Date post: 08-Nov-2018
Category:
Upload: buitram
View: 215 times
Download: 0 times
Share this document with a friend
78
Optimization and Control for Metabolic Networks Alexandre Jo ˜ ao Borralho Domingues (Licenciado) Dissertac ¸˜ ao para obter o grau de Mestre em Engenharia Electrot ´ ecnica e Computadores uri Presidente: Professor Doutor Carlos Jorge Ferreira Silvestre Orientador: Professor Doutor Jo˜ ao Manuel Lage de Miranda Lemos Co-orientador: Professora Doutora Susana de Almeida Mendes Vinga Martins Vogal: Professor Doutor Antonio Pedro Rodrigues de Aguiar Novembro de 2009
Transcript
Page 1: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Optimization and Control for Metabolic Networks

Alexandre Jo ao Borralho Domingues(Licenciado)

Dissertacao para obter o grau de Mestre em

Engenharia Electrot ecnica e Computadores

Juri

Presidente: Professor Doutor Carlos Jorge Ferreira Silvestre

Orientador: Professor Doutor Joao Manuel Lage de Miranda Lemos

Co-orientador: Professora Doutora Susana de Almeida Mendes Vinga Martins

Vogal: Professor Doutor Antonio Pedro Rodrigues de Aguiar

Novembro de 2009

Page 2: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 3: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Acknowledgments

This work would not have been possible without the help of Prof. Joao Miranda Lemos, who

provided me the technical basis, pointed the right directions and was always patient with the many

problems encountered, and Prof. Susana Vinga, who always gave me all the possible support and

helpful comments. Thank you for all the help and for giving me this opportunity.

This dissertation was performed under the framework of project DynaMo (PTDC/EEA-ACR/-

69530/2006), I would like to thank all the KDBIO group. This work also had a big contribution

of Dr. Ana Rute Neves, Prof. Helena Santos and Dr. Paula Gaspar, from ITQB, who provided the

data and valuable information.

Thank you Joana for encouraging me to do this Master and supporting me in all the bad mood

days. Thank you for always being caring, it is a gift to have you in my life.

Finally, a big thank you to my parents and Ines for supporting me in every possible way.

Page 4: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 5: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Abstract

The quick evolving area of Systems Biology aims to provide deeper understanding of biologi-

cal systems at system level. A common application is the systematization of metabolic networks

using mathematical models. Valid models can avoid time consuming and expensive experiments

when testing and acquiring data from these networks. The increasing availability of these models

and data poses new challenges in what concerns optimization. Due to the high level of complexity

and uncertainty associated to these networks the suggested models often lack detail and liability,

required to determine the proper optimization strategies. A possible approach to overcome this

limitation is the combination of both kinetic and stoichiometric models. The work reported ad-

dresses the optimization and control of metabolic networks along such lines.

In the first part of this dissertation three control optimization methods, Direct Optimization and Bi-

level optimization using two different inner-optimization procedures, with different levels of com-

plexity and assuming various degrees of process information, are presented and their results

compared using a prototype network. The results obtained show that the bi-level optimization

provides a good approximation to networks with incomplete kinetic information.

The process of formulating Metabolic Network models and the estimation of its parameters is

complex and there is no defined framework to obtain valid solutions. On the second part of this

dissertation, a procedure to estimate parameters using data sets from different experiments is

presented. The procedure is illustrated by a case study on the effect of Nisin on Mannitol produc-

tion by Lactococcus lactis. The obtained results are encouraging, providing a consistent estimate

of the model parameters.

Keywords

Metabolic Networks, Optimization, Control, Parameter Identification, Modeling.

iii

Page 6: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 7: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Resumo

A area emergente de Biologia de Sistemas procura aprofundar o conhecimento de Sistemas

Biologicos ao nıvel das suas componentes estruturais. Uma das aplicacoes comuns e a sistemati-

zacao de redes metabolicas usando modelos matematicos.

A formulacao de modelos matematicos para redes metabolicas pode evitar experiencias caras e

demoradas necessarias para testar estas redes. A crescente disponibilidade destes modelos e

respectivos dados coloca novos desafios no que diz respeito a optimizacao destas redes e produ-

tos. Devido a grande complexidade e incerteza associadas a estas redes os modelos sugeridos

padecem muitas vezes de falta de detalhe e fiabilidade, indispensaveis para a definicao de es-

trategias de controlo. Uma abordagem possıvel para ultrapassar esta limitacao e a combinacao

de modelos cineticos e estoiquiometricos. O trabalho apresentado aborda a optimizacao e con-

trolo de redes metabolicas seguindo estas linhas.

Na primeira parte desta dissertacao, tres metodos de optimizacao do controlo, com diferentes

niveis de complexidade e assumindo diferentes niveis de informacao acerca da rede, sao apre-

sentados. Os seus resultados sao comparados, usando para tal uma rede prototipo.

O processo de formulacao destes modelos para redes metabolicas e a respectiva estimacao dos

seus parametros e complexa e nao existe nenhuma abordagem sistematica definida para obter

solucoes validas. Na segunda parte desta dissertacao, e apresentado um procedimento para

estimar parametros, usando conjuntos de dados de experiencias diferentes, garantindo a con-

sistencia das estimativas. Este procedimento e ilustrado pelo estudo do efeito da inducao com

Nisina na producao de Manitol na Lactococcus lactis.

Palavras Chave

Redes Metabolicas, Optimizacao, Controlo, Identificacao de parametros, Modelacao.

v

Page 8: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 9: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Contents

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Synthetic problem 7

2.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.1 Metabolic network modeling tools . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.2 Prototype network model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.3 The optimization problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 The control function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.2 Pontryagin’s Maximum Principle . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.3 Flux Balance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.4 Geometric Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3 Optimization Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Direct optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.2 Bi-Level Optimization algorithm structure . . . . . . . . . . . . . . . . . . . 15

2.3.3 Inner-optimization using Geometric Programming . . . . . . . . . . . . . . . 18

2.3.4 Inner-optimization using Linear Programming . . . . . . . . . . . . . . . . . 18

2.3.5 Pontryagin’s Maximum Principle: Computational implementation . . . . . . 19

2.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.1 Direct optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4.2 Bi-Level Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.3 PMP: Computational implementation results . . . . . . . . . . . . . . . . . . 28

3 Model for Mannitol production with Nisin induction 33

3.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

vii

Page 10: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Contents

3.1.1 Mannitol model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1.2 Mannitol model with Nisin induction . . . . . . . . . . . . . . . . . . . . . . 35

3.1.3 Data sets description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1.4 Parameters estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2 Parameter estimation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.2.1 Estimation using one data set . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.2.2 Estimation using multiple data sets . . . . . . . . . . . . . . . . . . . . . . . 41

3.2.3 Estimation using the Nisin data sets . . . . . . . . . . . . . . . . . . . . . . 42

3.2.4 Further notes on estimation strategies . . . . . . . . . . . . . . . . . . . . . 42

3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.1 Identification of set δ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.3.2 Identification of set δ using Nisin data sets . . . . . . . . . . . . . . . . . . . 43

3.3.3 Identification of set σ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Optimizing Mannitol production using Optimal Control 49

4.1 Control using a step function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5 Conclusions 55

viii

Page 11: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

List of Figures

2.1 Prototype network: The maximization of the final value of u5 depends on the profile

of the function f(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.2 Inner-Optimization algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Bi-Level optimization formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.4 Result of the simulation using Direct optimization. . . . . . . . . . . . . . . . . . . . 21

2.5 Comparison of three f(t) profiles. The solid line is the optimal treg obtained in the

Direct optimization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Temporal variation of metabolites x2, x4, and outputs u3 and u5 for three values of

treg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.7 Comparison of the temporal variation of u1, u3 and u5 with a fixed f(n) . . . . . . . 26

2.8 Result of the optimization using the Inner Optimization with Geometric Program-

ming (left) and Linear Programming (right). . . . . . . . . . . . . . . . . . . . . . . 27

2.9 Control function, Hamiltonian derivative and u5 evolution on several iterations. . . . 30

3.1 Detail of a metabolic pathway of Lactococcus lactis [1] . . . . . . . . . . . . . . . . 35

3.2 Mannitol Model without Nisin induction . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.3 Aspect of a Hill Function with n = 20 and θ = 5 . . . . . . . . . . . . . . . . . . . . 36

3.4 Mannitol Model with Nisin induction . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.5 Data Sets for Mannitol production. Vertical dashed lines represent the time of in-

duction of Nisin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.6 Parameter estimation structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.7 Estimation of δ using the data set without Nisin. . . . . . . . . . . . . . . . . . . . . 43

3.8 Estimation of set δ using all the data sets. Each Nisin data set is modeled with an

independent σ set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.9 Estimation of σ using the Nisin data sets and a fixed δ. . . . . . . . . . . . . . . . . 46

3.10 Plot of the obtained Hill-type functions for each Nisin data set. . . . . . . . . . . . . 48

ix

Page 12: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

List of Figures

x

Page 13: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

List of Tables

2.1 Parameters used in the prototype network. . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Results for the Direct Optimization using the Discrete form of the control function . 24

3.1 Estimation of the parameters of set δ using the data set without Nisin. . . . . . . . 43

3.2 Fine tuning of set δ using all data sets. . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.3 The three independent σ sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3.4 Common subset of σ obtained in the estimation using the Nisin data sets. . . . . . 47

3.5 Independent σ parameters obtained for each of the Nisin data sets. . . . . . . . . . 47

4.1 Three independent step function parameters, obtained on the first estimation with

all data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.2 Three independent step function parameters, obtained on the second estimation

with the Nisin data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.3 The common step function parameters, obtained on the third estimation #1 with the

Nisin data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.4 Three independent values for tnisin, obtained on the third estimation #1 with the

Nisin data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.5 The common step function parameters, obtained on the third estimation #2 with the

Nisin data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.6 Three independent values for tnisin, obtained on the third estimation #2 with the

Nisin data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

xi

Page 14: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 15: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

List of Acronyms

PMP Pontryagin’s Maximum Principle

FBA Flux Balance Analysis

dFBA Dynamic Flux Balance Analysis

GP Geometric Programming

LP Linear Programming

GMA General Mass Action

BST Biochemical Systems Theory

F6P Fructose 6-phosphate

Page 16: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 17: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1Introduction

Contents1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Document structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1

Page 18: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1. Introduction

1.1 Motivation

The emergent area of Systems Biology is gradually changing the paradigms associated to the

study of Biological Systems. Systems Biology studies the various parts of a biological system,

not as individual components, but as parts of the same system that interact to achieve a global

function or characteristic.

The general nature of Systems Biology leads to different points of view. While some describe Sys-

tems Biology as a main field of study, others consider it a paradigm. From an engineering point

of view it is common to be described as the application of dynamical systems theory to molecular

biology.

A frequent area of application of Systems Biology is the systematization of Biological Systems by

proposing mathematical models to describe the interactions of the molecular components. These

models together with the available genetic engineering tools open exciting new areas of research.

The advances in genetic engineering have made available a wide selection of tools to manipulate

organisms. Methods such as gene knock down/up [2], where the manifestation of a certain gene

can be decreased/increased, gene knock-outs [3], where the manifestation of a certain gene is

silenced, gene substitutions, among others, provide degrees of freedom when manipulating an

organism.

These tools are now common in genetic engineering and have proven to be useful in many situa-

tions. They can, for instance, be used to improve desired characteristics of certain organisms, a

common example being the manipulation of metabolic networks in order to maximize the normal

product yield or even redirect the production to a flux that was residual or non-significant in the

original network. Such an example is provided in [1], where a genetically modified strain of Lac-

tococcus lactis was able to produce Mannitol.

Even though genetic engineering tools are robust enough, the high complexity and uncertainty

associated to living organisms, and corresponding metabolic networks, makes it extremely diffi-

cult to determine what are the required manipulations and conditions needed to attain a certain

objective.

Since an heuristic approach to such problems does not allow to explore the maximum potential

of metabolic engineering, these tools are now combined with methods from classical engineering

areas, such as Electrical Engineering, and Physics, among others. Tools that have been used

for several years in technical contexts are now being applied to genetic engineering under new

paradigms and conditions, in turn rising new obstacles that need to be solved when they are ap-

plied to Metabolic Engineering.

The combination of efforts from areas so diverse as Electronics, Control, Biology, among others,

is very exciting and have already introduced a variety of very interesting results, extending from

the modification of small organisms to the manipulation of actual ecosystems [4].

2

Page 19: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1.2 Problem formulation

1.2 Problem formulation

The work described in this thesis addresses two different problems.

A common situation in metabolic network engineering arises when a certain organism is natu-

rally, or after genetic manipulation, capable of producing a product of interest and this product

competes with the natural objective of the cell [1] [5] [6]. Since the primary objective of most of

the living organisms is assumed to be the assurance of the continuity of the species, the natural

objective of the cell is normally assumed to be the production of biomass or the formation of a

growth precursor [7, 8].

The first part of this dissertation addresses an optimization problem related to a situation where

the trade-off between biomass formation and product production can be explored with a control

variable, e.g. pH, temperature or gene inductors.The objective function is the maximization of the

final concentration of a metabolite whose formation competes with the natural objective of the cell

(e.g. maximization of biomass).

In order to solve this optimization problem a proper model of the organism is required. Unfortu-

nately, the identification of the kinetic parameters of a metabolic network is still very difficult and

represents an area of research by itself. A possible solution for the optimization problem is the

combination of kinetic information with stoichiometric information, that depends only on the stoi-

chiometry of the reactions.

In the first part of this work, a synthetic problem is formulated. A prototype network with the

described behavior is taken as example and the corresponding optimization problem is solved

assuming three different levels of information about the network kinetics.

The second problem relates with Mannitol production in a modified strain of L.lactis. In [1] the

strain L. lactis FI10089mtlD+Pase+ was created. This strain is able to produce Mannitol, whose

formation competes with the natural pathway of the organism to produce biomass. This strain has

also a Nisin inducible gene that allows to control the over expression of two enzymes responsible

for the formation of Mannitol.

The maximization of the Mannitol production, controlled by the inductor, can be achieved in an

heuristic approach by repeating the experiment several times. Due to the high costs involved, high

complexity and long time scales associated with each practical experience this method is far from

ideal. Since the pathway that leads to the production of both Mannitol and Biomass is partially

known, a mathematical model is proposed in order to explore, in a systematic way, the trade-off

between the production of Mannitol and the formation of Biomass controlled by Nisin.

1.3 State of the art

Although Metabolic Engineering [9] has developed very powerful approaches to optimize biotech-

nological processes, the systematic use of mathematical models and optimal control methods is

3

Page 20: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1. Introduction

still limited and poses many open issues.

Interesting examples are provided, some at the genome level, by [6, 10–12]. In [6] the use of

a bi-level optimization method, including a linear programming problem in the inner level and a

nonlinear optimization problem in the outer level, presents the interesting feature of not requiring

full model knowledge. This optimization method was used on an in silico model of E. coli and

tested in vivo with promising results.

The work in [13] and [8] focus on techniques to determine dynamic distributions of fluxes on

metabolic network models where not all the kinetics are known.

The use of Nisin as an inductor to control a certain product yield has been tested several times.

In [14] an optimization strategy, relying on practical experiments, is formulated to maximize yield

by controlling variables such as the pH, type of neutralizing agent, fermentation temperature or

point of induction, among others.

1.4 Contributions

The major contributions of this thesis consist in the development of two case studies on

metabolic network modeling, control and optimization.

In the first section, three different methods are compared on a common basis. Two of these meth-

ods assume complete knowledge of the dynamic equations of a network model. While one relies

on an Optimal Control approach, the other makes a steady-state optimization using Geometric

Programming (GP) [15]. These methods provide a baseline performance with which the results

obtained by other methods may be compared.

As such, the third approach assumes only a partial knowledge of the network kinetic model and

relies on a bi-level optimization. Furthermore, using Pontryagin’s Maximum Principle (PMP) [16],

it is shown that, for the class of problems considered, the manipulated variable may only assume

values at the extremes of the optimization interval.

In the second section, a procedure to consistently estimate parameters using data sets from dif-

ferent experiments is presented. This is illustrated by a case study on parameter estimation in

metabolic networks using data taken in different conditions. An initial model for Mannitol produc-

tion is suggested and a sub-model is later added to account for the Nisin induction.

Although the initial model does not predict Nisin induction, the data taken using induction is also

used to identify the model’s parameters. The resulting model can later be used to optimize the

product yield without the need of complex and expensive experiences.

1.5 Document structure

The thesis is organized as follows: after the Introduction (Chapter 1) in which the problem is

introduced and motivated and the state of the art revised, the synthetic problem is formulated and

4

Page 21: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1.5 Document structure

three possible optimization methods are presented in Chapter 2.

Chapter 3 presents the Mannitol production problem in L. lactis and a model is suggested. In

Chapter 4 the Mannitol production problem is further detailed with the study of a possible control

strategy.

Finally, conclusions are drawn in Chapter 5.

5

Page 22: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

1. Introduction

6

Page 23: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2Synthetic problem

Part of this chapter was published in:

Domingues A., J.M. Lemos and S. Vinga. (2009)

Optimization strategies for metabolic networks.

In Proc. of the European Control Conference (ECC’09).

August 23-26, Budapest, Hungary.

Contents2.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 Optimization Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1

7

Page 24: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

2.1 Problem description

In this chapter, a prototype synthetic network, where the formation of two of its metabolites

compete with each other, is taken as example. The ratio between the formation of the two metabo-

lites is controlled by a single function. After ensuring that the network has the required character-

istics, three different optimization strategies to maximize one of the products yield are explored.

Each of the optimization strategies assume a different level of information on the network.

2.1.1 Metabolic network modeling tools

Metabolic networks can be as diverse as life itself. While an heuristic approach to explore

them can give valuable information on their structure and molecular mechanisms, a structured

approach based on a mathematical description is fundamental to gain deeper insight.

Thus, the implementation of mathematical models to metabolic networks is a valuable approach.

Due to the high level of uncertainty and complexity associated to these networks there is no de-

fined framework to create these models.

A common procedure is to adopt a top-down approach, where all the possible information is gath-

ered about the biological system structure. This information is then translated into mathematical

model equations, to yield a system of non-linear differential equations.

A common methodology to establish these equations is to use the Biochemical Systems Theory

(BST) framework where each flux is approximated by a power law, that corresponds to a Taylor se-

ries expansion in logarithmic space [17]. The fluxes from BST can be expressed using S-Systems

[18–20] or General Mass Action (GMA) [21].

In order for the model to properly describe the system, the set of parameters has to be estimated.

The parameter identification procedure consists in minimizing an objective function, usually the

weighted sum of squares of the residuals between simulated (parameter dependent) and experi-

mental data points.

Another possible approach to obtain valid models is to use the stoichiometry of the reactions in-

volved in the metabolic network. This method has the advantage of being simpler to obtain, since

there are already large and reliable databases [22] with the reactions involved in several networks,

but does not account for the dynamic nature of the organisms. Thus, regulatory mechanisms are

hard to predict using stoichiometric models.

2.1.2 Prototype network model

A graphical representation of the used network is shown in Fig. 2.1.

This network is an adaptation of a previously suggested one [18]. The stoichiometric model is

8

Page 25: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.1 Problem description

Figure 2.1: Prototype network: The maximization of the final value of u5 depends on the profile of thefunction f(t).

described by the following set of ordinary differential equations:

du1

dt= k − v1

dx2

dt= v1 − v2(1 − f) − v3f

du3

dt= v2(1 − f) (2.1)

dx4

dt= v3f − v4

du5

dt= v4

Here ui, i = 1, 3, 5 and xi, i = 2, 4 are metabolite concentrations at the network nodes, vi,

i = [1, . . . , 4] are fluxes associated to the metabolic network branches and k is a constant pa-

rameter that represents the yield of u1. In the equations, f represents a time dependent control

function f(t) that allows to redirect the flux between the branches x2 → u3 and x2 → x4. A de-

tailed description of this function is made in Section 2.2.1.

Fig. 2.1 shows a positive feedback from u3 to the flux v3. Stoichiometric models do not pre-

dict feedbacks, as the stoichiometry of the reactions remain unchanged. The solution to model

this feedback will be presented in Section 2.3.4, where the implementation of Flux Balance

Analysis (FBA) is presented.

In the framework of S-systems [18] the system is described by:

du1

dt= k − β1u

h11

1

dx2

dt= α2u

g21

1 − β2uh23

3 xh22

2

du3

dt= α3(1 − f)xg32

2 (2.2)

dx4

dt= α4fu

g43

3 xg42

2 − β4xh44

4

du5

dt= α5x

g54

4

In this framework the kinetic orders are denoted gij if they refer to fluxes that enter a node or

metabolite (V +i ), and by hij if they refer to a fluxes that leave the node or metabolite (V −

i ). Finally,

αi and βi are constant parameters.

Their values were adapted from the initial model [18] and adjusted to obtain the desired response.

9

Page 26: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

Table 2.1: Parameters used in the prototype network.

Param. Value Param. Value

α2 8 h11 0.5α3 4.0556 h22 1.4224α4 1.8397 h23 0.6109α5 4.0556 h44 0.5829β1 1 g21 0.5β2 5.1179 g32 0.4171β4 4.0556 g42 2.8274k 0.8 g43 1.4646

g54 0.5

Table 2.1 shows the list of parameters. To distinguish between metabolites concentrations and

inputs/outputs different letters were used. Thus, x# represents the concentration of a metabolite

and u# an input/output.

The degradation of metabolite x2 depends on flux v3 and v2. In (2.2) the two fluxes were

lumped and are expressed as β2uh23

3 xh22

2 . If mass conservation was imposed, the two lumped

fluxes from dx2

dtshould equal the sum of v2 and v3 from the equations for du3

dtand dx4

dtrespectively.

Thus, β2uh23

3 xh22

2 = α3(1 − f)xg32

2 + α4fug43

3 xg42

2 .

Although mass conservation principle is a fundamental principle on a biochemical system, it was

not forced in this model for the sake of simplicity.

Assuming that u3 represents a precursor of the cellular objective (such as growth) and u5 the

desired product, if f(t) is biased towards the branch of v2 this yields the formation of u3 but little or

no production of u5. If f(t) is biased towards the branch of v3 the production of u5 will be affected

by the low concentration of u3 (since there is a forward feedback).

Thus, there is an optimal profile for f(t) to maximize the concentration of u5 at the pre-defined

final time tfinal.

2.1.3 The optimization problem

The optimization problem consists in selecting f(t) for t ∈ [0, tfinal] such that the cost

function:

J(f) = u5(tfinal) (2.3)

is maximized under the constraint that f ∈ [0, 1], ∇ t ≥ 0.

This translates in the maximization of the desired product yield at the end of the experiment.

2.2 Optimization Methods

The solution of the optimization problem formulated in Section 2.1.3 is now considered ac-

cording to three different approaches.

10

Page 27: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.2 Optimization Methods

Before presenting the different optimization strategies, the control function is described in detail

and PMP is invoked to show that the optimal control function has a particular form. A short intro-

duction to FBA and GP is also made for a better understanding of the optimization algorithms.

All the software was implemented on MATLAB, using standard functions and functions from the

freely available Systems Biology Toolbox [23], Linear Programming (LP) problems were solved

using the function linprog and non-linear problems using the function fmincon. For GP problems,

functions from the GGPLAB [15] package were used. The simulations were run on a laptop with

a 1.6gHz processor and 512mB of Ram.

2.2.1 The control function

The function f is the control function that is selected in order to maximize the product yield.

Two forms for this function are used.

The first form, discrete form, is represented as f(n) and divides the time interval in N seg-

ments. At each segment the function can assume any value inside the admissible upper and

lower bounds, defined to be 1 and 0 respectively.

The discrete form is described as:

f(n) = fn for n =tfinal

Ni, i = [1 . . . N ], fn ∈ [0, 1] (2.4)

The second form, step form, is represented as f(t). Starts at its minimum admissible value and

then switches to its maximum value at a certain time instant, which will be called time of regulation

(treg) throughout the rest of this thesis. Thus, the step form is described as:

f(t) =

{0 if t ≤ treg

1 if t > treg

, t ∈ [0, tfinal] (2.5)

2.2.2 Pontryagin’s Maximum Principle

The general tool to solve dynamic optimization problems such as the one considered here is

Pontryagin’s Maximum Principle PMP [16].

Let x be the state of a dynamical system with control inputs u such that:

x = F (x, u), x(0) = x0, u(t) ∈ U, t ∈ [0, T ] (2.6)

where U is the set of valid control inputs and T is the final time, assumed here to be constant.

The control function u must be chosen in order to maximize the functional J, defined by:

J(u) = ψ(xi(T )) +

T∫

0

L(x(t), u(t))dt (2.7)

where ψ is the cost associated with the terminal condition of the system and L the Lagrangian.

For that sake define the adjoint equations, with final conditions,

λ = −LTx − fT

x λ, λ(T ) = ψx(x = x(T )) (2.8)

11

Page 28: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

and the Hamiltonian,

H(λ(t), x(t), u(t), t) = λTF (x(t), u(t)) + L(x(t), u(t)) (2.9)

where the co-state λ satisfies the adjoint equation (2.8) with suitable final time conditions [16] and

x verifies (2.6) with u being the optimal control.

According to PMP, a necessary condition for the optimal control is that, along the optimal solution

for the state x, co-state λ and control u the Hamiltonian H is maximum with respect to u.

Comparing the cost (2.3) with the generalized case (2.7) and taking into consideration that,

in the case at hand, given by (2.1), the dynamics vector field depends linearly on the control, it

follows that

H(λ, x, u) = λTφ(x)u (2.10)

where φ(x) is a function that does not depend on u. Since, according to (2.10), the Hamiltonian

is linear in u, its maximum is obtained at the boundary of the admissible control set U .

Hence, this shows that, for the metabolic network (2.1), the control that optimizes (2.3) only

assumes the values f = 0 or f = 1.

2.2.3 Flux Balance Analysis

The difficulties associated with the creation of dynamic models, based on network kinetics,

promote the use of stoichiometric models and simpler methods for analysis of metabolic capabili-

ties of cellular systems. FBA has proven useful in the study of metabolic systems [7, 13, 24] and

is part of the optimization process of the current study.

Stoichiometric models describe the organisms through a set of chemical reactions (metabolism),

the rates of each of this reaction being called a flux. Assuming, as explained before, that the

main objective of a given organism is to grow, the problem that flux balance analysis addresses

is, given a set of reactions, to find what is the combination of fluxes that maximizes the growth

rate.

The first step on FBA is the reconstruction of the metabolic network, such as in Fig. 2.1. Mass

balance equations are written for every metabolite as in (2.1), and known constraints (such as

lower and upper bounds for fluxes) are included (2.11).

α ≤ vi ≤ β (2.11)

The system can be written, in a generic form as (2.12).

dX

dt= S.v (2.12)

Here X is a vector with the concentration of each metabolite, S is a matrix describing the stoi-

chiometry of the catabolic reactions, v a vector of the n metabolic reactions rates (fluxes) and α

12

Page 29: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.2 Optimization Methods

and β are the lower and upper constraints for each flux.

In the case at hand (2.1), considering only the boxed metabolites from Fig. 2.1, (2.12) becomes:

[dx2

dtdx4

dt

]=

[1 −(1 − f) −f 00 0 f −1

]

v1v2v3v4

(2.13)

If it is assumed that the system has achieved steady-state, thus removing the ability to describe

transient states or regulatory mechanisms, (2.12) becomes,

S.v = 0

which is typically an undetermined equation since there are more fluxes than metabolites.

While there are a multitude of solutions for this problem, since the fluxes can be organized in

several ways, only one or a small set of solutions will maximize the growth rate.

A valid solution is found by solving a LP problem with a proper objective function. In optimal

environmental conditions, with enough substrate, it is valid to assume that the cellular objective is

the maximization of biomass [24]. Thus the objective function of the LP can be a flux or a function

of fluxes known to be related to growth precursors.

In the case of (2.13) there are four undetermined fluxes and two equations, thus, a valid flux

distribution is obtained maximizing flux v2, and subsequently maximizing the biomass precursor

u3.

The FBA framework has been extended [8, 13] to incorporate the dynamics of the network.

Dynamic Flux Balance Analysis (dFBA) can predict the reprogramming of a metabolic network

and model the dynamics of certain metabolites over time. This is done by solving the steady-state

problem at several time instants and integrating the known fluxes during each time interval. FBA

and the principle of dFBA are used as part of the optimization procedures described below.

2.2.4 Geometric Programming

Geometric Programming GP is a powerful mathematical optimization tool that can be used in

problems where the objective and constraint functions have a special form [15]. GP is of particular

interest because it can solve large scale problems with extreme efficiency and reliability [25]. It

has been shown [26] that a problem formulated in S-Systems form can be solved with GP after a

minimum adaptation.

Let x = (x1, . . . , xn) be a vector of n real positive variables x1, . . . , xn. A function f(x) with the

form

f(x) = cxa1

1 xa2

2 · · ·xan

n

where c > 0 and ai ∈ R, is called a monomial function [26]. A sum of one or more monomials

is called a posynomial function [26] and any monomial is also a posynomial. The standard GP

13

Page 30: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

problem is formulated as:

minimize f0(x)

subject to fi(x) ≤ 1; i = 1, . . . ,m,

gi(x) = 1; i = 1, . . . , p, (2.14)

where fi and f0 are posynomial functions, gi are monomials, and xi are the variables to be

optimized. Given that monomials are closed under multiplication and division (if f and g are both

monomials then so are f × g and f ÷ g) [26].

Transforming an S-Systems model (in steady state) to be used in a GP problem constraints is

straightforward. For that sake, start from the standard form of S-Systems:

dxi

dt= αiΠx

gij

j − βiΠxhij

j (2.15)

Assuming steady-State::

0 = αiΠxgij

j − βiΠxhij

j (2.16)

This expression is re-arranged (2.18) to yield the form of a GP problem constraint (2.14):

αiΠxgij

j = βiΠxhij

j (2.17)

αiΠxgij

j

βiΠxhij

j

= 1 (2.18)

GP is used below as part of one of the optimization procedures considered.

2.3 Optimization Strategies

The control function, described in Section 2.2.1, is now optimized in order to obtain a maxi-

mum yield of u5, at the end of the run-time (tfinal), in the Prototype model (Section 2.1.2). Three

different methods, with different levels of information about the network, are presented to attain

this goal.

The first method, direct optimization, is used as a benchmark to compare the results of the next

methods.

The last two methods rely on a Bi-level optimization and illustrate a possible solution to the op-

timization problem when the information about the network is incomplete. The three methods

are tested with both forms of the control function, step and discrete, and their results compared.

Finally, a numerical analysis solution for PMP and the respective computational implementation

are presented.

2.3.1 Direct optimization

The first method, Direct Optimization, is used mainly as a benchmark, to compare the results

of the following methods. Since it is assumed that all the information about the network kinetics

14

Page 31: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.3 Optimization Strategies

is known, the system of differential equations, described in (2.2) is used. The initial conditions for

every integration was set to [u1, x2 u3 x4, u5] = [0.8 0 1 0 0]. The optimization was made for both

forms of the control function.

On the first optimization the step form (2.5) of the control function f(t) was used. The step form

imposes that the branch v2 is active in the beginning (f(t) = 0), building up biomass, switching,

at treg, to branch v3 (f(t) = 1) and activating the production of u5.

Given a function that receives treg as input and outputs the final yield of u5, this optimization tests

all the possible values of treg and returns the function:

J(treg) = u5(tfinal)

The value of treg that results on a maximum product yield is thus determined.

This optimization can be done manually, by testing the several possible values of treg and plotting

the results or by passing the function as an argument to an optimization function in MATLAB,

such as fmincon. The run time for the optimization is dependent on the constraints applied to treg.

Assuming that treg is forced to be an integer and treg ∈ [0, 30] the optimization takes less than a

minute to finish.

In order to show that the optimal transition on the step form of the control function is f(t) =

0 → f(t) = 1 a simulation was run with the inverse profile (2.19).

f(t) =

{1 if t ≤ treg

0 if t > treg

, tǫ[0, tfinal] (2.19)

On the second optimization the discrete form (2.4) of the control function was used. An opti-

mization algorithm was run to determine the optimal value for each interval of f(n).

Increasing the number of intervals results in an increased time resolution for f(n) but also in-

creases the computation time. In this optimization, f(n) can assume any real value between 0

and 1 for every time interval. These extra degrees of freedom highly increase the computational

time to obtain a valid result.

The algorithm was tested with several initial conditions to f(n), the initial conditions have shown

to have a major influence both on the computational time and on the convergence of the algorithm

to the optimal results.

A manual implementation of this optimization is not viable. The optimization was tested with two

MATLAB functions. fmincon, from the standard optimization toolbox, that finds the minimum of a

constrained nonlinear multi variable function, and simannealingSB from Systems Biology Toolbox

[23] that performs simulated annealing optimization.

2.3.2 Bi-Level Optimization algorithm structure

The Bi-Level optimization [5, 6] was structured as a general algorithm to accommodate missing

information on the kinetics of networks. In order to apply the algorithm to the prototype network,

15

Page 32: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

it is assumed that the two metabolites and the four fluxes inside the box in Fig. 2.1 are a part of

the network that might not be fully described in terms of its kinetics.

Given a certain control function f(t), in order to obtain the final yield of u5 it is necessary to

have an estimation of the temporal variation of the metabolite concentration or flux distribution.

In the Bi-level optimization algorithm, this problem is solved by an inner optimization process that

allows us to obtain the product yield, u5(tfinal), given a certain f(t), taking into account a valid

approximation of the network dynamics over the simulation time. The Bi-Level Optimization is

tested with two different levels of information on the network, which affect the inner-optimization

type.

In section 2.3.3 it is assumed that the kinetic parameters are known but the system is simulated

in steady-state. While this situation is not likely to happen in a real life problem, it is useful to

test the algorithm and to serve as a guideline in real problems. Assuming that the system is in

steady-state, the boxed metabolites concentrations are calculated at each time instant by solving

a GP problem.

Section 2.3.4 presents a real life valid situation, where no kinetics information is available for the

boxed metabolites/fluxes. The missing kinetic information is replaced with stoichiometric data and

FBA is used to obtain a valid flux distribution at each time instant.

The first step of the inner optimization process is to define the initial conditions of the input u1

and outputs u3, u5, Since there is a constant yield of substrate, u1(0) was set to zero, u3(0) was

set to 1, so there is an initial amount of biomass. Finally, u5(0) was also set to zero.

Given the initial conditions for the input and outputs, and depending on the method, a valid distri-

bution for the fluxes (v1, v2, v3, v4) or a valid concentration of the metabolites (x2, x4) is obtained

by solving an LP or a GP respectively, with a proper objective function.

After obtaining the flux distribution/metabolite concentrations, new values for the input/outputs can

be calculated by integrating their expressions in the considered time interval.

During this time interval the function f(t) and the values of the fluxes/metabolites are kept con-

stant. This process is repeated from t = 0 to t = tfinal. The time interval for the integration was

defined to be 1 second. The inner optimization process is shown in Fig. 2.2.

The inner-optimization is subject to a non-linear outer-optimization which will optimize the

control function f(t) in order to obtain a maximum yield of u5. Depending on the optimization

function used, the outer optimization runs the inner optimization every time it needs to evaluate

the value of the final yield of u5. In this way, the bi-level optimization maximizes the final yield of

u5 while guaranteeing a valid temporal flux/metabolite concentration distribution.

The bi-level optimization algorithm is schematically represented in Fig 2.3

16

Page 33: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.3 Optimization Strategies

Figure 2.2: Inner-Optimization algorithm

Figure 2.3: Bi-Level optimization formulation

17

Page 34: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

2.3.3 Inner-optimization using Geometric Programming

On the first implementation of the Bi-Level optimization algorithm the dynamics of the boxed

metabolites from Fig. 2.1 are used but, following the algorithm structure, steady-state is assumed.

Thus, x2 and x4 from (2.2) become:

dx2

dt= α2u

g21

1 − β2uh23

3 xh22

2 = 0 (2.20)

dx4

dt= α4u

g43

3 xg42

2 (u) − β4xh44

4 = 0 (2.21)

The equations are then manipulated to have a valid form for a GP problem constraint:

α2ug21

1

β2uh23

3 xh22

2

= 1 (2.22)

α4ug43

3 xg42

2 (u)

β4xh44

4

= 1 (2.23)

In this implementation of the algorithm, the inner optimization problem determines the profile of

the metabolites, instead of fluxes, due to the nature of the equations.

The metabolite concentrations are calculated at the beginning of each time interval by solving the

GP problem with the objective function being flux v2, given by α3xg32

2 . The obtained concentrations

are then used with (2.2) to integrate the values of u1, u3 and u5 at the beginning of each interval.

2.3.4 Inner-optimization using Linear Programming

On the second implementation it is assumed that only stoichiometric information is available

for the reactions inside the box of Fig. 2.1. Assuming steady state, the equations of x2 and x4

become:

dx2

dt= v1 − v2(1 − f) − v3f = 0

dx4

dt= v3(f) − v4 = 0

Fig. 2.1 shows a forward feedback from u3 (Biomass) to flux v3, since stoichiometric models

do not account for feedbacks, the effect of u3 can not be integrated directly in the equations.

Assuming that the forward feedback leads to an over expression of flux v3, then a valid solution

is to model the forward feedback as a variation of the constraints applied to flux v3. Thus, the

constraints applied to fluxes [v1 v2 v3 v4] to solve the FBA problem are:

Lower Bounds = [0 0 0 0]

Upper bounds = [100 1.85 (1.5u3) 35]

The initial guess for the upper bounds were taken from the maximum fluxes obtained in the direct

optimization, and adapted to yield the expected behavior.

Setting flux v2 (precursor of Biomass formation) as the objective function, the FBA problem is

solved with the previous equations and constraints to obtain a valid and unique flux distribution at

18

Page 35: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.3 Optimization Strategies

each time step. In the context of the inner-optimization, these fluxes are then used to calculate the

values of the input/outputs. Due to the simple nature of the prototype network, the concentrations

of u1, u3 and u5 can be calculated directly by replacing the obtained fluxes in (2.1), and therefore,

the equations in (2.2) are not used in this case.

In a more complex case, a relation between the dynamics of the input/outputs and the fluxes

distributions would be needed. For instance, in E. coli a valid relation between the product con-

centration variation (metabolite) and the growth rate (flux) is dProductdt

= (GrowthRate)×Biomass

[5], the same relation applies to the Biomass variation.

2.3.5 Pontryagin’s Maximum Principle: Computational imple mentation

As seen in Section 2.2.2, the optimal control function for the type of optimization problem

considered will only assume values in the borders of the allowed range.

In this section, PMP is applied to the system considered and the computational implementation is

described.

In the case at hand, we are interested in maximizing the final value of the state u5. Since the

Lagrangian (L) is zero, (2.7) becomes J(u) = ψ(xi(T )). Thus, the functional J to be maximized

is:

ψ(x(T )) = u5(Tfinal) (2.24)

as shown before in 2.3.

Taking into account that, L = 0 the adjoint equations are reduced to

λ = −fTx λ (2.25)

The network is described by the system of ordinary differential equations in (2.2), if we consider

the state model in the form of f(x, u), fx(x, u), where u is the control function, becomes:

fx(x, u) =

∂f1

∂x1

∂f1

∂x2

∂f1

∂x3

∂f1

∂x4

∂f1

∂x5

∂f2

∂x1

∂f2

∂x2

∂f2

∂x3

∂f2

∂x4

∂f2

∂x5

∂f3

∂x1

∂f3

∂x2

∂f3

∂x3

∂f3

∂x4

∂f3

∂x5

∂f4

∂x1

∂f4

∂x2

∂f4

∂x3

∂f4

∂x4

∂f4

∂x5

∂f5

∂x1

∂f5

∂x2

∂f5

∂x3

∂f5

∂x4

∂f5

∂x5

(2.26)

=

−β1h11xh11−11 0 0 0 0

α2g21xg21−11 −β2x

h23

3 h22xh22−12 −β2x

h22

2 h23xh23−13 0 0

0(α3g32x

g32−12

)(1 − u) 0 0 0

0 α4xg43

3 ug42xg42−12 α4ux

g42

2 g43xg43−13 −β4h44x

h44−14 0

0 0 0 α5g54xg54−14 0

(2.27)

and

19

Page 36: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

fTx (x, u) =

−β1h11xh11−11 α2g21x

g21−11 0 0 0

0 −β2xh23

3 h22xh22−12

(α3g32x

g32−12

)(1 − u) α4x

g43

3 ug42xg42−12 0

0 −β2xh22

2 h23xh23−13 0 α4ux

g42

2 g43xg43−13 0

0 0 0 −β4h44xh44−14 α5g54x

g54−14

0 0 0 0 0

(2.28)

Thus

λ1 = β1h11xh11−11 λ1 − α2g21x

g21−11 λ2

λ2 = β2x3h23h22x

h22−12 λ2 −

(α3g32x

g32−12

)(1 − u)λ3 − α4x

g43

3 ug42xg42−12 λ4

λ3 = β2xh22

2 h23xh23−13 λ2 − α4ux

g242g43x

g43−13 λ4

λ4 = β4h44xh44−14 λ4 − α5g54x

g54−14 λ5

λ5 = 0

(2.29)

The terminal conditions for the co-states λ are

λn(T ) =∂ψ

∂x

∣∣x=x(T ) = [0 0 0 0 1] (2.30)

Since L = 0 the Hamiltonian (2.9) is given by λT f(x).

Substituting in the expression and after some manipulation, becomes:

H(λ(t), x(t), u(t), t) = (2.31)

λ1

(k − β1 − xh11

1

)+ λ2

(α2x

g21

1 − β2xh23

3 xh222

)+ λ5 (α5x

g54

4 ) +

λ3α3xg32

2 − α4β4xh44

4 +

(λ4α4xg43

3 xg42

2 − λ3α3xg32

2 )u

that depends linearly on the control function u, as expected.

The derivative of the Hamiltonian in order to the control function is:

Hu = λT fu

= −λ3α3xg32

2 + λ4α4xg43

3 xg42

2(2.32)

The algorithm to compute the optimal control function is:

1. Given the initial conditions for x1...5 and an initial estimate for the function f(n) (2.5), inte-

grate the state equations (2.2) from t = 0 till t = tfinal, where tfinal in this experimental

procedure is typically 30.

2. Integrate the system of adjoint equations (2.29) from t = tfinal until t = 0, with final condi-

tions from (2.30).

3. Update the control function, by calculating δu(t) = KHu(t), where K is a small value (typ-

ically 0.1 . . . 0.001) and adding this variation to the previous control function, u(t)new =

u(t)old + δu(t)

20

Page 37: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

4. If the stop condition is not reached (number of iterations or minimum δu(t)) go back to the

first step

Starting with a rough estimate of the control function, the algorithm, estimates, at each itera-

tion, a variation that will approximate the control function of the optimal control. This variation is

added to the previous control function and the operation is repeated. The iterative process stops

when a certain stop condition is reached.

2.4 Results and Discussion

In this section, the results of the various optimizations are presented as well as some con-

siderations about the adjustments needed to obtain valid results when necessary. Since we are

dealing with a prototype network the scales do not have any physical meaning. Thus, the units in

the time scales were purposely omitted and the values obtained for product yields are absolute

values or normalized values.

2.4.1 Direct optimization

Direct optimization used model (2.2) with the set of parameters from Table 2.1 and was first

tested for the step form (2.5) of the control function.

On a first approach, all possible integer values of treg were tested in the interval treg = [1, 30].

Thus, 30 possible values for treg were tested. The optimization took around 3min to run, Fig. 2.4

plots the resulting function J(treg) = u5(tfinal).

It is clear from the figure that there is an optimal time of regulation to maximize the yield of u5.

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8

1

Time of Regulation (Treg)

Fin

al P

rodu

ct C

once

ntra

tion

(u5(

final

)

Direct Optimization

Figure 2.4: Result of the simulation using Direct optimization.

The optimal time of regulation is treg = 9 with a final yield of u5 = 293.

The existence of a maximum may be interpreted as follows:

21

Page 38: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

If f(t) switches from 0 to 1 before treg is reached, the formed biomass will not be enough to

maximize u5(tfinal). On the other hand, if f(t) switches from 0 to 1 after treg, there will be enough

biomass but the time will not be enough to produce the maximum possible amount of u5.

In order to increase the time resolution, the time variation for treg between adjacent intervals

was decreased. An optimization was run for treg − treg+1 = 0.5, treg − treg+1 = 0.25 and treg −

treg+1 = 0.125 where 60, 120 and 240 possible values for treg were tested, respectively.

The results were similar to Fig. 2.4, with an optimal time of regulation of treg = 9 and the same

maximum yield of u5.

A second optimization was performed with the profile for f(t) shown in (2.19). This profile

forces branch x2 → x4 to be active in the beginning, switching then to branch x2 → u3, as ex-

pected, the obtained u5 yield was always low and no optimal treg was observed.

To better illustrate the behavior of the prototype network, simulations were made for treg = 4,

treg = 9 and treg = 14. The obtained optimal treg = 9 is compared with lower and upper values in

order to show the different temporal evolution of the metabolites.

Fig. 2.5 plots J(treg) = u5(tfinal) for treg = 4, treg = 9 and treg = 14. As expected, the function

f(t) with treg = 9 has the higher product yield.

Figure 2.5: Comparison of three f(t) profiles. The solid line is the optimal treg obtained in the Direct opti-mization.

Fig. 2.6 plots the temporal variation of metabolites x2, x4, and outputs u3 and u5 for the three

values of treg. It can be seen that, for treg = 14 the final concentration of Biomass (u3) is high but

there is not enough time for the production of x4 and, subsequently, u5. On the other hand, for

22

Page 39: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

0 5 10 15 20 25 300

0.2

0.4

0.6

0.8X

2

Time0 5 10 15 20 25 30

0

10

20

30

40

U3 −

> B

iom

ass

Time

0 5 10 15 20 25 30−5

0

5

10

15

20

X4

Time0 5 10 15 20 25 30

−100

0

100

200

300

Time

U5 −

> F

inal

Pro

duct

Yie

ld

Treg = 4Treg = 9Treg = 14

Figure 2.6: Temporal variation of metabolites x2, x4, and outputs u3 and u5 for three values of treg

treg = 4 the formation of x4 starts earlier but the lack of biomass does not allow a big production

of u5.

Direct optimization was then tested with the discrete form of the control function (2.4). Using

this form, the optimization is not as straightforward as with the step form. The run-time and result

is highly dependent on the number N of intervals used and the initial prediction for f(n).

For a higher number of intervals it was frequent for the optimization function to freeze, which can

be related with the heavy computational load.

There were also several situations where the return values were far from the optimal, that probably

correspond to local minimums. The option to output the temporary results of the function was

set to ON, when available. Another preventive measure was the use of variables to store the

temporary results of the functions, to restore in case of unexpected interruption of the function.

Table 2.2 resumes the results obtained for several values of N , for three different initial condi-

tions.

In general the obtained results were inside the expected values. For t << treg all the simula-

tions converged to 0, with t >> treg all simulations converged to 1. This is in concordance with

PMP and also with the assumption that the optimal switching is 0 → 1.

The switching time for all simulations was always centered around t = 9. The critical time points

are the ones next to treg and the result of the optimization is highly dependent on the initial condi-

tions and number of intervals.

The dependency on the initial conditions is related to the algorithms used by the optimization

23

Page 40: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

Table 2.2: Results for the Direct Optimization using the Discrete form of the control function

Segments uinitial uoptimal u5(tfinal)

2 [0 0] [0.0765 1.0] 2552 [0.5 0.5] [0.1601 1.0] 2592 [1 1] [0.1254 1.0] 2583 [0 0 0] [0 0.9569 1.0] 2863 [0.5 0.5 0.5] [0 1 1] 2873 [1 1 1] [0 1 1] 2874 [0 0 0 0] [0 0.413 1.0 1.0] 2854 [0.5 0.5 0.5 0.5] [0 0.4925 1.0 1.0] 2864 [1 1 1 1] [0 0.452 1.0 1.0] 2865 [0 0 0 0 0] [0 0.413 1.0 1.0] 2855 [0.5 0.5 0.5 0.5 0.5] [0 0.1220 1.0 1.0 1.0] 292.95 [1 1 1 1 1] [0 0.1742 1.0 1.0 1.0] 2936 [0 0 0 0 0 0] [0 0 1 1 1 1] 2956 [0.5 0.5 0.5 0.5 0.5 0.5] [0 0 1 1 1 1] 2956 [1 1 1 1 1 1] [0 0 1 1 1 1] 29515 [0n=1...15] [0n=1...4 0.13 0 1n=7,...,15] 294.815 [0.5n=1...15] [0n=1...4, 0.5, 0.5, 1n=7,...,15] 29315 [1n=1...15] [0n=1...4 0.8 0.15 1n=7,...,15] 293

functions. It is easy to understand that initial conditions close to the optimal result are less prone

to lead the algorithm to local minimums and a valid result is obtained in less time.

The dependency on the number of intervals can also be related to the optimization algorithm,

since the number of degrees of freedom increases, but it is also highly connected with the tempo-

ral resolution. Since we are running the system from 0 to 30 each time step of the control function

will correspond to a time interval of 30N

. Thus when integrating the equations, the control function

will be constant on every 30N

time steps.

As we saw in the results for the step form f(t) the optimal treg is around 9, so if N is such that30N

∗ i ≈ 9, where i is an integer, then it is more likely that the transition will be f(n) = 0 →

f(n+ 1) = 1.

While there is some variation on the results, they all fall within one of three cases:

• The optimization resulted in an optimal function where the transition was f(n) = 0 → f(n+

1) = 1 specially when the number of intervals was low (< 15) and 30N

∗ i ≈ 9, such as in f(n)

with 3, 6 or 16 intervals. While this is the best possible scenario, with an immediate switch

from 0 to 1, this result only appeared for a relatively small group of N values.

• In some cases, f(n) assumes a value different than 0 or 1 during one or more time samples

near t = treg, this happens mostly for higher number of intervals. These cases are due to

convergence problems on the optimization algorithm, thus, forcing those samples to be 1 or

0 will result in a higher value for u5(tfinal). Such an example can be seen optimizing f(n)

24

Page 41: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

with N = 15 intervals. The output of the optimization is:

f(n) = [0n=1...4, 0.5, 0.5, 1n=7,...,15]

Forcing the function f(n) to

f(n) = [0i=1,...,5, 1i=6,...,15]

results on a slightly higher yield of u5.

• Finally, in some cases, f(n) assumes values different than 0 or 1 near treg and forcing those

values to 0 or 1 will not increase the final yield. It is important to note that, in these situations,

the difference between u5(tfinal) with the f(n) calculated by the algorithm and the f(n) with

forced 0’s and 1’s is relatively small. For example, for N = 30,

f(n) = [0n=1...7 0.1 0.5 0.5 0.5 0.8 1n=13...30]

with a final product yield of 294.8774, forcing f(n) function to

f(n) = [0n=1...9 1n=10...30]

will result in a yield of 293.7503 which is approximately only 0.4% smaller than the previous.

This means that both solutions are in the optimal region of f(n) and algebraic problems on

the algorithm might be responsible for this problem.

The computational time varied with both initial conditions and number of intervals. For in-

stance, with N = 2 the computational time was around 90 seconds, with N = 6 the computational

time was 498 seconds for f(n)initial = 0 and 1168 seconds for f(n)initial = 1.

2.4.2 Bi-Level Optimization

Before testing the Bi-Level optimization it is important to guarantee that the Inner-Optimization,

described in Section 2.3.2, is able to give a valid estimation of the temporal variation of the input

and outputs of the network.

The inner-optimization was tested with a fixed discrete form f(n) function for the two cases, GP

and LP. The used f(n) was:

f(n) = [0n=1...15 1n=16...30]

The results were compared with the integration of the complete model with the same control func-

tion. Figure 2.7 shows the results obtained, while the values of u1 and u3 are absolute values, u5

was normalized. It can be seen from the figure that the temporal variations in the three cases are

very similar. The substrate u1 variation, with LP is the only that does not have the same profile.

In the case of u5, the normalized variation for the two inner-optimizations overlap, hence only one

line is seen in the plot.

25

Page 42: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

0 5 10 15 20 25 300.6

0.65

0.7

0.75

0.8

Time

u 1 −>

Sub

stra

te

0 5 10 15 20 25 300

20

40

Time

u 3 −>

Bio

mas

s

0 5 10 15 20 25 30

0

0.5

1

Time

u 5 −>

Pro

duct

yie

ld

Complete modelInner−Opt w/ GPInner−Opt w/ LP

Figure 2.7: Comparison of the temporal variation of u1, u3 and u5 with a fixed f(n)

From this preliminary results, the inner optimization seems to be a valid option to obtain an ap-

proximation of the temporal variation of the inputs/outputs.

The Bi-Level optimization was then tested with the two forms of the control function. Once

again, for the step form of f(t) the optimization consisted in testing all the possible values of

treg and plotting the function J(treg) = u5(tfinal). In comparison with the direct optimization,

described in the previous section, this optimization does not use the whole model but uses the

inner-optimizations instead.

Fig. 2.8 plots the normalized curves for J(treg) = u5(tfinal) for the two optimizations, GP and

LP. Comparing Fig. 2.8 with Fig. 2.4 it can be seen that the profiles remain similar. The final

product yield, u5(tfinal), increases with treg until the optimal value is reached, then it starts de-

creasing.

The optimal time of regulation obtained with both GP and LP on the inner optimization was

treg = 9. The profile of J(treg) = u5(tfinal) with LP on the inner-optimization is not as smooth as

using GP or the whole set of equations.

The Bi-Level optimization was finally tested with the discrete form of the control function.

The obtained results were very similar to the case of the Direct Optimization. All the obtained

f(n) functions converged to f(n) = 0 when n << treg and f(t) = 1 when n >> treg, once again

the critical time point was at n = treg and the same three cases described in the previous section

were observed:

26

Page 43: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time of Regulation (Treg)F

inal

Pro

duct

Con

cent

ratio

n (u

5(fin

al)

Inner−Optimization with LP

0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time of Regulation (Treg)

Fin

al P

rodu

ct C

once

ntra

tion

(u5(f

inal

)

Inner−Optimization with GP

Figure 2.8: Result of the optimization using the Inner Optimization with Geometric Programming (left) andLinear Programming (right).

• The optimal case, where the switching is f(n) = 0 → f(n+ 1) = 1 was more frequent than

with the Direct Optimization, specially when using LP on the Inner-Optimization. The fact

that the inner-optimization results in less temporal data variation might be responsible for

this optimal switches.

• The case when the optimization returns wrong values, different than 0 or 1 near t = treg,

is very frequent in the Bi-Level optimization. In fact, it is quite frequent for the optimization

function to return values far from optimal, even for initial conditions that would return optimal

values on the Direct Optimization.

• The last case, when the optimal solution contains values different than 0 or 1 near t = treg

is less frequent and is probably explained by the same reason given in the first case.

In terms of coherency, the three methods (Direct Optimization and Bi-Level Optimization with

the two different Inner-Optimizations) exhibit high consistency in terms of resulting optimal values.

For a given N , the optimal solution for the Bi-Level Optimization with GP in the Inner Optimization

was, for all the tested values, always the same as for the Direct Optimization.

In the case of the Bi-Level Optimization with LP in the Inner Optimization there are some discrep-

ancies. Such an example is N = 10. The optimal solution for the first two cases is

f(n) = [0001111111]

which supports that the optimal solution is found for treg = 9.

In the Bi-Level Optimization with LP in the Inner Optimization the optimal solution is:

f(n) = [0000111111]

27

Page 44: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

While this case is not frequent, it was found for some values of N . For N = 30 and N = 60 all

three results are coherent and for N = 120 the Inner-Optimization optimal is once again deviated

one time step.

Due to the iterative nature of the inner-optimization, the integration of the inputs and outputs

must be done manually. As explained in section 2.3.3 and section 2.3.4 the obtained flux distri-

butions or metabolite concentrations are used, in each iteration, to calculate the variation ∆u of

each input and outputs.

Assuming a control function f(n), with N segments, at each step n, with duration tn = 30N

, the

new value of the input/output u is calculated by un = un−1 + ∆utn.

In the cases where N is small tn will assume large values, for example, if N = 3, tn = 10, the

input/outputs will only be calculated 3 times, which is insufficient and leads to erroneous results.

If the manual integration is bonded to the number of intervals of f(n), large values of N must be

used and consistency among integrations is not guaranteed, since, according to Euler’s method,

the size of the integration step affects the results. Thus, the implementation of the manual inte-

gration must include a fixed time step and, at each step of the integration, f(n) is estimated by

interpolation.

The results obtained for the Bi-Level Optimization are encouraging. In the case of the Inner-

Optimization the network behavior was correctly predicted with only a fraction of the original in-

formation on the network. In the current example, some stoichiometric values were adapted to

approximate the desired behavior. Such an example was the rate of consumption of the sub-

strate, which was tweaked to approximate the dynamic case. While tweaking the parameters to

obtain the desired response might seem against the proposed objective, this is only necessary

since we are dealing with a prototype network, with no physical meaning. On a real network, real

stoichiometric parameters would be used to approximate the dynamics of the system.

2.4.3 PMP: Computational implementation results

The implementation of the numerical method described in Section 2.3.5 proved to be more

complex than expected. Each iteration of the algorithm includes the integration of two sets of

equations, being the second integration (the backward integration of the co-states) particularly

demanding in terms of computational time. The initial results were not the expected and several

tweaks had to be made to the various steps of the algorithm.

In the integration of the state variables, the initial conditions were set to [u1 x2 u3x4 u5] = [0 0 1 0 0]

and the first estimate of the control function f(n) was set to f(n) = [01... N3

1N3

+1...N ], where N is

the number of intervals of the control function.

The algorithm was also tested with other initial conditions for f(n) but the effect in the final result

28

Page 45: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

was negligible, since the algorithm always converged to the same result.

N was initially set to low values, like 15, 30 or 60 but in the final implementation was set to 30.000,

as explained below.

In the previous sections some considerations were already made regarding the necessary

interpolations. In this case the interpolations proved to be a bottleneck on the convergence of the

algorithm. The function used to integrate both the state variables and the co-states was ode45,

from the standard MATLAB package. This function solves non-stiff differential equations with a

non-fixed time step.

When integrating the equations for λ the function needs an estimate of the value of the states

x at each time step. Since function ode45 does not use a fixed time step, the values of x are

interpolated.

When calculating the Hamiltonian, the same problem arises. For every time step, an estimation of

λ and x is needed, since they are not sampled at the same intervals, they have to be interpolated.

An initial approach calculated the value of the Hamiltonian function forN intervals. Since the initial

values for N were low, the time points at which the Hamiltonian was evaluated were not enough.

Thus, the final solution uses a high N and both the states and co-states are integrated and forced

to be evaluated for the same N time points. N was set to 30.000, the integration is made from

t = [0, 30] which means 1000 points per time point.

As explained in section (2.3.5) the integration of the co-state equations must be done back-

wards, in the interval t = [30, 0] since we only know the final value of the co-states.

Matlab’s functions support backwards integration, and function ode45 was used to do it on a first

approach. Although one of the input options of the function is to force the variable to have posi-

tive values, it is not possible to force it to be greater than zero. During the backward integration

was quite frequent for the co-states to reach zero, which lead to undetermined values and sub-

sequently bad integrations. A solution including safeguards, where a small value (1e − 10) was

added to the co-states, was tried but this lead to even longer (when feasible) integration times. To

solve this problem, a simple function, based on Euler’s method, was implemented. This function

does the backward integration and evaluates the co-states for the defined N time steps.

Finally, the update of the control function at each iteration was done by adding δu(t) = KHu(t)

to the previous estimation of the control function. Several values for K were tried, big values

would make the algorithm oscillate around the optimal value while small values would take many

iterations to converge to the optimal solution. The value considered to explore this trade-off was

K = 0.05. A possible implementation for calculating δ would be a dynamic value of K, starting

with a big value, allowing fast convergence, and decreasing it when the control function was near

the optimal solution.

Another possibility to update the control function is to calculate the Hamiltonian (instead of its

29

Page 46: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2. Synthetic problem

derivative) at each iteration. As illustrated in section (2.3.5) the value of the control function that

maximizes the Hamiltonian is, at each time instant, either 0 or 1. Thus, the control function can

be updated by calculating the value (0 or 1) that maximizes the Hamiltonian at each instant. This

method has shown to oscillate around the optimal solution so the final implementation uses the

derivative of the Hamiltonian.

A simulation was run setting the maximum of iterations to 30 and the run-time was 4281 sec-

onds. Figure 2.9 shows the evolution of the control function (initially set to f(n) = 0.5, n =

[0 . . . 30]), the derivative of the Hamiltonian and u5 yield for 6, non-consecutive, iterations of this

simulation.

On the first iteration, with the control function set to 0.5 on all time steps, the u5 yield is low

0 5 10 15 20 25 30

0

0.5

1

Time

Con

trol

func

tion

0 5 10 15 20 25 30−0.2

−0.1

0

0.1

0.2

Time

Ham

ilton

ian

Der

ivat

ive

0 5 10 15 20 25 300

100

200

300

Time

u 5 yie

ld

Iteration 0Iteration 5Iteration 10Iteration 20

Figure 2.9: Control function, Hamiltonian derivative and u5 evolution on several iterations.

and the Hamiltonian derivative is smaller than 0 for time values before t ≈ treg and higher than 0

for time values after t ≈ treg. This shows that the optimal control function will approximate 0 for

t < treg and 1 for t > treg. The u5 yield increases in the following iterations and the variation in

the control function is clearly noticeable until the 10th iteration.

The final result for f(n) seen after 30 iterations is still not the optimal. For t << treg, f(n) always

converges to 0 and for t >> treg, f(n) always converges to 1, but the values near treg the transi-

tion from 0 to 1 is slow.

The algorithm was run with more iterations and even with bigger values of K, but still this result

30

Page 47: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

2.4 Results and Discussion

was always constant.

As shown mathematically in section 2.3.5, the optimal control function is either 0 or 1, but all the

simulations, for the different optimization strategies, that use the discrete form of the control func-

tion have shown the same behavior for values of t near treg.

Although the problem has not been identified, this fact shows that there must be a numerical

problem.

31

Page 48: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high
Page 49: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3Model for Mannitol production with

Nisin induction

Contents3.1 Problem description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2 Parameter estimation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

33

Page 50: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

3.1 Problem description

The work developed in [1] towards the improvement of Mannitol production in Lactococcus

lactis, led to the creation of several different strains.

A particular strain, FI10089mtlD+Pase+, was created with the ability of simultaneously over ex-

press two genes known to code two enzymes responsible for the pathway that leads to the Man-

nitol formation. The over expression of these genes is also controlled by an inductor, Nisin.

The over expression of the genes led to increases in the activity of the enzymes up to ten times

in one case and up to 1400 times, in the other.

In separate experiments, Nisin was added in distinct time points, resulting in different yields of

Mannitol and suggesting that the time of induction can be used as a control variable for the prod-

uct yield.

Maximization of Mannitol production can be seen as an optimization problem, where several pa-

rameters must be fine tunned, among them is the pH, the temperature and Nisin induction[12, 14].

In this chapter the available data sets are presented and two simple models for the production

of Mannitol in the genetically manipulated strain FI10089mtlD+Pase+ are suggested. Finally, a

consistent parameter identification process that uses the four different data sets simultaneously

is presented,.

3.1.1 Mannitol model

The complete metabolic pathway of Lactococcus lactis that leads to the production of Mannitol

is yet to be fully understood. Fig. 3.1 shows the identified pathways and the corresponding

metabolites and enzymes involved. The metabolic pathway is dependent on the availability of

Glucose (substrate), that will be transformed in Fructose 6-phosphate (F6P). From F6P there

are two possible paths, one that leads to the formation of Mannitol, and other that leads to the

formation of Pyruvate and subsequently the formation of Biomass.

The studies in [1] suggest that Mannitol production in L.lactis is highly dependent on the avail-

able substrate and total amount of biomass, thus, a simple model with these three variables

(Mannitol, biomass and substrate) was formulated. The choice of these three variables for the

model was made, not only by their known close relation but also because of the limitations on the

practical acquisition of data.

Once again the S-System formalism was used to describe the system:

dx1

dt= −β1x

h11

1 xh12

2 xh13

3

dx2

dt= α2x

g21

1 xg23

3 − β2xh22

2 (3.1)

dx3

dt= α3x

g31

1 xg33

3 − β3xh33

3

Here x1 represents the amount of available Glucose, x2 the amount of Mannitol and x3 the

34

Page 51: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.1 Problem description

Figure 3.1: Detail of a metabolic pathway of Lactococcus lactis [1]

biomass, measured in terms of its dry weight. The formation of biomass depends on the amount

of available Glucose and on the amount of the biomass itself. The production of Mannitol depends

on the amount of biomass and on the available Glucose.

The set of parameters that are part of the S-System is going to be referred as δ throughout the

rest of this thesis.

δ = {α2, α3, g21, g23, g31, g33, β1, β2, ...

...β3, h11, h12, h13, h22, h33}

The model is schematically represented in Fig 3.2.

3.1.2 Mannitol model with Nisin induction

The addition of Nisin in the pathway shown in Fig. 3.1 leads to the over expression of genes

mtlD and M1Pase (not shown in the figure). Both these genes code enzymes that will contribute

to the production of Mannitol. Thus, the addition of Nisin increases the fluxes that lead to the

production of Mannitol and, consequently, decreases the fluxes that lead to the production of

Biomass.

In order to incorporate the induction using Nisin the previous model was modified. The time

profile of the Nisin concentration is unknown, but it is assumed that the maximum concentration

in the solution is reached shortly after the addition. It is also assumed that this concentration

remains constant throughout the experience.

Given these assumptions, a Hill-type Function was used to approximate the Nisin concentration.

35

Page 52: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

Figure 3.2: Mannitol Model without Nisin induction

A Hill function has the form (3.2), where n controls the steepness of the curve and θ is the point

where f(t) = fmax−fmin

2 .

Fig. 3.3 plots a Hill Function where n = 20 and θ = 5.

f(t) =tn

θn + tn(3.2)

0 1 2 3 4 5 6 7 8 9 10

0

0.2

0.4

0.6

0.8

1

Time

Figure 3.3: Aspect of a Hill Function with n = 20 and θ = 5

The model described in (3.1) was adapted to the induction using Nisin by multiplying each

metabolite with a scaled Hill Function. Thus, the model becomes:

dx1

dt= −β1x

h11

1 xh12

2 xh13

3 α1n

(1 +

(tn

θn + tn

)h1n

)

dx2

dt= α2x

g21

1 xg23

3 α2n

(1 +

(tn

θn + tn

)h2n

)− β2x

h22

2 (3.3)

dx3

dt= α3x

g31

1 xg33

3 α3n

(1 −

(tn

θn + tn

)h3n

)− β3x

h33

3

For the Biomass, the Hill-type function was added to the degrading part of the equation, this will

result in a faster glucose consumption after the addition of Nisin.

36

Page 53: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.1 Problem description

For Mannitol and Biomass, the Hill-type function was added to the formation part of the equations.

Since we expect that Nisin slows down Biomass production and speeds up Mannitol production,

the terms in the equations have opposite signs.

The set of parameters of the Hill Function and scaling is going to be referred as σ throughout

the rest of this thesis.

σ = {α1n, α2n, α3n, h1n, h2n, h3n, θ, n}

It is important to point that (3.3) does not obey the S-System formalism.

Fig.3.4 schematically represents the model including Nisin induction.

Figure 3.4: Mannitol Model with Nisin induction

3.1.3 Data sets description

The available data sets have information about the temporal variation of: Glucose, Manni-

tol, Lactate, Formate, Acetate, Acetoin, 2,3-Bd, Ethanol,Optical Density (OD600) and Dry Weight

(mg/ml) [1].

The optical density is obtained by the light absorbance at 600nm and is used to measure the cel-

lular density of a colony, which is proportional to its size. In this context, it was used to control the

time of addition of Nisin.

The comparison between the available metabolites on the data sets and Fig 3.1 supports the

choice of variables for the Mannitol Model.

Four data sets were used for the parameter estimations.

The first data set describes the Glucose and Mannitol concentrations and Biomass dry weight, in

a 25 hours period, in FI10089mtlD+Pase+ with no Nisin added.

The remaining three data sets, describe the Glucose and Mannitol concentrations and Biomass

in a FI10089mtlD+Pase+ with Nisin added at OD600 = 0.1, OD600 = 0.3 and OD600 = 0.8, or, in

terms of time, Nisin added at t = 2, t = 3 and t = 5 hours.

The four data sets are plotted in Fig. 3.5.

37

Page 54: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

0 5 10 15 20 250

20

40

60

Time (hours)

Glu

cose

0 5 10 15 20 250

10

20

30

Time (hours)

Man

nito

l Pro

duct

ion

0 5 10 15 20 250

0.5

1

1.5

2

Time (hours)

Dry

Wei

ght m

g/m

l

No NisinNisin @ OD

600 = 0.1

Nisin @ OD600

= 0.3

Nisin @ OD600

= 0.8

Figure 3.5: Data Sets for Mannitol production. Vertical dashed lines represent the time of induction of Nisin.

The data sets were not acquired with a fixed sampling time, since the sampling is manual the

data was mainly acquired in periods where the dynamics of the system were relevant.

In terms of mathematical modeling and parameter estimation it would be useful to have more data

points. For example in the period after 15 hours of experience, only the OD600 = 0.1 data set has

measures before the final time.

This will lead to a biased weighting in the estimation, giving more importance to the time period

before t = 15 hours. A possible approach to solve this issue would be the use of interpolation.

Analyzing the figure, it can be seen that the lowest Mannitol yield is for the case with no Nisin

added.

When Nisin is added at OD600 = 0.1 the growth rate decreases and Mannitol yield is higher that

in the case without Nisin. The consumption of Glucose is highly affected and decreases.

When Nisin is added at OD600 = 0.8 the growth and Glucose consumption rates are similar with

the no Nisin case, the Nisin production is approximately the same as in OD600 = 0.1 case.

Finally, the maximum product yield is obtained for OD600 = 0.3 with a low biomass formation and

low glucose consumption rate.

From a visual analysis the data sets appear to be coherent, with one exception. The data for

OD600 = 0.1 exhibits a noticeable different curve for Glucose consumption, thus being visible also

in the much lower biomass production. The effect of the addition of Nisin is only clearly visible

38

Page 55: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.2 Parameter estimation methods

in the Mannitol production rate. It can be seen that Mannitol production starts shortly after the

addition of Nisin, on every data set.

These data sets illustrate real data from a complex network and there are underlying mech-

anisms behind this data that are still not understood. Still, from a simplified point of view, it is

possible to do the parallelism with the case explored in Section 2.1.2, one can see that adding

Nisin too soon affects Mannitol production because of the lack of biomass, adding Nisin too late

affects Mannitol production because of the lack of substrate and run time, even if there is enough

biomass. Thus, the trade-off that needs to be explored is similar to the case of the previous

section.

3.1.4 Parameters estimation

The suggested Mannitol model has 14 parameters (δ) to be estimated, the second model,

including the Nisin induction, adds 8 more parameters (σ). The estimation problem consists, on a

first stage, estimating the parameters of the Mannitol model using only the data acquired without

the addition of Nisin.

On a second stage, the parameters of the Mannitol model are estimated using both the data

acquired with and without Nisin.

Finally, the parameters of the Nisin part of the second model are estimated and fine tunned using

only the data acquired with Nisin added.

3.2 Parameter estimation methods

The estimations of the parameters of both models were made using MATLAB. On a first ap-

proach the estimations were made using the freely available toolbox SBTOOLBOX and SBPD [23]

but the necessity of easy customization of the cost functions and transparency on the estimation

process led to the use of scripts written for the effect.

The general parameters estimation algorithm is resumed as follows:

• Initial parameters and constraints are defined. The constraints of the S-System parameters

(δ) were set to -4 on the lower bound and 4 on the higher bound to reproduce biochemical

reasonable rates and kinetic parameters. In the cases where σ parameters were estimated,

no constraints were applied.

• An optimizing function is called. Three functions were tested, fminunc, fmincon and siman-

nealingSB. The first two function belong to MATLAB optimization toolbox, and perform un-

constrained and constrained optimizations respectively. The last function belongs to the

SBTOOLBOX and performs minimization by simulated annealing.

39

Page 56: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

1. Inside the optimization function the set of differential equations of the model is inte-

grated, using the initial set of parameters. Depending on the estimation, this set of

parameters is δ, σ or both.

2. After the integration a cost function calculates the cost, normally using the minimum

sum of squares.

3. If the obtained cost obeys the stop condition of the optimization function, the optimiza-

tion is stopped. Otherwise, a new set of parameters is tested.

The parameters estimation process is illustrated on Fig. 3.6.

The estimations were made on a laptop with 4GB of RAM and a dual processor. Estimation times

Figure 3.6: Parameter estimation structure

varied between less than a minute and several minutes.

3.2.1 Estimation using one data set

The first estimation uses the data set obtained without the addition of Nisin and model (3.1).

A first and rough estimation used SBTOOLBOX, with the initial parameters set to 1. After this

estimation a script written for the effect was used. The cost function was defined as the sum of

the squared residuals, where the residuals are the difference between the the modeled data and

40

Page 57: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.2 Parameter estimation methods

the observed data.

J =

3∑

i=1

timepoints∑

j=1

[yij(δ) − yij

ωi

]2

(3.4)

Where i = 1, 2, 3 refers to the three metabolites, Glucose, Mannitol and Biomass, yij(δ) refers to

sample j of metabolite i of the model data, integrated with the set of parameters δ and yij refers

to experimental data, sample j of metabolite i.

The weighting factor ωi allows to give more or less weight to each metabolite during the estimation

process.

This estimation identifies the 14 parameters δ, belonging to the S-System but gives no guar-

antee that the identified parameters are valid to the data sets with Nisin induction.

3.2.2 Estimation using multiple data sets

On the second estimation both models (3.1) and (3.3) are used, as well as the four data sets,

the ones with and without the addition of Nisin. It is important to note that the data sets obtained

with Nisin added will not fit (3.1) with the parameters δ obtained in the previous section, but since

model (3.3) is an extension of (3.1) to assure consistency on the models, the common parameters

δ have to be equal.

The cost function becomes:

J = J1 + J2 + J3 + J4 (3.5)

where J1 corresponds to the cost associated with the No Nisin data set, and J2, J3 and J4 cor-

respond to the the costs associated with the Nisin data sets added at OD600 = 0.1, OD600 = 0.3

and OD600 = 0.8.

The cost functions are defined as:

J1 =

3∑

i=1

timepoints∑

j=1

[yij(δ) − yij

ωi

]2

J2,3,4 =

3∑

i=1

timepoints∑

j=1

[yij(δ, σ) − yij

ωi

]2

and σ is the set of parameters of the Hill Function and scaling.

This estimation refines the δ parameters obtained previously, ensuring that this set is common

to the sets with and without the addition of Nisin.

Since model (3.3) is used, σ must also be estimated. For the sake of simplicity, only the δ param-

eters are forced to be common to the 4 data sets, set σ is left free and will have different values

for each Nisin data set. Thus, δ + 3σ = 38 parameters are identified in this section.

The initial conditions for δ were set to the estimation obtained in the previous section and the

σ parameters were all set to 1.

41

Page 58: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

3.2.3 Estimation using the Nisin data sets

Having estimated the set of parameters of the S-System, δ, the Hill Function and the respective

scaling parameters, σ can be fine tunned.

Even though the three sets of parameters σ obtained in the previous section are able to fit the

experimental data, from a control point of view it is useful to reduce the control variables. Thus,

σ is divided, and the set {α1n, α2n, α3n, h1n, h2n, h3n} is forced to be common to the three Nisin

data sets.

The only variables/parameters left free are {θ, n}.

This decision is based in the fact that {θ, n} directly manipulate the shape and position of the Hill-

type function, as seen in (3.2), more specifically, varying θ changes the position of the function in

the time axis, creating a time control variable. This time control variable subsequently models the

time of addition of Nisin.

The estimation algorithm estimates

{α1n, α2n, α3n, h1n, h2n, h3n} + {θ, n} ∗ 3 = 12

parameters.

3.2.4 Further notes on estimation strategies

The possibility to define a custom cost function allows us to define other estimation strategies

using the four data sets. Two possible examples are:

• Estimating simultaneously {δ + {δ+ σ}× ∗3} parameters (one whole set of parameters per

data set) and defining a cost function that minimizes the difference between the modeled

and real data and the difference between the three sets of parameters δ. This strategy is

computationally heavy but gives an acceptable approximate estimation in the cases where

the other strategies fail to find the common parameters.

• Another strategy is a combination of the second and third strategies. The cost function is as

in (3.5) but the estimated parameters are {δ+ α1n, α2n, α3n, h1n, h2n, h3n}+ {θ, n} ∗ 3 = 26.

In this strategy both δ and σ are estimated at the same time but three pairs of {θ, n} are

obtained (one for each nisin data set).

3.3 Results

3.3.1 Identification of set δ

The first estimation, identifies the parameters of δ set using only the data set without Nisin.

Starting from the initial condition δ1...14 = 1 the parameters that best fit the real data were identified

42

Page 59: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.3 Results

with a final cost (3.4) of J = 0.042.

Table 3.1 shows the identified parameters.

Table 3.1: Estimation of the parameters of set δ using the data set without Nisin.

Param. Value Param. Value

α2 0.1267 α3 0.0204β1 0.4861 β2 0.0001β3 0.0367 g21 0.6417g23 2.0663 g31 0.6498g33 0.3644 h11 0.7931h12 −0.1292 h13 1.6108h22 −0.0018 h33 −0.1780

Fig.3.7 plots the modeled and the real data, showing that the estimated parameters accurately

fit the experimental data with one exception. For t ≈ 15 the modeled data does not fit the real

0 5 10 15 20 250

20

40

60

Time (hours)

Glu

cose

0 5 10 15 20 250

5

10

15

Time (hours)

Man

nito

l

0 5 10 15 20 250

1

2

Time (hours)

Bio

mas

s

Modeled dataReal data

Figure 3.7: Estimation of δ using the data set without Nisin.

data of the Biomass concentration. Since there is only one data point after t = 15 it is not possible

to know if the concentration of Biomass decreases from t = 14 to t = 15, increasing after that or

if the lower value for t = 15 is a measurement error.

3.3.2 Identification of set δ using Nisin data sets

On the second estimation all data sets are used to estimate δ. The initial parameters were set

to the previously obtained δ, (Table 3.1) and σ1..8 = 1.

The final cost function was J = 0.619 and the obtained δ set is shown in Table 3.2.

43

Page 60: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

Comparing the parameters of the two tables, the main differences are found on parameters

Table 3.2: Fine tuning of set δ using all data sets.

Param. Value Param. Value

α2 0.1267 α3 0.0196β1 0.3660 β2 −0.0079β3 −0.0080 g21 0.6842g23 1.6550 g31 0.6011g33 0.8168 h11 0.7988h12 0.1432 h13 1.3605h22 0.5963 h33 −0.1577

β2, g33, h12 and h22.

While the variations on these parameters are hard to justify without a proper sensibility analysis,

based on the system equations one can observe that the change in pair β2, h22 leads to a higher

decay on Mannitol concentration, but still, the value is very small. The increase in parameter g33

results in a higher dependency of the system on Biomass concentration which also affects the for-

ward feedback that increases Mannitol production. Finally the increase in h12 gives more weight

to the decay of Glucose due to Mannitol production.

The three obtained σ sets are shown in Table 3.3

Table 3.3: The three independent σ sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

α1n 1.0166 1.2592 0.9208α2n 1.7323 2.6433 1.0987α3n 1.1775 1.5914 1.6015h1n 1.8747 2.3056 2.1681h2n −0.033 −0.411 0.9763h3n 0.9467 1.6172 1.7065n −0.043 0.2937 0.0886θ 1.6023 1.8125 1.8863

Given the new sets of parameters, the two models were integrated and the results plotted

against the real data, Fig.3.8. The algorithm was able to estimate a common set δ that will fit all

the data sets, being the variations (due to Nisin induction) all modeled in set σ.

3.3.3 Identification of set σ

Having a common δ set of parameters to all four data sets, a final estimation tries to find a

common subset of σ that will allow us to fit all data simply by varying the parameters n and θ.

Thus, the final estimation uses only the Nisin data sets.

44

Page 61: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.3 Results

0 5 10 15 20 250

50100

Glu

cose

No Nisin added

0 5 10 15 20 250

1020

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

0 5 10 15 20 250

50100

Glu

cose

Nisin added @ OD600

= 0.1

0 5 10 15 20 250

2040

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

0 5 10 15 20 25 300

50100

Glu

cose

Nisin added @ OD600

= 0.3

0 5 10 15 20 25 300

2040

Man

nito

l

0 5 10 15 20 25 30012

Bio

mas

s

0 5 10 15 20 250

50100

Glu

cose

Nisin added @ OD600

=0.8

0 5 10 15 20 250

1020

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

Modeled DataReal Data

Figure 3.8: Estimation of set δ using all the data sets. Each Nisin data set is modeled with an independentσ set.

45

Page 62: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

In this estimation, δ set is not estimated, but is necessary as input to integrate the systems. The

set used was the one obtained in the previous section and shown in Table 3.2.

For the subset of σ that will be common to the three data sets, {α1n, α2n, α3n, h1n, h2n, h3n},

the initial conditions were set to the values obtained for Nisin added at OD600 = 0.1, shown in the

second column of Table 3.3.

The three initial pairs of n and θ were set to 1.

This estimation resulted in a final cost of J = 1.28 and the results are plotted in Fig.3.9.

The estimation successfully found a subset of σ, {α1n, α2n, α3n, h1n, h2n, h3n}, that allows to

2 4 6 8 10 12

4060

Glu

cose

No Nisin added

0 5 10 15 20 25−20

020

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

0 5 10 15 20 250

50100

Glu

cose

Nisin added @ OD600

= 0.1

0 5 10 15 20 25−20

020

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

0 5 10 15 20 25 300

50100

Glu

cose

Nisin added @ OD600

= 0.3

0 5 10 15 20 25 30−50

050

Man

nito

l

0 5 10 15 20 25 30012

Bio

mas

s

0 5 10 15 20 250

50100

Glu

cose

Nisin added @ OD600

= 0.8

0 5 10 15 20 25−50

050

Man

nito

l

0 5 10 15 20 25012

Bio

mas

s

Figure 3.9: Estimation of σ using the Nisin data sets and a fixed δ.

model all the Nisin data sets, only by varying the value of n and θ.

The obtained parameters are shown in Table 3.4 and Table 3.5.

The results shown on Table 3.5 are not the ones expected. The obtained n values characterize

a Hill Function with a very slow transition from the minimum to maximum, which goes against the

initial predictions.

It was also expected to obtain θ parameters that were proportional to the different OD600 values

used or proportional to the time of induction.

To better understand the effect of the Hill-Function on the Mannitol model, the Hill-function curves

46

Page 63: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3.3 Results

Table 3.4: Common subset of σ obtained in the estimation using the Nisin data sets.

Param. Value

α1n 0.9875α2n 2.0556α3n 1.5726h1n 1.0793h2n 0.3992h3n 1.4508

Table 3.5: Independent σ parameters obtained for each of the Nisin data sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

n 0.3040 0.8771 0.3375θ 0.2893 3.8653 2.0929

were plotted, and are shown in Fig. 3.10.

The effect of Nisin induction was modeled using a Hill-Function because its characteristics are

similar to the theoretical concentration curve of Nisin, however, the results from Fig. 3.10 suggest

that, for this model, the effect of Nisin can be modeled by a simple straight line.

While this result is not encouraging from the control point of view, the results were satisfactory

since the primary objective, to model Mannitol production with and without Nisin induction, was

fulfilled.

The use of S-Systems to model Mannitol production was based on the fact that it is a standard

when modeling metabolic systems. Given the simplicity of the system (only three variables) the

number of parameters is excessive. A future model for Mannitol production should be formulated

reducing the number of parameters.

Chapter 4 briefly explores the possible control strategies given the described results.

47

Page 64: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

3. Model for Mannitol production with Nisin induction

0 5 10 15 20 25 300.5

1

1.5

2

Glu

cose

0 5 10 15 20 25 302

2.5

3

3.5

4

Man

nito

l

0 5 10 15 20 25 301.5

2

2.5

3

Bio

mas

s

OD600

= 0.3

OD600

= 0.1

OD600

= 0.8

Figure 3.10: Plot of the obtained Hill-type functions for each Nisin data set.

48

Page 65: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4Optimizing Mannitol production

using Optimal Control

Contents4.1 Control using a step function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

49

Page 66: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4. Optimizing Mannitol production using Optimal Control

4.1 Control using a step function

The results obtained in Section 3.3.3 have shown that for this network, a Hill-function is not

the ideal to model the temporal control achieved with Nisin induction. A simplification of the Hill-

function to model Nisin induction is the use of a step-function, with the same form of the step form

control function f(t) used in the Prototype network, in Chapter 2.2.1.

Thus, the model for Mannitol production with Nisin induction becomes:

if t < tnisin

dx1

dt= −β1x

h11

1 xh12

2 xh13

3 (1 + u11)

dx2

dt= α2x

g21

1 xg23

3 (1 + u21) − β2xh22

2

dx3

dt= α3x

g31

1 xg33

3 (1 + u31) − β3xh33

3

else (4.1)

dx1

dt= −β1x

h11

1 xh12

2 xh13

3 (1 + u12)

dx2

dt= α2x

g21

1 xg23

3 (1 + u22) − β2xh22

2

dx3

dt= α3x

g31

1 xg33

3 (1 + u32) − β3xh33

3

Here, tnisin is the time of addition of Nisin or a term proportional to it. The seven parameters

that model the step function are u11, u21, u31, u12, u22, u31, tnisin.

Since there are two distinct sets of parameters, set δ from the S-System equations and the

step function parameters, the estimation was done in three different ways.

• In a first approach all the possible parameters were estimated, thus, δ+3×(step function parameters) =

35 parameters. All data sets, including the non-nisin data set were used.

• In the second estimation, the set δ was fixed (it was used the one obtained in section 3.3) and

only the set of parameters of the step function was estimated. Here, the set of parameters for

the step function was independent for each data set, thus 3 × (step function parameters) =

21 parameters were estimated.

• Finally, having the same fixed δ, the parameters of the step function, u##, were forced to

be common, with the exception of the three tnisin parameters. Thereby, it is ensured that

the control is only dependent on one time variable. Only the Nisin data-sets were used and

u## + 3 × tnisin = 9 parameters were estimated.

50

Page 67: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4.2 Results

4.2 Results

The first estimation returned a low value for the functional of J = 0.65 and the model fitted the

data.

The initial estimation for δ is the one shown in Table 3.2. By the end of the estimation procedure

only small changes in the δ were observed.

The obtained parameters for the step function are shown in Table 4.1.

The step function parameters for Glucose, u11 and u12, show a reduction of consumption after

Table 4.1: Three independent step function parameters, obtained on the first estimation with all data sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

u11 1.9006 1.1973 1.9677u12 0.8843 1.0106 0.1798u21 1.4158 1.7018 1.1581u22 2.8653 3.8892 0.8279u31 0.3373 0.0398 −0.0447u32 −0.1123 −0.0678 0.0183tnisin 0.2909 0.5796 1.2695

tnisin. For Mannitol, u21 and u22 show an increase in production for OD600 = 0.1 and OD600 = 0.3

but a decrease for OD600 = 0.8. The same pattern is observed for Biomass (u31 and u31). The

obtained values for tnisin are encouraging since they are increasing proportionally with the time

of addition of Nisin.

The second estimation returned a functional of J = 0.7, the results being shown in Table 4.2.

The parameters for Glucose, u11 and u12, show a decrease of consumption for OD600 = 0.1

and OD600 = 0.8 after tnisin. Mannitol production increases after tnisin for OD600 = 0.1 and

OD600 = 0.3 and Biomass production increases after tnisin in all cases. Once again, the values

of tnisin increase with the time of addition of Nisin.

Table 4.2: Three independent step function parameters, obtained on the second estimation with the Nisindata sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

u11 1.6251 0.7102 0.7925u12 0.1514 0.7947 −0.0553u21 0.8024 1.1865 0.6190u22 1.1551 2.6113 0.3193u31 −0.7084 −0.7184 −0.7690u32 −0.4217 0.0656 0.2952tnisin 1.5365 1.8139 2.4397

51

Page 68: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4. Optimizing Mannitol production using Optimal Control

The final estimation forced the step function parameters to be common to all data sets. A first

optimization finished with a functional of J = 2.4 which is far from ideal, being the disadjustments

easily visible between the modeled and real data. The obtained parameters are listed in Table 4.3

and Table 4.4.

Table 4.3: The common step function parameters, obtained on the third estimation #1 with the Nisin datasets.

u11 2.3718u12 0.8186u21 1.4296u22 2.6559u31 0.3382u32 −0.1285

Table 4.4: Three independent values for tnisin, obtained on the third estimation #1 with the Nisin data sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

tnisin 0.2981 0.6994 1.1334

Although the values of tnisin are still increasing with the time of addition of Nisin, the values of

the modeled data no longer agree with real data. In this case, the final yield of Mannitol for the

modeled data, with OD600 = 0.8 is greater than the yield with OD600 = 0.3, putting in cause the

validity of the model and the possibility of determining the optimal Nisin induction time.

To confirm the last result, a second optimization was run, finishing with a functional of J = 1.3.

The obtained results are listed in Table 4.5 and Table 4.6.

Table 4.5: The common step function parameters, obtained on the third estimation #2 with the Nisin datasets.

u11 2.1825u12 0.5478u21 2.1977u22 2.1486u31 0.3661u32 −0.1574

Table 4.5 shows that the Glucose consumption (u11 and u12) increases after tnisin, Mannitol

production (u21 and u22) increases and Biomass production (u31 and u31) decreases. The values

for tnisin do not increase with the time of addition of Nisin.

While the results for u## are in part in agreement with the expected, increased Mannitol pro-

duction and decrease in Biomass production, the lack of coherency between estimations and the

52

Page 69: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4.2 Results

Table 4.6: Three independent values for tnisin, obtained on the third estimation #2 with the Nisin data sets.

Param. OD600 = 0.1 OD600 = 0.3 OD600 = 0.8

tnisin 0.3209 0.9237 0.3892

obtained values for tnisin do not allow a generalization of the results.

The results obtained confirm that the model used is not able to describe the variation of Manni-

tol production, controlled by the time of induction with Nisin, only by varying the parameter tnisin.

The results presented in the last paragraphs were confirmed several times, by running new opti-

mizations with different initial conditions.

When the set of parameters u## is allowed to be independent on each Nisin data set, it is possi-

ble to properly fit the data and obtain increasing tnisin values. For a common u## set, in order to

obtain a valid fit, the tnisin values will not have the desired characteristic.

Applying further restrictions on the estimation algorithm, for example, forcing tnisin to be pro-

portional to the time of induction, and fine tunning the initial conditions, would probably allow to

obtain a common δ and u## sets. Still, the validity of the model and ability to predict new results

would be questionable.

As mentioned before, inspecting Fig.3.5 the effect of Nisin addition is only clear for the data of

Mannitol production. From the figure, and from [1] it is not possible to infer a rule for the effect of

Nisin on the consumption of Glucose and formation of Biomass. In fact, in Fig.3.5 the variation

of the metabolites suggest that the difference between them is, not the time of addition, but the

amount of Nisin added, this empirical result is also confirmed with the results obtained to the Hill-

type functions in section 3.3.3.

Since it is known for a fact, that these data sets were obtained for the same strain of L.lactis,

within the same laboratory conditions, same amount of Nisin added and that the only difference

was the time of addition, one can only conclude that the problem is on the mathematical model.

As explained before, Fig.3.1, the metabolic pathways for the production of Mannitol in L.lactis is

still covered with many uncertainties. In [1] many unpredictable results were obtained that actu-

ally contributed to the formulation of new tests and for the progressive unraveling of the metabolic

pathway structure.

The results obtained in this chapter are not the ones expected but may be a proof that Man-

nitol formation with and without Nisin induction is more complex than predicted and that other

metabolites must be included in the model in order to elaborate a proper predictive model.

53

Page 70: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

4. Optimizing Mannitol production using Optimal Control

54

Page 71: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

5Conclusions

55

Page 72: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

5. Conclusions

The work presented in this thesis addresses several questions in a logical sequence that can

arise when formulating a strategy to optimize and control the production of a certain metabolite

on a metabolic network.

Although the concept of metabolic engineering is not a new concept, due to its complexity many

questions are still unsolved and are expected to remain so for many years to come. The high

variety of metabolic networks makes hard to define a modeling, optimization or control strategy

that is applicable to all of them. Thus, when dealing with a new problem, it is wise to gather all

the possible information about that specific problem and combine solutions from various problems.

In Chapter 2 a prototype metabolic network was presented. Although quite simple, it exhibits

a trade-off behavior between two metabolites that often occurs in real life. It is shown that, for a

class of networks in which the yield of the product that favors cell population growth (the “natural”

product) competes with the desired product yield, with the manipulated variable affecting linearly

the fluxes, the optimal control that explores this trade-off assumes only extreme values.

While the implementation of control poses no challenge on in silico metabolic networks, on real

metabolic networks complex bioengineering skills are required. Gene knockout manipulations do

not adequate to this kind of control problem due to the long time scale associated with these tech-

niques. The manipulation of specific enzyme levels, controlled by modulating the expression of

the corresponding genes using promoter systems and inducers, is a possible solution to this kind

of control problem [6].

Since the lack of detailed information on the kinetics of the networks is frequent, a bi-level opti-

mization was presented and tested for three levels of information on the network. It is shown that

the use of a bi-level optimization strategy, that maximizes the natural product in the inner level by

manipulating the fluxes, leads to a good approximation to the optimal solution, with the advantage

of not requiring the full knowledge of the network model.

The presented optimization strategies are not a valid solution to every optimization problem re-

lated to Metabolic Networks. While the algorithms and results are valid for the network in question,

their contribution is mostly a guideline for future optimizations. The different strategies comple-

ment each other and, while some might never be used in practical terms, like the optimization

using GP when the full kinetics of the network are known, they introduce techniques that can be

used in the same context. Although the example network used is very simple, real networks are

extremely complex and exhibit relations between metabolites that are not always expected or fully

understood. This gives emphasis to the need of good in silico models. The prototype network

has proved to be useful to test the optimization strategies but a more complex network should be

used to confirm that the strategy can be scaled to a bigger network.

Having studied a synthetic prototype network, and possible optimization strategies, a real life

56

Page 73: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

case was taken as example in Chapter 3.

A model for the production of Mannitol on a specific strain of L. lactis was created. The model

predicts two situations. Mannitol production with and without the addition of Nisin, where Nisin

acts as an inductor of two enzymes whose activation leads to the production of Mannitol.

This network was used as a case-study on the identification of the models parameters. The chal-

lenge was the identification of the models parameters using simultaneously the data sets obtained

for Mannitol production with and without Nisin induction. Since the two models have a common

part, the common parameters should be the same.

The estimation of the parameters using multiple data sets can be done in several ways. The

ability of freely manipulate the estimation algorithm and the cost function to be minimized is of

great importance, since one can adjust the estimation to each particular case. This strategy for

parameters identification provides consistency to the estimation and is, hopefully, a step forward

on the creation of predictive models, instead of simple descriptive models.

Finally, the validity of using a Hill-type function to model Nisin induction was tested in the end

of Chapter 3. The results were not encouraging, since the model was unable to identify distinct

times of induction with Nisin.

In Chapter 4 the Hill-type function approach was relaxed and a simple step-function was tested

to model Nisin induction. Although the modeled data was able to properly fit the real data, once

again it was not possible to identify distinct times of induction and, subsequently, formulate an

optimization strategy based on a temporal control variable. The answer to this problem remains

unanswered but given the reliability of the data sets, the solution must rely on a new approach to

modeling Mannitol production taking into account other underlying mechanisms of Mannitol for-

mation.

The work presented on this thesis for networks optimization, parameter estimation and con-

trol strategies, provide clues for future problems and networks with similar characteristics. The

problems and new questions raised can be used as a starting point to many new research paths.

57

Page 74: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

5. Conclusions

58

Page 75: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Bibliography

[1] P. Gaspar, A. R. Neves, A. Ramos, M. J. Gasson, C. A. Shearman, and H. Santos, “Engi-

neering lactococcus lactis for production of mannitol: High yields from food-grade strains de-

ficient in lactate dehydrogenase and the mannitol transport system,” Appl. Environ. Microbiol,

vol. 70.

[2] J. Summerton, “Morpholino antisense oligomers: the case for an rnase h-independent struc-

tural type,” Biochim Biophys Acta, vol. 1489, no. 1, pp. 141–58, 1999.

[3] H. Gu, J. D. Marth, P. C. Orban, H. Mossmann, and K. Rajewsky, “Deletion of a dna poly-

merase beta gene segment in t cells using cell type-specific gene targeting,” Science, vol.

265, no. 5168, pp. 103–6, 1994.

[4] P. Masci, O. Bernard, F. Grognard, E. Latrille, J.-B. Sorba, and J.-P. Steyer, “Driving compe-

tition in a complex ecosystem: Application to anaerobic digestion,” In Proc. of the European

Control Conference (ECC’09). August 23-26, Budapest, Hungary., 2009.

[5] K. G. Gadkar, F. J. Doyle Iii, J. S. Edwards, and R. Mahadevan, “Estimating optimal profiles

of genetic alterations using constraint-based models,” Biotechnol Bioeng, vol. 89, no. 2, pp.

243–51, 2005.

[6] R. M. Kapil G. Gadkar and F. J. D. III, “Optimal genetic manipulations in batch bioreactor

control,” Automatica, vol. 42, no. 10, pp. 1723–1733, 2006.

[7] J. S. Edwards, M. Covert, and B. Palsson, “Metabolic modelling of microbes: the flux-balance

approach,” Environ Microbiol, vol. 4, no. 3, pp. 133–40, 2002.

[8] R. Mahadevan, J. S. Edwards, and r. Doyle, F. J., “Dynamic flux balance analysis of diauxic

growth in escherichia coli,” Biophys J, vol. 83, no. 3, pp. 1331–40, 2002.

[9] J. Nielsen, “Metabolic engineering,” Appl Microbiol Biotechnol, vol. 55, no. 3, pp. 263–83,

2001.

[10] Y. Liu, H. B. Sun, and H. Yokota, “Regulating gene expression using optimal control theory,”

Bioinformatic and Bioengineering, IEEE International Symposium on, vol. 0, p. 313, 2003.

59

Page 76: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Bibliography

[11] A. Datta and E. Dougherty, Introduction to genomic signal processing with control. CRC

Press (Taylor & Francis Group), 2007.

[12] P. Pharkya and C. D. Maranas, “An optimization framework for identifying reaction activa-

tion/inhibition or elimination candidates for overproduction in microbial systems,” Metab Eng,

vol. 8, no. 1, pp. 1–13, 2006.

[13] A. Varma and B. O. Palsson, “Stoichiometric flux balance models quantitatively predict

growth and metabolic by-product secretion in wild-type escherichia coli w3110,” Appl Environ

Microbiol, vol. 60, no. 10, pp. 3724–31, 1994.

[14] I. Mierau, K. Olieman, J. Mond, and E. J. Smid, “Optimization of the lactococcus lactis nisin-

controlled gene expression system nice for industrial applications,” Microb Cell Fact, vol. 4,

no. 1, p. 16, 2005.

[15] K. Koh, S. Kim, A. Mutapic, and S. Boyd, “GGPLAB: A simple matlab toolbox for geometric

programming,” 2006.

[16] F. Lewis and V. Syrmos, Optimal Control. John Wiley & Sons Inc., 2nd ed., New York, 1995.

[17] M. A. Savageau, “Biochemical systems analysis. i. some mathematical properties of the rate

law for the component enzymatic reactions,” J Theor Biol, vol. 25, no. 3, pp. 365–9, 1969.

[18] A. Sorribas, B. Hernandez-Bermejo, E. Vilaprinyo, and R. Alves, “Cooperativity and satu-

ration in biochemical networks: a saturable formalism using taylor series approximations,”

Biotechnol Bioeng, vol. 97, no. 5, pp. 1259–77, 2007.

[19] M. A. Savageau, “Biochemical systems analysis. 3. dynamic solutions using a power-law

approximation,” J Theor Biol, vol. 26, no. 2, pp. 215–26, 1970.

[20] ——, “Biochemical systems analysis. ii. the steady-state solutions for an n-pool system using

a power-law approximation,” J Theor Biol, vol. 25, no. 3, pp. 370–9, 1969.

[21] E. O. Voit and S. W. Omholt, “Computational analysis of biochemical systems. a practical

guide for biochemists and molecular biologists, cambridge university press, 2000, 531 pages

(isbn 0-521-78579-0; paperback),” Mathematical Biosciences, vol. 181, no. 1, pp. 107 – 109,

2003.

[22] Kyoto University and U. of Tokyo, “Kegg pathway database,”

http://www.genome.jp/kegg/pathway.html.

[23] H. Schmidt and M. Jirstrand, “Systems biology toolbox for MATLAB: a computational

platform for research in systems biology,” Bioinformatics, vol. 22, no. 4, pp. 514–515,

February 2006. [Online]. Available: http://dx.doi.org/10.1093/bioinformatics/bti799

60

Page 77: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Bibliography

[24] C. H. Schilling, J. S. Edwards, D. Letscher, and B. O. Palsson, “Combining pathway analysis

with flux balance analysis for the comprehensive study of metabolic systems,” Biotechnol

Bioeng, vol. 71, no. 4, pp. 286–306, 2000.

[25] S. P. Boyd and L. Vandenberghe., “Convex optimization,” Cambridge University Press., 2004.

[26] A. Marin-Sanguino, E. O. Voit, C. Gonzalez-Alcon, and N. V. Torres, “Optimization of biotech-

nological systems through geometric programming,” Theor Biol Med Model, vol. 4, p. 38,

2007.

61

Page 78: Optimization and Control for Metabolic Networks · Optimization and Control for Metabolic Networks ... and data poses new challenges in what concerns optimization. Due to the high

Bibliography

62


Recommended