Pathway Modeling and Problem Solving Environments Cliff Shaffer Department of Computer Science...

Post on 25-Dec-2015

215 views 1 download

Tags:

transcript

Pathway Modeling andProblem Solving Environments

Cliff ShafferDepartment of Computer Science

Virginia TechBlacksburg, VA 24061

The Fundamental Goal of Molecular Cell Biology

Application:Cell Cycle Modeling

How do cells convert genes into behavior? Create proteins from genes Protein interactions Protein effects on the cell

Our study organism is the cell cycle of the budding yeast Saccharomyces cerevisiae.

S

cell d

ivision

G1

DNAreplication

G2M(mitosis)

growth

Clb5MBF

P Sic1 SCFSic1Swi5

Clb2Mcm1

APCCdc14

Cdc14

CDKs

Cln2SBF

?

andCln3

Bck2

DNA synthesis

Inactive trimer

Inactive trimer

P

Clb2

Budding

Cdc20

Cdc20

Cdh1

Cdh1

Mcm1

Mad2

unaligned chromosomes

RENT

Cdc14

APC-P

Cln2Clb2Clb5

Lte1

SBF

Esp1 Esp1Pds1

Pds1

Net1

Net1P

PPX

Cdc15/MENTem1-GDP

Tem1-GTPBub2

unaligned chromosomesCdh1

Sister chromatid separation

Mcm1Cdc20

Mitosis

Modeling Techniques

One method: Use ODEs that describe the rate at which each protein concentration changes Protein A degrades protein B:

… with initial condition [A](0) = A0.

Parameter c determines the rate of degradation. Sometimes modelers use “creative” rate laws to

approximate subsystems

]A[]B[

cdt

d

'1 1 2

d[Cln2][SBF] [Cln2]

dk k k

t

' '3 3 4 4 5

d[Clb2][Mcm1] [Cdh1] [Clb2] [Sic1][Clb2]

dk k k k k

t

' '6 6 T 7 7

6 T 7

[Cdc20] [Cdh1] [Cdh1] [Clb5] [Cdh1]d[Cdh1]

d [Cdh1] [Cdh1] [Cdh1]

k k k k

t J J

synthesis degradation

synthesis degradation binding

activation inactivation

Mathematical Model

0 50 100 150

0.0

0.5

1.0

1.5

0.0

0.5

0.0

0.5

1.0

1

2

Time (min)

CKI

mass

Clb2

Cln2

Cdh1

Simulation of the budding yeast cell cycle

G1 S/M

Cdc20

Table 6. Properties of clb, sic1, and hct1 mutants

mass at birth

mass at

SBF 50%

mass at

DNA repl.

mass at bud ini.

mass at division

TG1

(min)

changed

parameter

Comments

1 wild type

(daughter) 0.71 1.07

(71’) 1.15 (84’)

1.15 (84’)

1.64 (146’)

84 CT 146 min (time of occurrence of event)

2 clb1 clb2

0.71 1.07 1.16 1.16 No mit k's,b2 = 0

k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.

3 clb1 clb2

1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1

k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.

4 clb5 clb6 0.73 1.07

(65’) 1.30 (99’)

1.17 (80’)

1.70 (146’)

99 k's,b5 = 0 k"s,b5 = 0

Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.

5 clb5 clb6

GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1

k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.

6 sic1 0.66 1.00

(73’) 0.82 (37’)

1.06 (83’)

1.52 (146’)

38 k's,c1 = 0 k"s,c1 = 0

Schneider 1996 Fig 4, sic1 uncouples S phase from budding.

7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0

Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.

8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high

throughout the cycle. 9 sic1 hct1

0.71 No SBF 0.72 No bud No mit k's,c1 = 0

k"d,b2 = 0.01 Visintin 1997, telophase arrest.

10 sic1 GAL-CLB5

first cycle second cycle

0.71 0.52

0.74

0.73

No repl

0.76

1.20

k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0

Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.

Table 6. Properties of clb, sic1, and hct1 mutants

mass at birth

mass at

SBF 50%

mass at

DNA repl.

mass at bud ini.

mass at division

TG1

(min)

changed

parameter

Comments

1 wild type

(daughter) 0.71 1.07

(71’) 1.15 (84’)

1.15 (84’)

1.64 (146’)

84 CT 146 min (time of occurrence of event)

2 clb1 clb2

0.71 1.07 1.16 1.16 No mit k's,b2 = 0

k"s,b2 = 0 Surana 1991 Table 1, G2 arrest.

3 clb1 clb2

1X GAL-CLB2 0.65 1.10 1.19 1.19 1.50 105 k's,b2 = 0.1

k"s,b2 = 0 Surana 1993 Fig 4, 1X GAL-CLB2 is OK, 4X GAL-CLB2 (or 1X GAL-CLB2db) causes telophase arrest.

4 clb5 clb6 0.73 1.07

(65’) 1.30 (99’)

1.17 (80’)

1.70 (146’)

99 k's,b5 = 0 k"s,b5 = 0

Schwob 1993 Fig 4, DNA repl begins 30 min after SBF activation.

5 clb5 clb6

GAL-CLB5 0.61 0.93 0.92 0.96 1.41 73 k's,b5 = 0.1

k"s,b5 = 0 Schwob 1993 Fig 6, DNA repl concurrent with SBF activation in both GAL-CLB5 and GAL-CLB5db.

6 sic1 0.66 1.00

(73’) 0.82 (37’)

1.06 (83’)

1.52 (146’)

38 k's,c1 = 0 k"s,c1 = 0

Schneider 1996 Fig 4, sic1 uncouples S phase from budding.

7 sic1 GAL-SIC1 0.80 1.07 1.38 1.17 1.86 94 k's,c1 = 0.1 k"s,c1 = 0

Verma 1997 Fig3B, Nugroho & Mendenhall 1994 Fig 2, most cells are viable.

8 hct1 0.73 1.08 1.17 1.18 1.69 82 k"d,b2 = 0.01 Schwab 1997 Fig 2, viable, size like WT, Clb2 level high

throughout the cycle. 9 sic1 hct1

0.71 No SBF 0.72 No bud No mit k's,c1 = 0

k"d,b2 = 0.01 Visintin 1997, telophase arrest.

10 sic1 GAL-CLB5

first cycle second cycle

0.71 0.52

0.74

0.73

No repl

0.76

1.20

k's,b5 = 0.1 k"s,b5 = 0 k's,c1 = 0

Schwob 1994 Fig 7C, inviable. First cycle OK, DNA repl advanced; but pre-repl complexes cannot form and cell dies after the first cycle.

d CDK dt = k1 - (v2’ + v2” . Cdh1 ) . CDK

d Cdh1dt =

(k3’ + k3” . Cdc20A) (1 - Cdh1) J3 + 1 - Cdh1 -

(k4’ + k4” . CDK . M) Cdh1 J4 + Cdh1

d IEPdt = k9

. CDK . M . (1 – IEP ) – k10 . IEP

d Cdc20T

dt = k5’ + k5” (CDK . M)4

J54 + (CDK . M)4 - k6

. Cdc20T

d Cdc20A

dt = k7

. IEP (Cdc20T - Cdc20A) J7 + Cdc20T - Cdc20A

- k8

. MAD Cdc20A

J8 + Cdc20A - k6

. Cdc20T

Differential equations Parameter values

k1 = 0.0013, v2’ = 0.001, v2” = 0.17,

k3’ = 0.02, k3” = 0.85, k4’ = 0.01, k4” = 0.9,

J3 = 0.01, J4 = 0.01, k9 = 0.38, k10 = 0.2,

k5’ = 0.005, k5” = 2.4, J5 = 0.5, k6 = 0.33,

k7 = 2.2, J7 = 0.05, k8 = 0.2, J8 = 0.05,

Experimental Data

Tyson’s Budding Yeast Model

Tyson’s model contains over 30 ODEs, some nonlinear.

Events can cause concentrations to be reset.

About 140 rate constant parameters Most are unavailable from experiment and must set by

the modeler

Fundamental Activities

Collect information Search literature (databases), Lab notebooks

Define/modify models A user interface problem

Run simulations Equation solvers (ODEs, PDEs, deterministic,

stochastic)

Compare simulation results to experimental data Analysis

Modeling Lifecycle

Our Mission: Build Software to Help the Modelers

Typical cycle time for changing the model used to be one month Collect data on paper lab notebooks Convert to differential equations by hand Calibrate the model by trial and error Inadequate analysis tools

Goal: Change the model once per day. Bottleneck should shift to the experimentalists

Another View

Current models of simple organisms contain a few 10s of equations.

To model mammalian systems might require two orders of magnitude in additional complexity.

We hope our current vision for tools can supply one order of magnitude.

The other order of magnitude is an open problem.

JigCell

Current Primary Software Components:JigCell Model Builder

JigCell Run Manager

JigCell Comparator

Automated Parameter Estimation (PET)

Bifurcation Analysis (Oscill8)

http://jigcell.biol.vt.edu

Model Builder

Run Manager

Comparator

Parameter Values

ParameterOptimizer

Optimum Parameter Values

From a wiring diagram…

JigCell Model Builder

N.B. Parameters are given names,not numerical values!

…to a reaction mechanism

… to ordinary differential equations (ode files, SBML)

JigCell Model Builder

Mutations

Wild type cell

Mutations Typically caused by gene knockout Consider a mutant with no B to degrade A.

Set c = 0 We have about 130 mutations

each requires a separate simulation run

• Inheritance patterns

Basal Set(wild-type)

Derived Set(mutant A)

Derived Set(mutant B)

Derived Set(mutant C)

Derived Set(mutant A’)

Derived Set(mutant AB)

Derived Set(mutant A’C)

Run Manager

JigCell Run Manager

Phenotypes

Each mutant has some observed outcome (“experimental” data). Generally qualitative. Cell lived Cell died in G1 phase

Model should match the experimental data. Model should not be overly sensitive to the rate

constants. Overly sensitive biological systems tend not to

survive

Visualize results

Kumagai1 Kumagai2

Comparator

Comparator

Optimization

How to decide on parameter values?

Key features of optimization Each problem is a point in multidimensional space Each point can be assigned a value by an objective

function The goal is to find the best point in the space as defined

by the objective function We usually settle for a “good” point

Parameter Optimization

Error Function

orthogonal distance regression

Levenberg-Marquardt algorithm

Parameter Optimization

Only 1 experiment shown here. The model must be fitted simultaneously to many different experiments.

Parameter Optimization

Global DIRECT Search(DIViding RECTangles)

Global DIRECT Search(DIViding RECTangles)

Composition Motivation

Models are reaching the limits of manageability due to an increase in: Size Complexity

Making a model suitable for stochastic simulation increases the number of reactions by a factor of 3-5.Models of the mammalian cell cycle will require 100-1000 reactions (even more for stochastic simulation).

Model Composition

Notice that the yeast cell diagram contains natural components

Composition ProcessesFusion Merging two or more existing models

Composition Build up model hierarchy from existing models by

describing their interactions and connections

Aggregation Connects modular blocks using controlled

interfaces (ports)

Flattening Convert hierarchy back into a single “flat” model

for use with standard simulators

Composition Processes

Sample Sub-models

Sample Composed Model

Composition WizardFinal Species Mapping Table

Composition WizardFinal Reaction Mapping Table

Aggregated Submodels

Final Aggregated Model

Aggregation Connector

Composition in SBML

Virginia Tech’s proposed language features to support composition/aggregation being written into forthcoming SBML Level 3 definition

Stochastic Simulation

ODE-based (deterministic) models cannot explain behaviors introduced by random nature of the system. Variations in mass of division Variations in time of events Differences in gross outcomes

Gillespie’s Stochastic Simulation Algorithm

There is a population for each chemical species

There is a “propensity” for each reaction, in part determined by population

Each reaction changes population for associated species

Loop: Pick next reaction (random, propensity) Update populations, propensities

Slow, there are approximations to speed it up

Comments on Collaboration

Domain team routinely underestimates how difficult it is to create reliable and usable software.

CS team routinely underestimates how difficult it is to stay focused on the needs of the domain team.

Partial solution: truly integrate.

How to Succeed in CBB

Programming skills are necessary but not sufficient

Math is usually the biggest bottleneck Statistics for Bioinformatics Numerical analysis, optimization, differential equations

for computational biology

Chemistry/biochemistry are good choices for domain knowledge

You have to have an “interdisciplinary attitude”