Identification of Predictive Dynamic Models for …...Identification of Predictive Dynamic Models...

Post on 18-Jun-2020

1 views 0 download

transcript

Identification of Predictive Dynamic

Models for Systems Biology

Jörg Stelling

joerg.stelling@bsse.ethz.ch

IMA Workshop Biological Systems and Networks

Minneapolis / MN, November 2015

Network Identification: Dual Challenges

Complexity:

Many components 

Dynamic interactions

Self-modifying system

Spatial organization

Uncertainty:

Incomplete inventory

Few quantitative data

Conflicting hypotheses

Molecular noise

Pro

tein

inte

ract

ions

in D

roso

phi la

: J. G

iot e

t al .

(200

3 ) S

cien

c e 3

02: 1

727

Aim: Mechanistic Model Development

Fig

ure

from

: Ald

ridge

et a

l. (2

006)

Nat

ure

Cel

l Bio

logy

8: 1

195.

Reaction listPictogram Approximations

Pathway diagram Differential equations

Aim: Mechanistic Model Development

Fig

ure

from

: Ald

ridge

et a

l. (2

006)

Nat

ure

Cel

l Bio

logy

8: 1

195.

Reaction listPictogram Approximations

Pathway diagram Differential equations

Co-Design of Experiments and Models

Exploitation of prior knowledge → Few, targeted experiments.

Model ensembles representing uncertain prior knowledge.

Challenges: Model discrimination & experimental design.

Experiment

Model ensembles

Prior knowledge:

(Conflicting) hypotheses.

Few experimental data.

Qualitative data.

Remember that all models are wrong; the practical

question is how wrong do they have to be to not be useful.

G.E. Box

Remember that all models are wrong; the practical

question is how wrong do they have to be to not be useful.

G.E. Box

Science may be described as the art

of systematic over-simplification.

K. Popper

Coarse Identification: Network Structure

A.-P. Oliveira et al. (2015) Mol. Syst. Biol. 11: 802.

Example: Yeast Nutrient Signaling

TOR

PKA

De Virgilio & Loewith, Oncogene 25:6392 (2006).

Problem: Identifying fundamental signaling mechanisms,

such as specific input signals to nutrient sensing pathways.

Challenge: Data Integration & Causal Inference

Oliveira et al., Mol.. Sys. Biol. 11:802 (2015).

Identification of causal relations (metabolic input signals):

Design of informative experiments and data integration.

Metabolites

Transcripts

TOR

Conceptual Idea: Network Motifs

Approach: Decompose a network (graph) into smaller subunits.

Network motifs: "Patterns of interconnections that recur ... at

frequencies significantly higher than ... in randomized networks.”

S.

Sh

en

-Orr

et

al.,

Na

t. G

en

etic

s 3

1:

64

, 2

00

2.

Network Motif-Based Causal Inference

Oliveira et al., Mol.. Sys. Biol. 11:802 (2015).

Approach: 'Prototypic' interactions and qualitative causal

model → Experimental design enables model selection.

Metabolites

Transcripts

TOR

Network Motif-Based Causal Inference

Approach: 'Prototypic' interactions and qualitative causal

model → Experimental design enables model selection.

Network Motif-Based Causal Inference

Approach: 'Prototypic' interactions and qualitative causal

model → Bayesian approach & integration over methods.

TOR Input Signals: Experimental Design

Idea: Co-design of inference method and perturbation

experiments → Consistent, large-scale dynamic data set.

TOR Input Signals: Inference Results

Motif-based assignment of metabolite functions: Robust

(integration over methods) and 'plausible' inference results.

TOR Input Signals: Inference Results

Motif-based assignment of metabolite functions: Robust

(integration over methods) and 'plausible' inference results.

Generalization of the Approach?

Difficulties: Feature and experiment selection according to

(biological) hypothesis space; motif identifiability analysis.

'Downstream' motifs

excluded by prior knowledge

'Downstream' motifs

that cannot be discriminated

Detailed Identification: Toward Mechanisms

M. Sunnaker et al. (2013) Sci. Signaling ra41.

Prior Work: Ensemble Models for Yeast TOR

Set of ODE models: 24-29 species (states), 30-50 parameters.

Problem: Combinatorial explosion (models and parameters).

L. Kuepfer et al., Ensemble modeling for analysis of cell signaling dynamics. Nat. Biotechnol. 25: 1001 (2007).

Core model (CM, 1)Core model (CM, 1)

Original ensemble (19)Original ensemble (19)

Ensemble of kineticallydecoupled models (13)Ensemble of kineticallydecoupled models (13)

Ensemble with multiple phosphorylation (12)

Ensemble with multiple phosphorylation (12)

Computational Analysis Experimental Analysis

Assembly of type 2A phosphatases

Assembly of type 2A phosphatases

Tip41p complex formationwith type 2A phosphatasesTip41p complex formationwith type 2A phosphatases

- Model extensions

- Model discrimination- Cross-validation- Critical experiments- Key control mechanisms

- Model refinement

Topological Filtering: Concept

Formulation of all hypotheses in

a 'supermodel', specification of

models Mi via parameters.

Model reduction: Projection for

elimination of single parameters.

Model evaluation by Bayes

factors / posteriors given data Y:

M. Sunnaker et al., A method for automatic generation of predictive dynamic models …, Science Signaling, ra41, 2013.

Topological Filtering: Method

Dynamic (ODE) system, Gaussian measurement noise:

Likelihood function for model M, given parameter point Θ and

data Y, measurement covariance matrix S, residuals ε:

Bayes factors by integration over 'viable' volume (we do not

need to know the 'true' model for this computation):

Topological Filtering: Sampling Algorithm

Characterization of parameter spaces as a key ingredient.

Improved scaling: Novel (hybrid MCMC) sampling algorithm.

E. Zamora Sillero et al., Efficient characterization of high-dimensional parameter spaces ... BMC Syst. Biol. (2011).

Application: Yeast Stress Response

Short-term control of stress responsive genes via TF Msn2.

Glucose addition after starvation: Input cAMP peak, output

Msn2 phosphorylation and translocation to the cytoplasm.

Input Output

Application: Yeast Stress Response

Ensemble of 192 candidate topologies → 12 feasible models.

Iterations of (optimal) experimental design, data integration,

and posterior computations for mechanistic identification.

Iteration 1Iteration 2Iteration 3

Application: Yeast Stress Response

Example iteration: Confirmation of fast predicted Msn2

phosphorylation dynamics by targeted phosphoproteomics.

nucleus

cytoplasm

Application: Yeast Stress Response

Analysis of most plausible (>99% relative posterior) model:

Fast switching of phosphorylation state in nucleus.

High constitutive turnover of phosphorylation.

Switching through minor differences in net rates.

Msn2, nuc. = redMsn2, cyt. = greenMsn2P, nuc. = greyMsn2P, cyt. = blue

Nuclear rates = blackCytoplasmic rates = green

Nuclear net rates = greyCytopl. net rates = blue

Application: Yeast Stress Response

Msn2 system can act as differential sensor of cAMP input.

faster normal slower

Application: X Activity Control

Steady-state (simple) model, constant kinase input u:

Relation between phosphorylation and localization.

Switch between two (constitutive) transport cycles.

Cytoplasm

Nucleus

X CPX C

X NPX N

Nuclear X

Phosphorylated X

Steady-state (simple) model, constant kinase input u:

Relation between phosphorylation and localization.

Switch between two (constitutive) transport cycles.

Application: X Activity Control

Cytoplasm

Nucleus

X CPX C

X NPX N

Nuclear X

Phosphorylated X

Scaling: Modularity & Numerical Methods

M. Lang et al. (2014) Biophys. J. 106: 321.

M. Lang & J. Stelling (2015) SIAM J. Sci. Comput., under review.

C. Schillings et al. (2015) PLOS Comp. Biol. e1004457.

Modularization for experimental design (purpose-driven):

Given: Network with n hypotheses (existence of interactions).

Enumerate all modularizations that insulate hypotheses from each other → Combinations of n states to be measured.

Approach 1: Modularization for Design

M. Lang et al., Cutting the wires ... (2014) Biophys. J. 106: 321.

Modularization for experimental design (purpose-driven):

Given: Network with n hypotheses (existence of interactions).

Enumerate all modularizations that insulate hypotheses from each other → Combinations of n states to be measured.

Approach 1: Modularization for Design

Modularization for experimental design (purpose-driven):

Given: Network with n hypotheses (existence of interactions).

Enumerate all modularizations that insulate hypotheses from each other → Combinations of n states to be measured.

Approach 1: Modularization for Design

Modularization for optimization (parameter estimation):

Idea: Replace module inputs by experimental measurements.

Use two-level optimization procedure: Multiple-shooting approach by partitioning networks (instead of time domain).

Approach 2: Modularization for Optimization

Modularization for optimization (parameter estimation):

Idea: Replace module inputs by experimental measurements.

Use two-level optimization procedure: Multiple-shooting approach by partitioning networks (instead of time domain).

Approach 2: Modularization for Optimization

Characterization of system behavior in parameter space → Addressing the 'curse of dimensionality' problem:

Idea: Sparse, adaptive polynomial approximations of systems behavior with guarantees on accuracy (mass-action kinetics).

Approach 3: Sparse Polynomial Approximations

First-order(sensitivities)

Adaptivesparsegrids

x

Characterization of system behavior in parameter space → Addressing the 'curse of dimensionality' problem:

Small-scale example (10 parameters) and in general: Improved scaling of accuracy for number of simulations.

Approach 3: Sparse Polynomial Approximations

Characterization of system behavior in parameter space → Addressing the 'curse of dimensionality' problem:

EGFR signaling model (50 parameters): Numerical performance and questioning 'sloppy parameters' concepts.

Approach 3: Sparse Polynomial Approximations

Conv. 0.75

Conclusions & Perspectives

For network inference, co-design of experiments with

(purpose-driven) mathematical models is critical.

Formal models of different granularity can be employed

for data / hypothesis integration and interpretation.

Many conceptual (generalization of network motifs?) and

computational (scaling to large dynamic models?)

challenges need to be addressed.

Acknowledgments

Sotiris Dimopoulos, Mikael

Sunnaker, Elias Zamorra-Sillero,

Moritz Lang

Ana-Paula Oliveira, Reinhard

Dechant, Alberto Bussetto,

Andreas Wagner, Ruedi Aebersold,

John Lygeros, Joachim Buhmann,

Uwe Sauer, and others ...

Sean Summers, Claudia Schillings,

Christoph Schwab