+ All Categories
Home > Documents > Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with...

Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with...

Date post: 04-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
52
Synthesizing and simplifying biological networks from pathway level information Bhaskar DasGupta Department of Computer Science University of Illinois at Chicago Chicago, IL 60607-7053 [email protected] Joint works with Reka Albert, Piotr Berman, German Enciso, Sema Kachalo, Paola Vera-Licona, Eduardo Sontag, Kelly Westbrooks, Alexander Zelikovsky and Ranran Zhang Supported by NSF grants DBI-0543365 and IIS-0346973
Transcript
Page 1: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Synthesizing and simplifying biological networks from pathway level information

Bhaskar DasGuptaDepartment of Computer ScienceUniversity of Illinois at Chicago

Chicago, IL [email protected]

Joint works with Reka Albert, Piotr Berman, German Enciso, Sema Kachalo, Paola Vera-Licona, Eduardo Sontag, Kelly Westbrooks, Alexander Zelikovsky and Ranran Zhang

Supported by NSF grants DBI-0543365 and IIS-0346973

Page 2: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Cellular Networks• A single cell by itself is complex

enough to understand its functions completely.

• Various technologies have facilitated the monitoring of expression of genes and activities of proteins.

• Difficult to find the causal relations and overall structure of the network.

http://www.nyas.org/ebriefreps/ebrief/000534/images/mendes2.gif

Page 3: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Reverse engineering issuesGiven

– partial knowledge about the process/network– access to suitable biological experiments

how to gain more knowledge about the process/network?– effective use of resources (time, cost)

Page 4: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Reverse engineering issuesGiven

– partial knowledge about the process/network– access to suitable biological experiments

how to gain more knowledge about the process/network?– effective use of resources (time, cost)

Page 5: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Reverse engineering

• Process of backward reasoning, requiring careful observation of inputs and outputs, to elucidate the structure of the system.

http://www.computerworld.com/computerworld/records/images/story/46Reverse-engineering.gif

Page 6: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Ingredients for reverse engineering of biological networks

• Appropriate mathematical models– Differential equation model

• Computational techniques (algorithms)– Set multicover

• Biological experiments– Perturbation experiments

Page 7: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Iterative process in systems biology

Page 8: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Difficulty with traditional perturbation experiments

• Perturbation given to any gene or part of network may quickly spread to whole network

• Measurement of only global changes is possible

http://www.cumc.columbia.edu/news/journal/journal-o/winter-2006/img/MAGNet-diagram.jpg

Page 9: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Differential Equation Model of Biological Systems

state variables evolve by (unknown) partial differential equations

),,,,,,,(

),,,,,,,(

),(

2121

212111

mnnn

mn

pppxxxft

x

pppxxxftx

txftx

=∂∂

=∂∂

≡=∂∂

x = (x1(t),...,xn(t)) state variables over time tmeasurable (e.g., activity levels of proteins)

p = (p1,...,pm) parameters that can be manipulated

f(x*,p*)=0p* “wild-type” (i.e., normal) condition of px* corresponding steady-state condition

Page 10: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Settings for modular response analysis method

– do not know f

– but, prior information of the following type is available• parameter pj does or does not effect variables xi

(i.e., ∂fi /∂pj ≡ 0 or not)

Kholodenko, Kiyatkin, Bruggeman, Sontag, Westerhoff and Hoek, PNAS, 2002

Page 11: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Experimental protocols(perturbation experiments)

• perturb one parameter, say pk

• for perturbed p, measure steady state vector x = ξ(p)– let the system relax to steady state– measure xi (western blots, microarrys etc.)

• estimate n “sensitivities”:

nipepppp

pp

b ijjijjj

ij ,,2,1for ))()((1)( ***

*i=−+

−≈

∂∂

= ξξξ

where ej is the jth canonical basis vector

Page 12: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Modeling Goal

A

DC

B1. Topology of

connections only

2. Direction of the relationship

3. Information about stimulatory or inhibitory effects

4. Strength of relationship

+

+ -+

-2.1

9.3 1.24.8

5.3

Modeling goal can be at different levels

Stark et al., Trends Biotechnology 21, pp.290-293, 2003

Page 13: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Our very modest goal

Obtain information about the sign of ∂fi/∂xj(x∗,p∗)

e.g., if ∂fi/∂xj> 0, then xj has a positive (catalytic) effect on the formation of xi

Page 14: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

In a nutshellafter some combinatorics and linear algebra

one can quantify the additional prior knowledge necessary to reach the goal

Kholodenko, Kiyatkin, Bruggeman, Sontag, Westerhoff and Hoek, PNAS, 2002Bermen, DasGupta and Sontag, Discrete Applied Math, 2007Berman, DasGupta and Sontag, Annals of NYAS, 2007

Page 15: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

But, assuming (near)-sufficient prior information

• how to determine a minimum or near-minimum number of perturbation experiments that will work?

This now becomes a algorithmic/complexity issue...

Page 16: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

After some effort, one can see that

designing minimal sets of experimentsleads to

the set multi-cover problem

Page 17: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Modular Response Analysis for

Differential Equations modelLinear Algebraic

formulation

Combinatorialformulation

CombinatorialAlgorithms

(randomized)

Selection ofappropriate

perturbation experiments Overall high-level picture

Page 18: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

In our biological application context, it means....

we can provide a set of suggested experiments such that

# of experiments ≈ minimum possible

Page 19: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Experimental validation of Modular Response Analysis (MRA) Method

Growth factor-induced MAPK network topology shapes Erk response determining PC-12 cell fate

by

Silvia D. M. Santos, Peter J. Verveer, Philippe I. H. Bastiaens

Nature Cell Biology 9, 324 - 330 (2007)

Page 20: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Experimental validation (continued)

• MAPK pathway involving proteins Raf, Mek and Erk is activated through receptor tyrosine kinases TrkA and epidermal growth factor receptor (EGFR) by two different stimuli, NGF (neuronal-) or EGF (epidermal growth factor)

• MRA method was applied to determine the MAPK network architecture in the context of NGF and EGF stimulations

Page 21: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Another ongoing work on reverse engineering (with Paola Vera-Licona (INRIA), Eduardo Sontag (Rutgers), Joe Dundas (UIC))

Comparison of reverse engineering methods to infer network topology from gene expression data

Page 22: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

steady state profiles ofperturbations

of the network

Boolean network

hitting set (set cover)

introduceredundancy

set multicover

expression data representing state transition measurement

for wildtype and perturbation data

topology of interconnection

network

hitting set (set cover)

introduceredundancy

set multicover

http://sts.bioengr.uic.edu/causal/

Page 23: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Synthesizing and Minimizing Signal Transduction Networks

Page 24: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Overall Goal

direct interactionA → BA ┤B

double-causal interaction

A → (B → C)A → (B ┤C)

additionalinformation

Method(algorithms, software)

FAST

network

minimal complexitybiologically relevant

Page 25: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Nature of experimental evidence

• biochemical (e.g., enzymatic activity, protein-protein interaction)– direct interaction

• pharmacological evidence– double-causal interaction

• genetic evidence of differential responses to a stimulus– can be direct, but most often double-causal

Page 26: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

We describe a method for synthesizing double-causal (path-level) information into a consistent network

Our method significantly expands the capability for incorporating indirect (pathway-level) information. Previous methods of synthesizing signal transduction networks only include direct biochemical interactions, and are therefore restricted by the incompleteness of the experimental knowledge on pairwise interactions.

Page 27: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Direct interactions

A promotes B A → B

A inhibits B A ┤ B

Illustration of double-causal interactionC promotes the process of A promoting B

A B

BA

C

BApseudo

Page 28: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

“Critical” edge(known direct interaction, part of input)

Page 29: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Main computational step for network synthesis

• Pseudo-vertex collapse (PVC)– not so hard

• Binary transitive reduction (BTR)– hard– need heuristics

Page 30: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Pseudo-vertex collapse (PVC)

Intuitively, the PVC problem is useful for reducing the pseudo-vertex set to the the minimal set that maintains the graph consistent with all indirect experimental observations.

u

v

in(u)=in(v)out(u)=out(v)

uv

pseudo-vertices

new psuedo-vertex

Page 31: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Illustration of Binary Transitive Reduction (BTR)

remove?

yes,alternate path

remove?

no,critical edge

Intuitively, the BTR problem is useful for determining the sparsest graph consistent with a set of experimental observations

Page 32: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

High level description of the network synthesis process

Synthesize direct interactions

Optimize

Synthesize double-causal interactions

Optimize

Interaction with

biologists

BTR

PVC

BTR

Page 33: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Biological validation of the network synthesis approach

Page 34: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Plant signal transduction network

consistent guard cell signal transduction network for ABA-induced stomatal closure– manually curated– described in S. Li, S. M. Assmann and R. Albert, Predicting Essential Components

of Signal Transduction Networks: A Dynamic Model of Guard Cell Abscisic Acid Signaling, PLoS Biology, 4(10), October 2006

– list of experimentally observed causal relationships collected by Li et al. and published as Table S1. This table contains

• around 140 interactions and causal inferences, both of type “A promotes B” and “C promotes process (A promotes B)”

– We augment this list with critical edges drawn from biophysical/biochemical knowledge on enzymatic reactions and ion flows and with simplifying hypotheses made by Li et al. both described in Text of S1

Page 35: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Arabidopsis thaliana is a small flowering plant that is widely used as a model organism in plant biology. Arabidopsis is a member of the mustard (Brassicaceae) family, which includes cultivated species such as cabbage and radish. Arabidopsis is not of major agronomic significance, but it offers important advantages for basic research in genetics and molecular biology

(source: http://www.arabidopsis.org/portals/education/aboutarabidopsis.jsp)

Page 36: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Regulatory interactions between ABA signal transduction pathway components

Page 37: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Regulatory interactions between ABA signal transduction pathway components (continued)

NO → GC not critical and not enzymatic

ERA1 ┤(ABA → CalM)

Page 38: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Some nodes in the network

GCR1 putative G protein coupled receptorOST1 proteinNO Nitric OxideABH1 RNA cap-binding proteinRAC1 small GTPase protein

Page 39: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

(left) Guard cell signal transduction network for ABA-induced stomatal closure manually curated by Li, Assmann and Albert [source: PloS Biology, 10 (4), 2006].

( right) our developed automated network synthesis procedure produced a reduced (fewer edges) network while preserving all observed pathways [source: DasGupta’s group, Journal of Computational Biology and Bioinformatics]

Page 40: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral
Page 41: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Summary of comparison of the two networks

• Li et al. has 54 vertices and 92 edgesour network has 57 vertices but 84 edges

• Both networks have identical strongly connected component of vertices

• All the paths present in the Li et al.’s reconstruction are present in our network as well

• The two networks have 71 common edges• It took a few seconds to synthesize our network

Page 42: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Summary of comparison of the two networks (continued)

Thus the two networks are highly similar but diverge on a few edges,

All these discrepancies are not due to algorithmic deficiencies but to human decisions.

Page 43: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Software is available at:

http://www.cs.uic.edu/~dasgupta/network-synthesis/

• runs on any machine with MS Windows (Win32)– click, save the executable and run

• for linux/unix fans, source files for a non-graphic version of the program, that can be compiled and run from the console, can be obtained by sending an email to the authors

Page 44: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Data sourcesSignal transduction pathway repositories such as

• TRANSPATH (http://www.gene-regulation.com/pub/databases.html#transpath)

• protein interaction databases such as the Search Tool for the Retrieval of Interacting Proteins (http://string.embl.de)

contain up to thousands of interactions, a large number of which are not supported by direct physical evidence.

NET-SYNTHESIS can be used to filter redundant information while keeping all direct interactions

Page 45: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Other applications of the software Synthesizing a Network for T Cell Survival and Death in LGL Leukemia

Backgound• Large Granular Lymphocytes (LGL)

– medium to large size cells with eccentric nuclei and abundant cytoplasm– comprise 10%~15% of the total peripheral blood mononuclear cells– two major lineages

• CD3- natural-killer (NK) cell lineage: ~85% of LGL cells• CD3+ lineage: ~15% of LGL

Page 46: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

LGL leukemia

disordered clonal expansion of LGL and their invasions in the marrow, spleen and liver

Page 47: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Background (continued)Ras:

– small GTPase essential for controlling multiple essential signaling pathways

– its deregulation is frequently seen in human cancers

Activation of H-Ras required its farnesylation, which can be blocked by Farnesyltransferase inhibitiors (FTIs)

This envisions FTIs as future drug target for anti-cancer therapies, and several FTIs have entered early phase clinical trials

This observation, together with the finding that Ras is constitutively activated in leukemic LGL cells, leads to the hypothesis that Ras plays an important role in LGL leukemia, and may functions through influencing Fas/FasL pathway.

Page 48: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

To further understand the molecular mechanism(s) of the onset of LGL leukemia, we constructed the cell-survival/cell-death regulation-related signaling network, with special interest on the Ras’ effect on apoptosis response through Fas/FasL pathway

Goal: initiates the understanding of the interactions between Ras pathway and Fas/FasL pathways, two of the major pathways that regulate cell survival/death decision.

Currently, there is no standard therapy for LGL leukemia. Understanding the mechanism of this disease is crucial for drug/therapy development

Proteins that modulate the Ras-apoptosis response can potentially serve as future reference for drug design and therapeutic-target-molecule search, and this may not be restricted to LGL leukemia

Page 49: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Synthesizing a Network for T Cell Survival and Death in Large Granular Lymphocyte Leukemia

• Synthesized a cell-survival/cell-death regulation-related signaling network from the TRANSPATH 6.0 database, with additional information manually curated from literature search

• 359 vertices of this network represent proteins/protein families and mRNAs participating in pro-survival and Fas-induced apoptosis pathways

• 1295 edges represent regulatory relationships between nodes, including protein interactions, catalytic reactions, transcriptional regulation (no double-causal interactions were known)

• Performing BTR with NET-SYNTHESIS reduced the total edge-number to 873

Page 50: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

To focus on pathways that involve the 33 known T-LGL deregulated proteins, we designated vertices that correspond to proteins with no evidence of being changed during T-LGL as pseudo-vertices and deleted the label “Y” for those edges whose both endpoints were pseudo-vertices

Recursively performing “Reduction (faster)” BTR and “Collapse degree-2 pseudonodes” of NET-SYNTHESIS until no edge/node could be further removed simplified the network to 267 nodes and 751 edges.

Page 51: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

For further results, see

R. Zhang, M. V. Shah, J. Yang, S. B. Nyland, X. Liu, J. K. Yun, R. Albert, and T. P. Loughran,

Network Model of Survival Signaling in LGL Leukemia PNAS, 2008

Page 52: Synthesizing and simplifying biological networks from ...€¦ · medium to large size cells with eccentric nuclei and abundant cytoplasm – comprise 10%~15% of the total peripheral

Thank you for your attention!

Questions?

52


Recommended