Speaker: Chun-Yuan Lin Assistant Professor, CSIE, Chang Gung University *Data from Chuan Yi Tang,...

Post on 31-Dec-2015

228 views 1 download

Tags:

transcript

Speaker: Chun-Yuan LinAssistant Professor, CSIE, Chang Gung

University

*Data from Chuan Yi Tang, Professor, CS, Tsing Hua University

Introduction to Computational Systems Biology

112/04/191

OutlineReverse Engineering

Constructing Relationships of Networks from Expression and Perturbation Data

Modeling and Simulation

ResearchVirus-Host interaction

112/04/192

Reverse Engineering

112/04/193

4

Reverse Engineering (Computer Aided Engineering)

VLSI CAD

Communication Protocol

Bio-X ?

112/04/19

5

Protocol ImplementationCorret Spec. & Documentation

coding & debugging

test generation

code generation

pass?

Diagnostics

Error Reports

Product

Interoperability Testing

Program code test suite

Conformance Testing

test env.

Yes

No

YesNo

Standard

Error Reports

Validation Model

SDLFormal Spec.

Read & Formal Specify

add assertions

Validate

pass?

112/04/19

6

Key IssuesFormal Specification

Relationship Exploring

112/04/19

7

Bolouri, H., and Davidson, E. H. (2002). Modeling DNA sequence-based cis-regulatory gene networks. Dev. Biol. 246, 2–13. Brown, C. T., Rust, A. G., Clarke, P. J. C., Pan, Z., Schilstra, M. J., De Buysscher, T., Griffin, G., Wold, B. J., Cameron, R. A., Davidson, E. H., and Bolouri, H. (2002). New computational approaches for analysis of cis-regulatory networks. Dev. Biol. 246, 86–102.

Davidson, E. H., Rast, J. P., Oliveri, P., Ransick A., Calestani, C., Yuh, C.-H., Minokawa, T., Amore, G., Hinman, V., Arenas-Mena, C., Otim, O., Brown, C. T., Livi, C. B., Lee, P. Y., Revilla, R., Schilstra, M. J., Clarkes, P. J. C., Rust, A. G., Pan, Z., Arnone, M. I., Rowen, L., Cameron, R. A., McClay, D. R., Hood, L., and Bolouri, H. (2002). A provisional regulatory gene network for specification of endomesoderm in the sea urchin embryo. Dev. Biol. 246, 162–190. Yuh, C.-H., Brown, C. T., Livi, C. B., Rowen, L., Clarke, P. J. C., and Davidson, E. H. (2002). Patchy interspecific sequence similarities efficiently identify positive cis-regulatory elements in the sea urchin. Dev. Biol. 246, 148–161. 

Construction of cis-regulatory gene networks (update continuously at http://www.its.caltech.edu/~mirsky/endomes.htm)

Purpose:

To understand how various genes are expressed under the regulation of cis-regulatory elements during the developmental period of sea urchin embryos

112/04/19

8

Purple sea urchin (Strongylocentrotus purpuratus)

and white sea urchin (Lytechinus variegatus)

http://www.divebums.com/FishID/Pages/sea_urchin_purple.html  

http://digimorph.org/specimens/Strongylocentrotus_purpuratus/

112/04/19

9

cis-Regulation

trans-Regulation

112/04/19

10

A network exists in the cis-regulatory elements

with some logical rules

112/04/19

11 112/04/19

Constructing Relationshipsof Networks fromExpression and Perturbation Data

Developing a computational platform which can inference networks automatically by experiment data

112/04/1912

13

What is network ?

Network is commonly in organism. e.g. gene regulatory network, pathway, neuron network…etc.

Abstractly, network can defined as combinations of a group nodes and edges.

112/04/19

14

ApproachMining the relationship between nodes from expression curve

and perturbation matrix.

Inference networks by the relationships we found.

112/04/19

15

Expression curve

node A

node B Alignment

Bioinformatics 2003 19: 905-912

Scoring

Score

Relationship of A & B

112/04/19

16

Perturbation matrix

0

1

1

1

D

1

0

1

1

C

0

1

0

1

B

0

0

0

0

A

1

1

1

1

WT

D

C

B

A A BCD

AAB D

A

B

C

D

112/04/19

17

Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network

SCIENCE VOL 292:929-934, MAY 4, 2001

Trey Ideker, Vesteinn Thorsson, Jeffrey A. Ranish, Rowan Christmas, Jeremy Buhler, Jimmy K. Eng, Roger Bumgarner, David R. Goodlett, Ruedi Aebersold, Leroy Hood

112/04/19

18 112/04/19

19 112/04/19

20

Perturbation Matrix( mRNA Level)

mR

NA

Galactose Non-Galactose

WT

A

B

C

A

B

A

C

B

C

A

BC

WT

A

B

C

A

B

A

C

B

C

A

BC

A

B

C

數量變化

112/04/19

21

Reverse Engineering Strategy

Hypothesis

Simulation Models

Candidate Set

Match

實際 Microarray 輸出結果

Believe it or not是否唯一吻合

重新假設

再作 Distinguishable 實驗

112/04/19

22

A

B

C

A

B

C

A

B

C

A

B

C

A

B

C

A

BC

A

BC

A

BC

A

BC

a b c d e

A

B

C

g

s t u v

Possible Models

A

B

C

f

A

BC

A

BCp

A

BCq r

112/04/19

23

A(GAL4)

B(GAL80)

C(GAL3)

D(Galactose)

Gala

ctose

調控機制核心

112/04/19

24

Raw Data

Mining information

HypothesisModeling

Experimental Simulator

Simulation Result

Match

Verification by other biologists

New Biological study

New Biological Experiments

Experiment results

Revise hypothesis by biologists

Error Report

Significant Information

Confirm

Error Report

Biological Study

N

Y

N

Y112/04/19

Modeling and Simulation

Overview of systems biology

Systems biology is biology

Systems biology is modelingModel representationDynamic analysisSystems Biology Workbench (SBW)

Systems biology is data integration

Some examples of biological networks

112/04/1925

What is systems biologySystems biology is the study of an organism, viewed as an integrated

and interacting network of genes, proteins and biochemical reactions which give rise to life. Instead of analyzing individual components or aspects of the organism, such as sugar metabolism or a cell nucleus, systems biologists focus on all the components and the interactions among them, all as part of one system. (Institute for Systems Biology)

By discovering how function arises in dynamic interactions, systems biology addresses the missing links between molecules and physiology. Top-down systems biology identifies molecular interaction networks on the basis of correlated molecular behavior observed in genome-wide "omics" studies. Bottom-up systems biology examines the mechanisms through which functional properties arise in the interactions of known components. (Bruggeman and Westerhoff, 2007)

112/04/1926

A systems biology view...

Components

BuildingBlocks

FunctionalModules

System

Life‘s Complexity Pyramid (Oltvai-Barabasi, Science 10/25/02)

112/04/1927

Systems biology is biology

112/04/1928

Systems biology is modelingModel-driven analysis: integrated application of experimental

and computational tools

http://www.genomatica.com/scitech_modeldev.shtml112/04/1929

What is a modelA model is an abstract representation of objects or processes

that explains features of these objects or processes

112/04/1930

Computational model (silico model)in silico models are a compact framework to integrate massive

and diverse data sets generated from genomics, transcriptomics, proteomics and metabolomics research.

these models become a platform to design crucial experiments based on testable hypotheses and to address questions that are otherwise too difficult to address experimentally (Simulation).

From http://www.genomatica.com/scitech_model.shtml112/04/1931

Representation of computational modeling in bio-systems

Graph

Bayesian Networks

Boolean Networks

Rule Based Systems

Petri Net

112/04/1932

A Graph, G = (V, E), consists of two sets V and EV: finite non-empty set of verticesE: set of pairs of vertices, edges, e.g.(1,2) or 1,2 Note: V(G): set of vertices of graph G

E(G): set of edges of graph G

Undirected Graph – the pair of vertices representing any edge is unordered. Thus, the pairs (V1,V2) and (V2,V1) represent the same edge.

Directed Graph – each edge is represented by a directed pairs V1, V2 Note: V1, V2 and V2, V1 represent two different edges.

Graph

V1 V2

1

2

3

V(G) = {1, 2, 3}E(G) = {1,2 2,3 3,1}

112/04/1933

Directed Graph

The most straightforward way to model a genetic regulatory network is to view it as a directed graph.

v1

v2

v3

(2,1)

(3,2)

112/04/1934

Terminology

End vertices (or endpoints) of an edgeU and V are the endpoints of a

Edges incident on a vertexa, d, and b are incident on V

Adjacent verticesU and V are adjacent

Degree of a vertexX has degree 5

Parallel edgesh and i are parallel edges

Self-loopj is a self-loop

XU

V

W

Z

Y

a

c

b

e

d

f

g

h

i

j

112/04/1935

Terminology (cont.)

Pathsequence of alternating vertices and

edges begins with a vertexends with a vertexeach edge is preceded and followed

by its endpointsSimple path

path such that all its vertices and edges are distinct

ExamplesP1=(V,b,X,h,Z) is a simple pathP2=(U,c,W,e,X,g,Y,f,W,d,V) is a

path that is not simple

P1

XU

V

W

Z

Y

a

c

b

e

d

f

g

hP2

112/04/1936

Terminology (cont.)

Cycle circular sequence of alternating

vertices and edges each edge is preceded and

followed by its endpointsSimple cycle

cycle such that all its vertices and edges are distinct

ExamplesC1=(V,b,X,g,Y,f,W,c,U,a,) is

a simple cycleC2=(U,c,W,e,X,g,Y,f,W,d,V,a,

) is a cycle that is not simple

C1

XU

V

W

Z

Y

a

c

b

e

d

f

g

hC2

112/04/1937

Bayesian Networks

Bayesian Networks is modeled by a directed acyclic graph G =(V,E).

112/04/1938

Boolean NetworksBoolean network focuses on revealing the overall, global

property of large networks, especially gene regulatory networks.

The state of a gene can be described by a Boolean variable expressing that it is active (on, 1) or inactive (off, 0).

112/04/1939

(Boolean operation)

112/04/1940

Features of Boolean NetworksState of a system at time t+1 is determined by Boolean rules

based on its current state and input.

Systems undergo deterministic state transition path and produce predictable behavior.

Small, local perturbations produce small, local effect only.

112/04/1941

Due to the combinatorial control of transcription and the existence of enhancers and silencers the gene expression is complex, timely and precise. Thus, instead of ON/OFF states, genes often differentially express themselves during development, forming protein gradients in tissues that guide cell differentiation.

112/04/1942

112/04/1943

Rule-based SystemsRule-based systems have been well studied and widely applied

in computer science.

A rule-based system consist of two components, a set of facts and a set of rules, that are stored in a knowledge base.

112/04/1944

For example, the class DNA could be defined as consisting of objects with properties that include topology and strandedness.

112/04/1945

Petri NetsPetri Nets is represented by a directed, bipartite graph in which

nodes are either places or transitions, where places represent conditions and transitions represent activities .

112/04/1946

112/04/1947

Extended Petri NetsTimed Petri NetsStochastic Petri NetsHierarchical Petri NetsColored Petri NetsHybrid Functional Petri Nets

112/04/1948

112/04/1949

112/04/1950

112/04/1951

112/04/1952

112/04/1953

Can it represents bio-systems ?

For sophisticated dynamic systems in which control mechanisms of genes and chemical reactions with enzymes are concurrently performed, it is more reasonable to use real numbers for representing the amounts of some objects, e. g. the concentrations of a protein, mRNA, complex of proteins, metabolites, etc.

112/04/1954

112/04/1955

Dynamic analysis of the networkThe most common method to model dynamic behavior is a set

of ODEs (Ordinary Differential Equation)

dXf X ,

dt

112/04/1956

Example: A prey-predator system

112/04/1957

Example: The Lotka-Volterra model of a prey-predator system

112/04/1958

112/04/1959

Computer tools I am wishing for

Building NetworkCross-reference among online databasesVisualize NetworkSimulation of NetworkProtein-Protein Interaction Analysis Protein Function Prediction

112/04/1960

The Need for Interoperabilityin Systems Biology Software

No single package answers all needs of modelersDifferent packages have different niche strengths reflecting expertise &

preferences of the developing groupStrengths are often complementary to those of other packages

No single tool is likely to do so in the near futureRange of capabilities needed is largeNew techniques ( new tools) are evolving too rapidly

Problems with using multiple tools:Simulations & results often cannot be shared or re-used Duplication of software development effort

112/04/1961

Systems Biology Workbench (SBW)http://www.sys-bio.org/sbwWiki/sysbio/sbw

The Systems Biology Workbench (SBW) is an extendable, open source software framework, connecting software applications written in a variety of programming languages.

Software components provided with SBW assist in analyzing, creating, optimizing, simulating and visualizing computational models.

112/04/1962

SBWModules: These are the applications that a user would

use and a wide collection of model editing, model simulation and model analysis tools.

Framework: The software framework that allows developers to cross programming language boundaries and connect application modules to form new applications.

112/04/1963

SBW

VisualEditor

StochasticSimulator ODE-based

Simulator

ScriptInterpreter

DatabaseInterface

Simple framework for enabling application interaction

112/04/1964

Systems Biology Approach: 4 M’s Paradigm

Summary

112/04/1965

Systems Biology has three possible impacts

A system-level understanding of native biological systems (animals, plants, microorganisms) with not only system structures but also system dynamics. This includes metabolic analysis, sensitivity analysis and bifurcation analysis (Kernevez et al., 1983).

A system-level understanding of pathology and malfunction in order to control the state from the cell to the whole body and to provide potential therapeutic targets for treatment of diseases (Bailey, 1999; Friboulet et al., 2002).

The development of a system-level approach in Biotechnology to design biological systems having desired properties not existing in nature (Bailey, 1991).

112/04/1966

Web and tools (1)Cytoscape: An Open Source Platform for Complex

Network Analysis and Visualizationhttp://www.cytoscape.org/

112/04/1967

112/04/1968

112/04/1969