Download - BioSigNet: Reasoning and Hypothesizing about Signaling Networks Nam Tran.

BioSigNet: Reasoning and Hypothesizing about Signaling Networks

Nam Tran

Main points

Biomedical databases: structured data and queries.

http://cbio.mskcc.org/prl/ Next step: knowledge bases and reasoning. Kinds of reasoning, incomplete knowledge How can existing knowledge be revised, expanded?

Hypothesis formation Experimental verifications

http://cbio.mskcc.org/prl/

Knowledge based reasoning

Various kinds of reasoning Prediction – side effects Planning – designing therapies Explanation – reasoning about unobserved aspects Consistency checking – correctness of ontologies

Additional facets/nuances Reasoning with incomplete knowledge. Reasoning with defaults. Ease of updating knowledge (elaboration tolerance)

Hypothesis formation

If: our observations can not be explained by our existing

knowledge? or the explanations given by our existing knowledge

are invalidated by experiments? Then: Our knowledge needs to be augmented or revised? How? Can we use a reasoning system to predict some

hypothesis that one can verify through experimentation?

Hypothesis space

Knowledge base

No cancer

Cancer

p53

UV leads_to cancer High UV

(K,I) |= O

Motivation -- summary

Goal: To emulate the abstract reasoning done by biologists, medical researchers, and pharmacology researchers.

Types of reasoning: prediction, explanation and planning.

Current system biology approaches: mostly prediction.

Incomplete knowledge constantly needs to be updated -> Hypothesis formation

Overview of our approach

Represent signal network as a knowledge base that describes actions/events (biological interactions, processes). effect of these actions/events. triggering conditions of the actions/events.

To query using the knowledge base: Prediction; explanation; planning.

Hypothesizing to discover new knowledge BioSigNet-RRH: Biological Signal Network –

Representation, Reasoning and Hypothesizing

Foundation behind our approach Research on representing and reasoning

about dynamic systems (space shuttles, mobile robots, software agents) causal relations between properties of the world effects of actions (when can they be executed) goal specification action-plans

Research on knowledge representation, reasoning and declarative problem solving – the AnsProlog language.

Representing signal networks as a Knowledge Base

Alphabet: Actions/Events: bind(ligand,receptor) Fluents: high(ligand), high(receptor)

Statements: Effect axioms:

bind(ligand,receptor) causes bound(ligand,receptor) if con.

high(other_ligand) inhibits bind(lig,receptor) if cond. Trigger conditions:

high(ligand), high(receptor) triggers bind(ligand,receptor)

Initial observations, Queries, Entailment Entailment: (K,I) |= Q

Given K: the knowledge base of binding I: initially high(ligand), high(receptor)

Conclude Q = eventually bound(ligand,receptor)

Given K: the knowledge base of binding I’: initially high(ligand), high(receptor), high(other_ligand)

Conclude Q

Importance of a formal semantics Besides defining prediction, explanation and

planning, it is also useful in identifying: Under what restrictions the answer given by a

given algorithm will be correct. (soundness!) Under what restrictions a given algorithm will find

a correct answer if one exists. (completeness!)

● bind(TNF-,TNFR1) causes trimerized(TNFR1)

● trimerized(TNFR1) triggers bind(TNFR1,TRADD)

Prediction

Given some initial conditions and observations, to predict how the world would evolve or predict the outcome of (hypothetical) interventions.

● Initial Condition

– bind(TNF-α,TNF-R1) occurs at 0

● Query

– predict eventually apoptosis

● Answer: Unknown!

– Incomplete knowledge about the TRADD’s bindings.

– Depends on if bind(TRADD, RIP) happened or not!

● Initial Condition

– bind(TNF-α,TNF-R1) occurs at t0

● Observation

– TRADD’s binding with TRAF2, FADD, RIP

● Query

– predict eventually apoptosis

● Answer: Yes!

Explanation

Given initial condition and observations, to explain why final outcome does not match expectation.

Relation to diagnosis.

● Initial condition:

– bound(TNF-,TNFR1) at t0

● Observation:

– bound(TRADD, TRAF2) at t1

● Query: Explain apoptosis

● One explanation:

– Binding of TRADD with RIP

– Binding of TRADD with FADD

Planning

Given initial conditions, to plan interventions to achieve a goal.

Application in drug and therapy design.

Planning requirements

In addition to the knowledge about the pathway we need additional information about possible interventions such as: What proteins can be introduced What mutations can be forced.

Planning example

Defining possible interventions: intervention intro(DN-TRAF2) intro(DN-TRAF2) causes present(DN-TRAF2) present(DN-TRAF2) inhibits bind(TRAF2,TRADD) present(DN-TRAF2) inhibits interact(TRAF2,NIK)

Initial condition: bound(NFκB,IκB) at 0 bind(TNF-α,TNF-R1) at 0

Goal: to keep NFκB remain inactive. Query:

plan always bound(NFκB,IκB) from 0

Future Works! Further development of the language

To better approximate cellular systems Delay triggers Granularities of representation Continuous processes, hybrid systems Concurrency, durative actions

Scaled-up implementation Kohn’s map Networks in Reactome and other repositories

Ontologies Integration with BioPax

Hypothesis space

Knowledge base

No cancer

Cancer

p53

UV leads_to cancer High UV

(K,I) |= O

Issues in this tiny example

Hypothesis formation:

Theory: UV leads to cancer.

Observation: wild-type p53 resists the UV effect.

Hypothesis: p53 is a tumor-suppressor. Elaboration tolerance:

How do we update/revise “UV leads to cancer”? Defaults and non-monotonic reasoning:

Normally UV leads to cancer.

UV does not lead to cancer if p53 is present.

Construction of hypothesis space Present: manual construction, using research literature Future: integration of multiple data sources

Protein interactions Pathway databases Biological ontologies

……..

Provide cues, hunches such as

A may interact with B: action interact(A,B)

A-B interaction may have effect C:

interact(A,B) causes C

Generation of hypotheses

Enumeration of hypotheses Search: computing with Smodels (an

implementation of AnsProlog) Heuristics

A trigger statement is selected only if it is the only cause of some action occurrence that is needed to explain the novel observations.

An inhibition statement is selected only if it is the only blocker of some triggered action at some time.

Maximizing preferences of selected statements

Generation … (cont’): heuristics Knowledge base K

a causes g b causes g

Initial condition I = { intially f } Observation O = { eventually g } (K,I) does not entail O Hypothesis space: to expand K with rules among

f triggers a f triggers b

Hypotheses: { f triggers a }, or { f triggers b }

Case study: p53 network

Tumor suppression by p53

p53 has 3 main functional domains N terminal transactivator domain Central DNA-binding domain C terminal domain that recognizes DNA damage

Appropriate binding of N terminal activates pathways that lead to protection of cell from cancer.

Inappropriate binding (say to Mdm2) inhibits p53 induced tumor suppression.

p53 knowledge base

Stress high(UV ) triggers upregulate(mRNA(p53))

Upregulation of p53 upregulate(mRNA(p53)) causes high(mRNA(p53)) high(mRNA(p53)) triggers translate(p53) translate(p53) causes high(p53)

p53 knowledge base (cont.)

Tumor suppression by p53 high(p53) inhibits growth(tumor)

p53 knowledge base (cont’)

Interaction between Mdm2 and p53 high(p53), high(mdm2) triggers bind(p53,mdm2) bind(p53,mdm2) causes bound(dom(p53,N)) bind(p53,mdm2) causes high([p53 : mdm2]), bind(p53,mdm2) causes ¬high(p53),¬high(mdm2)

Hypothesis formation

Experimental observation: I = { initially high(UV), high(mdm2), high(ARF) } O = { eventually ~ tumorous }

(K,I) does not entail O Need to hypothesize the role of ARF.

Constructing hypothesis space

Levels of ARF and p53 correlate high(ARF) triggers upregulate(mRNA(p53)) high(p53) triggers upregulate(mRNA(ARF))

Interactions of ARF with the known proteins bind(p53,ARF) causes bound(dom(p53,N))

Constructing …(cont’)

Influence of X (=ARF) on other interactions high(ARF) triggers upreg(mRNA(p53)) high(ARF) triggers translate(p53) high(ARF) triggers bind(p53,mdm2)

Constructing …(cont’)

Hypothesis

high(UV) triggers upregulate(mRNA(ARF)) high(ARF), high(mdm2) triggers bind(ARF,mdm2)

Future Works

Automatic construction of hypothesis space Extraction of facts like protein interactions … Integration of knowledge from different sources

Consistency-based integration (HyBrow) Ontologies

Heuristics for hypothesis search Ranking of hypotheses Make use of “number” data like microarray?