+ All Categories
Home > Documents > Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden [email protected] comas.soi.city.ac.uk...

Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden [email protected] comas.soi.city.ac.uk...

Date post: 10-Jan-2016
Category:
Upload: ann-rich
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
65
Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden [email protected] comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications in Bioinformatics
Transcript
Page 1: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

Prof. Michael Schroeder Biotec/Dept. of ComputingTU [email protected] Biotec

Reasoning on the Web:Theory, Challenges, and

Applications in Bioinformatics

Page 2: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 2

Contents

Motivation Beyond the web: Rules, Reasoning, Semantics,

Ontologies Semantics of Deduction Rules

Argumentation Semantics Fuzzy Reasoning

Reaction rules Vivid Agents Prova

Applications in Bioinformatics

Page 3: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 3

The Web

A great success story, but… it’s the web for humans, not machines

Many areas, such as biology, have fully embraced the web Human genome project is only tip of the iceberg More than 500 tools and databases online

LLNEYLEEVE EYEEDE

Page 4: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 4

Example: Pubmed

>12.000.000 literature abstracts Great resource if one

knows what one is looking for

“Kox1” has 17 hits

But “diabetes” will produce >200.000

Often need to automatically process abstracts

Page 5: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 5

Results of PubMed Lorenz P, Transcriptional repression mediated by

the KRAB domain of the human C2H2 zinc finger protein Kox1/ZNF10 does not require histone deacetylation.Biol Chem. 2001 Apr;382(4):637-44.

Fredericks WJ. An engineered PAX3-KRAB transcriptional repressor inhibits the malignant phenotype of alveolar rhabdomyosarcoma cells harboring the endogenous PAX3-FKHR oncogene.Mol Cell Biol. 2000 Jul;20(14):5019-31....

Author

Title

YearJournal

However, to a machine things look different!

Page 6: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 6

Results of PubMed

....

Solution: tag data (XML)

Page 7: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 7

Results of PubMed <author> </

author><title>

. </title>

<journal> </journal><year><year> <author> </

author><title>

. </title>

<journal> </journal><year><year>

...

However, to a machine things look different!

Page 8: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 8

Results of PubMed

...

Solution: use ontologies(Semantic Web)

Page 9: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 9

GeneOntology

Biologists have recognised the problem of semantic inter-operability between disparate information sources

GeneOntology (GO) is effort to provide common vocabulary for molecular biology

GO has >10.000 terms in three branches “function”, “process”, “localisation”

Page 10: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 10

GeneOntology Has 13 levels Width broadens to level 6 (3885 terms wide) then shrinks Number of leaves per levels broadens to level 6 (1223 leaves) then

shrinks Average term has 4 words Maximal term has 29 words:

0

500

1000

1500

2000

2500

3000

3500

4000

4500

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Breadth of GOOxidoreductase activity, acting on paired donors, with incorporation or reduction of molecular oxygen, 2-oxoglutarate as one donor, and incorporation of one atom each of oxygen into both donors

Page 11: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 11

Motivation Summary

Web in the old days HTML (for humans)

Web these days HTML XML, Ontologies (for machines)

Web of the future HTML XML, Ontologies rules, reasoning, semantics access to computational resources (a la grid-computing)

Page 12: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 12

Open Problems

Part I: Theory of rules and reasoning on the web: Knowledge representation: Which level of expressiveness? Semantics: How to guarantee inter-operability Reasoning: Fuzzy reasoning and unification Reactivity: Vivid agents

Part II: Applications of rules and reasoning on the web: Integration and querying of information sources

Integration: transmembrane prediction tools Integration: protein structure DB and structure classification

Consistency checking Ontology: If A is B and B is C, then the ontology should not

explicitly mention A is C, as it is already implicit Annotation: Do different tools agree or disagree?

Page 13: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 13

The wider Picture: www.RuleML.org

Goal: develop Web language for rules using XML markup,

formal semantics, and

efficient implementations.

Rules: derivation rules, transformation rules, and reaction rules.

RuleML can thus specify queries and inferences in Web ontologies, mappings between Web ontologies, and dynamic Web behaviors of workflows, services, and agents.

Currently, some 30 international members and close collaboration with W3C

Page 14: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 14

The wider Picture: REWERSE Reasoning on the Web with Rules and Semantics FP6 Network of Excellence with nearly 30 partners

Working groups on Infrastructure and Applications Composition Typing Policies Querying Reactivity and evolution

Personalised Web sites Calendar systems Bioinformatics

Page 15: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 15

Part I: Theory

Motivation: Expressive Knowledge Representation Part I.a: Argumentation as LP semantics

Notions of attack and justified arguments Hierarchy of semantics Proof procedure

Part I.b: Fuzzy unification and argumentation Fuzzy negation Fuzzy argumentation Fuzzy unification

Part I.c: Vivid Agents

Page 16: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 16

Part I.a: A Hierarchy of Semantics

RuleML caters for different degrees of knowledge representation

A hierarchy of semantics is required to guarantee inter-operation.

Analogy: In HTML, <b>Michael</b> will be interpreted differently in Netscape (Michael) and the text-based browser Lynx (Michael).

Problem: How can we guarantee inter-operability between different interpretations of rules?

Page 17: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 17

Knowledge representation

Pete earns 500.000$ p.a. earns(pete,500000).

Cross the street if there are no cars cross not car cross car

The fridge is quite cheap cheap(fridge):70%

Does Mike live in Londn? address(mike,london) = address(mike,londn): 95%

Page 18: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 18

Knowledge System Cube

rFB

fDB

fdFB

rDB

dDB

fdDB

dFB

fFB

r: relational f: fuzzy d: deductive

DB: database FB: factbase

ded

uct

ive

negation

fuzz

y

Page 19: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 19

Part I.a:Argumentation as semantics for Extended Logic Programs

rFB

fDB

fdFB

rDB

dDB

fdDB

dFB

fFB

f: fuzzy d: deductive

DB: database FB: factbase

ded

uct

ive

negation

fuzz

y

Page 20: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 20

Extended Logic Programming

Logic Programming with 2 negations Default negation:

not p : true if all attempts to prove p fail. Explicit negation:

p : falsehood of a literal may be stated explicitly. Coherence principle:

p not p

Page 21: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 21

Argumentation Interaction between agents in order to

gain knowledge revise existing knowledge convince the opponent solve conflicts

Elegant way to define semantics for (extended) logic programming Dung Kowalski, Toni, Sadri Prakken & Sartor Etc.

Page 22: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 22

Arguments

An argument is a partial proof, with implicitly negated literals as assumptions.

Argument = sequence of rules

Page 23: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 23

Attacking arguments Two fundamental kinds of attack:

A undercuts B = A invalidates premise of B P: Let’s go to the lake as it is not snowing anymore O: Hang, it is snowing

A rebuts B = A contradicts B P: Let’s go to the lake as it is not snowing O: Let’s not, as I’ve got to prepare my talk

Derived notions of attack used in Literature:

A attacks B = A u B or A r B

A defeats B = A u B or (A r B and not B u A)

A strongly attacks B = A a B and not B u A

A strongly undercuts B = A u B and not B u A

Page 24: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 24

Proposition: Hierarchy of attacks

Undercuts = u

Strongly undercuts = su = u - u -1

Strongly attacks = sa = (u r ) - u -1

Defeats = d = u ( r - u -1)

Attacks = a = u r

Page 25: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 25

Fixpoint Semantics Argumentation:

game between proponent and opponent argument A is acceptable if opponent’s x-attack is countered by

proponent’s y-attack, which proponent already accepted earlier. Acceptable

Let x,y be notions of attack. An argument A is x,y-acceptable w.r.t. a set of arguments S iff

for every argument B, such that (B,A) x, there is a C S such that (C,B) y

Fixpoint semantics Fx/y (S) = { A | A is x,y-acceptable w.r.t. S }

x/y-justified arguments = Least Fixpoint of Fx/y.

x/y-overruled arguments = x-attacked by a justified argument. x/y-defensible iff neither justified nor overruled

Page 26: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 26

Theorem: Relationship of semantics Weakening opponent or strengthening proponent increases justified

arguments Different notions of acceptability give rise to different argumentation

semantics

sa/u=sa/d=sa/a

sa/su=sa/sa

d/su=d/u=d/a=d/d=d/sa

u/su=u/u

su/su

su/u

su/a=su/d

su/sa

u/a=u/d=u/sa

a/su=a/u=a/a=a/d=a/sa

Dung’s groundedargumentation semantics

Prakken and Sartor’ssemantics w/o priorities

WFSX

If opponent is allowed to attack,type of defense does not matter

If opponent is allowed defeat,type of defense does not matter

If opponent is allowed undercut,defense with (a,u,sa) or without(su,u) rebut makes a difference

Page 27: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 27

Proof procedure Dialogues:

x/y-dialogue is sequence of moves such that Proponent and Opponent alternate Players cannot repeat arguments Opponent x-attacks Proponent’s last argument Proponent y-attacks Opponent’s last argument

Player wins dialogue if other player cannot move Argument A is provably justified if proponent wins all branches of

dialogue tree with root A Concrete implementation SLXA:

Since u/a=u/d=u/sa=WFSX

compute justified arguments with top-down proof procedure SLXA for WFSX [Alferes, Damasio, Pereira]

SLXA can be adapted for other notions

Page 28: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 28

Part I.b:Fuzzy unification and argumentation

rFB

fDB

fdFB

rDB

dDB

fdDB

dFB

fFB

r: relational f: fuzzy d: deductive

DB: database FB: factbase

ded

uct

ive

negation

fuzz

y

Page 29: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 29

Classical Fuzzy Logic

Solution: Truth values in [0,1] instead of {0,1}. Assertions:

p:V (p a formula, V a truth value). Conjunction:

p:V, q:W p q : min(V,W) Disjunction:

p:V, q:W p q : max(V,W) Inference:

p q1, …, qn ; q1:V1, …, qn:Vn p : min(V1, …, Vn)

Page 30: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 30

Fuzzy Negation

Classical fuzzy negation: L:V L: 1-V (Zadeh)

Our setting (fuzzy adaptation of WFSX): L:V and L:V’ with V’ 1-V possible L and L not directly related.

Page 31: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 31

Fuzzy Coherence Principle

If L:V and V > 0, and not L:V’,

then V’ > V. “If there is some explicit evidence that L is false, then there is

at least the same evidence that L is false by default.”

If L:V and V > 0,

then not L: 1.

Page 32: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 32

Law of excluded... ...contradiction ...middle

p p :V V > 0 possible Contradictory programs!

not p p : V V > 0 possible By coherence principle!

Contradiction removal

not p p : V V > 0

p p : V V = 0 possible p is unknown

Page 33: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 33

Strength of an argument

Strength of an argument: Fact: value is given Rule: minimum of body literals Argument: Conclusion

Least fuzzy value of the facts contributing to the argument.

Page 34: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 34

Theorems

Theorem (Soundness and Completeness)There is a justified argument of strength V for L

iffThere is a successful T-tree of truth value V for L

Theorem (Conservative Extension)

Argumentation semantics is a conservative extension of WFSX.

Page 35: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 35

Application: Fuzzy unification

Open systems: knowledge and ontologies may not match interaction with humans “Does Mike live in Londn?”

Approach: address(mike,london) = address(mike,londn): 95% adapt unification algorithm

(normalised edit distance over trees net) embed into argumentation framework

Page 36: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 36

Finding Mismatches: Edit distance

Edit distance between strings A and B: minimal number of delete, add, replace operations to

convert A into B. efficient implementation with dynamic programming

Example: e(address,adresse)=2, e(007,aa7)=2

Normalise: ne(A,B) = e(A,B) / max{ |A|, |B| }

Trees: net = sum of all mismatches divided by sum of all

max lengths

Page 37: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 37

Fuzzy unification and arguments

net is conservative extension of MGU (most general unifier)

net(t,t’) ne(t,t’)

Adapt definition of argument for fuzzy unificationV-argument: for all L in a body, there is L’ in head such

that net(L,L’) 1-VA V-undercuts B if A contains not L and B’s head is L’ and

net(L,L’) 1-VA V-rebuts B if A’s head is L and B’s head is L’ and

net(L,L’) 1-V

Adapt previous definitions accordingly

Page 38: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 38

Comparison: Argumentation

Our framework allows us to relate existing and new argumentation semantics: Dung= a/su=a/u=a/a=a/d=a/sa Prakken&Sartor = d/su=d/u=d/a=d/d=d/sa WFSX = u/a = u/d = u/sa Dung Prakken&Sartor WFSX

Proof Theory and Top-down Proof Procedure adapted from Alferes, Damasio, Pereira’s SLXA

Page 39: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 39

Comparison: Fuzzy Argumentation

Wagner: Scale: -1 to +1 Unlike WFSX, he relates F and F: F: -V iff F:V

We adopted his interpretation for not: not F:1 if F:V, V>0

Relates his work to stable models, but there is no top-down proof procedure for stable models [Alferes&Pereira]

Our approach conservatively extends WFSX, hence we can adapt proof procedure SLXA

Page 40: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 40

Comparison: Fuzzy unification

Arcelli, Formato, Gerla define abstract fuzzy unification/resolution framework cannot deal with missing parameters (common

problem [Fung et al.]) no conservative extension of classical unification we use concrete distance: edit distance

Evaluated idea on bioinfo DB

Page 41: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 41

Conclusion “A database needs two kinds of negation” (Wagner) Argumentation is an elegant way of defining semantics Our framework allows classification of various new and

existing semantics Efficient top-down proof procedure for justified arguments Argumentation as basis for belief revision (REVISE) We cover the whole knowledge system cube including

fuzzy argumentation Defined fuzzy unification, which is useful in open systems

Page 42: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 42

Part I.c: Vivid Agent

A vivid agent is a software-controlled system, whose state is represented by a knowledge base and whose behaviour is represented by

action- and reaction rules

Actions are planned and executed to achieve a goal Reactions are triggered by events

Epistemic RR: Effect <- Event, Cond Physical RR: Action, Effect <- Event, Cond Interaction RR: Msg, Effect <- Event, Cond

Page 43: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 43

Vivid Agent

KB

Reaction Rules

PerceptionReaction

Cycle

Believes/Updates

KB

GoalsAction rules

Planner

Believes

Goals

Intentions

Believes

Interface

Events

Page 44: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 44

Agent State and Transition Semantics

Agent State: Event queue, Plan queue, Goal queue, Knowledge base

Transition semantics Perception

Add event to agent’s event queue Reaction

Pop event from event queue, execute reactions including update of knowledge base

Plan execution Execute action of plan in plan queue

Replanning If action fails, replan

Planning Pop goal from goal queue and generate plan

Page 45: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 45

Implementation in Prova

Original Implementation in PVM-Prolog Course-grain parallelism (PVM) for each agent and

Prolog threads for an agent’s components

Currently: Prova is a Java-based rule engine

easy integration of all kinds of data sources. e.g., database, web services, etc.

Page 46: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 46

Part II: Application to Bioinformatics NSF and EU’s strategic research workshop found that

bioinformatics could play the role for the semantic web, which physics played for the web.

Why? Masses of information Masses of publicly accessible online information

(e.g. 8000 abstracts per month and over 500 tools) Data (more and more often) published in XML Data standards are accepted and actively developed Much valuable information scattered (as production cheap and

hence not centralised) Systemsintegration and interoperation prime concern (e.g.

GeneOntology)

LLNEYLEEVE EYEEDE

Page 47: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 47

Example: Information Agents for…

… Protein interactions PDB, SCOP

… Protein annotation TOPPred, HMMTOP,…

Information source Wrapper Mediator Facilitator

Mediator

Source

Source Source

Wrapper

Wrapper Wrapper

Facilitator

Page 48: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 48

Example 1: Protein Interaction:

PDB: Protein structures SCOP: Structure classification

Page 49: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 49

Example 1: PSIMAP: Structural Interactions

Page 50: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 50

Example 1: Protein Interaction: How it is currently done

PDB: 15 Gigabyte in flat files SCOP: 3 flat files

How? Download PDB, SCOP files Think up DB schema and populate MySQL DB Run some Perl scripts on various machines, that

grind through the data and analyse it Run some Java to visualise results

Problem: “Business logic” not separated

Page 51: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 51

How our Prova system can run execute

Declarative and executable specifications Interaction(Superfamliy1, Superfamliy2) if

PDB(Protein), Domain(Protein,Domain1), Domain(Protein,Domain2), SCOP Superfamily(Domain1, Superfamily1), SCOP Superfamily(Domain2, Superfamily2), InteractionDD(Domain1,Domain2, 5 Ang, 5 Residues)

Separation of information integration workflow Easier to maintain

Platform independence, because of Java Flexible, optimized execution

Query optimization and load-balancing of computations

Local or remote computation.

Might be held locally in file, remotely from a DB,

through a web service, on the grid, etc.

Page 52: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 52

Actual Prova Code

% ACTUAL PROVA CODE

% Given the open database connection DB

% and a unique protein identifier in Protein

% Data Bank PDB_ID, test whether the provided

% domains with IDs PXA and PXB interact

% (have at least 5 atoms within 5 angstroms)

scop_dom2dom(DB,PDB_ID,PXA,PXB) :-

access_data(pdb,PDB_ID,Protein),

scop_dom_atoms(DB,Protein,PXA,DomainA),

scop_dom_atoms(DB,Protein,PXB,DomainB),

DomainA.interacts(DomainB).

Page 53: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 53

Caching% Two alternative rules for either retrieving data % from the cache or accessing the data from its % original location and caching it.access_data(Type,ID,Data,CacheData) :- % Attempt to retrieve the data Data=CacheData.get(ID), % Success, Data (whatever object it is) is returned !.

access_data(Type,ID,Data,CacheData) :- % Retrieve the data from its location and update

the cache retrieve_data_general(Type,ID,Data), update_cache(Type,ID,Data,CacheData).

Page 54: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 54

Example 2: GoPubmed

Page 55: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 55

Consistency of GO

Simple example: Parsimony: If A is-a C is explicitly stated in the

ontology, it should be possible to derive it implicitly

I.e. Don’t state A is-a C if you have already A is-a B and B is-a C

Done with Prova

Page 56: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 56

Towards functional annotation through GoPubmed

Protein Name/Enzyme activity hydrolase 

kinase transferase lyase isomerase one other

Pyruvate kinase M1 isozyme X

X X X X oxireductase

CAMP dpt protein kinase type II regulatory chain X

X X X cyclase

Galactokinase X

X X X X

Tropomyosin bêta chain X

X X X

HnRNP DO X

X X X helicase

Page 57: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 57

Example 3: Consistent Integration of Protein Annotation

Page 58: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 58

Conflicts

Page 59: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 59

Host

Host

HostHost

DispatcherAnalyser Analyser

Analyser

Info object

Info object

Info object

Info object Info object

Info object

Info object

Info object

EditToTrEMBL (Steffen Möller, EBI): automate annotation of DNA sequences by combining results of various tools and databases, which are online

Example: Edit2TrEMBL

Page 60: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 60

Challenge Uncertain, incomplete, vague,

contradictory information Wrappers domains overlap: How

can mediator resolve conflicts? How can mediator integrate

information consistently? How can mediator improve info

quality using overlapping info and inconsistencies

Mediator contains conflict resolution component

Semantic conflict resolution requires domain knowledge to identify conflicts

We use extended logic programming

Mediator

Source

Source Source

Wrapper

Wrapper Wrapper

Facilitator

Common Problem:Overlapping information

can lead to inconsistencies

Solution:Semantic consistency

checking

Page 61: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 61

Modelling domain knowledge Facts, Rules, Assumptions, Integrity Constraints

For example: The length of transmembrane regions is limited:

false if ft(AccNo,transmembrane,From,To), To-From >25false if ft(AccNo,transmembrane,From,To), To-From <15

Maximal difference in membrane bordersfalse if ft(Agent1,Acc,transmembrane,From1,To1), ft(Agent2,Acc,transmembrane,From2,To2), (From1>From2,From1<To2;To1>From2,To1<To2),

(abs(From2-From1)>4;abs(To2-To1)>4).

Assessment of predictions:probability(ft(tmhmm,p12345,transmem,6,26), 0.5)

Page 62: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 62

REVISE REVISE detects conflicting arguments and

computes minimal set of assumptions, which removes conflict

Dropping these assumptions yields minimal consistent annotation of all predictions

Minimality is based on probabilities given as part of predictions

alternative: cardinality, set-inclusion

Page 63: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 63

Expression Space:Space Explorer

Pathway Space:

BioNetExplorer

Interaction Space:PSIMAP

Literature Space:Classification Server

Vision: A semantic Grid for Bioinformatics

Page 64: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 64

Conclusion Advanced applications on the web, will require rules

and reasoning Part I:

Argumentation is an elegant way of defining semantics Classification of various new and existing semantics Fuzzy reasoning and unification Reactivity with vivid agents and prova

Part II: Bioinformatics requires a semantic web and the

semantic web requires bioinformatics

Page 65: Prof. Michael Schroeder Biotec/Dept. of Computing TU Dresden ms@mpi-cbg.de comas.soi.city.ac.uk Biotec Reasoning on the Web: Theory, Challenges, and Applications.

By Michael Schroeder, Biotec, 2003 65

Acknowledgment

Ralf Schweimeier (Argumentation semantics) Panos Dafas, Dan Bolser (PSIMAP) Steffen Moeller (Edit2Trembl) David Gilbert (Fuzzy Unification) Ralph Delfs, Alexander Kozlenkov (Go, Prova) Carlos Damasio (REVISE)

More information at comas.soi.city.ac.uk

Email: [email protected]


Recommended