+ All Categories
Home > Science > Knowledge Management in a Knowledge Based Discipline

Knowledge Management in a Knowledge Based Discipline

Date post: 22-Nov-2014
Category:
Upload: robertstevens65
View: 142 times
Download: 3 times
Share this document with a friend
Description:
invited talk, Aston Business School, Aston University 2009
Popular Tags:
46
Knowledge Management in a Knowledge Based Discipline Robert Stevens BioHealth Informatics Group University of Manchester [email protected]
Transcript
Page 1: Knowledge Management in a Knowledge Based Discipline

Knowledge Management in a Knowledge Based Discipline

Robert Stevens

BioHealth Informatics Group

University of Manchester

[email protected]

Page 2: Knowledge Management in a Knowledge Based Discipline

Introduction

• How do we do (molecular)biology• Managing stamp albums• A knowledge based discipline• Representing knowledge computationally• Ontologies that define what entities are in the

domain• Describing biological knowledge ontologically• Using ontologies and is it enough?

Page 3: Knowledge Management in a Knowledge Based Discipline

Ernest Rutherford

“All science is either physics or stamp collecting”

Image: http://en.wikipedia.org/wiki/File:Ernest_Rutherford2.jpg

Page 4: Knowledge Management in a Knowledge Based Discipline

Mathematical Sciences

Page 5: Knowledge Management in a Knowledge Based Discipline

Laws in Biology

Charles Darwin

Image: http://en.wikipedia.org/wiki/File:Charles_Darwin_01.jpg

On The Origin of Species - 1859

Page 6: Knowledge Management in a Knowledge Based Discipline

Classic and Modern Biology

Genotype Phenotype

Modern biology

Classic biology

Page 7: Knowledge Management in a Knowledge Based Discipline

Central Dogma

Image: http://cellbio.utmb.edu/CELLBIO/DNA-RNA.jpg

Page 8: Knowledge Management in a Knowledge Based Discipline

Speed of sequencing

• First human genome

– 10+ years to produce– Cost $500 million– Huge international effort

• Now done in 10 weeks

– (for $399)– http://tinyurl.com/genomecost– http://www.23andme.com

Page 9: Knowledge Management in a Knowledge Based Discipline

1000+ databases

• according to Nucleic Acids Research

Page 10: Knowledge Management in a Knowledge Based Discipline

PubMed: 2 papers per minute

• ~700,000 individual papers• Grows at 2 papers per minute (see http://

blogs.bbsrc.ac.uk for details)

Page 11: Knowledge Management in a Knowledge Based Discipline

Uniprot:- A protein database?

Page 12: Knowledge Management in a Knowledge Based Discipline

What is Knowledge?

• Knowledge – all information and an understanding to carry out tasks and to infer new information

• Information -- data equipped with meaning

• Data -- un-interpreted signals that reach our senses

Michael AshburnerProfessor

University of CambridgeUK

ISMB

NameJob

InstitutionCountry

Conf

manacademic, senior

ancient university, 5 ratedEuropean

important figure in biology

BIOLOGY

Page 13: Knowledge Management in a Knowledge Based Discipline

A Knowledge Based Discipline

• Rather than laws captured in mathematics….• We have lots of facts: the discipline’s knowledge• Rather than “calculating” what a protein does, we

investigate and write it down• Equivalent to writing down the trajectories of all

thrown objects and not doing ballistics!• To do biology one needs “the knowledge”

Page 14: Knowledge Management in a Knowledge Based Discipline

Heterogeneity

• 28 ways to format the representations of a biological sequence

• Though one way to represent the bases or amino acids…

• Different words same concept• Different concepts same words• Different and implicit data schema

Page 15: Knowledge Management in a Knowledge Based Discipline

Categories and Category Labels

GO:0000368

U2-type nuclear mRNA 5' splice site recognition

spliceosomal E complex formation

spliceosomal E complex biosynthesis

spliceosomal CC complex formation

U2-type nuclear mRNA 5'-splice site recognition

Page 16: Knowledge Management in a Knowledge Based Discipline

An Identity Crisis

• Database entries have identifiers unique within their database

• The type of entity described in an entry doesn’t have an identifier

• Different entries about the same type talk about it differently

• How do we know when an entry in one DB talks about the same thing as another entry in another DB?

• That’s the skill of a bioinformatician

Page 17: Knowledge Management in a Knowledge Based Discipline

Why: Society of Biologists

• To do particle physics necessarily has central organisation

• One central place to generate data• A communitarian attitude• It is still possible to do biology in the “garden shed”• Historicaly less need to organise• Hence…

Page 18: Knowledge Management in a Knowledge Based Discipline

Navigating the Web of Knowledge in Bioinformatics

Page 19: Knowledge Management in a Knowledge Based Discipline

Biology is Special

• Large quantities of data: No it doesn’t• Complex data: Yes it does• Volatile data: Types of data and what is recorded

changes rapidly• Nothing that special about biology • …except that it has all the problem and often to a

large degree

Page 20: Knowledge Management in a Knowledge Based Discipline

Lots of catalogues

Genome

Proteome

Transcriptome

Interactome

Metabolome

PHENOME

Page 21: Knowledge Management in a Knowledge Based Discipline

Biology now has lots of facts

Page 22: Knowledge Management in a Knowledge Based Discipline

Creating Woods, not Trees

Genes

Proteins

Pathways

Interactions

LiteratureComplex Machines

Virtual Organism

…. from biological facts, we make a system that is some model of a real organism

Page 23: Knowledge Management in a Knowledge Based Discipline

Networks of Chemicals

Image: http://genome-www.stanford.edu/rap_sir/images/Web_FigF_RAP1_glycolysis.gif

Page 24: Knowledge Management in a Knowledge Based Discipline

Systems within Systems

Image: http://www.ehponline.org/members/2007/10373/fig1.jpg

Page 25: Knowledge Management in a Knowledge Based Discipline

A Biologist’s Skills

• By the time a biologist has finished a Ph.D. he/she is about ready for action

• They have a comprehensive knowledge of the facts of a (narrow) domain

• He/she also knows how to do experimentation in that domain

• There are so many facts, it is difficult to move outside one’s sub-discipline

• Yet in a systems view such movement is mandatory

Page 26: Knowledge Management in a Knowledge Based Discipline

The Role of Knowledge

• A lot of facts• Perhaps organised into a system• No equivalent of “laws of mechanics” – we

can’t do this biology with mathematics• Or at least not without knowing what the

numbers mean...• This is why we’ve been using ontologies!

Page 27: Knowledge Management in a Knowledge Based Discipline

What is an Ontology?

• A description of that which exists (in our data)• What it means to be a member of a category• What categories of things exist and how do I

recognise that a particular object is a member of a given category

Page 28: Knowledge Management in a Knowledge Based Discipline

Uses of Ontology in Bioinformatics

Page 29: Knowledge Management in a Knowledge Based Discipline

Why develop an ontology?

• To make domain assumptions explicit

– Easier to change domain assumptions– Easier to understand and update legacy data

• To separate domain knowledge from operational knowledge

– Re-use domain and operational knowledge separately

• A community reference for applications• To share a consistent understanding of what information means.

Page 30: Knowledge Management in a Knowledge Based Discipline

History of Bio-ontologies

1992 1996 1998

TAMBIS

2002

MGED

2006

1st Bio-ontologies meeting

Gene Ontology starts

2005

Page 31: Knowledge Management in a Knowledge Based Discipline

Controlled Vocabulary

• An Ontology isn’t a controlled vocabulary, but can be used to deliver one

• By agreeing upon the categories in a domain and agreeing upon their labels we are controlling vocabulary

• Addresses one major problem in biology• Also forces examination of definitions• Makes domain assumptions explicit

Page 32: Knowledge Management in a Knowledge Based Discipline

Transferring Characteristics

Uncharacterised protein

Tra1 La2 La3

High similarity transfer characteristics

Page 33: Knowledge Management in a Knowledge Based Discipline

Post-Genomic Biology

• Fly, mouse, yeast, worm all have their own terminologies

• I want to compare genomes• How?• The genomic sequence is easily dealt with

computationally and comparisons are easy• This is not true of the annotations or knowledge of

those sequences• Need a common understanding

Page 34: Knowledge Management in a Knowledge Based Discipline

Annotation of Data

• Big effort to create controlled vocabularies using ontologies

• A huge annotation efffort – describe the entities in DB with terms from ontologies

• The Gene Ontology (http://www.geneontology.org))• The Open Biomedical Ontologies Consortiym

Page 35: Knowledge Management in a Knowledge Based Discipline

Genotype Phenotype

Sequence

Proteins

Gene products Transcript

Pathways

Cell type

BRENDA tissue / enzyme source

Development

Anatomy

Pheonotype

Plasmodium life cycle

-Sequence types and features-Genetic Context

- Molecule role - Molecular Function- Biological process - Cellular component

-Protein covalent bond -Protein domain -UniProt taxonomy

-Pathway ontology -Event (INOH pathway ontology) -Systems Biology -Protein-protein interaction

-Arabidopsis development -Cereal plant development -Plant growth and developmental stage -C. elegans development -Drosophila development FBdv fly development.obo OBO yes yes -Human developmental anatomy, abstract version -Human developmental anatomy, timed version

-Mosquito gross anatomy-Mouse adult gross anatomy -Mouse gross anatomy and development -C. elegans gross anatomy-Arabidopsis gross anatomy -Cereal plant gross anatomy -Drosophila gross anatomy -Dictyostelium discoideum anatomy -Fungal gross anatomy FAO -Plant structure -Maize gross anatomy -Medaka fish anatomy and development -Zebrafish anatomy and development

-NCI Thesaurus -Mouse pathology -Human disease -Cereal plant trait -PATO PATO attribute and value.obo -Mammalian phenotype -Habronattus courtship -Loggerhead nesting -Animal natural history and life history

eVOC (Expressed Sequence Annotation for Humans)

Page 36: Knowledge Management in a Knowledge Based Discipline

The Sequence Ontology

(http://obo.sf.net)

Page 37: Knowledge Management in a Knowledge Based Discipline
Page 38: Knowledge Management in a Knowledge Based Discipline

GO in Analysis

• Microarray analysis one of the original visions for GO• Clustering of modulated genes cluster about

functional attributes of their proteins• GO also used in, for example, semantic similarity;

text analysis; etc.

Page 39: Knowledge Management in a Knowledge Based Discipline

Fact Management

• When “stamp collecting” we’re collecting facts• Biology is a fact management activity• Knowing what these fact mean is very import• Science is perofrmed on data and the smeantics of

data enable us to do science• Semantic e-Science

Page 40: Knowledge Management in a Knowledge Based Discipline

Summary

• The nature of modern biology gives it interesting knowledge (fact) management issues

• It is a knowledge based discipline• Not unique, but often extreme• Ontologies seen as one component in management

(but not a panacea)

Page 41: Knowledge Management in a Knowledge Based Discipline

acknowledgements

• All these people provided slides and input:• Duncan Hull• Simon Jupp• Phil Lord• Carole goble

Page 42: Knowledge Management in a Knowledge Based Discipline

Genotype to Pathway

Created by Paul Fisher

Page 43: Knowledge Management in a Knowledge Based Discipline

Pathway to Phenotype

Created by Paul Fisher

Page 44: Knowledge Management in a Knowledge Based Discipline

Ontology Space

(Axi

omat

ic)

Ric

hnes

s

Usage

Representation

Page 45: Knowledge Management in a Knowledge Based Discipline

Metadata toilet

• Everyone wants to use good metadata but few people want to spend time curating and cleaning metadata

– Like a clean toilet

Page 46: Knowledge Management in a Knowledge Based Discipline

Biologists Wake up to Standards


Recommended