1 How Philosophy of Science Can Help Biomedical Research Barry Smith .

Post on 15-Jan-2016

219 views 0 download

Tags:

transcript

1

How Philosophy of Science Can Help Biomedical Research

Barry Smith

http://ontology.buffalo.edu/smith

How to Do Biology across the Genome?

2

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPISKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDV

3sequence of X chromosome in baker’s yeast

MKVSDRRKFEKANFDEFESALNNKNDLVHCPSITLFESIPTEVRSFYEDEKSGLIKVVKFRTGAMDRKRSFEKVVISVMVGKNVKKFLTFVEDEPDFQGGPIPSKYLIPKKINLMVYTLFQVHTLKFNRKDYDTLSLFYLNRGYYNELSFRVLERCHEIASARPNDSSTMRTFTDFVSGAPIVRSLQKSTIRKYGYNLAPYMFLLLHVDELSIFSAYQASLPGEKKVDTERLKRDLCPRKPIEIKYFSQICNDMMNKKDRLGDILHIILRACALNFGAGPRGGAGDEEDRSITNEEPIIPSVDEHGLKVCKLRSPNTPRRLRKTLDAVKALLVSSCACTARDLDIFDDNNGVAMWKWIKILYHEVAQETTLKDSYRITLVPSSDGISLLAFAGPQRNVYVDDTTRRIQLYTDYNKNGSSEPRLKTLDGLTSDYVFYFVTVLRQMQICALGNSYDAFNHDPWMDVVGFEDPNQVTNRDISRIVLYSYMFLNTAKGCLVEYATFRQYMRELPKNAPQKLNFREMRQGLIALGRHCVGSRFETDLYESATSELMANHSVQTGRNIYGVDSFSLTSVSGTTATLLQERASERWIQWLGLESDYHCSFSSTRNAEDVVAGEAASSNHHQKISRVTRKRPREPKSTNDILVAGQKLFGSSFEFRDLHQLRLCYEIYMADTPSVAVQAPPGYGKTELFHLPLIALASKGDVEYVSFLFVPYTVLLANCMIRLGRRGCLNVAPVRNFIEEGYDGVTDLYVGIYDDLASTNFTDRIAAWENIVECTFRTNNVKLGYLIVDEFHNFETEVYRQSQFGGITNLDFDAFEKAIFLSGTAPEAVADAALQRIGLTGLAKKSMDINELKRSEDLSRGLSSYPTRMFNLIKEKSEVPLGHVHKIRKKVESQPEEALKLLLALFESEPESKAIVVASTTNEVEELACSWRKYFRVVWIHGKLGAAEKVSRTKEFVTDGSMQVLIGTKLVTEGIDIKQLMMVIMLDNRLNIIELIQGVGRLRDGGLCYLLSRKNSWAARNRKGELPPKEGCITEQVREFYGLESKKGKKGQHVGCCGSRTDLSADTVELIERMDRLAEKQATASMSIVALPSSFQESNSSDRYRKYCSSDEDSNTCIHGSANASTNASTNAITTASTNVRTNATTNASTNATTNASTNASTNATTNASTNATTNSSTNATTTASTNVRTSATTTASINVRTSATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSATTTASINVRTSATTTESTNSSTSATTTASINVRTSATTTKSINSSTNATTTESTNSNTNATTTESTNSSTNATTTESTNSSTNATTTESTNSNTSAATTESTNSNTSATTTESTNASAKEDANKDGNAEDNRFHPVTDINKESYKRKGSQMVLLERKKLKAQFPNTSENMNVLQFLGFRSDEIKHLFLYGIDIYFCPEGVFTQYGLCKGCQKMFELCVCWAGQKVSYRRIAWEALAVERMLRNDEEYKEYLEDIEPYHGDPVGYLKYFSVKRREIYSQIQRNYAWYLAITRRRETISVLDSTRGKQGSQVFRMSGRQIKELYFKVWSNLRESKTEVLQYFLNWDEKKCQEEWEAKDDTVVVEALEKGGVFQRLRSMTSAGLQGPQYVKLQFSRHHRQLRSRYELSLGMHLRDQIALGVTPSKVPHWTAFLSMLIGLFYNKTFRQKLEYLLEQISEVWLLPHWLDLANVEVLAADDTRVPLYMLMVAVHKELDSDDVPDGRFDILLCRDSSREVGE

4

5

6Stelzl et al., Cell, 2005

network of gene interactions in E. coli http://moebio.com/santiago/gnom/ingles.html

8

9

10

what cellular component?

what molecular function?

what biological process?

11

12

13

The Idea of Common Controlled Vocabularies

MouseEcotope GlyProt

DiabetInGene

GluChem

sphingolipid transporter

activity

14

The Idea of Common Controlled Vocabularies

MouseEcotope GlyProt

DiabetInGene

GluChem

Holliday junction helicase complex

15

male courtship behavior, orientation prior to leg tapping and wing vibration

Gene Ontology

16

Benefits of GO

1. based in biological science

2. links data to biological reality

3. links people to software

4. links data together

• across species (human, mouse, yeast, fly ...)

• across granularities (molecule, cell, organ, organism, population)

The goal

all biological (biomedical) research data should cumulate to form a single, algorithmically processible, whole

http://obofoundry.org

17

Ontologies already being applied to achieve this goal

Sjöblöm T, et al. analyzed 13,023 genes in 11 breast and 11 colorectal cancers

GO tells you what is standard functional information for these genes

By tracking deviations from this standard 189 genes could be identified as being mutated at significant frequency and thus as providing targets for diagnostic and therapeutic intervention.

Science. 2006 Oct 13;314(5797):268-74.

18

Towards Empirical Philosophy

• processualist vs. 3-dimensionalist

• reductionist vs. non-reductionist

• realist vs. nominalist

If ontologies based on different philosophical principles are tested for their utility in support of scientific research, which types of ontologies will prove most useful?

19

20

Some sample ontologies

Cell Ontology (CL)Foundational Model of Anatomy (FMA)Environment Ontology (EnvO) Gene Ontology (GO)Infectious Disease Ontology Phenotypic Quality Ontology (PaTO)Protein Ontology (PRO)RNA Ontology (RnaO)Sequence Ontology (SO)

21

22

23

24

The problem

High throughput experimentation data is meaningless unless the researcher is provided with detailed information concerning how it was obtained

25

To make experimental data computationally accessible we need ontologies to describe the data

(1) from the point of view of their relation to reality

(2) from the point of view of their relation to experiments

26

27

Three solutions

The MGED Ontology

OBI: The Ontology for Biomedical Investigations

EXPO: The Experiment Ontology

28

MGED (Microarray Gene Expression Data) Ontology

MGED Ontology

Individual =def. name of the individual organism from which the biomaterial was derived

Experiment =def. The complete set of bioassays and their descriptions performed as an experiment for a common purpose. ... An experiment will be often equivalent to a publication.

29

MGED Ontology

Chromosome =Def An abstraction used for annotation

Chromosome =Def A biological sequence that can be placed on an array

30

31

OBI

The Ontology for Biomedical Investigations

with thanks to Trish Whetzel and Richard Scheuermann

32

Purpose of OBI

To provide a resource for the unambiguous description of the components of biomedical investigations such as the design, protocols and instrumentation, material, data and types of analysis and statistical tools applied to the data

NOT designed to model biology

Hypothesis

That it is possible to create ontology resources of genuine utility by drawing on logical and philosophical principles e.g. pertaining to consistency of definitions, avoidance of use-mention confusions.

33

34

OBI Collaborating CommunitiesCrop sciences Generation Challenge Programme (GCP),Environmental genomics MGED RSBI Group, www.mged.org/Workgroups/rsbiGenomic Standards Consortium (GSC),

www.genomics.ceh.ac.uk/genomecatalogueHUPO Proteomics Standards Initiative (PSI), psidev.sourceforge.netImmunology Database and Analysis Portal, www.immport.orgImmune Epitope Database and Analysis Resource (IEDB),

http://www.immuneepitope.org/home.doInternational Society for Analytical Cytology, http://www.isac-net.org/Metabolomics Standards Initiative (MSI), Neurogenetics, Biomedical Informatics Research Network (BIRN),Nutrigenomics MGED RSBI Group, www.mged.org/Workgroups/rsbiPolymorphismToxicogenomics MGED RSBI Group, www.mged.org/Workgroups/rsbiTranscriptomics MGED Ontology Group

OBI – Tools and Documentation

Open source, standards compliant and version management• Ontology Web Language (OWL) using Protégé editor• OBI.owl files are available from the OBI SVN Repository

The Problem of Clinical Investigations

Regulatory bodies such as the FDA need to assess the evidentiary value of enormous volumes of data collected e.g. in trials on specific drug formulations

For this, they need to impose standardization of terminologies used to express these data, e.g. as developed by the Clinical Data Interchange Standards Consortium (CDISC)

36

37

Clinical Investigations terminologies

“Study Design”

Descriptive research – Case study – description of one or more patients– Developmental research – description of pattern of change

over time– Qualitative research – gathering data through interview or

observation

Exploratory research– Secondary analysis – exploring new relationships in old data– Historical research – reconstructing the past through an

assessment of archives or other records

Experimental research– Randomized clinical trial – Meta-analysis – statistically combining findings from several

different studies to obtain a summary analysis

“Population”Recruited population

– Randomized population– Eligible population– Screened population– Premature termination population

Excluded population– Excluded post-randomization population– Not-eligible-population

Analyzed population– Study arm population– Crossover population– Subgroup population– Intent-to-treat population - based on randomization

Overview of OCI

Meta-analysis (CDISC)Quality assurance (CDISC)Quality control (CDISC)Baseline assessment (CDISC)Validation (CDISC)Coding (MUSC)Permuted block randomization (MUSC)Secondary-study-protocol (RCT)Intervention-step (RCT)Blinding-method (RCT)

Study design

Development plan (CDISC)Standard operating procedures (CDISC)Statistical analysis plan (CDISC)

Negative findings (MUSC)Positive findings (MUSC)Primary-outcome (RCT)Secondary-outcome (RCT)

46

EXPO

The Ontology of Experiments

L. Soldatova, R. KingDepartment of Computer Science

The University of Wales, Aberystwyth

47EXPO: Experiment Ontology

48EXPO: Experiment Ontology

49EXPO: Experiment Ontology

50

experimental actions part_of experimental designsubject of experiment part_of experimental design

51

Role of Philosophy of Science

EXPO: Experiment Ontology

Towards Empirical Philosophy of Science

• rational statistical models of induction• case-based / domain-based reasoning• falsifiabilism• Humeanism vs. laws• logical, relative frequency, Bayesian, objective

(chance) and epistemic theories of probability

These generate different ontologies of scientific evidence

– which one is correct?

52

Environment Ontology +

Phenotypic Quality Ontology +

Ontology for Personalized and Community Medicine

‘Racial’ Phenotypes: Social, Phylogenetic, Essentialistic ...

53

54

Ontology for Personalized and Community Medicine

to support studies of differential effects on health

1. of environmental qualities of different neighborhoodsand

2. of different community behavior phenotypes