+ All Categories
Home > Documents > The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The...

The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The...

Date post: 07-Oct-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
94
The CROP (Common Reference Ontologies for Plants) Initiative Barry Smith September 13, 2013 http://ontology.buffalo.edu/smith 1
Transcript
Page 1: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

The CROP (Common Reference Ontologies for Plants)

Initiative

Barry Smith

September 13, 2013http://ontology.buffalo.edu/smith

1

Page 2: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

2

The OBO FoundryPrinciplesReference ontologies vs. application ontologiesOther ontology consortiaThe CROP InitiativeExamples of ontologies within CROP

Agenda

Page 3: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

On June 22, 1799, in Paris,everything changed

3

Page 4: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

International System of Units

4

Page 5: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

How to find data?

How to find other people’s data?

How to reason with data when you find it?

How to work out what data does not yet exist?

5

Page 6: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

6

How to solve the problem of making the data we find queryable and re-

usable by others?

Part of the solution must involve: standardized terminologies and coding schemes

Page 7: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

But there are multiple kinds of standardization for biological data, and

they do not work well together

7

Proposed solution: Ontology-based annotation of data

Page 8: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

8

ontologies = standardized labels designed for use in annotations

to make the data cognitively accessible to human beings

and algorithmically accessible to computers

Page 9: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

9

ontologies = high quality controlled structured vocabularies for the annotation (description) of data, images, journal articles …

Page 10: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Ramirez et al. Linking of Digital Images to Phylogenetic Data Matrices Using a Morphological OntologySyst. Biol. 56(2):283–294, 2007

Page 11: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

11

what cellular component?

what molecular function?

what biological process?

ontologies used in curation of literature

Page 12: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Proposed framework: the Semantic Web

• html demonstrated the power of the Web to allow sharing of information

• can we use semantic technology to create a Web 2.0 which would allow algorithmic reasoning with online information based on a common Web Ontology Language (OWL)?

• can we use netcentricity, common URLs, to break down silos, and create useful integration of on-line data and information

12/24

Page 13: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Ontology success stories, and some reasons for failure

A fragment of the “Linked Open Data” in the biomedical domain

13

Page 14: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

http://bioportal.bioontology.org/

14

Page 15: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

15

Page 16: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

16

Page 17: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

17

Page 18: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

18

Page 19: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

The more ontology-building is successful, the more it fails

OWL breaks down data silos via controlled vocabularies for the description of data dictionaries

Unfortunately the very success of this approach led to the creation of multiple, new, semantic silos – because multiple ontologies are being created in ad hoc ways

19/24

Page 20: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

http://bioportal.bioontology.org/

Many ontologies in bioportal are created by importing content from existing ontologies and giving the terms imported new names and new IDs

The result is chaos, with bits and pieces of the same ontologies chopped in multiple different places.

Leads to massively redundant effort, forking and doom

20

Page 21: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

• It is easier to write useful software if one works with a simplified model

• (“…we can’t know what reality is like in any case; we only have our concepts…”)

• This looks like a useful model to me

• (One week goes by:) This other thing looks like a useful model to him

• Data in Pittsburgh does not interoperate with data in Vancouver

• Science is siloed

A standard engineering methodology

Page 22: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

A good solution to this silo problem must be:

• modular• incremental

• independent of hardware and software

• bottom-up

• evidence-based

• revisable

• incorporate a strategy for motivating potential developers and users 22

Page 23: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Uses of ‘ontology’ in PubMed abstracts

23

Page 24: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

24

Page 25: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

main reason for GO’s success

Gene Ontology and associated databases

“make it possible to systematically dissect large gene lists in an attempt to assemble a summary of the most enriched and pertinent biology”PMC2615629

Page 26: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

GO provides a controlled system of terms for use in annotating (describing, tagging) data

• multi-species, multi-disciplinary, open source

• contributing to the cumulativity of scientific results obtained by distinct research communities

• compare use of kilograms, meters, seconds in formulating experimental results

26

Page 27: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

GO is 3 ontologies

biological process

cellular component

molecular function

Page 28: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Top-Level Architecture

Continuant Occurrent(Process, Event)

IndependentContinuant

DependentContinuant

28

..... ..... .....universals

instances

Page 29: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Problem with the GO

• it covers only three types of entities

• no diseases

• no laboratory artifacts

• no anatomy (above the cell)

• only species-terms for development

• no phenotypes

29

Page 30: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

The Open Biomedical Ontologies (OBO) Foundry30

Page 31: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

Cellular Process(GO)

MOLECULEMolecule

(ChEBI, SO,RNAO, PRO)

Molecular Function(GO)

Molecular Process

(GO)

rationale of OBO Foundry coverage

GRANULARITY

RELATION TO TIME

31

Page 32: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

32

a shared portal for (so far) 58 ontologies (low regimentation)

http://obo.sourceforge.net NCBO BioPortal

First step (2001)

Page 33: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

33

Page 34: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

OBO builds on the principles successfully implemented by the GO

recognizing that ontologies need to be developed in tandem

34

Page 35: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

35

The OBO FoundryThe OBO Foundryhttp://obofoundry.org/http://obofoundry.org/

Second step Second step (2006)(2006)

Page 36: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

36

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

Building out from the original GO

Page 37: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

37

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Organism-Level Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

Cellular Process(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

initial OBO Foundry coverage

GRANULARITY

RELATION TO TIME

Page 38: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

OBO Foundry Principles common formal architecture

clearly delineated content (redundant – overlaps with orthogonality)

the ontology is well-documented (– overlaps with rules for definitions; needs expanding, for developers, for users, minimal metadata)

plurality of independent users

single locus of authority, trackers, help desk

38

Page 39: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

OBO Foundry Principles

textual definitions plus formal definitions

all definitions should be of the genus-species form

A =def. a B which Cs

where B is the parent term of A in the ontology hierarchy

• formal definitions use OBO format or OWL

39

Page 40: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Orthogonality• For each domain, there should be convergence

upon a single ontology that is recommended for use by those who wish to become involved with the Foundry initiative

• Part of the goal here is to avoid the need for mappings – which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change

• Orthogonality means: – everyone knows where to look to find out how to

annotate each kind of data– everyone knows where to look to find content for

application ontologies 40

Page 41: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Orthogonality = non-redundancy for the reference ontologies

inside the Foundry

• application ontologies can overlap, but then only in those areas where common coverage is supplied by a reference ontology

41

Page 42: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

42

COMMON FORMAL ARCHITECTURE: The ontology uses relations which are unambiguously defined following the pattern of definitions laid down in the Basic Formal Ontology (BFO)

http://www.ifomis.uni-saarland.de/bfo/

‘formal’= domain neutral

PRINCIPLES

Page 43: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Continuant Occurrent

IndependentContinuant

DependentContinuant

cell component

biological process

molecular function

Basic Formal Ontology

Page 44: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

OBO Foundry

provides guidelines (traffic laws) to new groups of ontology developers in ways which can counteract current dispersion of effort

Page 45: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

New principle: Employ the methodology of cross-products

compound terms in ontologies are to be defined as cross-products of simpler terms:E.g elevated blood glucose is a cross-product of PATO: increased concentration with FMA: blood and CheBI: glucose.

= factoring out of ontologies into discipline-specific modules (orthogonality)

45

Page 46: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

The methodology of cross-products

enforcing use of common relations in linking terms drawn from Foundry ontologies serves

• to ensure that the ontologies are maintained and revised in tandem

• logically defined relations serve to bind terms in different ontologies together to create a network

46

Page 47: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

47

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

Building out from the original GO

Page 48: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

COMPLEX OFORGANISMS

Family, Community, Deme, Population

OrganFunction

(FMP, CPRO)

Population Phenotype

PopulationProcess

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

Phenotypic Quality(PaTO)

Biological Process

(GO)CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process(GO)

Population-level ontologies 48

Page 49: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

RELATION TO TIME

GRANULARITY

CONTINUANT OCCURRENT

INDEPENDENT DEPENDENT

ORGAN ANDORGANISM

Organism(NCBI

Taxonomy)

Anatomical Entity(FMA, CARO)

OrganFunction

(FMP, CPRO) Phenotypic Quality(PaTO)

Biological Process

(GO)

CELL AND CELLULAR

COMPONENT

Cell(CL)

Cellular Component(FMA, GO)

Cellular Function

(GO)

MOLECULEMolecule

(ChEBI, SO,RnaO, PrO)

Molecular Function(GO)

Molecular Process

(GO)

Environment Ontology

en

viro

nm

en

ts

49

Page 50: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Anatomy Ontology(FMA*, CARO)

Environment Ontology(EnvO)

Infectious Disease Ontology

(IDO*)

Biological Process

Ontology (GO*)

Cell Ontology

(CL)

CellularComponent

Ontology(FMA*, GO*) Phenotypic

QualityOntology(PaTO)Subcellular Anatomy Ontology (SAO)

Sequence Ontology (SO*) Molecular

Function(GO*)Protein Ontology

(PRO*)

Extension Strategy + Modular Organization

50

top level

mid-level

domain level

Information Artifact Ontology

(IAO)

Ontology for Biomedical

Investigations(OBI)

Spatial Ontology(BSPO)

Basic Formal Ontology (BFO)

Page 51: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Third step:Third step:Creation of new ontology consortia,

modeled on the OBO Foundry

51

OBO Foundry Open Biological and Biomedical Ontologies

NIF Standard Neuroscience Information Framework

eagle-I Ontologies used by VIVO and CTSAconnect

IDO Consortium Infectious Disease Ontology

Page 52: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

A good solution to the silo problem must be:

• modular• incremental

• independent of software and hardware

• bottom-up

• evidence-based

• revisable

• incorporate a strategy for motivating potential developers and users 52

Page 53: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Because the ontologies in the Foundry

are built as orthogonal modules which form an incrementally evolving network

• scientists are motivated to commit to developing ontologies because they will need in their own work ontologies that fit into this network

• users are motivated by the assurance that the ontologies they turn to are maintained by experts

53

Page 54: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

More benefits of orthogonality

• helps those new to ontology to find what they need

• to find models of good practice

• ensures mutual consistency of ontologies (trivially)

• and thereby ensures additivity of annotations

54

Page 55: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

More benefits of orthogonality

• it rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes

• thereby brings an obligation on the part of ontology developers to commit to scientific accuracy and domain-completeness

55

Page 56: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

More benefits of orthogonality

• No need to reinvent the wheel for each new domain

• Can profit from storehouse of lessons learned

• Can more easily reuse what is made by others

• Can more easily reuse training

• Can more easily inspect and criticize results of others’ work

• Leads to innovations (e.g. Mireot, Ontofox) in strategies for combining ontologies

56

Page 57: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Reference Ontologies vs. Application Ontologies

Reference ontology = an ontology that captures generic content and is designed for aggressive reuse in multiple different types of context. Our assumption is that most reference ontologies will be created manually on the basis of explicit assertion of the taxonomical and other relations between their terms.

Page 58: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Reference Ontologies vs. Application Ontologies

By ‘application ontology’ we mean an ontology that is tied to specific local applications. Each application ontology is created by using ontology merging software to combine new, local content with generic content taken over from relevant reference ontologies

Xiang, et al., “OntoFox: Web-Based Support for Ontology Reuse”, BMC Research Notes. 2010, 3:175.

Page 59: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Normalization of the ontology space – content from reference ontologies is maximally re-used, e.g. in formulation of compound terms and of cross-product definitions

(Compare normalization of a vector space)(Compare, again, SI System of Units)

Page 60: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

International System of Units

60

Page 61: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Infectious Disease Ontology (IDO)

61

Page 62: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

We have data, e.g.:

• TBDB: Tuberculosis Database, including Microarray data

• VFDB: Virulence Factor DB

• TropNetEurop Dengue Case Data

• ISD: Influenza Sequence Database at LANL

• MPD/MRD/CPP: Protein Data of PIR Resource Center for Biodefense Proteomics Research

• PathPort: Pathogen Portal Project 62

Page 63: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Purpose of Infectious Disease Ontology (IDO)

• Retrieval and integration of infectious disease relevant data– Sequence and protein data for pathogens– Case report data for patients– Clinical trial data for drugs, vaccines– Epidemiological Data for surveillance,

prevention– ...

• Goal: to make data deriving from different sources comparable and computable

63

Page 64: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

IDO Strategy

• Reference ontology (IDO Core) with terms relevant to any infectious disease

• Disease- and organism-specific application ontologies– for different types of host, types of vector,

types of pathogen, types of disease

64

Page 65: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Infectious Disease Ontology (IDO)

• Member of the OBO Foundry• A suite of ontologies

– IDO Core: • General terms in the ID domain. • A hub for all IDO extensions.

– IDO Extensions: • Disease specific. • Developed by subject matter experts.

• Provides:– Clear, precise, and consistent natural language

definitions– Computable logical representations (OWL, OBO)

Page 66: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

How IDO evolves

IDOCore

IDOSa

IDOHumanSa

IDORatSa

IDOStrep

IDORatStrep

IDOHumanStrep

IDOMRSa

IDOHumanBacterial

IDOAntibioticResistant

IDOMAL IDOHIVCORE and SPOKES:Domain ontologies

SEMI-LATTICE:By subject matter experts in different communities of interest.

IDOFLU

Page 67: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

IDO Process Model

Page 68: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Sample Application: A lattice of infectious disease application ontologies from NARSA isolate data

• Expose value of Genotype-Phenotype Linked Data by converting a free-text database from NARSA (Network on Antimicrobial Resistance in Staphylococcus Aureu) into a computational resource

Page 69: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Ways of differentiating Staphylococcus aureus infectious

diseases• Infectious Disease

– By host type– By (sub-)species of

pathogen– By antibiotic resistance– By anatomical site of

infection• Bacterial Infectious

Disease– By PFGE (Strain)– By MLST (Sequence Type)– By BURST (Clonal

Complex)• Sa Infectious Disease

– By SCCmec type• By ccr type• By mec class

– spa type

http://www.sccmec.org/Pages/SCC_ClassificationEN.html

Page 70: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

ido.owl

narsa.owl

narsa-isolates.owl

ndf-rt

NRS701’s resistance to clindamycin

Page 71: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Further extensions of IDO• Vaccine (Vaccine Ontology)

• Plant IDOfrom ICBO 2012:

71

Page 72: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Founding CROP

Page 73: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

The ontologies in CROPGeneral ontologies taken over from OBO Foundry•ChEBI Chemistry ontology•GO Gene Ontology•PRO Protein Ontology•ENVO Environment Ontology

+ GAZ Gazetteer built on ontological principles•PATO Phenotype Ontology

73

Page 74: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Plant specific ontologies to be developed by CROP group

PO Plant OntologyTO Trait OntologyEO Plant Environment OntologyPlant IDOPlant DiseaseAction items:

fix relation between EnvO and EOfix relation between PATO and TO

Page 75: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Taxonomy resource(for diseases of host and causal organisms + vectors/secondary

hosts)

NCBI Taxonomy has most of the hosts , but not the viruses

Page 76: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Next steps in CROP:

PRO-PO-GO MeetingBuffalo, Spring 2013

PRO = protein ontology

PO = plant ontology

GO = gene ontology

Page 77: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

The Environment Ontology

77

OBO FoundryGenomic Standards ConsortiumNational Environment Research Council (UK)USDA, Gramene, J. Craig Venter Institute ...

Page 78: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

78

Applications of EnvO in biology

Page 79: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

79

Page 80: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

80

Page 81: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

81

Page 82: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

How EnvO currently works for information retrieval

Retrieve all experiments on organisms obtained from:– deep-sea thermal vents– arctic ice cores– rainforest canopy– alpine melt zone

Retrieve all data on organisms sampled from:– hot and dry environments– cold and wet environments– a height above 5,000 meters

Retrieve all the omic data from soil organisms subject to:– moderate heavy metal contamination

82

Page 83: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

extending EnvO to clinical and translational research

• we have public heath, community and population data

• we need to make this data available for search and algorithmic processing

• we create a consensus-based ontology which can interoperate with ontologies for neighboring domains of medicine and basic biology

83

Page 84: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Environment = totality of circumstances external to a living organism or group of

organisms– pH– evapotranspiration– turbidity– available light– predominant vegetation– predatory pressure– nutrient limitation …

84

Page 85: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

extend EnvO to the clinical domain– dietary patterns (Food Ontology: FAO, USDA) ...

allergies– neighborhood patterns

• built environment, living conditions• climate• social networking• crime, transport• education, religion, work• health, hygiene

– disease patterns• bio-environment (bacteriological, ...)• patterns of disease transmission (links to IDO)

85

Page 86: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Aligning EnvO to the Basic Formal Ontology

Page 87: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

habitat

• Habitat =def. An ecosystem which can support the life of a given organism, population, or community

• Realized niche =def. An ecosystem which is that part of a habitat which supports the life of a given organism, population or community

Page 88: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Aligning EnvO to the Basic Formal Ontology

Page 89: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Hutchinsonion niche(niche as volume in a functionally defined hyperspace)

• =def. an n-dimensional hyper-volume whose dimensions correspond to resource gradients over which species are distributed– degree of slope, exposure to sunlight, soil

fertility, foliage density, salinity...

Page 90: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

G.E. Hutchinson (1957, 1965)

Page 91: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies
Page 92: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

Aligning EnvO to the Basic Formal Ontology

part_of

Page 93: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

93

Page 94: The CROP - wiki.plantontology.orgwiki.plantontology.org/images/8/89/Smith_CROP-9-13-12.pdf · The CROP (Common Reference Ontologies for Plants) Initiative ... standardized terminologies

94


Recommended