+ All Categories
Home > Documents > VDOS2013-Zhe-Slides

VDOS2013-Zhe-Slides

Date post: 16-Jul-2015
Category:
Upload: zhe-henry-he
View: 25 times
Download: 0 times
Share this document with a friend
Popular Tags:
20
Zhe He 1 Christopher Ochs 1 Larisa Soldatova 2 Yehoshua Perl 1 Sivaram Arabandi 3 James Geller 1 1 New Jersey Institute of Technology, 2 Brunel University, 3 Ontopro LLC. Auditing Redundant Import in Reuse of a Top Level Ontology for the Drug Discovery Investigations Ontology (DDI) ICBO 2013 Workshop on Vaccine and Drug Ontology Studies 1
Transcript
Page 1: VDOS2013-Zhe-Slides

Zhe He1

Christopher Ochs1

Larisa Soldatova2

Yehoshua Perl1

Sivaram Arabandi3

James Geller1 1New Jersey Institute of Technology, 2Brunel University, 3Ontopro LLC.

Auditing Redundant Import in Reuse of a Top Level Ontology for the Drug Discovery

Investigations Ontology (DDI)

ICBO 2013 Workshop on Vaccine and Drug Ontology Studies

1

Page 2: VDOS2013-Zhe-Slides

Outline

• Introduction

– Environment

– Motivation

– Ontology for Drug Discovery Investigations (DDI)

– Abstraction Networks & Partial Area Taxonomy

• Algorithm Hide

– Hiding Redundant BFO (Basic Formal Ontology) classes from DDI

• Future work

• Conclusions

2

Page 3: VDOS2013-Zhe-Slides

Environment

• BioPortal: a large repository of over 340 biomedical ontologies covering a wide range of domains.

• Many ontologies in BioPortal are released in OWL or OBO format.

• OWL (Web Ontology Language): based on Description Logic, maintained by a working group of W3C.

• OBO (Open Biological and Biomedical Ontologies ) Foundry: a collaborative experiment involving developers of ontologies who are establishing a set of principles for ontology development.

3

Page 4: VDOS2013-Zhe-Slides

Motivation

• Use a top-level ontology as a template for a domain ontology is recommended.

• OBO Foundry recommends importing BFO (Basic Formal Ontology).

• The top-domain ontologies OGMS (Ontology for General Medical Science) and BioTop (Beisswanger et al. 2008) reuse BFO.

• Some domain ontologies reuse OGMS, thereby indirectly reusing BFO.

4

Page 5: VDOS2013-Zhe-Slides

Motivation (cont.)

• Ontologies need to go through Quality Assurance before being put to use.

– Discovering modeling errors and inconsistencies in the design

– Unused imported top-level classes diminish the usability of the ontology.

– Currently, there is no mechanism to remove unused imported classes.

– Redundant imported top-level classes should be hidden.

5

Page 6: VDOS2013-Zhe-Slides

Ontology for Drug Discovery Investigations

• DDI was developed to support automatic drug discovery investigations run by a Robot Scientist “Eve” (Qi et al. 2010).

• DDI is used for reasoning with data about the biological activity of compounds in regards to various drug targets.

• DDI uses BFO (Basic Formal Ontology) and RO (Relations Ontology) as design templates and extends BFO and OBI (Ontology for Biomedical Investigations).

• Some imported BFO classes were left unused in DDI.

– connected_temporal_region

– temporal_instant

– temporal_interval

6

Page 7: VDOS2013-Zhe-Slides

Abstraction Networks

• An abstraction network is a secondary network that provides a compact view of the structure and content of the primary ontology.

• Abstraction of an ontology is the process by which subsets of classes are each replaced by a higher-level conceptual entity (node).

Ontology Abstraction Network

Subset of classes modeled by a node

7

Page 8: VDOS2013-Zhe-Slides

Partial Area Taxonomy

• Partial area taxonomy is an abstraction network developed by our research group that summarizes sets of structurally and semantically similar classes.

• Partial area taxonomies have been derived for

– SNOMED CT (Wang et al. 2007)

– Ontology of Clinical Research (OCRe) (Ochs et al. 2012)

– Sleep Domain Ontology (SDO) (Ochs et al. 2013)

– Cancer Chemoprevention Ontology (CanCo) (He et al. 2013)

– etc.

8

Page 9: VDOS2013-Zhe-Slides

Area Taxonomy

Area: Set of all classes that are explicitly defined or inferred as being in exactly the domain of a given set of object properties.

9

Page 10: VDOS2013-Zhe-Slides

Partial Area Taxonomy Root: Class with no superclasses in area Partial area: Root + all descendants in area

10

Page 11: VDOS2013-Zhe-Slides

Algorithm Hide

• Hide is a post order recursive algorithm requiring linear time.

• Hide identifies imported classes that are not used in the domain ontology.

• Applicability: – Ontologies in OWL or OBO format

– Both domain ontology and top-level ontology are trees.

– Top-level ontology does not have object properties.

• A Class is redundant if: – Imported from the top-level ontology AND

– In Root partial area of the taxonomy AND

– A leaf in the domain ontology (at some stage of the algorithm) AND

– Not used as range of an object property

11

Page 12: VDOS2013-Zhe-Slides

12

Partial Area Taxonomy for DDI

Page 13: VDOS2013-Zhe-Slides

Entity Node of DDI Taxonomy

• 81 classes in Entity root partial area of DDI taxonomy

• BFO has 38 classes.

• 32 out of 81 classes are imported from BFO.

• 6 BFO classes are used as domains of object properties.

• Hence, we reviewed 32 classes for redundancy.

13

Page 14: VDOS2013-Zhe-Slides

BFO Classes in Entity

Node Before Hiding

Entity (2 children)

continuant (3 children)

dependent_continuant (2 children)

independent_continuant (3 children)

material_entity (10 children)

fiat_object_part

object

object_aggregate

object_boundary

site (3 children)

spatial_region (4 children)

one_dimentional_region

two_dimentional_region

three_dimentional_region

zero_dimentional_region

occurent (3 children)

processual_entity (6 children)

fiat_process_part

process (2 children)

process_aggregate

process_boundary

processual_context

spatiotemporal_region (2 children)

connected_spatiotemporal_region (2 children)

spatiotemporal_instant

spatiotemporal_interval

scattered_spatiotemporal_region

temporal_region (2 children)

connected_temporal_region (2 children)

temporal_instant

temporal_interval

scattered_temporal_region Legend LL Leaf LL Parent of classes that are all leaves LL Grandparent of grandchildren that are all leaves

14

Page 15: VDOS2013-Zhe-Slides

BFO Classes in Entity Partial Area After Hiding

• 18 unused BFO classes are hidden.

• Meaning 18/32 = 56% BFO classes in Entity partial area are hidden.

Entity (2 children)

continuant (3 children)

dependent_continuant (2 children)

independent_continuant (3 children)

material_entity (10 children)

site (3 children)

spatial_region (4 children)

one_dimentional_region

two_dimentional_region

three_dimentional_region

zero_dimentional_region

occurent (3 children)

processual_entity (6 children)

process (2 children)

15

Page 16: VDOS2013-Zhe-Slides

Future Work

• As many as 35 out of 186 ontologies we investigated in BioPortal reuse BFO classes.

• Some ontologies have a Directed Acyclic Graph (DAG) hierarchy, e.g. SDO (Sleep Domain Ontology) (Arabandi 2010).

• Need to consider cases where both top-level and domain ontologies are DAG hierarchies.

• Some top-domain ontologies have object properties, e.g. BioTop.

• Need to design algorithm to deal with issues regarding redundant import of relationships in the reuse of top-domain ontologies.

16

Page 17: VDOS2013-Zhe-Slides

Conclusions

• We described a recursive linear algorithm for hiding unused imported top-level ontology classes of an OWL-based ontology.

• The algorithm was demonstrated by hiding 18 (56%) BFO imported classes from the DDI.

• Hiding of unused imported top-level classes should be part of the Quality Assurance process of OWL-based ontologies.

17

Page 18: VDOS2013-Zhe-Slides

References • Qi, D., R. D. King, et al. (2010). "An ontology for description of drug

discovery investigations." J Integr Bioinform 7(3).

• Arabandi, S. (2010). “Developing a Sleep Domain Ontology.” AMIA TBI/CRI Summit. San Francisco, CA.

• Beisswanger, E, S. Schulz, et al. “BioTop: An Upper Domain Ontology for the Life Sciences.” Appl Ontology 3(4): 205-212.

• Wang, Y., et al. (2007). "Structural methodologies for auditing SNOMED." J Biomed Inform 40(5): 561-581.

• Ochs, C., A. Agrawal, et al. (2012). "Deriving an Abstraction Network to Support Quality Assurance in OCRe." AMIA Annu Symp Proc: 681-689

• Ochs, C. , Z. He, et al. (2013). “Choosing the Granularity of Abstraction Networks for Orientation and Quality Assurance of the Sleep Domain Ontology.” The 4th International Conference on Biomedical Ontology Proc.

• He, Z., C. Ochs, et al. (2013). “A Family-based Framework for Supporting Quality Assurance of Biomedical Ontologies in BioPortal.” To appear in AMIA Annu Symp Proc.

18

Page 19: VDOS2013-Zhe-Slides

Thank you!

Any Questions?

19

Page 20: VDOS2013-Zhe-Slides

Algorithm of Hide • Algorithm Hide(R, O, T, v)

• IF isInternal(O, v) THEN

• FOR EACH Class w IN subclasses(R, v) {

• Hide(R, O, T, w)

• }

• END IF

• IF NOT(isInternal(O,v)) THEN

• IF isClassFrom(v, O, T) AND NOT(in_op_range(v, O))

• THEN

• hide(v, O)

• END IF

• END IF

• RETURN

• Main Program

• // Initially, call Hide on the root class r of the root partial area R.

• Hide(R, O, T, r)

Function Name Function Description

isInternal(O, v) Boolean function that returns true if class v has any subclasses in ontology O.

subclasses(R, v) Returns iterator to the set of subclasses of class v in root partial area R.

isClassFrom(v, O, T) Boolean function that returns true if the class v in ontology O is imported from Top-Level ontology T.

in_op_range(v, O) Boolean function that returns true if class v is in the range of an object property of ontology O.

hide(v, O) Hides class v from ontology O and therefore also removes all subclass relationships from v.

Domain ontology: O Top-Level ontology: T Root Partial Area of O: R Class in O - v

20


Recommended