Ontologies: Introduction and Some Uses

Post on 01-Jan-2016

28 views 0 download

description

Ontologies: Introduction and Some Uses. Boyan Brodaric Bertram Lud ä scher. Uses of Concept Spaces / Ontologies. Concept Browsing and Searching find concept C, find all related concepts D; display: C ==R ==>D “Smart Data Discovery” - PowerPoint PPT Presentation

transcript

Ontologies: Ontologies: Introduction and Some UsesIntroduction and Some Uses

Ontologies: Ontologies: Introduction and Some UsesIntroduction and Some Uses

Boyan Brodaric Boyan Brodaric

Bertram LudBertram Ludääscherscher

Boyan Brodaric Boyan Brodaric

Bertram LudBertram Ludääscherscher

2

Uses of Concept Spaces / OntologiesUses of Concept Spaces / Ontologies

• Concept Browsing and SearchingConcept Browsing and Searching– find concept C, find concept C, – find all related concepts D; display: C ==R==>D find all related concepts D; display: C ==R==>D

• ““Smart Data Discovery”Smart Data Discovery”– find instances of data sets X that are related to C:find instances of data sets X that are related to C:

• X = {....X = {....your-tagged-data-hereyour-tagged-data-here...} ==R==> C...} ==R==> C

– searching for instances of D ...searching for instances of D ...– ... and knowing that C ==IS-A==> D... and knowing that C ==IS-A==> D– ... we can find X !... we can find X !=> requires => requires “Smart Source Registration”“Smart Source Registration”

• Integrated Views and Querying Integrated Views and Querying – access, iterate over, aggregate, group-by, ... conceptsaccess, iterate over, aggregate, group-by, ... concepts

3

Glue KnowledgeGlue Knowledge for Semantic Mediation: for Semantic Mediation: Unified Medical Language System (UMLS)Unified Medical Language System (UMLS)

• Started by National Library of Medicine in 1986Started by National Library of Medicine in 1986– ... to aid the development of systems that help health ... to aid the development of systems that help health

professionals and researchers professionals and researchers retrieve and integrateretrieve and integrate electronic electronic biomedical biomedical information from a variety of sourcesinformation from a variety of sources and to make and to make it easy for users to it easy for users to link disparate information systemslink disparate information systems, , including computer-based patient records, bibliographic including computer-based patient records, bibliographic databases, factual databases, and expert systems. databases, factual databases, and expert systems.

– The UMLS project develops The UMLS project develops "Knowledge Sources""Knowledge Sources" that can be that can be used by a wide variety of applications programs to used by a wide variety of applications programs to overcomeovercome retrieval problems caused by retrieval problems caused by differences in terminologydifferences in terminology and and the scattering of relevant information across many databases.the scattering of relevant information across many databases.

4

Medical Subject Headings (MeSH) Tree StructuresMedical Subject Headings (MeSH) Tree Structures

5

Finding out about .... PaleontologyFinding out about .... Paleontology

1.1. Find *paleo*Find *paleo*

2.2. Find related Find related conceptsconcepts

6

Combining Ontologies: UMLS and Gene OntologyCombining Ontologies: UMLS and Gene Ontology

7

UMLS Concept Space as Relational TablesUMLS Concept Space as Relational Tables

• conceptconcept((CUI, LUI, SUI, STRCUI, LUI, SUI, STR))– CUICUI = concept ID = concept ID

– LUILUI = lexical ID = lexical ID

– SUISUI = string ID = string ID

– STRSTR = string representation = string representation

• relationshiprelationship((CUI1, REL, CUI2, RELA, SAB, SLCUI1, REL, CUI2, RELA, SAB, SL))– RELREL = {chd (child), par (parent), sib (sibling), ...} = {chd (child), par (parent), sib (sibling), ...}

– RELARELA = {isa, has_part, adjacent_to, contains, contained_in... } = {isa, has_part, adjacent_to, contains, contained_in... }

– SAB,SLSAB,SL = origin of definition (MeSH2001) = origin of definition (MeSH2001)

8

S1 S2

S3

(XML-Wrapper) (XML-Wrapper) (XML-Wrapper)

CM-Wrapper CM-Wrapper CM-Wrapper

USER/ClientUSER/Client

CM (Integrated View)

MediatorEngine

FL rule proc.

LP rule proc.

Graph proc.XSB Engine

CM(S) =OM(S)+KB(S)+CON(S)

GCM

CM S1

GCM

CM S2

GCM

CM S3

CM Queries & Results (exchanged in XML)

Domain MapsDMs

Domain MapsDMs

Domain MapsDMs

Domain MapsDMs

Domain MapsDMs

Process MapsPMs

“Glue” MapsGMs

semanticcontextCON(S)

Integrated View Definition IVD

Model-Based Mediator Architecture

First results & Demos:KIND prototype, formal

DM semantics, PMs[SSDBM00] [VLDB00][ICDE01] [NIH-HB01]

[BNCOD02] [ER02][EDBT02] [BioInf02]

9

Source Contextualization & DM RefinementSource Contextualization & DM Refinement

In addition to registering (“hanging off”) data relative toexisting concepts, a source may also refine the mediator’s domain map...

sources can register new concepts at the mediator ...

10

Demonstration: Using Ontologies in Queries/Views Demonstration: Using Ontologies in Queries/Views

• find data sets that find data sets that are “inside” Xare “inside” X

• inside = inside = – logical_inside PLUS logical_inside PLUS

spatially_insde spatially_insde

• logical_inside useslogical_inside uses– UMLS, andUMLS, and

– NEURONAMESNEURONAMES

• spatially_inside usesspatially_inside uses– Oracle-Spatial Oracle-Spatial

• visualize @ clientvisualize @ client

Query Processing Query Processing DemoDemo

Query resultsin context

ContextualizationCON(Result) wrt. ANATOM.

Mediator View DefinitionMediator View DefinitionDERIVEDERIVE

protein_distributionprotein_distribution((ProteinProtein, , Organism,Organism,Brain_region, Brain_region, Feature_name, Feature_name, Anatom,Anatom, ValueValue) ) WHEREWHERE

I:I:protein_label_image[protein_label_image[ proteins ->> {Protein}; organism -> Organism; anatomical_structures ->> proteins ->> {Protein}; organism -> Organism; anatomical_structures ->>{AS:{AS:anatomical_structure[anatomical_structure[name->Anatomname->Anatom]]}}] ] , , % from PROLAB% from PROLAB

NAE:NAE:neuro_anatomic_entity[neuro_anatomic_entity[name->Anatom; name->Anatom; % from ANATOM% from ANATOM located_in->>{Brain_region}located_in->>{Brain_region}]], , AS..segments..featuresAS..segments..features[[name->Feature_name; value->Valuename->Feature_name; value->Value]]. .

• provided by the domain expert and mediation engineer• deductive OO language (here: F-logic)

Inside Query Evaluation: Another ExampleInside Query Evaluation: Another Example

push selectionpush selection@SENSELAB@SENSELAB: X1 := : X1 := selectselect targets of “output from targets of “output from parallel fiber”parallel fiber” ;;

determine source contextdetermine source context@MEDIATOR@MEDIATOR: X2 := : X2 := ““find and situatefind and situate”” X1 in ANATOM X1 in ANATOM Domain MapDomain Map;;

compute region of interest (here: downward closure)compute region of interest (here: downward closure)@MEDIATOR@MEDIATOR: X3 := : X3 := subregion-closuresubregion-closure(X2);(X2);

push selectionpush selection @NCMIR@NCMIR: X4 := : X4 := selectselect PROT-data(X3, PROT-data(X3, Ryanodine ReceptorsRyanodine Receptors););

compute protein distributioncompute protein distribution @MEDIATOR@MEDIATOR: X5 := : X5 := compute aggregatecompute aggregate(X4);(X4);

display in contextdisplay in context @MEDIATOR/GUI@MEDIATOR/GUI: : displaydisplay X5 X5 inin context context (ANATOM)(ANATOM)

"How does the parallel fiber output (Yale/SENSELAB) relate to the

distribution of Ryanodine Receptors (UCSD/NCMIR)?”

13

Ecological Metadata Language (EML): Ecological Metadata Language (EML): Useful for Marking up GEON Data?Useful for Marking up GEON Data?