Post on 13-Jan-2016
transcript
LexRDF: A Semantic-Web
Compatible Extension of
LexGrid
Cui Tao
Jyotishman Pathak
Harold R. Solbrig
Wei-Qi Wei
Christopher G. Chute
Division of Biomedical Statistics and InformaticsMayo Clinic, College of Medicine
Introduction
• LexGrid provides
• A common information model to represent multiple vocabulary/ontology sources
• A scalable and robust API for accessing such information
• The Semantic Web community provides:
• OWL: formal, sound, and complete logic-based
• Tools: • Editor: Protégé• Reasoner: RACER, FaCT, FaCT++, Pellet• Storage: Triple stores
The Classic Web
• Single information space• Built on URIs
• globally unique IDs• retrieval mechanism
• Built on Hyperlinks• are the glue that holds
everything together
Web Browsers
B C
HTML HTMLHTML
Search Engines
hyper-links
A
hyper-links
Source: Chris Bizer
Search by Search Engines
Find the protein and the animo-acids
information for gene “cdk-4"
• One problem with the web of documents is that we can only search by documents. By using a search engine such as Google, we can only guess the keywords that can best represent our questions and hope that they will lead us to the documents that contain the answers we are looking for. However, Google usually returns hands of thousands of documents contain the keywords and users have to manually go through the returned documents in order to obtain the information of interest.
The Answer is Here
•The Hidden Web:• Hidden behind forms• Hard to query
Find the protein and the animo-acids
information for gene “cdk-4"
• Results from a search engine classic search.
• Even if we found the page contains the answer, we will have to read through the documents and locate the information of interest. There is currently no way for Google to return the piece of data we are looking for directly.
Source: Chris Bizer
• Use Semantic Web technologies to
• publish structured data on the Web
• set links between data with the same source or across sources
• The data and the links could be annotated by ontologies
A Web of Data
• This is an RDF graph showing that each data has an URI and they can be all linked together.
RDF Graph
Resource Description Framework (RDF)
• A language that allows machines to understand
• XML-based
• Used to identify things on the web (URI)
• Triple structure makes it efficiently implemented and stored
• A direct graph
• Fully parallelized processing, everyone can contribute simultaneously
• Easy to merge data from different sources
An RDF Statement
• RDF Triple
• Subject: thing the statement is about
• Predicate: property or characteristic of subject
• Object: value of the property
An Example
Source: Hoot72
• This is an example from Hoot72. They are trying to convert information in HL7 to RDF.
RDF Data
• This is an example showing how easy to use RDF graph the link data together.
Linked Health Data
Source: Hoot72
• We can link the whole health care domain together to have the linked health data. It can include patient data, scientific findings, doctor information, insurance information and centered by standard coding systems.
An Example
Source: Tim Berners-Lee
• Tim Berners-Lee gave this example during one of his talk early this year. The questions is “what proteins are involved in signal transduction and are related to pyramidal neurons?”
Search Engine: 223,000 Hits, 0 Results
• A Google search returned more than 200k hits without
a result because no one has asked the exact question before.
Source: Tim Berners-Lee
Linked Health Data: 32 Hits, 32 Results
Source: Tim Berners-Lee
• With the linked health data, he got 32 hits with 32 results. This is because with the linked data, all the data are semantic annotated. With the semantic web, we can also have semantic query so that the question can be described more precisely. Therefore our questions can be answered better.
LexRDF
• Provide an unified RDF-based model for biomedical ontologies and terminologies
• Directly apply tools and technologies developed for the semantic web
• Provide a public-accessible repository for the biomedical ontologies/terminologies
LexRDF System Overview
• This is the LexRDF/LexGrid system overview. The left side is our current LexGrid model. We use relational databases as the backend storage, load information from different source terminologies/ontologies to the LexGrid format and this information can be queried through the LexEVS API. LexRDF, instead, use triple stores as the backend storage and loaded information will be stored as triples based on the LexRDF model. The LexEVS API will stay the same from the user’s point of view. All the current functionalities will still be provided with additional features that semantic web can potentially provide to us.
LexEVS API
Triple StoreRelational Database
Load
LexGrid Model
LexRDF Model
Service Layer
Query LoadQuery
Grid Service
WebService
GraphicalBrowser
Ajax API
REST Interface
QueryEngine
Reasoner
LexRDFMapping
Specification
W3C Recommendations
• The Resource Description Framework (RDF)
• RDF Schema (RDFs)
• Web Ontology Language (OWL)
• Simple Knowledge Organization System (SKOS)
• SKOS eXtension for Labels (SKOS-XL)
• Dublin Core metadata element set (dc) (not W3C)
Mapping Specification
Ontology Information
• This slide shows the mapping specification between for the LexGrid components related to ontology information. We successfully mapped all those LexGrid components to W3C components
Mapping Specification
• ENTITY
This is a graph showing the mapping about LexGrid entity. LexGrid entity has three sub-components: concepts, instance, and association. We were able to find equivalent components for them in the W3C name spaces.
Mapping Specification
• This is a table about entity mapping.
Mapping Specification
PROPERTY
This is a graph for property mapping. A LexGrid property can be further specified as a presentation, a definition, or a comment. lg:presentation
is mapped to skos:prefLabel and skos:altLabel. Lg:definition is mapped
to skos:definition. Lg:comment is mapped to a subset of skos:note.
When no type is specified, we use owl:AnnotationProperty to describe
the general lg:property for now.
Mapping Specification
• PROPERTY
This is a table about the detailed
information for property mapping.
Mapping Specification
ASSOCIATION
This is a table about the detailed information
for association mapping.
Example
Concept, Property, and Reification
FAO:000025 skos:definition “middle stages of reproductive phase”
reification
• This is an example showing mapping for concepts, properties, and reification. We show an OBO term as an example. This term has a name, a definition, and a synonym. LexGrid uses two presentations and one definition to represent the information. The presentation for the name is defined as preferred and the presentation for the synonym is defined as not preferred. LexRDF creates an owl class for the term. It uses skos:preLabel for the name and skos:altLabel for the synonym. For the definition, LexRDF uses skos:definition. Because the definition also has a source information, we had to use RDF reification to describe the source information.
Example: Property Link
translation
This is an example for property link. In LexGrid, we can describe relations between any two properties. Here is
an example for Agrovoc. Agrovoc is a multi-lingual terminologies. In this example, the term has 17 scope notes in 17 different languages. Here we just show 2. Suppose we want to describe the relation “translation” between the English note and the Deutsch note, we then need to add a LexRDF property link between the two statements.
• This is an example for association mapping. In LexGrid, we not only can describe the relations between two concept, we also can add qualifiers to the relations. In this example, we want to show if a given disease has certain clinical signs. We also want to describe how frequently the clinical sign appears in the disease. In this case, LexRDF creates an association for the relation HAS_CLINICAL_SIGN between the disease and the clinical sign. It also add a qualifier for the frequency to the association.
Example: Association
Discussion
• Generic holder for properties and comments
During the mapping process, we encountered some challenges
and problems. We also suggested some possible solutions.
The following slides will address these challenges and suggestions.
In LexGrid, we have a generic holder for property and comment.
But we cannot find the appropriate equivalent component for them
in the W3C name spaces.
Discussion
•Newly defined properties
LexGrid: can define name and value for a particular property
Concept C001 has Property
Name = “short_name”
Value = “A”
LexRDF:
New annotation properties have to be
defined interoperability problems?
Subject Predicate Object
short_name rdf:type owl:AnnotationProperty
C001 short_name A
Discussion
Preferred properties
In LexGrid we can define both presentation and definition as preferred. But skos only has prefLabel and altLabel. It will be great if they can include preDefinition and altDefinition in the future.
LexRDF:isPreferred
Discussion
• Association qualification
Discussion
• Relation among properties
skosxl:labelRelation• Only between skosxl:label• symmetric property
PropertyLink is more general!
Discussion
•Property groups
CUI1 AUI1 CUI2 AUI2 Rel Rela Sab
C001 A001 C002 A002 PAR sub_type LNC
C001 A003 C002 A004 PAR is_a SNOMED
C002C001PAR
Qualifier Group 1: Rela = sub_type Source = LNC Source_AUI = A001 Target_AUI = A002
Qualifier Group 1: Rela = sub_type Source = SNOMED Source_AUI = A003 Target_AUI = A004
How do we handle a group of properties?
Initial Implementation Status
Web Browser
DatabaseDatabase
Ontologies/Terminologies
LBOWLManagerLexRDFManager
LexRDF
LexEVSAPI
SesameAPI
LexRDFMapping
LexGrid
JDBC Memory Native
RDBMS100101010110100101111001010
Sesame Server
Query Engine
SesameAPI
Conclusion and Future Work
• Next Step
• Implementation
• Evaluation Result Analysis
• Query functionality Analysis
• Formalize the mapping specification by using standards such as the OMG Ontology Definition Meta-model
• LexRDF mapping specification:
• successfully mapped 32 out of 42 LexGrid elements
• high degree of reusability
• LexRDF documentation:
https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/LexGrid_to_RDF_Triple_Store