+ All Categories
Home > Documents > Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike...

Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike...

Date post: 02-Jan-2021
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
4
Developing a Web-based SKOS Editor Mike Conway Dept. Family and Preventive Medicine University of California San Diego La Jolla CA, UNITED S TATES [email protected] Fariba Fana San Diego Supercomputer Center La Jolla CA, UNITED S TATES [email protected] Melissa Tharp and William Scuba and Wendy Chapman Dept. Biomedical Informatics University of Utah Salt Lake City UT, UNITED S TATES {melissa.tharp|william.scuba|wendy.chapman}@utah.edu Simon Jupp European Molecular Biology Laboratory European Bioinformatics Institute (EMBL-EBI) Wellcome Trust Genome Campus Cambridge, UNITED KINGDOM [email protected] Abstract The Simple Knowledge Organization System (SKOS) was introduced to the wider research community by a 2005 World Wide Web Con- sortium (W3C) working draft, and further de- veloped and refined in a 2009 W3C recom- mendation. Since then, SKOS has become the de facto standard for representing and shar- ing thesauri, lexica, vocabularies, taxonomies, and classification schemes. In this paper, we describe the development of a web-based, free SKOS editor built for the development, cu- ration, and management of small to medium sized lexicons for health-related Natural Lan- guage Processing. 1 Introduction & Background The Simple Knowledge Organization System (SKOS) standard was introduced to the wider community by a 2005 World Wide Web Consortium (W3C) working draft 1 and further developed and refined in a 2009 W3C recommendation 2 (Lacasta et al., 2010). Since then, SKOS has become the de facto standard for representing and sharing thesauri, lexica, vocabularies, taxonomies, and classification schemes, both as a useful data format in its own 1 www.webcitation.org/6QmPKaUaP 2 www.webcitation.org/6QmPUa0m8 right, and as a means for sharing resources on the semantic web. In this paper, we describe the development of a web-based, free SKOS editor suitable for the creation and curation of knowledge organization systems in general, and health-related, linguistically-oriented thesauri designed to support health-related Natural Language Processing (NLP) in particular. SKOS is a flexible standard designed to repre- sent and encode a wide number of different types of knowledge organization systems. The standard is widely used 3 by governments (e.g. United King- dom Public Sector Vocabularies), scientific bodies (e.g. NASA vocabularies), and non-governmental organizations (e.g. UNESCO Thesaurus). In con- trast to its sibling World Wide Web Consortium se- mantic web standard, the Web Ontology Language (OWL), SKOS follows the principal of “minimal on- tological commitment” (Baker et al., 2013). That is, SKOS concepts and relations are lightly specified, using thesaurus-style relations like broader rather than logically formalized relations commonly used in OWL or the Basic Formal Ontology (e.g. IS-A). SKOS models consist of concept schemes which serve as containers for concepts. Concepts can be related together in various ways to create 3 www.webcitation.org/6QmoFFjdc 105
Transcript
Page 1: Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike Conway Dept. Family and Preventive Medicine University of California San Diego

Developing a Web-based SKOS Editor

Mike Conway

Dept. Family and Preventive MedicineUniversity of California San Diego

La Jolla CA, UNITED [email protected]

Fariba Fana

San Diego Supercomputer CenterLa Jolla CA, UNITED STATES

[email protected]

Melissa Tharp and William Scuba and Wendy Chapman

Dept. Biomedical InformaticsUniversity of Utah

Salt Lake City UT, UNITED STATES{melissa.tharp|william.scuba|wendy.chapman}@utah.edu

Simon Jupp

European Molecular Biology LaboratoryEuropean Bioinformatics Institute (EMBL-EBI)

Wellcome Trust Genome CampusCambridge, UNITED KINGDOM

[email protected]

Abstract

The Simple Knowledge Organization System(SKOS) was introduced to the wider researchcommunity by a 2005 World Wide Web Con-sortium (W3C) working draft, and further de-veloped and refined in a 2009 W3C recom-mendation. Since then, SKOS has become thede facto standard for representing and shar-ing thesauri, lexica, vocabularies, taxonomies,and classification schemes. In this paper, wedescribe the development of a web-based, freeSKOS editor built for the development, cu-ration, and management of small to mediumsized lexicons for health-related Natural Lan-guage Processing.

1 Introduction & Background

The Simple Knowledge Organization System(SKOS) standard was introduced to the widercommunity by a 2005 World Wide Web Consortium(W3C) working draft1 and further developed andrefined in a 2009 W3C recommendation2 (Lacastaet al., 2010). Since then, SKOS has become the defacto standard for representing and sharing thesauri,lexica, vocabularies, taxonomies, and classificationschemes, both as a useful data format in its own

1www.webcitation.org/6QmPKaUaP2www.webcitation.org/6QmPUa0m8

right, and as a means for sharing resources onthe semantic web. In this paper, we describe thedevelopment of a web-based, free SKOS editorsuitable for the creation and curation of knowledgeorganization systems in general, and health-related,linguistically-oriented thesauri designed to supporthealth-related Natural Language Processing (NLP)in particular.

SKOS is a flexible standard designed to repre-sent and encode a wide number of different typesof knowledge organization systems. The standardis widely used3 by governments (e.g. United King-dom Public Sector Vocabularies), scientific bodies(e.g. NASA vocabularies), and non-governmentalorganizations (e.g. UNESCO Thesaurus). In con-trast to its sibling World Wide Web Consortium se-mantic web standard, the Web Ontology Language(OWL), SKOS follows the principal of “minimal on-tological commitment” (Baker et al., 2013). That is,SKOS concepts and relations are lightly specified,using thesaurus-style relations like broader ratherthan logically formalized relations commonly usedin OWL or the Basic Formal Ontology (e.g. IS-A).

SKOS models consist of concept schemes whichserve as containers for concepts. Concepts canbe related together in various ways to create

3www.webcitation.org/6QmoFFjdc

105

Page 2: Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike Conway Dept. Family and Preventive Medicine University of California San Diego

Figure 1: Screenshot of the web interface

a hierarchical structure. The most importantof these semantic relations are, skos:broader(which can be read as “has broader concept”) andskos:narrower (which can be read as “has nar-rower concept”). Further, each concept can be as-sociated with a number of lexical labels, includ-ing skos:prefLabel (a preferred label providesa mechanism to link a canonical label to a con-cept), skos:altLabel (an alternative label pro-vides a mechanism to specify synonyms for the con-cept), and skos:hiddenLabel (a hidden labelprovides a mechanism to specify non-standard syn-onyms like colloquialisms or misspellings).

Given its lightweight semantics, SKOS is partic-ularly suitable as a basis for the development andsharing of vocabularies to support NLP tasks. A keypart of the workflow in developing some NLP sys-tems – in particular NLP systems designed to pro-cess health-related text – is the development of cus-tom lexicons, including common abbreviations, syn-onyms (including slang terms), and truncations (Wuet al., 2012; Liu et al., 2013; Wilson et al., 2010;Myslın et al., 2013).

Since its inception in 2005, significant effort hasbeen expended on the development of software toolsfor the SKOS standard, in particular in editing andviewing SKOS vocabularies. Notable examples in-

clude a SKOS Application Programming Interface(API) and editing module (Jupp et al., 2009) forProtege 44 (the Protege SKOS Editor), PoolParty, anonline SKOS editing and manipulation tool (Schandland Blumauer, 2010), and SKOS functionality builtinto the TopBraid Composer RDF editing platform5,all of which facilitate the creation, development, andutilization of SKOS vocabularies. However, to thebest of our knowledge, until now no free web-basedSKOS editor has been available to the research com-munity (note that PoolParty, although web-based, isa commercial product). In this paper we presenta web-based SKOS editing tool that is suitable fordeveloping and modifying the health-related lexi-cons necessary for large-scale information extrac-tion from clinical notes and other health-related text,yet is also general purpose enough for any small-to-medium sized SKOS vocabulary development or cu-ration project.

2 Implementation

A key advantage of using a web-based editor, is thatit can be used anywhere, on any machine, withoutcomplex user installation. Given that our target usersare clinicians and domain experts — i.e. those with

4www.webcitation.org/6QmsQg41G5www.webcitation.org/6QmsXXNCc

106

Page 3: Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike Conway Dept. Family and Preventive Medicine University of California San Diego

SKOS Editor ArchitectureFariba Fana, NLP Group @ UCSD Dept. of BiomedicaI Information Division

Client

-- Upload SKOS file from URL or local machine -- Create SKOS file from scratch -- Download own work in RDFXML format

XML

Liferay Server (Running on Tomcat)user authorization & authentication and user & user groups management

View

Technologies:JSP/JavaScript/YUI/Dojo/CSS/HTML

HTML forms rendered bydialogs as user interactswith different parts of theapplication and sendAJAX calls to controller

Controller

APP

Technologies:Java JSR 286 Portlet Web App

Portlet processAction andserveResource methodsto answer AJAX calls fromthe view JSP

Model

Technologies:Java SKOS and OWL 2 API

Methods to call SKOSand OWL2 API to dovarious operations

AJAX CallsJSON

RequestResponse

Functionalities / Operations

File / IO Schema & Concept Object Properties Data Properties -- Upload SKOS file from URL -- Upload SKOS file from local machine -- Create SKOS file from scratch and download -- Download own work in RDFXML format

-- Create and rename schema -- Display concepts in hierarchical view/tree -- Add top concept and child child concept -- Remove concept -- Rename concept -- Move concept within the hierarchy

-- Display object properties organized in a table -- Add new object properties -- Remove object properties -- Edit object properties -- Object properties are grouped according to SKOS standard object property types

-- Display data properties organized in a table -- Add new data properties -- Remove data properties -- Edit data properties -- Add new type for properties -- Data properties are grouped according to SKOS standard

Figure 2: System functionality

little or no experiences of semantic web languages— rather than informatics professionals, ease of useis an important requirement. We took the decision tosimplify the editor’s user interface as much as pos-sible, hiding much of the general OWL/RDF func-tionality available in tools like Protege and TopBraidComposer.

A screenshot of the system is shown in Figure 1.The screenshot shows a SKOS thesaurus designedto drive a NLP system for the automatic identifi-cation of biosurveillance-relevant symptoms fromElectronic Health Records (Conway et al., 2011).After some experimentation, we adopted an inter-face that consists of three panels, from left to right:

• CONCEPT PANE: An editable taxonomichierarchy of SKOS concepts representingskos:broader and skos:narrower re-lations, which the user can click on to expandand collapse the tree

• RELATIONS PANE: An editable list of re-lations between concepts, particularly theskos:related, skos:broader, andskos:narrower relations

• LINGUISTICS PANE: An editable list of lexicalitems related to each SKOS concept (e.g.skos:prefLabel, skos:altLabel,skos:hiddenLabel)

We identified six core functions necessary for theeditor, partially based on the requirements identifiedby Jupp et al., 2009:

• Create, edit and delete SKOS entities• Assert SKOS relationships between SKOS con-

cepts (e.g. broader/narrower)• Assert and edit skos:prefLabel,skos:altLabel, andskos:hiddenLabel data properties

• Visualize broader and narrower relationships ina browsable hierarchical tree

• Support for SKOS documentation propertiesand Dublin Core

• Provide alternative renderings (e.g. multilin-gual prefLabels) within the editor

In building our web-based SKOS Editor, we reliedheavily on existing OWL, SKOS and RDF tooling,in particular, the SKOS API (Jupp et al., 2009) (de-veloped by author SJ) and the OWL API (Horridge

107

Page 4: Developing a Web-based SKOS Editor a... · 2014. 9. 25. · Developing a Web-based SKOS Editor Mike Conway Dept. Family and Preventive Medicine University of California San Diego

and Bechhofer, 2011). The system is a Liferay Port-let application (see Figure 2).

3 Current Limitations & Future Directions

While the SKOS editor is suitable for building andcurating special purpose SKOS vocabularies to runbespoke health-related NLP systems, it does haveseveral limitations. First, the system is not suitablefor editing large SKOS taxonomies (i.e. > 1000concepts). Second, although there have been re-cent efforts in developing best practice guidelinesfor building SKOS vocabularies (Mader et al., 2012;Manaf et al., 2012) there may still be situations inwhich particular language features may not work asexpected.

Our long term goal is to integrate the SKOS edi-tor as a lexicon development and management mod-ule within a comprehensive platform for developingclinical NLP algorithms. As part of this long termgoal — and informed by the comments and sug-gestions of our early users — we are currently im-plementing several system enhancements, includingadding version control tools, developing “wizards”to support the rapid creation of concept schemes,and most importantly, building multi-user function-ality and collaborative editing.

4 Acknowledgements

We would like to thank Drs Tania Tudorache,Natasha Noy, and Matthew Horridge of StanfordUniversity’s Department of Biomedical Informat-ics for their valuable guidance in using the OWLAPI. We would also like to thank Artem Khojoyanfor his help with Liferay development. This workwas funded by grants from the United States Vet-erans Administration (VA HIR 08-204), and theUnited States National Library of Medicine (NLM1R01LM010964).

References

Thomas Baker, Sean Bechhofer, Antoine Isaac, AlistairMiles, Guus Schreiber, and Ed Summers. 2013. Keychoices in the design of Simple Knowledge Organi-zation System (SKOS). Journal of Web Semantics,20:35–49.

Mike Conway, John Dowling, and Wendy Chapman.2011. Developing an application ontology for min-ing free text clinical reports: The extended syndromicsurveillance ontology. In Proceedings of the Third

International Workshop on Health Document TextMining and Information Analysis, Slovenia (LOUHI2011), pages 75–82.

Matthew Horridge and Sean Bechhofer. 2011. The OWLAPI: A Java API for OWL ontologies. Semantic Web,2(1):11–21, January.

Simon Jupp, Sean Bechhofer, and Robert Stevens. 2009.A flexible API and editor for SKOS. In Proceedings ofthe 6th European Semantic Web Conference on The Se-mantic Web: Research and Applications, ESWC 2009Heraklion, pages 506–520. Springer-Verlag, Berlin.

Javier Lacasta, Javier Nogueras-Iso, and Francisco JavierZarazaga-Soria. 2010. Terminological Ontolo-gies: Design, Management and Practical Applica-tions. Springer-Verlag, Berlin..

Vincent Liu, Mark P Clark, Mark Mendoza, Ramin Saket,Marla N Gardner, Benjamin J Turk, and Gabriel J Es-cobar. 2013. Automated identification of pneumo-nia in chest radiograph reports in critically ill patients.BMC Med Inform Decis Mak, 13:90.

Christian Mader, Bernhard Haslhofer, and Antoine Isaac.2012. Finding quality issues in SKOS vocabular-ies. In Panayiotis Zaphiris, George Buchanan, EdieRasmussen, and Fernando Loizides, editors, Theoryand Practice of Digital Libraries, volume 7489, pages222–233. Springer-Verlag, Berlin, .

Nor Azlinayati Abdul Manaf, Sean Bechhofer, andRobert Stevens. 2012. The current state of SKOSvocabularies on the web. In Elena Simperl, PhilippCimiano, Axel Polleres, Oscar Corcho, and ValentinaPresutti, editors, The Semantic Web: Research andApplications, volume 7295, pages 270–284. Springer-Verlag, Berlin.

Mark Myslın, Shu-Hong Zhu, Wendy Chapman, andMike Conway. 2013. Using Twitter to examine smok-ing behavior and perceptions of emerging tobaccoproducts. J Med Internet Res, 15(8):e174.

Thomas Schandl and Andreas Blumauer. 2010. Pool-Party: SKOS thesaurus management utilizing linkeddata. In The Semantic Web: Research and Applica-tions, pages 421–425. Springer-Verlag, Berlin.

Richard A Wilson, Wendy W Chapman, Shawn J De-fries, Michael J Becich, and Brian E Chapman. 2010.Automated ancillary cancer history classification formesothelioma patients from free-text clinical reports.J Pathol Inform, 1:24.

Yonghui Wu, Joshua C Denny, S Trent Rosenbloom, Ran-dolph A Miller, Dario A Giuse, and Hua Xu. 2012. Acomparative study of current clinical natural languageprocessing systems on handling abbreviations in dis-charge summaries. AMIA Annu Symp Proc, 2012:997–1003.

108


Recommended