SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics

Post on 07-May-2015

2,819 views 0 download

description

Semantic Web Applications in Scientific Discourse Workshop at the International Semantic Web Conference / Washington, DC / 26th October 2009

transcript

School of Engineering and Informatics

SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics

Alexandre Passant1, Paolo Ciccarese2, 3, John G. Breslin4, Tim Clark2, 3

1 DERI, NUI Galway, Ireland 2 Massachusetts General Hospital, Boston, USA

3 Harvard Medical School, Boston, USA 4 School of Engineering and Informatics, NUI Galway, Ireland

School of Engineering and Informatics

Motivation

• To provide a complete RDF-based model to model online activities and scientific argumentation in neuromedicine:

– Combining Web 2.0 shared knowledge using SIOC and formal scientific data (hypotheses, claims, dialogue, evidence, publications, etc.) via SWAN

• To make (both formal and informal) discourse concepts and relationships more accessible to computation:

– So that they can be better navigated, compared and understood both across and within domains

School of Engineering and Informatics

How is this achieved?

• An alignment of ontologies was performed to provide a complete framework for modelling activities in scientific communities

• SWAN objects were integrated into SIOC Types module

• SWAN was reused to model argumentative discussions

• External models such as SCOT and MOAT were reused for tagging

• SCF is being updated so that it can create data according to this model

School of Engineering and Informatics

Collaborative websites are like data silos

* Source: Pidgin Technologies, www.pidgintech.com

School of Engineering and Informatics

Many isolated communities of users and their data

* Source: Pidgin Technologies, www.pidgintech.com

School of Engineering and Informatics

Need ways to connect these islands

* Source: Pidgin Technologies, www.pidgintech.com

School of Engineering and Informatics

Allowing users to easily move from one to another

* Source: Pidgin Technologies, www.pidgintech.com

School of Engineering and Informatics

Enabling users to easily bring their data with them

* Source: Pidgin Technologies, www.pidgintech.com

School of Engineering and Informatics

Types of data silos (scientific and social)

• Collaborative websites used by scientific researchers in various domains:

– SWAN/SCF is being used to connect these

• Social websites used by people collaborating or communicating through the Web 2.0 platform:

– SIOC is being used to connect these

• SWAN/SIOC connects both sets of data silos together, not just structures but what is embedded within content as well

School of Engineering and Informatics

SWAN (Semantic Web Applications in Neuromedicine)

• An ontology of scientific discourse (Ciccarese et al. 2008)

• A participatory knowledge base of hypotheses, claims, evidence and concepts in biomedicine, with the first instance in the domain of Alzheimer’s disease (AD)

• Currently being integrated with the SCF (Science Collaboration Framework) toolkit for biomedical web communities

• http://swan.mindinformatics.org/

School of Engineering and Informatics

What does SWAN consist of?

• A formal structure to record and present scientific discourse

• Tools for scientists to manage, access and share knowledge

• Tools for discovering conflicts, gaps and missing evidence

• An information bridge to promote collaboration

• A community process built upon the Alzforum site

School of Engineering and Informatics

Main concepts and relationships in the SWAN ontology

School of Engineering and Informatics

Modules in the SWAN ontology

School of Engineering and Informatics

A typical hypothesis

School of Engineering and Informatics

Contributions from leading researchers

Key research topics

Contribute content

Inventory of ideas

Mechanisms of disease

School of Engineering and Informatics

Scientist viewToxic protein fragments believed responsible for AD

Key information, gaps and conflicts

Computer viewKnowledge organised for

computer processing, integration and reasoning

School of Engineering and Informatics

Browsing evidence and inconsistencies

New experiment required?

School of Engineering and Informatics

A researcher-supported effort

• Dozens of etiopathological AD models annotated by SWAN curators in collaboration with leading researchers

• Content reviewed before release by over twenty senior AD researchers

• Software features reviewed before release by over thirty senior AD researchers

• Extensive feedback incorporated into SWAN, such that this is a community tool (in line with Web 2.0 principles)

School of Engineering and Informatics

Semantically-Interlinked Online Communities (SIOC)

• An effort from DERI, NUI Galway to discover how we can create / establish ontologies on the Semantic Web

• Goal of the SIOC ontology is to address interoperability issues on the (Social) Web

• http://sioc-project.org/

• SIOC has been adopted in a framework of 50 applications or modules deployed on over 400 sites

• Various domains: Web 2.0, enterprise information integration, HCLS, e-government

School of Engineering and Informatics

School of Engineering and Informatics

The steps taken

1. Develop an ontology of terms for representing rich data from the Social Web

2. Create a food chain for producing, collecting and consuming SIOC data

3. As well dissemination via papers about SIOC, provide docs and examples at sioc-project.org

• SIOC aims to enrich the Web infrastructure:

– During the next upgrade cycle, gigabytes of semantically-enriched community data become available!

School of Engineering and Informatics

Some of the SIOC core ontology classes and properties

School of Engineering and Informatics

Some examples of where SIOC is already use (about 50 applications / modules)

School of Engineering and Informatics

Creating a Social Semantic Web of previously-disconnected social “data silos”

School of Engineering and Informatics

Also integrating scientific “data silos” in a semantic scientific collaboration framework

• Enabling researchers to:

– Collect data

– Draw conclusions

– Gather information

– Create/modify hypotheses

– Perform experiments

• But with the benefit of cross-community and cross-domain experiences and results

School of Engineering and Informatics

Mappings between SWAN and SIOC at http://rdfs.org/sioc/swan in OWL-DL

School of Engineering and Informatics

Mappings between SWAN and SIOC classes

• Subclasses of sioc:Item:

– swanscidis:DiscourseElement

– swanscidis:ResearchStatement

– swanscidis:ResearchQuestion

– swanscidis:ResearchComment

– swancit:Citation

– swancit:JournalArticle

• Other mappings:

– sioc:Post > swancit:WebArticle, swancit:WebNews

– sioc:Comment > swancit:WebComment

• swanscidis is the Scientific Discourse module, which provides a set of classes and properties to represent discourse elements

• swancit is the Citations module, which aims to model the various citation elements that occur in scientific publishing

School of Engineering and Informatics

Mappings between SWAN and SIOC properties

• Subtypes of sioc:related_to:

– swandisrel:agreesWith / swandisrel:disagreesWith

– swandisrel:alternativeTo

– swandisrel:arisesFrom

– swandisrel:cites

– swandisrel:consistentWith / swandisrel:inconsistentWith

– swandisrel:discusses

– swandisrel:inResponseTo

– swandisrel:motivatedBy

– swandisrel:refersTo

• swandisrel is the Scientific Discourse Relationships module, which collects some of the relationships used for modelling discourse

• May also use sioc:Item dcterms:hasPart swanscidis:DiscourseElement, for example, to represent that a particular hypothesis is part of a blog post

School of Engineering and Informatics

Mappings redundancy

• Redundant mappings:– Can be entailed thanks to the transitivity of rdfs:subClassOf /

rdfs:subPropertyOf– e.g. “swancit:JournalArticle rdfs:subClassOf sioc:item” can be

inferred from “swancit:JournalArticle rdfs:subClassOf swancit:Citation” and “swancit:Citation rdfs:subClassOf sioc:Item”

• However:– SIOC applications generally do not support such chained

entailments– Need to address lightweight inference– Therefore we provide direct rdfs:subClassOf mappings

School of Engineering and Informatics

Querying mappings

• Simple query to identify relatedness between items:

– Applying a SIOC query over SWAN data

– SPARQL / Pellet, files loaded on runtime in memory

– Experiment with both simple mappings (including transitive closure) and full mappings

PREFIX sioc: <http://rdfs.org/sioc/ns#>SELECT DISTINCT ?s ?oWHERE {?s sioc:related_to ?o .?s a sioc:Item . ?o a sioc:Item .}

School of Engineering and Informatics

W3C HCLS Interest Group notes published

• http://www.w3.org/TR/hcls-sioc/

• http://www.w3.org/TR/hcls-swan/

• http://www.w3.org/TR/hcls-swansioc/

School of Engineering and Informatics

RDFa support in Drupal 7 for SSW data

School of Engineering and Informatics

Exposing scientific results to search

• Yahoo! Search Monkey and Google Rich Snippets

• Highlights the structured data embedded in web pages

• Google developers have indicated that scholarly publications marked up with Rich Snippets will also be picked up and appropriately indexed by Google Scholar

School of Engineering and Informatics

Acknowledgements

• We would like to thank Science Foundation Ireland for their support under grant SFI/08/CE/I1380 (Líon 2)

• We would also like to thank an anonymous foundation for a generous gift in support of this work

• Thanks to members of the W3C HCLSIG, in particular:

– Susie Stephens

– Scott Marshall

– Eric Prud’hommeaux

School of Engineering and Informatics

Motivation

• To provide a complete RDF-based model to model online activities and scientific argumentation in neuromedicine:

– Combining Web 2.0 shared knowledge using SIOC and formal scientific data (hypotheses, claims, dialogue, evidence, publications, etc.) via SWAN

• To make (both formal and informal) discourse concepts and relationships more accessible to computation:

– So that they can be better navigated, compared and understood both across and within domains