Ontology languages, XML and RDF. Contents Definition of ontology A quick survey to famous ontology...

Post on 16-Jan-2016

221 views 0 download

Tags:

transcript

Ontology languages, XML and RDF

Contents

• Definition of ontology •A quick survey to famous ontology languages

• Graphical• Logic based

● XML● XML, namespaces, XSLT● XSLT, XSL:FO

● RDF

Ontology

• An ontology is a hierarchically structured set of terms for describing a domain that can be used as a skeletal foundation for a knowledge base. (Swartout, Patil, Knight, Russ)• An ontology provides the means for describing explicitly the conceptualization behind the knowledge represented in a knowledge base. (Bernaras, Lasergoiti, Correra)• An ontology is a formal, explicit specification of a shared conceptualization (Studer, Benjamins,Fensel)

(In)famous “Layer Cake”

Data Exchange

Semantics+reasoning

Relational Data?

?

???

???

???

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Graphical notations• Semantic networks

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Graphical notations• Topic Maps

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Graphical notations• UML

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Graphical notations• RDF

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Logic based• Description Logics (e.g., OIL, DAML+OIL, OWL)• Rules (e.g., RuleML, LP/Prolog)• First Order Logic (e.g., KIF)

Ontology Languages• Wide variety of languages for

“Explicit Specification” – Logic based

• Conceptual graphs

Ontology Languages• Wide variety of languages for “Explicit Specification”

– Logic based

• Conceptual graphs

• (Syntactically) higher order logics (e.g., LBase)

• Non-classical logics (e.g., Flogic, Non-Mon, modalities)

– Bayesian/probabilistic/fuzzy

• Degree of formality varies widely– Increased formality makes languages more amenable to

machine processing (e.g., automated reasoning)

• Objects/Instances/Individuals– Elements of the domain of discourse– Equivalent to constants in FOL

• Types/Classes/Concepts– Sets of objects sharing certain characteristics– Equivalent to unary predicates in FOL

• Relations/Properties/Roles– Sets of pairs (tuples) of objects– Equivalent to binary predicates in FOL

• Such languages are/can be:– Well understood– Formally specified– (Relatively) easy to use– Amenable to machine processing

Many languages use “object oriented” model based on:

Web “Schema” Languages• Existing Web languages extended to facilitate content

description– XML XML Schema (XMLS)

– RDF RDF Schema (RDFS)

• XMLS not an ontology language– Changes format of DTDs (document schemas) to be XML

– Adds an extensible type hierarchy

• Integers, Strings, etc.

• Can define sub-types, e.g., positive integers

• RDFS is recognisable as an ontology language– Classes and properties

– Sub/super-classes (and properties)

– Range and domain (of properties)

XML

– Unformal definition:

– XML is a markup language for representation of

– documents which contain stuctured information.

•Usages:

Data exchange (e.g. RSS, SOAP-envelope, ad hoc B2B data exchange)

Web services

Data integration

Content publishing (single source multiple output)

Multimedia presentations (SMIL)

XML• Processing

– Parsing

• SAX: uses callback to notify occurences of each element(e.g. StartDocument(), startElement(), endDocument(), etc.)

• DOM: Makes and object model of the document in memory

– Remote procedure call: XML-RPC

– SOAP (web serices)

Namespaces in XML

Namespaces avoids name conflicts• same tag for different things• same tag for different format

Example:Document 1: <person>Tim Berners Lee</person>

Document 2: <person>Tim Berners Lee</person>

Namespaces in XML (cont.)

• First try <Document1:person> Tim Berners Lee </Document1:person>

Second try<www.w3c.org:person>

Tim Berners Lee</www.w3c.org:person>

Final solution<foo xmlns:doc=”www.w3c.org”>

<doc:person> Tim Berners Lee </doc:person>

Namespace example

<html:html xmlns:html=”...” xmlns:math=”...”>

<html:title> George Soros </html:title>

<html:h2> Counting ... </html:h2>

<math:reln>

.....

</math:reln>

</html:html>

XML Schema: What is a Schema ?

A schema defines the content of a number of XML-documents

It defines which elements and attributes can be included the element content the order of elements

Schema substitutes DTD

Think of classes (schema) and instances (documents)

XML Schema Schema is saved with postfix .xsd A document is validated against a schema A schema is a XML-document

XSLXSL consist of

– formatting objects

– XSL Transformations (XSLT)

• Formatting object: specifies presentation

• XSLT: Transformations to arbitrary format

• The idea of XSLT: traverse the tree and apply a specific template at each node

• Usage– XML to XML (i.e. SVG)

– XML to (X)HTML

– XML to text

– XML to other formats (i.e CSV)

The XSLT language

• Transformation in XML dialect• Example:

<xsl:transform xmlns:xsl=”http://www.w3.org.”>

<!-- templates rules go here-->

</xsl:transform>

• A rule based language• Output handled by the enviroment

Terminology

• Template rules consists of:– pattern

– template

• Pattern specifies nodes a template applies to– tag name

– attributes

– context

• Template defines the transformations

XML Example

<book>

<title> chicken soup </title>

<section>

<title> Introduction </title>

<para> I’ve always.. <para>

</section >

</book>

book

title section

title para

Template rule for book

<xsl:template match=”book”>

<body>

<h1> <xsl:value-of select=”title”/> </h1>

<xsl:apply-templates select=”section”></body>

</xsl:template>

Template rule for section

<xsl:template match=”section”>

<h2><xsl:value-of select=”title”/></h2>

<xsl:apply-template select=”para”/>

</xsl:template>

Formatting objects

• Formatting represents common document elements

• Example– block

– external-graphic

– table– simple-link

• They are specified in XML• Attributes specify their appearance

Template rule with objects

<xsl:template match=”section”>

<fo:block font-size=”12pt”>

<xsl:value-of select=”title”/>

</fo:block>

<xsl:apply-template select=”para”/>

</xsl:template>

SMIL

SMIL Synchronized Multimedia Integration Language SMIL is an XML extension used for multimedia presentations which integrate

streaming audio and video with images, text, etc. enables to specify what should be presented when

SMIL

Introduction About XML XML Schema XML

Namespaces XSLT Future

SMIL<smil> <head> </head>

<body> <par> <audio src="sound.rm"/> <seq> <textstream src="tobbe2.rt" region="videoregion"/> <par>

<textstream src="tobbe.rt" region="textregion"/> <video src="tobias.rm" region="videoregion"/>

</par> </seq> </par> </body>

</smil>

RDF and RDFS• RDF stands for Resource Description Framework

• It is a W3C candidate recommendation (http://www.w3.org/RDF)

• RDF is graphical formalism ( + XML syntax + semantics)– for representing metadata

– for describing the semantics of information in a machine- accessible way

• RDFS extends RDF with “schema vocabulary”, e.g.:– Class, Property

– type, subClassOf, subPropertyOf

– range, domain

The RDF Data Model• Statements are <subject, predicate, object> triples:

<Knuth,hasColleague,StanfordU>• Can be represented as a graph:

Knuth StanfordUhasColleague

• Statements describe properties of resources• A resource is any object that can be pointed to by a URI:

– a document, a picture, a paragraph on the Web;– http://www.cs.man.ac.uk/index.html– a book in the library, a real person (?)– isbn://5031-4444-3333– …

• Properties themselves are also resources (URIs)

URIs• URI = Uniform Resource Identifier

• "The generic set of all names/addresses that are short strings that refer to resources"

• URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages)

• In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document:

– http://www.somedomain.com/some/path/to/file#fragmentID

Linking Statements• The subject of one statement can be the object of another

• Such collections of statements form a directed, labeled graph

• Note that the object of a triple can also be a “literal” (a string)

Knuth STUhasColleague

Carole http://www.stanford.edu

hasColleaguehasHomePage

How can RDF be implemented• Usally RDF/XML syntax• However other notations are possible

– e.g. Notation3

<http://xyz.org/#Sean> <http://xyz.org/#name> "Sean"

<http://xyz.org/#a> <http://xyz.org/#b> <http://xyz.org/#c>

RDF Syntax• RDF has an XML syntax that has a specific meaning:• Every Description element describes a resource• Every attribute or nested element inside a Description is a property

of that Resource• We can refer to resources by using URIs

<Description about="some.uri/person/knuth"> <hasColleague resource="some.uri/STU/Math"/></Description><Description about="some.uri/STU/Math"> <hasHomePage>http://www.stanford.edu/~Math</hasHomePage></Description><Description about="some.uri/person/carole"> <hasColleague resource="some.uri/STU/Math"/></Description>

Rdf type• RDF predifined property• Its value – a resource that represent a category or class• Its subject – Instance of that category or class

prefix ex:, URI: http://www.example.org/terms

Rdf containers• Bag: (A resource having type rdf:Bag)

– Represents an unordered list of resources or literals

– Duplicated values are prermitted

• Sequence: (A resource having type rdf:Seq)

– Represents ordered list of resources or literal

– Duplicated values are permitted

• Alternatives: (A resource having type rdf:Alt)

– Represents group of resources or literals that are alternatives

Bag example

RDF reification• association of a statement and a specific resource representing the statement• used to make statements about statements• Vocabulary:• type rdf:Statement• properties

• rdf:subject• rdf:predicate• rdf:object

Reification example• The member of project opposed the statement that http://www.daml.org/projects/#11 is an NSF founded project

RDF graphs

RDF Schema (RDFS)• RDF gives a formalism for meta data annotation, and a way

to write it down in XML, but it does not give any special meaning to vocabulary such as subClassOf or type– Interpretation is an arbitrary binary relation

• RDF Schema allows you to define vocabulary terms and the relations between those terms– it gives “extra meaning” to particular RDF predicates and

resources

– this “extra meaning”, or semantics, specifies how a term should be interpreted

RDFS Examples• RDF Schema terms (just a few examples):

– Class

– Property

– type

– subClassOf

– range

– domain

• These terms are the RDF Schema building blocks (constructors) used to create vocabularies:<Person,type,Class>

<hasColleague,type,Property>

<Professor,subClassOf,Person>

<Carole,type,Professor>

<hasColleague,range,Person>

<hasColleague,domain,Person>

RDFS is about creating taxonomies

Inferences:Yangtze is ariver

Yangtze has a length of 6300 Kilometers

Inferences in RDF and RDFS

• Two classes same concept:– Equivalent classes: Airplane and Plane

• Cardinality constraints:– Ocean has one maxDepth

Desire for more expressiveness

RDF/RDFS “Liberality”• No distinction between classes and instances (individuals)

<Species,type,Class>

<Lion,type,Species>

<Leo,type,Lion>

• Properties can themselves have properties<hasDaughter,subPropertyOf,hasChild>

<hasDaughter,type,familyProperty>

• No distinction between language constructors and ontology vocabulary, so constructors can be applied to themselves/each other<type,range,Class>

<Property,type,Class>

<type,subPropertyOf,subClassOf>

RDFS semantic using logic

Problems with RDFS• RDFS too weak to describe resources in sufficient detail

– No localised range and domain constraints

• Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

– No existence/cardinality constraints

• Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents

– No transitive, inverse or symmetrical properties

• Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

– …

• Difficult to provide reasoning support– No “native” reasoners for non-standard semantics

– May be possible to reason via FO axiomatisation

RDF(S) tools• Read RDF data

– Parsers: Jena, Redland, SWI-Prolog

– Validators: W3C RDF validation service

– Editors: IsaViz, RDF Author, RDFEd, InferEd

• Store RDF data (XML format, tripples or relational/oo DB)– RSSDB, RDFLib

• Use RDF data (applications, RSS news, etc.)• Manipulate RDF data (inference, query, etc.)

– Jena RDQL, etc.

– Example:

SELECT ?person, ?knows

WHERE (?x <http://xmlns.com/foap/knows> ?z),

(?x <http://xmlns.com/foap/name> ?person),

(?z <http://xmlns.com/foap/name> ?knows)

Web Ontology Language Requirements

Desirable features identified for Web Ontology Language:

• Extends existing Web standards – Such as XML, RDF, RDFS

• Easy to understand and use– Should be based on familiar KR idioms

• Formally specified

• Of “adequate” expressive power

• Possible to provide automated reasoning support

From RDF to OWL• Two languages developed to satisfy above requirements

– OIL: developed by group of (largely) European researchers (several from EU OntoKnowledge project)

– DAML-ONT: developed by group of (largely) US researchers (in DARPA DAML programme)

• Efforts merged to produce DAML+OIL– Development was carried out by “Joint EU/US Committee on Agent

Markup Languages”

– Extends (“DL subset” of) RDF

• DAML+OIL submitted to W3C as basis for standardisation– Web-Ontology (WebOnt) Working Group formed

– WebOnt group developed OWL language based on DAML+OIL

– OWL language now a W3C Candidate Recommendation

– Will soon become Proposed Recommendation

OWL Language• Three species of OWL

– OWL full is union of OWL syntax and RDF– OWL DL restricted to FOL fragment (¼ DAML+OIL)– OWL Lite is “easier to implement” subset of OWL DL

• Semantic layering– OWL DL ¼ OWL full within DL fragment– DL semantics officially definitive

• OWL DL based on SHIQ Description Logic– In fact it is equivalent to SHOIN(Dn) DL

• OWL DL Benefits from many years of DL research– Well defined semantics– Formal properties well understood (complexity, decidability)– Known reasoning algorithms– Implemented systems (highly optimised)

(In)famous “Layer Cake”

Data Exchange

Semantics+reasoning

Relational Data?

?

???

???

???

• Relationship between layers is not clear

• OWL DL extends “DL subset” of RDF