UNIVERSITY OF JYVÄSKYLÄ
Semantic Web and Linked Data
IHME courseSpring 2015
University of Jyväskylä Khriyenko Oleksiy
UNIVERSITY OF JYVÄSKYLÄ
Evolution of the Web
2
CONTEXT
OWLOWL
Communication:Human-to-Data,
Human-to-Service, Human-to-Human,
Service-to-Data, Service-to-Service.
Resources:Static Data,
Static Services.
Static Environment
Proactive Goal-driven Resources:
Communication:
data, services/software, processes,organizations, real world objects (human,device, machine, etc.)
Resource-to-Resource (Thing-to-Thing).
Context-aware Flexible Interoperable Dynamic Collaborative Environment
Communication:Human-to-Data,
Human-to-Service, Human-to-Human,
Service-to-Data, Service-to-Service.
Resources:Static Annotated Data,
Ontology-driven Services/Software.
Flexible Interoperable Static Environment
Collaborative Environment
OWLOWL
Web of Data Web of ServicesInternet of Things
Web of People
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
3
Avalanche of informationEducation
Events (art, cultural, sport), exhibitions, etc.
Work
Public transport scheduling systemShopping Centers:
sales, offers, etc.
Emergency service activities
Social Networks
Road and city services
Weather forecast and disaster
notification systems
But!!!Information is not
yet a data…
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Demands and Challenges
4
� society requires new innovative services and applications that make the life much easier, comfortable andinteractive;
� industry requires new intelligent systems to better perform maintenance and do better automation of productdevelopment and product operation processes;
� markets are looking for new opportunities based on information and data co-creation and reuse.
Demands of Society and Businesses
� unavailability of data limits us to develop new useful service and whittles away context-awareness of applicationsand services;
� complex accessibility and heterogeneity of data sources limits consumption of data by applications and services;
� human orientation of data formats slows down the process of intelligent autonomous service creation and serviceintegration;
� passiveness of data sources, lack of handy channels to provide and manage data, minimizes process of datareuse.
DataChallenges
� being bounded to certain data source, application is limited with possibility to access other data sources, to getmore fresh and updated information;
� being based on limited (closed) data model, application is not able to utilize data produced by another applicationand be interoperable.
Applications and Services
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
World Wide Web
� AAA principle = Anybody can writeAnything about Any topic
� Basic building block is a web page� Any web page can refer to any other
web page freely
� No central point of control
� No central repository => Documentsscattered across the whole Web
5
� Problem : Web page is a document for humans . For computers (machines) web pages are too difficult to understand
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Semantic Web vision
� Solution– Let’s produce Web data in a form that is easy to
“digest” by a machine without losing good propertiesof WWW
� How?– Switch: informal representation => formal model– Connect information, but stay consistent– Distribute information (no central repository)
� Semantics– Relation between signs, words, symbols and the
things (documents, people, places, events,organizations, concepts, etc.) to which they refer.
– Relation of the things to each other.
6
I am enjoying about learning and using new technology…
Syntax:
Semantics:
I technologyI love technology____
Web
Semantic Web
RDF
RDF
RDF
RDF
From decentralized platform for
distributed presentationstowards decentralized platform for
distributed knowledge
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Disambiguity of data
� Disambiguity of referencing to things:
� Disambiguity of conceptso Example:
• John is attracted to Mary. • North pole is attracted to south pole.
o There has to be some common understanding of the domain in questiono Solution: Ontology – a precise explanation of terms and reasoning in a subject area.
7
Computer symbolsIdeas Human symbols
mouse
cry
mouse1
mouse2
cry1
cry2
bank 1
bank 2
bank
o Example: mouse, windows, bank, cry, etc.o Every thing should have its unique nameo Solution: URIs (Uniform Resource Identifiers)
Usually URIs are represented in a form of URLs:• http://www.jyu.fi/people/students/john/assignments/assignment1• http://www.jyu.fi/people/students/john/assignments/assignment2• http://www.jyu.fi/people/students/john/assignments/assignment3
Namespace as a prefix of the short (qualified name):Full name: http://www.jyu.fi/people/students/john/assignments/assignment1
Use qualified names (qnames):as:assignment1, as:assignment2, as:assignment3
Prefix (for example as:) Rest of the name
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDF (Resource Description Framework)
� RDF is a general method to decompose knowledge into small pieces with rulesabout the meaning of those pieces. It is a method to describe facts in a short form.
8
ID Firstname Surname City
145 Albert Einstein NY
626 Marie Currie LA
California
State
USACountry
Person-145
Albert
Firstname
Einstein
NY145ID
Surname
City
Marie Currie
LA626
Person-626
ID
Firstname
Surname
City
� RDF represents graphs
� Everything is a Resource– Anything that we can talk about and
has identity in a form of URI.– Example: human, building, weather
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDF as graph, RDF as text
9
AlbertEinstein
c:NY
145
p:per145
f:hasID
f:hasFirstname
f:hasSurname
f:livesIn
p:per145 f:hasID “145” . p:per145 f:hasFirstName “Albert” . p:per145 f:hasSurname “Einstein” . p:per145 f:livesIn c:NY .
All the data in RDF is described in statements/triples:subject – predicate – object
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Graph matching and merging
10
Albert Einstein
c:NY
145
p:per145
f:hasID
f:hasFirstname
f:hasSurname
f:livesIn
The Big Apple
s:NY
8 175 133
c:NY
c:citizens
c:nickname
c:state
Albert Einstein
145
p:per145
f:hasID
f:hasFirstname
f:hasSurname
f:livesIn
The Big Apple
s:NY
8 175 133
c:NY
c:citizens
c:nickname
c:state
c:NY
f:livesIn
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Serialization
� The way of representing the graph in textual form� Serializations (notations):
– RDF/XML
– TriX– N-triples– Turtle (Terse RDF Triple Language)– Notation 3
1110/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDF/XML
� Suitable for machines� Many XML parsers exist� Difficult for humans to see subject-predicate-object triples
12
<rdf:RDFxmlns="http://data.gov/ontology/edu#" xmlns:log="http://www.w3.org/2000/10/swap/log#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax- ns#">
<rdf:Description rdf:about=" http://www.jyu.fi/courses/TIES452 "><credits >5</ credits >
</rdf:Description> <rdf:Description rdf:about=" http://www.jyu.fi/people/Mary ">
<studies rdf:resource=" http://www.jyu.fi/courses/TIES452 "/> <livesIn xmlns="http://data.gov/ontology/urban#"
rdf:resource=" http://www.geo.com/city/Turku "/> </rdf:Description>
</rdf:RDF>
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
N-triples
� Simple textual serialization of RDF statements� Each statement consists of subject, predicate and object
separated by a white space� Statements are separated by dots (.)� Resources are referred to with full URIs in <> brackets� Literals are wrapped into double quotes (“ ” )
13
<http://www.jyu.fi/people/Mary> <http://data.gov/ontology/urban#livesIn> <http://www.geo.com/city/Turku> .
<http://www.jyu.fi/people/Mary> <http://data.gov/ontology/edu# studies > <http://www.jyu.fi/courses/TIES452> .
<http://www.jyu.fi/courses/TIES452> <http://data.gov/ontology/edu#credits> “5” .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Turtle
� Same example as before:
14
@prefix p: <http://www.jyu.fi/people/> .@prefix u: <http://data.gov/ontology/urban#> .@prefix edu: <http://data.gov/ontology/edu#> .@prefix co: <http://www.jyu.fi/courses/> .@prefix ci: <http://www.geo.com/city/> .
p:Mary u:livesIn ci:Turku .p:Mary edu:studies co:TIES452 .co:TIES452 edu:credits “5” .
p:Mary u:livesIn ci:Turku ; edu:studies co:TIES452 .co:TIES452 edu:credits “5” .
<http://www.jyu.fi/people/Mary> <http://data.gov/ontology/urban#livesIn> <http://www.geo.com/city/Turku> .
<http://www.jyu.fi/people/Mary> <http://data.gov/ontology/edu# studies > <http://www.jyu.fi/courses/TIES452> .
<http://www.jyu.fi/courses/TIES452> <http://data.gov/ontology/edu#credits> “5” .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Manipulation of RDF data
� Storing– RDF file on the web– Specialized RDF storage (“RDF database”)– Other form (*.xls, DB, …) exposed as RDF
� Querying– Like in relational DB there is a query language (SPARQL)– Can query from several sources (web sources, local RDF storages, etc.)
� Reasoning– Can derive new facts from already existing facts– Can check consistency of the model– Does not exist in relational DB !!!
1510/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Storing of RDF data
� Small datasets (few triples)– RDF file published on the web or stored locallyExamples: *.rdf , *.nt , *.ttl , *.n3 , etc.
� Large datasets (thousands to millions of triples)– Database-bases solution better. Usually in form of RDF storageExamples: • Native RDF Stores (4/5Store , AllegroGraph , Apache Jena TDB , Mulgara , GraphDB™ , etc.)• DBMS-backed Stores (ARC2, Apache Jena SDB , Oracle Spatial and Graph , Semantics Platform ,
RDFLib , etc.)• Hybrid Stores ()• Non-RDF DB support (D2RQ Platform )
� Legacy data– Keep in original form– Provide mapping to RDF– Expose as RDF to the outer world
List of Triplestores: http://en.wikipedia.org/wiki/Triplestore
1610/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Querying of RDF data
� SPARQL- is an RDF query language, that is, a semantic query language for databases,able to retrieve and manipulate data stored in RDF format.SPARQL query general form:
� SPARQL 1.1 Update (SPARUL or SPARQL/Update) – is a declarative datamanipulation language that is an extension to the SPARQL query language and provides theability to insert, delete and update RDF data (as well as manipulation with graphs) held withina triple or quad stores.
Useful links: http://www.w3.org/TR/rdf-sparql-query/
http://www.w3.org/TR/sparql11-update/
17
PREFIX (Namespace Prefixes)e.g. PREFIX f: <http://example.org#>
SELECT (Result Set)e.g. SELECT ?age
FROM(Data Set)e.g. FROM <http://users.jyu.fi/~olkhriye/itks544/rdf/people.rdf>
WHERE(Query Triple Pattern)e.g. WHERE { f:mary f:age ?age }
ORDER BY, DISTINCT, etc. (Modifiers)e.g. ORDER BY ?age
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Simple SPARQL query
� Show me all things that are loved. Also show me their age (f:age )
18
f:mary
25
f:john
f:age
f:janef:billf:loves
30
24
26
f:age
f:loves
f:age
f:age
Data Query
PREFIX f: <http://example.org#> SELECT ?person ?ageWHERE {
?x f:loves ?person . ?person f:age ?age
}
person age
f:jane 26
f:mary 24
Result
?person
?age
f:age
?xf:loves
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Ontologies
� Ontologies are formal models that describe a certain domain (e.g. medical ontology, IT
ontology, milk ontology, etc.) and specify the definitions of terms by describing their relationshipswith other terms in the ontology. Consists of:o TBox - describes abstract concepts and their relationships, taxonomy, classification;o ABox - describes concrete individuals and their relationships to other individuals and/or abstract
concepts from Tbox.
� Class (type)o Represents a set of things that share same properties (and/or behavior)o Example: Person, Fruit, Feeling, etc.
� Instance (individual)o Represents a concrete thingo Can belong to one or more classeso Example: johnDoe, appleGoldenDelicious, anger, etc.
There cannot be a global ontology of everythingo Ontologies are dynamic (they change in time)o Every person can have a different perspective on the domain
19
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix ont: <http://www.john.com/myOntology.owl#> .
ont:benny rdf:type ont:Dog .ont:superman rdf:type ont:ComicBookCharacter .ont:mrBean rdf:type ont:ComicCharacter .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Ontology language
� Language that is used to formally define Ontologies� Example:
o RDFS (RDF Schema) - simple ontology languageo OWL (Web Ontology Language) - has more expressive power than RDF Schema
providing additional vocabularyo OWL2 is extension of OWL
� Majority based on RDF model as wello Ontology written in such language is RDF itself
� Differences between ontology languageso Expressivenesso Computational complexity of reasoning
� Protégé is an ontology editor
2010/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDF Schema (RDFS)
� Simple ontology language (W3C Recommendation in 2004)� Prefix: � Features:
– Declaration of classes and subclass hierarchy:
– Declaration of literals and their hierarchy:
– Definition of properties and their hierarchy:
– Other features (statement, container, collections, comments, etc.)
21
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema #>
x:Human rdf:type rdfs:Class .x:Human rdfs:subClassOf x:LivingBeing .
x:Henkilotunnus rdf:type rdfs:Literal .rdfs:Datatype rdfs:subClassOf rdfs:Literal .
x:hasAge rdf:type rdf:Property .x:hasAge rdfs:domain x:LivingBeing .x:hasAge rdfs:range xsd:int .rdfs:subPropertyOf rdf:type rdf:Property .x:hasMovablePart rdfs:subPropertyOf x:hasPart .x:hasStaticPart rdfs:subPropertyOf x:hasPart .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDFS example
� Ontology
� Annotated resource
22
@prefix x: <http://mypage.com/myOntologies/humanOntology#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
x:LivingBeing rdf:type rdfs:Class .x:Human rdf:type rdfs:Class ;
rdfs:subClassOf x:LivingBeing .x:hasAge rdf:type rdf:Property ;
rdfs:domain x:Human ;rdfs:range xsd:int .
@prefix x: <http://mypage.com/myOntologies/humanOntology#> . @prefix xsd: <http://www.w3.org/2000/01/rdf-schema#> .
x:bill rdf:type x:Human ; x:hasAge "40"^^xsd:int .
@prefix x: <http://mypage.com/myOntologies/humanOntology#> . @prefix xsd: <http://www.w3.org/2000/01/rdf-schema#> .
x:bill rdf:type x:LivingBeing ; "40"^^xsd:int .x:hasAgex:hasAge
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
OWL example
� Value constraint: owl:hasValue
� Value constraint: owl:someValuesFrom
:CitizenOfJyvaskyla rdf:type owl:Class ;owl:equivalentClass [
rdf:type owl:Restriction ;owl:onProperty :livesInCity ;owl:hasValue :cityJKL
] .
23
:FinnByOrigin rdf:type owl:Class ;owl:equivalentClass [
rdf:type owl:Restriction ;owl:onProperty :hasParent ;owl:someValuesFrom :Finn
] .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
OWL example
24
� Cardinality constraints::Mammal rdf:type owl:Class;
rdfs:subClassOf [rdf:type owl:Restriction;owl:onProperty :hasParent;owl:cardinality 2
];rdfs:subClassOf [
rdf:type owl:Restriction;owl:qualifiedCardinality 1;owl:onProperty :hasParent;owl:onClass :Female
];rdfs:subClassOf [
rdf:type owl:Restriction;owl:qualifiedCardinality 1;owl:onProperty :hasParent;owl:onClass :Male
].
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
OWL example
25
� owl:FunctionalPropertyExample: ex:marriedTo (in monogamous cultures);
� ��
�
�
��
� owl:inverseOfExample: ex:isOwnedBy & ex:owns, ex:hasChild & ex:hasParent are inverse.
� �
��
��
� owl:SymmetricPropertyExample: ex:hasSpouse. � �
�
�
�� �� �� �
�
� owl:TransitivePropertyExample: ex:bossOf, ex:hasAncestor.
� ...
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Reasoning on top of RDF data
� Ontology-based reasoningo The inference rules for RDF-S or OWL are fixed. Therefore: No need for rule engine,
procedural algorithm is sufficient
� Rule-based reasoning usually requires:o A language for representing the ruleso A rule engine
26
:John :hasWife :Mary :John rdf:type :Human . :John rdf:type :Man . :Mary rdf:type :Human . :Mary rdf:type :Woman . :Mary :hasHusband :John.
Family ontology
+ also means
:Mary :hasHusband :John
REASONER(RULE ENGINE)
:John :hasWife :Mary (?a :hasWife ?b) => (?b :hasHusband ?a)
Belief
Rule
Premise(s) Conclusion(s)if , then
Inferred Belief Inverse property
� �
��
��
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Some rules of RDF Schema
� If a resource is an instance of a class, it is also an instance of anysuper-class of that class (any human is a mammal).
27
� If a statement with a property is made, the statement with anysuper-property is also true (if you love something, you also like it).
:Mammal rdf:type owl:Class.:Human rdf:type owl:Class.:Human rdfs:subClassOf :Mammal.:John rdf:type :Human.
:John rdf:type :Mammal.also means
:like rdf:type owl:ObjectProperty.:love rdf:type owl:ObjectProperty.:love rdfs:subPropertyOf :like.:John :love :Mary .
:John :like :Mary .also means
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Some rules of RDF Schema
� Transitive property:– If class A is a sub-class of B, while B is a sub-class of C, then A is a
sub-class of C (mother is woman, woman is human, therefore mother ishuman).
– Also applies to sub-properties� Example: rdfs:subClassOf and rdfs:subPropertyOf are transitive properties
28
:Human rdf:type owl:Class.:Woman rdf:type owl:Class.:Mother rdf:type owl:Class.:Woman rdfs:subClassOf :Human.:Mother rdfs:subClassOf :Woman.:prefer rdf:type owl:ObjectProperty.:like rdf:type owl:ObjectProperty.:love rdf:type owl:ObjectProperty.:like rdfs:subPropertyOf :prefer.:love rdfs:subPropertyOf :like.
:Mother rdfs:subClassOf :Human.:love rdfs:subPropertyOf :prefer .
also means
�� �� �� �
�
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Some rules of RDF Schema
29
� If a property is defined to have class A as its domain, and astatement with that property is made, the subject of the statementmust be an instance of A.– The same for the range of a property and the object of a statement.
:Human rdf:type owl:Class .:love rdf:type owl:ObjectProperty ;
rdfs:domain :Human ;rdfs:range :Human .
:John :love :Mary .
:John rdf:type :Human .:Mary rdf:type :Human .
also means
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Some rules of OWL
� Some of the property characteristics allow reasoners to infer newknowledge about instances and their relations:
– owl:inverseOf � �
��
��
30
:Human rdf:type owl:Class .:hasChild rdf:type owl:ObjectProperty .:hasParent rdf:type owl:ObjectProperty .:hasChild owl:inverseOf :hasParent .:John rdf:type :Human .:Mary rdf:type :Human .:John :hasChild :Mary .
:Mary :hasParent :John .also means
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Rule-based reasoning
� The OWL language is not able to express all relations (ex: it cannot express the relation “child of married parents“).
� The expressivity of OWL can be extended by adding rules to an ontology.
� Need for rule definition language:– SWRL (Semantic Web Rule Language)– Notation 3 (N3) logic– RIF (Rule Interchange Format)
3110/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Example@prefix rdf: <http://www.w3.org/1999/02/22-rdf-synt ax-ns#> .@prefix owl: <http://www.w3.org/2002/07/owl#> .
32
:Human rdf:type owl:Class .:dan rdf:type :Human .:peter rdf:type :Human .:mary rdf:type :Human .:jon rdf:type :Human .:betty rdf:type :Human .
:ancestorOf rdf:type owl:TransitiveProperty.:hasSpouse rdf:type owl:SymmetricProperty.:brotherOf rdf:type owl:ObjectProperty.:sisterOf rdf:type owl:ObjectProperty.owl:inverseOf rdf:type owl:SymmetricProperty .
:brotherOf owl:inverseOf :sisterOf .
:dan :ancestorOf :peter .:peter :ancestorOf :jon .:peter :hasSpouse :mary .:betty :sisterOf :jon .
{ ?P rdf:type owl:SymmetricProperty . ?S ?P ?O
} => {?O ?P ?S} .
{ ?P owl:inverseOf ?Q .?S ?P ?O
} => {?O ?Q ?S} .
{ ?P rdf:type owl:TransitiveProperty .?S ?P ?X .?X ?P ?O
} => {?S ?P ?O} .
+
Rules
:dan :ancestorOf :jon .:mary :hasSpouse :peter .:sisterOf owl:inverseOf :brotherOf .
:jon :brotherOf :betty .
Data
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Machine readable data exchange
� RDFa– Resource Description Framework in attributes (W3C Recommendation).
It is a domain-independent way to explicitly embed RDF data in attributesof a web page to:
– transfer data from an application to another through the web;– write data only once for web users and web applications.
� JSON-LD - JavaScript Object Notation for Linked Data. Extension ofJSON - simple property-value type machine readable data exchangeformat
3310/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDFa
34
<div vocab=“http://schema.org/” prefix=“ex: http://example.com/” resource="ex:alice/posts/trouble_with_bob“typeof=“Article”>
<h2 property="title"> The trouble with Bob </h2>...
The trouble with Bob is that he takes much better photos than I do:...
<div resource="ex:bob/photos/sunset.jpg"prefix=“dc: http://purl.org/dc/terms/” >
<img src="http://example.com/bob/photos/sunset.jpg" /><span property=“title"> Beautiful Sunset </span>by <span property=“dc:creator"> Bob</span>.
</div></div> The trouble with Bob
…The trouble with Bob is that he takes much better photos than I do:…
Beautiful Sunset by Bob
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
RDFa
35
<div vocab=“http://schema.org/” prefix=“ ex: http://example.com/ ” resource=" ex:alice/posts/trouble_with_bob “typeof=“ Article ”>
<h2 property=" title "> The trouble with Bob </h2>...
The trouble with Bob is that he takes much better photos than I do:...
<div resource=" ex:bob/photos/sunset.jpg "prefix=“ dc: http://purl.org/dc/terms/ ” >
<img src="http://example.com/bob/photos/sunset.jpg" /><span property=“ title "> Beautiful Sunset </span>by <span property=“ dc:creator "> Bob</span>.
</div></div>
@prefix sc: < http://schema.org/ > .@prefix ex: < http://example.com/ > .@prefix dc: < http://purl.org/dc/terms/ > .
ex:alice/posts/trouble_with_bob a sc:Article ; sc:title ” The trouble with Bob ” .ex:bob/photos/sunset.jpg sc:title ” Beautiful Sunset ” ; dc:creator “ Bob” .
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Semantic Annotation Tools
� OpenCalais� Zemanta� DBpedia Spotlight� OnTeA� RDFaCE� Structured Data Markup Helper� FRED� Semantator� Etc.
3610/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Linked Data
� In 2006, Tim Berners-Lee set out four simple principles for publishing data on the web. (http://linkeddata.org)
– Use URIs to identify things.– Use HTTP URIs so that people can look up those names.– When someone looks up a URI, provide useful information, using
the standards (RDF, RDFS, SPARQL).– Include links to other URIs, so that they can discover more things.
37
2007
2011
volume of data has grown fromaround 2 billion triples in 2007to over 30 billion in 2011…
2014
In 2014, altogether, the diagramcontains 570 datasets and 2909linkage relationships between thedatasets...
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Linked Open Data
� In 2010, Tim Berners-Lee suggested a 5 star deployment scheme for Open Datato encourage people (especially government data owners) to improve linked data.
� Linked Open Data (LOD) is Linked Data which is released under an open license, which does not impede its reuse for free. LOD2 - http://lod2.eu/Welcome.html
38
Available on the web(whatever format) butwith an open license, tobe Open Data.
Available as machine-readablestructured data (e.g. excel instead ofimage scan of a table).
The data does not use a proprietary format (e.g.CSV instead of excel).
All the previous plus, data use only open standards from W3C(RDF and SPARQL) to identify things, so that people can point atyour stuff.
All the before, plus: Link your data to other people’s data to provide context.
10/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
� Reusing existing well-known vocabularies. In order to make it possible for client applications toprocess your data, you should reuse terms from well-known vocabularies wherever possible. You should onlydefine new terms yourself if you can not find required terms in existing vocabularies. It is common practice tomix terms from different vocabularies.
Linked Open Vocabularies: http://lov.okfn.org/dataset/lov/Well-known vocabularies: http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/CommonVocabularies
Linked Data: good practice
� Friend-of-a-Friend (FOAF) provides terms for describing people andtheir social network
� SIOC Semantically-Interlinked Online Communities� DOAP Description of a Project� Dublin Core Defines general metadata attributes.� SKOS Simple Knowledge Organization System� SKOS DataZone list of vocabularies available in SKOS schema� Review Vocabulary provides terms for representing reviews.� GoodRelations provides terms for describing products and business
entities.� Music Ontology provides terms for describing artists, albums, tracks,
but also performances, arrangements, etc.� Organization Ontology for describing the structure of organizations.� Linking Open Description of Events (LODE) provides terms for
describing the basic properties of an event and contains a list ofaxioms expressing mapping relationships with other ontologies such asDOLCE, CYC, CIDOC-CRM, Event Ontology, F, and SEM.
� Google, Yahoo and Microsoft have agreed on vocabularies forpublishing structured data on the Web. Their shared 'ontology' ismaintained on schema.org .
� MarineTLO (core) Ontology is a top-level ontology for the marinedomain (also applicable to the terrestrial domain) and MarineTLO(imarine) Ontology is an extension and operational version of theMarineTLO core.
� etc.
3910/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
� Reusing existing URIs. If you need URI references for geographic places, research areas, generaltopics, artists, books or CDs, you should consider using URIs from existing data sources (for instanceGeonames, DBpedia, Musicbrainz, dbtune, RDF Book Mashup, etc.). The two main benefits of using URIsfrom such data sources are:o The URIs are dereferenceable, meaning that a description of the concept can be retrieved from the Web. o The URIs are already linked to URIs from other data sources.
Well-known Data Sets: http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSetsLinked Data Sets available as RDF Dumps: http://www.w3.org/wiki/DataSetRDFDumpsSparqlEndpoints list: http://www.w3.org/wiki/SparqlEndpoints
Linked Data: good practice
4010/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Guidance for own term definition:
� Do not define new vocabularies from scratch, but complement existing vocabularies with additional terms(in your own namespace) to represent your data as required.
� Provide for both humans and machines. At this stage in the development of the Web of Data, more peoplewill be coming across your code than machines, even though the Web of Data is meant for machines in the firstinstance. Don't forget to add prose, e.g. rdfs:comment for each term invented. Always provide a label for each termusing the rdfs:label property.
� Make term URIs dereferenceable. It is essential that term URIs are dereferenceable so that clients can look upthe definition of a term. Therefore you should make term URIs dereferenceable following the W3C Best PracticeRecipes for Publishing RDF Vocabularies (http://www.w3.org/TR/swbp-vocab-pub/).
� Make use of other people's terms. Using other people's terms, or providing mappings to them, helps topromote the level of data interchange on the Web of Data, in the same way that hypertext links built the traditionaldocument Web. Common properties for providing such mappings are rdfs:subClassOf or rdfs:subPropertyOf.
� State all important information explicitly. For example, state all ranges and domains explicitly. Remember:humans can often do guesswork, but machines can't. Don't leave important information out!
� Do not create over-constrained, brittle models; leave some flexibility for growth. For instance, ifyou use full-featured OWL to define your vocabulary, you might state things that lead to unintended consequences andinconsistencies when somebody else references your term in a different vocabulary definition. Therefore, unless youknow exactly what you are doing, use RDF-Schema to define vocabularies.
Linked Data: good practice
4110/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Linked Data Browsers, Mashups and other Client Applications:(http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/SemWebClients)
� Tabulator� OpenLink Data Explorer� DBpedia Mobile� Marbles� Graphity Browser� Quick & Dirty RDF Browser� LODmilla� Etc.
Semantic Web Search Engines:(http://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/SemanticWebSearchEngines)
� <sameAs.org>� VisiNav� Falcons� Sindice� Watson� Swoogle� Etc.
Web of Data Tools
4210/04/2015 IHME course
UNIVERSITY OF JYVÄSKYLÄ
Web Intelligence and Service EngineeringInternational Master’s Programhttps://www.jyu.fi/en/studywithus/programmes/wise
43
Relevant Courses:� ITKS544 Semantic Web and Ontology Engineering (5 ECTS)
� TIES452 Practical Introduction to Semantic Web Technologies (5 ECT S)
� TIES437 Everything-to-Everything Interfaces (5 ECTS)
� TIES438 Big Data Engineering (5 ECTS)
� …
10/04/2015 IHME course