Hideaki Takeda / National Institute of Informatics
General Introduction for Semantic Web and Linked Open Data
Hideaki TakedaNational Institute of Informatics
takeda@ nii.ac.jp
2012 INTERNATIONAL ASIAN SUMMER SCHOOL IN LINKED DATA IASLOD 2012, August 13-17, 2012, KAIST, Daejeon, Korea
Hideaki Takeda / National Institute of Informatics
Semantic Web and Linked Data• Semantic Web
– What is Semantic Web– How to realize Semantic Web
• Metadata• RDF• RDFS• OWL
• Linked Data– What is Linked Data?– The State-of-the-Art of Linked Data
• Linking Open Data (LOD)– How to use Linked Data
• Linked Data Browser• Linked Data Search Engine• Linked Data Applications
– How to use RDF• RDFa
– SPARQL
Hideaki Takeda / National Institute of Informatics
The Aim of The Semantic Web• "The Semantic Web is an extension of the current web in
which information is given well-defined meaning, better enabling computers and people to work in cooperation."
The Semantic Web, Scientific American, May 2001, Tim Berners-Lee, James Hendler and Ora Lassila
• The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications.
http://www.w3.org/2001/sw/
Hideaki Takeda / National Institute of Informatics
Semantic Web• Realization of various information exchanging via Web
自動化統合
データの再利用
AutomationIntegration
Re-use of data
Hideaki Takeda / National Institute of Informatics
Next Generation Web?• Evolution of Web
– HTML: Web for Display– XML: Web with Syntax– ?? : Web with Semantics
• Why should we embed semantics into Web? From– Web for Human
To– Web for human and machinescf. Web for machines
Hideaki Takeda / National Institute of Informatics
A brief introduction of XML• Limitation of HTML
– Chaos by mixture of displaying and text structures• e.g.,
– <h3></h3> should be used for “the third-level heading”, but are often used just for bigger fonts
– <b></b> is specifying “bold” , not “emphasis”.– Fixed Structure
• e.g.,– If you need <h7></h7>….– I need a structure just for my data
<h1> A list of lectures</h1><h2> Knowledge Sharing Systems</h2><h3> Lecturer : Hideaki Takeda</h3><h3>Wednesday 3rd</h3>
Hideaki Takeda / National Institute of Informatics
XML
• XML(eXtensible Markup Language)– Can define original tags– Represent logical structures of data
• DTD– Do not include style information
• XST <lecturelist><lecture> <title id=1234> Knowledge Sharing Systems</title><lecturer> Hideaki Takeda</lecturer><schedule> <week>Wednesday</week> <time>3rd</time></lecture>... </lecturelist>
Hideaki Takeda / National Institute of Informatics
Whey is XML not sufficient?
• What are specified by “person” and “name” ?• Is “name” and “ 名前” the same?• Is this description sufficient as a description for “person”?• …
• In short, syntax alone cannot solve these problems
<person> <name> Hideaki Takeda</name> <age> 20</age></person>
< 個人 > < 名前 >Hideaki Takeda</ 名前 > < 年齢 > 20</ 年齢 ></ 個人 >
Hideaki Takeda / National Institute of Informatics
Architecture for the Semantic Web
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of Informatics
How to describe “meaning”?
• Need to describe “information on information”– “Meaning of something” is a description (“meaning”)
to a description (“something”) in computers – Metadata
• Data about data
• Need to architecture for common understanding– Syntax (language or scheme)– Vocabulary (ontology)
Hideaki Takeda / National Institute of Informatics
Metadata• What is metadata?
– Data about data – What one can say about any information object
• What is described as metadata?– Content relates to what the object contains or is about, and
is intrinsic to an information object. – Context indicates the who, what, why, where, how aspects
associated with the object's creation and is extrinsic to an information object.
– Structure relates to the formal set of associations within or among individual information objects and can be intrinsic or extrinsic
Setting the State, Anne J.Gilliand-Swetland, Introduction to Metadata – Pathways to Digital Information, Murthsa Baca (ed.), Getty Information Institute.
Hideaki Takeda / National Institute of Informatics
Metadata• Metadata to individual information objects
– Bibliography , Dublin Core• Metadata to part or structure of information objects
– Drawings , RDF , RDFS, OWL
Type : tractorOwner : Taro
Product year :2002
Axis:Connect body to wheel
Wheel
Body
Hideaki Takeda / National Institute of Informatics
A Layer model for Semantic Web• RDF (Resource Description Framework)
– The most primitive model for metadata description• SVO model• Entity-Relation Model• Semantic net
• RDF Schema– Addition of “concept” to RDF
• class-subclass , constraints• OWL
– More general concept description language• Logical consistency• Various class expressions• Various constraints
• DAML-S– Descriptions on processes
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Hideaki Takeda / National Institute of Informatics
RDF (Resource Description Framework)
• A framework to describe metadata• Separation of model and syntax• W3C Recommendation (2004)
Hideaki Takeda / National Institute of Informatics
RDF Model
• Element– Resource:
• URI(Universal Resource Identifier)• Literal(string)
– No need to be specified by Web
– Property: • Attribute when describing resources• URI or Literal just as Resource
– Statement: triad of resource, property, and resource
Hideaki Takeda / National Institute of Informatics
RDF model• Statement
– Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda” • Structure
– Resource (subject): http://www-kasm.nii.ac.jp/~takeda– Property (predicate): Creator– Value (object): “Hideaki Takeda”
http://www-kasm.nii.ac.jp/~takeda “Hideaki Takeda”Creator
Resource Property Value
Hideaki Takeda / National Institute of Informatics
RDF model• Creator of http://www-kasm.nii.ac.jp/~takeda is http://www.nii.ac.jp/staffid/123456 which
has name “Hideaki Takeda” and email “[email protected]” .
http://www-kasm.nii.ac.jp/~takeda
“Hideaki Takeda”
Creatorhttp://www.nii.ac.jp/staffid/123456
name email
Hideaki Takeda / National Institute of Informatics
RDF model• Creator of http://www-kasm.nii.ac.jp/~takeda has name “Hideaki Takeda”
email “[email protected]” .
http://www-kasm.nii.ac.jp/~takeda
“Hideaki Takeda”
Creator
name email
Hideaki Takeda / National Institute of Informatics
RDF syntax• Creator of http://www-kasm.nii.ac.jp/~takeda is “Hideaki Takeda”
http://www-kasm.nii.ac.jp/~takeda “Hideaki Takeda”Creator
Resource Property Value
<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://dublincore.org/2001/08/14/dces#"> <rdf:Description about="http://www-kasm.nii.ac.jp/~takeda"> <dc:Creator>Hideaki Takeda</dc:Creator> </rdf:Description> </rdf:RDF> <rdf:RDF> <rdf:Description about="http://www-kasm.nii.ac.jp/~takeda"> <dc:Creator rdf:resource=“Hideaki Takeda” /> </rdf:Description> </rdf:RDF>
Hideaki Takeda / National Institute of Informatics
RDFS (RDF Schema)• Stronger knowledge representation model
– RDF: ER model , semantic net– RDF Schema: Frame model , object-oriented
paradigm• Minimal definition• Property-centered approach
• RDFS is defined as extension of RDF• RDFS gives definitions of RDF descriptions
Hideaki Takeda / National Institute of Informatics
RDFS• Class Definition
– rdfs:Resource– rdfs:Class– rdf:Property– rdfs:ConstraintProperty– rdfs:Literal
• Property Definition– rdf:type– rdfs:subClassOf– rdfs:subPropertyOf– rdfs:comment– rdfs:label– rdfs:seeAlso– rdfs:isDefinedBy
• ConstraintProperty Definition– rdfs:range – rdfs:domain
Resource Description Framework(RDF) Schema Specification 1.0http://www.w3.org/TR/2000/CR-rdf-schema-20000327/
RDFS Structure by RDF
Hideaki Takeda / National Institute of Informatics
RDF Schema• rdfs:Class• rdfs:SubclassOf
– Detailed class– Multiple– Transivity
• rdf:type– Indicate an instance of a
class• rdf:property
– Attribute• rdfs:subPropertyOf
– Detailed property– Transivity
Range Only one
No cardinality Domain
Multiple (or)
Hideaki Takeda / National Institute of Informatics
RDF Schema<rdf:RDF xml:lang="en" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"><rdfs:Class rdf:ID="Person"> <rdfs:comment>The class of people.</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/ 2000/03/example/ classes#Animal"/></rdfs:Class><rdf:Property ID="maritalStatus"> <rdfs:range rdf:resource="#MaritalStatus"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdf:Property ID="ssn"> <rdfs:comment>Social Security Number</rdfs:comment> <rdfs:range
rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdf:Property ID="age"> <rdfs:range
rdf:resource="http://www.w3.org/2000/03/example/classes#Integer"/> <rdfs:domain rdf:resource="#Person"/></rdf:Property><rdfs:Class rdf:ID="MaritalStatus"/><MaritalStatus rdf:ID="Married"/><MaritalStatus rdf:ID="Divorced"/><MaritalStatus rdf:ID="Single"/><MaritalStatus rdf:ID="Widowed"/></rdf:RDF>
Animal
Person
ssnage
maritalStatus
s
d
MaritalStatus
r
“The class of person”
rdfs:comment
Integer
d
r
d
“Social Security Number”
rdfs:comment
t = rdf:typed = rdfs:domainr = rdfs:range = class = class instance = property
Resource Description Framework(RDF) Schema Specification 1.0http://www.w3.org/TR/2000/CR-rdf-schema-20000327/
Married
Divorced
Single
Windowed
t
t
t
t
Hideaki Takeda / National Institute of Informatics
OWL(Web Ontology Language)• More general knowledge representation• Based on Description Logics• Features
– Class• Necessary condition / necessary and sufficient condition• Class expression:
– Constraint by property » Like slot definition of a class» Type constraint (all/some), cardinality, typed cardinality
– Logical operation of classes: union, intersection, negation– Property
• Multiple ranges and domains• Specifying meta-property
– Import of definitions
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
Architecture for the Semantic Web
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
The world of instances (Linked Data)
The world of classes (Ontologies)
Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web• Ontology
– Descriptions on classes– RDFS, OWL– Challenges for ontology building
• Ontology building is difficult by nature– Consistency, comprehensiveness, logicality
• Alignment of ontologies is more difficult
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Descriptions on classes
インスタンスに関する記述
Ontology
Linked Data
Hideaki Takeda / National Institute of Informatics
Layers of Semantic Web• Linked Data
– Descriptions on instances (individuals)– RDF + (RDFS, OWL)– Pros for Linked Data
• Easy to write (mainly fact description)• Easy to link (fact to fact link)
– Cons for Linked Data• Difficult to describe complex structures• Still need for class description (-> ontology)
Tim Berners-Lee http://www.w3.org/2002/Talks/09-lcs-sweb-tbl/
Descriptions on classes
Description on instances
Ontology
Linked Data
Hideaki Takeda / National Institute of Informatics
Linked Data Linked Data is “Web of Data”
– Data published as RDF– Can refer from outside
• The four rules for Linked Data
Hideaki Takeda / National Institute of Informatics
Linked Data• The four rules for Linked Data
– Use URIs as names for things • Give a URI to every object in the world!
– Use HTTP URIs so that people can look up those names. • Don’t use URN
– When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
• Provide machine-readable data for URI– Include links to other URIs. so that they can discover more things.
• Make data linked together just like Web
Linked Data, TBL, http://www.w3.org/DesignIssues/LinkedData.html
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
Linking Open Data (LOD)• The project to collect published Linked Data• Major Linked Data• (Translated from the original resources)
– Dbpedia (Wikipedia) 270 Million Triples– Geonames : Geo names and their latitudes and longitudes, 93 Million Triples– MusicBrainz : Music– WordNet : Dictionary– DBLP bibliography : Bibliography for technical papers. 28 Million Triples– US Census Data: 1 Billion Triples
• ( Crawling)– FOAF (Friend Of A Friend)
• ( Wrapper )– Flickr Wrapper
Hideaki Takeda / National Institute of Informatics
Hideaki Takeda / National Institute of Informatics
http://dbpedia.org/page/Tokyo
Hideaki Takeda / National Institute of Informatics
http://en.wikipedia.org/wiki/Tokyo
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
How to use Linked Data
Things Things Things Things Things
Linked Data Browser
Linked Data Mashup
Linked Data Search Engine
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
Linked Data Browser• Browse linked data just as browsing web pages
– Show RDF data– Prompt links to follow
• System/Service– Mables
• Display data by following links– Tabulator
• Firefox plugin/online• Adding information in a single page
– Sig.ma• Showing RDF resources which can be operated
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
Linked Data Search Engine
• Search RDF data with crawled data set– Swoogle– Sindice– watson
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
How to use Linked Data
• Semantic Data Mash-up Applications
– SemaPlorer• http://btc.isweb.uni-koblenz.de/
– Dbpedia Mobile• http://wiki.dbpedia.org/DBpediaMobile
– Bio2RDF• http://bio2rdf.org/
Hideaki Takeda / National Institute of Informatics
Bio2RDF• Search LOD in
bioscience• Translate data into RDF
if not
Hideaki Takeda / National Institute of Informatics
Linked Data
• What is Linked Data?• The State-of-the-Art of Linked Data
– Linking Open Data (LOD)• How to use Linked Data
– Linked Data Browser– Linked Data Search Engine– Linked Data Applications
• How to use RDF– RDFa– SPARQL
Hideaki Takeda / National Institute of Informatics
RDFa
• Add extra structured content to the (X)HTML pages– adds new (X)HTML/XML attributes
• “RDF in attributes”– Programs can extract those and turn into RDF– Flexibility for using Literals and URI resources
Hideaki Takeda / National Institute of Informatics
Principles of RDFa
• RDF contents are defined through XML attributes (no elements)
• XML/HTML tree structure is used• Varios attributes are defined by RDFa
– Some attributes (@href, @rel) are also reused• The text content can be also reused
Hideaki Takeda / National Institute of Informatics
Examples<div xmlns:dc="http://purl.org/dc/elements/1.1/"> <h2 property="dc:title">The trouble with Bob</h2> <h3 property="dc:creator">Alice</h3> ... </div>
http://example.com/alice/posts/trouble_with_bob
<http://www.example.com/alice/posts/trouble_with_bob> <http://purl.org/dc/elements/1.1/title> "The Trouble with Bob"; <http://purl.org/dc/elements/1.1/creator> "Alice" .
In N3
Hideaki Takeda / National Institute of Informatics
<div xmlns:dc="http://purl.org/dc/elements/1.1/"> <div about="/alice/posts/trouble_with_bob"> <h2 property="dc:title">The trouble with Bob</h2> <h3 property="dc:creator">Alice</h3> ... </div> <div about="/alice/posts/jos_barbecue"> <h2 property="dc:title">Jo's Barbecue</h2> <h3 property="dc:creator">Eve</h3> ... </div> ... </div>
Hideaki Takeda / National Institute of Informatics
<div about="/alice/posts/trouble_with_bob"> <h2 property="dc:title">The trouble with Bob</h2> The trouble with Bob is that he takes much better photos than I do: <div about="http://example.com/bob/photos/sunset.jpg"> <img src="http://example.com/bob/photos/sunset.jpg" /> <span property="dc:title">Beautiful Sunset</span> by <span property="dc:creator">Bob</span>. </div> </div>
Hideaki Takeda / National Institute of Informatics
<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name"> Alice Birpemswick </p> <p> Email: <a rel="foaf:mbox" href="mailto:[email protected]">[email protected]</a></p> <p> Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p> </div>
Hideaki Takeda / National Institute of Informatics
<div xmlns:foaf="http://xmlns.com/foaf/0.1/" about="#me" rel="foaf:knows"> <ul> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/bob">Bob</a> </li> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/eve">Eve</a> </li> <li typeof="foaf:Person"> <a property="foaf:name" rel="foaf:homepage" href="http://example.com/manu">Manu</a> </li> </ul> </div>
Hideaki Takeda / National Institute of Informatics
Using RDFa
• RDF Validator– http://validator.w3.org/
• RDF Distiller– http://www.w3.org/2007/08/pyRdfa/
Hideaki Takeda / National Institute of Informatics
<http://example.org/john-d/> <http://xmlns.com/foaf/0.1/primaryTopic> <http://example.org/john-d/#me>.<http://example.org/john-d/> <http://purl.org/dc/elements/1.1/creator> "Jonathan Doe"@en.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/nick> "John D"@en.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <http://www.neubauten.org/>.<http://example.org/john-d/#me> <http://xmlns.com/foaf/0.1/interest> <urn:ISBN:0752820907>.<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/title> "Weaving the Web"@en.<urn:ISBN:0752820907> <http://purl.org/dc/elements/1.1/creator> "Tim Berners-Lee"@en.
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/" version="XHTML+RDFa 1.0" xml:lang="en"> <head> <title>John's Home Page</title> <base href="http://example.org/john-d/" /> <meta property="dc:creator" content="Jonathan Doe" /> <link rel="foaf:primaryTopic" href="http://example.org/john-d/#me" /> </head>
<body about="http://example.org/john-d/#me"> <h1>John's Home Page</h1> <p>My name is <span property="foaf:nick">John D</span> and I like <a href="http://www.neubauten.org/" rel="foaf:interest" xml:lang="de">Einsturzende Neubauten</a>. </p> <p> My <span rel="foaf:interest" resource="urn:ISBN:0752820907">favorite book is the inspiring <span about="urn:ISBN:0752820907"> <cite property="dc:title">Weaving the Web</cite> by <span property="dc:creator">Tim Berners-Lee</span></span> </span> </p> </body></html>