Date post: | 29-Oct-2014 |
Category: |
Business |
Upload: | john-breslin |
View: | 13 times |
Download: | 2 times |
Copyright 2005 Digital Enterprise Research Institute. All rights reserved.
www.deri.org
DM110 Emerging Web Media
Dr. John Breslin
[email protected]://sw.deri.org/~jbreslin/
Week 10: Semantic Web / Web 3.0
2
What is the Semantic Web?
• Sir Tim Berners-Lee et al., Scientific American, 2001:– “An extension of the current web in which information is given
well-defined meaning, better enabling computers and people to work in cooperation.”
• “Entrepreneurs see a Web guided by common sense”, John Markoff, New York Times, 2006:– “Referred to as Web 3.0, the effort is in its infancy, and the very
idea has given rise to skeptics who have called it an unobtainable vision. But the underlying technologies are rapidly gaining adherents, at big companies like IBM and Google as well as small ones.”
• Requires web pages to have metadata with underlying ontologies
3
Where are we in the Semantic Web layer cake?
YouAreHere!
4
What is metadata?
• Metadata has been with us since the first librarian made a list of the items on a shelf of handwritten scrolls
• The term “meta” comes from a Greek word that denotes “alongside, with, after, next”
• More recent Latin and English usage would employ “meta” to denote something transcendental, or beyond nature
• Metadata can be thought of as “data about data”• It is the Internet-age term for information that librarians
traditionally have put into catalogues, and it most commonly refers to descriptive information about Web resources
5
Why do we need metadata?
• To provide a structured description of characteristics such as the meaning (semantics), content, structure and purpose of a resource
• To facilitate information sharing• To enable more sophisticated search engines on the
Internet• To support intelligent agents and the pushing of data
(e.g. from blog feeds)• To minimise data loss or repetition• To improve resource discovery by enabling field-based
searches
6
Why does the Web need metadata?
• No metadata:– Google
• Library analogy:– Index
every word in
every page in
every book
Bad search results Lagging the growth and
change in the Web
• Metadata (basic):– Yahoo! Directory
• Library analogy:– Categories– Titles– Descriptions– Ratings
Better results More work in classifying
things and assigning properties!
7
What kind of resources, objects, things?
• HTML documents• digital images• databases• books• museum objects• archival records• metadata records
• collections• services• physical places• people (using FOAF)• abstract “works”• concepts• events
8
Who or what makes use of metadata?
• People:– an owner managing resources– a researcher seeking resources– third-party services
• Software agents:– aggregators (e.g. blog collections)– “portals” presenting “landscape” of data to users– “brokers” performing query tasks on behalf of users
9
What can they do with metadata?
• End user wants to:– find– identify– select– obtain/use– interpret
• Third-party service may want to:– disclose/promote– enable and control access/use– annotate– re-contextualise
10
Metadata and ontologies
• Metadata elements are used to provide structure to the description of a resource:– e.g. title, description, keywords, author, educational level,
version, location, language, date created, etc.
• Further structure is provided by a metadata schema or ontology:– For example, if there is metadata about a soccer team, an
underlying ontology will say that a soccer team always has a goalkeeper and always has a manager, so each metadata entry for a soccer team should have that information
11
How is metadata created?
• By software tools:– indexing robots, web crawlers– from resource content, from server info
• By people:– descriptions added by resource creator/owner– descriptions provided by third party services, specialist
cataloguers or resource users
• Creating (and maintaining) good quality metadata is not always cheap:– may be rights issues for metadata as well as for resources
12
Where can you find metadata?
• Embedded within the coding for a resource itself:– depends on format of resource– can metadata be extracted from resource
• Linked to resource• In a database of descriptions/repository of resources:
– may be remote database
• …• Adopt approach which offers most flexibility:
– may need to “present” different subsets of full metadata in different contexts
13
What about metadata standards?
• Metadata standards are agreed-on criteria for describing data to support interoperability
• Simple example:– January 31, 2006– 31 janvier 2006– 2006-01-31– 01-31-2006– 31012006
• Need some consistent forms for exchanging metadata• Many standards for different domains (Dublin Core,
Warwick Framework, SCORM, IMS, ARIADNE, IEEE LOM, AICC, ADL SCORM, Merlot, RDF), so may also need mappings between these standards
14
What is RDF?
• On the Semantic Web, we use a standard called RDF to express metadata about resources, and RDF Schema to create metadata schemas or ontologies
• RDF stands for Resource Description Framework• RDF is a framework for describing and interchanging
Semantic Web metadata• “RDF is an infrastructure that enables the encoding,
exchange, and reuse of structured metadata” - Bearman et al., 1999
15
A typical full text search without RDF
• Web pages at the moment are mainly text, e.g.“Stefan Decker works at DERI, funded by SFI.”
• NLP not evolved enough to solve human problems, e.g. how can one find out Stefan’s funding agency?
1. Google: “stefan decker +deri”– Did I choose the right keywords?
2. Look through results– How do I know Google’s rankings are correct?
3. Click on most likely link– But is it really the best choice?
4. Search through text for the answer– The answer in the text is ambiguous…
16
Same example but with RDF metadata
• If we use RDF metadata in a web page, e.g.
<Person><name>Stefan Decker</name><workplaceHomepage>http://www.deri.ie/</workplaceHomepage><fundedBy>http://www.sfi.ie/</fundedBy>
</Person>
• Now a computer can return an answer to a question such as “who funds Stefan Decker?” rather than requiring a combination of person plus computer to figure it out!
17
What does RDF consist of?
• Resources– A resource is a thing you talk about (can reference)– Resources have URIs (e.g. they may be web pages, a part of an
XML document, etc.)
• Properties – Slots, define relationships to other resources or atomic values
• Statements– “Resource has Property with Value” (expressed as a Subject /
Predicate / Object statement)– Values can be resources or atomic XML data (e.g. “literal” string)
• Frames– A straightforward way to express these abstract properties in
XML
18
A simple RDF example
http://www.w3.org/Home/Lassilas:Creator Ora Lassila
• Statement:– “Ora Lassila is the creator of the resource (web page)
http://www.w3.org/Home/Lassila”
• Structure:Resource (subject) http://www.w3.org/Home/Lassila
Property (predicate) http://www.schema.org/#Creator
Value (object) "Ora Lassila”
• Directed graph:
19
Simple RDF example shown in RDF/XML
• In the directed graphs, the arrows point from the subject to the object, and the text on the arrow is the predicate
• The ellipses are resources and the rectangles are literals or text strings
• We can also represent this graph model in RDF/XML:
<rdf:Description about=“http://www.w3.org/Home/Lassila”>
<Creator>Ora Lassila</Creator>
</rdf:Description>
20
Expanding on the previous example
http://www.w3.org/Home/Lassila
s:Creator
Person://fi/654645635
Name
Ora Lassila [email protected]
• To add properties to the “Creator”, point through an intermediate resource (the ellipses are resources and the rectangles are literals or text strings)
21
Expanded RDF example shown in RDF/XML
<rdf:Description about=“http://www.w3.org/Home/Lassila”>
<Creator rdf:resource=“Person://fi/654645635”/>
</rdf:Description>
<rdf:Description about=“Person://fi/654645635”>
<Name>Ora Lassila</Name>
<Email>[email protected]</Email>
</rdf:Description>
22
What is an ontology?
• In a nutshell, ontologies are formal and consensual specifications of conceptualisations that provide a shared and common understanding of a domain
• Ontologies define the terms used to describe and represent an area of knowledge
• Ontologies are a key enabling technology for the Semantic Web
• They interweave human understanding of symbols with their machine processability
23
Ontologies on the Semantic Web
• Semantic Web ontologies have computer-usable definitions:
➔ Concepts (AKA classes) are general things in the domain:– Person, Document, Book, Web_Page
➔ Relationships exist among things:– Book, Web_Page are subclasses of Document
➔ Properties (attributes) that things may have:– Person has an age, Web_Page has a creation_date
24
Ontology structures
From: http://aot.ce.unipr.it/team/poggi/teaching/ia/docs/Ontology.pdf
25
Why use ontologies?
• Labeling:– If I say “car” and you say “automobile”, how do we know we
mean the same thing?
• Semantics:– If I say “vehicle”, how do you know if this includes buses,
powered motorcycles?
• Knowledge sharing and reuse:– Need to be able to create definitions of terms in a machine-
understandable format– Systematic categorisation and computation requires systematic
representation:• Systematic representation corresponds to an ontology
26
What is a concept?
• Concepts or “classes”:– Are in general language independent (the words ‘university’ and ‘ollscoil’
denote the same concept)
– Are mental or logical representations of reality
– Are related to other concepts
– Do not need symbols but hold them for means of communication
• A concept has:– Intension, i.e. meaning
– Extension, i.e. a set of objects that the concept refers to
• On the difference between intension and extension, consider phrases "Evening Star" and "Morning Star" that have different meanings (intension) yet both refer to planet Venus (extension)
• Ontology is mainly concerned with intension
27
Components of an ontology
• Concepts– Cat
– Dog
• Properties– Length
– Age
• Constraints– Cardinality is at least 1
– Maximum value is 300
• Axioms– Cows are larger than dogs
– Cats cannot eat only vegetation
• Relationships– Is a
– Part of
28
An ontology example in RDF
<rdf:Description ID=“Document">
<rdf:type resource="http://www.w3.org/...#Class"/>
<rdfs:subClassOf
rdf:resource="http://www.w3.org/...#Resource"/>
</rdf:Description>
<rdf:Description ID=“Book">
<rdf:type resource="http://www.w3.org/...#Class"/>
<rdfs:subClassOf rdf:resource="#Document"/>
</rdf:Description>
29
Implementing or creating ontologies
• Implementation consists in defining all the ontology components through an ontology definition language
• Generally in two stages:– Informal stage:
• Ontology is sketched out using either natural language descriptions or some diagram technique
– Formal stage:• Ontology is encoded in a formal knowledge representation
language, that is machine computable
• Different tools (e.g., Protégé) may help in the implementation
30
Can already describe lots of things semantically
• Geographic coordinates:– GEO
• Library books:– Dublin Core (DC)
• Online discussions:– SIOC
• People, social networks:– Friend-of-a-Friend (FOAF)
• Maybe even hormones!– GeneOnt
31
The power of the Semantic Web
• Interoperability and increased connectivity is possible through a commonality of expression
• Vocabularies can be combined and used together:– e.g. a description of a book using Dublin Core metadata can be
augmented with specifics about the book author using the Friend-of-a-Friend vocabulary
• Vocabularies can be easily extended (modules, etc.)• Intelligent search with more granularity and relevance:
– e.g. a search can be personalised to an individual by making use of their identity and relationship information
32
The challenge for the Semantic Web
• The Semantic Web can’t work all by itself:– If it did it would be called the “Magic Web”– It will need some help to become a reality
• For example, it is not very likely that you will be able to sell your car just by putting your RDF file on the Web
• Need society-scale applications:– Consumers and processors of Semantic Web data– Semantic Web agents or services– More advanced collaborative applications that make real use of
shared data and annotations
33
The path to Web 3.0
• The Semantic Web effort is mainly towards producing standards and recommendations that will interlink applications
• The Web 2.0 meme (already discussed) is about providing user applications
• Not mutually exclusive:– http://www.oreillynet.com/xml/blog/2005/10/
is_web_20_killing_the_semantic.html– With a little effort, many Web 2.0 applications can and do use
Semantic Web technologies to great benefit
34
Semantic Web + Web 2.0 = Web 3.0
• Web 2.0 applications such as blogging and wikis have become very popular and at the same time have created an interconnected information space (through the “blogosphere” and inter-wiki links)
• At the same time, these applications are experiencing boundaries in terms of information dissemination and automation, as they require increased levels of automation (i.e. more automated ways for information distribution)
• The Semantic Web is increasingly aiming at these applications areas:– Semantic Wikis, Semantic Desktops, etc.