+ All Categories
Home > Documents > RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Date post: 29-Dec-2015
Category:
Upload: jody-lewis
View: 227 times
Download: 0 times
Share this document with a friend
64
RDFa, Etc. (Resource Description Framework– in–attributes) W3C: RDFa 1.1 Primer http://www.w3.org/TR/xhtml-rdfa-primer/
Transcript
Page 1: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa, Etc. (Resource Description Framework–in–attributes)

W3C: RDFa 1.1 Primer

http://www.w3.org/TR/xhtml-rdfa-primer/

Page 2: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa RDFa allows RDF statements to be included in ordinary

HTML/XHTML files using formally defined attributes

A W3C recommendation, http://www.w3.org/TR/rdfa-core

The vocabularies are specified using XML namespaces, so use with XHTML, not HTML, document types

Do not generate RDF/XML files separately

RDF/XML is complex

Requires a separate creation and storage mechanisms

Add extra structured content to the (X)HTML pages

Let processors extract that content and turn it into RDF

Page 3: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa provides attributes to carry metadata in an XML language

Note the ‘a’ (attributes) in RDFa

These attributes include:

about gives a URI specifying the resource the metadata is about

rel and rev specify a relationship and inverse relationship with another resource, resp.

src, href and resource specify the partner resource

property specifies a property for the content of an element or the partner resource (the resource that the metadata is about)

content (optional) overrides the content of the element when using the property attribute

datatype (optional) specifies the datatype of text specified for use with the property attribute

typeof (optional) specifies the RDF type(s) of the subject or the partner resource

Page 4: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Five "principles of interoperable metadata" met by RDFa

Publisher Independence: Each site can use its own standards

Data Reuse: Data are not duplicated—separate XML and HTML sections aren’t required for the same content.

Self Containment: The HTML and the RDF are separated

Schema Modularity: The attributes are reusable

Evolvability: Additional fields can be added and XML transforms can extract the semantics of the data from an XHTML file

Page 5: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Attributes map to RDF components

Subject: about, src—e.g., about="rdfa-course"

Predicate: property, rel, rev, typeof—e.g., property="dc:title"

Object: content, href, resource, datatype, or just plain content or a resource—e.g., RDFa Course as the content of an HTML element

Example

<div about=”rdfa-course">

<h3 property="dc:title">RDFa Course</h3>

</div>

Page 6: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa Example<div xmlns:v="http://rdf.data-vocabulary.org/#"

typeof="v:Person">

<span typeof="v:Address">

<span property="v:locality">Albuquerque</span>

<span property="v:region">NM</span>

</span>

</div>

The namespace used here identifies the vocabulary developed by Schema.org—see below

Page 7: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Publishing RDFa

RDFa provides an easy way of publishing RDF data on the Web

Often the same RDF data is available in different formats, including RDFa

The client chooses which one(s) to support

Consuming RDFa

Various search engines have begun to consume RDFa

Google, Yahoo, …

They may specify which vocabularies they “understand”

Facebook’s “social graph” is based on RDFa

Page 8: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa Distiller W3C service to identify and list RDF in a web page

http://www.w3.org/2012/pyRdfa/

Extract RDF from HTML + RDFa

Using a web address, local file or direct text inputs, it provides a clean view of the implied data hierarchy

Page 9: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Example Select the tab Distill by Direct Text Input, copy the following into the window

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<title>Books by Marco Pierre White</title>

</head>

<body>

I think White's book

'<span about="urn:ISBN:0091808189"

typeof="http://purl.org/ontology/bibo/Book"

property="http://purl.org/dc/terms/title"

>Canteen Cuisine</span>'

is well worth getting since although it's quite advanced stuff, he

makes it pretty easy to follow. You might also like

<span about="urn:ISBN:1596913614"

typeof="http://purl.org/ontology/bibo/Book"

property="http://purl.org/dc/terms/description"

>White's autobiography</span>.

</body>

</html>

Page 10: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Choose the following selections in the dropdowns below the text window

Host Language: HTML5 + RDFa

Output Format: Turtle

Returned content: Only core triples

Expand vocabularies: No

Generate warnings for non RDFa 1.1 Lite usage: No

Click the Go button (below these dropdowns)

Output presented in a downloaded file—open in, e.g., Notepad++

Page 11: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

For our example, the output is

@prefix dc: <http://purl.org/dc/terms/> .

<urn:ISBN:0091808189> a <http://purl.org/ontology/bibo/Book>;

dc:title "Canteen Cuisine" .

<urn:ISBN:1596913614> a <http://purl.org/ontology/bibo/Book>;

dc:description "White's autobiography" .

Page 12: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa Developerhttps://addons.mozilla.org/en-US/firefox/addon/rdfa-developer/?src=ss

Firefox add-on that lets us visualize all the RDFa triples in a web page

Shows a list of errors and warnings found while parsing the document

Lets us execute SPARQL queries on the RDFa content

Page 13: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

To install, follow above link, click Add to Firefox button, restart Firefox (Perhaps first look Tools Add-ons for restart in Developer listing)

The Developer windows occupy the bottom part of the screen

To add an icon in the lower right corner of the browser (the icon bar), in the View menu at the top, under Toolbars, have Add-on bar checked

Click the icon to toggle the Developer display off and on

By default, the Developer windows appear when you start up Firefox

To prevent this, in the Tools tab, select Add-ons

In the resulting display, click the Disable button for RDFa Developer

To use the Developer again, go back and click the Enable button

If the Developer icon doesn't appear in add-on bar, View Toolbar Customize and drag the Developer icon from the pallet to the add-on bar

Page 14: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Example Save the following code (same as the previous example) in an HTML file<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<title>Books by Marco Pierre White</title>

</head>

<body xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:dc="http://purl.org/dc/terms/">

I think White's book

'<span about="urn:ISBN:0091808189"

typeof="bibo:Book"

property="dc:title"

>Canteen Cuisine</span>'

is well worth getting since although it's quite advanced stuff, he

makes it pretty easy to follow. You might also like

<span about="urn:ISBN:1596913614"

typeof="bibo:Book"

property="dc:description"

>White's autobiography</span>.

</body>

</html>

Page 15: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Open the saved HTML file in Firefox

The output should show 4 triples in the Data tab (expand by clicking th triangles) and 3 warnings in the Notices tab

If the tabs do not show any triples or warnings, try to disable & re-enable the RDFa Developer add-on

Page 16: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Regarding the Notices tab (errors & warning), suppose we remove the namespaces in the body element

Change

<body xmlns:bibo="http://purl.org/ontology/bibo/"

xmlns:dc="http://purl.org/dc/terms/">

to

<body xmlns:dc="http://purl.org/dc/terms/">

Open the saved HTML file in Firefox

Page 17: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The output shows the errors and warnings in the Notices tab

The errors specify that the prefix used for the bibo namespace is not defined (and the attribute with this prefix is unused)

I couldn't get the Query tab to submit queries

Page 18: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

E.g., on the BBC New world website, http://www.bbc.co.uk/news/world/, one part of the HTML is as follows

To see the source HTML, in Mozilla, right click in the window In the resulting menu, click View Page Source

<meta property="og:title" content="BBC News: World">

<meta property="og:description" content="World news from the BBC">

<meta property="og:url" content="http://www.bbc.co.uk/news/world/">

<meta property="og:type" content="website">

<meta property="og:image" content= "http://news.bbcimg.co.uk/media/images/56400000/jpg/_56400259_bbcnews.jpg">

<meta property="og:site_name" content="BBC News">

<meta property="fb:app_id" content="218019758281651">

The next slide shows part of the RDFa Developer Data tab

The RDFa occurs in several places—hence the triples from RDFa not shown here

Page 19: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The og: prefix is for the Open Graph protocol

Page 20: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The Open Graph protocolhttp://ogp.me/

Enables any web page to become a rich object in a social graph

Used on Facebook to allow any web page (by adding metadata) to have the same functionality as any other object on Facebook

Since Open Graph is an open protocol of sorts, it's not Facebook specific

Google Plus gives schema.org the highest weight If they don’t exist, it falls back on open graph tags If they do not exist, falls back on page content, like "title", etc.

Even without a good internal search engine, Facebook already drives more traffic for some searches (social searches) than Google

Page 21: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

No single technology provides enough info to richly represent any web page within the social graph

The Open Graph protocol builds on these existing technologies

Developer simplicity is a key goal that has informed many of the technical design decisions

See: The Open Graph Protocol Design Decisions (D. Recordon, presented at the W3C’s Linked Data CAMP at WWW 2010)

http://www.scribd.com/doc/30715288/The-Open-Graph-Protocol-Design-Decisions

Within 7 days of implementation, the following services hosted it og:it—simple metadata extractor to HTML OpenGraph.in—simple metadata extractor to HTML and JSON Multiple RDF parsers now understand the Open Graph protocol Open Graph protocol to JSON converter for testing Open source libraries for Java, Perl, PHP, and Ruby WorldPress plugin for easy publishing

Page 22: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Initial version is based on RDFa

Place additional <meta> tags in the <head> of your web page

The 4 required properties

og:title—the title of your object as it’s to appear in the graph

og:type—the type of your object, e.g., "video.movie" Depending on the type you specify, other properties may also

be required

og:image—an image URL to represent your object in the graph

og:url—the canonical URL of your object, used as its permanent ID in the graph

Page 23: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Example: the Open Graph protocol markup for The Rock on IMDB

<html prefix="og: http://ogp.me/ns#">

<head>

<title>The Rock (1996)</title>

<meta property="og:title" content="The Rock" />

<meta property="og:type" content="video.movie" />

<meta property="og:url"

content="http://www.imdb.com/title/tt0117500/" />

<meta property="og:image"

content="http://ia.media-imdb.com/images/rock.jpg" />

...

</head>

...

</html>

Also 7 optional properties

Page 24: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Some properties can have extra metadata attached to them

E.g, the og:image property has some optional structured properties

og:image:url—identical to og:image

og:image:secure_url—an alternate url to use if the webpage requires HTTPS

og:image:type—a MIME type for this image

og:image:width—the number of pixels wide

og:image:height—the number of pixels high

The og:video tag has the identical tags

The og:audio tag only has the 1st 3 properties

Page 25: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

If a tag can have multiple values, put multiple versions of the same <meta> tag on your page

The 1st tag (from top to bottom) is given preference during conflicts

This is effectively an array of values

When the community agrees on the schema for a type, it’s added to the list of global types

All other objects in the type system are CURIEs (see below) of the form

<head prefix="my_namespace: http://example.com/ns#">

<meta property="og:type" content="my_namespace:my_type" />

Page 26: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The global types are grouped into verticals, each with its own namespace

The og:type values for a namespace are prefixed with the namespace and then a period Reduces confusion with user-defined namespace types

(which have colons)

Example (more a candidate vertical) profile—namespace URI: http://ogp.me/ns/profile#

profile:first_name—string—a given name

profile:last_name—string—a name inherited from a family or marriage

profile:username—string—a short unique string to identify them

profile:gender—enum(male, female)—their gender

Page 27: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The types used when defining attributes

Boolean—values: true, false, 1, 0

DateTime—composed of a date (year, month, day) and an optional time component (hours, minutes) as per the ISO 8601 standard

Enum—a type consisting of bounded set of constant string values

Float—a 64-bit signed floating point number

Integer—a 32-bit signed integer.

String—a sequence of Unicode characters

URL—all valid URLs that utilize the http:// or https:// protocols

Discuss the Open Graph Protocol

in the Facebook group (https://www.facebook.com/groups/opengraph/) or

on the developer mailing list (http://groups.google.com/group/open-graph-protocol)

Page 28: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The open source community has developed several parsers and publishing tools

Facebook Object Debugger—Facebook's official parser & debugger

Google Rich Snippets Testing Tool—Open Graph protocol support in specific verticals and Search Engines.

OpenGraph.in—a service that parses Open Graph protocol markup and outputs HTML and JSON

PHP Validator and Markup Generator—OGP 2011 input validator and markup generator in PHP5 objects

PHP Consumer—a small library for accessing of Open Graph Protocol data in PHP

OpenGraphNode in PHP—a simple parser for PHP

PyOpenGraph—a library written in Python for parsing Open Graph protocol information from web sites

Continued

Page 29: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

OpenGraph Ruby—a Ruby Gem that parses web pages and extracts Open Graph protocol markup

OpenGraph for Java—a small Java class used to represent the Open Graph protocol

RDF::RDFa::Parser—a Perl RDFa parser that understands the Open Graph protocol

WordPress plugin—Facebook's official WordPress plugin

WordPress

http://wordpress.org/

A free and open source blogging tool and a content management system (CMS) based on PHP and MySQL

Runs on a web hosting service

Used by more than 18.9% of the top 10 million websites (August 2013)

The most popular blogging system (>60 M websites)

Page 30: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

A CURIE (short for Compact URI) defines a generic, abbreviated syntax for expressing URIs, e.g., [isbn:0393315703]

May be considered a datatype

The square brackets may be used to prevent ambiguities between CURIEs and regular URIs, yielding so-called safe CURIEs

QNames may be considered a type of CURIE

CURIEs can be better defined and may include checking

Unlike QNames, the part of a CURIE after the colon needn’t conform to the rules for XML element names

The final W3C recommendation was released 2009

Page 31: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Example (using a QName syntax within XHTML)

<html xmlns:wiki="http://en.wikipedia.org/wiki/">

<head>...</head>

<body>

<p>

Find out more about <a href="[wiki:Biome]">biomes</a>.

</p>

</body>

</html

Definition

CURIE

Page 32: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDFa Playhttp://rdfa.info/play/

Beta version (still bugs) yet very useful

HTML fragment with RDFa in left panel, rendering in right

Choose to see (below the panels) either N3 serialization of contained RDF or its graphical visualization

Examples of type Person, Social Network, Event, Place, Product, SVG

Edit these or make your own HTML fragments from scratch

Page 33: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

See Tools tab at RDF.info web page (http://rdfa.info/tools/)

The W3C’s Nu Markup Validation Service

http://validator.w3.org/nu/

Handles RDFa in XML and (X)HTML (various versions) as well as SVG and MathML

Can automatically detect content type

Page 34: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

java-rdfahttps://github.com/shellac/java-rdfa

An offshoot of the Stars Project, Univ. of Bristol, Institute for Learning and Research Technology (Web Futures team)

STARS (roughly Semantic Tools for Screen Arts Research) project (http://www.dshed.net/dshed/stars, http://stars.ilrt.bris.ac.uk/blog/) is now finished

Funded by JISC, a charity that champions the use of digital technologies in UK education and research

The Semantic Web technologies used in it broadly seek to capture and make machine readable data resources of video content

Lets people browsing the content discover thematic links and describe them in new ways

Page 35: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

For HTML sources, add the format argument; need the validator.nu parser (see below)

$ java -cp '*' rdfa.simpleparse --format HTML http://www.slideshare.net/intdiabetesfed/world-diabetes-day-2009

<http://www.slideshare.net/intdiabetesfed/world-diabetes-day-2009>

<http://www.w3.org/1999/xhtml/vocab#stylesheet>

<http://public.slidesharecdn.com/v3/styles/combined.css?1265372095> .

...

The output of simpleparse is n-triples (hard to read)

Add Jena to the classpath and use rdfa.parse instead

$ java -cp '*:/path/to/jena/lib/*' rdfa.parse --format HTML http://www.slideshare.net/intdiabetesfed/world-diabetes-day-2009

@prefix dc: <http://purl.org/dc/terms/> .

@prefix hx: <http://purl.org/NET/hinclude> .

... nice turtle output ...

Page 36: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

java-rdfa can be used from Jena—invoke

Class.forName("net.rootdev.javardfa.RDFaReader");

This hooks the 2 readers into Jena, then we can do either of the following

model.read(url, "XHTML"); // xml parsing

model.read(other, "HTML"); // html parsing

Page 37: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The Validator.nu HTML Parser

http://about.validator.nu/htmlparser/

An implementation of the HTML5 parsing algorithm in Java

Works as a drop-in replacement for the XML parser in applications that

already support XHTML 1.x content with an XML parser and

use SAX, DOM or XOM to interface with the parser

The parser core compiles on Google Web Toolkit

Page 38: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

The following are mentioned in RDFa.info, Developers link, http://rdfa.info/dev/

Green Turtle

http://code.google.com/p/green-turtle/

An implementation of RDFa 1.1 for browsers

Including a bit of JavaScript extends the DOM to include the RDFa API

An RDFa 1.1 processor to process any ancillary documents to harvest triples

EasyRdfhttp://www.easyrdf.org/

A PHP library to make it easy to consume and produce RDF—e.g.,

$foaf = new EasyRdf_Graph("http://njh.me/foaf.rdf");

$foaf->load();

$me = $foaf->primaryTopic();

echo "My name is: ".$me->get('foaf:name')."\n";

There’s a class to map between RDF Types and PHP Classes

Support for visualization of graphs using GraphViz

EasyRdf 0.8 does support RDFa, but it's still in beta Use the converter at easyrdf-converter.aelius.com to test it out

Page 39: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

pyrdfa3

https://github.com/RDFLib/pyrdfa3

This is what provides the W3C’s RDFa Distiller and Parser

Part of Python RDFLib, https://github.com/RDFLib

The RDFa gem

http://rubygems.org/gems/rdf-rdfa

The Ruby RDF Project collects numerous gems supporting Linked Data and Semantic Web programming in Ruby

See http://ruby-rdf.github.io/

librdfa, “The Fastest RDFa Processor on the Internet”

https://github.com/rdfa/librdfa/

A SAX-based RDFa processor written in C for XML and HTML family languages

Supports

XML+RDFa, XHTML+RDFa, SVG+RDFa, HTML4+RDFa and HTML5+RDFa

for both RDFa 1.0 and RDFa 1.1

Page 40: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

clj-rdfa

https://github.com/niklasl/clj-rdfa

An RDFa extractor implemented in Clojure running on a Java Virtual Machine.

Clojure (pronounced “closure”) is a dialect of Lisp programming

A functional general-purpose language

Runs on the Java Virtual Machine, Common Language Runtime, and JavaScript engines

Focus is on programming with immutable values and explicit progression-of-time constructs

Facilitates the development of more robust programs, particularly multithreaded ones

Page 41: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Semarglhttp://semarglproject.org/

Download from https://github.com/levkhomich/semargl

A modular framework for crawling linked data from structured documents

Provides lightweight and performant tools without excess dependencies

High-performant streaming parsers for RDFa, JSON-LD (see below), RDF/XML, N-Triples

Streaming serializer for Turtle, NTriples, NQuads

Integration with Jena, Sesame (see below) and Clerezza (see below)

Small memory footprint and CPU requirements allow this framework to be used by any application

Runs seamlessly on Android and GAE (Google App Engine)

Page 42: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Sesamehttp://www.openrdf.org/about.jsp

An open-source framework for querying and analyzing RDF data

Implements an in-memory triple store and an on-disk triple store

And 2 Servlet packages to manage and provide access to these triple stores on a permanent server

The Sesame Rio (RDF Input/Output) package contains a simple API for Java-based RDF parsers and writers

Supports 2 query languages: SPARQL and SeRQL (in the SWI-Prolog Semantic Web Library, http://www.swi-prolog.org/pldoc/package/semweb.html, see also http://www.swi-prolog.org/web/)

Its Alibaba component is an API that lets us

map Java classes onto ontologies and

Generate Java source files from ontologies

Can thus use specific ontologies like RSS, FOAF and the DC directly from Java

Page 43: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Clerezza

http://clerezza.apache.org/

A service platform based on OSGi (Open Services Gateway initiative, open specifications that enable the modular assembly of software built with Java technology, http://www.osgi.org/)

Functionality for managing semantically linked data accessible through RESTful Web Services and in a secured way

Tools to manipulate RDF data, create RESTful Web Services and Renderlets using Scala Server Pages

A renderlet is a special container that can receive every object in Pimcore

Pimcore is an open source web content management platform for creating and managing web applications and digital presences implemented in PHP and MySQL

Scala Server Pages are like JSPs but for Scala instead of Java

Scala is an object-functional programming and scripting language for general software applications

Page 44: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

RDF triples are stored via Clerezza’s Smart Content Binding (SCB)

A java implementation of the graph data model and functionalities to operate on it

A service interface to access multiple named graphs

Can use various providers to manage RDF graphs in a technology specific manner (using e.g., Jena or Sesame)

Provides for adaptors that allow an application to use various APIs (including the Jena api) to process RDF graphs

A serialization and a parsing service to convert a graph into a certain representation and vice versa

Page 45: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

JSON-LD (JSON for Linked Data, http://json-ld.org/) is a method of transporting Linked Data using JSON

Being standardized by the W3C RDF Working Group (http://www.w3.org/TR/2013/PR-json-ld-20131105/, Nov. 2013)

Linked Data is a way of publishing structured data so that it can be interlinked and more useful

Builds upon standard Web technologies (HTTP, RDF, URIs, …)

Extends them to share info in a computer-readable way so that data from different sources can be connected and queried

JSON-LD aims to require as little effort as possible from developers to transform their existing JSON to JSON-LD

Designed around the concept of a “context” to provide additional mappings from JSON to an RDF-like model

See the playground at http://json-ld.org/playground/

Page 46: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

checkrdfahttp://check.rdfa.info/

Checks a web page for RDFa and displays the found data

Validates our data against the published recommendations from major consumers/users of RDFa data

I don’t think this works anymore

Page 47: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Microformats See microformats.org at http://microformats.org/

Primer: http://www.digital-web.com/articles/microformats_primer/

A microformat (abbreviated μF) is a web-based approach to semantic markup

Re-uses existing HTML/XHTML tags to convey metadata and other attributes in web pages and other contexts that support (X)HTML (e.g., RSS)

Lets software process info intended for end-users (e.g., contact info, geographic coords, calendar events) automatically

Established microformats (e.g., hCard) are published on the web at least as often as alternatives (e.g., schema and RDFa)

hCard is a microformat version of vCard

Page 48: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Mozilla Operator add-on

https://addons.mozilla.org/en-US/firefox/addon/operator/

Leverages microformats and other semantic data available on many web pages to provide new ways to interact with web services

After adding it, View Toolbar Customize and drag the Operator icon from the pallet to the add-on bar

Then, at the top of the Mozilla window, View Sidebar and click Operator

Page 49: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Operatortoolbar

Operator iconand drop-down

menu

Page 50: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Add various items of info to various services

Here add an event to my Google Calendar

Get the same options in the toolbar just above the page

Page 51: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

XOXO (eXtensible Open XHTML Outlines) is an XML microformat for outlines built on top of XHTML

http://microformats.org/wiki/xoxo

The spec defines an outline as a hierarchical, ordered list of arbitrary elements

It's fairly open, suitable for many types of list data

The XML elements in an XOXO document

<ol class="xoxo">

<ul class="xoxo">

These, with class attribute with value xoxo, are the root elements of XOXO, used as containers for outline items

May have attribute compact="compact" to indicate whether child items are visible 

<li> is an item in the outline

May contain an ol or ul element to contain child items, which themselves may do so as well

Page 52: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

<a> is a hyperlink for an item in the outline (and may contain much info—see below)

<dl> may contain any number of arbitrary properties using dt (definition term) and dd (definition description) elements

Example

<ol class='xoxo'>

<li>item 1

<dl>

<dt>description</dt>

<dd>This item represents the main point we're trying to make.</dd>

</dl>

<ol>

<li>subpoint a</li>

<li>subpoint b</li>

</ol>

</li>

Page 53: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Special properties: text, url, title, type, and rel (short for relationship)

Example<ol class='xoxo'>

<li><a href="http://example.com/more.xoxo"

title="title of item 1"

type="text/xml"

rel="help">item 1</a>

<!-- note how the "text" property is just the contents of the <a> -->

<dl>

<dt>description</dt>

<dd>This item represents the main point we're trying to make.</dd>

</dl>

</li>

Page 54: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Some Open Microformat Formats A microformat (singular) is one collection of properties, each with an intended kind of value

Only hCard and hCalendar have been ratified so far

hCard is for publishing people, companies, organizations on the web, using a 1:1 representation of vCard properties and values in HTML—e.g.,

<div class="vcard">

<a class="url fn org" href="http://microformats.org/">

microformats.org

</a>

</div>

Page 55: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

hCalendar is a format for publishing events on the web, using a 1:1 representation of iCalendar (RFC2445) VEVENT properties and values in HTML—e.g.,

<span class="vevent">

<span class="summary">The microformats.org site was launched</span>

on <span class="dtstart">2005-06-20</span>

at the Supernova Conference

in <span class="location">San Francisco, CA, USA</span>.

</span>

Page 56: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

XFN is a lightweight method of annotating links to indicate a personal relationship with the person responsible for the linked resource

Strengthens existing links in a way that’s both machine-readable and human-comprehensible

It and FOAF serve different purposes

relationship category XFN valuesfriendship (at most one): friend acquaintance contact

physical: met

professional: co-worker colleague

geographical (at most one): co-resident neighbor

family (at most one): child parent sibling spouse kin

romantic: muse crush date sweetheart

identity: me

Page 57: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

hReview is a format suitable for embedding reviews (of products, services, businesses, events, etc.) in HTML, XHTML, Atom, RSS, and arbitrary XML

Example (See the rendering on the next slide)

<div class="hreview">

<span><span class="rating">5</span> out of 5 stars</span>

<h4 class="summary">Crepes on Cole is awesome</h4>

<span class="reviewer vcard">Reviewer:

<span class="fn">Tantek</span> -

<abbr class="dtreviewed" title="2005-04-18">April 18, 2005</abbr>

</span>

<div class="description item vcard"><p>

<span class="fn org">Crepes on Cole</span>

is one of the best little creperies

in <span class="adr">

<span class="locality">San Francisco</span>

</span>.

Excellent food and service. Plenty of tables in a variety of sizes

for parties large and small. Window seating makes for excellent

people watching to/from the N-Judah which stops right outside.

Continued

Page 58: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

I've had many fun social gatherings here, as well as gotten

plenty of work done thanks to neighborhood WiFi.

</p></div>

<p>Visit date: <span>April 2005</span></p>

<p>Food eaten: <span>Florentine crepe</span></p>

</div>

Page 59: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Apache Any23 (Anything to Triples)

https://any23.apache.org/

A library, a web service, and a command line tool that extracts structured data in RDF format from a variety of Web documents

Supported input formats:

RDF/XML, Turtle, Notation 3

RDFa with RDFa1.1 prefix mechanism

Microformats: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License, XFN and Species

HTML5 Microdata: (such as Schema.org, see below)

CSV with separator autodetection

For a detailed description of available extractors, see https://any23.apache.org/extractors.html

Page 60: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Apache Any23 is written and Java and used in major Web of Data applications such as

sindice.com (Semantic Web index, http://sindice.com/, collects Web data in many ways and offers search and querying across this data), and

sig.ma (semantic info mashup, http://sig.ma/)

Used in various ways, including

As a library in Java applications that consume structured data from the Web

As a command-line tool for extracting and converting between the supported formats

Online service: http://any23.org/

Page 61: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Microdata Microdata is a WHATWG HTML specification used to nest

semantics within existing content on web pages.

Web Hypertext Application Technology Working Group (WHATWG): a community interested in evolving HTML and related technologies

Microdata aims for annotation of HTML elements with machine-readable tags that’s simpler than the similar approaches

E.g., those using RDFa and Microformats

A web developer can design a custom vocabulary or use vocabularies available on the web (see data-vocabulary.org below)

Page 62: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Microdata Global Attributes (used as HTML attributes) itemscope creates the Item and indicates that descendants of this

element contain info about it

itemtype is a valid URL of a vocabulary that describes the item and its properties

itemid indicates a unique identifier of the item

itemprop indicates that its containing tag holds the value of the specified item property

The property’s name and value context are described by the item’s vocabulary

Property values usually consist of string values but can also use URLs (e.g., using the a element and its href attribute)

itemref: Properties that aren’t descendants of the element with the itemscope attribute can be associated with the item using this attribute

Provides a list of element itemids with additional properties elsewhere in the document

Page 63: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Schema.org See http://getschema.org/index.php/Main_Page

An initiative launched in June 2011 by Bing, Google and Yahoo! to provide a vocabulary for web masters to markup web content in ways recognized by major search providers

Data-Vocabulary.org: http://www.data-vocabulary.org/

Has links to the documentation on Schema.org’s vocabulary

The Schema.org vocabulary can be used with both Microdata or RDFa 1.1 Lite syntax

Has types for Event, Organization, Person, Product, Review, AggregateRating, Offer and hundreds of others

For the RDFS file (as an XML document) that defines this vocabulary, see http://rdf.data-vocabulary.org/rdf.xml

Other markup vocabularies are provided by Schema.org schemas

Page 64: RDFa, Etc. (Resource Description Framework–in–attributes) W3C: RDFa 1.1 Primer

Typically, applications need to extract semantic annotations from the web pages and use them to perform reasoning

RDFa Extractor (RDFa2RDF Service)

http://getschema.org/rdfaliteextractor/about

A REST Web Service to extract RDF data from RDFa annotations

Provides the semantic information as N-Triples, N3 Notation, JSON

Powered by node.js and uses jsdom library node.js: http://nodejs.org/ jsdom: https://github.com/tmpvar/jsdom

Microdata Extractor (Microdata2RDF Service)

http://getschema.org/microdataextractor/about

Like the RDFa Extractor but has Microdata, not RDFa, as input

Conforms with the Microdata2RDF specification at W3C

But may use a different generation algorithm


Recommended