+ All Categories
Home > Documents > Virtuoso - Linked Data2_6118738

Virtuoso - Linked Data2_6118738

Date post: 08-Apr-2016
Category:
Upload: fabianlhz
View: 44 times
Download: 1 times
Share this document with a friend
56
Deploying Linked Data - TOC Section Contents: Deploying Linked Data - Part 2 Deploying Linked Data using Virtuoso The Virtuoso Rule-Based URL Rewriter Conductor UI for the URL Rewriter Virtual Domains (Hosts) & Directories "Nice" URLs vs. "Long" URLs Rule Processing Mechanics Enabling URL Rewriting via the Virtuoso Conductor UI Northwind Demonstration Database Configuring Rewrite Rules using Conductor Dissection of Northwind Rewrite Rules Configured using Conductor Regex Rule for RDF Requests Constructing the Destination Path Format Data Flow in Conductor-Defined Northwind RDF Regex Rule Regex Rule for HTML Requests Enabling URL Rewriting via Virtuoso PL Exporting Rewrite Rules from Conductor Defining Virtual Hosts in Virtuoso PL URL Rewriting Configuration API Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo... 1 di 56 29/11/2009 20.43
Transcript
Page 1: Virtuoso - Linked Data2_6118738

Deploying Linked Data - TOC

Section Contents:

Deploying Linked Data - Part 2

Deploying Linked Data using Virtuoso

The Virtuoso Rule-Based URL Rewriter

Conductor UI for the URL Rewriter

Virtual Domains (Hosts) & Directories

"Nice" URLs vs. "Long" URLs

Rule Processing Mechanics

Enabling URL Rewriting via the Virtuoso

Conductor UI

Northwind Demonstration Database

Configuring Rewrite Rules using

Conductor

Dissection of Northwind Rewrite Rules

Configured using Conductor

Regex Rule for RDF Requests

Constructing the Destination Path

Format

Data Flow in Conductor-Defined

Northwind RDF Regex Rule

Regex Rule for HTML Requests

Enabling URL Rewriting via Virtuoso PL

Exporting Rewrite Rules from

Conductor

Defining Virtual Hosts in Virtuoso PL

URL Rewriting Configuration API

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

1 di 56 29/11/2009 20.43

Page 2: Virtuoso - Linked Data2_6118738

Creating Rewrite Rules

URLREWRITE_CREATE_REGEX_RULE

Dissection of Northwind Rewrite Rules

Configured using Virtuoso PL

Data Flow in Virtuoso/PL-Defined

Northwind RDF Regex Rule

Northwind URL Rewriting Verification Using

cURL

Browsing & Exploring the Northwind RDF View

Interacting with Linked Data via ODE

Interacting with Linked Data via iSPARQL

Transparent Content Negotiation

HTTP/1.1 Content Negotiation

Transparent Content Negotiation

Deficiencies of HTTP/1.1 Server-Driven

Negotiation

Variant Selection By User Agent

Variant Selection By Server

Variant Selection By End-User

Transparent Content Negotiation in Virtuoso HTTP

Server

Describing Resource Variants

HTTP_VARIANT_MAP Table

Configuration using Virtuoso/PL

Configuration using Conductor UI

Variant Selection Algorithm

Transparent Content Negotiation Examples

Simple TCN with Static Content

Northwind RDF View

DBpedia

Simplifying Deployment with RDFa

No Content Negotiation or 303 Redirects

Generating RDFa Dynamically Using

Description.vsp

RDFa Output From Non-RDF Data Sources

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

2 di 56 29/11/2009 20.43

Page 3: Virtuoso - Linked Data2_6118738

Sample RDFa Output From Description.vsp

The preceding sections described a generic approach to

deploying Linked Data into the existing Web. We now turn

our attention to Virtuoso, to describe its solution for Linked

Data deployment. In fact, Virtuoso's solution is to implement

the generic approach outlined in the prior sections, using the

twin pillars of content negotiation and URL rewriting.

Virtuoso provides a URL rewriter that can be enabled for

URLs matching specified patterns. Coupled with

customizable HTTP response headers and response codes,

Linked Data Web server administrators can configure highly

flexible rules for driving content negotiation and URL

rewriting. The key elements of the URL rewriter are:

Rewrite rule

Each rule describes how to parse a single source

URL, and how to compose the URL of the page

ultimately returned in the "Location:" response

headers

Every rewrite rule is uniquely identified internally

(using IRIs).

Two types of rule are supported, based on the

syntax used to describe the source URL pattern

matching - sprintf-based and regex-based.

Rewrite rule list

A named ordered-list of rewrite rules or rule lists

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

3 di 56 29/11/2009 20.43

Page 4: Virtuoso - Linked Data2_6118738

where rules of the list are processed from top to

bottom or in line with processing pipeline

precedence instructions

Configuration API

Defines functions for creating, dropping, and

enumerating rules and rule lists.

Virtual hosts and virtual paths

URL rewriting is enabled by associating a rewrite

rules list with a virtual directory

Each of these elements is described in more detail below,

although complete descriptions of the features or functions in

question are not given. The intention here is to provide an

overview of Virtuoso's URL rewriting capabilities and their

application to deploying Linked Data. Please refer to the

Virtuoso Reference Documentation for full details.

Virtuoso is a full-blown HTTP server in its own right. The

HTTP server functionality co-exists with the product core (i.e.

DBMS Engine, Web Services Platform, WebDAV filesystem,

and other components of the Universal Server). As a result, it

has the ability to multi-home Web domains within a single

instance across a variety of domain name and port

combinations. In addition, it also enables the creation of

multiple virtual directories per domain.

In addition to the basic functionality describe above, Virtuoso

lets you associate URL rewrite rules with the virtual directories

associated with a particular hosted Web domain.

In all cases, Virtuoso enables you to configure virtual

domains, virtual directories and URL rewrite rules for one or

more virtual directories, via the (X)HTML-based Conductor

Admin User Interface or a collection of Virtuoso Stored

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

4 di 56 29/11/2009 20.43

Page 5: Virtuoso - Linked Data2_6118738

Procedure Language (PL)-based APIs.

A Virtuoso virtual directory maps a logical path to a physical

directory in your file system or WebDAV repository. This

mechanism allows physical locations to be hidden or simply

reorganised. Virtual directory definitions are held in the

system table DB.DBA.HTTP_PATH. Virtual directories can be

administered in three basic ways:

Using the Visual Administration Interface via a Web

browser;

Using the functions vhost_define() and

vhost_remove(); and

Using SQL statements to directly update the

HTTP_PATH system table.

Although we are approaching the URL Rewriter from the

perspective of deploying Linked Data, the rewriter was

developed with additional objectives in mind. These in turn

have influenced the naming of some of the formal argument

names in the Configuration API function prototypes. In the

following sections, "long" URLs are those containing a query

string with named parameters; "nice" (also known as

"source") URLs have data encoded in some other format. The

primary goal of the Rewriter was to accept a nice URL from

an application and convert this into a long URL, which then

identifies the page that should actually be retrieved.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

5 di 56 29/11/2009 20.43

Page 6: Virtuoso - Linked Data2_6118738

When an HTTP request is accepted by the Virtuoso HTTP

server, the received nice URL is passed to an internal path

translation function. This function takes the nice URL and, if

the current virtual directory has a url_rewrite option set to

an existing rule list name, tries to match the corresponding

rule lists and rules; that is, the function performs a recursive

traversal of any rule list associated with the virtual directory.

For every rule in the rule list, the same logic is applied (only

the logic for regex-based rules is described; that for

sprintf-based rules is very similar):

The input for the rule is the resource URL as received

from the HTTP header, i.e., the portion of the URL from

the first '/' after the host:port fields to the end of the

URL.

The input is normalized.

The input is matched against the rule's regex. If the

match fails, the rule is not applied and the next rule is

tried. If the match succeeds, the result is a vector of

values.

If the URL contains a query string, the names and

values of the parameters are decoded by

split_and_decode().

The names and values of any parameters in the request

body are also decoded.

The destination URL is composed.

The value of each parameter in the destination

URL is taken from (in order of priority):

the value of a parameter in the match result;

the value of a named parameter in the query

string of the input nice URL;

if the original request was submitted by the

POST method, the value of a named

parameter in the body of the POST request;

or

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

6 di 56 29/11/2009 20.43

Page 7: Virtuoso - Linked Data2_6118738

if a parameter value cannot be derived from

one of these sources, the rule is not applied

and the next rule is tried.

Note:The path translation function described above is internal

to the Web server, so its signature is not appropriate for

Virtuoso/PL calls and thus is not published. Virtuoso/PL

developers can harness the same functionality using the

DB.DBA.URLREWRITE_APPLY API call.

The URL rewriting examples which follow are taken from the

Virtuoso Northwind demonstration database, which is

included in the Demo VAD (Virtuoso Application Distribution)

archive.

To check which version of the Demo VAD is installed, or to

upgrade it, refer to the Conductor's 'VAD Packages' screen,

reachable through the 'System Admin' > 'Packages' menu items.

The latest VADs for the closed source releases of Virtuoso can be

downloaded from the downloads area of the OpenLink website.

Select either the 'DBMS (WebDAV) Hosted' or 'File System

Hosted' product format from the 'Distributed

Collaborative Applications' section, depending on whether you want

the Virtuoso application to be run from WebDAV or native

filesystem storage. VADs for Virtuoso Open Source edition (VOS)

are available for download from the VOS Wiki.

Northwind Demonstration Database

The Virtuoso Northwind database (contained in the "Demo"

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

7 di 56 29/11/2009 20.43

Page 8: Virtuoso - Linked Data2_6118738

catalog) is very similar to the Northwind example database

available for SQL Server. Its schema comprises commonly

understood SQL tables that include: Customers, Orders,

Employees, Products, Product Categories, Shippers,

Countries, Provinces, etc.

Northwind is installed with a preconfigured RDF view and a

set of preconfigured URL rewrite rules that collectively expose

RDF based entity graphs and URLs of (X)HTML web pages

that describe the back-end relational data.

An RDF View over relational data is a named collection

(graph) of RDF records (triples) derived from an RDBMS-

to-RDF source data map exposed via a Virtuoso Quad Store.

The process of declaring RDF Views over RDBMS data using

the Virtuoso Meta-schema Language is described in detail in

our RDF Views of SQL white paper.

To view the Northwind entity graph in RDF format, starting

with the entity "ALFKI", simply place the following document

URL into the OpenLink Data Explorer :

http://demo.openlinksw.com/Northwind/Customer/ALFKI

Alternatively, you can view an (X)HTML based description of

the entity "ALFKI" by pointing your Web browser to the same

URL. (The details of these URLs will be explained shortly; for

now they are presented purely as pointers to illustrate

example data available from Northwind.)

Configuring Rewrite Rules using Conductor

The steps for configuring URL Rewrite rules via the Virtuoso

Conductor are as follows:

Click to the "Web Application Server" > "Virtual

Domains & Directories" tabs.

1.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

8 di 56 29/11/2009 20.43

Page 9: Virtuoso - Linked Data2_6118738

Conductor's Hosted Domains and Virtual

Directories screen

Pick the domain that contains the virtual directories to

which the rules are to be applied (in this case the

default was taken).

Accessing the URL rewrite rules for the

Northwind demo database

2.

Click on the "URL-rewrite" link to create, delete, or edit

a rule as shown below.

3.

Create a rule for HTML based representations of

resource description requests.

Northwind URL rewrite rule for HTML

requests

4.

Create a rule for N3 or RDF/XML based representations5.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

9 di 56 29/11/2009 20.43

Page 10: Virtuoso - Linked Data2_6118738

or resource descriptions.

Northwind URL rewrite rule for RDF requests

Save your rules, exit the Conductor, and test your rules

with "cURL" or any other HTTP-based user agent.

6.

Dissection of Northwind Rewrite Rules Configuredusing Conductor

The screenshots above show the default Northwind rewrite

rules. Let's analyze what they are doing.

Regex Rule for RDF Requests

The regex rule for handling RDF/XML or N3 representation

requests specifies a 'Request Path Pattern' of (/[^#]*) .

Recall that the input path is the portion of the input URL from

the first '/' after the host:port fields to the end of the

normalized URL. So, given a request for

http://demo.openlinksw.com/Northwind/Customer

/ALFKI, the request path pattern would match /Northwind

/Customer/ALFKI. Parentheses in the pattern collect the

results of the pattern matching into parameters. Each

successive pair of parentheses denotes a parameter, referred

to elsewhere in the rewrite rule as $U1, $U2, $U3, ..., or

$s1, $s2, $s3, ..., etc. These parameters can then be

used to substitute a part of the input path that was matched

into the new URL being composed. The parameter markers

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

10 di 56 29/11/2009 20.43

Page 11: Virtuoso - Linked Data2_6118738

$U1 and $s1 (likewise $U2 and $s2 etc.) identify the same

pattern segment in the request path pattern. The only

difference between them is how the matched text is encoded

when it is inserted into the new URL. The 's' format specifier

inserts the matched text as is, whereas the 'U' format specifier

causes the inserted text to be URL encoded.

Content types specified in the request's Accept header and

matched by the 'Accept Header Request Pattern' are

available for substitution into the rewritten URL through the

$accept variable.

Rather than hardcoding host names and ports, the rules are

made more generic by using the convenience macro

URIQADefaultHost. Every occurrence of

^{URIQADefaultHost}^ will be substituted with the value

of the DefaultHost parameter defined in the URIQA section

of the Virtuoso configuration file, virtuoso.ini.

"DefaultHost" is the "canonical" server name that is used to

identify the service. It should be either a server host name

including domain (i.e. an FQDN), or an IP address in

standard notation. If Virtuoso's default HTTP port is not equal

to 80 then the port should also be included, e.g.

"www.example.com:8890".

Constructing the Destination Path Format

The parameter markers, variables and macros just described

provide the building blocks for constructing the 'Destination

Path Format' which serves as a template for the rewritten

URL. It must be stressed that it is not necessary to

URL-encode the Destination Path Format by hand. You

need only write the underlying CONSTRUCT or DESCRIBE

SPARQL query. When defining a new Destination Path

Format, click on the SPARQL button to enable a text box

(shown below) into which you can enter the base SPARQL

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

11 di 56 29/11/2009 20.43

Page 12: Virtuoso - Linked Data2_6118738

query which will describe the entity being dereferenced. On

clicking the 'Format' button to return, the SPARQL query will

be expanded into a full query string, including a result-set

format-specifier, and URL-encoded automatically. For

example, the base query:

DESCRIBE <http://^{URIQADefaultHost}^$U1#this> <http

becomes:

/sparql?query=DESCRIBE+%3Chttp%3A//^{URIQADefaultHos

The pre-configured DESCRIBE query for Northwind describes two

entities:

http:// {̂URIQADefaultHost} /̂Northwind/Customer

/ALFKI#this

and

http:// {̂URIQADefaultHost} /̂Northwind/Customer/ALFKI

http://^{URIQADefaultHost}^/Northwind/Customer

/ALFKI identifies a document (an entity of type foaf:Document)

that has the entity http://^{URIQADefaultHost}^

/Northwind/Customer/ALFKI#this as its

foaf:PrimaryTopic property value. This relationship is the key

to using the description of the document (a report) about "ALFKI"

to expose the deeper entity graph that describes the entity

"ALFKI#this".

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

12 di 56 29/11/2009 20.43

Page 13: Virtuoso - Linked Data2_6118738

Defining the SPARQL query for the Northwind

RDF requests

Data Flow in Conductor-Defined Northwind RDF Regex Rule

The process of rewriting a request for an RDF representation

of Northwind customer ALFKI, through the corresponding

regex rule, is depicted below as a data flow diagram. The

arcs connecting similarly-colored items attempt to illustrate

how portions of the input request are matched and

substituted into the rewritten request.

Breakdown of the URL rewriting process for

Northwind RDF requests

Regex Rule for HTML Requests

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

13 di 56 29/11/2009 20.43

Page 14: Virtuoso - Linked Data2_6118738

The Northwind regex rule for HTML requests functions in a

similar way to the regex rule for RDF requests. That is, the

mechanisms for pattern matching and parameter substitution

are the same. The only differences are the content types

matched and the target URL.

In this case, the destination path format is:

/about/html/http://^{URIQADefaultHost}^$s1

Here, the path /about/html/ redirects the client to the

Virtuoso Sponger proxy interface. The Sponger itself is a

highly customizable RDFizer. Virtuoso reserves two paths for

the proxy service, '/about/rdf/' and '/about/html/'.

(Note: These proxy paths have since been augmented to

support a richer slash URI scheme for identifying format

variants. Please refer to Appendix B for more details.) The

web service takes the target URL following the proxy path

and either returns the content "as is" or tries to transform it to

RDF. The RDF graph derived from the sponging process is

then rendered in one of the RDF serialization formats

(RDF/XML or N3) or HTML depending on whether the request

specified /about/rdf/ or /about/html/. Thus, the proxy service

can be used as a middleware for enabling RDF based

exploration of non-RDF sources using dedicated RDF

browsers or standard (X)HTML browsers.

The mechanism through which Virtuoso composes an HTML

rendering of RDF data (whether this be a native RDF

description, or one extracted by the Sponger) is via the

"description.vsp" rendering template, a specialized

Virtuoso Server Page specifically aimed at RDF-model-based

resource description. The "description.vsp" template is

described in more detail in Appendix A. A usage example

covering the description of the entity

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

14 di 56 29/11/2009 20.43

Page 15: Virtuoso - Linked Data2_6118738

<http://demo.openlinksw.com/Northwind/Customer

/ALFKI#this> is shown below.

description.vsp HTML rendering of Customer

entity ALFKI

While the Conductor UI provides the easiest way to set up

URL rewriting, on occasion it may be preferable to configure

URL rewriting programmatically using Virtuoso PL.

Exporting Rewrite Rules from Conductor

The Conductor lets you export configured rules as Virtuoso

PL, making it easier to use them on another system, for

instance. The exported script recreates the rewrite rules using

Virtuoso's URL Rewriting Configuration API.

Conductor's 'Export' button for exporting URL

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

15 di 56 29/11/2009 20.43

Page 16: Virtuoso - Linked Data2_6118738

rewrite rules

The code listing below shows the exported Northwind rules.

Describing the Configuration API and this exported rules file

forms the focus of this section.

DB.DBA.VHOST_REMOVE ( lhost=>'*ini*', vhost=>'*ini*'

DB.DBA.VHOST_DEFINE ( lhost=>'*ini*', vhost=>'*ini*'

ppath=>'/DAV/home/demo/', is_dav=>1, vsp_user=>'db

vector ('url_rewrite', 'demo_nw_rule_list1'), is_d

DB.DBA.URLREWRITE_CREATE_RULELIST ( 'demo_nw_rule_li

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'demo_nw_rule1

'/about/html/http://^{URIQADefaultHost}^%s', vecto

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'demo_nw_rule2

'/sparql?query=DESCRIBE+%%3Chttp%%3A//^{URIQADefau

vector ('path', 'path', '*accept*'), NULL, '(text/

Exporting Rewrite Rules from a Script

Use the function DB.DBA.URLREWRITE_DUMP_RULELIST_SQL to export

rule lists programmatically. e.g. From isql, you can generate the listing shown

above by executing:

SELECT

DB.DBA.URLREWRITE_DUMP_RULELIST_SQL('demo_nw_rule_list1')

Defining Virtual Hosts in Virtuoso PL

As can be seen above, the vhost_define() API call is used

to define virtual hosts and virtual paths hosted by the

Virtuoso HTTP server. URL rewriting is enabled through this

function's opts parameter. opts is of type ANY, e.g. a vector

of field-value pairs. Numerous fields are recognized for

controlling different options. The field value url_rewrite

controls URL rewriting. The corresponding field value is the

IRI of a rule list to apply.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

16 di 56 29/11/2009 20.43

Page 17: Virtuoso - Linked Data2_6118738

URL Rewriting Configuration API

Virtuoso includes the following functions for managing URL

rewrite rules and rule lists. The names are self-explanatory.

DB.DBA.URLREWRITE_DROP_RULE - Deletes a rewrite

rule.

DB.DBA.URLREWRITE_CREATE_SPRINTF_RULE -

Creates a rewrite rule which uses sprintf-based pattern

matching.

DB.DBA.URLREWRITE_CREATE_REGEX_RULE - Creates

a rewrite rule which uses regular expression

(regex)-based pattern matching.

DB.DBA.URLREWRITE_DROP_RULELIST - Deletes a

rewrite rule list.

DB.DBA.URLREWRITE_CREATE_RULELIST - Creates a

rewrite rule list.

DB.DBA.URLREWRITE_ENUMERATE_RULES - Lists all

the rules whose IRIs match the specified 'SQL like'

pattern.

DB.DBA.URLREWRITE_ENUMERATE_RULELISTS - Lists

all the rule lists whose IRIs match the specified 'SQL

like' pattern.

Creating Rewrite Rules

Rewrite rules take two forms: sprintf-based or regex-based.

When used for nice URL to long URL conversion, the only

difference between them is the syntax of format strings. The

reverse long to nice conversion works only for sprintf-based

rules, whereas regex-based rules are unidirectional. For the

purpose of describing how to make dereferenceable URIs for

Linked Data, we will focus on regex-based rules.

Regex rules are created using the

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

17 di 56 29/11/2009 20.43

Page 18: Virtuoso - Linked Data2_6118738

URLREWRITE_CREATE_REGEX_RULE() function.

URLREWRITE_CREATE_REGEX_RULE

Function Prototype:

URLREWRITE_CREATE_REGEX_RULE (

rule_iri, allow_update, nice_match, nice_params, n

target_compose, target_params, target_expn := null

do_not_continue := 0, http_redirect_code := null,

Parameters: rule_iri : VARCHAR

The rule's name / identifier

allow_update : INTEGER

Indicates whether the rule can be updated. 1 indicates

yes; 0 indicates no. The update is subject to the

following rules:

If the given rule_iri is already in use as a rule

list identifier, an error is signalled.

If the given rule_iri is already in use as a rule

identifier and allow_update for the existing rule

is zero, an error is signalled.

If the given rule_iri is already in use as a rule

identifier and allow_update for the existing rule

is non-zero, the existing rule is updated.

nice_match : VARCHAR

A regex match expression to parse the URL into a vector

of occurrences.

nice_params : ANY

A vector of the names of the parsed parameters. The

length of the vector should be equal to the number of

'(...)' specifiers in the format string.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

18 di 56 29/11/2009 20.43

Page 19: Virtuoso - Linked Data2_6118738

nice_min_params : INTEGER

Used to specify the minimum number of sprintf format

patterns to be matched in order to trigger the given rule.

It only affects sprintf rules and has no effect for regex

rules.

target_compose : VARCHAR

A regex compose expression for the URL of the

destination page.

target_params : ANY

A vector of names of parameters that should be passed

to the compose expression (target_compose) as $1,

$2 and so on.

target_expn : VARCHAR

Optional SQL text that should be executed instead of a

regex compose call.

accept_pattern : VARCHAR

A regex expression to match the HTTP Accept header

do_not_continue : INTEGER

If the given rule satisfies the match conditions, 1

signifies do not try the next rule from same rule list, and

0 signifies try the next rule.

http_redirect_code : INTEGER

NULL or the integer values 301, 302, 303, or 406, are

currently allowed. If a 3xx redirect code is given, an

HTTP redirect response will be sent back to client. If

NULL is specified, the server will process the redirect

internally.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

19 di 56 29/11/2009 20.43

Page 20: Virtuoso - Linked Data2_6118738

http_headers : VARCHAR

HTTP headers to supply with the rewritten request.

Dissection of Northwind Rewrite Rules Configuredusing Virtuoso PL

Having briefly outlined the URL Rewriting API, we return now

to the Northwind rule configuration script listed earlier.

At the start of the script, we define a virtual directory in order

to turn on URL rewriting through vhost_define(). We first

remove any existing definition for logical path /Northwind

on the virtual host defined by vhost, before redefining the

logical path. vhost specifies the host name sent to a

user-agent in an HTTP response. This must be a valid fully-

qualified host name or alias and port separated by ':'. This

parameter accepts the special value '*ini*' which will be

replaced with the hostname and port configured in the

virtuoso.ini file.

The /Northwind virtual directory is mapped to a DAV folder

(indicated by is_dav being non-zero) whose physical path is

/DAV/home/demo. The machine hosting the virtual directory

listens on the IP address and port specified by lhost (i.e.,

listen host). Like vhost, this accepts the special value

'*ini*'. Any VSP pages contained in the virtual directory will

run as user 'dba'.

URL rewriting is enabled through the url_rewrite field in

the opts vector; the URL rewriter will use the rule list named

demo_nw_rule_list1. The latter is defined by the

URLREWRITE_CREATE_RULELIST function call which follows.

The rule list contains two regex-based rules,

demo_nw_rule1 and demo_nw_rule2, each defined by calls

to function URLREWRITE_CREATE_REGEX_RULE.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

20 di 56 29/11/2009 20.43

Page 21: Virtuoso - Linked Data2_6118738

Consider first rule demo_nw_rule2. In this rule, the regular

expression '(/[^#]*)' specified for nice_match matches the

input IRI up to fragment delimiter (#). The corresponding

occurrence is named 'path' in the nice_params vector.

The client must be requesting the return data as RDF

serialized as N3 or RDF/XML in order for the rule to apply.

Argument target_compose specifies a URL-encoded

template for the rewritten destination URL. Spaces are

encoded as '+' or '%20', the reserved character '#' is percent-

encoded as '%23' and the '%' character itself is escaped by '%'.

Removing the URL encoding and the final format specifier

('&format=%U'), the SPARQL DESCRIBE query being built

takes the form:

DESCRIBE <http://^{URIQADefaultHost}^%U#this>

<http://^{URIQADefaultHost}^%U> FROM

<http://^{URIQADefaultHost}^/Northwind>

Unsurprisingly this is almost identical to the SPARQL query

displayed by Conductor, when the same rewrite rules are

viewed through the Conductor UI. The only difference lies in

the slightly different syntax used for parameter markers (%U

or %s, as opposed to $U1, $U2, ... or $s1, $s2, ... in

Conductor). Here, the two sprintf-like format characters %U

are placeholders which receive the first two entries in the

target_params vector, i.e., the value of 'path'. In our

example, the value of 'path' would be '/Northwind

/Customer/ALFKI'.

The query response format is controlled by the format query

parameter. In the format specifier ('&format=%U') at the end

of the constructed query string, the third placeholder '%U'

receives the value of the third entry in the target_params

vector, '*accept*'. The '*accept*' parameter is used to

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

21 di 56 29/11/2009 20.43

Page 22: Virtuoso - Linked Data2_6118738

pass the part of Accept header matched against

accept_pattern, e.g. if the Accept header specified MIME

types of 'application/rdf+xml, application/xml' and

the accept_pattern is '(text/rdf.n3)|(application

/rdf.xml)', then the '*accept*' parameter will have the

value of 'application/rdf+xml'.

The other rule, demo_nw_rule1, is essentially similar, but

targeted at HTML browsers rather than RDF browsers. Rather

than the internal redirect used by demo_nw_rule2, this rule

returns HTTP redirect code 303 to the client when the rewrite

rule is applied.

Internal Rewrites vs External Redirects

External redirect: Tells the client to ask for the requested content

again using a new URL and HTTP request. An external redirect is

indicated by one of the HTTP response codes:

301 - Moved permanently (for permanent redirection)302 -

Found (the most common way of performing a

redirection)303 - See Other (the correct manner in which to

redirect web applications to a new URI)

Internal rewrite/redirect: Gets the content for the requested URL

from a different server file path than implied by the requested URL.

As described earlier when examining the Conductor-

configured rules, HTML requests are redirected to

description.vsp via the Sponger proxy interface.

System Tables Supporting URL Rewriting

If you need to check your rewrite rule definitions, an alternative to

inspecting them using Conductor is to query Virtuoso's system

tables directly. The relevant system tables for URL rewriting are

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

22 di 56 29/11/2009 20.43

Page 23: Virtuoso - Linked Data2_6118738

DB.DBA.URL_REWRITE_RULE_LIST and

DB.DBA.URL_REWRITE_RULE. For example, the configured rule

lists can be seen by executing 'SELECT URRL_LIST FROM

DB.DBA.URL_REWRITE_RULE_LIST'

Data Flow in Virtuoso/PL-Defined Northwind RDF Regex Rule

Earlier we presented a data flow diagram showing the

process of rewriting a request for an RDF representation of

Northwind customer ALFKI, through a regex rule defined in

the Conductor. Below is a similar diagram, depicting the

same request rewrite, this time using the Virtuoso PL

definition of the same rule. As before, the arcs connecting

similarly coloured items illustrate how portions of the input

request are matched and substituted into the rewritten

request.

Breakdown of the URL rewriting process for

Northwind RDF requests

As illustrated earlier, the curl utility provides a useful tool for

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

23 di 56 29/11/2009 20.43

Page 24: Virtuoso - Linked Data2_6118738

verifying HTTP server responses and rewrite rules. The first

two curl exchanges below show the default Northwind URL

rewrite rules being applied.

Example 1:

$ curl -I -H "Accept: text/html" http://demo.openlin

HTTP/1.1 303 See Other

Server: Virtuoso/05.09.3037 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=ISO-8859-1Date: Fri

Accept-Ranges: bytes

Location: http://demo.openlinksw.com/about/html/http

Content-Length: 0

Example 2:

$ curl -I -H "Accept: application/rdf+xml" http://de

HTTP/1.1 200 OK

Server: Virtuoso/05.09.3037 (Solaris) x86_64-sun-sol

Connection: Keep-Alive

Date: Fri, 06 Feb 2009 11:14:49 GMT

Accept-Ranges: bytes

Content-Type: application/rdf+xml; charset=UTF-8

Content-Length: 9488

Example 3:

$ curl -I -H "Accept: application/rdf+xml" http://de

HTTP/1.1 303 See Other

Server: Virtuoso/05.09.3037 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=ISO-8859-1

Date: Thu, 12 Feb 2009 11:23:31 GMT

Accept-Ranges: bytes

Location: http://demo.openlinksw.com/sparql?query=DE

Content-Length: 0

The third example shows the response generated when the

default rule for RDF requests is changed to return an HTTP

response code of 303, rather than use an internal redirect.

Making this temporary change allows the generated SPARQL

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

24 di 56 29/11/2009 20.43

Page 25: Virtuoso - Linked Data2_6118738

query to be viewed and checked with curl.

In this section, we are going to interact with Linked Data

deployed into the Linked Data Web from a live instance of

Virtuoso, which uses the URL Rewrite rules from the prior

section.

The components used in the example are as follows:

Virtuoso SPARQL Endpoint -

http://demo.openlinksw.com/sparql

Named RDF Graph - http://demo.openlinksw.com

/Northwind

Entity ID - http://demo.openlinksw.com/Northwind

/Customer/ALFKI#this

Information Resource - http://demo.openlinksw.com

/Northwind/Customer/ALFKI

Interactive SPARQL Query Builder (iSPARQL) -

http://demo.openlinksw.com/DAV/JS/isparql/index.html

Steps:

Point your HTML browser to the URI

http://demo.openlinksw.com/Northwind/Customer/ALFKI

to display the HTML rendering of the RDF container

document pointing to entity ALFKI. Click on the link

http://demo.openlinksw.com/Northwind/Customer

/ALFKI#this adjacent to the foaf:PrimaryTopic property

to display the ALFKI entity itself.

1.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

25 di 56 29/11/2009 20.43

Page 26: Virtuoso - Linked Data2_6118738

Ctrl-Click (OS/X) or right-click (Windows) on the 'About:

Alfreds Futterkiste' link at the top of the page to display

the ODE pop-up menu then click on the 'View Linked

Data Sources' command. This launches the ODE RDF

browser which will present an RDF Linked Data view of

customer ALFKI.

2.

Click on one of ALFKI's attribute values, for instance

one of the customer's orders, to display a pop-up with

options for 'expanding' the URI. Selecting 'Describe'

dereferences the attribute value URI, to display details

of the selected order.

The screenshot below shows the result of dereferencing

the data link http://demo.openlinksw.com/Northwind

/Order/11103#this. (Triples associated with the 'Alfreds

Futterkiste' entity were removed beforehand for clarity,

by clicking the 'Remove' link in the 'Cache' group box,

3.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

26 di 56 29/11/2009 20.43

Page 27: Virtuoso - Linked Data2_6118738

to show only the Order details.)

Continuing in this way, one can navigate over the Northwind

RDF graph, drilling down to uncover more details of selected

entities.

We can interact with the same Information Resource and

associated RDF using the iSPARQL Query tool as follows:

Start the Query Builder by entering the following into

your browser: http://demo.openlinksw.com/isparql You

will be presented with a default Query By Example

(QBE) canvas that includes a default Graph Pattern

and a default URI. Change the URI to the Northwind

graph URI: http://demo.openlinksw.com/Northwind .

1.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

27 di 56 29/11/2009 20.43

Page 28: Virtuoso - Linked Data2_6118738

Then execute the default query (which simply gets a list

of concepts defined by the RDF graph), by clicking on

the ">" button.

2.

Click on the Customer record, and you will be

presented with a Linked-Data-Web-optimized hyperlink

that presents you with three options: Get Data Items,

Describe Data Source, and Open Web Page.

3.

Click Get Data Items (since you are interested in

"instance data" for the Customer concept, as opposed

to the schema definitions of said concept). You will be

presented a list of northwind:Customer instances. Click

on the 's' column header to sort the customer list

alphabetically.

4.

Click on the http://demo.openlinksw.com/Northwind5.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

28 di 56 29/11/2009 20.43

Page 29: Virtuoso - Linked Data2_6118738

/Customer/ALFKI record, and you will once again be

presented with the enhanced hyperlink and its options.

Again click on Get Data Items, to get all the records in

the RDF database related to entity

http://demo.openlinksw.com/Northwind/Customer/ALFKI

.

So as not to overload our preceding description of Linked

Data deployment with excessive detail, the description of

content negotiation presented thus far was kept deliberately

brief. This section discusses content negotiation in more

detail.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

29 di 56 29/11/2009 20.43

Page 30: Virtuoso - Linked Data2_6118738

Recall that a resource (conceptual entity) identified by a URI

may be associated with more than one representation (e.g.

multiple languages, data formats, sizes, resolutions). If

multiple representations are available, the resource is

referred to as negotiable and each of its representations is

termed a variant. For instance, a Web document resource,

named 'ALFKI' may have three variants: alfki.xml, alfki.html

and alfki.txt all representing the same data. Content

negotiation provides a mechanism for selecting the best

variant.

As outlined in the earlier brief discussion of content

negotiation, when a user agent requests a resource, it can

include with the request Accept headers (Accept, Accept-

Language, Accept-Charset, Accept-Encoding etc.) which

express the user preferences and user agent capabilities.

The server then chooses and returns the best variant based

on the Accept headers. Because the selection of the best

resource representation is made by the server, this scheme is

classed as server-driven negotiation.

An alternative content negotiation mechanism is Transparent

Content Negotiation (TCN), a protocol defined by RFC2295 .

TCN offers a number of benefits over standard HTTP/1.1

negotiation, for suitably enabled user agents.

RFC2295 introduces a number of new HTTP headers

including the Negotiate request header, and the TCN and

Alternates response headers. (Krishnamurthy et al. note that

although the HTTP/1.1 specification reserved the Alternates

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

30 di 56 29/11/2009 20.43

Page 31: Virtuoso - Linked Data2_6118738

header for use in agent driven negotiation, it was not fully

specified. Consequently under a pure HTTP/1.1

implementation as defined by RFC2616, server-driven

content negotiation is the only option. RFC2295 addresses

this issue.)

Deficiencies of HTTP/1.1 Server-DrivenNegotiation

Weaknesses of server-driven negotiation highlighted by RFCs

2295 and 2616 include:

Inefficiency - Sending details of a user agent's

capabilities and preferences with every request is very

inefficient, not least because very few Web resources

have multiple variants, and expensive in terms of the

number of Accept headers required to fully describe all

but the most simple browser's capabilities.

Server doesn't always know 'best' - Having the server

decide on the 'best' variant may not always result in the

most suitable resource representation being returned to

the client. The user agent might often be better placed

to decide what is best for its needs.

Variant Selection By User Agent

Rather than rely on server-driven negotiation and variant

selection by the server, a user agent can take full control over

deciding the best variant by explicitly requesting transparent

content negotiation through the Negotiate request header.

The negotiation is 'transparent' because it makes all the

variants on the server visible to the agent.

Under this scheme, the server sends the user agent a list,

represented in an Alternates header, containing the available

variants and their properties. The user agent can then choose

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

31 di 56 29/11/2009 20.43

Page 32: Virtuoso - Linked Data2_6118738

the best variant itself. Consequently, the agent no longer

needs to send large Accept headers describing in detail its

capabilities and preferences. (However, unless caching is

used, user-agent driven negotiation does suffer from the

disadvantage of needing a second request to obtain the best

representation. By sending its best guess as the first

response, server driven negotiation avoids this second

request if the initial best guess is acceptable.)

Variant Selection By Server

As well as variant selection by the user agent, TCN allows the

server to choose on behalf of the user agent if the user agent

explicitly allows it through the Negotiate request header. This

option allows the user agent to send smaller Accept headers

containing enough information to allow the server to choose

the best variant and return it directly. The server's choice is

controlled by a 'remote variant selection algorithm' as defined

in RFC2296.

Variant Selection By End-User

A further option is to allow the end-user to select a variant, in

case the choice made by negotiation process is not optimal.

For instance, the user agent could display an HTML-based

'pick list' of variants constructed from the variant list returned

by the server. Alternatively the server could generate this pick

list itself and include it in the response to a user agent's

request for a variant list. (Virtuoso currently responds this

way.)

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

32 di 56 29/11/2009 20.43

Page 33: Virtuoso - Linked Data2_6118738

The following section describes the Virtuoso HTTP server's

TCN implementation which is based on RFC2295, but without

"Feature" negotiation. OpenLink's RDF rich clients, iSparql

and the OpenLink RDF Browser, both support TCN. User

agents which do not support transparent content negotiation

continue to be handled using HTTP/1.1 style content

negotiation (whereby server-side selection is the only option -

the server selects the best variant and returns a list of

variants in an Alternates response header).

In order to negotiate a resource, the server needs to be given

information about each of the variants. Variant descriptions

are held in SQL table HTTP_VARIANT_MAP. The

descriptions themselves can be created, updated or deleted

using Virtuoso/PL or through the Conductor UI.

HTTP_VARIANT_MAP Table

The table definition is as follows:

create table DB.DBA.HTTP_VARIANT_MAP (

VM_ID integer identity, -- unique ID

VM_RULELIST varchar, -- HTTP rule list name

VM_URI varchar, -- name of requested r

VM_VARIANT_URI varchar, -- name of variant e.g

VM_QS float, -- Source quality, a n

VM_TYPE varchar, -- Content type of the

VM_LANG varchar, -- Content language e.

VM_ENC varchar, -- Content encoding e.

VM_DESCRIPTION long varchar, -- a human readable de

VM_ALGO int default 0, -- reserved for future

primary key (VM_RULELIST, VM_URI, VM_VARIANT_URI)

);

create unique index HTTP_VARIANT_MAP_ID on DB.DBA.HT

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

33 di 56 29/11/2009 20.43

Page 34: Virtuoso - Linked Data2_6118738

Configuration using Virtuoso/PL

Two functions are provided for adding or updating, or

removing variant descriptions using Virtuoso/PL:

Adding or Updating a Resource Variant

DB.DBA.HTTP_VARIANT_ADD (

in rulelist_uri varchar, -- HTTP rule list na

in uri varchar, -- Requested resourc

in variant_uri varchar, -- Variant name e.g.

in mime varchar, -- Content type of t

in qs float := 1.0, -- Source quality, a

in description varchar := null, -- a human readable

in lang varchar := null, -- Content language

in enc varchar := null -- Content encoding

)

Removing a Resource Variant

DB.DBA.HTTP_VARIANT_REMOVE (

in rulelist_uri varchar, -- HTTP rule list nam

in uri varchar, -- Name of requested

in variant_uri varchar := '%' -- Variant name filte

)

Configuration using Conductor UI

The Conductor 'Content negotiation' panel for describing

resource variants and configuring content negotiation is

depicted below. It can be reached by selecting the 'Virtual

Domains & Directories' tab under the 'Web Application Server'

menu item, then selecting the 'URL rewrite' option for a

logical path listed amongst those for the relevant HTTP host,

e.g. '{Default Web Site}'.

The screen snapshot shows the variant descriptions created

by issuing the HTTP_VARIANT_ADD and VHOST_DEFINE

Virtuoso/PL calls detailed in the examples at the end of this

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

34 di 56 29/11/2009 20.43

Page 35: Virtuoso - Linked Data2_6118738

section. Obviously these definitions could instead have been

created entirely 'from scratch' through the Conductor UI.

The input fields reflect the supported 'dimensions' of

negotiation which include content type, language and

encoding. Quality values corresponding to the options for

'Source Quality' are as follows:

Source Quality Quality Value

perfect representation 1.000

threshold of noticeable loss of quality 0.900

noticeable, but acceptable quality reduction 0.800

barely acceptable quality 0.500

severely degraded quality 0.300

completely degraded quality 0.000

Content negotiation rules in Conductor

When a user agent instructs the server to select the best

variant, Virtuoso does so using the selection algorithm below:

If a virtual directory has URL rewriting enabled (has the

'url_rewrite' option set), the web server:

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

35 di 56 29/11/2009 20.43

Page 36: Virtuoso - Linked Data2_6118738

Looks in DB.DBA.HTTP_VARIANT_MAP for a

VM_RULELIST matching the one specified in the

'url_rewrite' option

1.

If present, it loops over all variants for which VM_URI is

equal to the resource requested

2.

For every variant it calculates the source quality based

on the value of VM_QS and the source quality given by

the user agent

3.

If the best variant is found, it adds TCN HTTP headers

to the response and passes the VM_VARIANT_URI to

the URL rewriter

4.

If the user agent has asked for a variant list, it

composes such a list and returns an 'Alternates' HTTP

header with response code 300

5.

If no URL rewriter rules exist for the target URL, the web

server returns the content of the dereferenced

VM_VARIANT_URI.

6.

The server may return the best-choice resource

representation or a list of available resource variants. When a

user agent requests transparent negotiation, the web server

returns the TCN header "choice". When a user agent asks for

a variant list, the server returns the TCN header "list".

In this example we assume the following files have been

uploaded to the Virtuoso WebDAV server, with each

containing the same information but in different formats:

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

36 di 56 29/11/2009 20.43

Page 37: Virtuoso - Linked Data2_6118738

/DAV/TCN/page.xml - a XML variant

/DAV/TCN/page.html - a HTML variant

/DAV/TCN/page.txt - a text variant

We add TCN rules and define a virtual directory:

DB.DBA.HTTP_VARIANT_ADD ('http_rule_list_1', 'page',

DB.DBA.HTTP_VARIANT_ADD ('http_rule_list_1', 'page',

DB.DBA.HTTP_VARIANT_ADD ('http_rule_list_1', 'page',

DB.DBA.VHOST_DEFINE (lpath=>'/DAV/TCN/', ppath=>'/DA

opts=>vector ('url_rewrite', 'h

Having done this we can now test the setup with a suitable

HTTP client, in this case the curl command line utility. In the

following examples, the curl client supplies Negotiate request

headers containing content negotiation directives which

include:

"trans" - The user agent supports transparent content

negotiation for the current request.

"vlist" - The user agent requests that any transparently

negotiated response for the current request includes an

Alternates header with the variant list bound to the

negotiable resource. Implies "trans".

"*" - The user agent allows servers and proxies to run

any remote variant selection algorithm.

The server returns a TCN response header signalling that the

resource is transparently negotiated and either a choice or a

list response as appropriate.

In the first curl exchange, the user agent indicates to the

server that, of the formats it recognizes, HTML is preferred

and it instructs the server to perform transparent content

negotiation. In the response, the Vary header field expresses

the parameters the server used to select a representation, i.e.

only the Negotiate and Accept header fields are considered.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

37 di 56 29/11/2009 20.43

Page 38: Virtuoso - Linked Data2_6118738

$ curl -i -H "Accept: text/xml;q=0.3,text/html;q=1.0

-H "Negotiate: *" http://localhost:8890/DA

HTTP/1.1 200 OK

Server: Virtuoso/05.00.3021 (Linux) i686-pc-linux-gn

Connection: Keep-Alive

Date: Wed, 31 Oct 2007 15:43:18 GMT Accept-Ranges: b

TCN: choice Vary: negotiate,accept

Content-Location: page.html

Content-Type: text/html

ETag: "14056a25c066a6e0a6e65889754a0602"

Content-Length: 49

<html>

<body>

some html

</body>

</html>

Next, the source quality values are adjusted so that the user

agent indicates that XML is its preferred format.

$ curl -i -H "Accept: text/xml,text/html;q=0.7,text/

-H "Negotiate: *" http://localhost:8890/DA

HTTP/1.1 200 OK Server: Virtuoso/05.00.3021 (Linux)

Connection: Keep-Alive

Date: Wed, 31 Oct 2007 15:44:07 GMT

Accept-Ranges: bytes TCN: choice

Vary: negotiate,accept

Content-Location: page.xml

Content-Type: text/xml

ETag: "8b09f4b8e358fcb7fd1f0f8fa918973a"

Content-Length: 39

<?xml version="1.0" ?>

<a>some xml</a>

In the final example, the user agent wants to decide itself

which is the most suitable representation, so it asks for a list

of variants. The server provides the list, in the form of an

Alternates response header, and, in addition, sends an HTML

representation of the list so that the end user can decide on

the preferred variant himself if the user agent is unable to.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

38 di 56 29/11/2009 20.43

Page 39: Virtuoso - Linked Data2_6118738

$ curl -i -H "Accept: text/xml,text/html;q=0.7,text/

-H "Negotiate: vlist" http://localhost:889

HTTP/1.1 300 Multiple Choices

Server: Virtuoso/05.00.3021 (Linux) i686-pc-linux-gn

Connection: close Content-Type: text/html; charset=I

Date: Wed, 31 Oct 2007 15:44:35 GMT

Accept-Ranges: bytes

TCN: list

Vary: negotiate,accept

Alternates: {"page.html" 0.900000 {type text/html}},

{"page.xml" 1.000000 {type text/xml}}

Content-Length: 368

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<html>

<head>

<title>300 Multiple Choices</title>

</head>

<body>

<h1>Multiple Choices</h1>

Available variants:

<ul>

<li><a href="page.html">HTML variant</a>, type text/

<li><a href="page.txt">Text document</a>, type text/

<li><a href="page.xml">XML variant</a>, type text/xm

</ul>

</body>

</html>

Our next example illustrates the use of a slash URI scheme in

an RDF view, and shows how to combine URL rewriting and

transparent content negotiation. The example is taken from

the RDF View tutorial , one of many Virtuoso on-line tutorials.

The view generates an RDF rendering of Virtuoso's Northwind

'Demo' database. (Note: The 'tutorial' RDF view described

here is distinct from the hash-URI-based 'demo' RDF view

created by the Demonstration VAD.) If you intend trying the

example locally, both the Demonstration and Tutorial VADs

must be installed on the local machine.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

39 di 56 29/11/2009 20.43

Page 40: Virtuoso - Linked Data2_6118738

To generate the RDF view and setup the URL rewriting rules,

the tutorial runs the script rd_v_1.sql (see the 'View Source'

tab of the RDF View tutorial, or WebDAV folder DAV/VAD

/tutorial/rdfview/rd_v_1). The view creates two RDF graphs:

http://<URIQADefaultHost>/tutorial

/Northwind - containing the base RDF data

http://<URIQADefaultHost>/tutorial

/Northwind/ontology - containing the OWL class

definitions

A slash URI scheme is adhered to throughout. Each entity

exposed by the view is identified by the URI prefix

http://<URIQADefaultHost>/tutorial/Northwind

/resource/. For example:

http://demo.openlinksw.com/tutorial/Northwind/resource

/Customer/ALFKI

http://demo.openlinksw.com/tutorial/Northwind/resource

/Order/10692

RDF and HTML representation documents describing

Northwind entities are identified by URIs with prefixes

http://<URIQADefaultHost>/tutorial/Northwind

/data/ and http://<URIQADefaultHost>/tutorial

/Northwind/page/, e.g.

http://demo.openlinksw.com/tutorial/Northwind

/data/Customer/ALFKI.xml

http://demo.openlinksw.com/tutorial/Northwind

/data/Customer/ALFKI.n3

http://demo.openlinksw.com/tutorial/Northwind

/data/Customer/ALFKI.ttl

http://demo.openlinksw.com/tutorial/Northwind

/page/Customer/ALFKI.html

Transparent content negotiation is enabled to allow entity

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

40 di 56 29/11/2009 20.43

Page 41: Virtuoso - Linked Data2_6118738

representations to be rendered in several formats. The

available variants can be seen using curl. e.g.

curl -I -L -H "Negotiate: vlist" "http://demo.openli

returns

HTTP/1.1 303 See Other

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=ISO-8859-1

Date: Fri, 15 May 2009 11:11:19 GMT

Accept-Ranges: bytes

TCN: list

Vary: negotiate,accept

Alternates: {"ALFKI.html" 0.600000 {type text/html}}

{"ALFKI.ttl" 0.700000 {type application/x-turtle}},

Location: http://demo.openlinksw.com/tutorial/Northw

Content-Length: 443

Requesting RDF/XML as the preferred representation of a

resource (and requesting only the HTTP headers be

displayed)

curl -I -L -H "Accept: application/rdf+xml;q=0.95,te

-H "Negotiate: *" "http://demo.openlinksw

returns

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

41 di 56 29/11/2009 20.43

Page 42: Virtuoso - Linked Data2_6118738

HTTP/1.1 303 See Other

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: close

Date: Fri, 15 May 2009 16:17:11 GMT

Accept-Ranges: bytes

TCN: choice

Vary: negotiate,accept

Content-Location: ALFKI.xml

Content-Type: application/rdf+xml; qs=0.9025

Location: http://demo.openlinksw.com/tutorial/Northw

Content-Length: 0

HTTP/1.1 303 See Other

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=ISO-8859-1

Date: Fri, 15 May 2009 16:17:11 GMT

Accept-Ranges: bytes

Location: http://demo.openlinksw.com/sparql?default-

query=DESCRIBE+%3Chttp%3A//demo.openlinksw.com/tutor

Content-Length: 0

HTTP/1.1 200 OK

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: Keep-Alive

Date: Fri, 15 May 2009 16:17:11 GMT

Accept-Ranges: bytes

Content-Type: application/rdf+xml; charset=UTF-8

Content-Length: 6358

Likewise, specifying N3 as the preferred format

curl -I -L -H "Accept: text/rdf+n3;q=1.0,application

-H "Negotiate: *" "http://demo.openlinksw

generates

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

42 di 56 29/11/2009 20.43

Page 43: Virtuoso - Linked Data2_6118738

HTTP/1.1 303 See Other

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: close

Date: Fri, 15 May 2009 16:30:27 GMT

Accept-Ranges: bytes

TCN: choice

Vary: negotiate,accept

Content-Location: ALFKI.n3

Content-Type: text/rdf+n3; qs=0.8

Location: http://demo.openlinksw.com/tutorial/Northw

Content-Length: 0

HTTP/1.1 303 See Other

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=ISO-8859-1

Date: Fri, 15 May 2009 16:30:28 GMT

Accept-Ranges: bytes

Location: http://demo.openlinksw.com/sparql?default-

query=DESCRIBE+%3Chttp%3A//demo.openlinksw.com/tutor

Content-Length: 0

HTTP/1.1 200 OK

Server: Virtuoso/05.10.3038 (Solaris) x86_64-sun-sol

Connection: Keep-Alive

Date: Fri, 15 May 2009 16:30:28 GMT

Accept-Ranges: bytes

Content-Type: text/rdf+n3; charset=UTF-8

Content-Length: 2018

To explain how this TCN configuration is set up, the salient

portions of the rd_v_1.sql setup script are described below.

A URL rewriting rule list, nwtut_rule_list_1, is associated with

logical path /tutorial/Northwind/resource. Two rules,

resource_rule_1 and resource_rule_2 are added to the rule

list. Each rewrites request paths containing '/tutorial

/Northwind/resource/'.

DB.DBA.VHOST_DEFINE (lpath=>'/tutorial/Northwind/res

ppath=>'/DAV/VAD/tutorial/rdfview/rd_v_1/', is_dav

vsp_user=>'dba',opts=>vector ('url_rewrite', 'nwtu

...

DB.DBA.URLREWRITE_CREATE_RULELIST ('nwtut_rule_list_

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

43 di 56 29/11/2009 20.43

Page 44: Virtuoso - Linked Data2_6118738

The first rule, resource_rule_1, acts as a 'catch all', handling

requests for content types not handled by the second rule.

The latter handles requests for different RDF serialization

formats: RDF/XML, N3, TTL, redirecting them to path

/tutorial/Northwind/data/... . resource_rule_1 forces

requests for any other content types to 'text/html', redirecting

the request to path /tutorial/Northwind/page/... .

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ('resource_rule_

vector ('par_1'), 1,'/tutorial/Northwind/page/%s',

vector ('par_1'), NULL, NULL, 2, 303, NULL);

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ('resource_rule_

vector ('par_1'), 1,'/tutorial/Northwind/data/%s',

vector ('par_1'), NULL, '(application/rdf.xml)|(te

So, requests for /tutorial/Northwind/resource/$1 are routed to:

/tutorial/Northwind/data/$1.xml - if content type

application/rdf+xml was requested

/tutorial/Northwind/data/$1.n3 - if content type

text/rdf+n3 was requested

/tutorial/Northwind/data/$1.ttl - if content type

application/x-turtle was requested

/tutorial/Northwind/page/$1.html - if content type

text/html, or any other content type, was requested

where $1 signifies the remainder portion of the input path.

The Customer entity ALFKI has four description document

variants, ALFKI.xml, ALFKI.n3, ALFKI.ttl and ALFKI.html.

Each variant is described using function

HTTP_VARIANT_ADD. (Here, the '$' character is coded using

its hex value, \x24.)

DB.DBA.HTTP_VARIANT_ADD ('nwtut_rule_list_1', '(.*)'

DB.DBA.HTTP_VARIANT_ADD ('nwtut_rule_list_1', '(.*)'

DB.DBA.HTTP_VARIANT_ADD ('nwtut_rule_list_1', '(.*)'

DB.DBA.HTTP_VARIANT_ADD ('nwtut_rule_list_1', '(.*)'

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

44 di 56 29/11/2009 20.43

Page 45: Virtuoso - Linked Data2_6118738

Finally, the paths /tutorial/Northwind/data and

/tutorial/Northwind/page have their own rewrite rules,

attached to rule lists nwtut_rule_list2 and nwtut_rule_list3

respectively.

DB.DBA.VHOST_DEFINE (lpath=>'/tutorial/Northwind/dat

ppath=>'/DAV/VAD/tutorial/rdfview/rd_v_1/',

is_dav=>1, is_brws=>1, vsp_user=>'dba',opts=>vecto

DB.DBA.VHOST_DEFINE (lpath=>'/tutorial/Northwind/pag

ppath=>'/DAV/VAD/tutorial/rdfview/rd_v_1/',

is_dav=>1, is_brws=>1, vsp_user=>'dba',

opts=>vector ('url_rewrite', 'nwtut_rule_list_3'))

nwtut_rule_list2 contains three rewrite rules

(data_rule_1/2/3), one for each RDF description document

variant. Each rewrites the resource request as a SPARQL

DESCRIBE query, the only difference between the queries

being the request response serialization format.

nwtut_rule_list3 contains one rule (page_rule_1) to re-route

requests for text/html through the /about/html Sponger proxy,

and so generate an HTML rendering. Each rule strips off any

file suffix identifying the variant; e.g. only the 'Customer/ALKI'

portion of 'Customer/ALFKI.n3' or 'Customer/ALFKI.html' is

inserted into the rewritten request.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

45 di 56 29/11/2009 20.43

Page 46: Virtuoso - Linked Data2_6118738

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'data_rule_1',

'/sparql?default-graph-uri=http%%3A//^{URIQADefaultH

query=DESCRIBE+%%3Chttp%%3A//^{URIQADefaultHost}^/tu

vector ('par_1'), NULL, NULL, 2, 303, '');

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'data_rule_2',

'/sparql?default-graph-uri=http%%3A//^{URIQADefaultH

query=DESCRIBE+%%3Chttp%%3A//^{URIQADefaultHost}^/tu

vector ('par_1'), NULL, NULL, 2, 303, '');

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'data_rule_3',

'/sparql?default-graph-uri=http%%3A//^{URIQADefaultH

query=DESCRIBE+%%3Chttp%%3A//^{URIQADefaultHost}^/tu

vector ('par_1', 'f'), NULL, NULL, 2, 303, '');

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'page_rule_1',

'/about/html/http://^{URIQADefaultHost}^/tutorial/No

vector ('par_1'), NULL, '(text/html)', 2, 303);

Under the umbrella of the W3C Linking Open Data (LOD)

Community Project, DBpedia is a well known initiative to

extract structured information from Wikipedia and make this

information available on the Web. The DBpedia knowledge

base is accessible through a SPARQL endpoint or through a

Linked Data interface. As DBpedia defines Linked Data URIs

for millions of concepts, it forms one of the central interlinking

hubs in the LOD Cloud and the emerging Web of Data.

When serving the DBpedia dataset as Linked Data, DBpedia

supports transparent content negotiation in a similar manner

to that already described for the Northwind Tutorial RDF

View. Indeed, the Northwind RDF View's TCN configuration

was modelled as a simplifed version of DBpedia's.

DBpedia uses a slash URI scheme when distinguishing

between resource and description document URIs.

Depending on the content type preferences of the consuming

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

46 di 56 29/11/2009 20.43

Page 47: Virtuoso - Linked Data2_6118738

client expressed in any 'Accept' request headers and the

'best' variant as selected by the server, a request for resource

http://dbpedia.org/resource/The_Beatles is redirected to one

of:

http://dbpedia.org/page/The_Beatles (default, or if

text/html requested)

http://dbpedia.org/data/The_Beatles.xml (if 'best' RDF

variant is application/rdf+xml)

http://dbpedia.org/data/The_Beatles.n3 (if 'best' RDF

variant is text/rdf+n3)

http://dbpedia.org/data/The_Beatles.ttl (if 'best' RDF

variant is application/x-turtle)

As with the Northwind RDF view, the URI prefixes

http://dbpedia.org/resource/...,

http/dbpedia.org/page/... and http://dbpedia.org

/data/... distinguish between a resource and its HTML or

RDF description documents.

The available RDF description document variants can be

listed using curl. The command:

curl -I -L -H "Negotiate: vlist" -H "Accept: applica

yields:

HTTP/1.1 303 See Other

Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-sol

Connection: close

Content-Type: text/html; charset=UTF-8

Date: Mon, 18 May 2009 14:47:31 GMT

Accept-Ranges: bytes

TCN: list

Vary: negotiate,accept

Alternates: {"The_Beatles.n3" 0.800000 {type text/rd

{"The_Beatles.xml" 0.950000 {type application/rdf+

Location: http://dbpedia.org/data/__The_Beatles

Content-Length: 418

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

47 di 56 29/11/2009 20.43

Page 48: Virtuoso - Linked Data2_6118738

Requesting resource "The_Beatles" with RDF/XML as the

preferred description format, using:

curl -I -L -H "Negotiate: *"

-H "Accept: application/rdf+xml;q=0.95,te

"http://dbpedia.org/resource/The_Beatles"

returns:

HTTP/1.1 303 See Other

Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-sol

Connection: close

Date: Mon, 18 May 2009 14:56:39 GMT

Accept-Ranges: bytes

TCN: choice

Vary: negotiate,accept

Content-Location: The_Beatles.xml

Content-Type: application/rdf+xml; qs=0.9025

Location: http://dbpedia.org/data/The_Beatles.xml

Content-Length: 0

HTTP/1.1 200 OK

Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-sol

Connection: Keep-Alive

Date: Mon, 18 May 2009 14:56:40 GMT

Accept-Ranges: bytes

Content-Type: application/rdf+xml; charset=UTF-8

Content-Length: 55844

Changing the preferred description format to N3:

curl -I -L -H "Negotiate: *"

-H "Accept: application/rdf+xml;q=0.70,te

"http://dbpedia.org/resource/The_Beatles"

results in the response:

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

48 di 56 29/11/2009 20.43

Page 49: Virtuoso - Linked Data2_6118738

HTTP/1.1 303 See Other

Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-sol

Connection: close

Date: Mon, 18 May 2009 15:00:16 GMT

Accept-Ranges: bytes

TCN: choice

Vary: negotiate,accept

Content-Location: The_Beatles.n3

Content-Type: text/rdf+n3; qs=0.76

Location: http://dbpedia.org/data/The_Beatles.n3

Content-Length: 0

HTTP/1.1 200 OK

Server: Virtuoso/05.11.3039 (Solaris) x86_64-sun-sol

Connection: Keep-Alive

Date: Mon, 18 May 2009 15:00:20 GMT

Accept-Ranges: bytes

Content-Type: text/rdf+n3; charset=UTF-8

Content-Length: 29259

DBpedia's URL rewriting rules and TCN support are

configured using script dbpedia_init.sql, portions of which are

listed below. For completeness, dbpedia_init.sql is available

here.

Using VHOST_DEFINE, the logical paths http://dbpedia.org

/resource, http://dbpedia.org/page and

http://dbpedia.org/data are each associated with URL

rewriting rule lists. Requests to /resource are redirected to

/page/%s or /data/__%s accordingly depending on

whether an HTML or RDF description is being requested, and

where %s is the portion of the request path after

/resource/. Resource descriptions provided by path

/data/__%s are available in three variants RDF/XML, N3 and

TTL - each variant is described using HTTP_VARIANT_ADD.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

49 di 56 29/11/2009 20.43

Page 50: Virtuoso - Linked Data2_6118738

DB.DBA.VHOST_DEFINE ( lhost=>':80', vhost=>'dbpedia.

ppath=>'/', is_dav=>0, def_page=>'',

opts=>vector ('url_rewrite', 'dbp_rule_list_2'));

...

DB.DBA.VHOST_DEFINE ( lhost=>':80', vhost=>'dbpedia.

lpath=>'/page',

ppath=>registry_get('_dbpedia_path_'),

is_dav=>atoi (registry_get('_dbpedia_dav_')),

opts=>vector ('url_rewrite', 'dbp_rule_list_7'));

...

DB.DBA.VHOST_DEFINE ( lhost=>':80', vhost=>'dbpedia.

ppath=>registry_get('_dbpedia_path_'),

is_dav=>atoi (registry_get('_dbpedia_dav_')), vsp_us

opts=>vector ('url_rewrite', 'pvsp_rule_list2'));

DB.DBA.URLREWRITE_CREATE_RULELIST ( 'dbp_rule_list_2

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'dbp_rule_14',

'/page/%s', vector ('par_1'), NULL, NULL, 2, 303, NU

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'dbp_rule_12',

'/data/__%s', vector ('par_1'), NULL, '(application/

delete from DB.DBA.HTTP_VARIANT_MAP where VM_RULELIS

DB.DBA.HTTP_VARIANT_ADD ('dbp_rule_list_2', '__(.*)'

DB.DBA.HTTP_VARIANT_ADD ('dbp_rule_list_2', '__(.*)'

DB.DBA.HTTP_VARIANT_ADD ('dbp_rule_list_2', '__(.*)'

...

DB.DBA.URLREWRITE_CREATE_RULELIST ( 'dbp_rule_list_7

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'dbp_rule_13',

registry_get('_dbpedia_path_')||'description.vsp?res

NULL, NULL, 0, 0, '');

...

DB.DBA.URLREWRITE_CREATE_RULELIST ( 'pvsp_rule_list2

DB.DBA.URLREWRITE_CREATE_REGEX_RULE ( 'pvsp_data_rul

'/sparql?default-graph-uri=http%%3A%%2F%%2Fdbpedia.o

query=DESCRIBE+%%3Chttp%%3A%%2F%%2Fdbpedia.org%%2Fre

vector ('par_1'), NULL, NULL, 2, null, '');

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

50 di 56 29/11/2009 20.43

Page 51: Virtuoso - Linked Data2_6118738

Requests redirected to /data/__%s are redirected to

/data/%s.xml, /data/%s.ttl or /data/%s.(n3|rdf)

depending on the content type of the chosen variant. The

data for these RDF variants is furnished by similar SPARQL

DESCRIBE queries which differ only in the format= query

string parameter used to specify the result set representation

format.

Requests redirected to /page/%s are in turn redirected to the

page description template description.vsp which provides the

HTML rendering. In effect, this is equivalent to the external

303 redirect to the /about/html proxy used by the

Northwind tutorial RDF view - the proxy uses description.vsp

internally.

With Yahoo and Google both having announced support for

RDFa, this format has arguably become the most important

of the RDF syntaxes. From the perspective of content

providers, RDFa brings other benefits beyond the obvious

attraction of increasing your content's page rank by providing

more accurate, semantically richer metadata to RDFa aware

crawlers. Key amongst these is that RDFa provides the

simplest route to deploying Linked Data.

In this guide we have emphasized the distinction between a

real world concept or entity and its, possibly many,

descriptions, where each description is associated with a

different media-type. Earlier examples have shown how to

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

51 di 56 29/11/2009 20.43

Page 52: Virtuoso - Linked Data2_6118738

serve multiple representation formats: HTML, RDF/XML, N3,

TTL etc. In essence these formats boil down to a choice

between either an HTML representation or some variant of

RDF. What RDFa gives you is both representations combined

in a single entity description document. Consequently the

need for content negotiation or 303 redirects to different

representation documents is removed. This fundamental

difference is depicted in the following three diagrams

contrasting the differences between serving content using

HTML+RDFa and serving separate HTML and RDF

description documents through a hash or slash URI scheme.

Content negotiation with a hash URI scheme

Content negotiation and 303 redirect with a slash

URI scheme

HTML+RDFa potentially removes the need for

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

52 di 56 29/11/2009 20.43

Page 53: Virtuoso - Linked Data2_6118738

content negotiation and HTTP redirects

While authors of small sites might opt to serve static content

and mark up their HTML with RDFa manually, for large

datasets this becomes unattractive. In cases where the HTML

representation itself is being generated from an RDF quad

store, it makes sense to generate any embedded RDFa

alongside the HTML. Virtuoso provides this option through

description.vsp, a Virtuoso Server Page which provides an

HTML description of RDF Linked Data. Appendix A provides a

brief overview.

When dereferencing an entity URI, the description returned is

determined by the media-type(s) specified in any Accept

headers expressing the client's preferred representation

formats. A client can request an XHTML+RDFa description by

supplying an Accept header with a media-type of

application/xhtml+xml or text/html. In the absence of Accept

headers, OpenLink's rewriting rules are normally configured

to return HTML+RDFa by default. (Rewriting rules configured

by the rdf_mappers VAD typically use this convention.)

As our earlier coverage of Virtuoso's proxy service URIs

explained, requests for an HTML rendering of an entity

description are normally redirected internally to the

/about/html proxy. This proxy in turn uses description.vsp to

generate an HTML rendering with embedded RDFa. So, by

exploiting the default URL rewriting rules, internal redirects

(as opposed to much slower external 303 redirects) and the

/about/html proxy service, it is possible to combine

description.vsp's HTML+RDFa generation capabilities with the

deployment benefits of RDFa.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

53 di 56 29/11/2009 20.43

Page 54: Virtuoso - Linked Data2_6118738

If viewing Virtuoso purely as an RDF publishing service,

RDFa simply constitutes another supported syntax for

encoding RDF metadata, alongside RDF/XML, N3, Turtle,

NTriples and JSON. However, RDF metadata drawn from the

Virtuoso quad store and rendered in one of these formats can

itself have been extracted directly or synthesised from a

multitude of non-RDF data sources using Virtuoso's Sponger.

(Obviously raw RDF data can also be imported directly.)

When sponging an XHTML resource, the Sponger will, via the

xHTML cartridge, automatically ingest any RDFa found and

cache the extracted RDF in the quad store. But, the Sponger

can also generate RDF metadata describing non-RDF data

sources. The net result is that the Sponger in combination

with description.vsp can generate RDFa for data sources

containing neither RDF nor RDFa.

As well as being invoked by the /about/html proxy,

description.vsp also underpins the OpenLink Data Explorer's

"View Page Metadata" option. ODE provides a simple means

to examine the RDFa generated by description.vsp.

The screenshot below shows ODE's "View Page Metadata"

output when http://www.crunchbase.com/company/twitter is

sponged by the public Sponger at

http://linkeddata.uriburner.com. The subsequent screenshot

highlights some of the RDFa markup in a heavily cutdown

extract from the description.vsp generated page source.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

54 di 56 29/11/2009 20.43

Page 55: Virtuoso - Linked Data2_6118738

Sponged Twitter company profile

Page source extract highlighting snippets of

generated RDFa

Essentially, in the description.vsp output page, values listed

under the "Has Attributes & Values" tab are described using

RDFa attributes @rel and @resource, if the object part of the

triple is a URI, or using @property if the object part is a literal.

Entities listed under the "Is Attribute Value Of" tab are

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

55 di 56 29/11/2009 20.43

Page 56: Virtuoso - Linked Data2_6118738

described using RDFa attributes @rev and @resource.

Virtuoso - Linked Data Deployment Guide http://virtuoso.openlinksw.com/Whitepapers/html/vdld_html/VirtDeplo...

56 di 56 29/11/2009 20.43


Recommended