Post on 18-Jan-2016
transcript
Querying on the Web:XQuery, RDQL, SparQL
Semantic Web - Spring 2007
Computer Engineering Department
Sharif University of Technology
2
Outline
• XQuery– Querying on XML Data
• RDQL– Querying on RDF Data
• SparQL– Another RDF query language (under development)
3
Requirements for an XML Query Language
David Maier, W3C XML Query Requirements:• Closedness: output must be XML• Composability: wherever a set of XML elements is
required, a subquery is allowed as well• Can benefit from a schema, but should also be applicable
without• Retains the order of nodes• Formal semantics
4
How Does One Design a Query Language?
• In most query languages, there are two aspects to
a query:
– Retrieving data (e.g., from … where … in SQL)
– Creating output (e.g., select … in SQL)
• Retrieval consists of
– Pattern matching (e.g., from … )
– Filtering (e.g., where … )
… although these cannot always be clearly distinguished
5
XQuery Principles
• A language for querying XML document.
• Data Model identical with the XPath data model– documents are ordered, labeled trees
– nodes have identity
– nodes can have simple or complex types (defined in XML Schema)
• XQuery can be used without schemas, but can be checked against DTDs and XML schemas
• XQuery is a functional language– no statements
– evaluation of expressions
6
Sample data
7
<titles>
{for $r in doc("recipes.xml")//recipe
return $r/title}
</titles>
returns
<titles>
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
<title>Ricotta Pie</title>
…
</titles>
A Query over the Recipes Document
8
XPath
<titles>
{for $r in doc("recipes.xml")//recipe
return
$r/title}
</titles>
Query Features
doc(String) returns input document
Part to be returned as it is given {To be evaluated}
Iteration $var - variables
Sequence of results,one for each variable binding
9
Features: Summary
• The result is a new XML document
• A query consists of parts that are returned as is
• ... and others that are evaluated (everything in {...} )
• Calling the function doc(String) returns an input document
• XPath is used to retrieve nodes sets and values
• Iteration over node sets:
let binds a variable to all nodes in a node set
• Variables can be used in XPath expressions
• return returns a sequence of results,
one for each binding of a variable
10
XPath is a Fragement of XQuery• doc("recipes.xml")//recipe[1]/title
returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
• doc("recipes.xml")//recipe[position()<=3] /title
returns
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>,
<title>Ricotta Pie</title>,
<title>Linguine Pescadoro</title>
an element
a list of elements
11
Beware: XPath Attributes
• doc("recipes.xml")//recipe[1]/ingredient[1] /@name
→ attribute name {"beef cube steak"}
• string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)
→ "beef cube steak"
a constructor for an attribute node
a value of type string
12
XPath Attributes (cntd.)
• <first-ingredient>{string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)}</first-ingredient>
→ <first-ingredient>beef cube steak</first-ingredient>
an element with string content
13
XPath Attributes (cntd.)
• <first-ingredient>{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}
</first-ingredient>
→ <first-ingredient name="beef cube steak"/>
an element with an attribute
14
XPath Attributes (cntd.)
• <first-ingredient
oldName="{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}">Beef</first-ingredient>
→ <first-ingredient oldName="beef cube steak">
Beef
</first-ingredient>
An attribute is cast as a string
15
Iteration with the For-Clause
Syntax: for $var in xpath-expr
Example: for $r in doc("recipes.xml")//recipe return string($r)
• The expression creates a list of bindings for a variable $var
If $var occurs in an expression exp,
then exp is evaluated for each binding
• For-clauses can be nested:
for $r in doc("recipes.xml")//recipefor $v in doc("vegetables.xml")//vegetable return ...
16
Nested For-clauses: Example
<my-recipes>
{for $r in doc("recipes.xml")//recipe
return
<my-recipe title="{$r/title}">
{for $i in $r//ingredient
return
<my-ingredient>
{string($i/@name)}
</my-ingredient>
}
</my-recipe>
}
</my-recipes>
Returns my-recipes with titles as attributes and my-ingredientswith names as text content
17
The Let Clause
Syntax: let $var := xpath-expr
• binds variable $var to a list of nodes,
with the nodes in document order
• does not iterate over the list
• allows one to keep intermediate results for reuse
(not possible in SQL)
Example:
let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"]
18
Let Clause: Example
<calory-content>
{let $ooreps := doc("recipes.xml")//recipe
[.//ingredient/@name="olive oil"]
for $r in $ooreps return
<calories>
{$r/title/text()}
{": "}
{string($r/nutrition/@calories)}
</calories>}
</calory-content>
Calories of recipeswith olive oil
Note the implicitstring concatenation
19
Let Clause: Example (cntd.)
The query returns:
<calory-content>
<calories>Beef Parmesan: 1167</calories>
<calories>Linguine Pescadoro: 532</calories>
</calory-content>
20
The Where Clause
Syntax: where <condition>• occurs before return clause • similar to predicates in XPath• comparisons on nodes:
– "=" for node equality– "<<" and ">>" for document order
• Example:
for $r in doc("recipes.xml")//recipewhere $r//ingredient/@name="olive oil"return ...
21
Quantifiers
• Syntax: some/every $var in <node-set> satisfies <expr>
• $var is bound to all nodes in <node-set> • Test succeeds if <expr> is true for some/every
binding• Note: if <node-set> is empty, then
“some” is false and “all” is true
22
Quantifiers (Example)
• Recipes that have some compound ingredient
• Recipes where every ingredient is non-compound
for $r in doc("recipes.xml")//recipewhere some $i in $r/ingredient satisfies $i/ingredient Return $r/title
for $r in doc("recipes.xml")//recipewhere every $i in $r/ingredient satisfies not($i/ingredient) Return $r/title
23
Element Fusion
“To every recipe, add the attribute calories!”<result>
{let $rs := doc("recipes.xml")//recipe
for $r in $rs return
<recipe>
{$r/nutrition/@calories}
{$r/title}
</recipe>}
</result>
an element
an attribute
24
Element Fusion (cntd.)
The query result:
<result>
<recipe calories="1167">
<title>Beef Parmesan with Garlic Angel Hair Pasta</title>
</recipe>
<recipe calories="349">
<title>Ricotta Pie</title>
</recipe>
<recipe calories="532">
<title>Linguine Pescadoro</title>
</recipe>
</result>
25
Eliminating Duplicates
The function distinct-values(Node Set)
– extracts the values of a sequence of nodes
– creates a duplicate free sequence of values
Note the coercion: nodes are cast as values!
Example:
let $rs := doc("recipes.xml")//recipereturn distinct-values($rs//ingredient/@name)
yields
"beef cube steak
onion, sliced into thin rings
...
26
Syntax: order by expr [ ascending | descending ]
for $iname in doc("recipes.xml")//@name
order by $iname descending
return string($iname)
yields
"whole peppercorns",
"whole baby clams",
"white sugar",
...
The Order By Clause
27
The Order By Clause (cntd.)
The interpreter must be told whether the values should be regarded as numbers or as strings (alphanumerical sorting is default)
for $r in $rsorder by number($r/nutrition/@calories)return $r/title
Note:
– The query returns titles ...
– but the ordering is according to calories, which do not appear in the output
Not possible in SQL!
28
Grouping and Aggregation
Aggregation functions count, sum, avg, min, max
Example: The number of simple ingredients
per recipe
for $r in doc("recipes.xml")//recipe
return
<number>
{attribute {"title"} {$r/title/text()}}
{count($r//ingredient[not(ingredient)])}
</number>
29
Grouping and Aggregation (cntd.)
The query result:
<number title="Beef Parmesan with Garlic Angel Hair Pasta">11</number>,
<number title="Ricotta Pie">12</number>,
<number title="Linguine Pescadoro">15</number>,
<number title="Zuppa Inglese">8</number>,
<number title="Cailles en Sarcophages">30</number>
30
Nested Aggregation
“The recipe with the maximal number of calories!”
let $rs := doc("recipes.xml")//recipelet $maxCal := max($rs//@calories)for $r in $rswhere $r//@calories = $maxCalreturn string($r/title)
returns
"Cailles en Sarcophages"
31
Running Queries with Galax
• Galax is an open-source implementation of
XQuery (http://www.galaxquery.org/)
– The main developers have taken part in the definition of
XQuery
RDQL
Querying on RDF data
33
Introduction
• RDF Data Query Language• JDBC/ODBC friendly
• Simple:
SELECTsome information
FROMsomewhere
WHEREthis match
ANDthese constraints
USINGthese vocabularies
34
Example
35
Example
• q1 contains a query:SELECT ?x
WHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, "John Smith")
• For executing q1with a model m1.rdf:java jena.rdfquery --data m1.rdf --query q1
• The outcome is:x
=============================
<http://somewhere/JohnSmith/>
36
Example
• Return all the resources that have property FN and the associated values:
SELECT ?x, ?fnameWHERE (?x, <http://www.w3.org/2001/vcard-rdf/3.0#FN>, ?fname)
• The outcome is:
x | fname ================================================<http://somewhere/JohnSmith/> | "John Smith" <http://somewhere/SarahJones/> | "Sarah Jones"<http://somewhere/MattJones/> | "Matt Jones"
37
Example
• Return the first name of Jones:
SELECT ?givenName
WHERE (?y, <http://www.w3.org/2001/vcard-rdf/3.0#Family>, "Jones"),
(?y, <http://www.w3.org/2001/vcard-rdf/3.0#Given>, ?givenName)
• The outcome is:
givenName
=========
"Matthew"
"Sarah"
38
URI Prefixes : USING
• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :
SELECT ?x WHERE (?x, vCard:FN, "John Smith") USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
SELECT ?givenNameWHERE (?y, vCard:Family, "Smith"),
(?y, vCard:Given, ?givenName) USING vCard FOR <http://www.w3.org/2001/vcard-rdf/3.0#>
39
Filters
• RDQL has a syntactic convenience that allows prefix strings to be defined in the USING clause :
SELECT ?resource WHERE (?resource, info:age, ?age) AND ?age >= 24 USING info FOR <http://somewhere/peopleInfo#>
40
Another Example
SELECT?title ?description ?orbit ?satellite ?sensor ?date
FROM<http://earth.esa.int/showcase/ers/dublin.rdf>
WHERE(?item <dc:title> ?title)(?item <dc:description> ?description)(?item <isc:orbit> ?orbit)(?item <isc:satellite> ?satellite)(?item <isc:sensor> ?sensor)(?item <dc:date> ?date)
USINGisc FOR <http://earth.esa.int/standards/showcase/>dc FOR <http://purl.org/dc/elements/1.1/>rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>rdfs FOR <http://www.w3.org/2000/01/rdf-schema#>
41
Implementations
• Jena– http://jena.sourceforge.net/
• Sesame– http://sesame.aidministrator.nl/
• RDFStore– <http://rdfstore.sourceforge.net/>
42
Limitation
• Does not take into account semantics of RDFS• For example:
ex:human rdfs:subClassOf ex:animalex:student rdfs:subClassOf ex:humanex:john rdf:type ex:student
Query: “ To which class does the resource John belong?”Expected answer: ex:student, ex:human, ex:animalHowever, the query:
SELECT ?xWHERE (<http://example.org/#john>, rdf:type, ?x)USING rdf FOR <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
Yields only:<http://example.org/#student>
• Solution: Inference Engines
SparQL
44
Introduction
• A RDF query language currently under development by W3C
• Builds on previous RDF query languages such as rdfDB, RDQL, and SeRQL.
45
Example RDF
46
Example
• Simple Query:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?url FROM <bloggers.rdf> WHERE {
?contributor foaf:name "Jon Foobar" . ?contributor foaf:weblog ?url . }
47
Example (cont.)
• Optional block:
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name ?depiction
WHERE { ?person foaf:name ?name .
OPTIONAL { ?person foaf:depiction ?depiction . }
}
48
Example (cont.)
• Alternative matches:
PREFIX foaf: <http://xmlns.com/foaf/0.1/> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> SELECT ?name ?mbox WHERE { ?person foaf:name ?name . { { ?person foaf:mbox ?mbox } UNION { ?person foaf:mbox_sha1sum ?mbox } } }
• There are many other features in SparQL which is out of scope for this class. Refer to references for more information.
49
References
• http://www.w3.org/TR/xquery/
• A Programmer's Introduction to RDQL– http://jena.sourceforge.net/tutorial/RDQL/
• http://rdfstore.sourceforge.net/
• http://jena.sourceforge.net
• http://sesame.aidministrator.nl/
• http://www.w3.org/TR/2004/WD-rdf-sparql-query-20041012/
• http://www-128.ibm.com/developerworks/java/library/j-sparql/