Date post: | 29-Jan-2016 |
Category: |
Documents |
Upload: | camron-wilkins |
View: | 219 times |
Download: | 0 times |
IS432Semi-Structured Data
Lecture 6:
XQuery
Dr. Gamal Al-Shorbagy
Introduction
• What is XQuery?- XQuery is to XML what SQL is to database tables.
- XQuery was designed to query XML data.- XQuery is built on XPath expressions
- XQuery is supported by all major databases- XQuery is a W3C Recommendation
Introduction
• XQuery can be used to:- Extract information to use in a Web Service
- Generate summary reports- Transform XML data to XHTML
- Search Web documents for relevant information
FLWR (“Flower”) Expressionsfor $x in doc("bib.xml")/bib/book
where $x/price>30order by $x/titlereturn $x/title
FLWOR is an acronym for "For, Let, Where, Order by,Return".The for clause selects all book elements under the bibelement into a variable called $x.
The where clause selects only book elements with a priceelement with a value greater than 30.
The order by clause defines the sort-order. Will be sort by thetitle element.
The return clause specifies what should be returned. Here itreturns the title elements.
FOR v.s. LET
FOR• Binds node variables iteration
LET• Binds collection variables one value
Bib.xml<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE bib SYSTEM "D:\Chikh\Teaching\2009-2010\Sem1\IS432\Examples\People\book.dtd"><bib>
<book year="1994"><title>TCP/IP Illustrated</title>
<author><last>Stevens</last><first>W.</first></author><publisher>Addison-Wesley</publisher>
<price>65.95</price></book>
<book year="1992"><title>Advanced Programming in the Unix environment</title><author><last>Stevens</last><first>W.</first></author><publisher>Addison-Wesley</publisher>
<price>65.95</price></book>
<book year="2000"><title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author><publisher>Morgan Kaufmann Publishers</publisher><price>39.95</price>
</book><book year="1999">
<title>The Economics of Technology and Content for Digital TV</title><editor><last>Gerbarg</last><first>Darcy</first><affiliation>CITI</affiliation></editor>
<publisher>Kluwer Academic Publishers</publisher><price>129.95</price>
</book></bib>
Bib.dtd
<!ELEMENT bib (book* )><!ELEMENT book (title, (author+ | editor+ ), publisher, price )>
<!ATTLIST book year CDATA #REQUIRED ><!ELEMENT author (last, first )>
<!ELEMENT editor (last, first, affiliation )><!ELEMENT title (#PCDATA )>
<!ELEMENT last (#PCDATA )><!ELEMENT first (#PCDATA )>
<!ELEMENT affiliation (#PCDATA )><!ELEMENT publisher (#PCDATA )><!ELEMENT price (#PCDATA )>
Books.xml
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE chapter SYSTEM "books.dtd"><chapter>
<title>Data Model</title><section>
<title>Syntax For Data Model</title></section> <section>
<title>XML</title><section>
<title>Basic Syntax</title></section><section><title>XML and Semistructured
Data</title></section> </
section></chapter>
Books.xml
<?xml version="1.0" encoding="UTF-8"?><!ELEMENT chapter (title, section*)><!ELEMENT section (title, section*)><!ELEMENT title (#PCDATA)>
Reviews.xml
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE reviews SYSTEM "reviews.dtd"><reviews>
<entry><title>Data on the Web</title><price>34.95</price>
<review>A very good discussion of semi-structured databasesystems and XML.
</review></entry><entry>
<title>Advanced Programming in the Unix environment</title><price>65.95</price>
<review>A clear and detailed discussion of UNIX
programming.</review> </entry>
<entry><title>TCP/IP Illustrated</title>
<price>65.95</price><review>
One of the best books on TCP/IP.</review> </
entry></reviews>
Reviews.dtd
<?xml version="1.0" encoding="UTF-8"?><!ELEMENT reviews (entry*)><!ELEMENT entry (title, price, review)><!ELEMENT title (#PCDATA)><!ELEMENT price (#PCDATA)><!ELEMENT review (#PCDATA)>
Prices.xml
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE prices SYSTEM "prices.dtd"><prices>
<book><title>Advanced Programming in the Unix environment</title><source>bstore2.example.com</source>
<price>65.95</price></book><book><title>Advanced Programming in the Unix environment</title>
<source>bstore1.example.com</source><price>65.95</price></book>
<book><title>TCP/IP Illustrated</title><source>bstore2.example.com</source><price>65.95</price></book>
<book><title>TCP/IP Illustrated</title><source>bstore1.example.com</source><price>65.95</price></book>
<book><title>Data on the Web</title><source>bstore2.example.com</source><price>34.95</price></book>
<book><title>Data on the Web</title><source>bstore1.example.com</source>
<price>39.95</price></book></prices>
Prices.dtd
<?xml version="1.0" encoding="UTF-8"?><!ELEMENT prices (book*)>
<!ELEMENT book (title, source, price)><!ELEMENT title (#PCDATA)><!ELEMENT source (#PCDATA)><!ELEMENT price (#PCDATA)>
Query 1List books published by Addison-Wesley after 1991, includingtheir year and title
<bib>{
for $b in doc("bib.xml")/bib/bookwhere $b/publisher = "Addison-Wesley" and $b/@year > 1991return
<book year="{ $b/@year }">{ $b/title }
</book>}
</bib>
The doc() function is used to openthe “bib.xml"
<bib><book year="1994"><title>TCP/IP Illustrated</title></book><bookyear="1992"><title>Advanced Programming in the Unix
environment</title></book></bib>
Query 2Create a flat list of all the title-author pairs, with each pair
enclosed in a "result" element.
<results>{for $b in doc("bib.xml")/bib/book,
$t in $b/title,$a in
$b/authorreturn<result>{ $t } { $a
}</result>}</results>
<results><result><title>TCP/IP Illustrated</title><author><last>Stevens</last> <first>W.</first></author></result><result><title>Advanced Programming in the Unix
environment</title><author><last>Stevens</last><first>W.</first></author></result><result><title>Data on the Web</title><author><last>Abiteboul</last><first>Serge</first></author></result><result><title>Data on the Web</title><author><last>Buneman</last><first>Peter</first></author></result>
<result><title>Data on the Web</title><author><last>Suciu</last><first>Dan</first></author></result></results>
Query 3For each book in the bibliography, list the title and authors,
grouped inside a "result" element.
<results>{
for $b in doc("bib.xml")/bib/bookreturn
<result>{ $b/title }
{ $b/author }</result>
}</results>
<results><result><title>TCP/IPIllustrated</title><author><last>Stevens</last><first>W.</first></author></result><result><title>AdvancedProgramming in the Unix
environment</title><author><last>Stevens</last><first>W.</first></author></result><result><title>Data on theWeb</title><author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author></result><result><title>The Economics ofTechnology and Content for Digital TV</title></result></results>
Query 4For each author in the bibliography, list the author's name and
the titles of all books by that author, grouped inside a "result"
<results>{let $a := doc("bib.xml")//authorfor $last in distinct-values($a/last),$first in
distinct-values($a[last=$last]/first)order by $last, $firstreturn
<result><author>
<last>{ $last }</last><first>{ $first }</first></author>{for $b in doc("bib.xml")/bib/book
where some $ba in $b/authorsatisfies ($ba/last = $last and
$ba/first=$first)return $b/title}</result>}
</results>
Query 4
<results><result><author><last>Abiteboul</last><first>Serge</first></author><title>Data on theWeb</title></result><result><author><last>Buneman</last><first>Peter</first></author><title>Data on the
Web</title></result><result><author><last>Stevens</last><first>W.</first></author><title>TCP/IP
Illustrated</title><title>Advanced Programming in the Unix environment</title></result><result><author><last>Suciu</last><first>Dan</first></author><title>Data on theWeb</title></result>
</results>
Query 5For each book found in bib.xml and reviews.xml, list the title of
the book and its price from each source<books-with-prices>
{for $b in doc("bib.xml")//book,
$a in doc("reviews.xml")//entrywhere $b/title = $a/titlereturn
<book-with-prices>{ $b/title }
<price-reviews-source>{ $a/price/text() }</price-reviews-source>
<price-bib-source> { $b/price/text() }</price-bib-source></book-with-prices> }
</books-with-prices>
<books-with-prices><book-with-prices><title>TCP/IP Illustrated</title><price-reviews-source>65.95</price-reviews-source><price-bib-source> 65.95</price-bib-source></book-with-prices><book-with-
prices><title>Advanced Programming in the Unix environment</title><price-reviews-source>65.95</price-reviews-source><price-bib-source>65.95</price-bib-source></book-with-prices><book-with-prices><title>Data on theWeb</title><price-reviews-source>34.95</price-reviews-source><price-bib-source>39.95</price-bib-source></book-with-prices></books-with-prices>
Query 6
For each book that has at least one author, list the title and firsttwo authors, and an empty "et-al" element if the book hasadditional authors.
<bib>{for $b in doc("bib.xml")//bookwhere count($b/author) > 0return
<book>{ $b/title }
{for $a in $b/author[position()<=2]return $a}
{if (count($b/author) > 2)then <et-al/>
else ()}</
book>}</bib>
Query 6
<bib><book>
<title>TCP/IP Illustrated</title><author><last>Stevens</last><first>W.</first></author>
</book><book>
<title>Advanced Programming in the Unix environment</title><author><last>Stevens</last><first>W.</first></author></book>
<book><title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><et-al/>
</book></bib>
Query 7List the titles and years of all books published by Addison-Wesley
after 1991, in alphabetic order.
<bib>{
for $b in doc("bib.xml")//bookwhere $b/publisher = "Addison-Wesley" and $b/@year > 1991order by $b/title
return<book year= { $b/@year }>
{ $b/title }</
book>}</bib>
<bib><book year="1992"><title>Advanced Programming in the Unix environment</title></book><bookyear="1994"><title>TCP/IP Illustrated</title></book></bib>
Query 8Find books in which the name of some element ends with the
string "or" and the same element contains the string "Suciu"somewhere in its content. For each such book, return the
title and the qualifying element.
for $b in doc("bib.xml")//booklet $e := $b/*[contains(string(.), "Suciu")
and ends-with(local-name(.), "or")]where exists($e)return
<book>{ $b/title }{ $e }</book>
<book><title>Data on the Web</title><author><last>Suciu</last><first>Dan</first></author></book>
Query 9In the document "books.xml", find all section or Section titles
that contain the word "XML", regardless of the level ofnesting
<results>{
for $t in doc("books.xml")//(Section | section)/titlewhere contains($t/text(), "XML")
return $t}
</results>
<results><title>XML</title><title>XML and Semistructured Data</title></results>
Query 10In the document "prices.xml", find the minimum price for each
book, in the form of a "minprice" element with the book titleas its title attribute.
<results>{
let $doc := doc("prices.xml")for $t in distinct-values($doc//book/title)let $p := $doc//book[title = $t]/pricereturn
<minprice title="{ $t }"><price>{ min($p)
}</price></minprice>}
</results>
<results><minprice title="Advanced Programming in the Unix environment"><price>65.95</price></minprice>
<minprice title="TCP/IP Illustrated"><price>65.95</price></minprice><minprice title="Data on the Web"><price>34.95</price></minprice></results>
Query 11For each book with an author, return the book with its title and
authors. For each book with an editor, return a reference withthe book title and the editor's affiliation.
<bib>{for $b in doc("bib.xml")//book[author]
return<book>
{$b/title }{$b/author }</book>}
{for $b in doc("bib.xml")//book[editor]return
<reference>{$b/title }
{$b/editor/affiliation}</reference>}</bib>
Query 11
<bib><book><title>TCP/IP
Illustrated</title><author><last>Stevens</last><first>W.</first></author></book><book><title>Advanced Programming in the Unix
environment</title><author><last>Stevens</last><first>W.</first></author></book><book><title>Data on the Web</title>
<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author>
</book><reference><title>The Economics of Technology and Content for DigitalTV</title><affiliation>CITI</affiliation></reference>
</bib>
Query 12Find pairs of books that have different titles but the same set of
authors (possibly in a different order)
<bib>{for $book1 in doc("bib.xml")//book,$book2 in doc("bib.xml")//book
let $aut1 := for $a in $book1/authororder by $a/last, $a/first
return $alet $aut2 := for $a in $book2/author
order by $a/last, $a/firstreturn $a
where $book1 << $book2and not($book1/title = $book2/title)and deep-equal($aut1, $aut2)return <book-pair>
{ $book1/title }
{ $book2/title }</book-pair>}
</bib>
Query 12
<bib><book-pair>
<title>TCP/IP Illustrated</title><title>Advanced Programming in the Unixenvironment</title>
</book-pair></bib>
Thanks