+ All Categories
Home > Documents > IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Date post: 29-Jan-2016
Category:
Upload: camron-wilkins
View: 219 times
Download: 0 times
Share this document with a friend
30
IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al- Shorbagy
Transcript
Page 1: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

IS432Semi-Structured Data

Lecture 6:

XQuery

Dr. Gamal Al-Shorbagy

Page 2: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Introduction

• What is XQuery?- XQuery is to XML what SQL is to database tables.

- XQuery was designed to query XML data.- XQuery is built on XPath expressions

- XQuery is supported by all major databases- XQuery is a W3C Recommendation

Page 3: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Introduction

• XQuery can be used to:- Extract information to use in a Web Service

- Generate summary reports- Transform XML data to XHTML

- Search Web documents for relevant information

Page 4: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

FLWR (“Flower”) Expressionsfor $x in doc("bib.xml")/bib/book

where $x/price>30order by $x/titlereturn $x/title

FLWOR is an acronym for "For, Let, Where, Order by,Return".The for clause selects all book elements under the bibelement into a variable called $x.

The where clause selects only book elements with a priceelement with a value greater than 30.

The order by clause defines the sort-order. Will be sort by thetitle element.

The return clause specifies what should be returned. Here itreturns the title elements.

Page 5: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

FOR v.s. LET

FOR• Binds node variables iteration

LET• Binds collection variables one value

Page 6: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Bib.xml<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE bib SYSTEM "D:\Chikh\Teaching\2009-2010\Sem1\IS432\Examples\People\book.dtd"><bib>

<book year="1994"><title>TCP/IP Illustrated</title>

<author><last>Stevens</last><first>W.</first></author><publisher>Addison-Wesley</publisher>

<price>65.95</price></book>

<book year="1992"><title>Advanced Programming in the Unix environment</title><author><last>Stevens</last><first>W.</first></author><publisher>Addison-Wesley</publisher>

<price>65.95</price></book>

<book year="2000"><title>Data on the Web</title>

<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author><publisher>Morgan Kaufmann Publishers</publisher><price>39.95</price>

</book><book year="1999">

<title>The Economics of Technology and Content for Digital TV</title><editor><last>Gerbarg</last><first>Darcy</first><affiliation>CITI</affiliation></editor>

<publisher>Kluwer Academic Publishers</publisher><price>129.95</price>

</book></bib>

Page 7: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Bib.dtd

<!ELEMENT bib (book* )><!ELEMENT book (title, (author+ | editor+ ), publisher, price )>

<!ATTLIST book year CDATA #REQUIRED ><!ELEMENT author (last, first )>

<!ELEMENT editor (last, first, affiliation )><!ELEMENT title (#PCDATA )>

<!ELEMENT last (#PCDATA )><!ELEMENT first (#PCDATA )>

<!ELEMENT affiliation (#PCDATA )><!ELEMENT publisher (#PCDATA )><!ELEMENT price (#PCDATA )>

Page 8: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Books.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE chapter SYSTEM "books.dtd"><chapter>

<title>Data Model</title><section>

<title>Syntax For Data Model</title></section> <section>

<title>XML</title><section>

<title>Basic Syntax</title></section><section><title>XML and Semistructured

Data</title></section> </

section></chapter>

Page 9: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Books.xml

<?xml version="1.0" encoding="UTF-8"?><!ELEMENT chapter (title, section*)><!ELEMENT section (title, section*)><!ELEMENT title (#PCDATA)>

Page 10: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Reviews.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE reviews SYSTEM "reviews.dtd"><reviews>

<entry><title>Data on the Web</title><price>34.95</price>

<review>A very good discussion of semi-structured databasesystems and XML.

</review></entry><entry>

<title>Advanced Programming in the Unix environment</title><price>65.95</price>

<review>A clear and detailed discussion of UNIX

programming.</review> </entry>

<entry><title>TCP/IP Illustrated</title>

<price>65.95</price><review>

One of the best books on TCP/IP.</review> </

entry></reviews>

Page 11: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Reviews.dtd

<?xml version="1.0" encoding="UTF-8"?><!ELEMENT reviews (entry*)><!ELEMENT entry (title, price, review)><!ELEMENT title (#PCDATA)><!ELEMENT price (#PCDATA)><!ELEMENT review (#PCDATA)>

Page 12: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Prices.xml

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE prices SYSTEM "prices.dtd"><prices>

<book><title>Advanced Programming in the Unix environment</title><source>bstore2.example.com</source>

<price>65.95</price></book><book><title>Advanced Programming in the Unix environment</title>

<source>bstore1.example.com</source><price>65.95</price></book>

<book><title>TCP/IP Illustrated</title><source>bstore2.example.com</source><price>65.95</price></book>

<book><title>TCP/IP Illustrated</title><source>bstore1.example.com</source><price>65.95</price></book>

<book><title>Data on the Web</title><source>bstore2.example.com</source><price>34.95</price></book>

<book><title>Data on the Web</title><source>bstore1.example.com</source>

<price>39.95</price></book></prices>

Page 13: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Prices.dtd

<?xml version="1.0" encoding="UTF-8"?><!ELEMENT prices (book*)>

<!ELEMENT book (title, source, price)><!ELEMENT title (#PCDATA)><!ELEMENT source (#PCDATA)><!ELEMENT price (#PCDATA)>

Page 14: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 1List books published by Addison-Wesley after 1991, includingtheir year and title

<bib>{

for $b in doc("bib.xml")/bib/bookwhere $b/publisher = "Addison-Wesley" and $b/@year > 1991return

<book year="{ $b/@year }">{ $b/title }

</book>}

</bib>

The doc() function is used to openthe “bib.xml"

<bib><book year="1994"><title>TCP/IP Illustrated</title></book><bookyear="1992"><title>Advanced Programming in the Unix

environment</title></book></bib>

Page 15: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 2Create a flat list of all the title-author pairs, with each pair

enclosed in a "result" element.

<results>{for $b in doc("bib.xml")/bib/book,

$t in $b/title,$a in

$b/authorreturn<result>{ $t } { $a

}</result>}</results>

<results><result><title>TCP/IP Illustrated</title><author><last>Stevens</last> <first>W.</first></author></result><result><title>Advanced Programming in the Unix

environment</title><author><last>Stevens</last><first>W.</first></author></result><result><title>Data on the Web</title><author><last>Abiteboul</last><first>Serge</first></author></result><result><title>Data on the Web</title><author><last>Buneman</last><first>Peter</first></author></result>

<result><title>Data on the Web</title><author><last>Suciu</last><first>Dan</first></author></result></results>

Page 16: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 3For each book in the bibliography, list the title and authors,

grouped inside a "result" element.

<results>{

for $b in doc("bib.xml")/bib/bookreturn

<result>{ $b/title }

{ $b/author }</result>

}</results>

<results><result><title>TCP/IPIllustrated</title><author><last>Stevens</last><first>W.</first></author></result><result><title>AdvancedProgramming in the Unix

environment</title><author><last>Stevens</last><first>W.</first></author></result><result><title>Data on theWeb</title><author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author></result><result><title>The Economics ofTechnology and Content for Digital TV</title></result></results>

Page 17: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 4For each author in the bibliography, list the author's name and

the titles of all books by that author, grouped inside a "result"

<results>{let $a := doc("bib.xml")//authorfor $last in distinct-values($a/last),$first in

distinct-values($a[last=$last]/first)order by $last, $firstreturn

<result><author>

<last>{ $last }</last><first>{ $first }</first></author>{for $b in doc("bib.xml")/bib/book

where some $ba in $b/authorsatisfies ($ba/last = $last and

$ba/first=$first)return $b/title}</result>}

</results>

Page 18: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 4

<results><result><author><last>Abiteboul</last><first>Serge</first></author><title>Data on theWeb</title></result><result><author><last>Buneman</last><first>Peter</first></author><title>Data on the

Web</title></result><result><author><last>Stevens</last><first>W.</first></author><title>TCP/IP

Illustrated</title><title>Advanced Programming in the Unix environment</title></result><result><author><last>Suciu</last><first>Dan</first></author><title>Data on theWeb</title></result>

</results>

Page 19: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 5For each book found in bib.xml and reviews.xml, list the title of

the book and its price from each source<books-with-prices>

{for $b in doc("bib.xml")//book,

$a in doc("reviews.xml")//entrywhere $b/title = $a/titlereturn

<book-with-prices>{ $b/title }

<price-reviews-source>{ $a/price/text() }</price-reviews-source>

<price-bib-source> { $b/price/text() }</price-bib-source></book-with-prices> }

</books-with-prices>

<books-with-prices><book-with-prices><title>TCP/IP Illustrated</title><price-reviews-source>65.95</price-reviews-source><price-bib-source> 65.95</price-bib-source></book-with-prices><book-with-

prices><title>Advanced Programming in the Unix environment</title><price-reviews-source>65.95</price-reviews-source><price-bib-source>65.95</price-bib-source></book-with-prices><book-with-prices><title>Data on theWeb</title><price-reviews-source>34.95</price-reviews-source><price-bib-source>39.95</price-bib-source></book-with-prices></books-with-prices>

Page 20: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 6

For each book that has at least one author, list the title and firsttwo authors, and an empty "et-al" element if the book hasadditional authors.

<bib>{for $b in doc("bib.xml")//bookwhere count($b/author) > 0return

<book>{ $b/title }

{for $a in $b/author[position()<=2]return $a}

{if (count($b/author) > 2)then <et-al/>

else ()}</

book>}</bib>

Page 21: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 6

<bib><book>

<title>TCP/IP Illustrated</title><author><last>Stevens</last><first>W.</first></author>

</book><book>

<title>Advanced Programming in the Unix environment</title><author><last>Stevens</last><first>W.</first></author></book>

<book><title>Data on the Web</title>

<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><et-al/>

</book></bib>

Page 22: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 7List the titles and years of all books published by Addison-Wesley

after 1991, in alphabetic order.

<bib>{

for $b in doc("bib.xml")//bookwhere $b/publisher = "Addison-Wesley" and $b/@year > 1991order by $b/title

return<book year= { $b/@year }>

{ $b/title }</

book>}</bib>

<bib><book year="1992"><title>Advanced Programming in the Unix environment</title></book><bookyear="1994"><title>TCP/IP Illustrated</title></book></bib>

Page 23: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 8Find books in which the name of some element ends with the

string "or" and the same element contains the string "Suciu"somewhere in its content. For each such book, return the

title and the qualifying element.

for $b in doc("bib.xml")//booklet $e := $b/*[contains(string(.), "Suciu")

and ends-with(local-name(.), "or")]where exists($e)return

<book>{ $b/title }{ $e }</book>

<book><title>Data on the Web</title><author><last>Suciu</last><first>Dan</first></author></book>

Page 24: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 9In the document "books.xml", find all section or Section titles

that contain the word "XML", regardless of the level ofnesting

<results>{

for $t in doc("books.xml")//(Section | section)/titlewhere contains($t/text(), "XML")

return $t}

</results>

<results><title>XML</title><title>XML and Semistructured Data</title></results>

Page 25: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 10In the document "prices.xml", find the minimum price for each

book, in the form of a "minprice" element with the book titleas its title attribute.

<results>{

let $doc := doc("prices.xml")for $t in distinct-values($doc//book/title)let $p := $doc//book[title = $t]/pricereturn

<minprice title="{ $t }"><price>{ min($p)

}</price></minprice>}

</results>

<results><minprice title="Advanced Programming in the Unix environment"><price>65.95</price></minprice>

<minprice title="TCP/IP Illustrated"><price>65.95</price></minprice><minprice title="Data on the Web"><price>34.95</price></minprice></results>

Page 26: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 11For each book with an author, return the book with its title and

authors. For each book with an editor, return a reference withthe book title and the editor's affiliation.

<bib>{for $b in doc("bib.xml")//book[author]

return<book>

{$b/title }{$b/author }</book>}

{for $b in doc("bib.xml")//book[editor]return

<reference>{$b/title }

{$b/editor/affiliation}</reference>}</bib>

Page 27: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 11

<bib><book><title>TCP/IP

Illustrated</title><author><last>Stevens</last><first>W.</first></author></book><book><title>Advanced Programming in the Unix

environment</title><author><last>Stevens</last><first>W.</first></author></book><book><title>Data on the Web</title>

<author><last>Abiteboul</last><first>Serge</first></author><author><last>Buneman</last><first>Peter</first></author><author><last>Suciu</last><first>Dan</first></author>

</book><reference><title>The Economics of Technology and Content for DigitalTV</title><affiliation>CITI</affiliation></reference>

</bib>

Page 28: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 12Find pairs of books that have different titles but the same set of

authors (possibly in a different order)

<bib>{for $book1 in doc("bib.xml")//book,$book2 in doc("bib.xml")//book

let $aut1 := for $a in $book1/authororder by $a/last, $a/first

return $alet $aut2 := for $a in $book2/author

order by $a/last, $a/firstreturn $a

where $book1 << $book2and not($book1/title = $book2/title)and deep-equal($aut1, $aut2)return <book-pair>

{ $book1/title }

{ $book2/title }</book-pair>}

</bib>

Page 29: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Query 12

<bib><book-pair>

<title>TCP/IP Illustrated</title><title>Advanced Programming in the Unixenvironment</title>

</book-pair></bib>

Page 30: IS432 Semi-Structured Data Lecture 6: XQuery Dr. Gamal Al-Shorbagy.

Thanks


Recommended