+ All Categories
Home > Documents > IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Date post: 19-Jan-2016
Category:
Upload: chastity-page
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
40
IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al- Shorbagy
Transcript
Page 1: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

IS432Semi-Structured Data

Lecture 4:

XPath

Dr. Gamal Al-Shorbagy

Page 2: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

2

What is Xpath ?

• XPath: "A language for addressing parts of an XML document"

• Similar to a DOS or UNIX "file system path" but with powerful expressions

• XPath is to XML what the SQL "select" statement is to SQL– But, XPath is not a full programming language or a query

language.

Page 3: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

What is XPath ?

• XPath is used to navigate through elements and attributes in an XML document.

• XPath is a major element in W3C's XSLT standard – – XQuery and XPointer are both built on XPath

expressions.

XPath

XPointer XQuery

Page 4: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

4

XPath Related Standards

• XSLT – XPath is used to tell XSLT how to match tags• XLink – similar to HTML links <a> but more powerful• XPointer - a standard manner for identifying

document fragments• XQuery – a newer, more comprehensive standard that

includes XPath 2.0 and allows more complex searches and data types include relational database searches.

Page 5: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

5

Versions

• Version 1.0– W3C Recommendation November, 16 1999– http://www.w3.org/TR/xpath

• Version 2.0– W3C Working Draft October, 29 2004– http://www.w3.org/TR/xpath20/

Page 6: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

6

Other Familiar Path Names

• DOS:– C:\Program Files\Altova\XMLSPY2004\Examples\Tutorial

• Web– http://www.google.com/search?hl=en&lr=lang_en&&q=XPath

• Unix– /usr/local/lib/mylib/myprogram.jar

• Similarities– Absolute path starts with "/"

– Relative paths express do not start with "/"

Page 7: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

What is XPath ?

• A syntax for defining parts of an XML document

• Uses path expressions to navigate in XML documents

• Contains a library of standard functions• A major element in XSLT (W3C

recommendation)

Page 8: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Terminology

• Seven Nodes of XPath– Element– Attribute– Text– Namespace– Processing-instruction– Comment – Document nodes.

• Atomic Values

XML documents are treated as trees of nodes.

The topmost element of the tree is called the root element.

Page 9: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Terminology

• <?xml version="1.0" encoding="ISO-8859-1"?><bookstore>  <book>    <title lang="en">Harry Potter</title>    <author>J K. Rowling</author>    <year>2005</year>    <price>29.99</price>  </book></bookstore>

Root Element

AttributeAtomic Value

Element

Page 10: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Syntax

• XPath uses path expressions to select nodes or node-sets in an XML document.

• The node is selected by following a path or steps.

Page 11: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath AxisRelationships of Xpath Nodes

Page 12: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Axis• ancestor• parent• child• descendant• Proceeding-sibling• following-sibling• Self• Attribute

title

Page 13: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Syntax<?xml version="1.0" encoding="ISO-8859-1"?><bookstore>

<book>  <title lang="eng">Harry Potter</title>  <price>29.99</price>

</book><book>

  <title lang="eng">Learning XML</title>  <price>39.95</price>

</book></bookstore>

Page 14: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Syntax: Selecting NodesPath Expression Description

nodename Selects all nodes with the name "nodename"

/ Selects the root node

// Selects nodes in the document from the current node that match the selection no

matter where they are

. Selects the current node

.. Selects the parent of the current node

@ Selects attributes

Page 15: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Syntax: Selecting NodesPath Expression Result

bookstore Selects all nodes with the name "bookstore"

/bookstore Selects the root element bookstoreNote: If the path starts with a slash ( / ) it always represents an absolute path to an

element!bookstore/book Selects all book elements that are children of

bookstore//book Selects all book elements no matter where they

are in the documentbookstore//book Selects all book elements that are descendant

of the bookstore element, no matter where they are under the bookstore element

//@lang Selects all attributes that are named lang

Page 16: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Syntax: Selecting NodesPath Expression Result

bookstore

/bookstore

bookstore/book

//book

bookstore//book

//@lang

<?xml version="1.0" encoding="ISO-8859-1"?><bookstore> <book>  <title lang="eng">Harry Potter</title>  <price>29.99</price> </book> <book>  <title lang="eng“>XML 4 Dummies</title>  <price>39.95</price> </book> <book>  <title lang=“kor“>The Han River</title>  <price>149.95</price> </book></bookstore>

Page 17: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Syntax: Predicates

• Predicates are used to find a specific node or a node that contains a specific value.

• Predicates are always embedded in square brackets.

Page 18: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Syntax: PredicatesPath Expression Result

/bookstore/book[1] Selects the first book element that is the child of the bookstore element.

/bookstore/book[last()] Selects the last book element that is the child of the bookstore element

/bookstore/book[last()-1] Selects the second last book element that is the child of the bookstore element

/bookstore/book[position()<3] Selects the first two book elements that are children of the bookstore element

//title[@lang] Selects all the title elements that have an attribute named lang

//title[@lang='eng'] Selects all the title elements that have an attribute named lang with a value of 'eng'

/bookstore/book[price>35.00] Selects all the book elements of the bookstore element that have a price element with a value

greater than 35.00/bookstore/book[price>35.00]/title Selects all the title elements of the book elements of

the bookstore element that have a price element with a value greater than 35.00

Page 19: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Syntax: PredicatesPath Expression Result

/bookstore/book[1]

/bookstore/book[last()]

/bookstore/book[last()-1]

/bookstore/book[position()<3]

//title[@lang]

//title[@lang=‘kor']

/bookstore/book[price>35.00]

/bookstore/book[price>35.00]/title

<?xml version="1.0" encoding="ISO-8859-1"?><bookstore> <book>  <title lang="eng">Harry Potter</title>  <price>29.99</price> </book> <book>  <title lang="eng“>XML 4 Dummies</title>  <price>39.95</price> </book> <book>  <title lang=“kor“>The Han River</title>  <price>149.95</price> </book></bookstore>

Page 20: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Example<?xml version="1.0" encoding="iso-8859-1"?> <pets>

<pet type="dog" color="brown">Max</pet>

<pet type="cat" color="white">Toula</pet> </pets>

• Select all pet elements • //pet or alternatively /pets/pet or

/pets/child::* • Select the first pet • /pets/pet[1] • Select all pets of type dog • //pet[@type ="dog"] • Select all pets of white color • //pet[@color="white"] • Select the color of all dogs • //pet[@type ="dog"]/@color • Get the types of pets with the name

Max • /pets/pet[text()="Max"]/@type

Page 21: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Syntax: Wild CardsWildcard Description

* Matches any element node*@ Matches any attribute node

node)( Matches any node of any kind

Path Expression Result/bookstore*/ Selects all the child nodes of the bookstore

element*// Selects all elements in the document

//title]* [@ Selects all title elements which have any attribute

Page 22: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Syntax: Selecting Multiple Paths

Path Expression Result//book/title | //book/price All the title and price elements of all book

elements//title | //price All title and Price elements/bookstore/book/title |

//priceAll books (in bookstore) and All price

elements

Page 23: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Axis

• ancestor::author

• parent::author

• child::firstname , (child::*) , child::node()

• descendant::author

• proceeding-sibling::author

• following-sibling::author

• attribute::title

title

Page 24: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath Functions

Node-Set Takes a node-set argument, returns a node-set, or returns/provides information about a particular node within a node-set.

String Performs evaluations, formatting, and manipulation on string arguments.

Boolean Evaluates the argument expressions to obtain a Boolean result.Number Evaluates the argument expressions to obtain a numeric result.

Page 25: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Functions: Nodes Set

node-set count(node-set) //emp 3//emp[1] 1

<?xml version="1.0" encoding="UTF-8"?><root> <emp id=" S0 01 ">

<name>ABC</name> <salary>5000</salary>

</emp><emp id="S002">

<name>PQR</name> <salary>7000</salary>

</emp><emp id="S003">

<name>XYZ</name> <salary>9000</salary>

</emp></root>

Page 26: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Functions: Nodes Set

node-set last)(

//emp[last()]

<emp id="S003 >"< name>XYZ</name >< salary>9000</salary >

/<emp>

<?xml version="1.0" encoding="UTF-8"?><root> <emp id=" S0 01 ">

<name>ABC</name> <salary>5000</salary>

</emp><emp id="S002">

<name>PQR</name> <salary>7000</salary>

</emp><emp id="S003">

<name>XYZ</name> <salary>9000</salary>

</emp></root>

Page 27: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Functions: String

• String concat("abc", "d", "ef", "g") abcdefg• boolean contains(“Mobily”, “bil”) true• String normalize-space(" abc def ") “abc def”• boolean starts-with(string, string) • number string-length("abcd") 4• String substring("12345",2,3) “234”• …• …

Page 28: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath Functions: Number

• ceiling(2.5) = 3 • floor(3.5) = 3 • number(arg)

• number('2048') = 2048 • number('-2048') = -2048 • number('text') = NaN • number('109.54') = 109.54

• round(2.6) = 3, round (2.4) = 2, round(2.5) = 3• number sum(node-set)

Page 29: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Example for XPath Queries<bib>

<book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year></book><book price=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year></book>

</bib>

<bib><book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year></book><book price=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year></book>

</bib>

Page 30: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Data Model for XPath

bib

book book

publisher author . . . .

Addison-Wesley Serge Abiteboul

The root

The root element

Much like the Xquery data model

Processing instruction

Comment

Page 31: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

The Root and the Root

• <bib> <paper> 1 </paper> <paper> 2 </paper> </bib>

• bib is the “document element”

• The “root” is above bib

• /bib = returns the document element

• / = returns the root

• Why ? Because we may have comments before and after <bib>; they become siblings of <bib>

• This is advanced xmlogy

Page 32: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath: Simple Expressions

/bib/book/year

Result: <year> 1995 </year>

<year> 1998 </year>

/bib/paper/year

Result: empty (there were no papers)

Page 33: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

XPath: Restricted Kleene Closure

//author

Result:<author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <author> Jeffrey D. Ullman </author>

/bib//first-nameResult: <first-name> Rick </first-name>

Page 34: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: Functions

/bib/book/author/text()

Result: Serge Abiteboul

Jeffrey D. Ullman

Rick Hull doesn’t appear because he has firstname, lastname

Functions in XPath:– text() = matches the text value– node() = matches any node (= * or @* or text())– name() = returns the name of the current tag

Page 35: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: Wildcard

//author/*

Result: <first-name> Rick </first-name>

<last-name> Hull </last-name>

* Matches any element

Page 36: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: Attribute Nodes

/bib/book/@price

Result: “55”

@price means that price is has to be an attribute

Page 37: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: Qualifiers

/bib/book/author[firstname]

Result: <author> <first-name> Rick </first-name>

<last-name> Hull </last-name>

</author>

Page 38: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: More Qualifiers

/bib/book[@price < “60”]

/bib/book[author/@age < “25”]

/bib/book[author/text()]

Page 39: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

Xpath: Summarybib matches a bib element

* matches any element

/ matches the root element

/bib matches a bib element under root

bib/paper matches a paper in bib

bib//paper matches a paper in bib, at any depth

//paper matches a paper at any depth

paper|book matches a paper or a book

@price matches a price attribute

bib/book/@price matches price attribute in book, in bib

bib/book/[@price<“55”]/author/lastname matches…

Page 40: IS432 Semi-Structured Data Lecture 4: XPath Dr. Gamal Al-Shorbagy.

References

• http://www.w3schools.com/xpath/default.asp

• http://msdn.microsoft.com/en-us/library/ms256086.aspx

• http://www.xpathtester.com/test

• http://oreilly.com/catalog/xmlnut/chapter/ch09.html

• http://msdn.microsoft.com/en-us/library/ms256086.aspx


Recommended