Date post: | 11-Jan-2016 |
Category: |
Documents |
Upload: | hubert-miller |
View: | 218 times |
Download: | 0 times |
XML Processing Moves Forward XSLT 2.0 and XQuery 1.0
Michael Kay
Prague 2005
2
About me
• Database background• Started using XML in 1998 for
content management applications• Author of XSLT Programmer’s
Reference• Developer of Saxon XSLT
processor• Member of W3C XSL and XQuery
Working Groups• Founded SAXONICA March 2004
3
Contents
• A tour of the new specs
• What’s significant about XSLT 2.0
• A quick demo
• Why XQuery?
4
The QT Specification Family
XSLT 2.0 XQuery 1.0
XPath 2.0
Data Model
XML Schema
Functionsand
Operators
5
XSLT 1.0XPath 1.0
Standards maturity
Maturity
Time
XQueryXSLT 2.0XPath 2.0
XMLSchema
XML
REC
CR
6
XML Schema
A family of standards
XPath 1.0
XPath 2.0
XQuery 1.0
XSLT 1.0
XSLT 2.0
7
XSLT and XQuery
Documents Data
XSLT
XQuery
8
What’s new in XSLT 2.0
• New Processing Model
• Major Features– grouping– regular expressions– functions– schema support
• Many “minor” features
9
Some “minor” features
XSLT 2.0• Temporary trees
• Multiple Output Files
• Format date/time
• Tunnel parameters
• Declared variable types
• Multi-mode templates
• xsl:next-match
• conditional compilation
• XHTML serialization
• xsl:namespace
• separator=“,”
• character maps
XPath 2.0
• Sequences
• if..then..else
• for $x in X return f($x)
• some/every
• except/intersect
• $n is $m
Function library• String functions
• Regex functions
• Date/time arithmetic
• URI handling
• min(), max(), avg()
10
Handling unstructured text
• unparsed-text() function– reads a text file into a string
• tokenize() function– splits a string into substrings
• xsl:analyze-string– parses a string and generates markup
11
Regular expression functions
• matches()test if a string matches a regexif (matches($in, ‘[A-Z]{3}[0-9]{3}’)
• tokenize()split a string into substringsregex matches the separatorfor $s in tokenize($in, ‘,\s?’) ...
• replace()replace every occurrence of a matchreplace($in, ‘\s’, ‘%20’)
12
Grouping
• Takes any sequence as input• Divides the items into groups• Applies processing to each group
group-by: items with a common value for a grouping key
group-adjacent:adjacent items with a common grouping key
group-starting-with:pattern to match first item in each group
group-ending-with:pattern to match last item in each group
13
Grouping by Value
<xsl:for-each-group select=“book” group-by=“publisher”> <xsl:sort select=“current-grouping-key()”/> <h2>Publisher: <xsl:value-of select=“current-grouping-key”/> </h2> <xsl:for-each select=“current-group()”/> <xsl:sort select=“title”/> <p>author: <xsl:value-of select=“author”/></p> <p>title: <xsl:value-of select=“title”/></p> </xsl:for-each></xsl:for-each-group>
14
User-defined Functions
• Written like named templates• Called from XPath• Return a result
<xsl:function name=“ged:date-to-ISO” as=“xs:date”><xsl:param name=“in” as=“ged:date”/><xsl:sequence select=“xs:date(concat( substring($in, 8, 4), ‘-’ format-number(index-of((“JAN”, “FEB”, ...), substring($in, 4, 3)), ’00’), ‘-’, substring($in, 1, 2)))”/></xsl:function>
<xsl:sort select=“ged:date-to-ISO(@birth-date)”/>
15
XQuery 1.0
• Designed to query XML databases
• Also handles in-memory transformations
• Well supported by database vendors
16
XQuery ExampleJoin two tables
xquery version 1.0;
<results> { for $p in doc ("auction.xml")/site/people/person let $a := for $t in doc("auction.xml") /site/closed_auctions/closed_auction where $t/buyer/@person = $p/@id return $t return <item person="{$p/name}"> {count ($a)} </item>} </results>
XMark Q8
17
XSLT Equivalent
<result xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:for-each select="/site/people/person"> <xsl:variable name="a" select="/site/closed_auctions/closed_auction [buyer/@person = current()/@id]"/> <item person="{name}"> <xsl:value-of select="count($a)"/> </item> </xsl:for-each></result>
XMark Q8
18
Optimization
• With multi-GB databases, using indexes is essential
• XQuery does not have template rules
• This makes it possible to do static analysis and join optimization
19
XMark Q8 results (msecs)
1Mb
1503
160
33
90
Xalan
xt
MSXML
Saxon 8.4
XSLT
XQuerySaxon 8.4
Qizx
Galax
136
351
1870
4Mb
11006
2253
519
1340
1575
711
6672
10Mb
65855
16414
4248
11126
11947
1813
16625
O(n2)
O(n)
20
Two can play at that game!
Xalan
xt
MSXML
Saxon 8.5
1Mb
1503
160
33
27
XSLT
XQuerySaxon 8.5
Qizx
Galax
16
351
1870
4Mb
11006
2253
519
26
16
711
6672
10Mb
65855
16414
4248
45
31
1813
16625
O(n2)
O(n)
caveat: this is one query only!
21
Conclusions
• XSLT 2.0 and XQuery 1.0 are nearly ready
• XSLT 2.0 has many powerful new features, making new applications possible
• XQuery 1.0 designed for optimization against very large databases