+ All Categories
Home > Documents > XQuery Reloaded 2009-09-25 - Brown...

XQuery Reloaded 2009-09-25 - Brown...

Date post: 30-May-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
27
XQuery Reloaded Roger Bamford, Vinayak Borkar, Matthias Brantner, Peter M. Fischer, Daniela Florescu, David Graf, Donald Kossmann, Tim Kraska , Dan Muresan, Sorin Nasoi, Markos Zacharioudakis Systems Group/ETH Zurich 27.09.2009 XQuery
Transcript
Page 1: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery Reloaded

Roger Bamford, Vinayak Borkar, Matthias Brantner, Peter M. Fischer, Daniela Florescu, David Graf, Donald Kossmann, Tim Kraska, Dan Muresan, Sorin Nasoi, Markos Zacharioudakis

Systems Group/ETH Zurich 27.09.2009

XQuery

Page 2: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Google searches for “XQuery” (normalized)

27.09.2009 2 Systems Group/ETH Zurich

0

10

20

30

40

50

60

70

80

90

100

2004

-01-

04

2004

-03-

14

2004

-06-

13

2004

-08-

29

2004

-11-

07

2005

-01-

16

2005

-03-

27

2005

-06-

26

2005

-09-

04

2005

-11-

13

2006

-01-

22

2006

-04-

02

2006

-06-

11

2006

-08-

20

2006

-10-

29

2007

-01-

07

2007

-03-

18

2007

-05-

27

2007

-08-

05

2007

-10-

14

2007

-12-

23

2008

-03-

02

2008

-05-

11

2008

-07-

20

2008

-09-

28

2008

-12-

07

2009

-02-

15

2009

-04-

26

2009

-07-

05

Page 3: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Google searches for “SQL” (normalized)

27.09.2009 3 Systems Group/ETH Zurich

0

10

20

30

40

50

60

70

80

90

100

2004

-01-

04

2004

-03-

14

2004

-05-

23

2004

-08-

01

2004

-10-

10

2004

-12-

19

2005

-02-

27

2005

-05-

08

2005

-07-

17

2005

-09-

25

2005

-12-

04

2006

-02-

12

2006

-04-

23

2006

-07-

02

2006

-09-

10

2006

-11-

19

2007

-01-

28

2007

-04-

08

2007

-06-

17

2007

-08-

26

2007

-11-

04

2008

-01-

13

2008

-03-

23

2008

-06-

01

2008

-08-

10

2008

-10-

19

2008

-12-

28

2009

-03-

08

2009

-05-

17

2009

-07-

26

Page 4: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery folklore

27.09.2009 4 Systems Group/ETH Zurich

  XML and XQuery are slow   XQuery is complicated   Legacy of XML, Namespaces, Schema, Xpath   Bad products   No people   ...

Page 5: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery folklore

27.09.2009 5 Systems Group/ETH Zurich

  XML and XQuery are slow   partly true, but products are catching up   highly optimizable (like SQL; better than Java) [Boncz06]

  XQuery is complicated   is skiing more complicated than snowbording?   try to process Web pages, RSS Feeds with Java!

  Legacy of XML, Namespaces, Schema, XPath   yes, but there is no alternative (relevance!)

  Bad products   huge investments: research projects, big players

  No people   courses at all top places (ETH, Stanford, etc.)

  .... [http://www.ibm.com/developerworks/xml/library/x-xqmyth.html]

Page 6: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

This talk

Clear your mind and get rid of prejudices

  Explain XQuery in a nutshell   XQuery today and tomorrow   Introduce Zorba – the MySQL for XQuery   Show some fancy usages of Zorba

Not in this talk: How to design an XQuery processor. If you are interested in that, talk to one of us after the talk

27.09.2009 6 Systems Group/ETH Zurich

Page 7: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

The roots of XQuery…   Origins go back to the QL workshop 1998   Standardized by the W3C (recommendation since 2007)   Started out as a query language   Closed data model, composition

  results of expressions can be input for expressions

  Compliant with other standards   XML, XML Schema, XPath, Web Services, ...

  Example query: for $empl in //employees

let $name := $emp/name where $x/salary > 5000 order by $name return $name

27.09.2009 7 Systems Group/ETH Zurich

Page 8: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Today, XQuery is much more….

XQuery = Query + Update + Fulltext + Scripting + Streaming + Libraries

  X   XQuery is the only language for XML, but that does not mean that

XML is all it can do   CSV, JSON, HTML, …   spectrum: structured data to unstructured text

  Query   XQuery has joins, group-by, sorting, etc., but that does not mean

that it is only good for the DB   by now, full-fledged programming language   modules for structured programming

The name „XQuery“ is a disnomer!!!

27.09.2009 8 Systems Group/ETH Zurich

Page 9: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery is alive

Most successful in the middle-tier   Data integration   Configuration   Reporting

27.09.2009 9 Systems Group/ETH Zurich

but also in the database world

(Oracle has 8000+ customers reporting XQuery bugs)

…why else should so many people from different companies care to have his/her name on the paper

Page 10: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery in the future: Gartner’s Top Ten disruptive technologies for 2008 to 2012   Cloud computing and cloud/web platforms   Multi-core and hybrid processors   Virtualisation and fabric computing   Social networks and social software   Web mashups   User interface   Ubiquitous computing   Contextual computing   Augmented reality   Semantics 27.09.2009 10 Systems Group/ETH Zurich

Page 11: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

What is needed: A programming language for the Web

  Machine-to-Machine Communication between/inside distributed systems across company boundaries

  Machine-to-Human Communication over the browser, event-based interaction

  Variety of workloads: Updates, OLTP/OLAP/Streaming queries, structured and semi-structured data, varying and evolving over time

27.09.2009 11 Systems Group/ETH Zurich

IMHO XQuery is the best starting point XQuery is not perfect, but solves many of the problems

Page 12: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery in the future: Gartner’s Top Ten disruptive technologies for 2008 to 2012   Cloud computing and cloud/web platforms   Multi-core and hybrid processors   Virtualisation and fabric computing   Social networks and social software   Web mashups   User interface   Ubiquitous computing   Contextual computing   Augmented reality   Semantics 27.09.2009 12 Systems Group/ETH Zurich

Page 13: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery mashups

„Gartner predicts that web mashups, which mix content from publicly available sources, will be the dominant model (80 percent) for the creation of new enterprise applications.“

  Requires to integrate different source   XQuery is made for the web

  Works natively with XML and JSON   Has native support for REST and WebServices

  XQuery is a programming language

27.09.2009 13 Systems Group/ETH Zurich

Page 14: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery and cloud computing/Web platforms

27.09.2009 14 Systems Group/ETH Zurich

Outgoing XML message

communication with the world XML

XML Protocol (SOAP) XML Schema validation

XSLT/ XQuery evaluation

XML Java/C# XML

Java/C# SQL Java/C#

application logic Java/C#, JavaScript

application logic, data validation error handling,

caching, replication and distribution

data persistence and integrity; transactions

SQL queries and updates integrity constraints

triggers, transactions, etc.

Incoming XML message

Page 15: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery and cloud computing/Web platforms

27.09.2009 15 Systems Group/ETH Zurich

communication with the world, application logic, data persistence and integrity; transactions

XQuery (XML Schema, XML Protocols,… )

  No impedance mismatch   Reduce the numbers of “hops”   XQuery made for the web-standards

Outgoing XML message

Incoming XML message

Page 16: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery and new hardware / multi-cores

  Requires a highly parallelizable programming language

  Declarative (functional) programming model

  XQuery is highly optimizable and parallelizable (as well as SQL is highly optimizable and parallelizable)

  Made for bulk processing

27.09.2009 16 Systems Group/ETH Zurich

Page 17: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

If you do not buy all that: Still, XML is out there and it will not disappear

  XML is best choice for communication data   general: web services (SOAP, WSDL), REST, RSS   specific: XBRL, HL7, ebXML, RosettaNet, ...

  XML is best choice for meta-data and code   configuration files, XMI (Eclipse), XForms (apps), XMP (photography),

XAML / InfoPath (UIs), ...   XML is best choice for documents

  XHTML, SVG, OpenXML (Office), UBL (business)

XQuery is the cheapest way to process XML (and JSON)

27.09.2009 17 Systems Group/ETH Zurich

Page 18: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

What is missing is a MySQL for XQuery

27.09.2009 18 Systems Group/ETH Zurich

Page 19: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Zorba

  Intended to be the MySQL for XQuery   Developed by the FLWOR Foundation   A team of 5 full-time programmers and several volunteers   Contributing companies/organisations

http://www.zorba-xquery.com/

27.09.2009 19 Systems Group/ETH Zurich

Page 20: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

What makes Zorba different?

  Not a “research” project!!!   Open-source “Apache License”   Zorba is a query processor like MySQL   Exchangeable store

  comes with an in-memory store   feel free to implement other stores (like 28msec)

  Zorba is designed for compliance first!   Feature complete (except for full-text)

  full-standard and not just the researchy features

  C++, interfaces to all main languages (Java, PHP, …)

27.09.2009 20 Systems Group/ETH Zurich

Page 21: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Zorba Architecture

27.09.2009 21 Systems Group/ETH Zurich

Page 22: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Zorba libraries/features

27.09.2009 22 Systems Group/ETH Zurich

  XQuery 1.1, Update, Scripting   Debugger   Eval   JSON   REST/HTTP   XQDoc   PDF   E-Mail   Collections   Tidy ....

Page 23: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Events

Cool stuff with Zorba: XQuery in the browser

27.09.2009 23 Systems Group/ETH Zurich

Web browser Zorba

DOM Data access and manipulation Store

Plug In

http://www.xqib.org

Page 24: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Cool stuff with Zorba: 28msec WebServer

27.09.2009 24 Systems Group/ETH Zurich

Zorba Zorba Zorba

http://www.28msec.com

Page 25: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Cool stuff with Zorba: Xadoop

  Hadoop + Zorba   Position XQuery as an alternative to Pig/Hive/etc.   Well-suited for semi-structured data   Use cases

  pre-processing   financial data   analysis of blogs/wikipedia etc.   ….

  Pre-packed on EC2

So far, just a toy project with open ending...

27.09.2009 25 Systems Group/ETH Zurich

Page 26: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

XQuery Benchmarking Service

http://xqbench.org

  Supports several benchmarks   XMark   TPC-W like scenario

  Automatic test infrastructure   Browse results   Allows to post: Your queries + Your documents

Main goal: put pressure on the vendors and give use cases for the open-source community

27.09.2009 26 Systems Group/ETH Zurich

Page 27: XQuery Reloaded 2009-09-25 - Brown Universitycs.brown.edu/people/tkraska/pub/vldb09-xquery_reloaded...XQuery folklore 27.09.2009 Systems Group/ETH Zurich 5 XML and XQuery are slow

Conclusion

  Zorba as the MySQL for XQuery   XQuery combines a database and a programming language   XQuery is ready for the next step

FLWOR Foundation: http://www.flworfound.org Zorba XQuery Processor: http://www.zorba-xquery.com XQuery in the Browser: http://www.xqib.org XQuery Benchmarking Service: http://www.xqbench.org XQuery Eclipse Plug-In: http://www.xqdt.org Xadoop (not yet online): http://www.xadoop.org/

27.09.2009 27 Systems Group/ETH Zurich


Recommended