+ All Categories
Home > Documents > XQuery Updates in MonetDB/XQuery

XQuery Updates in MonetDB/XQuery

Date post: 12-Sep-2021
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
40
ADT 2008 ADT 2008 Lecture 5 Lecture 5 XQuery Updates in MonetDB/XQuery XQuery Updates in MonetDB/XQuery Stefan Manegold [email protected] http://www.cwi.nl/~manegold/
Transcript
Page 1: XQuery Updates in MonetDB/XQuery

ADT 2008ADT 2008

Lecture 5Lecture 5

XQuery Updates in MonetDB/XQueryXQuery Updates in MonetDB/XQuery

Stefan [email protected]

http://www.cwi.nl/~manegold/

Page 2: XQuery Updates in MonetDB/XQuery

2

[email protected] Lecture 5: XQuery Updates ADT 2008

• skipping: avoid touching node ranges that cannot contain results

Generate a duplicate-free result in document order • pruning: reduce the context set a-priori• partitioning: single sequential pass over the document

document

List of context nodes

seek

seek scan skip seek scan skip ...

Staircase Join Staircase Join [VLDB03][VLDB03]

Page 3: XQuery Updates in MonetDB/XQuery

3

[email protected] Lecture 5: XQuery Updates ADT 2008

Loop-lifted XPath StepsLoop-lifted XPath Steps

Many algorithms have been proposed & studied for XPath evaluation:• Dataguide based, • Structural Join,• Staircase Join, • Holistic Twig Join

IN: sequence of context nodes in (doc order)OUT: sequence of document nodes (unique, in doc order)

Page 4: XQuery Updates in MonetDB/XQuery

4

[email protected] Lecture 5: XQuery Updates ADT 2008

Loop-lifted XPath StepsLoop-lifted XPath Steps

In XQuery, expressions generally occur inside FLWR blocks, i.e. inside a for-loop

for $x in doc()//employee $x/ancestor::department

Choice:• call XPath algorithm N times, accessing document and index structures N times.• use a loop-lifted algorithm:

IN: for each iteration, a sequence of context nodesOUT: for each iteration, a sequence of document nodes (per iteration unique, in doc order)

Page 5: XQuery Updates in MonetDB/XQuery

5

[email protected] Lecture 5: XQuery Updates ADT 2008

Staircase joinStaircase join

document

List of context nodes

Page 6: XQuery Updates in MonetDB/XQuery

6

[email protected] Lecture 5: XQuery Updates ADT 2008

Loop-lifted staircase joinLoop-lifted staircase join

document document

List of context nodes Active stack

Multiple lists of context nodes

Adapt:

pruning, partitioning and skipping rules

to correctly deal with multiple context sets

Page 7: XQuery Updates in MonetDB/XQuery

7

[email protected] Lecture 5: XQuery Updates ADT 2008

Loop-lifted staircase joinLoop-lifted staircase join

Results on the 20 XMark queries:

Page 8: XQuery Updates in MonetDB/XQuery

8

[email protected] Lecture 5: XQuery Updates ADT 2008

• 15.09.2008:

•RDBMS back-end support for XML/XQuery (1/2):

•Document Representation (XPath Accelerator, Pre/Post plane)

•XPath navigation (Staircase Join)

• 22.09.2008:

•XQuery to Relational Algebra Compiler:

•Item- & Sequence- Representation

•Efficient FLWoR Evaluation (Loop-Lifting)

•Optimization

• 29.09.2008:

•RDBMS back-end support for XML/XQuery (2/2):

•Updateable Document Representation

•Other (DB-) approaches to XML/XQuery processing

ScheduleSchedule

Page 9: XQuery Updates in MonetDB/XQuery

9

[email protected] Lecture 5: XQuery Updates ADT 2008

What is MonetDB?

• Main-memory based DBMS backend/kernel

• Developed at CWI since 1992

• “Query-intensive” applications

• Data mining

• Data warehousing / decision support

• Multi-media information retrieval (text, images, audio, video, XML, ...)

• XML databases

• GIS

• part of Data Distilleries' products

• CWI spin-off company

• (>100GB) databases at ABN Amro, Postbank, Ohra, Spaarbeleg, FBTO, Centerparcs, Vodafone

• Nowadays: part of SPSS

Page 10: XQuery Updates in MonetDB/XQuery

10

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Motivation (1/2)• Relational DBMS dominate the scene

• Oracle, SQLserver, DB2

• databases a solved problem?

Page 11: XQuery Updates in MonetDB/XQuery

11

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Motivation (1/2)• Relational DBMS dominate the scene

• Oracle, SQLserver, DB2

• databases a solved problem? No!

Problems:

• performance

• new ‘query intensive’ applications (data mining, et al)

• extensibility

• new applications (GIS,text,image,audio,video,XML)

Page 12: XQuery Updates in MonetDB/XQuery

12

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Motivation (2/2)

• are relational DBMS fit for the job?

• developed in end 1970’s begin 1980’s

Page 13: XQuery Updates in MonetDB/XQuery

13

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Motivation (2/2)

• are relational DBMS fit for the job?

• developed in end 1970’s begin 1980’s

• hardware has changed

• CPUs get faster but more vulnerable

• capacity and bandwidth follows Moore’s law

• latency becomes a bottleneck (I/O and RAM)

Page 14: XQuery Updates in MonetDB/XQuery

14

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Motivation (2/2)

• are relational DBMS fit for the job?

• developed in end 1970’s begin 1980’s

• hardware has changed

• CPUs get faster but more vulnerable

• capacity and bandwidth follows Moore’s law

• latency becomes a bottleneck (I/O and RAM)

• applications have changed

• RDBMS tuned for transaction processing

• not query-intensive

• only business domain

Page 15: XQuery Updates in MonetDB/XQuery

15

[email protected] Lecture 5: XQuery Updates ADT 2008

Transactions (OLTP)Transactions (OLTP)

Page 16: XQuery Updates in MonetDB/XQuery

16

[email protected] Lecture 5: XQuery Updates ADT 2008

OLAP, Data MiningOLAP, Data Mining

Page 17: XQuery Updates in MonetDB/XQuery

17

[email protected] Lecture 5: XQuery Updates ADT 2008

How is MonetDB Different

• full vertical fragmentation: always!• everything in binary (2-column) tables

• saves you from table scan hell in OLAP and Data Mining

• the RISC approach to databases• simple data model, simple query language

• don’t need (to pay for) a buffer manager => manage virtual memory

• explicit transaction management => DIY approach to ACID

• CPU and memory cache optimized• programming team experienced in main memory DBMS techniques

• use of scientific programming optimizations (loop unrolling)

•Cache conscious data structures and algorithms

Page 18: XQuery Updates in MonetDB/XQuery

18

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB: Shopping ListMonetDB: Shopping List

• A quantum leap in performance requires a quantum leap in technology (and risk)

• Better support for non-administrative applications, using:• Multi-model database kernel support• Extensible data types, operators, accelerators• Database hot-set is memory resident (but scale to TB)• Use simple data structures• Index management should be automatic• Algebraic language as the computational model• Query optimization = strategic + tactic + operational optimization• Dynamic optimization, parallelism, JIT-compile-link-run• Cooperative (application) transaction management• Do not replicate the operating system

Page 19: XQuery Updates in MonetDB/XQuery

19

[email protected] Lecture 5: XQuery Updates ADT 2008

Storing Relations in MonetDBStoring Relations in MonetDB

Page 20: XQuery Updates in MonetDB/XQuery

20

[email protected] Lecture 5: XQuery Updates ADT 2008

Relational MappingRelational Mapping

Page 21: XQuery Updates in MonetDB/XQuery

21

[email protected] Lecture 5: XQuery Updates ADT 2008

Object-Oriented MappingObject-Oriented Mapping

Page 22: XQuery Updates in MonetDB/XQuery

22

[email protected] Lecture 5: XQuery Updates ADT 2008

Hash tables,T-trees,R-trees,...

BAT Data StructureBAT Data Structure

BAT: binary association table

BUN: binary unit

BUN heap: - consecutive memory block (array) - memory-mapped file

Page 23: XQuery Updates in MonetDB/XQuery

23

[email protected] Lecture 5: XQuery Updates ADT 2008

BAT Storage OptimizationsBAT Storage Optimizations

Dense ascendingsequence

Page 24: XQuery Updates in MonetDB/XQuery

24

[email protected] Lecture 5: XQuery Updates ADT 2008

type - (physical) type number

enum - enumerated type flag

dense - dense ascending range

sorted - ascending head sorting

constant - all equal values

align - unique sequence id

key - no duplicates on column

set - no duplicates in BAT

hash - accelerator flag

Ttree - accelerator flag

mirrored - head=tail value

count - cardinality

BAT Property ManagementBAT Property Management

Page 25: XQuery Updates in MonetDB/XQuery

25

[email protected] Lecture 5: XQuery Updates ADT 2008

XQuery Update Facility 1.0 W3C Candidate Recommendation http://www.c3.org/TR/xquery-update-10/

• Categorize updates into• Value updates• Structural updates

(MonetDB/XQuery does not yet support the latest syntax changes made by W3C; for details see

http://monetdb.cwi.nl/XQuery/Documentation/XQuery-Updates.html)

XML/XQuery UpdatesXML/XQuery Updates

Page 26: XQuery Updates in MonetDB/XQuery

26

[email protected] Lecture 5: XQuery Updates ADT 2008

do replace value of fn:doc("bib.xml")/books/book[1]/pricewith fn:doc("bib.xml")/books/book[1]/price * 1.1

do replace value of fn:doc(“bib.xml”)/books/book[2]/@isbnwith “90­6196­517­9”

do rename fn:doc(“bib.xml”)/books/book[3]/author[1]into “primary­author”

do rename fn:doc(“bib.xml”)/journals/journal[9]/@isbninto “issn”

=> map directly to simple value updates in relational storage

Value UpdatesValue Updates

Page 27: XQuery Updates in MonetDB/XQuery

27

[email protected] Lecture 5: XQuery Updates ADT 2008

do insert attribute isbn {“90­6196­517”}into fn:doc("bib.xml")/books/book[17]

do delete fn:doc(“bib.xml”)/books/book[2]/@wrong

do insert <author>Stefan Manegold</author>after fn:doc(“bib.xml”)/books/book[33]/author[last()]

do replace fn:doc(“bib.xml”)/books/book[44]/author[1]with fn:doc(“bib.xml”)/books/book[33]/author[last()]

do delete fn:doc(“bib.xml”)/books/book[author = “Kermit”]

=> How to implement on pre-/post-encoding?

Structural UpdatesStructural Updates

Page 28: XQuery Updates in MonetDB/XQuery

28

[email protected] Lecture 5: XQuery Updates ADT 2008

XML/XQuery XML/XQuery UpdatesUpdates

do insert <k><l/><m/></k> as first into /a/f/g

Page 29: XQuery Updates in MonetDB/XQuery

29

[email protected] Lecture 5: XQuery Updates ADT 2008

XML/XML/XQuery XQuery UpdatesUpdates

do insert <k><l/><m/></k> as first into /a/f/g

Page 30: XQuery Updates in MonetDB/XQuery

30

[email protected] Lecture 5: XQuery Updates ADT 2008

XML/XQuery UpdatesXML/XQuery Updates

Page 31: XQuery Updates in MonetDB/XQuery

31

[email protected] Lecture 5: XQuery Updates ADT 2008

XML/XQuery UpdatesXML/XQuery Updates

Page 32: XQuery Updates in MonetDB/XQuery

32

[email protected] Lecture 5: XQuery Updates ADT 2008

XML/XML/XQuery XQuery UpdatesUpdates

StaircaseStaircaseJoinJoin

Page 33: XQuery Updates in MonetDB/XQuery

33

[email protected] Lecture 5: XQuery Updates ADT 2008

XML Storage RevisitedXML Storage Revisited

N9N8N7

N6N5N4N3N2nullnullN1N0nid

147

null03

30113010229

208

306305224

null-121510110levelsizerid

309308227206145304303222131090levelsizepre

null-12nullnull3

30113010229208147306305224

1510110levelsizepre

69j58i77h46g85f14e03d22c31b90a

postpre

post = pre + size - level

Allow holes Define logical pages

Page 34: XQuery Updates in MonetDB/XQuery

34

[email protected] Lecture 5: XQuery Updates ADT 2008

XML Storage RevisitedXML Storage Revisited

N5N4N3

N2N9N8N7N6nullnullN1N0nid

307

null03

14113010309

228

306225204

null-121510110levelsizerid

309308227206145304303222131090levelsizepre

null-12nullnull3

30113010229208147306305224

1510110levelsizepre

69j58i77h46g85f14e03d22c31b90a

postpre

post = pre + size - level

Allow holes Define logical pages

122100mappage

rid = pre.swizzle( )

Page 35: XQuery Updates in MonetDB/XQuery

35

[email protected] Lecture 5: XQuery Updates ADT 2008

XML Storage RevisitedXML Storage RevisitedUpdate-friendly• rid-table is append-only• rid-tuples may be unused• rid = autoincrement column

MonetDB: • rid not stored but computed (virtual oid)• allows positional lookup/join

Opportunity currently not exploited by other RDBMS

Occurs widely in our XQuery translation.

N5N4N3

N2N9N8N7N6nullnullN1N0nid

307

null03

14113010309

228

306225204

null-121510110levelsizerid

Page 36: XQuery Updates in MonetDB/XQuery

36

[email protected] Lecture 5: XQuery Updates ADT 2008

XML Storage RevisitedXML Storage RevisitedUpdate-friendly• rid-table is append-only• rid-tuples may be unused• rid = autoincrement column

MonetDB: • rid not stored but computed (virtual oid)• allows positional lookup/join

Opportunity currently not exploited by other RDBMS

Occurs widely in our XQuery translation.

N5N4N3

N2N9N8N7N6nullnullN1N0nid

307

null03

14113010309

228

306225204

null-121510110levelsizerid

Page 37: XQuery Updates in MonetDB/XQuery

37

[email protected] Lecture 5: XQuery Updates ADT 2008

MonetDB/XQueryMonetDB/XQueryOur own XML DBMS with (almost..) full XQuery support.• Built purely on an RDBMS, namely MonetDB

• Future: also middleware support (P2P!!) in AmbientDB

Pathfinder compiler & “staircase join” (see later):– Technical University Munich (Torsten Grust, et al.)

– Technical University Twente (Maurice van Keulen, et. al.)

MonetDB High-Performance DBMS– CWI Amsterdam (Peter Boncz, Stefan Manegold, ...)

Useful for:

• Large XML databases!

• Querying XML annotations (multimedia, forensic NFI)

Pathfinder Compiler

RelationalAlgebra

XQuery

RDBMS

(MonetDB)

Page 38: XQuery Updates in MonetDB/XQuery

38

[email protected] Lecture 5: XQuery Updates ADT 2008

Current ProjectsCurrent Projects• Value indeces & runtime optimization

• Code freeze, release this week

• Algebraic Query Optimization• Some released, most still in the development version

• Distributed XQuery P2P XQuery• SOAP group communication, XQuery RPC

• VLDB'07 [Zhang, Boncz]

• Benchmarking beyond XMark• ExpDB'06 Workshop [Manegold]

• Support for XML Interval Annotations• XIME-P'06 Workshop [Alink et al.]

• ...

Page 39: XQuery Updates in MonetDB/XQuery

39

[email protected] Lecture 5: XQuery Updates ADT 2008

ConclusionsConclusions• Relational approach can be scalable & fast

• MonetDB/XQuery compares favorably with all other available systems

• Techniques that made it work• Property-driven peephole optimization

Order & other properties

• Loop-lifted XPath steps Evaluate Sets of context nodes in a single pass

• Support for dense (autoincrement) keys Positional lookup

• Background Information & Literaturehttp://monetdb-xquery.orghttp://pathfinder-xquery.org

Page 40: XQuery Updates in MonetDB/XQuery

40

[email protected] Lecture 5: XQuery Updates ADT 2008

Exam / TentamenExam / Tentamen

Tuesday October 21 2008

9:45 – 11:45

REC-G S.14


Recommended