Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | geraldine-reagan |
View: | 14 times |
Download: | 0 times |
1
IVOXIVOX IIncremental ncremental VView Maintenance for iew Maintenance for
OOrdered rdered XXMLML
DSRG Talk WPI February 20th 2003
Students: Katica Dimitrova & Maged El SayedAdvisor: Prof. Elke Rundensteiner
2
OutlineOutline
Motivation Problem Description Background
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
3
OutlineOutline
Motivation Problem Description Background
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
4
MotivationMotivation
Views in general Data warehouses Information integration Access control, Privacy, ..etc
XML Views (EXTRA useful) Information Inter-Portability Crossing gaps between
different data models
Materialized Views Speed up data retrieval Query optimization Increased availability
RDB XMLOther
Sources
View
View Definition
Query
5
Maintaining Materialized Views Maintaining Materialized Views
When sources are updated, materialized view may becomes inconsistent.
Methods of view maintenance Recomputation
recompute view from scratch from base data
Incremental view maintenance compute changes to view in response to changes to base sources
Heuristic: Incremental view maintenance is usually cheaper than full recomputation.
6
OutlineOutline
Motivation Problem Description Background
The XAT Algebra XML order in the XAT Context
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
7
The ProblemThe Problem
Previous work for: Relational [GMS93], bag semantics [GL95], [ZGHW95], [PSCP02]
Object-Relational [LVM00]
Object-Oriented [AFP02]
Structured data models [AMRVW98], [ZM98]
XML data model not handling order [LD00]
Can techniques for other data models be reused for XML?
8
Is Maintaining XML Views Different?Is Maintaining XML Views Different?
XML features Hierarchical Optional elements Self-typed References Ordered
Expressiveness of view definition language Complex operations
tagging, unnesting, aggregation, .. Expected large auxiliary information
9
ExampleExample<bib>
<book> <price> 65.95 </price>
<title> Advanced Programming in the Unix environment </title>
</book> <book> <title> TCP/IP Illustrated </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book> <title> TCP/IP Illustrated </title>
</book> <book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book>
$b/title, $b/price </book></result>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book>
$b/title, $b/price </book></result>
List all books that cost less than $60, including their title and price
<result>
<book>
<title>Data on the Web</title>
<price>39.95</price>
</book>
</result>
<result>
<book>
<title>Data on the Web</title>
<price>39.95</price>
</book>
</result>
Bib.xml
View Definition Query
View Extent
10
ExampleExample
Insert element<price>55.48</price> into second book
Bib.xml
<result>
<book>
<title>Data on the Web</title>
<price>39.95</price>
</book>
</result>
<result>
<book>
<title>Data on the Web</title>
<price>39.95</price>
</book>
</result>View Extent
<book> <title>TCP/IP Illustrated</title> <price>55.48</price> </book>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book>
<book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<bib><book>
<price> 65.95 </price><title> Advanced Programming in the Unix environment </title>
</book> <book>
<title> TCP/IP Illustrated </title></book>
<book> <price>39.95</price>
<title> Data on the Web </title> </book></bib>
<price>55.48</price>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book>
$b/title, $b/price </book></result>
<result> for $b in document("bib.xml")/bib/book where $b/price/text() < 60 return <book>
$b/title, $b/price </book></result>
View Definition Query
11
Our Goal Our Goal
Design incremental view maintenance strategy for XQuery views that:
Correctly update the view
Is order sensitive Returns view in proper order Allows for updates that specify order
Covers at least the “core” of XQuery language views
Minimizes auxiliary information requirements
12
Basics of IVOX Approach: AlgebraicBasics of IVOX Approach: Algebraic
Update propagation rules for each algebra operator and each update type
XML Source
XML Source
XML Source
XML View
Update
Update
Algebra
Tree
XQuery Definition
Operator
D1
D2
Operator
D1 Update
D2 Update
Execution View Maintenance
time
13
Why Algebraic?Why Algebraic?
Robust – Easily adaptable to operator semantic changes
Extensible – new operators can be added
Allows for reuse of techniques for known operators
Language independent- independent of syntax changes (of XQuery by W3C)
Formal – basis for provable correctness
14
OutlineOutline
Motivation Problem Description Background
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
15
Background on XML Algebra XATBackground on XML Algebra XAT
XAT Operators SQL Operators: Select, Project … Special Operators: Source, FOR… XML Operators: Navigate, Tagger ..
XAT Data Model (XAT Table) Order sensitive table of tuples Columns denote user-specified or
internally generated variable bindings A cell in a tuple holds an XML node
for a sequence of XML nodes
$col1, price $col3
$b $col3
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
16
Order in XAT ContextOrder in XAT Context
Order among tuples
Order among XML nodes in a cell
$col1, price $col3
$b $col3
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
17
Order in the XAT ContextOrder in the XAT Context
Agg $col5
$col5
<book> <book>
<title>TCP/IP… <title>Data …
</title>… </title> ..
</book> </book>
$col5
<book>
<title>TCP/IP …</title>
<price>55.48</price>
</book>
<book>
<title>Data … </title>
<price>39.95</price>
</book>
)
(
, Order among the tuples
Order among XML nodes in a single cell
18
Order in XAT Context: View MaintenanceOrder in XAT Context: View Maintenance
On update worry about:
Order among tuples
Order among XML nodes in a cell $col1, price $col3
$b $col3
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 55.48
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
<price>55.48</price>
</book>
<book>….</book>
19
Order in XAT Context & View MaintenanceOrder in XAT Context & View Maintenance
Agg $col5
$col5
<book> <book>
<title>TCP/IP… <title>Data …
</title>… </title> ..
</book> </book>
$col5
<book>
<title>TCP/IP …</title>
<price>55.48</price>
</book>
<book>
<title>Data … </title>
<price>39.95</price>
</book>
),
(On update worry about:
Order among the tuples
Order among XML nodes in a single cell
20
Duplicate Information in XAT ContextDuplicate Information in XAT Context
Complex operations require auxiliary information
Auxiliary information can be too large in XAT context
May be expensive to maintain it
$col1, price $col3
$b $col3
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
Duplicated Storage!
21
OutlineOutline
Motivation Problem Description Background
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
22
Possible Solutions to Order Preservation (I)Possible Solutions to Order Preservation (I)
Sequential storage(XPROP approach by Maged, Ling & Luping) Assume intermediate results stored
sequentially Inserts and deletes are performed in
physical order No order encoding
Special support required for secondary storage
May require iteration over many tuples to determine order
$col1, price $col3
<price> 39.95
</price>
<book>….</book>
<price> 65.95
</price>
$col3
<book>….
</book>
$b
<book>….</book>
<book>
<title> TCP/IP …</title>
</book>
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
$b
$b $col3
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
<price>55.48</price>
<price> 55.48
</price>
<book>….</book>
23
Possible Solutions to Order Preservation (II)Possible Solutions to Order Preservation (II)
Naïve order encoding for tuples and sequences of XML nodes Assign order numbers to tuples and to
XML nodes in a sequence
Requires frequent renumbering on inserts.
$col1, price $col3
$b $col3
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
Ord
1
2
Ord
1
2
3
<price> 55.48
</price>
<book>….</book>
2
3 2
1
Ord
<price>55.48</price>
24
Using Node IdentityUsing Node Identity
Idea: Use node identity node identity
Usage: For encoding order and structure As a reference to base data
25
What Encoding For Node Identity?What Encoding For Node Identity?
bib
book book
book
price
title
title
price
title
1
2
5
7
4
3
6
8
9
Existing techniques for encoding order for XML
Global Order (UW)Global Order (UW)
Local Order (UW)
Dewey Order (UW)
Lexicographical Order (MASS)
price
6
7
8
9
10
26
bib
book book
book
price
title
title
price
title
1
1
2
3
2
1
1
1
2
Existing techniques for encoding order for XML
Global Order (UW)
Local Order (UW)Local Order (UW)
Dewey Order (UW)
Lexicographical Order (MASS)
What Encoding For Node Identity?What Encoding For Node Identity?
price
1
2
27
bib
book book
book
price
title
title
price
title
1
1.1
1.2
1.3
1.1.2
1.1.1
1.2.1
1.3.1
1.3.2
Existing techniques for encoding order for XML
Global Order (UW)
Local Order (UW)
Dewey Order (UW)Dewey Order (UW)
Lexicographical Order (MASS)
What Encoding For Node Identity?What Encoding For Node Identity?
price
1.2.1
1.2.2
28
bib
book book
book
price
title
title
price
title
b
b.b
b.d
b.f
b.b.cd
b.b.b
b.d.f
b.f.cm
b.f.l
Existing techniques for encoding order for XML
Global Order (UW)
Local Order (UW)
Dewey Order (UW)
Lexicographical Order Lexicographical Order (MASS)(MASS)
What Encoding For Node Identity?What Encoding For Node Identity?
The WinnerThe Winner
price
b.d.b
29
Lexicographical Keys: LexKeysLexicographical Keys: LexKeys
What are LexKeys? Multi-level lexicographical keys Example: c , ba.c.b
Examples of comparison b < b.c bab < bd.cc b.b < b.b.c
Advantages All LexKeys form a totally ordered set with respect to < It is always possible to generate a key between two keys The deletion of a LexKey in a sequence does not affect other LexKeys
Usage Reference to XML nodes Encoding order
30
LexKeys in XAT TablesLexKeys in XAT Tables
$b, price $col2
$b $col2
b.b b.b.b
b.f b.f.cm
$b
b.b
b.d
b.f
$b, price $col2
$b $col2
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
31
Order Among XAT TuplesOrder Among XAT Tuples
Notion: designate order schema to XAT tables Ordering by LexKeys by columns in order schema
yields correct tuple order.
$b $c $d
b.f b.b.b c.m
b.b b.f.cm d.c
b.b b.f.cm d.c.b
Order SchemaOrder Schema
11 22
1
3
2
32
Calculating Order SchemaCalculating Order Schema
Operator Order Schema
odc(out)
Tagger Tpattern $col’ (s) odc(s)
Source Sdesc $col’ none.
Navigate Unnest $col, path $col’ (s)
If col is last in odc(s)
Concat (odc(s) – col, col’ )
else
Concat (odc(s), col’ )
Rules for each operator Calculated in a postorder traversal of the tree Sample Rules
33
Order Among Tuples ExampleOrder Among Tuples Example
$b, price $col2
$b $col2
b.b b.b.b
b.f b.f.cm
$b
b.b
b.d
b.f
$b, price $col2
$b $col2
<book>….
</book>
<price> 65.95
</price>
<book>….</book>
<price> 39.95
</price>
$b
<book>
<price> 65.95 </price>
<title> Advanced …</title>
</book>
<book>
<title> TCP/IP …</title>
</book>
<book>….</book>
11
11
2
1
3
2
1
34
Order in Collection within a cell?Order in Collection within a cell?
Agg $col5
$col5
<book> <book>
<title>TCP/IP… <title>Data …
</title>… </title> ..
</book> </book>
$col5
<book>
<title>TCP/IP …</title>
<price>55.48</price>
</book>
<book>
<title>Data … </title>
<price>39.95</price>
</book>
)
(
,
Agg $col5
$col5
tbb tbc
$col2 $col4 $col5
b.f.cm b.f.l tbb
b.d.f b.d.b tbc
{ },
11 22
1
2
12
35
Smart KeysSmart Keys
What is a SmartKey?
Key (LexKey)
Overriding Order
(LexKey)
SmartKeySmartKey
Key part, by default also represents order
Optional, only represents order when present
Notation: key(order) Examples
b.c.b (h) b.c.b
36
SmartKeys in XATTablesSmartKeys in XATTables
Agg $col5
$col5
<book> <book>
<title>TCP/IP… <title>Data …
</title>… </title> ..
</book> </book>
$col5
<book>
<title>TCP/IP …</title>
<price>55.48</price>
</book>
<book>
<title>Data … </title>
<price>39.95</price>
</book>
)
(
,
Agg $col5
$col5
tbb(b.f.cm..b.f.l) tbc(b.d.f..b.d.b)
$col2 $col4 $col5
b.f.cm b.f.l tbb
b.d.f b.d.b tbc
{ },
11 22
1
2
12
37
The Impact of The Impact of SmartKeys on SmartKeys on
View MaintenanceView Maintenance
38
Order Among XAT Tuples during View MaintenanceOrder Among XAT Tuples during View Maintenance
Not touching other tuples in XAT table
No reordering ever needed.
Gaining distributiveness in regard to bag union on tuple level
$col1, price $col3
$b $col3
b.b b.b.b
b.f b.f.cm
b.d b.d.b
$b
b.b
b.f
b.d
3
1
2
3
1
2
39
Order in a Sequence during View MaintenanceOrder in a Sequence during View Maintenance
Agg $col5
$col5
tb..b.f.l..b.f.cm tb..b.d.f..b.d.b
$col5
tb..b.f.l..b.f.cm
tb..b.d.f..b.d.b
Not touching other members of the sequence
No reordering ever needed.
Gaining distributiveness in regard to bag union on cell level
{ },
1
2
12
40
Update Propagation RulesUpdate Propagation Rules
Operator
XAT table 1
XAT table 2
Operator
Update to XAT table 1
Update to XAT table 2
Execution View Maintenance
time
Use distributiveness in regard to bag union
Reuse rules from relational for most SQL XAT operators
41
Update Propagation Rules ExampleUpdate Propagation Rules Example((Navigate Unnest on Insert Tuple)Navigate Unnest on Insert Tuple)
T2old = $col,path$col’ (T1old)
T1new=T1old + T1
T2new = $col,path$col’ (T1old + T1) =
= $col,path$col’ (T1old) + $col,path
$col’ (T1) =
= T2old + T2
+ represents bag union
T1
T2
$col,path$col’
T1
T2
Execution View Maintenance
time
$col,path$col’
42
Update Propagation Strategy Update Propagation Strategy
XML Source XML Source XML Source
XML ViewUpdate
XAT
xatup
keyup
TranslatorTranslator
xmlup
Update XQuery
Storage ManagerStorage Manager
43
Update Primitives Update Primitives (The Format of Delta)(The Format of Delta)
XML Update Primitives (xup) Insert (xmlFragment, path) Delete (path) InsertAtt (name, value, path) DeleteAtt (name, path) Replace (oldValue, newValue, path)
XML Key Update Primitives (keyup) Insert (el, path) Delete (path) Replace (el, pos)
XAT Update Primitives (xatup) InsertTuple (tuple) DeleteTuple (tupleId) ChangeTuple (Keyup, columnName, tupleId)
Apply to original XML Document
Express update on original XML data in
terms of LexKeys
Apply to XATTable
44
A Complete A Complete ExampleExample
45
S ”bib.xml” $S1
bib.xml
$S1, bib $col1
$col1, book $b
$b, price $col2
$b, title $col4
$col3 < 60
T <book>$col4 $col2</book> $col5
Agg $col5
Storage ManagerStorage Manager
bib
book bookbook
price title titleprice
title
b
b.b b.d b.f
b.b.cdb.b.b
b.d.f
b.f.cm b.f.l
bib.xml
Constructed XDOMs
{
tb..b.f.l..b.f.cm(b.f.l..b.f.cm )
}
$col5
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
tr
$col6
tr
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
result
tb..b.f.l.. b.f.cm
T <result>$col5</result> $col6
b
$col1b.f
b.d
b.b
$bb.f.cm
b.b.b
$col2
b.f
b.b
$b b.f.l
b.b.cd
$col4
b.f.cm
b.b.b
$col2b.f.l
$col4
b.f.cm
$ col2
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
tb..b.f.l..b.f.cm
$col5
Execution
46
S ”bib.xml” $S1
bib.xml
$S1, bib $col1
$col1, book $b
$b, price $col2
$b, title $col4
$col3 < 60
T <book>$col4 $col2</book> $col5
Agg $col5
Storage ManagerStorage Manager
bib
book bookbook
price title titleprice
title
b
b.b b.d b.f
b.b.cdb.b.b
b.d.f
b.f.cm b.f.l
bib.xml
Constructed XDOMs
T <result>$col5</result> $col6
price
b.d.b
Insert (price, bib[1].book[2])
Insert (price[b.d.b],
bib[b].book[b.d])
b
$col1
ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col1, b)
b.f
b.d
b.b
$b
changeTuple(insert(price[b.d.b], book[b.d]), $b, b.d)
ChangeTuple(insert(price[b.d.b], bib[b].book[b.d]), $col2, b.f, b.f.m)
b.f.cm
b.b.b
$col2
b.f
b.b
$b
insertTuple({b.d, b,d.b})
b.f.l
b.b.cd
$col4
b.f.cm
b.b.b
$col2
insertTuple({b.d.b, b.d.f})
b.f.l
$col4
b.f.cm
$ col2
insetTuple({b.d.b, b.d.f})
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
tb..b.f.l..b.f.cm
$col5
insertTuple({tb..b.d.f..b.d.b})
tr
$col6
tr
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
result
tb..b.f.l.. b.f.cm
ChangeTuple(insert(tb..b.d.f..b.d.b, result[tr]), $col6, tr)
b.d.bb.d
b.f.cm
b.b.b
$col2
b.f
b.b
$b
b.d.fb.d.d
b.f.l
b.b.cd
$col4
b.f.cm
b.b.b
$col2 b.d.f
b.f.l
$col4
b.d.d
b.f.cm
$ col2
tb..
b.d.f..
b.d.b
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
book
b.d.f b.d.b
tb..b.d.f..b.d.b
tb..b.f.l..b.f.cm
$col5
tb..
b.d.f..
b.d.b
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
book
b.d.f b.d.b
{
tb..b.f.l..b.f.cm(b.f.l..b.f.cm )
tb..b.d.f..b.d.b(..b.d.f..b.d.b)
}
$col5
tb..b.d.f..b.d.b(..b.d.f..b.d.b)
{
tb..b.f.l..b.f.cm(b.f.l..b.f.cm )
}
$col5
ChangeTuple(insert( tb..b.d.f..b.d.b, null), $col5, )
tb..
b.d.f..
b.d.b
tb..
b.f.l..
b.f.cm
XDOMKey
book
b.f.l b.f.cm
book
b.d.f b.d.b
View Maintenance
47
OutlineOutline
Motivation Problem Description Background on XAT
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
48
System ArchitectureSystem Architecture
Process
Data
Legend
XML Query Engine
XMLSource
XML Algebra
Tree
Materialized Auxiliary Views
Materialized XML View
XMLSource
Persistent Data Storage
One time occurrence
On-update occurrence
XML View
Maintainer
VM Initializer
View Definition XQuery
RainbowRainbow
User
Update XQuery
Update Propagation
RulesRepository
XMLSource
Update Primitive
Generator
Executer
XTUPXTUP
Storage ManagerStorage Manager
ExecutionView Maintenance
49
OutlineOutline
Motivation Problem Description Background on XAT
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
50
Related WorkRelated Work
A.Gupta, I.S.Mumick. Maintenance of Materialized Views: Problems, Techniques, and Application. In Bulletin of the Technical Committee on Data engineering 1995.
T. Grin, L.Libkin. Incremental maintenance of views with duplicates. In SIGMOD 1995.
H. Liefke and S. Davidson. View Maintenance for Hierarchical Semistructured Data. In DAWAK 2000.
S. Abiteboul, J. McHugh, Rys, Vassalos, J. Wiener. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB 1998.
51
OutlineOutline
Motivation Problem Description Background on XAT
XML Algebra Order in XML Algebra
The IVOX Approach Order Encoding Overall strategy
System Architecture Related Work Future Work
52
Future WorkFuture Work
Near Future … Launch the system Batch update coming Experiments and Evaluation
Compare the system’s performance to recomputation
… and Beyond Batching updates coming from
different sources Integrity constraints Algebra tree rewrite rules