Date post: | 31-Dec-2015 |
Category: |
Documents |
Upload: | halla-dillon |
View: | 42 times |
Download: | 0 times |
1
XQuery to SQL by XML Algebra Tree
Brad Pielech, Brian MurphyThanks:
Xin
2
Outline
1. Overview of Rainbow System2. Process of translating XQuery ->
SQL3. XML Operators4. Partial translation walkthrough
with running example
3
Rainbow System Complete XML <-> SQL system Uses some ideas from XPERANTO,
Niagara, and other systems Several main subsystems:
Document Shredder View Generator Query Translation, Query Rewrite Result Generation
Work in progress
4
Steps in Translation
1. User inputs XQuery query2. User Query is converted into an
XML Algebra Tree (XAT)3. Database Mapping Query’s XAT
generated4. Queries are Decorrelated 5. Trees are merged, unnecessary
branches cut
5
Steps Continued
6. Computation Pushdown (presentation concludes here)
7. SQL Generation8. Query Execution 9. Tagging of Results
6
What is the difference between the two queries? The user query is executed over a view
of the XML document and specifies what to return and how to return it
The mapping query specifies how the view the user is querying “maps” to the database
Therefore, combining the two queries into one is necessary in order to correctly process the user’s request
7
XAT Operators Each XAT is comprised of XAT
Operators. Similar in concepts to Relational
Algebra Operator set is combination
between Niagara and Xperanto papers
8
Set of Operators SQL like (9):
Project, Select, Join (Theta, Outer, Semi), Groupby, Orderby, Union (Node, Outer), Cartesian Product.
XML like (4): Tagger, Navigate, is(Element, Text),
Aggregate. Special:
SQL, Function, Source, NameColumn, FOR
9
SQL like Operators (9)NiagaraNiagara XPERANTOXPERANTO
Project Expose Project
Select Select Select
Theta Join Join Theta Join
Outer Join N/A Outer Join
Semi Join N/A N/A
Groupby Group Groupby
Orderby N/A Orderby
Union Union Union
Outer Union
Union Outer Union
10
XML like OperatorsNiagarNiagaraa
XPERANTOXPERANTO
Tagger*(pattern)
Vertex Project:cr8(Elem, AttList, Att, XMLFragList),
Navigate(from, path)
Follow Project:get(TagName, Attributes, Contents, AttName, AttValue), Unnest
Is N/A Select:is(Element, Text),
Aggregate Group AggXMLFrags
11
Special Operators
NiagarNiagaraa
XPERANTOXPERANTO DescriptionDescription
SQL N/A Input Denote a SQL query.
Function N/A Function Used to represent recursive query
Source Source Table, View
Identify a data source.
NameColumn
Rename
N/A Naming of columns.
FOR N/A N/A FOR iteration.
12
<sports> <organization> <team> Boston Red Sox </team> <stadium sname = "Fenway Park"/> <starPlayer> <pname> Nomar </pname> <position> Shortstop </position> </starPlayer> <starPlayer> <pname> Pedro </pname> <position> Pitcher </position> </starPlayer> <starPlayer> <pname> Manny </pname> <position> Outfield </position> </starPlayer> </organization> <organization> … </organization>
Sports XML Document
<stadium> <sname> Fenway Park </sname> <capacity> 33,000 </capacity> <yearBuilt> 1912 </yearBuilt> <ticket_high rate = "55"/> <ticket_low rate = "18"/> </stadium> <stadium> … <stadium> <player name="Pedro" number="45" rookieYear = "1991" /> <player name="Nomar" number="5" rookieYear = "1997" /> <player name="Manny" number="24" rookieYear = "1993" /> </sports>
13
Example XQuery<bestPlayers>{
For $p in document("sports.xml")/sports/organizationLet $a = $p/team/text()Where $a = "Boston Red Sox"Return
<playerName>$p/starPlayer/pname/text()
<playerName>}</bestPlayers>
List all of the star players’ names on the Boston Red Sox
14
XAT Tree for Example Query
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(“/”, sports/organization)
Source(“sports.xml”)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
15
RDBMS Tables of Sports Info
organizationID
teamName stadiumnName
1 Boston Red Sox Fenway Park
Organization
stadiumID sname Capacity yearBuilt ticketHigh
ticketLow
1 Fenway Park 33,000 1912 50 22
Stadium
starPlayerName
starPlayePosition organizationID
Nomar ShortStop 1
Pedro Pitcher 1
Manny Outfield 1
StarPlayer
PlayerName
Number rookieYear
Nomar 5 1997
Pedro 45 1992
Manny 24 1993
PlayerInfo
16
Partial Default XML View<Organization>
<row><organizationId> 1 </organizationId><teamName> Boston Red Sox </teamName><stadiumName> Fenway Park </stadiumName></row>
</Organization><Stadium>
<row><stadiumID> 1 </stadiumID><sname> Fenway Park </sname>…</row>
</Stadium>
17
Challenge Question I
<Organization> <row> <organizationId> 1 </organizationId> <teamName> Boston Red Sox </teamName> <stadiumName> Fenway Park
</stadiumName> </row></Organization>
…<StarPlayer> <row> <starPlayerName> Nomar </starPlayerName> <starPlayerPosition>
shortstop</starPlayerPosition> <organizationId> 1 </organizationId> </row></StarPlayer>
…
<organization> <team> Boston Red Sox </team> <stadium sname = "Fenway Park"/> <starPlayer> <pname> Nomar </pname> <position> Shortstop </position> </starPlayer> <starPlayer> <pname> Pedro </pname> <position> Pitcher </position> </starPlayer> <starPlayer> <pname> Manny </pname> <position> Outfield </position> </starPlayer></organization>
What is the XQuery that converts the document on the left (default XML view) to the document on the
right (user view)?
18
Mapping Query Part ICreate view invoice as (<sports>
FOR
$organization IN view ("default") /Organization/row RETURN
<organization> <team> $organization/teamName/text() </team> <stadium sname = $organization/stadiumName/text() />
FOR $starPlayer IN view ("default") /StarPlayer/rowWHERE $starPlayer/organizationID = $organization/organizationID
RETURN
<starPlayer> <pname> $starPlayer/starPlayerName/text()
</pname> <position> $starPlayer/starPlayerPosition/text() </position>
</starPlayer> </organization>
B1
B2
19
Mapping Query Part IIFOR $stadium IN view ("default") /Stadium/rowRETURN <stadium> <sname> $stadium/sname/text() </sname> <capacity> $stadium/capacity/text() </capacity> <yearBuilt> $stadium/yearBuilt/text() </yearBuilt> <ticket_high rate = $stadium/ticket_high_rate/text() /> <ticket_low rate = $stadium/ticket_low_rate/text() /> </stadium>
FOR $player IN view ("default") /PlayerInfo/rowRETURN
<player name = $player/playerName/text() number = $player/playerNumber/text() rookieYear = $player/rookieYear/text() />
</sports>)
B3
B4
20
Cutting Mapping Query The mapping query has data that
is unused by the user query, so we can get rid of it B3 and B4 are completely removed Remove stadium from B1 Remove position from B2
21
Mapping Query XAT General Form
$organization := Navigate("/",Organization/row)
Source(“default.xml”)
FOR $organization
More Stuff
Some Stuff
Source(“default.xml”)
FOR $starPlayer
Some Stuff will be shown in Part I
More Stuff in Part II
B1
B2
$starPlayer := Navigate("/", StarPlayer/row)
22
Mapping Query XAT Part IB1
O := Tagger(<sports> All </sports)
All = Aggregate
Tagger(<organization> V0 </organization)
V0 := Aggregate
Tagger (<team>$tname </team> )
$tname := Navigate($organization, teamName/text())
$starPlayer := Navigate("/", StarPlayer/row)
Source("default.xml")
FOR $starPlayer
To: Part II
Some Stuff
FOR $organization
23
Mapping Query XAT Part II
Aggregate
$ID := Navigate($organization, organizationID)
Select($starPlayerID = $ID)
$starPlayerID := Navigate($starPlayer, OrganizationID)
$sname := Navigate($starPlayer, starPlayerName)
To: Part I
B2
Tagger(<starPlayer> <pname> $sname </pname> </starPlayer)
More Stuff
24
Decorrelated Mapping XAT Part I
<sports> <organization> <team> Boston Red Sox </team> <starPlayer> <pname> Nomar </pname> </starPlayer> <starPlayer> <pname>Pedro </pname> </starPlayer> <starPlayer> <pname> Manny </pname> </starPlayer> </organization></sports> Tagger (<team>$tname </team>
O:= Tagger(<sports> All </sports>)
All = Aggregate
Tagger(<organization> V0 </organization)
V0 := Aggregate
$tname := Navigate($organization, teamName/text())
From Part II
25
Decorrelated Mapping XAT Part II
Source("default.xml")Source("default.xml")
$organization = Navigate("/", Organization/row)
$starPlayer := (Navigate"/", StarPlayer/row)
Cartesian Product
$ID := Navigate($organization, organizationID)
$starPlayerID := Navigate($starPlayer, organizationID)
Select($starPlayerID = $ID)
$sname := Navigate($starPlayer, starPlayerName)
To Part I
Aggregate
Tagger(<starPlayer> <pname> $sname </pname>
</starPlayer)
26
Progress Report
1. User inputs XQuery query2. User Query is converted into an
XML Algebra Tree (XAT)3. Database Mapping Query’s XAT
generated4. Queries are Decorrelated 5. Trees are merged, unnecessary
branches cut
27
XAT merging Input:
User Query XAT + Mapping Query XAT Output:
Simplified composite XAT Approach:
The Tagger from the top of the Mapping Query is linked to the bottom of the User Query.
The Source Operator at the bottom of the User Query is deleted
Pushdown Navigation By using the commutative rules
Cancel out the navigation operators By using the composition rules
28
Combined XAT
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(O, sports/organization)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
Tagger (<team>$tname </team>
O:= Tagger(<sports> All </sports>)
All = Aggregate
Tagger(<organization> V0 </organization)
V0 := Aggregate
$tname := Navigate($organization, teamName/text())
Top of Mapping Query
User Query
Rest of Mapping
Query
29
Computation Pushdown Part I
What is PushDown? After merging the 2 XATs, there may be
redundancies in the larger tree. Ex: The user query and mapping query may
navigate to the same thing The decorrelated query tree may be
unorganized and inefficient Pushdown aims to eliminate these
problems
30
Computation Pushdown Part II XPERANTO mentions pushdown as
a means of pushing computation to relational engine
Niagara defines equivalence rules and specifies several different heuristics for using the rules
31
XAT Pushdown Example Part I
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(O, sports/organization)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
Tagger (<team>$tname </team>
O:= Tagger(<sports> All </sports>)
All = Aggregate
Tagger(<organization> V0 </organization)
V0 := Aggregate
$tname := Navigate($organization, teamName/text())
Top of Mapping Query
User Query
Rest of Mapping Query
32
XAT Pushdown Example Part II
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(O, sports/organization)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
Tagger (<team>$tname </team>
O:= Tagger(<sports> All </sports>)
All = Aggregate
Tagger(<organization> V0 </organization)
V0 := Aggregate
$tname := Navigate($organization, teamName/text())
Top of Mapping Query
User Query
Rest of Mapping Query
33
XAT Pushdown Example Part III
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(O, sports/organization)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
Source("default.xml")
Cartesian Product
$organization = Navigate("/",
Organization/row)
$starPlayer := (Navigate"/",
StarPlayer/row)
$ID := Navigate($organization, organizationID)
$starPlayerID := Navigate($starPlayer, organizationID)
Select($starPlayerID = $ID)
$sname := Navigate($starPlayer, starPlayerName)
Source("default.xml")
User Query
Tagger(<starPlayer> <pname> $sname </pname> </starPlayer)
34
XAT Pushdown Example Part IV
V1 := Aggregate
$pname = Navigate($p, starPlayer/pname/text())
Select($a = "Boston Red Sox")
$a := Navigate($p, team/text())
$p := Navigate(O, sports/organization)
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
Source("default.xml")
Cartesian Product
$organization = Navigate("/",
Organization/row)
$starPlayer := (Navigate"/",
StarPlayer/row)
$ID := Navigate($organization, organizationID)
$starPlayerID := Navigate($starPlayer, organizationID)
Select($starPlayerID = $ID)
$sname := Navigate($starPlayer, starPlayerName)
Source("default.xml")
User Query
Tagger(<starPlayer> <pname> $sname </pname> </starPlayer)
35
Challenge Questions II & III What are some of the heuristics we
could use during Pushdown? What can / should we try to accomplish? What should the tree look like afterwards?
How could we go about pushing things down? What would the algorithm be? How do we know if an operator can be
pushed down? When do we stop pushing an operator down?
36
Computation Pushdown Part III Goal: Tagger + SQL operators + XML
operators Use Equivalence rules repository to swap
operators Step 1: Navigation Pushdown.
Cancel Mapping Query Taggers and corresponding Aggregates
Delete redundant Navigates from User Query Rename columns in Mapping Query
Step 2: SQL Computation Pushdown. By commutative and composition rules.
37
Equivalence Rules Pair-wise rules that determine if one
operator (parent) may be pushed through another (child) Navigate / Navigate rule: If the parent
depends on the child, they may not be swapped
Navigate / Join: Navigate is pushed to the side of the join that its entry point comes from
And many, many more
38
Pushdown Results
1. Push Navigates to the correct side of Cartesian Product
2. Create a NameColumn operator that renames $tname into $a
3. Create a 2nd NameColumn operator that renames $pname into $sname
4. Get rid of all Taggers and Aggregates from Mapping Query and Navigates that were crossed out from User Query
5. Merge Select($starPlayerID = $ID) and Cartesian into a Join
39
XAT After Computation PushDown Part I
V1 := Aggregate
Select($a = "Boston Red Sox")
Tagger(<bestPlayers> V1 </bestPlayers>
Tagger(<playerName> $pname </playerName>
NameColumn( $pname = $sname)
NameColumn( $a = $tname)
From Part II
40
XAT After Computation PushDown Part II
$starPlayerID := Navigate($starPlayer, OrganizationID)
$sname := Navigate($starPlayer, starPlayerName)
$starPlayer := Navigate("/", StarPlayer/row)
Source("default.xml")
$ID := Navigate($organization, organizationID)
Source("default.xml")
$organization := Navigate("/",Organization/row)
$tname := Navigate($organization, teamName/text())
Join on ($ID = $starPlayerID)
To Part I
41
Rest of the Process
1. Take the Combined XAT from the previous slide and generate a single SQL query.
2. Execute query on local RDBMS3. Format result tuples according to
Tagger4. Return XML document to user
42
Summary1. Created XAT of the user query2. Created XAT for mapping query
1. Cut information unused by user query2. Decorrelated Mapping query
3. Merged two queries into 1 larger XAT4. Identified weaknesses in combined tree5. Walked through pushdown steps6. Displayed final, optimized tree
43
The End!!!