Query Rewrite
Starburst Model (IBM)
DB2 Query Optimizer (Starburst)
Parsing and Semantic Checking
Query Rewrite
Plan Optimization
Query Evaluation System
Query GraphModel
ExecutablePlan
Data Flow
Control Flow
Compile Time
Run Time
Goal of Query Rewrite
•Make queries as declarative as possible:
Poorly expressed queries could force the optimizer into choosing suboptimal plans
•Perform natural heuristics
For example, “predicate pushdown”
Components of Rewrite Engine
• Rewrite rules (more later)• Rule engine
– control strategies• sequential (rules are processed sequentially)• priority (higher priority rules are given a chance first)• statistical (next rule is chosen randomly based on a user
defined probability distribution– budget
• to avoid spending too much time on rewrites, the processing stops at a consistent state of QGM when the budget is exhausted
• Search facility– browses through QGM providing the context for the
rules to work on
Problem
• How do we choose between competing incompatible transformations?
• Optimal solution: apply cost analysis and pick the transformation leading to a cheaper plan
• Practical solution (why?): generate multiple alternatives and send them to plan optimization phase (problems?)
Rewrite Rules: SELECT MergeCREATE VIEW itpv AS
(SELECT DISTINCT itp.itemn, pur.vendn
FROM itp, pur
WHERE itp.ponum = pur.ponum AND
pur.odate > ’85’)
SELECT DISTINCT itm.itmn, pur.vendn
FROM itm, itp, pur
WHERE itp.ponum = pur.ponum AND itm.itemn = itpv.itemn AND
pur.odate > ’85’ AND
itm.itemn > ’01’ AND
itm.itemn < ’20’
SELECT itm.itmn, itpv.vendn
FROM itm, itpv
WHERE itm.itemn = itpv.itemn AND
itm.itemn > ’01’ AND
itm.itemn < ’20’
Speedup: 200 times
Rewrite Rules: Existential Subquery Merge
SELECT *
FROM itp
WHERE itm.itemn IN
( SELECT itl.itmn
FROM itl
WHERE itl.wkcen = ‘WK468’ AND
itl.locan = ‘L’)
SELECT DISTINCT itp.*
FROM itp, itl
WHERE itp.itmn = itl.itemn AND
itl.wkcen = ‘WK468’ AND
itl.locan = ‘L’
Speedup: 15 times
Rewrite Rules:Intersect to Exists
SELECT itemn
FROM wor
WHERE empno = ‘EMPN1279’
INTERSECT
SELECT itmn
FROM itl
WHERE entry_time = ‘9773’ AND
wkctr = ‘WK195’)
Speedup: 8 times
SELECT DISTINCT itemn
FROM wor, itl
WHERE empno = ‘EMPN1279’
entry_time = ‘9773’ AND
wkctr = ‘WK195’) AND
itl.itmn = wor.itemn
The Count Bug
parts(PNUM,QOH)supply(PNUM,QUAN,SHIPDATE)
Query: Find the part numbers of those parts whose quantities on hand equal the number of shipments of those parts before 1-1-80.
select PNUMfrom partswhere QOH = ( select count(SHIPDATE)
from supplywhere supply.PNUM = parts.PNUM
and SHIPDATE < 1-1-80)
The Count Bug (cont.)select PNUMfrom partswhere QOH = ( select count(SHIPDATE)
from supplywhere supply.PNUM = parts.PNUM
and SHIPDATE < 1-1-80)
temp (SUPPNUM,CT) =(select PNUM, count(SHIPDATE)from supplywhere SHIPDATE < 1-1-80) group by PNUM)
select PNUMfrom parts, tempwhere parts.QOH = temp.CT and
temp.PNUM = parts.PNUM
The Count Bug (cont.)
PNUM QOH
3 6
10 1
8 0
PNUM QUAN SHIPDATE
3 4 7-3-79
3 2 10-1-78
10 1 6-8-78
10 2 8-10-81
8 5 5-7-83
select PNUMfrom partswhere QOH = ( select count(SHIPDATE)
from supplywhere supply.PNUM = parts.PNUM
and SHIPDATE < 1-1-80)
SupplyParts
PNUM
10
8
Result
The Count Bug (cont.)
PNUM QOH
3 6
10 1
8 0
PNUM QUAN SHIPDATE
3 4 7-3-79
3 2 10-1-78
10 1 6-8-78
10 2 8-10-81
8 5 5-7-83
SupplyParts
Temptemp (SUPPNUM,CT) =
(select PNUM, count(SHIPDATE)from supplywhere SHIPDATE < 1-1-80) group by PNUM)
Suppnum CT
3 2
10 1
The Count Bug (cont.)
PNUM QOH
3 6
10 1
8 0
Parts Temp
SUPPNUM CT
3 2
10 1
select PNUMfrom parts, tempwhere parts.QOH = temp.CT and
temp.PNUM = parts.PNUM
Result
PNUM
10
The Count Bug – solutionwith outer joins
X
A
B
Y
B
C
E
R S R=+S X Y
A null
B B
null C
null E
The Count Bug – solutionwith outer joins
temp (SUPPNUM,CT) =(select parts.PNUM, count(SHIPDATE)from parts, supplywhere SHIPDATE < 1-1-80 and
parts.PNUM =+ supply.PNUMgroup by parts.PNUM)
Parts.PNUM Parts.QOH Supply.PNUM Supply.QUON Supply.SHIPDATE
3 6 3 4 7-3-79
3 6 3 2 10-1-78
10 1 10 1 6-8-78
8 0 null null null
parts.PNUM =+ supply.PNUM (for SHIPDATE < 1-1-80)