Common Table Expressions in MariaDB 10.2
Sergei PetruniaGalina Shalygina
2
● A standard SQL feature
● Two kinds of CTEs
•Recursive
•Non-recursive
● Supported by Oracle, MS SQL Server, PostgreSQL, SQLite, …
● MySQL world:
•MariaDB: 10.2 Beta (Sept, 27th)
•MySQL: MySQL-8.0.0-labs-optimizer tree
Common Table Expressions
3
Plan
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
4
Plan
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
5
CTE SyntaxCTE name
CTE Body
CTE Usage
● Similar to DERIVED tables● “Query-local VIEWs”
with engineers as ( select * from employees where dept='Engineering')select * from engineers where ...
WITH
6
select *from( select * from employees where dept='Engineering') as engineerswhere...
with engineers as ( select * from employees where dept='Engineering')select * from engineers where ...
CTEs are like derived tables
● Similar to DERIVED tables● “Query-local VIEWs”
7
Use case #1: CTEs refer to CTEs
● More readable than nested FROM(SELECT …)
with engineers as ( select * from employees where dept in ('Development','Support')),eu_engineers as ( select * from engineers where country IN ('NL',...))select ...from eu_engineers;
8
with engineers as ( select * from employees where dept in ('Development','Support')),select * from engineers E1 where not exists (select 1 from engineers E2 where E2.country=E1.country and E2.name <> E1.name);
Use case #2: Multiple use of CTE● Anti-self-join
9
select * from sales_product_year CUR, sales_product_year PREV,where CUR.product=PREV.product and CUR.year=PREV.year + 1 and CUR.total_amt > PREV.total_amt
with sales_product_year as ( select product, year(ship_date) as year, sum(price) as total_amt from item_sales group by product, year)
Use case #2: example 2● Year-over-year comparisons
10
select * from sales_product_year S1where total_amt > (select 0.1*sum(total_amt) from sales_product_year S2 where S2.year=S1.year)
with sales_product_year as ( select product, year(ship_date) as year, sum(price) as total_amt from item_sales group by product, year)
Use case #2: example 3● Compare individuals against their group
11
● Non-recursive CTES are “Query-local VIEWs”
● One CTE can refer to another
•Better than nested FROM (SELECT …)
● Can refer to a CTE from multiple places
•Better than copy-pasting FROM(SELECT …)
● CTE adoption
•TPC-H (1999) - no CTEs
•TPC-DS (2011) - 38 of 100 queries use CTEs.
Conclusions so far
12
Plan
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
13
Base algorithm: materialize in a temp.table
● Always works● Often not optimal
with engineers as ( select * from employees where dept='Engineering' or dept='Support')select ...from engineers, other_table, ...
14
Optimization #1: CTE Mergingwith engineers as ( select * from employees where dept='Development')select ...from engineers E, support_cases SCwhere E.name=SC.assignee and SC.created='2016-09-30' and E.location='Amsterdam'
select ...from employees E, support_cases SCwhere E.dept='Development' and E.name=SC.assignee and SC.created='2016-09-30' and E.location=’Amsterdam’
● for each support case•lookup employee by name
15
Optimization #1: CTE Merging (2)
● Requirement
•CTE is a JOIN : no GROUP BY, DISTINCT, etc
● Output
•CTE is merged into parent’s join
•Optimizer can pick the best query plan
● This is the same as ALGORITHM=MERGE for VIEWs
16
with sales_per_year as ( select year(order.date) as year sum(order.amount) as sales from order group by year)select * from sales_per_year where year in ('2015','2016')
with sales_per_year as ( select year(order.date) as year sum(order.amount) as sales from order where year in ('2015','2016') group by year)select * from sales_per_year
Optimization #2: condition pushdown
17
Optimization #2: condition pushdown (2)Summary
● Used when merging is not possible
•CTE has a GROUP BY
● Makes temp. table smaller
● Allows to filter out whole groups
● Besides CTEs, works for derived tables and VIEWs
● MariaDB 10.2, Galina’s GSOC 2016 project:
“Pushing conditions into non-mergeable views and derived tables in
MariaDB”
18
with product_sales as ( select product_name, year(sale_date), count(*) as count from product_sales group by product, year)select *from product_sales P1, product_sales P2where P1.year = 2010 AND P2.year = 2011 AND ...
Optimization #3: CTE reuse
● Idea•Fill the CTE once•Then use multiple times
● Doesn’t work together with condition pushdown or merging
19
CTE Optimizations summary
● Merge and condition pushdown are most important
•MariaDB supports them, like MS SQL.
● PostgreSQL’s approach is *weird*
•“CTEs are optimization barriers”
● MySQL’s labs tree: “try merging, otherwise reuse”
CTE Merge Condition pushdown CTE reuse
MariaDB 10.2 ✔ ✔ ✘
MS SQL Server ✔ ✔ ✘
PostgreSQL ✘ ✘ ✔
MySQL 8.0.0-labs-optimizer ✔ ✘ ✔*
20
Recursive CTEs
21
Plan
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
22
Recursive CTEs
● First attempt: Oracle’s CONNECT BY syntax (80’s)● Superseded by Recursive CTEs
•SQL:1999, implementations in 2007-2009
● SQL is poor at “recursive” data structures/algorithms
wheel
boltcapnut
tire valve
rimtirespokes- Trees - Graphs
Chicago
Nashville Atlanta
Orlando
23
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
Plan
24
Recursive CTE syntax
Recursive part
Anchor partRecursive use of CTE
“recursive”
with recursive ancestors as ( select * from folks where name = 'Alex' union [all] select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
25
Sister AmyAlex
Mom Dad
Grandpa Bill
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
Recursive CTE computation
26
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 |+------+--------------+--------+--------+
Computation Result table
Step #1:execution of the anchor part
27
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 |+------+--------------+--------+--------+
Computation Result table
+------+--------------+--------+--------+| id | name | father | moher |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
Step #2:execution of the recursive part
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
28
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL |+------+--------------+--------+--------+
Computation Result table
Step #2:execution of the recursive part
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
29
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
Computation Result table
Step #3:execution of the recursive part
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL |+------+--------------+--------+--------+
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
30
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL |+------+--------------+--------+--------+
Computation Result table
Step #3:execution of the recursive part
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
31
with recursive ancestors as ( select * from folks where name = 'Alex' union select f.* from folks as f, ancestors AS a where f.id = a.father or f.id = a.mother)select * from ancestors;
Computation Result table
Step #4:execution of the recursive part
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL || 98 | Sister Amy | 20 | 30 |+------+--------------+--------+--------+
+------+--------------+--------+--------+| id | name | father | mother |+------+--------------+--------+--------+| 100 | Alex | 20 | 30 || 20 | Dad | 10 | NULL || 30 | Mom | NULL | NULL || 10 | Grandpa Bill | NULL | NULL |+------+--------------+--------+--------+
No results!
32
Summary so far
1. Compute anchor_data
2. Compute recursive_part to
get the new data
3. if (new data is non-empty)
goto 2;
with recursive R as ( select anchor_data union [all] select recursive_part from R, …)select …
33
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
Plan
34
Transitive closure
bus_routes+------------+------------+ | origin | dst | +------------+------------+| New York | Boston || Boston | New York || New York | Washington || Washington | Boston || Washington | Raleigh |+------------+------------+
New York
Boston Washington
Raleigh
35
with recursive bus_dst as ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin)select * from bus_dst
● Start from New York
New York
Boston Washington
Raleigh
Transitive closure
36
with recursive bus_dst as ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin)select * from bus_dst
● Step #1: add Boston and Washington
Transitive closure +------------+ | dst | +------------+| New York |+------------+
New York
Boston Washington
Raleigh
37
with recursive bus_dst as ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin)select * from bus_dst
● Step #1: add Boston and Washington
New York
Boston Washington
Raleigh
Transitive closure +------------+ | dst | +------------+| New York || Boston || Washington |+------------+
38
with recursive bus_dst as ( select origin as dst from bus_routes where origin='New York' union select bus_routes.dst from bus_routes, bus_dst where bus_dst.dst= bus_routes.origin)select * from bus_dst
● Step#2:•Add Raleigh•UNION excludes nodes that are already present.
New York
Boston Washington
Raleigh
Transitive closure +------------+ | dst | +------------+| New York || Boston || Washington || Raleigh |+------------+
39
UNION and MySQL-8.0-labs● UNION is useful for “transitive closure” queries
● Supported in all databases
● MySQL 8.0.0-labs
ERROR 42000: This version of MySQL doesn't yet support
'UNION DISTINCT in recursive Common Table Expression (use
UNION ALL)'
•No idea why they have this limitation.
40
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
Plan
41
Computing “Paths”
bus_routes+------------+------------+ | origin | dst | +------------+------------+| New York | Boston || Boston | New York || New York | Washington || Washington | Boston || Washington | Raleigh |+------------+------------+
New York
Boston Washington
Raleigh
● Want paths like “New York -> Washington -> Raleigh”
42
with recursive paths (cur_path, cur_dest) as ( select origin, origin from bus_routes where origin='New York' union select concat(paths.cur_path, ',', bus_routes.dest), bus_routes.dest from paths, bus_routes where paths.cur_dest= bus_routes.origin and locate(bus_routes.dest, paths.cur_path)=0)select * from paths
New York
Boston Washington
Raleigh
Computing “Paths”
43
New York
Boston Washington
Raleigh
select concat(paths.cur_path, ',', bus_routes.dest), bus_routes.dest from paths, bus_routes where paths.cur_dest= bus_routes.origin and locate(bus_routes.dest, paths.cur_path)=0
+-----------------------------+------------+| cur_path | cur_dest |+-----------------------------+------------+| New York | New York || New York,Boston | Boston || New York,Washington | Washington || New York,Washington,Boston | Boston || New York,Washington,Raleigh | Raleigh |+-----------------------------+------------+
Computing “Paths”
44
Recursion stopping summary● Tree or Directed Acyclic Graph walking
•Execution is guaranteed to stop
● Computing transitive closure
•Use UNION
● Computing “Paths” over graph with loops
•Put condition into WHERE to stop loops/growth
● Safety measure: @@max_recursive_iterations
•Like in SQL Server
•MySQL-8.0: @@max_execution_time ?
45
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
Plan
46
[Non-]linear recursion
with recursive R as ( select anchor_data
union [all]
select recursive_part from R, …)select …
The SQL standard requires that
recursion is linear:
● recursive_part must refer to R only
once
•No self-joins
•No subqueries
● Not from inner side of an outer join
● …
47
Linearity of SELECT statements● Recursive CTE is a “function”: R(x)=a
R
● Linear “function”: R(x+y)=a+b
•added rows in source -> added rows in the result
x a
+y+b
48
Linear recursion
with recursive R as ( select anchor_data
union [all]
select recursive_part from R, …)select …
A
49
with recursive R as ( select anchor_data
union [all]
select recursive_part from R, …)select …
Linear recursion
● New data is generated by “wave-front” elements
● Contents of R are always growing
50
Non-linear recursion● Next version of R is computed from the
entire R (not just “wavefront”)
● R may grow, shrink, or change
● It’s possible to screw things up
•R may flip-flop
•R may be non-deterministic
● Still, MariaDB supports it:
set @@standard_compliant_cte=0
•Be sure to know what you’re doing.
51
● Non-recursive CTEs•Use cases•Optimizations
● Recursive CTEs•Basics•Transitive closure•Paths•(Non-)linear recursion•Mutual recursion
Plan
52
● Multiple CTEs refer to each other
● Useful for “bipartite” graphs
•emp->dept->emp
● MariaDB supports it
● No other database does
with recursive C1 as ( select … from anchor_table union select … from C2), C2 as ( select … from C1)select ...
Mutual recursion
53
Conclusions● MariaDB 10.2 has Common Table Expressions
● Both Recursive and Non-recursive are supported
● Non-recursive
•“Query-local VIEWs”
•Competitive set of query optimizations
● Recursive
•Useful for tree/graph-walking queries
•Mutual and non-linear recursion is supported.
54
Rate My Session!
55
● Bill of materials
Tree-walking
Which part of bicycle is nut situated in?
56
+------+------------+-------+---------+| id | name | count | part_of |+------+------------+-------+---------+| 10 | nut | 1 | 7 |+------+------------+-------+---------+
Bill of materials
wheel
boltcapnut
tire valve
rimtirespokes
with recursive nut as ( select * from bicycle_parts where name = 'nut' union select b.* from bicycle_parts as b, nut AS n where b.id = n.part_of )select * from nutwhere part_of = NULL;
57
+------+------------+-------+---------+| id | name | count | part_of |+------+------------+-------+---------+| 10 | nut | 1 | 7 || 7 | tire valve | 2 | 4 |+------+------------+-------+---------+
Bill of materials
wheel
boltcapnut
tire valve
rimtirespokes
with recursive nut as ( select * from bicycle_parts where name = 'nut' union select b.* from bicycle_parts as b, nut AS n where b.id = n.part_of )select * from nutwhere part_of = NULL;
58
+------+------------+-------+---------+| id | name | count | part_of |+------+------------+-------+---------+| 10 | nut | 1 | 7 || 7 | tire valve | 2 | 4 || 4 | tire | 2 | 1 |+------+------------+-------+---------+
Bill of materials
wheel
boltcapnut
tire valve
rimtirespokes
with recursive nut as ( select * from bicycle_parts where name = 'nut' union select b.* from bicycle_parts as b, nut AS n where b.id = n.part_of )select * from nutwhere part_of = NULL;
59
+------+------------+-------+---------+| id | name | count | part_of |+------+------------+-------+---------+| 10 | nut | 1 | 7 || 7 | tire valve | 2 | 4 || 4 | tire | 2 | 1 || 1 | wheel | 2 | NULL |+------+------------+-------+---------+
wheel
boltcapnut
tire valve
rimtirespokes
Bill of materials
with recursive nut as ( select * from bicycle_parts where name = 'nut' union select b.* from bicycle_parts as b, nut AS n where b.id = n.part_of )select * from nutwhere part_of = NULL;
60
with recursive nut as ( select * from bicycle_parts where name = 'nut' union select b.* from bicycle_parts as b, nut AS n where b.id = n.part_of )select * from nutwhere part_of = NULL;
It is a part of wheel!
Bill of materials
+------+------------+-------+---------+| id | name | count | part_of |+------+------------+-------+---------+| 1 | wheel | 2 | NULL |+------+------------+-------+---------+