Date post: | 16-Feb-2017 |
Category: |
Data & Analytics |
Upload: | bingjie-miao |
View: | 238 times |
Download: | 6 times |
Query Optimizer Enhancement in Informix 12.1
Bingjie MiaoIBM
1
Agenda• sqexplain overview• Set operations• View folding enhancements• Subquery flattening after view folding• ANSI OUTER JOIN to informix outer join
transformation• Hash join support for ANSI JOIN queries• Optimizer costing enhancement for hash join • Temp table optimization• LATERAL derived table• Predicate derivation for ANSI JOIN query
sqexplain Overview
• Print out query plan information• Includes runtime statistics• Ways to turn on explain
– set explain on;– set explain file to ‘file_name’;– set explain on avoid_execute;– EXPLAIN directive on a query
Sections in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-07-2013 17:28:30)------select {+ FULL(tab1) AVOID_FULL(tab2)} * from tab1, tab2 where tab1.id = tab2.id
DIRECTIVES FOLLOWED: FULL ( tab1 )AVOID_FULL ( tab2 )DIRECTIVES NOT FOLLOWED:
Estimated Cost: 4Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: INDEX PATH
(1) Index Name: informix.t2idx1 Index Keys: id (Serial, fragments: ALL) Lower Index Filter: informix.tab1.id = informix.tab2.id NESTED LOOP JOIN
query text
general queryinformation
access pathsand joins
Sections in sqexplain – cont.
Query statistics:-----------------
Table map : ---------------------------- Internal name Table name ---------------------------- t1 tab1 t2 tab2
type table rows_prod est_rows rows_scan time est_cost ------------------------------------------------------------------- scan t1 1000 1 1000 00:00.00 2
type table rows_prod est_rows rows_scan time est_cost ------------------------------------------------------------------- scan t2 1000 1 1000 00:00.01 0
type rows_prod est_rows time est_cost ------------------------------------------------- nljoin 1000 1 00:00.01 4
Runtime query statistics
sqexplain for ANSI JOIN queryQUERY:------Select * from (t1 left outer join (t2 left outer join t3 on t2.c1=t3.c1) on t1.c2=t2.c2 and t2.c1 < 5) left outer join t4 on t1.c1=t4.c1Estimated Cost: 14Estimated # of Rows Returned: 4 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH Filters: informix.t2.c1 < 5 (1) Index Keys: c2 (Serial, fragments: ALL) Lower Index Filter: informix.t1.c2 = informix.t2.c2 3) informix.t3: AUTOINDEX PATH (1) Index Keys: c1 Lower Index Filter: informix.t2.c1 = informix.t3.c1 ON-Filters:informix.t2.c1 = informix.t3.c1 NESTED LOOP JOIN(LEFT OUTER JOIN) ON-Filters:(informix.t1.c2 = informix.t2.c2 AND informix.t2.c1 < 5 ) NESTED LOOP JOIN(LEFT OUTER JOIN) 4) sqlqa.t4: AUTOINDEX PATH (1) Index Keys: c1 Lower Index Filter: informix.t1.c1 = informix.t4.c1 ON-Filters:informix.t1.c1 = informix.t4.c1 NESTED LOOP JOIN(LEFT OUTER JOIN)
Set Operations
• Similar to UNION• INTERSECT – rows common to both arms
– internally transformed into EXISTS subquery with special NULL handling
• MINUS or EXCEPT – rows in first arm that’s not in second arm– internally transformed into NOT EXISTS subquery
with special NULL handling
Set Operations in explain
QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:04:22)------select intcol from tab1intersect select intcol2 from tab2
Estimated Cost: 4Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: SEQUENTIAL SCAN (First Row)
Filters: informix.tab1.intcol == informix.tab2.intcol2 NESTED LOOP JOIN (Semi Join)
Set Operations in explain – cont.QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:13:28)------select intcol, charcol from tab1intersectselect intcol2, charcol2 from tab2minusselect intcol3, charcol3 from tab3
Estimated Cost: 6Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: SEQUENTIAL SCAN (First Row)
Filters: (informix.tab1.intcol == informix.tab2.intcol2 AND informix.tab1.charcol == informix.tab2.charcol2 )
NESTED LOOP JOIN (Semi Join)
3) informix.tab3: SEQUENTIAL SCAN (First Row)
Filters: (informix.tab1.charcol == informix.tab3.charcol3 AND informix.tab1.intcol == informix.tab3.intcol3 )
NESTED LOOP JOIN (Anti Semi Join)
View folding enhancement• Views containing ANSI JOIN or Informix outer
join can now be folded into main query, for better performance– view must be referenced as a dominant table in
the main query– if view is used as subservient table, then the view
still needs to be materialized first
View folding examplecreate view v1(vc1, vc2) as select t1.c, t2.c from t1 left join t2 on t2.a = t1.a;
select * from v1 left join t3 on v1.vc1 = t3.a;
select * from v1 right join t3 on v1.vc1 = t3.a;
View folding in sqexplainQUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:23:41)------select * from v1 left join t3 on v1.vc1 = t3.a
Estimated Cost: 6Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t2.a = informix.t1.a NESTED LOOP JOIN
3) informix.t3: INDEX PATH
(1) Index Name: informix.ind3 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.c = informix.t3.a NESTED LOOP JOIN
View folding in sqexplain – cont.QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47)------create view "informix".v1 (vc1,vc2) as select x0.c ,x1.c from ("informix".t1 x0 left join "informix".t2 x1 on (x1.a = x0.a ) );
Estimated Cost: 4Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t2.a = informix.t1.a NESTED LOOP JOIN
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47)------select * from v1 right join t3 on v1.vc1 = t3.a
Estimated Cost: 5Estimated # of Rows Returned: 3
1) informix.t3: SEQUENTIAL SCAN
2) (Temp Table For View): SEQUENTIAL SCAN
DYNAMIC HASH JOIN Dynamic Hash Filters: (Temp Table For View).vc1 = informix.t3.a
Subquery flattening after view folding• Subquery flattening improves query
performance, however previously it is disabled if query contains view or derived table reference
• In 12.1 subquery flattening is attempted again after view folding process, and can be done with the view either folded, or materialized into temp table
Subquery flattening after view folding in sqexplain
create view v4 (v4_c1, v4_c2) as select t1_c1 + 1, MAX(1) from t1 group by 1;
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:52:44)------select 1 from v4 where exists (select 1 from t2 where t2_c1 = v4_c1)
Estimated Cost: 6Estimated # of Rows Returned: 1Temporary Files Required For: Group By
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: SEQUENTIAL SCAN (First Row)
Filters: informix.t2.t2_c1 = informix.t1.t1_c1 + 1 NESTED LOOP JOIN (Semi Join)
Subquery flattening after view folding in sqexplain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:02:37)------select z.a from zwhere z.b = some (select v.a from vag1 v, z z1 where v.a > z.a)
Estimated Cost: 9Estimated # of Rows Returned: 11
1) informix.z: SEQUENTIAL SCAN
Filters: informix.z.b > informix.z.a
2) (Temp Table For View): AUTOINDEX PATH (First Row)
(1) Index Name: (Auto Index) Index Keys: a (Key-Only) Lower Index Filter: informix.z.b = (Temp Table For View).a NESTED LOOP JOIN (Semi Join)
3) informix.z1: SEQUENTIAL SCAN (First Row)NESTED LOOP JOIN (Semi Join)
ANSI OUTER JOIN to Informix Outer Join Transformation
• “Simple” ANSI OUTER JOIN can be converted to informix outer join– potentially more join choices by the optimizer– ON clause filters must be of the type “col = col” involving
the current join– WHERE clause filters cannot reference subservient tables,
flattened subquery tables, correlated subqueries or UDR references
• If one join is not transformed, then entire query is not transformed
ANSI OUTER JOIN to Informix Outer Join in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:23:32)------select * from t1 left join t2 on t1.a = t2.a
Estimated Cost: 4Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a NESTED LOOP JOIN
ANSI OUTER JOIN to Informix Outer Join in sqexplain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:24:01)------select * from t1 left join t2 on t1.a = t2.a and t1.a = 1
Estimated Cost: 4Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a
ON-Filters:(informix.t1.a = informix.t2.a AND informix.t1.a = 1 ) NESTED LOOP JOIN(LEFT OUTER JOIN)
Hash Join Support in ANSI JOIN
• Hash join is supported in ANSI JOIN queries• Optimizer can consider and choose best join
method for each join – hash join can be faster for large joins
• Optimizer costing is adjusted for situation where build/probe sides for hash join can be composite
Hash Join for ANSI JOIN in sqexplainQUERY: (OPTIMIZATION TIMESTAMP: 03-14-2013 15:01:22)------select * from (t1 left join t2 on t1.a = t2.a ) left join (t3 inner join t4 on t3.a = t4.a) on t4.a = t1.a
Estimated Cost: 9Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a
ON-Filters:informix.t1.a = informix.t2.a NESTED LOOP JOIN(LEFT OUTER JOIN)
3) informix.t3: SEQUENTIAL SCAN
4) informix.t4: INDEX PATH
(1) Index Name: informix.ind4 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t3.a = informix.t4.a
ON-Filters:informix.t3.a = informix.t4.a NESTED LOOP JOIN
ON-Filters:informix.t4.a = informix.t1.a DYNAMIC HASH JOIN (LEFT OUTER JOIN) Dynamic Hash Filters: informix.t4.a = informix.t1.a
Optimizer costing improvements• Current optimizer costing tends to favor index
based scans and joins, which can be problematic for large tables
• In 12.1, introduced costing modifications to make hash join more favorable for large tables
• Under control of undocumented ONCONFIG parameter SQL_DEF_CTRL (off by default)– add 0x200 and 0x800 bits– set to 0xeb0 to include “on-by-default” bits
Optimizer costing exampleSQL_DEF_CTRL=0x4b0
SELECT dm.dm_s_symb AS stock, MONTH(dm.dm_date) AS month, COUNT(*) AS count_num_days, MAX(dm.dm_vol) AS max_volFROM daily_market dm, security sWHERE dm.dm_s_symb = s.s_symb AND YEAR(dm.dm_date) = "2001"GROUP BY 1,2
Estimated Cost: 5018350Estimated # of Rows Returned: 1779368Temporary Files Required For: Group By
1) informix.s: INDEX PATH
(1) Index Name: informix.pk_security Index Keys: s_symb (Key-Only)
(Serial, fragments: ALL)
2) informix.dm: INDEX PATH
Filters: YEAR (informix.dm.dm_date ) = 2001
(1) Index Name: informix.fk_daily_market_security
Index Keys: dm_s_symb (Serial, fragments: ALL)
Lower Index Filter: informix.dm.dm_s_symb = informix.s.s_symb
NESTED LOOP JOIN
SQL_DEF_CTRL=0xeb0SELECT dm.dm_s_symb AS stock, MONTH(dm.dm_date) AS month, COUNT(*) AS count_num_days, MAX(dm.dm_vol) AS max_vol FROM daily_market dm, security s WHERE dm.dm_s_symb = s.s_symb AND YEAR(dm.dm_date) = "2001"GROUP BY 1,2
Estimated Cost: 4183207Estimated # of Rows Returned: 1779368Temporary Files Required For: Group By
1) informix.dm: SEQUENTIAL SCAN
Filters: YEAR (informix.dm.dm_date ) = 2001
2) informix.s: INDEX PATH
(1) Index Name: informix.pk_security Index Keys: s_symb (Key-Only)
(Serial, fragments: ALL)
DYNAMIC HASH JOIN Dynamic Hash Filters:
infomix.dm.dm_s_symb = informix.s.s_symb
Temp table optimization• Temp tables are created when a view or derived
table cannot be folded into main query• Previously when a temp table is created, it
includes all columns from underlying tables• In 12.1, a temp table only includes columns that
are required in the query– smaller temp table– more efficient query processing
Temp table optimization exampleselect rtrim(D12.C36), rtrim(D12.C48), D12.C103, D12.C104from ( select stock_trans_type.stt_type as C0, stock_trans_type.stt_desc as C1, stock_movements.stk_trans_type as C2, ...... stock_master.user_num1 as C103, system_table.systbl_code as C104 from stock_trans_type, stock_master, system_table, outer stock_movements where ...... ) D12 right outer join system_type on D12.C29 = system_type.type_idwhere D12.C12 between 129 and 256 and D12.C16 is not null;
TEMP table for D12 contains only the following columns:C12, C16, C29, C36, C48, C103, C104
LATERAL derived table
• Correlated column reference inside a derived tableselect * from t1, LATERAL (select * from t2 where t1.c1 = t2.c2 ) as dtab(dc1);
• LATERAL derived table may or may not be folded into main query
• Logic in the optimizer to ensure LATERAL correlation reference is properly satisfied at run time
LATERAL derived table in sqexplain
select t1.c1, dc1 from t1, LATERAL (select t2.c2 from t2 where t1.c1 = t2.c2) as dtab(dc1)
Estimated Cost: 4Estimated # of Rows Returned: 1 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: SEQUENTIAL SCAN
DYNAMIC HASH JOIN Dynamic Hash Filters: informix.t1.c1 =
informix.t2.c2
LATERAL derived table in sqexplain – cont.
select * from t1, t2, t3, t4, LATERAL ( select t5.c5 from t5 where t1.c1 = t5.c5 and t5.c5 < t2.c2 group by 1) as dtab(vc1) where t4.c4 = t3.c3
1) informix.t3: SEQUENTIAL SCAN 2) informix.t4: SEQUENTIAL SCANDYNAMIC HASH JOIN Dynamic Hash Filters: informix.t4.c4 = informix.t3.c3 3) informix.t2: SEQUENTIAL SCANNESTED LOOP JOIN 4) informix.t1: SEQUENTIAL SCANNESTED LOOP JOIN 5) (Temp Table For Collection Subquery): SEQUENTIAL SCANNESTED LOOP JOIN
Predicate derivation for ANSI JOIN Query
• Optimizer is able to derive predicates based on existing predicates– t1.c1 = t2.c2 and t1.c1 = t3.c3 t2.c2 = t3.c3– t1.c1 = t2.c2 and t1.c1 >= 5 t2.c2 >= 5
• Predicate derivation is now enabled for ANSI JOIN query as well (among dominant tables)
Predicate derivation for ANSI JOIN in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-15-2013 12:17:32)------select int1, value1, word1, int3, int4, value4from aoj1 left join (aoj3 left join aoj4 on value3 = value4) on (value1 = value3 and int1 = int3)where value3 > 15
Estimated Cost: 6Estimated # of Rows Returned: 1
1) informix.aoj1: INDEX PATH
(1) Index Name: informix.aoj1_value1 Index Keys: value1 (Serial, fragments: ALL) Lower Index Filter: informix.aoj1.value1 > 15
2) informix.aoj3: AUTOINDEX PATH
(1) Index Name: (Auto Index) Index Keys: value3 int3 (Key-Only) Lower Index Filter: (informix.aoj1.value1 = informix.aoj3.value3 AND informix.aoj1.int1 = informix.aoj3.int3 ) Index Key Filters: (informix.aoj3.value3 > 15 )
3) informix.aoj4: SEQUENTIAL SCAN
ON-Filters:informix.aoj3.value3 = informix.aoj4.value4 DYNAMIC HASH JOIN (LEFT OUTER JOIN) Dynamic Hash Filters: informix.aoj3.value3 = informix.aoj4.value4
ON-Filters:(informix.aoj1.value1 = informix.aoj3.value3 AND informix.aoj1.int1 = informix.aoj3.int3 ) NESTED LOOP JOIN
Summary• sqexplain overview• Set operations• View folding enhancements• Subquery flattening after view folding• ANSI OUTER JOIN to informix outer join
transformation• Hash join support for ANSI JOIN queries• Optimizer costing enhancement for hash join • Temp table optimization• LATERAL derived table• Predicate derivation for ANSI JOIN query