Reading DB2 LUW EXPLAIN plans (with special … DB2 LUW EXPLAIN plans (with special emphasis on XML)...

© 2008 IBM Corporation

Reading DB2 LUW EXPLAIN plans(with special emphasis on XML)

Susanne Englert3/31/2009

2© 2009 IBM Corporation Information Management

Agenda

Why should I be interested in EXPLAINs?What IS an EXPLAIN?How do I get them?How do I read them?►What do the operators mean?

►Which ones are XML-specific?

Examples (many)Summary and references


Why should I be interested in EXPLAINs?

The single most powerful tool to debug performance problems!Answer questions like:►Which indexes are getting used?

►How many rows does DB2 think my query will read?

►Does my query require sorts?

►For joins, what join methods are being used? In what order are the tables joined?


What happens when I ask for an EXPLAIN?

DB2 query optimizer populates special tables in the catalog that describe the execution strategy (the “plan”)► EXPLAIN_ARGUMENT, ► EXPLAIN_INSTANCE► … more

Two tools are available to read the tables and provide a visual/graphical representation of the plan ► db2exfmt (command line)► Visual Explain

Other tools that do not use the “Explain tables”► dynexpln► db2expln


What does an EXPLAIN look like?

Representation of the query optimizer’s execution plan as a tree►Leaf nodes are data►Internal nodes are operators that filter, join, sort, group, etc.

For queries, data flows upwards from the leaves through the tree’s operators towards the root


Types of EXPLAIN outputsdb2exfmt command-line tool

Visual Explain tool – start from►DB2 Control Center or►IBM Data Studio Developer

Rows RETURN ( 1) Cost I/O | 1

GRPBY ( 2) 15.2737

2 |

0.438058 HSJOIN ( 3) 15.2736

2 /-----+-----\

92.6111 92 IXSCAN IXSCAN( 4) ( 5) 15.1963 0.0655628

2 0 | | 19450 92

INDEX: SENGLERT INDEX: SENGLERT PRODX1 DSX4


How to use db2exfmt Not as pretty as Visual Explain, but

► Text format provides all information without clicking► Easy to cut, paste, attach to email► Preferred format when dealing with DB2 support and sending explains to IBM► Prerequisite: (Create Explain tables in the database catalog – a one- time operation)

db2 –tvf sqllib/misc/EXPLAIN.DDL

Steps (assume a connection to “mydb”) ► db2 set current explain mode explain (set flag to explain, don’t run query)► db2 –tvf <file containing text of your query> (populate the explain tables)► db2exfmt –d mydb -1 –o <output_file.exfmt> (format output)► db2 set current explain mode no (reset the explain-only flag)

OR ► db2 explain plan for <text of your query>► db2exfmt –d mydb -1 –o <output_file.exfmt> (format output)► When using this method, text following XQUERY must be enclosed in single quotes

Both options explain a single query and format the most-recently-explained query into a file called <output_file.exfmt> .


db2exfmt examples:

Single-table queries involving►Relational columns only (Q1)►XML extraction, relational predicate (Q2)►XQUERY, using one XML index for one predicate (Q3)►XQUERY, using two XML indexes and two predicates (Q4)►One XML index, One relational index (Q5)

Join queries►Relational join with XML predicates and extraction (Q6)►Q6 rewritten using XML joins, two ways

– Incorrectly written, doesn’t use index on join element (Q7)– Corrected to use index on join element (Q8)


Single-table queries: The PRODUCT table. Relational columns and one 1 XML column that replicates relational data

CREATE TABLE "SENGLERT"."PRODUCT" ("PRODKEY" INTEGER NOT NULL , "UPC_NUMBER" CHAR(11) NOT NULL ,"PACKAGE_TYPE" CHAR(20) ,"FLAVOR" CHAR(20) ,"FORM" CHAR(20) ,"CATEGORY" INTEGER ,"SUB_CATEGORY" INTEGER ,"CASE_PACK" INTEGER ,"PACKAGE_SIZE" CHAR(6) ,"ITEM_DESC" CHAR(30) ,"P_PRICE" DECIMAL(11,2) ,"CATEGORY_DESC" CHAR(30) ,"P_COST" DECIMAL(11,2) ,"SUB_CATEGORY_DESC" CHAR(70) ,"PRDDOC" XML );

Relational indexes:• PRODKEY (primary key), (CATEGORY, PRODKEY)XML indexes:/product/prodkey (type double), /product/category (type double) /product/sub_category (type varchar(30))


Query with relational columns only (Q1)-- Uses the PRODUCT table in database “POPSSER”

db2 => set current explain mode explain;DB20000I The SQL command completed successfully.

db2 => select count(*) from product wheredb2 (cont.) => category = 42 and sub_category = 3;

SQL0217W The statement was not executed as only Explain information requestsare being processed. SQLSTATE=01604

db2 => !db2exfmt -d popsser -1 -o q1.exfmt;

DB2 Universal Database Version 9.7, 5622-044 (c) Copyright IBM Corp. 1991, 200Licensed Material - Program Property of IBMIBM DATABASE 2 Explain Table Format Tool

Connecting to the Database.Connect to Database Successful.Output is in q1.exfmt.

db2 => set current explain mode no;db2 => -- Let's see what's in q1.exfmt!


The top of q1.exfmtDB2 Universal Database Version 9.5, 5622-044 (c) Copyright IBM Corp. 1991, 2007Licensed Material - Program Property of IBMIBM DATABASE 2 Explain Table Format Tool

******************** EXPLAIN INSTANCE ********************DB2_VERSION: 09.07.0SOURCE_NAME: SQLC2G13SOURCE_SCHEMA: NULLID SOURCE_VERSION: EXPLAIN_TIME: 2009-03-28-21.14.50.051131 EXPLAIN_REQUESTER: SENGLERT

Database Context:----------------

Parallelism: NoneCPU Speed: 4.000000e-005Comm Speed: 0Buffer Pool size: 880100Sort Heap size: 1024Database Heap size: 2476Lock List size: 423444Maximum Lock List: 98Average Applications: 1Locks Available: 13279204

Package Context:---------------

SQL Type: DynamicOptimization Level: 5Blocking: Block All CursorsIsolation Level: Cursor Stability

Database parameters that can affect query

plan selection!


The next part of q1.exfmt---------------- STATEMENT 1 SECTION 201 ----------------

QUERYNO: 1QUERYTAG: CLP Statement Type: SelectUpdatable: NoDeletable: NoQuery Degree: 1

Original Statement:------------------select count(*) from product where category = 42 and sub_category = 3

Optimized Statement:-------------------SELECT Q3.$C0 FROM

(SELECT COUNT(*) FROM

(SELECT $RID$ FROM SENGLERT.PRODUCT AS Q1 WHERE (Q1.SUB_CATEGORY = 3) AND (Q1.CATEGORY = 42)) AS Q2) AS Q3

Optimized Statement-A SQL-like representation of the query after rewriting:• View merging• Redirection to summary tables• Pre-computation of constant expressions• Subquery-to-join transformations


92.6111 IXSCAN( 6)16.3696

1

Total Cost: 45.259Query Degree: 1

Rows RETURN( 1)Cost I/O |1

GRPBY ( 2)45.2234 17 |

0.377555 FETCH ( 3)45.2096 17

/---+---\92.6111 19450 RIDSCN TABLE: SENGLERT( 4) PRODUCT18.7684

1 |

92.6111 SORT ( 5)18.7169

1 |

92.6111 IXSCAN( 6)16.3696

1 |19450

INDEX: SENGLERTPRODX2

The interesting part of q1.exfmt

A tree of operatorsEvery operator has:

►Rowcount estimate: 92.6111►Operator name: IXSCAN►Operator number: (6)►Cost: 16.3696►I/O count: 1

If you forget what the numbers mean, look at the RETURN operator! It serves as a legend.

Relational index on (category, prodkey)

Relational Index scan: CATEGORY = 42

Sort Row-IDs (RIDs) generated by index scan

Fetch rows from base table and apply SUB_CATEGORY = 3 predicate

Compute aggregate: count(*)

A sample operator


How to see what each operator is doing – Example: IXSCAN in operator 6 – look at detail section in output

6) IXSCAN: (Index Scan)…Predicates:----------3) Start Key Predicate

Comparison Operator: Equal (=)Subquery Input Required: NoFilter Factor: 0.0047615Predicate Text:--------------(Q1.CATEGORY = 42)

3) Stop Key PredicateComparison Operator: Equal (=)Subquery Input Required: NoFilter Factor: 0.0047615Predicate Text:--------------(Q1.CATEGORY = 42)

Input Streams:1) From Object SENGLERT.PRODX2

Estimated number of rows: 19450Number of columns: 2

Output Streams:2) To Operator #5

Estimated number of rows: 92.6111

What predicate is being evaluated?

PRODX2 is the relational index on (category, prodkey)


EXPLAIN plan operators – last three are XML-specificOperator: Description: DELETE Deletes rows from a table. FETCH Fetches rows from a table. FILTER Filters data. GENROW Used by DB2 to generate rows of data. GRPBY Groups rows. HSJOIN Performs a hash joins in which the qualified rows from tables are hashed. INSERT Inserts rows into a table. IXAND The AND’ing of the results of multiple index scans. IXSCAN Scans or probes an index on relational. MSJOIN Performs a merge-sort join. NLJOIN Performs a nested loop join. RETURN Returns data from a query. RIDSCN Scans a list of row identifiers (RIDs). RPD Retrieves data from a non-relational remote data source. SHIP Retrieves data from a remote data source. SORT Sorts rows or rowIDs from a table. TBSCAN Performs a table scans. TEMP Stores data in a temporary table. TQ A table queue, for parallelization of a query. UNION Concatenates streams of rows from multiple tables. UNIQUE Eliminates rows with duplicate values. UPDATE Updates data in the rows of a table. XANDOR Evaluates multiple predicates simultaneously with two or more XISCAN operators. XISCAN Scans or probes an index on XML data. XSCAN Navigates XML data to evaluate XPath expressions.

Table courtesy of Matthias Nicola


XML-specific EXPLAIN operators

XSCAN – XML document scan. Traverse XML document trees, extract document sequences or values, evaluate predicatesXISCAN - XML index scan ► Input: path-value pair such as $doc/product[p_price > 1.00]► Output: row IDs of qualifying documents and node IDs within those

documentsXANDOR – XML index AND-ing► Input: two or more XISCANs► Output Row IDs of document that satisfy all XISCANs► Can be used if:

– Only equality predicates are used. – There are no wildcards in the index lookup path.– All predicates involve the same XML column

► XANDOR does round-robin probing of indexes to efficiently find qualifying Row IDs


XML extraction, relational predicate (Q2)explain plan forselect xmlquery('$PRDDOC/product/item_desc/text()') from product where prodkey = 1;!db2exfmt -d popsser -1 -t|more;

RowsRETURN( 1)CostI/O|1

NLJOIN( 2)26.2359

3/-+--\

1 1FETCH XSCAN( 3) ( 5)18.1042 8.13169

2 1/---+---\1 19450

IXSCAN TABLE: SENGLERT( 4) PRODUCT9.9514 Q2

1|

19450INDEX: SYSIBM

SQL090217203311130Q2

IXSCAN - Relational index scan: PRODKEY = 1

XSCAN – the navigation operator – extracts /product/item_desc/text()

This NLJOIN operator is not really joining anything. Delivering documents to the XSCAN operator

FETCH documents from base table

Details of XSCAN operator (5)5) XSCAN : (XML Doc Navigation)

Arguments:INPUTXID: (Context Node) PRDDOCJN INPUT: (Join input leg) INNERXPATH : (Internal XPath Expression) ($INTERNAL_XMLTOXML_NIEO$(Q2.PRDDOC))/product/item_desc/(text())(:-->$C0:)


FAQ about Q2Q. How come the plan shows a NLJOIN? There is no join happening.A. True, there isn’t. This is a notation used to indicate that documents are beingpassed to XSCAN. The pictures shows how to think of what is happening. Imagine that the FETCH feeds the XSCAN.


NLJOIN( 2)26.2359

3/-+--\1 1

FETCH XSCAN( 3) ( 5)18.1042 8.13169

2 1/---+---\1 19450


1|19450

INDEX: SYSIBMSQL090217203311130

Q2


XSCAN( 2)8.13169

|1

FETCH( 3) 18.1042

2 /---+---\1 19450


1|19450

INDEX: SYSIBMSQL090217203311130

Q2

Actual Plan How to think about it – FETCH feeds XSCAN

Idea/pictures courtesy of M. Nicola


XQUERY that uses an XML index for one predicate (Q3)

xquery for $i in db2-fn:xmlcolumn('PRODUCT.PRDDOC') where $i/product/category = 54 return <result>{$i/product/item_desc}</result>;

RowsRETURN( 1)CostI/O|

30.7726FILTER( 2)

|32.0548NLJOIN( 3)

/--+--\34.1379 0.938979FETCH XSCAN( 4) ( 8)

/---+----\34.1379 19450RIDSCN TABLE: SENGLERT( 5) PRODUCT

|34.1379SORT( 6)

|34.1379XISCAN( 7)

|19450

XMLIN: SENGLERTDIM_PRODCATEGORYIDX

Q2

XISCAN: XML index scan on /product/category = 54

XSCAN – the navigation operator –extracts /product/item_desc/text() and rechecks /product/category = 54

SORT RIDs of rows with qualifying docs

Details of XSCAN operator 8:XPATH : (Internal XPath Expression)

Q2.PRDDOC/{(.[(product/category = 54)])(:-->$C0:),product/(item_desc)(:-->$C1:)}

Again, not a real nested loop join -Delivers documents for navigation.


FAQ about Q3

Q. Why is there a NLJOIN shown? This is not a join.► A. See FAQ for Q2

Q. The details for XSCAN operator (8) show that it does two things- extraction of /product/item_desc and re-evaluation of the /product/category predicate. Why do we need to re-evaluate the predicate? Hasn’t the index scan XISCAN operator (7) already returned only the rows with documents satisfying the predicate?► A. Good question! It turns out that there are some (rare) cases in which the

index can return documents that don’t satisfy the predicate (but it never misses any that do). So we are careful and plan a navigation to make sure that the predicate is really satisfied. However, a run-time optimization is able to avoid this “extra” navigation in many cases. Often, we are able to detect that we don’t need to do it.


RETURN( 1)

|0.067333FILTER( 2)

|0.0701385NLJOIN( 3)

/--+---\0.0747235 0.938641FETCH XSCAN( 4) ( 10)

/----+----\0.0747235 19450RIDSCN TABLE: SENGLERT( 5) PRODUCT

|0.0747235SORT( 6)

|0.0747235XANDOR( 7)

/--------+---------\34.1379 42.5735XISCAN XISCAN( 8) ( 9)

| |19450 19450

XMLIN: SENGLERT XMLIN: SENGLERTDIM_PRODCATEGORYIDX DIM_PRODSUBCATEGORYIDX

Q2 Q2

XQUERY with two predicates and two XML indexes (Q4)

xquery for $i in db2-fn:xmlcolumn('PRODUCT.PRDDOC')where $i/product/category = 54 and$i/product/sub_category = 3return <result>{$i/product/item_desc}</result>;

XISCAN: XML index scan on /product/sub_category = 3

XANDOR: XML index-anding. See slide 17. Output: RIDs that satisfy both XISCANs.

XISCAN: XML index scan on /product/category = 54

XSCAN navigation to extract item_descas well as to re-check predicates on category and sub_category


RETURN( 1)

|1

GRPBY( 2)

|24.0385^NLJOIN( 3)/-+--\

24.0385 1FETCH XSCAN( 4) ( 10)

/---+----\24.0385 19450RIDSCN TABLE: SENGLERT( 5) PRODUCT

|24.0385SORT( 6)

|24.0385IXAND( 7)

/-------+-------\70.9319 6591.52IXSCAN XISCAN( 8) ( 9)

| |19450 19450

INDEX: SYSIBM XMLIN: SENGLERTSQL090217203311130 DIM_PRODCATEGORYIDX

One XML index, one relational index (Q5)

explain plan forselect count(*)from productwhere xmlexists( '$PRDDOC/product/category[. < 10]')and prodkey between 30 and 100;!db2exfmt -d popsser -1 -t|more;

XISCAN: XML index scan on /product/category < 10

IXAND – index anding- can be used with a combination of XML and relational indexes. Allows range predicates, wildcards in XML expressions

XSCAN navigation to re-check predicate on category


For join queries: The DAILY_SALES table

One row per saleEach row has a foreign key “prodkey” that refers to our product tableRelational index on “prodkey”(others as well, but not used in our examples)One XML document per row, sample at right. Replicates keys of relational column, adds other data.XML indexes: /fact/keys/prodkey(type double)

CREATE TABLE "SENGLERT"."DAILY_SALES“("PERKEY" INTEGER NOT NULL ,"STOREKEY" INTEGER NOT NULL ,"CUSTKEY" INTEGER NOT NULL ,"PRODKEY" INTEGER NOT NULL"PROMOKEY" INTEGER NOT NULL ,"SALDOC" XML );


Relational join with XML predicate and extractions (Q6)

select px.sub_category, sx.shelf_number fromdaily_sales s, product p,xmltable('$SALDOC/fact/measures/shelf_number‘

columns shelf_number integer path '.')as sx,

xmltable('$PRDDOC/product[category < 150]' columns sub_category varchar(30) path 'sub_category')as px

where s.prodkey = p.prodkey;

For certain product categories, find sales of those categories and list their shelf numbersThere’s an XML index on /product/categoryp.prodkey and s.prodkey are (indexed) relational columns


RowsRETURN( 1)CostI/O|

2.20389e+07NLJOIN( 2)

/--+---\2.16407e+07 1.0184

HSJOIN XSCAN( 3) ( 11)

/-----+------\2.22907e+07 18882.9

TBSCAN NLJOIN( 4) ( 5)

| /-+--\2.22907e+07 9725 1.94168

TABLE: SENGLERT FETCH XSCANDAILY_SALES ( 6) ( 10)

/---+----\9725 19450

RIDSCN TABLE: SENGLERT( 7) PRODUCT

|9725

SORT( 8)

|9725

XISCAN( 9)

48|

19450XMLIN: SENGLERT

DIM_PRODCATEGORYIDX

XISCAN to evaluate /product[category < 150]

XSCAN navigation to re-evaluate /product[category < 150] and to extract /product/sub_category

HSJOIN (hash join) on s.prodkey = p.prodkey

XSCAN navigation to extract /fact/measures/shelf_number

Not real nested loop joinsNot real nested loop joins

Relational join with XML predicate and extractions (Q6)


The same join as Q6, with XML join keys (Q7)

Same query as before, except relational join predicate on prodkey has been replaced by XML join predicate inside XMLEXISTSRemember that $SALDOC/fact/keys/prodkey and $PRDDOC/product/prodkey have XML indexes

explain plan forselect px.sub_category, sx.shelf_number from daily_sales s, product p,xmltable('$SALDOC/fact/measures/shelf_number‘

columns shelf_number integer path '.')as sx,


where xmlexists('$SALDOC/fact/keys[prodkey = $PRDDOC/product/prodkey]');!db2exfmt -d popsser -1 -t|more;


Plan for the XML join query Q7RETURN( 1)

|1.72347e+07

NLJOIN( 2)

/------+------\18694.5 921.915NLJOIN TBSCAN( 3) ( 9)/-+--\ |

9725 1.92232 2.05329e+07FETCH XSCAN TEMP( 4) ( 8) ( 10)

/---+----\ |9725 19450 2.05329e+07

RIDSCN TABLE: SENGLERT NLJOIN( 5) PRODUCT ( 11)

| /---+---\9725 2.22907e+07 0.921143

SORT TBSCAN XSCAN( 6) ( 12) ( 13)

| |9725 2.22907e+07

XISCAN TABLE: SENGLERT( 7) DAILY_SALES

|19450

XMLIN: SENGLERTDIM_PRODCATEGORYIDX

This IS a real nested loop join! NLJOIN is the only option for XML joins.

XSCAN to extract /fact/keys/prodkey and /fact/measures/shelf_number

TBSCAN of DAILY_SALES table. Hmmmm… Why is the index on /fact/keys/prodkey not used?

TEMP of entire DAILY_SALES table. Why??

XISCAN to evaluate /product[category< 150]

XSCAN to retrieve product/prodkey


Q8: Corrected join from Q7 with casts around XML join keys!

For XML joins: need to cast both sides of the join predicate in order to enable use of the XML index(es)!Now it is possible to use index(es) on $SALDOC/fact/keys/prodkey and $PRDDOC/product/prodkey

explain plan forselect px.sub_category, sx.shelf_number from daily_sales s, product p,xmltable('$SALDOC/fact/measures/shelf_number'

columns shelf_number integer path '.') as sx,


where xmlexists('$SALDOC/fact/keys[prodkey/xs:double(.) = $PRDDOC/product/prodkey/xs:double(.)]');!db2exfmt -d popsser -1 -t|more;


Plan for the corrected XML join query (Q8)RETURN ( 1) Cost I/O |

3.83852e+08 NLJOIN ( 2)

/----------+----------\18694.5 20532.9 NLJOIN NLJOIN( 3) ( 9) /-+--\ /-+--\

9725 1.92232 753.287 27.2577 FETCH XSCAN FETCH XSCAN ( 4) ( 8) ( 10) ( 14)

/---+----\ /---+----\9725 19450 753.287 2.22907e+07

RIDSCN TABLE: SENGLERT RIDSCN TABLE: SENGLERT ( 5) PRODUCT ( 11) DAILY_SALES

| | 9725 753.287

SORT SORT48 5 | |

9725 753.287 XISCAN XISCAN( 7) ( 13)

| | 19450 2.22907e+07

XMLIN: SENGLERT XMLIN: SENGLERT DIM_PRODCATEGORYIDX PRODKEYIDX

This IS a real nested loop join! NLJOIN is the only option for XML joins.

XSCAN to extract /fact/measures/shelf_number and recheck /fact/keys/prodkeypredicate

Hooray, XISCAN of /fact/keys/prodkey index (join)

XISCAN to evaluate /product[category< 150]

XSCAN to retrieve product/prodkey


Things to remember when reading EXPLAINs

Start reading from the lower left corner – since that is (generally) where execution beginsIgnore the cost numbers► They are in units called timerons that correspond somewhat to

estimated elapsed time► They give clues about the optimizer’s expense estimates, but are of

little value to an outside observerDO watch the estimated row counts – they may help you understand the optimizer’s decisionsMost value in EXPLAINs:► Determining index use► Join order► Row count (“cardinality”) estimates


Further reading

►http://download.boulder.ibm.com/ibmdl/pub/software/dw/dm/db2/bestpractices/DB2BP_Query_Tuning_0508I.pdf

►http://download.boulder.ibm.com/ibmdl/pub/software/dw/dm/db2/bestpractices/DB2BP_XML_0508I.pdf

►http://www.ibm.com/developerworks/data/library/techarticle/dm-0611nicola/

►http://www.ibm.com/developerworks/data/library/techarticle/dm-0508kapoor/

►http://hoadb2ug.org/Docs/accessplans.pdf (old, but good)


Acknowledgements

Matthias Nicola – review and ideasWolfgang Krause – careful reviewAnjali Norwood – patient answers to random questions

Date post:	09-Jun-2018
Category:	Documents
Upload:	lemien
View:	232 times
Download:	2 times

Reading DB2 LUW EXPLAIN plans (with special … DB2 LUW EXPLAIN plans (with special emphasis on XML)...

Documents