Guidelins SQL Performance

8/12/2019 Guidelins SQL Performance

1/149

Informix SQLPerformance Tuning

Mike WalkerUCIConsulting, Inc.

Phone: 1-888-UCI FOR U

1-888-824-3678

Fax: 1-609-654-0957

e-mail: [email protected]


2/149

Overview:

Discuss steps for optimizingDiscuss the output of the Set Explain commandFinding Slow Running SQLDiscuss Indexing SchemesData Access Methods

Optimizer DirectivesDiscuss optimization techniques and examples

XTREE command

Correlated Sub-Queries


3/149

What will not be covered:

Engine & Database Tuning:Onconfig settings

Disk/Table LayoutsFragmentation, etc


4/149

Steps for Optimizing


5/149

Optimization Goal:

Increase PerformanceReduce I/O

reduce I/O performed by the enginereduce I/O between the back-end and thefront-end (reduce number of database operations)

Reduce processing time


6/149

Setting up a Test EnvironmentIdentify Problem QueriesEasier to spot

Easier to traceSimplify Queries

Test on a machine with minimal system activityUse database that reflects production dataNumber of rows & similar distributionsWant same query plan

Want similar timingsTurn Set Explain onChange configuration parametersTurn PDQ on

Bounce engine to clear LRUs


7/149

Optimizing the Query:

Understand the RequirementsWhat is the object of the query?

What is the information required?What is the order criteria?


8/149

Optimizing the Query:Examine the Schema

Identify the the data types and indexes on thecolumns being:

selected

used as filtersused in joinsused for sorting

Be aware of constraints on the data( e.g. primary, check, etc. )

Some constraints are enforced with indexesPrimary and Unique constraints may help identifywhen expect single row to be returned

Check constraints may hint at data distributions


9/149


Examine the DataConsider the number of rows examined vs. thenumber of rows returned Determine the distribution of filter columns

dbschema -hd -d (if have stats on that column)Select count with group

Look at the relationship of joined tables:one-to-oneone-to-manymany-to-many


10/149


Run, Examine and ModifyRun the Query:query.sql

$ timex dbaccess db query.sql > try1.out 2>&1

UPDATE STATISTICS ON TABLE query_table;SET EXPLAIN ON;SELECT . . .

Examine the Set Explain outputModify the query and/or schema (usedirectives to test various paths)Run the query again


11/149


Explain OutputThe query plan is written to the file:sqexplain.out

File is created in the current directory (UNIX)

If use SQLEditor file will be in homedirectory of the user that SQL was executedas

File will be appended to each time more SQLis executed in the same session

For NT, look for a file called

username.out

in %INFORMIXDIR%\sqexpln on the server


12/149


Explain OutputSometimes sqexplain.out will not be writtento, even though

SET EXPLAIN ON

statement has been executed

Turn off the EXPLAIN and turn it back on again:

SET EXPLAIN OFF;

SET EXPLAIN ON;

SELECT


13/149


Explain OutputIn IDS 9.4

onmode Y [0|1]Set or unset dynamic explain

Creates file called: sqexplain.out. sid

May have

issues


14/149

Set Explain Output


15/149

Set Explain: Example 1QUERY:select * from stock order by description

Estimated Cost: 6Estimated # of Rows Returned: 15

Temporary Files Required For: Order By

1) informix.stock: SEQUENTIAL SCAN


16/149

Set Explain: Example 2QUERY:select * from stock where unit_price>20

order by stock_num


1) informix.stock: INDEX PATHFilters: informix.stock.unit_price > 20(1) Index Keys: stock_num manu_code


17/149

Set Explain: Example 3QUERY:select manu_code from stock

Estimated Cost: 2

Estimated # of Rows Returned: 15

1) informix.stock: INDEX PATH

(1) Index Keys: stock_num manu_code (Key-Only)


18/149

Set Explain: Example 4

QUERY:select * from stockwhere stock_num>10 and stock_num 10

Upper Index Filter: informix.stock.stock_num < 14


19/149

Set Explain: Example 5QUERY:select * from stock, itemswhere stock.stock_num = items.stock_numand items.quantity>1


1) informix.stock: SEQUENTIAL SCAN2) informix.items: INDEX PATH

Filters: informix.items.quantity > 1(1) Index Keys: stock_num manu_code

Lower Index Filter: informix.items.stock_num =informix.stock.stock_num


20/149

Set Explain: Example 6QUERY:------select *from items, stockwhere items.total_price = stock.unit_price


1) informix.items: SEQUENTIAL SCAN


DYNAMIC HASH JOINDynamic Hash Filters:

informix.items.total_price = informix.stock.unit_price


21/149

Set Explain: Example 7Table ps_ledger has the following index:

create index psaledger on ps_ledger (account,fiscal_year,accounting_period,

business_unit,ledger,currency_cd,statistics_code,deptid,

product, posted_total_amt) fragment by expression

( fiscal_year = 2003 ) in dbspace1,( fiscal_year = 2004 ) in dbspace2,remainder in dbspace3


22/149

Set Explain: Example 7 cont.QUERY:------

select fiscal_year, account, posted_total_amtfrom ps_ledgerwhere fiscal_year = 2003

and accounting_period = 10and account between '1234' and '9999'

1) sysadm.ps_ledger: INDEX PATHFilters: (ps_ledger.fiscal_year = 2003 AND

ps_ledger.accounting_period = 10 )

(1) Index Keys: account fiscal_yearaccounting_period business_unit ledgercurrency_cd statistics_code deptid product

posted_total_amt (Key-Only)(Serial, fragments: 0)

Lower Index Filter: ps_ledger.account >= '1234'Upper Index Filter: ps_ledger.account


23/149


24/149

Finding Slow SQL


25/149

Finding Slow SQLonstat u

address flags sessid user tty wait tout locks nreads nwrites1b062064 Y--P--- 44948 cbsread - 1c5f3cc8 0 1 0 01b06662c ---PX-- 44961 cbsdba - 0 0 0 2022 1180081b067520 Y--P--- 39611 cbsuser - 1ecf6f00 0 1 5308 61240

address flags sessid user tty wait tout locks nreads nwrites1b062064 Y--P--- 44948 cbsread - 1c5f3cc8 0 1 0 01b06662c ---P--- 44961 cbsdba - 0 0 1 2372 135200

1b067520 Y--P--- 39611 cbsuser - 1ecf6f00 0 1 5308 61240

address flags sessid user tty wait tout locks nreads nwrites1b062064 Y--P--- 44948 cbsread - 1c5f3cc8 0 1 0 01b06662c ---P--- 44961 cbsdba - 0 0 1 31294 68033081b067520 Y--P--- 39611 cbsuser - 1ecf6f00 0 1 5308 61240


26/149

Finding Slow SQL

onstat g nttIndividual thread network information (times):

netscb thread name sid open read write address1d380f00 sqlexec 44961 16:46:29 16:46:29 16:46:29

>date Wed Apr 7 16:49:49 MDT 2004

Query has been executing for 3 mins 20 secs


27/149

Finding Slow SQL

onstat g sql 44961 or onstat g ses 44961Sess SQL Current Iso Lock SQL ISAM F.E.

Id Stmt type Database Lvl Mode ERR ERR Vers44961 SELECT cbstraining CR Not Wait 0 0 7.31

Current statement name : slctcur

Current SQL statement :

select * from tab1, tab2 where tab1.a = tab2.b order by tab2.c

Last parsed SQL statement :select * from tab1, tab2 where tab1.a = tab2.b order by tab2.c


28/149

Finding Slow SQL

These Informix

onstat

commands areeasily

scriptable

!!

Create a

suite

of performancemonitoring scripts


29/149

Indexing Schemes


30/149

Indexing Schemes: B+ Trees100

>

500

>100

>

15

25

99

1

00

1

32

1

90

4

00

5

00

5

01

6

99

8

50

9

99

D A T A

Level 2 (Root Node)

Level 1

Level 0


31/149

Indexing Schemes:

Types of IndexesUniqueDuplicateCompositeClustered

Attached

DetachedIn 9.x, all indexes are detached - index pages anddata pages are not interleaved

I d i S h


32/149

Indexing Schemes:Leading Portion of an Index

Consider an index on columns a, b and c on table xyz.Index is used for:SELECT * FROM XYZ

WHERE a = 1 AND b = 2 AND c = 3

SELECT * FROM XYZ

WHERE a = 1 AND b = 2

SELECT * FROM XYZ WHERE a = 1

ORDER BY a, b, c

Index is not used for:SELECT * FROM XYZ

WHERE b = 2 AND c = 3

SELECT * FROM XYZ WHERE b = 2

SELECT * FROM XYZ WHERE c = 3

ORDER BY b, c


33/149

Indexing Schemes:Guidelines

Evaluate Indexes on the following:Columns used in joining tablesColumns used as filtersColumns used in ORDER BY

s and GROUP BY

s

Avoid highly duplicate columnsKeep key size smallLimit indexes on highly volatile tablesUse the FILLFACTOR option


34/149

Indexing Schemes:

Benefits vs. CostBenefits

Speed up QueriesGuaranteeUniqueness

CostMaintenance ofindexes onInserts, Updates &Deletes

Extra Disk Space


35/149

Any Questions?


36/149

How Data is Accessed


37/149

Data Access Methods

Sequential ScanIndex

Auto Index

Index Scans:


38/149

Index Scans:Upper and Lower Index Filters

QUERY:select * from stockwhere stock_num>=99 and stock_num= 99Upper Index Filter: informix.stock.stock_num


39/149

Index Scans:Upper and Lower Index Filters

100

>

500

>100

>

15

25

99

100

132

190

400

500

501

699

850

999


40/149

Index Scans:

Upper and Lower Index FiltersCreate indexes on columns that are the mostselective.

For example:SELECT * FROM CUSTOMER

WHERE ACCOUNT BETWEEN 100 and 1000

AND STATUS =

A

AND STATE =

MD

Which column is the most selective? Account , status or state ?

Index Scans:


41/149

Index Scans:Key-Only

QUERY:select manu_code from stockwhere stock_num = 190


1) informix.stock: INDEX PATH(1) Index Keys: stock_num manu_code (Key-Only)Lower Index Filter: informix.stock.stock_num = 190


42/149

Index Scans: Key-Only

Index Read (not Key Only)

IndexPages

stock_num

Data Pagesstock_num, manu_code, qty

Index Read (Key Only)IndexPages

stock_num,

manu_code


select manu_code from stock where stock_num = 190


43/149

I d S K Fi t


44/149

Index Scans: Key-FirstQUERY:

select count(e) from mytablewhere a=1and b=1and d="Y"


1) informix.mytable: INDEX PATH

Filters: informix.mytable.d = 'Y'(1) Index Keys: a b c d (Key-First) (Serial, fragments: ALL)Lower Index Filter: (informix.mytable.a = 1 AND

informix.mytable.b = 1 )

I d S K Fi t


45/149

Index Scans: Key-First

May not see much advantage with Key-FirstIndexes. They may help some especially for largewide tables

Can gain some benefit from adding additionalcolumns to the end of the index to reduce the jumpsfrom the index pages to the data pages

Evaluate adding a new index or changing the index

to include the key-first column earlier in the index


46/149

Any Questions?


47/149


48/149

Joining Tables: Join Methods

Nested Loop JoinDynamic Hash JoinSort Merge Join

J i i T bl


49/149

Joining TablesConsider the following query:

select * from stock, itemswhere stock.stock_num = items.stock_numand items.quantity>10

What we

re looking for is:

All of the items records with a quantity greater than 10and their associated stock records.

Join Methods: Nested Loop Join


50/149

Join Methods: Nested Loop JoinQUERY:select * from stock, itemswhere stock.stock_num = items.stock_numand items.quantity>10



2) informix.items: INDEX PATH

Filters: informix.items.quantity > 10(1) Index Keys: stock_num manu_codeLower Index Filter:items.stock_num =

stock.stock_numNESTED LOOP JOIN

Notice theindex onthe joinedcolumn


51/149

Joining Tables: Table Order

Consider the select:

select * from A, Bwhere A.join_col = B.join_col

How can the database satisfy this join?

Read from A then find matching rows in BRead from B then find matching rows in A

J i i g T bl T bl O d


52/149

Joining Tables: Table OrderWho Cares?

A then B1,000 reads from A

For each A row do an indexscan into B (4 reads)

Total reads: 5,000(1,000 for A +

1,000 x 4 for B)

B then A50,000 reads from B

For each B row do an indexscan into A (3 reads)

Total reads: 200,000(50,000 for B +

50,000 x 3 for A)

Table A - 1000 rows Table B - 50,000 rows

This is a difference of 195,000 reads!!!

Joining Tables: Table rder


53/149

Joining Tables: Table rderWhat is the best order?

A then B

1,000 reads from AFor each A row do an index

scan into B (4 reads)Total reads: 5,000

(1,000 for A + 1,000 x 4 for B)Total Rows Returned: 10

B then AIndex scan of B (3 reads), then

the data (10 reads) for a totalof 13

For each B row do an indexscan into A (3 reads)

Total reads: 43(13 for B + 10 x 3 for A)Total Rows Returned: 10

General Rule: The table which returns the fewest rows,either through a filter or the row count, should be first.

Table A - 1,000 rows Table B - 50,000 rows

select * from A, Bwhere A.join_col = B.join_coland B.filter_col = 1

Assume 10rows meetthiscondition

Joining Tables: Table Order


54/149

Joining Tables: Table OrderWhat affects the join order?

Number of rows in the tablesIndexes available for:

FiltersJoin Columns

Data DistributionUPDATE STATISTICS is very important


55/149

Any Questions?


56/149

Optimizer Directives


57/149


58/149

O ti i Di ti


59/149

Optimizer Directives:Syntax

SELECT --+ directive text SELECT {+ directive text }UPDATE --+ directive text UPDATE {+ directive text }DELETE --+ directive text DELETE {+ directive text }

C-style comments are also valid as in:

SELECT /*+ directive */


60/149

Types of Directives

Access MethodsJoin Methods

Join OrderOptimization GoalQuery Plan Only (IDS 9.3)Correlated Subquery Flattening (IDS 9.3)


61/149

Types of Directives:

Access Methodsindex forces use of a subset of specified

indexes

avoid_index avoids use of specified indexes

full forces sequential scan of specifiedtable

avoid_full avoids sequential scan of specifiedtable



62/149

Types of Directives:Join Order

ordered forces table order to follow the FROMclause



63/149

Types of Directives:Optimization Goal

first_rows (N) tells the optimizer to choose a planoptimized to return the first N rows ofthe result set

all_rows tells the optimizer to choose a planoptimized to return all of the results

Query level

equivalent of:OPT_GOAL configuration parameter (instance level)

0=First Rows, -1=All Rows (default)

OPT_GOAL environment variable (environment level)SET OPTIMIZATION statement (session level)

FIRST_ROWS, ALL_ROWS


64/149


Join Methodsuse_nl forces nested loop join on specified

tables

use_hash forces hash join on specified tables

avoid_nl avoids nested loop join on specifiedtables

avoid_hash avoids hash join on specified tables

Directives Examples: ORDERED


65/149

Directives Examples: ORDEREDQUERY:select --+ ORDEREDcustomer.lname, orders.order_num, items.total_pricefrom customer, orders, itemswhere customer.customer_num = orders.customer_num

and orders.order_num = items.order_numand items.stock_num = 6 and items.manu_code = "SMT"

DIRECTIVES FOLLOWED:ORDERED

DIRECTIVES NOT FOLLOWED: 1) customer: SEQUENTIAL SCAN2) orders: INDEX PATH

(1) Index Keys: customer_numLower Index Filter: orders.customer_num =

customer.customer_num

NESTED LOOP JOIN3) items: INDEX PATH

Filters: items.order_num = orders.order_num(1) Index Keys: stock_num manu_code

Lower Index Filter: (items.stock_num = 6 ANDitems.manu_code = 'SMT' )

NESTED LOOP JOIN

Di ti E l INDEX


66/149

Directives Examples : INDEXQUERY:------select --+ ordered index(customer, zip_ix)

avoid_index(orders," 101_4")customer.lname, orders.order_num,

items.total_price

from customer c, orders o, items iwhere c.customer_num = o.customer_numand o.order_num = i.order_numand stock_num = 6 and manu_code = "SMT"

Directives Examples : INDEX (cont.)


67/149

ect ves a p es : (co t.)DIRECTIVES FOLLOWED:ORDERED

INDEX ( customer zip_ix )AVOID_INDEX ( orders 101_4 )DIRECTIVES NOT FOLLOWED:

1)customer: INDEX PATH

(1) Index Keys: zipcode2)orders: SEQUENTIAL SCANDYNAMIC HASH JOIN (Build Outer)

Dynamic Hash Filters:c.customer_num =o.customer_num

3)items: INDEX PATHFilters:i.order_num =o.order_num(1) Index Keys: stock_num manu_code

Lower Index Filter: (i.stock_num = 6 AND i.manu_code = 'SMT' )

NESTED LOOP JOIN

Directives Examples : Errors


68/149

p

QUERY:

select --+ ordered index(customer, zip_ix)avoid_index(orders," 222_4")customer.lname, orders.order_num, items.total_pricefrom customer, orders, itemswhere customer.customer_num = orders.customer_num

and orders.order_num = items.order_numand stock_num = 6 and manu_code = "SMT"

DIRECTIVES FOLLOWED:

ORDEREDINDEX ( customer zip_ix )DIRECTIVES NOT FOLLOWED:AVOID_INDEX( orders 222_4 ) Invalid Index Name Specified.


69/149


Query PlanEXPLAIN AVOID_EXECUTE

Generate the Query Plan (SQL Explain Output), but don

t runthe SQL

Introduced in IDS 9.3

Especially useful for getting the query plans for Insert, Updateand Deletes no longer have to rewrite them as Selectstatements, or surround them with BEGINWORKROLLBACK WORK commands

T f Di ti


70/149

Types of Directives:Query Plan

Without AVOID_EXECUTE:

SET EXPLAIN ON;BEGIN WORK;

DELETE FROM x WHERE y=10;ROLLBACK WORK;

With AVOID_EXECUTE:

SET EXPLAIN ON;

DELETE /*+ EXPLAIN AVOID_EXECUTE */

FROM x WHERE y=10;

Delete will NOT beperformed, but the

execution plan will bewritten

SET EXPLAIN ON;OUTPUT TO /dev/null

SELECT * FROM x WHERE y=10;

OR

T f Di ti


71/149

Types of Directives:Query Plan

SET EXPLAIN ON;

DELETE /*+ EXPLAIN AVOID_EXECUTE */

FROM x WHERE y=10;

Feature can also be implemented without using thedirective as

SET EXPLAIN ON AVOID_EXECUTE;

DELETE

FROM x WHERE y=10;

Delete will NOT beperformed, but the

execution plan will bewritten

Optimizer Directives: Pros & Cons


72/149

pPros:

Force the engine to execute the SQL the way thatwe wantSometimes we know better!!Great for testing different plans

Cons:

Force the engine to execute the SQL the way that

we wantSometimes the engine knows better!!If new indexes added, number of rows changesdrastically, or data distributions changethen a

better execution plan may be available


73/149

Any Questions?


74/149

Optimization Techniques


75/149

Optimization TechniquesUse Composite IndexesUse Index FiltersCreate indexes for Key-Only scans

Perform indexed reads for sortingUse temporary tablesSimplify queries by using Union

s

Avoid sequential scans of large tablesUse Light Scans when possibleUse Hash Joins when joining all rows from

multiple tables


76/149

Optimization Techniques (cont.)

Use the CASE/DECODE statements to combinemultiple selects

Drop and recreate indexes for largemodifications

Use Non Logging Tables

Use OUTER JOINSPrepare and Execute statements

Optimization Techniques:


77/149

p qUse Composite Indexes

Composite indexes are ones built on more thanone columnThe optimizer uses the leading portion s of acomposite index for filters, join conditions and sorts

A composite index on columns a, b and c will beused for selects involving:column acolumns a and b

columns a, b and cIt will not be used for selects involving onlycolumns b and/or c since those columns are not atthe beginning of the index( i.e. the leading portion )


78/149

Composite Key


79/149

Composite Key

Index (stock_num, manu_code)Index

Pagesstock_num,manu_code


select qty from stockwhere stock_num = 190and manu_code = 10

Table stock : 100,000 rowsstock_num = 190 : 10,000 rowsstock_num = 190 AND manu_code = 10 : 100 rows

Now just approx. 100 reads

O i i i T h i


80/149


Use Index FiltersCreate indexes on columns that are the most selective.For example:

SELECT * FROM CUSTOMER WHERE ACCOUNT BETWEEN 100 and 1000

AND STATUS =

A

AND STATE =

MD

Which column is the most selective?

account , status or state ?



81/149

Optimization Techniques:Use Index Filters

Assume table xyz has an index on begin_idx & end_idx

With the following select:

SELECT * FROM xyz WHERE begin_idx >= 99

AND end_idx


82/149

p qUse Index Filters

100

>

500

>100

>

15

25

99

100

132

190

400

500

501

699

850

999



83/149

Optimization Techniques:Use Index Filters

If we can change the query to include an upper boundon begin_idx as follows:

SELECT * FROM xyz WHERE begin_idx >= 99

AND begin_idx


84/149



85/149

p qKey-Only Scans

Data for the select list is read from the indexkey -- No read of the data page is needed

Useful for inner tables of nested-loop joins

Useful for creating a

sub-table

for very widetables



86/149

p qKey-Only Scans

tab1: 1000 rowstab2: 1000 rows

create index tab1_idx on tab1(a);create index tab2_idx on tab2(b);

output to /dev/null Handy!select unique tab2.cfrom tab1, tab2where tab1.a = 1

and tab1.b = tab2.b

Every Row in tab1will join to everyrow in tab2

Will NOT giveKey Only Scan!

Key Only Scans


87/149

Key-Only Scansselect unique tab2.c

from tab1, tab2where tab1.a = 1

and tab1.b = tab2.b

1) cbsdba.tab1: INDEX PATH

(1) Index Keys: aLower Index Filter: cbsdba.tab1.a = 1


(1) Index Keys: bLower Index Filter: cbsdba.tab2.b = cbsdba.tab1.b

NESTED LOOP JOIN

Key-Only Scans


88/149

Key-Only Scans

Index Read (NOT Key Only)tab1

Index Pagesa

tab1

Data Pagesa, b, y

select unique tab2.c from tab1, tab2where tab1.a = 1 and tab1.b = tab2.b

tab2Index Pages

b

tab2Data Pages

b, c, z

Key-Only Scans


89/149

Key-Only Scans

Index Read (NOT Key Only)


tab1

Index Pages

a

tab1

Data Pages

a, b, y

tab2

Index Pages

b

tab2

Data Pages

b, c, z

1,000 reads from tab1 Index Pages1,000

jumps

to tab1 Data Pages1,000 reads from tab1 Data PagesFor each of these:

1,000 reads from tab2 Index Pages1,000

jumps

to tab2 Data Pages

1,000 reads from tab2 Data Pages

That

s a lot of

readsand a lot of jumps

!!

Timing: 50 seconds

Key-Only Scans


90/149

Key-Only Scans

create index tab1_idx on tab1(a);create index tab2_idx on tab2(b);

Will NOT giveKey Only Scan!

create index tab1_idx on tab1(a,b);create index tab2_idx on tab2(b,c);

WILL give KeyOnly Scan!

Change Indexes

Key Only Scans


91/149

Key-Only Scansselect unique tab2.c

from tab1, tab2where tab1.a = 1and tab1.b = tab2.b


(1) Index Keys: a b (Key-Only)Lower Index Filter: cbsdba.tab1.a = 1


(1) Index Keys: b c (Key-Only)Lower Index Filter: cbsdba.tab2.b = cbsdba.tab1.b

NESTED LOOP JOIN

Key-Only Scans


92/149

Key Only Scans

Index Read (Key Only)tab1

Index Pagesa, b


tab2Index Pages

b, c

tab1

Data Pagesa, b, y

tab2Data Pages

b, c, z

Key-Only Scans


93/149

Key Only Scans

Index Read (Key Only)


1,000 reads from tab1 Index PagesFor each of these:

1,000 reads from tab2 Index PagesThat

s a lot lessreadsand no

jumps

!!

Timing: 35 seconds

tab1

Index Pages

a, b

tab1

Data Pages

a, b, y

tab2

Index Pages

b, c

tab2

Data Pages

b, c, z



94/149

p qIndexed Reads for Sorting

Indexed reads cause rows to be read in theorder of the indexed columnsHigher priority is given to indexes on columnsused as filtersReasons why an index will not be used toperform a sort:

Columns in the sort criteria are not in the indexColumns in the sort criteria are in a different orderthan the indexColumns in the sort criteria are from differenttables



95/149

Indexed Reads for Sorting

select * from some_tablewhere x = ?and y = ?order by z

Assume the table some_table has a compositeindex on columns x, y and z

Note: As of Informix Dynamic Server v7.31 this isdone automatically by the optimizer

select * from some_tablewhere x = ?and y = ?order by x, y, z



96/149

Optimization Techniques:Temporary Tables

Useful for batch reporting Avoid selecting a subset of data

repetitively from a larger tableCreate summary information that can be

joined to other tables

DisadvantageThe data in the temporary table is a copyof the real data and therefore is not

changed if the original data is modified.



97/149

Temporary Tablesselect b.sku, sum(b.sz_qty) tot_qtyfrom ctn a, ctn_detail bwhere a.carton_stat = "Q"and a.ctn_id = b.ctn_idgroup by b.skuinto temp tmp1 with no log;

create index i1 on tmp1( sku )

select tot_qty

from tmp1where sku = ?

select sum(b.sz_qty)from ctn a,

ctn_detail bwhere a.carton_stat = "Q"

and a.ctn_id = b.ctn_idand b.sku = ?

The ctn table contains

300,000 records andvery few records have astatus of

Q



98/149

Optimization Techniques:Using UNION

sOR's can cause the optimizer to not useindexes

Complex where conditions can cause theoptimizer to use the wrong indexNote: Informix Dynamic Server v7.3 allows

UNION

s in views



99/149

Using UNION s

select sum(qty)from log

where trans_id = 1and sku = ?and date_time > ?UNION

. . .

select sum(qty)from logwhere trans_id = 4and sku = ?and date_time > ?

The log table has an index on date_time and acomposite index on trans_id, sku and date_time

select sum(qty)

from logwhere sku = ?and ( trans_id = 1 or

trans_id = 2 ortrans_id = 3 or

trans_id = 4)and date_time > ?

Uses the index on date_time

Uses the composite index

Optimization Techniques:l d


100/149

Eliminate OR Conditions

select sum(qty)from logwhere sku = ?and ( trans_id = 1 ortrans_id = 2 or

trans_id = 3 ortrans_id = 4)

and date_time > ?

select sum(qty)from log

where sku = ?and trans_id in ( 1, 2, 3, 4)and date_time > ?

Uses the index on date_time Uses the composite index

Alternative to using UNIONs

Note: Earlier versions of Informix still may

not use the composite index


101/149



102/149

Use Light ScansWhat are they?

Very efficient way to sequentially scan a tableGo straight to disk, avoid the buffer pool

Database Engine

Buffers (LRUs)

Disk

Without Light Scans

Database Engine

Disk

With Light Scans



103/149

Use Light Scans

How do you get them?Only used when sequentially scanning a table

The table is bigger than the buffer poolPDQ must be on (SET PDQPRIORITY ) Dirty read isolation (SET ISOLATION TO DIRTYREAD) or no loggingMonitor using onstat g lsc



104/149

Optimization Techniques:Use Hash Joins

Good to use when joining a large number ofrows from multiple tables

Typical join is NESTED LOOP, costly to doindex scan over and over

Builds hash table in memory for one table,

scans second and hashes into memoryPDQ must be turned on

DS_TOTAL_MEMORY should be set high



105/149

Optimization Techniques:Use Hash Joins & Light Scans

SELECT

H.JRNL_ID, L.ACCOUNT, L.DEPTID, SUM(AMT)FROM JRNL_HDR H, JRNL_LN L WHERE H.JRNL_ID = L.JRNL_ID

AND H.FISCAL_YEAR = 2001 AND H.JRNL_STATUS = P

GROUP BY H.JRNL_ID, L.ACCOUNT, L.DEPTID

Two tables, 4 years of data evenly distributed:JRNL_HDR 1,000,000 rows JRNL_LN 10,000,000 rows

This will join 250,000 header records with 2,500,000 linerecords.With a nested loop join, the database will do an index readinto the line table 250,000 times.



106/149

Use Hash Joins & Light ScansSET PDQPRIORITY 50;SET ISOLATION TO DIRTY READ;SELECT --+ FULL( H ) FULL( L )

H.JRNL_ID, L.ACCOUNT, L.DEPTID, SUM(AMT)FROM JRNL_HDR H, JRNL_LN L

WHERE H.JRNL_ID = L.JRNL_ID

AND H.FISCAL_YEAR = 2001 AND H.JRNL_STATUS = P

GROUP BY H.JRNL_ID, L.ACCOUNT, L.DEPTID

Allows Light Scan

Forces Sequential Scan

This will read the 10 million line records and put them in a hash table,then the header table will be read from and the hash table will be usedto do the join.

A better option might be to put an ordered directive and change theorder of the from clause so the 250,000 header records are put in thehash table. It depends on the memory available to PDQ.

This is more efficient than a NESTED LOOP join.



107/149

Hash Joins

tab1: 1000 rowstab2: 1000 rows

output to /dev/nullselect unique tab2.cfrom tab1, tab2where tab1.a = 1and tab1.b = tab2.b

Every Row in tab1 joins to

every row in tab2

Remember this example to demonstrate Key Only Scans?

Let

s try the same thing with

a Hash Join



108/149

Hash Joinsselect unique tab2.cfrom tab1, tab2

where tab1.a = 1

and tab1.b = tab2.b

select /*+FULL(tab1) FULL(tab2)*/

unique tab2.cfrom tab1, tab2

where tab1.a = 1

and tab1.b = tab2.b

Force Full TableScans


109/149

Optimization Techniques:CASE/DECODE


110/149

CASE/DECODE

CASE Syntax:CASE

WHEN condition THEN expr WHEN condition THEN expr

ELSE expr END

DECODE Syntax:DECODE( expr , when_expr, then_expr, , else_expr )

Optimization Techniques:CASE/DECODE


111/149

CASE/DECODE

update customerset preferred =

Y

where stat =

A


N

where stat

A

DECODE( stat,

A

,

Y

,

N

)

OR 2 SQL Statements


case when stat=

A

then

Y

else

N

end

1 SQL Statement


112/149

Optimization Techniques:I d F ti


113/149

Indexes on FunctionDilemma:

LNAME in the customer table is mixed caseUsers want to enter

smith

and find alloccurrences of

Smith

regardless of case(e.g.,

SMITH

,

Smith

or

SmiTH

You can write a query like :

SELECT *

FROM customerWHERE UPPER( lname ) =

SMITH

Unfortunately this performs a sequential scanof the table.

Optimization Techniques:d


114/149

Indexes on Function

Solution:Version 9 allows indexes to be built on functions

Functions must be what is called

NONVARIANT

Informix Built-in functions, such as UPPER arevariantCreate your own function and use it

Optimization Techniques:d i


115/149

Indexes on FunctionFirst create the new function:

CREATE FUNCTION UPSHIFT( in_str VARCHAR )RETURNING VARCHARWITH( NOT VARIANT )

DEFINE out_str VARCHAR;OUT_STR=UPPER(in_str);RETURN( out_str );END FUNCTION

Then create the index on the function:CREATE INDEX I_CUST1 ON CUSTOMER(

UPSHIFT( lname ))


116/149



117/149

Optimization Techniques:Drop and Recreate Indexes

Useful for modifications to > 25% of the rowsEliminates overhead of maintaining indexesduring modificationIndexes are recreated more efficiently

Indexes can deteriorate over timeUse PDQPRIORITY for faster creation

DisadvantageMust have exclusive access to the table before doing this!

Locking the table may not be sufficient! 3-tier architecture can make this an even bigger pain!

Optimization Techniques:N L ggi g T bl


118/149

Non Logging Tables

In XPSand introduced in IDS 7.31

Inserts, Updates and Deletes against

rows in a tables are loggedFor large operations this could producesignificant overhead

Create the table as RAW or change it toRAW for the duration of the operation,and the operations will not be logged


119/149

Optimization Techniques:Use O ter Joins


120/149

Use Outer JoinsSELECT cnumFROM customerWHERE status =

A

FOREACHSELECT onumFROM ORDERS oWHERE o.cnum = cnumIF ( STATUS = NOTFOUND )

THEN...

END IFEND FOREACH

Main SELECT

SELECT repeated for each row found

in Main Select

Ouch!

Optimization Techniques:Use Outer Joins


121/149

Use Outer JoinsSELECT cnumFROM customerWHERE status =

A

FOREACHSELECT onumFROM ORDERS oWHERE o.cnum = cnumIF ( STATUS = NOTFOUND )

THEN...

END IFEND FOREACH

SELECT cnum, onumFROM customer c,

OUTER order oWHERE status = A AND c.cnum = o.cnum

FOREACHIF ( onum IS NULL )THEN

...END IF

END FOREACH

ONLY 1SELECT

Bri l l iant!

Optimization Techniques:Use Outer Joins


122/149

Use Outer JoinsSELECT cnum, NVL( onum, 0 )FROM customer c,

OUTER order oWHERE status = A

AND c.cnum = o.cnum

FOREACHIF ( onum = 0 )

THEN...END IF

END FOREACH

Can now check for zeroinstead of NULL

Use NVL to replace NULLswith something else


123/149

Optimization Techniques:Prepare and Execute


124/149

Prepare and ExecuteFOR x = 1 to 1000

INSERT INTO some_table VALUES ( x, 10 )

END FOR

Syntax CheckPermission CheckOptimizeExecute at last!!

Do it ALL again!!!

Syntax CheckPermission CheckOptimize

Execute

Once

Only!

PREPARE p1 FROM

INSERT INTO some_table VALUES ( ?, 10 )

FOR x = 1 to 1000EXECUTE p1 USING x

END FOR

Optimization Techniques:PDQ


125/149

PDQReally should be using PDQ for batch processes andreporting

Enable for index builds (also Light Scans & Hash Joins)

Set DS_TOTAL_MEMORY as high as you can spare setin config file or with onmode -M

Use MAXPDQPRIORITY to set the maximum priority thatany single session is permitted set in config file or with

onmode -DUse SET PDQPRIORITY n to set the PDQ for a session or set in the environment (e.g. export PDQPRIORITY=80)

Optimization Techniques:PDQ


126/149

PDQMonitor PDQ with onstat g mgm

onstat u : Will see multiple threads with the samesession ID

onstat g ses : Will see #RSAM Threads > 1

See Informix Manuals for more info


127/149

Xtree


128/149

Xtree

Xwindows interfaceOnly works with Xwindows terminal

Need the Xwindows libraries setupProvides a window into an executing queryUseful for checking the speed & progress ofa query without waiting until it completes great for testing different query plans!

ame


129/149

This part of the window is called

the display window and showsthe information about what ishappening in the query. Each of these boxes (nodes) designatesan operation of the query: sort,group, filter, scan.

This number represents thenumber of rows that have been

assed to node above.

This number represents thenumber of rows examined per

second. The little speedometer is agraphical representation of thisnumber. The number isoccasionally negative which could

be because it is a 2-byte integer and when it gets too high (i.e., toofast) it displays as a negative.

This part of the windowdisplays the entire query tree.If the tree is too big for thedisplay window (to the right),a black box will appear whichcan be dragged to scroll todifferent parts of the treewhich are displayed in thedisplay window.


130/149

Example


131/149

SET EXPLAIN ON;

SELECT A.DSCNT_DUE_DT, A.SCHEDULED_PAY_DT,

A.PYMNT_GROSS_AMT,B.GROSS_AMT_BSE, A.DSCNT_PAY_AMTFROM PS_PYMNT_VCHR_XREF A,

PS_VOUCHER B,PS_VENDOR C,PS_VENDOR_ADDR D,PS_VENDOR_PAY E

WHERE A.BUSINESS_UNIT = B.BUSINESS_UNIT AND

A.VOUCHER_ID = B.VOUCHER_ID AND

A.REMIT_SETID = C.SETID AND A.REMIT_VENDOR = C.VENDOR_ID AND A.REMIT_SETID = D.SETID AND A.REMIT_VENDOR = D.VENDOR_ID AND A.REMIT_ADDR_SEQ_NUM = D.ADDRESS_SEQ_NUM AND

D.EFF_STATUS = 'A' AND . . .

Need the explain plan tointerpret xtree display


132/149


133/149

Correlated Sub-Queries

Correlated Sub-QueriesWhat are they?


134/149

y

select c.*from customers c,orders o

where c.custid = o.custidand o.ord_date = TODAY

select c.*from customers cwhere c.custid in (

select custidfrom orderswhere ord_date = TODAY)

These are examples of non-correlated sub-queries.The performance of these two should be the same.

Correlated Sub-QueriesWhat are they?


135/149

select c.*from customers c,

orders o

where c.custid = o.custidand o.stat =

OPEN

select c.*from customers c

where custid in (select custidfrom orders owhere o.stat =

OPEN

)

select c.*from customers cwhere exists (

select

X

from orders owhere o.cus t id = c .cus t id

and o.stat = OPEN )

CorrelatedNot Correlated

Outer query referenced in Inner query Inner query must be repeated for each

row returned by the Outer query

Correlated Sub-QueriesWhat

s wrong with them?


136/149

Consider the statement:update customers set stat =

A

where exists (

select

X

from orders o

where o.custid = customer.custidand o.cmpny = customers.cmpnyand o.stat =

OPEN

)

The sub-query, on orders , is executed for every row retrievedfrom customers .

If customers table had 100,000 rows, the sub-query would getexecuted 100,000 times.

If orders only had 20 rows with stat=

OPEN

the databasewould be doing a lot of extra work.

Correlated Sub-queries


137/149

update customers set stat = A where exists (

select

X

from orders owhere o.custid = customer.custid

and o.cmpny = customers.cmpnyand o.stat = OPEN )

update customers set stat = A where exists (

select

X

from orders owhere o.custid = customers.custid

and o.cmpny = customers.cmpnyand o.stat = OPEN )

If orders has only 20 rows meeting the filter, the second version ofthe update runs much faster, assuming that cus tomers has anindex on the column cus t id .

The original CSQ is leftsince it was joining onmore than one column

Add this condition to reducethe number of times thesubquery is executed

and custid in (select custidfrom orders owhere o.stat = OPEN )

Correlated Sub-queries:Normal CSQ


138/149

QQUERY:update orders set ship_charge = 0

where exists (select "X" from customer c where c.customer_num = orders.customer_num

and c.state = "MD

)

1) informix.orders: SEQUENTIAL SCAN

Filters: EXISTS Subquery:---------Estimated Cost: 1Estimated # of Rows Returned: 1

1) informix.c: INDEX PATHFilters: informix.c.state = 'MD'(1) Index Keys: customer_num

Lower Index Filter:c.customer_num = orders.customer_num

Here

s the joinbetween the innerand outer tables

Yuk!

Correlated Sub-queries:Rewritten CSQ


139/149

Q

QUERY:update orders set ship_charge = 0where customer_num in (

select customer_num from customer c where c.state = "MD

)

1) informix.orders: INDEX PATH

(1) Index Keys: customer_numLower Index Filter:orders.customer_num = ANY

Subquery:---------1) informix.c: SEQUENTIAL SCAN

Filters: informix.c.state = 'MD'

EXISTS has beenchanged to an IN

Subquery is no longerCORRELATED

LookNo Join!!

Yippee!!

Correlated Sub-queries:CSQ Flattening


140/149

Q gQUERY:

update orders set ship_charge = 0where exists (

select "X" from customer c where c.customer_num = orders.customer_num

and c.state = "MD

)

1) informix.c: SEQUENTIAL SCANFilters: informix.c.state = 'MD'

2) informix.orders: INDEX PATH(1) Index Keys: customer_num

Lower Index Filter:orders.customer_num = c.customer_num

NESTED LOOP JOINNote: An index could be created on

state to avoid the sequential scan.

Where did thesubquery go?!

It was turned into aregular NestedLoop Join. AUTOMATICALLY!!

Correlated Sub-queries:CSQ Flattening


141/149

Q g

As of 9.3, optimizer directives can be used toindicate whether Subquery Flattening occurs

/*+ USE_SUBQF */

/*+ AVOID_SUBQF */

Does this indicate that Subquery Flattening isnot necessarily a good thing ????

Correlated Sub-queries:Predicate Promotion in CSQs


142/149

Predicate Promotion in CSQsCorrelated Subquery

select * from ps_jrnl_lnwhere business_unit = 'ABC

and process_instance = 5960and not exists( select "X"

from PS_SP_BU_GL_NONVW P

where P.business_unit = ps_jrnl_ln.business_unit)

But, we know that weare limiting rows in theouter query by thefilter:

business_unit=

ABC

Then why don

t we just apply the samefilter in the subquery?

What a great idea.


143/149



144/149

Predicate Promotion in CSQs

QUERY:

select * from ps_jrnl_ln where business_unit = 'ABC

and process_instance = 5960and not exists

( select "X"from PS_SP_BU_GL_NONVW P

where P.business_unit = ps_jrnl_ln.business_unit)

Let

s take a look at the query plan


C t t S b O ti i ti


145/149

1) ps_jrnl_ln: INDEX PATH

Filters: NOT EXISTS (1) Index Keys: process_instance business_unit

Lower Index Filter: (ps_jrnl_ln.business_unit ='ABC' AND ps_jrnl_ln.process_instance = 5960 )

Subquery:---------1) ps_bus_unit_tbl_gl: INDEX PATH

(1) Index Keys: business_unit (Key-Only)Lower Index Filter:

ps_bus_unit_tbl_gl.business_unit = 'ABC'

2) ps_bus_unit_tbl_fs: INDEX PATH(1) Index Keys: business_unit descr (Key-Only)

Lower Index Filter: ps_bus_unit_tbl_fs.business_unit= ps_bus_unit_tbl_gl.business_unit

NESTED LOOP JOIN

Constant Subquery OptimizationWhen this filter is checked for the firstrow, the query can stop immediately, if:

it

s a NOT EXISTS and a row is foundit

s an EXISTS and no rows are found

Filter condition ofouter query has been

applied here

Correlated Sub-Queries:First Row/Semi-Join


146/149

QUERY:

UPDATE PS_JRNL_LN SET jrnl_line_status = 3

WHERE BUSINESS_UNIT='ABC'AND PROCESS_INSTANCE=5960

AND EXISTS (SELECT 'X'FROM PS_COMBO_SEL_06 A

WHERE A.SETID='ABC' AND A.COMBINATION='OVERHEAD' AND A.CHARTFIELD='ACCOUNT'

AND PS_JRNL_LN.ACCOUNT BETWEEN A.RANGE_FROM_06 AND A.RANGE_TO_06)

Let

s take a look at the query plan

Correlated Sub-Queries:First Row/Semi-Join


147/149

1) sysadm.ps_jrnl_ln: INDEX PATH

(1) Index Keys: process_instance business_unitLower Index Filter: (ps_jrnl_ln.business_unit =

'ABC' AND ps_jrnl_ln.process_instance = 5960 )

2) informix.a: INDEX PATH ( First Row )

Filters: (informix.a.range_to_06 >= ps_jrnl_ln.account AND a.tree_effdt = )

(1) Index Keys: setid chartfield combinationrange_from_06 range_to_06

Lower Index Filter: (a.setid = 'ABC' AND(a.combination = 'OVERHEAD' AND a.chartfield = 'ACCOUNT') )

Upper Index Filter:a.range_from_06


148/149

QUERY:update orders set backlog = "Y"

where exists (select "X

from items where orders.order_num = items.order_num

and stock_num = 6 and manu_code = "SMT

)

1) informix.items: INDEX PATH ( Skip Duplicate )Filters: (items.stock_num=6 AND items.manu_code='SMT' )

(1) Index Keys: order_num

2) informix.orders: INDEX PATH(1) Index Keys: order_num

Lower Index Filter:orders.order_num = items.order_num

NESTED LOOP JOIN

Will get un ique valuesfrom the first table before

joining to the second

table, so preventingmultiple updates with the

same value


149/149

Any Questions?

Date post:	03-Jun-2018
Category:	Documents
Upload:	jorge-gonzalez-ramirez
View:	238 times
Download:	0 times

Guidelins SQL Performance

Documents