+ All Categories
Home > Documents > DB2 Programming Standards and Guidelines

DB2 Programming Standards and Guidelines

Date post: 24-May-2015
Category:
Upload: tess98
View: 194 times
Download: 1 times
Share this document with a friend
Popular Tags:
33
~~draft~~ Created: June 23, 1993 Last Modified: July19, 1993 DB2 Programming Standards and Guidelines SQL (Structured Query Language) is very flexible and can formulate a given database request in a number of different ways. DB2 in turn may access the data needed to satisfy any given request in any of several ways. Theoretically the DB2 optimizer should be able to choose the most efficient access path, no matter which equivalent SQL statement is used. DB2’s optimizer has not yet reached this ideal. In most cases, the optimizer makes the best choice no matter how the request is stated. But in a few situations, the programmers’ decisions in formulating statements can make a difference in performance. This situation will be eliminated over time as IBM software developers improve the optimizer. For now, programmers can improve the performance of their own programs by following a few guidelines for statement formulations. The following tables will be used to show examples of various SQL statements: SUPPLIER (S) TABLE SN SNAME STATUS CITY S1 S2 S3 S4 S5 SMITH JONES BLAKE CLARK ADAMS 20 10 30 20 30 LONDON PARIS PARIS LONDON ATHENS
Transcript
Page 1: DB2 Programming Standards and Guidelines

~~draft~~

Created: June 23, 1993Last Modified: July19, 1993

DB2 Programming Standards and Guidelines

SQL (Structured Query Language) is very flexible and can formulate a given database request in a number of different ways. DB2 in turn may access the data needed to satisfy any given request in any of several ways. Theoretically the DB2 optimizer should be able to choose the most efficient access path, no matter which equivalent SQL statement is used. DB2’s optimizer has not yet reached this ideal. In most cases, the optimizer makes the best choice no matter how the request is stated. But in a few situations, the programmers’ decisions in formulating statements can make a difference in performance. This situation will be eliminated over time as IBM software developers improve the optimizer. For now, programmers can improve the performance of their own programs by following a few guidelines for statement formulations.

The following tables will be used to show examples of various SQL statements:

SUPPLIER (S) TABLE SN SNAME STATUS CITYS1S2S3S4S5

SMITHJONESBLAKECLARKADAMS

2010302030

LONDONPARISPARISLONDONATHENS

JOB (J) TABLEJN JNAME CITYJ1J2J3J4J5J7

SORTERPUNCHREADERCONSOLECOLLATORTAPE

PARISROMEATHENSATHENSLONDONLONDON

PART (P) TABLEPN PNAME COLOR WEIGHT CITYP1P2P3

NUTBOLTSCREW

REDGREENBLUE

121717

LONDONPARISROME

Page 2: DB2 Programming Standards and Guidelines

P4P5P6

SCREWCAMCOG

REDBLUERED

141219

LONDONPARISLONDON

SUPPLIER/PART/JOB (SPJ) TABLESN PN JN QTYS1S1S2S2S2S2S2S2S2S2S3S3S4S4S5S5S5S5S5S5S5S5S5S5

P1P1P3P3P3P3P3P3P3P5P3P4P6P6P2P2P5P5P6P1P3P4P5P6

J1J4J1J2J3J4J5J6J7J2J1J2J3J7J2J4J5J7J2J4J4J4J4J4

200700400200200500600400800100200500300300200100500100200100200800400500

EFFICIENT SELECTS

The purpose of a SELECT statement is to retrieve information from the DB2 tables. The result of this operation will always be a TABLE (consisting of columns and rows selected aka RESULT SET).

- Maximum number of columns in a SELECT list is 750- Maximum number of tables in the FROM clause is 15- Maximum number of predicates is 300

Page 3: DB2 Programming Standards and Guidelines

- Multiple predicates (WHERE clauses) can be specified using the AND and OR operators

General form of the SELECT statement:

SELECT the data you want (column names)FROM some source (table name (s))WHERE conditions which are to be met (if any)GROUP BY group data (column name (s)) into one rowHAVING a condition each group of data must satisfyORDER BY rows will be in this order (column name (s))

WHERE

When selecting columns from tables, do not select columns that are referenced in the SQL statement’s WHERE clause. Select only those columns whose values are unknown to the program. Do not select columns whose values are already known to the host language program because this results in unnecessary I/O, as in the following query:

EXEC SQLSELECT SN, SNAMEFROM SWHERE SN = :HOST-VARIABLE-SNEND-EXEC

Write the above statement as follows:

EXEC SQLSELECT SNAMEFROM SWHERE SN = :HOST-VARIABLE-SNEND-EXEC

A SELECT * will return all columns from a table whether or not they are needed for program processing. This statement will result in excessive I/O overhead. Do not use a SELECT * as in the following example:

EXEC SQLSELECT * FROM SEND-EXEC

Code all desired columns explicitly. If a table layout ever changes, for whatever reason, the program using a SELECT * has to be re-compiled and re-bound.

Page 4: DB2 Programming Standards and Guidelines

When coding a SQL SELECT statement in a host language program (COBOL), always code the following statement: EXEC SQL before the SQL statement and code END-EXEC after the SQL statement:

EXEC SQLSELECT SNAMEFROM SWHERE SN = :HOST-VARIABLE-SNEND-EXEC

Host variables, as in the above statement, are working storage fields in host-language programs that are used to accept data from SQL statements, provide input to SQL update or insert statements, or are used in selecting data from tables. Always code host variables in the working-storage section of your program with the same data type and length as the columns of the table in the SQL statement(s) that they pertain to.

CURSORS

Host language programs like COBOL are not prepared to accept more than one row at a time when retrieving columns from a DB2 table. A cursor is used to retrieve multiple rows in succession from a SELECT statement. (Since it is a declarative rather than executable statement, it can be placed in the data or procedure division of a COBOL program, but must physically precede the OPEN.) The declaration takes the form DECLARE SCURSOR CURSOR FOR, in which the cursor’s name follows the word DECLARE. Cursor processing logic must be used in host language programs when you will be retrieving more than one row which satisfies the WHERE clause of an SQL statement. Here is a complete example declaration:

EXEC SQLDECLARE SCURSOR CURSOR FORSELECT SN, SNAME, STATUS, CITYFROM SWHERE STATUS > :OLD-STATUSAND SN > : OLD-SNEND-EXEC

The program calls on the cursor for processing through two statements: one for opening the cursor, and the other for fetching its rows. The OPEN statement readies the SELECT for execution by creating a cursor set. DB2 attempts to retrieve one qualifying row at a time to pass to the program by the use of the FETCH statement. The OPEN statement simply uses the key word OPEN followed by the cursor’s name:

Page 5: DB2 Programming Standards and Guidelines

EXEC SQLOPEN SCURSOREND-EXEC.

FETCH

The FETCH statement identifies the cursor to be used and employs the INTO clause to indicate the host variables that are to receive the values from each row:

EXEC SQLFETCH SCURSOR INTO :SN, :SNAME, :STATUS, :CITYEND-EXEC

As with a SELECT INTO statement that is used when a single row is returned, the variables used in a FETCH INTO statement must be of compatible data type and length with the columns that supply their values.

- Host languages are not prepared to accept more than one row at a time

- Declare cursor must come before the OPEN

- A SELECT statement outperforms the cursor select for retrieving a single row

- OPEN readies the SELECT for execution

- DB2 will try to fetch rows one at a time as requested

- If now rows qualify from a cursor- Open results in SQLCODE = 0- First Fetch results in SQLCODE = 100

- Multiple cursors can be opened on one or more tables

- Always check the SQLCODE after opening and closing a cursor. If the SQLCODE after the execution of an OPEN CURSOR or CLOSE CURSOR is less than zero, abend processing should occur

The following example illustrates a processing loop, which is how cursors are most often used:

EXEC SQL DECLARE SCURSOR

Page 6: DB2 Programming Standards and Guidelines

SELECT SN, SNAME, STATUS, CITY FROM S WHERE STATUS = :NEW-STATUS

END-EXEC

OPEN-SCURSOR. EXEC SQL

OPEN SCURSOR END-EXEC. PERFORM GET-ROW UNTIL SQLCODE = 100

GET-ROW. EXEC SQL

FETCH SCURSOR INTO :SN, :SNAME, :STATUS, :CITY END-EXEC.

IF SQLCODE = 100 MOVE SAVE-STATUS TO NEW-STATUS EXEC SQL

CLOSE SCURSOR END-EXEC.

When checking a DB2 table for straight existence checking (no data retrieval is necessary for program processing), it is more efficient to use a select literal SQL statement versus a select count (*), or a statement that actually retrieves data from the DB2 table. For example, if you wanted to verify the existence of a participant social security number, instead of using a standard select statement:

EXEC SQL SELECT SN INTO :SN FROM S WHERE SN = :HOST-VARIABLE-SN

END-EXEC

use the following:

EXEC SQLSELECT ‘SUPPLIER VERIFICATION’

INTO :SUPPLIER-LITERAL FROM S WHERE SN = :HOST-VARIABLE-SN

END-EXEC.

Page 7: DB2 Programming Standards and Guidelines

SUPPLIER-LITERAL

The field SUPPLIER-LITERAL is defined in the working-storage section of the host program in the following manner:

01 SUPPLIER-LITERAL PIC X(21)

If it is possible that more than one row could satisfy the where clause in an existence check, interrogate the SQLCODE returned by the SQL call for a value of –811. If the SQLCODE is equal to – 811, it means that there is more than one row that satisfies the where clause of the SQL call.

DISTINCT

The DISTINCT operator may be used to eliminate duplicates from the result set. Do not use DISTINCT unless absolutely necessary for program processing. DISTINCT requires a sort which may degrade performance.

EXEC SQLSELECT DISTINCT CITY FROM PARTEND-EXEC

Returns:

LondonParisRome

ORDER BY

The ORDER BY is used to order the result set of a select statement. For example, the following statement could be used to select supplier numbers and names from the S table and order the result by the city column:

EXEC SQLSELECT SN, CITYFROM SORDER BY CITYEND-EXEC

RETURNS:

S5 ATHENSS1 LONDON

Page 8: DB2 Programming Standards and Guidelines

S4 LONDONS2 PARISS3 PARIS

Each column in the ORDER BY clause must identify a column of the result table. ORDER BY columns may also be identified by the position of the column in the result table. The ORDER BY will sort the result set in ascend9ing order by default. You may also request that the result be returned in descending order. The following SELECT statement is an example of this and will return the same results as the above query except that the result will be in descending order:

EXEC SQLSELECT SN, CITYFROM SORDER BY CITY DESCEND-EXEC

RETURNS:

S2 PARIS S3 PARISS1 LONDONS4 LONDONS5 ATHENS

GROUP BY

Use of the GROUP BY conceptually rearranges the table represented by the FROM clause into partitions, such that within any one group, all rows have the same value for the GROUP BY field. GROUP BY does not imply ORDER BY. But GROUP BY will perform a sort to group the result set. The column(s) specified by the GROUP BY must be in the select clause of the SQL call. The following query presents an example of the use of the GROUP BY:

EXEC SQLSELECT SN, SUM (QTY) FROM SPJGROUP BY SNEND-EXEC

RETURNS:

Page 9: DB2 Programming Standards and Guidelines

S1 900S2 3200S3 700S4 600S5 3100

HAVING

The HAVING clause is used with the GROUP BY to eliminate unwanted groups. The HAVING operates on the result set created by the GROUP BY. The HAVING will not work without a GROUP BY clause:

EXEC SQLSELECT SN, SUM (QTY) FROM SPJGROUP BY SNHAVING COUNT (*) > 2END-EXEC

RETURNS:

S2 3200S3 700S4 600S5 3100

LIKE

The LIKE operator can be used as a “wildcard” SELECT to retrieve rows based on only a fraction of the actual column value. With a LIKE you can retrieve rows when you only know a portion of the column value. The following SQL select will retrieve rows where the first character in the CITY column is equal to “L”. The percent sign is used to denote a wildcard, meaning that any values that come after the “L” in the SELECT are valid and satisfy the WHERE clause.

EXEC SQLSELECT SN, CITY FROM S WHERE CITY LIKE ‘L%’END-EXEC

RETURNS:

S1 LONDONS4 LONDON

Page 10: DB2 Programming Standards and Guidelines

LIKE Operator

DB2 will not use a matching index scan to search for a character string if the leading character is not specified in the LIKE clause. If the leading character or characters are specified, it may use a matching scan to locate values with those beginnings in the index and a nonmatching scan to complete the search for strings with those beginnings and any other characters specified in the clause.

UNION

The UNION statement is used to bring together rows from multiple selects:

- Set of all elements belonging to either or both of the original result sets.

- Must have the same number of columns selected

- Selected column data types must be compatible

- Redundant duplicate rows are always eliminated from the results unless UNION ALL is specified.

- Any number of SELECT statements can be UNIONed together

- ORDER BY clause, if used must appear in the final SELECT statement only. References columns by number, not name. The number refers to the order in which the columns were selected.

Query involving UNION:

EXEC SQLSELECT PN FROM PARTWHERE COLOR = ‘RED’UNIONSELECT PN FROM SPJWHERE QTY > 100ORDER BY 1END-EXEC

RETURNS:

P1P2P3P4P5

Page 11: DB2 Programming Standards and Guidelines

P6

Do Not Use ORDER BY, GROUP BY, DISTINCT, or UNION unless absolutely necessary for program processing. These predicates cause DB2 sorting which may degrade performance.

AGGREGATES

DB2 supports the use of various aggregate functions in SELECT statements:

- AVG average of the values in a column- SUM sum of the values in a column- MIN smallest value in a column- MAX largest value in a column- COUNT number of values in a column

- For COUNT, DISTINCT must be specified for anything but COUNT (*)

- Here is an example of a select statement to count the number of unique cities that supply parts:

EXEC SQLSELECT COUNT (DISTINCT CITY) FROM PARTEND-EXEC

RETURNS:

3

- DISTINCT is not applicable with MAX, MIN, AVG, SUM

- AGGREGATE function in the SELECT clause:

EXEC SQLSELECT COUNT (*)FROM SEND-EXEC

RETURNS:

5

Page 12: DB2 Programming Standards and Guidelines

EXEC SQLSELECT MIN (QTY), AVG (QTY), MAX (QTY)FROM SPJWHERE PN = ‘P1’END-EXEC

RETURNS:

MIN (QTY) AVG (QTY) MAX (QTY)100 333 700

- AGGREGATE function in the SELECT clause with DISTINCT:

EXEC SQLSELECT COUNT (DISTINCT SN)FROM SPJEND-EXEC

RETURNS:

5

If the result of an aggregate is an empty set (no rows satisfy the WHERE clause), COUNT returns a value of zero with an SQLCODE of zero. The other aggregates will field(s). To avoid this, code a null indicator variable in the SQL select statement:

EXEC SQLSELECT MIN (QTY)INTO :RESULT-FIELD NULL-INDICATORFROM SPJWHERE PN = ‘P7’END-EXEC

This will cause a SQLCODE of 100 to be returned, and the NULL-INDICATOR value will be set to a value of less than zero (usually – 1), if no rows satisfy the WHERE clause. If SQLCODE is equal to zero that means that a row, or multiple rows were found that satisfy the where clause and the result field has a valid value in it. If a null has been returned, you should initialize your host variable that was to receive the result of the aggregate to remove the “garbage data” that may be in it. The format of the SQL statement and the SQL and NULL-INDICATOR checking that may follow should be coded like this:

EXEC SQLSELECT MIN (QTY)INTO :RESULT-FIELD, NULL-INDICATOR

Page 13: DB2 Programming Standards and Guidelines

FROM SPJWHERE PN = ‘P7’END-EXEC

IF NULL-INDICATOR < 0MOVE ZEROS TO RESULT-FIELD.

IF SQLCODE = 0 DO WHATEVERELSEIF SQLCODE = 100 DO WHATEVERELSEIF SQLCODE NOT = 0 AND 100 GO TO ABEND ROUTINE.

EFFICIENT JOINS

Most joins have equivalent correlated subselects. A correlated subselect is one in which the outer select identifies a table from which SQL draws values for use in the subselect. Joins are preferred over correlated subselects. A join’s where clause can reduce the number of rows participating in the join. When processing a correlated subselect, on the other hand, DB2 must evaluate all rows from one table against all rows in the other. In some cases DB2 cannot use an index in satisfying correlated subselects where it would be able to use the index in carrying out the equivalent join if the join predicates are indexed columns.

EXEC SQLSELECT JN, PNAME, WEIGHT, QTYFROM P, SPJWHERE P.PN = SPJ.PNAND P.PN = ‘P3’END-EXEC

Logically, it makes no difference whether the statement specifies that the P3 rows be selected from the SPJ or P table. The results will be the same. From a performance standpoint, however, it is usually best to narrow the search on the larger table. Consequently, if the user or programmer knows which table in the join is larger, he or she should use it in the statements to narrow the search. An alternative would be to specify both tables in the WHERE clause – for example:

EXEC SQL

Page 14: DB2 Programming Standards and Guidelines

SELECT JN, PN, PNAME, WEIGHT, QTYFROM P, SPJWHERE P.PN = SPJ.PNAND P.PN = ‘P3’END-EXEC

Guidelines for efficient joins:

- Narrow the search to only those rows required for the join.

- Narrow the search on both tables to be joined to five the optimizer more information on which to base its decisions.

- Do not use correlated subselects with IN or NOT IN operators. A correlated subselect with the EXISTS or NOT EXISTS operator should be used instead. The EXISTS and NOT EXISTS operators will usually perform as well as a JOIN.

- In most cases, DB2 will not use an index when searching a table specified in an IN or NOT IN subselect.

The following example explains why correlated subselects with IN or NOT IN operators should not be used in host programs:

Consider a search for the names of suppliers of part P5. One way to accomplish that search would be with a statement that causes DB2 to select P5 rows from the SPJtable and join that set on supplier number with the S table. The following statement will do that job:

EXEC SQLSELECT SNAMEFROM S, SPJWHERE S.SN = SPJ.SNAND S.SN = ‘P5’END-EXEC

Since only four P5 rows will be joined with the S table, the procedure requires at most, reads of only four rows in the S table. Another way to carry out this search would be to perform a correlated subselect. The equivalent statement would be:

EXEC SQLSELECT SNAMEFROM SWHERE ‘P5’ IN

(SELECT PNFROM SPJWHERE SPJ.SN = S.SN)

Page 15: DB2 Programming Standards and Guidelines

END-EXEC

In this method DB2 first satisfies the inner select (in parentheses) by evaluating each row of the S table in turn against all the rows in the SPJ table, searching for a match on supplier number. The outer SELECT returns the supplier name any time the returned set includes P5. The entire SPJ table must be read once for each row in the S table, although searching the SPJ table just once for p5 rows is clearly more efficient.

EXISTS and NOT EXISTS

The EXISTS and NOT EXISTS operators return a true or false value to the outer select depending on the evaluation of the subselect with which they are used. For EXISTS, if there is a value satisfying the subselect, the operator returns true. For NOT EXISTS, if there is no value satisfying the subselect, the operator returns true. The outer select includes a row whenever the subselect returns a true evaluation. Any query that can be expressed using IN, can be formed using EXISTS. However, the converse is not true.

A select to determine the cities that have both suppliers and part stocks represents the intersection of the two sets of city values from the S and P tables. This statement is used to determine that intersection:

EXEC SQLSELECT DISTINCT CITYFROM SWHERE EXISTS

(SELECT CITYFROM PWHERE P.CITY = S.CITY)

END-EXEC

RETURNS:

CITY:

LONDONPARIS

Here is an example of the above query written as a correlated subselect without exists:

EXEC SQLSELECT DISTINCT CITYFROM SWHERE CITY IN

Page 16: DB2 Programming Standards and Guidelines

(SELECT CITYFROM PWHERE P.CITY = S.CITY)

END-EXEC.

RETURNS:

CITY:

LONDONPARIS

DB2 performs the search by passing the first city from S to the subselect, which finds a match for LONDON and therefore returns a true evaluation. The outer select therefore includes the row, returning LONDON. Then it continues the process for each subsequent row.

Similarly a search to determine whether there are any cities with parts but no suppliers, found by subtracting the cities in S from those in P, would represent a defference. The statement to determine this difference is:

EXEC SQLSELECT DISTINCT CITYFROM PWHERE NOT EXISTS

(SELECT CITYFROM SWHERE CITY = P.CITY)

END-EXEC.

RETURNS:

CITY:

ROME

Let’s look at another statement that can be written as join, or with an EXISTS operator:

EXEC SQLSELECT SN, PN, JN, QTYFROM SPJ SPJXWHERE JN = ‘J2’AND EXISTS

(SELECT S.SN

Page 17: DB2 Programming Standards and Guidelines

FROM SWHERE S.SN = SPJX.SNAND CITY = ‘PARIS’)

END-EXEC

Here is an example of the previous query written as a join:

EXEC SQLSELECT SN, PN, JN, QTYFROM S, SPJWHERE S.SN = SPJ.SNAND CITY = ‘PARIS’)AND JN = ‘J2’)END-EXEC

UPDATES

The purpose of the UPDATE statement is to update the values of one or more columns in one or more rows of a table. The rows to be updated are chosen by the “WHERE CLAUSE”. The basic format of the UPDATE statement is:

UPDATE the table (table name) SET column name 1 = expression 1

column name 2 = expression 2 WHERE conditions to be met (if any)

If no WHERE CLAUSE is specified, all rows in the table will be updated

Because updates modify the data and associated indexes during the update process, use of indexes to find the rows to be changed presents several problems. DB2 will not use certain indexes for certain types of updates and in some cases will fail to take advantage of usable indexes if the UPDATE statement is improperly worded.

DB2 will not use an index to search a column being updated. For example, to change supplier S5’s supplier number to S6, we could use this statement:

EXEC SQLUPDATE SSET SN = ‘S6’ WHERE SN = ‘S5’ END-EXEC

Page 18: DB2 Programming Standards and Guidelines

This statement would not take advantage of an index on SN. Using SNAME to identify the rows for updating would require this equivalent statement:

EXEC SQLUPDATE SSET SN = ‘S6’WHERE SNAME = ‘ADAMS’END-EXEC

This statement would take advantage of the index on SNAME.

In some cases, however, updates or deletions may be done in conjunction with other processing that would require a cursor. If the rows returned with a cursor are to be updated or deleted, the cursor definition may indicate this. A FOR UPDATE OF clause alerts DB2 that the update is coming. This warning prevents DB2 from using an index on a column to be updated. If it were to use an index on a column being updated, some updates would not be applied after the initial update since the index itself would be changed by the update. For example, if an index on the STATUS column of the s table were used to retrieve rows, those rows would be returned in STATUS order. An update of status from 10 to 50 would cause the cursor position to skip to a STATUS of 50, missing 20, 30, 40. If equal predicates are used on the updated column and on all preceding columns, an index can be used to locate the row to be updated. If the program to reassign parts over a certain weight from one city to another employed a cursor, its declaration would look something like this:

EXEC SQLDECLARE PCURSOR FORSELECT PN, WEIGHTFROM PWHERE WEIGHT > :WEIGHT-LIMITFOR UPDATE OF CITY

END-EXEC.

Notice that the column being ukpdated need not appear in the SELECT clause, although it often does.

When a cursor is used for updates, the UPDATE statement must use a WHERE CURRENT OF CURSOR clause to indicate the rows to be processed for example:

EXEC SQLUPDATE PSET CITY = :NEW-CITYWHERE CURRENT OF PCURSOR

END-EXEC.

Page 19: DB2 Programming Standards and Guidelines

The WHERE CURRENT OF clause causes the row at which the cursor points to be updated. The update does not advance the cursor. A FETCH . . . INTO statement is necessary for that.

The FOR UPDATE OF cannot be specified if the SELECT has:

- ORDER BY, GROUP BY, UNION, DISTINCT- AVG, SUM, COUNT, MIN, MAX, SUM- JOIN- A correlated subselect

All of the above functions create temporary internal tables, that cannot be updated since the base tables may be changed independently, creating inconsistencies in the data.

DB2 automatically bars the use of an index on any column referenced in the FOR UPDATE OF clause of cursor declaration. In performing the statement’s search, regardless of whether the reference columns are actually updated. Reference only those columns that are likely to be updated. Updating an entire set of rows with an update statement is more efficient than using the SELECT and cursor operation to update a row at a time.

USING INDEXES WITH OR CLAUSES:

The optimizer may use indexes in searches that include ‘OR’ CLAUSES. There are exceptions, however. The optimizer will not use an index if the search criteria depends on a comparison other than equality, as with the following statement:

EXEC SQLSELECT SN, SNAMEFROM SWHERE SN < ‘S2’OR SN > ‘S4’END-EXEC

DB2 will not use an index on supplier number, but an equivalent statement using UNION ALL would:

EXEC SQLSELECT SN, SNAMEFROM SWHERE SN < ‘S2’UNION ALLSELECT SN, SNAMEFROM S

Page 20: DB2 Programming Standards and Guidelines

WHERE SN > ‘S4’END-EXEC

NEGATIVE LOGIC CLAUSES:

DB2 will usually not use indexes in performing searches with predicates that include a negation. The not-greater-than or not-less-than conditions are different. The optimizer can reverse the negative and use an index on the positive. The optimizer can evaluate the clause WHERE NOT STATUS > 20 as WHERE STATUS <= 20 AND USE AN INDEX ON status IN THE SEARCH. The optimizer will not use an index on a search based on the NOT BETWEEN operator.

BETWEEN

If the optimizer knows that a search covers a range it will likely choose a nonmatching index scan. Between is more efficient than using IN.

The following three queries will return the same results, but the query using BETWEEN is more efficient than using IN, or a combination of >=, ,<=

EXEC SQLSELECT SN, PN, JNFROM SPJWHERE QTY BETWEEN 100 and 300END-EXEC

EXEC SQLSELECT SN, PN, JNFROM SPJWHERE QTY IN(100,200,300)END-EXEC

EXEC SQLSELECT SN, PN, JNFROM SPJWHERE QTY >=100 and <= 300)END-EXEC RETURNS:

S1 P1 J1S2 P3 J2S2 P3 J3

Page 21: DB2 Programming Standards and Guidelines

S2 P5 J2S3 P3 J1S4 P6 J3S4 P6 J7S5 P2 J2S5 P2 J4S5 P5 J7S5 P6 J2S5 P1 J4S5 P3 J4

PROGRAMMING FOR PROGRAMS THAT ARE RE-STARTABLE

When coding a DB2 program with restart capabilities a DB2 table is often used as a “driver” to determine processing sequence and an ORDER BY is often used for re-start positioning. When using a DB2 table as a driver, order the retrieved rows by the clustering index. For example, if you are using participant social security numbers from TBL_EE_PART as a processing driver you would want to select the column PART_SEARCH_NAM from TBL_EE_PART. This column contains the participants name and social security number. For an example of how to use this column as a processing driver, how to extract the social security number from it, and how to use this column for re-starting purposes look at program TR99ET06 in dataset: ADM.TR16101.TESTLIB.

LOCKING

When coding a batch program there are ways to improve efficiency by explicitly specifying the way your program will lock tables that you are accessing. Locking refers to the way that DB2 secures tables is only SELECTing columns from a DB2 table, you should code a LOCK TABLE statement in the initialization routine of your program as follows:

EXEC SQLLOCK TABLE S IN SHARE MODE

END-EXEC.

Locking a table in SHARE mode will allow other plans to read rows from the table, but will not allow any updates to it.

If your program is going to be updating or inserting rows on a DB2 table, you should LOCK the table in exclusive mode so that no other plans may access it. You code this LOCK table statement in the following way:

EXEC SQL

Page 22: DB2 Programming Standards and Guidelines

LOCK TABLE SPJ IN EXCLUSIVE MODEEND-EXEC.

Locking a table in exclusive mode will prevent any other plan from accessing this table, even SELECT access.

SUMMARY OF SEARCHES THAT CAN USE AN INDEX:

DB2 may use indexes to satisfy searches called for by:- Where clauses using the comparators =, <, <=, >, >=, ^> and ^<.- For example, WHERE SN = 10.

- Subselects using those comparators.

- Predicates using AND to limit the search criteria.

- Predicates using OR to expand the search criteria if the two clauses refer to the same column and both search on an equality.

- Predicates using IN or BETWEEN search criteria.

- Predicates seeking null values, for example WHERE SN IS NULL.

- Predicates seeking LIKE character values. If the search value begins with a blank or wild card - WHERE LOCATION LIKE ‘%BURG,’ for example – DB2 can use a nonmatching index scan. If the search value begins with an alphanumeric value, it can use a matching index scan.

- Predicates that search on a value determined by a ‘NOT’ operator negation of a less-than or greater-than comparator – for example, NOT SN > 50.

- A WHERE clause limiting a search on the first column or columns of a composite index, in which case a matching scan can be used; if the columns used in the predicate do not include the first column of the composite, a nonmatching scan may be used.

- Predicates requiring ORDER BY or GROUP BY operations on clustered columns.

- EXISTS and NOT EXISTS operators can use a matching index scan.

- The aggregates AVG, SUM, MIN, MAX, and COUNT – can use a matching or nonmatching index scan.

Page 23: DB2 Programming Standards and Guidelines

SUMMARY OF SEARCHES THAT CANNOT USE AN INDEX:

- A predicate using the ‘OR’ operator to expand search criteria to more than one column or a predicate using the ‘OR’ operator to include in a search criteria determined by the < and > operators.

- A correlated subselect.

- A subselect comparison based on the ‘IN’, ‘ALL’, ‘ANY’ operators, though an index can be used to satisfy the subselect itself.

- An UPDATE statement that specifies a column specified in the WHERE condition.

- A join of a table to itself when aliases are not used.

- Predicates that determine search criteria by negating an equality WHERE NOT SN = 10, for example.

- A comparison using an arithmetic expression – WHERE QTY = :QTY/2, for example.

INDEX USAGE IN DB2

Indexed Column is . . . . DB2 ConsidersIndex Sort

Used with:IN YESLIKE 1BETWEEN YES>, >=, <, <=, ^<, ^> YES^= NONOT 2OR 3ORDER BY YES YESGROUP BY YES YESDISTINCT YES YESCompared to:LONGER LETERAL NOMATH. EXPRESSIONDIFF. DATA TYPEUSED FOR:

Page 24: DB2 Programming Standards and Guidelines

JOINS YES YES

1 – Yes, except when comparison string is host variable or begins with ‘%’ or ‘_’2 – Apply NOT, then evaluate.3 – No, except when convertible to an ‘IN’

This document has shown the format of basic SQL statements and how they may have an impact on DB2 performance, and pointed out requests that will and will not allow DB2 to use indexes. It also showed how to formulate requests so that DB2 will be most likely to determine the most efficient access paths. I have also included some guidelines for DB2 programs that are re-startable. If you have any questions or comments, please see me.

Page 25: DB2 Programming Standards and Guidelines

Recommended