DB-15: Developing Performance-oriented ODBC/JDBC OpenEdge™ Applications
John GoodsonVice President, DataDirect, R&D
2 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Agenda
Topic: Understanding how drivers
work and what factors influence
performance
Topic: Guidelines for improving
performance
Topic: Review
3 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
At The End, You Should Understand:
How driver implementations trade off performance for functionality
How to design for performance
How to avoid pitfalls in some drivers
4 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
ODBC
Provides a common API to all data stores Cross platform Initially a Microsoft standard based on a
draft pre-release of the X/Open SQL Access Group CLI
ODBC 3.0 aligns with the ISO and ANSI CLI standards
5 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
JDBC
JDBC is a Java™ interface for accessing SQL data sources
Based on ODBC, Open Group SQL CLI, and ANSI/ISO CLI
JDBC 1.0 designed for basic usage with emphasis on ease of use
JDBC 2.0 enhanced for advanced features and server-side use
JDBC 3.0 goal was to “round out” API by providing missing functionality
6 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Performance Issues
Developing code that works is hard enough without having to make it work fast!
Tough to know all the specifics of a single backend
No exceptions are thrown to say your code is running too slow
Most performance “help” is DBMS server focused … not middleware focused
7 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
How Do Drivers Work?
Each database vendor defines a communication protocol of how the database server will communicate with clients
This communication protocol is defined in terms of packets that will be transferred using a network transport (i.e. information will be encoded into a specific database defined format that will be sent over the network (tcp-ip) to/from clients)
The database server only understands requests that are sent in this format
8 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
How Do Drivers Work – Part 2
Every database vendor defines their packet descriptions and their model differently– Oracle and DB2 are cursor based models– SQL Server and Sybase are streaming models– Progress OpenEdge is a cursor based model
Every JDBC request will most likely result in many packets being sent to/from the database– Why?– (1) packets are fixed size and data might not fit in
one packet– (2) JDBC API does not map 1:1 with database
communication protocol
9 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Why Do We Care How Drivers Work?
The two slowest things in a database system are– Network i/o– Disk i/o
The goal is to keep applications using the CPU and not waiting on movement of data across a network or to/from a storage device.
10 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Will Applications Do?
– Connect– Look for data– Read data – Change data– Delete data– Make changes permanent (commit)– Disconnect
11 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Happens When We Connect?
Login credentials verified by dbms Shared memory (allocated and) assigned to
connection– Depending on db between 24k and 64k
Various queues updated with connection presence
Connection recorded in audit files on disk Database context established Code page set Transaction information initialized Connection properties applied Initialization statements executed
12 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Connecting
Connecting is the single slowest thing you can do in a well tuned application– Lots of network i/o’s … disk i/o’s … CPU cycles
Using connection pooling, we can make the slowest (network i/o and disk i/o intensive) operation in your application one of the fastest (using no network i/o or disk i/o)
With JDBC, Connection pooling requires use of JDBC DataSources … don’t connect via Driver Manager
13 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Will Applications Do?
Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect
14 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Looking for Data – Catalog Functions/DatabaseMetaData
Compared to other xDBC methods, catalog calls/DatabaseMetaData methods that return ResultSet objects are slow– getTables, getColumns, getProcedures, …
xDBC drivers have to execute complex queries – and many times multiple queries – in order to return the results mandated by the xDBC specs.
15 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Getting a List of Tables
What you want: a list of all tables that I have starting with the prefix NYC
ResultSet WSrs = WSc.getTables (null, null, “NYC%”, null);
16 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Getting a List of Tables (part 2)
What a driver thinks you want: A list of all tables, views, synonyms, system tables, aliases, or temporary tables owned by any schema in any catalog/database starting with the prefix NYC
ResultSet WSrs = WSc.getTables (“cat1”, “johng”, “NYC%”, “TABLE”);
17 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Avoid Search Patterns for DatabaseMetaData Methods Using null arguments or search patterns to
Catalog/DatabaseMetaData methods results in time-consuming queries and, potentially, increases network traffic due to unwanted results.
Common places to see null that should be “fixed”– Schema name (most all)
– Table type (getTables/SQLTables)
– Catalog name (most all)
18 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Oracle SQL … getProcedures … yuck!select a.owner, a.object_name, INSTR(a.object_type, 'PROC'), '1' from
sys.all_objects a where a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID'
union all select c.owner, c.synonym_name, INSTR(a.object_type, 'PROC'), '1' from sys.all_objects a, sys.all_synonyms c
where c.table_owner = a.owner and c.table_name = a.object_name and a.object_type IN ('FUNCTION','PROCEDURE') and a.status = 'VALID'
union all select distinct b.owner, CONCAT(b.package_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b
where b.package_name IS NOT NULL group by b.owner, CONCAT(b.package_name, '.' || b.object_name)
union all select distinct c.owner, CONCAT(c.synonym_name, '.' || b.object_name), min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c
where c.table_owner = b.owner and c.table_name = b.package_name and b.package_name IS NOT NULL group by c.owner, CONCAT(c.synonym_name, '.' || b.object_name)
union all select distinct c.owner, c.synonym_name, min(b.position), max(b.overload) from sys.all_arguments b, sys.all_synonyms c
where c.owner = b.owner and c.table_owner=b.package_name and c.table_name=b.object_name group by c.owner, c.synonym_name
19 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
getTypeInfo
Even getTypeInfo may result in a network round-trip– Do this once and cache the information
20 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Use Prepared Statement Metadata Instead of getColumns (when possible)
getColumns/SQLColumns– driver sends a potentially complex SQL query to the server– SQL is preprocessed and executed on the server– driver must then obtain result set descriptions, bind results,
and retrieve rowset from the server– Usually at least 3 network i/o’s (probably more)
ResultSetMetaData/SQLDescribeCol– Simple query, which evaluates to no rows, sent to server –
NOT EXECUTED– Only result set information obtained– Usually 2 network i/o (prepare + close)
21 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Example
PreparedStatement pstmt = WScon.prepareStatement (“select * from StockPurchasePlan where 0 = 1”);
ResultSetMetaData rsmd = pstmt.getMetaData();
int count = rsmd.getColumnCount;
// this code would replace
// getColumns (“ct1”, “johng”,
// “StockPurchasePlan”, “%”);
22 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Will Applications Do?
Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect
23 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Prepared Statements
Many times our applications do the same things over and over again by specifying SQL that is parameterized
SELECT firstname, lastname, address, phone FROM employees WHERE empid = ?
UPDATE balance SET balance = balance + ? WHERE acctid = ?
24 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Prepared Statements (2)
Many times prepared statements are not as efficient as executing the SQL directly (using the Statement object in JDBC/SQLExecDirect in ODBC)
Use prepared statements if the query is to be executed more than twice in this session OR if it will be executed frequently by the system (JDBC)
Execute directly if the query is to be executed once or twice
25 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Prepared Statement Side Effect
Side effect: There is no way to control what gets put in the prepared statement pool. If you create a prepared statement, then it’s going into the pool and might remove some other statement you want to stay in the pool.
26 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Even More on Prepared Statements
Prepare, Execute, Describe– Always preferred– Saves network roundtrip with most databases
Prepare, Describe, Execute– Always slower
27 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Stored Procedure Invocation
Use parameter markers for arguments to stored procedures … do not use literal arguments
{call getFunds (?,?,?)} // good
{call getFunds (1234, ‘JPG’, 53.23)} // bad
28 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Why Not Use Literal Arguments?
The spec says not to use literals
More code for the programmer, but executions can be done via RPCs (remote procedure calls) instead of executed as language events
Use of RPCs improves performance by– Eliminating parameter processing on server– Parsing done on server
29 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Example: {call foo (12,’abc’)} Server gets a generic SQL packet
– Parse SQL
– Semantically validate Does procedure foo exist? Is foo defined to have 2 arguments?
– SQL is always a Unicode character string Look up data type for foo’s parameter 1 (int) Convert ’12’ to an integer Look up data type for foo’s parameter 2 (varchar) Convert ‘abc’ to varchar
– Execute
30 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Example:
{call foo (?,?)}
Server gets a specific stored procedure packet where the packet structure contains the arguments in the defined server format for the parameters– Execute only
31 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Understand Cursor Types
Forward only cursors are always the fastest cursor type
(most of the time) static/insensitive scrollable cursors will retrieve all rows for the query to the client
(most of the time) keyset/sensitive scrollable cursors will retrieve all keys to the client, then each fetch (next) will result in an execute
32 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Forward-Only Cursors driver will pre-fetch rows from the server
For small result sets, can fit entire contents of the result set in one network roundtrip
Side effects:– First fetch will be slower than subsequent fetches for
application– Cursor position is not where you “think” it is– Read Committed data might have already been read
33 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Scroll Insensitive/Static Cursors
drivers will usually cache the entire contents of the result set on the client
“Lazy fetch” model … only fetch the data when you have to.
If you request a scroll sensitive cursor and the driver can’t support a scroll sensitive cursor, then it will downgrade to scroll insensitive
Not scalable!
34 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Keyset/Scroll Sensitive Cursors
drivers usually cache a set of keys to the client
Example, ‘select * from table’ with Oracle– Results in the driver executing: ‘select rowid from
table’– Retrieves all rowids– Each call to SQLFetch/rs.next() will result in the driver
executing ‘select * from table where rowid = ?’ and retrieving the result
35 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Use Batches Instead of Prepared Statements
Bulk load of data has typically been done by preparing an INSERT statement and executing many times.
Preparing an INSERT statement and executing 100 times:– 101 network round-trips
Using batch queries/addBatch() to compound 100 inserts:– 1 network round-trip
Network round-trips are more expensive than DBMS CPU cycles
36 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Arrays
In ODBC, use SQLParamOptions to execute multiple parameterized queries with one network roundtrip
Not as easy in JDBC; however, use batch queries and eventually all drivers will optimize to use parameter arrays
37 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Retrieving JDBC Data The hard work is not just positioning to the data, follow
a few more guidelines when putting data into buffers
Use getter methods with result column ordinals instead of with column names.
String fullname = rs.getString (3); // good
String fullname = rs.getString (“lname”); // bad
38 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
JDBC Retrieving Data (2)
Only use getObject() unless it’s absolutely necessary.
String children = (String) rs.getObject (2); // bad
Int children = rs.getInt (2); // fast
39 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Avoid Long Data (Including LOBs)
Select * from table– Easy! Most people do this!
– Slow! Even if you don’t ask for column data, a
performance penalty has been paid as the data has already been moved across the network to the client
Particularly bad when the table contains long varchar/long varbinary/LOB data
Compounded because the driver is trying to array fetch … which is might not be able to do
40 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
CLOB/BLOB
Using CLOB and BLOB objects sparingly
Although you might think that only parts of the LOB are being transported across the network, most database wire protocols retrofit LOBs into existing packet structures
JDBC drivers emulate getClob/getBlob by retrieving the entire object locally
41 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Various other Guidelines Use getGeneratedKeys()
– Saves network i/o’s
Don’t call rs.last() or fetch absolute last
Don’t write code that relies on the number of result rows from a query– Drivers have no idea how many rows a query will
return
– Drivers have to fetch all rows to know how many rows the query will return
42 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
More guidelines Understand:
– setMaxRows
– setMaxFieldSize
Both limit network i/o by requesting that the server not send data across the wire
Side effects:– setMaxRows also applies to updates/deletes in some
databases
– These options will persist in the connection
43 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Will Applications Do?
Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect
44 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Transactions
Committing transactions is expensive!– Network i/o’s– Numerous non-sequential disk i/o’s
Don’t run in auto-commit mode
Distributed txs are at least 4 times as slow as local transactions
Use DTC/JTA only if required!
45 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
What Will Applications Do?
Connect Look for data Read data Change data Delete data Make changes permanent (commit) Disconnect
46 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Summary
Avoid:– Network i/o– Disk i/o
Think about what drivers are doing to process your requests
Stop lazy programming ... Fill in the null arguments
47 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Summary
Avoid network I/O
Avoid disk I/O
Avoid sloppy programming
48 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Questions?
49 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation
Thank you for your time!
50 DB-15 ODBC/JDBC OpenEdge™ Applications © 2005 Progress Software Corporation