Agenda
Brief overview of tuning General tuning techniques Theory behind tuning Lots of silly looking dogs
Agenda
Brief overview of tuning General tuning techniques Theory behind tuning Lots of silly looking dogs
What is tuning?
Tuning is optimizing a database for a specific purpose.
Often overlooked. Can offer serious performance boosts.
Tuning Theory
Consider all aspects of the system. Remember that your database is just one part of
a larger whole piece of software, on a network of hardware.
A few of the many things you can (and should) check when tuning: System downtimes Batch job failures Resource utilization history Update history Partner machine needs and stability (apps
servers, etc)
Priority list for tuning
Hardware stability No computers are crashing often. The network
cables and routers are reliable. Hardware adequacy
Hardware on all machines is appropriate. No ancient Pentium I processors holding everyone up. Operating System and so forth is up to date.
Software design Database tables work well for application.
Software utilization Software doesn't have outrageous demands
being made of it (bad SQL).
General Tuning Techniques
Keep table size down. On particularly large databases undergoing large
queries, going from the hard drive to RAM and RAM to CPU can be very taxing.
A good rule of thumb: Think of single tables in terms of seconds of motherboard capacity.
This part's important don't look away now
Example table calculation: Data deliveries per second * size of deliveries =
throughput Motherboard: Intel 3010, 1066MHz FSB CPU. Front Side Bus (FSB): 1066 MHz, 32 bit. ((FSB Speed*1000) * Bit width) = throughput (in
bits) Throughput/(1,048,576 * 8) = throughput (in MB) (1,066,000 * 32)/(1,048,576 * 8) = 2.56 MB
So a 25 MB table is a 10-second table. A 5MB one is a 2-second table.
Seriously this part's useful
How long is too long? This entirely depends on what the database is
doing. To get a reply out of a 10-second table, you will need up to 10 seconds to get a result. Honestly ask yourself: Is that going to annoy people?
Some considerations for “annoyance” How often is this table used per day? Who uses it? What else is going on while it's being used?
Table size
How to cut size down directly? Archive old stuff Divide tables up more finely (one table per user
instead of a big users table) Eliminate redundant stuff
Kind of like grooming a dog.
Table size
Kind of like grooming a dog. (Cannot be the only thing you do. The real world is wet tar to your database.)
+
General Tuning Techniques
Add better indexes and reference tables Indexes allow searches to immediately
ignore data that is clearly not what it’s looking for.
Reference tables allow quick lookups.
Indexes
Indexes are no different in theory from the index in the back of a book.
When looking for a single String1 = String2 comparison, it checks the first letter, then the second, and so on.
Instead of looking at all records, you look at only records which sort of fit—looks like a tree structure.
Normal searches are “linear”, and require as many operations as the table has records. Indexed searches are much more efficient, and require log(records) time.
All modern databases HAVE indexes—they are not always properly used, however.
Reference table example
Example of reference table use: Imagine an e-commerce website with keywords
on items.
Reference Table Example
Imagine we must find products based on these keywords. Assume 100,000 individual entries to the Product table, with an average of 3 five letter keywords each. Search for ONE keyword.
First option: Search the “keywords” field. Obscenely slow. Requires as many operations
as the string is long; string length 5 requires 5 operations, length 10 requires 10, etc.
Called “linear(n) time with respect to string length.” 17 length keyword field * 100,000 records = 1.7
million operations for 1 search. …many customers looking at the same time, too.
Reference Table Example
Option 2: make reference tables by keyword Create Table “electronics”, and put anything with
“electronics” in the “keyword” field in there. Can be done while system is otherwise idle. Middleware can make new tables for new
keywords automatically. Requires only 1 operation to finish. Select * from
(keyword). Called “Constant time”.
This is how Google returns millions of webpages in a fraction of a second.
Reference Table Example
This search would have taken years on the best supercomputers in the world if simple text searches were done.
Reference Tables and Interfaces
What does this do for you? This makes the computer look at less useless
stuff when searching for something. Resists huge and sudden increases in size and
subsequent drops in speed. Doesn't address underlying problems.
General Tuning Techniques
Check the SQL being fed into the database. All current databases keep logs of what SQL
they run. Bad SQL can potentially hurt performance much
more than big tables or clunky interfaces. Call up an audit and see if anything stands out
as particularly slow or taxing. Rewrite it if possible.
SQL Monitoring
What does this do for you? Keeps people from doing too many dumb things,
which they will despite your best efforts. Again doesn't address any real problems with
the DB.
Monitoring tools
SQL Power Tools (http://www.sqlpower.com/) Monitors a database remotely, giving no overhead. Puts things into graph form. Can monitor entire server farms (up to 250). Web-enabled interface. $1200.
Freeware monitoring tools are few, far in between, and usually pretty primative. Mostly, they involve individual developers writing frontends for already-existing feedback programs like netstat and adding in alarms and so forth. Hard to find online. Most freeware programs you can find have serious flaws.
Monitoring Tools
SQL Monitor: (http://www.brothersoft.com/downloads/sql-
monitor.html) Freeware Only does SQL testing very well.
SQLXML 3.0 (http://msdn.microsoft.com/downloads/default.asp)
Comes free with Microsoft SQL Server. Very useful in getting database schema into XML for
presentation. Only works with MSSQL (very common problem!)
And many more
There are dozens and dozens of techniques to tune a database. Parallelism (setting up multiple computers to serve a single DB), indexing, and caching are just a few words that would work for all DBs...then there are dozens more specifically for Oracle, mySQL, MSSQL, and so forth. Reading and asking informed questions is key! There is no one procedure or good answer to a well tuned database.
Sources
O'Reilly Databases, “RDBMS Tuning”. Steven Hauser. http://www.onlamp.com/pub/a/onlamp/2003/11/13/rdbms_tuning.html
University of Arkansas, “Oracle Database Tuning”. Steve Rea. http://www.uaex.edu/srea/tuning.htm
Database Tuning: Principles, Experiments, and Troubleshooting Techniques. Dennis Shasha and Philippe Bonnet. (Paperback version: http://www.amazon.com/Database-Tuning-Principles-Experiments-Troubleshooting/dp/1558607536
Special Thanks to:Google, for endless pictures of silly dogs.Nikita Minter, for the Party Doog picture.