Issues in Database Performance
• Performance in Read / write are hardware issues => throw money at it
• Performance of DB = ability of engine to locate data
• Factors affecting speed of retrieval– Cache (sizing of objects)– Access method (table scan / indexes…)– Contention between processes– Indirect processes (roll back, archiving…)
Reading data in Oracle
1 – determine how to access block where data is located– read data dictionary stored in separate part of DB– DD may be loaded up in cache (aka row cache) to
limit I/O activity
2 - DD indicates preferred access method for block– B tree index, partitioning, hash clusters etc…
3 - Search begins either in full scan or with index until data found
Designing DB for reading
• Supply methods for high precision access to data
• But some queries will defeat the strategies• E.g. credit cards transactions – monthly
report of scattered items• No solution = take off-line
Changing data• Oracle makes hard work of changes
– Rollback data (immediate)– Log files (long term)
• Changed blocks read and updated in buffer• Released to disk as buffer is cleared• But rollback info generate most I/O operations• In sensitive environments, simultaneous
archiving makes it worse (ARCHIVELOG mode)
Indexing Problems
• Super-fast indexes need updating as data is changed => DB slows down.
• More complex index = more complex update mechanism = more rollback ….
• DB physical structure degrades, so does index (eg split blocks)
• Performance decreases over time• Rebuild needed (which interferes with
operations)INDEX_STATS tells you how big the index has grown
Side effects
• Role of DBMS enforce data consistency• A reading process may need an older
version of the data– Need to create a private version of the data=> processes that should be Read Only require
writesAttempt to describe all required operations
in executing a query requiring old data
Solution• To create an older version of data
– Must apply roll back– find old roll back block (I/O)– Roll back index (I/O)– Find data (I/O)– Roll back data– Read old data– Reverse all changes (multiple I/O)
• Significant I/O implications + buffer full of old stuff
Conclusions • Writers and readers DO interfere with each other• The mechanisms used by Oracle to bypass
locking have performance side effects• Performance come with minimising I/O
– i.e. with good access techniques– Precision of data location– Physical proximity of related data (cf: caching)
• However: Techniques to reduce I/O numbers tend to reduce the speed of access!
Indexing example (see figure 3.1):
• Alphabetic search using B tree index for name = Oscar Smith
• Split the table in sections (eg: half) and read until find start beyond letter “S” + go back one
• Do same in branch block• Smith, N is the one• Then look for page with Smith, N in header• Scan for actual entry in index• Read address• Move to table• Read
About Btree indexes• Root and Branch blocks = approx 2% of index
(small)• In frequent hit situations, both blocks loaded in
Data Buffer all the time• Then only 2 I/O may be required:
– One to read leaf block– One to read the table
• In practice, read in index and in table may require reading several blocks– ?
Creating and Using Indexes• Important for live access, • Even more for querying multiple tables• Value matching is costly process in RDB
– No pointers– Connection purely on comparison basis only– One value against all values in joint field
• If link between 2 huge tables, perf is low• All RDBs use some form of indexing• Some complexity involved as index can reduce
physical I/O at the cost of logical I/O (CPU time)
Btree indexing• Creating an index means creating a table with
X+1 columns– X = number of columns in index– Rowid (added field) [table block + row]
• Index is then copied into consecutive blocks• PCTFREE function leaves space for data growth
– high value will generate many leaf blocks but reduce occurrence of split blocks in time
• Pointer is added to previous and next leaf blocks in header of block
Btree indexing (2)• Then branch layer is built:• If index > one block
– Collect all first entries + block address of each leaf block– Write down into the first level branch block (packed)– If branch block is full, initiates second level of branch blocks
etc….• Room is saved in branch blocks:
– No forward and backward pointer in branch blocks– Entries are “trimmed” to the bare minimum– First entries are omitted
• See figure 6.1
Syntax
• CREATE INDEX name ON table name (field1, field2 …)PCTFREE 50;
• utility programmes to assess performance of indexes – eg INDEX_STATS View
Updating indexes
• Index entries are NEVER changed– Marked as deleted and re-inserted
• Space made available cannot be used until after index is re-built
• Inserts that don’t fit split the block (rarely 50/50!)• If a blocks becomes empty, it is marked as free,
but is never removed• Also, blocks never merge automatically
Some problems
• Some situations cannot be addressed with indexes
• e.g. In a FIFO processing situation (e.g. a queue), indexes will prove counterproductive
• Index may grow to stupid proportions even with small error rate (unsuccessful processing of data)
• Every time a transaction is added or processed (deleted) the index must change
Alternative: Bitmap indexing• Imagine following query in huge tableFind customers living in London, with 2 cars and 3
children occupying a 4 bed house• Index not useful – why?
– Too big– If query changes in any way =>new index needed– Maintaining a set of indexes for each query would just
be too costly• Use a bitmap (see table 6.1, 6.2 and 6.3)
Bitmap indexes (2)
• Special for data warehouse type DBs• Build one bitmap for each relevant
parameter• Combine bitmaps using the “and” SQL
keyword• Also possible to use “not”ing of bitmap
(see table 6.4, 6.5)
Key points to remember
• What is the key advantage of a bitmap index?
• What situation does it best suit ?• Bitmaps can also be packed by Oracle
compression features• But size is unpredictable – why?
Example:• Table with 1,000,000 rows• Bitmap on one column that can contain one of 8
different values (e.g. city names)• Data is such that all same city together 125,000
times– Write the bitmap– Imagine what compression can be achieved
• Data is such that cities are in random order, but same number of each– Same questions
solutionFirst scenario• Bitmap for first city: 125,000 ones and 875,000 zeros
[trimmed off]• Size ~ 125,000 bits or approx 18Kbytes• Full bitmap = 156 KbytesSecond scenario:• Bitmap for first city sequences of 1’s and 7 zeros,
repeated 125,000 times• Size ~ 1,000,000 bits or approx 140 Kbytes• Full bitmap = 1.12 Mbytes• But BTree index for such data would be around 12MB
Conclusion• Bitmap indexes work best when combined• They are very quick to build
– Up to a million rows for 10 seconds• Work best when limited number of values + when high
repetitions• Best way to deal with huge volumes => make drastic
selection of interesting rows before reading the table• Warning: one entry in a Bitmap = hundreds of records
= > locking can be crazy (OLTP systems)• => for datawarehouse type applications (no
contention)
What oracle says about it
• Use index for queries with low hit ratio or when queries access < 2 - 4% of data
• Index maintenance is costly so index “just in case” is silly
• Must analyse the type of data when deciding what kind of index
• Do NOT use columns with loads of changes in an index
• Use indexed fields in “where” statement • Can also write queries with NO_INDEX
Administering indexes
• Indexes degrade over time• Should stabilise around 75% efficiency,
but don’t• Run stats:Analyse index NAME validate structureAnalyse index NAME compute statisticsAnalyse index NAME estimate statistics
sample 1 percent
See table 6.6