+ All Categories
Home > Technology > Chapter15

Chapter15

Date post: 22-May-2015
Category:
Upload: gourab87
View: 1,465 times
Download: 1 times
Share this document with a friend
Description:
Navate Database Management system
Popular Tags:
29
Elmasri/Navathe, Fundamen tals of Database Systems, 4th Edition
Transcript
Page 1: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Page 2: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Algorithms for Query Processing And Optimization

Chapter 15

Page 3: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Translating SQL Queries Into Relational AlgebraAlgorithms For External SortingAlgorithms For SELECT and JOIN OperationsAlgorithms For PROJECT and SET OperationsImplementing Aggregate Operations and OUTER JOINS Using Heuristics in Query OptimizationUsing Selectivity and Cost Estimates in Query OptimizationOverview of Query Optimization in ORACLE Semantic Query Optimization

Chapter Outline

Page 4: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Typical steps when processing a high-level query

Figure 15.1

Page 5: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A query block contains a single SELECT-FROM-WHERE expression (may contain GROUP BY and HAVING)

Nested queries are not query blocks, but are identified as separate query blocks

SQL queries are first decomposed into query blocks, then translated into equivalent extended relational algebra expressions

Translating SQL Queries Into Relational Algebra

Page 6: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

For example, the following compound querySELECT FNAME, LNAMEFROM EMPLOYEEWHERE SALARY > (SELECT MAX(SALARY)

FROMEMPLOYEE WHERE DNO=5);

Can be decomposed into two blocks:

SELECT FNAME, LNAMEFROM EMPLOYEEWHERE SALARY > c

And SELECT MAX(SALARY)FROM EMPLOYEEWHERE DNO=5

Where ‘c’ is the result returned from the inner query block.

Translating SQL Queries Into Relational Algebra

Page 7: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The inner query block (which need to be calculated first) could be translated into the expression

δ <MAX SALARY> (σ <DNO=5> (EMPLOYEE))

And the outer block into the expression

π <FNAME, LNAME> (σ<SALARY > c> (EMPLOYEE))

The query optimizer would then chooses an execution plan for each block.

Translating SQL Queries Into Relational Algebra

Page 8: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The result of a SQL query containing ORDERED BY-clause needs to be sorted

Internal sorting algorithms are suitable if the whole data fits in main memory.

External sorting is suitable for large files of records that do not fit entirely in main memory.

External sorting consists of two main phases:Sorting phase – sort portions of fileMerging phase – merge sorted portions

Algorithms For External Sorting

Page 9: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Search methods for simple selection (file scans):Linear search (brute force)Binary search Using a primary index (or hash key) –e.g.,

σ <SSN=‘123456789’> (EMPLOYEE)

Using a primary index to retrieve multiple records –σ <DNUMBER>5> (DEPARTMENT)

Using a clustering index to retrieve multiple recordsσ <DNO=5> (EMPLOYEE)

Using a secondary (B+-tree) index on an equality comparison.

Linear search is applicable to any file, but all the other above searches implies having an appropriate access path on the relevant attributes.

Algorithms For SELECT and JOIN Operations

Page 10: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Search methods for complex selection (when the SELECT condition as a conjunctive condition –e.g., σ <DNO=5 AND SEX=‘F’> (EMPLOYEE) ):

Conjunctive selection using an individual indexConjunctive selection using a composite indexConjunctive selection by intersection of record pointers

Selectivity is the ratio of the number of records (tuples) that satisfy the condition to the total number of records in the file (relation).

The optimizer chooses selection conditions with smaller selectivity first to retrieve records.

Algorithms For SELECT and JOIN Operations

Page 11: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Methods for implementing JoinsNested-loop join (brute force) –for each record in outer-loop retrieve every records from inner-loop

Single-loop join (using an access structure to retrieve the matching records) –use index to retrieve one record from one relation, and then uses the access structure to retrieve all matching records from the other relation

Sort-merge join –it is applicable if both relations are sorted by value of the join attribute

Hash-join –it is applicable if records of both relations are hashed to the same hash file, using the same hashing function on the join attribute

Implementing the JOIN Operation

Page 12: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Implementation of a PROJECT operation is straightforward if <attribute list> includes a key

If <attribute list> does not include a key, duplicate elimination may be required (requires sorting).

Implementation of {∪, ∩, -} can be done by using some variations of the sort-merge or hashing techniques

Implementation of the Cartesian-product is expensive

Algorithms For PROJECT and SET Operations

Page 13: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The aggregate operations MIN, MAX, COUNT, AVERAGE, and SUM can be computed by scanning the whole records (the worst case)

If index exists on the attribute of MAX , MIN operation, then these operations can be done in a much more efficient way(if the index is dense, it can also be used for COUNT, AVERAGE , and SUM)

When a GROUP BY clause is used in a query, the aggregate function must be applied separately on each group of tuples

Aggregate Operations and OUTER JOINS

Page 14: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The parser first generates an initial internal representation, the uses heuristic rules to optimize

One of the main heuristic rules is to apply the unary operations ‘σ’ and ‘π’ before ⋈ or other binary operations

A query tree is a tree data structure that represents the input relations of the query as leaf nodes and the relational algebra operations as internal nodes.

Using Heuristics in Query Optimization

Page 15: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

FIGURE 15.4 (b)Initial (canonical) query tree for SQL querySELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATEFROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER AND MGRSSN=SSN AND PLOCATION=‘Stafford’

Page 16: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Query tree corresponding to the relational algebra expression for the SQL query

SELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATEFROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER AND MGRSSN=SSN AND

PLOCATION=‘Stafford’

FIGURE 15.4 (a)

Page 17: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Execution of the query tree:

1. Execute an internal node operation whenever its operands are available and then replace the internal node by the resulting operation.

2. Repeat step 1 as long as there are leaves in the tree, that is, the execution terminates the root node is executed and produces the result relation for the query.

A more natural representation of a query is the query graph notation.

Using Heuristics in Query Optimization

Page 18: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Query graph corresponding to the SQL querySELECT PNUMBER, DNUM, LNAME, ADDRESS, BDATEFROM PROJECT, DEPARTMENT, EMPLOYEE WHERE DNUM=DNUMBER AND MGRSSN=SSN AND

PLOCATION=‘Stafford’

FIGURE 15.4 (c)

Page 19: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Example of Transforming a Query:

Consider the query Q that states “Find the last names of employees born after 1957 who work on a project named ‘Aquarius’.”

In SQL, this query can be specified as:

SELECT LNAME FROM EMPLOYEE, WORKS_ON, PROJECTWHERE PNAME=‘Aquarius’ AND ESSN=SSN

AND PNUMBER=PNO AND BDATE > ‘1957-12-31’;

Using Heuristics in Query Optimization

Page 20: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Steps in converting a query tree during heuristic optimization.

(a) Initial (canonical) query tree for SQL query Q.

FIGURE 15.5

Page 21: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Moving SELECT operations down the query tree.

FIGURE 15.5 (b)

Page 22: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Applying the more restrictive SELECT operation first.

FIGURE 15.5 (c)

Page 23: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Replacing CARTESIAN PRODUCT and SELECT with JOIN operations.

FIGURE 15.5 (d)

Page 24: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Moving PROJECT operations down the query tree.

FIGURE 15.5 (e)

Page 25: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

Break up conjunctive selection condition, that is,σ<c1 AND c2 AND … AND cn> (R) ≡ σ<c1> (σ<c2> (… (σ<cn> (R)) …))

Move each ‘σ’ operation as far down the query tree as is permitted by the attribute involved in the ‘σ’ condition Rearrange the leaf nodes of the tree using;

• Position the leaf node relation with the most restrictive σ operations so they are executed first,

• Make sure that the ordering of leaf nodes does not cause CARTESIAN PRODUCT operations

Combine a X with a subsequent σ in the tree into a ⋈ Break down and move lists of projection attributes down the tree as far as possibleIdentify subtrees that represent groups of operations that can be executed by a single algorithm.

Outline of a Heuristic Algebraic optimization Algorithm

Page 26: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

π<FNAME, LNAME, ADDRESS>

(σ<DNAME=‘Research’> (DEPARTMENT) ⋈<DNUMBER=DNO>(EMPLOYEE))

Converting Query Trees into Query Execution Plan

Page 27: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

A query optimizer should also estimate and compare the costs of executing a query (using different strategies).

Cost-based query optimization considers:

Access cost to secondary storageStorage costComputation costMemory usage costCommunication cost

Using Selectivity and Cost in Query Optimization

Page 28: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

The ORACLE DBMS provides two different approaches to query optimization:

1. Rule-based optimization; the optimizer chooses execution plans based on heuristically ranked operations

2. Cost-based optimization; the optimizer examines access paths and operator algorithms and chooses the execution plan with lowest estimated cost.

An addition feature to the ORACLE query optimizer is the capability for an application developer to specify hints to the optimizer.

Overview of Query Optimization in ORACLE

Page 29: Chapter15

Elmasri/Navathe, Fundamentals of Database Systems, 4th Edition

An alternative approach to query optimization is the semantic query optimization. It uses constraints specified on the database scheme.

Consider the query “Find the names of employees who earn more than their supervisors”. If we have a constraint on the database that states ‘no employee can earn more than their supervisor’, then the semantic query optimizer will not execute the query at all.

The draw back of this technique is that searching through many constraints is very time-consuming.

Semantic Query Optimization


Recommended