+ All Categories
Home > Documents > CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna...

CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna...

Date post: 14-Jan-2016
Category:
Upload: godwin-hicks
View: 225 times
Download: 0 times
Share this document with a friend
Popular Tags:
21
CS 257 CS 257 Chapter – 15.9 Summary of Query Execution Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Database Systems: The Complete Book Krishna Vellanki 124
Transcript
Page 1: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

CS 257CS 257Chapter – 15.9 Summary of Query ExecutionChapter – 15.9 Summary of Query Execution

Database Systems: The Complete BookDatabase Systems: The Complete Book

Krishna Vellanki124

Page 2: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

IntroductionIntroduction

What is Query Processor?◦ Group of components of a DBMS that converts a user

queries and data-modification commands into a sequence of database operations

◦ It also executes those operations◦ Must supply detail regarding how the query is to be

executed

Page 3: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Building Blocks of Query processingBuilding Blocks of Query processing

3

Query Execution: The algorithms that manipulate the data of the database.

Focus on the operations of extended relational algebra.

Page 4: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Outline of Query CompilationOutline of Query Compilation

Query compilationParsing: A parse tree for the

query is constructedQuery Rewrite: The parse tree

is converted to an initial query plan and transformed into logical query plan (less time)

Physical Plan Generation: Logical Q Plan is converted into physical query plan by selecting algorithms and order of execution of these operator.

4

Page 5: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Scanning TablesScanning Tables One of the basic thing we can do in a Physical query plan is to

read the entire contents of a relation R. Variation of this operator involves simple predicate, read only

those tuples of the relation R that satisfy the predicate. Basic approaches to locate the tuples of a relation R

Table Scan Relation R is stored in secondary memory with its tuples

arranged in blocks It is possible to get the blocks one by one

Index-Scan If there is an index on any attribute of Relation R, we can use this

index to get all the tuples of Relation R

5

Page 6: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Sorting While Scanning TablesSorting While Scanning Tables

Number of reasons to sort a relation Query could include an ORDER BY clause, requiring

that a relation be sorted.Algorithms to implement relational algebra operations

requires one or both arguments to be sorted relations.Physical-query-plan operator sort-scan takes a

relation R, attributes on which the sort is to be made, and produces R in that sorted order

6

Page 7: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Parameters for Measuring CostsParameters for Measuring Costs

Parameters that affect the performance of a query Buffer space availability in the main memory at the time of execution of

the query Size of input and the size of the output generated The size of memory block on the disk and the size in the main memory

also affects the performance B: The number of blocks are needed to hold all tuples of relation R.

Also denoted as B(R). T is the number of tuples in relation R, also denoted as T(R). V: The number of distinct values that appear in a column of a relation R V(R, a)- is the number of distinct values of column for a in relation R

7

Page 8: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

One-Pass Algorithms for Database One-Pass Algorithms for Database OperationsOperations

The choice of an algorithm for each operator is an essentialpart of the process of transforming a logical query plan intoa physical query plan. Main classes of Algorithms:

◦ Sorting-based methods◦ Hash-based methods◦ Index-based methods

Division based on degree difficulty and cost:◦ 1-pass algorithms◦ 2-pass algorithms◦ 3 or more pass algorithms

Page 9: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

One-Pass Algorithm MethodsOne-Pass Algorithm Methods

1.1. One-Pass Algorithms for Tuple-at-a-Time Operations: One-Pass Algorithms for Tuple-at-a-Time Operations: selection and projection

2.2. One-Pass Algorithms for Unary, fill-Relation Operations: One-Pass Algorithms for Unary, fill-Relation Operations: Duplicate Elimination and Grouping

3.3. One-Pass Algorithms for Unary, fill-Relation Operations: One-Pass Algorithms for Unary, fill-Relation Operations: Binary operations including Union, Intersection, Difference, Product and Join

9

Page 10: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Nested Loop JoinsNested Loop Joins

Used for relations of any side. Not necessary that relation fits in main memory Uses “One-and-a-half” pass method in which for

each variation: One argument read just once. Other argument read repeatedly. Two kinds:

Tuple-Based Nested Loop Join Block-Based Nested Loop Join

Page 11: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Improvement & ModificationImprovement & Modification

To decrease the cost Method 1: Use algorithm for Index-Based joins

We find tuple of R that matches given tuple of S We need not to read entire relation R

Method 2: Use algorithm for Block-Based joins Tuples of R & S are divided into blocks Uses enough memory to store blocks in order to reduce

the number of disk I/O’s.

Page 12: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Physically Unrealizable BehaviorsPhysically Unrealizable Behaviors

Transaction T tries to read too late

Read too LateRead too Late

Page 13: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Write too LateWrite too Late

Transaction T tries to write too late

Page 14: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Problem with dirty dataProblem with dirty data

T could perform a dirty read if it is reads X

Page 15: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

A write is cancelled because of a write with a later timestamp, but the writer then aborts

Page 16: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Timestamps Vs LocksTimestamps Vs Locks

Timestamps LocksSuperior if

• most transactions are read-only• rare that concurrent transactions will read or write the same element

Superior in high-conflict situations

In high-conflict situations, rollback will be frequent, introducing more delays than a locking system

Frequently delay transactions as they wait for locks

Page 17: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Two passed Algorithm based on Two passed Algorithm based on hashinghashing

Hashing is done if the data is too big to store in main memory buffers.

◦ Hash all the tuples of the argument(s) using an appropriate hash key.

◦ For all the common operations, there is a way to select the hash key so all the tuples that need to be considered together when we perform the operation have the same hash value.

◦ This reduces the size of the operand(s) by a factor equal to the number of buckets.

Page 18: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Steps to be followed for a Two passed Steps to be followed for a Two passed Algorithm based on hashingAlgorithm based on hashing

• Duplicate EliminationDuplicate Elimination

• Grouping and AggregationGrouping and Aggregation

• Union, Intersection, and DifferenceUnion, Intersection, and Difference

• Hash-Join AlgorithmHash-Join Algorithm

Page 19: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Sort based Vs Hash basedSort based Vs Hash based

For binary operations, hash-based only limits size to min of arguments, not sum

Sort-based can produce output in sorted order, which can be helpful

Hash-based depends on buckets being of equal size

Sort-based algorithms can experience reduced rotational latency or seek time

Page 20: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

15.6 Index based Algorithms15.6 Index based Algorithms

Clustered Relation: Tuples are packed into roughly as few blocks as can possibly hold those tuples

Clustering indexes: Indexes on attributes that all the tuples with a fixed value for the search key of this index appear on roughly as few blocks as can hold them

A relation that isn’t clustered cannot have a clustering index

A clustered relation can have nonclustering indexes

Page 21: CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book Krishna Vellanki 124.

Thank You..!!Thank You..!!


Recommended