+ All Categories
Home > Documents > CS 4432query processing1 CS4432: Database Systems II.

CS 4432query processing1 CS4432: Database Systems II.

Date post: 21-Dec-2015
Category:
View: 221 times
Download: 0 times
Share this document with a friend
46
CS 4432 query processing 1 CS4432: Database Systems II
Transcript
Page 1: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 1

CS4432: Database Systems II

Page 2: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 2

Query Processing

Query in SQL Query Plan in Algebra

Page 3: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 3

Example

Data: relation R (A, B, C) relation S (C, D, E)

Query: SELECT B, D

FROM R, SWHERE R.A = “c” and S.E = 2 and R.C=S.C

Page 4: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 4

R A B C S C D E

a 1 10 10 x 2

b 1 20 20 y 2

c 2 10 30 z 2

d 2 35 40 x 1

e 3 45 50 y 3

Answer B D2 x

SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C

Page 5: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 5

• How do we execute query?

- Form Cartesian product of all tables in FROM-clause

- Select tuples that match WHERE-clause- Project columns that occur in SELECT-clause

One idea

Page 6: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 6

R X S R.A R.B R.C S.C S.D S.E

a 1 10 10 x 2

a 1 10 20 y 2

. .

C 2 10 10 x 2 . .

Bingo!

Got one...

Page 7: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 7

But ?

• Performance would be unacceptable!

• We need a better approach to reasoning about queries, their execution

orders and their respective costs

Page 8: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 8

Formal Relational Query Languages

• Two mathematical Query Languages form basis for “real” languages (e.g. SQL), and for implementation:

– Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declarative.)

– Relational Algebra: More operational, very useful for representing execution plans.

Page 9: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 9

Relational Algebra

• Tuple : ordered set of data values• Relation: a set of tuples• Algebra: formal mathematical

system consisting of a set of objects and operations on those objects

• Relational algebra: Algebra whose objects are relations and operators transform relations into other relations

Page 10: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 10

Relational Algebra ?

Query: SELECT B, D

FROM R, SWHERE R.A = “c” and S.E = 2 and R.C=S.C

Page 11: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 11

Relational Algebra - can be used to

describe plans...Ex: Plan I

B,D

R.A=“c” S.E=2 R.C=S.C

XR S

OR: B,D [ R.A=“c” S.E=2 R.C = S.C (RXS)]

Page 12: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 12

Example Instances

sid sname rating age

22 dustin 7 45.0

31 lubber 8 55.558 rusty 10 35.0

sid sname rating age28 yuppy 9 35.031 lubber 8 55.544 guppy 5 35.058 rusty 10 35.0

sid bid day

22 101 10/10/9658 103 11/12/96

R1

S1

S2

“Sailors” and

“Reserves” relations

Page 13: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 13

Relational Algebra• Basic operations:

– Selection ( ) Selects a subset of rows from relation.– Projection ( ) Deletes unwanted columns from relation.– Cross-product ( ) Allows us to combine two relations.– Set-difference ( ) Tuples in reln. 1, but not in reln. 2.– Union ( ) Tuples in reln. 1 and in reln. 2.

• Additional operations:– Intersection, join, division, renaming: Not essential !

• Algebra is “closed”: Since each operation returns a relation, operations can be composed !

Page 14: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 14

Projectionsname rating

yuppy 9lubber 8guppy 5rusty 10

sname rating

S,

( )2

age

35.055.5

age S( )2

Page 15: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 15

Selection

rating

S82( )

sid sname rating age28 yuppy 9 35.058 rusty 10 35.0

sname ratingyuppy 9rusty 10

sname rating rating

S,

( ( ))82

Page 16: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 16

Union, Intersection, Set Difference

• Operate on two union-compatible relations:– Same number of

fields.– `Corresponding’ fields

have same type.

sid sname rating age

22 dustin 7 45.031 lubber 8 55.558 rusty 10 35.044 guppy 5 35.028 yuppy 9 35.0

sid sname rating age31 lubber 8 55.558 rusty 10 35.0

S S1 2

S S1 2

sid sname rating age

22 dustin 7 45.0

S S1 2

Page 17: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 17

Cross-Product• Each row of S1 is paired with each row of

R1.

( ( , ), )C sid sid S R1 1 5 2 1 1

(sid) sname rating age (sid) bid day

22 dustin 7 45.0 22 101 10/ 10/ 96

22 dustin 7 45.0 58 103 11/ 12/ 96

31 lubber 8 55.5 22 101 10/ 10/ 96

31 lubber 8 55.5 58 103 11/ 12/ 96

58 rusty 10 35.0 22 101 10/ 10/ 96

58 rusty 10 35.0 58 103 11/ 12/ 96

Conflict: Both S1 and R1 have a field called sid.Renaming operator:

Page 18: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 18

Joins

• Condition Join :

• Result schema same as that of cross-product.

R c S c R S ( )

(sid) sname rating age (sid) bid day

22 dustin 7 45.0 58 103 11/ 12/ 9631 lubber 8 55.5 58 103 11/ 12/ 96

S RS sid R sid

1 11 1

. .

Page 19: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 19

Joins

• Equi-Join: condition contains only equalities.

• Result schema only one copy of fields for which equality is specified.

• Natural Join: Equijoin on all common fields.

sid sname rating age bid day

22 dustin 7 45.0 101 10/ 10/ 9658 rusty 10 35.0 103 11/ 12/ 96

S Rsid

1 1

Page 20: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 20

Division

• Not primitive operator, but useful: Find sailors who have reserved all boats.

• A has 2 fields x and y; B has only field y:– A/B =

– i.e., A/B contains all x tuples (sailors) such that for every y tuple (boat) in B, there is an xy tuple in A.

x x y A y B| ,

Page 21: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 21

Examples of Division A/B

snos1s2s3s4

snos1s4

snos1

sno pnos1 p1s1 p2s1 p3s1 p4s2 p1s2 p2s3 p2s4 p2s4 p4

A

pnop2

B1

pnop2p4

B2

pnop1p2p4

B3

A/B1 A/B2 A/B3

Page 22: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 22

Expressing A/B Using Basic Operators

• Division is useful shorthand. • Idea: For A/B, compute all x values

that are not `disqualified’ by some y value in B.

Disqualified x values:

A/B:

x x A B A(( ( ) ) )

x A( ) all disqualified tuples

Page 23: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 23

Find names of sailors who’ve reserved boat #103

Solution 2: ( , Re )Temp servesbid

1103

( , )Temp Temp Sailors2 1

sname Temp( )2

Solution 3: sname bidserves Sailors( (Re ))103

sname bidserves Sailors(( Re ) )103

Solution 1:

Page 24: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 24

Find names of sailors who’ve reserved a red boat

• Information about boat color only available in Boats; so need an extra join:

sname color redBoats serves Sailors((

' ') Re )

A more efficient solution:

sname sid bid color redBoats s Sailors( ((

' ') Re ) )

A query optimizer can find this, given the first solution!

Page 25: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 25

Find sailors who’ve reserved a red or a green boat

• Can identify all red or green boats, then find sailors who’ve reserved one of these boats:

( , (' ' ' '

))Tempboatscolor red color green

Boats

sname Tempboats serves Sailors( Re )

Can also define Tempboats using union! (How?)

What happens if is replaced by in this query?

Page 26: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 26

Find sailors who’ve reserved a red and a green boat

• Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (sid is a key for Sailors):

( , ((' '

) Re ))Tempredsid color red

Boats serves

sname Tempred Tempgreen Sailors(( ) )

( , ((' '

) Re ))Tempgreensid color green

Boats serves

Page 27: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 27

Find names of sailors who’ve reserved all boats

• Uses division; schemas of input relations to / must be carefully chosen:

( , (,

Re ) / ( ))Tempsidssid bid

servesbidBoats

sname Tempsids Sailors( )

To find sailors who’ve reserved all ‘Interlake’ boats:

/ (' '

) bid bname Interlake

Boats.....

Page 28: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 28

Relational Algebra

representation used to describe plans...

Page 29: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 29

Example

Data: relation R (A, B, C) relation S (C, D, E)

Query: SELECT B, D

FROM R, SWHERE R.A = “c” and S.E = 2 and R.C=S.C

Page 30: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 31

Relational Algebra - to describe planEx: Plan I

B,D

R.A=“c” S.E=2 R.C=S.C

XR S

OR: B,D [ R.A=“c” S.E=2 R.C = S.C (RXS)]

Page 31: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 32

Another idea:

B,D

R.A = “c” S.E = 2

R S

Plan II

natural join

Page 32: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 33

R S

A B C (R) (S) C D E

a 1 10 A B C C D E 10 x 2

b 1 20 c 2 10 10 x 2 20 y 2

c 2 10 20 y 2 30 z 2

d 2 35 30 z 2 40 x 1

e 3 45 50 y 3

SELECT B,D FROM R,S WHERE R.A = “c” and S.E = 2 and R.C=S.C

Page 33: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 34

Yet another idea:

B,D

S.E = 2

R.A = “c”

R S

Plan III

natural join

Page 34: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 35

Plan III

Use R.A and S.C Indexes(1) Use R.A index to select R tuples

with R.A = “c”(2) For each R.C value found,

use S.C index to find matching tuples(3) Eliminate S tuples S.E 2(4) Join matching R,S tuples, project

B,D attributes and place in result

Page 35: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 36

R S

A B C C D E

a 1 10 10 x 2

b 1 20 20 y 2

c 2 10 30 z 2

d 2 35 40 x 1

e 3 45 50 y 3

A CI1 I2

=“c”

<c,2,10> <10,x,2>

check=2?

output: <2,x>

next tuple:<c,7,15>

Page 36: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 37

Overview of Query Optimization

Page 37: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 38

parse

convert

apply laws

estimate result sizes

consider physical plans estimate costs

pick best

execute

{P1,P2,…..}

{(P1,C1),(P2,C2)...}

Pi

answer

SQL query

parse tree

logical query plan

“improved” l.q.p

l.q.p. +sizes

statistics

Page 38: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 39

Example: SQL query

Query : Find the movies with stars born in 1960

SELECT titleFROM StarsInWHERE starName IN (

SELECT nameFROM MovieStarWHERE birthdate LIKE ‘%1960’

);

Page 39: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 40

Example: Parse Tree<Query>

<SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> <Tuple> IN <Query>

title StarsIn <Attribute> ( <Query> )

starName <SFW>

SELECT <SelList> FROM <FromList> WHERE <Condition>

<Attribute> <RelName> <Attribute> LIKE <Pattern>

name MovieStar birthDate ‘%1960’

Page 40: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 41

Example: Generating Relational Algebra

title

StarsIn <condition>

<tuple> IN name

<attribute> birthdate LIKE ‘%1960’

starName MovieStar

Fig. 16.14: An expression using a two-argument , midway between a parse tree and relational algebra

Page 41: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 42

Example: Logical Query Plan

title

starName=name

StarsIn name

birthdate LIKE ‘%1960’

MovieStar

Fig. 16.16: Applying the rule for IN conditions

Page 42: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 43

Example: Improved Logical Query Plan

title

starName=name

StarsIn name

birthdate LIKE ‘%1960’

MovieStar

Fig. 16.16: An improvement on prev fig

Question:Push project to

StarsIn?

Page 43: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 44

Example: Estimate Result Sizes

Need expected size

StarsIn

MovieStar

Page 44: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 45

Example: One Physical Plan

Parameters: join order,

memory size, project attributes,...

Hash join

SEQ scan index scan Parameters:Select Condition,...

StarsIn MovieStar

Page 45: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 46

Example: Estimate costs

L.Q.P

P1 P2 …. Pn

C1 C2 …. Cn Pick best!

Page 46: CS 4432query processing1 CS4432: Database Systems II.

CS 4432 query processing 47

Query Optimization

• Relational algebra level …• Detailed query plan level …

– Estimate costs– Generate and compare plans


Recommended