+ All Categories
Home > Documents > Chapter 7 The Query Compiler Query Processor : Query Parser Tree Logical Query Plan Physical Query...

Chapter 7 The Query Compiler Query Processor : Query Parser Tree Logical Query Plan Physical Query...

Date post: 30-Dec-2015
Category:
Upload: adelia-gray
View: 320 times
Download: 1 times
Share this document with a friend

of 108

Click here to load reader

Transcript
  • Chapter 7 The Query CompilerQuery ProcessorQuery Parser Tree Logical Query Plan Physical Query Plan Query Structure Relational Algebraic Expression Tree 1 2 3

  • The Stages of Query CompilationParser QueryPreprocessorLogical query plan generatorPreferred logic query plan 7.1 7.3Query rewriter

  • Parsing Convert a SQL statement to a parse tree which consists of the following nodes: 1. Atoms: lexical elements such as keywords, names of attributes or relations, constants, parentheses, operators and other schema elements 2. Syntactic categories: names for families of query subparts such ,

  • A Grammar of a Simple Subset of SQL 1. Query ::= ::= () 2. Select-From-Where::= SELECT FROM WHERE

  • 3. Select-Lists:::= ,::= 4. From-Lists::= ,::= 5. Conditions:::= AND ::= IN ::= = ::= LIKE 6. ::=

  • An ExampleStarsIn( title, year, starName)MovieStar( name,address, gender, birthdate)

    Find the movies with stars born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE %1960 );

  • SELECT FROM WHERE IN title StarsIn ( )starName SELECT FROM WHERE LIKE name MovieStar birthdate %1960

  • SELECT titleFROM StarsIn, MovieStarWHERE starName =name AND birthdate LIKE %1960

    SELECT FROM WHERE , title StarsIn MovieStar = LIKE starName name birthdate %1960AND

  • PreprocessorView ExpansionSemantic CheckingCheck relation usesCheck and resolve attribute usesCheck types

  • Algebraic Laws for Improving Query Plans Commutative and Associative Laws Laws Involving Selection Laws Involving Projection Laws About Joins and Products Laws Involving Duplicate Elimination Laws Involving Grouping and Aggregation

  • Commutative and Associative Laws RS=SR RS=SR RS=SR RS=SR

    (RS) T=R (ST)(RS) T=R (ST)(RS) T=R (ST)(RS) T=R (ST)

  • Theta JoinR S = S R c c

    Suppose R(a,b), S(b,c) and T(c,d). (R S) T R ( S T ) R.a>S.b aS.b a

  • Laws Involving Selection

    C1 AND C2 (R)= C1(C2(R))C1 OR C2 (R)= (C1(R))s(C2(R))

    = C2(C1(R))

  • Transformation Examples

    (a=1 OR a=3)AND b

  • Law for Binary Operators 1. The selection must be pushed to both arguments. 2. : The selection must be pushed to the first argument and optionally may be pushed to the second. 3. OthersIt is only required that the selection be pushed to one argument.

  • C(RS) = C(R)C(S) C(RS) = C(R)S = C(R)C(S) C(RS) = C(R)SC(RS) = C(R)S

    C(R S) =C(R) S D DC(RS) = C(R)S For example, R(a,b) and S(b,c)a=1 OR a=3(b

  • Pushing SelectionsSometimes move a selection as far up the tree and then push the selections down all possible branchesE.g. , StarsIn (title, year, starName) Movie (title, year, length, studioName)View : CREATE VIEW MovieOf1996 AS SELECT * FROM Movie WHERE year=1996; Query: Which stars worked for which studios in 1996? SELECT starName, studioName FROM MovieOf1996 NATURAL JOIN StarsIn

  • starName, studioName year=1996 StarsIn Movie C(R S) = C(R) S year=1996(Movie) StarsIn = year=1996(Movie StarsIn) C(R S) =C(R) C(S) year=1996(Movie StarsIn) = year=1996(Movie) year=1996(StarsIn) starName, studioName year=1996 year=1996 Movie StarsIn

  • Laws Involving Projection A projection may be introduced anywhere in an expression tree, as long as it eliminates only attributes that are never used by any of the operators above, and are not in the result of the entire expression.

  • Basic Laws L(RS)=L(M(R)N(S)) L(RS)=L(M(R)N(S)) C C

    L(RS)=L(M(R) N(S))

    where M,N are attributes of R and S respectively or input attributes in L

  • Suppose there are relations R(a,b,c), S(c,d,e) a+ex,by(RS)

    a+ex,by(a,b,c(R)c,e(S))

    a+ex,by( Rc,e(S))

    L(RB S)=L(R)B L(S) Projections cannot be pushed below S,,. For example, R(a,b):{(1,2)}; S(a,b): {(1,0)} a(RS)= a(R)a(S)={(1)}

  • Projection Involving Some Computation R(a,b,c), S(c,d,e)

    a+bx,d+ey(RS) =x,y(a+bx,c(R)d+ey,c(S)) If x or y is c, we need a temporary name. a+bc,d+ey(RS) =zc,y(a+bz,c(R)d+ey,c(S))

  • Pushing a projection below a selection L(c(R))=L(c(M(R))) (M: input attributes of L or mentioned in C)For example, from StarsIn( title, years, starName)to find stars that worked in 1996SELECT starNameFROM StarsInWHERE year=1996; starName year=1996 StarsIn starName year=1996starNameyear StarsInNotice: If there is index on year, it may not improve the plan

  • Laws About Joins and ProductsRS=c(R S) c

    RS=L(c(R S))

    Usually use the rule from right to left ?

  • Laws Involving Duplicate Elimination

    (R)=R if R has no duplicates [ R:1) A stored relation with a declared primary key 2) The result of a operation] (R s S)=R s S the same as s, s

  • Several laws that push

    (RS) =(R)(S) (R S) =(R) (S)

    (R S) =(R) (S) c c (c(R))=c((R))

    Notice cannot be moved across B,B or

  • For example, R has two copies of t tupleS has one copy of tT(a,b): {(1,2),(1,3)}.(a(T)={(1)}a((T))={(1) , (1)}

  • Laws Involving Grouping and AggregationGeneral Rules (LR=L(R) L(R)=L(M(R)) (M: attributes of R mentioned in L Other Rules MIN, MAX: Not affected by duplicates L(R)= L((R)) SUM, COUNT, AVG: Affected by duplicates

  • An Example

    Relations: MovieStar( name, addr, gender, birthdate) StarsIn( title, year, starName) QueryFor each year, find the birthdate of the youngest star to appear in a movie that year

    SELECT year, MAX (birthdate)FROM MovieStar, StarsInWHERE name=starNameGROUP BY year; year, MAX (birthdate) name=starName MovieStar StarsIn

  • Combine the selection and product into an equijoinGenerate a belowGenerate a between the and the introduced to project onto year and birthdate year, MAX (birthdate) name=starName name=starNameMovieStar StarsIn year, MAX (birthdate) year,birthdate name=starName birthdate, name year, starNameMovieStar StarsIn

  • From Parse Trees to Logical Query Plans Suppose is a construct has no subqueriesconvert into a relational algebra expression from bottom to top as follows

    1. Product all relations from 2. c, C is the expression 3. L, L is the list of attributes in the

  • Translation of A Parse Tree to an Algebraic Expression Tree

    SELECT FROM WHERE , title StarsIn MovieStar = LIKE starName name birthdate %1960AND

  • title starName=name AND birthdate LIKE%1960 StarsIn MovieStar

  • Removing Subqueries From ConditionsTwo-argument selection Node: Left Child: The Relation R Right Child: The Condition C

  • title StarsIn IN name birthdate LIKE 1960 starName MovieStar

  • Replacement of Two-Argument Selection by a One-Argument Selection Uncorrelated Subquery Two-Argument Selection with a left child for R and right child for t IN S Replace the by the expression SReplace the two-argument selection c. Give c an argument that is the product of R and S

  • Uncorrelated Subquery title starName=name

    StarsIn name birthdate LIKE 1960

    MovieStar

  • Correlated Subquery: SELECT DISTINCT m1.title, m1.yearFROM StarsIn m1WHERE m1.year-40
  • m1.title,m1.year

    m1.year-40abd

    m2.title=m1.title AND m2.year=m1.year

    StarsIn m1 m2.title,m2.year,Avg(s.birthdate)abd

    m2.starName=s.name

    StarsIn m2 MovieStar s

  • m2.title,m2.year

    m2.year-40abd

    m2.title,m2.year,Avg(s.birthdate)abd

    m2.starName=s.name

    StarsIn m2 MovieStar s

  • Improving the Logical Query Plan

    Pushing down selection.Pushing down projectionor adding new projection.Removing duplicate elimination, or moving to a more convenient position.Turning selection and product into an equijoin.

  • title

    starName=nameAND birthdate LIKE %1960

    StarsIn MovieStar title

    starName=name

    StarsIn birthdate LIKE 1960

    MovieStar title

    starName=name

    StarsIn birthdate LIKE 1960

    MovieStar title

    starName=name

    StarsIn birthdate LIKE 1960

    MovieStar

  • Grouping Associative/Commutative OperatorsTo group the nodes with the same associative/ commutative operators into a single node with many childrenIn some situationnatural join can be combined with theta-joinReplace the natural joins with theta-join;Add a projectionThe theta-join conditions must be associative U V WR S T U V WR S T

  • Estimating the Cost of Operations When deriving physical plans from a logical plan, we need selectan order and grouping for associative-and-commutative operationsan algorithm for each operator in the logical planadditional operators scanning, sorting, and so onthe way in which arguments are passed from one operator to the next

  • Estimating Sizes of Intermediate RelationsGive accurate estimatesAre easy to computeAre logically consistent

  • Estimating the Size of a Projection Suppose R(a, b, c), a, b are integers with 4 bytes respectivelyc is a string with 100 bytes. Each tuple header requires 12 bytes and each block header requires 24 bytesThen each block can hold 1024-24/120=8 tuples If T(R)=10,000, then B(R)=10,000/8=1250

    For S=a+b,c(R)each tuple of S is 116 bytes and each block can only hold (1024-24)/116=8 tuples, B(S)=1250

    For U= a,b(R)each tuple of U is 20 bytes. Each block can hold 1000/20=50 tuples. B(U)=10,000/50=200

  • Estimating the Size of a SelectionFor S=A=c(R)T(S)=T(R)/V(R,A)

    For S=a

  • AND of ConditionsSelectivity factor in equality1/3 1 A=c 1/V(R,A) For R(a,b,c), S=a=10 AND a>20(R) T(R)=10,000,V(R,a)=50. T(S)=T(R)/(50*3)=67

    If the condition is contradictory S=a=10 AND a>10(R) then T(S) = 0

  • OR of ConditionsSuppose S=C1 OR C2(R),the sum of the number of tuples satisfying C1 and those satisfying C2. T(S)=n(1-(1-m1/n)(1-m2/n)) If R has n tuples, m1 of which satisfy C1 and m2 of which satisfy C2.

    For exampleR(a,b), T(R)=10,000. S=a=10 OR b

  • Estimating the Size of a Join

    The equijoin can be handled as the natural joinThe theta-join can be handled as a selection following a product.

  • For R(X,Y), S(Y,Z), Y is a single attributeX and Z represent any set of attributes Two Simplifying Assumptions:Containment of Value Sets If V(R,Y)V(S,Y), then every Y-value of R will be a Y-value of S.Preservation of Value Sets If A is an attribute of R but not of S, Then V(RS,A)=V(R,A).

    Let V(R,Y)V(S,Y) T(RS)= T(R)T(S)/V(S,Y); Let V(S,Y) V(R,Y), T(RS)= T(R)T(S)/V(R,Y).

    In general, T(RS)=T(R)T(S)/max(V(R,Y),V(S,Y))

  • R(a,b) S(b,c) U(c,d) T(R)=1000 T(S)=2000 T(U)=5000 V(R,b)=20 V(S,b)=50 V(S,c)=100 V(U,c)=500Compute Natural JoinRSUIf (RS)U, thenT(RS)=T(R)T(S)/max(V(R,b),V(S,b)=1000*2000/50=40,000T((RS)U)= T(RS)T(U)/max(V(RS,c),V(U,c)) = 40,000*5000/max(100,500)= 400,000If R(SU), thenT(SU)=T(S)T(U)/max(V(S,c),V(U,c)) =2000*5000/500=20000T(R(SU))= T(SU)T(R)/max(V(SU,b),V(R,b)) =20,000*1000/max(50,20)= 400,000

  • Natural Joins With Multiple Join AttributesR(x,y1,y2) S(y1,y2,z)Probability that r and s agree on attribute y1 1/max(V(R,y1),V(S,y1))2. Probability that r and s agree on attribute y 2 1/max(V(R,y2),V(S,y2))3. Probability that r and s agree on both y1 and y2 1/(max(V(R,y1),V(S,y1))max(V(R,y2),V(S,y2)))4. T(R(x,y1,y2) S(y1,y2,z))=T(R)T(S)/(max(V(R,y1),V(S,y1))max(V(R,y2),V(S
  • R(a,b,c) S(d,e,f) R.b=S.d AND R.c=S.e

    R(a,b,c) S(d,e,f) T(R)=1000 T(S)=2000 V(R,b)=20 V(S,d)=50 V(R,c)=100 V(S,e)=50 max(V(R,b), V(S,d))= 50, max(V(R,c),V(S,e))=100 T(R S) =1000*2000/50/100=400

  • Compute RSU. R(a,b) S(b,c) U(c,d) T(R)=1000 T(S)=2000 T(U)=5000 V(R,b)=20 V(S,b)=50 V(S,c)=100 V(U,c)=500(RU)ST(RU)=T(R)T(U)=1000*5000=5,000,000max(V(RU,b),V(S,b))=max(20,50)=50max(V(RU,c),V(S,c))=max(500,100)=500T(RSU)=5,000,000*2000/50/500=400,000.

  • Join of Many Relations S=R1R2...Rn, suppose the attribute A appears in k of the Risthe various values of V(Ri,A) for i=1,2,k, are v1 v2 vkSelect a tuple t from relation having v1. The selected tuple ti from relation having vi has probability 1/vi of agreeing with t1 on A. For all i=2,3,,k, the probability that all k tuples agree on A is 1/v2v3vk.The rule for estimating the size of any join: Start with the product of the number of tuples in each relation. Thenfor each attribute A appearing at least twice, divide by all but the least of the V(R,A).

  • For example R(a,b,c)S(b,c,d) U(b,e) R(a,b,c) S(b,c,d) U(b,e) T(R)=1000 T(S)=2000 T(U)=5000 V(R,a)=100 V(R,b)=20 V(S,b)=50 V(U,b)=200 V(R,c)=200 V(S,c)=100 V(S,d)=400 V(U,e)=500The resulting estimate is 1000*2000*5000/50*200*200=5000 b c

  • Estimating Sizes for Other OperationsUnion UB: sum of the sizes of the arguments Us: as large as the sum of the sizes or as small as the larger of the two argumentsIntersection 1. as few as 0 tuples or as many as the smaller of the two arguments 2. recognized as the extreme case of the natural join

  • Difference R-S: [T(R)+(T(R)-T(S)]/2=T(R)-T(S)/2Duplicate Elimination (R) 1. (1+T(R))/2 2. V(R,a1)*V(R,a2)**V(R,an)Grouping and Aggregation 1. Product of V(R,A)sA is grouping attribute 2. [1+T(R)]/2

  • The Cost Influenced byThe Chosen Logical Query PlanThe Sizes of Intermediate RelationsThe Physical Operators Used to Implement Logical OperatorsThe Ordering of Similar OperationsThe Method of Passing Arguments from One Physical Operator to the Next

  • Obtaining Estimates for Size ParametersT(R) , V( R, a) : Scanning R and countingB(R): Counting the actual number of blocks used

  • The most common types of histogramsEqual-width: the number of tuples with value v in the range v0
  • For example: computing R(a,b)S(b,c) R.b: 1:200, 0:150, 5:100, others: 550 S.b: 0:100, 1:80, 2:70, others: 250Suppose V(R,b)=14, V(S,b)=13.In R, except 105the average number of the other eleven values is 550/11=50.In S, except 012the average number of the other ten values is 250/10=25.For R.b=1, S.b=1; R.b=0, S.b=0 200*80+100*150=31000.For S.b=2, 70*50=3500.For R.b=5, 100*25=2500.For nine other b-value 50*25=750SUM31,000+3,500+2,500+9*750=48,250.If estimated by formula in Section 7.4T(R)T(S)/max(V(R,b),V(S,b))=1000*500/14=35,714

  • Given relations Jan( day, temp), July( day, temp) SELECT Jan.day, July.day FROM Jan, July WHERE Jan.temp=July.temp

    40-4910*5/10=5 50-595*20/10=10 The size of the join: 10+5=15. If computing without the histogram245*245/100=600

  • Incremental Computation of StatisticsMaintaining T(R) by adding one every time a tuple is inserted and by subtracting one every time a tuple is deleted.Estimating T(R) by counting only the number of blocks in the B-tree.Maintaining V(R,a) by using an index on attribute a of relation R.If a is a key for R, V(R,a) = T( R ).

  • Heuristics for Reducing the Cost of Logical Query PlansIn order to choose a suitable transformation, we need estimate the cost both before and after a transformation.

    For example :R(a,b) S(b,c) T(R)=5000 T(S)=2000 V(R,a)=50 V(R,b)=100 V(S,b)=200 V(S,c)=100 a=10 R S

  • a=10 S R50002000T(R)/V(R,a)=5000/50=100T(R)/2=100/2=5050*1000/200=2501000 a=10 S R500020001001000500For the left plan tree50+1000+100=1150For the right plan tree1000+100=1100

  • Approaches to Enumerating Physical PlansExhaustiveTop-downBottom-up

  • Heuristic SelectionFor A=c (R), there is an index on attribute A of R, perform an indexed scanIf the above includes the other condition, the indexed scan will be followed by a further selection called filter.If there is an index on the join attributes, perform an index-joinIf one argument of a join is sorted on the join attributes, perform a sort-joinWhen computing the union or intersection of three or more relations, group the smallest relations first.

  • Branch-and-Bound Plan EnumerationUse heuristics to find a good physical plan with cost C and then explore the space of physical query plans.Eliminate any plan having the subquery with cost greater than C.Replace the current plan with the new plan having cost less than C.

  • Hill ClimbingUse heuristics to find a good physical plan .Make small changes to the plan to find nearby plans that have lower cost by (1) replacing one method for an operator by another. (2) reordering joins by using the associative and/or commutative laws.

  • Dynamic ProgrammingVariation of the general bottom-up strategyKeep for each subexpression only the plan of least cost.Only the best plan for each subexpression is considered during constructing the plans for a larger subexpression.

  • Selinger-Style OptimizationKeep for each subexpression not only the plan of least cost, but certain other plans that have higher cost but produce a result that is sorted in an order that may be useful higher up in the expression tree.Produce optimal overall plans from plans that are not optimal for certain subexpressions.

  • Choosing an Order for JoinsSelecting an order for the (natural) join of three or more relations. The same ideas can be applied to other binary operations like union or intersection.

  • Significance of Left and Right Join ArgumentsOne-pass join: The left argument is stored in a main-memory while the right argument is read a block at a time.Nested-loop join: The left argument is the relation of the outer loop.Index-join: The right argument has the index.

  • Join Trees SELECT title FROM StarsIn, MovieStar WHERE starName=name AND birthdate LIKE %1960

    starName=name

    StarsIn birthdate LIKE%1960

    MovieStar

    starName=name

    birthdate LIKE%1960 StarsIn

    MovieStar

  • Ways to join four relations When the join involves more than two relations, the number of possible join trees grows rapidly U TR S (a) R S T U (b) R S T U (c)Each tree represents 4!=24 different trees when the possible labelings of the leaves are considered.left-deep treeright-deep treebushy tree

  • Left-Deep Join Trees Only considering left-deep join trees has the following advantagesLimit the search spaceInteract well with common join algorithms

  • For n relations, there is only one left-deep tree shape, to which we may assign the relations in n! waysThe total number of tree shapes T(n): T(1)=1 n-1 T(n)=i=1 T(i)T(n-i)The total number of trees: T(n)n! Given 6 relations, then T(6)6!=426!=30,240

  • U TR SB(R)+B(RS)B(RS)+B((RS) T) R S T UB(R)+B(S)+B(T)It is possible that B(R)+B(S)+B(T)< B(R)+B(RS) or B(RS)+B((RS) T)If R is smallwe expect B(RS)
  • U TR S R S T UFor the right-deep treewe need construct S(TU), TU in repetitive way. If we store it on diskwe are using extra disk I/Os.

  • Dynamic Programming to Select a Join Order and Grouping Three choices to pick an order for the join of many relations.Consider them allConsider a subsetUse a heuristic to pick one

  • A table constructed by dynamic programming algorithmThe estimated size of the join of these relations.The least cost of computing the join of these relations.The expression that yields the least cost.

  • Consider the join of four relations R, S, T, and U R(a,b) S(b,c) T(c,d) U(d,a)V(R,a)=100 V(U,a)=50V(R,b)=200 V(S,b)=100 V(S,c)=500 V(T,c)=20 V(T,d)=50 V(U,d)=1000

  • T(R)T(S)/max(V(R,b),V(S,b)=1000*1000/200=5000T(ST)T(R)/max(V(S,b),V(R,b))=2000*1000/200

  • Join groupings and their costs

    grouping cost ((ST)R)U 12,000 ((RS)U)T 55,000 ((TU)R)S 11,000 ((TU)S)R 3,000 (TU)(RS) 6,000 (RT)(SU) 2,000,000 (ST)(RU) 12,000 B((ST) R) + B(ST)=10,000+2000 =12,000B(TU)+B(RS)=1000+5000=6000

  • Dynamic Programming With More Detailed Cost FunctionsUse Disk I/O as the cost measureCompute the cost of R1 R2 by summing the cost of R1, the cost of R2, and the least cost of joining these two relations.Dynamic programming based on the Selinger-style optimization.

  • A Greedy Algorithm for Selecting a Join OrderBASIS: Start with the pair of relations whose estimated join size is the smallest. The join of these relations becomes the current tree.INDUCTION: Find, among all those relations not yet included in the current tree, the relation that, when joined with the current tree, yields the relation of the smallest estimated size. The new current tree has the old current tree as its left argument and the selected relation as its right argument.

  • Example

  • Completing the Physical-Query-Plan SelectionSelection of algorithms to implement the operations of the query plan.Decision regarding when intermediate results will be materialized and when they will be pipelinedNotation for physical-query-plan operators.

  • Choosing a Selection MethodHave an indexAre compared to a constant in one of the terms of the selection.Use one comparison of the form .Retrieve all tuples that satisfy the comparison from 1Consider each tuple selected in (2) to decide whether it satisfies the rest of the selection conditions

  • Costs for the Various AlgorithmsThe table-scan algorithm (a) B(R) if R is clustered (b) T(R) if R is not clusteredThe algorithm that picks an equality term (a) B(R)/V(R,a) if the index is clustering (b) T(R)/V(R,a) if the index is not clusteringThe algorithm that picks an inequality (a) B(R)/3 if the index is clustering (b) T(R)/3 if the index is not clustering

  • Examplefor R(x,y,z)x=1 AND y=2 AND z
  • Choosing a Join MethodOne-pass join if there is enough buffers to the join.Sort-join when either (1) one or both arguments are already sorted on their join attributes or (2) there are two or more joins on the same attributes.Index-join if there is an index on the join attributes.Hashing join if it can not satisfy the above conditions.

  • Pipelining Versus MaterializationPipelining: The tuples produced by one operation are passed directly to the operation that uses it, without ever storing the intermediate tuples on disk.Materialization: The result of each operation is stored on disk until it is needed by another operation

  • Pipelining Unary Operations Implementation by IteratorProject: call GetNext() once.Selectionc: call GetNext() several times until one tuple that satisfies condition C is found.Test for CGetNext GetNext GetNext()Tuple that satisfies C Consumer

  • Pipelining Binary OperationUse one buffer to pass the result to its consumerExample(R(w,x)S(x,y))U(y,z) M=101 U(y,z) B(U)=10000R(w,x) S(x,y)B(R)=5000 B(S)=10000RS: the two-pass hash joinneed 3(B(R)+B(S))=45,000 disk I/OsIf k
  • If 49
  • If k>5000, we can not perform a two-pass join in the 50 buffer. We use the following algorithmUse two-pass hash join RSneed 45,000 disk I/Os. Store the result on diskneed k disk I/OsUse two-pass hash join (RS)U in the 100 buffersneed 30,000+3k disk I/Os. The total cost is 75000+4k disk I/Os.

  • Notation for Physical Query PlansEach operator of the logical plan becomes one or more operators of the physical planLeaves (stored relations) of the logical plan become one of the scan operators applied to that relation.Materialization would be indicated by a Store operator applied to the intermediate result.

  • Operators for leaves Each relation R that is a leaf operand of the logical-query-plan tree will be replaced by a scan operatorTableScan(R) : All blocks holding tuples of R are read in arbitrary order. SortScan(R,L): Tuples of R are read in order, sorted according to the attribute(s) on List LIndexScan(R,C): C is a condition of the form AcTuples of R are accessed through an index on attribute A. IndexScan(R,A)A is an attribute of R. The entire relation R is retrieved via an index on R.A.

  • Physical Operators for Selection

    Replace c(R) with Filter (C) If R is intermediate relationno other operator besides Filter is needed. If R is a stored or materialized relationTableScan, SortScan(L) are used to access R.2. If condition C can be expressed as Ac AND Dthere is an index on R.Athen a) Use the operator IndexScan(R, Ac ) to access R b) Use Filter(D) in place of the selectionc(R).

  • Physical Sort OperatorsIntroduce SortScan(R,L) which reads a stored relation R, and produces it sorted according to the list of attributes LOther Relational-Algebra Operations: Replaced by a suitable physical operatorThe operation being performedNecessary parametersA general strategy for the algorithm: sort-based, hash-based, or in some joins, index-basedA decision about the number of passes to be used An anticipated number of buffers the operation will require

  • k 5000

  • Annotating a selection to use the most appropriate indexExample for R(x,y,z)x=1 AND y=2 AND z
  • Ordering of Physical Operation

    Break the tree into subtrees at each edge that represents materialization. Order the execution of the subtrees in a bottom-up, left-to-right manner.Execute all nodes of each subtree using a network of iterators

  • ExercisesEx 7.1.3, Ex 7.2.2 (b), (c) , (d)Ex 7.3.1 (c) , Ex 7.3.2Ex 7.4.1 (c) , (d), (e), Ex 7.5.1Ex 7.6.1, Ex 7.7.1 (b), (c)


Recommended