Relaxing Join Selection & Queries Nick Koudas et al

Post on 17-Jan-2016

51 views 0 download

Tags:

description

Relaxing Join Selection & Queries Nick Koudas et al. Presented by Pradnya Chavarkar. Motivation. Complex queries in terms of predicates against large databases. Right parameter values not known. Parameter adjustment becomes a trial-and-error method. - PowerPoint PPT Presentation

transcript

Relaxing Join Selection & Queries Nick Koudas et al

Presented by Pradnya Chavarkar

Relaxing Join and Selection Queries

2

Motivation

Complex queries in terms of predicates against large databases.

Right parameter values not known. Parameter adjustment becomes a trial-

and-error method. More the predicates, difficult the

adjustment

Relaxing Join and Selection Queries

3

Example

Relaxing Join and Selection Queries

4

Example

How many conditions to adjust?? How much to adjust?? No. of choices exponential in number of

conditions

Relaxing Join and Selection Queries

5

Outline

Relaxation framework Relaxing select conditions Relaxing All conditions Lattice traversal Experimental results Conclusion

Relaxing Join and Selection Queries

6

Query Relaxation

Top–K queriesWeights given to each condition

Skyline Points which are not dominated by any

other points in the given set

Relaxing Join and Selection Queries

7

Relaxation Framework

Selection Condition

Join Condition

Relaxing Join and Selection Queries

8

Relaxation Skyline

Relaxing Join and Selection Queries

9

Simple Solution !

Compute Skyline on Jobs Compute Skyline on Candidates Join the skylines

Relaxing Join and Selection Queries

10

Simple Solution ! (Contd)

Incorrect results ! Computes skyline locally

Relaxing Join and Selection Queries

11

Relaxing Selection Conditions Many joins are on Identifier attributes Relaxation skyline should satisfy :

Relaxing Join and Selection Queries

12

Join First (JF)

Join without select conditions Compute skyline for select conditions

on the resulting tuples May not be efficient for joins which

return a large number of pairs

Relaxing Join and Selection Queries

13

Pruning Join (PJ)

Compute skyline during join step Assuming index on relation S For each tuple of R, find joining tuples

from S Call ‘update’ for each such pair and

current skyline Discard the pair if it is already

dominated else discard the pairs dominated by the current pair

Relaxing Join and Selection Queries

14

Pruning Join (PJ) (Contd.)

Advantage: Produces less pairs after join step

as compared to Join First Disadvantage

Need for modification of join methods

Relaxing Join and Selection Queries

15

Pruning Join+ (PJ+)

Modifies pruning Join Algorithm Computes “local skylines” for records

joining with a record of othe relation Does Dominance checking within local

skyline If a pair is locally dominated then it will

not be in global skyline

Relaxing Join and Selection Queries

16

Pruning Join+ (PJ+) (Contd)

R (A, B, C)S (C, D, E)

Relaxing Join and Selection Queries

17

Pruning Join+ (PJ+) (Contd)

Does local pruning hoping to eliminate some S tuples beforehand

May not be always beneficial, when many tuples are not eliminated

Experimental results show that the tradeoff depends upon factors like number of conditions

Relaxing Join and Selection Queries

18

Sorted Access Join (SAJ)

Constructs sorted list of tuple IDs for each select condition based on relaxation

Go through the lists in round robin fashion

For each record in Li find records which can join -- (p,q)

Stop if all the current records in list Lj (i≠j), have relaxation greater that (p,q)

Relaxing Join and Selection Queries

19

Sorted Access Join (SAJ) (Contd)

Relaxing Join and Selection Queries

20

Relaxing All Conditions

Relaxing join conditions along with selection conditions

Assumes multidimensional indexed structure on selection and join conditions

R-Tree on both relations

Relaxing Join and Selection Queries

21

MIDIR

Traverses both R-Trees top down to identify potential pairs which can be in relaxation skyline

Push such pairs in queue In each iteration, pop one pair and

perform dominance checking If object-object pair, call Update()

Relaxing Join and Selection Queries

22

Dominance checking in MIDIR

Compute lower and upper bound on relaxation Pair p’ dominates p if for each condition Ci

To compute the lower and upper boundfor Ci (selection condition) convert itinto interval

Relaxing Join and Selection Queries

23

Dominance checking in MIDIR (Contd) B: MRB or tuple A: attribute used in B I(B,A): interval of A in B

Selection Condition:

Join Condition

Relaxing Join and Selection Queries

24

Dominance checking in MIDIR (Contd) Minimum distance between two intervals

Then MINDIST = 0

else MINDIST =

Relaxing Join and Selection Queries

25

Variants of Query Relaxation

Top-k answers User can specify weights on the conditions K points that have smallest weighted

summation Store the k best skyline answers Modify Update() in PJ Priority queues in MIDIR

Relaxing Join and Selection Queries

26

Variants of Query Relaxation (Contd)

Queries with multiple joins JF – compute all joins beforehand SAJ – access multiple index structures to

find the joining tuples MIDIR – queue maintains possible vectors

Relaxation of nonnumeric attributes RELAX(r,c) an be modified All algorithms can refer to this computed

value

Relaxing Join and Selection Queries

27

Variants of Query Relaxation (Contd)

Lattice Structure

Relaxing Join and Selection Queries

28

Lattice Traversal At least one result should be returned Examine the reasons for the empty

result – Join condition Cj is strict

Relax Cj with the tuples provided by applying selection conditions

Either CR or Cs is strict Relax the strict selection condition If join still strict then relax join with either of the

selections

Relaxing Join and Selection Queries

29

Lattice Traversal (Contd)

Both CR or Cs are strict Relax both selection conditions If join condition still strict, relax join condition If still no result, relax all togather

Relaxing Join and Selection Queries

30

Experiments

Experimental Settings Two real and three synthetic dabases Three types of correlations

Independent Correlated Anti-correlated

Relaxing Join and Selection Queries

31

Experiments (different data size)

Relaxing Join and Selection Queries

32

Experiments (different data sizes)

Relaxing Join and Selection Queries

33

Experiments (Different join cardinality)

Relaxing Join and Selection Queries

34

Experiments (Different selection conditions)

Relaxing Join and Selection Queries

35

Summary

JF and PJ Almost same performance for relaxation of

selection conditions PJ and PJ+

PJ works faster, but performance gets affected by join cardinality

SAJ Efficient for correlated data and query

conditions

Relaxing Join and Selection Queries

36

Conclusion

Efficient relaxation algorithm for relaxing join and selection queries

Lattice traversal method for minimal relaxation

Relaxing Join and Selection Queries

37

Related Work (ML for query relaxation ) LOQR algorithm proposed which is for

queries in disjunctive normal form Three steps

Exacting domain knowledge Finding the most useful rule Relaxing the failing query

Relaxing Join and Selection Queries

38

Introduction Let the query by the user be

The query fails because Laptops with large screen weights more

than 3 pounds Fast laptops with large HDD cost more than

$2000

Relaxing Join and Selection Queries

39

Extracting domain knowledge Take subset D of the target database Find “typical values” of the other attributes for

each constraints Forms a new dataset Di, with values of the

constraint attribute indicated a Boolean

Relaxing Join and Selection Queries

40

Extracting domain knowledge (contd)

Relaxing Join and Selection Queries

41

Finding the most useful rule

For rules formed using all the constraints individually, find the most similar rule by nearest neighbour search

R1 is the most similar as they differ only on the price attribute

It is guaranteed to give a result as it gives a result on the subset of the target database

Relaxing Join and Selection Queries

42

Relaxing the failing query

Most similar query is combined with the original query to give the relaxed query

Relaxing Join and Selection Queries

43

References

Relaxing Join and selection QueriesNick Koudas, Chen Li, and Anthony K. H. Tung, Rares Vernica

Machine learning for online query relaxationIon Muslea