+ All Categories

Iceberg

Date post: 07-Nov-2014
Category:
Upload: om-pawar
View: 21 times
Download: 0 times
Share this document with a friend
Popular Tags:
25
Iceberg Query Evaluation Using Bitmap Index Name-Om Pawar Roll No.-3253 Guide-Prof.A.Phakatkar Iceberg query evaluation using bitmap index 1
Transcript
Page 1: Iceberg

Iceberg query evaluation using bitmap index

1

Iceberg Query Evaluation Using

Bitmap IndexName-Om Pawar

Roll No.-3253

Guide-Prof.A.Phakatkar

Page 2: Iceberg

Iceberg query evaluation using bitmap index

2

1) Introduction to Iceberg query2) Bitmap index 3) Bitmap index example4) Dynamic Pruning5) Vector Alignment6) Experimental Evaluation7) Conclusion8) References

Index

Page 3: Iceberg

Iceberg query evaluation using bitmap index

3

What is an iceberg query? Iceberg query is special class of

aggregation query which computes aggregate values above a given threshold.

The general form of an iceberg query on relation R(C1,C2, . . . , Cn) is:

SELECT Ci, Cj, . . . ; Cm, AGG(*) FROM R

GROUP BY Ci; Cj; . . . , Cm

HAVING AGG(*) >= T

Introduction to Iceberg Query

Page 4: Iceberg

Iceberg query evaluation using bitmap index

4

Example: SELECT Product, State, COUNT(*) FROM Sales

GROUP BY Product, State HAVING COUNT(*) >= 100,000.

In above example aggregation is done on states and products with a COUNT function. Only (state, product) groups whose counts exceed 100K are included in the result set.

Page 5: Iceberg

Iceberg query evaluation using bitmap index

5

What is a bitmap index? A bitmap index is a special type of structure used by

most high-end database management systems to optimize search and retrieval for low-variability data such as gender (M, F).

Consists of a collection of bitmap vectors each created to represent a distinct value.

Each distinct value of a column is encoded using a number of bits, each of which is stored in a bitmap vector.

Bitmap Index

Page 6: Iceberg

Iceberg query evaluation using bitmap index

6

Example of Bitmap index

Fig: An example of Bitmap index

Gender {Male,Female}

Bitmap vectors: {Bmale,Bfemale}

Page 7: Iceberg

Iceberg query evaluation using bitmap index

7

1) Saving computation time by conducting bitwise operations

2) Low cardinality(few distinct values)

3) Include null values

4) Compression is utilized to reduce the storage size and improve performance.

Advantages of bitmap index

Page 8: Iceberg

Iceberg query evaluation using bitmap index

8

A bitmap for an attribute (column) of a table can be viewed as a (v X r) matrix, where v is the number of distinct values of the column and r is the number of tuples (rows) in the table.

An uncompressed bitmap can be much larger than the original data, thus compression is typically utilized to reduce the storage size and improve performance.

Bitmap index and its compression

Page 9: Iceberg

Iceberg query evaluation using bitmap index

9

Algorithm for iceberg processing

Page 10: Iceberg

Iceberg query evaluation using bitmap index

10

The way to process iceberg query on two attributes A and B using bitmap indices is to conduct pair wise bitwise-AND operations between each vector of A and each vector of B.

Bitwise-AND operation, which carries out the following three actions in one bitwise-AND operation between vectors X and Y :

1. Z = X AND Y 2. X =X XOR Z 3. Y = Y XOR Z

Dynamic Pruning

Page 11: Iceberg

Iceberg query evaluation using bitmap index

11

Empty bitwise-AND result

Performance is slower

Takes more time to solve query

Disadvantages of Dynamic Pruning

Page 12: Iceberg

Iceberg query evaluation using bitmap index

12

First 1-bit position: It refers to the position of the first 1-bit in a bitmap vector.

Vector alignment: Two bitmap vectors are aligned

if their first 1-bit positions are the same.

Vector Alignment

Page 13: Iceberg

Iceberg query evaluation using bitmap index

13

iceberg PQ (attribute A, attribute B, threshold T) Output: iceberg results 1: PQA.clear, PQB.clear 2: for each vector a of attribute A do 3: a.count = BIT1 COUNT (a) 4: if a.count >= T then 5: a.next1 =first1BitPosition (a, 0) 6: PQA.push (a) 7: for each vector b of attribute B do 8: b.count = BIT1_ COUNT (b) 9: if b.count >= T then 10: b.next1 = first1BitPosition(b, 0) 11: PQB.push(b) 12: R =0;

Algorithm: Iceberg Processing with Vector Alignment and Dynamic Pruning

Page 14: Iceberg

Iceberg query evaluation using bitmap index

14

13: a, b = nextAlignedVectors(PQA, PQB; T) 14: while a ≠ null and b ≠ null do 15: PQA.pop 16: PQB.pop 17: r = BITWISE_AND(a, b)

18: if r.count >= T then 19: Add iceberg result (a.value, b.value, r.count) into R 20: a.count = a.count – r.count 21: b.count =b.count – r.count 22: if a.count >= T then 23: a.next1 = first1BitPosition(a, a.next1 + 1) 24: if a.next1 ≠ null then 25: PQA:push(a)

Page 15: Iceberg

Iceberg query evaluation using bitmap index

15

26: if b.count >= T then 27: b.next1 = first1BitPosition(b, b.next1 + 1) 28: if b.next1 ≠ null then 29: PQB:push(b) 30: a, b = nextAlignedVectors(PQA, PQB, T) 31: return R

Page 16: Iceberg

Iceberg query evaluation using bitmap index

16

A1 A2 A3

0 1 0

1 0 0

0 1 0

0 1 0

1 0 0

0 1 0

0 1 0

0 1 0

1 0 0

0 1 0

0 0 1

0 0 1

B1 B2 B3

0 1 0

0 0 1

1 0 0

0 1 0

0 0 1

1 0 0

0 1 0

1 0 0

0 0 1

0 1 0

1 0 0

1 0 0

A B C

A2 B2 1.23

A1 B3 2.34

A2 B1 5.56

A2 B2 8.36

A1 B3 3.27

A2 B1 9.45

A2 B2 6.23

A2 B1 1.98

A1 B3 8.23

A2 B2 0.11

A3 B1 3.44

A3 B1 2.08

Fig.Table R Fig. Bitmap Indices for A,B

Page 17: Iceberg

Iceberg query evaluation using bitmap index

17

SELECT A,B,COUNT(*) FROM RGROUP BY A,BHAVING COUNT(*)>2

Initial Bitmap VectorsPriority Queue 1 Priority Queue 2A2 1011 0111 0100 B2 1001 0010 0100A1 0100 1000 1000 B3 0100 1000 1000A3 0000 0000 0011 B1 0010 0101 0011 Number of 1s in A3 is not larger than 2

Example of Vector Alignment

Page 18: Iceberg

Iceberg query evaluation using bitmap index

18

Bitmap vectors after first alignmentPriority Queue 1 Priority Queue 2A1 0100 1000 1000 B3 0100 1000 1000A2 0010 0101 0000 B1 0010 0101 0011

B2 is removedBitmap vectors after second alignment Priority Queue 1 Priority Queue 2 A2 0010 0101 0000 B1 0010 0101 0011

Page 19: Iceberg

Iceberg query evaluation using bitmap index

19

Experimental evaluation is based on:

1) Data size(No of tuples)

2) Time

3) No of distinct values

4) Bitwise AND operations

Experimental evaluation

Page 20: Iceberg

Iceberg query evaluation using bitmap index

20

Performance of Dynamic Pruning and Vector Alignment

Tim

e(s

)

Number of tuples(millions)

Fig 5: Performance of icebergDP and icebergPQ

Page 21: Iceberg

Iceberg query evaluation using bitmap index

21

IcebergPQ IcebergDP

Performance of icebergPQ is faster than icebergDP.

Performance of icebergDP is slower than icebergPQ

IcebergPQ only needs 0.404 seconds to finish processing for 1 million tuples.

IcebergDP only needs 10.688 seconds to finish processing for 1 million tuples.

IcebergPQ also scales well when the data size increases.

The performance of icebergDP is unacceptable for practical data sizes.

Page 22: Iceberg

Iceberg query evaluation using bitmap index

22

1. Data Warehousing

2. Information Retrieval

3. Market Analysis

4. Data Mining

Applications Of Iceberg Query

Page 23: Iceberg

Iceberg query evaluation using bitmap index

23

To investigate the processing of iceberg queries without the anti monotone property.

Optimal order of attributes to be processed (in case, we have three or more aggregation attributes) to gain better efficiency.

When the data are of enormous size such that the bitmap of a single column does not fit in main memory.

Future Scope

Page 24: Iceberg

Iceberg query evaluation using bitmap index

24

1.“Iceberg query evaluation using bitmap index”.Bin He, Hui-I Hsiao, Member, IEEE, Ziyang Liu, Yu Huang, and Yi Chen, Member, IEEE,2012.

2.F. Delie`ge and T.B. Pedersen, “Position List Word Aligned Hybrid: Optimizing Space and Performance for Compressed Bitmaps,” Proc. Int’l Conf. Extending Database Technology (EDBT), pp. 228-239, 2010.

3.A. Ferro, R. Giugno, P.L. Puglisi, and A. Pulvirenti, “BitCube: A Bottom-Up Cubing Engineering,” Proc. Int’l Conf. Data Warehousing and Knowledge Discovery (DaWaK), pp. 189-203, 2009.

4.M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, and J.D.Ullman, “Computing Iceberg Queries Efficiently,” Proc. Int’l Conf.Very Large Data Bases (VLDB), pp. 299-310, 1998K. Wu, E.J. Otoo, and A. Shoshani, “Optimizing Bitmap Indices with Efficient Compression,” ACM Trans. Database Systems, vol. 31, no. 1, pp. 1-38, 2006.

Refrences used

Page 25: Iceberg

Iceberg query evaluation using bitmap index

25

THANK YOU!!!


Recommended