+ All Categories
Home > Documents > 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang...

1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang...

Date post: 05-Jan-2016
Category:
Upload: lee-hart
View: 212 times
Download: 0 times
Share this document with a friend
23
1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University
Transcript
Page 1: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

1

Using Tiling to Scale Parallel Datacube Implementation

Ruoming JinKarthik Vaidyanathan

Ge Yang Gagan Agrawal

The Ohio State University

Page 2: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

2

Introduction to Data Cube Construction

Data cube construction involves computing aggregates for all values across all possible subsets of dimensions.

If the original dataset is n dimensional, the data cube construction includes computing and storing nCm m-dimensional arrays.

Three-dimensional data cube construction involves computing arrays AB, AC, BC, A, B, C and a scalar value all.

Part I

Page 3: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

3

Motivation• Datasets for off-line processing are becoming larger.

– A system storing and allowing analysis on such datasets is a data warehouse.

• Frequent queries on data warehouses require aggregation along one or more dimensions.– Data cube construction performs all aggregations in

advance to facilitate fast responses to all queries.

• Data cube construction is a compute and data-intensive problem.– Memory requirements become the bottleneck for

sequential algorithms.

Part I

Construct data cubes in parallel in cluster environments!

Page 4: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

4

Our Earlier Work

• Parallel Algorithms for Small Dimensional Cases and Use of a Cluster Middleware (CCGRID 2002, FGCS 2003)

• Parallel algorithms and theoretical results (ICPP 2003, HiPC 2003)

• Evaluating parallel algorithms (IPDPS 2003)

Page 5: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

5

Using Tiling

• One important issue: memory requirements for intermediate results – From a Sparse m dimensional array, we compute m m-

1 dimensional dense arrays

• Tiling can help scale sequential and parallel datacube algorithms

• Two important issues: – Algorithms for Using Tiling

– How to tile so as to have minimum overhead

Page 6: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

6

Outline

• Main Issues and Data Structures

• Parallel algorithms without tiling

• Tiling for Sequential Datacube construction

• Theoretical analysis

• Tiling for Parallel Datacube construction

• Experimental evaluation

Page 7: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

8

Main Issues

• Cache and Memory Reuse– Each portion of the parent array is read only once to

compute its children. Corresponding portions of each child should be updated simultaneously.

• Using Minimal Parents– If a child has more than one parent, it uses the

minimal parent which requires less computation to obtain the child.

• Memory Management– Write back the output array to the disk if there is no

child which is computed from this array.

– Manage available main memory effectively

• Communication Volume– Appropriately partition along one or more dimensions

to guarantee minimal communication volume.

Part I

Page 8: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

9

Aggregation Tree

Given a set X = {1, 2, …, n} and a prefix tree P(n), the corresponding aggregation tree A(n) is constructed by complementing every node in P(n) with respect to X.

Part III

Prefix lattice Prefix tree Aggregation tree

Page 9: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

10

Theoretical Results

• For data cube construction using aggregation tree– The total memory requirement for holding the results is

bounded.

– The total communication volume is bounded.

– It is guranteed that all arrays are computed from their minimal parents.

– A procedure of partitioning input datasets exists for minimizing interprocessor communication.

Part III

Page 10: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

11

Level One Parallel Algorithm

Main ideas

• Each processor computes a portion of each child at the first level.

• Lead processors have the final results after interprocessor communication.

• If the output is not used to compute other children, write it back; otherwise compute children on lead processors.

Part III

Page 11: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

12

Example

• Assumption– 8 processors

– Each of the three dimensions is partitioned in half

• Initially– Each processor computes

partial results for each of D1D2, D1D3 and D2D3

D1D2D3

D2D3 D1D3 D1D2

D3 D2 D1

all

Three-dimensional array D1D2D3 with |D1| |D2| |D3|

Part III

Page 12: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

13

Example (cont.)

• Lead processors for D1D2

(l(l11, l, l22, 0), 0) (l1, l2, 1)

(0, 0, 0)(0, 0, 0) (0, 0, 1)

(0, 1, 0)(0, 1, 0) (0, 1, 1)

(1, 0, 0)(1, 0, 0) (1, 0, 1)

(1, 1, 0)(1, 1, 0) (1, 1, 1)

• Write back D1D2 on lead processors

D1D2D3

D2D3 D1D3 D1D2

D3 D2 D1

all

Three-dimensional array D1D2D3 with |D1| |D2| |D3|

Part III

Page 13: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

14

Example (cont.)• Lead processors for D1D3

(l(l11, 0, l, 0, l33)) (l1, 1, l3)

(0, 0, 0)(0, 0, 0) (0, 1, 0)

(0, 0, 1)(0, 0, 1) (0, 1, 1)

(1, 0, 0)(1, 0, 0) (1, 1, 0)

(1, 0, 1)(1, 0, 1) (1, 1, 1)

• Compute D1 from D1D3 on lead processors; write back D1D3 on lead

processors • Lead processors for D1

(l(l11, 0, 0), 0, 0) (l1, 0, 1)

(0, 0, 0)(0, 0, 0) (0, 0, 1)

(1, 0, 0)(1, 0, 0) (1, 0, 1)

• Write back D1 on lead processors

D1D2D3

D2D3 D1D3 D1D2

D3 D2 D1

all

Three-dimensional array D1D2D3 with |D1| |D2| |D3|

Part III

Page 14: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

15

Tiling-based Approach

• Motivation– Parallel machines are not always available

– Memory of individual computer is limited

• Tiling-based Approaches– Sequential: Tile along dimensions on one processor

– Parallel: Partition among processors and on each processor tile along dimensions

Part IV

Page 15: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

16

Sequential Tiling-based Algorithm• Main Idea

A portion of a node in aggregation tree is expandable (can be used to compute its children) once enough tiles of the portion of this node have been processed.

• Main MechanismEach tile is given a label

D1D2D3

D1D2 D1D3 D2D3

D1 D2 D3

all

Three-dimensional array D1D2D3 with |D1| |D2| |D3|

4 tiles, tile along D2, D3.Each tile is given a lable (0, l2, l3)Tile 0 – (0, 0, 0)Tile 1 – (0, 0, 1)Tile 2 – (0, 1, 0)Tile 3 – (0, 1, 1)

Part IV

Page 16: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

17

Example

D1D2D3

D1D2 D1D3 D2D3

D1 D2 D3

all

Three-dimensional array D1D2D3 with |D1| |D2| |D3|

D2D3 D1D3 D1D2

Tile

(0 0 0)done Portion 0 Portion 0

Tile

(0 0 1) done Portion 1Portion 0

Merge & expand

Tile

(0 1 0) donePortion 0

Merge & expand

Portion 1

Tile

(0 1 1) donePortion 1

Merge & expand

Portion 1

Merge & expand

Part IV

Page 17: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

18

Tiling Overhead • Tiling based algorithm requires writing

back and rereading portions of results

• Want to tile to minimize the overhead

• Tile the dimension Di 2ki times

• We can compute the total tiling overhead as

Page 18: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

19

Minimizing Tiling Overhead

• Tile the largest dimension first, change its effective size

• Keep choosing the largest dimension, till the memory requirements are below the available memory

Page 19: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

20

Parallel Tiling-based Algorithm

• Assumptions– Three-dimensional

partition (0 1 1 1)

– Two-dimensional tiling (0 0 1 1)

D1D2D3D4

D1D2D3 D1D2D4 D1D3D4 D2D3D4

D1D2 D1D3 D2D3 D1D4 D2D4 D3D4

D1 D2 D3 D4

all

Four-dimensional aggregation tree with|D1| |D2| |D3| |D4|

Part IV

• Solutions–Apply tiling-based approaches to first level nodes only

–Apply Level One Parallel Algorithm to other nodes

Page 20: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

21

Choosing Tiling Parameters

0

50

100

150200

250

300

350

400

Time (s)

25 5

Sparsity Level (percent)

128^4 Dataset, 1 Processor, 8 Tiles

Sequential Algorithmw/o TilingThree-dimensionalTilingTwo-dimensional Tiling

One-dimensional Tiling

Tiling overhead exists.

Tiling along multiple dimensions can reduce tiling overhead.

Part IV

Page 21: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

22

Parallel Tiling-based Algorithm Results

0

10

20

30

40

50

60

70

Time (s)

25 5

Sparsity Level (percent)

128^4 Dataset, 8 Processors, Three-dimensional Partition

Tiling Parameter (1 0 0 1)Tiling Parameter (0 0 1 1)Tiling Parameter (0 0 0 2)

Algorithm of choosing tiling parameters to reduce tiling overhead still takes effect in parallel environments!

Part IV

Page 22: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

23

More data goes here

Page 23: 1 Using Tiling to Scale Parallel Datacube Implementation Ruoming Jin Karthik Vaidyanathan Ge Yang Gagan Agrawal The Ohio State University.

24

Conclusions

• Tiling can help scale parallel datacube construction

• Algorithms and analytical results in our work


Recommended