+ All Categories
Home > Documents > Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD...

Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD...

Date post: 25-Dec-2015
Category:
Upload: patrick-wiggins
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
23
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation •CRTI-02-0093RD Project Review Meeting •Canadian Meteorological Centre •August 22-23, 2006
Transcript
Page 1: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation

•CRTI-02-0093RD Project Review Meeting•Canadian Meteorological Centre•August 22-23, 2006

Page 2: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Outline

Introduction Numerical methods

Parallel load-balancing with space-filling curves (SFC) Data distribution Adaptive mesh refinement and derefinement Construction of the ghost boundary cells for each processor Discretization of the Poisson equation Parallel multigrid preconditioner with conjugate gradient method

Numerical results Conclusions

Page 3: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

Structured adaptive mesh refinement (AMR)

Block-structured AMR

• Each Node represents a block of cells.

• Advantage: The cells in each block can be organized as two or three dimensional arrays. The structured grids solver can be used without too many modifications for AMR.

• Disadvantage: It is inflexible. A substantial number of cells can be wasted on a smooth flow.

Page 4: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

• Each Node represents a cell.

• The mesh is only locally refined in contrast to the block-structured AMR.

• It is more flexible, and computationally more efficient than the block-structured AMR.

• The cell-based AMR is chosen in the present paper.

Cell-based AMR

Page 5: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

1

12 1316 17

14 15

29 10

6 711

4 583

1

2

3

4 5

11 12

13

14 15 16 178 9 10

6 7

• The cells can be organized as a quad-tree for 2D, or oct-tree for 3D.

• For a oct-tree structure, it needs 17 words of memory if the connectivity information is explicitly stored.

• If it is not explicitly stored, a tree may be traversed up to its root to find the required neighboring cell.

• It is difficult to parallelize because a search may be extended from one processor to another processor.

An ordinary tree data structure

Page 6: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

• All cells are grouped together as Octs.

• The memory overhead is significantly reduced.

• The maintenance of an octal FTT requires about three words of memory per cell instead of 17 words in the ordinary oct-tree.

Oct

Neighbor cell 1

Neighbor cell 2

Neighbor cell 6

Parent cell ID Oct level Oct positions

Cell 1

Parent Oct

Child Oct

Cell 2

Parent Oct

Child Oct

Cell 8

Parent Oct

Child Oct

An oct-tree structure in a FTT

Fully Threaded Tree (FTT) structure

Page 7: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

• The west and south neighbors of cell 6 can be found directly through its explicitly stored parent Oct.

• The east and north neighbors of cell 6 can be found through the neighboring cells of its parent Oct.

• No more than one level of the tree needs to be traversed to access the neighbors of a cell.

Fully Threaded Tree (FTT) structure

1 2

3 4

1 2

5 6

7 8

9 10

11 12

level Level+1

n

s

w e

An example to access the neighbors of a cell without searching using FTT structure.

Page 8: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Introduction

• Objective:

Propose a new parallel approach to the AMR code based on the FTT data structure

Page 9: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• SFC is chosen as the grid partitioner due to its mapping, compactness and locality.

• The points in the higher dimensional space can be mapped to the corresponding points on a line.

• Only the coordinates of the point in the higher dimensional domain are required to compute the corresponding location on the 1D line.

• In the Hilbert ordering, all adjacent neighbors on the 1D line are face-neighboring cells in the higher dimensional domain (locality).

Parallel load-balancing with space-filling curves (SFC)

0

1 2

30 1

23

4

5 6

7 8

9 10

11

1213

14 150

1 2

3 63

0 1 2 3 0 1 2 15

(a)

(b)

0

1

2

3

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

0

1

2

3

63

Space-filling curves in two dimensions: (a) Hilbert or U ordering, (b) Morton or N ordering.

Page 10: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• The different colors correspond to partitions on the different processors.

• Only leaf cells are shown in the left figure.

Parallel load-balancing with space-filling curves (SFC)

The two-dimensional adaptive grids partitioned on four processors with the Hilbert SFC.

Page 11: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• A unique global ID is used to identify each cell instead of the local ID on each processor.

• Not stored processor ID for each cell, which can be computed from its spatial coordinates using SFC.

• Hash-table technique is applied to store the cells and oct structures on each processor.

• If a cell is marked to be migrated to another processor by the Hilbert SFC, both of the data in the cell and the corresponding oct structures have to be migrated.

Data distribution

The global ID is used to identify each cell.

Page 12: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• Constraint: no neighboring cells with level difference greater than one are allowed.

• Cell A is marked to be refined– Check the neighboring cells of

the parent cell of cell A (i.e., cells B and D), if the neighbors are leaves, they are marked to be refined.

– If cells B and C belong to two different processors, send the global ID of the neighbor of the parent cell of cell B to the processor where cell C resides.

Adaptive mesh refinement and derefinement

Not allowed cells

(a) (b) (c)

A

B

C

D

A example showing how to flag cells to be refined over 2 processors

Page 13: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

(a) before enforcing the refinement constraints (b) after enforcing the refinement constraints

Adaptive mesh refinement and derefinement

Before and after enforcing the refinement constraints on 4 processors

Page 14: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• if cell A is marked to be coarsened– All the children cells of cell A

should be leaves.

– If any neighboring cell is not a leaf, check its nearest two children cells.

• If the nearest two children cells are not leaves, and they are not marked to be coarsened, cell A cannot be coarsened.

Adaptive mesh refinement and derefinement

A

An example showing how cell A is coarsened without violating the constraint.

Page 15: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• The corresponding oct data structure has to be generated to make the boundary cells find their neighboring cells.

• Seven cells in each neighbor direction should use oct A to find their neighboring cells.

• Hilbert coordinates of all neighboring cells are computed to obtain their processor ID.

• The data in the oct A will be sent out to the processors where all the related neighboring cells reside.

Construction of the ghost boundary cells for each processor

A

B C D

The neighboring cells related to the Oct A in the FTT data structure.

Page 16: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• The ghost boundary cells for each processor can be determined based on the oct data structures.

Construction of the ghost boundary cells for each processor

The local leaf cells together with their corresponding ghost leaf boundary cells on two processors

(b) (c)

(a)

Page 17: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• Poisson equation:

• second-order accuracy

• using the cell-centered gradients to approximate the value at the auxiliary node

• The least squares approach is used to evaluate the cell-centered gradients.

Discretization of the Poisson equation

P

EP’ e

Fe

Se

Approximation of the gradient flux Fe based on values at the node E and the auxiliary node P'

f 2

e

PE

PEeee S

xxSF

'

Pe

P

Ee

EPE

ee

PE

PEe yy

yyy

yxx

SS

xxF

Page 18: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical methods

• Additive multigrid method:– The smoothing can be performed

simultaneously (or in parallel ) at all grid levels.

– Better parallel performance than the classical multigrid method

– not convergence if used as a stand-alone smoother

– as a preconditioner combined with the conjugate gradient method.

Parallel multigrid preconditioner with conjugate gradient method

D

S

S

S

S

D

D

P

PR

R

+

+

+

S: Smoother D: Defect

R: Restriction P: Prolongation

+: Sum

The finest level

The coarsest level

R P

A sketch of the V-cycle additive multigrid method.

Page 19: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical results

• Considering a 2D Poisson equation

• The computational domain is

• The Neumann boundary conditions are used on the four boundaries.

• The parallel efficiency are tested on the cluster of computers in SHARCNET.

3)sin()sin()( 2222 lkwithlykxlk

25.0,5.0

Page 20: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical results

• Uniform grids:• Using more processors does not

always reduce the time. For the cases corresponding to levels less than 8, the times increase from 16 to 32 processors due to the domination of the communication times.

• As the problem becomes bigger, the parallel efficiency is improved because of the domination of the computational times.

• For the last case, a parallel efficiency 98% is achieved with 64 processors.

Time (s) Processors

level grids 1 2 4 8 16 32 64

5 1024 1.88 1.03 0.70 0.68 0.75 0.80 0.80

6 4096 7.52 3.75 2.23 1.75 1.38 1.53 1.23

7 16384 33.50 15.73 8.24 5.02 3.50 3.76 2.51

8 65536 158.11 73.32 34.21 18.23 10.02 7.01 5.23

9 262144 353.51 176.01 85.76 42.04 21.49 12.47

10 1048576 381.52 192.72 92.76 48.70

The wall clock times on regular grids from level 5 to 10 with up to 64 processors

Page 21: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical results

• AMR grids:

• The leaf cells are refined if is larger than the mean value.

• For problems with large grid sizes, the times decreases monotonically as the number of processors increases.

• For the last case, a parallel efficiency of 106% (>100%) is achieved due to efficient use of cache memory when the grid size in each processor becomes smaller.

Time (s) Processors

grids 1 2 4 8 16 32 64

604 1.26 0.71 0.49 0.55 0.58 0.51 0.57

2320 4.89 2.54 1.44 1.31 1.20 1.23 1.31

10120 23.10 10.88 5.52 3.56 2.67 2.72 2.61

40048 109.98 53.42 24.53 12.53 7.41 5.22 3.87

160288 342.25 165.71 73.94 36.23 19.66 11.43

639376 388.14 187.12 87.58 45.62

The wall clock times on AMR grids with up to 64 processors

Page 22: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Numerical results

• The grid partitioning and mapping times using the Hilbert SFC:

• The percentage increases slightly when a larger number of processors are used because a large amount of data have to be migrated over a larger number of processors.

• The ratio of the load balancing times to the total computational time is only 0.22% in the case of 64 processors.

• The proposed method is very efficient.

Adaptive grids 40048

Processors 1 2 4 8 16 32 64

Time (s) 0 0.039 0.021 0.011 0.010 0.0092 0.0085

Percentage(%) 0 0.073 0.085 0.088 0.13 0.18 0.22

The wall clock times associated with the load balancing procedure for an adaptive grid on the different processors.

Page 23: Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI-02-0093RD Project Review Meeting Canadian Meteorological Centre August.

Conclusions

• FTT data structure is used to organize the adaptive meshes because of its low memory overhead and accessing neighboring cells without searching.

• The Hilbert SFC approach is used to dynamically partition the adaptive meshes.

• The numerical experiments show that the proposed parallel Poisson solver is highly efficient.


Recommended