[American Institute of Aeronautics and Astronautics 38th Aerospace Sciences Meeting and Exhibit -...

AIAA-2000-0999

Multigrid Preconditioning of Krylov Subspace Methods for CFD Applications

Jason Smith and Ismail Celik

West Virginia University Department of Mechanical and Aerospace Engineering Morgantown, WV, 26506-6 106

38th Aerospace Sciences Meeting & Exhibit

10=13Januaty2000/Reno, NV

(c)2000 American Institute of Aeronautics & Astronautics or published with permission of author(s) and/or author(s)’ sponsoring orgagiation.- . __---

\

\ \ --- ,Ai)o-16820 -------’

For permission to copy or republish, contact the American Institute of Aeronautics and Astronautics 1801 Alexander Bell Drive, Suite 500, Reston, VA 20191

(c)2000 American Institute of Aeronautics & Astronautics or published with permission of author(s) and/or author(s)’ sponsoring organization.

AIAA-2000-0999

MULTIGRID PRECONDITIONING OF KRYLOV SUBSPACE METHODS FOR CFD APPLICATIONS

Jason Smith*and Ismail CeIik**

West Virginia University Department of Mechanical and Aerospace Engineering

Morgantown, WV, 26506-6 106

ABSTRACT

This paper presents a more general computational fluid dynamics (CFD) application of multigrid (MG) preconditioned Krylov subspace methods than has previously been performed. The problem solved is similar to the three- dimensional Poisson pressure problem, the most CPU intensive part of any CFD code. In addition, application of MG preconditioning to a Krylov subspace method other than the conjugate gradient method is presented. It is shown that MG preconditioning of Krylov subspace methods yields a faster solution methodology than both standard Incomplete LU (ILU) preconditioning and a basic Full Multigrid (FMG) scheme. In addition, the potential for MG implementation in more sophisticated CFD solvers, such as finite element or fully coupled finite volume methods is discussed,

INTRODUCTION

The coefficient matrices generated by CFD methods are typically sparse, i.e. the vast majority of the matrix coefficients are zero. Solutions of linear systems in CFD are usually performed with either multigrid (MG) methods or Krylov subspace methods. Both of these methodologies can be easily configured to take advantage of the sparseness of the matrices. Of the former group the Full Multigrid (FMG) method (Lonsdale, 1993) is popular. Examples of the later group include the Incomplete Cholesky Conjugate Gradient (ICCG) method (Kershaw, 1978) and Generalized Minimal Residual (GMRES) method (Saad and Schultz, 1986).

The convergence of Kylov subspace methods is directly dependent on the quality of the preconditioning. Preconditioning involves calculating an approximate inverse. This approximate inverse is employed either directly or indirectly to make the system easier to solve. The effort required to solve a linear system can be measured by the condition number x. Though there are different ways to calculate the condition number of a matrix, typically the ratio of the largest to smallest eigenvalue is used. For systems where

Assistant Research Professor ** Professor

Release A: Copyright 0 1999 The American Institute of Aeronautics and Astronautics Inc. All rights reserved.

1

there exists a range of eigenvalues the condition number can be very large. The larger the condition number, the more difficult the system will be to solve. For problems frequently encountered in CFD, especially those dominated by Neumann- type boundary conditions, the condition number grows as the number of grid cells is increased. As the number of grid cells grows the matrix becomes increasingly stiff and is more difficult to solve. It should be noted that this is the case with the linear system from which the pressure field is derived in CFD. As such it is important for efficiency considerations that the pressure solver be robust.

In this paper a review of both MG and Krylov subspace methods is given. In addition, a review of frequently employed preconditioning schemes is presented. Several different matrix solution algorithms are then applied to a standard benchmark three-dimensional Poisson-type problem with characteristics similar to the pressure/pressure correction equations encountered in CFD. It is shown that MG preconditioning of several representative Krylov methods yields a class of solution methods that are very efficient. These methods exhibit near grid independent scaling. This behavior is observed even though the MG scheme that is used as the preconditioner does not exhibit grid independent convergence when used as a stand alone solver.

MULTIGRID METHODS

It is well known that iterative methods reduce the frequency components of the error most rapidly on the grid, corresponding to them. MG techniques exploit this characteristic by solving the system on multiple grids. Though ’ MG methods are fast, they tend to be problem specific. This requires some tuning by the user to get optimum performance. However, MG methods tend to be independent of problem size and the matrix need not be symmetric.

Often the coarse grids are chosen apriori by knowing the structure of the finest grid. The governing equations are then discretized on this grid. If this procedure is employed then the method is referred to as a geometric multigrid method. However, for problems where the grid is non-uniform, choosing the coarse grid matrices can be difficult if not impossible. This demands a procedure where the coarse grid matrices can be generated automatically. An approximation can be made to generate the coarse grids by using the fine grid matrix itself.

American Institute of Aeronautics and Astronautics


The coarse grid matrices can be automatically generated with a procedure known as Galerkin coarse grid approximation. The procedure is as follows:

A= RAP' =c === (1)

where & is the restriction matrix and P' = is the prolongation matrix. MG methods that employ this methodology are often called Algebraic Multigrid (AMG) methods. The matrix coarsening procedure can be further simplified by performing the above procedure by simple node or cell averaging. This is especially easy for cell centered quantities on structured meshes (e.g. pressure in staggered solvers). A coarse grid control volume can be visualized as being formed from eight neighboring fine grid control volumes. The coarse grid can then be approximated by appropriately averaging the fine grid coefficients to obtain the coarse grid coefficients. This is a far more computationally efficient procedure than actually performing the matrix operations given by Equation 1. The only drawback of this approach is that restriction/prolongation operations tend to be of relatively low order. The low order of the restriction/prolongation may adversely effect convergence of the MG solver as is discussed in more detail later in this paper.

Typically the smoothing operations performed by MG methods are performed by iterative methods. Point iteration methods are preferable since they parallelize easily. However, when grids become non-uniform MG methods that use point iteration smoothing deteriorate (Ferziger, 1997). The solution to this problem is to move to more implicit methods, such as line implicit methods or further to Stones method. The drawback of this approach is that implicit smoothers are difficult to parallelize.

Two different MG strategies are utilized in this study. The first is the V cycle, which is the simplest of the MG strategies (Bramble, 1993). This method starts by relaxing on the finest grid. The down-stroke of the V cycle is performed next. The residual is calculated on the current level then restricted from the fme grid to the coarse grid. Smoothing is then performed to find the coarse grid correction at each level. At the coarsest grid the coarse grid correction is calculated exactly. The up-stroke of the cycle is calculated next. The coarse grid correction is interpolated (a process known as projection or prolongation) up to the next level where it is added to the coarse grid correction calculated at that level on the down-stroke. The correction is smoothed at this level then projected to the next level. This procedure is repeated until the fine grid is reached. When the fine grid is reached the coarse grid correction is added to the approximate solution and smoothed. This .completes one full V cycle. The V cycle is illustrated in Figure 1.

s 0

\

/ 0

\ \

\

‘1 / 9

0

\ \

/

q Solve exactly 0 Apply smoother

’ ‘4 Transfer resklual.9

q 4 Transfercomtiions

Figure 1: Basic V cycle Multigrid.

0 0 \ \

‘\ / 0 0

\ \

\

‘1 / 0

\ 0 0 Solveexactty

\

/

l DonothIng \

‘\ .

0 Smooth with V cycle

N Transfer~sldttals

cl / Transfer~omctlon~

Figure 2: Basic fall multigrid (FMG) cycle.

The FMG cycle (Figure 2) is a faster converging procedure than the V cycle MG. FMG takes advantage of the fact that it is cheaper to work on coarse grid matrices than it is to operate on the fine grid matrices (Bramble, 1993). The FMG cycle can be visualized as being similar to the V cycle. The down stroke of the cycle typically consists of restriction of the error only. The upstroke of the cycle is similar to the upstroke of the V cycle MG except that the smoothing operation at each level consists of a V cycle. Thus the V cycle replaces the single step smoothing (e.g. SOWGS type iteration) that is typically used for the V cycle MG. The resulting coarse grid corrections are more accurate and a faster converging method is obtained.

KRYLOV SUBSPACE METHODS

A large group of linear system solvers fall into a class of methods knoivn .as Krylov subspace methods, -and can be classified as quasi-Newton methods. These methods actually find the exact solution of a linear system in a finite number of

2 American Institute of Aeronautics and Astronautics


steps provided that the coefficient matrix is symmetric positive definite. However, this only occurs in exact arithmetic. Round-off errors ruin this finite termination property. Also, the matrix will rarely be symmetric in CFD applications. Though some methods, like the conjugate gradient method, can only be applied to symmetric systems, there exist generalizations such that non-symmetric systems can be solved. Even when one of these generalizations is utilized non-symmetry of the system will ruin the finite termination property. Fortunately, the methods in this category often find a very good approximation of the solution in fewer steps than would be predicted by the finite termination property. As a result, Krylov subspace methods are applied more often as iterative solvers. The convergence properties for Krylov methods are quadratic (for linear positive definite systems) and do not require prior knowledge of eigenvalues. These procedures implicitly take into account the distribution of eigenvalues, though the convergence properties are still condition number dependent. Relaxation is not required, and these methods are easy to program and are vectorizable (Kershaw, 1978; Khosla and Rubin, 1981) and parallelizable (Tatabe, 1993). The latter point is very important due to the large systems that arise in CFD applications.

The advantages of Krylov subspace methods include quick convergence when applied to difficult (ill-conditioned) problems. They are particularly useful in finding solutions to problems that have sparse coefficient matrices. The disadvantage of Krylov subspace methods is that convergence is dependent on the quality of the preconditioning, a topic that will be addressed in the next section. Also, Krylov subspace methods typically require many full size scratch arrays, which can increase the memory requirements significantly. This problem is made more severe by preconditioning, which could require additional memory to store the approximate inverse

Preconditioning

The goal of any matrix solution procedure is to efficiently solve the system

(2)

where 4 is a square matrix, 15 is the unknown (solution)

vector, and b is the right-hand-side vector. Preconditioning is a procedure where an approximate inverse of matrix A is = calculated, here denoted as g. The resulting system is as follows:

PAX = gb ==- (3)

Yielding the modified system of equations

gx = b’ (4)

where c is a matrix that is as close as possible to the identity

matrix, and E is the modified right-hand-side vector. Applying preconditioning in the manner shown in Equation 3 is referred to as direct preconditioning. Direct preconditioning is seldom used when Krylov subspace methods are employed since matrix multiplication is computationally expensive. In practice, Krylov subspace methods employ preconditioning to generate better search directions in an effort to minimize the residual error. This type of implementation warrants further

explanation. The preconditioning vector r’ can be defined as follows:

(5)

where r is the residual error vector. Preconditioning in this manner can be referred to as indirect preconditioning since the original matrix and right hand side vector are not altered in the manner prescribed by Equation 3.

The traditional way of generating a precondition matrix g involves finding an approximate inverse in one of several ways. The simplest preconditioning method is referred to as diagonal preconditioning. Here the reciprocal of the center diagonal of the matrix A is used as g . This approach is memory efficient since onl&e diagonal matrix coefficients are involved. However, it is a primitive form of preconditioning. More advanced procedures involve calculating an Incomplete LU decomposition (ILU), or an Incomplete Cholesky (IC) decomposition of the original matrix. The later is a special case of ILU decomposition for symmetric matrices. Only ILU will be considered in this work since it is a more general method. Matrix A is approximately factored such that only non-zero elemez that exist in the original matrix are considered. Further fill-in is sometimes employed (Brussino and Sonnad, 1989) to yield a better approximation, but will not be considered in this work. The precondition vector E’ can then be found by considering the following system:

El- = g (6) -



The goal is to obtain E’ , in other words:

r’ = Efj-‘g ( 1 (7)

Solution of Equation 6 is typically performed in two steps:

1) forward substitute to solve c q = r (84 - 2) back substitute to solve e r’ = 3 @b)

Here q is an intermediate solution. Though this procedure - often yields very good preconditioning, the generation of an incomplete factorization can be CPU intensive if the matrix does not have a definite banded structure that can be exploited. This can be the case with unstructured meshes. Traditionally ILU decomposition does not parallelize well, though recent work seems to have made progress in this area (D&to et al., 1999). Also, the factored matrix now needs to be stored. For the three-dimensional Poisson equation approximated with central differences on a structured grid an additional seven full size arrays are required. Depending.on the application this may be an unacceptable memory increase. To avoid this problem a slightly different approach to preconditioning is employed that requires only the original matrix A . The approach given by Equation 5 can be equivalently?tated by relaxing on the following equation:

.Ar’ = g

The simplest relaxation that can be performed is to apply several sweeps of a simple iterative solution procedure such as Jacobi, Gauss Siedel (GS), or Successive Over- Relaxation (SOR). Jacobi iteration is as follows:

This approach to finding d is appropriately referred to as Jacobi preconditioning. The convergence rate can be accelerated by replacing the values of Q,t, with the newest calculated value which is the Gauss-Siedel (GS) method. Over- relaxing the solution can further accelerate the convergence. This procedure is known as Successive Over-Relaxation (SOR). Both of these methods can be illustrated as follows:

9; = .w(‘&&,+b,)/a, + .(1.0-o)*@;-’ (11)

In Equation 11 the a’s represent nodal coefficients, nb denotes neighboring nodes, and P denotes the center node. Here the

iteration count k’ is either k or k-l depending on whether the particular neighboring node has or has not been calculated in sweep k. This method reduces to GS if w is set equal to 1.0. The optimum over-relaxation factor is greater than 1.0 but never exceeds 2.0. From multi-grid theory it is known that the simple iterative methods given above will yield the high frequency components of the solution in relatively few iterations. On ill-conditioned problems the low frequency portion of the solution is often unobtainable with one of these simple iterative approaches. As such the best possible solution to the precondition vector will be obtained with a relatively low number of iterative sweeps.

The advantage of the GS/SOR approach is that convergence is accelerated over the Jacobi method. Also, only one array is required for the later two methods since new values are simply substituted for the old values. Two arrays are required for the Jacobi approach since two iteration levels must be stored. For parallel implementation the Jacobi method is desirable since there is no dependence on new values of the neighbor nodes. However, GS/SOR can be used in a parallel application if a multicolor (red-black) formulation is employed.

A further extension of this approach is to relax Equation 9 with a multi-grid method (Ashby and Falgout, 1996; Tatabe, 1993). The motivation for MG preconditioning is that weaknesses exhibited by Krylov subspace methods can be compensated for by the strengths of MG methods. Likewise the weaknesses exhibited by MG methods can be compensated for by Krylov-type acceleration. One of the weaknesses of MG techniques is that they can hang on systems where there exists a few outlying eigenvalues (i.e. a few large or small eigenvalues). MG techniques will still converge, but will be inefficient. The strength of MG methods is that the rate of convergence tends to be independent of problem size. The strength of Krylov subspace methods is that they effectively find good approximations to extreme eigenvalues in relatively few iterations. Kershaw (1978) describes this process in detail. The weakness of Krylov subspace methods is that the number of iterations required to find the full solution grows with problem size. By combining the two it is hoped that the best of both methods can be obtained. This would include the linear scaling associated with MG methods with the insensitivity to a few extreme eigenvalues. Another advantage of this mixed procedure is that the low order of the interpolation/prolongation, which may cause AMG methods to be less efficient, will have little effect when used as a preconditioning scheme.

For some special matrices that arise in CFD applications the matrix will be symmetric. Typically those Krylov methods that do not necessarily require symmetry converge quicker when the matrix is symmetric. In fact, for many Krylov subspace methods convergence is guaranteed only if the matrix is symmetric. This seldom causes problems in application in that the procedure rarely fails to find a solution. Depending on how it is applied, preconditioning may


destroy the symmetry of a symmetric matrix. One directional sweeps of the Jacobi method will retain symmetry. However, one directional sweeps of both GS and SOR will destroy symmetry. For those cases where the matrix is symmetric it is desirable to apply the preconditioning in such a way as to preserve symmetry. This can be done by using symmetric GS (SGS) and symmetric SOR (SSOR). Employing symmetric versions of these schemes is a simple procedure that entails making two separate sweeps. The first sweep goes from the first node in the matrix to the last. The second sweep reverses the sweep direction going from the last node to the fast. The two opposite direction sweeps represent one iteration of the symmetric scheme. MG preconditioning can retain symmetry by ensuring that the relaxation scheme applied at each level is symmetric.

Preconditioned Conjugate Gradient (PCG)

The conjugate gradient method was originally proposed by Hestenes and Stiefel(l952). However, like most Krylov subspace methods the performance is not good unless used in combination with a preconditioning procedure. The basic PCG method is as follows:

ak = (ddk)bk’+k) --

Xk+l = xk + akek

rk+l = rk - ak&k

WI

(13)

(14)

Find the precondition vector rk+r ’ from

Li d+l = rk+l (15) by applying ILU or by relaxing with a symmetric scheme.

pk = (r:+l,Yk+l)/(!ihk) (16)

P -k+l = d+, - pkek (17)

where X is the solution vector, p is the search direction, r is - the residual, d is the precondition vector, and a and l3 orthogonalize the new iteration values of a vector to the previous iteration values.

Preconditioned Bi-Conjugate Gradient Stabilized (PBCGS)

The CG method is probably the most efficient method in the class of Krylov subspace methods. However,-it is only applicable to symmetric systems. In many applications the


matrix will be symmetric, or with some pre-processing can be made symmetric. Unfortunately this is not always the case. Whenever a non-symmetric discretization scheme is employed (such as upwinding schemes) symmetry is lost. Many schemes have been proposed that work for non-symmetric as well as symmetric systems. Two such methods are the Generalized Minimal Residual method (GMRES) (Saad and Schultz, 1986) and the Preconditioned Bi-Conjugate Gradient Stabilized (PBCGS) method (Van Der Vorst, 1992). GMRES is popular for finite element applications, while PBCGS finds frequent application for fully coupled fmite volume applications (Chin et al. 1992; Deng et al., 1994). The PBCGS method was chosen as a representative non-symmetric scheme and will be presented here.

The first step of the procedure is to initialize several vectors and parameters. The initial residual is calculated as follows:

CO =b-AX, -- - (18)

Another vector, x0* is initialized such that it is an arbitrary vector satisfying the condition &*,o*,r,,) f 0. Several parameters need to be initialized as follows: set p. = a0 = w. = 1, y. = po = 0. The second part of the procedure involves iterative calculation and is as follows:

pi = @i?rk-l) (1%

P = I?4 ak ’ Pk-I Ok-1 (20)

zk = rk-I + /@k-I - wk-,?k-I > (21)

Find the precondition vector y from

d Y =Pk - (22) - by applying ILU or by relaxing with a symmetric scheme.

(23)

(24)

(25)

Find the precondition vector z from

Az=g -- (26) by applying ILU or by relaxing with a symmetric scheme.



(27)

(28)

(2% rk =s-&I,[ (30)

In the above formulation r, r*, p, s, t, 2, y, z are scratch work arrays. As can be see< the PBCGS method is considerably more complicated than the PCG method. Two precondition vectors are calculated instead of one. This means that more effort is required for each iteration than with the PCG method. For more details of this method the reader is referred to Van Der Vorst (1992).

BENCHMARK PROBLEM

For the most part MG preconditioning has only been applied to the CC method. As has been mentioned previously the CG method is only applicable to symmetric matrices. Despite this constraint previous researchers have successfully applied MGCG to three-dimensional groundwater flow problems (Tatebe (1993); Ashby and Falgout (1996)). Though the equation that is solved in this kind of application is of Poisson-type, the boundary conditions were mostly if not all of Dirichlet type. Though these problems are difficult, they converge much faster than problems whose boundary conditions are entirely of Neumann type. Unfortunately, the Poisson pressure and pressure correction equations solved in CFD are of this type. In fact, the vast majority of the CPU time (up to 80% for some methodologies) is spent solving some kind of pressure equation. Thus efficient solution of this kind of problem is extremely important to the efficiency of a CFD solver. Though there are very efficient solvers that can be applied to specific kinds of grids (e.g. fast-Fourier transforms on uniform grids), an efficient methodology that can be applied to any grid distribution would be very useful.

Unfortunately, many CFD applications involve matrices that are not symmetric. There have been few attempts at MG preconditioning for Krylov methods applicable to non- symmetric matrices. One notable exception can be found in Oosterlee and Washio (1998) who applied MG preconditioning to both the PBCGS and GMRES. The problems solved were relevant to CFD, but were only two-dimensional.

A benchmark problem was desired that resembled the pressure equation in CFD. The benchmark problem chosen to test the different matrix solvers is that used by Ferziger and Peric (1997). The equation is of Poisson type with Neumann boundary conditions on all boundaries and is given by Equation 31.

3% a% a%3 -- a?+i3~2 +a22 - = sin(7r X)sin(n Y)sin(n 2) (31)

The equation is solved on the domain (-0.5 < X,Y,Z < 0.5) for which the exact boundary condition is

dal = 0 an

(32)

This is the approximate boundary condition that is often applied at solid boundaries for the pressure equations when the Navier-Stokes equations are solved. The solution to the above equation is given by:

CD = --Lin(lr~)sin(aY)sin(nZ) + C (33) 3n2

The constant C in the above equation recognizes the fact that the solution of Equation 3 1 with Neumann boundary conditions is indeterminate and thus a family of solutions exists differing only be a constant. For incompressible flows only the pressure gradient is important. Consequently, the particular solution that is found is irrelevant. The constant is determined by the initial guess of the solution. The solution was initialized to zero, which sets the constant C equal to zero.

Equation 31 was discretized on two different problems. Problem 1 had a uniform grid. A total of 4 different grids were considered including 163, 323, 643, 1283. The goal here was to determine how each method scales with respect to CPU time and number of required iterations as the grid size is increased. Problem 2 utilized the finest grid (1283). However, the grid was made non-uniform by smoothly contracting the grid from the center of the domain to the borders. This is as would be the case for problems that include walls since grid cells are typically clustered near the walls to capture the boundary layer, Each grid was characterized by the maximum grid aspect ratio. A total of four tests were performed on grids where the maximum aspect ratio was 1, 5, 10, and 20, respectively. The goal of this test was to determine how sensitive different solution strategies are to cell aspect ratios in non-uniform grids. Problems 1 and 2 are illustrated in Figures 3 and 4, respectively.

Many CFD methodologies employ a staggered grid arrangement with the pressure (and all other scalars) located at the center of the grid cells and the velocity located at cell faces. This arrangement prevents de-coupling of the pressure and velocity fields. Due to this fact Equation 31 was discretized assuming that Q is located at the center of the grid cells. The restriction/prolongation procedure is affected by this choice. Simple averaging of the eight neighboring fine grid values to obtain coarse grid values is used as the restriction. Straight injection of the values obtained on the coarse grid into the eight



fine grid cells that make up the coarse grid cell is used as the prolongation procedure. This is commonly referred to as piece- wise constant prolongation/restriction and is only first-order accurate (Xu and Pollard, 1997). Higher order restriction/prolongation can be used, but boundary conditions would need to be applied (Wesseling, 1988). In addition, the coarse grid matrices produced by the Galerkin Coarse grid approximation may not have the same banded structure as the original matrix (Wesseling, 1998), and the symmetry of the fine grid matrix may not be preserved (Wesseling, 1987). Piece- wise constant prolongation/restriction is a natural selection for a staggered mesh, though the low order will effect the convergence of the MG solver. For the convergence of MG solvers to be independent of grid size the additive order of the prolongation and restriction must be greater than the order of the differential equation to be solved (Wesseling, 1988). In the case of piece-wise constant restriction/prolongation the order of both the restriction and prolongation is one, and the order of the differential equation given by Equation 31 is two. Thus the criteria stated above is not satisfied and grid independent convergence is not to be expected. This no doubt could be improved by using a more accurate interpolation/restriction procedure, but will be left for future work.

Figure 3: Problem 1 illustrated with a 323 grid.

Figure 4: Problem 2 illustrated with a 323 grid.

IMPLEMENTATION

The two different Krylov subspace methods were preconditioned with three different strategies including SSOR (SSORCG, SSORBCGS), ILU (ILUCG, ILUBCGS), and MG (MGCG, MGBCGS). In addition to these methods, a FMG scheme was implemented. Thus a total of seven different matrix solution strategies were tested. It was determined that for all schemes that utilized SSOR that an over-relaxation factor of 1.4 gave optimum results. Though PBCGS does not require symmetric preconditioning, it was observed that convergence was quicker when a symmetric scheme was used. Whether this is by chance or due to the symmetric nature of the benchmark remains to be determined.

All schemes that utilized a MG methodology were similar. The coarsest grid was solved exactly with SSORBCGS. All other smoothing operations on the higher grids used one sweep of SSOR. All schemes that utilized MG used only three grids. This takes advantage of the fact that the coarse grid smoother utilized is a very efficient solver for up to 323, and would be efficient enough for grids as high as 643. It was observed that the more grids that were utilized, the poorer the performance. This was due to additive degradation due to the low order of the restriction/prolongation procedure. To minimize this effect, only three grids were utilized for the MG procedures. When MG was used as a preconditioner only one V cycle was used as the relaxing scheme.

It should be noted that the MG procedures implemented for this study are not necessarily optimized. However, any improvement in the MG solver would be accompanied by a corresponding improvement in the MG preconditioned schemes. Thus the conclusions would be



similar even if the MG solvers were improved. In fact it is suspected that further improvement would be observed if either a W-cycle or the FMG scheme was used as a preconditioner.

RESULTS

The seven different schemes outlined in the previous section were applied to Problems 1 and 2. The results are shown in Tables 1 and 2, respectively. All CPU times are wall clock time. The solutions were converged such that the Li norm ( Equation 34 ) was reduced by 5 orders of magnitude. This is a severe convergence criterion in that the error is reduced by only 3-4 orders of magnitude in typical CFD applications. This severe criterion makes for a good test of the solution procedures.

Ll = Eiabs(q) (34)

The results obtained with the uniform grids (Problem 1) will be discussed first. It can be seen in Table 1 that the procedures that involve using the MG methodology are faster than those that do not. Also, the methods that utilize some form of MG procedure tend to be more insensitive to grid resolution as far as the number of required iterations is concerned. The SSOR/ILU preconditioned Krylov methods are sensitive to grid resolution. An interesting result of this study is that the MG preconditioned Krylov methods are more insensitive than the FMG method with respect to the number of iterations required to reach convergence. This can be attributed to the restriction/prolongation process, which has little effect as a preconditioning scheme but slows convergence in the FMG method. Of more importance is how the grid resolution affects the required CPU time. Ideally this should be linear. Two of the above schemes were selected to demonstrate how the CPU time scales with grid resolution and are shown in Figure 5. The three finest grids were.plotted against grid resolution on a log- log plot to check the scaling. The smallest grid (163) was neglected due to the fact that the CPU times listed were rounded by the computer to the nearest 0.1. The MG preconditioned schemes scaled approximately as N’.” whereas the other schemes scaled between N’.” and N’.30. As can be seen none of the methods scale linearly, though the MG preconditioned Krylov methods come close. This is an interesting result since the MG preconditioned results appear to scale linearly (or nearly linear) with respect to the required number of iterations. It was discovered that the cause of this discrepancy lies not with the solution methodology, but with the implementation. The schemes were implemented in a manner that was not particularly memory efficient. The larger grids did not completely fit in RAM requiring the use of-some swap space. This slowed down the computations and prevented linear scaling. This problem could easily be overcome with a more efficient memory implementation (or larger memory).

The convergence of the various methods is shown in Figures 6 and 7. The SSOR preconditioned schemes converge more smoothly than do the ILU preconditioned schemes. As would be expected, the FMG scheme exhibited linear convergence. The schemes that utilize the MG preconditioning exhibited fairly linear convergence after the first iteration.

The last series of tests were performed on Problem 2. It was desired to determine how the grid aspect ratio affects the convergence of different solution methodologies. It is well known that MG methods that utilize point iteration smoothers are drastically affected by non-uniform grid resolution whereas Krylov subspace methods are less affected. Though all methods deteriorated when applied to non-uniform grids, MG preconditioned Krylov subspace methods are by far the most efficient. As was noted earlier, the schemes that utilize the MG methodology could be improved by using a more implicit smoother.

An interesting result of this work is the relatively good performance exhibited by the SSOR preconditioned Krylov schemes. In general the schemes performed almost as well as the more sophisticated ILU preconditioned schemes. This was achieved with a significant reduction in memory overhead as opposed to the ILU schemes since no extra memory is required for the precondition matrix. However, these results were obtained with an optimum relaxation factor. It is doubtful that similar good performance would be achieved on a more general matrix.

The computational efficiency of the various methods is not the only efficiency that needs to be considered. Memory overhead can become a consideration when large systems are being solved. MG preconditioned Krylov methods are more memory efficient than their ILU preconditioned counterparts. This is because storing .. coarse grid matrices -requires considerably less memory than storing the ILU matrix. For a structured grid this memory overhead is seven full size arrays versus slightly more than one for the MG preconditioning.


0.1 ( 1e+4 If?+5 le+O

Number of Grid Nodes

Figure 5: Log-log plot showing scaling for Problem 1.


Despite the advantage over ILU preconditioning, the main drawback of MG preconditioned Krylov methods is the memory overhead. Krylov methods require extra scratch arrays. This can become a severe limitation when large systems are being solved. The scheme requiring the least memory overhead is the FMG method, which required only one full size array and some smaller arrays for coarse grid operations. The basic PCG method requires three fall-size scratch arrays, and the PBCGS method requires eight. For the MG preconditioned versions additional memory can be added for the coarse grid operations. This is slightly over one full array. The memory overhead for MGCG is acceptable for most applications. However, MGBCGS involves significant memory overhead, which may preclude this method for many applications. It is suspected that GMRES may be worse for large systems that require much iteration to reach convergence. From work using other preconditioning methods (Brussino and Sonnad, 1989), it is suspected that MG preconditioned BCGS will converge quicker than a similarly preconditioned GMRES procedure.

1-1 - ILUBCGS & - 5 SSORCG le.2 -

SSORBCGS 7 = le.3 I 05 -7 1%4

1s6 0 20 40 00 80 100 120 140 180

Iteration Number

Figure 6: Convergence of the ILU and SSOR preconditioned solvers on 12g3 uniform grid.

0 5 10 15 20

Iteration Number

Figure 7: Convergence of the FMG and MG preconditioned solvers on 12g3 uniform grid.

CONCLUSIONS

It can be concluded that MG preconditioned Krylov methods are efficient solvers for stiff matrices that arise from CFD applications. Of greater interest to the CFD community is accelerating convergence of more sophisticated CFD solvers, such as finite element or fully coupled finite volume methods that utilize unstructured grids. The FMG approach has been shown to be successful ‘on CFD problems on unstructured meshes (Lonsdale, 1993). In addition, MG preconditioning of the conjugate gradient method has been shown to be successful on elliptic problems solved with unstructured finite element techniques (Globisch, and Nepomnyaschikh, 1998; Xu, J., 1996). It should be noted that the MG technique used by the above author’s was different than that used in this work. There have been many recent successful applications of MG techniques to unstructured meshes. Given this fact and the success of the MG preconditioning demonstrated in this work, implementation of one of the above unstructured MG methodologies as a preconditioner for a generalized Krylov method (e.g. GMRES or PBCGS) in more sophisticated CFD solvers would likely provide significant acceleration in convergence of these methods.

ACKNOWLEDGMENTS

This investigation has been conducted partially under the sponsorship of US DOD Army Research Office through the EPSCoR program (Grant No.: DAAH04-96- 1-O 196).

NOMENCLATURE

A = ap %b

b CFD FMG GS I’ IF ILU L ZE MG P GCGS PCG

P’ =

Coefficient matrix Diagonal coefficient Neighboring coefficients Right-Hand-Side vector Computational Fluid Dynamics Full Multigrid Gauss-Siedel iteration Approximate identity matrix Incomplete Choleski factorization Incomplete LU factorization Li error norm Incomplete LU factorization Multigrid Preconditioning matrix Bi-Conjugate Gradient Stabilized Conjugate Gradient method

Prolongation matrix



?c! Preconditiming vector r Residual vector R Restriction matrix s=GS Symmetric GS SOR Successive Over-relaxation SSOR Symmetric SOR x Solution vector x y, z Coordinate directions

Greek a and p x CD 0

Orthogonalization factors kinetic energy Condition number Variable for Jacobi, GS, or SOR iteration Over-relaxation factor

REFERENCES

Ashby, S., and Falgout, R. (1996) “A Parallel Multigrid Preconditioned Conjugate Gradient Algorithm for Groundwater Flow Simulations,” Nuclear Science and Engineering, Vol. 124, pp. 145-159.

Bramble, J. (1993) Multigrid Methods. John Wiley & Sons, New York.

Brussino, G. and Sonnad, V. (1989) “A Comparison of Direct and Preconditioned Iterative Techniques for Sparse, Unsymmetric Systems of Linear Equations,” Int. J. Num. Methods in Eng., Vol. 28, pp. 801-815.

Chin, P., D’Azevedo, E., Forsyth, P., and Tang, W. (1992) “Preconditioned Conjugate Gradient method for the Incompressible Navier-Stokes Equations,” Int. J. Num. Meth. in Fluids, Vol. 15, pp. 273-295.

Deng, G., Piquet, J., Queutey, P., and Visonneau, M. (1994) “A New Fully Coupled Solution of the Navier-Stokes Equations,” Int. J. Num. Meth. in Fluids, Vol. 19, pp. 605-639.

Dutto, L., Lepage, C., and Habashi, W. (1999) “Impact of Storage of Sparse Linear Systems in CFD Computations,” Paper AIAA-99-0298.

Ferziger, J. and peric, M. (1997) Computational Methods for Fluid Dynamics. Springer-Verlag, New York.

Globisch, and Nepomnyaschikh (1998) “The hierarchial Preconditioning On Unstructured Grids,” Computing, Vol. 6 1, pp. 307-330.

Hestenes, M. and Stiefe!, E. (1952) “Methods of Conjugate Gradients for Solving Linear Systems,” Nat. Bur. Standards J. Res. Vol. 49, pp. 409-436.

Kershaw, D. (1978) “The Incomplete Cholesky - Conjugate gradient Method for the Iterative Solution of Systems of linear Equations,” J. Comp. Physics, Vol. 26, pp. 43-65.

Khosla, P. and Rubin, S. (1987) “Consistent Strongly Implicit Iterative Procedures for Two-Dimensional Unsteady and Three- Dimensional Space-Marching Flow Calculations,” Computers and Fluids, Vol. 15, No. 4, pp. 36 l-377.

Lonsdale, R. (1993) “An Algebraic Multigrid Solver for the Navier-Stokes Equations on Unstructured Meshes,” Int. J. Num. Meth. Heat and Fluid Flow, Vol. 3, pp. 3-14.

Oosterlee, C. and Washio, T. (1998) “‘An Evaluation of Parallel multigrid as a Solver and a Preconditioner for Singularly Perturbed problems,” SIAM J. Sci. Stat. Comput.,Vol. 19, No. 1, pp. 87-l 10.

Saad, Y. and Schultz, M. (1986) “GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric linear Systems,” SIAM J. Sci. Stat. Comput.,Vol. 7, No. 3, pp. 857- 869.

Tatebe, 0. (1993) “The Multigrid Preconditioned Conjugate Gradient Method,” Sixth Copper Mountain Conference on Multigrid Methods, Vol. CP3224, pp. 621-634.

Van Der Vorst, H. (1992) “BI-CGSTAB: A Fast and Smoothly Converging Variant of BI-CG for the Solution of Nonsymmetric Linear Systems,” SIAM J. Sci. Stat. Comput., Vol. 13, No. 2, pp. 631-644.

Wesseling, P. (1988) “Cell-Centered Multigrid for interface problems,” J. Comp. Physics, Vol. 79, pp. 85-91.

Wesseling, P. (1987) “Linear Multigrid Methods,” in Frontiers in Applied Mathematics, Vol. 5, edited by McCormack (SIAM, Philadelphia).

Xu, J. (1996) “The Auxillary Space method and optimal Multigrid Preconditioning Techniques for unstructured Grids,” Computing, Vol. 56, pp. 215-235.

Xu, H. and Pollard, A. (1997) “large Eddy Simulation of Annular Duct Flow Using Parallel Computation,” Advances in LES/DNS.



Table 1: Performance of various solvers on Problem 1.

323

Table 2: I?erformance of various solvers on Problem 2, 1283 on non-uniform grid.

1 5 10 20 ASPECT RATIO

#IT CPU Time (s) #IT CPU Time (s) # IT CPU Time(s) # IT CPU Time (s)

SSOR-CG 81 373.0 98 444.5 148 663.0 188 841.2 SSOR-BCGS 57 463.1 55 446.9 99 799.7 129 1048.2 ILUCG 145 409.0 166 465.7 226 631.9 251 711 ILUBCGS 72 390.3 105 569.5 127 684.7 172 939.2 MGCG 11 128.9 14 175.3 20 261.6 31 432.7 MGBCGS 5 122.8 8 199.8 11 288.4 17 486.4 FMG 16 264.9 26 458.4 55 1040.6 146 3098.0

11 American institute of Aeronautics and Astronautics

Date post:	16-Dec-2016
Category:	Documents
Upload:	ismail
View:	221 times
Download:	8 times

[American Institute of Aeronautics and Astronautics 38th Aerospace Sciences Meeting and Exhibit -...

Documents