L’= -f}. At1 t2 - CCPOklinck/Reprints/PDF/dossSIAM1979.pdf · DYNAMICADI METHODSFORELLIPTIC...

SIAM J. NUMER. ANAL.Vol. 16, No. 5, October 1979

(C) f979 Society for Industrial and Applied Mathematics

0036-1429/79/1605-0010/$01.00/0

DYNAMIC ADI METHODS FOR ELLIPTIC EQUATIONS*

SAID DOSS5" AND KEITH MILLER"

Abstract. This paper develops "dynamic" ADI methods involving a computerized strategy forcompletely automatic change of the iteration parameter At in alternating direction implicit methods for linearor nonlinear elliptic equations. Computational results are presented for many large scale trial runs,corresponding to the finite difference equations for the divergence equation V a7u b on various regions,with various linear (a(x, y)) or nonlinear (a(x, y, u)) co’fficient choices. Our experience is that DADI hasalways worked, and worked extremely well.

1. Rationale of the automatic stepsize changer. Consider the "elliptic" equation

(1.1) L(u)=--(A+B)u=f

where f is given and A and B are linear operators which are easily invertible and insome loose sense "negative." In the nonlinear case it will be the Fr6chet derivative of Lwhich should have this L’= A +B decomposition. The ADI approach (see Peaceman-Rachford I-5] and Douglas-Rachford [3]) is (heuristically) to first convert (1.1) to theparabolic equation

(1.2) ut=(A+B)u-f

whose steady state (t oo) solution solves (1.1). Starting from an initial guess u, onethen discretizes (1.2) in time with a step size At, solving on odd numbered steps (n + 1)implicitly in A and explicitly in B,

n+l n+l(1.3) u -u At{Au +Bu -f},

then reversing the process on even numbered steps (n + 2), solving implicitly in B andexplicitly in A,

+2 n+l n+l n+2(1.4) u -u At{Au +Bu -f}.The combined operations (1.3), (1.4) (always with the same At in (1.3) as in (1.4)) makeone double sweep of the ADI (alternating direction implicit) iteration.

The true power of the ADI method on elliptic problems comes to bear only whenone uses variable At. One chooses a short sequence of values, 0 < At1 < t2 <" < Atr,then scans through r double sweeps of the ADI process, first with At At1, then withAt At2, , Atr, thus completing one full ADI iteration. In this way one is of course notreally attempting to solve the parabolic equation (1.2) accurately for finite times, but toreach the oo solution as quickly as possible; one hopes to damp out the high ordercomponents of the error very strongly with the small time steps, then to proceed tolarger and larger time steps to damp out the lower order components. It is in the choiceof the iteration parameters Ah,..., Atr that difficulties occur in the use of ADImethods. In computational practice one usually finds fast convergence within a fairlywide latitude of choices; but insagacious choice in some cases can lead (often withoutapparent rhyme or reason) to divergence or to very slow convergence.

The situation here seems strongly reminiscent of that occurring in the numericalsolution of stiff systems of ODE’s by stiftly stable implicit methods, in which the secondauthor has some experience [4]. There also one needs to use small time steps initially tofollow the fast transients of the solution down to near equilibrium, but thereafter one

* Received by the editors February 24, 1978. This work was partially supported by the National ScienceFoundation under Grants MPS 73-08593, MCS 76-06967, and MCS 77-07585.

? Department of Mathematics, University California, Berkeley, California 94720.

837

838 SAID DOSS AND KEITH MILLER

can often repeatedly double and double the step size, sometimes to tens of thousands oftimes its initial value, as the solution settles down to its slower, smoother components.The change of step size is of course decided automatically by the computer, based uponan estimate of the local per step truncation error. Such step size changers work, andwork beautifully, in highly nonlinear and difficult problems, even though a proof thatthey would work is out of the question.

It occurred to us that an automatic step size changer might be equally powerful forthe ADI method because: (i) it might be able to keep At within a region of fastconvergence, (ii) it might be able to recognize instabilities as they start to occur andbypass them by decreasing At, and finally, (iii) it would avoid the necessity of judicious apriori choice of the iteration parameters Ati, a chancy business even in the mostpracticed hands.

Our approach is quite similar to that often employed by step size changers forRunge-Kutta methods; one first passes from u to u n/4 by two successive doublesweeps of ADI with step size At; one then backs up and (strictly for bookkeepingpurposes) passes from u" to the value/n+4 by a single ADI double sweep with step size2At. One then computes the test parameter

(1.5) TP= Ilun+4- a./411/llu/_ u"ll,

using the 12 norm, of course. If At is quite small, and one is solving the true parabolicequation (1.2) quite accurately, then TP (an estimate of the relative local truncationerror) will of course be small. We, however, wish to accelerate convergence and attemptto push TP into an interval (.1, .3] where (analysis suggests) convergence for ADIshould be quite rapid. Accordingly, we accept the present step (two double sweeps plusone bookkeeping double sweep) and proceed to the next step with At increased,unchanged, or decreased, depending upon whether TP is too small, okay, or too large.If TP is much too large we reject the present step and beging anew at u with a muchsmaller At.

The analysis, in 2, is based upon the assumptions (i) that A and B are commutingself adjoint negative definite matrices, and (ii) that the error at each step is concentratedin a single eigencomponent, corresponding to eigenvalues -a for A and -b for B.Under these assumptions the convergence factor CF and the test parameter TP becomefunctions of (a At, b At) and we are able to devise a unified strategy of changing At independence upon the value of the test parameter. The strategy adopted is the following:when TP falls in the intervals (-c, .05], (.05, .1], (.1, .3], (.3, .4], or (.4, .6] respectively,one accepts the present step and changes At by the factor of 4, 2, 1, 1/2, or 41- for the nextstep; when TP falls in (.6, ), however, one rejects the present computation and startsthe step anew with At changed by 6. (In 7, based upon later experience, we suggestone slight improvement to this strategy which should be of interest to prospectiveusers.)

In 3 we consider the case of a nonlinear operator L. Our method is akin to aNewton’s method, involving the replacement of L(u) for u near u by its linearizationL(u) -L(u)+L’(un) (u- u); here it is the Fr6chet derivative L’ (i.e., the Jacobianmatrix) which should have the decomposition L’(u) A(u) +B(u) analogous to thatof the linear case. We now have several possible options, depending upon how often wechoose to update the linearization" Option I updates after every single sweep, Option IIafter every double sweep, and we discuss several other possibilities. In all cases thecomputation of the test parameter TP and the strategy for changing At remain the sameas in the linear case.

DYNAMIC ADI METHODS FOR ELLIPTIC EQUATIONS 839

The good news is that this strategy (based upon an analysis with such rigidassumptions, and upon some experimentation with a limited number of trial runs, butthen fixed once and for all) appears to give exceptionally fine results in situations (linearand nonlinear) which are drastically beyond the realm of the assumptions. This ofcourse has been the computational experience with ADI itself for nearly two decades,but our dynamic ADI (DADI) methods seem to work well even on those extremeproblems where standard ADI fails (with a variety of supposedly careful choices ofiteration parameters).

Our computational examples all involve solution of the standard finite differenceequations (with a five point pattern) corresponding to the Dirichlet problem for theelliptic equation in divergence form,

(1.6)7" (aVu) (au,,) + (auy), cu g on Of.

in the region f,

The operators A and B then correspond to certain second order differences in the x andy directions respectively.

In 4 we discuss the computational results for the linear case, a a(x, y), anddiscover that the convergence is exceedingly fast, even for strongly nonconstant andnonsmooth coefficients and for mesh regions with very jagged boundaries. We beginwith constant coefficients on a 32 x 32 square mesh, and proceed to examples with verynonsmooth coefficients (varying randomly between 1 and 100, or varying abruptly froma constant .0001 in one portion to a constant 1 in another portion) and on regions whichare very jagged (the points of an overlaying 40x40 mesh which lie inside the45<0<315 pie-shaped sector of the unit disc, for example). In the process theconvergence rate deteriorates a bit, but only by a factor of 2 or 3.

The computational results for the nonlinear case, a a (x, y, u), are discussed in5. We display a large variety of examples, on a 32 x 32 square mesh, on a 64 x 64

square mesh, and on our 40x40 pie-shaped mesh. Some of these examples canrightfully be called strongly nonlinear, with max/min of a over the region ranging ashigh as 191 and with most of the variation in a (x, y, u) being due to the nonlinearity.Our method converged in all cases tried, even though the initial guess (u = 0) is greatlyin error; the rates of convergence observed seem not to be appreciably different than inthe linear case.

In 6 we compare our new DADI methods with standard ADI (by rerunning withADI some of the linear and nonlinear examples of 5 and 6).

In 7 we give a quick preview of some further results by Doss [2] in his Ph.D.thesis. These include application to equations with gradient dependent coefficientsa a (x, y, Vu) (such as the equations for minimal surfaces, for capillary surfaces, andfor magnetostatics), some alterations in the step size strategy, and development of"three directional" DADI for use with hexagonal meshes or with three dimensionalproblems.

We are extremely enthusiastic about the promise of these DADI methods. Ourexperience in brief is that DADI, on many scores of linear and nonlinear examples hasalways worked, and worked quite well. It beats standard SOR and SLOR routines bylarge factors in efficiency. Even in the highly specialized constant coefficient case, testedon the Laplace difference equation on a 64 x 64 mesh, DADI (with .5 x 10-5 accuracy)turns out to be only about four times slower than the fastest known direct Poissonsolver; although we have not actually performed such comparisons, we would con-jecture that DADI is fully competitive with the capacitance matrix direct Poisson


solvers for problems with highly nonrectangular mesh boundaries such as that ofExample 2. We should also call attention of course to the great suitability of ADI andDADI type methods for modern computers with vectorized or parallel arithmetic andtheir ease of application to problems with nonconstant or nonlinear coefficients and tomesh regions with highly irregular boundaries or drastically stretched meshes.

2. Analysis and choice of step size strategy. The only really quantitatively satis-factory analysis of ADI ever devised requires that A and B be symmetric, negativedefinite, and commuting (hence they have a basis of common eigenspaces). In that caseone easily analyzes the convergence factor by which each double sweep of ADI dampseach eigencomponent of the error. For the analysis of a step size strategy we now makethe even more sweeping assumption that the error at each step is concentrated mainly(e.g., for the analysis, completely) in a single eigenspace. Thus we assume that the errore n= u- u at the nth step is an eigenfunction for A and B, corresponding to eigen-values-a and-b respectively.

Since the true solution u of (1.1) is a steady state solution of (1.3) and (1.4), theerror e also satisfies (1.3), (1.4) but with f set =0. We thus easily compute theconvergence factor CF and the test parameter TP as functions of a At and bat. Theconvergence factor for a single double sweep is

(2.1) e+2=(1-aAt)(1-bAt e=--(aAt, bAt)e .l+aAt 1+bAt]

Hence the convergence factor CF for two double sweeps, and the test parameter TPfrom (1.5), become

(2.2) CF--Ile"+all/lle"ll- 2(aAt, bat),

(2.3) TP=2(aat’ bAt)-(2aAt, 2bAt)lgZ(a At, b At) 1

We now wish to investigate graphically whether it is possible (for all values of a andb) to always make CF decently small merely by changing At to keep TP within a certainrange. Of course we can normalize by assuming b-<_a 1. We then consider thecomputer generated graphs of (CF) 1/2 and TP as functions of At for many differentvalues of b. Four of these graphs, on a log-log scale, for b 1, .5, .1, .01 are shown inFigs. 1-4; we of course considered many other values of b which are not shown. Noticethat in all cases (CF)x/2 is quite small (less than .1, and often considerably less) providedthat TP is in the range (. 1, .3]. These graphs also allow us to easily investigate the effectupon CF and TP of halving, doubling, or quadrupling, etc., of At. In each figure theintervals (-, .05], (.05, .1], (1., .3], (.3, .4], (.4, .6], (.6, ) imposed upon the verticalTP axis lead to subsets of the TP graph which we have labeled [4], [2], 1], [21-], [1/4], rejectand [6], corresponding to the factor by which our strategy will shift At for the next stepof DADI. Notice that in most cases the shifted At falls pretty much into a region of fastconvergence where we would want it to be.

In designing this strategy we have always maintained a healthy bias toward smallerAt, in the hopes of greater stability. This is perhaps best illustrated in Fig. 4 with b .01;notice that there are two At’s at which CF plunges to zero, at At’= 1/a and atAt" 1/b 100At’. Our strategy may allow At to hover about the larger value At" forawhile, but if At once gets a bit below At" the strategy should repeatedly reject andreduce At by drastic 6 factors until it brings At all the way down into the neighborhoodof the smaller value At’. One sees this apparently happening in some of the numericalexamples; see examples 2 and 3 for instance.


10-2 2 10-I 2 51002 51012 5102 0FIG. 1. TP and CF/2 vs. At for eigenvalues a 1, b 1.

It is important that one not overcorrect in shifting At; for example, factors of 40,20, 1, o, 4-1o, 1o in the strategy, in lieu of our present 4, 2, 1, 1/2, 41-, , would undoubtedlylead to wild oscillations--throwing At first too high, then rejecting and too low, then toohigh, etc. Here again our bias towards smaller At is helpful: we have allowed a ratherdrastic reduction factor () when needed; this is often followed in practice (and bydesign) by a more gradual doubling or quadrupling of At in successive steps back uptoward its previous value. Our aim is always to well damp out the higher ordereigencomponents before increasing At to damp out the lower order eigencomponents.

7

-2 2 5 10 -1 2 5 10

Reject, []

2 5 101 2 5 10 5 10

FIG. 2. TP and CF1/2 vs. At ]:or eigenvalues a 1, b .5.


ect []

lO-, 2 lOO i’ 2

FIG. 3. TP and CF1/- vs. At for eigenvalues a 1, b .1.

5 1(

5 10-1 ’’ ’5’ 10 2 5 10’ 5 10 2 5 103

FIG. 4. TP and CF1/z vs. zt for eigenvalues a 1, b .01.

3. The nonlinear case. We wish to solve L(u) f where the Fr6chet derivative L’has the decomposition L’(u n) A(u) + B(u n) analogous to that of the linear case. Webegin (as is usual in Newton type methods) by replacing L by its linearization about u n.We then attempt to partially solve the resulting linearized problem,

L’(u").(u-u")+L(u")-fO,

by ADI iterations. We emphasize that we will only want to solve the linearized equation"part way"; after every few ADI iterations we will want to update the linearization.


Let us denote

R(3.2)

A

Hence "updating the linearization" of L(u) f involves computing new values of R, A,and B.

We now have several possible options of how frequently to update the lineariza-tion. Option I updates after every single sweep"

n+l n+l n)u -u =ZXt{An(u -un+2 n+l {Bn+l( n+2 n+l) n+l}u -u =At u -u -R

(3.3)u u At{An+2(u n+3- u n+2) -R n+2},n+4 n+3 {nn+3 +4 n+3 n+3u -u =At (u" -u )-R },

-"+ -u )-R"},u -u 2At{An(t7(3.4)

-n+4 -n+2u u 2At{/"+2(ti "+4- a"+2).--/+2}.Option II (our favorite) updates the linearization after every double sweep" We beginwith

(u+- u(3.5)

n+2 n+l n+lu u At{B (u n+2 u n) +A (u un)} AtR n, etc.,

which for computational purposes is better written as

(U n+l U At{An(u n+l un)}- AtR n,(un+2 n) n+2 n+lu At{Bn(u u )}+2(u -u ),

(3.6)(U n+3- U n+2) At{An+2(U n+3- U AtR

n+4 n+2)} n+3(u n+4- u n+2) At{Bn+Z(u u + 2(u u

(ti n+2 u 2AtAn(an+2 un)-2ZktR n,(3.7)

(t n+4- u n) 2AtB (t7 n+4- u n) + 2(a "+:- u").

In either option the computation of the testparameter TPand the strategy ]:or changing Atremain the same as in the linear case.

There is also the possibility (Option III) of continuing one fullDADIstep (i.e., twodouble sweeps, one bookkeeping double sweep, and the decision on change of At)before updating the linearization, or even (Option IV) of continuing several full DADIsteps before updating (which would be closer to a standard Newton’s method). We havefelt both of these options to be inadvisable (at least in the early stages of iteration whenthe changes in u n, A and B are large) since they in no way permit the nonlinearity ofthe problem to enter in to the choice of the smallness of At.

There is also the possibility (Option V) of "modified Newton" type methods whichupdate the residuals R n, R n/l, in (3.3)-(3.5) at every sweep but which update thederivatives An, B only infrequently.

It is well known, on computational experience with iterative methods for nonlinearequations, that one can often replace the exact derivative L’(u n) A(u n) + B(u n) in thelinearized equation (3.1) by an "approximate derivative" operator L*(un)A*(u n) + B*(u n) which is really quite a bit different from L’(u ), but which nevertheless


"heads the iteration in the right direction." This is especially true when combined with"underrelaxation," or, as in the present situation, when we solve the linearizedequation only "part way." In 5 we encounter a concrete example of this in which thetrue Fr6chet derivative A(u n) + B(u) involves derivatives of the coefficients at u butin which the "approximate derivative" A*(u")+B*(u) is of simpler structure andinvolves merely "freezing the coefficients" at their fixed values a (x, y, u ). Of course wewill have all the Options I-V of how often to update the "approximate linearization"R, A*, B*. We choose to call these frozen coefficient linearization methods as opposedto the true derivative linearization methods of previous paragraphs.

4. Trial runs in the linear case. As stated earlier, all our computational examplesinvolve solution of standard finite difference equations corresponding to the Dirichletproblem (1.6). Let (xi, yj) be the i, jth point in a square mesh of grid size Ax Ay h, letui, u(xi, xi), let fh be a mesh region approximating f, let be its mesh interior andOh its mesh boundary. For the sake of brevity we now write, the finite differenceequations for both the linear and nonlinear cases; they are"

{ai,i+l/2(ui,i+l ui,i)- ai,i-1/2(ui,j- bli,i-1)}h -2

(4.1) +{ai+l/2,i(Ui+l,j-tti,i)-ai_l/2,j(li,i-Ui_l)}h-2=qbi, for(x/, yi)in,

where

Ui,j gi,j for (Xi, y) on

(4.2) ai,i.l/2 a(xi, Yi*1/2, (Ui,j*l + Ui,i)), etc.

This can be written as

(4.3) L(u) =[,

where L is an operator mapping the space of "vectors" or "mesh functions" on f, ontoitself. In the linear case fii &ii at points well interior to f; at points of f, adjacent toOfh, fi also involve terms from the boundary data g. In the nonlinear case, fii )ij on all

f and the terms involving the boundary data g have to be included in the nonlinearoperator L.

We now discuss some of the computational results for the linear case a a (x, y).One of the myths that seems to have grown up about ADI methods is that they workwell only in the case of (nearly) constant coefficients and (nearly) rectangular meshregions, because it is essentially only in such separable cases that the usual convergenceproof involving commutativity of the negative definite symmetric difference operatorsAand B holds true; see Varga [7, p. 235] and Wachspress [8] for example. It is for thisreason, therefore, that we choose to begin our examples with constant coefficients on asquare mesh and proceed to examples with very nonsmooth coefficients and on regionswith very jagged boundary. In the process the convergence rate deteriorates somewhat,but not greatly.

In the linear case it is sufficient to test our methods with the inhomogeneous terms

b0 and boundary terms gii (and hence also the solution) identically zero in (4.1), sinceone could always subtract off the true solution in advance. Thus the solution u . whichwe compute, starting from an initial function u . 0, is actually the error function e i’-.

In column 1 of all our examples we show the step number (recall that one DADIstep involves two ADI double sweeps with At plus one bookkeeping ADI double sweepwith 2At). In column 2 we show the present At, in column 3 the test parameter

n+4TP=llu"+4-a II/]}u "+4 u computed at this step, in column 4 the local con-


vergence factor CF Ile+"l[/lle"ll actually observed for this step, in column 5 the totalreduction factor RF-llen+411/llell by which the error has been reduced since thebeginning, and in column 6 the decision (depending only upon the value of TP,remember) on acceptance or rejection and on the factor by which to change At for thenext DADI step. In most examples we will exhibit the table of these results only for afew of the ten to twenty steps of the calculation.

The calculation will always be allowed to run until the error has been reduced by afactor less than .5 x 10-5. The norm involved is always the 12 (i.e., Euclidean, root meansquare) norm of course. The central processing unit computational times shown arealways on the CDC 7600 with the new FTN 4, OPT 2, optimized FORTRAN compilerat Lawrence Berkeley Labs. Some earlier runs with the older RUN 76 compiler wereconsistently about 2.3 times slower.

In Example la we consider the Laplace difference equation, a =- 1, on a 32 x 32mesh (h =) overlaying the unit square. (This mesh region will be referred tohenceforth as "the 32 x 32 unit square.") The initial error u= e takes on randomvalues uniformly distributed in (0, 1) and the initial At is chosen equal 10-4 .1 (Ax).Notice that both steps 8 and 9, which have test parameters of .355 and. 114 falling aboutin the desired (.1, .3] interval, turn out to have exceedingly good convergence factors of.013 and .0014 respectively. The error is reduced by a factor of 3.5 x 10-8 in only 9DADI steps (each corresponding to the computational effort of about 2.5 to 3 ADIdouble sweeps) with only .098 seconds of CDC 7600 CPU time.

EXAMPLE laLaplace difference equation, a 1, on the unit square with Ax Ay . lnitial

error randomly distributed in (0, 1).

Change inStep At TP local CF total RF next At

.100 E-3 .042 .855 .855 E+0 42 .400 E- 3 .130 .901 .771 E +03 .400 E- 3 .022 .941 .725 E + 0 44 .160 E- 2 .054 .839 .609 E + 0 25 .320 E- 2 .030 .760 .463 E + 0 46 .128 E- .029 .361 .167 E+0 47 .512 E- .011 .011 .196 E-2 48 .204 E +0 .355 .013 .254 E-49 .102 E+0 .114 .0014 .354 E-7

Convergence (by factor of .354 E-7) with 9 DADI steps in .098 seconds of CDC 7600 CPU time.

Standard ADI here converges (by a factor of .133 E- 5) with 33 double sweepsin .172 CPU seconds. The At’s are uniformly logarithmically distributed betweenAt1 =.743 E-3 and At8 =.266 E+0.

Example lb differs from la only in that the mesh spacing is rather than . Thetest parameters and convergence factors observed are nearly the same as those in l a.Indeed, notice the superb convergence factors in steps 8, 9, and 10. The error is reducedby a factor of .324 x 10-7 in only .42 seconds of CDC 7600 CPU time.

The best direct Poisson solvers known to the authors [1], [6] consume about .1seconds on the same computer and compiler to solve this 64 x 64 problem. Hence, thedirect solver is only about 4 times faster than our DADI. However, this is not a badfactor if one takes into account the great flexibility of DADI type methods to handle awide variety of linear and nonlinear equations, with stretched meshes, and on practic-ally any irregularly shaped domain.


EXAMPLE lbLaplace difference equation, a--1, on the unit square with Ax Ay- 64.

Initial error randomly distributed in (0, 1).

Change in

Step At TP local CF total RF next At

.100 E 3 .222 .827 .827 E + 02 .100E-3 .057 .970 .803 E+0 23 .300 E- 3 .031 .950 .769 E + 0 44 .800 E 3 .061 .892 .686 E + 0 25 .160 E- 2 .036 .850 .583 E + 0 46 .640 E-2 .051 .591 .345 E+0 27 .128E-1 .011 .362 .125 E+0 48 .512 E- .011 .011 .146 E-2 49 .204 E+0 .355 .013 .190 E-410 .102 E+0 .114 .001 .324 E-7


Standard ADI here converges (by a factor of .150 E- 5) with 41 double sweepsin .848 CPU seconds. The At1,..., At8 are uniformly logarithmically distributedbetween .203 E-3 and .244 E +0.

Example 2 also considers the Laplace difference equation, a 1, but this time on apie-shaped region, those points of a 4040 mesh (h =) overlaying the square[-1, 1] [-1, 1] which also lie inside the 45< 0 < 315 sector of the unit disc. (Thismesh region will be referred to henceforth as "the 40 40 pie.") Notice that our meshregion has equally spaced nodes even at its boundary; there has been no shifting of thesenodes over to the curved boundary of the sector region itself for the sake of "moreaccurate treatment of the boundary"; we are, after all, not attempting to accuratelysolve a PDE on the curved sector but to test the convergence rate of the DADI methodto the solution of the algebraic finite difference equations on this mesh region withrather jagged boundary. We again let the initial error u= e take on random valuesuniformly distributed in (0, 1), and choose the initial At equal .00179 (Ax)2. Noticethat step 8 is rejected because the test parameter is too large (>.6) even though in this

EXAMPLE 2Laplace difference equation, a 1, on the 40 40 pie. Initial error randomly distribu-

ted in (0, 1).

Change inAt TP local CF total RF next At

.179 E-2 .325 .710 .710 E+0

Step

6789

14

.574 E- .150 .037 .922 E-2

.574 E- .372 .354 .326 E-2

.287 E- .765 .577 reject, 6

.179 E-2 .244 .532 .174 E-2

.574 E- .140 .038 .222 E-415 .574 E- .364 .365 .812 E-516 .287 E- .686 .563 reject, 617 .179 E-2 .214 .545 .443 E-5

Convergence (by factor of .443 E- 5) with 17 DADI steps in .46 seconds of CDC 7600 CPU time.

Standard ADI here converges (by a factor of .456 E- 5) with 25 double sweeps in.26 CPU.second. The At1,..., At8 are uniformly logarithmically distributed between.179 E-2 and .282 E+0.


case it turns out that the step would have given a quite acceptable convergence factor of.57. Notice that both steps 6 and 14, which have test parameters of .15 and .14 falling inthe desired (.1, .3] interval, turn out to have exceedingly good convergence factors of.037 and .038 respectively. The error is reduced by a factor of .443 10-5 in only17 DADI steps with .46 seconds of CPU time.

Example 3 is on the same 40 40 pie-shaped mesh, but this time with quitenonsmooth coefficients. The coefficients ai.j+l/2 (these should be thought of physicallyas the "internodal conductivities" linking the (i,/’) and (i,/" + 1) nodes which each haveunit "heat capacity") now take on random values uniformly distributed in (1,100). Theinitial error function u is taken as a rather haphazard linear combination of sine andcosine functions and X

2 - y2. The initial At is taken equal .000183 (Ax)2/10, where 10represents an intermediate value for a. Notice that, by comparison (using the minimaxtheorem) with the Laplace difference operator each of whose parts A and B has spectralradius about equal 4/(Ax)2), we have chosen our initial At somewhat on the order of afew times larger than [p(A)]- or [p(B)]-. Great care in this choice is not at all crucial,as is evident in later examples: if At starts out very much too large, or very much toosmall, then a few DADI steps are merely wasted as the method quickly decreases orincreases At to a suitable range. The error in this example is reduced by a factor of.499 x 10-5 in 23 DADI steps, with .63 seconds of CPU time.

EXAMPLE 3

Coefficients a(x, y) randomly distributed in (1,100), on the "40 x 40 pie." Initialerror a haphazard combination of sines, cosines, and x + y2.


.183 E-3 .280 .331 .331E+02 .183 E-3 .324 .404 .133 E+03 .915 E-4 .518 .671 .898 E-

10 .732 E-3 .133 .105 .598 E-311 .732 E- 3 .320 1.097 .657 E- 312 .366 E-3 .745 .791 reject,13 .2289 E- 5 .204 .826 .543 E- 3

18 .732 E-3 .128 .102 .115 E-419 .732 E-3 .264 .613 .710 E-520 .732 E- 3 1.078 .930 reject,21 .457 E-4 .336 .790 .561E-522 .228 E-4 .150 .942 .529 E-523 .228 E-4 .187 .943 .499 E-5 4


Standard ADI here converges (by a factor of .288 E- 5) with 35 double sweepsin .37 CPU seconds. The Atl,.’., At8 are uniformly logarithmically distributedbetween .237 E-4 and .191 E /0.

Example 4 is an extreme attempt to make DADI (or at least standard ADI) fail ona linear problem. We divide our "40 40 pie" mesh region into a left hand and a righthand portion by the line ABCDE joining the grid points whose indices are (-1, 10),(4, 14), (-10, 0), (4,-14), (-1,-19). (By way of reference, the center or vertex of the"pie" has indices (0, 0) and the four corners of the overlaying 2 2 square are at(+20, +20).) For those internodal links falling to the left of ABCDE we set theconductivity a 1, for those to the right we set a 10-4. The initial error u e takes


on random values in (0, 1). DADI converges, but about 5 times slower than for theconstant coefficient case (Example 2) on the same region; it now takes 72 (rather than15) DADI steps to reduce the error by a factor of .28 x 10-5. We have shown a few ofthe steps of greatest interest" notice that the initial At is rejected as being too large; theerror is reduced by only a factor of .189 in the first 20 steps; from step 30 to 40 we see aparticularly bad period of stagnation, the error being reduced only by about 3% inthese 10 steps even though the test parameter TP often falls in the desired (.1, .3]interval; fi, ranges between a minimum of .0002 and a maximum of 52 in thiscalculation.

EXAMPLE 4Coefficients a (x, y) on the "left portion" and 10-4 On the "right portion" of the

40 40 pie. lnitial error randomly distributed in (0, 1).


.319 E-2 .603 .816 reject,2 .199 E-3 .029 .914 .914 E+0 43 .797 E-3 .127 .947 .866 E+0

10 .510E-1 .243 .987 .559E+0

20 .130 E + 2 .070 .632 .189 E +0 221 .261 E + 2 .080 .403 .765 E- 222 .522E+2 .135 .174 .133E-1

27 .204 E + 0 .515 1.009 .368 E- 2

30 .319 E-2 .426 .988 .329 E-2

37 .255 E- .132 .997 .321E-238 .255 E- .107 .997 .321E-239 .255 E- .107 .997 .320 E-240 .255 E- .060 .997 .319 E-2 2

50 .522 E + 2 .507 .749 .425 E-4

60 .130 E + 2 .812 .584 reject, 670 .653 E + .074 .750 .145 E-4 271 .130 E + 2 .090 .583 .848 E- 5 272 .261E+2 .109 .335 .284 E-5

Convergence (by factor of .284 E- 5) with 72 DADI steps in 1.99 seconds of CDC 7600 CPU time.

Standard ADI here diverges slowly (increasing the error by a factor of .637 E + 4 in45 ADI double sweeps). The Atl,..., At8 are uniformly logarithmically distributedbetween .319 E-2 and .239 E +4.

5. Trial runs in the nonlinear case. We generate the algebraic equations for ournonlinear trials by first specifying formulas for the coefficient and solution functionsa (x, y, u) and u (x, y) and then computing the right-hand side f in the finite differenceequations (4.1)-(4.3). In this way we always have at hand the means to compute theexact error en= u- u and its norm, and also the means to force a desired kind ofbehavior (smooth, highly oscillatory, large or small range of values of a (x, y, u) over 12,etc.) on our exact solution u.

All of the examples 5-9 shown were run with Option II (updating the linearizationafter every double sweep). We also ran several of these same trials (Examples 5a, 5b,


plus a few others) with Option I (updating the linearization after every single sweep).The convergence obtained per DADI step, based on this limited experience, seems tobe virtually unchanged, no better and no worse; however, because it doubles the effortof updating, the total CPU time consumed does increase.

Probably it would be worthwhile to make some trial runs on Option IV (the "nearlyNewton" method), especially in cases where "updating the linearization" is a dominantfraction of the per step computation, but we did not make any such runs.

Once again we say that "convergence" has been achieved when the initial error hasbeen reduced by a factor of less than .5 x 10-5.

We begin in Example 5a with a 32 x 32 mesh over the unit square, 0 -<_ x, y -<_ 1, withcoefficients a(x, y, u) =.05+(x +2y)/(1 +y)+u2/(1 +u2) and the very smooth solu-tion function u(x, y)- x2+ y2. This example has a rather large "ellipticity factor" of(max a over f)/(min a over f) 47, but much of this ratio is due to the smooth variationof the linear part of the coefficient function from its minimum .05 in the corner (0, 0) toits maximum 1.55 in the corner (1, 1); hence we should probably term this example onlymoderately nonlinear. We use the true derivative linearization L’ =A +B (whichinvolves computing derivatives of a(x, y, u)). The initial guess u (in this and all laterexamples) is taken to be =-0 (greatly in error). Convergence is achieved in only13 DADI steps with .75 seconds of CPU time. The pattern and rate of convergenceobserved seems quite similar to that of the nonextreme linear cases.

EXAMPLE 5aCoefficients a(x, y,u)=.O5+(x+2y)/(l+y)+uE(l+u2) on the 32x32 unit

square. Smooth exact solution u(x, y) x + y2. True derivative linearization with OptionH on updating. Moderately nonlinear case with max a/mina =47. Initial guess u =-0.

Change in


.100 E-3 .091 .911 .911E/0 22 .200 E-3 .091 .911 .911E+0 23 .400E-3 .053 .883 .731E+0 24 .800 E 3 .046 .846 .619 E + 0 45 .320 E- 2 .089 .649 .408 E / 0 26 .640 E- 2 .051 .553 .225 E + 0 27 .128 E- .035 .348 .787 E- 48 .512 E- .0867 .005 .462 E-3 29 .102 E /0 .208 .084 .390 E-410 .102 E +0 .598 .581 .227 E-411 .256 E- .496 .546 .124 E-412 .640 E-2 .634 .505 reject, 613 .400 E- 3 .088 .085 .106 E- 5 2


Standard ADI here converges (by a factor of .302 E- 5) with 25 double sweeps in.67 CPU seconds. The At1,..., At8 are uniformly logarithmically distributed between.402 E 3 and .418 E + 1.

Example 5b is exactly the same as the previous Example 5 except that we switchfrom the true derivative to the frozen coefficients linearization. The convergence is evenbetter than before, requiring only 11 steps (rather than 13) for convergence. Moreover,since the per step computation is less (not requiring computation of terms involvingderivatives of a(x, y, u)) the total CPU time is reduced from .75 seconds to .44 seconds.


EXAMPLE 5bExactly the same as Example 5a except we switch to the frozen coefficients

linearization.


.100 E-3 .047 .913 .913 E+0 42 .400 E-3 .127 .850 .776 E+03 .400 E-3 .036 .901 .700 E+0 44 .160 E-2 .069 .764 .535 E+0 25 .320 E 2 .050 .699 .374 E + 0 26 .640 E-2 .042 .570 .213 E+0 47 .256 E- .041 .134 .284 E- 48 .102 E+0 .163 .012 .350 E-39 .102 E +0 .478 .285 .999 E-4

10 .256 E- .455 .121 .131 E-411 .640 E- 2 .299 .093 .122 E- 5


Standard ADI here converges (by a factor of .359 E- 5) with 25 double sweepsin .46 CPU seconds. The At1,’’’, At8 are as in 5a.

Example 6a has coefficients with a 10 times stronger nonlinearity, a(x, y, u)=.05 +(x + 2y)/(1 + y)+ 10u2/(1 + u 2) on the 32 32 unit square, with the smooth exactsolution u(x, y)=x2+y2. This example, with max a/min a 191 and most of thevariation of a being due to the nonlinearity, can rightfully be termed strongly nonlinear.We use the true derivative linearization and initial guess u--0. Convergence isachieved with 24 DADI steps in 1.37 seconds of CPU time.

EXAMPLE 6aCoefficients a(x, y, u) =.05+(x + 2y)/(1 + y)+ 10u2/(1 +//2) on the 3232 unit

square. Smooth exact solution u(x, y) x + y2. True derivative linearization with OptionH on updating. Strongly nonlinear case with max a/min a 191. Initial guess u =-O.

Change inAt TP local CF total RF next At

100 E 3 1.002 .838 .838 E + 0 reject,.625 E-5 .015 .978 .978 E+0 4.250 E- 5 .117 .941 .921 E + 0

Step

23

10

20212223

.400 E-3 .114 .891 .545 E+0

24

.128 E-1 .084 .163 .172 E-2 2

.256 E-1 .174 .015 .270 E-4256 E-1 .445 .705 .190 E-4.640 E-2 .744 .829 reject,6.400 E-3 .467 .204 .389 E-5

Convergence (by factor of .389 E-5) with 24 DADI steps in 1.37 seconds of CDC 7600 CPU time.

Standard ADI here diverges dramatically (by factors of .327 E + 4, , 399 E + 6after 8,..., 88 double sweeps). The At1,’’., At8 are uniformly logarithmically dis-tributed between .108 E-3 and .385 E + 1. However, ADI did converge when it wasstarted with u much closer to the true solution u.

Example 6b is exactly the same as our previous Example 6a except that we switchto the frozen coefficients linearization for this strongly nonlinear problem. The rate ofconvergence remains the same, requiring 24 DADI steps once again for convergence.


However, the CPU time consumed is reduced from 1.37 seconds to .94 seconds becauseof the decreased computations required per step.

EXAMPLE 6bExactly the same as Example 6a except we switch to the frozen coefficients

linearization.

Step

23

10

20212223

Change in

At TP local CF total RF next At

.100 E- 3 .272 .859 .859 E 0

.100 E-3 .172 .919 .791E+0

.100 E-3 .110 .938 .742 E+0

.800 E-3 .126 .884 .413 E+0

24

.128 E-1 .084 .213 .546 E-2 2

.256 E-1 .117 .040 .218 E-3

.256 E-1 .276 .063 .138 E-4

.256 E-1 .530 .563 .781E-5

.640 E-2 .526 .552 .431E-5


Standard ADI here (as opposed to 6a) converges nicely (by a factor of.180 E- 5) with 41 double sweeps in .73 CPU seconds. The At1, , At8 are as in 6a.

Example 7 has the same coefficient function a(x, y, u) .05 + (x + 2y)/(1 + y)+ u2/(1 + u 2) on the 32 32 unit square as in Examples 5a and 5b, but with the highlyoscillatory solution function 1 -cos 6(x 2 + y2) which therefore brings out more stronglythe nonlinear nature of the variation of the coefficients. These vary by the factormax a/min a 45.6 for the true solution over this region. We use the frozen coefficientslinearization, and initial guess u 0. Convergence is achieved in 19 DADI steps, in .78seconds of CPU time.

EXAMPLE 7Coefficients a(x,y,u)=.O5+(x+2y)/(l+y)+uZ/(l+u 2) on the 3232 unit

square. Highly oscillatory exact solution u(x, y)= 1-cos 6(x + y2). Frozen coecientslinearization with Option 11 on updating. Moderately to strongly nonlinear case withmax a/min a =45.6. lnitial guess u =0.

Change in

Step TP TP local CF total RF next At

.100 E-3 .044 .944 .944 E/0 42 .400 E-3 .121 .885 .836 E+0

10 .128 E-1 .178 .305 .128 E-2

12 .640 E-2 .608 .480 reject,613 .400 E-3 .104 .366 .217 E-3

17 .128 E-1 .101 .248 .212 E-418 .128 E-1 .108 .244 .520 E-519 .128 E-1 .114 .248 .129 E-5


Standard ADI here converges (by a factor of .202 E- 5) with 27 double sweeps in.50 CPU seconds. The Aq,..., At8 are uniformly logarithmically distributed between.413 E-3 and .419 E+ 1.


Example 8a is the same as Example 5a except that we switch from the "32 32square" to the "40 40 pie" (centered now inside the [0, 2] [0, 2] square rather thaninside the [- 1, 1 [- 1, 1] square as in Examples 2, 3 etc.). We again have coefficientsa(x, y, u) .05 + (x + 2y)/(1 + y) + u2/(1 + u 2) with the smooth exact solution u(x, y)x2+ y2. This example should be termed only mildly nonlinear since trimming off thecorners of the square (where a before took on its extreme values) results now in an"ellipticity factor" of max a/min a only 4.51. We use the true derivative lineariza-tion, with Option II on updating, and u --- 0. Convergence is achieved with 17 DADIsteps in .67 seconds of CPU time.

EXAMPLE 8aExactly the same as Example 5a except on the 4040 pie centered inside the

[02][02] unit square. Coefficients a .05 + (x + 2y)/(1 + y)+ u9-/(1 + u2), smooth exact

solution u x + y2, true derivative linearization, Option H on updating, and u O.Mildly nonlinear case with max a/min a 4.51.

Change in


.687 E- 3 .222 .760 .760 E + 02 .687 E-3 .071 .864 .657 E+0 2

9 .219 E .807 .806 reject, z10 .137 E-2 .398 .366 .441E-3

16 .219 E-1 .102 .137 .104 E-417 .219 E-3 .151 .154 .160 E-5

Convergence (by factor of. 160 E-5) with 17 DADI steps in .67 seconds of CDC 7600 CPU time.

Standard ADI here converges (by a factor of .240 E- 5) with 25 double sweeps in.44 CPU seconds. The At1,..., At8 are uniformly logarithmically distributed from.687 E- 3 to .404 E + 0.

Example 8b is exactly the same as 8a except that we switch to the "frozencoefficients" linearization. The rate of convergence remains the same, requiring 17DADI steps again for convergence. However, once again the total CPU time is reduceda bit (from .67 to .54) because of the decreased per step computation required.

EXAMPLE 8bExactly the same as Example 8a except we switch to the ]:rozen coefficients

linearization.

Change inStep At TP CF RF next At

.687 E-3 .168 .767 .767 E-02 .687 E-3 .071 .864 .663 E-0 2

10 .137 E-2 .384 .358 .453 E-311 .687 E-3 .339 .865 .392 E-3

16 .219 E-1 .115 .120 .669 E-517 .219 E-1 .182 .201 .135 E-5 1


Standard ADI here converges (by a factor of .196 E-5) with 25 doublesweeps in .37 CPU seconds. The At1,..., At8 are as in 8a.

Example 9a has coefficients which are highly nonlinear and change abruptly withrespect to u for u close to 1; a(x, y, u) .5 + (x + 2y)/(1 + y)+[1 + 16(u 1)4]-1, on the


40 X40 pie centered in the square [0, 2Ix[0, 2], with the highly oscillatory exactsolution u(x, y)= 1-cos6(xa+y2). The ellipticity factor here is max a/mina=2.87/.98=2.93. We use the true derivative linearization and initial guess u=-0.Convergence is achieved with 19 DADI steps in .82 seconds of CPU time.

EXAMPLE 9a

Coefficients a(x, y, u)= .05 + (x + 2y)/(1 + y)+[ + 16(u- 1)4]-1 on the 40 40 piecentered in [0,2][0,2]. Highly oscillatory exact solution u(x, y)= 1-cos 6(x2+ y2).True derivative linearization with Option H on updating. Max a/min a only 2.93, butcoefficients are highly nonlinear with respect to u. lnitial guess u =-- O.


.674 E 3 .236 .734 .734 E / 02 .674 E- 3 .074 .896 .658 E / 0 2

10 .215 E- .433 .694 180 E-311 .549E-2 .673 .515 reject 6

18 .215 E- .132 .191 .620 E- 519 .215 E- .233 .374 .232 E-5

Convergence (by factor of .232 E- 5) with 19 DADI steps in .82 seconds of CDC 7600 CPU time.

Standard ADI converges here (by a factor of .327 E- 5) with 25 double sweeps in.49 CPU seconds. The Atl,"’" At are uniformly logarithmically distributed between.674 E 3 and .320 E + 0.

Example 9b is the same as 9a but with the frozen coefficients linearization. In thiscase the convergence rate worsens dramatically, requiring 62 DADI steps for con-vergence in 2.09 seconds of CPU time. The reason for this near stagnation is quite

EXAMPLE 9bExactly the same as 9a except with the frozen coefficients linearization.

Step

23

67

10

2O

3O

4O

5O

6O

62

Change in

At TP CF RF next At

.674 E-3 .149 .738 .738 E+0

.674 E 3 .094 .896 .661 E + 0 2

.134 E-2 .113 .863 .571E+0

.134 E-2 .096 .895 .403 E+0 2

.269 E-2 .112 .808 .326 E+0

269 E-2 .121 .814 175 E+0

.269 E-2 .1198 .8173 .230 E-

.269 E-2 .1179 .8174 .307 E-2

.269 E-2 .1176 .8174 .409 E-3

.269 E-2 .1175 .8174 .545 E-4

.269 E-2 .1175 .8174 .726 E-5

.269 E-2 .1175 .8174 .485 E-5

Convergence (by factor of .485 E-5) with 62 DADI steps in 2.09 seconds of CPU time.

Standard ADI converges here (by a factor of .424 E-5) with 30 doublesweeps in .44 CPU seconds. The At1, , At8 are as in Example 9a.


apparent, howevermthe method begins with an initial At .000674 with which it issatisfied and, except for doublings of At after the 2nd and 6th DADI steps, the methodstays quite well satisfied and never changes At again. In fact the method keeps At fixedat .00269 after the 6th step and very quickly settles into a nearly fixed TP of. 1175 (in theacceptable [.1, .3) range) with a nearly fixed CF of .81743.

Doss observed from this example of near stagnation that it would probably bebetter to revise our stepsize changer strategy such that At is never allowed to remain atthe same value for two successive DADI steps. In fact, Doss found that merely revisingthe strategy such that At increases by the factor of 2 instead of 1 when TP falls in thedesired [. 1, .3) interval seems to avoid stagnation in the few (previously near stagnant)examples on which we tried it. Doss pursues this idea in his thesis [2] with moresophisticated revised strategies that cause At to hop around quite a bit more withoutever returning to exactly the same value.

6. Comparison with standard AI)I. We ran all of the DADI Examples 1-9 shownalso with the standard ADI method. In each case the choice of true derivative or frozencoefficients linearization and the Option II on updating remains the same; the onlydifference is that standard ADI uses a prechosen sequence At1 <. < Atr of At’s ratherthan the dynamic choice of DADI.

Based on conversations with Jim Douglas, Jr., one of the major proponents anddevelopers of ADI methods, we adopted a rather long-established procedure forchoosing the Ati. First, one makes rather gross estimates for an interval [c, d] in whichthe eigenvalues of A, B ought to lie (in the linear case). This interval is about [Tr2, 4h -2]for the discrete Laplacian operator (a 1) on the unit square. With variable a(x, y), useof the Courant minimax principle gives the gross estimate [c,d]=[r2mina,4h -2 max a]. We then decided on about 8 iteration parameters (as suggested byDouglas for a 32 32 or 64 64 grid), divided [1/d, 1/c] into 8 equal intervals (on alogarithmic scale) then picked log At1 < <log At8 to be the midpoints of theseintervals. For the sake of probably greater stability we in a few cases tried deleting thelargest value At8; this is justified also because our c is probably a rather grossunderestimate of the spectrums of A and B. In the nonlinear case we just pretend thatour nonlinear operator behaves mainly like the linear operator with coefficients frozenat their values a (x, y, u (x, y)) for the exact solution u (x, y); we then estimate [c, d] andpick At At8 as before; the values max a and min a used here of course depend uponour having a decent idea of the true solution’s values already, which might not easily bethe case in actual practice; this is one of the disadvantages of standard ADI as opposedto the fully automatic DADI.

Our computational experience with these trial runs is that standard ADI is hard tobeat. It usually converges, and when it does it usually converges just about as fast asDADI (or sometimes a bit faster in terms of CPU time since DADI expends one of its 3double sweeps merely on the bookkeeping of deciding how to choose At for the nextstep). However, in certain extreme cases (we had to push a bit to generate them)standard ADI does explode.

Example ADI la (that is, Example la with standard ADI rather than with DADI)converges with 33 double sweeps in .17 CPU seconds (as opposed to the nearly twicefaster .09 seconds of DADI). This is the Laplace difference equation (a 1) on the32 32 unit square.

Example ADI lb, the Laplace difference equation on the 64 64 unit square,converges with 41 double sweeps in .84 CPU seconds (as opposed to the twice faster .42seconds of DADI).


Example ADI 2, the Laplace difference equation on the 40 x 40 pie, convergeswith 25 double sweeps in .26 CPU seconds (nearly twice as fast as the slower .46 secondsof DADI).

Example ADI 3, the linear case with random (1,100) coefficients on the 40 x 40pie, converges with 35 double sweeps in .37 CPU seconds (as opposed to the over g

slower .63 seconds of DADI).Example ADI 4, the extreme linear case with coefficients a -= 1 or 10-4 on the left

or right portion respectively of the 40 x 40 pie, diverges slowly, increasing the error by afactor of .637 E + 4 in 45 ADI double sweeps. The pattern of divergence in each cycle of8 double sweeps is interesting to note: the convergence factor is greater than 1 not forthe larger time steps Aty, Ats as we would have expected, but for the midrange time stepsAt2, Ats. For example, in the 5th and final full ADI cycle computed, from the 33rdthrough the 40th double sweeps, the observed local convergence factors are .80, 1.08,2.20, 3.31, 2.49, .752, .312, and .87. This pattern is quite typical of all the other fullcycles in the run.

Example ADI 5a, a modrately nonlinear case on the 32 x 32 unit square, convergeswith the true derivative linearization with only 25 double sweeps in .67 CPU seconds (asopposed to the slightly slower .75 seconds of DADI).

Example ADI 5b, the exact same case except with the frozen coefficients lineariza-tion, converges again with only 25 double sweeps in .46 CPU seconds (as opposed to thevery slightly faster .44 seconds of DADI).

Example ADI 6a, the strongly nonlinear case on the 32 x 32 unit square, with thetrue derivative linearization, and the greatly in error initial guess u-=0, divergesdramatically (by factors of .327 E + 4, .570 E +6, .751 E + 5, , .339 E +6 at the endof the 8th, 16th, 24th,..., 88th ADI double sweeps. Once again, as in the linearExample ADI 4, it seems to be the midrange values At2,. , At5 with the CF’s largerthan 1.

Example ADI 6b is exactly the same as ADI 6a except for the switch to the frozencoefficients linearization. In this case, however, ADI converges quite nicely with only41 double sweeps in .73 CPU seconds (as opposed to the slightly slower .94 seconds forDADI).

Example ADI 7, a moderately to strongly nonlinear case with highly oscillatoryexact solution converges with 27 double sweeps in .50 CPU seconds (as opposed to theapproximately 1/2 slower .78 seconds for DADI).

Example ADI 8a, a mildly nonlinear case on the 40 x 40 pie with smooth exactsolution and true derivative linearization, converges with only 25 double sweeps in .44CPU seconds (about 1/2 faster than the .67 seconds of DADI).

Example ADI 8b, the exact same case except with the frozen coefficientslinearization, converges again with only 25 double sweeps in .37 CPU seconds (asopposed to the nearly 1/2 slower .54 seconds of DADI).

Example ADI 9a, a highly nonlinear case on the 40 x 40 pie, with highly oscillatoryexact solution and the true derivative linearization converges with 25 double sweeps in.49 CPU seconds (as opposed to the slower .82 seconds of DADI).

Example ADI 9b, the same case but with the frozen coefficients linearizationconverges with 30 double sweeps in .44 CPU seconds (thus avoiding the near stagnationof the much slower DADI run which required 2.09 seconds).

7. Preview of the thesis results of Doss. The first author in his Ph.D. thesis [2]makes three principal extensions over the present work.

First are extensions of these methods to equations of the form L(u)=-


V. aVu +lower order terms =f, where the coefficients are gradient dependent, i.e.,a a ([7u[). His computational examples include the minimal surfaces equation, witha (1 + [7u12)-1/2, the capillary surfaces equation with the same a, and the magneto-static equation, with a (e + [7u 12) (1 + [7u 12)-1 in the portion of f corresponding toan iron core and a 1 in the portion corresponding to air, where e 10-4 for purposesof cyclotron design. Because the Fr6chet derivative L’ of L has a nine point pattern,instead of a five point pattern, Doss has to invent several ingenious "approximatederivatives" L* A* + B* L’, with a five point pattern and with A* and B* negativedefinite. These methods show a 5 to 20 times improvement in CPU time over thesuccessive over relaxation (SOR) magnetostatics programs now in use at LawrenceBerkely Labs; moreover, they have always worked.

Second are improvements in the step size strategy that speed convergence a bitand avoid near stagnation in some very extreme cases (such as Example 9b). As wementioned previously, all of the trial examples of this paper were run with a singlestrategy decided upon and frozen near the beginning of our experience. However,based upon later experience there is at least one simple change in strategy which wewould recommend to. a prospective user: when TP falls in the desired [.1,.3)range, increase At at the next step not by the present factor of 1, but by 2 (or probablybetter yet by some irrational number like x/ or x/ in order to avoid even more surelythe repetition of the same At’s). Doss designs and tests out some even more sophisti-cated strategy improvements.

Third are extensions to "three directional" DADI methods for solving L(u)=A(u)+ B(u)+ C(u)=f, corresponding to 3-dimensional problems or to 2-dimensionalproblems with a hexagonal mesh. These, and even "four-directional" DADI methodshave worked well on some simple trial cases.

REFERENCES

RANDOLPH E. BANK AND H. B. KELLER, Marching algorithms for elliptic b.v. problems I: the constant

coefficient case, this Journal, 14 (1977), pp. 792-829.[2] SAID Doss, Dynamic ADI methods for elliptic equations with gradient dependent coefficients, Ph.D.

thesis in Mathematics, Univ. California, Berkeley, LBL-6142, 1977.[3] J. DOUGLAS AND H. H. RACHFORD, On the numerical solution of heat condution problems in two and

three variables, Trans. Amer. Math. Soc., 82 (1956), pp. 421-439.[4] K. MILLER, Math 228A Notes, PartHon StiffEquations, Fall 1973, Univ. California Berkeley, available

from author.[5] D. W. PEACEMAN AND H. H. RACHFORD, The numerical solution of parabolic and elliptic differential

equations, J. Soc. Indust. Appl. Math., .3 (1955), pp. 28-41.[6] PAUL N. SWARZTRAUBER, The methods of cyclic reduction, Fourier analysis and the FACR algorithm

for the discrete solution of Poisson’s equation on a rectangle, SIAM Rev., 19 (1977), pp. 490-501.[7] RICHARD S. VARGA, Matrix Iterative Analysis, Prentice-Hall, Englewood Cliffs, NJ, 1962.[8] E. L. WACHSPRESS, Iterative Solution of Elliptic Systems, Prentice-Hall, Englewood Cliffs, NJ, 1966.

Date post:	22-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

L’= -f}. At1 t2 - CCPOklinck/Reprints/PDF/dossSIAM1979.pdf · DYNAMICADI METHODSFORELLIPTIC...

Documents