Download - A Fast Cauchy-Riemann Solver...MATHEMATICS OF COMPUTATION, VOLUME 33, NUMBER 146 APRIL 1979, PAGES 585-635 A Fast Cauchy-Riemann Solver By Michael Ghil* and Ramesh Balgovind** Abstract.

MATHEMATICS OF COMPUTATION, VOLUME 33, NUMBER 146

APRIL 1979, PAGES 585-635

A Fast Cauchy-Riemann Solver

By Michael Ghil* and Ramesh Balgovind**

Abstract. We present a solution algorithm for a second-order accurate discrete form

of the inhomogeneous Cauchy-Riemann equations. The algorithm is comparable in

speed and storage requirements with fast Poisson solvers. Error estimates for the dis-

crete approximation of sufficiently smooth solutions of the problem are established;

numerical results indicate that second-order accuracy obtains even for solutions which

do not have the required smoothness. Different combinations of boundary conditions

are considered and suitable modifications of the solution algorithm are described and

implemented.

1. Introduction. Inhomogeneous Cauchy-Riemann equations appear naturally in

many fluid-dynamical problems, as the divergence and the vorticity equations of a

two-dimensional steady flow field (u, v) = (t/(x, y), u(x, y)). The velocity components

u, v axe usually called in this context primitive variables, in contradistinction to the

derived variables \¡i, f in the stream function-vorticity formulation of the flow equa-

tions (e.g., Roache [28]). In the latter formulation, the stream function \p satisfies a

Poisson equation; and computations with this formulation have greatly benefited from

the rapid development of fast direct methods for the solution of Poisson's equation, or

Poisson solvers (Buneman [4], Buzbee, Golub and Nielson [5], Dorr [9], Fischer,

Golub, Hald, Leiva and Widlund [12], Golub [18], Hockney [21], [22], Widlund [32] ).

Working in the primitive variables, however, permits the treatment of more gen-

eral flows. Indeed, either nondivergence or irrotationality of the flow are required in

order to introduce a stream function \p or a velocity potential 0, and obtain a Poisson

equation for them. There are many situations of practical interest in which neither of

these assumptions holds. Furthermore, the formulation of boundary conditions is

often easier in terms of the primitive variables, by using physical considerations which

arise naturally from the problem. On the other hand, a boundary condition on the

vorticity f for instance is at times hard to formulate (Langlois [23]); the construction

of appropriate discrete versions of such a boundary condition is often even more diffi-

cult (Öliger and Sundström [27]). Hence, the desirability of simple, physically mean-

ingful boundary conditions and, thus, of the use of primitive variables.

Lomax and Martin [24] have developed a fast Cauchy-Riemann solver and

Received April 10, 1978.

AMS (MOS) subject classifications (1970). Primary 65F05, 65N15, 65N20; Secondary

65N04, 65N05, 76B05, 86A10.

Key words and phrases. Fast direct solvers, Cauchy-Riemann equations, elliptic first-order

systems, transonic flow.

*The work of this author was supported in part by NASA, Grant No. NSG-5130.

**The work of this author was supported in part by NASA, Grant No. NSG-5034.

© 1979 American Mathematical Society

0025-5718/79/0000-0058/$! 3.75

585

License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use

586 MICHAEL GHIL AND RAMESH BALGOVIND

applied it to a quasilinear problem in aerodynamics ([24], [25] ). Additional versions of

their solver are given in [26]. Our interest in the problem stems from a different applica-

tion, dealing with a two-dimensional version of the equations of dynamic meteorology

(Ghil [14], [15] ). The solver we present also differs from those of [24], [26] in a num-

ber of ways: the boundary conditions and their numerical treatment, the decoupling of

u, v and the reduction to a discrete Poisson problem, and finally the method of solution

of the resulting discrete Poisson equation are all different. We substantiate by numerical

results that the present solver is second-order accurate, even for solutions which do not

have the formally required degree of smoothness. The solvers of [24], [26] seem to be

only first-order accurate according to our numerical tests.

The intended application of our solver is to a fully nonlinear first-order system,

rather than a quasilinear one. This system is a generalization of a Monge-Ampére equation

[15], [17]. Eventually, we hope to apply the solver to cases where the nonlinear equa-

tions are of mixed elliptic-hyperbolic type [16], [17]. Prehminary results are encourag-

ing, and we expect to pursue the nonlinear problem in a future publication.

The organization of the article is the following: Section 2 contains the description

in continuous and then in discrete form of the model problem of which we seek a fast

solution. Section 3 contains the derivation and description of the solution algorithm.

Section 4 presents numerical results for test computations with the model problem. Sec-

tion 5 presents modifications of the model problem arising from changes in boundary

conditions. Section 6 contains a comparison of results with the solvers of [24], [26].

Section 7 gives conclusions and a discussion of possible extensions and generalizations.

Finally, Appendix A presents an error estimate for the method, and Appendix B con-

tains a listing of the basic program.

Acknowledgements. It is gratifying to acknowledge useful discussions with Profes-

sors Eugene Isaacson and Olof Widlund. Numerical calculations were performed in part

on the CDC 6600 of the Courant Mathematics and Computing Laboratory, New York

University under Contract EY-76-C-02-3077 with the U. S. Department of Energy.

2. The Model Problem.

The Differential Equations. We wish to study the fast numerical solution of the

elliptic system of two first-order linear equations in two independent variables,

(2.1a) ux + vy = dix, y),

(2.1b) Uy - vx = e(x, y).

The dependent variables u, v can be thought of as velocity components in the x, y direc-

tions, respectively. In this interpretation d, e axe the divergence and vorticity of the

flow, which are assumed to be known. If d = 0 = e, (2.1) are the Cauchy-Riemann equa-

tions, and u, v axe analytic. We are interested in the inhomogeneous case, \d\ + \e\ ^ 0,

and we concentrate on real-valued d, e, u, v, although the method is applicable with

minor changes to complex-valued functions as well.

We consider a rectangular domain R, taken without loss of generality to be R =

{(x, y): 0 < x < 27T, 0 < y < 7r}. The boundary conditions are that d is given on the


A FAST CAUCHY-RIEMANN SOLVER 587

lower and upper side of the rectangle,

(2.2a) v = t/°>(x), y = 0,

(2.2b) V = v^0c), y = 77,

and that both u and v axe periodic in the x-direction,

(2-3a) u(x + 2tt, y) = u(x, y),

(2.3b) v(x + 2ir, y) = v(x, y).

The Gauss divergence theorem implies that d(x, y), v^°\x), and i/îx) have to satisfy

(2.3c) J02"/0" d(x, y)dxdy = f*" {v^(x) - „<»>(*)} dx.

These conditions, together with (2.1), determine v completely and u up to an additive

constant (compare Ghil [15]). The latter indeterminacy in the solution can be elimi-

nated, for instance, by prescribing u at one arbitrary point (x0, y0) in the rectangle.

The boundary conditions we use are associated with a standard channel-flow

problem in geophysical fluid dynamics (e.g., Elvius and Sundström [10], Gustafsson

[20] ), from which the nonlinear problem mentioned in Section 1 is derived (Ghil [15],

[17]). Extensions to different boundary conditions and to irregular domains will be

discussed in Sections 5 and 7.

777e Difference Equations. The discretization of the problem we chose is to

approximate the derivatives in (2.1) by finite differences. Let U, V, D, E stand for

the mesh functions which approximate the continuous functions 77, v, d, e; and let

h, k stand for the mesh size in the x, y-directions, respectively. It is natural to use

centered differences to replace the corresponding derivatives in (2.1). We write

(A) U(x + b, y) - U(x - b, y) s 2St7x(x, y),

U(x, y + e) - U(x, y - e) s 2euy(x, y),

for « and similar formulas for v; this yields a second-order accurate approximation of

the derivatives and allows us to expect that in some adequate norm ||-||,

||U- u|| + || V - u|| = 0(h2) + 0(k2).

The use of centered differences in a straightforward manner, on an unstaggered

mesh x¡ = ih, y ■ = jk, leads, however, to the existence of spurious null vectors (U, V),

i.e., to zero eigenvectors of the discrete matrix operator which approximates the dif-

ferential Cauchy-Riemann operator. To avoid dealing with these null vectors and to ob-

tain an invertible discrete matrix operator, we used a staggered mesh (Figure 1). Such

a mesh, suggested already by Lomax and Martin [24], can be formulated for Eqs. (2.1)

in a particularly efficient way.

Let u, v denote points at which U, V axe defined, and let -, x (for the vector

operators divergence and curl) denote the points at which the discrete versions (2.4a, b)

(see below) of equations (2.1a, b) are written and at which D, E are defined. Thus,


588

yiiU

MICHAEL GHIL AND RAMESH BALGOVIND

IT *

l,N-*

-»-

1,3

}1,2

1,1

^2^1

l,N ~2,N

V

1,2

M V2,l

1,1 2,

2,N

2,1 V3,l

2,1 3,1 3,1

v¡, ta v

-tf-

-»i-tír

V

-tÖ-•-

^SLIT l(27T,ir)

M,N

M,l

(0,0) '.o 2,0 3,0 M,0277

Figure 1

x(¡)

u, v alternate on diagonals of the mesh and so do -, x on diagonals parallel to and al-

ternating with those of u, v; in other words, v, x, u, -, in clockwise direction, occupy

the corners of an elementary mesh cell of area hk/4 (see Figure 1). No averaging

of U, V is necessary in writing (2.4a, b) on this staggered mesh. Furthermore, the

boundary conditions (2.2) and the periodicity condition (2.3a, b) can be easily han-

dled. Indeed, let U, V be indexed independently, with y = 0, 7T being horizontal

F-lines corresponding to F-indices / = 0, N, and with x = 0, 27r being vertical F-lines

with F-indices i = 1, M + 1. Then the computational domain includes the points

((/ - l)h, jk), at which Vfj is defined, and the points ((/' - l/2)h, (j - l/2)k), at which

Ujj is defined; in other words, Vtj approximates v((i - l)h, jk), 1 < i <M + 1, 0 <

/ < N, and U(j approximates 77((z" - l/2)h, (j - l/2)k), Ki<M, 1 < / < N, with

h = 2irlM, k = n/N.

The discrete equations are centered:

(2.4a) (Uu - Ut_t j)th + (VM - Vu_x)/k = Dti, Ki<M,Kj<N,

(2.4b) (Uu + x - Uif/)/k -iVi+x . - Vu)lh = Eif, Ki<M,l<j<N-l;

here D/; approximates diQ - l)h, (;' - 1/2)k), and E(j approximates e((7 - 1/2)A, /7c).



Notice that S, e in (A) have been replaced after staggering by h/2, k/2, rather than by

h, k. The boundary conditions become

(2.5a)

(2.5b)

Vi¡0 = ¿°Xii - l)h), Ki<M,

ViN = ¿l\ii-l)h), Ki<M,

and the periodicity condition becomes

(2.6a)

(2.6b)

uoj = Um.j' 1 </</V,

Vi,j=VM + ij, 0<j<N.

Conditions (2.5), (2.6) leave M(2N - 1) unknowns, to wit, Ut/, 1 < i < M, 1 <

/ < N, and Vtf, 1< i" <M, I </ <N- 1, while (2.4) yields M(2N - 1) linear alge-

braic equations for them. We should expect from the situation in the continuous prob-

lem that the matrix of the linear system (2.4) has a one-dimensional null space. The

resulting indeterminacy can be eliminated either by prescribing the value of U¡¡ at an

arbitrary index (i'0, /„), or by some alternative procedure which will arise naturally in

Section 3; the necessary discrete compatibility condition will also be discussed there.

3. Solution Algorithm. Our plan shall be to rewrite (2.4)—(2.6) in a convenient

block matrix form, to eliminate V, then to bring the remaining block matrix operating

on U to the form of a discrete five-point Laplacian, and finally to formulate a fast

direct algorithm to solve for U. After this, V obtains from U by a straightforward,

fast computation.

Block Matrix Equation. We start by rewriting the discrete linear system (2.4)

in block matrix form. Let IL = (Ux ,., U2 Jt..., UM])*, Y} = (K, j, V2 Jt..., VMJ)*,

where ( )* denotes (conjugate) transpose, so that U-, V- are column vectors corre-

sponding to a horizontal mesh line. D;- and E- are introduced in similar fashion. Also

let p = k/h. With this notation, (2.4) can be written as

Nj = Vy.j - TVj + kVj, Kf<N,

1 </</Y-l;

(3.1a)

(3.1b)

here p ~lT is the familiar backward difference operator,

U/+.=U7 T*Vj + kEf,

(3.2) T=p

1

1 1

-1

1 1 M XM

acting on the M-periodic vectors U;., V;- and -p 17"* is the corresponding forward

difference operator.



From (2.5) we have that V0 and VN axe known; accordingly, we redefine Dx as

Dj + &-1V0 in the first one, and D^ as D^ - k~l\N in the last one of the equa-

tions (3.1a). After these changes of notation and a change of order, the vector equa-

tions (3.1) can be put into the block matrix form

N N - 1

N-l

(3.3)

7V<

-I I

I

-I

U,

N-l

N

"N-l .

= *

^TV-l

D,

LoJwhere / is the M x M identity matrix. This form is roughly analogous to writing

(2.1a, b) in reversed order,

(2-03/3y -3/3*

3/3x 3/3y

We notice that the matrix in (3.3) has only four nonzero diagonals of M x M

blocks and that the blocks are at most scalar tridiagonal. Furthermore, all the non-

zero entries are ± 1 or ±p. In the present form, however, we cannot take full advan-

tage of the extreme sparsity and simplicity of the matrix in order to obtain a solution

method comparable to fast Poisson solvers.

Before bringing (3.3) to a more advantageous form we remark that its matrix

indeed has a one-dimensional null space: the sum of the MN rows in the lower half of

the matrix is zero. The corresponding compatibility condition that 2/.-D« = 0 is the

discrete counterpart of (2.3c).

Decoupling of U and V. It is a well-known fact that each of the functions 77, u

which satisfy (2.1) will also satisfy a Poisson equation obtained from (2.1) by elimi-

nating the other dependent variable by cross-differentiation. This suggests the attempt

to eliminate Fin system (3.3) in order to obtain a linear algebraic system for Ualone.

The matrix of this system will be similar to a discrete Laplacian, and to it we shall be

able to apply fast solution techniques.

The elimination of F proceeds as follows. In system (3.3) add the Mh block

row to the (N + l)st, then the (N + l)st to the (N + 2)nd and so on. This discrete

summation procedure is analogous to integration with respect to y in (2.1a). The lower

part of the system thus becomes



N N-l

(3.4) N>

UN

LViv-i.

Di

D,+D2

?>/ J

After this transformation, (3.3) can be written as

N N- 1

(3.5)

N

N

¥■■

P I ßI _

I

T i 0---0

UE'

D'

TV

LJ L*£>dhere U = (U*..... U*)*, V = (Vf, ..., V*_,)*, 7iV_1 is the M(N - 1) x M(N - 1)

identity matrix, the definitions of P, Q and E' are easily read from Eqs. (3.3), while

those of R and D' follow from (3.4). In particular, note that D' and E' contain the

factor k.

From (3.5) it follows that

PV + ßV = E',(3.6a)

(3.6b) RV + V = D',

where we omit for the moment the last vector equation. Substituting V from (3.6b)

into (3.6a), we obtain

(3.7)

where

(3.8)

(P-QR)V = E'-ßD',

N

QR

TT*

TT* TT* 0

N- 1

since ß is block diagonal with constant equal blocks and T*T = TT*. Attaching now

the last equation of (3.5) to (3.7) yields

(3.9)P-QR

T... T1 -i--"-2-0-"

or, changing sign in all but the last equation,



+

(3.9')

= k

0.

T*D

TT*

TT*

J

Ei

U

TT*

T*(BX + D2) - E2

L"i i

Let S = TT*. Notice that

(3.10) S = p(T+ T*)

or, in the familiär notation for tridiagonal matrices,

(3.10') T=p(-1,1,0), 7* = p(0, 1,-1), S = p2(-1,2,-1);

this is a slight extension of standard notation: because of periodicity these matrices are

actually circulants, rather than tridiagonal proper (cf. (3.2), (3.10)). Thus, S corre-

sponds to a central second difference, carrying further the analogy with the continu-

ous case. However, (3.9') is still not in a form suitable for fast solution.

Discrete Poisson Equation for U. This form is now easily achieved. First, mul-

tiply the last block row by T*. Then subtract the second block row from the first,

the third from the second, and so on up to and including the last one; the new last

row is obtained by adding all the new rows, from 1 to (N - 1), to the last row. Fi-

nally, the new last row is taken as the first row of the system and all the signs are re-

versed, yielding

S + I -I

-I S+ 21 -I

(3.11)

-/ S+ 21 -I

-I S + I

u

r*D -i

7;*D, + E, -E,

T*DN_X + EN_2 -EN-l

T*DN + E7V-1 mN



the vector B is introduced for future notational convenience.

We notice that the matrix in (3.11) has almost, but not quite the form of a dis-

crete Laplacian (compare for instance Buzbee, Golub and Nielson [5], for Laplacian

with Dirichlet, Neumann and periodic boundary conditions). The matrix is symmet-

ric and nonnegative semidefinite; in fact, LT.- = const satisfies the homogeneous sys-

tem. The system would be positive definite if any one diagonal element actually

exceeded the sum of the off-diagonal elements in the same row (e.g., Collatz [6,

pp. 43—47]). This immediately suggests that we make use of the possibility of pre-

scribing U( ■ = uQ, i.e., that we increase the element at the position Mj0 + iQ along

the diagonal by an arbitrary amount a > 0, and add at70 to the right-hand side of that

equation at the same time. We shall call the system thus modified (3.11a) for future

reference.

The system (3.11a) could now be solved by modifying any one of the different

methods or combination of methods available for Poisson's equation, as reviewed for

instance by Buzbee, Golub and Nielson [5], Dorr [9], or Widlund [32]. In this sense,

the elimination of V is analogous to a single step of odd/even reduction as proposed

by Hockney [21], [22], although the role of this step is more crucial in the present

context. We describe in the sequel the particular fast algorithm chosen to solve (3.11a).

Fast Direct Solution for U. Essentially, our algorithm is based on the one de-

scribed in greater generality by Buzbee, Golub and Nielson [5] as matrix decomposi-

tion. It relies on the fact that the eigenvalues Xk and eigenvectors %k of S axe known,

i.e., that

S%k ~ ^k%k'

for

(3.12a) Xk = 2p2(l - cos(2n(k - 1)¡M)), Kk<M,

(3.12b) %k = Af-Ô, e-2»'(*-»>/tt ..., e-2ni(M-i)(k-i)iM)*í i < A: <M,

where i is the imaginary unit.

Our algorithm deviates, however, from general matrix decomposition because the

first and last diagonal blocks in (3.11) differ from all the other blocks and because the

matrix (3.11), without suitable modification, is singular. For the sake of simplicity, we

shall describe the actual algorithm used directly in a self-contained manner.

Let Q. be the matrix whose columns are %x, %2, ..., %M, and let A =

diag( X,, X2, ..., XM). Then, with our normalization of %k,

(3.13) 0*0.= I. Q*Sd=A.

Introduce U;-, B;- by

(3.14a) IL = QV,;

(3.14b) *i=Q?i, l<i<N,

where B, are the subvectors on the right-hand side of (3.11).



Clearly, (3.14) corresponds to a discrete Fourier transform in the x-direction.

Notice in particular that

Ui,i L ui,ji^M>1=1

where £/• , denotes the first component of the vector Üy.

Premultiplying (3.11) by a block-diagonal matrix with all the diagonal blocks

equal to Q*, and using (3.14), we obtain

(3.15)

A+ 7 -I

-I A+ 21

-I A+ 21 -I

-I A + I uN BA^

In order to turn the matrix in (3.15) from block-tridiagonal with diagonal blocks

into a tridiagonal matrix, it suffices to reorder the components of U and of B by ver-

tical Unes rather than by Fourier transformed vectors of horizontal Unes, i.e., let

(3.16) <>*., U,i,k> Bk.j Bj,k' 1 <j<N, 1 <k<M.

Here again the first subscript denotes the vector partition and the second denotes the

component of the subvector; this notational convention is different from the mesh

notation introduced in Section 2 and used in (3.1). Using the reordering (3.16),

Eq. (3.15) becomes

(3.17)

with

~Ak0k Bt 1 <k<M,

(3.18a)

(3.18b)

(3.18c)

Ak =

NXN

ak = -2-\k = 2p2 [cos(2tt(k - l)/M) - 1] - 2,

** = -!

'vfc

7„2vfc - 2p2 [cos(27r(x - l)/M) - 1 ] - 1, 1 < k < M;

here the ck come from the first and last diagonal block in (3.15).

It is easy to check that each Ak is nonsingular, except the first, which has a

one-dimensional null space. This reflects the fact that we have not yet utilized a side

condition eliminating U = const as a null vector of (3.11), or of (3.15). As in the



discussion of (3.11a), we can deal with the singularity of Ax by prescribing any com-

ponent of Ûj, Ux . say; this means prescribing Ux • = Uj x = T,f=x t/. • I\[M,

rather than one given mesh value U¡ ¡ . Again we call the suitably modified systems

(3.15a), (3.17a) and (3.18a).

The LU factorization of the tridiagonal matrices in (3.18a) is performed next,

say Ak = Lkiik, storing the nontrivial entries of Lk and Uk. The diagonal elements

of Lk and the superdiagonal elements of Uk axe identically 1, while the subdiagonal ele-

ments of Lk and the diagonal elements of Uk are reciprocals of each other. Hence only

the diagonal elements of Uk in fact need to be stored.

Having computed B and thus B, from the original data, we can solve for U and

hence for U. The computation of B from B and of U from U involves matrix multi-

plication by Q* and by Q, respectively. Since these correspond to a direct and an in-

verse Fourier transform, they can be performed efficiently by a fast Fourier transform

(FFT) algorithm. The algorithm used was the one published by Cooley, Lewis and

Welch [7], modified to operate in the real.

Summary and Programming Considerations. The solution algorithm developed

in this section reduces in practice to the following:

(i) From the data of the problem D, E, compute

Bx=kiT*Dx-Ex),

B, = x(r*D; + E,_, - E,.), 2 </ <N- 1,

BN = kiT*DN + EN_x),

where Dj, D^y- include the boundary data V0, NN, respectively. The number of opera-

tions is 77, = 4MN.*** Notice that, after shifting of the blocks E-, B can overwrite

E, since E is no longer needed.

(ii) From the B above compute B (cf. (3.14b)) by

B;. = £*B;., Kj<N,

using an FFT, with /72 = 2MN log2 M real operations. Although FFT algorithms exist

which use MN log2M operations for real transforms, we did not feel that in our appli-

cation actual execution time would be much improved by the utilization of such an

algorithm. In this step, B can be stored in the previous location of B. Note that B;

is complex with pairwise complex conjugate components.

(iii) Solve (3.17a) for the 0k by forward elimination and back substitution,

using the stored entries of the LU factors. One solution, for fixed k, requires 8/V real

operations, since the complex conjugate components of B enter nonsymmetrically into

the LU factorization of (3.17a), and we have to operate separately on the real and the

imaginary parts of B. However, the fact that the components of Uy are also complex

***A11 operation and storage counts are given to highest order. We do not distinguish be-

tween addition, multiplication and division, and we do not assume that accumulation of products

is particularly efficient. Furthermore, we take 7i = k, or p = 1.



conjugate allows us to solve only M/2 + 1 systems, so that the operation count for this

step is 773 = 4MN real operations. Notice that no reindexing of B as indicated by

(3.16) is actually necessary, and that at this stage U can overwrite B. The shifting of

B elements and U elements required to use the LU subroutine is relatively fast.

(iv) From the Uk computed in (iii) obtain U (again with no need for reindexing)

by carrying out (3.14a) with another application of the FFT, in 774 = 2MN log2M real

operations, and with U stored instead of U.

(v) Compute V from U obtained in (iv), using (3.1a). This requires tis = 4MN

operations. Notice that in this recursion V does not depend on \JN, DN, since VN is

known. Using (3.1a) as a backward recursion would yield independence of V on Up

Dp since V0 is also known. This redundancy, however, is only apparent and compu-

tations with (3.1a) used as either forward or backward recursion yielded results with

the same accuracy, which only depended on the accuracy of U.

The final operation count is

5

77 = X nk = 4MN(3 + log2M),i

to obtain both Uand Fat all the grid points at which they are defined.

The storage requirement for the L, U factors is s, = MN/2, using again the fact

that we only solve for half the number of components of U; the storage requirement

for the right-hand sides D, E, is s2 = 2MN; and these locations are successively over-

written until U, V are stored in them. Hence, the total storage requirement is s =

sx + s2= 5MN/2.

We conclude this section with a diagram of the algorithm:

E FFT ~ R „ LU *Of, E;, V0, VN -^* B, ^Xo^m B7 AT Bk -^ Vk

R ~ FFT E—► u.-► u.-► v.-0 ' 2MN\og2M i 4MN >'

hexe E stands for evaluations by linear recursion, FFT for an application of the Fast

Fourier Transform, R for reindexing, and LU for factorization. The ranges of the in-

dices are 1 < / < N, 1 < k < M, and the operation counts are given under the corre-

sponding step of the algorithm.

4. Numerical Results. In this section we shall present results of test computa-

tions for model problem (2.1)—(2.3). The results will provide evidence for the short

computation time required by the method, for its second-order accuracy and for its

insensitivity to errors in the data.

The experiments consisted in evaluating analytically d, e in (2.1) for known

77, v, and then comparing the numerical solutions U, V with the correct analytical

u, v. The computations were carried out on an IBM 360/95 computer, with a

FORTRAN IV, level H compiler, the code optimization parameter OPT being set to

the value OPT = 2. Double precision arithmetic was used throughout our numerical

experiments. Test comparisons on an Amdahl 470V/6 computer and on a CDC 6600



gave entirely similar results; the program is currently being developed into a package

and we expect to test it on other computers as well.

Initially, three different versions of the program were run. The first version

solved the positive definite modification (3.11a) of the system (3.11) by Cholesky

factorization. It is denoted by C in the tables, and given for orientation purposes

only. This version required much longer computer time; moreover, results were less

accurate than with the other two versions, since the larger number of arithmetic oper-

ations introduced more round-off error.

The second version utilized diagonalization by the discrete Fourier transform as

given in (3.14) and the reindexing given in (3.16), but did not use the FFT in steps

(ii) and (iv) of the algorithm. It is thus somewhat comparable to Hockney's original

method [21] and is also included for comparison purposes. Its numerical results were

equal at least to within three significant digits to those of the third version and are

Usted separately only in Table III. This version is denoted by P in the tables.

Table I

CPU execution times for the versions C, P and F on IBM 360/95 with compiler

optimization OPT = 2. The number of points is MN = M2 ¡2.

16 .07 .09 .01

32 .88 .65 .04

64 14.35 5.17 .13

128 .51

256 2.11

Table II

CPU execution times for version F with different optimization levels, OPT = 0,

1, 2, 3, on IBM 360/95, Amdahl 470V/6, and CDC 6600. The compiler used onthe CDC computer was FTN 4.6; on the other two computers, a FORTRAN IV,

level H compiler was used.

360/95 470V/6 6600

16

32

64

128

OPT/0

0.02

0.08

0.35

1.40

0.02

0.07

0.20

0.88

0.01

0.04

0.13

0.51

OPT/0

0.00 0.02

0.03 0.09

0.13 0.38

0.52 1.53

0.02

0.07

0.30

1.25

0.01

0.05

0.23

0.96

OPT/0

0.01 0.059

0.06

0.24

0.232

0.969

0.96 4.065

0.031

0.120

0.494

2.034

0.025

0.093

0.382

1.615

256 5.91 3.70 2.11 2.13 6.52 5.46 4.07 4.14

The third version is the one actuaUy described in the summary of Section 3 and

Usted in Appendix B; it represents the proposed method. This version is denoted by F

in the tables.



Tables I and II contain timing results. These are essentiaUy independent of the

solution, and depend only on the method, or program version, and on the number of

points. Table I compares CPU execution times for the three program versions C, P

and F on the IBM 360/95 computer with compiler optimization OPT = 2. As the

number of points increases, the advantages of version F become more and more obvious.

We can also see clearly that the execution time for version F is proportional to the

number of points MN.

Table II gives CPU execution times for version F with different optimization

levels, OPT = 0, 1, 2, 3, on the IBM 360/95, Amdahl 470V/6, and CDC 6600 comput-

ers. This is meant to give an idea of the order of magnitude of running time improve-

ments which could still be achieved by code optimization. We notice that timing with

OPT = 3 is not superior to OPT = 2.

The other tables are organized as follows: the first column contains the known

test solution (u, v), and for Table V only, its L2 and L„ norms. The norms used for

u are the continuous norms

(4.1a) \\u\f\ = \ fon j2n u2ix, y) dx dy,2lT

(4.1b) ||id|„ä sup |r/(x,y)|,0<x <2-n,0<y*ín

and similarly for v. The corresponding L2 and /,«, norms for the vector (u, v) axe de-

fined in the obvious way, with u2 + v2 as the integrand in (4.1a) and max{|î7|, |u|} as

the maximand in (4.1b). These norms are given in order to compare the corresponding

norms of the computational error with them.

The second column contains the version of the program used, specificaUy version

C or version P/F. The third column contains the value of M, the number of points in

the x-direction. AU reported computations were carried out with equal mesh spacing

in the x-and y-directions, i.e., h = k = 2-n\M. Hence, N = M/2 always, and the total

number of points equals M2 ¡2.

The remaining columns contain the absolute errors in (u - U, v — V). The

norms used here are the discrete counterparts of the norms in the first column:

(4.2a) l22(U)=Z^=J=xÜ¡jlMN,

(4.2b) UU) = maxf=x^x\Utj\,

and similarly for V and (U, V). For simpUcity, in the column headings, as well as in

(4.2) above, we used l2(U), instead of l2(u - U), and so on. The grid values of (u, v)

entering (4.2) are those identified in Section 2 and, therefore, are taken at different

locations for u than for v.

Table III shows that for u, v which are combinations of linear and of trigono-

metric functions, and thus eigenvectors of the discrete operator in (3.11), the results

are essentiaUy exact to machine accuracy (in double precision arithmetic).



Table III

Numerical results with versions C, P and F of the algorithm for a number of sim-

ple test cases. The labeUing of rows and columns is explained in the text.

*_<« "«(V)

u = sin(y), v = y sin{x) 6351

4364

330S

1272

-14

-13

-12

-11

5457 -15

2929 -14

1543 -13

6296 -12

1.9989

1.7934

1.6889

1.5273

2.8144 -14

2.5881 -13

2.4685 -12

2.2408 -11

6.6613 -15

7.6605 -14

6.7857 -13

5.5140 -12

5286

2001

7386

4957

4851 -16

0922 -16

4532 -15

2021 -15

3.5100

5.1500

2.2120

1.1267

-16

-16

-15

-14

8.7430 -16

1.4017 -15

6.7862 -15

2.8847 -14

6.6613 -16

1.5543 -15

5.1070 -15

3.2488 -14

7088

3077

5606

4255

4397 -16

3889 -16

5149 -16

2937 -15

3.0436

5.1554

1.8946

1.0286

-16

-16

-15

-14

5.6287 -16

1.5821 -15

4.3438 -15

3.9314 -14

6.6613 -16

6.6613 -16

1.5682 -15

7.7934 -15

y sin (x) , v = sin (y) 3764

7252

4271

0832

5357 -15

2782 -14

9268 -13

5755 -12

2.5870

2.0024

1.6875

1.3481

-15

-14

-13

-12

4.2188 -15

3.5083 -14

2.9177 -13

2.2524 -12

6.4254 -15

5.8356 -14

5.0736 -13

4.2358 -12

8762

8400

1500

1996

-15

-15

-15

-14

5357 -15

9115 -15

9430 -14

7951 -13

2.1834

3.6143

2.7782

1.2690

4.6629 -15

6.0091 -15

2.7145 -14

7.9020 -14

9.7561 -15

2.5924 -14

3.2709 -13

2.0683 -12

0791

7749

6763

5572

-16

-16

-15

-15

9369 -16

0176 -14

3516 -13

0348 -12

2.0194

1.3785

9.4054

7.2592

6.6613 -16

8.8818 -16

7.1054 -15

9.7145 -15

5.6899 -16

3.1225 -14

2.0578 -13

1.4948 -12

y sin(x) + 0.001 ,

y4 + 0.001

2.2678

8.0236

6.6951

-14

-14

-13

1.4387 -14

8.2059 -14

7.8598 -13

1.9560 -14

8.1092 -14

7.2820 -13

5.3290 -14

1.9540 -13

1.6165 -12

3.5527 -14

2.4158 -13

2.3128 -12

1.1175

1.1179

4.1173

8.9470 -15

1.5689 -14

1.3525 -13

1.0279 -14

1.3750 -14

9.8620 -14

3.6193 -14

5.4622 -14

2.5047 -13

2.1316 -14

1.0303 -13

1.4211 -12

7.3201

1.6285

8.2406

1.1531 -14

2.5126 -13

7.6397 -13

9.3597 -15

1.7206 -15

5.3471 -13

1.4655 -14

3.1974 -14

1.8474 -13

3.1974 -14

5.2558 -13

1.4246 -12

Table IV presents results for slightly perturbed boundary conditions i/0)(x),

u(1)(x), m(x0, y0) and right-hand sides d(x, y), e(x, y). Notice that errors in the com-

puted Fare bounded by the errors in the prescribed boundary data i/°\ i/1* in the

absence of other errors. Also observe that the l2 error in U is larger when d, e are

perturbed than when i/0) and i/1* are perturbed by a comparable amount.

Table V contains results for a number of more severe test cases, in which the discre-

tization error is nonnegligible. For comparison, an additional last column is included in

this table, which gives a theoretical error bound e(u) on the l„-discretization error in u.

This bound is explained and proved in Appendix A; the values of the constants a, b, a and

ß introduced there were chosen so as to optimize the bound, and the constants CR, Cr in

(A. 14), (A. 15) were computed explicitly. We observe that the numerical errors are indeed

lower than the bound in aU cases in which u, v axe sufficiently differentiable. In fact, the



Table IV

Numerical results with the proposed algorithm (version F) for problems with

slightly perturbed boundary conditions i/°*(x), v^l\x), £7(x0, y0) and right-hand

sides d(x, y), e(x, y). The subscript ( ) stands for the correct values.

t-(V) ».("I *<v>

u = sin(x) sin(y), v = sin(4x) 1270

1732

2952

2417

1071

7.0594 -2

1.6540 -2

4.0377 -3

9.9953 -4

2.4878 -4

5.0665

1.2091

2.9790

7.4064

1.8473

4.1757

1.6793

5.2345

1.4537

3.8233

1.0958 -1

2.6040 -2

6.4284 -3

1.6021 -3

4.0020 -4

d s l.ld , e = l.le„8127 -2

6760 -2

4525 -2

4003 -2

3874 -2

1.4141 -1

8.1393 -2

6.6998 -2

6.3252 -2

6.2204 -2

1.0866 -1

6.9773 -2

6.0982 -2

5.8773 -2

5.8172 -2

1.0417 -1

1.0102 -1

1.0028 -1

1.0009 -1

1.0003 -1

2.1951 -1

1.2814 -1

1.0667 -1

1.0138 -1

1.0007 -1

v<°> = l.lv<°>, v(1» = l.lv'1' 8156 -3

4438 -2

8546 -2

9594 -2

9858 -2

7.7503

2.6059

1.8548

1.8582

1.9146

5.3018

2.0885

1.E547

1.9099

1.9507

6.2641

4.7086

7.5598

8.8792

9.4717

1.1061

6.1123

6.9934

8.2510

9.0694

u = sin(x) sin(y), v = sin(4x)

„"» = l.M v<°>. JV= l-Ol „<»

2.4564

9.1866

6.1478

5.5561

5.4252

1.9396

3.2585

7.3104

1.6720

1.9138

7.7675

2.3026

1.0333

7.2249

6.4446

7.1245

1.7228

4.9289

2.3682

2.0042

5.6013

1.7323

8.4702

6.4382

5.9547

5.0689

1.2210

3.4965

2.0472

1.9594

-2

-2

-3

-3

-3

-2

-2

-3

-3

-3

4.7747

2.4293

1.3836

1.0738

1.0026

3.8097

1.0437

2.8843

7.5755

9.1283

1.2057

3.6250

1.6453

1.1580

1.0367

1.0968

2.6091

8.8601

8.5085

9.1034

u = sin(x)sin(y), v = sin(4x)

d ■ l.Old , e = 1.01 e

V10>. 1.01 v¿°>. ,<». 1.01 ,«>

u(x0,y0) = 1.01 uc(x0,y0)

2.2784

7.7732

5.3611

5.0612

5.0134

7.8326 -2

2.3701 -2

1.1088 -2

8.0548 -3

7.3146 -3

5.6034 -2

1.7407 -2

8.6658 -3

6.7151 -3

6.2660 -3

4.4085 -2

1.7937 -2

1.0386 -2

1.0096 -2

1.0024 -2

1.2068 -1

3.6301 -2

1.6493 -2

1.1618 -2

1.0404 -2

u ■ sin(x) sin{y)

v = sin(4x) sin (4y)

3.2273

8.0409

2.0085

5.0203

1.2550

5.9183 -2

1.3515 -2

3.2789 -3

8.1045 -4

2.0164 -4

4.0498 -2

9.4190 -3

2.3045 -3

5.7193 -4

1.4258 -4

6.2089 -3

1.5927 -3

4.0074 -4

1.0035 -4

2.5096 -5

1.1072 -1

2.6172 -2

6.4545 -3

1.6082 -3

4.0171 -4

d = 1.01 dc

(0)v

u(x.,y.) = 11 u (*.n,yn)

8.2597

5.8123

5.2030

5.0508

5.0128

6.5120 -2

1.8814 -2

8.3918 -3

5.8582 -3

5.2234 -3

4.4892 -2

1.3738 -2

6.9572 -3

5.4663 -3

5.1188 -3

1.5891 -2

1.1513 -2

1.0381 -2

1.0096 -2

1.0024 -2

1.2183 -1

3.6434 -2

1.6519 -2

1.1625 -2

1.0406 -2

1.1 e-ldc

1.1 vvc

u(xn,y.) - l.lu (xn,y„)*0"0'c'-O'-'O'

5.3550

5.0885

5.0221

5.0055

5.0014

1.1855 -1

6.6507 -2

5.4407 -2

5.1287 -2

5.0418 -2

8.9935 -2

5.8963 -2

5.2323 -2

5.0670 -2

5.0216 -2

1.0302 -1

1.0079 -1

1.0020 -1

1.0005 -1

1.0001 -1

2.2179 -1

1.2879 -1

1.0710 -1

1.0177 -1

1.0044 -1

errors become very nearly equal to the bound in some such cases (e.g., u = y sin(x), v =

y2 sin(x)), which seems to indicate that the bound is rather close to being sharp. The

theoretical bound provides also an indication of error magnitude even in some cases

where the data are not sufficiently differentiable (e.g., u = y4 sin(8x), v =

x(2tt - x) sin(8y); u = v = x(n - x)(2n - x) sin(y)).



TABLE IV (Continued)

»,(0) l,(U,V) »_(V)

i -= sin(x) sin (y)

, = sin(4x) sin(4y)

,(0) _ „(0)

,(1) „<1>

+ 0.1 sin(32x)

-l- 0.1 sin(32x)

6.7163

6.9041

3.5965

4.7732

5.3989

5.9392

3.6605 -2

6.3567 -2

2.7103

4.7378

u = sin(x) sin(y)

v = sin(4x) sin(4y)

„(0) „ v<°> + o.lsin(31x)

v(1) - v'1'* 0.1sin(31x)

3.2180

3.2461

7.4420

6.2894

7.0221

7.0895

4.3097

4.5354

3.7044

4.9065

5.3831

3.7981

6.4849

5.5055

6.0615

7.2440

8.1892

4.1426

5.2501

6.9534

1.7657 -1

9.5994 -2

1.8729 -2

2.7606 -2

4.8358 -2

u = sin(x) sin(y)

v = sin(4x)

v<0) = vt0) + 0.1 sin(31x)c

v(1) =. vll) + 0.1 sin(31x)

3.8439

3.2860

7.5512

6.8369

7.0226

8.0665

4.4139

5.1109

3.7503

4.9087

6.1843

3.8780

6.4665

5.5256

6.0626

1.1296

9.8528

4.6641

5.3952

6.9553

1.5208 -1

9.2006 -2

1.8363 -2

2.7601 -2

4.8357 -2

y sin(3x)

2 -x)

+ 0.1 sin(31x)

1 = x(2 -x) sin(3y)

,(D 0.1 sin(31x)

1.1961

2.9682

7.3896

1.9594

8.3894

5.7240

1.4061

3.3020

8.9053

5.3044

9.5706

2.3461

5.7534

1.5258

7.0244

3.6459

1.0109

2.6792

1.0993

8.3911

1.5927

4.1008 -1

1.0273 -1

3.2782 -2

4.8979 -2

y sin(4x)

2 -x)

+ 0.1 sin(31x)

v = x{2 -x) sin{4y)

,d> + 0.1 sin(31x)

2.2889

5.5349

1.3716

3.4841

1.1054

9.2522

2.1416

5.1032

1.3094

5.8169

1.7871

4.2463

1.0410

2.6397

8.8422

5.1558

1.7679

4.9607

1.6413

9.8365

2.8077

6.7504 -1

1.6534 -1

4.1227 -2

4.9620 -2

The formal truncation error of the finite-difference scheme, in the interior as

weU as on the boundary, is 0(h2 + k2)A hence, we expect that the scheme will pro-

duce second-order accurate results. A good way of testing this numerically is by com-

puting the ratio

| lp-h,2k) \lpi.k)

and similar quantities for v and (77, u); here || || is either the l2 or the /„ norm of

(4.2). For sufficiently smaU h, k and twice continuously differentiable (77, v) these

ratios should be very close to 4 if the method is indeed second-order accurate, and it

should be close to 2 if the method is first-order accurate.

The indicated ratios are computed and entered in additional rows in Table V; the

entry in column 3 indicates the values of M and 2M, rather than of h and A/2, to

which the norms whose ratio was taken correspond. The interesting result is that for

those cases tested the method seems to have second-order accuracy even when v is

merely continuous and u once continuously differentiable and that it is first-order ac-

curate when both u and v axe merely continuous. The continuity of u, v is given in

standard notation as a column to the left of the usual first column; i.e., u, v E C°,

Cl, C2, ..., C°°, indicate the number of continuous derivatives of u, v. Continuity is

understood for u, v and their derivatives as extended x-periodic functions.

T A more detailed discussion of accuracy appears in Appendix A.



Table V

Numerical results indicating the order of accuracy of the proposed algorithm (ver-

sion F) and discretization procedure.

if") *.(v> e(u)

CCc°°

u = y sin(x)

v = y2sin(x)

lul^ 3.1416

Ivl = 9.8696

16

32

64

4.015

1.049

2.657

6.664

2.879

7.043

1.722

4.248

3.573 -2

9.049 -3

2.254 -3

5.607 -4

8.041

2.794

8.124

2.193

4.879

1.315

3.414

8.566

4.455 -1

1.114 -1

2.784 -2

6.961 -3

8/16

16/32

32/64

3.826

3.950

3.987

4.088

4.090

4.054

3.948

4.016

4.019

2.878

3.440

3.706

3.711

3.851

3.986

u«C

Uc"= y sin(4x)

= y sin(4x)

ul,= 1.2825

lul^» 3.1416

Ivl ■ 9.8696

16

32

64

128

1.7509

4.326

1.083

2.7087

2.5573

6.1881

1.5133

3.752

2.1684 -1

5.292 -2

1.3123 -2

3.268 -3

4.9751

1.573

4.6187

1.2895

5.5634

1.4699

3.6885

9.2309

1.426 +1

3.564

8.910 -1

2.227 -1

16/32

32/64

64/128

4.066

3.996

3.997

4.152

4.069

4.033

4.097

4.033

4.015

3.169

3.407

3.582

3.805

3.964

3.996

kiec

|vec°°lui 2=

Ivl 2=

Ivl =

y sin(8x)

y2sin(8x)

1.2825

3.1210

3.1416

9.8696

16

32

64

128

256

1.0235

1.6710

4.0218

9.9941

2.4956

1.0575

2.973

7.088

1.7449

4.3369

7.579 -1

2.3912 -1

2.739 -2

1.419 -3

3.835 -3

1.6347

5.6803

1.7565

5.0463

1.4018

1.1188

7.6216

1.856

4.6501

1.1612

2.103 +2

5.257 +1

1.314 +1

3.286

9.215 -1

16/32 6.125

4.155

4.024

4.005

3.5571

4.194

4.062

4.023

3.141

4.1663

4.045

4.015

2.878

3.234

3.481

3.600

1.467

4.106

3.992

4.004

use

vec"lui 2=

Ivl 2=

Ivl -

y sin(16x)

y2sin(16x)

1.2825

3.1210

3.1416

9.8696

16

32

64

128

256

5.8205 +1

1.0321

1.5734 -1

3.7506 -2

9.2864 -3

1886

4585

2060

6097

8749

4.7763 +1

7.4243 -1

2.5130 -1

5.9846 -2

1.4774 -2

16/32

32/64

64/128

128/256

5.6397 +1

6.5596

4.1950

4.0388

8415

7026

2131

0587

6.4334 +1

2.9544

4.1991

4.0499

3081

7140

0606

8544

2893

0451 +1

6038 -2

8298 -1

1426 -1

3174 -2

3.290 +3

.224 +2

2.056 +2

5.140 +1

1.285 +1

6319

8280

2682

5060

0788 +3

3465 -2

1210

0294

hr£C

lul.

Ivl

sin(x)sin(y)

sin(4x)sin(4y)

0.5

0.5

1.0

1.0

8

16

32

64

1.306 -2

3.227 -3

8.0409 -4

2.0085 -4

1429

198

3515

2789

9.8921 -3

4.0498 -2

9.419 -3

2.3045 -3

2339

2089

5927

0074

2730 -1

1072 -1

6170 -2

4545 -3

268 -1

3.171 -2

27 -3

1.982 -3

8/16

16/32

32/64

4.0548

4.0135

4.0034

628

3790

1219

152.443

4.2996

4.0872

5980

8982

9745

76 -15

2305

0548

y sin(x)

x(2ir-x) sin(y)

22.9595

5.0966

97.4091

9.8696

1

16

32

64

8/16

16/32

32/64

3.586 -1

9.332 -2

2.36 -2

5.92 -3

33

509

67

03

4.95

1.24

3.06

7.61

44

49

821

617

168

14 -1

89 -2

984 -2

7.898

1.975

937 -1

1.234 -1

3.8424

3.9541

3.9880

1959

1150

0603

4.0077

4.0346

4.0244

8922

9734

7221

9793

9761

Table VI contains a study of the number of grid points per wave length which

the method necessitates for given numerical accuracy. The results are given here as

relative errors, l2(u — U)/\\u\\2, rather than as absolute errors, l2(u - U), and similarly

for v and for /„. It seems that roughly 4 points per wave length wiU give 10 _1 rela-

tive error, 8 points wiU give 5 x 10~2, and 16 wiU give 10-2. We notice again that

if u osciUates less than v, the error in u wiU be considerably smaUer than that in v.



TABLE V (Continued)

l,(U) £,(U,V) i (U) *„<v> e(u)

lui 2=

Ivl 2=

,U,«T

Ivl =

y sin(8x)

x(2tt-x) sin(8y)

22.9595

5.0966

97.4091

9.8690

16

32

64

128

256

1.732 +1

2.459

5.853 -1

1.446 -1

3.604 -2

3.8587

7.1378 -1

1.636 -1

3.989 -2

9.892 -3

1.292 +1

1.8352

4.3267 -1

1.064 -1

2.647 -2

3.932 +1

6.2285

2.0035

5.3048 -1

1.3449 -1

7.2614

2.1761

5.130 -1

1.2653 -1

3.165 -2

3.856 +1

9.640

410

6.025 -1

1.506 -1

16/32

32/64

64/128

128/256

7.0428

4.201

4.048

4.012

5.406

4.363

4.101

4.032

7.040

4.242

4.0656

4.020

312

109

777

944

3.337

4.242

4.055

3.997

lui 2=

Ivl 2=

Ivl -

y sin(16x)

x{2tt-x) sin (16y)

22.9595

5.0966

97.4091

9.8690

16

32

64

128

256

4.66

1.816

2.5178

5.964

1.471

1.174

2.241

6.1677

1.4347

3.518

16/32

32/64

64/128

128/256

2.566

7.212

4.221

4.053

5.236

3.634

4.299

4.078

8.7105

1.3108

1.8458

4.353

1.071

6.630

7.118

4.241

4.061

4677 +1

6798 +1

884

1700

7139 -1

2.055 +2

4.5798

1.783

4.106 -1

1.006 -1

8.067

2.017

5.042

1.260

3.151

3~86~

781

172

798

4.487

2.568

4.343

4.080

Ivl „=

lut, =

y sin{32x)

x(2n-x)sin(32y)

5.0966

22.9595

9.8696

97.4091

16

32

64

128

256

9.32

9.176

1.843

2.535

5.997

7.511

2.901

1.2121

5.813

1.3647

5.176

2.123

1.316

1.8459

4.356

2935 +2

277 +2

092 +1

2431

2600

1.564 +3

7.936 +2

2.7678

1.4956

3.4708 -1

1.467 +4

3.669 +3

9.172 +2

2.293 -.3

732 +1

16/32

32/64

64/128

128/256

1.0157

4.9795

7.2681

4.228

2.5888

2.39

2.085

4.259

2.4378

1.613

7.1300

4.237

0128

5081

030

205

1.9709

2.867 +2

1.851

4.309

u = x(tt-x) (2ir-x)sin y

v = x{ïï-x)(2ïï-x)sin y

16

32

64

128

25616/32

32/64

64/128

128/256

1.4431

3.6692

9.2269

2.3104

5.77833.9330

3.9766

3.9937

3.9984

9.7688

2.4499

6.1387

1.5368

3.8452

1.2474

3.1389

7.8604

1.9651

91163.9874

3.9910

3.9943

3.9968

3.9739

3.9933

4.0000

4.0002

3.4316

9.5067

2.4802

6.3163

1.5928

2.4159 -1

6.3250 -2

1.6099 -2

4.0366 -3

1.0101 -33.6097

3.8330

3.9267

3.9656

3.8196

3.9287

3.9884

3.5960

3.784 -1

9.461 -2

2.365 -2

5.913 -3

1.478 -3

u ■ x(it-x) (2,r-x)sin y

v = x(2n-x)sin y

lul„= 11.9343

Ivl " 9.8696

16

32

128

25616/32

32/64

64/128

128/256

1.3883

3.4733

8.6870

2.1720

5.4303

1.2207

1.9011

4.8190

1.2.1.33

3.0443

1.3127

2.8241

7.0539

1.7629

4.4065

3.0675

8.3754

2.1836

5.5692

1.4059

2.5198 -1

6.0616 -;

1.5363

3.8529 -3

9.6442 -

3.784 -1

9.461 -2

2.365 -2

5.913 -3

1.478 -33.9970

3.9983

3.9995

3.9999

6.4210

3.9450

3.9717

3.9856

4.6484

4.0036

4.0014

4.0006

3.6625

3.8357

3.9207

3.9613

4.1570

3.9456

3.9873

3.9950

= x(2tt-x) sin y

= x(2n-x)sin y

lui

16

32

64

128

256

16/32

32/64

64/128

128/256

4.6542

2.3271

1.1639

5.8198

2.9100

5.0733

2.3676

1.2030

6.0629

3.04342.0000

1.9994

1.9999

2.0000

2.1428

1.9680

1.9842

1.9922

4.8543

2.3468

1.1833

5.9417

2.9772

1.0241

5.6744

2.9686

1.5144

7.64272.0685

1.9832

1.9916

1.9958

1.8647

1.9115

1.9603

1.9814

1.1306

5.7633 -1

2.9230 -1

1.4679 -1

7.3474 -21.961S

1.9717

1.9913

1.9978

1.565 -1

3.912 -2

9.780 -3

2.445 -3

6.112 -4

These conclusions are also supported by some of the results in Table V. It is inter-

esting that experiments with solutions containing odd wave numbers give results which

are only slightly worse than those for even wave numbers, if at aU; in other words,

using M, N which are powers of 2 is not detrimental to accuracy, even when odd

wave numbers are present in the solution.



TABLE V (Continued)

i (U) 1_(V) e(u)

2 2u=x (2tt-x) sin (y)

4 4v - x (2n-x) sin(y)

lul_ ■ 97.4091

Ivl = 9488.531

8

16

32

64

1.6926 +1

3.5778

8.674 -1

2.155 -1

1.2066 +2

2.7330 +1

6.5176

1.614

8.0018 +1

1.8850 +1

4.6131

1.4278/ié

16/32

32/64

4.7319

4.1236

4.0258

4.4148

4.1593

4.0706

4.2446

4.0867

4.0370

3.3808 +1

8.311

2.009

4.9799 -14.0678

4.1368

4.0344

2.909 +2

7.162 +1

1.783 +1

4.45384.0618

4.0160

4.0040

2.310 +1

5.774

1.443

3.609 -1

u£C

vec°

uSC

uec"

u=x (rr-x) (2ïï-x) sin (y)

v - 0

16

32

64

128

256

2.1348

4.7264

1.0805

2.5536

6.1632

9.5464

2.3985

6.0150

1.5065

3.7699

1.5604

3.3996

7.7121

1.8158

4.3888

4.2730

1.1174

2.8579

7.2238

1.8161

2.4159 -1

6.3250 -2

1.6099 -2

4.0366 -3

1.0101 -3

3.784 -1

9.461 -2

2.365 -2

5.913 -3

1.478 -3

16/32

32/64

64/128

128/256

4.5167

4.3743

4.2313

4.1299

3.9800

3.9877

3.9927

3.9961

4.5899

4.4081

4.2471

4.1375

3.8241

3.9106

3.9554

3.9777

3.8196

3.9287

3.9884

3.9960

u = x(2tt-x) sin(y)

A

v = y sin(x)

lul^= 9.8696

Ivl = 97.4091

16

32

64

128

256

16/32

32/64

64/128

128/256

1.1787

4.9757

2.3681

1.169

5.826

2.9107

2.1011

2.0257

2.0065

2.0016

1.0090

4.889

2.433

1.2179"

6.0988

3.052

2.0093

1.9978

1.9972

1.9982

1.1091

4.936

2.3999

1.193

5.963

2.982

2.0566

2.0110

2.0013

1.9996

2.8276

1.3147

6.4005

3.1481

1.559

7.754

1.9799

1.1202

5.7897 -1

2.9311 -1

1.4697 -1

7.352 -2

2.0540

2.0332

2.0194

2.0105

1.9349

1.9752

1.9944

1.9990

1.265

3.162

7.904

1.976

4.940

1.235

In particular, these results also show that the present solver would perform very

weU on Unearized versions of the original geophysical fluid dynamic problem we were

interested in ([16], [17]). We shaU return to this point in Section 7.

5. Changes in Boundary Conditions. In Section 2 we have formulated the model

problem (2.1)—(2.3) which motivated this study. The algorithm of Section 3 has ob-

vious appUcations to many other situations; it is of interest, therefore, to consider a

number of different boundary conditions which could be associated with the Cauchy-

Riemann equations (2.1) in a rectangle.

We shaU assume throughout this section that the boundary conditions on y = 0,

77 are still (2.2), i.e., v is prescribed there as v^°\x) and u^\x), respectively. The two

different combinations of boundary conditions we consider explicitly are: (1) that u is

given on the left boundary of the rectangle and v is given on the right boundary, and

(2) that u is given on both vertical sides.

It is clear that if v is given on aU sides, the problem should be formulated as

Poisson's equation for v with Dirichlet boundary conditions; similarly if u is given on

all the sides, a Dirichlet problem for u is more suitable. A moment's thought wiU

show thaf the two situations we shaU discuss can easUy be transformed into a con-

siderable number of others, by reflections or by interchanging the roles of x and y,

and of u and v. In fact, aU situations in which u as weU as v axe prescribed on some

of the sides of the rectangle can be handled by sUght modifications of the algorithms

we present, yielding second-order accurate numerical solutions.



Table VI

Numerical results indicating the resolution (mesh points per wave length) required

by the proposed discretization procedure and solution algorithm (versions C and

F) to obtain prescribed accuracy.

t,(U)/lul. i2(V)/lvl2 H^lUl/lul,

u = y sin(x) v = y sin(x) 6.6212 -4

5.1958 -4

1.3611

1.3611

8.6548

6.9789

u = y sin(4x),

u * y sin(8x),

u = y sin(64x),

v = y sin(4x)2

v = y sin(8x)2

v = y sin(64x)

8.444 -3

3.1358 -2

1.9450 +2

8.0699 -1

1.1422 -1

4.849 -3

2.271 -2

4.4180 +1

4.4600 -3

1.0867 -1

1.47

5.59

2.0070

5.6440

2.0238

u = sintx)sin(y) , v = sin(4x)sin(4y) 4.1000

4.0170

6.5580

6.5578

4.4250

4.0074

u » y sin(x),

u = y sin(8x),4

u » y sin(16x),4

u = y sin{32x),4

u - y sin(64x),

v = x(2n-x)sin(y)

v = x(2ir-x)sin(8y)

v ■= x(2ir-x)sin(16y)

v ■ x(2ir-x)sin(32y)

v « x(2Tf-x)sin(64y)

4u * y sin(128x), v - x(2tt-x) sin(128y)

u « x(2ir-x)sin(y), v = y sin(x)

3.0218 -4

2.5784 -4

2.5490 -2

1.097 -1

8.027 -1

7.962

8.0577 -1

1.1064 -1

1.592 +1

1.5906 +1

1.0707 -1

3.394 -2

2.2937 -2

1.7716 -3

1.7716 -3

3.2100 -2

1.210 -1

2.378 -1

1.2598 +2

1.2376 -1

1.1180 -1

6.956 +2

2.6429 -12

6.514 -2

5.270 -3

5.270 -3

1.3592

9.8728

2.0570

7.067

5.227

2.617

5.4615

7.6292

5.234

5.3360

5.5526

4.104

3.1896

u - x2(2ir-x)2sin(y) , y - x4 (2ir-x) 4sin(y)

u ■ x{tt-x) (2n-x)sin{y) , v ■ 0

1.121 -2

4.9081 -3

1.785 -2

1.2937

1.2937

9.4755

5.1124

5.267

u - y"sin(3x) v = x(2rr-x)sin(3y) 5.2079

1.2851

3.2022

7.9989

1.9993

1.1205

2.6395

6.4495

1.5969

3.9748

3.6800

9.6328

2.4339

6.0900

1.5225

u ■ y sin(5x) v = x(2rr-x)sin{5y) 1.6604

3.8882

9.5680

2.3826

5.9508

2.7604

5.9866

1.4350

3.5362

8.7915

1.2079

3.1351

7.8489

1.9858

4.9629

u = y sin(7x) v ■ x(2ir-x) sin(7y) 3.7455

8.0104

1.9318

4.7869

1.1941

5.7315

1.0879

1.9621

6.2087

1.5410

2.7813

6.5915

1.6605

4.1578

1.0382

u - y sin(9x) , v = x{2ir-x) sin(9y) 7.4843

1.3893

3.2567

8.0160

1.9963

1.1403

1.7668

3.9733

9.6458

2.3893

5.5451

1.1704

2.8390

7.1369

1.7829

We proceed now with the description of the algorithm for the two cases men-

tioned.

Case 1. u Given on the Left Side. The rectangular domain is now taken as

R j = {(x, y): -A/2 < x < 2tt, 0 < y < 77}. This is merely done for notational con-

venience, so as to leave Figure 1 unchanged. The boundary conditions are first (2.2),



which we repeat here as

(5.1a) u = u(0>(x), y.= 0,

(5.1b) v = u(1)(x), y = 77,

and also

(5.2a) " = "(0)00. x - ~hl2'

(5.2b) v = v(x)(y), x = 2ir.

Thus Eqs. (5.2) replace the periodicity conditions (2.3a, b). Eqs. (2.1), together with

(5.1), (5.2) completely determine u, v, and u need not and should not be prescribed

any more at an interior point of R x.

The difference equations are still (2.4), which we repeat for convenience as

(5.3a) (Uu - t/,._w)/A + iVUj - Vu_x)lk = Dip Ki<M,l<,j<N,

(5.3b) (t/./+1 - Uu)/k - iVi+x. - Vu) = Eif, Ki<M,Kj<N-l.

The boundary conditions become

(5.4a) Vu0 = ij(0)((7 - 1)A), Ki<M,

(5-4b) ViN = v(l\(i - 1)A), Ki<M,

and

(5.5a) uo,j = u(o)(0 - 1/2)*), 1 </<7V,

(5.5b) vm + i,j = v(i)(Jk), 0 </ <TV.

Hence, there are MN interior ¿/-values to be determined, and M(N - 1) interior V-

values, or M(2N - 1) unknowns altogether. Eqs. (5.3) yield M(2N - 1) Unear alge-

braic equations; we shaU see that these equations are actually independent and deter-

mine U, V completely.

It turns out to be more convenient in this case to form column vectors U(-, V^

from the values of U, V along vertical mesh Unes; in Section 3 vectors U-, V- were

formed along horizontal mesh lines. Thus

U, = (f.,i> UU*>-> UUn)*' V,. = (Fu, F,2,..., V.n_x)*;

in particular U(-, Vf have now different lengths, the U/s being yV-vectors, while the V('s

are (N - l)-vectors. In a similar fashion, Df is an TV-vector, while Ef is an (N - 1)-

vector of values along the corresponding vertical mesh lines.

With this notation, (5.3) becomes

(5.6a) U,. - U,_, + TV,. = AD,., I <i<M,

(5.6b) vi+ , - V,. + T*U,. = AE,., 1 < / <M;



here T is an N x (N - 1) matrix of rank (N - 1),

(5.7)

1

-1

and we define p = h/k. In (5.6a) U0 is known from (5.5a), and in (5.6b) VM + 1 is

known from (5.5b). Hence it is convenient to redefine D1 as Dj + A_IU0, and EM

•M h 1 \M + j. Furthermore, we redefine D¡ 0 as D,. 0 + k 1 V¡ 0 and D¡ N asas E

DiN - k~lViN. After these changes of notation (5.6) can be given the block matrix

form

(5.8)M M

M.

M'

T

'tV-1 *N-1

'N-l *N-1

TI

I

JN

~In ^n

^n *n

jn-i |

-I iA7-1 I

yM

Ui= A

JLU*J

D,

DM

-M

where IL is the L x L identity matrix.

The decoupling of U and V in this case proceeds as foUows. In the upper half of

system (5.8) we add the first block row to the second, then the second to the third,

and so on, untü the new (M - l)st row is added to the Mh. We obtain a new system,

which we write in condensed form as

(5.9)

where

IMq\r

V

u= A

F

E

(5.10a)P =

T

T T

T ■ ■ T T



(5.10b)

(5.10c)

(5.10d)

ß =

"^JV-l ^JV-1

'N-l ¡N-l

-IN-l

R =

D,

Dx+D2

E>,

and I is the MN x MN identity matrix. From (5.9) we have

(5.11a)

(5.11b)

with

(5.11c)

U = AF - PV,

(Q - RP)y = h(E - RE),

RP

S

S s

s s]

hexe S = T*T is an (N-l) x (N - 1) matrix,

(5.12) S = p2

-1 2j

Notice that 5, in contradistinction to S of Section 3, is nonsingular. We shall return to

it later.

Written out expUcitly, (5.11b) becomes after a change of sign,


A FAST CAUCHY-RIEMANN SOLVER

(5.13)

S + I -I

S S+I -I

-I

S+I

T*DX-EX

T*(DX+D2)-E2

Lt*z>,---AÍ

where / is now the (TV - 1) x (/V - 1) identity. This system can be brought into block-

tridiagonal form simply by subtracting the rth row from the (/ + l)st, starting from the

top. This produces the system

(5.14)

S+I -I

-I S+ 21 -I

S+ 21-I

T*DX-EX

T*D2+E1-E2

J*0M + KM^X DM JS + 27_

We see that the decoupling resulted in this case in the elimination of U, rather

than of V. Equation (5.14) is rather similar to (3.11). Some of the differences have

already been pointed out; as a result of these differences, the matrix of (5.14) is non-

singular. It can be brought to scalar tridiagonal form by diagonalizing S and then re-

indexing. We shaU comment on the fast solution of (5.14) further at the end of this

section, together with the fast solution of the matrix equation obtained in the second

case we wish to discuss.

Case 2. u Given on Both Vertical Sides. We fit the grid to the rectangular do-

main so that R2 = {(x, y): -A/2 < x < 27r - A/2, 0 < y < n}. The boundary condi-

tions are (5.1) on the horizontal sides of the rectangle, and

(5.15a) u = 77(0)(y), x = -A/2,

(5.15b) u = u(1)(y), x = 2?r - A/2,

on the vertical sides. Equations (2.1), (5.1) and (5.15) determine u and v completely,

subject to the requirement that the data d(x, y), u,0^(y), u,xAy), i/0)(x), v^(x)

satisfy the Gauss divergence theorem:

ffdix, y) dxdy = f^-"'2 {v^\x) - u<°>(*)} dx

(5.16)

+ /;{"(1)(v)-«(0)(y)}dy.

The difference equations are (5.3), with (5.3b) only being written for 1 < / <

M-l,l</</V-l. The boundary conditions for the mesh variables are (5.4) and

(5.17a)

(5.17b)

U,o,/ u(0)iij-l/2)k),

UMJ = uix)iiJ-l/2)k),

1 </</V,

1 </</V.



Hence, there remain (M - l)N í/-values to be determined, and M(N - 1) F-values, i.e.

2MN - M - N unknowns. In (5.3) we have MN D-equations and (M - l)(N - 1) in-

equations, i.e., 2MN - M - N + 1 equations in aU.

It would seem that the number of equations exceeds the number of unknowns

by one. We expect, however, from the continuous case that one compatibiUty condi-

tion has to be imposed on the data, analogous to (5.16), and that the matrix of sys-

tem (5.3) have rank equal to the number of unknowns. This can be checked directly

in the block form (5.18) to which we shaU bring this matrix below; the compatibiUty

condition turns out to be the one obtained when computing the integrals in (5.16) by the

midpoint rule. A similar statement is true in the case in which u is prescribed on the hori-

zontal sides of the rectangle, and u on the vertical sides; in that case the compatibility con-

dition is the discrete analog of Stokes' curl theorem, involving e(x, y) rather than d(x, y).

We introduce the TV-vectors U,-, D,- and the (N - l)-vectors V,-, Ef as in the pre-

vious case. Again, the first and last components of D,, D. 0 and DiN, are modified by

the addition of ph -1 V¡ 0 and of -ph _1 V¡ N, respectively. Also, Dj is modified by

the addition of A-1U0 and D^ by the addition of -A-1UM.

After these changes of notation, system (5.3) becomes

M M- 1

M<

M- 1

(5.18)

T

T

-/ N-l 'N-l

'N-l 'N-l

'N

~'n ^n

T*

'n

-IN

v,

v.

T*J Lu*-U

D,

= ADM

-EM-l



where T is the TV x (N - 1) matrix defined by (5.7), and IN-X,IN are identity ma-

trices of the appropriate dimensions. Clearly, the sum of the MN rows in the upper

half of the matrix in (5.18) is zero. The corresponding compatibiUty condition that

2,. :Dj . = 0 is exactly the one we expected; we only need to remember that the D¡-

close to the boundary have been redefined to include the boundary data with appro-

priate coefficients.

The elimination of V proceeds in a manner analogous to Case 1, by summing the

blocks of the upper half of (5.18), in a discrete form of integration with respect to x.

The result is

(5.19)M- 1 M- 1

M-V

M-U

T

T T

T ■ •'■ i" T

T

■ 0

T

"^jv-i ^v-i

"^7V-l ^tV-1

'N

'N

0 • ■ • 0

T*

Âf-l

'M

Ui

uM-l

Di

D1+D2

-E,

-EM-l

We rewrite this, with the obvious identifications, as

(5.20)

P i I

i••• T ¡0 ■•0

_ l

q r r u

F

*M

-E

Notice that P and Q have block dimension (M-l) x M, P has blocks of dimension

N x (N - I), and Q has blocks of dimension (N - 1) x (TV - 1).

From (5.20) we obtain

(5.21a) u = AF - py,

(5.21b) ßV + RU = -AE.



This aUows us to eliminate U and write

(5.22a) (Q - RP)y = -A(E + RE),

M

(5.22b) T*(TT-"T)V = AT*£D/;1

here we introduced the block row missing in (5.21) as (5.22b). The elimination of

the single redundant equation in (5.18) was done naturaUy by multiplication of (5.22b)

with the (N - 1) x N matrix T *.

System (5.22) becomes, after carrying out the matrix multipUcations,

(5.23)

5+7 -/

S S+I -I

S+IV = A

rT*D, + Ex

T*(DX + D2) + E2

r*z;M-lD. + EM-l

IT£>,

with S being the (N - 1) x (N - 1) matrix defined by (5.12). The matrix of this

system is the same as that of (5.13), except for the lower right corner block; also the

right-hand sides differ only sUghtly. Applying the block-tridiagonalization procedure

used in Case 1, which corresponds to differencing in x, one obtains

(5.24)

S+I -I

-I S + 21

-I

-I

S + 27

-I

-I

S+I

V = A

T*D1 +EX

T*D2+E2-E1

T*r> + p _ p1 UM-1 T EM-1 c'M-2

We are now prepared to discuss the fast solution of (5.14) and of (5.24).

Fast Sine Transform. The fast solution of (5.14) and of (5.24) involves bringing

the corresponding matrices to scalar tridiagonal form. This is done in two steps: the

first and crucial step is to diagonalize S; the second is to bring the two diagonals which

are identicaUy -1 from the position of block subdiagonal and block superdiagonal to

that of scalar sub and superdiagonal, i.e., immediately adjacent to the main diagonal.

We shaU write the procedure for a sUghtly more general system, which includes

(5.14) and (5.24) as special cases, to wit:



(5.25)

Sx+axI VS2 + a2I 7i'

ß'M-l1

7m-iI

>MSM + a-M1

W= 8;

W and 8 are partitioned to conform with the blocks of the matrix,

W = (W*,W*,...,W*f)*, 8 = (Bf, 82*,..., 3%)*, and S, = 6,.S.

The eigenvalues pk and eigenvectors r\k of S,

(5.26a) Svk = ßkT}k, Kk<N-l,

axe known:

(5.26b) pk = 2p2 {I + cos(nk/N)},

(5.26c) r\kl = (2//V)1/2 sin(nkl/N).

The matrix S differs from S of Section 3 inasmuch as its eigenvectors are generated by

the sine function, rather than by the exponential, as in (3.12). Its diagonaUzation thus

corresponds to the fast sine transform, in the same way in which S was connected to

the FFT. The differences arise in the continuous problem because of the different

boundary conditions.

Let Pbe the matrix whose columns are t,1 , r\2, . . . , t\N_x, and let M =

diagOij, ...,pN_x). Then

(5.27)

We introduce W,., 8(- by

(5.28a)

(5.28b)

P*S? = M, ?*? = !N-l-

w,. = p*w,.,

8,= P*8,., Ki<M,

and M,- = <5,-M. We premultiply (5.25) by a block-diagonal matrix with aU the diagonal

blocks equal to P* and use (5.27), (5.28) to yield

(5.29)

M, + a,/ VM2 + a27 72'

r->rf r^J * Ä

Reindexing W and 8 into w and 8 by

7AÍ-17

K + "m1

w = g.

(5.30) Wk,i Wi,k' 8k,i Bik, 1 <i<M, 1 <k<N-l,



the scalar tridiagonaUzation of (5.25) is completed in the form

(5.31)

here each Ck is a nohsingular scalar tridiagonal M x M matrix,

5,/ifc+a, 7,

0i 52Mfc + a2 72

"lM-1

&M-1 SM^fc + 0:M_

1 <k<N-l.

After performing and storing the LU decomposition of the Cfc's, the solution of

(5.25) is now carried out by a forward fast sine transform (5.28b) of the data B into

S, elimination and substitution in each subsystem

(5.32b) CkVik = Bk, Kk<N-l,

and a backward fast sine transform (5.28a) of the solution W into W; the reindexing of

rj> into 8 and of W into W does not necessitate actual computer operations. Further-

more, in our application,

ß. = 7. = -l, 6,. = 1, Ki<M,

and aU a,- are 2, except for aM = 1 in (5.24).

The operation counts and storage requirements are, therefore, the same as in

Section 3; in particular, the same remarks apply to the computation of the sine trans-

form as to the real Fourier transform, i.e., the practical computer time required is es-

sentially determined by twice the number of real data.

Numerical experiments similar to those in Tables I through VI of Section 4 were

carried out for Cases 1 and 2, with the algorithm described above. The results were

entirely analogous, confirming the fact that the algorithm is second-order accurate and

the computational time it requires is essentially proportional to the number of mesh

points used in the discretization.

It is remarkable that the numerical tests still indicate second-order accuracy for

u E Cp,v E Cq with p > I, q > 0, and lower-order accuracy for u E Cp, v E Cq with

p < 1, q < 0. This is true in both Case 1 and Case 2 for jump discontinuities in u and

v or their derivatives introduced along either x = const or y = const. These numerical

observations seem to indicate that the method's second-order accuracy even for solu-

tions without the formally required differentiability is not restricted to the case of

C,

-N

W = Ê;

(5.32a)



periodic boundary conditions. Furthermore, the asymmetry with regard to the con-

tinuity requirements on u and v does not appear to depend on whether U or V axe

eliminated in the algorithm, or on the boundary conditions imposed.

6. Comparison with Existing Solvers. A fast direct solver for the inhomogeneous

Cauchy-Riemann equations was published by Lomax and Martin [24]. Some applica-

tions and extensions are given in [25], [26]. We carried out a comparison between

our solvers and those published previously. The comparison was made first for the

test case of [24], [26], then for some of the test cases in our Tables III through VI

and similar ones. These computations were carried out on a CDC 6600 computer with

a FTN 4.6 compiler; single precision arithmetic was used throughout.

Thin Biconvex Airfoil. The example problem used in [24], [26] to illustrate the

use of a Cauchy-Riemann solver in aerodynamics is that of steady, irrotational, sub-

sonic, inviscid flow over a thin symmetrical parabolic-arc biconvex airfoil in the small-

perturbation approximation. The linearized Prandtl-Glauert transformation (e.g., [25])

yields for this problem the formulation

(6.1a) "* + vy = °.

(6.1b) ","u*"0' J'>0>-00<^<00.

with boundary conditions

(62a) u(x, 0) = -4x, -0.5<x<0.5,

= 0, 0.5 < |*|,

and

(6.2b) u,v->0, x2 + y2 -► °°.

The analytical solution to (6.1), (6.2) is weU known and contains a logarithmic

singularity at the leading and trailing edges of the airfoil, x = ±0.5, y = 0. It is most

easily written as

4 Í, . z + 0.5 \(6.3) w = -|l-zlog7^-j,

with w = u - iv and z = x + iy. The most important quantity one wishes to compute

is u on the airfoil {(x, y): y = 0, -0.5 < x < 0.5}; from it one can obtain the lift.

The exact u there, given by (6.3), is simply

(6.4) u(x,0) = ±{l-xlog\^§\}.

In [24] the numerical computation is carried out in a rectangle 7?3 = {(x, y):

0<y<2,-l<x<l}, which we also did. Computations were performed in this

rectangle prescribing the exact, analytic solution as boundary data on the three sides

{x = ± 1}, {y = 2}, as well as prescribing homogeneous boundary conditions there.

Clearly, it is a matter of choice which function, u or v, is prescribed on the sides of



the rectangle not containing the airfoil. Hence, we used both the solvers of Section 5,

Case 1, as well as Case 2.

Among the solvers of [26], we chose for comparison purposes version D, option

(d). Version D is the one suggested by the authors themselves for the application at

hand, and its option (d) seemed, of all the versions and options offered, to be the only

one which could be second-order accurate. This version corresponds to our Case 1 in

the choice of boundary conditions.

For the test case (6.1), (6.2) neither the norms of the solution over R3, nor the

order of accuracy of the solvers is particularly relevant, because of the singularities at

the edges of the airfoil, x = ±0.5, y = 0. As a means of comparison we chose, there-

fore, the accuracy in computing u(x, 0). To compute U on y = 0, which is a F-line

and not a (7-line in the staggered mesh, [24] suggests the second-order accurate extra-

polation formula

(6.5) Uixl2 = (l/S){9Uix - UU2 + 3(x/A)(F,. 0 - F,.+10) - 3kEi>0};

we used this formula in our program, as well as in theirs.

Because of the singularity at the edges, [24] did computations both with x =

±0.5 being F-lines, i.e., with computational mesh points coinciding with the edges, and

with x = ±0.5 being ¿/-lines, i.e., with the computational mesh straddling the edges.

In Table VII we give results for the computation with both these meshes.

For each mesh, Table VII contains in successive columns the x-coordinates of

the points along y = 0 at which U was calculated, the exact 77 there, the solution by

the [26] solver, version D, option (d), and its error, the solution by our solver Case 1,

its error, and finally the solution by our solver Case 2, with its error. We were only

interested in comparing the different solvers, and did not wish to study the question of

computing in a finite domain the solution to a half-plane problem; hence, only the

comparison when using exact boundary data is given. The computations with homo-

geneous boundary data were carried out and gave slightly poorer results for all solvers.

For the mesh points coinciding with the edges, our Case 1 solver has a smaller

error than the [26] solver at aU points except four. The error, in particular, is smaller

by a considerable factor over the airfoil |x| < 0.5, and it is smaller at the edges, where

most of the error occurs. Our Case 2 solver has mostly smaUer error than the Case 1

solver, but they are quite comparable.

For the computational mesh straddling the edges, our Case 1 solver seems to

have mostly larger errors than [26] outside the airfoil, 0.5 < |x|, but smaller errors in-

side. The Case 2 solver is still slightly better than the Case 1 solver, except at a few

points.

As a conclusion, our solvers of Section 5, Case 1 and Case 2, are quite successful

on the example problem of [24]. If anything, they have slightly smaller error than the

[26] solver which appeared most promising. A better test would probably be to ex-

tract the singularities from the solution analytically (cf. the example in [13], for in-

stance) and compute only the regular part of the solution. We felt, however, that [24]

stressed the importance of computing the neighborhood of the singularities as accurate-

ly as possible, and made the comparison accordingly.



Table VII

Numerical results for the u component of velocity along y = 0 in the solution of

the example problem (6.1), (6.2). The heading LM indicates results with the sol-

ver of [26], version D, option (d); headings GBuv and GBuu indicate results with

the solvers of Section 5, Cases 1 and 2, respectively. The solutions were obtained

on a 64 x 64 grid; every second point is given for reasons of space economy, ex-

cept near the singularity at x = ±0.5, where every point is given; the "extra"

points are marked by stars. The terms "coinciding mesh" and "straddling mesh"

are explained in the text.

VIIA. Coinciding mesh

u exact LMLM

errorGB GB

errorGB GB

error

-0.-0.-0-0-0-0-0-0-0-0-0-0-0-0-0-0.-0.-0-0.-0.

0.0.0.0.0.0.0.0.0.0.0.0.0,0.00000

953125890625828125765625703125640625609375578125546875515625484375453125421875390625328125265625203125140625078125015625046875109375171875234375296875359375390625421875453125484375515625546875578125609375671875734375796875859375921875

1.409-11.666-12.009-12.486-13.193-14.341-4

243-1588-1895-1467282646-1250-2302-1163-1729-1050170

1.2421.272

262211116698-1566-1

4.450-12.302-15.250-24.646-11.2821.4678.895-1

588-1243-1689-1802-1227-1825-1529-1

-1.'411-1-1.670-1-2.016-1-2.497-1-3.210-1-4.372-1-5.286-1-6.640-1-8.863-1-1.684-1.480-4.560-1-5.493-2

2.277-16.150-18.704-11.050

169241271261210115684-1548-1422-1266-1613-2573-1482

1.6858.880-1

659-1306-1736-1844-1270-1872-1584-1

1.962-44.049-46.647-41.044-31.703-33.083-34.290-35.230-3

-3.208-3

2.166-11.983-1

-8.664-3

2.434-32.490-31.323-38.401-4

.658-4

.182-4

.331-4

.869-4

.729-48.950-4

069-3338-3812-3796-3580-3635-3

7.346-31.998-12.182-11.490-37.102-36.330-34.675-34.205-34.292-34.720-35.454-3

1.411-11.670-12.015-12.496-13.208-14.370-15.283-16.637-18.860-11.6841.4804.556-15.456-22.281-16.154-18.725-1

050170242272262

1.2111.1179.695-17.560-14.435-12.280-15.464-24.557-11.4801.6848.862-16.639-15.285-13.711-12.816-12.237-11.833-11.536-1

1.612-43.346-45.586-49.012-41.522-32.864-34.051-34.970-3

-3.488-3

2.163-11.980-1

-9.009-3

2.066-32.099-38.841-4

.496-4

.209-4

.488-5

.309-5

.743-5

.587-5

.145-68.743-52.539-46.116-41.463-32.174-32.149-3

-8.918-3

1.981-12.164-1

-3.367-3

5.103-34.197-32.231-31.374-39.723-47.678-46.693-4

-1.411-1-1.670-1-2.015-1-2.496-1-3.208-1-4.370-1-5.283-1-6.637-1-8.860-1-1.684-1.480-4.556-1-5.454-2

2.281-16.155-18.726-11.0501.1701.2421.2721.2621.2111.1169.696-1

.560-1,436-1.281-1.454-2.556-1

-1.480-1.684-8.860-1-6.637-1

283-1709-1814-1234-1829-1531-01

1.595-43.311-45.533-48.942-4

513-3853-3039-3957-3503-3163-1980-1027-3047-3078-3606-4228-4045-5953-5

7.198-59.136-58.551-5

196-5391-5818-4297-4369-3074-3042-3033-3

1.980-12.163-13.511-34.948-34.029-3

031-3134-3777-4980-4902-4

General Purpose Comparison. Lomax and Martin developed their solvers [24],

[26] with certain aerodynamical problems in mind [25], and we developed ours bearing

in mind certain problems in geophysical fluid dynamics [15], [17]. On the other

hand, the first Poisson solvers were also formulated for specific applications [4], [21],

and only later developed into general purpose algorithms and into packages. Therefore,

it seemed reasonable to test our solvers on solutions which had an appropriately general

character (Tables III through VI, and discussion at the end of Section 5); these tests

showed that our solvers are second-order accurate, given rather minimal continuity prop-

erties of the solution.



VIIB. Straddling mesh

u exact LMLM

errorGB

GB"V

errorGB'

GB""error

-0.-0.

0.0.0,

-0.

0.

0.0.0.0.

0.0.0.0.0.0.0.

-0.

0.0.0.0.0.0.0.0.0.0.0.0.

0.0.0.0.0.0.0.0.

9375875081257500687562505937556255312550004687543754062537503125250018751250062500625125018752500312537504062543754687550005312-5562559375625068757500812587509375

1.467-1■1.743-1■2.114-1•2.637-1■3.425-1■4.753-1■5.840-1•7.559-1■1.092

•7.763-1•2.353-1

.975-2441-1898-1235-1085192253273253192085235-1898-1441-1353-1353-1763-1

■1.092-7.559-1-5.840-1-4.753-1-3.425-1-2.637-1-2.114-1-1.743-1-1.467-1

■1.467-1■1.742-1■2.112-1■2.633-1•3.417-1■4.730-1■5.793-1■7.417-1-1.020

027-1219-1036-1455-1900-1234-1085192253273

1.2531.1911.0849.229-16.894-13.448-11.027-12.228-17.037-1

-1.021-7.430-1-5.807-1■4.746-1-3.435-1-2.655-1■2.138-1-1.774-1-1.506

986-5054-4127-4073-4356-4236-3734-3422-2206-2

353-2338-2796-3406-3765-4697-4290-4268-4998-4610-4144-4578-4804-4476-4365-4466-4957-3246-2252-2

-7.086-2-1.291-2-3.304-3-6.781-4

1.013-31.796-32.432-33.105-33.911-3

467-1741-1111-1632-1416-1729-1791-1415-1020

-7.025-1-2.217-1

.038-1

.458-1

.903-1

.237-1

.085,192.253.273.253.192.085

9.237-16.903-13.459-11.039-1

-2.216-1-7.024-1

019414-1790-1727-1414-1630-1109-1739-1464-1

063-5.474-4765-4

,939-4,463-4,372-3884-3439-2224-2

7.374-21.361-24.037-3

665-3726-4669-4183-5463-6715-5928-5971-6301-5314-5095-4274-4

-1.733-3-4.112-3-1.369-2-7.383-2

-7.235-2-1.450-2-5.011-3-2.510-3-1.108-3-6.830-4-4.957-4-3.994-4-3.457-4

1.467-11.741-1

111-1632-1416-1729-1791-1415-1020

.025-1

.217-1

.038-1

.458-1

.903-1

.237-1

.085

.192

.253

.273

.2531.1921.0859.237-16.903-13.458-11.038-1

-2.217-1-7.025-1

.020

.415-1

.791-1

.729-1

.416-1

.632-1-2.111-1-1.741-1-1.467-1

•5.903-5•1.442-4-2.717-4■4.874-4■9.380-4■2.362-3■4.873-3•1.427-2■7.223-2

.373-2

.359-2

.019-3

.646-3

.505-4

.417-4

.327-5

.980-5

.376-5

.068-5

.376-5

.990-5

.327-5

.417-4

.505-4

.646-3

.019-3-1.359-2-7.373-2

7.223-21.437-24.873-3

362-3380-4874-4717-4442-4903-5

The intent of [24], [25], [26] was explicitly restricted to the solution of spe-

cific aerodynamic problems. It appeared worthwhüe, however, to consider the more

general applicabiüty of their solvers.

We carried out a number of tests with version D, option (d) of [26] on solu-

tions with a generic character. The results are given in Table VIII. This table is or-

ganized in the same way as Tables III and V of Section 4, and we refer to the de-

scription there.

For very simple test cases such as u, v constant or quadratic, the [26] solver is

essentiaUy exact to machine accuracy; the round-off error increases sUghtly with the

number of grid points used. The difference between these results and those in Table III

is due to the computer used: double-precision arithmetic on an IBM 360/95 is slightly

more accurate than single precision on a CDC 6600.

The results with more severe test cases seem to indicate that the [26] solver is

only first-order accurate in general. This is apparently due to the formulation of the

algorithm at the boundaries. An indication that boundary inaccuracies are the cause

of lower than second-order accuracy are those exceptional test cases which show high



Table VIII

Numerical results for general-purpose test cases using the solver of [26], version

D, option (d). The smaller regions, such as 7?4 = {(x, y): 0 <x < tt, 0 <y < 7t/2}

were used in some of the tests because of the difficulty of fitting computations with

M = 256 into the core memory of the CDC 6600; M is the actual number of grid

points used in the x-direction in each test.

MH_(U) t_(V)

U, V

constant

u = 1000.0

v = - 100.0

16

32

64

128

1.083-11

2.871-11

1.001-10

1.999-10

1.642-12

2.628-11

1.387-10

2.177-10

7.747-12

2.752-11

1.210-10

2.090-10

2.910-11

9.095-11

4.475-10

1.281-9

4.547-12

5.093-11

2.310-10

3.174-10

u = x + y

100

16

32

64

128

3.076-12

1.360-11

7.589-11

1.706-10

1.537-12

2.533-11

1.376-10

2.164-10

2.431-12

2.033-11

1.111-10

1.949-10

8.640-12

5.571-11

3.583-10

1.122-9

4.547-12

4.957-11

2.287-10

3.156-10

y sin(x)

sin (y)

16

32

64

128

4.183-3

1.020-3

2.519-4

6.262-5

4.979-3

1.227-3

3.046-4

7.589-5

4.593-3

1.128-3

2.795-4

6.957-5

9.507-3

2.379-3

5.921-4

1.477-4

16/32

32/64

64/12E

4.101

4.048

4.023

4.058

4.023

4.014

4.076

4.036

4.018

3.996

4.019

4.009

1.013-2

2.544-3

6.366-4

1.592-4

3.983

3.996

3.999

u,v e C u = y sin x

v = sin(y)

16

32

64

128

1.155-2

2.885-3

7.197-4

1.797-4

8.376-3

2.055-3

5.090-4

1.267-4

1.009-2

2.504-3

6.233-4

1.555-4

2.622-2

6.613-3

1.656-3

4.141-4

16/3232/6464/12!

4.0054.0084.005

4.0764.0374.018

4.0294.0184.009

3.9663.9943.999

1.909-2

4.838-3

1.211-3

3.030-4

3.9473.9933.998

M *„(U> * <V>

u,v e c y sin(x)

sin(y)

163264

128

2.356-25.333-31.296-33.219-4

3.073-27.566-31.876-34.672-4

2.738-26.546-31.612-34.012-4

8.028-22.600-27.155-31.865-3

16/3232/6464/128

4.4184.1154.025

4.0624.0324.016

4.1834.0594.019

3.0883.6333.836

6.846-21.725-24.314-31.078-3

3.9693.9984.003

u = y sin(x)

v = sin(y)

16

3264

128

9.705-23.426-21.061-22.938-3

1.262-13.117-27.741-31.928-3

1.126-13.276-29.286-32.485-3

4.135-11.282-13.417-28.742-3

16/3232/6464/128

2.8323.2303.610

4.0474.0274.014

3.4363.5273.737

3.2243.7533.909

2.797-17.139-21.787-24.470-3

3.9193.9953.990

u = y sin(x)

v = sin(y)

2-H/3

163264

128

16/3232/6464/128

6.019-23.206-21.058-22.988-3

5.031-21.277-23.206-38.026-4

5.547-22.440-27.818-32.187-3

1.8773.0303.542

3.9393.9833.995

2.2733.1223.574

1.217-16.215-21.982-25.511-3

1.9593.1353.597

1.052-12.712-26.805-31.703-3

3.8783.9853.996

accuracy, apparently because their solutions close to the boundary behave in a special

way in the variable perpendicular to it. The series of tests with increasing powers of



TABLE VIII (Continued)

M *,(U) H2(U,V) * (u) * (v)

u,v E C u = y sin(x)+ 0.1

v = sin( y)+ 0.1

163264

128

1.7261.0023.696-11.087-1

1.9374.854-11.210-13.018-2

1.8357.876-12.750-17.981-2

6.2292.2096.000-11.750-1

4.6151.1822.969-17.427-2

16/3232/6464/128

1.7222.7123.399

3.9904.0124.010

2.3302.8643.446

2.8203.6823.429

3.9043.9803.998

u,v e c 10 . , ,u = y sin(x)

v = sin(y)

163264

128

1.099+31.740+21.065+23.603+1

3.209+28.664+12.186+15.470

8.115+21.374+27.685+12.577+1

1.944+34.049+21.687+25.635+1

16/3232/6464/128

6.3207.6342.950

3.7963.9633.997

5.9051.7882.982

4.8002.4002.994

8.179+22.312+25.875+11.474+1

3.5383.9353.985

u,v e C = y sin(x)

= sin(y )

163264

128

5.742-21.556-23.835-39.507-4

1.128-12.690-26.571-31.632-3

8.951-22.197-25.379-31.335-3

2.299-18.305-22.533-27.042-3

16/3232/6464/128

3.6904.0584.033

4.1954.0944.027

4.0744.0844.029

2.7683.2793.597

3.015-17.070-21.892-24.703-3

4.2643.7374.023

u,v € C u = y sin(x)• i 4,

v = sm(y )

163264

128

1.261+16.5174.9132.752

1.892+11.044+17.5423.623

1.608+18.7036.3653.217

4.736+13.188+12.726+11.833+1

16/3232/6464/128

1.9351.3261.785

1.8121.3842.0828

1.8471.3671.978

1.4861.1691.487

3.853+12.896+11.494+11.026+1

1.3311.9391.456

t_(0> *_<V)

u = x sin(y)

v = y sin(x)

16

32

64

128

1.938-1

1.087-1

5.845-2

3.049-2

6.645-2

3.787-2

2.017-2

1.041-2

1.448-1

8.140-2

4.372-2

2.278-2

8.011-1

5.367-1

3.371-1

2.028-1

16/32

32/64

64/128

1.782

1.860

1.917

1.754

1.878

1.939

1.779

1.862

1.919

1.493

1.592

1.662

2.812-1

1.872-1

1.108-1

6.175-2

1.502

1.689

1.795

U,V 6 C u = sin(y)4

v = y sin(x)

16

32

64

128

2.733

1.787

1.033

5.604-1

6.771-1

4.29-1

2.413-1

1.281-1

1.991

1.300

7.501-1

4.065-1

1.055+1

8.858

6.367

4.193

16/32

32/64

64/128

1.529

1.730

1.343

1.578

1.778

1.884

1.532

1.732

1.845

1.191

1.391

1.519

4.096

3.456

2.401

1.505

1.185

1.439

1.595

u = x sin(y)

v = y sin (x)

16

32

64

4.381-1

2.571-1

1.429-1

1.319-1

7.774-2

4.213-2

3.235-1

1.899-1

1.054-1

1.738

1.227

8.254-1

16/32

32/64

1.704

1.798

1.697

1.845

1.704

1.802

1.416

1.487

6.385-1

4.722-1

3.017-1

1.350

1.565

y in the solution, which lead to decreasing order of accuracy, iUustrate this point. The

fractional order of accuracy evident in these and in some of the other test cases also

points in this direction. Needless to say, we also performed experiments with our sol-

vers of Sections 3 and 5 on the same test cases; they all yielded second-order accurate

results.



TABLE VIII (Continued)

M *2(u) I (u) i (v)

u,v

harmonic

u=cos(x)sinh{y

v=sin(x)cosh (y

16

32

64

128

7.981-2

4.464-2

2.379-2

1.234-2

2.924-2

1.647-2

8.727-3

4.491-3

6.013-2

3-365-2

1.792-2

9-285-3

3.669-1

2-559-1

1.050-1

1.012-1

1.105-1

7.177-2

4.246-2

2.383-2

16/32

32/64

64/12E

1.789

1.876

1.928

1.775

1.888

I.943

I.787

1.878

I.930

I.434

1.551

1.630

I.539

1.690

1.783

u,v

harmonic

u=cos(x)sinh(y

v=sin (x)cosh (y

n/8

16

32

64

128

9.154-3

4.861-3

2.515-3

1.283-3

3.840-3

2.110-3

1.105-3

5.653-4

7.019-3

3.747-3

I.943-3

9.914-4

4.605-2

3.087-2

1.938-2

1.166-2

1.31B-2

7.809-3

4.181-3

2.168-3

tf/416/32

32/64

64/128

1.883

I.932

I.96I

1.820

I.910

I.954

1.873

I.929

1-950

I.492

I.593

1.661

1.726

1.868

I.928

u=sin(y)cosh(xv=cos (y) sinh (x)|

163264

12816/3232/6464/12E

1.17.03.92.1

43+1655101

3.9832.5691.4637.809-1

8.5575.3152.9791.585

5.535+14.188+12.760+11.696+1

1.776+11.457+19.4925.490

1.61.71.8

18 1.5501.7561.873

1.6101.7841.879

1.3221.5171.627

1.2191.5351.745

harmonicu=cos(y)sinh{xi

V = - sin(y)

cosh(x)

163264

128

1.78.94.52.2

42+1110060

8.3944.6812.4761.274

1.367+17.1173.6321.835

7.865+14.542+12.443+11.264+1

3.313+12.071+11.165+16.183

16/3232/6464/12Í

1.91.91.9

1.7931.8911.944

1.9211.9601.970

1.7311.8601.928

1.6001.7791.883

Subject to further testing, we are forced to conclude at this point that our

algorithms, compared to previously published ones, are at least as good when applied

to specific problems of interest, and that they are more suitable for development into

general-purpose Cauchy-Riemann solvers.

7. Concluding Remarks. The inhomogeneous Cauchy-Riemann equations in a

rectangle have been discretized by a finite-difference approximation. A number of

different boundary conditions have been treated explicitly, leading to algorithms which

have overall second-order accuracy. All boundary conditions with either u or v pre-

scribed along a side of the rectangle can be treated by similar methods. A rigorous

proof of the second-order accuracy of the algorithm was given for one combination of

boundary conditions, and numerical experiments substantiate this result for aU the

boundary conditions tested.

The algorithms presented here have nearly minimal time and storage require-

ments and seem suitable for development into a general-purpose direct Cauchy-Riemann

solver for arbitrary boundary conditions. This could be done for instance along the

Unes of the capacitance matrix methods of discrete potential theory (Widlund [33]);

generalizations to nonrectangular domains can also be made by this approach and re-

lated ones. More experience with different applications should help in formulating a

code which gives a reasonable compromise between efficiency and range of applica-

biUty.

It is well known [30], [31], [32] that fast solvers can be formulated for a sin-

gle separable second-order eUiptic equation with variable coefficients. Clearly, the



same generalizations can be carried out for a first-order system (2.1) in which 9/9x

and 9/9y are replaced by a(x)9/9x and by A(y)9/9y, and in which lower-order terms

can also be introduced. A special case of such an extension appears in [26]. Effi-

cient algorithms for generalizations of this nature can be based on appropriate modifi-

cations and combinations of matrix decomposition, cyclic reduction and Toeplitz fac-

torization [30], [31], [32].

Linear problems with variable coefficients, as well as nonlinear problems, can

also be handled by semidirect methods, i.e., by splitting the given operator to be in-

verted into one whose inverse is easily computed using a fast direct method, and an-

other one which is small in some suitable sense [8], [19], [25], [32]. It is in this direc-

tion that we shall seek to extend the work presented here, in order to solve the non-

linear problem of [15], [17] and related ones. The straightforward iterative method

of [17] wUl then be compared to semidirect methods.

Appendix A. An Error Estimate. We saw in Sections 4 and 5 that the proposed

algorithm gives a second-order accurate solution to the inhomogeneous Cauchy-Riemann

equations under the various boundary conditions considered. The second-order ac-

curacy of the discrete solutions was obtained in our numerical experiments even for

cases in which the solution to the continuous problem was merely C1, more precisely,

u E C1, v E C°. We shall give now a rigorous error estimate for the model problem of

Section 2, making the stronger and more customary assumption that u, v E C4, i.e.,

that the solution to (2.1)—(2.3) has continuous derivatives up to fourth order.

The discrete operator Lh we consider is that of (3.1 la),

(A.l)

S + I

-I

-I

S + 21

-I

-I

+ aee*,

S + 21 -I

-I S + I

where a > 0 and e is a vector of length MN with all entries zero but one,

e, = 0 for / # Mj0 + i0, eMj0+i0 = • i

Lh acts on the grid function U. We wish to show that

(A.2a) ||i7 - £/|L = 0(h2 + k2).

From such an estimate and from (3.1a) it will follow immediately that

(A.2b) Hu - F|L = 0(h2 + k2)

as well; it suffices to observe that (3.1a) is a second-order accurate quadrature formula

for

(A.3) v(x, y) = u<°>(x) + f¿{d(x, v) - ux(x, r,)} dV,

namely the midpoint rule. We only need to apply the result of Bramble and Hubbard

[2] that (A.2a) implies similar estimates for the difference quotients of u.



The matrix operator Lh of (A.l) is of monotonie type (Collatz, [6, p. 42 ff.])

or of positive type with diagonal dominance (Forsythe and Wasow [11, p. 181]). To

avoid confusion in the terminology, we shall simply state that Lh satisfies the following

Lemma 1 (Maximum Principle). IfLhVl = H, a77»i H > 0 (in the sense that

all the components of H are nonnegative), then W > 0.

From this, one easüy shows that

Lemma 2 (Comparison Theorem). // \LhVi\ < Lh<&, then |W| < <i>.

We notice that these results would still hold if, instead of aee*, we included in

Lh a term aA, with a single diagonal block equal to IM, and all other blocks zero, IM

being the M x M identity matrix.

These properties of Ln suggest the familiar estimation procedure first used by

Gershgorin [13]. Let u be the solution of

(A.4a) Lu = f

where L and / are defined by the two equations

(A.4b) A» = uxx + uyy =dx+ey inR,

(A.4c) uy = e + vx on bR = {(x, y): 0 < x < 2n, y = 0, 77},

with the additional requirements that u be 27r-periodic in x, and given at a point

(x0, y0) £ R. For sufficiently smooth d, e and t/°\ v^l\ (AA) has a unique solution

uEC4, subject to the familiar compatibility condition for the Neumann problem that

(A.4d) ff Vx + ey) dxdy = -§dR (e + vx) dx.

We shall make the necessary smoothness and compatibility assumptions throughout

this Appendix.

Let U be the solution of

(A.5a) LhU = F~.

where Lh is defined by (A.l) and F is defined as the right-hand side of (3.11a),

7-*D, -E,

(A.5b)F=k

T*D„ + E,

^-1 + EN-2 "N-l

T*DN + EN-l

+ aee*U,

with F-, 1 </ < N, corresponding in obvious fashion to the partition of Lh. Clearly,

for 2 < j < N - 1, (A.5) is an approximation to (A.4b) with second-order local trun-

cation error. This would further encourage us to seek an estimate (A.2a) by first

showing that the truncation error satisfies

(A.6) Lh(u -U) = (Lh-L)u+f-F = 0(k2(h2 + k2))

and then constructing a Gershgorin comparison function $ which would aUow us to



conclude based on Lemma 2 that (A.2) follows from (A.6). We remark at this point

that, in order to make the notation of (A.6) transparent, we defined

(A.7a) L = -k2A in R, L = -kb/by on bR,

(A.7b) f=-k2(dx+ ey) in R, f= -k(e + vx) on bR;

furthermore, F is not merely / at the grid points.

A number of slight difficulties arise in carrying out the program above. First,

(A.4) is essentiaUy a Neumann problem, rather than a Dirichlet problem as in [13] and

in most of the literature on elliptic systems. The work on boundary conditions of the

second and third kind most relevant to the estimation which follows is that of Batsche-

let [1], who gives an 0(h) estimate, and that of Bramble and Hubbard [3], who give

an 0(A2|log A|) estimate, their assumptions being that u E C4. The latter article also

contains further references to estimates for the Neumann problem. The boundary con-

dition approximations in these works are different from ours.

Second, (A.6) actually fails close to the boundary, i.e., for / = 1, TV, where the

local truncation error is of first order only. The work of Bramble and Hubbard ([3]

and references therein), combined with our numerical results, led us to expect that

(A.2) could stiU be proved, with some additional effort; this turned out to be the case.

Our method to obtain (A.2) is actuaUy a modification of that in [1], which ex-

ploits the fact that the boundary condition we use is second-order accurate, while that

in [1] is only first-order. We shall see below that the failure of (A.6) for / = 1, /V

stems from Lh there being a linear combination of the discrete analog to (A.4b) and

of that to (A.4c). Hence the truncation error, as defined by (A.6), is of first order

there, although both (A.4b) and (A.4c) are separately approximated to second order.

In spite of this formally first-order truncation error, the fact that both the equation

and the boundary condition are approximated by second-order discrete analogs yields

an overall 0(h2 + k2) error estimate (A.2).

After these observations, we proceed with the business at hand. To start, we

derive (A.6) for 2 </ < tV - 1 ; let the discretization error W be defined as

(A.8a) W = u - U,

considered as a mesh function. Then

(LHW)j = -ViM +(S + 21)Vi. - W/+1

= -u,.,, +(S + 27)u- - U,+ j + k2(Au).-(A.8b)

-k2(dx + ey) - k(T*Df + E;_, - E;)

= 0(k2(h2 + k2)), 2 </ <N - 1.

In (A.8b), and in the sequel, u.-, (àu);-, and similar terms are interpreted as vectors of

grid values, in the same way as U-. The factor k2 is convenient in order to keep the

coefficients of Lh as 0(1). The terms 0(h2k2) appear due to differentiation in the x

direction, those 0(k4) due to differentiation in the y direction. The constants implied



in writing 0(hpkq) depend on the derivatives of order p + q for the functions involved;

they wiU not be written down explicitly, since those derivatives are known to be

bounded from our assumptions.

Here, as in the derivation of (A.2b) from (A.3) and as in the sequel, it is im-

portant to remember the staggering of Figure 1, which essentially guarantees that aU

differences are centered. Notice also that, for e- # 0, (A.8b) will only be modified

by the term ae,- e,*u, in Lhu, and by the term ae; et U,- in F; the condition weio >o *o " >o 'o >o

chose to eliminate the indeterminacy in the Neumann problem is exactly that these

two terms cancel, and hence the estimate is not affected. If instead of ee* we have

A, the same observation holds as after Lemma 2; i.e., prescribing the average of U- ,

rather than one of its components, U¡ ¡ , does not affect the estimate either.'o-V

To analyze the situation for / = 1, it is convenient to rewrite (A.5) for / = 1 as

a linear combination of two equations, involving an auxiliary vector U0; an entirely

similar procedure can be carried out for / = N, introducing U^ +,, but we shall omit

writing out the latter analysis and merely draw the conclusions we need from it. The

two equations for / = 1 are

(A.9a) -U0 + (S + 2I)\JX - U2 = k(T*Dx + E0 - Ej),

(A.9b) U0-Uj =-xE0 + r*V0;

here we revert to the original definition of Dj, which had been replaced by Dj +

k~l V0 in writing Eq. (3.3). Equation (A.9a) is now simply the discrete analog of (A.4b)

for/ = 1, or y = k/2, with E0 defined on y = 0 in the usual manner. Equation (A.9b)

is the discrete analog of (A.4c) written on y = 0. If the auxiliary vector U0 is thought of

as given on y =-k¡2, then both (A.9a) and (A.9b) are formally second-order accurate.

With this motivation in mind, it is easy to obtain

(LhW)x =(5 + /)Wj-W2

= (S + I)ux - u2 + k2Au\y = k/2 + fc(9/9y)77|y = 0

(A.lOa) -k2(dx + ey)\y = k/2 - k(T*Dx + EQ - Ex)

-k(e + vx)\y=i0+kE0-T*yo

= [(Lh-L)u]x + [f-F]x.

We have from (A. 10a) that

[(Lh - L)u], = 0(k2(h2 + k)), [f~F]x= 0(k2(h2 + k2)) + 0(kh2).

We assume throughout that k/h = 0(1) and drop the fourth-order terms; thus finaUy

(A. 10b) (/.„IV), = 0(k(h2 + k2)).

For / = N we obtain in the same way



(A.llb) L\¡/ =

(A-10c) (LhW)N = 0(k(h2 + k2)).

At this point we have to exhibit a comparison function <p which shall lead us

from (A.8) and (A. 10) to (A.2). Let \p be the solution of

(A.l la) L\p = k2a in R,

-kab on y = 0,

kßb on y = n.

L is defined by (A.7), and i// is 27T-periodic in x and fixed at one point; this yields a

unique and smooth i//, which is bounded independently of x, y, A and k. Notice that

\p satisfies inhomogeneous boundary conditions as in [1]. The constants a, b, a and ß

axe positive and chosen so as to satisfy the compatibility condition

(A.l lc) fJA^dxdy = f^Wy(x, it) - ^y(x, 0)} dx,R

that is, to = (a + ß)b.

We want to find 777 independent of x, y, but not of k, A, so that

(A.l 2a) 0 = 777i//

and

(A.12b) \LhW\<Lh<i>,

where W is given by (A.8a) and 4> is the mesh function corresponding to 0. Lemma 2

wül then provide the desired estimate on W in terms of 777.

We consider first the "interior" mesh domain of i/-points Rh = {(x,-, y): 1 < /

<M, 2 </<tV- 1}. There

(A.13a) Lh<¡> = L4> + (Lh - L)4> = mk2a + mO(k2(h2 + k2)),

as in (A.8b). For A, k sufficiently small, with h/k = 0(1),

Lh(p > mk2a/2.

It suffices, therefore, given some sufficiently large constant CR, to have set

(A. 14a) m = CRh2

for (A.12b) to hold in Rn; CR wiU depend on certain bounds for the derivatives of 77.

Consider now the "boundary" mesh domain of í/-points rh = {(x¡, y A.

1 <i<M,j= l,N}. Here, cf. (A.9, 10), we have

(Lh0)x =(S + 7)$j - $2 = k2A<¡>\ k/2 + *</> I 0 + 0(k2(h2 + k))(A.13b) y ' vy

= -mk2a + mkab + 0(k3);

a similar equation holds for (Lh<i>)N. Thus, for A, k sufficiently small, we have in Th

Lh<t> > mkab/2;

remembering again (A.10), it suffices here too that



(A. 14b) 777 = CrA2,

for (A.l2b) to hold. The constant Cr depends only on the derivatives of u, and on the

domain R.

In fact, one can write down \p = \¡/(x, y; a, b, a, ß) explicitly and optimize the

parameters a, b, a, ß subject to (A.llc). This yields

(A.15a) IB'I <|Ä2max||CÄ, Cr\,

where

(A.l5b) CR = — sup \2(uxxxx + u ) - (dxxx + e )\,^ 0<x<2n,

0<y<TT

and

(A.l 5c) Cr=TA SUP \uyyy+vxxx\-

y = 0,ir

The bound e(u) in Table V of Section 4 was computed using (A. 15).

Having derived bound (A.l5) completes the proof that under our assumptions

llw-i/|L + ||u-F|L=0(A2).

It might be of interest to notice that the algorithm proposed and the error estimate

just given apply not only to the inhomogeneous Cauchy-Riemann equations (2.1),

(2.2); they apply directly to the Neumann problem (A.4), with dx + e , e + vx arbi-

trarily prescribed as well. In other words, this is also a second-order algorithm for the

Poisson equation with Neumann boundary conditions in a rectangle (compare Schu-

mann and Sweet [29]).

Appendix B. The Basic Algorithm. At the end of the paper we present a

FORTRAN listing of the program used to compute the results in Tables I through VI.

The program implements the algorithm described in Section 3, tested in Section 4, and

analyzed in Appendix A. It solves the inhomogeneous Cauchy-Riemann equations for

77 and v in a rectangle, with u prescribed on the upper and lower side of the rectangle,

and with periodicity in the x-direction. The programs for other combinations of

boundary conditions, as discussed in Section 5, are very similar.

The program was tested for compatibility with ANSI FORTRAN (U.S.A. Stan-

dard X3.9-1966) by a special diagnostic option of the CDC FTN 4.6 compiler; it was

further cleaned up to conform to common usage by a program called TIDY, imple-

mented at and available from the Courant Institute. It was run on a CDC 6600 with

an FTN 4.6 compiler, and on an IBM 360/95 and an Amdahl 470V/6 with a

FORTRAN IV, level H compiler. It is felt that these procedures and the numerical

tests reported on in the text are some reasonable steps in the direction of validation

and portability; further steps in this direction taken by potential users would be of the

greatest interest to the authors.



The main part of the program is subroutine FASTCAR. For user convenience,

the driver subroutine, as well as the FFT subroutine and the tridiagonal solver we used

are included. Clearly, these subroutines can be changed to others which the user for

one reason or another finds are better adapted to his needs or preferences; e.g., cyclic

reduction can be used instead of the FFT, or Toeplitz factorization instead of the LU

factorization we use for the scalar tridiagonal systems, etc. The use of a more general

FFT algorithm would remove the restriction of M being a power of two.

The driver program contains some diagnostics on error norms, which would have

to be modified for problems whose solution is not known in closed form. FORMAT

statements are grouped together and can thus easily be modified. Execution speed can

be increased if MN additional storage locations are available, by avoiding the shifting

required for the reindexing process within the same array. Other trade-offs between

storage and execution time are also possible.

The implementation foUows rather closely the algorithm description in the last

subsection of Section 3. With the help of the COMMENTS provided in the pro-

gram, we hope that it is fairly readable.

Questions and comments by users are warmly invited.

C DRIVER FOR FASTCAk * 1

C A 2COMMON /KEEP/ Z(S25bl A 3COMMON /HAND/ t(6292),D(U292) A 4COMMON /WET/ V0(12B><VNU2ÍS) A 5COMMON /DRY/ X(64)« Y<64) A 6

DIMENSION K<5,6> A 7C A e

XJUI J) = XL + H2*r-L0AT( J)-ri A 9

XJV(J)=XL+H2*F10AT(J-1I A 10YKU(KI"YL+G2*FL0AT[K)-G A 11YKV(K.>*YL + G2*FL0AT<K) A 12

VEX(X,Y )«SIN(X) A 13U£X(X,YI -C0SÍX )+Y A l<i

DUUYIX»Y)"1.0 A 150UDX1X» Y)—SIMXI A 16DVDX(X,Y)-COS(XI A 17DVDY(X,YI=0.0 A 18

C A 19ISAM--1 A 20PIA*1.0 A 21PI=4.0*ATAN<PIA) A 22PI2=PI/2.0 A 23DO 60 MF-3.6 A 24ICODE»0 A 25HPOK'HF A 26M=2**MP0W A 27N-M/2 A 26YL-0.0 A 29XL-0.0 A 30XU-2.0*PI A 31H2*<XU-XL)/FLCAT(M) A 32YU=N*H2 A 33H»H2/2.0 A 34G2-(YU-YL)/FL0AT(N) A 35G-G2/2.0 A 36M1=M-1 A 37ALAM»G/H A 3tAM=1.0/M A 39Nl-N-1 A <.0IL«M*N A <tlILM-1L-M A 42MR=M+1 A 4300 10 J»1>N A <r<rY1=YKV(JI A 45Y2 = YKU<J) A <i6DO 10 1=1,M A 47Xl-XJU(I) A 46



XZ«XJV!I)K=(J-1)*M+I

SETTING UP R.H.S. UF THE INHOMOGENEOUS CAUCHY-RIEMANNEQUATIONS.E IS THE R.H.S. OF THE VORTICITY EOUATIOND IS THE R.H.S. OF THE DIVERGENCE EQUATION

E(K)«DUDY<X1,Y1)-DVDX(X1,Y1)D(K)-0UDX(X2,Y2)+0VDY(X2,Y2)

10 CONTINUE

NEXT STORE THE BOUND.ARY VALUES V(TOP) IN VN(I).

NEXT STORE THE BOUNDARY VALUES V(BOTTOM) IN VO(I).

DO 20 I-1»MXZ'XJVUIV0(I)»VEX(XJV(I),YU)VN(I)»VEXIXJVII I,YL)

20 CONTINUE

NEXT STORE THE FIXED U VALUE IN SOL.

IROW IS THE ROW WHERE FIXED VALUE U IS ASSIGNED.SOL-0.0IROW'NY1«G*(2*(IR0W-1)+1)+YL00 30 J»1,M

Xl-XJU(J)S0L»S0L+UEX(X1,Y1I

30 CONTINUECALL FASTCR <SOL»H,G,PI>IROW,MPOw,ISAM,ICOOE,N)

NEXT FINDING NORMS..XN IS THE L2 NORM OF U.YN IS THE L2 NORM OF V.XYN IS THE L2 NORM UF (U+V).XM IS THE MAX NORM OF U.YM IS THE MAX NORM OF V.

YN«0.0YN-O.ODO 40 J = 1,MX2-XJVIJlDO 40 I«1,N1Y2»YKV(I)K"(1-1)*M+JDKKK»ABS(0(K)-VEX(X2fY2II

YN*YN+DKKK*DKKK

IF (DKKK.GT.YM) YM«DKKKÍ0 CONTINUE

XN=0.0XM«0.0DO 50 J»1»MXl-XJUUIDO 50 1 = 1,NYl-YKUII IK=(1-1)*M+JEKKK»A3S(E<K)-UEX<X1,Y1))XN«XN+EKKK*EKKK

IF (EKKK.GT.XMI XM'EKKK50 CONTINUE

XYN"(XN+YN)/(IL+1LM)

XN-XN/IL

YN»YN/ILMXN=SORT(XNI

YN-SQRTIYN)XYN=SORT(XYN)

WRITE (6.120)WRITE (6,110) XN,YN,XYN,XM,YMW(1,MF)«XNW(2,MF)»YNW(3,MF ) =XYNW(4,MF)»XM

W(5,MF)«YM

495Û

515253

5«.5556= 7

5t59

6(j61be

b'i

t>4bt6667

6Í:

69

7C7j72737*75It7 77t7 9

to£ 1620 3

R*B 5tft

B7itB99C91

9 2

93■j*

959(

9 7

?e99

10Ü

10110210310*10510610710810911011111211311*115116117ue11912012112212312*12512612712B



WRITE (6,90)WRITE (6,90)WRITE (6,80) N,M,XL,XU,YL,YUWRITE (6,90)WRITE (6,90)

60 CONTINUEWRITE (6,90)DO 70 J = l,5W< J,3)=W<J»3)/W(J,*)W(J,*)'W(J,*)/W(J,5)

70 W( J,5)=W( J,5)/W( J,6)WRITE (6,100)WRITE (6,110) ((W(J,I),J=1,5),I=3,6)STOP

80 FORMAT (5X,*N=*,13,* M=*,I 3, 5X,*XL=♦>F 10.3,* XU=*,F10.3,* YL=*,1F10.3,* YU=*,F10.3)

90 FORMAT (IX)

100 FORMAT (7X,9HF0LL0WING,*X,2HIS>*X,5HRATIO,*X,2HOF,*X,5HABOVE,*X,5H1N0RMS)

110 FORMAT (1X.1P5E20.10)120 FORMAT (1X,9HNGRM OF U,15X,9HNQRM OF V,12X,1lHNOkM GF J+V»10X.11HM

1AX NORM U,10X,11HMAX NORM V)END

A 129

A 130A 131A 132A 133A 13*A 135A 136A 137A 13EA 139A 1*0A 1*1A 1*2A 1*3A 1**A 1*5A 1*6A 1*7

1*81*915C

15115215315*-

SUBROUTINE FASTCR (SOL,DELTAX,DELTAY,pI,IRUW,MPGW,iSAM,li.ODt,N)COMMON /HAND/ E(o 292),D(B292)COMMON /DRY/ X(6*),Y(6*1COMMON /WET/ A(126,2)

»***»**»*♦*♦♦**«*»«*♦*«**♦»*»♦«♦♦«**♦♦****♦♦♦*♦***♦**♦♦***♦»«**

SOLVES 1NHUM0GEN0US CAUCHY-RIEMANN EQUATIONS IN A RECTANGLE*U + V = D(X,Y) *

X Y WITH 3.C. V(x,0) , V(X,YMAX) ♦

U - V = E(X,Y) *Y X AND A FIXED VALUE OF U ASSIGNED ♦

AT SOME (X,Y). ♦*

ALL FUNCIONS ARE PERIODIC IN X WITH PERIOD 2*DELTAX*M . **

THE R.H.S. E(X,Y) , D(X,Y) ARE STORED IN *E(M*J+1) = E( DtLTAX*(2*l-l) , JElTAY*(¿*J+2) ) ♦

FOR J * 0...N-2 , I « 1...M *D(M*J+I) * D( DELIAX*2*(I-1) DELTAY*(2*J + l ) ) ♦

FOR J = 0...N-l , I • 1...M **

THE BOUNDARY CONDITIONS ARE STORED IN *A(I,1) = VlDELTAX*I,0) ,A(I,2) * V(utLTAX*I,D.tLTAY»N) *

FOR I « 1...M **

SOLUTIONS U AND V ARE RETURNED IN *E(M*J+I) = U( DELTAX*(2*I-1) , OtLTAY*(2♦J+l) ) *

FOR J * 0...N-l , I » 1...M *D(M*J + I) = V( DELTAX*2*(I-1) , C)ELTAY*<2*J+2) ) *

FOR J = 0...N-2 , 1 « 1..,M **

N • NUMBER OF MESH POINTS IN Y *2**MP0W = NUilBER OF MtSH POINTS IN X *

M«2

ISAMISAMISAMIROW

SOL

* -1 SOLVES V FROM TOP TO BOTTOMtY AXIS)« 1 SOLVES V FROM BOTTOM TU TUP■ O ÜOESNT SOLVE V

* THE ROW IN Y WHERE THt FIXED VALUE OF U ISASSIGNED

* (J + l) OF L'< X , DtLTAY*(2*J+l) )■ SUM OVER THE (XI OF THIS ROW

ICODE MUST BE SET NOT EQUAL TO 3 WHEN ROUTINE IS *CALLED FDR 1HE FIRST TIME *ICODE ' 3 FOR SUBSEQUENT CALL IF THE NUMBER OF *MESH POINTS IN X AND Y ARE SAME AS IN THE PREVIOUS CALL *

*I**************************************************************

»»MPOW

MBY2*M/2

M1«M-1Afl=1.0/FL0AT(M)N1«N-1NEXT=0

7t9

101112131*1 51617It1920¿122232*2526272 b293C313 2

333*3 536373t39*C-*1*2*3**

*5*6

*7*8*95t

5152535*



MHALFP=M8Y2+1

MPLUS2=M+2IL«M*NILM»IL-KALAM'DELTAY/DELTAXALSQ*ALAM*ALAM

SETTING UP THE R.H.S. OF THE LINEAR SYSTEM

G2«2.0*DELTAYDO 10 K=1,ILME(KI*E(K)*G2

10 D(K)»D(K)*G2DO 20 1 = 1,MKP-I+ILMD(KP)*D(KPI*G2-A(I,2)D(1)-D(II+A(I,1)

20 A( 1,11*0.0IP*1IS*2DO *0 J*1,ILM,MKM=J-1DO 30 1*1,MlKl-KM+IA(I,IS)»E(K1)

30 E(K1)=ALAM*(D(K1+1)-Ü(K1))+E(K1)-A(I,IP)KMMP=K1+1A(M,I3)=t(KMMP)E(KMMP)=ALAM*(D(KM+1)-D(KMMP>)+E(KMHP)-A(M,IP)IEX=IPIP*IS

*0 IS-IEXDO 50 I-1,M1K1«ILM+I

50 E(K1)»ALAM*(D(K1+1)-D(K1))-A(I»IP)E(IL)=ALAM*(D(ILM+l)-D(IL))-A(f1,IP)

PERFORMING THE FAST FOURIER TRANSFORM BLOCK BY BLOCK

SIG—l.ODO 80 K1*1,NJF«(K1-1)*MDO 60 1*1,MJ»JF+IA(1,1)=E(J)

60 A( 1,2)-0.0

CALL FFT2 (SIG,MPOw,PI)00 70 I*1,MBY2IQF*JF+I*2E(IQF-1)»A(1,1)

70 E(IQF)=A(I,2)X(K1)"AIMHALFP,1)

80 Y(K1)*AIMHALFP,2)

SOLVING EACH SUB-SYSTEM

DO 120 J*2,MBY2IF (ICODE.EO.3) GO TO 100X1«2.0*ALSO*(COS(2.0*PI*(FLOAT(J)-1.0)*AM)-1.01-2.0A(1,1)-X1+1.0A(N,1)=X1+1.000 90 K*2,N1

90 A(K,1)*X1100 DO 110 K*1,N

I»(K-1)*M+2*JIQF«K+NA(K,2)«E(I-1I

110 A(I0F,2)*E(I)CALL CONSOL (N,NEXT,ICODE)DO 120 K-1,NI«(K-1)*M+2*JE(I-1)»A(K,2)IQF*K+N

120 E(I)-A(IQF,2)IF (ICODE.EO.3) GO TO 1*0Xl*-2.0*(ALSO*2.0+l.J)A(1,1)»X1+1.0A(N,1)*X1+1.0DO 130 K*2,N1

130 A(K,1)=X11*0 DO 150 J*1,N

A(J,2)-X(J)IOF-J+N

55

56575t59

60

6162

63

6*

65

6b

67be69

7C7172737<i757b

77

7t

79bO

ai6283B*

b5bb

87BB

B9

90919293

9*9596979899

3BBB

BBBBBBBB 100

B 101B 102B 103B 10*B 105B lotB 107B 10 8s 109B 110B 111B 112B 113B 11*B 115B litB 117B lit8 119B 120B 121B 1223 1236 12*B 125B 126B 12 7B 12bB 1293 13CB 131B 13ZB 133B 13*B 135B 136B 137



150 A(I0F»2)«Y(J)CALL CONSOL (N,NEXT,1C0DL)DO 160 J*1,N

IQF=N+JX(J)*A(J,2)

160 Y(J)=A(10F,2)IF (ICODE.EQ.3) GO TO 1B0All,11—1.0A(N,1)*-1.0DO 170 K=2,N1

170 A(K,l)«-2.0180 DO 190 K»1,N

I*(K-1)*M+1A(K,2)*E(I)IOF=N+K

190 A(IOF,2)=E(1+1)AIIR0W,1)=A(IRÛW,11-1.0

A(IR0W,2)*A(IR0W.21-S0LCALL CONSOL (N,NEXT,ICû0E)00 200 K»1,NI=(K-1)*M+1E(I)=A(K,2)IQF*N+K

200 E(I+1)*A(I0F,2)

FAST FOURIER TRANSFORM OF SOLUTION U

SIOí)JF00IQMMDJL"A(

A(

A(A(

CA00J =

230 E(

210

220.

G*l.230(Kl210

F*JF1,1)1,2)

220MPLU

L,llL,2)KHALMHALLL F

230JF+IJ)=A

0Kl

-1)

1 =♦ I*•E(*E(

1 =S2->A(*-A

FP,FP,FT2

1 =

1,N*M

1,MSY22IQF-1)IOF)2,M3Y2I1,1)(1,2)1)=X(K1)

2)*Y(K1)(SIG,MPOW,PI)

1,M

(1,1)»AM

SOLUTION V WILL BE IN ARRAY D AND U WILL BE IN E

(ISAM.EQ.O) GU TO 300(ISAM.EC.-1) GO TO 260250 K*1,N1

= K*M

MP=KM-M

1,1)=ALAM»(E(KM)-E(KMMP + D)2*0 J"2,M

=KMMP+J

J,1)=ALAM*(L(KJ-l)-E(KJ))250 1*1,M

KMMP+I

J)=D(J)+«(1,1)(K.LL.l) GO TO 250

I = J-MJ)=0(J)+D(KlI)NTINUE

IFIFDOKMKMA(DOKJ

2*0 A(DOJ*D(IFKl0(

250 CO

GO TO 3002.60 Oq 270 1*1,M

J-I+ILM270 A(I,1)*0(J)

IS = 1IP*2DO 290 KD*1,N1IEX«ISIS = IPIP"IEXKM«M*(N-KD+1)KMMP*KM-MA(1,IP)=ALAM»(E(KM)-E(KMMP+1))+A(1»IP)DO 280 J*2,MKJ*KMMP+J

280 A(J,IP)=ALAM*(£(KJ-1)-E(KJ))+A(J,IP)DO 290 1=1,MJ-KMMP+I

1361391*01*11*21*31**1*5

6 1*6S 1*7B l*bB 1*9B 150B 151B 152B 153B 15*B 155B 1568 157B 15EB 159B 160B 161B 162B 163B 16*B 165B 1663 167B 16CB 169B 170B 171B 172

173B 17*B 175B 176B 1778 176B 179B 18 08 161B 162B 183B 18*

185B 186B 167B 186B 189B 190B 191B 192B 1938 19*8 195B 196B 197B 198B 1998 200

B 20120220320*20520620720t209210211

B 212B 213B 21*B 215B 21b

B 217B 2i6

B

B



K1I-J-MA(I,IS)'0(K1I)0(K1I)—A(I,IP)IF (KD.GT.l) D(K1I)»D(J)+D(K1I)

290 CONTINUE300 CONTINUE

RETURNEND

SUBROUTINE FFT2 (SIG,M,PI)COMMON /WET/ X(12f ),Y-(l26)

FAST FOURIER TRANSFORM

N=2*»MNV2-N/2NM1=N-1J = lDO 30 1=1,NM1IF ( I.GE.J) GO TO 10TX-XÍJ)TY*Y(J)X(J)=X(I)Y(J)=Y(I)X(1)=TXY(I)*TY

10 K=NV220 IF (K.GE.J) GO TO 30

J-J-KK=K/2GO TO 20

30 J*J+KDO 50 L=1,MLE*2**LLEl*LE/2UX-1.0UY*0.0ANG*PI/FL0AT(LE1)WX=COS(ANG)WY=SIG*SIN(ANG)DO 50 J=1,LE1DO *0 I*J,N,LEIP=I+LE1TX*X(IP)*UX-Y(IP)»UYTY=X(IP)»UY+YI1P)*UXX(IPI*X(Il-TXY(IP)*Y(I)-TYX(I)*X(I1+TX

*0 Y(1)=Y(I)+TYUS*UX»WX-UY*WYUY=UX*WY+UY*WX

50 UX=USRETURN

END

SUBROUTINE CONSOL (N,NEXT,1 CODE)COMMON /KEEP/ 2(8256)COMMON /WET/ X(126),Y(126)

»♦»»♦»»♦»t****************************************************

* SOLVES A SPECIAL TP1-DIAGJNAL LINEAR SYSTEM BY LU* FACTORIZATION .

* TWO R.H.S. ARE GIVEN AND STORED IN Y, STARTING AT Y(l)« AND AT YIN+1) RESPECTIVELY .* THE TWO CORRESPONDING VECTOR UNKNOWNS ARE RETURNED IN X,

* STARTING AT X(l) AND AT XIN+i) RESPECTIVELY.

B 2193 22C

B 221B 222o 223à 22*B 225

26-

1

IF ICODE = 3 THEN Z ARRAY MUST BE KEPT RESERVED FORUSE BY CONSOL.

ttl>tlt<Ht*tlt»H>t(>tt<ttt|c»lttH)(t)ltt<t>l<.4tH>l>tl

IF (ICODE.EO.3) GO TO 20Z(NEXT+1)*X(1)00 10 K«2»NNEXTPL-NEXT+KKl-K-1X(K1)=1.0/Z(N[XTPL-1)

10 Z(NEXTPL)*X(KI-X(K1)20 DO 30 K-2,N

KPL=K+NKNEXT«NEXT+K-1Y(K)=Y(K)-Y(K-1)/Z(KNEXT)

23*5

6789

101112131*1516

171619202122232*2526272e29303132333*3536373639*0*1*2*3***5-

123

67f9

101112131*1516171619202122232*252627



30 Y(KPLI*Y(KPL)-Y(KPL-1)/Z(KNEXT)NEXTPL-NEXT+NY(N)«Y(N)/Z(NEXTPL)Y(2»N)=Y(2»N)/Z(NEXTPL)DO *0 J*2,NK-N-J+lKPL-K+NKNEXT=K+NEXTY(K)*(Y(K)-Y(K+1))/Z(KNEXT)

'""(KPL>=(Y(KPL)-Y(KPL+1))/Z<KNEXT)NEXT*NEXTPLRETURNEND

Courant Institute of Mathematical Sciences

New York University

New York, New York 10012

1. E. BATSCHELET, "Über die numerische Auflösung von Randwertproblemen bei ellip-

tischen partiellen Differentialgleichungen," Z. Angew. Math. Phys., v. 3, 1952, pp. 165 — 193.

2. J. H. BRAMBLE & B. E. HUBBARD, "Approximation of derivatives by difference

methods in elliptic boundary value problems," Contributions to Differential Equations, v. 3, 1964,

pp. 399-410.

3. J. H. BRAMBLE & B. E. HUBBARD, "A finite difference analog of the Neumann prob-

lem for Poisson's equation," SIAM J. Numer. Anal, v. 2, 1964, pp. 1 — 14.

4. O. BUNEMAN, A Compact Non-Iterative Poisson Solver, SUIPR Report No. 294, Inst.

Plasma Research, Stanford Univ., May 1969, 11 pp.

5. B. L. BUZBEE, G. H. GOLUB & C. W. NIELSON, "On direct methods for solving

Poisson's equations," SIAM J. Numer. Anal, v. 8, 1970, pp. 627-656.

6. L. COLLATZ, The Numerical Treatment of Differential Equations, 3rd ed., Mathema-

tische Wissenschaften, vol. 60, Springer-Verlag, Berlin, 1966, 568 pp.

7. J. W. COOLEY, P. A. W. LEWIS & P. D. WELCH, "The finite Fourier transform,"

IEEE Trans. Audio and Electroacoustics, v. 17, 1969, pp. 77—85.

8. E. G. DJAKONOV, "On certain iterative methods for solving nonlinear difference equa-

tions," in Conference on the Numerical Solution of Differential Equations (J. LI. Morris, Ed.),

Lecture Notes in Math., vol. 109, Springer, Berlin, 1969, pp. 7-22.

9. F. W. DORR, "The direct solution of the discrete Poisson equation on a rectangle,"

SIAM Rev., v. 12, 1970, pp. 248-263.

10. T. ELVIUS & A. SUNDSTRÖM, "Computationally efficient schemes and boundary

conditions for a fine-mesh barotropic model based on the shallow-water equations," Tellus, v. 25,

1973, pp. 132-156.

11. G. E. FORSYTHE & W. R. WASOW, Finite-Difference Methods for Partial Differential

Equations, Wiley, New York, 1960, 444 pp.

12. D. FISCHER, G. GOLUB, O. HALD, C. LEIVA & O. WIDLUND, "On Fourier-

Toeplitz methods for separable elliptic problems," Math. Comp., v. 28, 1974, pp. 349-368.

13. S. GERSCHGORIN; "Fehlerabschätzung für das Differenzverfahren zur Lösung partiel-

ler Differentialgleichungen," Z. Angew. Math. Mech., v. 10, 1930, pp. 373—382.

14. M. GHIL, "The initialization problem in numerical weather prediction," in Improperly

Posed Boundary Value Problems (A. Carasso and A. P. Stone, Eds.), Research Notes in Math.,

vol. 1, Pitman, London, 1975, pp. 105-123.

15. M. GHIL, Initialization by Compatible Balancing, Report 75 — 16, Inst. Comp. Appl.

Sei. Engr., Hampton, Virginia, 1975, 38 pp.

16. M. GHIL & B. SHKOLLER, "Wind laws for shockless initialization," Ann. Meteor.

(Neue Folge), v. 11, 1976, pp. 112-115.

17. M. GHIL, B. SHKOLLER & V. YANG ARBER, "A balanced diagnostic system com-

patible with a barotropic prognostic model," Mon. Wea. Rev., v. 105, 1977, pp. 1223—1238.

18. G. GOLUB, "Direct methods for solving elliptic difference equations," in Symposium

on the Theory of Numerical Analysis (J. LI. Morris, Ed.), Lecture Notes in Math., vol. 193,

Springer-Verlag, Berlin, 1971, pp. 1-19.

19. J. E. GUNN, "The solution of elliptic difference equations by semi-explicit iterative

techniques," SIAM J. Numer. Anal., v. 2, 1965, pp. 24—45.

0 2fcD 29D 3GÛ 31Û 32D 33D 3*0 35D 36D 37D 36

0 39D *0-



20. B. GUSTAFSSON, "An alternating direction implicit method for solving the shallow

water equations,"/. Computational Phys., v. 7, 1971, pp. 239-254.

21. R. W. HOCKNEY, "A fast direct solution of Poisson's equation using Fourier analy-

sis,"/. Assoc. Comput. Mach., v. 12, 1965, pp. 95 — 113.

22. R. W. HOCKNEY, "The potential calculation and some applications," in Methods in

Computational Physics (B. Adler, S. Fernbach and M. Rotenberg, Eds.), vol. 9 (Plasma Physics),

Academic Press, New York, 1969, pp. 135-211.

23. W. E. LANGLOIS, Vorticity-Stream Function Computation of Incompressible Fluid

Flow with an Almost-Flat Free Surface, IBM Research Report RJ 1794 (#26092), 1976, 8 pp.

24. H. LOMAX & E. D. MARTIN, "Fast direct numerical solution of the nonhomogeneous

Cauchy-Riemann equations,"/. Computational Phys., v. 15, 1974, pp. 55—80.

25. E. D. MARTIN & H. LOMAX, Rapid Finite-Difference Computation of Subsonic and

Transonic Aerodynamic Flows, AIAA Paper No. 74—11, 1974, 13 pp.

26. E. D. MARTIN & H. LOMAX, Variants and Extensions of a Fast Direct Numerical

Cauchy-Riemann Solver, with Illustrative Applications, NASA Tech. Note TN D-7934, 1977,

94 pp.

27. J. ÖLIGER & A. SUNDSTRÖM, "Theoretical and practical aspects of some initial-

boundary value problems in fluid dynamics," SIAM J. Appl. Math. A, v. 35, 1978, pp. 419-446.

28. P. J. ROACHE, Computational Fluid Dynamics, 2nd ed., Hermosa Publishers, Albu-

querque, 1976, 446 pp.

29. U. SCHUMANN & R. A. SWEET, "A direct method for the solution of Poisson's

equation with Neumann boundary conditions on a staggered grid of arbitrary size,"/. Computa-

tional Phys., v. 20, 1976, pp. 171-182.

30. P. N. SWARZTRAUBER, "A direct method for the discrete solution of separable ellip-

tic equations," SIAM J. Numer. Anal, v. 11, 1974, pp. 1136-1150.

31. R. A. SWEET, "A generalized cyclic reduction algorithm," SIAM J. Numer. Anal,

v. 11, 1974, pp. 506-520.

32. O. WIDLUND, "On the use of fast methods for separable finite-difference equations for

the solution of general elliptic problems," in Sparse Matrices and Their Applications (D. J. Rose

and R. A. Willoughby, Eds.), Plenum Press, New York, 1972, pp. 121-131.

33. O. WIDLUND! "Capacitance matrix methods for Helmholtz' equation on general

bounded regions," in Numerical Treatment of Differential Equations (R. Bulirsch, R. D. Grigorieff

and J. Schröder, Eds.), Lecture Notes in Math., vol. 631, Springer, Berlin, 1978, pp. 209-219.