MATHEMATICS OF COMPUTATION, VOLUME 33, NUMBER 146
APRIL 1979, PAGES 585-635
A Fast Cauchy-Riemann Solver
By Michael Ghil* and Ramesh Balgovind**
Abstract. We present a solution algorithm for a second-order accurate discrete form
of the inhomogeneous Cauchy-Riemann equations. The algorithm is comparable in
speed and storage requirements with fast Poisson solvers. Error estimates for the dis-
crete approximation of sufficiently smooth solutions of the problem are established;
numerical results indicate that second-order accuracy obtains even for solutions which
do not have the required smoothness. Different combinations of boundary conditions
are considered and suitable modifications of the solution algorithm are described and
implemented.
1. Introduction. Inhomogeneous Cauchy-Riemann equations appear naturally in
many fluid-dynamical problems, as the divergence and the vorticity equations of a
two-dimensional steady flow field (u, v) = (t/(x, y), u(x, y)). The velocity components
u, v axe usually called in this context primitive variables, in contradistinction to the
derived variables \¡i, f in the stream function-vorticity formulation of the flow equa-
tions (e.g., Roache [28]). In the latter formulation, the stream function \p satisfies a
Poisson equation; and computations with this formulation have greatly benefited from
the rapid development of fast direct methods for the solution of Poisson's equation, or
Poisson solvers (Buneman [4], Buzbee, Golub and Nielson [5], Dorr [9], Fischer,
Golub, Hald, Leiva and Widlund [12], Golub [18], Hockney [21], [22], Widlund [32] ).
Working in the primitive variables, however, permits the treatment of more gen-
eral flows. Indeed, either nondivergence or irrotationality of the flow are required in
order to introduce a stream function \p or a velocity potential 0, and obtain a Poisson
equation for them. There are many situations of practical interest in which neither of
these assumptions holds. Furthermore, the formulation of boundary conditions is
often easier in terms of the primitive variables, by using physical considerations which
arise naturally from the problem. On the other hand, a boundary condition on the
vorticity f for instance is at times hard to formulate (Langlois [23]); the construction
of appropriate discrete versions of such a boundary condition is often even more diffi-
cult (Öliger and Sundström [27]). Hence, the desirability of simple, physically mean-
ingful boundary conditions and, thus, of the use of primitive variables.
Lomax and Martin [24] have developed a fast Cauchy-Riemann solver and
Received April 10, 1978.
AMS (MOS) subject classifications (1970). Primary 65F05, 65N15, 65N20; Secondary
65N04, 65N05, 76B05, 86A10.
Key words and phrases. Fast direct solvers, Cauchy-Riemann equations, elliptic first-order
systems, transonic flow.
*The work of this author was supported in part by NASA, Grant No. NSG-5130.
**The work of this author was supported in part by NASA, Grant No. NSG-5034.
© 1979 American Mathematical Society
0025-5718/79/0000-0058/$! 3.75
585
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
586 MICHAEL GHIL AND RAMESH BALGOVIND
applied it to a quasilinear problem in aerodynamics ([24], [25] ). Additional versions of
their solver are given in [26]. Our interest in the problem stems from a different applica-
tion, dealing with a two-dimensional version of the equations of dynamic meteorology
(Ghil [14], [15] ). The solver we present also differs from those of [24], [26] in a num-
ber of ways: the boundary conditions and their numerical treatment, the decoupling of
u, v and the reduction to a discrete Poisson problem, and finally the method of solution
of the resulting discrete Poisson equation are all different. We substantiate by numerical
results that the present solver is second-order accurate, even for solutions which do not
have the formally required degree of smoothness. The solvers of [24], [26] seem to be
only first-order accurate according to our numerical tests.
The intended application of our solver is to a fully nonlinear first-order system,
rather than a quasilinear one. This system is a generalization of a Monge-Ampére equation
[15], [17]. Eventually, we hope to apply the solver to cases where the nonlinear equa-
tions are of mixed elliptic-hyperbolic type [16], [17]. Prehminary results are encourag-
ing, and we expect to pursue the nonlinear problem in a future publication.
The organization of the article is the following: Section 2 contains the description
in continuous and then in discrete form of the model problem of which we seek a fast
solution. Section 3 contains the derivation and description of the solution algorithm.
Section 4 presents numerical results for test computations with the model problem. Sec-
tion 5 presents modifications of the model problem arising from changes in boundary
conditions. Section 6 contains a comparison of results with the solvers of [24], [26].
Section 7 gives conclusions and a discussion of possible extensions and generalizations.
Finally, Appendix A presents an error estimate for the method, and Appendix B con-
tains a listing of the basic program.
Acknowledgements. It is gratifying to acknowledge useful discussions with Profes-
sors Eugene Isaacson and Olof Widlund. Numerical calculations were performed in part
on the CDC 6600 of the Courant Mathematics and Computing Laboratory, New York
University under Contract EY-76-C-02-3077 with the U. S. Department of Energy.
2. The Model Problem.
The Differential Equations. We wish to study the fast numerical solution of the
elliptic system of two first-order linear equations in two independent variables,
(2.1a) ux + vy = dix, y),
(2.1b) Uy - vx = e(x, y).
The dependent variables u, v can be thought of as velocity components in the x, y direc-
tions, respectively. In this interpretation d, e axe the divergence and vorticity of the
flow, which are assumed to be known. If d = 0 = e, (2.1) are the Cauchy-Riemann equa-
tions, and u, v axe analytic. We are interested in the inhomogeneous case, \d\ + \e\ ^ 0,
and we concentrate on real-valued d, e, u, v, although the method is applicable with
minor changes to complex-valued functions as well.
We consider a rectangular domain R, taken without loss of generality to be R =
{(x, y): 0 < x < 27T, 0 < y < 7r}. The boundary conditions are that d is given on the
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 587
lower and upper side of the rectangle,
(2.2a) v = t/°>(x), y = 0,
(2.2b) V = v^0c), y = 77,
and that both u and v axe periodic in the x-direction,
(2-3a) u(x + 2tt, y) = u(x, y),
(2.3b) v(x + 2ir, y) = v(x, y).
The Gauss divergence theorem implies that d(x, y), v^°\x), and i/^ix) have to satisfy
(2.3c) J02"/0" d(x, y)dxdy = f*" {v^(x) - „<»>(*)} dx.
These conditions, together with (2.1), determine v completely and u up to an additive
constant (compare Ghil [15]). The latter indeterminacy in the solution can be elimi-
nated, for instance, by prescribing u at one arbitrary point (x0, y0) in the rectangle.
The boundary conditions we use are associated with a standard channel-flow
problem in geophysical fluid dynamics (e.g., Elvius and Sundström [10], Gustafsson
[20] ), from which the nonlinear problem mentioned in Section 1 is derived (Ghil [15],
[17]). Extensions to different boundary conditions and to irregular domains will be
discussed in Sections 5 and 7.
777e Difference Equations. The discretization of the problem we chose is to
approximate the derivatives in (2.1) by finite differences. Let U, V, D, E stand for
the mesh functions which approximate the continuous functions 77, v, d, e; and let
h, k stand for the mesh size in the x, y-directions, respectively. It is natural to use
centered differences to replace the corresponding derivatives in (2.1). We write
(A) U(x + b, y) - U(x - b, y) s 2St7x(x, y),
U(x, y + e) - U(x, y - e) s 2euy(x, y),
for « and similar formulas for v; this yields a second-order accurate approximation of
the derivatives and allows us to expect that in some adequate norm ||-||,
||U- u|| + || V - u|| = 0(h2) + 0(k2).
The use of centered differences in a straightforward manner, on an unstaggered
mesh x¡ = ih, y ■ = jk, leads, however, to the existence of spurious null vectors (U, V),
i.e., to zero eigenvectors of the discrete matrix operator which approximates the dif-
ferential Cauchy-Riemann operator. To avoid dealing with these null vectors and to ob-
tain an invertible discrete matrix operator, we used a staggered mesh (Figure 1). Such
a mesh, suggested already by Lomax and Martin [24], can be formulated for Eqs. (2.1)
in a particularly efficient way.
Let u, v denote points at which U, V axe defined, and let -, x (for the vector
operators divergence and curl) denote the points at which the discrete versions (2.4a, b)
(see below) of equations (2.1a, b) are written and at which D, E are defined. Thus,
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
588
yiiU
MICHAEL GHIL AND RAMESH BALGOVIND
IT *
l,N-*
-»-
1,3
}1,2
1,1
^2^1
l,N ~2,N
V
1,2
M V2,l
1,1 2,
2,N
2,1 V3,l
2,1 3,1 3,1
v¡, ta v
-tf-
-»i-tír
V
-tÖ-•-
^SLIT l(27T,ir)
M,N
M,l
(0,0) '.o 2,0 3,0 M,0277
Figure 1
x(¡)
u, v alternate on diagonals of the mesh and so do -, x on diagonals parallel to and al-
ternating with those of u, v; in other words, v, x, u, -, in clockwise direction, occupy
the corners of an elementary mesh cell of area hk/4 (see Figure 1). No averaging
of U, V is necessary in writing (2.4a, b) on this staggered mesh. Furthermore, the
boundary conditions (2.2) and the periodicity condition (2.3a, b) can be easily han-
dled. Indeed, let U, V be indexed independently, with y = 0, 7T being horizontal
F-lines corresponding to F-indices / = 0, N, and with x = 0, 27r being vertical F-lines
with F-indices i = 1, M + 1. Then the computational domain includes the points
((/ - l)h, jk), at which Vfj is defined, and the points ((/' - l/2)h, (j - l/2)k), at which
Ujj is defined; in other words, Vtj approximates v((i - l)h, jk), 1 < i <M + 1, 0 <
/ < N, and U(j approximates 77((z" - l/2)h, (j - l/2)k), Ki<M, 1 < / < N, with
h = 2irlM, k = n/N.
The discrete equations are centered:
(2.4a) (Uu - Ut_t j)th + (VM - Vu_x)/k = Dti, Ki<M,Kj<N,
(2.4b) (Uu + x - Uif/)/k -iVi+x . - Vu)lh = Eif, Ki<M,l<j<N-l;
here D/; approximates diQ - l)h, (;' - 1/2)k), and E(j approximates e((7 - 1/2)A, /7c).
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 589
Notice that S, e in (A) have been replaced after staggering by h/2, k/2, rather than by
h, k. The boundary conditions become
(2.5a)
(2.5b)
Vi¡0 = ¿°Xii - l)h), Ki<M,
ViN = ¿l\ii-l)h), Ki<M,
and the periodicity condition becomes
(2.6a)
(2.6b)
uoj = Um.j' 1 </</V,
Vi,j=VM + ij, 0<j<N.
Conditions (2.5), (2.6) leave M(2N - 1) unknowns, to wit, Ut/, 1 < i < M, 1 <
/ < N, and Vtf, 1< i" <M, I </ <N- 1, while (2.4) yields M(2N - 1) linear alge-
braic equations for them. We should expect from the situation in the continuous prob-
lem that the matrix of the linear system (2.4) has a one-dimensional null space. The
resulting indeterminacy can be eliminated either by prescribing the value of U¡¡ at an
arbitrary index (i'0, /„), or by some alternative procedure which will arise naturally in
Section 3; the necessary discrete compatibility condition will also be discussed there.
3. Solution Algorithm. Our plan shall be to rewrite (2.4)—(2.6) in a convenient
block matrix form, to eliminate V, then to bring the remaining block matrix operating
on U to the form of a discrete five-point Laplacian, and finally to formulate a fast
direct algorithm to solve for U. After this, V obtains from U by a straightforward,
fast computation.
Block Matrix Equation. We start by rewriting the discrete linear system (2.4)
in block matrix form. Let IL = (Ux ,., U2 Jt..., UM])*, Y} = (K, j, V2 Jt..., VMJ)*,
where ( )* denotes (conjugate) transpose, so that U-, V- are column vectors corre-
sponding to a horizontal mesh line. D;- and E- are introduced in similar fashion. Also
let p = k/h. With this notation, (2.4) can be written as
Nj = Vy.j - TVj + kVj, Kf<N,
1 </</Y-l;
(3.1a)
(3.1b)
here p ~lT is the familiar backward difference operator,
U/+.=U7 T*Vj + kEf,
(3.2) T=p
1
1 1
-1
1 1 M XM
acting on the M-periodic vectors U;., V;- and -p 17"* is the corresponding forward
difference operator.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
590 MICHAEL GHIL AND RAMESH BALGOVIND
From (2.5) we have that V0 and VN axe known; accordingly, we redefine Dx as
Dj + &-1V0 in the first one, and D^ as D^ - k~l\N in the last one of the equa-
tions (3.1a). After these changes of notation and a change of order, the vector equa-
tions (3.1) can be put into the block matrix form
N N - 1
N-l
(3.3)
7V<
-I I
I
-I
U,
N-l
N
"N-l .
= *
^TV-l
D,
LoJwhere / is the M x M identity matrix. This form is roughly analogous to writing
(2.1a, b) in reversed order,
(2-03/3y -3/3*
3/3x 3/3y
We notice that the matrix in (3.3) has only four nonzero diagonals of M x M
blocks and that the blocks are at most scalar tridiagonal. Furthermore, all the non-
zero entries are ± 1 or ±p. In the present form, however, we cannot take full advan-
tage of the extreme sparsity and simplicity of the matrix in order to obtain a solution
method comparable to fast Poisson solvers.
Before bringing (3.3) to a more advantageous form we remark that its matrix
indeed has a one-dimensional null space: the sum of the MN rows in the lower half of
the matrix is zero. The corresponding compatibility condition that 2/.-D« = 0 is the
discrete counterpart of (2.3c).
Decoupling of U and V. It is a well-known fact that each of the functions 77, u
which satisfy (2.1) will also satisfy a Poisson equation obtained from (2.1) by elimi-
nating the other dependent variable by cross-differentiation. This suggests the attempt
to eliminate Fin system (3.3) in order to obtain a linear algebraic system for Ualone.
The matrix of this system will be similar to a discrete Laplacian, and to it we shall be
able to apply fast solution techniques.
The elimination of F proceeds as follows. In system (3.3) add the Mh block
row to the (N + l)st, then the (N + l)st to the (N + 2)nd and so on. This discrete
summation procedure is analogous to integration with respect to y in (2.1a). The lower
part of the system thus becomes
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 591
N N-l
(3.4) N>
UN
LViv-i.
Di
D,+D2
?>/ J
After this transformation, (3.3) can be written as
N N- 1
(3.5)
N
N
¥■■
P I ßI _
I
T i 0---0
UE'
D'
TV
LJ L*£>dhere U = (U*..... U*)*, V = (Vf, ..., V*_,)*, 7iV_1 is the M(N - 1) x M(N - 1)
identity matrix, the definitions of P, Q and E' are easily read from Eqs. (3.3), while
those of R and D' follow from (3.4). In particular, note that D' and E' contain the
factor k.
From (3.5) it follows that
PV + ßV = E',(3.6a)
(3.6b) RV + V = D',
where we omit for the moment the last vector equation. Substituting V from (3.6b)
into (3.6a), we obtain
(3.7)
where
(3.8)
(P-QR)V = E'-ßD',
N
QR
TT*
TT* TT* 0
N- 1
since ß is block diagonal with constant equal blocks and T*T = TT*. Attaching now
the last equation of (3.5) to (3.7) yields
(3.9)P-QR
T... T1 -i--"-2-0-"
or, changing sign in all but the last equation,
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
592 MICHAEL GHIL AND RAMESH BALGOVIND
+
(3.9')
= k
0.
T*D
TT*
TT*
J
Ei
U
TT*
T*(BX + D2) - E2
L"i i
Let S = TT*. Notice that
(3.10) S = p(T+ T*)
or, in the familiär notation for tridiagonal matrices,
(3.10') T=p(-1,1,0), 7* = p(0, 1,-1), S = p2(-1,2,-1);
this is a slight extension of standard notation: because of periodicity these matrices are
actually circulants, rather than tridiagonal proper (cf. (3.2), (3.10)). Thus, S corre-
sponds to a central second difference, carrying further the analogy with the continu-
ous case. However, (3.9') is still not in a form suitable for fast solution.
Discrete Poisson Equation for U. This form is now easily achieved. First, mul-
tiply the last block row by T*. Then subtract the second block row from the first,
the third from the second, and so on up to and including the last one; the new last
row is obtained by adding all the new rows, from 1 to (N - 1), to the last row. Fi-
nally, the new last row is taken as the first row of the system and all the signs are re-
versed, yielding
S + I -I
-I S+ 21 -I
(3.11)
-/ S+ 21 -I
-I S + I
u
r*D -i
7;*D, + E, -E,
T*DN_X + EN_2 -EN-l
T*DN + E7V-1 mN
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 593
the vector B is introduced for future notational convenience.
We notice that the matrix in (3.11) has almost, but not quite the form of a dis-
crete Laplacian (compare for instance Buzbee, Golub and Nielson [5], for Laplacian
with Dirichlet, Neumann and periodic boundary conditions). The matrix is symmet-
ric and nonnegative semidefinite; in fact, LT.- = const satisfies the homogeneous sys-
tem. The system would be positive definite if any one diagonal element actually
exceeded the sum of the off-diagonal elements in the same row (e.g., Collatz [6,
pp. 43—47]). This immediately suggests that we make use of the possibility of pre-
scribing U( ■ = uQ, i.e., that we increase the element at the position Mj0 + iQ along
the diagonal by an arbitrary amount a > 0, and add at70 to the right-hand side of that
equation at the same time. We shall call the system thus modified (3.11a) for future
reference.
The system (3.11a) could now be solved by modifying any one of the different
methods or combination of methods available for Poisson's equation, as reviewed for
instance by Buzbee, Golub and Nielson [5], Dorr [9], or Widlund [32]. In this sense,
the elimination of V is analogous to a single step of odd/even reduction as proposed
by Hockney [21], [22], although the role of this step is more crucial in the present
context. We describe in the sequel the particular fast algorithm chosen to solve (3.11a).
Fast Direct Solution for U. Essentially, our algorithm is based on the one de-
scribed in greater generality by Buzbee, Golub and Nielson [5] as matrix decomposi-
tion. It relies on the fact that the eigenvalues Xk and eigenvectors %k of S axe known,
i.e., that
S%k ~ ^k%k'
for
(3.12a) Xk = 2p2(l - cos(2n(k - 1)¡M)), Kk<M,
(3.12b) %k = Af-^O, e-2»'(*-»>/tt ..., e-2ni(M-i)(k-i)iM)*í i < A: <M,
where i is the imaginary unit.
Our algorithm deviates, however, from general matrix decomposition because the
first and last diagonal blocks in (3.11) differ from all the other blocks and because the
matrix (3.11), without suitable modification, is singular. For the sake of simplicity, we
shall describe the actual algorithm used directly in a self-contained manner.
Let Q. be the matrix whose columns are %x, %2, ..., %M, and let A =
diag( X,, X2, ..., XM). Then, with our normalization of %k,
(3.13) 0*0.= I. Q*Sd=A.
Introduce U;-, B;- by
(3.14a) IL = QV,;
(3.14b) *i=Q?i, l<i<N,
where B, are the subvectors on the right-hand side of (3.11).
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
594 MICHAEL GHIL AND RAMESH BALGOVIND
Clearly, (3.14) corresponds to a discrete Fourier transform in the x-direction.
Notice in particular that
Ui,i L ui,ji^M>1=1
where £/• , denotes the first component of the vector Üy.
Premultiplying (3.11) by a block-diagonal matrix with all the diagonal blocks
equal to Q*, and using (3.14), we obtain
(3.15)
A+ 7 -I
-I A+ 21
-I A+ 21 -I
-I A + I uN BA^
In order to turn the matrix in (3.15) from block-tridiagonal with diagonal blocks
into a tridiagonal matrix, it suffices to reorder the components of U and of B by ver-
tical Unes rather than by Fourier transformed vectors of horizontal Unes, i.e., let
(3.16) <>*., U,i,k> Bk.j Bj,k' 1 <j<N, 1 <k<M.
Here again the first subscript denotes the vector partition and the second denotes the
component of the subvector; this notational convention is different from the mesh
notation introduced in Section 2 and used in (3.1). Using the reordering (3.16),
Eq. (3.15) becomes
(3.17)
with
~Ak0k Bt 1 <k<M,
(3.18a)
(3.18b)
(3.18c)
Ak =
NXN
ak = -2-\k = 2p2 [cos(2tt(k - l)/M) - 1] - 2,
** = -!
'vfc
7„2vfc - 2p2 [cos(27r(x - l)/M) - 1 ] - 1, 1 < k < M;
here the ck come from the first and last diagonal block in (3.15).
It is easy to check that each Ak is nonsingular, except the first, which has a
one-dimensional null space. This reflects the fact that we have not yet utilized a side
condition eliminating U = const as a null vector of (3.11), or of (3.15). As in the
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 595
discussion of (3.11a), we can deal with the singularity of Ax by prescribing any com-
ponent of Ûj, Ux . say; this means prescribing Ux • = Uj x = T,f=x t/. • I\[M,
rather than one given mesh value U¡ ¡ . Again we call the suitably modified systems
(3.15a), (3.17a) and (3.18a).
The LU factorization of the tridiagonal matrices in (3.18a) is performed next,
say Ak = Lkiik, storing the nontrivial entries of Lk and Uk. The diagonal elements
of Lk and the superdiagonal elements of Uk axe identically 1, while the subdiagonal ele-
ments of Lk and the diagonal elements of Uk are reciprocals of each other. Hence only
the diagonal elements of Uk in fact need to be stored.
Having computed B and thus B, from the original data, we can solve for U and
hence for U. The computation of B from B and of U from U involves matrix multi-
plication by Q* and by Q, respectively. Since these correspond to a direct and an in-
verse Fourier transform, they can be performed efficiently by a fast Fourier transform
(FFT) algorithm. The algorithm used was the one published by Cooley, Lewis and
Welch [7], modified to operate in the real.
Summary and Programming Considerations. The solution algorithm developed
in this section reduces in practice to the following:
(i) From the data of the problem D, E, compute
Bx=kiT*Dx-Ex),
B, = x(r*D; + E,_, - E,.), 2 </ <N- 1,
BN = kiT*DN + EN_x),
where Dj, D^y- include the boundary data V0, NN, respectively. The number of opera-
tions is 77, = 4MN.*** Notice that, after shifting of the blocks E-, B can overwrite
E, since E is no longer needed.
(ii) From the B above compute B (cf. (3.14b)) by
B;. = £*B;., Kj<N,
using an FFT, with /72 = 2MN log2 M real operations. Although FFT algorithms exist
which use MN log2M operations for real transforms, we did not feel that in our appli-
cation actual execution time would be much improved by the utilization of such an
algorithm. In this step, B can be stored in the previous location of B. Note that B;
is complex with pairwise complex conjugate components.
(iii) Solve (3.17a) for the 0k by forward elimination and back substitution,
using the stored entries of the LU factors. One solution, for fixed k, requires 8/V real
operations, since the complex conjugate components of B enter nonsymmetrically into
the LU factorization of (3.17a), and we have to operate separately on the real and the
imaginary parts of B. However, the fact that the components of Uy are also complex
***A11 operation and storage counts are given to highest order. We do not distinguish be-
tween addition, multiplication and division, and we do not assume that accumulation of products
is particularly efficient. Furthermore, we take 7i = k, or p = 1.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
596 MICHAEL GHIL AND RAMESH BALGOVIND
conjugate allows us to solve only M/2 + 1 systems, so that the operation count for this
step is 773 = 4MN real operations. Notice that no reindexing of B as indicated by
(3.16) is actually necessary, and that at this stage U can overwrite B. The shifting of
B elements and U elements required to use the LU subroutine is relatively fast.
(iv) From the Uk computed in (iii) obtain U (again with no need for reindexing)
by carrying out (3.14a) with another application of the FFT, in 774 = 2MN log2M real
operations, and with U stored instead of U.
(v) Compute V from U obtained in (iv), using (3.1a). This requires tis = 4MN
operations. Notice that in this recursion V does not depend on \JN, DN, since VN is
known. Using (3.1a) as a backward recursion would yield independence of V on Up
Dp since V0 is also known. This redundancy, however, is only apparent and compu-
tations with (3.1a) used as either forward or backward recursion yielded results with
the same accuracy, which only depended on the accuracy of U.
The final operation count is
5
77 = X nk = 4MN(3 + log2M),i
to obtain both Uand Fat all the grid points at which they are defined.
The storage requirement for the L, U factors is s, = MN/2, using again the fact
that we only solve for half the number of components of U; the storage requirement
for the right-hand sides D, E, is s2 = 2MN; and these locations are successively over-
written until U, V are stored in them. Hence, the total storage requirement is s =
sx + s2= 5MN/2.
We conclude this section with a diagram of the algorithm:
E FFT ~ R „ LU *Of, E;, V0, VN -^* B, ^Xo^m B7 AT Bk -^ Vk
R ~ FFT E—► u.-► u.-► v.-0 ' 2MN\og2M i 4MN >'
hexe E stands for evaluations by linear recursion, FFT for an application of the Fast
Fourier Transform, R for reindexing, and LU for factorization. The ranges of the in-
dices are 1 < / < N, 1 < k < M, and the operation counts are given under the corre-
sponding step of the algorithm.
4. Numerical Results. In this section we shall present results of test computa-
tions for model problem (2.1)—(2.3). The results will provide evidence for the short
computation time required by the method, for its second-order accuracy and for its
insensitivity to errors in the data.
The experiments consisted in evaluating analytically d, e in (2.1) for known
77, v, and then comparing the numerical solutions U, V with the correct analytical
u, v. The computations were carried out on an IBM 360/95 computer, with a
FORTRAN IV, level H compiler, the code optimization parameter OPT being set to
the value OPT = 2. Double precision arithmetic was used throughout our numerical
experiments. Test comparisons on an Amdahl 470V/6 computer and on a CDC 6600
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 597
gave entirely similar results; the program is currently being developed into a package
and we expect to test it on other computers as well.
Initially, three different versions of the program were run. The first version
solved the positive definite modification (3.11a) of the system (3.11) by Cholesky
factorization. It is denoted by C in the tables, and given for orientation purposes
only. This version required much longer computer time; moreover, results were less
accurate than with the other two versions, since the larger number of arithmetic oper-
ations introduced more round-off error.
The second version utilized diagonalization by the discrete Fourier transform as
given in (3.14) and the reindexing given in (3.16), but did not use the FFT in steps
(ii) and (iv) of the algorithm. It is thus somewhat comparable to Hockney's original
method [21] and is also included for comparison purposes. Its numerical results were
equal at least to within three significant digits to those of the third version and are
Usted separately only in Table III. This version is denoted by P in the tables.
Table I
CPU execution times for the versions C, P and F on IBM 360/95 with compiler
optimization OPT = 2. The number of points is MN = M2 ¡2.
16 .07 .09 .01
32 .88 .65 .04
64 14.35 5.17 .13
128 .51
256 2.11
Table II
CPU execution times for version F with different optimization levels, OPT = 0,
1, 2, 3, on IBM 360/95, Amdahl 470V/6, and CDC 6600. The compiler used onthe CDC computer was FTN 4.6; on the other two computers, a FORTRAN IV,
level H compiler was used.
360/95 470V/6 6600
16
32
64
128
OPT/0
0.02
0.08
0.35
1.40
0.02
0.07
0.20
0.88
0.01
0.04
0.13
0.51
OPT/0
0.00 0.02
0.03 0.09
0.13 0.38
0.52 1.53
0.02
0.07
0.30
1.25
0.01
0.05
0.23
0.96
OPT/0
0.01 0.059
0.06
0.24
0.232
0.969
0.96 4.065
0.031
0.120
0.494
2.034
0.025
0.093
0.382
1.615
256 5.91 3.70 2.11 2.13 6.52 5.46 4.07 4.14
The third version is the one actuaUy described in the summary of Section 3 and
Usted in Appendix B; it represents the proposed method. This version is denoted by F
in the tables.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
598 MICHAEL GHIL AND RAMESH BALGOVIND
Tables I and II contain timing results. These are essentiaUy independent of the
solution, and depend only on the method, or program version, and on the number of
points. Table I compares CPU execution times for the three program versions C, P
and F on the IBM 360/95 computer with compiler optimization OPT = 2. As the
number of points increases, the advantages of version F become more and more obvious.
We can also see clearly that the execution time for version F is proportional to the
number of points MN.
Table II gives CPU execution times for version F with different optimization
levels, OPT = 0, 1, 2, 3, on the IBM 360/95, Amdahl 470V/6, and CDC 6600 comput-
ers. This is meant to give an idea of the order of magnitude of running time improve-
ments which could still be achieved by code optimization. We notice that timing with
OPT = 3 is not superior to OPT = 2.
The other tables are organized as follows: the first column contains the known
test solution (u, v), and for Table V only, its L2 and L„ norms. The norms used for
u are the continuous norms
(4.1a) \\u\f\ = \ fon j2n u2ix, y) dx dy,2lT
(4.1b) ||id|„ä sup |r/(x,y)|,0<x <2-n,0<y*ín
and similarly for v. The corresponding L2 and /,«, norms for the vector (u, v) axe de-
fined in the obvious way, with u2 + v2 as the integrand in (4.1a) and max{|î7|, |u|} as
the maximand in (4.1b). These norms are given in order to compare the corresponding
norms of the computational error with them.
The second column contains the version of the program used, specificaUy version
C or version P/F. The third column contains the value of M, the number of points in
the x-direction. AU reported computations were carried out with equal mesh spacing
in the x-and y-directions, i.e., h = k = 2-n\M. Hence, N = M/2 always, and the total
number of points equals M2 ¡2.
The remaining columns contain the absolute errors in (u - U, v — V). The
norms used here are the discrete counterparts of the norms in the first column:
(4.2a) l22(U)=Z^=J=xܡjlMN,
(4.2b) UU) = maxf=x^x\Utj\,
and similarly for V and (U, V). For simpUcity, in the column headings, as well as in
(4.2) above, we used l2(U), instead of l2(u - U), and so on. The grid values of (u, v)
entering (4.2) are those identified in Section 2 and, therefore, are taken at different
locations for u than for v.
Table III shows that for u, v which are combinations of linear and of trigono-
metric functions, and thus eigenvectors of the discrete operator in (3.11), the results
are essentiaUy exact to machine accuracy (in double precision arithmetic).
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 599
Table III
Numerical results with versions C, P and F of the algorithm for a number of sim-
ple test cases. The labeUing of rows and columns is explained in the text.
*_<« "«(V)
u = sin(y), v = y sin{x) 6351
4364
330S
1272
-14
-13
-12
-11
5457 -15
2929 -14
1543 -13
6296 -12
1.9989
1.7934
1.6889
1.5273
2.8144 -14
2.5881 -13
2.4685 -12
2.2408 -11
6.6613 -15
7.6605 -14
6.7857 -13
5.5140 -12
5286
2001
7386
4957
4851 -16
0922 -16
4532 -15
2021 -15
3.5100
5.1500
2.2120
1.1267
-16
-16
-15
-14
8.7430 -16
1.4017 -15
6.7862 -15
2.8847 -14
6.6613 -16
1.5543 -15
5.1070 -15
3.2488 -14
7088
3077
5606
4255
4397 -16
3889 -16
5149 -16
2937 -15
3.0436
5.1554
1.8946
1.0286
-16
-16
-15
-14
5.6287 -16
1.5821 -15
4.3438 -15
3.9314 -14
6.6613 -16
6.6613 -16
1.5682 -15
7.7934 -15
y sin (x) , v = sin (y) 3764
7252
4271
0832
5357 -15
2782 -14
9268 -13
5755 -12
2.5870
2.0024
1.6875
1.3481
-15
-14
-13
-12
4.2188 -15
3.5083 -14
2.9177 -13
2.2524 -12
6.4254 -15
5.8356 -14
5.0736 -13
4.2358 -12
8762
8400
1500
1996
-15
-15
-15
-14
5357 -15
9115 -15
9430 -14
7951 -13
2.1834
3.6143
2.7782
1.2690
4.6629 -15
6.0091 -15
2.7145 -14
7.9020 -14
9.7561 -15
2.5924 -14
3.2709 -13
2.0683 -12
0791
7749
6763
5572
-16
-16
-15
-15
9369 -16
0176 -14
3516 -13
0348 -12
2.0194
1.3785
9.4054
7.2592
6.6613 -16
8.8818 -16
7.1054 -15
9.7145 -15
5.6899 -16
3.1225 -14
2.0578 -13
1.4948 -12
y sin(x) + 0.001 ,
y4 + 0.001
2.2678
8.0236
6.6951
-14
-14
-13
1.4387 -14
8.2059 -14
7.8598 -13
1.9560 -14
8.1092 -14
7.2820 -13
5.3290 -14
1.9540 -13
1.6165 -12
3.5527 -14
2.4158 -13
2.3128 -12
1.1175
1.1179
4.1173
8.9470 -15
1.5689 -14
1.3525 -13
1.0279 -14
1.3750 -14
9.8620 -14
3.6193 -14
5.4622 -14
2.5047 -13
2.1316 -14
1.0303 -13
1.4211 -12
7.3201
1.6285
8.2406
1.1531 -14
2.5126 -13
7.6397 -13
9.3597 -15
1.7206 -15
5.3471 -13
1.4655 -14
3.1974 -14
1.8474 -13
3.1974 -14
5.2558 -13
1.4246 -12
Table IV presents results for slightly perturbed boundary conditions i/0)(x),
u(1)(x), m(x0, y0) and right-hand sides d(x, y), e(x, y). Notice that errors in the com-
puted Fare bounded by the errors in the prescribed boundary data i/°\ i/1* in the
absence of other errors. Also observe that the l2 error in U is larger when d, e are
perturbed than when i/0) and i/1* are perturbed by a comparable amount.
Table V contains results for a number of more severe test cases, in which the discre-
tization error is nonnegligible. For comparison, an additional last column is included in
this table, which gives a theoretical error bound e(u) on the l„-discretization error in u.
This bound is explained and proved in Appendix A; the values of the constants a, b, a and
ß introduced there were chosen so as to optimize the bound, and the constants CR, Cr in
(A. 14), (A. 15) were computed explicitly. We observe that the numerical errors are indeed
lower than the bound in aU cases in which u, v axe sufficiently differentiable. In fact, the
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
600 MICHAEL GHIL AND RAMESH BALGOVIND
Table IV
Numerical results with the proposed algorithm (version F) for problems with
slightly perturbed boundary conditions i/°*(x), v^l\x), £7(x0, y0) and right-hand
sides d(x, y), e(x, y). The subscript ( ) stands for the correct values.
t-(V) ».("I *<v>
u = sin(x) sin(y), v = sin(4x) 1270
1732
2952
2417
1071
7.0594 -2
1.6540 -2
4.0377 -3
9.9953 -4
2.4878 -4
5.0665
1.2091
2.9790
7.4064
1.8473
4.1757
1.6793
5.2345
1.4537
3.8233
1.0958 -1
2.6040 -2
6.4284 -3
1.6021 -3
4.0020 -4
d s l.ld , e = l.le„8127 -2
6760 -2
4525 -2
4003 -2
3874 -2
1.4141 -1
8.1393 -2
6.6998 -2
6.3252 -2
6.2204 -2
1.0866 -1
6.9773 -2
6.0982 -2
5.8773 -2
5.8172 -2
1.0417 -1
1.0102 -1
1.0028 -1
1.0009 -1
1.0003 -1
2.1951 -1
1.2814 -1
1.0667 -1
1.0138 -1
1.0007 -1
v<°> = l.lv<°>, v(1» = l.lv'1' 8156 -3
4438 -2
8546 -2
9594 -2
9858 -2
7.7503
2.6059
1.8548
1.8582
1.9146
5.3018
2.0885
1.E547
1.9099
1.9507
6.2641
4.7086
7.5598
8.8792
9.4717
1.1061
6.1123
6.9934
8.2510
9.0694
u = sin(x) sin(y), v = sin(4x)
„"» = l.M v<°>. JV= l-Ol „<»
2.4564
9.1866
6.1478
5.5561
5.4252
1.9396
3.2585
7.3104
1.6720
1.9138
7.7675
2.3026
1.0333
7.2249
6.4446
7.1245
1.7228
4.9289
2.3682
2.0042
5.6013
1.7323
8.4702
6.4382
5.9547
5.0689
1.2210
3.4965
2.0472
1.9594
-2
-2
-3
-3
-3
-2
-2
-3
-3
-3
4.7747
2.4293
1.3836
1.0738
1.0026
3.8097
1.0437
2.8843
7.5755
9.1283
1.2057
3.6250
1.6453
1.1580
1.0367
1.0968
2.6091
8.8601
8.5085
9.1034
u = sin(x)sin(y), v = sin(4x)
d ■ l.Old , e = 1.01 e
V10>. 1.01 v¿°>. ,<». 1.01 ,«>
u(x0,y0) = 1.01 uc(x0,y0)
2.2784
7.7732
5.3611
5.0612
5.0134
7.8326 -2
2.3701 -2
1.1088 -2
8.0548 -3
7.3146 -3
5.6034 -2
1.7407 -2
8.6658 -3
6.7151 -3
6.2660 -3
4.4085 -2
1.7937 -2
1.0386 -2
1.0096 -2
1.0024 -2
1.2068 -1
3.6301 -2
1.6493 -2
1.1618 -2
1.0404 -2
u ■ sin(x) sin{y)
v = sin(4x) sin (4y)
3.2273
8.0409
2.0085
5.0203
1.2550
5.9183 -2
1.3515 -2
3.2789 -3
8.1045 -4
2.0164 -4
4.0498 -2
9.4190 -3
2.3045 -3
5.7193 -4
1.4258 -4
6.2089 -3
1.5927 -3
4.0074 -4
1.0035 -4
2.5096 -5
1.1072 -1
2.6172 -2
6.4545 -3
1.6082 -3
4.0171 -4
d = 1.01 dc
(0)v
u(x.,y.) = 11 u (*.n,yn)
8.2597
5.8123
5.2030
5.0508
5.0128
6.5120 -2
1.8814 -2
8.3918 -3
5.8582 -3
5.2234 -3
4.4892 -2
1.3738 -2
6.9572 -3
5.4663 -3
5.1188 -3
1.5891 -2
1.1513 -2
1.0381 -2
1.0096 -2
1.0024 -2
1.2183 -1
3.6434 -2
1.6519 -2
1.1625 -2
1.0406 -2
1.1 e-ldc
1.1 vvc
u(xn,y.) - l.lu (xn,y„)*0"0'c'-O'-'O'
5.3550
5.0885
5.0221
5.0055
5.0014
1.1855 -1
6.6507 -2
5.4407 -2
5.1287 -2
5.0418 -2
8.9935 -2
5.8963 -2
5.2323 -2
5.0670 -2
5.0216 -2
1.0302 -1
1.0079 -1
1.0020 -1
1.0005 -1
1.0001 -1
2.2179 -1
1.2879 -1
1.0710 -1
1.0177 -1
1.0044 -1
errors become very nearly equal to the bound in some such cases (e.g., u = y sin(x), v =
y2 sin(x)), which seems to indicate that the bound is rather close to being sharp. The
theoretical bound provides also an indication of error magnitude even in some cases
where the data are not sufficiently differentiable (e.g., u = y4 sin(8x), v =
x(2tt - x) sin(8y); u = v = x(n - x)(2n - x) sin(y)).
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 601
TABLE IV (Continued)
»,(0) l,(U,V) »_(V)
i -= sin(x) sin (y)
, = sin(4x) sin(4y)
,(0) _ „(0)
,(1) „<1>
+ 0.1 sin(32x)
-l- 0.1 sin(32x)
6.7163
6.9041
3.5965
4.7732
5.3989
5.9392
3.6605 -2
6.3567 -2
2.7103
4.7378
u = sin(x) sin(y)
v = sin(4x) sin(4y)
„(0) „ v<°> + o.lsin(31x)
v(1) - v'1'* 0.1sin(31x)
3.2180
3.2461
7.4420
6.2894
7.0221
7.0895
4.3097
4.5354
3.7044
4.9065
5.3831
3.7981
6.4849
5.5055
6.0615
7.2440
8.1892
4.1426
5.2501
6.9534
1.7657 -1
9.5994 -2
1.8729 -2
2.7606 -2
4.8358 -2
u = sin(x) sin(y)
v = sin(4x)
v<0) = vt0) + 0.1 sin(31x)c
v(1) =. vll) + 0.1 sin(31x)
3.8439
3.2860
7.5512
6.8369
7.0226
8.0665
4.4139
5.1109
3.7503
4.9087
6.1843
3.8780
6.4665
5.5256
6.0626
1.1296
9.8528
4.6641
5.3952
6.9553
1.5208 -1
9.2006 -2
1.8363 -2
2.7601 -2
4.8357 -2
y sin(3x)
2 -x)
+ 0.1 sin(31x)
1 = x(2 -x) sin(3y)
,(D 0.1 sin(31x)
1.1961
2.9682
7.3896
1.9594
8.3894
5.7240
1.4061
3.3020
8.9053
5.3044
9.5706
2.3461
5.7534
1.5258
7.0244
3.6459
1.0109
2.6792
1.0993
8.3911
1.5927
4.1008 -1
1.0273 -1
3.2782 -2
4.8979 -2
y sin(4x)
2 -x)
+ 0.1 sin(31x)
v = x{2 -x) sin{4y)
,d> + 0.1 sin(31x)
2.2889
5.5349
1.3716
3.4841
1.1054
9.2522
2.1416
5.1032
1.3094
5.8169
1.7871
4.2463
1.0410
2.6397
8.8422
5.1558
1.7679
4.9607
1.6413
9.8365
2.8077
6.7504 -1
1.6534 -1
4.1227 -2
4.9620 -2
The formal truncation error of the finite-difference scheme, in the interior as
weU as on the boundary, is 0(h2 + k2)A hence, we expect that the scheme will pro-
duce second-order accurate results. A good way of testing this numerically is by com-
puting the ratio
| lp-h,2k) \lpi.k)
and similar quantities for v and (77, u); here || || is either the l2 or the /„ norm of
(4.2). For sufficiently smaU h, k and twice continuously differentiable (77, v) these
ratios should be very close to 4 if the method is indeed second-order accurate, and it
should be close to 2 if the method is first-order accurate.
The indicated ratios are computed and entered in additional rows in Table V; the
entry in column 3 indicates the values of M and 2M, rather than of h and A/2, to
which the norms whose ratio was taken correspond. The interesting result is that for
those cases tested the method seems to have second-order accuracy even when v is
merely continuous and u once continuously differentiable and that it is first-order ac-
curate when both u and v axe merely continuous. The continuity of u, v is given in
standard notation as a column to the left of the usual first column; i.e., u, v E C°,
Cl, C2, ..., C°°, indicate the number of continuous derivatives of u, v. Continuity is
understood for u, v and their derivatives as extended x-periodic functions.
T A more detailed discussion of accuracy appears in Appendix A.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
602 MICHAEL GHIL AND RAMESH BALGOVIND
Table V
Numerical results indicating the order of accuracy of the proposed algorithm (ver-
sion F) and discretization procedure.
if") *.(v> e(u)
CCc°°
u = y sin(x)
v = y2sin(x)
lul^ 3.1416
Ivl = 9.8696
16
32
64
4.015
1.049
2.657
6.664
2.879
7.043
1.722
4.248
3.573 -2
9.049 -3
2.254 -3
5.607 -4
8.041
2.794
8.124
2.193
4.879
1.315
3.414
8.566
4.455 -1
1.114 -1
2.784 -2
6.961 -3
8/16
16/32
32/64
3.826
3.950
3.987
4.088
4.090
4.054
3.948
4.016
4.019
2.878
3.440
3.706
3.711
3.851
3.986
u«C
Uc"= y sin(4x)
= y sin(4x)
ul,= 1.2825
lul^» 3.1416
Ivl ■ 9.8696
16
32
64
128
1.7509
4.326
1.083
2.7087
2.5573
6.1881
1.5133
3.752
2.1684 -1
5.292 -2
1.3123 -2
3.268 -3
4.9751
1.573
4.6187
1.2895
5.5634
1.4699
3.6885
9.2309
1.426 +1
3.564
8.910 -1
2.227 -1
16/32
32/64
64/128
4.066
3.996
3.997
4.152
4.069
4.033
4.097
4.033
4.015
3.169
3.407
3.582
3.805
3.964
3.996
kiec
|vec°°lui 2=
Ivl 2=
Ivl =
y sin(8x)
y2sin(8x)
1.2825
3.1210
3.1416
9.8696
16
32
64
128
256
1.0235
1.6710
4.0218
9.9941
2.4956
1.0575
2.973
7.088
1.7449
4.3369
7.579 -1
2.3912 -1
2.739 -2
1.419 -3
3.835 -3
1.6347
5.6803
1.7565
5.0463
1.4018
1.1188
7.6216
1.856
4.6501
1.1612
2.103 +2
5.257 +1
1.314 +1
3.286
9.215 -1
16/32 6.125
4.155
4.024
4.005
3.5571
4.194
4.062
4.023
3.141
4.1663
4.045
4.015
2.878
3.234
3.481
3.600
1.467
4.106
3.992
4.004
use
vec"lui 2=
Ivl 2=
Ivl -
y sin(16x)
y2sin(16x)
1.2825
3.1210
3.1416
9.8696
16
32
64
128
256
5.8205 +1
1.0321
1.5734 -1
3.7506 -2
9.2864 -3
1886
4585
2060
6097
8749
4.7763 +1
7.4243 -1
2.5130 -1
5.9846 -2
1.4774 -2
16/32
32/64
64/128
128/256
5.6397 +1
6.5596
4.1950
4.0388
8415
7026
2131
0587
6.4334 +1
2.9544
4.1991
4.0499
3081
7140
0606
8544
2893
0451 +1
6038 -2
8298 -1
1426 -1
3174 -2
3.290 +3
.224 +2
2.056 +2
5.140 +1
1.285 +1
6319
8280
2682
5060
0788 +3
3465 -2
1210
0294
hr£C
lul.
Ivl
sin(x)sin(y)
sin(4x)sin(4y)
0.5
0.5
1.0
1.0
8
16
32
64
1.306 -2
3.227 -3
8.0409 -4
2.0085 -4
1429
198
3515
2789
9.8921 -3
4.0498 -2
9.419 -3
2.3045 -3
2339
2089
5927
0074
2730 -1
1072 -1
6170 -2
4545 -3
268 -1
3.171 -2
27 -3
1.982 -3
8/16
16/32
32/64
4.0548
4.0135
4.0034
628
3790
1219
152.443
4.2996
4.0872
5980
8982
9745
76 -15
2305
0548
y sin(x)
x(2ir-x) sin(y)
22.9595
5.0966
97.4091
9.8696
1
16
32
64
8/16
16/32
32/64
3.586 -1
9.332 -2
2.36 -2
5.92 -3
33
509
67
03
4.95
1.24
3.06
7.61
44
49
821
617
168
14 -1
89 -2
984 -2
7.898
1.975
937 -1
1.234 -1
3.8424
3.9541
3.9880
1959
1150
0603
4.0077
4.0346
4.0244
8922
9734
7221
9793
9761
Table VI contains a study of the number of grid points per wave length which
the method necessitates for given numerical accuracy. The results are given here as
relative errors, l2(u — U)/\\u\\2, rather than as absolute errors, l2(u - U), and similarly
for v and for /„. It seems that roughly 4 points per wave length wiU give 10 _1 rela-
tive error, 8 points wiU give 5 x 10~2, and 16 wiU give 10-2. We notice again that
if u osciUates less than v, the error in u wiU be considerably smaUer than that in v.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 603
TABLE V (Continued)
l,(U) £,(U,V) i (U) *„<v> e(u)
lui 2=
Ivl 2=
,U,«T
Ivl =
y sin(8x)
x(2tt-x) sin(8y)
22.9595
5.0966
97.4091
9.8690
16
32
64
128
256
1.732 +1
2.459
5.853 -1
1.446 -1
3.604 -2
3.8587
7.1378 -1
1.636 -1
3.989 -2
9.892 -3
1.292 +1
1.8352
4.3267 -1
1.064 -1
2.647 -2
3.932 +1
6.2285
2.0035
5.3048 -1
1.3449 -1
7.2614
2.1761
5.130 -1
1.2653 -1
3.165 -2
3.856 +1
9.640
410
6.025 -1
1.506 -1
16/32
32/64
64/128
128/256
7.0428
4.201
4.048
4.012
5.406
4.363
4.101
4.032
7.040
4.242
4.0656
4.020
312
109
777
944
3.337
4.242
4.055
3.997
lui 2=
Ivl 2=
Ivl -
y sin(16x)
x{2tt-x) sin (16y)
22.9595
5.0966
97.4091
9.8690
16
32
64
128
256
4.66
1.816
2.5178
5.964
1.471
1.174
2.241
6.1677
1.4347
3.518
16/32
32/64
64/128
128/256
2.566
7.212
4.221
4.053
5.236
3.634
4.299
4.078
8.7105
1.3108
1.8458
4.353
1.071
6.630
7.118
4.241
4.061
4677 +1
6798 +1
884
1700
7139 -1
2.055 +2
4.5798
1.783
4.106 -1
1.006 -1
8.067
2.017
5.042
1.260
3.151
3~86~
781
172
798
4.487
2.568
4.343
4.080
Ivl „=
lut, =
y sin{32x)
x(2n-x)sin(32y)
5.0966
22.9595
9.8696
97.4091
16
32
64
128
256
9.32
9.176
1.843
2.535
5.997
7.511
2.901
1.2121
5.813
1.3647
5.176
2.123
1.316
1.8459
4.356
2935 +2
277 +2
092 +1
2431
2600
1.564 +3
7.936 +2
2.7678
1.4956
3.4708 -1
1.467 +4
3.669 +3
9.172 +2
2.293 -.3
732 +1
16/32
32/64
64/128
128/256
1.0157
4.9795
7.2681
4.228
2.5888
2.39
2.085
4.259
2.4378
1.613
7.1300
4.237
0128
5081
030
205
1.9709
2.867 +2
1.851
4.309
u = x(tt-x) (2ir-x)sin y
v = x{ïï-x)(2ïï-x)sin y
16
32
64
128
25616/32
32/64
64/128
128/256
1.4431
3.6692
9.2269
2.3104
5.77833.9330
3.9766
3.9937
3.9984
9.7688
2.4499
6.1387
1.5368
3.8452
1.2474
3.1389
7.8604
1.9651
91163.9874
3.9910
3.9943
3.9968
3.9739
3.9933
4.0000
4.0002
3.4316
9.5067
2.4802
6.3163
1.5928
2.4159 -1
6.3250 -2
1.6099 -2
4.0366 -3
1.0101 -33.6097
3.8330
3.9267
3.9656
3.8196
3.9287
3.9884
3.5960
3.784 -1
9.461 -2
2.365 -2
5.913 -3
1.478 -3
u ■ x(it-x) (2,r-x)sin y
v = x(2n-x)sin y
lul„= 11.9343
Ivl " 9.8696
16
32
128
25616/32
32/64
64/128
128/256
1.3883
3.4733
8.6870
2.1720
5.4303
1.2207
1.9011
4.8190
1.2.1.33
3.0443
1.3127
2.8241
7.0539
1.7629
4.4065
3.0675
8.3754
2.1836
5.5692
1.4059
2.5198 -1
6.0616 -;
1.5363
3.8529 -3
9.6442 -
3.784 -1
9.461 -2
2.365 -2
5.913 -3
1.478 -33.9970
3.9983
3.9995
3.9999
6.4210
3.9450
3.9717
3.9856
4.6484
4.0036
4.0014
4.0006
3.6625
3.8357
3.9207
3.9613
4.1570
3.9456
3.9873
3.9950
= x(2tt-x) sin y
= x(2n-x)sin y
lui
16
32
64
128
256
16/32
32/64
64/128
128/256
4.6542
2.3271
1.1639
5.8198
2.9100
5.0733
2.3676
1.2030
6.0629
3.04342.0000
1.9994
1.9999
2.0000
2.1428
1.9680
1.9842
1.9922
4.8543
2.3468
1.1833
5.9417
2.9772
1.0241
5.6744
2.9686
1.5144
7.64272.0685
1.9832
1.9916
1.9958
1.8647
1.9115
1.9603
1.9814
1.1306
5.7633 -1
2.9230 -1
1.4679 -1
7.3474 -21.961S
1.9717
1.9913
1.9978
1.565 -1
3.912 -2
9.780 -3
2.445 -3
6.112 -4
These conclusions are also supported by some of the results in Table V. It is inter-
esting that experiments with solutions containing odd wave numbers give results which
are only slightly worse than those for even wave numbers, if at aU; in other words,
using M, N which are powers of 2 is not detrimental to accuracy, even when odd
wave numbers are present in the solution.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
604 MICHAEL GHIL AND RAMESH BALGOVIND
TABLE V (Continued)
i (U) 1_(V) e(u)
2 2u=x (2tt-x) sin (y)
4 4v - x (2n-x) sin(y)
lul_ ■ 97.4091
Ivl = 9488.531
8
16
32
64
1.6926 +1
3.5778
8.674 -1
2.155 -1
1.2066 +2
2.7330 +1
6.5176
1.614
8.0018 +1
1.8850 +1
4.6131
1.4278/ié
16/32
32/64
4.7319
4.1236
4.0258
4.4148
4.1593
4.0706
4.2446
4.0867
4.0370
3.3808 +1
8.311
2.009
4.9799 -14.0678
4.1368
4.0344
2.909 +2
7.162 +1
1.783 +1
4.45384.0618
4.0160
4.0040
2.310 +1
5.774
1.443
3.609 -1
u£C
vec°
uSC
uec"
u=x (rr-x) (2ïï-x) sin (y)
v - 0
16
32
64
128
256
2.1348
4.7264
1.0805
2.5536
6.1632
9.5464
2.3985
6.0150
1.5065
3.7699
1.5604
3.3996
7.7121
1.8158
4.3888
4.2730
1.1174
2.8579
7.2238
1.8161
2.4159 -1
6.3250 -2
1.6099 -2
4.0366 -3
1.0101 -3
3.784 -1
9.461 -2
2.365 -2
5.913 -3
1.478 -3
16/32
32/64
64/128
128/256
4.5167
4.3743
4.2313
4.1299
3.9800
3.9877
3.9927
3.9961
4.5899
4.4081
4.2471
4.1375
3.8241
3.9106
3.9554
3.9777
3.8196
3.9287
3.9884
3.9960
u = x(2tt-x) sin(y)
A
v = y sin(x)
lul^= 9.8696
Ivl = 97.4091
16
32
64
128
256
16/32
32/64
64/128
128/256
1.1787
4.9757
2.3681
1.169
5.826
2.9107
2.1011
2.0257
2.0065
2.0016
1.0090
4.889
2.433
1.2179"
6.0988
3.052
2.0093
1.9978
1.9972
1.9982
1.1091
4.936
2.3999
1.193
5.963
2.982
2.0566
2.0110
2.0013
1.9996
2.8276
1.3147
6.4005
3.1481
1.559
7.754
1.9799
1.1202
5.7897 -1
2.9311 -1
1.4697 -1
7.352 -2
2.0540
2.0332
2.0194
2.0105
1.9349
1.9752
1.9944
1.9990
1.265
3.162
7.904
1.976
4.940
1.235
In particular, these results also show that the present solver would perform very
weU on Unearized versions of the original geophysical fluid dynamic problem we were
interested in ([16], [17]). We shaU return to this point in Section 7.
5. Changes in Boundary Conditions. In Section 2 we have formulated the model
problem (2.1)—(2.3) which motivated this study. The algorithm of Section 3 has ob-
vious appUcations to many other situations; it is of interest, therefore, to consider a
number of different boundary conditions which could be associated with the Cauchy-
Riemann equations (2.1) in a rectangle.
We shaU assume throughout this section that the boundary conditions on y = 0,
77 are still (2.2), i.e., v is prescribed there as v^°\x) and u^\x), respectively. The two
different combinations of boundary conditions we consider explicitly are: (1) that u is
given on the left boundary of the rectangle and v is given on the right boundary, and
(2) that u is given on both vertical sides.
It is clear that if v is given on aU sides, the problem should be formulated as
Poisson's equation for v with Dirichlet boundary conditions; similarly if u is given on
all the sides, a Dirichlet problem for u is more suitable. A moment's thought wiU
show thaf the two situations we shaU discuss can easUy be transformed into a con-
siderable number of others, by reflections or by interchanging the roles of x and y,
and of u and v. In fact, aU situations in which u as weU as v axe prescribed on some
of the sides of the rectangle can be handled by sUght modifications of the algorithms
we present, yielding second-order accurate numerical solutions.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 605
Table VI
Numerical results indicating the resolution (mesh points per wave length) required
by the proposed discretization procedure and solution algorithm (versions C and
F) to obtain prescribed accuracy.
t,(U)/lul. i2(V)/lvl2 H^lUl/lul,
u = y sin(x) v = y sin(x) 6.6212 -4
5.1958 -4
1.3611
1.3611
8.6548
6.9789
u = y sin(4x),
u * y sin(8x),
u = y sin(64x),
v = y sin(4x)2
v = y sin(8x)2
v = y sin(64x)
8.444 -3
3.1358 -2
1.9450 +2
8.0699 -1
1.1422 -1
4.849 -3
2.271 -2
4.4180 +1
4.4600 -3
1.0867 -1
1.47
5.59
2.0070
5.6440
2.0238
u = sintx)sin(y) , v = sin(4x)sin(4y) 4.1000
4.0170
6.5580
6.5578
4.4250
4.0074
u » y sin(x),
u = y sin(8x),4
u » y sin(16x),4
u = y sin{32x),4
u - y sin(64x),
v = x(2n-x)sin(y)
v = x(2ir-x)sin(8y)
v ■= x(2ir-x)sin(16y)
v ■ x(2ir-x)sin(32y)
v « x(2Tf-x)sin(64y)
4u * y sin(128x), v - x(2tt-x) sin(128y)
u « x(2ir-x)sin(y), v = y sin(x)
3.0218 -4
2.5784 -4
2.5490 -2
1.097 -1
8.027 -1
7.962
8.0577 -1
1.1064 -1
1.592 +1
1.5906 +1
1.0707 -1
3.394 -2
2.2937 -2
1.7716 -3
1.7716 -3
3.2100 -2
1.210 -1
2.378 -1
1.2598 +2
1.2376 -1
1.1180 -1
6.956 +2
2.6429 -12
6.514 -2
5.270 -3
5.270 -3
1.3592
9.8728
2.0570
7.067
5.227
2.617
5.4615
7.6292
5.234
5.3360
5.5526
4.104
3.1896
u - x2(2ir-x)2sin(y) , y - x4 (2ir-x) 4sin(y)
u ■ x{tt-x) (2n-x)sin{y) , v ■ 0
1.121 -2
4.9081 -3
1.785 -2
1.2937
1.2937
9.4755
5.1124
5.267
u - y"sin(3x) v = x(2rr-x)sin(3y) 5.2079
1.2851
3.2022
7.9989
1.9993
1.1205
2.6395
6.4495
1.5969
3.9748
3.6800
9.6328
2.4339
6.0900
1.5225
u ■ y sin(5x) v = x(2rr-x)sin{5y) 1.6604
3.8882
9.5680
2.3826
5.9508
2.7604
5.9866
1.4350
3.5362
8.7915
1.2079
3.1351
7.8489
1.9858
4.9629
u = y sin(7x) v ■ x(2ir-x) sin(7y) 3.7455
8.0104
1.9318
4.7869
1.1941
5.7315
1.0879
1.9621
6.2087
1.5410
2.7813
6.5915
1.6605
4.1578
1.0382
u - y sin(9x) , v = x{2ir-x) sin(9y) 7.4843
1.3893
3.2567
8.0160
1.9963
1.1403
1.7668
3.9733
9.6458
2.3893
5.5451
1.1704
2.8390
7.1369
1.7829
We proceed now with the description of the algorithm for the two cases men-
tioned.
Case 1. u Given on the Left Side. The rectangular domain is now taken as
R j = {(x, y): -A/2 < x < 2tt, 0 < y < 77}. This is merely done for notational con-
venience, so as to leave Figure 1 unchanged. The boundary conditions are first (2.2),
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
606 MICHAEL GHIL AND RAMESH BALGOVIND
which we repeat here as
(5.1a) u = u(0>(x), y.= 0,
(5.1b) v = u(1)(x), y = 77,
and also
(5.2a) " = "(0)00. x - ~hl2'
(5.2b) v = v(x)(y), x = 2ir.
Thus Eqs. (5.2) replace the periodicity conditions (2.3a, b). Eqs. (2.1), together with
(5.1), (5.2) completely determine u, v, and u need not and should not be prescribed
any more at an interior point of R x.
The difference equations are still (2.4), which we repeat for convenience as
(5.3a) (Uu - t/,._w)/A + iVUj - Vu_x)lk = Dip Ki<M,l<,j<N,
(5.3b) (t/./+1 - Uu)/k - iVi+x. - Vu) = Eif, Ki<M,Kj<N-l.
The boundary conditions become
(5.4a) Vu0 = ij(0)((7 - 1)A), Ki<M,
(5-4b) ViN = v(l\(i - 1)A), Ki<M,
and
(5.5a) uo,j = u(o)(0 - 1/2)*), 1 </<7V,
(5.5b) vm + i,j = v(i)(Jk), 0 </ <TV.
Hence, there are MN interior ¿/-values to be determined, and M(N - 1) interior V-
values, or M(2N - 1) unknowns altogether. Eqs. (5.3) yield M(2N - 1) Unear alge-
braic equations; we shaU see that these equations are actually independent and deter-
mine U, V completely.
It turns out to be more convenient in this case to form column vectors U(-, V^
from the values of U, V along vertical mesh Unes; in Section 3 vectors U-, V- were
formed along horizontal mesh lines. Thus
U, = (f.,i> UU*>-> UUn)*' V,. = (Fu, F,2,..., V.n_x)*;
in particular U(-, Vf have now different lengths, the U/s being yV-vectors, while the V('s
are (N - l)-vectors. In a similar fashion, Df is an TV-vector, while Ef is an (N - 1)-
vector of values along the corresponding vertical mesh lines.
With this notation, (5.3) becomes
(5.6a) U,. - U,_, + TV,. = AD,., I <i<M,
(5.6b) vi+ , - V,. + T*U,. = AE,., 1 < / <M;
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 607
here T is an N x (N - 1) matrix of rank (N - 1),
(5.7)
1
-1
and we define p = h/k. In (5.6a) U0 is known from (5.5a), and in (5.6b) VM + 1 is
known from (5.5b). Hence it is convenient to redefine D1 as Dj + A_IU0, and EM
•M h 1 \M + j. Furthermore, we redefine D¡ 0 as D,. 0 + k 1 V¡ 0 and D¡ N asas E
DiN - k~lViN. After these changes of notation (5.6) can be given the block matrix
form
(5.8)M M
M.
M'
T
'tV-1 *N-1
'N-l *N-1
TI
I
JN
~In ^n
^n *n
jn-i |
-I iA7-1 I
yM
Ui= A
JLU*J
D,
DM
-M
where IL is the L x L identity matrix.
The decoupling of U and V in this case proceeds as foUows. In the upper half of
system (5.8) we add the first block row to the second, then the second to the third,
and so on, untü the new (M - l)st row is added to the Mh. We obtain a new system,
which we write in condensed form as
(5.9)
where
IMq\r
V
u= A
F
E
(5.10a)P =
T
T T
T ■ ■ T T
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
608 MICHAEL GHIL AND RAMESH BALGOVIND
(5.10b)
(5.10c)
(5.10d)
ß =
"^JV-l ^JV-1
'N-l ¡N-l
-IN-l
R =
D,
Dx+D2
E>,
and I is the MN x MN identity matrix. From (5.9) we have
(5.11a)
(5.11b)
with
(5.11c)
U = AF - PV,
(Q - RP)y = h(E - RE),
RP
S
S s
s s]
hexe S = T*T is an (N-l) x (N - 1) matrix,
(5.12) S = p2
-1 2j
Notice that 5, in contradistinction to S of Section 3, is nonsingular. We shall return to
it later.
Written out expUcitly, (5.11b) becomes after a change of sign,
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER
(5.13)
S + I -I
S S+I -I
-I
S+I
T*DX-EX
T*(DX+D2)-E2
Lt*z>,---AÍ
where / is now the (TV - 1) x (/V - 1) identity. This system can be brought into block-
tridiagonal form simply by subtracting the rth row from the (/ + l)st, starting from the
top. This produces the system
(5.14)
S+I -I
-I S+ 21 -I
S+ 21-I
T*DX-EX
T*D2+E1-E2
J*0M + KM^X DM JS + 27_
We see that the decoupling resulted in this case in the elimination of U, rather
than of V. Equation (5.14) is rather similar to (3.11). Some of the differences have
already been pointed out; as a result of these differences, the matrix of (5.14) is non-
singular. It can be brought to scalar tridiagonal form by diagonalizing S and then re-
indexing. We shaU comment on the fast solution of (5.14) further at the end of this
section, together with the fast solution of the matrix equation obtained in the second
case we wish to discuss.
Case 2. u Given on Both Vertical Sides. We fit the grid to the rectangular do-
main so that R2 = {(x, y): -A/2 < x < 27r - A/2, 0 < y < n}. The boundary condi-
tions are (5.1) on the horizontal sides of the rectangle, and
(5.15a) u = 77(0)(y), x = -A/2,
(5.15b) u = u(1)(y), x = 2?r - A/2,
on the vertical sides. Equations (2.1), (5.1) and (5.15) determine u and v completely,
subject to the requirement that the data d(x, y), u,0^(y), u,xAy), i/0)(x), v^(x)
satisfy the Gauss divergence theorem:
ffdix, y) dxdy = f^-"'2 {v^\x) - u<°>(*)} dx
(5.16)
+ /;{"(1)(v)-«(0)(y)}dy.
The difference equations are (5.3), with (5.3b) only being written for 1 < / <
M-l,l</</V-l. The boundary conditions for the mesh variables are (5.4) and
(5.17a)
(5.17b)
U,o,/ u(0)iij-l/2)k),
UMJ = uix)iiJ-l/2)k),
1 </</V,
1 </</V.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
610 MICHAEL GHIL AND RAMESH BALGOVIND
Hence, there remain (M - l)N í/-values to be determined, and M(N - 1) F-values, i.e.
2MN - M - N unknowns. In (5.3) we have MN D-equations and (M - l)(N - 1) in-
equations, i.e., 2MN - M - N + 1 equations in aU.
It would seem that the number of equations exceeds the number of unknowns
by one. We expect, however, from the continuous case that one compatibiUty condi-
tion has to be imposed on the data, analogous to (5.16), and that the matrix of sys-
tem (5.3) have rank equal to the number of unknowns. This can be checked directly
in the block form (5.18) to which we shaU bring this matrix below; the compatibiUty
condition turns out to be the one obtained when computing the integrals in (5.16) by the
midpoint rule. A similar statement is true in the case in which u is prescribed on the hori-
zontal sides of the rectangle, and u on the vertical sides; in that case the compatibility con-
dition is the discrete analog of Stokes' curl theorem, involving e(x, y) rather than d(x, y).
We introduce the TV-vectors U,-, D,- and the (N - l)-vectors V,-, Ef as in the pre-
vious case. Again, the first and last components of D,, D. 0 and DiN, are modified by
the addition of ph -1 V¡ 0 and of -ph _1 V¡ N, respectively. Also, Dj is modified by
the addition of A-1U0 and D^ by the addition of -A-1UM.
After these changes of notation, system (5.3) becomes
M M- 1
M<
M- 1
(5.18)
T
T
-/ N-l 'N-l
'N-l 'N-l
'N
~'n ^n
T*
'n
-IN
v,
v.
T*J Lu*-U
D,
= ADM
-EM-l
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 611
where T is the TV x (N - 1) matrix defined by (5.7), and IN-X,IN are identity ma-
trices of the appropriate dimensions. Clearly, the sum of the MN rows in the upper
half of the matrix in (5.18) is zero. The corresponding compatibiUty condition that
2,. :Dj . = 0 is exactly the one we expected; we only need to remember that the D¡-
close to the boundary have been redefined to include the boundary data with appro-
priate coefficients.
The elimination of V proceeds in a manner analogous to Case 1, by summing the
blocks of the upper half of (5.18), in a discrete form of integration with respect to x.
The result is
(5.19)M- 1 M- 1
M-V
M-U
T
T T
T ■ •'■ i" T
T
■ 0
T
"^jv-i ^v-i
"^7V-l ^tV-1
'N
'N
0 • ■ • 0
T*
^Af-l
'M
Ui
uM-l
Di
D1+D2
-E,
-EM-l
We rewrite this, with the obvious identifications, as
(5.20)
P i I
i••• T ¡0 ■•0
_ l
q r r u
F
*M
-E
Notice that P and Q have block dimension (M-l) x M, P has blocks of dimension
N x (N - I), and Q has blocks of dimension (N - 1) x (TV - 1).
From (5.20) we obtain
(5.21a) u = AF - py,
(5.21b) ßV + RU = -AE.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
612 MICHAEL GHIL AND RAMESH BALGOVIND
This aUows us to eliminate U and write
(5.22a) (Q - RP)y = -A(E + RE),
M
(5.22b) T*(TT-"T)V = AT*£D/;1
here we introduced the block row missing in (5.21) as (5.22b). The elimination of
the single redundant equation in (5.18) was done naturaUy by multiplication of (5.22b)
with the (N - 1) x N matrix T *.
System (5.22) becomes, after carrying out the matrix multipUcations,
(5.23)
5+7 -/
S S+I -I
S+IV = A
rT*D, + Ex
T*(DX + D2) + E2
r*z;M-lD. + EM-l
IT£>,
with S being the (N - 1) x (N - 1) matrix defined by (5.12). The matrix of this
system is the same as that of (5.13), except for the lower right corner block; also the
right-hand sides differ only sUghtly. Applying the block-tridiagonalization procedure
used in Case 1, which corresponds to differencing in x, one obtains
(5.24)
S+I -I
-I S + 21
-I
-I
S + 27
-I
-I
S+I
V = A
T*D1 +EX
T*D2+E2-E1
T*r> + p _ p1 UM-1 T EM-1 c'M-2
We are now prepared to discuss the fast solution of (5.14) and of (5.24).
Fast Sine Transform. The fast solution of (5.14) and of (5.24) involves bringing
the corresponding matrices to scalar tridiagonal form. This is done in two steps: the
first and crucial step is to diagonalize S; the second is to bring the two diagonals which
are identicaUy -1 from the position of block subdiagonal and block superdiagonal to
that of scalar sub and superdiagonal, i.e., immediately adjacent to the main diagonal.
We shaU write the procedure for a sUghtly more general system, which includes
(5.14) and (5.24) as special cases, to wit:
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 613
(5.25)
Sx+axI VS2 + a2I 7i'
ß'M-l1
7m-iI
>MSM + a-M1
W= 8;
W and 8 are partitioned to conform with the blocks of the matrix,
W = (W*,W*,...,W*f)*, 8 = (Bf, 82*,..., 3%)*, and S, = 6,.S.
The eigenvalues pk and eigenvectors r\k of S,
(5.26a) Svk = ßkT}k, Kk<N-l,
axe known:
(5.26b) pk = 2p2 {I + cos(nk/N)},
(5.26c) r\kl = (2//V)1/2 sin(nkl/N).
The matrix S differs from S of Section 3 inasmuch as its eigenvectors are generated by
the sine function, rather than by the exponential, as in (3.12). Its diagonaUzation thus
corresponds to the fast sine transform, in the same way in which S was connected to
the FFT. The differences arise in the continuous problem because of the different
boundary conditions.
Let Pbe the matrix whose columns are t,1 , r\2, . . . , t\N_x, and let M =
diagOij, ...,pN_x). Then
(5.27)
We introduce W,., 8(- by
(5.28a)
(5.28b)
P*S? = M, ?*? = !N-l-
w,. = p*w,.,
8,= P*8,., Ki<M,
and M,- = <5,-M. We premultiply (5.25) by a block-diagonal matrix with aU the diagonal
blocks equal to P* and use (5.27), (5.28) to yield
(5.29)
M, + a,/ VM2 + a27 72'
r->rf r^J * Ä
Reindexing W and 8 into w and 8 by
7AÍ-17
K + "m1
w = g.
(5.30) Wk,i Wi,k' 8k,i Bik, 1 <i<M, 1 <k<N-l,
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
614 MICHAEL GHIL AND RAMESH BALGOVIND
the scalar tridiagonaUzation of (5.25) is completed in the form
(5.31)
here each Ck is a nohsingular scalar tridiagonal M x M matrix,
5,/ifc+a, 7,
0i 52Mfc + a2 72
"lM-1
&M-1 SM^fc + 0:M_
1 <k<N-l.
After performing and storing the LU decomposition of the Cfc's, the solution of
(5.25) is now carried out by a forward fast sine transform (5.28b) of the data B into
S, elimination and substitution in each subsystem
(5.32b) CkVik = Bk, Kk<N-l,
and a backward fast sine transform (5.28a) of the solution W into W; the reindexing of
rj> into 8 and of W into W does not necessitate actual computer operations. Further-
more, in our application,
ß. = 7. = -l, 6,. = 1, Ki<M,
and aU a,- are 2, except for aM = 1 in (5.24).
The operation counts and storage requirements are, therefore, the same as in
Section 3; in particular, the same remarks apply to the computation of the sine trans-
form as to the real Fourier transform, i.e., the practical computer time required is es-
sentially determined by twice the number of real data.
Numerical experiments similar to those in Tables I through VI of Section 4 were
carried out for Cases 1 and 2, with the algorithm described above. The results were
entirely analogous, confirming the fact that the algorithm is second-order accurate and
the computational time it requires is essentially proportional to the number of mesh
points used in the discretization.
It is remarkable that the numerical tests still indicate second-order accuracy for
u E Cp,v E Cq with p > I, q > 0, and lower-order accuracy for u E Cp, v E Cq with
p < 1, q < 0. This is true in both Case 1 and Case 2 for jump discontinuities in u and
v or their derivatives introduced along either x = const or y = const. These numerical
observations seem to indicate that the method's second-order accuracy even for solu-
tions without the formally required differentiability is not restricted to the case of
C,
-N
W = Ê;
(5.32a)
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 615
periodic boundary conditions. Furthermore, the asymmetry with regard to the con-
tinuity requirements on u and v does not appear to depend on whether U or V axe
eliminated in the algorithm, or on the boundary conditions imposed.
6. Comparison with Existing Solvers. A fast direct solver for the inhomogeneous
Cauchy-Riemann equations was published by Lomax and Martin [24]. Some applica-
tions and extensions are given in [25], [26]. We carried out a comparison between
our solvers and those published previously. The comparison was made first for the
test case of [24], [26], then for some of the test cases in our Tables III through VI
and similar ones. These computations were carried out on a CDC 6600 computer with
a FTN 4.6 compiler; single precision arithmetic was used throughout.
Thin Biconvex Airfoil. The example problem used in [24], [26] to illustrate the
use of a Cauchy-Riemann solver in aerodynamics is that of steady, irrotational, sub-
sonic, inviscid flow over a thin symmetrical parabolic-arc biconvex airfoil in the small-
perturbation approximation. The linearized Prandtl-Glauert transformation (e.g., [25])
yields for this problem the formulation
(6.1a) "* + vy = °.
(6.1b) ","u*"0' J'>0>-00<^<00.
with boundary conditions
(62a) u(x, 0) = -4x, -0.5<x<0.5,
= 0, 0.5 < |*|,
and
(6.2b) u,v->0, x2 + y2 -► °°.
The analytical solution to (6.1), (6.2) is weU known and contains a logarithmic
singularity at the leading and trailing edges of the airfoil, x = ±0.5, y = 0. It is most
easily written as
4 Í, . z + 0.5 \(6.3) w = -|l-zlog7^-j,
with w = u - iv and z = x + iy. The most important quantity one wishes to compute
is u on the airfoil {(x, y): y = 0, -0.5 < x < 0.5}; from it one can obtain the lift.
The exact u there, given by (6.3), is simply
(6.4) u(x,0) = ±{l-xlog\^§\}.
In [24] the numerical computation is carried out in a rectangle 7?3 = {(x, y):
0<y<2,-l<x<l}, which we also did. Computations were performed in this
rectangle prescribing the exact, analytic solution as boundary data on the three sides
{x = ± 1}, {y = 2}, as well as prescribing homogeneous boundary conditions there.
Clearly, it is a matter of choice which function, u or v, is prescribed on the sides of
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
616 MICHAEL GHIL AND RAMESH BALGOVIND
the rectangle not containing the airfoil. Hence, we used both the solvers of Section 5,
Case 1, as well as Case 2.
Among the solvers of [26], we chose for comparison purposes version D, option
(d). Version D is the one suggested by the authors themselves for the application at
hand, and its option (d) seemed, of all the versions and options offered, to be the only
one which could be second-order accurate. This version corresponds to our Case 1 in
the choice of boundary conditions.
For the test case (6.1), (6.2) neither the norms of the solution over R3, nor the
order of accuracy of the solvers is particularly relevant, because of the singularities at
the edges of the airfoil, x = ±0.5, y = 0. As a means of comparison we chose, there-
fore, the accuracy in computing u(x, 0). To compute U on y = 0, which is a F-line
and not a (7-line in the staggered mesh, [24] suggests the second-order accurate extra-
polation formula
(6.5) Uixl2 = (l/S){9Uix - UU2 + 3(x/A)(F,. 0 - F,.+10) - 3kEi>0};
we used this formula in our program, as well as in theirs.
Because of the singularity at the edges, [24] did computations both with x =
±0.5 being F-lines, i.e., with computational mesh points coinciding with the edges, and
with x = ±0.5 being ¿/-lines, i.e., with the computational mesh straddling the edges.
In Table VII we give results for the computation with both these meshes.
For each mesh, Table VII contains in successive columns the x-coordinates of
the points along y = 0 at which U was calculated, the exact 77 there, the solution by
the [26] solver, version D, option (d), and its error, the solution by our solver Case 1,
its error, and finally the solution by our solver Case 2, with its error. We were only
interested in comparing the different solvers, and did not wish to study the question of
computing in a finite domain the solution to a half-plane problem; hence, only the
comparison when using exact boundary data is given. The computations with homo-
geneous boundary data were carried out and gave slightly poorer results for all solvers.
For the mesh points coinciding with the edges, our Case 1 solver has a smaller
error than the [26] solver at aU points except four. The error, in particular, is smaller
by a considerable factor over the airfoil |x| < 0.5, and it is smaller at the edges, where
most of the error occurs. Our Case 2 solver has mostly smaUer error than the Case 1
solver, but they are quite comparable.
For the computational mesh straddling the edges, our Case 1 solver seems to
have mostly larger errors than [26] outside the airfoil, 0.5 < |x|, but smaller errors in-
side. The Case 2 solver is still slightly better than the Case 1 solver, except at a few
points.
As a conclusion, our solvers of Section 5, Case 1 and Case 2, are quite successful
on the example problem of [24]. If anything, they have slightly smaller error than the
[26] solver which appeared most promising. A better test would probably be to ex-
tract the singularities from the solution analytically (cf. the example in [13], for in-
stance) and compute only the regular part of the solution. We felt, however, that [24]
stressed the importance of computing the neighborhood of the singularities as accurate-
ly as possible, and made the comparison accordingly.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 617
Table VII
Numerical results for the u component of velocity along y = 0 in the solution of
the example problem (6.1), (6.2). The heading LM indicates results with the sol-
ver of [26], version D, option (d); headings GBuv and GBuu indicate results with
the solvers of Section 5, Cases 1 and 2, respectively. The solutions were obtained
on a 64 x 64 grid; every second point is given for reasons of space economy, ex-
cept near the singularity at x = ±0.5, where every point is given; the "extra"
points are marked by stars. The terms "coinciding mesh" and "straddling mesh"
are explained in the text.
VIIA. Coinciding mesh
u exact LMLM
errorGB GB
errorGB GB
error
-0.-0.-0-0-0-0-0-0-0-0-0-0-0-0-0-0.-0.-0-0.-0.
0.0.0.0.0.0.0.0.0.0.0.0.0,0.00000
953125890625828125765625703125640625609375578125546875515625484375453125421875390625328125265625203125140625078125015625046875109375171875234375296875359375390625421875453125484375515625546875578125609375671875734375796875859375921875
1.409-11.666-12.009-12.486-13.193-14.341-4
243-1588-1895-1467282646-1250-2302-1163-1729-1050170
1.2421.272
262211116698-1566-1
4.450-12.302-15.250-24.646-11.2821.4678.895-1
588-1243-1689-1802-1227-1825-1529-1
-1.'411-1-1.670-1-2.016-1-2.497-1-3.210-1-4.372-1-5.286-1-6.640-1-8.863-1-1.684-1.480-4.560-1-5.493-2
2.277-16.150-18.704-11.050
169241271261210115684-1548-1422-1266-1613-2573-1482
1.6858.880-1
659-1306-1736-1844-1270-1872-1584-1
1.962-44.049-46.647-41.044-31.703-33.083-34.290-35.230-3
-3.208-3
2.166-11.983-1
-8.664-3
2.434-32.490-31.323-38.401-4
.658-4
.182-4
.331-4
.869-4
.729-48.950-4
069-3338-3812-3796-3580-3635-3
7.346-31.998-12.182-11.490-37.102-36.330-34.675-34.205-34.292-34.720-35.454-3
1.411-11.670-12.015-12.496-13.208-14.370-15.283-16.637-18.860-11.6841.4804.556-15.456-22.281-16.154-18.725-1
050170242272262
1.2111.1179.695-17.560-14.435-12.280-15.464-24.557-11.4801.6848.862-16.639-15.285-13.711-12.816-12.237-11.833-11.536-1
1.612-43.346-45.586-49.012-41.522-32.864-34.051-34.970-3
-3.488-3
2.163-11.980-1
-9.009-3
2.066-32.099-38.841-4
.496-4
.209-4
.488-5
.309-5
.743-5
.587-5
.145-68.743-52.539-46.116-41.463-32.174-32.149-3
-8.918-3
1.981-12.164-1
-3.367-3
5.103-34.197-32.231-31.374-39.723-47.678-46.693-4
-1.411-1-1.670-1-2.015-1-2.496-1-3.208-1-4.370-1-5.283-1-6.637-1-8.860-1-1.684-1.480-4.556-1-5.454-2
2.281-16.155-18.726-11.0501.1701.2421.2721.2621.2111.1169.696-1
.560-1,436-1.281-1.454-2.556-1
-1.480-1.684-8.860-1-6.637-1
283-1709-1814-1234-1829-1531-01
1.595-43.311-45.533-48.942-4
513-3853-3039-3957-3503-3163-1980-1027-3047-3078-3606-4228-4045-5953-5
7.198-59.136-58.551-5
196-5391-5818-4297-4369-3074-3042-3033-3
1.980-12.163-13.511-34.948-34.029-3
031-3134-3777-4980-4902-4
General Purpose Comparison. Lomax and Martin developed their solvers [24],
[26] with certain aerodynamical problems in mind [25], and we developed ours bearing
in mind certain problems in geophysical fluid dynamics [15], [17]. On the other
hand, the first Poisson solvers were also formulated for specific applications [4], [21],
and only later developed into general purpose algorithms and into packages. Therefore,
it seemed reasonable to test our solvers on solutions which had an appropriately general
character (Tables III through VI, and discussion at the end of Section 5); these tests
showed that our solvers are second-order accurate, given rather minimal continuity prop-
erties of the solution.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
618 MICHAEL GHIL AND RAMESH BALGOVIND
VIIB. Straddling mesh
u exact LMLM
errorGB
GB"V
errorGB'
GB""error
-0.-0.
0.0.0,
-0.
0.
0.0.0.0.
0.0.0.0.0.0.0.
-0.
0.0.0.0.0.0.0.0.0.0.0.0.
0.0.0.0.0.0.0.0.
9375875081257500687562505937556255312550004687543754062537503125250018751250062500625125018752500312537504062543754687550005312-5562559375625068757500812587509375
1.467-1■1.743-1■2.114-1•2.637-1■3.425-1■4.753-1■5.840-1•7.559-1■1.092
•7.763-1•2.353-1
.975-2441-1898-1235-1085192253273253192085235-1898-1441-1353-1353-1763-1
■1.092-7.559-1-5.840-1-4.753-1-3.425-1-2.637-1-2.114-1-1.743-1-1.467-1
■1.467-1■1.742-1■2.112-1■2.633-1•3.417-1■4.730-1■5.793-1■7.417-1-1.020
027-1219-1036-1455-1900-1234-1085192253273
1.2531.1911.0849.229-16.894-13.448-11.027-12.228-17.037-1
-1.021-7.430-1-5.807-1■4.746-1-3.435-1-2.655-1■2.138-1-1.774-1-1.506
986-5054-4127-4073-4356-4236-3734-3422-2206-2
353-2338-2796-3406-3765-4697-4290-4268-4998-4610-4144-4578-4804-4476-4365-4466-4957-3246-2252-2
-7.086-2-1.291-2-3.304-3-6.781-4
1.013-31.796-32.432-33.105-33.911-3
467-1741-1111-1632-1416-1729-1791-1415-1020
-7.025-1-2.217-1
.038-1
.458-1
.903-1
.237-1
.085,192.253.273.253.192.085
9.237-16.903-13.459-11.039-1
-2.216-1-7.024-1
019414-1790-1727-1414-1630-1109-1739-1464-1
063-5.474-4765-4
,939-4,463-4,372-3884-3439-2224-2
7.374-21.361-24.037-3
665-3726-4669-4183-5463-6715-5928-5971-6301-5314-5095-4274-4
-1.733-3-4.112-3-1.369-2-7.383-2
-7.235-2-1.450-2-5.011-3-2.510-3-1.108-3-6.830-4-4.957-4-3.994-4-3.457-4
1.467-11.741-1
111-1632-1416-1729-1791-1415-1020
.025-1
.217-1
.038-1
.458-1
.903-1
.237-1
.085
.192
.253
.273
.2531.1921.0859.237-16.903-13.458-11.038-1
-2.217-1-7.025-1
.020
.415-1
.791-1
.729-1
.416-1
.632-1-2.111-1-1.741-1-1.467-1
•5.903-5•1.442-4-2.717-4■4.874-4■9.380-4■2.362-3■4.873-3•1.427-2■7.223-2
.373-2
.359-2
.019-3
.646-3
.505-4
.417-4
.327-5
.980-5
.376-5
.068-5
.376-5
.990-5
.327-5
.417-4
.505-4
.646-3
.019-3-1.359-2-7.373-2
7.223-21.437-24.873-3
362-3380-4874-4717-4442-4903-5
The intent of [24], [25], [26] was explicitly restricted to the solution of spe-
cific aerodynamic problems. It appeared worthwhüe, however, to consider the more
general applicabiüty of their solvers.
We carried out a number of tests with version D, option (d) of [26] on solu-
tions with a generic character. The results are given in Table VIII. This table is or-
ganized in the same way as Tables III and V of Section 4, and we refer to the de-
scription there.
For very simple test cases such as u, v constant or quadratic, the [26] solver is
essentiaUy exact to machine accuracy; the round-off error increases sUghtly with the
number of grid points used. The difference between these results and those in Table III
is due to the computer used: double-precision arithmetic on an IBM 360/95 is slightly
more accurate than single precision on a CDC 6600.
The results with more severe test cases seem to indicate that the [26] solver is
only first-order accurate in general. This is apparently due to the formulation of the
algorithm at the boundaries. An indication that boundary inaccuracies are the cause
of lower than second-order accuracy are those exceptional test cases which show high
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 619
Table VIII
Numerical results for general-purpose test cases using the solver of [26], version
D, option (d). The smaller regions, such as 7?4 = {(x, y): 0 <x < tt, 0 <y < 7t/2}
were used in some of the tests because of the difficulty of fitting computations with
M = 256 into the core memory of the CDC 6600; M is the actual number of grid
points used in the x-direction in each test.
MH_(U) t_(V)
U, V
constant
u = 1000.0
v = - 100.0
16
32
64
128
1.083-11
2.871-11
1.001-10
1.999-10
1.642-12
2.628-11
1.387-10
2.177-10
7.747-12
2.752-11
1.210-10
2.090-10
2.910-11
9.095-11
4.475-10
1.281-9
4.547-12
5.093-11
2.310-10
3.174-10
u = x + y
100
16
32
64
128
3.076-12
1.360-11
7.589-11
1.706-10
1.537-12
2.533-11
1.376-10
2.164-10
2.431-12
2.033-11
1.111-10
1.949-10
8.640-12
5.571-11
3.583-10
1.122-9
4.547-12
4.957-11
2.287-10
3.156-10
y sin(x)
sin (y)
16
32
64
128
4.183-3
1.020-3
2.519-4
6.262-5
4.979-3
1.227-3
3.046-4
7.589-5
4.593-3
1.128-3
2.795-4
6.957-5
9.507-3
2.379-3
5.921-4
1.477-4
16/32
32/64
64/12E
4.101
4.048
4.023
4.058
4.023
4.014
4.076
4.036
4.018
3.996
4.019
4.009
1.013-2
2.544-3
6.366-4
1.592-4
3.983
3.996
3.999
u,v e C u = y sin x
v = sin(y)
16
32
64
128
1.155-2
2.885-3
7.197-4
1.797-4
8.376-3
2.055-3
5.090-4
1.267-4
1.009-2
2.504-3
6.233-4
1.555-4
2.622-2
6.613-3
1.656-3
4.141-4
16/3232/6464/12!
4.0054.0084.005
4.0764.0374.018
4.0294.0184.009
3.9663.9943.999
1.909-2
4.838-3
1.211-3
3.030-4
3.9473.9933.998
M *„(U> * <V>
u,v e c y sin(x)
sin(y)
163264
128
2.356-25.333-31.296-33.219-4
3.073-27.566-31.876-34.672-4
2.738-26.546-31.612-34.012-4
8.028-22.600-27.155-31.865-3
16/3232/6464/128
4.4184.1154.025
4.0624.0324.016
4.1834.0594.019
3.0883.6333.836
6.846-21.725-24.314-31.078-3
3.9693.9984.003
u = y sin(x)
v = sin(y)
16
3264
128
9.705-23.426-21.061-22.938-3
1.262-13.117-27.741-31.928-3
1.126-13.276-29.286-32.485-3
4.135-11.282-13.417-28.742-3
16/3232/6464/128
2.8323.2303.610
4.0474.0274.014
3.4363.5273.737
3.2243.7533.909
2.797-17.139-21.787-24.470-3
3.9193.9953.990
u = y sin(x)
v = sin(y)
2-H/3
163264
128
16/3232/6464/128
6.019-23.206-21.058-22.988-3
5.031-21.277-23.206-38.026-4
5.547-22.440-27.818-32.187-3
1.8773.0303.542
3.9393.9833.995
2.2733.1223.574
1.217-16.215-21.982-25.511-3
1.9593.1353.597
1.052-12.712-26.805-31.703-3
3.8783.9853.996
accuracy, apparently because their solutions close to the boundary behave in a special
way in the variable perpendicular to it. The series of tests with increasing powers of
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
620 MICHAEL GHIL AND RAMESH BALGOVIND
TABLE VIII (Continued)
M *,(U) H2(U,V) * (u) * (v)
u,v E C u = y sin(x)+ 0.1
v = sin( y)+ 0.1
163264
128
1.7261.0023.696-11.087-1
1.9374.854-11.210-13.018-2
1.8357.876-12.750-17.981-2
6.2292.2096.000-11.750-1
4.6151.1822.969-17.427-2
16/3232/6464/128
1.7222.7123.399
3.9904.0124.010
2.3302.8643.446
2.8203.6823.429
3.9043.9803.998
u,v e c 10 . , ,u = y sin(x)
v = sin(y)
163264
128
1.099+31.740+21.065+23.603+1
3.209+28.664+12.186+15.470
8.115+21.374+27.685+12.577+1
1.944+34.049+21.687+25.635+1
16/3232/6464/128
6.3207.6342.950
3.7963.9633.997
5.9051.7882.982
4.8002.4002.994
8.179+22.312+25.875+11.474+1
3.5383.9353.985
u,v e C = y sin(x)
= sin(y )
163264
128
5.742-21.556-23.835-39.507-4
1.128-12.690-26.571-31.632-3
8.951-22.197-25.379-31.335-3
2.299-18.305-22.533-27.042-3
16/3232/6464/128
3.6904.0584.033
4.1954.0944.027
4.0744.0844.029
2.7683.2793.597
3.015-17.070-21.892-24.703-3
4.2643.7374.023
u,v € C u = y sin(x)• i 4,
v = sm(y )
163264
128
1.261+16.5174.9132.752
1.892+11.044+17.5423.623
1.608+18.7036.3653.217
4.736+13.188+12.726+11.833+1
16/3232/6464/128
1.9351.3261.785
1.8121.3842.0828
1.8471.3671.978
1.4861.1691.487
3.853+12.896+11.494+11.026+1
1.3311.9391.456
t_(0> *_<V)
u = x sin(y)
v = y sin(x)
16
32
64
128
1.938-1
1.087-1
5.845-2
3.049-2
6.645-2
3.787-2
2.017-2
1.041-2
1.448-1
8.140-2
4.372-2
2.278-2
8.011-1
5.367-1
3.371-1
2.028-1
16/32
32/64
64/128
1.782
1.860
1.917
1.754
1.878
1.939
1.779
1.862
1.919
1.493
1.592
1.662
2.812-1
1.872-1
1.108-1
6.175-2
1.502
1.689
1.795
U,V 6 C u = sin(y)4
v = y sin(x)
16
32
64
128
2.733
1.787
1.033
5.604-1
6.771-1
4.29-1
2.413-1
1.281-1
1.991
1.300
7.501-1
4.065-1
1.055+1
8.858
6.367
4.193
16/32
32/64
64/128
1.529
1.730
1.343
1.578
1.778
1.884
1.532
1.732
1.845
1.191
1.391
1.519
4.096
3.456
2.401
1.505
1.185
1.439
1.595
u = x sin(y)
v = y sin (x)
16
32
64
4.381-1
2.571-1
1.429-1
1.319-1
7.774-2
4.213-2
3.235-1
1.899-1
1.054-1
1.738
1.227
8.254-1
16/32
32/64
1.704
1.798
1.697
1.845
1.704
1.802
1.416
1.487
6.385-1
4.722-1
3.017-1
1.350
1.565
y in the solution, which lead to decreasing order of accuracy, iUustrate this point. The
fractional order of accuracy evident in these and in some of the other test cases also
points in this direction. Needless to say, we also performed experiments with our sol-
vers of Sections 3 and 5 on the same test cases; they all yielded second-order accurate
results.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 621
TABLE VIII (Continued)
M *2(u) I (u) i (v)
u,v
harmonic
u=cos(x)sinh{y
v=sin(x)cosh (y
16
32
64
128
7.981-2
4.464-2
2.379-2
1.234-2
2.924-2
1.647-2
8.727-3
4.491-3
6.013-2
3-365-2
1.792-2
9-285-3
3.669-1
2-559-1
1.050-1
1.012-1
1.105-1
7.177-2
4.246-2
2.383-2
16/32
32/64
64/12E
1.789
1.876
1.928
1.775
1.888
I.943
I.787
1.878
I.930
I.434
1.551
1.630
I.539
1.690
1.783
u,v
harmonic
u=cos(x)sinh(y
v=sin (x)cosh (y
n/8
16
32
64
128
9.154-3
4.861-3
2.515-3
1.283-3
3.840-3
2.110-3
1.105-3
5.653-4
7.019-3
3.747-3
I.943-3
9.914-4
4.605-2
3.087-2
1.938-2
1.166-2
1.31B-2
7.809-3
4.181-3
2.168-3
tf/416/32
32/64
64/128
1.883
I.932
I.96I
1.820
I.910
I.954
1.873
I.929
1-950
I.492
I.593
1.661
1.726
1.868
I.928
u=sin(y)cosh(xv=cos (y) sinh (x)|
163264
12816/3232/6464/12E
1.17.03.92.1
43+1655101
3.9832.5691.4637.809-1
8.5575.3152.9791.585
5.535+14.188+12.760+11.696+1
1.776+11.457+19.4925.490
1.61.71.8
18 1.5501.7561.873
1.6101.7841.879
1.3221.5171.627
1.2191.5351.745
harmonicu=cos(y)sinh{xi
V = - sin(y)
cosh(x)
163264
128
1.78.94.52.2
42+1110060
8.3944.6812.4761.274
1.367+17.1173.6321.835
7.865+14.542+12.443+11.264+1
3.313+12.071+11.165+16.183
16/3232/6464/12Í
1.91.91.9
1.7931.8911.944
1.9211.9601.970
1.7311.8601.928
1.6001.7791.883
Subject to further testing, we are forced to conclude at this point that our
algorithms, compared to previously published ones, are at least as good when applied
to specific problems of interest, and that they are more suitable for development into
general-purpose Cauchy-Riemann solvers.
7. Concluding Remarks. The inhomogeneous Cauchy-Riemann equations in a
rectangle have been discretized by a finite-difference approximation. A number of
different boundary conditions have been treated explicitly, leading to algorithms which
have overall second-order accuracy. All boundary conditions with either u or v pre-
scribed along a side of the rectangle can be treated by similar methods. A rigorous
proof of the second-order accuracy of the algorithm was given for one combination of
boundary conditions, and numerical experiments substantiate this result for aU the
boundary conditions tested.
The algorithms presented here have nearly minimal time and storage require-
ments and seem suitable for development into a general-purpose direct Cauchy-Riemann
solver for arbitrary boundary conditions. This could be done for instance along the
Unes of the capacitance matrix methods of discrete potential theory (Widlund [33]);
generalizations to nonrectangular domains can also be made by this approach and re-
lated ones. More experience with different applications should help in formulating a
code which gives a reasonable compromise between efficiency and range of applica-
biUty.
It is well known [30], [31], [32] that fast solvers can be formulated for a sin-
gle separable second-order eUiptic equation with variable coefficients. Clearly, the
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
622 MICHAEL GHIL AND RAMESH BALGOVIND
same generalizations can be carried out for a first-order system (2.1) in which 9/9x
and 9/9y are replaced by a(x)9/9x and by A(y)9/9y, and in which lower-order terms
can also be introduced. A special case of such an extension appears in [26]. Effi-
cient algorithms for generalizations of this nature can be based on appropriate modifi-
cations and combinations of matrix decomposition, cyclic reduction and Toeplitz fac-
torization [30], [31], [32].
Linear problems with variable coefficients, as well as nonlinear problems, can
also be handled by semidirect methods, i.e., by splitting the given operator to be in-
verted into one whose inverse is easily computed using a fast direct method, and an-
other one which is small in some suitable sense [8], [19], [25], [32]. It is in this direc-
tion that we shall seek to extend the work presented here, in order to solve the non-
linear problem of [15], [17] and related ones. The straightforward iterative method
of [17] wUl then be compared to semidirect methods.
Appendix A. An Error Estimate. We saw in Sections 4 and 5 that the proposed
algorithm gives a second-order accurate solution to the inhomogeneous Cauchy-Riemann
equations under the various boundary conditions considered. The second-order ac-
curacy of the discrete solutions was obtained in our numerical experiments even for
cases in which the solution to the continuous problem was merely C1, more precisely,
u E C1, v E C°. We shall give now a rigorous error estimate for the model problem of
Section 2, making the stronger and more customary assumption that u, v E C4, i.e.,
that the solution to (2.1)—(2.3) has continuous derivatives up to fourth order.
The discrete operator Lh we consider is that of (3.1 la),
(A.l)
S + I
-I
-I
S + 21
-I
-I
+ aee*,
S + 21 -I
-I S + I
where a > 0 and e is a vector of length MN with all entries zero but one,
e, = 0 for / # Mj0 + i0, eMj0+i0 = • i
Lh acts on the grid function U. We wish to show that
(A.2a) ||i7 - £/|L = 0(h2 + k2).
From such an estimate and from (3.1a) it will follow immediately that
(A.2b) Hu - F|L = 0(h2 + k2)
as well; it suffices to observe that (3.1a) is a second-order accurate quadrature formula
for
(A.3) v(x, y) = u<°>(x) + f¿{d(x, v) - ux(x, r,)} dV,
namely the midpoint rule. We only need to apply the result of Bramble and Hubbard
[2] that (A.2a) implies similar estimates for the difference quotients of u.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 623
The matrix operator Lh of (A.l) is of monotonie type (Collatz, [6, p. 42 ff.])
or of positive type with diagonal dominance (Forsythe and Wasow [11, p. 181]). To
avoid confusion in the terminology, we shall simply state that Lh satisfies the following
Lemma 1 (Maximum Principle). IfLhVl = H, a77»i H > 0 (in the sense that
all the components of H are nonnegative), then W > 0.
From this, one easüy shows that
Lemma 2 (Comparison Theorem). // \LhVi\ < Lh<&, then |W| < <i>.
We notice that these results would still hold if, instead of aee*, we included in
Lh a term aA, with a single diagonal block equal to IM, and all other blocks zero, IM
being the M x M identity matrix.
These properties of Ln suggest the familiar estimation procedure first used by
Gershgorin [13]. Let u be the solution of
(A.4a) Lu = f
where L and / are defined by the two equations
(A.4b) A» = uxx + uyy =dx+ey inR,
(A.4c) uy = e + vx on bR = {(x, y): 0 < x < 2n, y = 0, 77},
with the additional requirements that u be 27r-periodic in x, and given at a point
(x0, y0) £ R. For sufficiently smooth d, e and t/°\ v^l\ (AA) has a unique solution
uEC4, subject to the familiar compatibility condition for the Neumann problem that
(A.4d) ff Vx + ey) dxdy = -§dR (e + vx) dx.
We shall make the necessary smoothness and compatibility assumptions throughout
this Appendix.
Let U be the solution of
(A.5a) LhU = F~.
where Lh is defined by (A.l) and F is defined as the right-hand side of (3.11a),
7-*D, -E,
(A.5b)F=k
T*D„ + E,
^-1 + EN-2 "N-l
T*DN + EN-l
+ aee*U,
with F-, 1 </ < N, corresponding in obvious fashion to the partition of Lh. Clearly,
for 2 < j < N - 1, (A.5) is an approximation to (A.4b) with second-order local trun-
cation error. This would further encourage us to seek an estimate (A.2a) by first
showing that the truncation error satisfies
(A.6) Lh(u -U) = (Lh-L)u+f-F = 0(k2(h2 + k2))
and then constructing a Gershgorin comparison function $ which would aUow us to
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
624 MICHAEL GHIL AND RAMESH BALGOVIND
conclude based on Lemma 2 that (A.2) follows from (A.6). We remark at this point
that, in order to make the notation of (A.6) transparent, we defined
(A.7a) L = -k2A in R, L = -kb/by on bR,
(A.7b) f=-k2(dx+ ey) in R, f= -k(e + vx) on bR;
furthermore, F is not merely / at the grid points.
A number of slight difficulties arise in carrying out the program above. First,
(A.4) is essentiaUy a Neumann problem, rather than a Dirichlet problem as in [13] and
in most of the literature on elliptic systems. The work on boundary conditions of the
second and third kind most relevant to the estimation which follows is that of Batsche-
let [1], who gives an 0(h) estimate, and that of Bramble and Hubbard [3], who give
an 0(A2|log A|) estimate, their assumptions being that u E C4. The latter article also
contains further references to estimates for the Neumann problem. The boundary con-
dition approximations in these works are different from ours.
Second, (A.6) actually fails close to the boundary, i.e., for / = 1, TV, where the
local truncation error is of first order only. The work of Bramble and Hubbard ([3]
and references therein), combined with our numerical results, led us to expect that
(A.2) could stiU be proved, with some additional effort; this turned out to be the case.
Our method to obtain (A.2) is actuaUy a modification of that in [1], which ex-
ploits the fact that the boundary condition we use is second-order accurate, while that
in [1] is only first-order. We shall see below that the failure of (A.6) for / = 1, /V
stems from Lh there being a linear combination of the discrete analog to (A.4b) and
of that to (A.4c). Hence the truncation error, as defined by (A.6), is of first order
there, although both (A.4b) and (A.4c) are separately approximated to second order.
In spite of this formally first-order truncation error, the fact that both the equation
and the boundary condition are approximated by second-order discrete analogs yields
an overall 0(h2 + k2) error estimate (A.2).
After these observations, we proceed with the business at hand. To start, we
derive (A.6) for 2 </ < tV - 1 ; let the discretization error W be defined as
(A.8a) W = u - U,
considered as a mesh function. Then
(LHW)j = -ViM +(S + 21)Vi. - W/+1
= -u,.,, +(S + 27)u- - U,+ j + k2(Au).-(A.8b)
-k2(dx + ey) - k(T*Df + E;_, - E;)
= 0(k2(h2 + k2)), 2 </ <N - 1.
In (A.8b), and in the sequel, u.-, (àu);-, and similar terms are interpreted as vectors of
grid values, in the same way as U-. The factor k2 is convenient in order to keep the
coefficients of Lh as 0(1). The terms 0(h2k2) appear due to differentiation in the x
direction, those 0(k4) due to differentiation in the y direction. The constants implied
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 625
in writing 0(hpkq) depend on the derivatives of order p + q for the functions involved;
they wiU not be written down explicitly, since those derivatives are known to be
bounded from our assumptions.
Here, as in the derivation of (A.2b) from (A.3) and as in the sequel, it is im-
portant to remember the staggering of Figure 1, which essentially guarantees that aU
differences are centered. Notice also that, for e- # 0, (A.8b) will only be modified
by the term ae,- e,*u, in Lhu, and by the term ae; et U,- in F; the condition weio >o *o " >o 'o >o
chose to eliminate the indeterminacy in the Neumann problem is exactly that these
two terms cancel, and hence the estimate is not affected. If instead of ee* we have
A, the same observation holds as after Lemma 2; i.e., prescribing the average of U- ,
rather than one of its components, U¡ ¡ , does not affect the estimate either.'o-V
To analyze the situation for / = 1, it is convenient to rewrite (A.5) for / = 1 as
a linear combination of two equations, involving an auxiliary vector U0; an entirely
similar procedure can be carried out for / = N, introducing U^ +,, but we shall omit
writing out the latter analysis and merely draw the conclusions we need from it. The
two equations for / = 1 are
(A.9a) -U0 + (S + 2I)\JX - U2 = k(T*Dx + E0 - Ej),
(A.9b) U0-Uj =-xE0 + r*V0;
here we revert to the original definition of Dj, which had been replaced by Dj +
k~l V0 in writing Eq. (3.3). Equation (A.9a) is now simply the discrete analog of (A.4b)
for/ = 1, or y = k/2, with E0 defined on y = 0 in the usual manner. Equation (A.9b)
is the discrete analog of (A.4c) written on y = 0. If the auxiliary vector U0 is thought of
as given on y =-k¡2, then both (A.9a) and (A.9b) are formally second-order accurate.
With this motivation in mind, it is easy to obtain
(LhW)x =(5 + /)Wj-W2
= (S + I)ux - u2 + k2Au\y = k/2 + fc(9/9y)77|y = 0
(A.lOa) -k2(dx + ey)\y = k/2 - k(T*Dx + EQ - Ex)
-k(e + vx)\y=i0+kE0-T*yo
= [(Lh-L)u]x + [f-F]x.
We have from (A. 10a) that
[(Lh - L)u], = 0(k2(h2 + k)), [f~F]x= 0(k2(h2 + k2)) + 0(kh2).
We assume throughout that k/h = 0(1) and drop the fourth-order terms; thus finaUy
(A. 10b) (/.„IV), = 0(k(h2 + k2)).
For / = N we obtain in the same way
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
626 MICHAEL GHIL AND RAMESH BALGOVIND
(A.llb) L\¡/ =
(A-10c) (LhW)N = 0(k(h2 + k2)).
At this point we have to exhibit a comparison function <p which shall lead us
from (A.8) and (A. 10) to (A.2). Let \p be the solution of
(A.l la) L\p = k2a in R,
-kab on y = 0,
kßb on y = n.
L is defined by (A.7), and i// is 27T-periodic in x and fixed at one point; this yields a
unique and smooth i//, which is bounded independently of x, y, A and k. Notice that
\p satisfies inhomogeneous boundary conditions as in [1]. The constants a, b, a and ß
axe positive and chosen so as to satisfy the compatibility condition
(A.l lc) fJA^dxdy = f^Wy(x, it) - ^y(x, 0)} dx,R
that is, to = (a + ß)b.
We want to find 777 independent of x, y, but not of k, A, so that
(A.l 2a) 0 = 777i//
and
(A.12b) \LhW\<Lh<i>,
where W is given by (A.8a) and 4> is the mesh function corresponding to 0. Lemma 2
wül then provide the desired estimate on W in terms of 777.
We consider first the "interior" mesh domain of i/-points Rh = {(x,-, y): 1 < /
<M, 2 </<tV- 1}. There
(A.13a) Lh<¡> = L4> + (Lh - L)4> = mk2a + mO(k2(h2 + k2)),
as in (A.8b). For A, k sufficiently small, with h/k = 0(1),
Lh(p > mk2a/2.
It suffices, therefore, given some sufficiently large constant CR, to have set
(A. 14a) m = CRh2
for (A.12b) to hold in Rn; CR wiU depend on certain bounds for the derivatives of 77.
Consider now the "boundary" mesh domain of í/-points rh = {(x¡, y A.
1 <i<M,j= l,N}. Here, cf. (A.9, 10), we have
(Lh0)x =(S + 7)$j - $2 = k2A<¡>\ k/2 + *</> I 0 + 0(k2(h2 + k))(A.13b) y ' vy
= -mk2a + mkab + 0(k3);
a similar equation holds for (Lh<i>)N. Thus, for A, k sufficiently small, we have in Th
Lh<t> > mkab/2;
remembering again (A.10), it suffices here too that
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 627
(A. 14b) 777 = CrA2,
for (A.l2b) to hold. The constant Cr depends only on the derivatives of u, and on the
domain R.
In fact, one can write down \p = \¡/(x, y; a, b, a, ß) explicitly and optimize the
parameters a, b, a, ß subject to (A.llc). This yields
(A.15a) IB'I <|Ä2max||CÄ, Cr\,
where
(A.l5b) CR = — sup \2(uxxxx + u ) - (dxxx + e )\,^ 0<x<2n,
0<y<TT
and
(A.l 5c) Cr=TA SUP \uyyy+vxxx\-
y = 0,ir
The bound e(u) in Table V of Section 4 was computed using (A. 15).
Having derived bound (A.l5) completes the proof that under our assumptions
llw-i/|L + ||u-F|L=0(A2).
It might be of interest to notice that the algorithm proposed and the error estimate
just given apply not only to the inhomogeneous Cauchy-Riemann equations (2.1),
(2.2); they apply directly to the Neumann problem (A.4), with dx + e , e + vx arbi-
trarily prescribed as well. In other words, this is also a second-order algorithm for the
Poisson equation with Neumann boundary conditions in a rectangle (compare Schu-
mann and Sweet [29]).
Appendix B. The Basic Algorithm. At the end of the paper we present a
FORTRAN listing of the program used to compute the results in Tables I through VI.
The program implements the algorithm described in Section 3, tested in Section 4, and
analyzed in Appendix A. It solves the inhomogeneous Cauchy-Riemann equations for
77 and v in a rectangle, with u prescribed on the upper and lower side of the rectangle,
and with periodicity in the x-direction. The programs for other combinations of
boundary conditions, as discussed in Section 5, are very similar.
The program was tested for compatibility with ANSI FORTRAN (U.S.A. Stan-
dard X3.9-1966) by a special diagnostic option of the CDC FTN 4.6 compiler; it was
further cleaned up to conform to common usage by a program called TIDY, imple-
mented at and available from the Courant Institute. It was run on a CDC 6600 with
an FTN 4.6 compiler, and on an IBM 360/95 and an Amdahl 470V/6 with a
FORTRAN IV, level H compiler. It is felt that these procedures and the numerical
tests reported on in the text are some reasonable steps in the direction of validation
and portability; further steps in this direction taken by potential users would be of the
greatest interest to the authors.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
628 MICHAEL GHIL AND RAMESH BALGOVIND
The main part of the program is subroutine FASTCAR. For user convenience,
the driver subroutine, as well as the FFT subroutine and the tridiagonal solver we used
are included. Clearly, these subroutines can be changed to others which the user for
one reason or another finds are better adapted to his needs or preferences; e.g., cyclic
reduction can be used instead of the FFT, or Toeplitz factorization instead of the LU
factorization we use for the scalar tridiagonal systems, etc. The use of a more general
FFT algorithm would remove the restriction of M being a power of two.
The driver program contains some diagnostics on error norms, which would have
to be modified for problems whose solution is not known in closed form. FORMAT
statements are grouped together and can thus easily be modified. Execution speed can
be increased if MN additional storage locations are available, by avoiding the shifting
required for the reindexing process within the same array. Other trade-offs between
storage and execution time are also possible.
The implementation foUows rather closely the algorithm description in the last
subsection of Section 3. With the help of the COMMENTS provided in the pro-
gram, we hope that it is fairly readable.
Questions and comments by users are warmly invited.
C DRIVER FOR FASTCAk * 1
C A 2COMMON /KEEP/ Z(S25bl A 3COMMON /HAND/ t(6292),D(U292) A 4COMMON /WET/ V0(12B><VNU2ÍS) A 5COMMON /DRY/ X(64)« Y<64) A 6
DIMENSION K<5,6> A 7C A e
XJUI J) = XL + H2*r-L0AT( J)-ri A 9
XJV(J)=XL+H2*F10AT(J-1I A 10YKU(KI"YL+G2*FL0AT[K)-G A 11YKV(K.>*YL + G2*FL0AT<K) A 12
VEX(X,Y )«SIN(X) A 13U£X(X,YI -C0SÍX )+Y A l<i
DUUYIX»Y)"1.0 A 150UDX1X» Y)—SIMXI A 16DVDX(X,Y)-COS(XI A 17DVDY(X,YI=0.0 A 18
C A 19ISAM--1 A 20PIA*1.0 A 21PI=4.0*ATAN<PIA) A 22PI2=PI/2.0 A 23DO 60 MF-3.6 A 24ICODE»0 A 25HPOK'HF A 26M=2**MP0W A 27N-M/2 A 26YL-0.0 A 29XL-0.0 A 30XU-2.0*PI A 31H2*<XU-XL)/FLCAT(M) A 32YU=N*H2 A 33H»H2/2.0 A 34G2-(YU-YL)/FL0AT(N) A 35G-G2/2.0 A 36M1=M-1 A 37ALAM»G/H A 3tAM=1.0/M A 39Nl-N-1 A <.0IL«M*N A <tlILM-1L-M A 42MR=M+1 A 4300 10 J»1>N A <r<rY1=YKV(JI A 45Y2 = YKU<J) A <i6DO 10 1=1,M A 47Xl-XJU(I) A 46
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 629
XZ«XJV!I)K=(J-1)*M+I
SETTING UP R.H.S. UF THE INHOMOGENEOUS CAUCHY-RIEMANNEQUATIONS.E IS THE R.H.S. OF THE VORTICITY EOUATIOND IS THE R.H.S. OF THE DIVERGENCE EQUATION
E(K)«DUDY<X1,Y1)-DVDX(X1,Y1)D(K)-0UDX(X2,Y2)+0VDY(X2,Y2)
10 CONTINUE
NEXT STORE THE BOUND.ARY VALUES V(TOP) IN VN(I).
NEXT STORE THE BOUNDARY VALUES V(BOTTOM) IN VO(I).
DO 20 I-1»MXZ'XJVUIV0(I)»VEX(XJV(I),YU)VN(I)»VEXIXJVII I,YL)
20 CONTINUE
NEXT STORE THE FIXED U VALUE IN SOL.
IROW IS THE ROW WHERE FIXED VALUE U IS ASSIGNED.SOL-0.0IROW'NY1«G*(2*(IR0W-1)+1)+YL00 30 J»1,M
Xl-XJU(J)S0L»S0L+UEX(X1,Y1I
30 CONTINUECALL FASTCR <SOL»H,G,PI>IROW,MPOw,ISAM,ICOOE,N)
NEXT FINDING NORMS..XN IS THE L2 NORM OF U.YN IS THE L2 NORM OF V.XYN IS THE L2 NORM UF (U+V).XM IS THE MAX NORM OF U.YM IS THE MAX NORM OF V.
YN«0.0YN-O.ODO 40 J = 1,MX2-XJVIJlDO 40 I«1,N1Y2»YKV(I)K"(1-1)*M+JDKKK»ABS(0(K)-VEX(X2fY2II
YN*YN+DKKK*DKKK
IF (DKKK.GT.YM) YM«DKKKÍ0 CONTINUE
XN=0.0XM«0.0DO 50 J»1»MXl-XJUUIDO 50 1 = 1,NYl-YKUII IK=(1-1)*M+JEKKK»A3S(E<K)-UEX<X1,Y1))XN«XN+EKKK*EKKK
IF (EKKK.GT.XMI XM'EKKK50 CONTINUE
XYN"(XN+YN)/(IL+1LM)
XN-XN/IL
YN»YN/ILMXN=SORT(XNI
YN-SQRTIYN)XYN=SORT(XYN)
WRITE (6.120)WRITE (6,110) XN,YN,XYN,XM,YMW(1,MF)«XNW(2,MF)»YNW(3,MF ) =XYNW(4,MF)»XM
W(5,MF)«YM
495Û
515253
5«.5556= 7
5t59
6(j61be
b'i
t>4bt6667
6Í:
69
7C7j72737*75It7 77t7 9
to£ 1620 3
R*B 5tft
B7itB99C91
9 2
93■j*
959(
9 7
?e99
10Ü
10110210310*10510610710810911011111211311*115116117ue11912012112212312*12512612712B
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
630 MICHAEL GHIL AND RAMESH BALGOVIND
WRITE (6,90)WRITE (6,90)WRITE (6,80) N,M,XL,XU,YL,YUWRITE (6,90)WRITE (6,90)
60 CONTINUEWRITE (6,90)DO 70 J = l,5W< J,3)=W<J»3)/W(J,*)W(J,*)'W(J,*)/W(J,5)
70 W( J,5)=W( J,5)/W( J,6)WRITE (6,100)WRITE (6,110) ((W(J,I),J=1,5),I=3,6)STOP
80 FORMAT (5X,*N=*,13,* M=*,I 3, 5X,*XL=♦>F 10.3,* XU=*,F10.3,* YL=*,1F10.3,* YU=*,F10.3)
90 FORMAT (IX)
100 FORMAT (7X,9HF0LL0WING,*X,2HIS>*X,5HRATIO,*X,2HOF,*X,5HABOVE,*X,5H1N0RMS)
110 FORMAT (1X.1P5E20.10)120 FORMAT (1X,9HNGRM OF U,15X,9HNQRM OF V,12X,1lHNOkM GF J+V»10X.11HM
1AX NORM U,10X,11HMAX NORM V)END
A 129
A 130A 131A 132A 133A 13*A 135A 136A 137A 13EA 139A 1*0A 1*1A 1*2A 1*3A 1**A 1*5A 1*6A 1*7
1*81*915C
15115215315*-
SUBROUTINE FASTCR (SOL,DELTAX,DELTAY,pI,IRUW,MPGW,iSAM,li.ODt,N)COMMON /HAND/ E(o 292),D(B292)COMMON /DRY/ X(6*),Y(6*1COMMON /WET/ A(126,2)
»***»**»*♦*♦♦**«*»«*♦*«**♦»*»♦«♦♦«**♦♦****♦♦♦*♦***♦**♦♦***♦»«**
SOLVES 1NHUM0GEN0US CAUCHY-RIEMANN EQUATIONS IN A RECTANGLE*U + V = D(X,Y) *
X Y WITH 3.C. V(x,0) , V(X,YMAX) ♦
U - V = E(X,Y) *Y X AND A FIXED VALUE OF U ASSIGNED ♦
AT SOME (X,Y). ♦*
ALL FUNCIONS ARE PERIODIC IN X WITH PERIOD 2*DELTAX*M . **
THE R.H.S. E(X,Y) , D(X,Y) ARE STORED IN *E(M*J+1) = E( DtLTAX*(2*l-l) , JElTAY*(¿*J+2) ) ♦
FOR J * 0...N-2 , I « 1...M *D(M*J+I) * D( DELIAX*2*(I-1) DELTAY*(2*J + l ) ) ♦
FOR J = 0...N-l , I • 1...M **
THE BOUNDARY CONDITIONS ARE STORED IN *A(I,1) = VlDELTAX*I,0) ,A(I,2) * V(utLTAX*I,D.tLTAY»N) *
FOR I « 1...M **
SOLUTIONS U AND V ARE RETURNED IN *E(M*J+I) = U( DELTAX*(2*I-1) , OtLTAY*(2♦J+l) ) *
FOR J * 0...N-l , I » 1...M *D(M*J + I) = V( DELTAX*2*(I-1) , C)ELTAY*<2*J+2) ) *
FOR J = 0...N-2 , 1 « 1..,M **
N • NUMBER OF MESH POINTS IN Y *2**MP0W = NUilBER OF MtSH POINTS IN X *
M«2
ISAMISAMISAMIROW
SOL
* -1 SOLVES V FROM TOP TO BOTTOMtY AXIS)« 1 SOLVES V FROM BOTTOM TU TUP■ O ÜOESNT SOLVE V
* THE ROW IN Y WHERE THt FIXED VALUE OF U ISASSIGNED
* (J + l) OF L'< X , DtLTAY*(2*J+l) )■ SUM OVER THE (XI OF THIS ROW
ICODE MUST BE SET NOT EQUAL TO 3 WHEN ROUTINE IS *CALLED FDR 1HE FIRST TIME *ICODE ' 3 FOR SUBSEQUENT CALL IF THE NUMBER OF *MESH POINTS IN X AND Y ARE SAME AS IN THE PREVIOUS CALL *
*I**************************************************************
»»MPOW
MBY2*M/2
M1«M-1Afl=1.0/FL0AT(M)N1«N-1NEXT=0
7t9
101112131*1 51617It1920¿122232*2526272 b293C313 2
333*3 536373t39*C-*1*2*3**
*5*6
*7*8*95t
5152535*
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 631
MHALFP=M8Y2+1
MPLUS2=M+2IL«M*NILM»IL-KALAM'DELTAY/DELTAXALSQ*ALAM*ALAM
SETTING UP THE R.H.S. OF THE LINEAR SYSTEM
G2«2.0*DELTAYDO 10 K=1,ILME(KI*E(K)*G2
10 D(K)»D(K)*G2DO 20 1 = 1,MKP-I+ILMD(KP)*D(KPI*G2-A(I,2)D(1)-D(II+A(I,1)
20 A( 1,11*0.0IP*1IS*2DO *0 J*1,ILM,MKM=J-1DO 30 1*1,MlKl-KM+IA(I,IS)»E(K1)
30 E(K1)=ALAM*(D(K1+1)-Ü(K1))+E(K1)-A(I,IP)KMMP=K1+1A(M,I3)=t(KMMP)E(KMMP)=ALAM*(D(KM+1)-D(KMMP>)+E(KMHP)-A(M,IP)IEX=IPIP*IS
*0 IS-IEXDO 50 I-1,M1K1«ILM+I
50 E(K1)»ALAM*(D(K1+1)-D(K1))-A(I»IP)E(IL)=ALAM*(D(ILM+l)-D(IL))-A(f1,IP)
PERFORMING THE FAST FOURIER TRANSFORM BLOCK BY BLOCK
SIG—l.ODO 80 K1*1,NJF«(K1-1)*MDO 60 1*1,MJ»JF+IA(1,1)=E(J)
60 A( 1,2)-0.0
CALL FFT2 (SIG,MPOw,PI)00 70 I*1,MBY2IQF*JF+I*2E(IQF-1)»A(1,1)
70 E(IQF)=A(I,2)X(K1)"AIMHALFP,1)
80 Y(K1)*AIMHALFP,2)
SOLVING EACH SUB-SYSTEM
DO 120 J*2,MBY2IF (ICODE.EO.3) GO TO 100X1«2.0*ALSO*(COS(2.0*PI*(FLOAT(J)-1.0)*AM)-1.01-2.0A(1,1)-X1+1.0A(N,1)=X1+1.000 90 K*2,N1
90 A(K,1)*X1100 DO 110 K*1,N
I»(K-1)*M+2*JIQF«K+NA(K,2)«E(I-1I
110 A(I0F,2)*E(I)CALL CONSOL (N,NEXT,ICODE)DO 120 K-1,NI«(K-1)*M+2*JE(I-1)»A(K,2)IQF*K+N
120 E(I)-A(IQF,2)IF (ICODE.EO.3) GO TO 1*0Xl*-2.0*(ALSO*2.0+l.J)A(1,1)»X1+1.0A(N,1)*X1+1.0DO 130 K*2,N1
130 A(K,1)=X11*0 DO 150 J*1,N
A(J,2)-X(J)IOF-J+N
55
56575t59
60
6162
63
6*
65
6b
67be69
7C7172737<i757b
77
7t
79bO
ai6283B*
b5bb
87BB
B9
90919293
9*9596979899
3BBB
BBBBBBBB 100
B 101B 102B 103B 10*B 105B lotB 107B 10 8s 109B 110B 111B 112B 113B 11*B 115B litB 117B lit8 119B 120B 121B 1223 1236 12*B 125B 126B 12 7B 12bB 1293 13CB 131B 13ZB 133B 13*B 135B 136B 137
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
632 MICHAEL GHIL AND RAMESH BALGOVIND
150 A(I0F»2)«Y(J)CALL CONSOL (N,NEXT,1C0DL)DO 160 J*1,N
IQF=N+JX(J)*A(J,2)
160 Y(J)=A(10F,2)IF (ICODE.EQ.3) GO TO 1B0All,11—1.0A(N,1)*-1.0DO 170 K=2,N1
170 A(K,l)«-2.0180 DO 190 K»1,N
I*(K-1)*M+1A(K,2)*E(I)IOF=N+K
190 A(IOF,2)=E(1+1)AIIR0W,1)=A(IRÛW,11-1.0
A(IR0W,2)*A(IR0W.21-S0LCALL CONSOL (N,NEXT,ICû0E)00 200 K»1,NI=(K-1)*M+1E(I)=A(K,2)IQF*N+K
200 E(I+1)*A(I0F,2)
FAST FOURIER TRANSFORM OF SOLUTION U
SIOí)JF00IQMMDJL"A(
A(
A(A(
CA00J =
230 E(
210
220.
G*l.230(Kl210
F*JF1,1)1,2)
220MPLU
L,llL,2)KHALMHALLL F
230JF+IJ)=A
0Kl
-1)
1 =♦ I*•E(*E(
1 =S2->A(*-A
FP,FP,FT2
1 =
1,N*M
1,MSY22IQF-1)IOF)2,M3Y2I1,1)(1,2)1)=X(K1)
2)*Y(K1)(SIG,MPOW,PI)
1,M
(1,1)»AM
SOLUTION V WILL BE IN ARRAY D AND U WILL BE IN E
(ISAM.EQ.O) GU TO 300(ISAM.EC.-1) GO TO 260250 K*1,N1
= K*M
MP=KM-M
1,1)=ALAM»(E(KM)-E(KMMP + D)2*0 J"2,M
=KMMP+J
J,1)=ALAM*(L(KJ-l)-E(KJ))250 1*1,M
KMMP+I
J)=D(J)+«(1,1)(K.LL.l) GO TO 250
I = J-MJ)=0(J)+D(KlI)NTINUE
IFIFDOKMKMA(DOKJ
2*0 A(DOJ*D(IFKl0(
250 CO
GO TO 3002.60 Oq 270 1*1,M
J-I+ILM270 A(I,1)*0(J)
IS = 1IP*2DO 290 KD*1,N1IEX«ISIS = IPIP"IEXKM«M*(N-KD+1)KMMP*KM-MA(1,IP)=ALAM»(E(KM)-E(KMMP+1))+A(1»IP)DO 280 J*2,MKJ*KMMP+J
280 A(J,IP)=ALAM*(£(KJ-1)-E(KJ))+A(J,IP)DO 290 1=1,MJ-KMMP+I
1361391*01*11*21*31**1*5
6 1*6S 1*7B l*bB 1*9B 150B 151B 152B 153B 15*B 155B 1568 157B 15EB 159B 160B 161B 162B 163B 16*B 165B 1663 167B 16CB 169B 170B 171B 172
173B 17*B 175B 176B 1778 176B 179B 18 08 161B 162B 183B 18*
185B 186B 167B 186B 189B 190B 191B 192B 1938 19*8 195B 196B 197B 198B 1998 200
B 20120220320*20520620720t209210211
B 212B 213B 21*B 215B 21b
B 217B 2i6
B
B
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 633
K1I-J-MA(I,IS)'0(K1I)0(K1I)—A(I,IP)IF (KD.GT.l) D(K1I)»D(J)+D(K1I)
290 CONTINUE300 CONTINUE
RETURNEND
SUBROUTINE FFT2 (SIG,M,PI)COMMON /WET/ X(12f ),Y-(l26)
FAST FOURIER TRANSFORM
N=2*»MNV2-N/2NM1=N-1J = lDO 30 1=1,NM1IF ( I.GE.J) GO TO 10TX-XÍJ)TY*Y(J)X(J)=X(I)Y(J)=Y(I)X(1)=TXY(I)*TY
10 K=NV220 IF (K.GE.J) GO TO 30
J-J-KK=K/2GO TO 20
30 J*J+KDO 50 L=1,MLE*2**LLEl*LE/2UX-1.0UY*0.0ANG*PI/FL0AT(LE1)WX=COS(ANG)WY=SIG*SIN(ANG)DO 50 J=1,LE1DO *0 I*J,N,LEIP=I+LE1TX*X(IP)*UX-Y(IP)»UYTY=X(IP)»UY+YI1P)*UXX(IPI*X(Il-TXY(IP)*Y(I)-TYX(I)*X(I1+TX
*0 Y(1)=Y(I)+TYUS*UX»WX-UY*WYUY=UX*WY+UY*WX
50 UX=USRETURN
END
SUBROUTINE CONSOL (N,NEXT,1 CODE)COMMON /KEEP/ 2(8256)COMMON /WET/ X(126),Y(126)
»♦»»♦»»♦»t****************************************************
* SOLVES A SPECIAL TP1-DIAGJNAL LINEAR SYSTEM BY LU* FACTORIZATION .
* TWO R.H.S. ARE GIVEN AND STORED IN Y, STARTING AT Y(l)« AND AT YIN+1) RESPECTIVELY .* THE TWO CORRESPONDING VECTOR UNKNOWNS ARE RETURNED IN X,
* STARTING AT X(l) AND AT XIN+i) RESPECTIVELY.
B 2193 22C
B 221B 222o 223à 22*B 225
26-
1
IF ICODE = 3 THEN Z ARRAY MUST BE KEPT RESERVED FORUSE BY CONSOL.
ttl>tlt<Ht*tlt»H>t(>tt<ttt|c»lttH)(t)ltt<t>l<.4tH>l>tl
IF (ICODE.EO.3) GO TO 20Z(NEXT+1)*X(1)00 10 K«2»NNEXTPL-NEXT+KKl-K-1X(K1)=1.0/Z(N[XTPL-1)
10 Z(NEXTPL)*X(KI-X(K1)20 DO 30 K-2,N
KPL=K+NKNEXT«NEXT+K-1Y(K)=Y(K)-Y(K-1)/Z(KNEXT)
23*5
6789
101112131*1516
171619202122232*2526272e29303132333*3536373639*0*1*2*3***5-
123
67f9
101112131*1516171619202122232*252627
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
634 MICHAEL GHIL AND RAMESH BALGOVIND
30 Y(KPLI*Y(KPL)-Y(KPL-1)/Z(KNEXT)NEXTPL-NEXT+NY(N)«Y(N)/Z(NEXTPL)Y(2»N)=Y(2»N)/Z(NEXTPL)DO *0 J*2,NK-N-J+lKPL-K+NKNEXT=K+NEXTY(K)*(Y(K)-Y(K+1))/Z(KNEXT)
'""(KPL>=(Y(KPL)-Y(KPL+1))/Z<KNEXT)NEXT*NEXTPLRETURNEND
Courant Institute of Mathematical Sciences
New York University
New York, New York 10012
1. E. BATSCHELET, "Über die numerische Auflösung von Randwertproblemen bei ellip-
tischen partiellen Differentialgleichungen," Z. Angew. Math. Phys., v. 3, 1952, pp. 165 — 193.
2. J. H. BRAMBLE & B. E. HUBBARD, "Approximation of derivatives by difference
methods in elliptic boundary value problems," Contributions to Differential Equations, v. 3, 1964,
pp. 399-410.
3. J. H. BRAMBLE & B. E. HUBBARD, "A finite difference analog of the Neumann prob-
lem for Poisson's equation," SIAM J. Numer. Anal, v. 2, 1964, pp. 1 — 14.
4. O. BUNEMAN, A Compact Non-Iterative Poisson Solver, SUIPR Report No. 294, Inst.
Plasma Research, Stanford Univ., May 1969, 11 pp.
5. B. L. BUZBEE, G. H. GOLUB & C. W. NIELSON, "On direct methods for solving
Poisson's equations," SIAM J. Numer. Anal, v. 8, 1970, pp. 627-656.
6. L. COLLATZ, The Numerical Treatment of Differential Equations, 3rd ed., Mathema-
tische Wissenschaften, vol. 60, Springer-Verlag, Berlin, 1966, 568 pp.
7. J. W. COOLEY, P. A. W. LEWIS & P. D. WELCH, "The finite Fourier transform,"
IEEE Trans. Audio and Electroacoustics, v. 17, 1969, pp. 77—85.
8. E. G. DJAKONOV, "On certain iterative methods for solving nonlinear difference equa-
tions," in Conference on the Numerical Solution of Differential Equations (J. LI. Morris, Ed.),
Lecture Notes in Math., vol. 109, Springer, Berlin, 1969, pp. 7-22.
9. F. W. DORR, "The direct solution of the discrete Poisson equation on a rectangle,"
SIAM Rev., v. 12, 1970, pp. 248-263.
10. T. ELVIUS & A. SUNDSTRÖM, "Computationally efficient schemes and boundary
conditions for a fine-mesh barotropic model based on the shallow-water equations," Tellus, v. 25,
1973, pp. 132-156.
11. G. E. FORSYTHE & W. R. WASOW, Finite-Difference Methods for Partial Differential
Equations, Wiley, New York, 1960, 444 pp.
12. D. FISCHER, G. GOLUB, O. HALD, C. LEIVA & O. WIDLUND, "On Fourier-
Toeplitz methods for separable elliptic problems," Math. Comp., v. 28, 1974, pp. 349-368.
13. S. GERSCHGORIN; "Fehlerabschätzung für das Differenzverfahren zur Lösung partiel-
ler Differentialgleichungen," Z. Angew. Math. Mech., v. 10, 1930, pp. 373—382.
14. M. GHIL, "The initialization problem in numerical weather prediction," in Improperly
Posed Boundary Value Problems (A. Carasso and A. P. Stone, Eds.), Research Notes in Math.,
vol. 1, Pitman, London, 1975, pp. 105-123.
15. M. GHIL, Initialization by Compatible Balancing, Report 75 — 16, Inst. Comp. Appl.
Sei. Engr., Hampton, Virginia, 1975, 38 pp.
16. M. GHIL & B. SHKOLLER, "Wind laws for shockless initialization," Ann. Meteor.
(Neue Folge), v. 11, 1976, pp. 112-115.
17. M. GHIL, B. SHKOLLER & V. YANG ARBER, "A balanced diagnostic system com-
patible with a barotropic prognostic model," Mon. Wea. Rev., v. 105, 1977, pp. 1223—1238.
18. G. GOLUB, "Direct methods for solving elliptic difference equations," in Symposium
on the Theory of Numerical Analysis (J. LI. Morris, Ed.), Lecture Notes in Math., vol. 193,
Springer-Verlag, Berlin, 1971, pp. 1-19.
19. J. E. GUNN, "The solution of elliptic difference equations by semi-explicit iterative
techniques," SIAM J. Numer. Anal., v. 2, 1965, pp. 24—45.
0 2fcD 29D 3GÛ 31Û 32D 33D 3*0 35D 36D 37D 36
0 39D *0-
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use
A FAST CAUCHY-RIEMANN SOLVER 635
20. B. GUSTAFSSON, "An alternating direction implicit method for solving the shallow
water equations,"/. Computational Phys., v. 7, 1971, pp. 239-254.
21. R. W. HOCKNEY, "A fast direct solution of Poisson's equation using Fourier analy-
sis,"/. Assoc. Comput. Mach., v. 12, 1965, pp. 95 — 113.
22. R. W. HOCKNEY, "The potential calculation and some applications," in Methods in
Computational Physics (B. Adler, S. Fernbach and M. Rotenberg, Eds.), vol. 9 (Plasma Physics),
Academic Press, New York, 1969, pp. 135-211.
23. W. E. LANGLOIS, Vorticity-Stream Function Computation of Incompressible Fluid
Flow with an Almost-Flat Free Surface, IBM Research Report RJ 1794 (#26092), 1976, 8 pp.
24. H. LOMAX & E. D. MARTIN, "Fast direct numerical solution of the nonhomogeneous
Cauchy-Riemann equations,"/. Computational Phys., v. 15, 1974, pp. 55—80.
25. E. D. MARTIN & H. LOMAX, Rapid Finite-Difference Computation of Subsonic and
Transonic Aerodynamic Flows, AIAA Paper No. 74—11, 1974, 13 pp.
26. E. D. MARTIN & H. LOMAX, Variants and Extensions of a Fast Direct Numerical
Cauchy-Riemann Solver, with Illustrative Applications, NASA Tech. Note TN D-7934, 1977,
94 pp.
27. J. ÖLIGER & A. SUNDSTRÖM, "Theoretical and practical aspects of some initial-
boundary value problems in fluid dynamics," SIAM J. Appl. Math. A, v. 35, 1978, pp. 419-446.
28. P. J. ROACHE, Computational Fluid Dynamics, 2nd ed., Hermosa Publishers, Albu-
querque, 1976, 446 pp.
29. U. SCHUMANN & R. A. SWEET, "A direct method for the solution of Poisson's
equation with Neumann boundary conditions on a staggered grid of arbitrary size,"/. Computa-
tional Phys., v. 20, 1976, pp. 171-182.
30. P. N. SWARZTRAUBER, "A direct method for the discrete solution of separable ellip-
tic equations," SIAM J. Numer. Anal, v. 11, 1974, pp. 1136-1150.
31. R. A. SWEET, "A generalized cyclic reduction algorithm," SIAM J. Numer. Anal,
v. 11, 1974, pp. 506-520.
32. O. WIDLUND, "On the use of fast methods for separable finite-difference equations for
the solution of general elliptic problems," in Sparse Matrices and Their Applications (D. J. Rose
and R. A. Willoughby, Eds.), Plenum Press, New York, 1972, pp. 121-131.
33. O. WIDLUND! "Capacitance matrix methods for Helmholtz' equation on general
bounded regions," in Numerical Treatment of Differential Equations (R. Bulirsch, R. D. Grigorieff
and J. Schröder, Eds.), Lecture Notes in Math., vol. 631, Springer, Berlin, 1978, pp. 209-219.
License or copyright restrictions may apply to redistribution; see https://www.ams.org/journal-terms-of-use