+ All Categories
Home > Documents > NUMER. Vol. 34. No. Ii. pp. 2090-2118. Dcember 1(197

NUMER. Vol. 34. No. Ii. pp. 2090-2118. Dcember 1(197

Date post: 21-Feb-2022
Category:
Upload: others
View: 2 times
Download: 0 times
Share this document with a friend
29
SIAM J. NUMER. ANAL. Vol. 34. No. Ii. pp. 2090-2118. D"cember 1(197 © 1997 Society for Industrial and AppliE'd l>lathematics 002 PARALLEL DOMAIN DECOMPOSITION SOLVER FOR ADAPTIVE HP FINITE ELEMENT METHODS· J T. ODENt, ABANI PATRA*. AND YUSHENG FENC§ Abstrllct. In this paper. the development and implementation of highly parallelizable domain decomposition solvers for adaptive Itp finite element methods is discussed. Two-level orthogonaI- ization is used to obtain a reduced system which is preconditioned by a coarse grid operator. The condition number of the preconditioned system. for Poisson problems in two space dimensions. is proved to he bounded by C(l + log I/p/h)2(1 + logp)2 and Cp( 1+ log /lp/h)2(1 + logp)2 for different choices of coarse ~id operators, where H is the 5ubdomain size, p is the maximum spectral order, h is the siztouf the smallest element in the subdomain, and C is a constant independent of the mesh parameters, The work here extends the work of Bramble et al. [Math Camp .. 47 (1986). pp, 103-134] on the h-version and Babuska et aI, [SIAM J, Numer, Anal" 29 (1991). pp, 624-6611on the p-version of the finite element method. A preliminary version of this solver was first announced by Oden, Pa- tra, and Feng in [Domain Decomposition Solver for Adaptive hp Finite Elements, VII Conference on Domain Decomposition, State College, PA, October 19931, Numerical experiments show fast convergence of the solver and good control of the condition number on a variety of discrctizations, Key words, domain decomposition, preconditioners, adaptive mesh, finite elements AMS subject classifications, G5N55.65N30, 35J25, 65Y05 PII, 50036142994278887 1. Introduction. Adaptive lip finite elements, in whkh the spl.'CtraJorder and element size are independently varied over the whole domain, are capable of delivering solutioll accuracies far superior to classical 11-or p-versioll finite element. methods for a given discretization size, Several researchers [2, 7, 18] have, in fact, shown that the reduction in discretization error with respect to number of unknowns can be exponential for general classes of elliptic boundary value problems, as opposed to the asymptotic algebraic rates observed for 11-or p-versiou finite element methods, Together with multiprocessor computing, these methods thus offer the possibility of orders-of-magnitude improvement in computing efficiency over existing finite element models, A principal computational cost in any finite clement solution is elleDulltered in the solver. In a parallel computing environment, conventional dired solvers based on some variant of Gauss elimination are extremely inefficient for the irregular sparse linear systems generated by adaptive lip discretizations, Tills is due to t.heir extremely high communication requirements during the elimination process for shared data, Fur- ther, as we discuss in the sequeL the linear systems are often very poorly conditioned, ruling out most standard iterative solvers, This feature also greatly degrades the solution obtained from direct solvers, Thus. efficient solvers meet.ing t.he twin cri- teria of being parallelizable ami controlling the conditioning of the system need to be developed, "Received by the editors September 12. 1994; accepted for publication March 27, 1996, This work was supported by Advanced Research Project Agency contract DABT63-92-C-0042, http://www .siam,org/ journals/ sinum/34-6/27888, html tDireclor and Cockrell Family Regents' Chair in Engineering #2, Texas Institute For Computat.ional and Applied !.. !athematics, University of Texas at Austin, Austin, TX 78712 ([email protected],ulexas.edu) . *Deparlment Mechanical En~ineering, SUNY-Bulfalo, Bnffalo, NY 14260 ([email protected]. edu). §Motorola Inc" Austin, TX 78721 ([email protected]). 2090
Transcript

SIAM J. NUMER. ANAL.Vol. 34. No. Ii. pp. 2090-2118. D"cember 1(197

© 1997 Society for Industrial and AppliE'd l>lathematics002

PARALLEL DOMAIN DECOMPOSITION SOLVER FOR ADAPTIVEH P FINITE ELEMENT METHODS·

J T. ODENt, ABANI PATRA*. AND YUSHENG FENC§

Abstrllct. In this paper. the development and implementation of highly parallelizable domaindecomposition solvers for adaptive Itp finite element methods is discussed. Two-level orthogonaI-ization is used to obtain a reduced system which is preconditioned by a coarse grid operator. Thecondition number of the preconditioned system. for Poisson problems in two space dimensions. isproved to he bounded by C(l + log I/p/h)2(1 + logp)2 and Cp( 1+ log /lp/h)2(1 + logp)2 for differentchoices of coarse ~id operators, where H is the 5ubdomain size, p is the maximum spectral order, his the siztouf the smallest element in the subdomain, and C is a constant independent of the meshparameters, The work here extends the work of Bramble et al. [Math Camp .. 47 (1986). pp, 103-134]on the h-version and Babuska et aI, [SIAM J, Numer, Anal" 29 (1991). pp, 624-6611on the p-versionof the finite element method. A preliminary version of this solver was first announced by Oden, Pa-tra, and Feng in [Domain Decomposition Solver for Adaptive hp Finite Elements, VII Conferenceon Domain Decomposition, State College, PA, October 19931, Numerical experiments show fastconvergence of the solver and good control of the condition number on a variety of discrctizations,

Key words, domain decomposition, preconditioners, adaptive mesh, finite elements

AMS subject classifications, G5N55.65N30, 35J25, 65Y05

PII, 50036142994278887

1. Introduction. Adaptive lip finite elements, in whkh the spl.'CtraJorder andelement size are independently varied over the whole domain, are capable of deliveringsolutioll accuracies far superior to classical 11-or p-versioll finite element. methods fora given discretization size, Several researchers [2, 7, 18] have, in fact, shown thatthe reduction in discretization error with respect to number of unknowns can beexponential for general classes of elliptic boundary value problems, as opposed tothe asymptotic algebraic rates observed for 11-or p-versiou finite element methods,Together with multiprocessor computing, these methods thus offer the possibility oforders-of-magnitude improvement in computing efficiency over existing finite elementmodels,

A principal computational cost in any finite clement solution is elleDulltered inthe solver. In a parallel computing environment, conventional dired solvers basedon some variant of Gauss elimination are extremely inefficient for the irregular sparselinear systems generated by adaptive lip discretizations, Tills is due to t.heir extremelyhigh communication requirements during the elimination process for shared data, Fur-ther, as we discuss in the sequeL the linear systems are often very poorly conditioned,ruling out most standard iterative solvers, This feature also greatly degrades thesolution obtained from direct solvers, Thus. efficient solvers meet.ing t.he twin cri-teria of being parallelizable ami controlling the conditioning of the system need tobe developed,

"Received by the editors September 12. 1994;accepted for publication March 27, 1996, This workwas supported by Advanced Research Project Agency contract DABT63-92-C-0042,

http://www .siam,org/ journals/ sinum/34-6/27888, htmltDireclor and Cockrell Family Regents' Chair in Engineering #2, Texas Institute For

Computat.ional and Applied !..!athematics, University of Texas at Austin, Austin, TX 78712([email protected],ulexas.edu).

*Deparlment Mechanical En~ineering, SUNY-Bulfalo, Bnffalo, NY 14260 ([email protected]).

§Motorola Inc" Austin, TX 78721 ([email protected]).

2090

PARALLEL DO.MAIN DECOMPOSITION SOLVER 2091

In recellt, years, very efficient parallel iterative solvers have been developed usingthe domain decomposition or substructurillg ideas. These solvers are highly par-allelizable and at the same time provide provably good control of the cOllditioningof the system. Bramble, Pasciak, and Schatz [4] developed the first such solver forh-version finite element methods, They proved that the condition number of the lin-ear system. generated by problems in two space dimensions, could be bounded byC(l + log(H/II)2), where H is the subdomain size, h is the minimum element sizewithin a subdomain, and C is a constant independent of the mesh parameters, Dryja,\Vidlund, and their coworkers obtained similar results using techniques based on theclassical Schwartz alternating method [19, 8] and extended many of the results tothree dimellsions. Subsequently, Babllska et al. [3] extended the method to I>-versionfinite elements. Here again the condition number of the systems was proved to bebounded by C(l + log2 p), Bramble et a1. [5] obtained sharper bounds for the h-version finite elements with local refinements in both two and three space dimensions,Mandel [10, ll, 12] developed efficient preconditiollers for p-version iterative solvers,Pavarino [171 obtained many theoretical results for the p-version with and withoutlocal refinements, More recently Ainsworth 11] has extended the theoretical results ofBramble et al. to the hp-version On quasi-uniform meshes, Other than the preliminaryHnDOUDcemellt in [141. we kllOWof no other parallel irnplemelltations of such solversfor lip-version finite element schemes,

In this paper. we discuss a practical and efficient iterative solver for adaptive lipfinite element discretizations, The solver is based on domain decomposition ideas andthe preconditioned conjugate gradient method, The solver can be thought of as acombination of multiple direct solvers at the subdomain level with a preconditionediterative solver to handle the interfacfl problem efficiently in parallel.

We first introduce a decomposition of the finite element space by decomposingthe underlying domain, Such decompositions may be automatically obtained by tech-niques discussed in Patra and Oden 116] or any of a variety of mesh partitioningsoftware now available, The subdomain shapes are not required to be triangular orquadrilateraL We then prove that, for problems in two space dimensions, the conditionnumber of the preconditioned system can be bounded by C(l + log Hp/h)2(1 +logp)2and Cp(1 + log Hp/hf{1 + logp)2 for different choices of preconditioners, where H,p.and h are the mesh parameters defined before and C is independent of them, Animplementation strategy and extensive numerical results complete the presentation,

2. Condition number growth with h and p. We begin our study of iterativesolvers with a series of numerical experiments to demonstrate the growth of the con-dition number with the mesh parameters II and p for both uniform and non-uniformmeshes, Using the Poisson equation ou a rectangular domain 8.,<; a test problem, weplot the growth in the condition number with changes in hand}J, These calcnlationsinvolve shape functions formed by standard tensor products of integrated Legendrepolynomials, In Fig, 1, the growth with respect to uniform refinements and enrich-ments is plotted, Note that the scale of the condition number in this figure is somewhatarbitrary as the system can be scaled by an arbitrary positive constant, In the nextfigure. Fig, 2. the growth of condition number over a sequence of adaptive hp meshesis plotted (see {15] for more examples and proofs on the condition number growth ofseveral adaptive hp refinement patterns without domain decomposition treatment),In both cases the computed growth is sccn to be vpry rapid,

3. Model problem and finite element spaces. The solver will be discussedwith respect to the model problem defined below,

2092

10000

1000

!E::>c:c: 1000

~8

10

..I

f

tft

,], T, ODEN, ABANI PATRA, AND YUSHENG FENG

uniformh -uniform p -+-_.

1o 200 400 600 800

degrees of freedom1000 1200

FIG, I. Condition number growth for uniform hand p refinement for a model elliptic problem,

]E:::lC

C,g:ge IEOI

2

Speclral Order p

100 150 200 250 300deerees of freedom

3

FIG. 2, Typical computed results on condition number growth for a sequence of meshe., producedby adapl-ive lip refinement,

Find II E V stich that

(1) B(u, v) = .c(v) 'r/ v E V,

where V = {v: v E HI(H),v =0 on an} = (HJ(H)), and B(u, v) is the bilinear form

PARALLEL DOMAIN DECOMPOSITION SOLVER 2003

on V characterizing a weak formulation of a two-dimensional second-order ellipt.ic PDEwith Dirichll't conditions on thl' boundary on and C. is a continLlous linear functionalin V; we require B(u, v) to be continuous 011 V, symmetric, and coercive. Thus (1) iswell posed and possesses a unique solution, In the theoretical developments given insections 4, 5, 6, we focLls on the Poisson problem for which

(2) 8(u,v) = InVu, Vv dx,

However, the general strategy has been successfully applied t.o lllore general bOlllldary\~dlue problems.

We consider a family Q of partitions Ph of n over which a sequence {.rh,p} offinite-dimensional subspaces of V is constructed, For definiteness, but without lossof generality, we are concerned with apprmdmation spaces based on the followingstructures,81. n is a connected domain affine equivalent. topologically, to a union of rectangles

and can be represented as the union of ND subdomains nr such that

We denote

82, The construction (in 81) defines a family of coarse mesh partitions PH of n,8a, Each sllbdomain n is partitioned into a finer mesh of quadrilateral sllbdomains

(the finite elements) wk. [{ = 1. 2." .. NI.

nl = u%~Jwf(. wk nw[ = 0. J( f: L,

and

The subpartitioning is quasi uniform of size O(hr) for each J.84, Each quadrilateral element Wk is the affiue image of a master element. Q =

[-1.1j2. The master element has nine-nodal points: four vertices, al' a2, a3.~. four edge nodes el, e2, e3. e4. and a centroid node. an (see Fig, 3),Corresponding polynomial shape fUIlctions are constructed that fall into thefollowing categories:Vertex junctions. These are the standard bilinear functions

~ 1 ?"ME" 1]) = 4(1 ± E,)(1 ± 1/). i = 1. 2, ;3,4. (E,,1/) E [-1. 11-,

tPi(aj) = fJij, 1$ i,j $ .1.

Let

12k-11~pdt;) = ~ -I Pk_l(S)ds

with Pk-l the Legendre polynomial of degree (k - 1),

2094 J. T, ODEN. ABANI PATRA, AND YUSHENG FENG

(-1,-1) =

it'2

'3'0

FIG, :l, The master element.

-a. = (1,1)

:e:--~4

Edge junctions, A polynomial function of degree P,., s = 1. 2, 3, 4. vanishingat the vertices, as.c;igned to the edge nodes, e,g,.

(fl (~, 1/) = -21(1 + 1/){Ji(~), i = 2.. ..,Pl';P2(~ ) _ 1(1 _ C) .() , _ ? ~'>i ,>.1/ - 2 '> P, ,/. z - -. "', P2.

(f3(~, 1/) = -;,1 (1 + 1/)Pi(l;),i = 2..... P:l.

,P4(C ) _ 1(1- C) '( ) , - ? :-'>i '>, '/ - .) ... P, '/ , z - -, .... P4 '

Interior (bubble) functions, Corresponding to node 110, we introduce thefunctions

We will denote the resulting space of functions spalUled by the above shapefunctions by QP(w) where fJ = ma.XO$i$4{fJ;}, Shape functions on each Wkwill be polynomials of different degree at different nodes, We denote

Pk = ma.x degree of the polynomial shape functions deflm><Jon Wk,I $ J $ N D, and

If Fk : W - wk is an affine invertible map from w to element wk' thencorresponding shape functions for wf( are of the form

"'}(1 =;Pi 0 (Ff( )-1.

,f"J (;Pj (Ff )-1'>Ki = i 0 K ' etc ..

PARALLEL DOMAIN DECOMPOSITION SOLVER 2095

Q(a)

Q(b)

(c)

FIG, 4, (a) General hp me.~h, (b) decomposition into four subdomains, and (c) a wire frameboundary of tile subdornain.s,

and the restriction uk of Uhp E Vhp to wk is of the form

1 ~ (FI)-Iul\ = UJ( 0 1\ '

85. Interel(~ment constraints are imposed so that, globally, .rh,p C CO(IT); this canbe accomplished using, for example, the schemes described in [7],

Throughout this investigation, we as,<;umethat conventions 81-85 are in force.although thl! results arC' valid for much more general applications. The resultingstructure admits the use of nonuniform mesh sizes and llonuniform distributions ofspectral order p: thus, the methods under study here are true hp-version schemes. Theparticular mesh structure and p distribution is assumed to be given and is generallydetermined by an adapt.ive strategy of the type given in [13, 18J,

The situation of interest here, consistent with the convent.ions listed above, isillustrated in Fig 4, The domain n is covered by a nonuniform lip mesh. as indicatedill Fig 4a (different shadings suggesting different spectral orders p), supporting ba..<;isfunctions for the space .rh,p' Thi<; is partitioned into t.he the mesh division PIl con-sisting of of the ND subdomains n[. as shown in Fig 4b, The int.erdomain boundariesconstitute a wire frame domain of fuuctions as suggest.ed in Fig 4c, Note that thesewire frame lil1lctions are not separately coustructed finite element shape fundions onthis coarser division but natural combinations of the finite element shape functionson the fine mesh, This allows much more general partitionings of the domain withoutregard to specific geometric shapes of the partitions, This feature is necessary toconveniently support the irregular rcfinemeuts and uIlstructured grids characteristicof good adaptive methods, The stiffness matrices corresponding to these fUllctionsare automat.ically constructed duriug the elimination of interior unknowns or by a

2096 J, T. ODEN, ABANI PATRA. AND YUSHENG FENG

process we shall refer to as part.ial orthogonalization of the finit.e element. shape func-tions whidl arc nonzero on the int.erface with respect to those with nonzero supportonly in the interior of the subdomalls. \,ye shall use the term interface mesh to referto the funct.ions on the wire frame, This differs significantly in implementation fromthe coan;e mesh as used by others in the literature [4, 19],

We dist.inguish between basis functions corresponding t.o the vertic·eg and edgesof subdolllain nl (vertices and edges of 01 being identified as the element verticesand edges in awk n aO/) and those interior to nl: the vertices of HI are hereaftercalled nodes: the edges and edge functions on anI are called sides: vertices and edgesinterior to 01 on aWK\anl retain the nomenclature vertex and edge functions. Thus.t.here is a two-level hierarchy of functions defining bases of the following spaces,

Interior fine mesh,Vertex functions

xi = span{(vertexfunctions)(uj).Uj E nIl,Edge functions

xf = span{(edgefunctions)(ej)' ej E Or},

Bubble functions

XIH = span{ (bubblefunctions)(aOj), aOj E HI}'

Interface mesh,Nodal functions

x{" = span{(vertexfunctions)(uj), Uj E anI}'Side functions

xf = span{(edgefunctions)(ej). ej E 8n/},See Fig, 5,

For each element wk c nl, the element shape functions belong to the space

(3){IL: It =: U(Xl,X2): (XI,X2) E wk. U = UO (Fk)-I, }

u E QP(w), P = PK = ma.XO~i~,dpf}}.

The resulting space of functions defined on 01 by

(4) =

vN=XS=XV~¥E=XB }i'l.l 'L' 1 'L' 1 'L" 1 'L' 1

{v: v = V(XI' X2) = ILln!, u E .h.p' vl"'f E QPK (wl),

1 $ J( $ NI, h = (hi' h2, " .. hN!), P = (PI, P2." .. PN/)}

and. for a sillgle element, we write

(5)

It imJllediately follows that we may write (with obvious notation)

Nv Nv N/

(6) B(u, v) = L BI(U, v) = L L Bk(u, v) V It, V E .rh,p(HJ),1=1 1=1K=l

E

B

PARALLEL DOMAIN DECOMPOSITION SOLVER

v

~II~en \

S

e I

2097

FIG, 5. Decomposition of the finite element space: nodal (N) degrees of freedom are at nodeson subdomain n, interfaces: side (S) are p-type degrees of freedom at side,~ between interface nodes;on the interior of the subdomains. vertice.' (V), edges (E), and bubbles (B) degrees of freedom arellre,Qent.

where u and v in 81(11. v) and 8k(-lL. v) imply t.he correspollding restriction to 0.[ and[

wK'With B(u. v) defined by (2), we define norIIlS and seminorms that we lise in the

sequel. The first are the HI seminorrns;

luli.n ~ B(u, u) VuE '''',p(Il), }

(7) luH,nl = BI(U.. u) '</u E Fh,p(0.I).

luli,wk = BK(u, u) Vu E Fh,p(wk),

The weighted H1(0.) norm for domains of size H, i.e., dia,(Sl) = H in two spacedimensions. can be written as

(8) Ilulli,l1 = ~2I1ulli2(11)+ luli,n'

The seminorms on the fractional spaces Jf1/2(J) = (HO(1). HI (1))1/2 and Ht/2(I) =(H°(I),HJ(I))1/2' respedively, where I = (a,b) c lR, are given by

(9)

(10)

b b ( ')2 u(x) - u(y) -It/ii/v = r r _) dxdy,io. io. x 11

lb ( ( )'») b ( '»)2 ~ 2 . U x - u(y)-ollull1/2 ~ luI1/2.1 + 2 -=- dx + 2 r - dy,0. X a in b - y

2098 J, T. ODEN, ABANI PATRA, AND YUSHENG FENG

(11)

where "~" means equivalent, These two norms are needed to characterize traces onelement and sllbdomain boundaries,

Now we establish some ba."ic results on Fh,p(n). which shall be used to establishbounds on the performance of the solution algorithms,

The first of these is a bound on the ma.'\.imwu value of the function in tenusof the Hi (n) norm, This is constructed using the famous Markov inequality and alemma proved by Bramble and Xu [61, For completeness both the lemmas are cited insection 7,

THEOREM 1. Let u E Fh,p(nJ): tllen

Ilull7,oo(!1I) ~ C (1+ log ~p) Ilulli.!11'

where H = dia(nl), h = minI( dia(wI(), pis the maximum order of polynomial in WI(,and C is independent of H p, and h,

Proof

Applying the ~vlarkov inequality to *.. i;;. we obtain

au p21Ila.,,~ ~ -, ma.'Xlu(xl.Xz)l,

XI UXI t XI

aU pZIllax~ ~ -, mHxlu(xl,Xz)l,

X2 UX2 t X2

Hence

I.e"

,)

p-IIV'uIILOO(!1I)~ 2 "hlluIILOO(!1J)'

Lemma 1. section 7 holds \I€ E (0,1), choosing € as

II€ = 4Cp2H'

Since p 2: 1 and H 2:: II, it follows that € E (0.1) for all C = Co 2: 1/4, Then

Ilog€11/2 = 14IogC+2Iog* +logHII/2

Lemma 1. section 7 thus leads to

PARALLEL DOMAIN DECOl\·IPOSlTJON SOLVER

or

(I HII/2 l+?~ )IIUIILOC(ntl ::; C 4logC + 2 log Ph lIulh.n1 + TllullLoc{nd '

where ~ = H r? / II, Now ~ > 1 since H / h 2': 1 and P 2': L Thus

1+2«~4( - 4

and

where

C}= C1_ 1+:?° ::; 4C,4~

2099

The result follows, 0We now present. a simple theorem on extensions of functions defined on the bound-

aries of subdomains into their interior. This will allow us to bound energies of suchextensions in terms of the values of these functions on the subdomain interfaces, Thetheorem is a generalization of a result introduced by Babuska and Suri [21 011 a singlemaster element.

THEOREM2 (extension theorem). Let SIl< E PpKhIK), where 'HK = aOlnOwK,and let the side shape functions on (Xli vanish at all nodal points N = A. B"", EDOl: SlldA) = SIK(B) = ." = 0, Then there exist.~ a U E Fh,p(Od, whereU 1"Y1K = Sl K sucll that

(12)

(13)

11U11I,n1 ::; CL IISIII~/2'''Y1KK

'J

::; C 0 IISlli/2,8fh '(See Fig. 6 faT an illuBtmtion.)Proof. Consider an e>..'tensionUK E QPk (wf() such that

UK 1"Y1K=SIK on /IK,

By Lemma 6,

lIuKII1.wK ::; C 0 IISI11/2,'l'lK 'Now const.ruct U 011 01 by defining

{

UK, (x, y) E WK,u=

0, (x, Y) E Oi\ E WK,

where WK are elements such that aWK nan, =1= 0, Then

11U1I~,nI= L lIuKII~.wK 'K

Then

IlUlli,ni ::; CLK a IISilii/2,"YiK

::;CLK oIiSiIIT/2,anl' 0

2100 .I. T, ODEN, ABANI PATRA, AND YUSHENG FENG

u

Y iK

y

FIG, 6, mustrntion of the extension function.

4. Parallel solver algorithm. In terms of the decomposition of the finite ele-ment spa.ce introduced earlier. the bilinear form can be written

NvB( ) "'" B (N S \-' E . B . N . S V E B )

'l/./ap. Il/ap = L.... / l1/ap + u/ap + tLhp + ll/ap + lthp' lL/ap + Il/ap + lLhp + 1I/ap +ILhp ,

1=1

where 1l/.p is an arbitrary element of Fh,p' This results in a SlIbdollla.in st.iffness matrixK / which contains sublllatrices associat.ed with nodal (N), side (S). and internal (I)degrees of freedom, The internal degrees of freedom are furt.her composed of vertex(V). edge (E), and bubble (B) degrees of freedom, Symbolically K I is of the form

[NN NS NI]

KI = 8N S8 SI ,IN IS II [

VV VE VB]II= EV EE EB ,

EV BE BB

I I represent.s the matrLx corresponding t.o aU internal degrees of freedom,Level 1 partial orthogonalization, t\ow if the local trial functiolls are chosen to

satisfy the orthogonality condition

(14)

the element stiffness matrix K cit and the submatrix corre$ponding to the interiordegrees of freedom reduce t.o

[---- -----VV~lt V Eelt

Kelt = Ev;;: BE;;;o 0

o ]o 'BBeit

VVEVo

VEEEo

~ ],BB

where VV. EE. V E represent modified blocks of the original matrLx, We can visual-, I'. f" f I I f I XN vS vV vF'Ize t liS as a trans ormatIOn 0 t le )a8es 0 t Ie spaces /, ...\./ ' "'/ ''''/ J lllto spaces

PARALLEL DOMAIN DECOMPOSITION SOLVER 2101

/if, xf. xy, xF such that every eleuH'nt in the transformed space is orthogonal toevery element in ry;B in the inner product defined by the bilinear form, This, in moreconventional domain decomposition language, makes the spaces Xi

N, rYp. xt. Xp dis-cretely harmonic to X;B,

Level 2 partial orthogonalization, If, in addition, the trial functions satisfy theorthogonality condition

(15)

the resulting subdomain stiffness matrices reduce to

[ - - ]NN NS 0K/ = SN SS ..£.

o 0 II

VEEEo

~ ],BB

where N N, NS, Sil. SS are matrices corresponding to shared degrees of freedomamong subdomains, Note that the first orthogonality condit.ion causes the orthogo-nalization of the edges, vertices, nodes, and sides with respect to the bubbles, whilethe second causes the orthogonalization of the interfaces with t.he interiors, The sec-ond orthogonalization makes the spaces xf + xf discretely harmonic with respectto Xi' + Xl. The original finite element. space is thus decomposed into the sum

(16)

and each It E F",p(nl) can be written in the form

(17)

Note that we eould have included t.he level 1 process in the level 2 by orthogonal-izing with respect to VV, EE, and BB at the same time-however, this would resultin much larger (by a factor of 2 to 3) local systems that would have to be factored,This would completely destroy the computational efficiency of this algorithm and failto take advantage of the special structure of the matrices from higher-order elements,

Remarks,1, Implementation of the first orthogonality condition can be done at the element

level and is thus completely parallelizable,2, Implementation .2L.the second orthogonality condition involves modification

of NN. etc" to NN, etc" which can be accomplished by the matri..x sum

Nv

NN = 2:)NN1 + fiN/),1=1

where NNI is the NN-stiffness matrix for domain nl and FiN I is its modi-fication produced by tl~artial-orthogonalization process,For each subdomain, NN] can be computed usiug only subdomain data.Hence these computations are also parallelizable,

3, If an iterative solver (e,g" PCG) is used to solve the interfare problem. theselI1odificatious can theu directly participatl> ill the pam lieI matrix-vector prod-uct that characterize these methods, and there is no need for assembly of thesecomponents,

2102 J, T, ODEN, ABANI PATRA, AND YUSHENG FENC

4, The subdomain interior prohlems are now independent of the interfaceproblem,

The parallel domain decomposition algoritlun is summarized as follows,

SOLUTION ALGORITHM

!. Partition the mesh into subdomains using any good decomposition algorithms validfor nonuniform hp meshp$ (e.g" see Patm uud Oden [16]). This has the effect.ofpartit.ioning the set. of unknowns u int.o {1/}~~l' Further, the decomposit.ion oft.he finite clement space corresponds to a decomposition of the unknowns in eachdomain into ul = {u}.,u~,u~,u}.,u~},

2, Creat.esubdomain approximat.ions transforming the algebraic system at the elemcntand sllbdomainlevels to sat.isfy ort.hogonality conditions (14) and (15). This ent.aiL"solving mult.iple right-hand sides on independent. element. and sllbdomain problemsfollowed by local matrix-mat.rix multiplicat.ion (e,g" at the clement level we needto compute VV = VV + VB BB-1 BV, EE = EE + EB BB-1 BE"." and atthe subdomain level we need to compute NN = NN + NI II-I IN, SS = SS+SI II-I IS"" where NN, 88, EB. etc .. are the matrix blocks defined earlier).

3, Solve t.he independent clement and subdomain problems in parallel to computeUB= BB-Ifs and {u-v,u'E} = II-I{fv, fE}'

,1, Solve the interface problem by an iterative method (e,g" peG) (using a precondi-tioner of the type described later by C in (21), (20), and (22) to obtain {UN, us},

5. Transform the solution of t he independent local problems to t.he original systemfor condit.ion (15) at subdomain level and condition (14) at. element level (e.g"UB = !til - BB-I B\/{UN.1lS}).

5. Condition number boullds for problems in two dimensions. There aresllveral technieal results of independent interest on polynomial spaces that we shalluse iII our proof, For the sake of readability these are included in a separate sectionat. the end of this article,

For Uhp E Fh,p note t.hat

Nv Nv

'"' '"' T 7'B(UI,P' Uhp) = L BI(Uhp,Uhp) = LUI K1UI = u Ku,1=1 1=1

As demonstrated earlier. the matrices K I can be poorly condit.ioned, A mat.rix C isa preconditioner of K if C is "spectrally close': to K, \vruch is expressed symbolicallyas

C-1K ~ I,

I being the identity matrix,A bilinear form C on Fh,p x Fh,p is a preconditioning Jom corresponding to B if

constants ml and m2 exist such that

(18) Tnl B(u, u) $ C(u, u) $ m2B(u, u).

and similar definitions apply to B I, The justification of this nOIllcnclature is d lIC tothe following properties:

i) Inequality (18) is equivalent to

Tnlu7'Ku $ uTCu $ m2uTKu '<Iu E lRM, M = rlim(Fh,p).

ii) If <I> and .x are a generalized eigenfunction-eigenvalue pair defined by

K <I> = .xC<I> with <I>T C<I> = 1.

PARALLEL DOMAIN DECOl\IPOSITION SOLVER

then ,\ is an eigenvalue of C-I K and

ml cpTKcp ~ 1 ~ Tn2CPT Kcp;

i,e"

and

2103

where fi(C-1 K) is the condition nwnber of C-1 K, Hopefully, fi(C-1 K)is close to unity, Thus, for a.ny form C satisfying (18), the correspondingmatri.x C is a preconditioner of K, and the quality of the preconditioner isdetermined by the magnitude of the ratio of the constants (m2/md and itse/oseness to 1. Similar remarks apply to K I and C],

It is sufficient to confine our attention to preconditioners for the subdomain formsand matrices because of the following simple property: if u E Vhp and if constantsTnl and Tn2 exist such that

(19)

then

VI, I=1.2, .. "ND.

mlB(u,u) ~ C(u,u) ~ m2B(u,u),

which follows by simply summing (19) over J, Thus, with

Np

C(u, u) = LCI(U, u)]=1

we seek preconditioning forms C I with desirable properties for the lip-mesh construc-tions described earlier, Let

denote the partially orthogonalized spaces introduced in section 4 through matrixcondensation (Schur complementation) steps described earlier, For Fh,p(nl) givenby (16), denote

d' vN. d' vS I' vV11I1c\.] =nN, In},',,] =ns, ClIllc\.] =1Iv,

d' vE . d' XBInlc\.I = nE, lin I = nB,

so that Uhp can be represented in the form

l1N ns fly 1!.E; 11S

uhpln/ = L IIN,j +L 118,j +L UV,j +L UE,j + LUB,j,j=1 )=1 j=1 )=1 j=1

We also note that xf can be written as xf = EBk~IXfk, where nK is the numberof edges on anI and each Xfk is composed of the orthogonalized side shape functions

2104 J, T. ODEN, ABANI PATRA, AND YUSHENG FENG

based on edge k, Clea.rly dim X},k ~ PI - L Then we define preconditioning formson restrict.ions of functions in l1hp to subdomain f21 by

c~is computatioua.lly much simpler to implement than C} and Cy, However,unlike C}. it is difficult to establish reasonabl(! bounds on the perforlllauce of C~,Another choiee of preconditioner for lip finite clements can be found in Aiusworth [1],

To establish properties of C} and cj we first lay dowu several basic lemmas,Hereafter. we denote HI. hI, and PI by simply H. II. and p, Note that in actual imple-mentation only the first two terms are used on the reduced system as a preconditioner,

THEOHEl\13, Let conventions 51-55 hold and let Fh,p(n/) be the finite elementspace defined in (4), If

i) BI(U. v) = 0

ii) BJ(u,v) = 0

then for any II E Fh,p(f2I),

Vu E XN a;l ,.tS, v E ,.tV a;l XE ffi XB.

V'll E XB, V E XV ffi ,.YE,

A) C~(tt, u) ~ /3BI(u. 11).

( HP)where /3 = C(l ...L logp) 1 + log h 'andB) BI(U, u) ~ nkCj(u, u) k = 1. 2,

Cp(1 + logp)(l + log Iff) for k = L

C(l + logp)(l + log Iff) for k = 2,

and C~ is defined in (21) and (22),Proof. The proof f'.ssentially is composed of computing bounds all the energy

ratios of the components of the preconditioner and u E Fh,p(f2/), The proof is largely

PARALLEL DOMAIN DECOl\IPOSITJON SOLVER 2105

inspired by similar results for the p version in Babuska et al. [3] and the II version inBramble, Pasciak, and Schatz [4],

Let Xj denote the coordinates of a node all the interface and let cPj denote thefirst-order Lagrange shape function associated with Xj, Clearly cPj E Fh.p(nJ),

Let k be the set of nodes beloging to element Wk' Note that k is $ 4. Thus foreach element w~. in subdomain I,

Bf< (~I/N,j, ~ UN,j)jEk jEk

$ B1: (~UN,j(Xj)cPj, ~ UN,j(Xj)¢j)jEk jEk

by the discrete harmonic nature of liN.)

2

where Theorem 1 is applied to the second term, Summing over all elements leads tothe result

Let u he the average value for u, From the previous inequality aJl(I Poillcare'sinequality,

(

liN liN )

B} 2: UN,j - U. 2: UN,j - Uj=1 j=1

(23)

$ C (1 + log¥) Ilu - ullI.nl

$ C (1 + log ¥) (lulf.nI + 1h-llu - uIl12(Od)

( ) (? H2lu-iili Ill)

$ C 1 +log¥ luli.oI +CI H2'

$ C (1 + log ¥) luli,nI'

Now ddinpUN

III - U - 2:UN,i,

i=1

The fUlIction ILl is lIero at all the vertices Xi on anI, By the triallgle inequality,

" ( HP) I I"IUlli,nI $ C 1 + log h U i,nI'

2106 J. T. ODEN, ABANJ PATRA, AND YUSHENG FENG

( HP)!lIutlIU"'(8nr) ~ lIu1I1L""(l1r) ~ C 1+ log h lIu111ulr

tlsing Theorem 1,Let "IiK = anI nWk s1lch that. nI = UZ~lwk ' By Lemma 5 (section i) and the

trace theorem.

'"' 2 2 ')L,.; 0 lIudll/2'/"K ~ 11l111/2,8I1r+ C(l + logp) lIudli""(811r)K

? ?~ C1I1ullli,l1/ +C(l +logp) 1111111£""(811,)

~ C1 lIudli,l1/ + C(l + logp) lIu1174'"(811r)

? (HP) ..,~ C1 Iludli,l1/ + C2(1 + logp) 1 + log h IIulli.l1r

~ C(l + logp) ( 1 + log ~p) lIu/li,111 '

Following Theorem 2. one can construct. Us = L:j~11i.S,j, US,j E Xs EB(Xv +XE+XB)stich that Us = It} on an] and

"K "K }

(24) /liislli,111 ~ C ~ 0 Ilu11Ii/2./'iK = C ~ ..,0 II!LS/li/2'/,iK

~ C(l + logp)(l + log !{f) /lulli,fl1 '

Let uS.j E Xs and HI = Us = L:? US,j: then uS,jlal1/ = uS,jlal1p (US,j -US,j) E

Xv + XE + Xn,uslafl/ = u1lal1/ = uslaflp and (us -us) E Xv + '-YE+ XB,From assumption (i) and (ii), we have

6I(US.j.uS,j) ~ 6I(-itS,j. US,j) ~ 0 lliis,jlli/2''YiK ~ C(l+logp) (1 + log ~p) lIulli,wK

(25)and

Thus.nsL BI(US,j, us.j)j=l

ns~L BI (US,j ,US,j)j=1

ns

~ CL 0 /lUS,j lIi/2,'YiKi=l

( HP)'"::;C(l + logp) 1+ log h Ilulli,\2r'Applying a procedure similar to (23), we find that

(26) ~ lus.ili.n1 $ e(l + logp) (1 + log ~p) luli.111 .

PARALLEL DOMAIN DECOf.-IPOSITION SOLVER 2107

Thus,

( HP){3 ::; C(l + logp) 1 + log h '

Now to establish the bounds in part B of Theorem :3, we expand B( ll, u) as follows:

(

nN ns n.v flE fl.a

BI(U, u) = BI ~ UN,j +~ uS,j +~ U\f,j +~ UE,j + ~ un,j·

nN 1I-s tl\.- nE nn)~ UN,j + ~ US,j + ]; UV,j + ~lLE,j + ]; Un,j

(

nN nN ) ( ns l1S )

= BI f; UN,j' f; UN,j + BI ~ US,j' ~ US,j

(nv ne nB n" ng no)

+ BI ~ UV.j + ~ UE,j + j; Un,j. ~ Ul-',j + ~ UE,j +~ UB.}

(

liN ns )+ 2BI LUN,j, LUS,j ,

j=1 j=1

Applying the Cauchy-Schwartz inequality and the arithmetic geometric mean inequal-ity on the term 2BI('L,j'';llLN,j, 'L,j~llls,j)' we have

( ))1/2 ( ))1/2nN ns nN nN ns ns

2BI (L UN,j. 2: us,j) ::; 2 BI (2: UN,;, 2: UN,j BI (2: US.j, 2:US,j)=1 )=1 )=1 )=1 )=1 J=I

(

nN nN ) ( "s ns )::; BI L UN,j' L UN,j + BI L US,j: L US,j

)=1 j=1 j=1 j=1

, B (""l1S ""ns )Now consider I L..j=1 US,j, L..j=IUS,j ,

(

ns ns )BI L Us,.!' L US,j

j=1 j=1

(

ns liS )

::; BI LiiS,j'L IIS,jj=l )=1

2PK-l

L US,jkj=l

1/2,'YlK

since each edge can have a ma.ximum of Pk - 1 ba.<;isfunctioIls,

2108 ,1. T ODEN, ABANI PATRA, AND YUSHENG FENG

Now using Lemma 4 section 7 we have

(0 0 )ilK pK-l - 1'«-1-

::;C ~I f; US,jk + C(l + logp) f; -l.tS,jk

]/2,'"/1« L""(-,1K)

"C I: (Pt' Us,,, ' + C(l + logp) , )1'«-1 -

L US,jk

/\=1 }=I 1/2,'"/1« '-1}- L"'(-,1«)

since US,jk and US.jk coincide on 'YI K,Again using the trace theorem on the first term

::;C ~I ( P~' us,,, 2 + C(l + logp)l,wK

and Theorem 1 on the second term

pK-I

L US,jkj=1

::; C ~I (P~Ius,,, 2 + C(l + logp) ( 1+ log ~p )

I,WK

(

2ilK I'K-I H

::; C ~ ?= US,jk + C(l + log]) (1 + log hP)1\=1 )=1 L01

Thus

(27)

(28)

(

IlS ns )BI f;US,j· f;US,j ::; C(l + logp)

ilK (1'«-1 1'«-1 )X (1 + log H:) ~1l3/ f; US.jk, f; US,jk '

Combining the above two inequalities, we have

( HP)BI(U, u) ::;C(l + logp) 1 + log h C~(u, u),

Again applying the Cauchy-Schwartz inequality and arithmetic-geometric meaninequalities unto l3I(I:j~~1 US.jk, I:j~;1US,jk) we can expand (27) as

(

ns ns) (H ) ns(29) l31 f; US,j, f; US,j ::; Cp(l + logp) 1 + log : ~ B1(uS,k. US,k),

Thus.

(30) B1(u, u) ::; Cp(l + logp) (1 + log ~p) C}(u, u),

PARALLEL DOMAIN DECOMPOSITION SOLVER 2109

Proof The major result immediately follows as Corollary 1.COROLLARY 1. If tile conditions in Theorem 3 hold, then tile condition number

of the system (K = K(C-1 B)) satisfies

2

Cp(l + logp)2 (1 + log ~p)

C(l + logp)2 (1+ log ~p) 2

Nv

forC = 'Lc},J=1

ND

forC= 'Lcj.J=1

where C is independent of H. h, and p, 0Remarks. AB we shall see in the next section on numerical experiments, this

bound can be pessimistic,We note here that the Ainsworth preconditioner has a sharper bound on the

condition number (0(1 + log Hpjh)2), However, the extremely complicated natureof the sequence of inner products lIsed to define the preconditioner makes it moreexpensive to implement,

6. Numerical results. The numerical results presented ill this section are ob-tained by applying the strategies described in section 4 to the Poisson problem(-~u = f) in two dimensions with n a unit square, Condition number estimatesof the reduced system, iteration counts of the iteration scheme, and parallel efficiencyof the domain decomposition algorithm are presented, The results show that the con-ditioning of the reduced system is dramatically improved, fcUltconvergence is achievedin terms of low iteratioll counts for the iteration scheme, and reasonahly good parallelefficiencies are obtained for highly Ilonuniform adaptive lip meshes.

We compare the performance of the preconditioners introduced in the previoussections (20). (21). and (22), which we shall denote as HPPO. HPPI, and HPP2,respectively, with the original system without any preconditioning, The conditionnumber estimates are ealculated using an e.xtended version of the Lanczos connection,which is derived in [151. to PCG methods, In each case the PCG iterations are carriedout until the residual (Euclidean norm) is reduced to machine accuracy of 10-15,

In Fig. 7. the control of conditioning by the different preconditioning strategiesfor increasing spectral order is shown, All the preconditioncrs appear to control theconditioning quite well,

In Fig. 8, the condition nwnber is plotted against spectral order p for the caseof eight subdomains with H/ II = 4, It is evident that the condition number ofthe reduced system is less than that provided by the estimate from Theorem 3, InFig. 9, the condition number versus the ratio IIj II is plott.ed for the case of foursubdomains with p = 2, Notice t.hat the condition number of the redut'ed system,aft.er applying HPPI preconditioner, is well within the theoretical bounds, while thecondition llumber of the reduced system. after applying HPPO preconditioner, seemsto have a larger growth rate,

In Tahle 1, the condition numher estimates and iteration counts of a four-sub-domain system with HPPO preconditioner for various spectral orders p and ratios ofsubdomain size over Illesh size Hjh are listed, The condition number estimates forthe same system but with HPPI and HPP2 preconditioning are shown ill Tables 3and 4, For large p (p ~ 4) HPP2 appears to improve the conditioning,

For the system resulting from a four-suhdomain decomposition, the EPPO pre-conditiOIwr appears to works as well as the HPPI preconditioner for a constant ratio

2110 ,/, T, ODEN, ABANI PATRA, AND YUSHENG FENG

100000

HPPO 0+-

HPP1 --HPP2 ·0"

original system --

10000

.! 1000E::lc:c:.g'58 100

10

2 3

-........... "'...............-~~ ....

4Spectral order p

5 6 7

FIG, 7. Control of condition number with increase in p, Problem was decomposed into foursubdomains with H/h = 2,

10000,2'p'((1 +100p)(l + log(Hplhl))"21 -

HPPO -HPP1 ,+ .

...+ ......

654spectral order p

3

.....~.

.----....------- ....--------- ..-----+------------ .....-------...............................

2

10

100

oc:c:.g~8

FIG. 8, Plot of condition number: eight subdomains WitJl H / h = -!,

of H/II, However, the HPPO preconditioner cannot control the conditioning of thesystem as t.he ratio H/II increases, The HPPI preconditioner, on the ot.her hand.improved the conditioning of the reduced system as predicted by the theory,

For the reduced system resulting from an eight-subdomain decomposition, thereis a large difference in condition numher betwe(Hl using HPPO and HPPI precondi-tioners, The results in Tables 2 and 5 show that the condition number of the reducedsystem, obtained by applying the HPPI preconditioner, is much smaller t.han thatafter applyiug the HPPO preconditioner,

100

PARALLEL DOMAIN DECOMPOSITION SOLVER

0.2·p·«1 + log p)'l + log Hpih»"2 -HPPO _-HPPl .+ ...

2111

oc:c:g'B8

10

..---~~-_....----------------------

11 2 3 4

HIh5 6 7 8

FIG, 9. Plot of condition number: four subdomains with p = 2,

TABLE 1Condition number estimates and itemtion counts for system decomposed into four subdomains

and preconditioned with an H P PO-type preconditioner, Condition numbers for the original systemwith no preconditioning are listed in parenthe.~es.

H/h = 1 H/h - 2 If/h --I If/h - 8p Cond# ltr# Cond# Itr# Cond# Itr# Cond# Itr#

1 1.00 I 2,45 3 5,50 7 12,28 15(5,07) (11.98) (,362E2) (,123E3)

2 3.45 3 -1.65 7 6,83 15 12,53 28(,235E2) (,594E2) (,838E2)

3 5,19 -I 5,93 11 7,55 21(,187E3) (,326E3) (,399E3)

4 7.29 7 8,63 15 9,91 26(.762E3) (.130E4)

5 9,13 9 10.09 19(,227E,I) (,341E4)

6 11.14 11 12,63 22(,560E4) (,862E4)

7 12,96 13 17.52 23(,127E5) (.170E5)

8 15,07 15 21.30 25(,242E5)

TABLE 2

Condition number estimates and iteration colmts for the eight-subdomain system (lIjh = 4),

HPPO HPPI HPP2P dof C{)nd # Itr # Cond # ltr # Cond # Itr #1 81 7,90 21 1.00 1 1.00 12 28!) IVI·I 23 5.00 28 5.00 283 625 12.Hi 36 7,02 30 7.15 314 1089 14.69 43 10,08 37 9,77 355 1681 15,63 43 12,20 38 11.6 ;~66 20,11 18,85 48 15,00 44 13,65 39

2112 J, T, ODEN, ABANI PATRA, AND YUSHENG FENG

TABLE 3Cone/ition number estimates and itemtion counts for system decomposed into four sube/omains

and preconditioned with an lfPPl-type preconditioner, Condition numbers for the original systemwith no preconditioning are listed in parentheses.

Hlh = 1 Hlh = 2 Hlh =-1 Hlh = 8p Cond# Itr# Cond# Itr# COnd# Itr# Cond# Itr#1 1.00 1 1.00 1 1.00 1 1.00 1

(5.07) ( 11.98) (,362E2) (.123E3)2 :JA5 3 3,79 7 3,82 15 3,89 23

(,2:15E2) (,594E2) (,838E2)3 5.19 4 5,14 11 5,31 20

(,187E:J) (,326E3) (,399E3)4 7,29 7 7,58 15 7,84 24

(,762E3) (.130E-1)5 9,13 9 9,O!J 17

(,227E4) (,341E4)6 11.14 11 11,35 20

(,127E5) (,l70E5)7 12,96 13 12.93 22

(,127E5) (,170E5)8 15,07 15 15,10 25

(,242E5)

TABLE 4Condition number estimates and itemtion cOtJn~ for system decomposed into four subdomain.~

and preconditioned with an H P P2 type preconditioner. Condition numbers for the original systemwith no preconditioning are listed in parentheses,

Hlh- 1 Hlh - 2 Hlh = 4 Hlh = 8p Cond# ltr# Cond# Itr# Cond# ltr# Cond# Itr":;'

I 1.00 1 l.lXI I 1.00 I LOU 1(5,07) (11.98) (,362E2) (,123E3)

2 3.45 3 3,79 7 3,82 15 3,89 23(,235E2) (.59,lE2) (,838E2)

3 5,19 4 5,14 II 5,31 20(,187E3) (,326E3) (,:J99E3)

,1 6,72 5 7.31 13 7.84 24(,762E3) (,130E4)

5 8,07 6 8,64 14(,227E4) (,341E4)

6 9,29 7 10,1 15(,127E5) (,170E5)

7 10..1 8 11.2 16(,127E5) (,l70E5)

8 11.44 8 12,4 16(,2,12E5)

TABLE 5Condition number estimate-~ and itemtion count.i for the eight-subdomain system (Il/It = 8).

~ HPPO I HPPI ~p dof C.ond# Iltr# eond # I Itr #

1 289 }6,77 32 1.()() 12 1089 2U,74 41 4.81 283 2401 21.37 44 6,72 30

Table 6 displays the effectiveness of the HPP1 preconditioner for an lip adaptivemesh, The mesh used in this calculation is shown in Fig, 10, This mesh is obtainedusing an lip adaptive strategy described in [13] and corresponds t.o a nonhomogeneous

PARALLEL om,IAIN DECOMPOSITION SOLVEH 2113

TABLE 6Condition number estimates and iteration count.s for all adaptive hp me.9h(dof=6fiU, p=6),

t1onzolltal decomposition~ HPPO I HPPIll/h Cond # I Itr # Cond # I Itr #

II 32 I 9.50 I 30 I 5.20 I 21 II

H/IIII 8 II/I ,..... I. II

Spectral Order p

2

3

4

5

6

7

8

FIG, 10. lip adaptive mesh (dof=6liO),

Poisson problem with exact solutionll = tan-I a(x+y-xo)x(l-x)y(l-y) for Q = 50,The condit.ion nlllnher and iteration tounts of the reduced system, after applying theHPPI preconditioner, are compared with those that apply the HPPO preconditioner,Two types of decompositions are at.tempted, a strip-type decomposition in which thedomain is decomposed into strips (called a "horizontal decomposition" in t.he table),and a second method in which the subdomuins are more compact (called a "crossdecomposition" ill the table), The ratio H/h for the "cross decomposition" is abouthalf that of the horizontal decomposition, The performance of HPPI preconditioneris quite dramatic, The number of iterations is five for HPPI as opposed to 26 forthe HPPO, Further, use of a HPPO preconditioner on the unreduced problem did notconverge, and the condition number was estimated to be 0(10'J), Comparing the twodecompositions, it is also apparent that the smaller the ratio H/ II, the better theconditioning of the reduced system,

\Ve now explof(~ the effect of using different spectral orders p on a uniform gridof 64 elements, In Fig, 11, both iteration counts and condition I11IIIlberesl,imates are

2114

20

18

16

2 14

i 12

~ 10§.~ 8'"~...

6

1 .-.-4

J. T, ODEN. ABANI PATRA, AND YUSHENG FENG

iteration COunt -condo no. estimate -+--

21-001.81(17)

2 3 4 5spectral order p

6

dol.1069(65)

7 8

FIG, 11. Iteration counts and condition numbers for llPPl preconditioner for different spectralorder" (four "ubdomains).

iteration count w, Jacobi -cood, no, estlmat& w, Jacobi •• - •.

iteration coun1 w, HPP1 "."condo no. es11matew, HPP1 -

1e-10

, , , --------- ---------- -------- -- ..---------- ---------

le-20o 50 100 150

no. of ilaration200 250 300

FIG, 12, Comparison of HPPI solver with conventional Jacobi iterative $olver.

plotted against p (ranging from 2 to 8), The condition llumber is controlled so asto remain under 16, Figure 12 shows the residual and condition number estimationagainst iteration count for HPP1 and HPPQ,

Figure 13 demonstrates the parallel efficiency of the solver on an Intel iPSCj860for 2, 4, and 8 processors, \Ve observe that the speedup is almost 5 from 2 to 4processors. and 2.5 from 4 to 8 processors, This is because of the reduction in blocksize, The computational effort in "partialorthogonalization:' ensuring that conditions

PARALLEL DOMAIN DECOMPOSITION SOLVER 2115

100

10

12 3 4 5

no. of proc6 7 8

FIG, 13, Parallel efficiency on Intel iPSC/860,

(i) and (ii) in Theorem 3 are satisfied. reduces steeply with block si7.eleading to thevery high speedup, However, reducing the block size also increases the size of thereduced system and comIllunication overheads in a parallel environlllent, This raisesthe intriguing possibility of finding an optimal block size for such problems,

1. Auxiliary results. In this section we list some technical results which areused in the proofs in section 5, I\lost of these are extensions of standard resultsavailable in the literature for the h or ]J version finite element method, Alternateproofs of some of these results may be found in [1],

LEMMA 1 (I3ramble and Xu [61), If Q is a bounded domain in R? and oQ isLipschitz continuous: then for any € E (0.1) and any w E IF1.OO(n),

(31) (

? ( 1 ) 1/2IlwllL"'(n) ~ C Ilog€11

/- H2I1wIlL2(n) + lwlt.n

+ €(llwllL"'(n) + HIIVwIIL"'(fl) ) ,

where H = dia(Q), and C is a constant independent of H,Proof. This result follows immediately from a result of Bramhle and Xu [uj and

Xu [20] after a straightfonvard scaling with respect to H, 0The next result is the classical I\farkov inequality for polynomials,LEMMA 2 (Markov), If v is polynomial of degree p on I = [-It, h], tllen

(32)?p-

max Iv'(x)/ < -II ma.x Iv(x)l,I - I

Proof See page 40 of Lorentz [9], 0

2116 J, T. ODEN, ABANI PATHA, AND YUSHENG FENG

LEMMA 3, Let I be a continuou.s ftmction on I = (a, b), and let I be partitionedinto n subintervals h : 7 = nk=)h. Then

(33)

where Ik = III",Proof. By definition,

n

L likli/2.h ::; I/li/v'k=l

1/12" =!! (/(X) - ICY»):.!dxdy,)/_,1 xY

I I -

Breaking up the integrals using 7 = Ukh and noting that the integrands are positive,we have

" 2Ifli/2,I=!! (/(X)-/(Y»)-dxdY+!! (/(X)-/(Y») dxdy+'"

I, II X - Y II 12 X - Y

! ! (/(X) - l(y»)2 ~,)+ x _ dxdy ~ ~ likli/2.h' 0~ ~ Y k=l

LE!\U\IA 4. Let Zp = {u E Pp(h).h = [-h,hJlu(-II) = u(h) = O'Vp> O}, Tllen

(34) ollulli/2,1" ::; IIlIIn.I" +C(l + logp)II'ulli""(h)'Proof. By definition,

2 2 Jh (U(X)2) J" (U(y)2)olluIlI/2,I" = lu11/2,I + 2 -, dx + 2 - dy,_I. X + ! -h h - !I

\Ve need to bound the last two integrals, Breaking up the second integral into twoparts gives

1"(It(Xf) dx = 1." (Il(X)2) dx + lh-;Z (U(X)2) dx,_II II - X h-!:z II - X -h h - x

p

Since u E Zp(h) we have for any x E h,

u=(II-x)v

where v E pp_) (h), Further u' E Pp_1 (h), Hence we have from the rvlarkov inequal-ity

lu(x)1 ::; (h - x) ma:< Iv(x)1x

::; C(h - x) llmxlu'(x)1x

p2::; C(ll - x)hllullu>O(h)'

Applying this to the first part gives

lh (u(xf) ,,( (" p,) 1"--/7 ( 1 ) )-h h-x dx::;Cllullioo(I.) Jh_;r(II-X)1I2dx+ -h It-x dx ,

PARALLEL DOMAIN DEC'OJ\IPOSITJON SOLVER

Computing the intep;rals gives

[~l(~(~):)dx ~ C/lull1,oo(lk) {~ + 2logp + log 2 }

~ C(l + logp)lIull1,oo(/k)'

2117

The first integral can be hounded similarly, 0LEMJ\IA 5, Let F~,p(I) = UkZPk' I = (-If. H), I = Ukh, lfu E Fl~,p(I) such

that Uk = Ullk, then

(35) '"'"' 2" 2~ llIIUk/ll/2.lk ~ luli/2,1 + C(l + 10gp)lluIILoo(/),k

Proof Combining the previous two lemmas, we have

~ luli/2.1 + C(l + logp)llull7.oo(l)'

That completes the proof. 0\Ve conclude this section with It simple lemma on extensions of functions on a

single master element introduced by Babuska and Snri [2].LEM~IA 6 (BallUska and Suri [2]), Let the .9quare S = {(x, y) : Ixl ~ 1 and Iyl ~

I} and f such that f(A) = f(B) = f(C) = f(D) = 0, where A. B. C. D are thevertices of the square, Further, let Ii = f I')'; . i = 1. 2, 3. 4. where 7, denotes thesides of the square, Now let It E 'Ppbd and 12 = 13 = It = (), Then there existsV E 'Pp(S) such that V = I on 8S and

1lVlli,s ~ C o/lflIlI/2m'where C is independent of p and f,

Proof This is a special ca.,e of the more general result in Babuska and Suri [2],where Ii :j:. O.i = 2,3.4, 0

8. Conclusions. In this paper, a parallel domain decomposition solver for adap-tive lip finite element methods for two-dimensional elliptic problems is developedand analyzed, The condition number for the solver is proved to be bowlded byC(l + logp)2(1 + log Hp/II)2 and Cp(l + logpf(l + logH/llf for different choices ofpreconditioners, In practical calculations, we generally take p ~ 9 and If/h ;::::4, Forour approach. results of estimated condition nnmher and iteration counts show that.the conditioning of the reduced system is controlled, and fast convergence is achiewdas predicted in the theoretical analysis, Parallel results are obtained on an InteliPSC/860 and networks of workstations, Good parallel efficiencies are demonstrated,

Acknowledgment. \VI' wish to acknowledge Dr. Mark Ainsworth's suggestionsfor improving earlier drafts of this work.

REFERENCES

[I] M, AINSWORTH. A Preconclitioner Based on Domain Decomposition of hp Finite ElementApproximation on Quasi-Uniform Meshes, Mathematicnl and Computer Science TechnicaJReports, No, IG, University of Leicester, Leicester, U,K., 1993,

2118 ,), T. ODEN, AllAN! PATRA, AND YUSIIENG FENG

[21 L BABUSKA AND [I.f, SURI, The h-p version of the finite element method with quasiuniformmeshes. Math. ~[odelling and Numer, Anal., 21 (1987), pp, 199-238,

[31 L BABUSKA, ,.\, CRAIG, ,), MANDEL. AND J. PITKARANTA, Efficient preconditioning for the pversion finite element metJlod in two dimensiom, SIAJ..] ,), Numer. Anal., 28 (1991), pp,624-661.

[4} ,1. BRAMBLE. J, PASCIAK, AND A, SCHATZ, The construction of preconditioners for ellipticproblems by substructuring I. l\fath, Comp., 47 (1986), pp, 103-134,

[51 .J. H, BRAl>IBLE, R, H, EWING, H. H. PARASHKEVOV, AND ,J, PASCIAK, Domain decompo-sition for problems with partial refinement, SIfu\1 ,), Sci. Statist. Comput" 13 (1992), pp,397-410,

[G] J. II. BRA~lBLE AND J, XU, Some estimates for a weighted L2 projection, Math, Comp., 51i(1991), pp. 463-476,

[71 L, DEMKOWICZ. J. T, ODEN, W. RACHOWICZ. AND 0, HARDY, Toward a universal hp adaptivefinite element strategy, Part 1. Constrained approximation and data structure, Comput.Methods Appl. l\1ech, Engrg" 77 (1989), pp, 79-112,

[8] M, DRYJA, A n additive Schwarz algorithm for two- and three-dimemional finite dement ellipticproblem, in Domain Decomposition Methods. T, Chan. R, Glowinski, ,), Periaux, and0, Widlund, cds" SIAM, Philadelphia, 1989,

[91 G, G, LORENTZ, Approximatioll of FUnctions, Chelsea Publishing Co .. New York, 19(i(j,[101 J, MANDEL, Efficient Domain Decomposition Preconditioning for the p- Version Finite Element

Method in Three Dimensions, Tech, report, University of Colorado at Denver, 1989.[11/ ,J. MANDEL, Iterative solvers by substructuring for the p-version finite method, Comput. tvleth-

ods Appl. Mech, Engrg" 80 (1990), pp. 117-128.[121 J. MANDEL, Two level domain decomposition preconditioning for the p-version finite element

method in three dimensiom, Int.ernat, ,), Numer. Methods Engrg" 29 (1990), pp, 1095-1108,(131 ,). T, ODEN. A. PATRA, AND Y, S, F'ENG, An hp adaptive strategy, in Adaptive. Multilevel. and

lIierarchical C.omputational Strategies, A, I";, Noor, ed,. AMD, Vol. 157, 1992. pp. 23-46,[HI J. '1', ODEN, A. PATRA, AND Y, S, FENG, Domain decomposition solver for adaptive hp finite

elements, in 7th International Conference on Domain Decomposition, State College, PA,1993,

[15] Y.S. FENG, Optimal hp FE/lls: Theory cllld Apl)/ir.atiofls to Fluid DyrHimics, Ph.D, disserta-tion, University of Texas at Austin, Austin, TX, 1995,

[16) A, PATRA AND J, T, ODEN, Problem decomposition for adaptil)e hp }inite element methods,Computing Systems in Engint.>ering, Vol. (j, 2 (1995), pp, 97-109,

[17] L, PAVARlNO, Domain Decomposition For p Version Finite Element Methods. Ph,D, disserta-tion, Courant Instit.ute, New York Universit.y, New York, 1993.

(181 W. RACHOWICZ, ,), T, Oorm, AND 1. DEMKOWICZ, Toward a universal hp adaptive finiteelement strategy: Part 3: Design of hp meshes, Comput, ~..Iethods Appl. Mech, Engrg,. 77(1989). pp, 181-212,

[19] O. B, WIDLUND, ltemtive ,.ubstructuring methods: .4lgorithms and t.heon) for elliptic pf'Oblemsin the plane, in 1st International Symposium on Domain Decomposition Methods for Par-tial Differential Equations, R. Glowinski. G,H, Golub, G,A, Meurant, and J. Periaux, cds"SIAM, Philadelphia, 1988,

[20] J, Xu, Theory of Multilevel Methods, Report AM 48, Department of Mathematics, PennsylvaniaState University, State College, PA, 1989.


Recommended