Iterative Methods for Nonlinear Operator Equations

8/3/2019 Iterative Methods for Nonlinear Operator Equations

1/14

I t e r a t i ve M et h o d s f o r N on l in e a r O p e r a t o r E q u a t i on sA. T. GhronopoulosDepartment of Computer ScienceUniversity of Minnesota at MinneapolisMinneapolis, Minnesota 55455andZ. ZlatevAirpollution Laborato yDanish Agency of Environmental ProtectionRisoe National Laborato yD&4000 Roskilde, Denmark

ABSTRACTA nonlinear conjugate gradient method has been introduced and analyzed by J. W.

Daniel. This method applies to nonlinear operators with symmetric Jacobians. Theconjugate gradient method applied to the normal equations can be used to approxi-mate the soultion of general nonsymmetric linear systems of equations if the conditionof the coefficient matrix is small. In this article, we obtain nonlinear generalizations ofthis method which apply directly to nonlinear operator equations. Under conditionson the Hessian and the Jacobian of the operators, we prove that these methodsconverge to a unique solution. Error bounds and local convergence results are alsoobtained.

1. INTRODUCTIONNonlinear systems of equations often arise when solving initial or boundary

value problems in ordinary or partial differential equations. We consider thenonlinear system of equations

F(x) = 0 (1.1)where F( x> is a nonlinear operator from a real Euclidean space of dimensionN or Hilbert space into itself. The Newton method coupled with Gaussianelimination is an efficient way to solve such nonlinear systems when the

APPLIED MATHEMATICS AND COMPUTATION 51:167-180 (1992)0 Elsevier Science Publishing Co., Inc.,

167

655 Avenue of the Americas, New York , NY 10010 0096-3003/92/$5.00


2/14

168 A. T. CHRO NOP OULOS AND Z. ZLATEVdimension of the Jacobian is of small. When the Jacobian is large and sparse,some kind of iterative method may be used. This can be a nonlinear iteration(for example, functional iteration for contractive operators) or an inexactNewton method. In an inexact Newton, the solution of the resulting linearsystems is approximated by a linear iterative method. (cf. [15], [6])

Nonlinear steepest descent methods for the minimal residual and normalequations have been studied by many authors (c.f. [12] and [14]). J. Fletcherand C. M. Reeves [8], and J. W. Daniel [4] have obtained a nonlinearconjugate gradient method that converges if the Jacobian is symmetric anduniformly positive definite. These nonlinear methods reduce to the standardconjugate gradient methods for linear systems. These methods are based onexact line search at each iteration and thus must solve a scalar nonlinearminimization problem in order to determine the steplengths. Several authorshave suggested inexact line search and have given conditions under whichthese methods would still converge [8]. This is done to avoid solving exactlythe scalar minimization problem whose derivative evaluation involves evalua-tion of the nonlinear operator.

The conjugate gradient method applied to normal equations can be usedto solve iteratively nonsymmetric linear systems when the condition numberof the Jacobian is small. Some preconditioning applied to the original linearsystem can be used to achieve this goal. Two algorithms exist for theconjugate gradient method applied to the normal equations: the CGNR [l,111 and CGNE [l, 31 (or Craigs method).

In this article we obtain a nonlinear extension of the Conjugate Gradientmethods applied to the normal equations. We assume that the Jacobian andthe Hessian of the nonlinear operator are uniformly bounded. We proveglobal convergence and local convergence results for the nonlinear algo-rithms. We also give asymptotic steplength estimates and error bounds. Thesesteplengths can be used in implementing these methods. In section 2, wereview the CGNR and CGNE methods. In section 3, we derive a nonlinearCGNR method and prove global convergence. In section 4, we derive anonlinear CGNE method and prove local convergence. In section 5, weobtain asymptotic steplength and error estimates.

2. THE CONJUGATE GRADIENT APPLIED TO THE NORMALEQUATIONS

Let us consider the system of linear equations Ax = f , where A isnonsingular, nonsymmetric matrix of order N. This system can be solved byeither of the two normal equations systems:

ATAx = Atf (2.1)


3/14

Nonlinear Conjugate Gradient Methods 169uTy =f, x =ATy (2.2)

Since both ATA and AA have the same spectrum, we can apply CG toeither system to obtain an approximate solution of Ax = f.

The CGNR method [ll] appl ies CG to (2.1). Then x,, minimizes the normof the residual error E(x,,) = Ilf - Ax,,ll. 2 over the affine Krylov subspace

x,, + ATr,, , . . , ( AA) - Arr,)

and the resulting algorithm is the following.

ALGORITHM 2.1. The CGNR algorithm.Initial vector x,)ro = f - Ax,,, Po = Ar,,F or n = 0 U n t i l Convergence Do1. a, = (A?r,, Ar,,)

( AP,) > & >2. x,1+1 = xi+ ,,p, andr,,+ = r,, - a,, P,,.3. p,,+ 1 = Ar,,+, + b,, p,, where b,, = - ( AAT?, + l> APJ11 P,, IIE n d F o r .The CGNE method [3] applies CG to (2.2). Then x,, minimizes the norm

of the error E(x,) = /Ix* - x,,l12 over the same affine Krylov subspace asCGNR, and the resulting algorithm is the following.

ALGORITHM 2.2. The CGNE algorithm.Initial vector x0ro =f - Aq, p, = A?r,,F or n = 0 Un t i l Convergence Do

(r f l1 rn >1. a,, (PII, P,l)2. x,+1 = x, + a,, p,, and r, +, = r,, - a,, Ap,.3. P,,+~ = Alr,+, + b,p, where b,, = - ( ATr,,+,, P)lIP,,lI .E n d F o r . ,Since the spectrum of the matrices AAr and ATA are the same, we should

expect that the performance of CGNR and CGNE is the same. However,CGNE minimizes the norm of the error and may yield better performance.


4/14

170 A. T. CHRONOPOULOS AND Z. ZLATEVThe CGNE method is sometimes called Craigs method because it was firstproposed by E. J. Craig.

The following bound error can be obtained [5] for the error functionalE(x):

1- l/p( I2E(x,*) =G2 1 + l/P E( x0> (2.3)where p = ]]A]]z]]A-11]2 is the condition number of the matrix.

3. THE NONLINEAR CGNR METHODIn this section, we generalize the CGNR iteration to a nonlinear iteration

which requires the solution of a scalar equation to determine the steplength.We then prove a global convergence result under assumptions that theHessian and the Jacobian are uniformly bounded.

Let F(x) be an operator mapping of the Euclidean space R (or, evenmore generally a real Hilbert space) into itself. The notation F(x) andF(x) will be used to denote the Frechet and Gateaux derivatives respec-tively. Also, for simplicity F,, and F,: will denote F(x,~) and F(x,,)respectively. We seek to solve iteratively the nonlinear system of equations:F(x) = 0. In the linear case F(x) = AX - b and F(x) = A.

Assume that F(x) and F(x) exist at all x and that there exist scalars0 < m < M, 0 < B independent of x so that the following conditions aresatisfied for any vectors x and u:

m21bllP ((F'( x)F( x))u, u) G M211~l12 (3.la)IIF( x)11< B (3.lb)REMARK 3.0. (i) The symmetric definite operators F(x)F(x) and

F( x)F( xY have the same eigenvalues. Thus, the following inequality holds:rn2~~ul12 (F'( x)F( x)~u,u) G M21M~2.

(ii) The left inequality in (3.la) and the inverse function theorem fordifferential operators imply that the inverse F-(x) exists and it is differen-tiable.


5/14

Nonlinear Conjugate Gradient Methods 171

From the left inequality in (3.la) and the inverse function theorem, weconclude that F-(x) exists and it is differentiable at all x. We use the meanvalue theorem for the operator F- (x) to obtain the following equation

(y -x>(y -xl) = (F'-'(-MY) -f(x))T(~ -d),Combining this with the right inequality in (3.Ia) we obtain:

II y - xl2 < llF( ) - F( x)lI II y - xll.m

This inequality implies that

ml1y - XII G llF( y) - F( x>II (3.2)By use of the mean value theorem for the operator F(x) and assumption(3.Ia), we obtain the following inequality

llF( y) - F( x)II Q Mll y - XII (3.3)Under assumptions (3.1), we consider the following nonlinear generaliza-

tion of CGNR.

ALGORITHM 3.1. The Nonlinear CGNR Algorithm.Initial vector xgr-0 = -F(x,,), p. = F:r,,For 12 = 0 Unt i l Convergence Do1. Select the smallest positive c,, to minimize IIF(x, + ~p,~)lll, c > 02. xn+r = x, + c,p, and r,,+l = -F(x,+,)3. b, = - ( F,+J,,T,P-,,+I~ F,+I pn)IIF,,+ I P# where prl + I = F,T+1r, + 1 + b,, P,,E n d F o rThe scalars c, and b,, are defined to guarantee the following two orthogo-nality relations:

(3.4)


6/14

172 A. T. CHRONOPOULOS AND Z. ZLATEVand

(FnP,, ,,P,-1)= 0. (3.5)Under the assumptions (3.1), the following lemma holds.

LEMMA 3.1. Let {r,,) be the nonlinear residuals and ( p,,) be the directionvectors in Algorithm 3.1 then the following identities hold:

(i) (r,,, F,,pJ = IIF,r,,l12(ii> II p,ll = IIF,Tr,,l12 + bt- ,I1p,,_ ,I?(iii) llF,,rF,~fTr,l12 IIFnp,,l12 + b,f_ lllFnp,,_I II2(iv) mllr,,II ,< IIF,zTr,,ll =G I p,,ll(~1 II p,ll G M31i*,il(vi) Ilr,, 1ll =Jlf-nII.

PROOF. The orthogonality relations (3.4) and (3.5) combined with Step 3of Algorithm 3.1 imply (i)-(iii). Equality (ii) and (3.la) are used in provinginequality in (iv) as follows:

mllr,il12 < llF,:Tr,,lI < II prrl12Equality (iii) and (3.la) are used in proving inequality (v) as follows:

mll pnII < IIF,:p,,II < IIF,lF,:Tr,~II< M211r,llInequality (vi) follows from the definition of c,~. cl

REMARK 3.1. Let f,,(c) denote the scalar function: ;llF(r,, + cp,Jl12. Itsfirst and second derivatives are given by:

f:(c) = (Fk + CIA,), F(x,, + w,)P~) (3.6)f:(c) = (W x,, + c-r%l)7 PJ F( x, + T,>) + llF( x + 7,) P#

(3.7)


7/14

Nonlinear Conjugate Gradient Methods 173The following upper and lower bounds on f:(c) can be computed from (3.61,the assumptions (3.0, and Lemma 3.1 (vi).

Ilp,1?(m~ - Bllrll) < m2 IIp,,l12 - ~llp,l1211r,~II 0 such that I]F(r, + c-p,)11 < I(rn((.We must prove thatthere is a c > 0 such that f,(O) < fn(c). Th IS would imply that there existsc, > 0 where f,(c) assumes a local minimum. From inequality (3.2) byinserting x = x, and y = xn + cptl we conclude that F(y) grows un-bounded for c -+ m. This proves that there is a 0 < c such that f,(O)


8/14

174 A. T. CHRONOPOULOS AND Z. ZLATEVThirdly, we prove that the sequence of residual norms decreases to zero.

For C = tc for some t in [0, I] we have

Now by inserting c = IIF, ~,,l12I] pJ2( M + B]]r,,]]) we Obtain

IIF,,r,,l14 1I p,l12(M2 + Bllr,ll) Now using m211r,l12 f IIF, r,,ll [from Lemma 3.1 (iv)] we obtain

llF,ar,,l12II ~,~ll( M + Bllr,,II) 1Ir12

If we substitute the fraction term in the square brackets by the left most termin (3.10), we prove that the norm of the residual is reduced (at each iteration)by a constant factor that is less than one. This implies that Ilr,ll converges tozero.

Finally, we prove that the sequence of iterates converges to a uniquesolution of the nonlinear operator equation. By use of (3.2) with x = x, and!j = *n+kT we obtain that the sequence x,, is a Cauchy sequence. Thus, itconverges to x* and F(x*) = 0. The uniqueness and the error boundinequality in the theorem statement follow from (3.2) with x = x, andy = x*. 0

4. THE NONLINEAR CGNE METHODLet us assume that (3.la) and (3.lb) hold in this section. From Theorem

3.1, it follows that a unique solution of F(x) = 0 exists. Next, we introduce anonlinear version of CGNE, and we prove a local convergence theorem.

ALGORITHM 4.1. The Nonlinear CGNE Algorithm.Initial vector x0ro = -F(x,,I, p, = F$r,,


9/14

Nonlinear Conjugate Gradient Methods 175

For 11 = 0 Unt i l Convergence Do1. Select the smallest positive c, to minimize 1(x* - (x,, + c,D,,&_, c > 02. XII+1 =x, + c,p, and r,,+l = -F(x,,+~)3. b, = - (F:+T~+I, P)IIp ,$ w here P,,+~ = FnT+Ir,+l + b, P,E n d F o r

REMARK 4.0. The error function in step 1 of the algorithm is notcomputable because it uses the exact solution. However, it is possible todetermine an approximation to c, that guarantees local convergence of thealgorithm to the solution.Let us denote the true error x* - x,, by e,,. The scalars c,, and b, bydefinition imply the following two orthogonality relations:

and

(e +l, P) =o (4.1)

(P n+l, P) = 0. (4.2)

Under the assumptions (3.1) the following lemma holds for Algorithm 4.1.

LEMM A 4.1. Let {r.,] be the nonlinear residuals and { pJ be the directionvectors in Algorithm 4.1 then the follow ing identit ies hold true:

0) ( p,, e,> = (FnTr,, e,>(ii) IIF~rnl12 lIpnIl + b ,2_111pn_1112(iii) Mllr,ll 2 II p ,II(iv) MIlenIl llrnII 2 m IlenIl(0) Ile,+lll < lIenIl

PROOF. (i) follows from relation (4.1) and equality 5 of Algorithm (4.1).We prove (ii) from equality 5 of Algorithm (4.1) and relation (4.2). Part (iii)follows from (ii) and (3.1). By using equalities (3.2) and (3.3) (at thebeginning of section 3) with y = xn and x = X* we prove (iv). Part (v)follows from the selection of c, 0


10/14

176 A. T. CHRONOPOULOS AND Z. ZLATEVREMARK4.1. L e t u s denote by fj,(c) the scalar function

fllx* (X + q,)ll = file - cpl?.The first and second derivatives are:

./xc > = c llp ,l12 (p,, e,),_&y(c >= lIpnIlWe expand F(x*) = 0 in Taylor series around x, to obtain:

(4.3)

r , = F,e, + (F,le,, e,),where ii = x, + te,. Using Lemma 4.1 (i) and (4.31, we obtain:

f;(c) = ~Ilp,l12+ llrnl12 (r,, (F,fe,, e,)).

(4.4)

(4.5)We next prove that under the assumptions (3.1) that Algorithm (4.1)

converges locally to the unique solution of the nonlinear operator equation.

THE OREM 4.1. Assume that conditions (3.1) hold. Also, assume that x0is selected such that IIF(x,)ll < &. Then the sequence x, generated byAlgorithm (4.1) is well-&fined converges to the unique solution x* of thenonlinear operator equation F(x) = 0.

P R O O F . Firstly, we prove the existence of the nonlinear steplengths c,. Itsuffices to prove that the first derivative of f(c) is negative at c = 0 and itssecond derivative is positive in an interval [0, cr). By using Lemma 4.1 (iv)and (v) and the assumption of the theorem we prove the following inequality:

l(r , ( Flen, e,>)I < ~IIrnI1211e,,ll ~llrnI12 < ~ll~nl12. (4.6)Combining (4.5) and (4.61, we conclude that f,(O) is negative. Also, (4.3) and(4.6) imply that

IfA(O)l= I(e,, p,)l > +llrnl12. (4.7)


11/14

Nonlinear Conjugate Gradient Metho& 177Now, using Lemma 4.1 (iv), we obtain:

II p ,,Ila ;llr,,ll. (4.8)This indequality shows that the second derivative of fi, is positive ifconvergence has not been reached (i.e., I;, z 0).

Secondly, we obtain a lower bound on the nonlinear steplength c,,. Sincef,,


12/14

178 A. T. CHRONOPOULOS AND Z. ZLATEVPROOF. We will prove only the rightmost inequality. The leftmost in-

equali ty is proved similarly. From equali ty (4.5) and f,: = 0, we obtainthe following inequality:

lb-,,112 IIr,,l12c, < IIp ,,II Bllf-,,I Ile,,II = IIp ,,ll (l - E,,) where E,, is the fraction Bllr,,II lle,,IIIIp,,II Now, we use Lemma 4.1 (iv) andinequality (4.8) to obtain:

We next obtain an asymptotic error bound for iterates in Algorithm 4.1.

PROPOSITION 5.2. Under the assumptions of Theorem 4.1, we obtain thefollowing inequality on the residual errors:

Ile+1112 Ile,Ild ,,,where

and

d, =

a;,

m2l-s+%

= 2M2(lr,ll.

PROOF. We note that by using relation (4.1) and Lemma 4.1 6) weobtain:

Ile+1ll2 = (en+,, e, - c , p ,) = -c ,(enT P,) = -c,(r,, Ce,)


13/14

Nonlinear Conjugate Gradient Methocls 179Now using equality (4.4) and Lemma 4.1 (iv) we obtain:

lie,, ill2 - Ile,,l12 -c ,,IIr,,II + Bllr,,ll lle,,l12. (5.1)Using Proposition 5.1 and Lemma 4.1 (iii), we prove the following inequality:

Ilr,,l14 1 c ,lIr,,l12a - 1IIp ,,II (1 + %) 2 ~lle .,ll~l + E,,I (5.2)Now using (5.2) in (5.1) we obtain:

Ile+1ll2 d Ile,,l12 - $ c 1 J E[ ) 1+llr,,II lle ,,l12I,The last term in this inequality is less than

llenl12[l- $ +q.],where

q, = Bllr,,II.

6. CONCLUSIONSWe have presented and analyzed nonlinear generalizations of the CGNR

and CGNE method. These nonlinear methods apply to nonlinear operatorequations with nonsymmetric Jacobian. We show that under certain uniformassumptions on the Jacobians and Hessians the nonlinear CGNR is guaran-teed to converge globally to a unique solution. For the nonlinear CGNEunder the same assumptions as CGNR, we prove local convergence resultsand give asymptotic error bound estimates. These results extend the work ofother authors [4, 81 to deriving nonlinear methods for nonsymmetric Jaco-bians.

The research was partially supported by NSF under grant CCR-8722260.

A conference proceedings version of this paper appear ed in the proceedings of the S.E.Conference on Theory of Approximation, March 1991, Marcel-Dekker, Pur e and AppZiedMathematics, Vol. 138, edited by George A. Anastassion.


14/14

180 A. T. CHRONOPOULOS AND Z. ZLATEV

REFERENCES

1234567

89

10111213

14

15

S. F. Ashby, T. A. Manteuffel, and P. E. Saylor, A taxonomy for conjugategradient methods, SIAM J. Nurner. Anal. 27:1542-1568 (1990).P. N. Brown, A local convergence theory for combined inexact-Newton/finite-difference projection methods, SlA M J. Numer. Anal . 24:610-638 (1987).E. J. Craig, The N-step iteration procedures, J. Math . Phys. 34:64-73 (1955).J. W. Daniel, The conjugate gradient method for linear and nonlinear operatorequations, SIA M J. Numer. Anal . 4:10-26 (1967).J. W. Daniel, The Approximut e Minim izat ion of Function&, Prentice-Hall,Englewood Cliffs, New Jersey, 1971.R. S. Dembo, S. C. Eisenstat, and T. Steighaug, Inexact Newton methods, SIAM/. Num er. Anal. 19:4OO-408 (1982).S. C. Eisenstat, H. C. Ehnan, and M. H. Schultz, Variational iterative methodsfor nonsymmetric systems of linear equations, SIA M J. Namer. Anal. 20:345-357(1983).R. Fletcher and C. M. Reeves, Function minimization by conjugate gradients,Compt. /. 7:149-154 (1964).R. Fletcher, Practical Methods in Opt imiz ation, Vol. 2, Unconstrained Opt imk a-tion, Wiley, Chichester, 1980.G. H. Golub and R. Kannan, Convergence of a two-stage Richardson process fornonlinear equations, BIT 209-219 (1986).M. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linearsystems, J. Res. Na t . Bur. Standards 49:409-436 (1952).M. Z. Nashed, On general iterative methods for the solutions of a class ofnonlinear operator equations, Math, Camp. 19: 14-24 (1965).M. Z. Nashed, The convergence of the method of steepest descent for nonlinearequations with variational or quasi-variational operators, J. Math. Me&.13:765-794 (1964).T. L. Saaty, Modern Nonlinear Equations, Dover Publications Inc., New York,1981.D. P. Oleary, A discrete Newton Algorithm for minimizing a function of manyvariables, Math. Programm ing 23:20-33 (1982).

Date post:	06-Apr-2018
Category:	Documents
Upload:	nguyen-hung
View:	228 times
Download:	0 times

Iterative Methods for Nonlinear Operator Equations

Documents