+ All Categories
Home > Documents > Non-monotone trust region methods for nonlinear equality constrained optimization without a penalty...

Non-monotone trust region methods for nonlinear equality constrained optimization without a penalty...

Date post: 10-Mar-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
31
Nonmonotone Trust Region Methods for Nonlinear Equality Constrained Optimization without a Penalty Function Michael Ulbrich and Stefan Ulbrich Zentrum Mathematik Technische Universit¨ at M ¨ unchen unchen, Germany Technical Report, December 2000.
Transcript

NonmonotoneTrust RegionMethods forNonlinear Equality ConstrainedOptimization

without a Penalty Function

MichaelUlbrich andStefanUlbrich

ZentrumMathematikTechnischeUniversitatMunchen

Munchen,Germany

TechnicalReport,December2000.

NON-MONOTONE TRUST REGION METHODS FOR NONLINEAR EQUALITYCONSTRAINED OPTIMIZA TION WITHOUT A PENALTY FUNCTION

MICHAEL ULBRICH�

AND STEFAN ULBRICH†

Abstract. We proposeand analyzea classof penalty-function-freenonmonotonetrust-region methodsfornonlinearequalityconstrainedoptimizationproblems.Thealgorithmicframework yieldsglobalconvergencewithoutusinga merit functionandallows nonmonotonicityindependentlyfor both,theconstraintviolationandthevalueofthe Lagrangianfunction. Similar to the Byrd–Omojokunclassof algorithms,eachstepis composedof a quasi-normalanda tangentialstep.Both stepsarerequiredto satisfya decreaseconditionfor their respective trust-regionsubproblems.The proposedmechanismfor acceptingstepscombinesnonmonotonedecreaseconditionson theconstraintviolationand/ortheLagrangianfunction,which leadsto aflexibility andacceptancebehavior comparabletofilter-basedmethods.Weestablishtheglobalconvergenceof themethod.Furthermore,transitiontoquadraticlocalconvergenceis proved.Numericaltestsarepresentedthatconfirmtherobustnessandefficiency of theapproach.

Key words. nonmonotonetrust-region methods,sequentialquadraticprogramming,penaltyfunction, globalconvergence,equalityconstraints,local convergence,large-scaleoptimization

AMS subject classifications.65K05,90C30

1. Intr oduction. Weconsiderthenonlinearequalityconstrainedoptimizationproblem

min f�x � subjectto c

�x ��� 0 (1.1)

with continuouslydifferentiablefunctions f : n � andc : n � m. For thesolutionof(1.1)we proposea methodthat is inspiredby theclassof trust-region algorithmsintroducedby Byrd [2], Omojokun[23], andDennis,El-Alem, andMaciel [9], but with the importantdifferencethatour algorithmdoesnot usea penaltyor augmentedLagrangefunction to testtheacceptabilityof steps.Hereby, wearemotivatedby theimpressingefficiency of sequentialquadraticprogramming(SQP)filter methods,which were recentlyintroducedby FletcherandLeyffer [17]. Thealgorithmthatwe investigateheredoesnot usetheconceptof a filter.Rather, it appliesnonmonotonetrust-region techniquesindependentlyto the quasi-normalsubproblemand the tangentialsubproblem.This strategy admitsa flexibility in acceptingstepsthat is comparablewith filter methods.Besidesglobalconvergence,our approachhastwo favorablepropertiesthatappearto benew for algorithmswithouta penaltyfunction:(a) Themethoddoesnot requirea restorationprocedure.(b) Weprovethatthealgorithmconvergeslocally q-quadratically, evenwithoutanadditional

secondordercorrectionthatis neededby many algorithmsto avoid theMaratoseffect.For SQPfilter methodsglobalconvergencehasbeenestablishedin Fletcher, Gould,Leyffer,andToint [16], whereasa local convergencetheoryis not yet available.Recently, a globallyconvergentprimal-dualinterior-point filter methodwasintroducedby Ulbrich, Ulbrich, andVicente[31]. Exceptfor themethodpresentedin thispaper, filter methodsanditspredecessor,thetolerance-tubeapproachby Zoppke-Donaldson[34], aretheonly algorithmsfor NLP weareawareof thatdo not requireapenaltyfunction.

We will use[2], [23], and[9] asour mainreferenceson trust-region methodsfor equal-ity constrainednonlinearprogramming.However, thereareseveral relatedapproachesandrecentextensionsthatshouldbementioned.Regardingrelatedwork, we referto Byrd, Schn-abel,andShultz[5], Celis,Dennis,andTapia[6], El Alem [13, 14], Powell andYuan[26],

�Lehrstuhlfur AngewandteMathematikundMathematischeStatistik,ZentrumMathematik,TechnischeUniver-

sitat Munchen,D-80290Munchen,Germany, E-mail: [email protected].†Lehrstuhlfur AngewandteMathematikundMathematischeStatistik,ZentrumMathematik,TechnischeUniver-

sitat Munchen,D-80290Munchen,Germany, E-mail: [email protected].

1

2 M. ULBRICH AND S.ULBRICH

and Vardi [32]. Recentcontributions to the analysisof trust-region methodsfor equalityconstrainedproblemsincludeDennisandVicente[12], El Alem [15], Lalee,Nocedal,andPlantenga[20]. Several extensionsto problemsinvolving inequalityconstraintshave beenproposed.Herewementiononly thosemethodsthatextendtheideasof Byrd [2], Omojokun[23] andDennis,El Alem,andMaciel [9]. Someof thesealgorithmsarebasedontrust-regionmethodsfor box-constrainedproblems,see,e.g.,ColemanandLi [7], Conn,Gould,andToint[8], DennisandVicente[11], Lin andMore [22], Ulbrich, Ulbrich andHeinkenschloss[30],andcombinethemwith theabove approachesto handleadditionalequalityconstraints.Al-gorithmsof this typewereinvestigatedby Dennis,Heinkenschloss,Vicente[10], Plantenga[24], andVicente[33]. Byrd, Gilbert, andNocedal[3] andByrd, Hribar, andNocedal[4]takeadifferentapproachby solvingasequenceof equalityconstrainedbarrierproblems.Theabove referencesunderlinethe importantrole of methodsfor nonlinearequalityconstrainedoptimizationproblems,bothasstandalonemethodsandassolversfor subproblems.

Thispaperis organizedasfollows. In section2 thealgorithmis developed.Weintroducethequasi-normalandthetangentialtrust-regionsubproblemanddescribethemodeldecreaseconditionsfor therespective trial steps.Thenonmonotonedecreaseconditionsfor constraintviolation and Lagrangianfunction, respectively, which are the key ingredientsof the newalgorithm,aredevelopedin sections2.1and2.2. Thefull algorithmis formulatedin section2.3. In section3 theglobalconvergenceof thealgorithmis established.Wefirst statethemainresult in section3.1. The global convergenceanalysisstartsin section3.2 with the proofof well definedness.Section3.3 is devoted to the developmentof nonmonotonedecreaseresults. Convergenceto feasiblepoints is proved in section3.4, convergenceto stationarypointsin 3.5. In section4 we show thatwith a Newton-typestepcomputationthealgorithmconvergeslocally quadratically. Numericalresultsfor problemsfrom the CUTE collection[1] arepresentedin section5.

Notations. Throughoutthepaper, ���� denotestheEuclideannorm ���� 2. Thegradientof f is denotedby f and c denotesthetransposedJacobianof c. Weusetheabbreviationsg �� f andA �� c.

2. Developmentof the algorithm. We denotethegradientof theobjective function fby g andwrite A for thetransposedJacobianof c:

g x ���� f x ��� n � A x ���� c x ��� n � m �Following Byrd [2], Omojokun[23] andDennis,El Alem, andMaciel [9], weobtainthetrialstepsk � st

k � snk atthecurrentiteratexk by computingaquasi-normalstepsn

k andatangentialstepst

k. The purposeof the quasi-normalstepsnk is to improve feasibility. It is obtainedas

approximatesolutionof thetrust-regionsubproblem

min � c xk � � A xk � Tsn � 2 subjectto � sn ����� k� (2.1)

where � k � 0 denotesthetrust-regionradius.Our requirementsonthestepssnk arethatthere

exist constantsK1� K2 � 0, independentof k, suchthatsn

k admitstheupperbound

� snk ��� min K1 � ck � � � k

� (2.2)

andsatisfiesthedecreasecondition

� ck � 2 � � ck � ATk sn

k � 2 � K2 � ck � min � ck � � � k� (2.3)

As,e.g.,in [9], wewill assumethatthematricesA xk � T A xk � arenonsingularwith uniformlyboundedinversesfor all k. Thenit is well known thattheCauchypoint,which is thesolution

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 3

of (2.1) alongthe directionof steepestdescentat sn � 0, satisfiesthe conditions(2.2) and(2.3)for appropriateconstantsK1 andK2. Theassumptionsstatedbelow ensuretheexistenceof constantsK1 and K2 thataresuitablefor all iterationsk. Therefore,theconditions(2.2)and(2.3)canbeimplementedby a fractionof Cauchydecreasecondition.

To improve optimality we seekstk in the tangentspaceof the linearizedconstraintsin

suchaway thatit providessufficientdecreasefor aquadraticmodelof theLagrangefunction "!

x # y$ � f!x $&% yTc

!x $'# y ( m #

undera trust-region constraint.To this end,we defineaquadraticmodel

qk!s$ � ! g ! xk $)% A

!xk $ yk $ T s % 1

2sT Hks

aboutthe currentpoint!xk # yk $ that approximates

*!xk % s# yk $�+ *! xk # yk $ . Here, Hk is a

symmetricapproximationof , 2x *!

xk # yk $ .Basedon this model,the tangentialstepst

k is computedasapproximatesolutionof thetrust-regionsubproblem

minqk!snk % st $ subjectto A

!xk $ Tst � 0# - st -�.�/ k (2.4)

satisfyingthedecreasecondition

qk!snk $0+ qk

!snk % st

k $�1 K3 - WTk , qk

!snk $2- min - WT

k , qk!snk $2-3#4/ k # (2.5)

with aconstantK3 5 0 independentof k, andthefeasibility condition

A!xk $ Tst

k� 0 # - st

k -�.�/ k 6 (2.6)

Hereby, Wk� W

!xk $ , whereW

!x $ denotesa matrix whosecolumnsform a basisof the

null spaceof A!x $ T . Note that WT

k , qk!st $ is the reducedgradientof qk in termsof the

representationst � Wkd of thetangentialstep:

, d qk!Wkd $ � WT

k , qk!Wkd $ � WT

k , qk!st $ 6

Therefore,(2.5) canbe realizedby a fractionof Cauchydecreaseconditionfor the reducedfunctiond 78 qk

!snk % Wkd $ subjectto theconstraint- Wkd -�.�/ k.

To simplify notationwe will use the abbreviations fk � f!xk $ , ck

� c!xk $ , k

� *!xk # yk $ , etc.Moreover, it will beconvenientto introducethereducedgradient9

g!x $ � W

!x $ T g

!x $ 6

Thenthe first ordernecessaryoptimality conditions(Karush–Kuhn–Tucker or KKT condi-tions)ata local solution :x ( n of (1.1)canbewritten as

c! :x $ � 0 # 9

g! :x $ � 0 6

The algorithm is basedon a combinationof nonmonotonedecreasecriteria for the quasi-normalandtangentialsteps.Non-monotonetrust-region methodswereinvestigatedby Toint[28] and Ulbrich [29]. We follow [29] and comparethe predicteddecreasepromisedbythe trust-region modelwith a relaxationof the actualdecreaseto decidewhethera stepisacceptableor not. Beforewe give a precisedescriptionof the algorithm,we introduceourassumptionsfor theglobalconvergenceanalysis.

Assumptions:Thereexistanopenconvex set;�< n andaclosedconvex set

9;�<=; with dist! 9;># n ? ;@$ 5

0 suchthat:

4 M. ULBRICH AND S.ULBRICH

(A1) Thefunctions f : ACB andc : ADB m arecontinuouslydifferentiable.

(A2) Thematrix A E x FHGJI c E x F hasfull rankfor all x KLA .

(A3) The functions f , g GMI f , c, A GMI c, E AT AFON 1, W, and E WT WFON 1 areuniformlyboundedon A . Hereby, W E x F denotesa matrix whosecolumnsform a basisfor thenull spaceof A E x F T .

(A4) For all k, xk is in PA , andxk Q snk aswell asxk Q sk arein A .

(A5) ThematricesHk andthemultiplier estimatesyk areuniformly boundedfor all k.

(A6) Thederivativesg GJI f andA GJI c areLipschitzcontinuouson A .

In section4 we will moreover requirethe following assumptionin orderto show tran-sition to locally quadraticconvergencein a neighborhoodof a stationarypoint Rx satisfyingsecondordersufficientconditions.

(A7) The functions f : ASB andc : ASB m aretwice continuouslydifferentiable.Furthermore,thereexists a neighborhoodA N T A of Rx on which I 2 f and I 2c areLipschitzcontinuousandHk G�I 2

x U E xk V yk F for all xk KLA N .

2.1. A nonmonotonedecreasecondition for the constraint violation. The decreasecondition(2.3) for thequasi-normalstepguaranteesthat thepredictedreductionfor thefea-sibility violation W c W 2,

predck

defGXW ck W 2 Y W ck Q ATk sk W 2 V

admitstheestimate

predck Z K2 W ck W min ['W ck W V4\ k ]_^ (2.7)

Noteherebythat ATk sk G AT

k snk . Clearly, therequirementthattheactualreduction

aredckdefGSW ck W 2 Y W c E xk Q sk F`W 2

shouldbea fractionof thepredictedreductionis too restrictive,sinceit couldimposesevererestrictionson the tangentialstepif the feasibility is “too good” in comparisonto the normof thereducedgradient.In orderto relaxthefeasibility requirementin this caseandto allownonmonotonicityweacceptthestepfor theconstraints if

raredck Zba 1predck V a 1 KcE 0 V 1F fixed,

with therelaxedactualreduction

raredckdefG max Rk V

d ck N 1

r e 0

f ckr W ck N r W 2 Y W c E xk Q sk F`W 2 ^

Hereby, we requirethatwith fixedparametersg c K andf KhE 0V 1i3g c F holds

g ck G min [ k Q 1 V g c ]jV f c

kr Z fck 0 Vd c

k N 1

r e 0

f ckr G 1 V

Rk Z W ck W 2 V usually Rk GSW ck W 2 ^

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 5

Beforewediscussthechoiceof Rk, wenoticethatamaximumof nonmonotonicityisachievedby selectinganindex r l , 0 m r l&mon c

k p 1, suchthat q ck r r s qut max0 v r w2x ckq ck r r q andsettingy c

kr s t 1 p{z n ck p 1| y , and

y ckr t y for r }t r l .

The choiceof Rk is an importantissuein the designof the method. It is donein sucha way that the feasibility requirementis relaxed if the feasibility is much better than thestationarity, i.e., if q ck q@~ q��gk q . If this situationis detectedtheninsteadof Rk t�q ck q 2 alarger valueis chosen.In orderto keepa minimumof control over the constraintviolationwe chooseRk not larger thansomeupperbounda jk . Hereby, z a j | is a slowly decreasingsequencetendingto zeroand j is only increased,i.e., jk� 1 t jk � 1, if Rk yieldsthemaximumin thefirst termof raredck. Thus,let z a j | beasequencewith

a j � 0 � 0 ��� 0 m a j � 1

a j� 1 � lim

j �L� a j t 0� and�j � 0

a� j t���� (2.8)

where � � 4� 3 is afixedconstant.Thefollowing algorithmdescribeshow Rk is updated:

Algorithm R: (Updateof Rk)

Let 0 �������D� 1� 2.

If q ck q�� min � a jk ����q4�gk q then

SetRk :t min � a2jk�q��gk q 2 � .

If Rk � x ck r 1

r � 0y c

kr q ck r r q 2 thenset jk� 1 :t jk � 1,

elseset jk� 1 :t jk.

Otherwise,setRk :t�q ck q 2 and jk� 1 :t jk.

2.2. A nonmonotonedecreasecondition for the Lagrangian function. To evaluatethedescentpropertiesof thestepfor the objective functionwe usethepredictedtangentialreductionof theLagrangian�

predtkdeft qk z sn

k | p qk z sk |'�thepredictedreductionof � for thewholestep

pred�k deft p qk z sk |'�andtherelaxedactualreductionof �

rared�k deft max � k �x��k r 1

r � 0

y �kr � k r r p � z xk � sk � yk |

whereasabovewith fixed n � � andy � z 0 � 1�3n � |

n �k t min � k � 1 ��n � � � y �kr � y � 0 �x �k r 1

r � 0

y �kr t 1¡

REMARK 2.1. Anothernaturalchoicewouldbeto use� z xk � sk � yk� 1| in thedefinitionof rared�k with anew multiplier estimateyk� 1 andto addtheterm z yk p yk� 1| T z ck � AT

k sk | in

6 M. ULBRICH AND S.ULBRICH

thedefinitionof predtk andpred¢k. Our convergenceanalysiscaneasilybeadaptedto handlethis caseaswell andeven simplifies. On the other hand,we prefer to work only with ykuntil anacceptablestepis found,sincethecomputationof new multipliers yk£ 1 requirestheusuallycostlyevaluationof A ¤ xk ¥ sk ¦ .

The computationof sk ensuresthat the tangentialstep stk provides decreasefor the

quadraticmodelqk ¤ snk ¥ st ¦ , sincepredtk satisfies(2.5). However, this descentcanbe de-

stroyedby thenormalstepsnk , if § sn

k § is too largecomparedwith §�gk § . This motivatesthefollowing admissibilitycriterion: If pred¢k promisessufficient decreasefor the whole step,moreprecisely, if

predtk © maxª predck « ¤ predck ¦­¬�® and pred¢k ©�¯ predtk « °²± ¤ 2³ 3 « 1¦ «�¯´± ¤ 0 « 1¦ «

thenwerequire

rared¢k ©´µ 1pred¢k ¶This leadsto thefollowing evaluationof trial steps.

Evaluation of steps:

Let °²± ¤ 2³ 3 « 1¦ , ¯´± ¤ 0 « 1¦ , µ 1 ± ¤ 0 « 1¦ .Acceptthetrial stepsk · sn

k ¥ stk

if sk is acceptablefor theconstraints,i.e.,

raredck ©´µ 1predck «andif sk is acceptablefor theobjective function:

If predtk © maxª predck « ¤ predc

k ¦ ¬ ® and pred¢k ©�¯ predtkthen rared¢k ©´µ 1pred¢k holds.

If the stepis not acceptablethen the trust-region radius ¸ k is reducedand the stepsk isrecomputed.

2.3. The Algorithm. Wenow givea completestatementof thealgorithm.

Algorithm A:

Let 0 ¹ µ 1 ¹ µ 2 ¹ 1, 0 ¹ ¯ 1 ¹ 1 ¹ ¯ 2, 0 ¹�º «�» ¹ 1³ 2, 2³ 3 ¹ ° ¹ 1, and0 ¹ ¯ ¹ 1. Fixº 0 ± ¤ 0 « 1¦ andchoosea sequence¤ a j ¦ satisfyingtheconditionsin (2.8).Chooseaninitial point x0 ±�¼ , andaninitial trust-radius 0 © ¸ min ½ 0.Set ¾ :· 1, k :· 0, and j0 :· 0.

1. (Evaluatefunctionsat xk)Computeck, Ak, Wk, fk, gk, gk :· WT

k gk, andaLagrangemultiplier estimateyk.2. (Check for termination)

If § ck § ¥ §�gk § · 0: STOP.3. (UpdateRk)

Choosetheweights¿ cÀ ¢kr for raredcÀ ¢k .

UpdateRk by callingAlgorithm R.4. (Computetrial steps)

Computea quasi-normalstepsnk satisfying(2.2), (2.3),anda tangentialstepst

k sat-isfying (2.5),(2.6). Setsk :· sn

k ¥ stk.

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 7

5. (Testif sk is acceptable)If predÁk ÂÄà predtk andpredtk  max predck Å4Æ predc

k ÇÉÈ thengotoStep5.1,elsegotoStep5.2.

5.1 If raredck Ê�Ë 1 predck or raredÁk Ê�Ë 1 predÁk thenset Ì k :Í Ã 1 Ì k andgotoStep4.Else chooseÌ kÎ 1 Ï [maxÐ_Ì min Å Ì k Ñ Å maxÐjÌ min Å4à 2 Ì k Ñ ], set xkÎ 1 :Í xk Ò sk,k :Í k Ò 1 andgotoStep1.

5.2 If raredck Ê�Ë 1 predck thenset Ì k :Í Ã 1 Ì k andgotoStep4.Else chooseÌ kÎ 1 Ï [maxÐ_Ì min Å Ì k Ñ Å maxÐjÌ min Å4à 2 Ì k Ñ ], set xkÎ 1 :Í xk Ò sk,k :Í k Ò 1 andgotoStep1.

In our formulationof the algorithmwe have avoideda further index that distinguishesbetweendifferentinstancesof trial stepsat iterationlevel k. To preventpossibleambiguities,weusethefollowing

Notation: Wesaythatthestepsk is accepted(or successful)if it is usedin Step5.1or 5.2to computethenew iterate,i.e.,xkÎ 1 Í xk Ò sk. If it is necessaryto referencethe“accepted”,i.e., final valuesof sk, sn

k , stk, Ì k, predc

k, predÁk, predtk, raredck, andraredÁk at iteration levelk, we denotethemby skÓ a, sn

kÓ a, stkÓ a, Ì kÓ a, predckÓ a, predÁkÓ a, predtkÓ a, raredckÓ a, andraredÁkÓ a,

respectively.

3. Global convergenceanalysis.

3.1. Statement of the global convergenceresult. The following theoremstatestheglobalconvergencepropertiesof Algorithm A.

THEOREM 3.1.(i) Underassumptions(A1)–(A3), AlgorithmA is well definedaslongasxk staysin Ô .

Moreover, if AlgorithmA doesnot terminatefinitely, thenthefollowingholds:(ii) If assumptions(A1)–(A5) aresatisfied,then

limkÕ×Ö

ØckØ Í 0 Ù

(iii) If assumptions(A1)–(A6) aresatisfied,thenin addition

lim infkÕLÖ

Ø�ÚgkØ Í 0 Ù

The proof requiresseveral stepsandis carriedout in the remainderof this section. Inparticular, part (i) is proved in section3.2, Lemma3.2, part (ii) in section3.4, Lemma3.7,andpart(iii) in section3.5,Lemma3.9. A local convergenceanalysisshowing thetransitionto fastlocal convergenceundersuitableconditionson thestepcomputationwill begiveninsection4.

Throughouttheremainderof thissection,wewill notconsiderthecasewhereAlgorithmA terminatessuccessfullyin Step1, sincein this situationtheglobalconvergenceis trivial.

For theconvergenceanalysisit will beconvenientto introducealsotheactualreductionof theLagrangian

aredÁk defÍÜÛ k Ý Û Æ xk Ò sk Å yk Ç ÙThe following estimates,obtainedby the meanvalue theorem,will be usedseveral timesin this section. We recall that sk Í sn

k Ò stk, AT

k sk Í ATk sn andmaxÐ Ø sn

kØ Å Ø st

kØ ÑßÞ Ì k.

Assumethatxk andxk Ò sk arecontainedin Ô . Denotingby à Ï [0 Å 1] anappropriategenericconstantthat is adjustedfrom caseto case,andwriting x ák Í xk Ò à sk we find thatundertheassumption(A1) holds

âaredck Ý predck

â Þ Ø Ak ATkØ Ì 2

k Ò 4ØA Æ x ák Ç c Æ x ák Ç Ý Akck

Ø Ì k Å (3.1)âaredÁk Ý predÁk â Þ 2

Øg Æ x ák Ç Ý gk

Ø Ò Ø Æ A Æ x ák Ç Ý Ak Ç ykØ Ì k Ò 2

ØHkØ Ì 2

k Ù (3.2)

8 M. ULBRICH AND S.ULBRICH

3.2. Well definedness.Westartby showing thatthealgorithmis well defined,andthusestablishpart (i) of Theorem3.1. We first note that underassumptions(A1)–(A3) normalstepssn

k satisfying(2.2), (2.3),andtangentialstepsstk satisfying(2.5), (2.6) canbeobtained

by enforcingafractionof Cauchydecreasecondition.Hereby, (A3) ensuresthattheconstantsK1, K2, K3 in (2.2), (2.3), and(2.5) canbe chosenindependentlyof k aslong asxk ã´ä .We refer to [9] and,for detailson thepracticalcomputationof stepsproviding a fractionofCauchydecrease,to [20].

LEMMA 3.2. Let theassumptions(A1)–(A3) hold. Thenfor xk ãLä with å ck å`æÜå4çgk å�è2éLè 0 thereexists êßè 0 such that thestepsk is acceptedin Step5 of AlgorithmA wheneverë

k ì ê . Hence, thealgorithmis well definedaslong asxk ãLä .Further, if also (A4) and (A5) hold and if g and A are uniformlycontinuouson ä then

ê canbechosendependingonlyonmaxí min íîé2ïå ck åñðjï min íîé2ï a jk ð­ð , ä , çä , andtheboundsinassumption(A3) and(A5).

Proof. Westartwith a ê ãcò 0 ï�é ] suchthattheclosedê -ball aboutxk lies in ä . If xk ã çä ,we alwaysachieve this by choosingê�ó min é`ï 1

2distò çä ï n ô ä@õ . Furtheradjustmentof êwill beperformedastheproof proceeds.

Case1: å ck å�ö´é .Sinceê ì é , (2.7) impliesthatfor any

ëk ì ê we havepredck ö K2é ë k. Now reduceê such

thatfor all s ã n, å s å ì 2ê , holds

å Ak ATk å÷êøæ 4 å A ò xk æ sõ c ò xk æ sõ0ù Akck å ì ò 1 ù�ú 1õ K2é`ï (3.3)

which is possibleby assumption(A1). This togetherwith (3.1) implies

raredck ö predck ùüû aredck ù predck û ö ú 1predck æ ò 1 ù�ú 1õ'ò predck ù K2é ë k õ ö ú 1predck ý

If predþk öÜÿ predtk andpredtk ö maxí predck ï ò predck õ�� ð thenwe have to satisfytheadditionalconditionraredþk ö ú 1predþk (seeStep5.1). To achieve this, we notethat in this caseholdspredþk öDÿ predck ö�ÿ K2é ë k, wherewe have used(2.7). We now reduceê è 0 suchthat forall s ã n, å s å ì 2ê , holds

å Hk å÷ê æ�å g ò xk æ sõ0ù gk å æ å ò A ò xk æ sõ0ù Ak õ yk å ì ÿ2ò 1 ù�ú 1õ K2é2ï (3.4)

whichcanbedoneby assumption(A1). Then(3.2)ensuresthatthetestin Step5.1 is passedfor all

ëk ì ê . Hence,thestepis acceptedif

ëk ì ê .

If (A4) and(A5) holdandif g, A areuniformly continuouson ä thenourmechanismofreducingê canbedonedependingonly on é ó min íîé2ïå ck åñð , ä , çä andon theboundsin theassumptions.

Case2: å�çgk å�è=é .By assumption(A3) thereexistsK7 è 0 with

å WTk�

qk ò snk õ å�ö å4çgk å ù K7

ëk ý

Hence,if we reduceê suchthat ê ì 12K7é thenfor all

ëk ì ê holds

ëk ì 1

2K7å4çgk å and

thereforeby (2.5)

predtk ö K3

4é min í é2ï ë k ð ó K3

4é ë k ý (3.5)

Case2.1: predtk � predck.

Thenpredck ö K3

4 é ë k forë

k ì ê . Wereduceê until for all s ã n with å s å ì 2ê holds

å Ak ATk å÷êøæ 4 å A ò xk æ sõ c ò xk æ sõuù Akck å ì ò 1 ù�ú 1õ K3

4é ý

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 9

This is possibleby (A1). Invoking (3.1), we obtain raredck�

aredck���

1predck whenever�

k � and thus the trial stepis acceptedin Step5.2. If (A4) and (A5) hold and g, Aareuniformly continuous, canbe chosendependingonly on � andon the boundsin theassumptions.

Case2.2: predtk� predck.

For theacceptanceof thestepwe have to make surethat raredck���

1predck. Theadditionalconditionrared

k���

1pred k is only requiredif alsopred

k���

predtk. In this case,the latterrequirementis metby notingthat(A1) allows to reduce suchthatfor all s � n, � s � � 2 ,holds

� Hk � �� � g � xk � s��� gk � � ��� A � xk � s��� Ak � yk � ��2� 1 � � 1� K3

4���

Then,by (3.2)and(3.5), rared k�

ared k���

1pred k if

�k �� . If (A4) and(A5) hold andif

g andA areuniformly continuousthen canbechosendependingonly on � .Thefirst requirementraredck

���1predck canbeachievedby reducing furtheraccording

to thefollowing cases:

Case2.2.1: � ck �! min " a jk #%$ � def& � jk .Reduce suchthat '� � ck � andthatfor all s � n, � s � � 2 , holds

� Ak ATk � �� 4 � A � xk � s� c � xk � s��� Akck � � � 1 � � 1� K2 � ck �(�

This is againpossibleby (A1). By (2.7) we have predck�

K2 � ck � � k if�

k �) . Hence,raredck

� aredck�*�

1predck by (3.1) whenever�

k �� andthereforethe stepis acceptedin5.1. If (A4) and(A5) hold andif g and A areuniformly continuousthen canbe chosendependingonly onmin +,� # � ck �.-/ min " a jk #%$ � andtheboundsin theassumptions.

Case2.2.2: � ck � � � jk .Then Rk

�min + a2

jk # � 2 - � � 2jk 0 max+," 2 #%$ 2 - � 4� 2

jkandwith suitable 12� [0 # 1] andx 3k &

xk � 1 sk holds

raredck�

Rk �4� ck � 2 ��� A � x 3k � c � x 3k �5� T sk �Since � Akck � Tsk

& � Akck � Tsnk � 0, this gives

raredck�

3� 2jk �6� A � x 3k � c � x 3k ��� Akck � � k �

Moreover, predck � � ck � 2 � � 2jk

. Now reduce suchthatfor all s � n, � s � � 2 , holds

3� 2jk �4� A � xk � s� c � xk � s��� Akck � ��� 1� 2

jk �This is possibleby (A1). Thenraredck

�7�1predck and,hence,thetrial stepis acceptedin Step

5.2 for all�

k �4 . If (A4) and(A5) hold andif g and A areuniformly continuousthen canbechosendependingonly onmin +," a jk #%$ �8- & � jk

�min +,� # � ck �.- andtheboundsin the

assumptions.

3.3. A nonmonotonedecreaseLemma. The following crucial decreaseLemmais aslight modificationof [29, Lem. 4.3].

LEMMA 3.3. Supposethat there exists K�

0 such that for all iteration levelsk�

Kholds

raredck9 a & max � ck � 2 #: c

k ; 1

r < 0

= ckr � ck ; r � 2 �4� ck> 1 � 2 �

10 M. ULBRICH AND S.ULBRICH

Thenfor all k ? K

@ckA 1

@ 2 B maxK C�D c

K E l F KRl GIH 1

k

r J K

K minL k C r MND c Opredcr M a P (3.6)

Proof. Set MK Q maxK C�D cK E l F K Rl andaredckM a Q @

ck@ 2 G @

ckA 1@ 2. Theproof is by

induction.SinceRl ? @cl@ 2, 0 B l B K , wehave for k Q K

MK G @ ckA 1@ 2 ? raredckM a ? H 1predc

kM a PNow let k ? K . If raredckA 1M a Q aredckA 1M a we getby R 3P 6S k

@ckA 2

@ 2 Q @ckA 1

@ 2 G raredckA 1M a B MK GTH 1

k

r J K

K minL k C r MND c Opredc

r M a GIH 1predckA 1M a Uwhich implies R 3 P 6S kA 1, since0 V K V 1.

Now considerthecasewhereraredckA 1M a WQ aredckA 1M a. Thenweobtainwith q QYX ckA 1 G 1

by using(3.6)andthefactthat@ckA 1 C p

@ 2 B MK for K G2X cK V k Z 1 G p B K ,

@ckA 2

@ 2 Qq

pJ 0

K ckA 1M p @ ckA 1 C p

@ 2 G raredckA 1M a

B q

pJ 0

K ckA 1M p MK GTH 1

k C p

r J K

K minL k C p C r MND c Opredcr M a GTH 1predc

kA 1M a

B MK GIH 1

k C q

r J K

K minL k C r MND c Opredcr M a GTH 1

K k

r J maxL K M k C q A 1OK minL k C r MND c O

predcr M aGIH 1predckA 1M a P

Now r ? k G q Z 1 yields k G r B q G 1 B X c G 2 andtherefore1 Z min [ k G r U X c \ Qmin [ k Z 1 G r U X c \ . Thus,weseefrom thelastchainof inequalitiesthat

@ckA 2

@ 2 B MK GTH 1

k

r J K

K minL kA 1 C r MND c Opredcr M a G2H 1predc

kA 1M a

Q MK GIH 1

kA 1

r J K

K minL kA 1 C r M]D c O predcr a P

whichconcludestheproof.To show theconvergencetowardsstationarypointswe will moreover usethefollowing

decreaseLemma,which is verysimilar to thepreviousLemma3.3.LEMMA 3.4. Supposethat there exists K ? 0 such that for all iteration levelsk ? K

holds

rared_kM a Z�`aR xkA 1 U yk S G ` kA 1 ? H 1

2pred_kM a

Thenfor all k ? K

@ ` kA 1@ B max

K C�D�bK E l F K` l G H 1

2

k

r J K

K min c k C r M]D b�d pred_r M a P (3.7)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 11

Proof. Weonly notethatby thedefinitionof raredekf a holds

raredekf a g�hai xkj 1 k yk l�m h kj 1 n max h k koqp

k r 1

r s 0

t ekr h k r r m h kj 1 u

Now (3.7) followsexactlyby thesameargumentsasin theproof of Lemma(3.3).

3.4. Convergenceto feasiblepoints. Thefollowing auxiliary resultwill beuseful:LEMMA 3.5. Let theassumptions(A1)–(A4) hold. If j is increasedby AlgorithmR in

iterationk, i.e., jkj 1 n jk g 1, then

vckw v!x 1y t a jk for all kz|{ k u

In particular, for all iterationsk with jk { 1 holds

vckv}x a jky t�~

0k (3.8)

where~

0 is theconstantin (2.8).Proof. If jkj 1 n jk g 1 thenwe musthave

vckv��

min � ~ a jk k%� v��gkv.��x ~

a jk and a2jk { Rk {

o ck r 1

r s 0

t ckrvck r r

v 2 u

Thus,usingt c

kr { t , weobtain

vckw v 2 x 1t a2

jk k kz n k g 1 mT� ck k�u�u�u�k k u

Wenow show by induction

vckw v!x 1y t a jk for all kz|{ k g 1 mT� c

k u (3.9)

For kz n k g 1 m�� ck k�u�u�u(k k this is alreadyshown. Now let theassertionhold for theiterations

k g 1 m�� ck k�u�u�u�k kz { k. Sinceby Lemma3.2thekz -th iterationwill eventuallybesuccessful,

weobtainin particularthat

0x��

1predckw�f a x raredckw�f a n max Rkw ko c

kw r 1

r s 0

t ckw r v ckw r r

v 2 m v ckw�j 1v 2 u

SinceRkw x maxvckw v 2 k a2

jkwx

maxvckw v 2 k a2

jk, wehaveby theinductionhypothesis

vckw�j 1

v 2 x max a2jk k v ckw v 2 k�u�u�u(k v ckw�j 1 r o c

kwv 2 x 1t a2

jk uThisprovesthefirst assertion.

12 M. ULBRICH AND S.ULBRICH

Now � ck ��� a jk �.� � follows immediatelyif j is increasedby Algorithm R in iterationk, i.e., jk� 1 � jk � 1. For all subsequentiterationsk� satisfying jk� � jk� 1 we have by ourpreviousresultandby (2.8)

� ck� �!� a jk

� � �a jk��� 1

� � � a jk��0� �}�

Therefore,(3.8)holdsfor all k with jk � 1.Thenext Lemmashows thatck mustconvergeto zeroif theassumptionof thedecrease

Lemma3.3doesnothold.LEMMA 3.6. Let theassumptions(A1)–(A4) hold. If for infinitelymanyiterationsholds

raredck  a ¡� max � ck � 2 ¢£ c

k � 1

r ¤ 0� c

kr � ck � r � 2 ¥ � ck� 1 � 2

then jk ¦ § and � ck � ¦ 0.Proof. Under the assumptionsof the Lemma,thereexists an infinite subsequenceof

iterationsk� for whichholds

Rk� � min a2jk� ¢ ��gk� � 2 © � ck� � 2 and Rk� ©

£ ck� � 1

r ¤ 0� c

k� r � ck� � r � 2 �Hence,jk� � 1 � jk� � 1 in eachiterationk� andthuswemusthave jk ¦ § . But now Lemma3.5yields

limkª¬« � ck �­� lim

kª¬«a jk

� � � 0� 0 �

CombiningLemmas3.3and3.6,wecanestablishconvergenceto feasiblepoints,whichprovespart(ii) of Theorem3.1.

LEMMA 3.7. If AlgorithmA doesnot terminatefinitely then

limkª'« � ck � � 0 �

Proof. Assumethat ck doesnot tend to zero. Then Lemma3.6 yields possiblyafterincreasingK

raredck  a � max � ck � 2 ¢£ c

k � 1

r ¤ 0� c

kr � ck � r � 2 ¥ � ck� 1 � 2 for all k � K � (3.10)

ThusLemma3.3 is applicableandweobtainthatfor all k � K holds

� ck� 1 � 2 � MK¥T®

1� £ ck

r ¤ K

predck  a ¢ where MK � maxK � £ c

K ¯ l ° KRl � (3.11)

Wefirst show that

lim infkª'« � ck � � 0� (3.12)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 13

If this is wrong thenpossiblyafter increasingK thereexists ±³² 0 with ´ ck ´Tµ)± for allk µ K . Now by (2.3)

predck¶ a µ K2± min ±�·�¸ k¶ a · (3.13)

which, togetherwith (3.11),showsthat

¹kº K

¸ k¶ a »2¼4½ (3.14)

Thus, ¾ xk ¿}ÀÂÁà is a Cauchysequenceandconvergesto some Äx Å Áà À à . Thecontinuityofc, A, andg andtheboundednessof ¾ Hk ¿ and ¾ yk ¿ , see(A1) and(A5), impliestheexistenceof 0 »IÆÈÇ ± and ÆÊÉ ² 0 suchthatfor all xk with ´ xk Ë Äx ´ ÇÌÆÊÉ andall s with ´ s ´ Ç 2Æ theinequalities(3.3)and(3.4)aresatisfied.In theproofof Lemma3.2,Case1, it wasshown(note´ ck ´!µÍ± ) thatthestepis acceptedif (3.3),(3.4)holdfor all s with ´ s ´ Ç 2Æ andif in addition¸ k ÇÎÆ . Sincefor all sufficiently large k µ K we have ´ xk Ë Äx ´ Ç*Æ.É , the mechanismofupdating k would thusensurethat thestepis acceptedwith ¸ k¶ a µ min Ïи min ·�Ñ 1ÆaÒ . Thiscontradicts(3.14)and(3.12)is proven.

Now assumethat (3.12)holds,but ´ ck ´ doesnot convergeto zero. Thenthereis ±Ó² 0with ´ c Ôk ´Õµ 2± for a subsequence¾ Ák¿ . By (3.12),we canassociatewith each Ák some Äk µ Ákwith

´ c Ök× 1 ´ » ±�· ´ ck ´!µÍ±�· k Ø Ák · ½�½�½ · Äk ½As a consequencewehaveby (2.3)

predck¶ a µ K2± min ±Ù·�¸ k¶ a · k Ø Ák · ½�½�½ · Äk ½ (3.15)

Sincemoreover (3.11)holds,wemusthave

Ökkº Ôk

predck¶ a Ú 0 for Ák Ú ¼

andthusby (3.15)and ´ sk¶ a ´ Ç 2 ¸ k¶ a

´ x Ök× 1 Ë x Ôk ´ Ç 2

Ökkº Ôk

¸ k¶ a Ú 0 for Ák Ú ¼6½

Sincec is LipschitzcontinuousonÃ

by (A3), we concludethat

±ÛØ 2± Ë ± Ç ´ c Ök× 1 ´ Ë ´ c Ôk ´ Ç ´ c Ök× 1 Ë c Ôk ´ Ú 0 for Ák Ú ¼which is acontradiction.Hence,our assumptionwaswrongandtheproof is complete.

3.5. Convergenceto stationary points. As a next stepwe considerthe convergencebehavior of the reducedgradient Ágk Ø WT

k gk. The following Lemmagivesan importantlowerboundfor acceptabletrust-region radii. Hereby, weestablishtwo variantsof theresult.Oneholdsin thegeneralsettingof assumptions(A1)–(A6). Thesecondresultis strongerandholdsunderassumption(A7). It is usedin section4 to achievelocally quadraticconvergence.

LEMMA 3.8. Let theassumptions(A1)–(A6) besatisfied.Thenthefollowingholds.

14 M. ULBRICH AND S.ULBRICH

(i) Thereexistsa constantÜ 1 Ý 0 independentof k such that

raredck ÞÍß 1predck

is satisfiedwhenever

maxà�á snk á(âãá st

k áÊä�åÍæ kdefç Ü 1 min 1 â maxà min à a jk âãá�ègk á.ä5âãá ck á.ä 2é 3 ê (3.16)

(ii) There existsa constantÜ 2 Ý 0 independentof k such that the stepsk is acceptedwhenever

maxà�á snk á(âãá st

k á.ä�å min æ k â%Ü 2 maxà�á�ègk á.âãá ck á.ä ê (3.17)

with æ k asin (i).(iii) If in addition (A7) holdsthenthere exists ë Ý 0 and Ü 1 Ý 0 in (i) can be chosen

such that thestepsk is acceptedwhenever á xk ì6íx á¬îïë and

maxà�á snk á(âãá st

k á.ä/å min æ k â%Ü 1 maxà�á�ègk á(âãá ck á.äñð ê (3.18)

Proof. Set ò kç maxà�á sn

k á(âãá stk á.ä andnotethat ò k åôó k.

(i): Taylorexpansionyieldswith x õk ç xk öÎ÷ sk andappropriate÷Èø [0 â 1]

ùaredck ì predck

ù å 4 á A ú x õk û A ú x õk û T ì Ak ATk áüò 2

k ö 4 á%ý 2c ú x õk û�þ c ú x õk û áüò 2kê

Using(A3), (A6) weconcludethatthereexists K5 Ý 0 with

ùaredck ì predck

ù å K5ò 2k ú,ò k ö á ck á û ê (3.19)

Wenow considertwo cases.

Case1: á ck á Þ min à,ÿ a jk â��Õá�ègk á.äSinceraredck Þ aredck, thedecreasecondition(2.3) and(3.19)ensurethat raredck Þôß 1predckholdsif

K5ò 2k úñò k ö á ck á û åôú 1 ì ß 1û K2 á ck á min à�á ck á(â%ò k ä ê (3.20)

If á ck á}å�ò k, (3.20)holdsif 2K5ò 3k å�ú 1 ì ß 1û K2 á ck á 2, i.e., if

ò k å C1é 31 á ck á 2é 3 â C1

defç ú 1 ì ß 1û K2

2K5

êIn thecaseá ck á Ý ò k, (3.20)is satisfiedif

2K5ò 2k á ck á!åôú 1 ì ß 1û K2 á ck á ò k â i.e., if ò k å C1

êSince á ck á Þ min à ÿ a jk â�� á�ègk á.ä theassertion(i) thusholdswith

Ü 1ç C2

defç min à C1 â C1é 31 min à ÿ�â�� ä 2é 3 ä ê

Case2: á ck á î min à,ÿ a jk â�� á�ègk á.äThenRk

ç min à a2jkâãá�ègk á 2 ä accordingto Algorithm R. Weobtain

raredck Þ Rk ì á ck� 1 á 2 ç Rk ì á ck á 2 ö predck ö ú aredck ì predck û ê

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 15

Using(3.19)we getraredck � predck if

Rk ��� ck � 2 � K5� 2k � � k � ck ���� (3.21)

Case2.1: Rk � ���gk � 2ThenRk ��� ck � 2 � � 1 ��� 2�����gk � 2 since � ck ���������gk � , and(3.21)is satisfiedif

� 1 ��� 2 �����gk � 2 � K5� 2k � � k ������gk ���� (3.22)

If ���gk ����� k, (3.22)holdsif

� 1 � � 2�����gk � 2 � K5 � 1 ��!�����gk � 2� k " i.e., if � k � C3def� 1 � �

K5

If ���gk ���#� k, (3.22)holdsif

� 1 �$� 2�����gk � 2 � K5 � 1 %�&�'� 3k " i.e., if � k � C1( 3

3 ���gk � 2( 3 Therefore,since � ck �����)���gk � , (3.22)holdsif

� k � min C3 " C1( 33 max ���gk � " � ck � 2( 3

Case2.2: Rk � a2jk

ThenRk ��� ck � 2 � � 1 ��* 2� a2jk

and � ck ����* a jk . As in Case2.1,(3.21)holdsif

� k � min C4 " C1( 34 a2( 3

jk " C4def� 1 � *

K5"

whichyieldswith � ck ���+* a jk

� k � min C4 " C1( 34 max a jk " � ck � 2( 3

Thus,theassertion(i) is provenwith , 1 � min C2 " C3 " C1( 33 " C4 " C1( 3

4 .

(ii): If predtk - max predck " � predck �/. or pred0k -21 predtk thenno further acceptance

criteriaarerequiredandwe aredone.Otherwise,wehave

predtk � max predc

k " � predck � . and pred0k � 1 predtk " (3.23)

andgeta furtherrestrictionon � k by therequirementrared0k �43 1pred0k. Now (3.2)and(A6)yield a constantK6 � 0 with 5

ared0k � pred0k 5 � K6� 2k (3.24)

Weconsiderfirst thecasethat � ck � � ���gk � . We know thatin thepresentcaseholds

pred0k � 1 predtk � 1 predck � 1 K2 � ck � min � ck � " � k Sincerared0k � ared0k, weconcludefrom (3.24)thatrared0k �+3 1pred0k is ensuredif

K6� 2k � 1 � 1 � 3 1� K2 � ck � min � ck � " � k

16 M. ULBRICH AND S.ULBRICH

This is satisfiedif

6k 7 C5 8 ck 8:9 C5 ; min

<>= 1 ?�@ 1A K2

K69 <>= 1 ?�@ 1A K2

K6

1B 2 C

Hence,in thecase8 ck 8�DE8�Fgk 8 thestepis acceptedif (3.17)holdswith G 2 ; C5.Wenow considerthecase8�Fgk 8�DE8 ck 8 . By (A3) and(A5) thereexistsaconstantK7 H 0

with

8 WTk I qk

= snk A 8JDE8�Fgk 8 ? K7 8 sn

k 8�DK8�Fgk 8 ? K76

k 9 (3.25)

8 WTk I qk

= snk A 8J7E8�Fgk 8&L K7 8 sn

k 8�7K8�Fgk 8&L K1K7 8 ck 8:9 (3.26)

wherewe have used(2.2) in the last inequality. Thus,we obtainfor 6 k 7 12K7 8�Fgk 8 by (2.5)

and(3.25)

predMk D < predtk D < K3

4 8�Fgk 8 min 8�Fgk 8:9 6 k

C(3.27)

Hence,by (3.24)we haveraredMk D @ 1predMk whenever 6 k 7 12K7 8�Fgk 8 and

K66 2

k 7 <>= 1 ?�@ 1A K3

48�Fgk 8 min 8�Fgk 8:9 6 k

CAll this is satisfiedwhenever

6k 7 C N5 8�Fgk 8:9 C N5 ; min

1

2K79 <>= 1 ? @ 1A K3

4K69 <>= 1 ? @ 1A K3

4K6

1B 2 C

Hence,thestepis acceptedif (3.17)holdswith G 2 ; min O C5 9 C N5 P , whichcompletestheproofof (ii).

(iii): Now let in addition(A7) hold. We chooseQ H 0 suchthatthe2Q -neighborhoodofRx is containedin S N . In therestof theproof we show thataftera possiblefurtherreductionof G 1 thestepis acceptedwhenever 8 xk ? Rx 8UT Q and(3.18)is satisfied.

Therefore,let xk satisfy 8 xk ? Rx 8VT Q . As alreadyin (ii), we have only to considerthecase(3.23)in which we have to ensurethatraredMk D @ 1predMk. To this end,let 0 T G N1 7 Q�W 2be suchthat (i) holds for G 1 ; G N1 (andthusfor all 0 T G 1 7 G N1) andconsiderstepsthatsatisfy(3.16)for G 1 ; G N1. In particular, we thenhave 6 k 7 Q�W 2 andthus[xk 9 xk L sk] X4S N .

Using(A7), Taylorexpansionyields

aredMk ? predMk ; 1

2sTk= I 2

x Y = x Zk 9 yk A ? I 2x Y = xk 9 yk A'A sk

for appropriatex Zk ; xk L\[ sk, [^] [0 9 1]. Thus,(A7) yieldsa constantK8 H 0 with

_aredMk ? predMk _ 7 K8

6 3k

C(3.28)

Now weconcludefrom (2.2)andthefirst inequalityin (3.25)that

8 WTk I qk

= snk A 8�D`8�Fgk 8 ? K1K7 8 ck 8 C (3.29)

Weconsiderfirst thecase

8 ck 8�7 1

2K1K7maxO 8�Fgk 8:9a8 WT

k I qk= sn

k A 8 P C (3.30)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 17

Thenby (3.29)

bWT

k c qk d snk e bgf 1

2b�hgkb

andthus(3.27)holdsby (2.5). Hence,(3.27),(3.28)guaranteeraredik f+j 1predik whenever

K8k 3k lnm d 1 o j 1e K3

4b�hgkb

minb�hgkb:p k k q

This is satisfiedfor

k k l C6 min r b�hgkb 2s 3 pab�hgk

b 1s 2 t p C6 u min m d 1 o j 1e K3

4K8

1s 3 p m d 1 o j 1e K3

4K8

1s 2 qMoreover, we have in thecase(3.30)

bckb l 1

K1K7

b�hgkb q (3.31)

In fact,either(3.31)follows directly from (3.30)or we havebWT

k c qk d snk e bvf 2K1K7

bckb

and thus(3.26) yields (3.31). We now choosew 1 u min rxwzy1 p C6 min r 1 p d K1K7e/{ t|t . Then(3.18)implies k k l�} k l w 1 and(notethat ~�� 2� 3)

k k l w 1 min r 1p maxr b�hgkb:pab

ckb t { t l w 1 min r 1 pab�hgk

b { maxr 1 p d K1K7e�� { t'tl C6 min r 1 pab�hgk

b { t l C6 min r b�hgkb 2s 3 pab�hgk

b 1s 2 t qHence,theproof of (iii) is completein thecase(3.30).

It remainsto considerthecase

bckb � 1

2K1K7maxr b�hgk

b:pabWT

k c qk d snk e b t q (3.32)

Now, sincestk u Wkdk with dk u d WT

k Wk e � 1WTk st

k, (A3) and(A5) yield a constantK9 � 0suchthat

predtk l b WT

k c qk d snk e b�b d WT

k Wk e�� 1WTk st

kb&� 1

2bHkb k 2

k

l K9k k d b WTk c qk d sn

k e b�� k k e qSincepredtk

fmax predck

p d predck e { , wededucefrom (2.3)

K9k k d b WTk c qk d sn

k e b&� k k e f K {2 b ckb { min r b ck

b { p k {k twhichyields

K9k k d 2K1K7bckb&� k k e f K {2 b ck

b { min r b ckb { p k {k t q

IntroducingtheconstantC7 u K �2�1� 2K1K7 � K9

, this implies

k 2kf

C7bckb 2{ p if

bckb l k k

p(3.33)

k kbckb�f

C7bckb { k {k p if

bckb � k k. (3.34)

18 M. ULBRICH AND S.ULBRICH

In thecase(3.33)weobtain� k � C1� 27 � ck ��� . Since � ck � is boundedby (A3) and2� 3 � �`�

1, we seethat in thecase(3.34)thereexistsa constantC8 � 0 with � k � C8. We concludethatin thesituation(3.23),(3.32)alwaysholds

� k � min � C1� 27 � ck � �!� C8 �|�

Since � ck � � 12K1K7 ���gk � , wefind thatwith C9 � min � C1� 2

7 min � 1 ��� 2K1K7��� � � � C8 � holds

� k � C9 min � 1 � max� � ck � � �gk � � �|�Choosing� 1 � min ���a 1 � C9 � , this concludestheproofof (iii) alsofor thecase(3.32).

Thefollowing Lemmaestablishespart(iii) of Theorem3.1.LEMMA 3.9. Let (A1)–(A6) hold. If thealgorithmdoesnot terminatefinitely then

lim infk¡£¢ ���gk � � 0 �

Proof. Assumethatthealgorithmrunsinfinitely andthatthereareK � 0 and ¤^¥ � 0 � 1]with ���gk � � 2¤ for all k � K . We first show thatafterapossibleincreaseof K holds

pred¦k§ a �©¨ predtk§ a � predtk§ a � max predck§ a ��� predck§ a � � � for all k � K �

As in theproof of Lemma3.8,see(3.29),thereexistsa constantK7 � 0 with

� WTk ª qk � sn

k§ a � � � ���gk �¬« K7 � snk§ a � � 2¤ « K1K7 � ck �

for all k � K , wherewehaveused(2.2). Since � ck �¬­ 0 by Lemma3.7,wecanincreaseKsuchthat

� WTk ª qk � sn

k§ a � � � ¤ for all k � K �andthusby (2.5)

predtk§ a � K3¤ min ¤ ��® k§ a for all k � K � (3.35)

On theotherhandholds

predck§ a ¯ 2 � Akck ��� snk§ a ��°� Ak AT

k ��� snk§ a � 2 �

and,hence,by (2.2)and(A3) find aconstantK10 � 0 with

predck§ a ¯ K10 � ck � min � ck � ��® k§ a � (3.36)

Since � ck ��­ 0 by Lemma3.7 and ���gk � � ¤ , we obtainfrom Lemma3.8, (i)–(ii), andthemechanismof updating® k thatafterapossibleincreaseof K holds

® k§ a �n¨ 1� 1 � ck � 2� 3 � � ck � for all k � K � (3.37)

andthuswehaveby (3.35)–(3.37)that,for sufficiently large K ,

predck§ a ¯ K10 � ck � 2 � predt

k§ a � K3¤ ¨ 1� 1 � ck � 2� 3 for all k � K � (3.38)

Hence,using � � 2� 3 and � ck �±­ 0, weseefrom (3.38)that,possiblyafterincreasingK ,

predtk§ a � max predck§ a ��� predc

k§ a � � for all k � K � (3.39)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 19

Moreover, we notethatby (A3), (A5), and(2.2) thereis K11 ² 0 with³predkµ a ¶ predt

kµ a ³¸·¹³ qk º snkµ a » ³½¼ K11 ¾ ck ¾:¿

Hence,K canby (3.38)beenlargedsuchthat

predkµ a À©Á predtkµ a for all k À K ¿ (3.40)

Therefore,for all k À K theacceptedstepskµ a satisfies(3.39)and(3.40).Thus,for all k À K ,theacceptanceof thesteptakesplacein Step5.1. In particular, theacceptedstepssatisfy

raredkµ a À# 1predkµ a ÀnÁ� 1K3à min Ã�Ä�Å kµ a for all k À K Ä (3.41)

wherewehaveused(3.35).By our assumptionholds ¾�Ægk ¾ À 2à for all k À K . ThusLemma3.8, (i)–(ii) andthe

mechanismof updatingÅ k yieldsaconstantC1 ² 0 with

Å kµ a À C1 min 1Ä maxÇ min Ç a jk Ä 2ýÈ|Äa¾ ck ¾ÉÈ 2Ê 3 À C1 min Ã�Ä a2Ê 3jk ¿ (3.42)

Hence,weobtainfrom (3.41)someC2 ² 0 with

predkµ a À C2à min Ã�Ä a2Ê 3jk

for all k À K ¿ (3.43)

By (3.41)holdsraredkµ a À� 1predkµ a for all k À K . We want to applythedecreaseLemma

3.4 andsinceraredkµ a usesË º xkÌ 1 Ä yk » , we show that³ Ë kÌ 1 ¶ Ë º xkÌ 1 Ä yk » ³ becomessmall

comparedto predkµ a. In fact,³ Ë º xkÌ 1 Ä ykÌ 1 »>¶ Ë º xkÌ 1 Ä yk » ³½¼ ¾ ykÌ 1 ¶ yk ¾�¾ ckÌ 1 ¾:¿ (3.44)

Usingpredckµ a À 0, we havewith (3.19)

¾ ckÌ 1 ¾ 2 · ¾ ck ¾ 2 ¶ aredckµ a ¼ ¾ ck ¾ 2 Í ³ aredckµ a ¶ predckµ a ³½¼ ¾ ck ¾ 2 Í K5 Å 2kµ a º|Å kµ a Í ¾ ck ¾ » ¿

This togetherwith (3.37)yieldsaconstantK12 ² 0 suchthat

¾ ckÌ 1 ¾ 2 ¼ K12 Å 3kµ a for all k À K ¿ (3.45)

Using(A5), (3.41),(3.44),and(3.45)weobtainpossiblyafterincreasingK

³ Ë º xkÌ 1 Ä ykÌ 1»Î¶ Ë º xkÌ 1 Ä yk » ³a¼  1

2predkµ a for all k À K ¿

In fact, by (3.41), (3.44), (3.45) this is clear for Å kµ a ¼ C3, C3 ² 0 small enough.AfterincreasingK this holdsalsofor all Å kµ a À C3 ² 0, sinceby ckÌ 1 Ï 0 accordingto Lemma3.7 theleft termtendsby (3.44)to zerowhereastheright handsidehasby (3.41)a positivelowerbound.Hence,wegetwith (3.41)

raredkµ a Í Ë º xkÌ 1 Ä yk »>¶ Ë kÌ 1 À Â 1

2predkµ a for all k À K ¿

Thisyieldsby Lemma3.4with MK·

maxK Ð�ÑÓÒK Ô l Õ K Ë lË kÌ 1

¼MK ¶ Â 1

2

Ö Ñ Ò k

r × K

predkµ a

20 M. ULBRICH AND S.ULBRICH

for all k Ø K . SinceÙ k is boundedfrom below by (A3) and(A5) this giveswith (3.43)ÚkÛ K

C2 min Ü 1Ý a2Þ 3jk ßáà

ÚkÛ K

predâkã a ä å�æBut theleft handsideis not summablebecauseaç j is not summableand è£é 2ê 3. Hence,wehavederivedacontradictionandtheproof is complete.

4. Transition to fast local convergence. Throughoutthis sectionwe assumethat as-sumptions(A1)–(A7) hold. We now show that the proposedAlgorithm A convergeswithlocal quadraticratetowardsa point satisfyingthesecondordersufficient condition.Hereby,we work with anSQP-Newton-typestepcomputation.Thesestepsareshown to beacceptedby our algorithm in a neighborhoodof a stationarypair ë�ìx Ýaìyí4îðïñóò m satisfyingthefollowing standardsufficientsecondordercondition:

(O2)c ë�ìx íô ïg ë�ìx í õ 0 Ý W ë�ìx í T ô 2

x Ù�ë�ìx Ýaìyí W ë�ìx í positivedefiniteÝ A ë�ìx í hasfull columnrank.

ThelastconditionensuresthattheLagrangemultiplier ìy is unique.

4.1. Requirementon thestepcomputation. Toachievefastlocalconvergencewehaveto ensurethatcloseto thesolutionSQP-Newtonstepsaretaken.This requiresanappropriatesplitting of thesestepsin their quasi-normalandtangentialpart anda carefulchoiceof theLagrangemultiplier updaterule. For thederivationof theseconcepts,webegin by collectingsomefactsaboutthe local convergencebehavior of SQPmethods. Hereby, we chooseaninformalstyleof presentationsincetheseresultsareby now well-known.

TheSQP-Lagrange-Newtonsystemis givenbyô 2x Ù�ë xk Ý yk í Ak

ATk 0

sN ã kzN ã k õ

ö ôx Ù÷ë xk Ý yk íö ck

æUnder the assumptions(A7) and (O2) it is well known that for all ø¹îKë 0 Ý 1í thereexistneighborhoodsUN of ìx andVN of ìy suchthatfor xk î UN , yk î VN thestepssN ã k andzN ã karewell definedwith xk ù sN ã k î UN , yk ù zN ã k î VN ,ú

xk ù sN ã k ö ìx ú ù ú yk ù zN ã k ö ìy ú à ø ú xkö ìx ú ù ú yk

ö ìy ú Ý (4.1)

andthatfor ë xk Ý yk íáû ë�ìx Ýzìyí holdsúxk ù sN ã k ö ìx ú ù ú yk ù zN ã k ö ìy ú õ O ë ú xk

ö ìx ú 2 ù ú ykö ìy ú 2íüÝ (4.2)ú ô Ù÷ë xk ù sN ã k Ý yk ù zN ã k í ú ù ú c ë xk ù sN ã k í ú õ O ë ú ô Ù k ú 2 ù ú ck

ú 2í æ (4.3)

Furthermore,if snN ã k andst

N ã k satisfy

ATk sn

N ã k õ ö ck Ýst

N ã k õ WkdtN ã k Ý where ë WT

kô 2

x Ù kWk í dtN ã k õ ö ë ïgk ù WT

kô 2

x Ù ksnN ã k íüÝ (4.4)

thenwe have sN ã k õ snN ã k ù st

N ã k. Hereby, we have usedthe identity ïgk õ WTkô

x Ù k. Notethatsn

N ã k solvestheunconstrainedquasi-normalproblem

minúck ù AT

k sn ú 2 Ý

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 21

andthatstN ý k is thecorrespondingsolutionof theunconstrainedtangentialproblem

minqk þ snN ý k ÿ st � subjectto AT

k st � 0

with Hk��� 2

x � þ xk � yk� .

Wewould like to achievequadraticconvergenceof þ xk� ratherthan þ xk � yk

� . To thisend,similar asin [19], let Y : UN �� m bea consistentupdaterule for theLagrangemultiplier,which is Lipschitzcontinuousat �x, i.e.,

Y þ �x �� �y � Y þ x ��� Y þ �x � � L y x � �x for all x � UN . (4.5)

By apossiblereductionof UN andVN weachievethat(4.1)holdsfor � 1� þ 2 ÿ 2L y� , anda

furtherreductionof UN yieldsY þ UN��� VN . Thus,for all xk � UN holdsyk

� Y þ xk� � VN

and

xk ÿ sN ý k � �x � � xk� �x ÿ yk

� �y 1

2 xk� �x (4.6)

xk ÿ sN ý k � �x � O þ� xk� �x 2 ÿ yk

� �y 2��� O þ� xk� �x 2� � (4.7)

wherewe have used(4.1), (4.2),and(4.5). Therefore,the iterationxk � xk� 1� xk ÿ sN ý k

with yk� Y þ xk

� convergesq-quadraticallyto �x.In thesequelwe restrictourselvesto thefollowing classof updaterules:Let B : UN �

n � m becontinuouslydifferentiablesuchthat AT B is uniformly boundedinvertibleon UN .We introducethemultiplier update

Y þ x ����� þ B þ x � T A þ x ����� 1B þ x � T g þ x � � (4.8)

which is obviously consistentandcontinuouslydifferentiable.Therefore,afterreducingUN

if necessary, B is boundedonUN andY satisfiestheLipschitzcondition.In particular, if we chooseB � A, we obtainthe well-known least-squaresmultiplier

update. Furthermore,the adjoint update, which is widely usedin optimal control, alsofitsin this framework: Let x bepartitionedin theform x � þ zT � uT � T � m � n � m suchthat�

zc þ x � is invertibleonU with uniformly boundedinverse.In anoptimalcontrolcontext thestandardchoicefor z is thestate,andfor u thecontrol. Theadjointupdatefor this splittingnow correspondsto BT � þ BT

z � BTu�� þ I � 0� .

Amongthemany possiblesolutionssnN ý k of thequasi-normalproblemwe selecttheone

containedin spanþ Bk� , i.e.,

snN ý k ��� Bk þ AT

k Bk��� 1ck � (4.9)

By constructionholdsfor yk� Y þ xk

�� � þ xk � yk

� TsnN ý k � 0� (4.10)

Further, thereexist constantsK1 � K13 � 0 suchthat

snN ý k � K1 ck �� st

N ý k K13 þ!#"gk ÿ ck � � (4.11)

wherethe first inequalityfollows from (4.9) andfor thederivationof thesecondinequalityweuse(4.4) to obtain

stN ý k �$� Wk þ WT

k� 2

x � kWk� � 1 þ�"gk ÿ WT

k� 2

x � ksnN ý k � �

22 M. ULBRICH AND S.ULBRICH

Furthermore,theuniformly boundedinvertibility of BT A andthe fact that AT W % 0 yieldtheuniformly boundedinvertibility of & B 'W( andthus

)+*x , k) % O & ) & Bk 'Wk ( T * x , k

) (% O0-gk

% O & ) -gk) (!.

Hence,usingthat-g & x (% W & x ( T * x , & x / y( for all y 0 m, we obtain

) -g & xk 1 sN 2 k ( ) 1 )

c & xk 1 sN 2 k ( ) % O & )+* x , & xk 1 sN 2 k / yk 1 zN 2 k ( ) ( 1 ) c & xk 1 sN 2 k ( )% O & )+* x , k

) 2 1 ) ck) 2(3% O & ) -gk

) 2 1 )ck) 2(4.

(4.12)Collectingtheresultsobtainedsofar, wehave

PROPOSITION 4.1. Let (A7) hold and assumethat 5x 0 -6satisfiesthe secondorder

sufficientcondition(O2). Let B & x (70 n 8 m becontinuouslydifferentiablein a neighborhoodUN of 5x such that AT B is uniformlyboundedinvertibleonUN . Thenfor xk 0 UN sufficientlycloseto 5x, yk % Y & xk ( , sn

N 2 k, andstN 2 k asgivenin (4.8), (4.9), and(4.4)arewell definedand

satisfy(4.10), (4.11). Furthermore, (4.6), (4.7), and(4.12)hold.Thefollowing assumptionstatesour requirementson thestepcomputationthatwe need

to prove fastlocal convergence.Assumption:

(A8) 5x 0 -6 satisfiesthesecondordersufficientcondition(O2),and & xk ( convergesto 5x.Moreover, thereexistsa neighborhoodUN of 5x suchthatfor all xk 0 UN holds:

(i) The Lagrangemultiplier estimatesarecomputedby yk % Y & xk ( with Y givenby (4.8),whereB & x (90 n 8 m is continuouslydifferentiableon UN and AT B isuniformly boundedinvertibleonUN .

(ii) Thestepsnk % sn

N 2 k with snN 2 k asin (4.9) is chosenwhenever

)sn

N 2 k )�:<; k.

(iii) If the reducedHessianWTk* 2

x , kWk is positive definite, thenstN 2 k is computed

accordingto (4.4)andstk % st

N 2 k is chosenwhenever)st

N 2 k )�:<; k.REMARK 4.2. A possibleimplementationof (iii) is obtainedby applying Steihaug’s

conjugategradientmethodto (2.4) in the reducedvariablesd, wherest % Wkd (or in itsprojectedform [18]). If the reducedHessianis positive definite then the CG-patheitherleavesthe trust-region (in this caseholds

)st

N 2 k )>=?;k), or it staysin the trust-region and

convergesto dtN 2 k. If thereducedHessianis not positivedefinite,theSteihaugmethodeither

detectsnegativecurvatureor stopssincethepathleavesthetrust-region.As is well known, onecanallow inexactnesswithoutdestroying therateof convergence.

Dueto spacelimitationsthis issueis not discussedhere.

4.2. Quadratic local convergence.Thenext resultshowsthatwith therule(A8) for thestepcomputationAlgorithm A eventuallytakesNewtonsteps.

THEOREM 4.3. Let (A1)–(A8) hold. Thenthe trial stepsaccording to (4.4), (4.9) areeventuallytakenbyAlgorithmA andthus & xk ( convergesq-quadratically to 5x.

Theproofof this resultrequiressomework. Westartwith thefollowing auxiliary result.LEMMA 4.4. Let (A1)–(A8) holdandlet @ satisfy

2

3A @ A min BC/ 2

3B /D2. (4.13)

Thenthere is K=

0 such that thefollowing is true: If for someiterationkEGF K holds

aH jkI =�)-gkI ) or

)ckI ) H =�) -gkI ) / (4.14)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 23

thenfor all k J kK AlgorithmA takesNewtonsteps,i.e. snkL a M sn

N L k andstkL a M st

N L k.Proof. We first note that by the assumptionson N and O the condition (4.13) can be

satisfied.Sincexk P Qx by (A8) andc R Qx S M 0, Tg R Qx S M 0, we find K U 0 with V ck VXW 1,V#Tgk VYW 1, V xk Z[Qx V]\ min ^`_ N a+bGc andxk d UN for all k J K , where b is asin Lemma3.8, (iii). In particular, (4.11) holds for all k J K . Hence,we can increaseK suchthatV sn

N L k V a V stN L k VeWgf min for all k J K . Sincethe stepssn

N L k andstN L k satisfy the decrease

conditions(2.3) and(2.5), respectively, part (iii) of Lemma3.8 yieldsby themechanismofupdatingf k thatfor k J K theNewtonstepsk M sN L k is acceptedwhenever

V snN L k V a V st

N L k V�W<h 1_ Kk a where (4.15)

_ Kk defMji 1 min max min ^ a jk a V#Tgk V cka V ck V c 2l 3 a max!V#Tgk V a V ck V c`m n (4.16)

In fact,iterationlevel k is enteredwith f k Jof min andin eachsubiterationf k is reducedbyat mostthefactor h 1.

Fromck P 0and(4.11)weobtain V snN L k V�W K1 V ck V�Wph 1i 1 V ck V m W<h 1_ Kk for all k J K

aftera possibleincreaseof K . Thus,by (A8) thequasi-normalstepsatisfiessnkL a M sn

N L k forall k J K .

Now weconsiderthestepstN L k for k J K . If aq jk U�V#Tgk V then,using V#Tgk V a V ck V�W 1,

max min ^ a jk a V#Tgk V cra V ck V c J max!V#Tgk V 1l q a V ck V c J max!V#Tgk V a V ck V c 1l q nSimilarly, if V ck VsqtU�V#Tgk V , we obtain

max min ^ a jk a V#Tgk V cra V ck V c JuV ck V�J max!V#Tgk V 1l q a V ck V c J max!VvTgk V a V ck V c 1l q nIn bothsituationsweconcludethat,sinceNw\ 2

3q , for k J K , K sufficiently large,holds

h 1_ Kk Jph 1i 1 max!V#Tgk V a V ck V c 23x U K13 R�V#Tgk VGyzV ck V{S�J|V st

N L k V awherewehaveused 2

3q \ 1 and(4.11).Therefore,wehaveproved:

If K is sufficiently largeandfor kKGJ K holds(4.14),thensk}~L a M sN L k} . (4.17)

In thecasejk �P � , k P � , thesequencea jk is boundedaway from zeroandwe seethat(4.14)holdsfor all kK J K if K is chosensufficiently large. Therefore,(4.17)completestheproof in this case.

Now considerthecasejk P � ask P � . Thenfor K so large that jK J 1, Lemma3.5yields

V ck V�W 1� ���0a jk

defM K14a jk for all k J K n (4.18)

In thecaseaq jk} U�V#Tgk} V wegetwith (4.18),usinga j P 0,

VvTgk} V�y�V ck} V�\ aq jk} y K14a jk} W 2aq jk} W2� q0 aq jk}~� 1

nfor kK J K , K largeenough.In thecaseV ck} Vsq�U�V#Tgk} V wehave

V#Tgk} VGyzV ck} V�\�V ck} V q y�V ck} V�W 2 V ck} V q W 2K q14aq jk} W2K q14� q0 aq jk}�� 1

n

24 M. ULBRICH AND S.ULBRICH

From(4.12)weseethat,possiblyafterincreasingK , holds

�#�gk�~� 1

�����#�g � xk�v� sN � k��� �t� �G�0

max� 2 � 2K �14 � ��#�gk� � � �

ck� � �7� a� jk��� 1 �Therefore,(4.14)holdsfor k  � 1 insteadof k  andthusinductionyieldssk� a � sN � k for allk ¡ k  by (4.17).

Wenow proveTheorem4.3.Proof of Theorem4.3. We chooseK from Lemma4.4. AssumethatAlgorithm A does

not eventuallychooseNewton steps.ThenLemma4.4 implies that (4.14)mustbeviolatedfor all k G¡ K . Thus,wehave

a� jk � �v�gk�

and�ck� � � �#�gk

�for all k ¡ K � (4.19)

Exactlyasatthebeginningof theproofof Lemma4.4weobtainthatpossiblyafterincreasingK for all k ¡ K holds

�ck� � 1,

�v�gk� � 1 andthatthusby Lemma3.8andthemechanism

of updating¢ k astepsk�

snk � st

k is acceptedif

�snk� � � st

k� �<£ 1¤ 1 min max� min � a jk � �#�gk

� � � � ck� � 2¥ 3 � max� �#�gk

� � � ck� �§¦ def� £ 1  k �

Hereby, we chooseK large enoughsuchthat the right handsideis � ¢ min (by (A8) holdsck © 0,

�gk © 0). By (4.19)wehavetherefore

¢ k� a ¡ £ 1  k ¡ £ 1¤ 1 min � max� a jk � � ck� � 2¥ 3 � max� � ck

� � ¦ � �v�gk� ¦ ��� ¡ £ 1¤ 1

�ck� 2¥ 3 � (4.20)

In thelastinequalitywehaveusedthat ª¬« � 2­ 3 by (4.13).In particular, weobtainpossiblyafterincreasingK by (4.11)

�sn

N � k � � K1�ck��� £ 1¤ 1

�ck� 2¥ 3 � ¢ k� a for all k ¡ K �

andthereforeby (A8)

snk� a � sn

N � k for all k ¡ K � (4.21)

Weshow next thatthenaftera possibleincreaseof K holds

pred®k� a ¡ £ predtk� a ¡ £ max predck� a �v� predck� a � ¦ for all k ¡ K � (4.22)

In fact,asusedbefore,cf. (3.25),wehaveby (A3), (A5), (2.2),and(4.19)�WT

k ¯ qk � xk � snk � � ¡ �#�gk

�±°K7�snk� ¡ �#�gk

�²°K1K7

�ck� ¡ �#�gk

�³°K1K7

�v�gk� 1¥ � �

Thus,for k ¡ K , K largeenough,it holds�WT

k ¯ qk � xk � snk � � ¡ �#�gk

� ­ 2 andconsequentlyby (2.5)and(4.20)

predtk� a ¡ K3

4�#�gk�

min�#�gk� �v¢ k� a ¡ K3

4�ck� � min

�ck� � � £ 1¤ 1

�ck� 2¥ 3 ¡ K3

4�ck� 2�

for all k ¡ K . On theotherhand,weobtainasbefore,cf. (3.36),

predck� a � K10

�ck� 2 �

Sinceby (4.13)holds ª � « andsinceck © 0 wededuce

predtk� a ¡ K3

4�ck� 2� ¡ K ¦10

�ck� 2¦ ¡ max predck� a �v� predck� a � ¦ for all k ¡ K (4.23)

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 25

possiblyafterincreasingK . Moreover, wehave

´predµk¶ a · predtk¶ a ´¬¸�´ qk ¹ sn

k¶ a º ´¼»½´¿¾ x À�¹ xk Á yk º Tsnk¶ a ´4 1

2ÃHkÃ�Ã

snk¶ a à 2»

K15Ãsnk¶ a à 2 » K 2

1 K15Ãckà 2 Ä

Hereby, we have usedthat (4.10)holdsfor all k Å K by (4.21). We use(4.23)to concludethatfor K largeenoughholds

predµk¶ a ÅoÆ predtk¶ a for all k Å K ÄTogetherwith (4.23)this impliesthat(4.22)holdsfor all k Å K . Thus,we have by Step5.1of Algorithm A thatfor all iterationlevelsk Å K holds

raredµk¶ a Å]Ç 1predµk¶ a Å<Æ#Ç 1predtk¶ a Å<Æ#Ç 1K3

4ÃvÈgkÃ

minÃ#Ègkà ÁvÉ k¶ a Ä

From (4.19)we seethatÃvÈgkà ŠaÊ jk andthus jk Ë Ì . Therefore,(4.20)yields É k¶ a Å

Æ 1Í 1 min Î a2Ï 3jk Á aÊ+Ðjk Ñ . Since Ò¬ÓÕÔ 2Ö 3 Ô×Ò , we obtain for jk large enough(i.e., k large

enough)

predµk¶ a Å<Æ K3

4Ã#ÈgkÃaÊ jk Å<Æ K3

4a2Êjk Ä (4.24)

As in theproof of Lemma4.4,theestimate(4.18)is satisfiedandwe deducewith (A8)

´ À�¹ xkØ 1 Á ykØ 1º�· ÀÙ¹ xkØ 1 Á yk º ´Ú»wà ykØ 1 · ykÃ�Ã

ckØ 1Ã�»

L yÃsk¶ a à K14a jk Á

whereL y is thelocalLipschitzconstantof Y. Since(4.21)holds,(A8) ensuresthatÃstk¶ a Ã�»Ã

stN ¶ k à for all k Å K , andtherefore(4.11)implies

Ãsk¶ a ÃÛ» K1

ÃckÃ3Â

K13 ¹ Ã#ÈgkÃ3Â?Ã

ckà º .

UsingthatÃckÃ�»|Ã#È

gkÃ

by (4.19)wearriveat

´ ÀÙ¹ xkØ 1 Á ykØ 1º�· ÀÙ¹ xkØ 1 Á yk º ´¼» L y ¹ K1Â

2K13º K14Ã#ÈgkÃa jkÄ

Comparingwith (4.24)andusingthat jk ËÜÌ , we getaftera possibleincreaseof K

´ À�¹ xkØ 1 Á ykØ 1º�· À�¹ xkØ 1 Á yk º ´Ú» Ç 1

2predµk¶ a Ä

Therefore,weobtain

raredµk¶ a  À�¹ xkØ 1 Á yk º· À kØ 1 Å Ç 1

2predµk¶ a for all k Å K Ä

Now Lemma3.4yieldswith M µK ¸ maxK ÝßÞáàK â l ã K À l

À k»

M µK · Ç 1

2

ä mk

r å K

predµk¶ a for all k Å K Ä

But since2 ÒæÔèç by (4.13),a2Êj is notsummable,andhencealsopredµk¶ a is notsummableby(4.24). We concludethat À k Ë · Ì which is a contradictionto theassumptions.Theproofis complete.

26 M. ULBRICH AND S.ULBRICH

5. Numerical results. In this sectionwe reportnumericalresultsfor a setof problemsfrom the CUTE collection[1]. The resultsareobtainedby a preliminaryMATLAB imple-mentationof Algorithm A. Hereby, weusetheleast-squaresmultiplier update,i.e.,(4.8)withB é x êë A é x ê , andconsequentlycomputequasi-normalstepscontainedin spané Ak ê . For thecomputationof Wk weusedirecteliminationbasedonMATLAB’ sLU-factorizationroutine.Moreprecisely, wecomputethefactorization

Lk

NkRk ë Pk Ak

with Nk ì n í mî m, lower triangularLk ì mî m, uppertriangluarRk ì n í mî n í m, andapermutationmatrix Pk andset

Wk ë PTk

ï L í Tk NT

k

I ðFor largeproblemsthismatrix is not formedexplicitly, only its applicationto vectorsis com-puted. For the computationof the quasi-normalstepwe usethe Dogleg-method[25] witha sparseCholesky factorizationfor the computationof the Newton step(4.9). The tangen-tial stepis obtainedby a Steihaug-CGmethod[27]. For problemswith moderatesizeweform the reducedHessianexplicitly andusethe incompleteCholesky factorizationICFSofLin and More [21] aspreconditioner. For large problems,we usecurrently a matrix-freeSteihaug-CGmethodwithoutpreconditioning.If theCG-pathdoesnot leave thetrust-regionand doesnot find negative curvature, it is stoppedif the currentresidualrk is reducedtorkñ 1 ò maxó 10í 20 ô min ó 0ð 05ô rk õ rk õ . Therefore,our stepcomputationis similar asin [20],but the implementationdescribedin [20], especiallythe computationof Wk, is moreelab-oratedthanour straightforward testcode. We noticethat our stepcomputationsatisfiesaninexact versionof our steprequirementfor the local convergenceanalysisin section4, cf.Remark4.2.

For thenumericaltestwe haveusedthefollowing parametersettingsin Algorithm A:

ö c ë öÙ÷ ë 5 ôeø ë 0 ð 001ôeù 1 ë 0 ð 01ôeù 2 ë 0 ð 9 ôûú 1 ë 0 ð 5 ôûú 2 ë 2 ôü ë½ýþë 0 ð 1 ô ÿ ë 3

�4 ôûú ë 0 ð 01ô�� min ë 10í 5 ô��

0 ë 10ðFor thesequenceé a j ê weused

a0 ë min 0 ð 1maxé 1 ô�� ck� ê ô����gk

��� ck� ô a j ë a0�

j 1ô j 1 ð

The weights ø c� ÷kr in raredc� ÷k arecomputedasmentionedin section2.1. Hence,we select

0 ò r ��� ö ck with � ck í r � � ë max0 � r ��� c

k� ck í r

� andset ø ckr � ë 1 ï é ö c

kï 1ê ø , and ø c

kr ë øfor r �ë r � . Thesamestrategy is usedto chooseø ÷kr . Westopif maxó � ck

���æô����gk� � õ3ò 10í 5.

Table5.1 shows thenumberof f ô c evaluationsandthenumberof g ô A evaluationsfora setof problemsfrom theCUTE collectionusingexactHessians.Theresultsshow that thenew penalty-function-freealgorithmis robustandefficient. Althoughour MATLAB imple-mentationis verystraightforwardwithoutusingadvancedtechniquesfor, e.g.,thetrust-radiusupdateandthecomputationof Wk, thenumberof iterationsfor mostproblemsis very satis-factory. As expectedfrom our local convergencetheory, we observedtransitionto fastlocalconvergencefor all problems. We believe that further improvementscanbe achievedby amoresophisticatedimplementation.Theresultsindicatethatour new classof nonmonotonetrust-region methodswithout penaltyfunction is an interestingandcompetitive alternative

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 27

TABLE 5.1Resultsfor CUTEproblemsusingexactHessians

Problem f � c g � A n m Problem f � c g � A n mARTIF 9 9 1000 1000 HS8 8 6 2 2AUG2D 10 10 3280 1600 HS9 3 2 2 1AUG2DC 8 8 3280 1600 HS26 74 43 3 1AUG3D 7 7 3873 1000 HS27 20 14 3 1AUG3DC 8 8 3873 1000 HS39 32 21 4 2BRATU3D 5 5 3375 3375 HS42 4 4 4 2BYRDSPHR 10 7 3 2 HS46 22 17 5 2DTOC1NA 10 10 2998 1996 HS47 42 26 5 3DTOC1NB 10 8 2998 1996 HS48 2 2 5 2DTOC1NC 16 11 2998 1996 HS49 14 14 5 2DTOC1ND 16 10 2998 1996 HS50 8 8 5 3DTOC2 10 7 2998 1996 HS51 2 2 5 3DTOC3 8 8 2998 1996 HS52 2 2 5 3DTOC4 5 5 2999 1998 HS56 17 10 7 4DTOC5 6 6 9999 4999 HS61 8 6 3 2DTOC6 13 13 2001 1000 HS77 12 12 5 2DTOC6 13 13 2001 1000 HS78 5 5 5 3EIGENA2 3 3 110 55 HS79 5 5 5 3EIGENB2 101 61 110 55 HS100LNP 10 7 7 2EIGENBCO 90 49 110 55 HS111LNP 16 12 10 3EIGENC2 58 35 462 231 LCH 33 21 300 1EIGENCCO 62 34 110 55 MARATOS 4 4 2 1GENHS28 3 3 300 298 MWRIGHT 11 8 5 3GRIDNETB 7 7 60 36 ORTHREGA 38 27 517 256GRIDNETE 7 7 60 36 ORTHREGB 2 2 27 6HAGER1 5 5 10001 5000 ORTHREGC 18 14 1005 500HAGER2 5 5 10000 5000 ORTHREGD 20 16 5003 2500HAGER3 5 5 10000 5000 OPTCTRL3 9 9 202 80HS6 12 8 2 1 OPTCTRL3 20 20 2002 800HS7 12 9 2 1 OPTCTRL6 9 9 202 80

to algorithmswith penaltyfunction anddeserves further consideration.Sinceonly minorchangesarerequiredto incorporateour framework in existing penalty-function-basedByrd-Omojokuntypealgortihms,it might beinterestingto includeour new approachasanoptionin existinghigh-qualityimplementations.

6. Conclusions. TheByrd–Omojokunclassof trust-region algorithmsis known asoneof themostefficient optimizationmethodsfor equalityconstrainedNLP. On theotherhand,the recentidea of filter methodsand, more general,merit-function-freealgorithmskeepsgreatpromisefor the developmentof powerful new algorithms. In this work we combinedboth of theseconceptsin a novel way. The result is a globally convergentalgorithmthatconvergeslocally quadraticallyto regularsolutions.Similar to filter methods,thealgorithmallows for nonmonotonicityof both, constraintviolation and(Lagrangian)function values.However, insteadof usinga filter, we imposenonmonotonedecreaseconditionsthatcontrolfeasibilityandoptimality, respectively, in a looselycoupledway. Fromacomputationalpointof view, only minor modificationsarenecessaryto incorporatethe proposedconceptsinto

28 M. ULBRICH AND S.ULBRICH

an existing implementationof the Byrd–Omojokunalgorithm. Our preliminarynumericalresultsindicatethat the presentedapproachis viableandyields anefficient androbustnewclassof trust-region algorithms.

REFERENCES

[1] I . BONGARTZ, A . R. CONN, N. I . M. GOULD, AND P. L. TOINT, CUTE: ConstrainedandUnconstrainedTestingEnvironment, ACM Trans.Math.Software,21 (1995),pp.123–160.

[2] R. H. BYRD, Robust trust region methodsfor constrainedoptimization, Third SIAM Conferenceon Opti-mization,Houston,TX, May 1987.

[3] R. H. BYRD, J. C. GILBERT, AND J. NOCEDAL, A trust region methodbasedon interior point techniquesfor nonlinearprogramming, Math.Programming,89 (2000),pp.149–185.

[4] R. H. BYRD, M. E. HRIBAR, AND J. NOCEDAL, An interior point algorithm for large scalenonlinearprogramming, SIAM J.Optim.,9 (2000),pp.877–900.

[5] R. H. BYRD, R. B. SCHNABEL , AND G. A. SHULTZ, A trust region algorithmfor nonlinearlyconstrainedoptimization, SIAM J.Numer. Anal., 24 (1987),pp.1152–1170.

[6] M. R. CELIS, J. E. DENNIS, AND R. A. TAPIA, A trustregionstrategyfor nonlinearequalityconstrainedop-timization, in NumericalOptimization1984,P. T. Boggs,R. H. Byrd, andR. B. Schnabel,eds.,Philadel-phia,USA, 1985,SIAM.

[7] T. F. COLEMAN AND Y. L I, An interior trust region approach for nonlinearminimizationsubjectto bounds,SIAM J.Optim.,6 (1996),pp.418–445.

[8] A. R. CONN, N. I . M. GOULD, AND P. L. TOINT, Global convergenceof a classof trust region algorithmsfor optimizationwith simplebounds, SIAM J.Numer. Anal., 25 (1988),pp.433–460.

[9] J. E. DENNIS, M. EL -ALEM , AND M. C. MACIEL, A global convergencetheoryfor general trust-regionbasedalgorithmsfor equalityconstrainedoptimization, SIAM J.Optim.,7 (1997),pp.177–207.

[10] J. E. DENNIS, M. HEINKENSCHLOSS, AND L. N. V ICENTE, Trust-region interior-pointSQPalgorithmsfora classof nonlinearprogrammingproblems, SIAM J.ControlOptim.,36 (1998),pp.1750–1794.

[11] J. E. DENNIS AND L. N. V ICENTE, Trust region interior-point algorithmsfor minimizationproblemswithsimplebounds, in AppliedMathematicsandParallelComputing,Festschriftfor KlausRitter, H. Fischer,B. Riedmuller, andS.Schaffler, eds.,Heidelberg, 1996,Physica-Verlag,pp.97–107.

[12] , On the convergencetheoryof trust-region-basedalgorithmsfor equality-constrainedoptimization,SIAM J.Optim.,7 (1997),pp.927–950.

[13] M. EL -ALEM,A globalconvergencetheoryfor a classof trustregionalgorithmsfor constrainedoptimization,Tech.Rep.TR88-5,Departmentof ComputationalandApplied Mathematics,RiceUniversity, Houston,TX, USA, 1988.

[14] , A global convergencetheoryfor theDennis-Celis-Tapiatrust-region algorithmfor constrainedopti-mization, SIAM J.Numer. Anal., 28 (1991),pp.266–290.

[15] , A global convergencetheory for a general classof trust-region-basedalgorithmsfor constrainedoptimizationwithoutassumingregularity, SIAM J.Optim.,9 (1999),pp.965–990.

[16] R. FLETCHER, N. I . M. GOULD, S. LEYFFER, AND P. L. TOINT, Global convergenceof trust-region SQP-filter algorithmsfor nonlinearprogramming, Tech.Rep.99/03,Departmentof Mathematics,Universityof Namur, Namur, Belgium,1999.

[17] R. FLETCHER AND S. LEYFFER, Nonlinearprogrammingwithout a penaltyfunction, NumericalAnalysisReportNA/171,Departmentof Mathematics,Universityof Dundee,Dundee,Scotland,1997.

[18] N. I . M. GOULD, M. E. HRIBAR, AND J. NOCEDAL, On the solutionof equalityconstrained quadraticprogrammingproblemsarising in optimization, Tech.Rep.RAL-TR-1998-069,RutherfordAppletonLaboratory, Chilton,Oxfordshire,England,1998.

[19] D. KLEIS AND E. W. SACHS, Convergencerateof theaugmentedLagrangianSQPmethod, J.Optim.TheoryAppl., 95 (1997),pp.49–74.

[20] M. LALEE, J. NOCEDAL , AND T. D. PLANTENGA, On the implementationof an algorithmfor large-scaleequalityconstrainedoptimization, SIAM J.Optim.,8 (1998),pp.682–706.

[21] C.-J. L IN AND J. J. MORE, IncompleteCholesky factorizationswith limitedmemory, SIAM J.Sci.Comput.,21 (1999),pp.24–45.

[22] , Newton’s methodfor large bound-constrainedoptimizationproblems, SIAM J. Optim., 9 (1999),pp.1100–1127.Dedicatedto JohnE. Dennis,Jr., onhis 60thbirthday.

[23] E. O. OMOJOKUN, Trust region algorithmsfor optimizationwith nonlinear equality and inequality con-straints, PhDthesis,Universityof Colorado,Boulder, Colorado,USA, 1989.

[24] T. D. PLANTENGA, A trust-region methodfor nonlinearprogrammingbasedon primal interior point tech-niques, SIAM J.Sci.Comput.,20 (1999),pp.282–305.

[25] M. J. D. POWELL, A hybrid methodfor nonlinearequations, in Numericalmethodsfor nonlinearalgebraicequations(Proc.Conf.,Univ. Essex, Colchester, 1969),GordonandBreach,London,1970,pp.87–114.

NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 29

[26] M. J. D. POWELL AND Y. YUAN, A trust region algorithm for equality constrained optimization, Math.Programming,49 (1990),pp.189–213.

[27] T. STEIHAUG, Theconjugategradientmethodandtrust regionsin large scaleoptimization, SIAM J.Numer.Anal., 20 (1983),pp.626–637.

[28] P. L . TOINT, A non-monotonetrust-regionalgorithmfor nonlinearoptimizationsubjectto convex constraints,Math.Programming,77 (1997),pp.69–94.

[29] M. ULBRICH, Non-monotonetrust-region methodsfor bound-constrainedsemismoothequationswith appli-cationsto nonlinearmixedcomplementarityproblems, SIAM J.ControlOptim.,(2000,in press).

[30] M. ULBRICH, S. ULBRICH, AND M. HEINKENSCHLOSS, Global convergenceof trust-region interior-pointalgorithmsfor infinite-dimensionalnonconvex minimizationsubjectto pointwisebounds, SIAM J.Con-trol Optim.,37 (1999),pp.731–764.

[31] M. ULBRICH, S. ULBRICH, AND L. N. V ICENTE, A globally convergent primal-dual interior point filtermethodfor nonconvex nonlinearprogramming, Tech.Rep.TR00-12,Departmentof ComputationalandAppliedMathematics,RiceUniversity, Houston,TX, USA, 2000.

[32] A. VARDI, A trust region algorithm for equalityconstrainedminimization:convergencepropertiesand im-plementation, SIAM J.Numer. Anal., 22 (1985),pp.575–591.

[33] L . N. V ICENTE, Trust-region interior-point algorithmsfor a classof nonlinearprogrammingproblems, PhDthesis,Departmentof ComputationalandApplied Mathematics,Rice University, Houston,TX, USA,1995.ReportTR96-05.

[34] C. A. ZOPPKE-DONALDSON, A tolerance-tubeapproach to sequentialquadratic programmingwith appli-cations, PhDthesis,Universityof Dundee,Dundee,Scotland,UK, 1995.


Recommended