Date post: | 10-Mar-2023 |
Category: |
Documents |
Upload: | independent |
View: | 0 times |
Download: | 0 times |
NonmonotoneTrust RegionMethods forNonlinear Equality ConstrainedOptimization
without a Penalty Function
MichaelUlbrich andStefanUlbrich
ZentrumMathematikTechnischeUniversitatMunchen
Munchen,Germany
TechnicalReport,December2000.
NON-MONOTONE TRUST REGION METHODS FOR NONLINEAR EQUALITYCONSTRAINED OPTIMIZA TION WITHOUT A PENALTY FUNCTION
MICHAEL ULBRICH�
AND STEFAN ULBRICH†
Abstract. We proposeand analyzea classof penalty-function-freenonmonotonetrust-region methodsfornonlinearequalityconstrainedoptimizationproblems.Thealgorithmicframework yieldsglobalconvergencewithoutusinga merit functionandallows nonmonotonicityindependentlyfor both,theconstraintviolationandthevalueofthe Lagrangianfunction. Similar to the Byrd–Omojokunclassof algorithms,eachstepis composedof a quasi-normalanda tangentialstep.Both stepsarerequiredto satisfya decreaseconditionfor their respective trust-regionsubproblems.The proposedmechanismfor acceptingstepscombinesnonmonotonedecreaseconditionson theconstraintviolationand/ortheLagrangianfunction,which leadsto aflexibility andacceptancebehavior comparabletofilter-basedmethods.Weestablishtheglobalconvergenceof themethod.Furthermore,transitiontoquadraticlocalconvergenceis proved.Numericaltestsarepresentedthatconfirmtherobustnessandefficiency of theapproach.
Key words. nonmonotonetrust-region methods,sequentialquadraticprogramming,penaltyfunction, globalconvergence,equalityconstraints,local convergence,large-scaleoptimization
AMS subject classifications.65K05,90C30
1. Intr oduction. Weconsiderthenonlinearequalityconstrainedoptimizationproblem
min f�x � subjectto c
�x ��� 0 (1.1)
with continuouslydifferentiablefunctions f : n � andc : n � m. For thesolutionof(1.1)we proposea methodthat is inspiredby theclassof trust-region algorithmsintroducedby Byrd [2], Omojokun[23], andDennis,El-Alem, andMaciel [9], but with the importantdifferencethatour algorithmdoesnot usea penaltyor augmentedLagrangefunction to testtheacceptabilityof steps.Hereby, wearemotivatedby theimpressingefficiency of sequentialquadraticprogramming(SQP)filter methods,which were recentlyintroducedby FletcherandLeyffer [17]. Thealgorithmthatwe investigateheredoesnot usetheconceptof a filter.Rather, it appliesnonmonotonetrust-region techniquesindependentlyto the quasi-normalsubproblemand the tangentialsubproblem.This strategy admitsa flexibility in acceptingstepsthat is comparablewith filter methods.Besidesglobalconvergence,our approachhastwo favorablepropertiesthatappearto benew for algorithmswithouta penaltyfunction:(a) Themethoddoesnot requirea restorationprocedure.(b) Weprovethatthealgorithmconvergeslocally q-quadratically, evenwithoutanadditional
secondordercorrectionthatis neededby many algorithmsto avoid theMaratoseffect.For SQPfilter methodsglobalconvergencehasbeenestablishedin Fletcher, Gould,Leyffer,andToint [16], whereasa local convergencetheoryis not yet available.Recently, a globallyconvergentprimal-dualinterior-point filter methodwasintroducedby Ulbrich, Ulbrich, andVicente[31]. Exceptfor themethodpresentedin thispaper, filter methodsanditspredecessor,thetolerance-tubeapproachby Zoppke-Donaldson[34], aretheonly algorithmsfor NLP weareawareof thatdo not requireapenaltyfunction.
We will use[2], [23], and[9] asour mainreferenceson trust-region methodsfor equal-ity constrainednonlinearprogramming.However, thereareseveral relatedapproachesandrecentextensionsthatshouldbementioned.Regardingrelatedwork, we referto Byrd, Schn-abel,andShultz[5], Celis,Dennis,andTapia[6], El Alem [13, 14], Powell andYuan[26],
�Lehrstuhlfur AngewandteMathematikundMathematischeStatistik,ZentrumMathematik,TechnischeUniver-
sitat Munchen,D-80290Munchen,Germany, E-mail: [email protected].†Lehrstuhlfur AngewandteMathematikundMathematischeStatistik,ZentrumMathematik,TechnischeUniver-
sitat Munchen,D-80290Munchen,Germany, E-mail: [email protected].
1
2 M. ULBRICH AND S.ULBRICH
and Vardi [32]. Recentcontributions to the analysisof trust-region methodsfor equalityconstrainedproblemsincludeDennisandVicente[12], El Alem [15], Lalee,Nocedal,andPlantenga[20]. Several extensionsto problemsinvolving inequalityconstraintshave beenproposed.Herewementiononly thosemethodsthatextendtheideasof Byrd [2], Omojokun[23] andDennis,El Alem,andMaciel [9]. Someof thesealgorithmsarebasedontrust-regionmethodsfor box-constrainedproblems,see,e.g.,ColemanandLi [7], Conn,Gould,andToint[8], DennisandVicente[11], Lin andMore [22], Ulbrich, Ulbrich andHeinkenschloss[30],andcombinethemwith theabove approachesto handleadditionalequalityconstraints.Al-gorithmsof this typewereinvestigatedby Dennis,Heinkenschloss,Vicente[10], Plantenga[24], andVicente[33]. Byrd, Gilbert, andNocedal[3] andByrd, Hribar, andNocedal[4]takeadifferentapproachby solvingasequenceof equalityconstrainedbarrierproblems.Theabove referencesunderlinethe importantrole of methodsfor nonlinearequalityconstrainedoptimizationproblems,bothasstandalonemethodsandassolversfor subproblems.
Thispaperis organizedasfollows. In section2 thealgorithmis developed.Weintroducethequasi-normalandthetangentialtrust-regionsubproblemanddescribethemodeldecreaseconditionsfor therespective trial steps.Thenonmonotonedecreaseconditionsfor constraintviolation and Lagrangianfunction, respectively, which are the key ingredientsof the newalgorithm,aredevelopedin sections2.1and2.2. Thefull algorithmis formulatedin section2.3. In section3 theglobalconvergenceof thealgorithmis established.Wefirst statethemainresult in section3.1. The global convergenceanalysisstartsin section3.2 with the proofof well definedness.Section3.3 is devoted to the developmentof nonmonotonedecreaseresults. Convergenceto feasiblepoints is proved in section3.4, convergenceto stationarypointsin 3.5. In section4 we show thatwith a Newton-typestepcomputationthealgorithmconvergeslocally quadratically. Numericalresultsfor problemsfrom the CUTE collection[1] arepresentedin section5.
Notations. Throughoutthepaper, ���� denotestheEuclideannorm ���� 2. Thegradientof f is denotedby f and c denotesthetransposedJacobianof c. Weusetheabbreviationsg �� f andA �� c.
2. Developmentof the algorithm. We denotethegradientof theobjective function fby g andwrite A for thetransposedJacobianof c:
g x ���� f x ��� n � A x ���� c x ��� n � m �Following Byrd [2], Omojokun[23] andDennis,El Alem, andMaciel [9], weobtainthetrialstepsk � st
k � snk atthecurrentiteratexk by computingaquasi-normalstepsn
k andatangentialstepst
k. The purposeof the quasi-normalstepsnk is to improve feasibility. It is obtainedas
approximatesolutionof thetrust-regionsubproblem
min � c xk � � A xk � Tsn � 2 subjectto � sn ����� k� (2.1)
where � k � 0 denotesthetrust-regionradius.Our requirementsonthestepssnk arethatthere
exist constantsK1� K2 � 0, independentof k, suchthatsn
k admitstheupperbound
� snk ��� min K1 � ck � � � k
� (2.2)
andsatisfiesthedecreasecondition
� ck � 2 � � ck � ATk sn
k � 2 � K2 � ck � min � ck � � � k� (2.3)
As,e.g.,in [9], wewill assumethatthematricesA xk � T A xk � arenonsingularwith uniformlyboundedinversesfor all k. Thenit is well known thattheCauchypoint,which is thesolution
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 3
of (2.1) alongthe directionof steepestdescentat sn � 0, satisfiesthe conditions(2.2) and(2.3)for appropriateconstantsK1 andK2. Theassumptionsstatedbelow ensuretheexistenceof constantsK1 and K2 thataresuitablefor all iterationsk. Therefore,theconditions(2.2)and(2.3)canbeimplementedby a fractionof Cauchydecreasecondition.
To improve optimality we seekstk in the tangentspaceof the linearizedconstraintsin
suchaway thatit providessufficientdecreasefor aquadraticmodelof theLagrangefunction "!
x # y$ � f!x $&% yTc
!x $'# y ( m #
undera trust-region constraint.To this end,we defineaquadraticmodel
qk!s$ � ! g ! xk $)% A
!xk $ yk $ T s % 1
2sT Hks
aboutthe currentpoint!xk # yk $ that approximates
*!xk % s# yk $�+ *! xk # yk $ . Here, Hk is a
symmetricapproximationof , 2x *!
xk # yk $ .Basedon this model,the tangentialstepst
k is computedasapproximatesolutionof thetrust-regionsubproblem
minqk!snk % st $ subjectto A
!xk $ Tst � 0# - st -�.�/ k (2.4)
satisfyingthedecreasecondition
qk!snk $0+ qk
!snk % st
k $�1 K3 - WTk , qk
!snk $2- min - WT
k , qk!snk $2-3#4/ k # (2.5)
with aconstantK3 5 0 independentof k, andthefeasibility condition
A!xk $ Tst
k� 0 # - st
k -�.�/ k 6 (2.6)
Hereby, Wk� W
!xk $ , whereW
!x $ denotesa matrix whosecolumnsform a basisof the
null spaceof A!x $ T . Note that WT
k , qk!st $ is the reducedgradientof qk in termsof the
representationst � Wkd of thetangentialstep:
, d qk!Wkd $ � WT
k , qk!Wkd $ � WT
k , qk!st $ 6
Therefore,(2.5) canbe realizedby a fractionof Cauchydecreaseconditionfor the reducedfunctiond 78 qk
!snk % Wkd $ subjectto theconstraint- Wkd -�.�/ k.
To simplify notationwe will use the abbreviations fk � f!xk $ , ck
� c!xk $ , k
� *!xk # yk $ , etc.Moreover, it will beconvenientto introducethereducedgradient9
g!x $ � W
!x $ T g
!x $ 6
Thenthe first ordernecessaryoptimality conditions(Karush–Kuhn–Tucker or KKT condi-tions)ata local solution :x ( n of (1.1)canbewritten as
c! :x $ � 0 # 9
g! :x $ � 0 6
The algorithm is basedon a combinationof nonmonotonedecreasecriteria for the quasi-normalandtangentialsteps.Non-monotonetrust-region methodswereinvestigatedby Toint[28] and Ulbrich [29]. We follow [29] and comparethe predicteddecreasepromisedbythe trust-region modelwith a relaxationof the actualdecreaseto decidewhethera stepisacceptableor not. Beforewe give a precisedescriptionof the algorithm,we introduceourassumptionsfor theglobalconvergenceanalysis.
Assumptions:Thereexistanopenconvex set;�< n andaclosedconvex set
9;�<=; with dist! 9;># n ? ;@$ 5
0 suchthat:
4 M. ULBRICH AND S.ULBRICH
(A1) Thefunctions f : ACB andc : ADB m arecontinuouslydifferentiable.
(A2) Thematrix A E x FHGJI c E x F hasfull rankfor all x KLA .
(A3) The functions f , g GMI f , c, A GMI c, E AT AFON 1, W, and E WT WFON 1 areuniformlyboundedon A . Hereby, W E x F denotesa matrix whosecolumnsform a basisfor thenull spaceof A E x F T .
(A4) For all k, xk is in PA , andxk Q snk aswell asxk Q sk arein A .
(A5) ThematricesHk andthemultiplier estimatesyk areuniformly boundedfor all k.
(A6) Thederivativesg GJI f andA GJI c areLipschitzcontinuouson A .
In section4 we will moreover requirethe following assumptionin orderto show tran-sition to locally quadraticconvergencein a neighborhoodof a stationarypoint Rx satisfyingsecondordersufficientconditions.
(A7) The functions f : ASB andc : ASB m aretwice continuouslydifferentiable.Furthermore,thereexists a neighborhoodA N T A of Rx on which I 2 f and I 2c areLipschitzcontinuousandHk G�I 2
x U E xk V yk F for all xk KLA N .
2.1. A nonmonotonedecreasecondition for the constraint violation. The decreasecondition(2.3) for thequasi-normalstepguaranteesthat thepredictedreductionfor thefea-sibility violation W c W 2,
predck
defGXW ck W 2 Y W ck Q ATk sk W 2 V
admitstheestimate
predck Z K2 W ck W min ['W ck W V4\ k ]_^ (2.7)
Noteherebythat ATk sk G AT
k snk . Clearly, therequirementthattheactualreduction
aredckdefGSW ck W 2 Y W c E xk Q sk F`W 2
shouldbea fractionof thepredictedreductionis too restrictive,sinceit couldimposesevererestrictionson the tangentialstepif the feasibility is “too good” in comparisonto the normof thereducedgradient.In orderto relaxthefeasibility requirementin this caseandto allownonmonotonicityweacceptthestepfor theconstraints if
raredck Zba 1predck V a 1 KcE 0 V 1F fixed,
with therelaxedactualreduction
raredckdefG max Rk V
d ck N 1
r e 0
f ckr W ck N r W 2 Y W c E xk Q sk F`W 2 ^
Hereby, we requirethatwith fixedparametersg c K andf KhE 0V 1i3g c F holds
g ck G min [ k Q 1 V g c ]jV f c
kr Z fck 0 Vd c
k N 1
r e 0
f ckr G 1 V
Rk Z W ck W 2 V usually Rk GSW ck W 2 ^
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 5
Beforewediscussthechoiceof Rk, wenoticethatamaximumof nonmonotonicityisachievedby selectinganindex r l , 0 m r l&mon c
k p 1, suchthat q ck r r s qut max0 v r w2x ckq ck r r q andsettingy c
kr s t 1 p{z n ck p 1| y , and
y ckr t y for r }t r l .
The choiceof Rk is an importantissuein the designof the method. It is donein sucha way that the feasibility requirementis relaxed if the feasibility is much better than thestationarity, i.e., if q ck q@~ q��gk q . If this situationis detectedtheninsteadof Rk t�q ck q 2 alarger valueis chosen.In orderto keepa minimumof control over the constraintviolationwe chooseRk not larger thansomeupperbounda jk . Hereby, z a j | is a slowly decreasingsequencetendingto zeroand j is only increased,i.e., jk� 1 t jk � 1, if Rk yieldsthemaximumin thefirst termof raredck. Thus,let z a j | beasequencewith
a j � 0 � 0 ��� 0 m a j � 1
a j� 1 � lim
j �L� a j t 0� and�j � 0
a� j t���� (2.8)
where � � 4� 3 is afixedconstant.Thefollowing algorithmdescribeshow Rk is updated:
Algorithm R: (Updateof Rk)
Let 0 �������D� 1� 2.
If q ck q�� min � a jk ����q4�gk q then
SetRk :t min � a2jk�q��gk q 2 � .
If Rk � x ck r 1
r � 0y c
kr q ck r r q 2 thenset jk� 1 :t jk � 1,
elseset jk� 1 :t jk.
Otherwise,setRk :t�q ck q 2 and jk� 1 :t jk.
2.2. A nonmonotonedecreasecondition for the Lagrangian function. To evaluatethedescentpropertiesof thestepfor the objective functionwe usethepredictedtangentialreductionof theLagrangian�
predtkdeft qk z sn
k | p qk z sk |'�thepredictedreductionof � for thewholestep
pred�k deft p qk z sk |'�andtherelaxedactualreductionof �
rared�k deft max � k �x��k r 1
r � 0
y �kr � k r r p � z xk � sk � yk |
whereasabovewith fixed n � � andy � z 0 � 1�3n � |
n �k t min � k � 1 ��n � � � y �kr � y � 0 �x �k r 1
r � 0
y �kr t 1¡
REMARK 2.1. Anothernaturalchoicewouldbeto use� z xk � sk � yk� 1| in thedefinitionof rared�k with anew multiplier estimateyk� 1 andto addtheterm z yk p yk� 1| T z ck � AT
k sk | in
6 M. ULBRICH AND S.ULBRICH
thedefinitionof predtk andpred¢k. Our convergenceanalysiscaneasilybeadaptedto handlethis caseaswell andeven simplifies. On the other hand,we prefer to work only with ykuntil anacceptablestepis found,sincethecomputationof new multipliers yk£ 1 requirestheusuallycostlyevaluationof A ¤ xk ¥ sk ¦ .
The computationof sk ensuresthat the tangentialstep stk provides decreasefor the
quadraticmodelqk ¤ snk ¥ st ¦ , sincepredtk satisfies(2.5). However, this descentcanbe de-
stroyedby thenormalstepsnk , if § sn
k § is too largecomparedwith §�gk § . This motivatesthefollowing admissibilitycriterion: If pred¢k promisessufficient decreasefor the whole step,moreprecisely, if
predtk © maxª predck « ¤ predck ¦¬�® and pred¢k ©�¯ predtk « °²± ¤ 2³ 3 « 1¦ «�¯´± ¤ 0 « 1¦ «
thenwerequire
rared¢k ©´µ 1pred¢k ¶This leadsto thefollowing evaluationof trial steps.
Evaluation of steps:
Let °²± ¤ 2³ 3 « 1¦ , ¯´± ¤ 0 « 1¦ , µ 1 ± ¤ 0 « 1¦ .Acceptthetrial stepsk · sn
k ¥ stk
if sk is acceptablefor theconstraints,i.e.,
raredck ©´µ 1predck «andif sk is acceptablefor theobjective function:
If predtk © maxª predck « ¤ predc
k ¦ ¬ ® and pred¢k ©�¯ predtkthen rared¢k ©´µ 1pred¢k holds.
If the stepis not acceptablethen the trust-region radius ¸ k is reducedand the stepsk isrecomputed.
2.3. The Algorithm. Wenow givea completestatementof thealgorithm.
Algorithm A:
Let 0 ¹ µ 1 ¹ µ 2 ¹ 1, 0 ¹ ¯ 1 ¹ 1 ¹ ¯ 2, 0 ¹�º «�» ¹ 1³ 2, 2³ 3 ¹ ° ¹ 1, and0 ¹ ¯ ¹ 1. Fixº 0 ± ¤ 0 « 1¦ andchoosea sequence¤ a j ¦ satisfyingtheconditionsin (2.8).Chooseaninitial point x0 ±�¼ , andaninitial trust-radius 0 © ¸ min ½ 0.Set ¾ :· 1, k :· 0, and j0 :· 0.
1. (Evaluatefunctionsat xk)Computeck, Ak, Wk, fk, gk, gk :· WT
k gk, andaLagrangemultiplier estimateyk.2. (Check for termination)
If § ck § ¥ §�gk § · 0: STOP.3. (UpdateRk)
Choosetheweights¿ cÀ ¢kr for raredcÀ ¢k .
UpdateRk by callingAlgorithm R.4. (Computetrial steps)
Computea quasi-normalstepsnk satisfying(2.2), (2.3),anda tangentialstepst
k sat-isfying (2.5),(2.6). Setsk :· sn
k ¥ stk.
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 7
5. (Testif sk is acceptable)If predÁk ÂÄà predtk andpredtk  max predck Å4Æ predc
k ÇÉÈ thengotoStep5.1,elsegotoStep5.2.
5.1 If raredck Ê�Ë 1 predck or raredÁk Ê�Ë 1 predÁk thenset Ì k :Í Ã 1 Ì k andgotoStep4.Else chooseÌ kÎ 1 Ï [maxÐ_Ì min Å Ì k Ñ Å maxÐjÌ min Å4à 2 Ì k Ñ ], set xkÎ 1 :Í xk Ò sk,k :Í k Ò 1 andgotoStep1.
5.2 If raredck Ê�Ë 1 predck thenset Ì k :Í Ã 1 Ì k andgotoStep4.Else chooseÌ kÎ 1 Ï [maxÐ_Ì min Å Ì k Ñ Å maxÐjÌ min Å4à 2 Ì k Ñ ], set xkÎ 1 :Í xk Ò sk,k :Í k Ò 1 andgotoStep1.
In our formulationof the algorithmwe have avoideda further index that distinguishesbetweendifferentinstancesof trial stepsat iterationlevel k. To preventpossibleambiguities,weusethefollowing
Notation: Wesaythatthestepsk is accepted(or successful)if it is usedin Step5.1or 5.2to computethenew iterate,i.e.,xkÎ 1 Í xk Ò sk. If it is necessaryto referencethe“accepted”,i.e., final valuesof sk, sn
k , stk, Ì k, predc
k, predÁk, predtk, raredck, andraredÁk at iteration levelk, we denotethemby skÓ a, sn
kÓ a, stkÓ a, Ì kÓ a, predckÓ a, predÁkÓ a, predtkÓ a, raredckÓ a, andraredÁkÓ a,
respectively.
3. Global convergenceanalysis.
3.1. Statement of the global convergenceresult. The following theoremstatestheglobalconvergencepropertiesof Algorithm A.
THEOREM 3.1.(i) Underassumptions(A1)–(A3), AlgorithmA is well definedaslongasxk staysin Ô .
Moreover, if AlgorithmA doesnot terminatefinitely, thenthefollowingholds:(ii) If assumptions(A1)–(A5) aresatisfied,then
limkÕ×Ö
ØckØ Í 0 Ù
(iii) If assumptions(A1)–(A6) aresatisfied,thenin addition
lim infkÕLÖ
Ø�ÚgkØ Í 0 Ù
The proof requiresseveral stepsandis carriedout in the remainderof this section. Inparticular, part (i) is proved in section3.2, Lemma3.2, part (ii) in section3.4, Lemma3.7,andpart(iii) in section3.5,Lemma3.9. A local convergenceanalysisshowing thetransitionto fastlocal convergenceundersuitableconditionson thestepcomputationwill begiveninsection4.
Throughouttheremainderof thissection,wewill notconsiderthecasewhereAlgorithmA terminatessuccessfullyin Step1, sincein this situationtheglobalconvergenceis trivial.
For theconvergenceanalysisit will beconvenientto introducealsotheactualreductionof theLagrangian
aredÁk defÍÜÛ k Ý Û Æ xk Ò sk Å yk Ç ÙThe following estimates,obtainedby the meanvalue theorem,will be usedseveral timesin this section. We recall that sk Í sn
k Ò stk, AT
k sk Í ATk sn andmaxÐ Ø sn
kØ Å Ø st
kØ ÑßÞ Ì k.
Assumethatxk andxk Ò sk arecontainedin Ô . Denotingby à Ï [0 Å 1] anappropriategenericconstantthat is adjustedfrom caseto case,andwriting x ák Í xk Ò à sk we find thatundertheassumption(A1) holds
âaredck Ý predck
â Þ Ø Ak ATkØ Ì 2
k Ò 4ØA Æ x ák Ç c Æ x ák Ç Ý Akck
Ø Ì k Å (3.1)âaredÁk Ý predÁk â Þ 2
Øg Æ x ák Ç Ý gk
Ø Ò Ø Æ A Æ x ák Ç Ý Ak Ç ykØ Ì k Ò 2
ØHkØ Ì 2
k Ù (3.2)
8 M. ULBRICH AND S.ULBRICH
3.2. Well definedness.Westartby showing thatthealgorithmis well defined,andthusestablishpart (i) of Theorem3.1. We first note that underassumptions(A1)–(A3) normalstepssn
k satisfying(2.2), (2.3),andtangentialstepsstk satisfying(2.5), (2.6) canbeobtained
by enforcingafractionof Cauchydecreasecondition.Hereby, (A3) ensuresthattheconstantsK1, K2, K3 in (2.2), (2.3), and(2.5) canbe chosenindependentlyof k aslong asxk ã´ä .We refer to [9] and,for detailson thepracticalcomputationof stepsproviding a fractionofCauchydecrease,to [20].
LEMMA 3.2. Let theassumptions(A1)–(A3) hold. Thenfor xk ãLä with å ck å`æÜå4çgk å�è2éLè 0 thereexists êßè 0 such that thestepsk is acceptedin Step5 of AlgorithmA wheneverë
k ì ê . Hence, thealgorithmis well definedaslong asxk ãLä .Further, if also (A4) and (A5) hold and if g and A are uniformlycontinuouson ä then
ê canbechosendependingonlyonmaxí min íîé2ïå ck åñðjï min íîé2ï a jk ðð , ä , çä , andtheboundsinassumption(A3) and(A5).
Proof. Westartwith a ê ãcò 0 ï�é ] suchthattheclosedê -ball aboutxk lies in ä . If xk ã çä ,we alwaysachieve this by choosingê�ó min é`ï 1
2distò çä ï n ô ä@õ . Furtheradjustmentof êwill beperformedastheproof proceeds.
Case1: å ck å�ö´é .Sinceê ì é , (2.7) impliesthatfor any
ëk ì ê we havepredck ö K2é ë k. Now reduceê such
thatfor all s ã n, å s å ì 2ê , holds
å Ak ATk å÷êøæ 4 å A ò xk æ sõ c ò xk æ sõ0ù Akck å ì ò 1 ù�ú 1õ K2é`ï (3.3)
which is possibleby assumption(A1). This togetherwith (3.1) implies
raredck ö predck ùüû aredck ù predck û ö ú 1predck æ ò 1 ù�ú 1õ'ò predck ù K2é ë k õ ö ú 1predck ý
If predþk öÜÿ predtk andpredtk ö maxí predck ï ò predck õ�� ð thenwe have to satisfytheadditionalconditionraredþk ö ú 1predþk (seeStep5.1). To achieve this, we notethat in this caseholdspredþk öDÿ predck ö�ÿ K2é ë k, wherewe have used(2.7). We now reduceê è 0 suchthat forall s ã n, å s å ì 2ê , holds
å Hk å÷ê æ�å g ò xk æ sõ0ù gk å æ å ò A ò xk æ sõ0ù Ak õ yk å ì ÿ2ò 1 ù�ú 1õ K2é2ï (3.4)
whichcanbedoneby assumption(A1). Then(3.2)ensuresthatthetestin Step5.1 is passedfor all
ëk ì ê . Hence,thestepis acceptedif
ëk ì ê .
If (A4) and(A5) holdandif g, A areuniformly continuouson ä thenourmechanismofreducingê canbedonedependingonly on é ó min íîé2ïå ck åñð , ä , çä andon theboundsin theassumptions.
Case2: å�çgk å�è=é .By assumption(A3) thereexistsK7 è 0 with
å WTk�
qk ò snk õ å�ö å4çgk å ù K7
ëk ý
Hence,if we reduceê suchthat ê ì 12K7é thenfor all
ëk ì ê holds
ëk ì 1
2K7å4çgk å and
thereforeby (2.5)
predtk ö K3
4é min í é2ï ë k ð ó K3
4é ë k ý (3.5)
Case2.1: predtk � predck.
Thenpredck ö K3
4 é ë k forë
k ì ê . Wereduceê until for all s ã n with å s å ì 2ê holds
å Ak ATk å÷êøæ 4 å A ò xk æ sõ c ò xk æ sõuù Akck å ì ò 1 ù�ú 1õ K3
4é ý
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 9
This is possibleby (A1). Invoking (3.1), we obtain raredck�
aredck���
1predck whenever�
k � and thus the trial stepis acceptedin Step5.2. If (A4) and (A5) hold and g, Aareuniformly continuous, canbe chosendependingonly on � andon the boundsin theassumptions.
Case2.2: predtk� predck.
For theacceptanceof thestepwe have to make surethat raredck���
1predck. Theadditionalconditionrared
k���
1pred k is only requiredif alsopred
k���
predtk. In this case,the latterrequirementis metby notingthat(A1) allows to reduce suchthatfor all s � n, � s � � 2 ,holds
� Hk � �� � g � xk � s��� gk � � ��� A � xk � s��� Ak � yk � ��2� 1 � � 1� K3
4���
Then,by (3.2)and(3.5), rared k�
ared k���
1pred k if
�k �� . If (A4) and(A5) hold andif
g andA areuniformly continuousthen canbechosendependingonly on � .Thefirst requirementraredck
���1predck canbeachievedby reducing furtheraccording
to thefollowing cases:
Case2.2.1: � ck �! min " a jk #%$ � def& � jk .Reduce suchthat '� � ck � andthatfor all s � n, � s � � 2 , holds
� Ak ATk � �� 4 � A � xk � s� c � xk � s��� Akck � � � 1 � � 1� K2 � ck �(�
This is againpossibleby (A1). By (2.7) we have predck�
K2 � ck � � k if�
k �) . Hence,raredck
� aredck�*�
1predck by (3.1) whenever�
k �� andthereforethe stepis acceptedin5.1. If (A4) and(A5) hold andif g and A areuniformly continuousthen canbe chosendependingonly onmin +,� # � ck �.-/ min " a jk #%$ � andtheboundsin theassumptions.
Case2.2.2: � ck � � � jk .Then Rk
�min + a2
jk # � 2 - � � 2jk 0 max+," 2 #%$ 2 - � 4� 2
jkandwith suitable 12� [0 # 1] andx 3k &
xk � 1 sk holds
raredck�
Rk �4� ck � 2 ��� A � x 3k � c � x 3k �5� T sk �Since � Akck � Tsk
& � Akck � Tsnk � 0, this gives
raredck�
3� 2jk �6� A � x 3k � c � x 3k ��� Akck � � k �
Moreover, predck � � ck � 2 � � 2jk
. Now reduce suchthatfor all s � n, � s � � 2 , holds
3� 2jk �4� A � xk � s� c � xk � s��� Akck � ��� 1� 2
jk �This is possibleby (A1). Thenraredck
�7�1predck and,hence,thetrial stepis acceptedin Step
5.2 for all�
k �4 . If (A4) and(A5) hold andif g and A areuniformly continuousthen canbechosendependingonly onmin +," a jk #%$ �8- & � jk
�min +,� # � ck �.- andtheboundsin the
assumptions.
3.3. A nonmonotonedecreaseLemma. The following crucial decreaseLemmais aslight modificationof [29, Lem. 4.3].
LEMMA 3.3. Supposethat there exists K�
0 such that for all iteration levelsk�
Kholds
raredck9 a & max � ck � 2 #: c
k ; 1
r < 0
= ckr � ck ; r � 2 �4� ck> 1 � 2 �
10 M. ULBRICH AND S.ULBRICH
Thenfor all k ? K
@ckA 1
@ 2 B maxK C�D c
K E l F KRl GIH 1
k
r J K
K minL k C r MND c Opredcr M a P (3.6)
Proof. Set MK Q maxK C�D cK E l F K Rl andaredckM a Q @
ck@ 2 G @
ckA 1@ 2. Theproof is by
induction.SinceRl ? @cl@ 2, 0 B l B K , wehave for k Q K
MK G @ ckA 1@ 2 ? raredckM a ? H 1predc
kM a PNow let k ? K . If raredckA 1M a Q aredckA 1M a we getby R 3P 6S k
@ckA 2
@ 2 Q @ckA 1
@ 2 G raredckA 1M a B MK GTH 1
k
r J K
K minL k C r MND c Opredc
r M a GIH 1predckA 1M a Uwhich implies R 3 P 6S kA 1, since0 V K V 1.
Now considerthecasewhereraredckA 1M a WQ aredckA 1M a. Thenweobtainwith q QYX ckA 1 G 1
by using(3.6)andthefactthat@ckA 1 C p
@ 2 B MK for K G2X cK V k Z 1 G p B K ,
@ckA 2
@ 2 Qq
pJ 0
K ckA 1M p @ ckA 1 C p
@ 2 G raredckA 1M a
B q
pJ 0
K ckA 1M p MK GTH 1
k C p
r J K
K minL k C p C r MND c Opredcr M a GTH 1predc
kA 1M a
B MK GIH 1
k C q
r J K
K minL k C r MND c Opredcr M a GTH 1
K k
r J maxL K M k C q A 1OK minL k C r MND c O
predcr M aGIH 1predckA 1M a P
Now r ? k G q Z 1 yields k G r B q G 1 B X c G 2 andtherefore1 Z min [ k G r U X c \ Qmin [ k Z 1 G r U X c \ . Thus,weseefrom thelastchainof inequalitiesthat
@ckA 2
@ 2 B MK GTH 1
k
r J K
K minL kA 1 C r MND c Opredcr M a G2H 1predc
kA 1M a
Q MK GIH 1
kA 1
r J K
K minL kA 1 C r M]D c O predcr a P
whichconcludestheproof.To show theconvergencetowardsstationarypointswe will moreover usethefollowing
decreaseLemma,which is verysimilar to thepreviousLemma3.3.LEMMA 3.4. Supposethat there exists K ? 0 such that for all iteration levelsk ? K
holds
rared_kM a Z�`aR xkA 1 U yk S G ` kA 1 ? H 1
2pred_kM a
Thenfor all k ? K
@ ` kA 1@ B max
K C�D�bK E l F K` l G H 1
2
k
r J K
K min c k C r M]D b�d pred_r M a P (3.7)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 11
Proof. Weonly notethatby thedefinitionof raredekf a holds
raredekf a g�hai xkj 1 k yk l�m h kj 1 n max h k koqp
k r 1
r s 0
t ekr h k r r m h kj 1 u
Now (3.7) followsexactlyby thesameargumentsasin theproof of Lemma(3.3).
3.4. Convergenceto feasiblepoints. Thefollowing auxiliary resultwill beuseful:LEMMA 3.5. Let theassumptions(A1)–(A4) hold. If j is increasedby AlgorithmR in
iterationk, i.e., jkj 1 n jk g 1, then
vckw v!x 1y t a jk for all kz|{ k u
In particular, for all iterationsk with jk { 1 holds
vckv}x a jky t�~
0k (3.8)
where~
0 is theconstantin (2.8).Proof. If jkj 1 n jk g 1 thenwe musthave
vckv��
min � ~ a jk k%� v��gkv.��x ~
a jk and a2jk { Rk {
o ck r 1
r s 0
t ckrvck r r
v 2 u
Thus,usingt c
kr { t , weobtain
vckw v 2 x 1t a2
jk k kz n k g 1 mT� ck k�u�u�u�k k u
Wenow show by induction
vckw v!x 1y t a jk for all kz|{ k g 1 mT� c
k u (3.9)
For kz n k g 1 m�� ck k�u�u�u(k k this is alreadyshown. Now let theassertionhold for theiterations
k g 1 m�� ck k�u�u�u�k kz { k. Sinceby Lemma3.2thekz -th iterationwill eventuallybesuccessful,
weobtainin particularthat
0x��
1predckw�f a x raredckw�f a n max Rkw ko c
kw r 1
r s 0
t ckw r v ckw r r
v 2 m v ckw�j 1v 2 u
SinceRkw x maxvckw v 2 k a2
jkwx
maxvckw v 2 k a2
jk, wehaveby theinductionhypothesis
vckw�j 1
v 2 x max a2jk k v ckw v 2 k�u�u�u(k v ckw�j 1 r o c
kwv 2 x 1t a2
jk uThisprovesthefirst assertion.
12 M. ULBRICH AND S.ULBRICH
Now � ck ��� a jk �.� � follows immediatelyif j is increasedby Algorithm R in iterationk, i.e., jk� 1 � jk � 1. For all subsequentiterationsk� satisfying jk� � jk� 1 we have by ourpreviousresultandby (2.8)
� ck� �!� a jk
� � �a jk��� 1
� � � a jk��0� �}�
Therefore,(3.8)holdsfor all k with jk � 1.Thenext Lemmashows thatck mustconvergeto zeroif theassumptionof thedecrease
Lemma3.3doesnothold.LEMMA 3.6. Let theassumptions(A1)–(A4) hold. If for infinitelymanyiterationsholds
raredck a ¡� max � ck � 2 ¢£ c
k � 1
r ¤ 0� c
kr � ck � r � 2 ¥ � ck� 1 � 2
then jk ¦ § and � ck � ¦ 0.Proof. Under the assumptionsof the Lemma,thereexists an infinite subsequenceof
iterationsk� for whichholds
Rk� � min a2jk� ¢ ��gk� � 2 © � ck� � 2 and Rk� ©
£ ck� � 1
r ¤ 0� c
k� r � ck� � r � 2 �Hence,jk� � 1 � jk� � 1 in eachiterationk� andthuswemusthave jk ¦ § . But now Lemma3.5yields
limkª¬« � ck �� lim
kª¬«a jk
� � � 0� 0 �
CombiningLemmas3.3and3.6,wecanestablishconvergenceto feasiblepoints,whichprovespart(ii) of Theorem3.1.
LEMMA 3.7. If AlgorithmA doesnot terminatefinitely then
limkª'« � ck � � 0 �
Proof. Assumethat ck doesnot tend to zero. Then Lemma3.6 yields possiblyafterincreasingK
raredck a � max � ck � 2 ¢£ c
k � 1
r ¤ 0� c
kr � ck � r � 2 ¥ � ck� 1 � 2 for all k � K � (3.10)
ThusLemma3.3 is applicableandweobtainthatfor all k � K holds
� ck� 1 � 2 � MK¥T®
1� £ ck
r ¤ K
predck a ¢ where MK � maxK � £ c
K ¯ l ° KRl � (3.11)
Wefirst show that
lim infkª'« � ck � � 0� (3.12)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 13
If this is wrong thenpossiblyafter increasingK thereexists ±³² 0 with ´ ck ´Tµ)± for allk µ K . Now by (2.3)
predck¶ a µ K2± min ±�·�¸ k¶ a · (3.13)
which, togetherwith (3.11),showsthat
¹kº K
¸ k¶ a »2¼4½ (3.14)
Thus, ¾ xk ¿}ÀÂÁà is a Cauchysequenceandconvergesto some Äx Å Áà À à . Thecontinuityofc, A, andg andtheboundednessof ¾ Hk ¿ and ¾ yk ¿ , see(A1) and(A5), impliestheexistenceof 0 »IÆÈÇ ± and ÆÊÉ ² 0 suchthatfor all xk with ´ xk Ë Äx ´ ÇÌÆÊÉ andall s with ´ s ´ Ç 2Æ theinequalities(3.3)and(3.4)aresatisfied.In theproofof Lemma3.2,Case1, it wasshown(note´ ck ´!µÍ± ) thatthestepis acceptedif (3.3),(3.4)holdfor all s with ´ s ´ Ç 2Æ andif in addition¸ k ÇÎÆ . Sincefor all sufficiently large k µ K we have ´ xk Ë Äx ´ Ç*Æ.É , the mechanismofupdating k would thusensurethat thestepis acceptedwith ¸ k¶ a µ min Ïи min ·�Ñ 1ÆaÒ . Thiscontradicts(3.14)and(3.12)is proven.
Now assumethat (3.12)holds,but ´ ck ´ doesnot convergeto zero. Thenthereis ±Ó² 0with ´ c Ôk ´Õµ 2± for a subsequence¾ Ák¿ . By (3.12),we canassociatewith each Ák some Äk µ Ákwith
´ c Ök× 1 ´ » ±�· ´ ck ´!µÍ±�· k Ø Ák · ½�½�½ · Äk ½As a consequencewehaveby (2.3)
predck¶ a µ K2± min ±Ù·�¸ k¶ a · k Ø Ák · ½�½�½ · Äk ½ (3.15)
Sincemoreover (3.11)holds,wemusthave
Ökkº Ôk
predck¶ a Ú 0 for Ák Ú ¼
andthusby (3.15)and ´ sk¶ a ´ Ç 2 ¸ k¶ a
´ x Ök× 1 Ë x Ôk ´ Ç 2
Ökkº Ôk
¸ k¶ a Ú 0 for Ák Ú ¼6½
Sincec is LipschitzcontinuousonÃ
by (A3), we concludethat
±ÛØ 2± Ë ± Ç ´ c Ök× 1 ´ Ë ´ c Ôk ´ Ç ´ c Ök× 1 Ë c Ôk ´ Ú 0 for Ák Ú ¼which is acontradiction.Hence,our assumptionwaswrongandtheproof is complete.
3.5. Convergenceto stationary points. As a next stepwe considerthe convergencebehavior of the reducedgradient Ágk Ø WT
k gk. The following Lemmagivesan importantlowerboundfor acceptabletrust-region radii. Hereby, weestablishtwo variantsof theresult.Oneholdsin thegeneralsettingof assumptions(A1)–(A6). Thesecondresultis strongerandholdsunderassumption(A7). It is usedin section4 to achievelocally quadraticconvergence.
LEMMA 3.8. Let theassumptions(A1)–(A6) besatisfied.Thenthefollowingholds.
14 M. ULBRICH AND S.ULBRICH
(i) Thereexistsa constantÜ 1 Ý 0 independentof k such that
raredck ÞÍß 1predck
is satisfiedwhenever
maxà�á snk á(âãá st
k áÊä�åÍæ kdefç Ü 1 min 1 â maxà min à a jk âãá�ègk á.ä5âãá ck á.ä 2é 3 ê (3.16)
(ii) There existsa constantÜ 2 Ý 0 independentof k such that the stepsk is acceptedwhenever
maxà�á snk á(âãá st
k á.ä�å min æ k â%Ü 2 maxà�á�ègk á.âãá ck á.ä ê (3.17)
with æ k asin (i).(iii) If in addition (A7) holdsthenthere exists ë Ý 0 and Ü 1 Ý 0 in (i) can be chosen
such that thestepsk is acceptedwhenever á xk ì6íx á¬îïë and
maxà�á snk á(âãá st
k á.ä/å min æ k â%Ü 1 maxà�á�ègk á(âãá ck á.äñð ê (3.18)
Proof. Set ò kç maxà�á sn
k á(âãá stk á.ä andnotethat ò k åôó k.
(i): Taylorexpansionyieldswith x õk ç xk öÎ÷ sk andappropriate÷Èø [0 â 1]
ùaredck ì predck
ù å 4 á A ú x õk û A ú x õk û T ì Ak ATk áüò 2
k ö 4 á%ý 2c ú x õk û�þ c ú x õk û áüò 2kê
Using(A3), (A6) weconcludethatthereexists K5 Ý 0 with
ùaredck ì predck
ù å K5ò 2k ú,ò k ö á ck á û ê (3.19)
Wenow considertwo cases.
Case1: á ck á Þ min à,ÿ a jk â��Õá�ègk á.äSinceraredck Þ aredck, thedecreasecondition(2.3) and(3.19)ensurethat raredck Þôß 1predckholdsif
K5ò 2k úñò k ö á ck á û åôú 1 ì ß 1û K2 á ck á min à�á ck á(â%ò k ä ê (3.20)
If á ck á}å�ò k, (3.20)holdsif 2K5ò 3k å�ú 1 ì ß 1û K2 á ck á 2, i.e., if
ò k å C1é 31 á ck á 2é 3 â C1
defç ú 1 ì ß 1û K2
2K5
êIn thecaseá ck á Ý ò k, (3.20)is satisfiedif
2K5ò 2k á ck á!åôú 1 ì ß 1û K2 á ck á ò k â i.e., if ò k å C1
êSince á ck á Þ min à ÿ a jk â�� á�ègk á.ä theassertion(i) thusholdswith
Ü 1ç C2
defç min à C1 â C1é 31 min à ÿ�â�� ä 2é 3 ä ê
Case2: á ck á î min à,ÿ a jk â�� á�ègk á.äThenRk
ç min à a2jkâãá�ègk á 2 ä accordingto Algorithm R. Weobtain
raredck Þ Rk ì á ck� 1 á 2 ç Rk ì á ck á 2 ö predck ö ú aredck ì predck û ê
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 15
Using(3.19)we getraredck � predck if
Rk ��� ck � 2 � K5� 2k � � k � ck ���� (3.21)
Case2.1: Rk � ���gk � 2ThenRk ��� ck � 2 � � 1 ��� 2�����gk � 2 since � ck ���������gk � , and(3.21)is satisfiedif
� 1 ��� 2 �����gk � 2 � K5� 2k � � k ������gk ���� (3.22)
If ���gk ����� k, (3.22)holdsif
� 1 � � 2�����gk � 2 � K5 � 1 ��!�����gk � 2� k " i.e., if � k � C3def� 1 � �
K5
If ���gk ���#� k, (3.22)holdsif
� 1 �$� 2�����gk � 2 � K5 � 1 %�&�'� 3k " i.e., if � k � C1( 3
3 ���gk � 2( 3 Therefore,since � ck �����)���gk � , (3.22)holdsif
� k � min C3 " C1( 33 max ���gk � " � ck � 2( 3
Case2.2: Rk � a2jk
ThenRk ��� ck � 2 � � 1 ��* 2� a2jk
and � ck ����* a jk . As in Case2.1,(3.21)holdsif
� k � min C4 " C1( 34 a2( 3
jk " C4def� 1 � *
K5"
whichyieldswith � ck ���+* a jk
� k � min C4 " C1( 34 max a jk " � ck � 2( 3
Thus,theassertion(i) is provenwith , 1 � min C2 " C3 " C1( 33 " C4 " C1( 3
4 .
(ii): If predtk - max predck " � predck �/. or pred0k -21 predtk thenno further acceptance
criteriaarerequiredandwe aredone.Otherwise,wehave
predtk � max predc
k " � predck � . and pred0k � 1 predtk " (3.23)
andgeta furtherrestrictionon � k by therequirementrared0k �43 1pred0k. Now (3.2)and(A6)yield a constantK6 � 0 with 5
ared0k � pred0k 5 � K6� 2k (3.24)
Weconsiderfirst thecasethat � ck � � ���gk � . We know thatin thepresentcaseholds
pred0k � 1 predtk � 1 predck � 1 K2 � ck � min � ck � " � k Sincerared0k � ared0k, weconcludefrom (3.24)thatrared0k �+3 1pred0k is ensuredif
K6� 2k � 1 � 1 � 3 1� K2 � ck � min � ck � " � k
16 M. ULBRICH AND S.ULBRICH
This is satisfiedif
6k 7 C5 8 ck 8:9 C5 ; min
<>= 1 ?�@ 1A K2
K69 <>= 1 ?�@ 1A K2
K6
1B 2 C
Hence,in thecase8 ck 8�DE8�Fgk 8 thestepis acceptedif (3.17)holdswith G 2 ; C5.Wenow considerthecase8�Fgk 8�DE8 ck 8 . By (A3) and(A5) thereexistsaconstantK7 H 0
with
8 WTk I qk
= snk A 8JDE8�Fgk 8 ? K7 8 sn
k 8�DK8�Fgk 8 ? K76
k 9 (3.25)
8 WTk I qk
= snk A 8J7E8�Fgk 8&L K7 8 sn
k 8�7K8�Fgk 8&L K1K7 8 ck 8:9 (3.26)
wherewe have used(2.2) in the last inequality. Thus,we obtainfor 6 k 7 12K7 8�Fgk 8 by (2.5)
and(3.25)
predMk D < predtk D < K3
4 8�Fgk 8 min 8�Fgk 8:9 6 k
C(3.27)
Hence,by (3.24)we haveraredMk D @ 1predMk whenever 6 k 7 12K7 8�Fgk 8 and
K66 2
k 7 <>= 1 ?�@ 1A K3
48�Fgk 8 min 8�Fgk 8:9 6 k
CAll this is satisfiedwhenever
6k 7 C N5 8�Fgk 8:9 C N5 ; min
1
2K79 <>= 1 ? @ 1A K3
4K69 <>= 1 ? @ 1A K3
4K6
1B 2 C
Hence,thestepis acceptedif (3.17)holdswith G 2 ; min O C5 9 C N5 P , whichcompletestheproofof (ii).
(iii): Now let in addition(A7) hold. We chooseQ H 0 suchthatthe2Q -neighborhoodofRx is containedin S N . In therestof theproof we show thataftera possiblefurtherreductionof G 1 thestepis acceptedwhenever 8 xk ? Rx 8UT Q and(3.18)is satisfied.
Therefore,let xk satisfy 8 xk ? Rx 8VT Q . As alreadyin (ii), we have only to considerthecase(3.23)in which we have to ensurethatraredMk D @ 1predMk. To this end,let 0 T G N1 7 Q�W 2be suchthat (i) holds for G 1 ; G N1 (andthusfor all 0 T G 1 7 G N1) andconsiderstepsthatsatisfy(3.16)for G 1 ; G N1. In particular, we thenhave 6 k 7 Q�W 2 andthus[xk 9 xk L sk] X4S N .
Using(A7), Taylorexpansionyields
aredMk ? predMk ; 1
2sTk= I 2
x Y = x Zk 9 yk A ? I 2x Y = xk 9 yk A'A sk
for appropriatex Zk ; xk L\[ sk, [^] [0 9 1]. Thus,(A7) yieldsa constantK8 H 0 with
_aredMk ? predMk _ 7 K8
6 3k
C(3.28)
Now weconcludefrom (2.2)andthefirst inequalityin (3.25)that
8 WTk I qk
= snk A 8�D`8�Fgk 8 ? K1K7 8 ck 8 C (3.29)
Weconsiderfirst thecase
8 ck 8�7 1
2K1K7maxO 8�Fgk 8:9a8 WT
k I qk= sn
k A 8 P C (3.30)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 17
Thenby (3.29)
bWT
k c qk d snk e bgf 1
2b�hgkb
andthus(3.27)holdsby (2.5). Hence,(3.27),(3.28)guaranteeraredik f+j 1predik whenever
K8k 3k lnm d 1 o j 1e K3
4b�hgkb
minb�hgkb:p k k q
This is satisfiedfor
k k l C6 min r b�hgkb 2s 3 pab�hgk
b 1s 2 t p C6 u min m d 1 o j 1e K3
4K8
1s 3 p m d 1 o j 1e K3
4K8
1s 2 qMoreover, we have in thecase(3.30)
bckb l 1
K1K7
b�hgkb q (3.31)
In fact,either(3.31)follows directly from (3.30)or we havebWT
k c qk d snk e bvf 2K1K7
bckb
and thus(3.26) yields (3.31). We now choosew 1 u min rxwzy1 p C6 min r 1 p d K1K7e/{ t|t . Then(3.18)implies k k l�} k l w 1 and(notethat ~�� 2� 3)
k k l w 1 min r 1p maxr b�hgkb:pab
ckb t { t l w 1 min r 1 pab�hgk
b { maxr 1 p d K1K7e�� { t'tl C6 min r 1 pab�hgk
b { t l C6 min r b�hgkb 2s 3 pab�hgk
b 1s 2 t qHence,theproof of (iii) is completein thecase(3.30).
It remainsto considerthecase
bckb � 1
2K1K7maxr b�hgk
b:pabWT
k c qk d snk e b t q (3.32)
Now, sincestk u Wkdk with dk u d WT
k Wk e � 1WTk st
k, (A3) and(A5) yield a constantK9 � 0suchthat
predtk l b WT
k c qk d snk e b�b d WT
k Wk e�� 1WTk st
kb&� 1
2bHkb k 2
k
l K9k k d b WTk c qk d sn
k e b�� k k e qSincepredtk
fmax predck
p d predck e { , wededucefrom (2.3)
K9k k d b WTk c qk d sn
k e b&� k k e f K {2 b ckb { min r b ck
b { p k {k twhichyields
K9k k d 2K1K7bckb&� k k e f K {2 b ck
b { min r b ckb { p k {k t q
IntroducingtheconstantC7 u K �2�1� 2K1K7 � K9
, this implies
k 2kf
C7bckb 2{ p if
bckb l k k
p(3.33)
k kbckb�f
C7bckb { k {k p if
bckb � k k. (3.34)
18 M. ULBRICH AND S.ULBRICH
In thecase(3.33)weobtain� k � C1� 27 � ck ��� . Since � ck � is boundedby (A3) and2� 3 � �`�
1, we seethat in thecase(3.34)thereexistsa constantC8 � 0 with � k � C8. We concludethatin thesituation(3.23),(3.32)alwaysholds
� k � min � C1� 27 � ck � �!� C8 �|�
Since � ck � � 12K1K7 ���gk � , wefind thatwith C9 � min � C1� 2
7 min � 1 ��� 2K1K7��� � � � C8 � holds
� k � C9 min � 1 � max� � ck � � �gk � � �|�Choosing� 1 � min ���a 1 � C9 � , this concludestheproofof (iii) alsofor thecase(3.32).
Thefollowing Lemmaestablishespart(iii) of Theorem3.1.LEMMA 3.9. Let (A1)–(A6) hold. If thealgorithmdoesnot terminatefinitely then
lim infk¡£¢ ���gk � � 0 �
Proof. Assumethatthealgorithmrunsinfinitely andthatthereareK � 0 and ¤^¥ � 0 � 1]with ���gk � � 2¤ for all k � K . We first show thatafterapossibleincreaseof K holds
pred¦k§ a �©¨ predtk§ a � predtk§ a � max predck§ a ��� predck§ a � � � for all k � K �
As in theproof of Lemma3.8,see(3.29),thereexistsa constantK7 � 0 with
� WTk ª qk � sn
k§ a � � � ���gk �¬« K7 � snk§ a � � 2¤ « K1K7 � ck �
for all k � K , wherewehaveused(2.2). Since � ck �¬ 0 by Lemma3.7,wecanincreaseKsuchthat
� WTk ª qk � sn
k§ a � � � ¤ for all k � K �andthusby (2.5)
predtk§ a � K3¤ min ¤ ��® k§ a for all k � K � (3.35)
On theotherhandholds
predck§ a ¯ 2 � Akck ��� snk§ a ��°� Ak AT
k ��� snk§ a � 2 �
and,hence,by (2.2)and(A3) find aconstantK10 � 0 with
predck§ a ¯ K10 � ck � min � ck � ��® k§ a � (3.36)
Since � ck �� 0 by Lemma3.7 and ���gk � � ¤ , we obtainfrom Lemma3.8, (i)–(ii), andthemechanismof updating® k thatafterapossibleincreaseof K holds
® k§ a �n¨ 1� 1 � ck � 2� 3 � � ck � for all k � K � (3.37)
andthuswehaveby (3.35)–(3.37)that,for sufficiently large K ,
predck§ a ¯ K10 � ck � 2 � predt
k§ a � K3¤ ¨ 1� 1 � ck � 2� 3 for all k � K � (3.38)
Hence,using � � 2� 3 and � ck �± 0, weseefrom (3.38)that,possiblyafterincreasingK ,
predtk§ a � max predck§ a ��� predc
k§ a � � for all k � K � (3.39)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 19
Moreover, we notethatby (A3), (A5), and(2.2) thereis K11 ² 0 with³predkµ a ¶ predt
kµ a ³¸·¹³ qk º snkµ a » ³½¼ K11 ¾ ck ¾:¿
Hence,K canby (3.38)beenlargedsuchthat
predkµ a À©Á predtkµ a for all k À K ¿ (3.40)
Therefore,for all k À K theacceptedstepskµ a satisfies(3.39)and(3.40).Thus,for all k À K ,theacceptanceof thesteptakesplacein Step5.1. In particular, theacceptedstepssatisfy
raredkµ a À# 1predkµ a ÀnÁ� 1K3à min Ã�Ä�Å kµ a for all k À K Ä (3.41)
wherewehaveused(3.35).By our assumptionholds ¾�Ægk ¾ À 2à for all k À K . ThusLemma3.8, (i)–(ii) andthe
mechanismof updatingÅ k yieldsaconstantC1 ² 0 with
Å kµ a À C1 min 1Ä maxÇ min Ç a jk Ä 2ýÈ|Äa¾ ck ¾ÉÈ 2Ê 3 À C1 min Ã�Ä a2Ê 3jk ¿ (3.42)
Hence,weobtainfrom (3.41)someC2 ² 0 with
predkµ a À C2à min Ã�Ä a2Ê 3jk
for all k À K ¿ (3.43)
By (3.41)holdsraredkµ a À� 1predkµ a for all k À K . We want to applythedecreaseLemma
3.4 andsinceraredkµ a usesË º xkÌ 1 Ä yk » , we show that³ Ë kÌ 1 ¶ Ë º xkÌ 1 Ä yk » ³ becomessmall
comparedto predkµ a. In fact,³ Ë º xkÌ 1 Ä ykÌ 1 »>¶ Ë º xkÌ 1 Ä yk » ³½¼ ¾ ykÌ 1 ¶ yk ¾�¾ ckÌ 1 ¾:¿ (3.44)
Usingpredckµ a À 0, we havewith (3.19)
¾ ckÌ 1 ¾ 2 · ¾ ck ¾ 2 ¶ aredckµ a ¼ ¾ ck ¾ 2 Í ³ aredckµ a ¶ predckµ a ³½¼ ¾ ck ¾ 2 Í K5 Å 2kµ a º|Å kµ a Í ¾ ck ¾ » ¿
This togetherwith (3.37)yieldsaconstantK12 ² 0 suchthat
¾ ckÌ 1 ¾ 2 ¼ K12 Å 3kµ a for all k À K ¿ (3.45)
Using(A5), (3.41),(3.44),and(3.45)weobtainpossiblyafterincreasingK
³ Ë º xkÌ 1 Ä ykÌ 1»Î¶ Ë º xkÌ 1 Ä yk » ³a¼  1
2predkµ a for all k À K ¿
In fact, by (3.41), (3.44), (3.45) this is clear for Å kµ a ¼ C3, C3 ² 0 small enough.AfterincreasingK this holdsalsofor all Å kµ a À C3 ² 0, sinceby ckÌ 1 Ï 0 accordingto Lemma3.7 theleft termtendsby (3.44)to zerowhereastheright handsidehasby (3.41)a positivelowerbound.Hence,wegetwith (3.41)
raredkµ a Í Ë º xkÌ 1 Ä yk »>¶ Ë kÌ 1 À Â 1
2predkµ a for all k À K ¿
Thisyieldsby Lemma3.4with MK·
maxK Ð�ÑÓÒK Ô l Õ K Ë lË kÌ 1
¼MK ¶ Â 1
2
Ö Ñ Ò k
r × K
predkµ a
20 M. ULBRICH AND S.ULBRICH
for all k Ø K . SinceÙ k is boundedfrom below by (A3) and(A5) this giveswith (3.43)ÚkÛ K
C2 min Ü 1Ý a2Þ 3jk ßáà
ÚkÛ K
predâkã a ä å�æBut theleft handsideis not summablebecauseaç j is not summableand è£é 2ê 3. Hence,wehavederivedacontradictionandtheproof is complete.
4. Transition to fast local convergence. Throughoutthis sectionwe assumethat as-sumptions(A1)–(A7) hold. We now show that the proposedAlgorithm A convergeswithlocal quadraticratetowardsa point satisfyingthesecondordersufficient condition.Hereby,we work with anSQP-Newton-typestepcomputation.Thesestepsareshown to beacceptedby our algorithm in a neighborhoodof a stationarypair ë�ìx Ýaìyí4îðïñóò m satisfyingthefollowing standardsufficientsecondordercondition:
(O2)c ë�ìx íô ïg ë�ìx í õ 0 Ý W ë�ìx í T ô 2
x Ù�ë�ìx Ýaìyí W ë�ìx í positivedefiniteÝ A ë�ìx í hasfull columnrank.
ThelastconditionensuresthattheLagrangemultiplier ìy is unique.
4.1. Requirementon thestepcomputation. Toachievefastlocalconvergencewehaveto ensurethatcloseto thesolutionSQP-Newtonstepsaretaken.This requiresanappropriatesplitting of thesestepsin their quasi-normalandtangentialpart anda carefulchoiceof theLagrangemultiplier updaterule. For thederivationof theseconcepts,webegin by collectingsomefactsaboutthe local convergencebehavior of SQPmethods. Hereby, we chooseaninformalstyleof presentationsincetheseresultsareby now well-known.
TheSQP-Lagrange-Newtonsystemis givenbyô 2x Ù�ë xk Ý yk í Ak
ATk 0
sN ã kzN ã k õ
ö ôx Ù÷ë xk Ý yk íö ck
æUnder the assumptions(A7) and (O2) it is well known that for all ø¹îKë 0 Ý 1í thereexistneighborhoodsUN of ìx andVN of ìy suchthatfor xk î UN , yk î VN thestepssN ã k andzN ã karewell definedwith xk ù sN ã k î UN , yk ù zN ã k î VN ,ú
xk ù sN ã k ö ìx ú ù ú yk ù zN ã k ö ìy ú à ø ú xkö ìx ú ù ú yk
ö ìy ú Ý (4.1)
andthatfor ë xk Ý yk íáû ë�ìx Ýzìyí holdsúxk ù sN ã k ö ìx ú ù ú yk ù zN ã k ö ìy ú õ O ë ú xk
ö ìx ú 2 ù ú ykö ìy ú 2íüÝ (4.2)ú ô Ù÷ë xk ù sN ã k Ý yk ù zN ã k í ú ù ú c ë xk ù sN ã k í ú õ O ë ú ô Ù k ú 2 ù ú ck
ú 2í æ (4.3)
Furthermore,if snN ã k andst
N ã k satisfy
ATk sn
N ã k õ ö ck Ýst
N ã k õ WkdtN ã k Ý where ë WT
kô 2
x Ù kWk í dtN ã k õ ö ë ïgk ù WT
kô 2
x Ù ksnN ã k íüÝ (4.4)
thenwe have sN ã k õ snN ã k ù st
N ã k. Hereby, we have usedthe identity ïgk õ WTkô
x Ù k. Notethatsn
N ã k solvestheunconstrainedquasi-normalproblem
minúck ù AT
k sn ú 2 Ý
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 21
andthatstN ý k is thecorrespondingsolutionof theunconstrainedtangentialproblem
minqk þ snN ý k ÿ st � subjectto AT
k st � 0
with Hk��� 2
x � þ xk � yk� .
Wewould like to achievequadraticconvergenceof þ xk� ratherthan þ xk � yk
� . To thisend,similar asin [19], let Y : UN �� m bea consistentupdaterule for theLagrangemultiplier,which is Lipschitzcontinuousat �x, i.e.,
Y þ �x �� �y � Y þ x ��� Y þ �x � � L y x � �x for all x � UN . (4.5)
By apossiblereductionof UN andVN weachievethat(4.1)holdsfor � 1� þ 2 ÿ 2L y� , anda
furtherreductionof UN yieldsY þ UN��� VN . Thus,for all xk � UN holdsyk
� Y þ xk� � VN
and
xk ÿ sN ý k � �x � � xk� �x ÿ yk
� �y 1
2 xk� �x (4.6)
xk ÿ sN ý k � �x � O þ� xk� �x 2 ÿ yk
� �y 2��� O þ� xk� �x 2� � (4.7)
wherewe have used(4.1), (4.2),and(4.5). Therefore,the iterationxk � xk� 1� xk ÿ sN ý k
with yk� Y þ xk
� convergesq-quadraticallyto �x.In thesequelwe restrictourselvesto thefollowing classof updaterules:Let B : UN �
n � m becontinuouslydifferentiablesuchthat AT B is uniformly boundedinvertibleon UN .We introducethemultiplier update
Y þ x ����� þ B þ x � T A þ x ����� 1B þ x � T g þ x � � (4.8)
which is obviously consistentandcontinuouslydifferentiable.Therefore,afterreducingUN
if necessary, B is boundedonUN andY satisfiestheLipschitzcondition.In particular, if we chooseB � A, we obtainthe well-known least-squaresmultiplier
update. Furthermore,the adjoint update, which is widely usedin optimal control, alsofitsin this framework: Let x bepartitionedin theform x � þ zT � uT � T � m � n � m suchthat�
zc þ x � is invertibleonU with uniformly boundedinverse.In anoptimalcontrolcontext thestandardchoicefor z is thestate,andfor u thecontrol. Theadjointupdatefor this splittingnow correspondsto BT � þ BT
z � BTu�� þ I � 0� .
Amongthemany possiblesolutionssnN ý k of thequasi-normalproblemwe selecttheone
containedin spanþ Bk� , i.e.,
snN ý k ��� Bk þ AT
k Bk��� 1ck � (4.9)
By constructionholdsfor yk� Y þ xk
�� � þ xk � yk
� TsnN ý k � 0� (4.10)
Further, thereexist constantsK1 � K13 � 0 suchthat
snN ý k � K1 ck �� st
N ý k K13 þ!#"gk ÿ ck � � (4.11)
wherethe first inequalityfollows from (4.9) andfor thederivationof thesecondinequalityweuse(4.4) to obtain
stN ý k �$� Wk þ WT
k� 2
x � kWk� � 1 þ�"gk ÿ WT
k� 2
x � ksnN ý k � �
22 M. ULBRICH AND S.ULBRICH
Furthermore,theuniformly boundedinvertibility of BT A andthe fact that AT W % 0 yieldtheuniformly boundedinvertibility of & B 'W( andthus
)+*x , k) % O & ) & Bk 'Wk ( T * x , k
) (% O0-gk
% O & ) -gk) (!.
Hence,usingthat-g & x (% W & x ( T * x , & x / y( for all y 0 m, we obtain
) -g & xk 1 sN 2 k ( ) 1 )
c & xk 1 sN 2 k ( ) % O & )+* x , & xk 1 sN 2 k / yk 1 zN 2 k ( ) ( 1 ) c & xk 1 sN 2 k ( )% O & )+* x , k
) 2 1 ) ck) 2(3% O & ) -gk
) 2 1 )ck) 2(4.
(4.12)Collectingtheresultsobtainedsofar, wehave
PROPOSITION 4.1. Let (A7) hold and assumethat 5x 0 -6satisfiesthe secondorder
sufficientcondition(O2). Let B & x (70 n 8 m becontinuouslydifferentiablein a neighborhoodUN of 5x such that AT B is uniformlyboundedinvertibleonUN . Thenfor xk 0 UN sufficientlycloseto 5x, yk % Y & xk ( , sn
N 2 k, andstN 2 k asgivenin (4.8), (4.9), and(4.4)arewell definedand
satisfy(4.10), (4.11). Furthermore, (4.6), (4.7), and(4.12)hold.Thefollowing assumptionstatesour requirementson thestepcomputationthatwe need
to prove fastlocal convergence.Assumption:
(A8) 5x 0 -6 satisfiesthesecondordersufficientcondition(O2),and & xk ( convergesto 5x.Moreover, thereexistsa neighborhoodUN of 5x suchthatfor all xk 0 UN holds:
(i) The Lagrangemultiplier estimatesarecomputedby yk % Y & xk ( with Y givenby (4.8),whereB & x (90 n 8 m is continuouslydifferentiableon UN and AT B isuniformly boundedinvertibleonUN .
(ii) Thestepsnk % sn
N 2 k with snN 2 k asin (4.9) is chosenwhenever
)sn
N 2 k )�:<; k.
(iii) If the reducedHessianWTk* 2
x , kWk is positive definite, thenstN 2 k is computed
accordingto (4.4)andstk % st
N 2 k is chosenwhenever)st
N 2 k )�:<; k.REMARK 4.2. A possibleimplementationof (iii) is obtainedby applying Steihaug’s
conjugategradientmethodto (2.4) in the reducedvariablesd, wherest % Wkd (or in itsprojectedform [18]). If the reducedHessianis positive definite then the CG-patheitherleavesthe trust-region (in this caseholds
)st
N 2 k )>=?;k), or it staysin the trust-region and
convergesto dtN 2 k. If thereducedHessianis not positivedefinite,theSteihaugmethodeither
detectsnegativecurvatureor stopssincethepathleavesthetrust-region.As is well known, onecanallow inexactnesswithoutdestroying therateof convergence.
Dueto spacelimitationsthis issueis not discussedhere.
4.2. Quadratic local convergence.Thenext resultshowsthatwith therule(A8) for thestepcomputationAlgorithm A eventuallytakesNewtonsteps.
THEOREM 4.3. Let (A1)–(A8) hold. Thenthe trial stepsaccording to (4.4), (4.9) areeventuallytakenbyAlgorithmA andthus & xk ( convergesq-quadratically to 5x.
Theproofof this resultrequiressomework. Westartwith thefollowing auxiliary result.LEMMA 4.4. Let (A1)–(A8) holdandlet @ satisfy
2
3A @ A min BC/ 2
3B /D2. (4.13)
Thenthere is K=
0 such that thefollowing is true: If for someiterationkEGF K holds
aH jkI =�)-gkI ) or
)ckI ) H =�) -gkI ) / (4.14)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 23
thenfor all k J kK AlgorithmA takesNewtonsteps,i.e. snkL a M sn
N L k andstkL a M st
N L k.Proof. We first note that by the assumptionson N and O the condition (4.13) can be
satisfied.Sincexk P Qx by (A8) andc R Qx S M 0, Tg R Qx S M 0, we find K U 0 with V ck VXW 1,V#Tgk VYW 1, V xk Z[Qx V]\ min ^`_ N a+bGc andxk d UN for all k J K , where b is asin Lemma3.8, (iii). In particular, (4.11) holds for all k J K . Hence,we can increaseK suchthatV sn
N L k V a V stN L k VeWgf min for all k J K . Sincethe stepssn
N L k andstN L k satisfy the decrease
conditions(2.3) and(2.5), respectively, part (iii) of Lemma3.8 yieldsby themechanismofupdatingf k thatfor k J K theNewtonstepsk M sN L k is acceptedwhenever
V snN L k V a V st
N L k V�W<h 1_ Kk a where (4.15)
_ Kk defMji 1 min max min ^ a jk a V#Tgk V cka V ck V c 2l 3 a max!V#Tgk V a V ck V c`m n (4.16)
In fact,iterationlevel k is enteredwith f k Jof min andin eachsubiterationf k is reducedbyat mostthefactor h 1.
Fromck P 0and(4.11)weobtain V snN L k V�W K1 V ck V�Wph 1i 1 V ck V m W<h 1_ Kk for all k J K
aftera possibleincreaseof K . Thus,by (A8) thequasi-normalstepsatisfiessnkL a M sn
N L k forall k J K .
Now weconsiderthestepstN L k for k J K . If aq jk U�V#Tgk V then,using V#Tgk V a V ck V�W 1,
max min ^ a jk a V#Tgk V cra V ck V c J max!V#Tgk V 1l q a V ck V c J max!V#Tgk V a V ck V c 1l q nSimilarly, if V ck VsqtU�V#Tgk V , we obtain
max min ^ a jk a V#Tgk V cra V ck V c JuV ck V�J max!V#Tgk V 1l q a V ck V c J max!VvTgk V a V ck V c 1l q nIn bothsituationsweconcludethat,sinceNw\ 2
3q , for k J K , K sufficiently large,holds
h 1_ Kk Jph 1i 1 max!V#Tgk V a V ck V c 23x U K13 R�V#Tgk VGyzV ck V{S�J|V st
N L k V awherewehaveused 2
3q \ 1 and(4.11).Therefore,wehaveproved:
If K is sufficiently largeandfor kKGJ K holds(4.14),thensk}~L a M sN L k} . (4.17)
In thecasejk �P � , k P � , thesequencea jk is boundedaway from zeroandwe seethat(4.14)holdsfor all kK J K if K is chosensufficiently large. Therefore,(4.17)completestheproof in this case.
Now considerthecasejk P � ask P � . Thenfor K so large that jK J 1, Lemma3.5yields
V ck V�W 1� ���0a jk
defM K14a jk for all k J K n (4.18)
In thecaseaq jk} U�V#Tgk} V wegetwith (4.18),usinga j P 0,
VvTgk} V�y�V ck} V�\ aq jk} y K14a jk} W 2aq jk} W2� q0 aq jk}~� 1
nfor kK J K , K largeenough.In thecaseV ck} Vsq�U�V#Tgk} V wehave
V#Tgk} VGyzV ck} V�\�V ck} V q y�V ck} V�W 2 V ck} V q W 2K q14aq jk} W2K q14� q0 aq jk}�� 1
n
24 M. ULBRICH AND S.ULBRICH
From(4.12)weseethat,possiblyafterincreasingK , holds
�#�gk�~� 1
�����#�g � xk�v� sN � k��� �t� �G�0
max� 2 � 2K �14 � ��#�gk� � � �
ck� � �7� a� jk��� 1 �Therefore,(4.14)holdsfor k � 1 insteadof k andthusinductionyieldssk� a � sN � k for allk ¡ k by (4.17).
Wenow proveTheorem4.3.Proof of Theorem4.3. We chooseK from Lemma4.4. AssumethatAlgorithm A does
not eventuallychooseNewton steps.ThenLemma4.4 implies that (4.14)mustbeviolatedfor all k G¡ K . Thus,wehave
a� jk � �v�gk�
and�ck� � � �#�gk
�for all k ¡ K � (4.19)
Exactlyasatthebeginningof theproofof Lemma4.4weobtainthatpossiblyafterincreasingK for all k ¡ K holds
�ck� � 1,
�v�gk� � 1 andthatthusby Lemma3.8andthemechanism
of updating¢ k astepsk�
snk � st
k is acceptedif
�snk� � � st
k� �<£ 1¤ 1 min max� min � a jk � �#�gk
� � � � ck� � 2¥ 3 � max� �#�gk
� � � ck� �§¦ def� £ 1 k �
Hereby, we chooseK large enoughsuchthat the right handsideis � ¢ min (by (A8) holdsck © 0,
�gk © 0). By (4.19)wehavetherefore
¢ k� a ¡ £ 1 k ¡ £ 1¤ 1 min � max� a jk � � ck� � 2¥ 3 � max� � ck
� � ¦ � �v�gk� ¦ ��� ¡ £ 1¤ 1
�ck� 2¥ 3 � (4.20)
In thelastinequalitywehaveusedthat ª¬« � 2 3 by (4.13).In particular, weobtainpossiblyafterincreasingK by (4.11)
�sn
N � k � � K1�ck��� £ 1¤ 1
�ck� 2¥ 3 � ¢ k� a for all k ¡ K �
andthereforeby (A8)
snk� a � sn
N � k for all k ¡ K � (4.21)
Weshow next thatthenaftera possibleincreaseof K holds
pred®k� a ¡ £ predtk� a ¡ £ max predck� a �v� predck� a � ¦ for all k ¡ K � (4.22)
In fact,asusedbefore,cf. (3.25),wehaveby (A3), (A5), (2.2),and(4.19)�WT
k ¯ qk � xk � snk � � ¡ �#�gk
�±°K7�snk� ¡ �#�gk
�²°K1K7
�ck� ¡ �#�gk
�³°K1K7
�v�gk� 1¥ � �
Thus,for k ¡ K , K largeenough,it holds�WT
k ¯ qk � xk � snk � � ¡ �#�gk
� 2 andconsequentlyby (2.5)and(4.20)
predtk� a ¡ K3
4�#�gk�
min�#�gk� �v¢ k� a ¡ K3
4�ck� � min
�ck� � � £ 1¤ 1
�ck� 2¥ 3 ¡ K3
4�ck� 2�
for all k ¡ K . On theotherhand,weobtainasbefore,cf. (3.36),
predck� a � K10
�ck� 2 �
Sinceby (4.13)holds ª � « andsinceck © 0 wededuce
predtk� a ¡ K3
4�ck� 2� ¡ K ¦10
�ck� 2¦ ¡ max predck� a �v� predck� a � ¦ for all k ¡ K (4.23)
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 25
possiblyafterincreasingK . Moreover, wehave
´predµk¶ a · predtk¶ a ´¬¸�´ qk ¹ sn
k¶ a º ´¼»½´¿¾ x À�¹ xk Á yk º Tsnk¶ a ´4 1
2ÃHkÃ�Ã
snk¶ a à 2»
K15Ãsnk¶ a à 2 » K 2
1 K15Ãckà 2 Ä
Hereby, we have usedthat (4.10)holdsfor all k Å K by (4.21). We use(4.23)to concludethatfor K largeenoughholds
predµk¶ a ÅoÆ predtk¶ a for all k Å K ÄTogetherwith (4.23)this impliesthat(4.22)holdsfor all k Å K . Thus,we have by Step5.1of Algorithm A thatfor all iterationlevelsk Å K holds
raredµk¶ a Å]Ç 1predµk¶ a Å<Æ#Ç 1predtk¶ a Å<Æ#Ç 1K3
4ÃvÈgkÃ
minÃ#Ègkà ÁvÉ k¶ a Ä
From (4.19)we seethatÃvÈgkà ŠaÊ jk andthus jk Ë Ì . Therefore,(4.20)yields É k¶ a Å
Æ 1Í 1 min Î a2Ï 3jk Á aÊ+Ðjk Ñ . Since Ò¬ÓÕÔ 2Ö 3 Ô×Ò , we obtain for jk large enough(i.e., k large
enough)
predµk¶ a Å<Æ K3
4Ã#ÈgkÃaÊ jk Å<Æ K3
4a2Êjk Ä (4.24)
As in theproof of Lemma4.4,theestimate(4.18)is satisfiedandwe deducewith (A8)
´ À�¹ xkØ 1 Á ykØ 1º�· ÀÙ¹ xkØ 1 Á yk º ´Ú»wà ykØ 1 · ykÃ�Ã
ckØ 1Ã�»
L yÃsk¶ a à K14a jk Á
whereL y is thelocalLipschitzconstantof Y. Since(4.21)holds,(A8) ensuresthatÃstk¶ a Ã�»Ã
stN ¶ k à for all k Å K , andtherefore(4.11)implies
Ãsk¶ a ÃÛ» K1
ÃckÃ3Â
K13 ¹ Ã#ÈgkÃ3Â?Ã
ckà º .
UsingthatÃckÃ�»|Ã#È
gkÃ
by (4.19)wearriveat
´ ÀÙ¹ xkØ 1 Á ykØ 1º�· ÀÙ¹ xkØ 1 Á yk º ´¼» L y ¹ K1Â
2K13º K14Ã#ÈgkÃa jkÄ
Comparingwith (4.24)andusingthat jk ËÜÌ , we getaftera possibleincreaseof K
´ À�¹ xkØ 1 Á ykØ 1º�· À�¹ xkØ 1 Á yk º ´Ú» Ç 1
2predµk¶ a Ä
Therefore,weobtain
raredµk¶ a  À�¹ xkØ 1 Á yk º· À kØ 1 Å Ç 1
2predµk¶ a for all k Å K Ä
Now Lemma3.4yieldswith M µK ¸ maxK ÝßÞáàK â l ã K À l
À k»
M µK · Ç 1
2
ä mk
r å K
predµk¶ a for all k Å K Ä
But since2 ÒæÔèç by (4.13),a2Êj is notsummable,andhencealsopredµk¶ a is notsummableby(4.24). We concludethat À k Ë · Ì which is a contradictionto theassumptions.Theproofis complete.
26 M. ULBRICH AND S.ULBRICH
5. Numerical results. In this sectionwe reportnumericalresultsfor a setof problemsfrom the CUTE collection[1]. The resultsareobtainedby a preliminaryMATLAB imple-mentationof Algorithm A. Hereby, weusetheleast-squaresmultiplier update,i.e.,(4.8)withB é x êë A é x ê , andconsequentlycomputequasi-normalstepscontainedin spané Ak ê . For thecomputationof Wk weusedirecteliminationbasedonMATLAB’ sLU-factorizationroutine.Moreprecisely, wecomputethefactorization
Lk
NkRk ë Pk Ak
with Nk ì n í mî m, lower triangularLk ì mî m, uppertriangluarRk ì n í mî n í m, andapermutationmatrix Pk andset
Wk ë PTk
ï L í Tk NT
k
I ðFor largeproblemsthismatrix is not formedexplicitly, only its applicationto vectorsis com-puted. For the computationof the quasi-normalstepwe usethe Dogleg-method[25] witha sparseCholesky factorizationfor the computationof the Newton step(4.9). The tangen-tial stepis obtainedby a Steihaug-CGmethod[27]. For problemswith moderatesizeweform the reducedHessianexplicitly andusethe incompleteCholesky factorizationICFSofLin and More [21] aspreconditioner. For large problems,we usecurrently a matrix-freeSteihaug-CGmethodwithoutpreconditioning.If theCG-pathdoesnot leave thetrust-regionand doesnot find negative curvature, it is stoppedif the currentresidualrk is reducedtorkñ 1 ò maxó 10í 20 ô min ó 0ð 05ô rk õ rk õ . Therefore,our stepcomputationis similar asin [20],but the implementationdescribedin [20], especiallythe computationof Wk, is moreelab-oratedthanour straightforward testcode. We noticethat our stepcomputationsatisfiesaninexact versionof our steprequirementfor the local convergenceanalysisin section4, cf.Remark4.2.
For thenumericaltestwe haveusedthefollowing parametersettingsin Algorithm A:
ö c ë öÙ÷ ë 5 ôeø ë 0 ð 001ôeù 1 ë 0 ð 01ôeù 2 ë 0 ð 9 ôûú 1 ë 0 ð 5 ôûú 2 ë 2 ôü ë½ýþë 0 ð 1 ô ÿ ë 3
�4 ôûú ë 0 ð 01ô�� min ë 10í 5 ô��
0 ë 10ðFor thesequenceé a j ê weused
a0 ë min 0 ð 1maxé 1 ô�� ck� ê ô����gk
��� ck� ô a j ë a0�
j 1ô j 1 ð
The weights ø c� ÷kr in raredc� ÷k arecomputedasmentionedin section2.1. Hence,we select
0 ò r ��� ö ck with � ck í r � � ë max0 � r ��� c
k� ck í r
� andset ø ckr � ë 1 ï é ö c
kï 1ê ø , and ø c
kr ë øfor r �ë r � . Thesamestrategy is usedto chooseø ÷kr . Westopif maxó � ck
���æô����gk� � õ3ò 10í 5.
Table5.1 shows thenumberof f ô c evaluationsandthenumberof g ô A evaluationsfora setof problemsfrom theCUTE collectionusingexactHessians.Theresultsshow that thenew penalty-function-freealgorithmis robustandefficient. Althoughour MATLAB imple-mentationis verystraightforwardwithoutusingadvancedtechniquesfor, e.g.,thetrust-radiusupdateandthecomputationof Wk, thenumberof iterationsfor mostproblemsis very satis-factory. As expectedfrom our local convergencetheory, we observedtransitionto fastlocalconvergencefor all problems. We believe that further improvementscanbe achievedby amoresophisticatedimplementation.Theresultsindicatethatour new classof nonmonotonetrust-region methodswithout penaltyfunction is an interestingandcompetitive alternative
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 27
TABLE 5.1Resultsfor CUTEproblemsusingexactHessians
Problem f � c g � A n m Problem f � c g � A n mARTIF 9 9 1000 1000 HS8 8 6 2 2AUG2D 10 10 3280 1600 HS9 3 2 2 1AUG2DC 8 8 3280 1600 HS26 74 43 3 1AUG3D 7 7 3873 1000 HS27 20 14 3 1AUG3DC 8 8 3873 1000 HS39 32 21 4 2BRATU3D 5 5 3375 3375 HS42 4 4 4 2BYRDSPHR 10 7 3 2 HS46 22 17 5 2DTOC1NA 10 10 2998 1996 HS47 42 26 5 3DTOC1NB 10 8 2998 1996 HS48 2 2 5 2DTOC1NC 16 11 2998 1996 HS49 14 14 5 2DTOC1ND 16 10 2998 1996 HS50 8 8 5 3DTOC2 10 7 2998 1996 HS51 2 2 5 3DTOC3 8 8 2998 1996 HS52 2 2 5 3DTOC4 5 5 2999 1998 HS56 17 10 7 4DTOC5 6 6 9999 4999 HS61 8 6 3 2DTOC6 13 13 2001 1000 HS77 12 12 5 2DTOC6 13 13 2001 1000 HS78 5 5 5 3EIGENA2 3 3 110 55 HS79 5 5 5 3EIGENB2 101 61 110 55 HS100LNP 10 7 7 2EIGENBCO 90 49 110 55 HS111LNP 16 12 10 3EIGENC2 58 35 462 231 LCH 33 21 300 1EIGENCCO 62 34 110 55 MARATOS 4 4 2 1GENHS28 3 3 300 298 MWRIGHT 11 8 5 3GRIDNETB 7 7 60 36 ORTHREGA 38 27 517 256GRIDNETE 7 7 60 36 ORTHREGB 2 2 27 6HAGER1 5 5 10001 5000 ORTHREGC 18 14 1005 500HAGER2 5 5 10000 5000 ORTHREGD 20 16 5003 2500HAGER3 5 5 10000 5000 OPTCTRL3 9 9 202 80HS6 12 8 2 1 OPTCTRL3 20 20 2002 800HS7 12 9 2 1 OPTCTRL6 9 9 202 80
to algorithmswith penaltyfunction anddeserves further consideration.Sinceonly minorchangesarerequiredto incorporateour framework in existing penalty-function-basedByrd-Omojokuntypealgortihms,it might beinterestingto includeour new approachasanoptionin existinghigh-qualityimplementations.
6. Conclusions. TheByrd–Omojokunclassof trust-region algorithmsis known asoneof themostefficient optimizationmethodsfor equalityconstrainedNLP. On theotherhand,the recentidea of filter methodsand, more general,merit-function-freealgorithmskeepsgreatpromisefor the developmentof powerful new algorithms. In this work we combinedboth of theseconceptsin a novel way. The result is a globally convergentalgorithmthatconvergeslocally quadraticallyto regularsolutions.Similar to filter methods,thealgorithmallows for nonmonotonicityof both, constraintviolation and(Lagrangian)function values.However, insteadof usinga filter, we imposenonmonotonedecreaseconditionsthatcontrolfeasibilityandoptimality, respectively, in a looselycoupledway. Fromacomputationalpointof view, only minor modificationsarenecessaryto incorporatethe proposedconceptsinto
28 M. ULBRICH AND S.ULBRICH
an existing implementationof the Byrd–Omojokunalgorithm. Our preliminarynumericalresultsindicatethat the presentedapproachis viableandyields anefficient androbustnewclassof trust-region algorithms.
REFERENCES
[1] I . BONGARTZ, A . R. CONN, N. I . M. GOULD, AND P. L. TOINT, CUTE: ConstrainedandUnconstrainedTestingEnvironment, ACM Trans.Math.Software,21 (1995),pp.123–160.
[2] R. H. BYRD, Robust trust region methodsfor constrainedoptimization, Third SIAM Conferenceon Opti-mization,Houston,TX, May 1987.
[3] R. H. BYRD, J. C. GILBERT, AND J. NOCEDAL, A trust region methodbasedon interior point techniquesfor nonlinearprogramming, Math.Programming,89 (2000),pp.149–185.
[4] R. H. BYRD, M. E. HRIBAR, AND J. NOCEDAL, An interior point algorithm for large scalenonlinearprogramming, SIAM J.Optim.,9 (2000),pp.877–900.
[5] R. H. BYRD, R. B. SCHNABEL , AND G. A. SHULTZ, A trust region algorithmfor nonlinearlyconstrainedoptimization, SIAM J.Numer. Anal., 24 (1987),pp.1152–1170.
[6] M. R. CELIS, J. E. DENNIS, AND R. A. TAPIA, A trustregionstrategyfor nonlinearequalityconstrainedop-timization, in NumericalOptimization1984,P. T. Boggs,R. H. Byrd, andR. B. Schnabel,eds.,Philadel-phia,USA, 1985,SIAM.
[7] T. F. COLEMAN AND Y. L I, An interior trust region approach for nonlinearminimizationsubjectto bounds,SIAM J.Optim.,6 (1996),pp.418–445.
[8] A. R. CONN, N. I . M. GOULD, AND P. L. TOINT, Global convergenceof a classof trust region algorithmsfor optimizationwith simplebounds, SIAM J.Numer. Anal., 25 (1988),pp.433–460.
[9] J. E. DENNIS, M. EL -ALEM , AND M. C. MACIEL, A global convergencetheoryfor general trust-regionbasedalgorithmsfor equalityconstrainedoptimization, SIAM J.Optim.,7 (1997),pp.177–207.
[10] J. E. DENNIS, M. HEINKENSCHLOSS, AND L. N. V ICENTE, Trust-region interior-pointSQPalgorithmsfora classof nonlinearprogrammingproblems, SIAM J.ControlOptim.,36 (1998),pp.1750–1794.
[11] J. E. DENNIS AND L. N. V ICENTE, Trust region interior-point algorithmsfor minimizationproblemswithsimplebounds, in AppliedMathematicsandParallelComputing,Festschriftfor KlausRitter, H. Fischer,B. Riedmuller, andS.Schaffler, eds.,Heidelberg, 1996,Physica-Verlag,pp.97–107.
[12] , On the convergencetheoryof trust-region-basedalgorithmsfor equality-constrainedoptimization,SIAM J.Optim.,7 (1997),pp.927–950.
[13] M. EL -ALEM,A globalconvergencetheoryfor a classof trustregionalgorithmsfor constrainedoptimization,Tech.Rep.TR88-5,Departmentof ComputationalandApplied Mathematics,RiceUniversity, Houston,TX, USA, 1988.
[14] , A global convergencetheoryfor theDennis-Celis-Tapiatrust-region algorithmfor constrainedopti-mization, SIAM J.Numer. Anal., 28 (1991),pp.266–290.
[15] , A global convergencetheory for a general classof trust-region-basedalgorithmsfor constrainedoptimizationwithoutassumingregularity, SIAM J.Optim.,9 (1999),pp.965–990.
[16] R. FLETCHER, N. I . M. GOULD, S. LEYFFER, AND P. L. TOINT, Global convergenceof trust-region SQP-filter algorithmsfor nonlinearprogramming, Tech.Rep.99/03,Departmentof Mathematics,Universityof Namur, Namur, Belgium,1999.
[17] R. FLETCHER AND S. LEYFFER, Nonlinearprogrammingwithout a penaltyfunction, NumericalAnalysisReportNA/171,Departmentof Mathematics,Universityof Dundee,Dundee,Scotland,1997.
[18] N. I . M. GOULD, M. E. HRIBAR, AND J. NOCEDAL, On the solutionof equalityconstrained quadraticprogrammingproblemsarising in optimization, Tech.Rep.RAL-TR-1998-069,RutherfordAppletonLaboratory, Chilton,Oxfordshire,England,1998.
[19] D. KLEIS AND E. W. SACHS, Convergencerateof theaugmentedLagrangianSQPmethod, J.Optim.TheoryAppl., 95 (1997),pp.49–74.
[20] M. LALEE, J. NOCEDAL , AND T. D. PLANTENGA, On the implementationof an algorithmfor large-scaleequalityconstrainedoptimization, SIAM J.Optim.,8 (1998),pp.682–706.
[21] C.-J. L IN AND J. J. MORE, IncompleteCholesky factorizationswith limitedmemory, SIAM J.Sci.Comput.,21 (1999),pp.24–45.
[22] , Newton’s methodfor large bound-constrainedoptimizationproblems, SIAM J. Optim., 9 (1999),pp.1100–1127.Dedicatedto JohnE. Dennis,Jr., onhis 60thbirthday.
[23] E. O. OMOJOKUN, Trust region algorithmsfor optimizationwith nonlinear equality and inequality con-straints, PhDthesis,Universityof Colorado,Boulder, Colorado,USA, 1989.
[24] T. D. PLANTENGA, A trust-region methodfor nonlinearprogrammingbasedon primal interior point tech-niques, SIAM J.Sci.Comput.,20 (1999),pp.282–305.
[25] M. J. D. POWELL, A hybrid methodfor nonlinearequations, in Numericalmethodsfor nonlinearalgebraicequations(Proc.Conf.,Univ. Essex, Colchester, 1969),GordonandBreach,London,1970,pp.87–114.
NONMONOTONETRUST-REGIONMETHODSWITHOUT PENALTY FUNCTION 29
[26] M. J. D. POWELL AND Y. YUAN, A trust region algorithm for equality constrained optimization, Math.Programming,49 (1990),pp.189–213.
[27] T. STEIHAUG, Theconjugategradientmethodandtrust regionsin large scaleoptimization, SIAM J.Numer.Anal., 20 (1983),pp.626–637.
[28] P. L . TOINT, A non-monotonetrust-regionalgorithmfor nonlinearoptimizationsubjectto convex constraints,Math.Programming,77 (1997),pp.69–94.
[29] M. ULBRICH, Non-monotonetrust-region methodsfor bound-constrainedsemismoothequationswith appli-cationsto nonlinearmixedcomplementarityproblems, SIAM J.ControlOptim.,(2000,in press).
[30] M. ULBRICH, S. ULBRICH, AND M. HEINKENSCHLOSS, Global convergenceof trust-region interior-pointalgorithmsfor infinite-dimensionalnonconvex minimizationsubjectto pointwisebounds, SIAM J.Con-trol Optim.,37 (1999),pp.731–764.
[31] M. ULBRICH, S. ULBRICH, AND L. N. V ICENTE, A globally convergent primal-dual interior point filtermethodfor nonconvex nonlinearprogramming, Tech.Rep.TR00-12,Departmentof ComputationalandAppliedMathematics,RiceUniversity, Houston,TX, USA, 2000.
[32] A. VARDI, A trust region algorithm for equalityconstrainedminimization:convergencepropertiesand im-plementation, SIAM J.Numer. Anal., 22 (1985),pp.575–591.
[33] L . N. V ICENTE, Trust-region interior-point algorithmsfor a classof nonlinearprogrammingproblems, PhDthesis,Departmentof ComputationalandApplied Mathematics,Rice University, Houston,TX, USA,1995.ReportTR96-05.
[34] C. A. ZOPPKE-DONALDSON, A tolerance-tubeapproach to sequentialquadratic programmingwith appli-cations, PhDthesis,Universityof Dundee,Dundee,Scotland,UK, 1995.