Date post: | 16-Nov-2023 |
Category: |
Documents |
Upload: | independent |
View: | 0 times |
Download: | 0 times |
Bulletin of the Iranian Mathematical Society Vol. XX No. X (201X), pp XX-XX.
A LIMITED MEMORY TRUST-REGION METHOD WITH ADAPTIVE
RADIUS FOR LARGE-SCALE UNCONSTRAINED OPTIMIZATION
MASOUD AHOOKHOSH, KEYVAN AMINI∗, MORTEZA KIMIAEI AND M. REZA PEYGHAMI
Communicated by
Abstract. This study concerns with a trust-region-based method for solving uncon-strained optimization problems. The approach takes the advantages of the compactlimited memory BFGS updating formula together with an appropriate adaptive radiusstrategy. In our approach, the adaptive technique leads us to decrease the number ofsubproblems solving, while utilizing the structure of limited memory quasi-Newton for-mulas helps to handle large-scale problems. Theoretical analysis indicates that the newapproach preserves the global convergence to first-order stationary points under classicalassumptions. Moreover, the superlinear and the quadratic convergence rates are alsoestablished under suitable conditions. Preliminary numerical experiments on some stan-dard test problems show the effectiveness of the proposed approach for solving large-scaleunconstrained optimization problems.
1. Introduction
Over the past few decades large-scale unconstrained optimization gets lots of attentionthanks to arising in many applications in the context of applied sciences such as biology,physics, geophysics, chemistry, engineering and industry. In general, an unconstrainedoptimization problem can be formulated as follows
(1.1)minimize f(x)subject to x ∈ Rn,
MSC(2010): Primary: 90C30; Secondary: 65k05; Third: 65k10.
Keywords: Unconstrained optimization, Trust-region framework, Compact quasi-Newton representation, Limited
memory technique, Adaptive strategy, Convergence theory.
Received: date, Accepted: date.
∗Corresponding author
c© 2011 Iranian Mathematical Society.
1
2 Ahookhosh et al.
where f : Rn → R is assumed to be continuously differentiable.
Motivation & history There exist lots of iterative schemes such as Newton, quasi-Newton, variable metric, gradient and conjugate gradient methods that have been intro-duced and developed for solving the unconstrained problem (1.1). In the most cases, theyare required to exploit one of the general globalization techniques, i.e. line search andtrust-region techniques, in order to guarantee the global convergence results (see [26]).
For a given xk, a line search technique refers to a procedure that computes a step-sizeαk along a specific direction dk and generates a new point as xk+1 = xk + αkdk. Manyline search strategies for determining this step-size have been proposed, for instance exactline search or Armijo, Wolfe and Goldstein inexact line searches (see [26]). On the otherhand, a quadratic-based framework of trust-region technique computes a trial step dk bysolving the quadratic subproblem
(1.2)minimize mk(xk + d) = fk + gTk d+
12d
TBkdsubject to d ∈ Rn, ‖d‖ ≤ ∆k,
where ‖.‖ denotes the Euclidean norm, fk = f(xk), gk = ∇f(xk), Bk is the exact Hessian,i.e. Gk = ∇2f(xk), or its symmetric approximation and ∆k is the trust-region radius.In order to evaluate an agreement between the model and the objective function and toaccept or reject the trial step dk, a criterion based on the actual and the model reductionsis required. Traditional monotone trust-region methods do this by defining the ratio
(1.3) ρk =f(xk)− f(xk + dk)
mk(xk)−mk(xk + dk),
where the numerator and the denominator are called the actual and the predicted reduc-tions, respectively. The trial step dk is accepted whenever ρk is greater than a positiveconstant. In this case, if ρk is greater than a constant µ2 > 0, then the new pointxk+1 = xk + dk is accepted and the iterate is called very successful. But if ρk ≥ µ1 for0 < µ1 < µ2, then the new point is generated by xk+1 = xk + dk and the iterate is calledsuccessful and the trust-region radius is also updated appropriately based on amount ofρk. Otherwise, the trial step is rejected and the quadratic subproblem (1.2) would beresolved at the current point with the reduced trust-region radius.
On the basis of literature reviews, one can realize that the traditional trust-regionmethods are very sensitive to the initial trust-region radius ∆0 and its updating scheme.This fact leads the researchers to work on finding appropriate procedures for initial trust-region radius as well as its updating rule. In 1997, Sartenaer in [30] proposed an approachto determine the initial trust-region radius by monitoring the agreement between the modeland the objective function along the steepest descent direction. A possible drawback ofthis approach is the dependency of the parameters to the problem information. Recently,Gould et al. in [18] examined the sensitivity of the traditional trust-region methods to
A limited memory trust-region method with adaptive radius 3
the parameters related to the step acceptance test and trust-region radius update withextensive numerical experiments. Despite their comprehensive tests on a large number oftest problems, they did not claim to find the best parameters for updating scheme. In2002, motivated by a problem in the field of neural network, the first adaptive trust-regionradius was proposed by Zhang et al. in [35] in which the information of the current iteratewas used more effectively to introduce an adaptive scheme. They introduced the followingadaptive trust-region radius
(1.4) ∆k = cpk‖gk‖ ‖B−1k ‖,
where c ∈ (0, 1) is a constant, pk ∈ N ∪ 0 and the matrix Bk = Bk + Ek is a safelypositive definite matrix based on Schnabel and Eskow modified Cholesky factorization,see [31]. As numerical results show, their method works very well on small-scale uncon-strained optimization problems, but the situation dramatically changes for the large-scaleand even medium-scale problems because of the computation of the inverse matrix B−1
k .Subsequently, Shi and Guo in [33] proposed another interesting adaptive radius by
(1.5) ∆k = −cpkgTk qk
qTk Bkqk‖qk‖,
where c ∈ (0, 1) is a constant, pk ∈ N ∪ 0, and Bk is generated by the procedure
Bk = Bk + iI,
where i is the smallest nonnegative integer so that qTk Bkqk > 0 and qk satisfies the well-known angle condition
(1.6) −gTk qk
‖gk‖.‖qk‖≥ τ,
in which τ ∈ (0, 1] is a constant. An important advantage of this method is its ability forselecting an appropriate qk in order to make a more robust method. For instance, Shi andGuo proposed qk = −gk and qk = −B−1
k gk. Preliminary numerical results and theoreticalanalysis showed that their method is promising for solving medium-scale problems withoutany need of exploring for appropriate initial trust-region radius. For more references aboutthe adaptive trust-region radius, see also [1, 2, 3, 4, 6].
Although the proposed adaptive radius of Shi and Guo has some advantages such asdecreasing in the total computational costs by decreasing the number of subproblemssolving and determining a good initial radius, it suffers from some drawbacks as well. Firstof all, it can be easily seen that qk = −gk does not generate an appropriate radius (see[33]), and the computation of qk = −B−1
k gk requires the computation of the inverse matrix
B−1k or solving a linear system of equations that causes being inappropriate for large-scale
problems. Secondly, the process of generating Bk guarantees that the denominator of (1.5)
4 Ahookhosh et al.
is bounded away from zero, however, if the numerator term −gTk qk be close to zero, thenthis case causes a tiny trust-region radius which possibly increases the total number ofiterates. Thirdly, numerical experiments have shown that when the ratio is so close to1, for very successful iterates, the method does not necessarily enlarge the trust-regionradius sufficiently. In addition, the procedure of constructing Bk is almost unusual andsometimes costly. Finally, the necessity of storing Bk for computing the term qTk Bkqk in
(1.5) and the term dTkBkdk for computing the predicted reduction may cause the methodto be unsuitable for large-scale problems.
It is known that the limited memory quasi-Newton methods are the customized ver-sion of the quasi-Newton methods for large-scale optimization problems. Besides, theirimplementations are almost identical to that of the quasi-Newton methods, however, theHessian and the inverse Hessian approximations are not explicitly formed on them. In-stead, they are defined based on information of some small number of previous iterates toreduce the required memory. As another advantage, some limited memory quasi-Newtonformula, such as the compact limited memory BFGS updating formula (see [10]), can pre-serve positive definite property based on some cheap conditions. Due to these remarkableadvantages of the limited memory quasi-Newton formulas, they are widely utilized forlarge-scale optimization problems, see [8, 10, 19, 20, 21, 24, 25].
Content. In this paper, we propose an improved version of the adaptive trust-regionradius (1.5) to attain better performances when the number of variables of underlyingfunction is large. More specifically, we first substitute (1.5) by a modified formula toovercome the above-mentioned disadvantages. We then take the advantages of the compactlimited memory BFGS formula in order to calculate the term dTkBkdk in a cheap way. Itis also clear that the method can preserve positive definiteness of Bk based on someconditions, so it can avoid of calculating Bk. The analysis of the new approach shows thatit inherits both the stability of adaptive trust-region approaches and the effectiveness of thelimited memory BSGS. We also investigate the global convergence to first-order stationarypoints of the proposed method and provide the superlinear and quadratic convergencerates. To show the efficiency of the proposed method, some numerical results are reported.
The remainder of this paper is organized as follows. In Section 2, we describe the mo-tivation behind the proposed algorithm and outline the algorithm. Section 3 is devotedto investigating the global, superlinear and quadratic convergence properties of the algo-rithm. Numerical results are provided in Section 4 to show the well promising behaviour ofthe proposed method for solving large-scale unconstrained optimization problems. Finally,some conclusions are given in Section 5.
A limited memory trust-region method with adaptive radius 5
2. Algorithmic framework
In this section, we first verify some alternatives to overcome disadvantages of the trust-region radius (1.5) which mentioned in the previous section. We then construct our trust-region-based algorithm and establish some of its properties.
It can be easily seen that if one exploits a positive definite quasi-Newton formula forBk, then there is no need to define Bk in (1.5) due to the fact that qTk Bkqk > 0 for every
arbitrary non-zero vector qk. Therefore, the method exempts from the generating Bk. Asmentioned before, qk = −gk does not generate an appropriate trust-region radius, andthe calculation of qk = −B−1
k gk imposes a remarkable computational cost to the method.Thus, it is needed to find a new qk with less computational cost satisfying (1.6). For thispurpose, the following qk is employed throughout the paper
(2.1) qk = −Hkgk,
which clearly has less computational cost in comparison with the cost of calculation qk =−B−1
k gk, so it can partially avoid the costly calculation of the inverse matrix B−1k . In order
to show qk = −Hkgk satisfies (1.6), it is sufficient that the spectral condition number κkof Hk, the ratio λ1/λn of the largest to the smallest eigenvalues of Hk, is bounded above,independently to k. Let θk denotes the angle between qk = −Hkgk and −gk, so the factthat ‖Hkgk‖ ≤ λ1‖gk‖ and gTk Hkgk ≥ λn‖gk‖2 implying
sin(π2− θk
)= cos(θk) =
gTk Hkgk‖gk‖.‖Hkgk‖
≥ λn‖gk‖2
λ1‖gk‖2= κ−1
k .
From the inequality sin(x) ≤ x, it can be concluded that
θk ≤π
2− κ−1
k .
Therefore, if spectral condition number κk is bounded above for any k ∈ N, then θk isbounded away from π/2, i.e. the angle condition (1.6) is satisfied, for example see [16].
It is known that the compact limited memory BFGS formula can be used to decrease thecomputational costs of the calculation qTk Bkqk, d
TkBkdk and Hkgk (see [10, 20]). Besides,
it can remain positive definite if the curvature condition yTk sk > 0 holds for all previousiterates, where sk = xk+1 − xk and yk = gk+1 − gk. We also increase the trust-regionradius more than that defined in (1.5) when the procedure encounters with very successfuliterates. It seems that these changes possibly lead to an approach needing less number ofiterates and function evaluations.
It is clear that traditional trust-region methods and the radius defined in (1.5) requiresto store the n × n matrix Bk needing lots amount of memory, especially for large-scaleproblems. Hence it is worth to explore a way to avoid storing this matrix. The pioneeringwork to find such methods proposed by Nocedal in [25] called the limited memory quasi-Newton method. On the basis of the interesting features of limited memory quasi-Newton
6 Ahookhosh et al.
methods, they have been widely used in various fields of optimization, see [8, 10, 19, 20,21, 24, 25] and references therein.
Here, we briefly describe the compact limited memory BFGS formulae proposed byByrd et al. in [10]. For a positive integer constant m1, let matrices Sk and Yk be definedas follows:
(2.2) Sk = [sk−m, · · · , sk−1], Yk = [yk−m, · · · , yk−1],
where m = mink,m1. Suppose that Dk be the m×m diagonal matrix
(2.3) Dk = diag[sTk−myk−m, · · · , sTk−1yk−1
]and let Lk be m×m lower triangular matrix
(2.4) (Lk)i,j =
sTk−m+i−1yk−m+j−1 if i > j,0 otherwise.
Then, using relations (2.2)–(2.4), the compact representation of BFGS formula proposedby Byrd et al. in [10] can be expressed as follows:
(2.5) Bk = B(0)k −
[Yk B
(0)k Sk
] [ −Dk LTk
Lk STk B
(0)k Sk
]−1 [Y Tk
STk B
(0)k
],
where the basic matrix B(0)k is defined as B
(0)k = σkI, for some positive scalar σk. Defining
Ak = [Yk, Sk] and using the way in [20], we can easily write
Bk = σkI −Ak
[I 00 σk
] [−Dk LT
k
Lk σkSTk Sk
]−1 [I 00 σk
]AT
k .
Consequently, using the Cholesky factorization, this formula can be stated as(2.6)
Bk = σkI −Ak
[I 00 σk
] [−D
12k D
− 12
k LTk
0 JTk
]−1 [D
12k 0
−LkD− 1
2k Jk
]−1 [I 00 σk
]AT
k ,
where Jk is a lower triangular matrix satisfying JTk Jk = σkS
Tk Sk + LkD
−1k LT
k . Notice
that the matrix Dk is a positive definite diagonal matrix, so D1/2k exists and is a positive
definite diagonal matrix as well. Moreover, from positive definiteness of the matrix Dk,one can easily conclude that σkS
Tk Sk + LkD
−1k LT
k is also positive definite and therefore
the matrix Jk evidently exists. Under curvature assumption sTk yk > 0 for each k, it is notdifficult to show that the matrix Bk generated by the formula (2.6) is positive definite. Inthe rest of the paper, we use (2.6) to approximate exact Hessian Gk = ∇2f(xk), however,we will never form it explicitly.
It is obvious that the computation of the term Hkgk in (2.1) is so costly, especially whenthe number of variables is large. Therefore, this formula can be calculated according to
A limited memory trust-region method with adaptive radius 7
the compact limited memory technique which is customized for the approximate inverseHessian, i.e. Hk. Due to literatures review in [10, 20], the compact limited memoryrepresentation of the approximate inverse Hessian Hk can be constructed by
(2.7) Hk = σ−1k I +Ak
[0 σ−1
k I
R−Tk 0
] [Dk + σ−1
k Y Tk Yk −I
−I 0
] [0 R−1
k
σ−1k I 0
]AT
k ,
where
(Rk)i,j =
sTk−m+i−1yk−m+i−1, if i ≤ j;0, otherwise.
Since the matrix Rk is an m×m upper triangular matrix, for any arbitrary vector ηk, it iseasy to compute the vector θk = R−1
k ηk by solving an upper triangular system Rkθk = ηk.Thanks to the updating formula of Hk in (2.7), for an arbitrary vector vk ∈ Rn, we com-pute the vector Hkvk by the following scheme:
Scheme 1: Calculation of Hkvk based on the formula (2.7)
Step 1. Determine ξ = ATk vk and partition it as ξ =
[ξ1ξ2
]mm
. Then solve Rkw = ξ2.
Step 2. Set β = (Dk + σ−1k Y T
k Yk)w − σ−1k ξ1 and solve RT
k γ = β.
Step 3. Set Hkvk = σ−1k [vk + Ykw] + Skγ.
Using the fact that Dk and Rk are respectively diagonal and triangular matrices, we
observe that the Scheme 1 consists of 2mn+ m(m+1)2 multiplications for Step 1, m2+3m+
m(m+1)2 multiplications for Step 2 and 2mn + n multiplications for Step 3. Thus Scheme
1 requires 4mn+ 2m2 + 4m+ n multiplications for computing the term Hkvk.As mentioned in the implementation of the Shi and Guo’s algorithm in [33], we need
to calculate the term vTk Bkvk in the subproblem (1.2) as well as the trust-region radius,respectively. One knows that direct computation of these formulas is expensive for large-scale problems. As discussed in [10, 20], these formula can be calculated more efficiently ifone applies the compact version of limited memory quasi-Newton updates. Based on theformula (2.6), for an arbitrary vector vk ∈ Rn, we establish an effective scheme to computevTk Bkvk as follows:
Scheme 2: Calculation of vTk Bkvk based on the formula (2.6)
Step 1. Compute ξ = ATk vk, and let ξ =
[ξ1ξ2
]mm
.
Step 2. Let ζ =
[ξ1
σkξ2
]. Solve following two systems
D12k t1 = ζ1 and Jkt2 = ζ2 + Lk(D
− 12
k t1).
8 Ahookhosh et al.
Step 3. Set vTk Bkvk = σvTk vk + tT1 t1 − tT2 t2.
In Scheme 2, it is observed that the vector ξ is available from Step 1 of Scheme 1, so wedo not need to recompute it. The lower triangular 2m× 2m linear system in Step 2 needsm2 + 3m multiplications. Finally, Step 3 requires only 2(n + m) multiplications. HenceProcedure 2 requires 2n+2m2+5m multiplications to compute the scalar vTk Bkvk, for anarbitrary vector vk.
We here describe the k-th step of our novel scheme. Let us define
βk = −gTk qk
qTk Bkqk‖qk‖,
where qk is an arbitrary vector satisfying (1.6), for example qk = −gk or qk = −Hkgk,and Bk is defined by the compact limited memory BFGS formula (2.6). Note that theformula (2.6) leads to a positive definite matrix Bk and consequently qTk Bkqk > 0, for anyarbitrary non-zero vector qk. We now define sk based on the definition φk as follows
(2.8) sk :=
‖g0‖ if k = 0;
βk if k ≥ 1, µ1 ≤ ρk−1 < µ2;
c1βk if k ≥ 1, ρk−1 ≥ µ2
where c1 > 1, 0 < µ1 ≤ µ2 ≤ 1. We then compute the adaptive trust-region radius ∆k bythe next formula
(2.9) ∆k = cpksk
in which pk is the smallest integer in N ∪ 0 guaranteeing ρk ≥ µ1. In view of our dis-cussion, a new limited memory trust-region algorithm with adaptive radius is outlined in
A limited memory trust-region method with adaptive radius 9
the following:
Algorithm 1: LMATR (limited memory trust-region algorithm with adaptive radius)
Input: x0 ∈ Rn, B0 ∈ Rn×n, kmax, 0 < µ1 ≤ µ2 ≥ 1, 0 < c1 ≤< 1, c2 ≥ 1, ε > 0;Output: xb; fb;
1 begin2 ∆0 ← ‖g0‖; k ← 0; p← 0;
3 while ‖gk‖ ≥ ε && k ≤ kmax do4 compute qk by (2.1) using Scheme 1;
5 compute qTk Bkqk using Scheme 2;
6 compute sk using (2.8);
7 solve the subproblem (1.2) to specify dk;
8 xk+1 ← xk + dk; compute f(xk+1);
9 compute dTkBkdk using Scheme 2;
10 determine ρk using (1.3);
11 while ρk < µ1 do12 p← p+ 1;
13 solve the subproblem (1.2) to specify dk;
14 xk+1 ← xk + dk; compute fk+1 = f(xk+1);
15 determine ρk using (1.3);
16 end
17 xk+1 ← xk+1; fk+1 = fk+1; m← mink,m1;18 update Sk, Yk, Dk, Lk, Y
Tk Yk and Rk;
19 k ← k + 1;
20 end
21 xb ← xk+1; fb ← fk+1;
22 end
The loop starts from Line 3 and ends at line 20 is called the outer loop, and the loopstarts from Line 11 and ends at line 16 is called the inner loop.
3. Convergence analysis
This section is devoted to analyzing the global convergence of the proposed al-gorithm. We first give some properties of the algorithm and then investigate its globalconvergence to first-order stationary points. The local superlinear and quadratic conver-gence rates of the proposed algorithm are also established in the sequel.
10 Ahookhosh et al.
To establish the global convergence property, we assume that the decrease on the modelmk is at least as much as a fraction of that obtained by Cauchy’s point, i.e., there existsa constant β ∈ (0, 1) such that, for all k,
(3.1) mk(xk)−mk(xk + dk) ≥ β‖gk‖ min
[∆k,‖gk‖‖Bk‖
].
This inequality is called the sufficient reduction condition and have been investigated bymany authors when they extended some inexact methods for approximately solving thesubproblem (1.2), for example see [12, 26, 27]. Formula (3.1) implies that dk 6= 0 whenevergk 6= 0. Furthermore, throughout the paper, we also consider the following two assump-tions in order to analyzing to convergence properties of Algorithm 1:
(H1) The objective function f(x) is twice continuously differentiable and has a lowerbound on the upper level set L(x0) = x ∈ Rn|f(x) ≤ f(x0), x0 ∈ Rn.
(H2) The approximation Hessian matrix Bk is uniformly bounded, i.e., there exists aconstant M > 0 such that ‖Bk‖ ≤M , for all k ∈ N ∪ 0.
Remark 3.1. Suppose that the objective function f(x) is a twice continuously differen-tiable function and the level set L(x0) is bounded. Then (H1) implies that ‖∇2f(x)‖ isuniformly continuous and bounded above on the open bounded convex set Ω, containingL(x0). As a result, there exists a constant L > 0 such that ‖∇2f(x)‖ ≤ L, for all x ∈ Ω.Therefore, using the mean value theorem, it can be concluded that, for all x, y ∈ Ω,
‖g(x)− g(y)‖ ≤ L‖x− y‖,which means that the objective function f(x) is Lipschitz continuous in the open boundedconvex set Ω.
In the next two lemmas, it will be proved that the inner loop of Algorithm 1 will bestopped after the finite number of steps.
Lemma 3.2. Suppose that (H2) holds and the sequence xk is generated by Algorithm1. Then, we have
|f(xk + dk)−mk(xk + dk))| ≤ O(‖dk‖2).
Proof. Taylor’s expansion along with Remark 3.1 and the definition of mk(d) imply that
|f(xk + dk)−mk(xk + dk))| ≤ | − dTkGkdk + dTkBkdk|+O(‖dk‖2)= |dTk (Bk −Gk)dk|+O(‖dk‖2)≤ (L+M)‖dk‖2 +O(‖dk‖2)= O(‖dk‖2),
A limited memory trust-region method with adaptive radius 11
giving the result.
Lemma 3.3. Suppose that (H2) holds and the sequence xk is generated by Algorithm1. Then, the inner loop of Algorithm 1 terminates after the finite number of steps.
Proof. By contrary, we assume that an infinite cycle happens in Step 2 of Algorithm 1.Therefore, setting p = pk implies
(3.2) ∆pk → 0, as p→∞.
Since the current iterate xk is not the optimum solution of problem (1.1), then there existsa positive constant ε so that ‖gk‖ ≥ ε. This fact and (1.6) suggest that
(3.3) mk(xk)−mk(xk + dpk) ≥ βε min[∆p
k,ε
M
],
where dpk is the solution of the subproblem (1.2) corresponding to p ∈ N ∪ 0. Now, byLemma 3.2, (3.2) and (3.3), we obtain∣∣∣∣ f(xk)− f(xk + dpk)
mk(xk)−mk(xk + dpk)− 1
∣∣∣∣ = ∣∣∣∣f(xk + dpk)−mk(xk + dpk))
mk(xk)−mk(xk + dpk)
∣∣∣∣≤
O(‖dpk‖2)
βε min[∆p
k,εM
] ≤ O((∆pk)
2)
βε ∆pk
→ 0, (p→∞).
Therefore, there exists a sufficiently large constant p1 such that for all p ≥ p1, the inequality
ρk =f(xk)− f(xk + dpk)
mk(xk)−mk(xk + dpk)≥ µ1
holds. This fact indicates that there exists a finite nonnegative integer pk, which is acontradiction with (3.2). Therefore, the inner loop of Algorithm 1 terminates after thefinite number of steps.
In order to establish global convergence to first-order critical points, we require thatthe following condition holds
(3.4) mk(xk)−mk(xk + dk) ≥cpk
2M
[gTk qk‖qk‖
]2, for all k ∈ N ∪ 0.
Hence the subsequent lemma plays an important role in proving the global convergence ofthe sequence xk generated by Algorithm 1.
Lemma 3.4. Suppose that (H2) holds, the sequence xk is generated by Algorithm 1 anddk is a solution of the subproblem (1.2). Then, we have that (3.4) holds.
12 Ahookhosh et al.
Proof. By setting
dk = −cpkgTk qk
qTk Bkqkqk,
it is clear that ‖dk‖ ≤ ∆k, i.e., dk is a feasible point of the subproblem (1.2). On the otherhand, from (1.6), we have gTk qk < 0. Consequently, using these facts and the assumption(H2), we obtain
mk(xk)−mk(xk + dk) ≥ mk(xk)−mk(xk + dk) = −gTk dk −1
2dTkBkdk
= cpkgTk qk
qTk Bkqk
[gTk qk −
1
2cpkgTk qk
]= cpk
(gTk qk)2
qTk Bkqk
[1− 1
2cpk
]≥ cpk
(gTk qk)2
qTk Bkqk
[1− 1
2
]=
1
2cpk
(gTk qk)2
qTk Bkqk
≥ cpk
2M
[gTk qk‖qk‖
]2,
completing the proof. We here establish the global convergence property to first-order stationary points of
Algorithm 1 under the mentioned assumptions.
Theorem 3.5. Suppose that (H1) and (H2) hold and qk satisfies (1.6). Then, Algorithm1 either stops at a stationary point of (1.1) or generates an infinite sequence xk suchthat
(3.5) limk→∞
‖gk‖ = 0.
Proof. If Algorithm 1 stops at a stationary point of the problem (1.1), then we havenothing to do. Otherwise, we first prove that
(3.6) limk→∞
−gTk qk‖qk‖
= 0.
By contrary, we suppose that there exists a constant ε0 > 0 and an infinite subset K ⊆N ∪ 0 such that
(3.7) −gTk qk‖qk‖
≥ ε0 ∀k ∈ K.
A limited memory trust-region method with adaptive radius 13
Lemma 3.4 and (3.7) imply that
fk − f(xk + dk) ≥ µ1(mk(xk)−mk(xk + dk)) ≥µ1
2Mcpk
[gTk qk‖qk‖
]2≥ 1
2Mµ1c
pkε20.
Taking an infinite summation from both sides of this inequality and using assumption(H1), we get
1
2Mµ1ε
20
∞∑k=0
cpk ≤∞∑k=0
(fk − fk+1) = f0 − limk→∞
f(xk) <∞.
This fact leads to
limk→∞
cpk = 0.
This equality clearly implies that pk →∞ as k →∞, which is a contradiction with Lemma3.3. Therefore, (3.6) holds.
Using (3.6) and the fact that the vector qk satisfies (1.6), we obtain
0 ≤ τ‖gk‖ ≤ −gTk qk
‖gk‖.‖qk‖‖gk‖ = −
gTk qk‖qk‖
→ 0, (k →∞).
Therefore, (3.5) holds and the proof is completed.
Theorem 1 is the key step in investigating the convergence analysis of Algorithm 1. Itobviously implies that, if the generated sequence xk of Algorithm 1 has limit points,then all of the limit points satisfies the first-order necessary condition.
In what follows, we will verify the superlinear and the quadratic convergence ratesof Algorithm 1 under some classical conditions that have been widely used in nonlinearoptimization literatures, see [26].
Theorem 3.6. Suppose that assumptions (H1) and (H2) hold, the sequence xk is gen-erated by Algorithm 1 converges to x∗, dk = −B−1
k gk, the matrix G(x) = ∇2f(x) iscontinuous in a neighborhood N(x∗, ε) of x∗, and Bk satisfies the following condition
limk→∞
‖[Bk −G(x∗)]dk‖‖dk‖
= 0.
Then, the sequence xk converges to x∗ superlinearly. Moreover, if Bk = G(xk) and G(x)is Lipschitz continuous in a neighborhood N(x∗, ε), then the sequence xk converges tox∗ quadratically.
Proof. The proof is similar to what stated in the proof of Theorems 4.3 and 4.4 in [33]and therefore the details are omitted.
14 Ahookhosh et al.
4. Preliminary numerical experiments
In this section, we report some numerical results of LMATR on a large set of standardtest problems taken from the references [5, 23] in which problem’s dimensions are variedfrom 500 to 10000. To show the efficiency of LMATR, we compare its results with thoseobtained by the limited memory quasi-Newton algorithm (LMQNM), proposed in page48 of [26], the limited memory version of the traditional trust-region algorithm (LMTTR)and the limited memory version of the adaptive trust-region algorithm of Shi and Guo[33] (LMTRS) with qk = −Hkgk.
All of these algorithms take the advantages of the compact limited memory BFGSwith m1 = 5 which is rewritten in MATLAB code due to L-BFGS-B Fortran code in [37],which is based on the literatures [9, 22, 36]. Furthermore, all the algorithms are coded withdouble precision format in MATLAB 7.4 and the trust-region subproblems are solved by amodified Steihaug-Toint procedure (see [12]), which employs the compact limited memoryBFGS. Similar to Bastin et al. in [7], the Steihaug-Toint algorithm terminates at xk + dwhen
‖∇f(xk) +Bkd‖ ≤ min
[1
10, ‖gk‖
12
]‖gk‖ or ‖d‖ = ∆k,
holds. Moreover, all the algorithms are stopped whenever the total number of iteratesexceeds 20000 or the condition
‖gk‖ ≤ 10−5
is satisfied. We set µ1 = 0.05 and µ2 = 0.9 for the algorithms LMATR, LMTTR andLMTRS. Besides, we exploit c = 0.2 and c1 = 1.55 for LMATR, and the parameters ofLMTRS are chosen the same as those reported in [33]. Similar to [12], LMTTR employs∆0 =
110‖g0‖ and updates the trust-region radius by
∆k+1 =
α1‖dk‖ if ρk < µ1;∆k if µ1 ≤ ρk < µ2;maxα2‖dk‖, ∆k if ρk ≥ µ2,
where α1 = 0.25 and α2 = 3.5. During the run of the algorithms, we make sure that all ofthe codes converge to the same point. Therefore, we have just provided those results inwhich all the algorithms are convergent to an identical point. The results are summarizedin Table 1.
Thanks to the structure of Algorithm 1 (LMATR) and the other presented algorithms,it is easy to see that the total number of iterates, Ni, are identical to the total numberof gradient evaluations, Ng. Therefore, in Table 1, we only report the number of iterates,the number of function evaluations, Nf and the running time T as efficiency measures.From Table 1, it can be seen that, in the most cases, both the number of iterates andfunction evaluations of Algorithm 1 is remarkably less than the others. Although theproposed algorithm is not the best algorithm in some problems, it has better computational
A limited memory trust-region method with adaptive radius 15
performances. We also take the advantages of the performance profile of Dolan and Morein [13], which is a statistical tolls to compare the efficiency of algorithms. Therefore, weillustrate the results of Table 1 in Figures 1-3 according to the total number of iterates,the total number of function evaluations and the running time, respectively.
From Figure 1, it is observed that LMATR attains the most wins among the otherconsidered algorithms. More precisely, it solves about 51% of test functions efficientlyand faster than the others. We also see that LMTRS performs better than LMQNM andLMTTR regarding of the total number of iterates. Moreover, considering the ability ofcompleting the run successfully, LMATR is the best algorithm among the others, where itsfigure grows up faster than the other algorithms. This means that whenever the proposedalgorithm is not the best algorithm, it performs close to the best algorithm. Figure 2 showsthat LMATR, LMQNM and LMTTR are so competitive regarding to the total number offunction evaluations, however they perform better than LMTRS. Furthermore, the resultsof LMATR have the most wins in about 40% of test functions. We also use the efficiencymeasure Nf + 3Ni in Figure 3, where it show that LMATR outperforms the other. Thelast comparison is related to the running time, where LMQNM has better performance.Our preliminary computational experiments show that LMATR is promising for solvinglarge-scale unconstrained optimization problems.
16 Ahookhosh et al.
1 2 3 4 5 6 70.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(a) τ = 7
1 1.2 1.4 1.6 1.8 20.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(b) τ = 2
Figure 1. Performance profile for the number of iterates
A limited memory trust-region method with adaptive radius 17
1 2 3 4 5 6 70.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(a) τ = 7
1 1.2 1.4 1.6 1.8 20.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(b) τ = 2
Figure 2. Performance profile for the number of function evaluations
18 Ahookhosh et al.
1 2 3 4 5 6 70.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(a) τ = 7
1 1.2 1.4 1.6 1.8 20.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(b) τ = 2
Figure 3. Performance profile for Nf + 3Ni
A limited memory trust-region method with adaptive radius 19
1 2 3 4 5 6 7 8 9 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNMLMTTRLMTRSLMATR
(a) τ = 10
1 1.5 2 2.5 3 3.5 4 4.5 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
τ
P(r
p,s ≤
τ :
1 ≤
s ≤
n s)
LMGNM
LMTTR
LMTRS
LMATR
(b) τ = 5
Figure 4. Performance profile for the running time
20 Ahookhosh et al.Tab
le1.
Numerical
results
Problem
nam
eDim
LMQNM
LMTTR
LMTRS
LMATR
Ni
/N
f/
TN
i/
Nf
/T
Ni
/N
f/
TN
i/
Nf
/T
POW
ER
500
4273/4397/6.28
6515/6860/23.89
6707/8061/31.31
6164/6543/26.52
Hager
500
35/40/0.61
36/41/0.12
38/49/0.22
38/49/0.16
Ray
dan
150
0165/176/0.99
146/155/0.52
144/178/1.00
143/157/1.29
G.W
.&
Holst
500
6423/9262/20.60
6390/8445/26.18
5764/15836/49.57
5577/9099/48.69
BIG
GSB1
500
1409/1449/4.02
1368/1447/5.10
1365/1668/8.39
1521/1598/9.74
G.Rosenbrock
1000
6892/9686/25.14
6248/8041/37.78
5983/13247/61.96
5735/8362/53.27
E.W
.&
Holst
1000
76/111/0.70
66/100/0.33
46/123/0.74
57/94/0.72
Partial
p.quad
.10
00188/217/3.83
170/193/2.73
169/229/3.68
164/187/3.41
DIX
ON3D
Q10
005327/5671/35.12
4692/4971/20.73
4849/6009/32.72
5156/5494/34.16
*FLETCHCR
1000
5333/6271/19.72
5814/6423/34.14
7173/9008/61.24
4687/5559/39.39
TRID
IA10
00871/895/1.31
868/924/3.62
790/920/5.81
631/669/3.55
Quad
.QF2
1000
1322/1681/4.91
289/308/1.84
278/342/3.25
273/289/2.61
VARDIM
1000
77/112/0.37
81/114/0.39
84/245/1.03
81/111/0.57
ENGVAL1
1000
25/32/0.08
22/27/0.25
23/42/0.40
20/26/0.21
Diagonal
210
00145/160/1.00
150/162/0.84
140/167/1.40
141/164/1.49
E.tridiag.
210
0027/31/0.45
29/32/0.15
28/43/0.41
27/30/0.42
E.Penalty
1000
69/71/0.55
61/76/0.27
62/140/0.61
56/71/0.50
ARW
HEAD
1000
14/16/0.12
9/15/0.04
9/48/0.13
10/17/0.06
EG2
1000
153/241/0.86
26/39/0.39
21/101/0.69
20/29/0.24
CUBE
1000
1263/2081/16.76
1903/2235/18.17
1646/6606/39.67
1280/2328/17.45
E.Maratos
1000
153/232/0.88
649/689/2.77
112/423/0.92
105/217/0.74
E.Pow
ell
5000
69/77/2.09
71/94/4.04
49/198/6.13
64/112/4.74
E.Wood
5000
178/236/0.51
50/73/2.20
34/102/2.89
49/77/2.68
P.quad
.50
00597/623/17.46
608/637/38.57
575/702/41.55
592/631/38.45
P.tridiag.
quad
.50
00557/583/14.63
635/676/33.47
591/693/39.91
555/595/34.29
Broyden
tridiag.
5000
68/87/1.17
52/58/2.73
51/74/3.67
33/40/2.07
Alm
ostp.quad
.50
00580/597/16.87
629/661/37.83
599/710/40.29
592/623/37.74
NONDQUAR
5000
1834/2036/16.31
2096/2542/82.08
2292/3949/143.37
2155/2746/112.47
EDENSCH
5000
26/30/0.16
20/25/0.69
21/39/1.17
21/24/0.71
Quad
.QF1
5000
587/604/15.33
607/633/31.43
588/696/37.46
598/640/37.64
E.Fre.an
dRoth
5000
20/23/0.66
18/23/0.74
17/45/0.72
19/26/0.90
E.tridiag.
150
0027/31/0.79
28/31/1.12
24/44/1.53
32/38/1.52
BDEXP
5000
31/32/0.40
36/37/1.00
31/32/0.96
31/32/0.95
HIM
MELBG
5000
2/3/0.02
37/38/1.37
2/3/0.07
2/3/0.06
A limited memory trust-region method with adaptive radius 21Tab
le1.(continued
)
QUARTC
5000
31/32/1.29
31/32/1.51
28/32/1.75
26/28/1.29
LIA
RW
HD
5000
34/42/0.93
39/55/2.14
33/96/3.07
31/40/1.65
E.PSC1
5000
21/24/0.43
12/14/0.46
12/26/0.79
14/18/0.60
E.BD1
5000
14/17/0.40
15/19/0.76
14/17/0.94
14/17/0.71
E.quad
.p.QP1
5000
34/36/1.62
28/35/2.29
27/61/3.43
24/31/2.19
E.DENSCHNB
5000
10/12/0.27
10/11/0.51
9/13/0.70
9/11/0.55
E.DENSCHNF
5000
16/18/0.46
13/18/0.87
14/38/1.90
13/22/1.04
SIN
COS
5000
21/24/0.89
12/14/0.90
12/26/1.25
14/18/1.04
COSIN
E50
0017/20/0.16
14/17/0.78
15/26/1.53
15/18/0.75
Ray
dan
250
008/9/0.30
7/8/0.50
8/9/0.68
8/9/0.71
Diag.
450
004/6/0.08
6/9/0.41
8/23/0.99
7/11/0.46
Diag.
750
0011/14/0.44
7/8/0.47
7/9/0.56
7/10/0.53
Diag.
850
006/8/0.24
6/7/0.45
7/11/0.57
5/7/0.32
NONSCOMP
5000
52/60/1.94
41/50/2.77
636/2466/127.13
34/40/2.55
DIX
MAANE
6000
332/349/23.57
363/386/46.97
288/339/44.67
336/360/47.96
DIX
MAANF
6000
257/268/16.93
252/262/46.42
209/241/43.36
245/259/35.67
DIX
MAANG
6000
240/247/22.79
325/349/47.25
225/267/28.19
289/307/35.89
DIX
MAANI
6000
2417/2478/150.34
2568/2744/240.75
811/965/99.72
3157/3376/354.40
DIX
MAANH
9000
498/586/82.93
249/258/71.60
233/278/89.83
268/283/82.25
DIX
MAANA
9000
8/11/1.36
9/10/2.42
9/17/3.36
10/13/3.33
DIX
MAANB
9000
7/10/2.19
7/8/3.13
8/16/6.74
9/12/4.71
DIX
MAANC
9000
9/12/4.51
10/12/5.64
10/20/5.95
9/12/5.63
DIX
MAAND
9000
10/14/6.53
9/11/6.09
11/24/9.81
13/17/7.76
DIX
MAANJ
9000
284/298/86.18
227/236/103.86
447/513/205.29
388/406/167.55
DIX
MAANK
9000
376/382/112.50
403/414/161.14
414/458/188.66
349/364/135.36
DIX
MAANL
9000
1656/1714/461.84
353/364/178.31
330/384/156.68
316/326/142.45
DQDRTIC
1000
010/14/0.91
18/30/3.59
15/33/5.15
14/23/2.90
Diag.
510
000
5/6/0.39
7/8/1.07
5/6/0.80
5/6/0.78
NONDIA
1000
011/13/0.95
9/19/2.02
11/61/5.37
9/19/2.18
E.T
ET
1000
010/12/0.46
10/11/1.17
10/18/1.86
9/12/1.32
E.Beale
1000
018/20/1.07
18/19/2.18
16/27/3.05
19/23/2.66
FullHessian
FH3
1000
04/6/0.33
4/11/1.06
4/35/3.36
4/11/1.09
G.PSC1
1000
044/46/0.28
38/41/3.29
39/56/5.23
33/38/3.33
E.Him
melblau
1000
013/16/0.60
11/14/2.36
12/24/2.50
10/13/1.48
E.quad
.e.
EP1
1000
06/8/0.26
4/8/0.67
4/25/1.96
4/9/0.79
22 Ahookhosh et al.
5. Concluding remarks
We present an iterative scheme for solving large-scale unconstrained optimization prob-lems based on a trust-region framework equipped with an adaptive radius and the compactlimited memory BFGS. As it is known, using an appropriate adaptive radius can decreasethe total number of subproblems solving in trust-region methods. We describe some dis-advantages of the adaptive trust-region radius of Shi and Guo in [33], especially for solvinglarge problems. To overcome these drawbacks, we propose some reformations for this ra-dius. Moreover, the limited memory quasi-Newton schemes have been developed to copewith large-scale optimization problems. We therefore unify this two interesting ideas intoa trust-region algorithm to decrease the computational cost compared with the traditionaltrust-region framework.
From the theoretical analysis point of view, the proposed algorithm inherits the globalconvergence of traditional trust-region algorithms to the first-order stationary points un-der classical assumptions. The superlinear and the quadratic convergence rates are alsoestablished. Finally, our preliminary numerical experiments on the set of standard testproblems point out that the proposed algorithm is efficient for solving large-scale uncon-strained optimization problems.
Acknowledgement The authors are grateful for the valuable comments and suggestionsof anonymous referees.
References
[1] M. Ahookhosh, K. Amini, A nonmonotone trust region method with adaptive radius for unconstrainedoptimization problems, Computers and Mathematics with Applications. 60 (2010) 411–422.
[2] M. Ahookhosh, H. Esmaeili, M. Kimiaei, An effective trust-region-based approach for symmetricnonlinear systems, International Journal of Computer Mathematics. 90 (2013) 671–690.
[3] K. Amini, M. Ahookhosh, A hybrid of adjustable trust-region and nonmonotone algorithms for un-constrained optimization, Applied Mathematical Modelling, 38 (2014), 2601–2612.
[4] K. Amini, M. Ahookhosh, Combination Adaptive Trust Region Method By Non-Monotone StrategyFor Unconstrained Nonlinear Programming, Asia-Pacific Journal of Operational Research, 28 (05),585–600.
[5] N. Andrei, An unconstrained optimization test functions collection, Advanced Modeling and Optimiza-tion. 10(1) (2008) 147–161.
[6] D. Ataee Tarzanagh, M.R. Peyghami, H. Mesgarani, A new nonmonotone trust region method forunconstrained optimization equipped by an efficient adaptive radius, Optimization Methods and Soft-ware, 29 (4) (2014), 819–836.
[7] F. Bastin, V. Malmedy, M. Mouffe, Ph.L. Toint, D. Tomanos, A retrospective trust-region methodfor unconstrained optimization, Mathemathical Programming. 123(2) (2008) 395–418.
[8] J.V. Burke, A. Wiegmann, L. Xu, Limited memory BFGS updating in a trust-region framework,SIAM Journal on Optimization. submitted, 2008.
[9] R.H. Byrd, P. Lu, J. Nocedal, A limited memory algorithm for bound constrained optimization, SIAMJournal on Scientific and Statistical Computing. 16(5) (1995) 1190–1208.
A limited memory trust-region method with adaptive radius 23
[10] R. Byrd, J. Nocedal, R. Schnabel, Representation of quasi-Newton matrices and their use in limitedmemory methods, Mathematical Programming. 63 (1994) 129–156.
[11] A.R. Conn, N.I.M. Gould, Ph.L Toint, LANCELOT: a Fortran package for large-scale nonlinearoptimization (Release A). (Springer Series in Computational Mathematics, No. 17) Springer, BerlinHeidelberg NewYork, 1992.
[12] A.R. Conn, N.I.M. Gould, Ph.L Toint, Trust-Region Methods. Society for Industrial and AppliedMathematics SIAM, Philadelphia, 2000.
[13] E. Dolan, J.J More, Benchmarking optimization software with performance profiles, MathemathicalProgramming. 21 (2002) 201–213.
[14] H. Esmaeili, M. Kimiaei, An efficient adaptive trust-region method for systemsof nonlinear equations, International Journal of Computer Mathematics, (2014),http://dx.doi.org/10.1080/00207160.2014.887701.
[15] H. Esmaeili, M. Kimiaei, A new adaptive trust-region method for system of nonlinear equations,Applied Mathematical Modelling. 38(11–12) (2014) 3003–3015.
[16] R. Fletcher, Practical Method of Optimization, Wiley, NewYork, 2000.[17] N. Gould, S. Lucidi, M. Roma, Ph.L Toint, Solving the trust-region subproblem using the Lanczos
method, SIAM Journal on Optimization. 9(2) (1999) 504–525.[18] N.I.M. Gould, D. Orban, A. Sartenaer, Ph.L Toint, Sentesivity of trust region algorithms to their
parameters, 4OR. A Quarterly Journal of Operations Research. 3 (2005) 227–241.[19] D.C. Liu, J. Nocedal, On the limited memory BFGS method for large scale optimization, Mathematical
Programming. 45 (1989) 503–528.[20] L. Kaufman, Reduced storage, quasi-Newton trust region approaches to function optimization, SIAM
Journal on Optimization. 10(1) (1999) 56–69.[21] J.L. Morales, J. Nocedal, Enriched methods for large-scale unconstrained optimization, Computational
Optimization and Applications. 21 (2002) 143–154.[22] J.L. Morales, J.Nocedal, L-BFGS-B: Remark on Algorithm 778: L-BFGS-B, FORTRAN routines
for large scale bound constrained optimization, to appear in ACM Transactions on MathematicalSoftware, 2011.
[23] J.J. More, B.S. Garbow, Hillstrom, K.E. Testing Unconstrained Optimization Software, ACM Trans-actions on Mathematical Software. 7 (1981) 17–41.
[24] S.G. Nash, J. Nocedal, A numerical study of the limited memory BFGS method and the truncatedNewton method for large scale optimization, SIAM Journal on Optimization. 1 (1991) 358–372.
[25] J. Nocedal, Updating quasi-Newton matrices with limited storage, Mathematics of Computation. 35(1980) 773–782.
[26] J. Nocedal, S.J Wright, Numerical Optimization. Springer, NewYork, 2006.[27] J. Nocedal, Y. Yuan, Combining trust region and line search techniques, in: Y. Yuan (Ed.), Advanced
in nonlinear programming. Kluwer Academic Publishers, Dordrecht, (1996) 153–175.[28] M.J.D Powell, A new algorithm for unconstrained optimization, In: Rosen JB, Mangasarian OL,
Ritter K (eds) Nonlinear programming. Academic Press, London, (1970) 31–65.[29] M.J.D Powell, On the global convergence of trust region algorithms for unconstrained optimization,
Mathematical Programming. 29 (1984) 297–303.[30] A. Sartenaer, Automatic determination of an initial trust region in nonlinear programming, SIAM
Journal on Scientific Computing. 18(6) (1997) 1788–1803.[31] R.B. Schnabel, E. Eskow, A new modified Cholesky factorization, SIAM Journal on Scientific Com-
puting. 11(6) (1990) 1136–1158.
24 Ahookhosh et al.
[32] G.A. Schultz, R.B. Schnabel, R.H. Byrd, A family of trust-region-based algorithms for unconstrainedminimization with strong global convergence, SIAM Journal on Numerical Analysis. 22 (1985) 47–67.
[33] Z.J. Shi, J.H. Guo, A new trust region method with adaptive radius, Computational Optimization andApplications, 213 (2008) 509–520.
[34] T. Steihaug, The conjugate gradient method and trust regions in large scale optimization, SIAMJournal on Numerical Analysis. 20(3) (1983) 626–637.
[35] X.S. Zhang, J.L. Zhang, L.Z. Liao, An adaptive trust region method and its convergence, Science inChina. 45 (2002) 620–631.
[36] C. Zhu, R.H. Byrd, J. Nocedal, L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for largescale bound constrained optimization, ACM Transactions on Mathematical Software. 23(4) (1997)550–560.
[37] http://users.eecs.northwestern.edu/ nocedal/lbfgsb.html
Masoud AhookhoshFaculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090 Vienna, AustriaEmail: [email protected]
Keyvan AminiDepartment of Mathematics, Razi University, Kermanshah, IranEmail: [email protected]
Morteza KimiaeiDepartment of Mathematics, Asadabad Branch, Islamic Azad University, Asadabad, IranEmail: [email protected]
M. Reza PeyghamiDepartment of Mathematics, K.N. Toosi University of Technology, P.O. Box 16315-1618, Tehran, IranEmail: [email protected]