Post on 14-Apr-2022
transcript
University of Calgary
PRISM: University of Calgary's Digital Repository
Graduate Studies Legacy Theses
1998
Analysis and modification of Newton's method for
algebraic riccati equations
Guo, Chun-Hua
Guo, C. (1998). Analysis and modification of Newton's method for algebraic riccati equations
(Unpublished doctoral thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/21874
http://hdl.handle.net/1880/26229
doctoral thesis
University of Calgary graduate students retain copyright ownership and moral rights for their
thesis. You may use this material in any way that is permitted by the Copyright Act or through
licensing that has been assigned to the document. For uses that are not allowable under
copyright legislation or licensing, you are required to seek permission.
Downloaded from PRISM: https://prism.ucalgary.ca
THE UNVERSIT'Y OF CALGARY
Analysis and Modificaton of Newton's Method
for Algebraic Riccati Equations
Chun-Hua Guo
A DISSEmATION
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF MATHEMATICS AND STATISTICS
CALGARY, ALBERTA
JANUARY, 1998
@) Chun-Hua Guo 1998
National Library (*m of Canada Bibliothéque nationale du Canada
Acquisitions and Acquisitions et Bibliographie Services services bibliographiques
395 Wellington Street 395, rue Wellington OnawaON K I A O N 4 OttawaON K1AON4 Canada Canada
Your h& Votm relermw
Our lire Notre retermu,
The author has granted a non- exclusive licence ailowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microfom, paper or e1ectron.i~ formats.
The author retains ownership of the copyright in ths thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.
L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.
L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.
Abstract
When Newton's method is applied to find the maximal Hermitian solution of on
algebraic Riccati equation, convergence can be guaranteed under moderate condi-
tions. In particular, the initial guess does not n e 4 to be close to the solution. The
convergence is quadratic if the Fréchet derivative of the Riccati function is invertible
at the solution. In this dissertation, our emphasis is on the behaviour of the Newton
iteration when the derivative is not invertible at the solution. For the continuous
algebraic Riccati equation, the denvative i3 not invertible at the solution if and only
if the closed-loop matrix has eigendues on the imaginary axis. The convergence
of Newton's method is shown to be either quadratic or Iinear with cornmon ratio
1/2, provided that the eigenvalues on the imaginary a i s are dl semisimple. Infor-
mation on these eigenvalues can be obtained from the corresponding Harniltonian
matrix. Linear convergence appears to be dominant, and the efficiency of the New-
ton iteration can be improved dramatically by applying a double Newton step at the
right time. The convergence behaviour of Newton's method is conjectured when the
closed-loop matrix has non-semisimple eigenvalues on the imaginary axis. Similar
results are established for the discrete algebraic Riccati equation. A matrix equation
that has been studied extensively turns out to be a special discrete algebraic Riccati
equation. Newton's method and some other methods are studied for this equation
whenever it has a Hermi t ian positive dehi te solution.
Acknowledgements
1 wish to thank my supervisor Professor Peter Lancaster for his guidance. Thanks
are also due to Professors P. A. Binding and L. P. Bos for several helpful discussions.
1 wish to express my gratitude for the finanad support 1 have received £rom The
University of Calgary and from research grants of Professors P. Lancaster, P. A.
Binding and R. Torrence. Findy, 1 thank my wife Min for her maximal support.
Table of Contents
Approval Page
Abstract iii
Table of Contents v
1 Introduction 1
2 Newton's Method in Banach Spaces 9 2.1 Some basic concepts and facts in functional analysis . . . . . . . . . . 9
. . . . . . . . . . . . . . . . . . . 2.2 Newton's method in Banach spaces 12
3 Newton's Method for Continuous Algebraic Riccati Equations 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Preliminaries 16
. . . . . . . 3.2 Interpretation of the direct s u m condition for the CARE 20 3.3 Characterization of the direct s u m condition by the Hamiltonian matnx 22
. . . . . . . . . . . . . . . . 3.4 Convergence rate of the Newton method 26 . . . . . . . . . . . . . . . . . . . . . . . 3.5 A modified Newton method 33
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Numerical results 39
4 Examples and Conjectures 44 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Examples 44 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Conjectures 48
5 Newton's Method for Discrete Algebraic Riccati Equations 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Prelimlnaries 50
. . . . . . . 5.2 Interpretation of the direct sum condition for the DARE 54 5.3 Characterization of the direct s u m condition via a matrix p e n d . . . 56
. . . . . . . . . . . . . . . . 5.4 Convergence rate of the Newton method 61 . . . . . . . . . . . . . . . . . . . . . . 5.5 Using the double Newton step 69
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Numerical results 73
6 A Special Discrete Algebraic Riccati Equation 76 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction 76
. . . . . . . . . . . . . . . . . . . . . . . . 6.2 Basic fixed point iteration 79 6.3 Inversion free variant of the basic fixed point iteration . . . . . . . . . 82
6.4 Newton's method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7 Some Concluding Remarks
Bibliogaphy
List of Tables
3.1 Perfomance of Algorithm 3.24 for Example 3.2 . . . . . . . . . . . . 40 3.2 Performance of Algorithm 3.24 for Example 3.3 . . . . . . . . . . . . 42 3.3 Performance of Algorithm 3.24 for Example 3.4 . . . . . . . . . . . . 43
5.1 Performance of the modified Newton method for Example 5.3 . . . . 74 5.2 Performance of the modified Newton method for Example 5.4 . . . . 75
vii
Chapter 1
Introduction
In this dissertation, we are concemed with the numerical solution of algebraic Riccati
equations. The continuous algebraic Riccati equation (CARE) we will consider is of
the form
X D X - X A - A ' X - C = 0 , (1.1)
where A, D, C E CnXn , and D' = Dl C' = C. The discrete algebraic Riccati
equation (DARE) we will consider is of the form
-X + A'XA + Q - (C + BmXA)'(R + B'XB)-'(C + B'XA) = 0, (1.2)
whereA,Q E C n X n , B ~ C n X m , C ~ C m X n , R ~ CmXm, and Q* = Q , R = R. These
equations appear in sever al problem areas including linear-quadratic control, filtering
theory and Hm-optimal control, etc. See, for example, [35, 39, 41, 421.
Hemitian solutions are generally required for practical reasons. For the CARE
(1.1 ), the Hermitian solution X for which the matrix A - DX (known as the 'closed
loop" matrix) has al1 its eigendues in the closed left half-plane is of particular
interest. Such a solution is unique under wme mild conditions on the matrices
A, D, C appearing in (1.1). For the DARE (1.2), the Hermitian solution X for which
the closed loop matrix A - R(R + BaXB)-'(C + B'XA) has all its eigenvalues in
the closed unit disk is of particular interest. Again, such a solution is unique under
mild assumptions on the matrices A, B, C, Q, R appearing in (1.2). Thene desired
solutions (for both CARE and DARE) are ca11ed almost stabilizing. They are called
stabilizing if the word 'closed" is replaced by 'open".
The order relation on the set of Heimitian matrices is the usual one: X 2 Y(X >
Y) if X - Y is positive semidefinite (defmite). For A E CnXn , B E CnXm, the pair
(A, B) is said to be stabilizable (resp. d-stabilizable) if there is a K CmX" such
that A - BK is stable (resp. d-stable), i.e., all its eigenvalues are in the open left
half-plane (resp. open unit disk).
We will briefly explain how equations of the type (1.1) or (1.2) appear in linear-
quadratic control problems. See [41] for more details.
We first consider the continuous time linear-quadratic control problem:
Minimize
subject to the dynamics
where A E CnX",B E CnXm, P E CPXn,Q E CpXP,R E Cmxm,z(t) E Cn,y(t) E
CP,u(f) E C m , Q m = Q,R. = R.
If, for example, Q > O, R > 0, and the pairs (A, B), (A' ,Pm) are stabilizable,
then the solution of the above optimal control problem is given by the feedback law
where X is the unique stabilizing, positive semidefinite solution of (1.1) with
The vector x ( t ) in (1 -3) can be found from the closed loop system
We now consider the discrete time linear-quadratic control problem:
Minimize 1 0 0
subject to the difference equation
where A E CnX", B E CTbXm, P E CPXn, V E C p X p , R E Cmxm, V* = V, R = R.
If, for example, V 2 O, R > 0, and the pairs (A , B) and (A', P m ) are d-stabiljzable,
then the solution of the above discrete time optimal control problem is aven by the
feedback law
= - (R + B'XB)-' B ' X A X ~ , (1-4)
where X is the unique stabilizing, positive semidefinite solution of (1.2) with Q =
P'VP, C = O. The vectors in (1.4) are then determined by the difference equation
The equations (1.1) and (1.2) are clearly nonlinear equations. However, linear
algebra methods can play a big role in the solution of these equations. For example,
we have the following result (see, e.g., 135, Theorern 7.1.21) for the solution of the
CARE (1.1).
Theorem 1.1 Equation (1.1) hm a solution X E CnXn if and only if the= is a set
of vectors V I , . . . , Vn in C2" forming a set of Jordan chains for
and if
when y j , zj E Cn, then yl, 92,. . . , ~ O ~ Y T Z a bais for Cn.
every solution of ( 1 . 1 ) has the form X = ZY-' for some set o f Jordan chains
ul, uz, . . . , u, for H such that Y is inoertible.
From the above result, we see that the problem of finding al1 solutions of the
equation (1.1) can be reformulated as the problem of finding d l Jordan chains of
the 2n x 272 matrix H. The desired Hermitian solution (with al1 eigenvalues in the
ciosed left half-plane) c m be found, in principle, by choosing an appropriate set of
Jordan chains or an appropriate H-invariant subspace. Numerical methods based
on this approach are d l e d subspace rnethods. The most commonly used method of
this kind for solving the CARE has been the Sehur vector method 1371. For recent
development s on subspace met hods , see [7].
In physical problems involving Riccati equations those of real type aise most
frequently. Consequently, the literature is dominated numerically by papers devoted
to the real case. In [37], the Schur rnethod is discussed in detail for the real CARE
where A, D,C E RnXn,DT = D,CT = C. Standard assurnptions for (1.5) are that
D, C 2 0 and the pairs (A, D), (AT, CT) are stabilizable. When these conditions are
fdilled, the equation (1.5) has a unique positive semidehite solution X+ , and the
closed loop matrix A - DX+ has al1 eigenvalues in the open left haJf plane. If the
eigenvalues of A - DX+ are not very close to the imaginary axis, the Schur method
is very efficient, particularly when combined with one or two steps of correction by
Newton's method. Since Newton's method is quadraticdy convergent in this case, it
can irriprove the approximate solution produced by the Schur method provided that
the approximate solution is already reasonably close to the exact solution. In [32],
a cornputable estimate is given for the closeness required to make sure the Newton
method does provide improvement in the first step.
However, even under the above standard conditions for the CARE (1.5), the
eigenvalues of the closed loop matrix can still be very close to the imaginary axis.
For example, for the equation (1.5) with n = l , A = 0, D = l,C = c > O, al1 the
standard assumptions are satisfied. But the eigenvahe of A - DX+ (-4 in this
case) goes to zero as e goes to zero.
For the CARE (1.1) arising in H,-optimal control problems, the matrices D
and C are often not both positive semidefinite and the closed loop matnx can have
eigenvalues precisely on the imaginary axis. Also, according to [9], the DARE (1.2)
having closed loop eigenvalues on the unit circle has applications in the filtering and
control of systems with purely deterministic disturbances such as sinusoids and drift
components.
In this dissertation, we are mainly interested in the situations in which the
closed loop matrix has eigenvalues on the imaginary axis (unit circle) for the CARE
(DARE). There has been some recent work related to the study of algebraic Riccati
equations in these situations. See, for example, [IO] and [40].
In these situations, subspace methods typically run into difficulties ( s e (411). For
the CARE (14, for example, in order to obtain the desired Hermitian solution in
the way described in Theorem 1.1, we need to choose half of the vectors in the Jordan
chains associated with the eigenvalues of the Hamiltonian matnx H on the imaginary
axis. Note that the lengths of these Jordan chains are necessary even (see [35]). If the
closed loop matrix has only one simple eigenvalue A on the imaginary axis, then A is
the only eigenvalue of H on the irnaginary axis and the corresponding Jordan chah
consists of two vectors al,a2, where al is the eigenvector (see [35]). In this case, we
can only pi& up the vector al when forming an H-invariant subspace. If the closed
loop matrix has just two simple eigenvalues X and p on the imaginary axis, then X
and p are the only eigenvalues of H on the imaginary axis and the corresponding
Jordan chains are {al, cr2} and {A, A}, where al and pl are eigenvectors. In this
case, when forming an H-invariant subspace, we can take {al, az}, or {a l , pl}, or
{a,&}. It is not clear which pair should be used to obtain the desired Hermitian
solution. The situation is even more complicated if the closed loop matrix has more
eigenvalues on the imaginary a i s .
We will consider the application of Newton's method for the solution of the
algebraic Riccati equations in these situations. It will be seen that the convergence
behaviour of Newton's method is dependent on the partial multiplicities of the closed
loop eigenvalues on the imaginary axis (unit circle) for the CARE (DARE). However,
we do not need any information on the eigenstnicture of the closed loop matrix in
carrying out the Newton iteration. We are thus free of the above-mentioned concerns
associated with subspace methods.
It should be mentioned that some progress has been recently made (see [38])
in the study of subspace methods when the closed loop rnatnx has eigenvalues on
the imaginary acis (unit circle) for the CARE (DARE). However, some correction
method is still needed to improve the accuracy of the approximate solution obtained
by subspace methods. Newton's method cannot guarantee improvement no mat ter
how close the approximate solution is to the exact solution. It may even produce a
new approximate solution that is worse than a random guess. In these situations,
Newton's method may have to be used directly.
The plan of this dissertation is as follows. In Chapter 2, we will review some
basic concepts and results from functional analysis. Some local results on Newton's
method in Banach spaces are also reviewed with emphasis on Newton's method at
singular roots. In Chapter 3, we mainly study Newton's method for the CARE with
eigendues on the irnaginaq axis. We will prove that the convergence of Newton's
method is either quadratic or linear with common ratio 112 provided that al1 closed
loop eigendues on the imaginary axis are semisimple. We will also discuss in this
chapter how the efficiency of Newton's method can be improved. Some interest-
ing examples of the CARE are given in Chapter 4. The convergence behaviour of
Newton's method when the closed loop matrix has an arbitrary eigenstructure is con-
jectured in this chapter. Chapter 5 is devoted to the more involved DARE. Similar
results are established for DAREs having closed loop eigenvalues on the unit circle.
Matrix pencils are used to characterize the eigendues of the closed loop matrix.
The results are established under very mild conditions for the given matrices in the
DARE (1.2). In Chapter 6, we study a special DARE which also appears in many
applications. The special D A M has the tom X + A'X-'A = Q, with Q being
Hermitian positive definite. Some convergence results are established for some sim-
pler iterative methods. Newton's method is then studied and compared with these
methods. In the last chapter, Chapter 7, we make some concluding remarks.
Chapter 2
Newton's Method in Banach Spaces
2.1 Some basic concepts and facts in functional
analy sis
In this section, we will review some basic concepts and facts from functional analysis.
Let E be a linear space over the field C of complex numbers. A n o m on E is a
function z I+ llzll from E to the field R of reals having the properties
With such a norm we can talk about convergence in E: a sequence {xk} in E
converges to x in E, and z is the limit of the sequence, if the sequence of real numbers
{ l lxc - XII} converges to zero; if such z exists, then the sequence is convergent. A
sequence {tk} is a Cauchy sequence if: for dl c > 0, theie exists an integer N such
t hat 11 zm - xn 11 < c whenever ml n > N. If every Cauchy sequence in E is convergent,
then E is complete. A (complex) Banach space is a (complex) linear space which has
a norm and is complete.
A subset S of a complex Banach space E is a subspace if
for d l q y E S,CE C.
Let Si and Sz be subspaces of E. If for al1 z E E there are a unique zl E Si and
a unique t 2 E SÎ SU& that z = xl + x 2 , then E is a direct sum of SI and Sa, wntten
as E = Si $ S2. In this case, z1 is the projection of z onto SI parallel to S2 and 2 2
is the projection of x onto S2 parallel to SI.
An operator A from a complex B a n d space El into a complex Banach space
& is linear if
A ( W l + ~ 2 2 2 ) = ~lA(21) + ~ 2 4 ~ 2 )
for al1 21, x 2 E El and cl, c2 E C. A linear operator A is bounded if there is a constant
k such tbat
l l A 4 I w for al1 z E El. The linear space of al1 bounded linear operators from El into & is
denoted by L(Ei , &). Note that L(El, E2) is a Banach space with the n o m defined
When El = & = E, we will write L(E) for L(E, E).
The nul1 space (or kernel) of A E L(El, &) is defined by
Ker(A) = {z E Et 1 Ar = O}.
The range (or image) of A E L(Ei , E2) is dehed by
h ( A ) = {y E & 1 y = Az for some z E El}.
Let D be an open set in El. A map F : D + & is Fréchet differentiable at
x E D, if there is an A E L(Ei, Ez) such that
The linear operator A is unique and is called the Fréchet derivative of F at z, denoted
by ~ ' ( z ) or F' .
The second Fréchet derivative of F at z, denoted by F"(z) or F:, is the first
Fréchet derivative of F' at z. Thus ~ " ( z ) is a linear rnap fiom El into &(El, &),
or equivalently, a linear map from El x El into E2. Other higher derivatives can be
defined similarly.
We have the following intermediate value theorem.
Theorem 2.1 Assume that F : D c El 4 & is Fréchet differentiable on a convez
set Do c D . Then, for al1 z, y E DO,
llFy - Fz l l 5 sup I I F ' ( ~ + t ( y - 4)llllv - X I I * OSt<l
For the proof of Theorem 2.1, see [29], for example.
A map F : D C El -, & is Lipschitz continuous on Dl c D if there is a constant
for all z , y E Dl.
A rnap A E L ( E ) has an inverse if Ker(A) = {O) and Irn(A) = E. The inverse is
defined by
~ - ' y = z, if Az = y.
A map A E L(E) is invertible if A-' E L(E). If E is finite dimensional? then
A E & ( E ) is invertible if and only if Ker(A) = {O), or equivalently, Im(A) = E.
Concerning the inverses of linear operators, we have the following clzssical result .
Theorem 2.2 (Banach Lemma) Let E be a Banad space and A, B E t ( E ) . I f A
is inveriible and II 1 - AB11 < 1 then B is &O invertible. Monover,
2.2 Newton's method in Banach spaces
Let F be a map from a Banach space E into itself. Assume that z* E E is a solution
to
F ( z ) = O .
The Newton iteration for the solution z* is @en by
where zo is an initial guess.
Theorem 2.3 If F'(z) i s Lipschitz continuous in a neighborhood of z* and F'(zo)
is inuertible, then the sequence {zn) defined b y ( 2 . 1 ) is well-defined and dl converge
quadratically to z* i f the initial guess zo is suficiently close to z'.
If F'(za) is not invertible, the root z* is said to be singular. In this case, the
behaviour of Newton's method is much more complicated.
Newton's method at singular roots has been studied by many authors, see for
example [8, 12, 13, 14, 23, 24, 30, 31, 44, 45, 471.
In [45], Ra11 considered Newton's method at singular mots for maps from Rn
into itself. He analysed the one variable case in detail and initiated the study for
the higher dimensional case. The results he obtained for the higher dimensional
case are partly incorrect. The h s t correct result on the convegence of Newton's
met hod at singular roots is due to Cavazlagh [8]. However, he also made the stringent
assumption (as in M l ' s work) that F' is invertible in a deleted neighborhood of the
singular root. Due to this restrictive assumption, his result is not generally applicable
to the study of Newton's method for algebraic Riccati equations.
Newton's method at singular rwts was also studied in [44] in the Banach space
setting. The conditions in [44, Theorem 40.11 are also very stringent and are not
satisfied for the very simple CARE given in [35, Example 9.Z.l].
The restriction that F' is invertible in a deleted neighborhood of the singular root
is dropped in Reddien [47]. However, Reddien's result (established in the Banach
space set ting) needs other conditions which are generally not satisfied for algebraic
Riccati equations. On the other hand, the results in [23, 241 are given only for maps
£rom Rn into itself.
The most general result for Newton's method at singular points is given by Kelley
in [30]. That result is a combination of many previous results in this area. The result
is general enough for application to algebraic Riccati equations. We will now give a
review of that result.
Assume that F is sdkiently smooth and that F'(zm) has a finite dimensional
null space N and a closed range M with the direct sum condition E = N $ M . We
define PN to be the projection onto N parallel to M and let PM = I - PN. Assume
further that the following regularity condition holds: there is a E N such that the
map B on N given by B = PNF"(r8)(&, =) is invertible. Linear convergence with
common ratio 112 is then predicted for an appropnate initial guess. The following
result is a restatement of [30, Theorem 1.11.
Theorem 2.4 Let E = N $ M , let & be chosen so Ùrat B i s invertible, and let
M = span {b) $ NI for some subspace Ni. Write 5 = z - zm and let
d e n Po is the projection ont0 span {k} parallez tu M $ Ni. If xo E W(po, Bo, m)
for po, Bo, suficiently small, then the Newton sequence {xi} is well-defined and
llF'(xi)-'ll < clli;ll-' for all i 2 1 and some constant c > O . Moreover,
Notice that the region W(p, 8, r)) in which zo is required to lie is close to x', N,
and in the sense determined by the p, 9, q inequalities, respectively.
In general situations, it would be hard to imagine that we can choose an initial
guess satisfying al1 these inequalities. For algebraic Riccat i equations, however , we
already know (see Theorem 3.2) that Newton's method is convergent for an appropri-
ate initial guess not necessarily close to the solution. We will show in later chapters
that, as the iteration goes on, the Newton iterate will automatically get into a region
of the form (2.2) for any fixed p, 8, r] > O unless the Newton iteration is quadratically
convergent.
As we have seen from the above theorem, the convergence of the Newton iteration
to a singular root is usually linear rather than quadratic. Much work has been done
on modifications of the Newton iteration with a view to accelerating convergence
when the derivative is not invertible at the solution. See, for example, [12], [l?] and
[31]. The modified methods as described in [12] and [14] are, however, not applicable
for the CARE. Motivated by consideration of quadratic problems, Kelley and Suresh
[31] proposed other modified methods which could be applied to the CARE as well
as the DARE. These methods are all local methods and very sophisticated. We will
not use these methods in our work, since the conditions needed for the convergence
of these methods cannot be verified easily.
The regularity condition is very important for Theorem 2.4. Without this condi-
tion, the behaviour of Newton's method can be very erratic (se, e.g., [25]).
Chapter 3
Newton's Method for Continuous Algebraic
Riccati Equations
In this chapter we consider continuous algebraic Riccati equations of the forrn
where A, D, C E CnXn, and Dm = D, Cg = C. It is cleax that ?Z maps Hermitian
matrices to Hermitian matrices. However, we find it more convenient to regard 7Z
as a mapping from CnXn into itself. For any matrix n o m CnXn is a Banach space.
Note that
'K(X + H) - R(X) = - H ( A - DX) - (A' - X D ) H + HDH.
Therefore, the fist Fréchet derivative of 'K at a Hermitian rnatrix X is the linear
map 7Z; : CnXn -, CnXn given by
RL(H) = - { H ( A - DX) + ( A - D X ) ' H } . (3-2)
Also the second derivative at a Hermitian matrix X, : CnXn x CnXn + CnXn, is
The Newton method for the solution of (3.1) is
given that the maps 7Zki are dl invertible. If 'Rii is invertible, the iteration (3.4) is
In view of (3.2) when Xi is Hermitian, the above equality is precisely
It is easy to see that Xi+l is also Hermitian. Therefore, al1 the matrices Xi are
Hemitian if Xo is so.
For A, B E CnXnl the pair (A, B) is said to be stabilizable if there is a K E CnX"
such that A - BK is stable, i.e., all its eigenvalues are in the open left half-plane.
The order relation on the set of Hermitian matrices is the usual one: X > Y(X > Y)
if X - Y is positive semidefinite (definite). A Hermitian solution X+ of (3.1) is called
maximal if X+ 2 X for every Hermitian solution X. The following result is Theorem
9.1.1 of (351. See also [Il] and [20].
Theorem 3.1 Assume that D 2 O , C' = C, ( A , D ) is stabilizable, and there ezists
a Hennitian solution of the inequality R ( X ) 5 O . Then there ezists a rnazimal
Hemitian solution X+ of R ( X ) = O . Monover, al1 the eigenvalues of A - DX+ are
in the closed left half-plane.
A Hermitian solution X of (3.1) is called stabilizing (resp. aimost stabilizing) if
dl the eigenvalues of A - DX are in the open (resp. closed) left half-plane. Such
solutions play important roles in applications. Theorem 3.1 tells us that, under the
given conditions, the maximal solution is at least h o s t stabilizing. In fact (see
[50] or [35, Theorem 7.9.3]), X+ is the only Herrnitian solution that can be almost
stabilizing. For this reason, the maximal solution is of p a r t i d a interest.
Theorem 3.2 Under the same conditions as in Theonm 3.1, starting with any Her-
mitian matrit Xo for &ch A - DXa is stable, the mursion (3.5) determines a
sequence of Hermitian matn'ces { X ; } Z , for which A - DXi is stable for i = 1,2, . . ., XI z X z I - , a n d l i m , , X i = X + .
The maximal solution can thus be found by the Newton iteration without previous
information about the solution. The prwf of the above theorem can be found in 135,
p. 2321. See dm [Il], (201 and [33]. There is no doubt about the existence of the
rnatnx Xo, which is called a stabilizing matnx. In fact, we have the following result
(see Lemrna 4.5.4 of [35]).
Lemma 3.3 If D > O and ( A , D ) is stabdizable, then there is an X 2 O such that
A - DX is stable.
Moreover, an initial stabilizing Hermitian matrix Xo can be produced by au-
tomatic stabilizing procedures such as the one in [48], although the matrix Xo so
obtained may be far away frorn the solution X+. We note that Xo 2 XI is generdly
not true. In fact, the first Newton iteration is capable of making a big adjustment to
the initial guess (see [32], for example). When X+ 2 O, we necessarily have Xi 2 0.
But Xo can be indefinite.
If X is an almost stabilizing solution of (3.1) (in the sense that a ( A - DX) is in the closed left half-plane), then 7Zk is invertible if and only if X is a stabilizing
solution. This can be seen from (3.2) and the following classical result (cf. [35, p.
1001).
Theorem 3.4 For any matrices A E CmXm, B E CnXn and r E CnX" the S y h t e r
equation SA - BS = r has a unique solution if and only i j A and B hune no
eigenvalues in cornmon.
It is readily seen that Rk, as a function of X, is Lipschitz continuous on CnXn.
We note that the expression for 7Zk when X is not Hermitian is different irom the one
given in (3.2). The locally quadratic convergence of Newton's method ( s e Theorem
2.3), in combination with Theorem 3.2, yields the following result.
Theorem 3.5 If A - DX+ is stable in Theorem 3.2, then for the sequence {X;}g0
there is a constant c > O such that, for i = 0,1,. . ., IIX;+i - X+II 5 cilXi - X+IIZ,
cohere II JI is any given mat& nom.
We note that, because a ( A - DX+) is in the open left half-plane, A - DXo is
necessarily stable if Xo is close enough to X+. A direct algebraic proof of the above
theorem can be found in [35, p. 2371.
In [4], an exact line search method is introduced which improves Newton's method
for the numerical solution of the Riccati equation in severd aspects. However, the
theory established there does not cover the general situation described in Theorem
3.1, even when A - DX+ has no eigendue-s on the imaginary axis.
When A - DX+ has eigenvalues on the imaginary axis, R>+ is not invertible
and the convergence behaviour of the Newton iteration is more complicated. In this
chapter we examine the behaviour of the Newton iteration for this case. The results
we obtain suggest that a simple modification step can be introduced to improve the
performance of the Newton iteration dramat idy in many cases. Numericd results
are also given to show the effectiveness of the modification.
3.2 Interpretation of the direct sum condition for
the CARE
The direct mm condition in Theorem 2.4 is essentid for the conclusions of that
result. We will now give an interpretation of the direct sum condition for the contin-
uous algebraic Riccati equation (3.1). We assume throughout this section that the
conditions of Theorem 3.1 are satisfied. Let X+ be the maximal solution of (3.1)
with 72k+ not invertible. Let N = Ker RIx+, M = I m q . We have the following
interpretation of the direct sum condition.
Theorem 3.6 CnX" = N $ M if and only if al2 elementary divisors of A - DX+
comsponding to the eigenvalues on the imaginary a i s are linear.
Pmof. Let J be the Jordan canonical form for A- DX+ with P-'(A- DX+)P =
J. We find that K E Ker'RX, if and only if K = P-'QP-' for some Q E Cnxn
satisfying Q J + J'Q = O. Also W E Im'RX, if and only if W = P-'RP-' with
R = VJ + J'V for some V E CnXn. Therefore,
where NJ and M r are the kernel and range of the map Ç : CnXn + CnXn given by
If al1 elementary divisors of A - DX+ corresponding to the eigenvalues on the
imaginary axis are linear, we can arrange the Jordan blocks so that
Here G,, E CTpXrp consists of Jordan blocks associated with eigenvalues in the open
left half-plane, and for k = 1,. . . , p - 1,
where the ar7s are distinct real numbers and i is the imaginary unit. Using block
matnx multiplications and applying Theorem 3.4 repeatedly, we can find easily
NJ = { N = diag(Nl,. . .,Np) IN; E CfiX",l 5 i sp; Np = O), (3-9)
Therefore, CnXn = NJ @ M J . We also have CnXn = N $ M by (3.6)-(3.7).
If A - DX+ has nonlinear elementary divisors conesponding to eigenvalues on
the imaginary axis, we can arrange the Jordan blocks so that the first Jordan block
Ji has the following fom:
Now,
where
Note that T =Ç(S) for
with
si =
Therefore, CnXn # Nj $ Mj, and also CnXn # N $ M .
3.3 Characterizat ion of the direct sum condition
by the Hamiltonian matrix
We have just given an interpretation for the direct sum condition using information
about the eigenvdues of A - DX+ on the imaginary a i s . However, X+ is the
solution to be found. It will be appropriate to give a characterization of the direct
sum condition that is independent of X+.
We st art with some definitions.
Let A E CnXn, B E CnXm. The controllable subspace CAPB of the pair (A, B) is
defined by
CAVB = Im(B AB A ~ - ' B ) .
The pair (A, B) is d e d controllable if
Note t hat if (A, B) is controllable then it is necessarily stabilizable (see Proposition
3.7 and Lemma 3.18 to follow) .
The next four propositions are taken from Chapter 4 of [35].
Proposition 3.7 For any A E CnXn and B E CnXm, the pair (.4, B ) is controllable
if and only if
for al1 X E C .
Proposition 3.8 Let A, B, K Le matrices of sizes n xn, n xm, and m xn respectively
and -te  = A + B K . Thcn
cig = CAB.
Proposition 3.9 For any matriz putt (A , B ) wîth A E CnXn and B E Cnxm, there
is a nonsingular matriz K such that
when Al is r x r , B1 is r x rn and r is the dimension of CA,& Furthermore, the pair
(Al, Bl) is controllable.
The pair of matrices
of (3.1 1 ) is known as a control (or K h a n ) normal form of (A, B) .
Proposition 3.10 Assume that the pair ( A , B ) is not controllable. Then the pair
( A , B ) is stubilizable if and only if the matriz Az of a control nonnal form (3 .11) is
stable.
For A E Cnxn and A E C, we let
&(A) = {x E Cn 1 (A - Xl)pz = O for some integer p 2 O).
The following result is a restatement of Theorem 7.3.1 of 1351.
Theorem 3.11 Assume that D > O and Ce = C . Suppose the CARE (3.1) has a
Hemitian solution X , and
for every eigenvalue Xo of A - DX on the imaginary azh. Then X is an eigenvalue
of A- DX on the imaginary azis i f and only if A is an eigenvalue of the Hamiltonian
m a t h
on the irnaginary azis. Moreover, the partial multiplicities (2.e. the degrees of ele-
mentary divisors) of X as an eigenvalue of H a n tmke the partial multiplicities of X
as an eigenvalue of A - DX.
The condition (3.12) is certainly satisfied if the pair (A, D) is controllable. We will
now show that the condition (3.12) can be guaranteed when (A, LI) is stabilizable.
In fact, we have the following result.
Proposition 3.12 Let A E Cnxn and 3 E CnXm. If (A, B) i s stabilizable, then for
any X &th nonnegative na1 part
Proof. Let A, B be a control normal form of (A, B) (cf. Proposition 3.9). We
Since (Al, Bi) is controllable, we have
Now, if y E &(A) , then
( A - AI)^^ = O
for some integer p 2 O. Therefore,
(Al - AI)P * O (A2 - XI) ' ) (::) = (i)
Since al1 the eigendues of A2 are in the open left half-plane (cf. Proposition 3.10)
and R d 2 O, Az - XI is nonsingular and hence y2 = O. Therefore, y E CAS. We
have thus proved &(A) c CAB. Consequently, we have EA(A) c CAB.
Corollary 3.13 URdcr the conditions of Theonm 3.1, CnXn = N $ M if and only
if al1 the eigenvalues of the Hamiltonian matriz (3.13) on the imaginary dzis have
partial multiplicity two.
Proof. Since (A, D) is stabilizable, we can find K E CnXn such that A - DK is
stable. Since (A - DX+ ) - D(K - X+ ) = A - D K, ( A - DX+ , D) is also stabilizable.
By Proposition 3.12, we have
for every eigenvalue & of A - DX+ on the imaginary axis. Since CA-DX+ ,D = CAYD
by Proposition 3.8, the result follows from Theorems 3.11 and 3.6. O
3.4 Convergence rate of the Newton method
When CnXn = N $ M, we let PN denote the projection onto N parallel to M and
let PM = I - PN. For the algebraic Riccati equation, we start the Newton iteration
with a Hermitian matnx Xo for which A - DXo is stable. Although the Newton
sequence is well-dehed and converges to X+, we do not know whether the iterates
Xi will finally fall into a speQal region of the form (2.2). Therefore Theorem 2.4
cannot be applied directly. Instead, we have the following result.
Theorem 3.14 For any f i e d B > O, let
Then there ezist an integer zo and a constant c > O such that
for al1 i in Q for which i 2 io.
Proof. Let Zi = Xi - X+. Using Taylor's Theorem with (3.3) and the fact that
%(pMxi ) = O, we have
On the other hand, we have by (3.5)
and obviously,
By subtraction, we obtain after some manipulations
Writing X+ = Xie1 - Xi-1 in (3.14) and using the last equation it is found that -
In view of (3.15) and the fact that Xi # X+ for any i, we have
Corollary 3.15 Assume thai, for gben 8 > O , IIPM(X; -X+)II > BIIPK(X, -X+)II
for al1 i large enough. Then Xi X+ quadratically.
The above result is somewhat surprising, since it is generally believed that linear
convergence is the best we can expect when the derivative at the solution is not
invertible (see [12], [14] and [31]). We cannot rejoice in the possibility of quadratic
convergence, however, since the condition in the corollary is not easily satisfied.
Nevertheless, we can conclude that , when the convergence is not quadratic, the error
will genemlly be dominated by its N-component. This will be the basis for a numerical
strategy proposed in the next section. Meanwhile, the following theorem shows what
happens in the generic case when convergence is not quadratic.
Theorem 3.16 Assume CnXn = N @ M . If the convergence of the Newton sequence
{Xi} is not quadratic, then II('R;r,)-'ll 5 cllXi - X+ll-' for a21 i 2 1 and some
constant c > O . Moreover,
The proof of this theorem is an application of Theorem 2.4 and follows readily
from the next lemma. The map B appearing in Theorem 2.4, when applied to the
Riccati equation (at a fixed Z E N instead of &), takes the form
Lemma 3.17 If CnXn = N $ M then
U = {Z E N 1 Bz : N 4 hf i s not inuertible)
has rneasure zero in N,
Some preliminary results will be needed to prove Lemma 3.17. The first result is
Theorem 4.5.6 (a) of (351.
Lemma 3.18 For any A E CnXn, B E CnXm, the pair (A, B ) is stabilizable if and
only if
rank(A1- A B) = n
for every X E C with ReX 1 0 .
Lemma 3.19 Let J and P 6e as in the pmof of Theonm 3.6. Then
rank ( X I - J P-'DP-') = n
for every X E C m'th R d 2 0.
Proof. Since (A, D) is stabilizable, there is an X such that A - DX. is stable.
Now
which is stable. Thus (J, P-'DP-') is a stabilizable pair. The result now follows
fromLernrna3.18. O
Lemma 3.20 Let W be a Hermitian positive semidefinite matriz. If the deteminant
of a principal submatrit of W is zero, then the mws of W containing this submatriz
m u t be linearly dependent.
Pmof Let W = (w;,)&=,. We may assume without loss of generality that
the principal submatrix is Wi = (w&=, = (a: + * a;)=(r < n) and that al =
c 2 a ~ + * * * + Ga, for some constants Q,.. .,G. Let E ( i , j ( k ) ) be the elernentary
matrix obtained from 1 by adding k times row j to row i . Let
Then UWU* is Hermitian positive semidefinite and has zero in the (1,1) position.
Hence the first row of UWU' is zero. This means that Pl = c2P2 + + c,&, where
&, . . . , ,Br are the first r rows of W. CI
Now consider the map Bz : N -, N of (3.17). Using the notation and results
in the proof of Theorem 3.6, we can write Y = P-'YJP-', Z = P-'ZJP-' with
Yj, Z j E Nj. Therefore, we have by (3.3)
where ev, is the projection onto Nj parallel to M J . Let ZJ = diag (Zl, . . . , Z,),
YJ = diag (K, . . .,Y,) and diag (Dl,. . . , Dp) be the block diagonal of P-'DP-',
where Z;, Y,, Di E Cri for i = 1, . . . , p. We have further
where we define linear transformations 7' : CriXc -, CriXG b
for 15 i S p - 1.
For i = 1,2 ,..., p-1 , le t
Lemma 3.21 The set U; has measure zero in CriX'i.
Proof. We need prove the result only for i = 1. Note that Fz, is invertible if and
only if the equation
&Dix +Y,D,Zi = Wl (3.19)
has a unique solution YI for any Wl E Cr' We c a n rewite (3.19) as
(see (35, p. 99]), where 8 is the Kronecker product and vec is the stacking operation
for matrices, Le.,
when S = (si 32 . . . SJ E C p x q . Therefore,
Since Zl E CTIXrl, the determinant in (3.20) is an algebraic polynornial in r:
variables. By the fundamental theorem of algebra, the set Ul has measure zero in
Crl unless
det (1 8 (&Dl) + (&Dl) @ I ) = 0. (3.21)
If (3.21) is true, we have in particular det (1 @ D: + D: @ 1) = O. Thus O is
an eigenvalue of the matrix I @ D: + D: 8 I . We can then find eigenvaluea Xj, .Ar
of Dl such that A: + A: = O (see [35, Theorem 5.1.11). Since Dl is Herrnitian,
all its eigendues are real. We then condude that O is an eigenvalue of Dl and
det Di = O. By Lemma 3.20, the first rl rows of P-'DP-' are lineady dependent.
Thus rank ( a l i l - J P-'DP-*) < n, which contradicts Lemma 3.19. O
Pmof of Lemma 3.17. From (3.18) we see that Bz is invertible if and only if F2,
is invertible for each i. Thus P-1
U = U v i T i=l
where
By Lemma 3.21, each Vi h a meôsure zero in N . Therefore U has measure zero in
N. 0
Pmof of Theorem 3.16. W e apply Theorem 2.4, with some natural changes of
notation. Let 4 = Xi - X+ and x = X - X+. We are to show that there is a
such that Ba, is invertible and, if = span {@O} $ & and Po is the projection on
O. almg Ni $ M, then there is an i such that Xi E W (po, Bo, m) where
Fint, Theorem 3.2 shows that by choosing Xo so that A- DXo is stable there is an
il such that O < J ~ ~ ; J J < po for all i 2 i l . Then, since the convergence of the Newton
sequence is not quadratic it follows from Corollary 3.15 that 11 ~ ~ 2 , ~ 11 5 do 11 pNxiYi, II
for some i2 2 il. Note that P,& # O, since otherwise we would have x,-~ = 0.
Finally, if we choose = ~ ~ 2 ; ~ then the last inequality of (3.22) is trivially
satisfied for Xi,, but Ba, may not be invertible. However, when is given, it follows
from Lemma 3.17 that a can be chosen arbitrady dose to p,qxi, in such a way
that BI, is invertible and Xi, E W(po, %, m). Now apply Theorem 2.4. O
3.5 A modified Newton method
The Newton iteration can be used to f h d the maximal solution of (3.1) when the
Hamiltonian matrix has eigenvalues on the imaginary axis, while most other alge
rithms are not applicable in this case (see [41]). However, the convergence of the
Newton sequence in this case is usually linear although, as Theorem 3.14 suggests,
we have not excluded the possibility of quadratic convergence. Since the Newton
iteration is an expensive procedure, we cannot be satisfied with linear convergence
alone.
For Newton's method in general Banach spaces, the initial guess must be in a
special region of the f o m (2.2) in order to guarantee that the modified methods we
mentioned in Chapter 2 are well-dehed and give fast convergence. When we apply
the Newton iteration to find the maximal Hermitian solution of (3.1), we start with
a Hermitian matrix Xo for which A - DXo is stable. It is not clear whether and
when the iterate Xk will fall into that special region. We therefore take a different
approach. We are not going to recuver quadratic convergence. Instead we will add a
simple modification step to the Newton iteration so that the required accuracy can
be achieved at an early stage. The following simple result is very instructive. Note
statement 2 especially, and the possibility that it presents for stepping directly to
the solution X+ .
Theorem 3.22 In the setting of Theorems 3.1 and 3.2, and under the condition that
XI - X+ E N, tue have
Proof. By Taylor's Theorem,
Since 7Z(X+) = O and Ri+(& - X+) = O, we may also write
The second part of the theorem follows immediately. The fkst part follows easily
from (3.4) and the second part. O
We remark that similar conclusions cm be reached for any map F from a Banach
space into itself, for which F" is constant. Note also that the direct sum condition
c n x n = N @ M is not required in Theorem 3.22.
Example 3.1 It is instructive to revisit Example 9.2.1 of [35] at this stage. Let
It is easily verified that there is a unique solution of R ( X ) = 0, namely,
Thus,
and is not stable.
If Newton iterations are started with
(in this case A - DXo = - I and is stable), then it can be proved by induction that,
for n = 1,2,. ..,
Consequently, for n = 1,2,. . . ,
It can be seen that, in this case
N = span 1 Note also that, for n = 1,2, . . . ,
Fortuitously, Xo is chosen in such a way that Xn - X+ E N for n = 1,2,. . .. Thus,
Theorem 3.22 applies and (3.23) holds. Furthmore, it is clear that, by applying
the modified Newton step at any n 2 1, the exact solution X+ is obtained.
When the direct sum condition is satisfied and the convergence of the Newton
sequence {Xk} is not quadratic, we have at some (hopefully early) stage I I PM (Xk - X+)(l a: II PN(Xk - X+)ll (cf- Theorem 3.16). A very good approxiinate solution
could then be obtained by applying the modification step 2 of Theorem 3.22. More
precisely, we have the following resuit.
Theorem 3.23 Assume CnXn = AI' $ M and
small, and
then IIYk+l - X+ll < ce for some constant c independent of c and k .
and
If q c < ?, we h o w from the Banach Lemma (Theorem 2.2) that Rik is invertible
Since xk - X+ E N , we have by Theorem 3.22
x+ = - ~(R~~)-'R(x~)-
Hence
Note that
Therefore,
we obtain easily IIYk+l - X+ I I 5 cc. O
Note that the condition (3.24) is satisfied when the convergence of Newton's
method is not quadratic (cf. Theorem 3.16). Note also that the matrix produced
by the double Newton step in Theorem 3.23 is at l es t almost stabilizing (see the
discussions in 141).
The following algorithm is suggested by the results of this section.
Algorithm 3.24 (Modified Newton method for the CARE)
1. Cnoose a Hennitian matriz Xo for d i c h A - DXo is stable.
2. F o r k = O , l , ... do:
Solve qI ( H ) = R ( X k ) ;
Compute Xk+l = Xk - 2 H ;
If IlR(xk+l) 11 < € 9 stop;
ahettoise, cornpute Xk+i = X j - H; If Ila(xk+l)ll < c, stop*
In the above algorithm, I I II is an easily cornputable matrix n o m (e.g. 1-nom)
and c is a prescribed accuracy. The equation R i i ( H ) = R(Xk) can be rewritten as
a Lyapunov equation
which can be solved efficiently by the algorithms described in [3] and [21]. In Al-
gonthm 3.24, dl iterates except the last one are identical to those produced by the
original Newton method. Thus al1 gwd properties of the Newton method are re-
tained. When A - DX+ has eigendues on the imaginary axis, the last iterate is
usually produced by the modified step. Algorithm 3.24 needs roughly 10% more
computational work per iteration, since we systemat icdly perform one additional
Riccati function evaluation with a view to achieving the required sccuracy in the
modified step as early as possible.
3.6 Numerical results
In this section we present îome numerical examples to illustrate the effectiveness of
the modified Newton step in Algorithm 3.24.
Example 3.2 Consider the CARE (3.1) with n = 2 and
(cf. Example 10 of [5 ] ) . The maximal solution X+ = (x,) is given by
For e = 0, the pair (A, D) is stabilizable, and
Observe that o ( A - DX+) = { O , -2).
Starting with
we perform 8 steps of the ordinary Newton iteration and then perform a modification
step. The results are recorded in Table 3.1. As usual we let Xk = Xk - X+ and write
xk = @fi). For this problem the convergence of the Newton iteration is linear with
common ratio f (cf. Theorem 3.16). After 8 Newton iterations, X8 is still not very
close to X+. However, Xs - X+ is very close to an element in N. A modification
step then produces a very accurate approximate solution (cf. Theorem 3.23).
Table 3.1: Performance of Algorithm 3.24 for Example 3.2
When r is a small positive number, X+ is a stabilizing solution. According to
Theorem 3.5, the Newton sequence {Xk) converges to X+ quadratically. However,
the constant c in Theorem 3.5 will be very large for very s m d a. Thus the quadratic
convergence could be exhibited only after Xk gets very close to the solution. On the
other hand, as Xk gets close to the solution, the corresponding Lyapunov equation
will be ill-conditioned. As a result, quadratic convergence can hardly be realized.
For example, take c = 10-8 and Xo as before. If we perform 8 Newton iterations
and then perfarm a modification step, we get I I X ~ ~ ~ ~ = 0.41420 - 08. Without the
modification step, the error 11 11 for the Newton it en t e decreases monotonieally
until the 26th iteration with ~l& l l~ = 0.37380 - 07.
Example 3.3 Consider the CARE (3.1) with n = 2 and
(cf. Example 11 of [5 ] ) . The maximal solution is
For c = 0, the pair (A, D) is stabilizable. And we have
and observe that @ ( A - DX+) = {-i,i).
Starting with
20 15
15 25
we perform 8 steps of the ordinary Newton iteration and then perform a modification
step. The results are recorded in Table 3.2. The situation for this example is very
similar to that for Example 3.2.
For É = IO-'' and the same initial guess, we perfom 8 steps of Newton iteration
and then perform a modification step. We get 11&11, = 0.10000 - 09. For the
Newton iteration, the error decreases rnonotonically untill the 31 st iteration with
Example 3.4 We consider the CARE (3.1) with n = 8 and a block-diagonal matrix
A with 2 x 2 blocks:
Table 3.2: Performance of Algonthm 3.24 for Exarnple 3.3
It is readily seen that X+ = O so that o ( A - LN+) = {- 1, O, f i , f 2i) and the purely
imaginary eigenvalues have linear elementary divisors.
We apply Algorithm 3.24 with Xo = 1 and e = IO-'^. The results are recorded in
Table 3.3. The first 9 steps are ordinary Newton iterations. The convergence of the
Newton iteration is linear with cornmon ratio f (cf. Theorem 3.16). And by (3.16)
we have Il a(&) 11 5 cllxk 1l2 in this case, as verified by the numerical results. The
Table 3.3: Performance of Algorithm 3.24 for Example 3.4
last step is a modification step, which irnproves the accuracy dramatically.
We camed out many other numerical experiments. The results reported above
are typical. In these experiments the convergence of the Newton method is always
observed to be linear with common ratio $ whenever all elementary divisors of A - DX+ are linear.
Chapter 4
Examples and Conjectures
4.1 Examples
In the previous chapter, we have shown that the convergence of Newton's method
is either quadratic or linear with cornmon ratio 112, provided the conditions in
Theorems 3.1 and 3.2 are satisfied and the direct s u m condition is fulfilled.
In practical applications, the direct sum condition CnXn = N $ M is usually
satisfied. On the other hand, we c m find examples where the eigendues on the
imaginary axis have elementary divisors of arbitrary degree.
Example 4.1 Consider the CARE (3.1) with
For X = (z,) with z~ = C i (the binomial coefficients), al1 the eigendues of A- DX
are -1. Thus (A, D) is stabilizable. We also have X+ = O (O is an almost stabilizing
solution, and thus maximal), and An is the only elementary divisor of A- DX+. Note
also that, by Theorem 3.11, X2" is the only elementary divisor of the Harniltonian
matrix (3.13).
We performed many numerical experiments with different n and different initial
guess. We obsemed that the Newton sequence converges to O linearly with common
ratio 2-'1". Of course, since Ri+ is not invertible, the Newton iteration cannot be
continued forever. The above rates are observed when the iterates are reasonably
close to the solution X+ = 0.
An analytical example displaying the observed rate of convergence is thus highly
desirable.
Example 4.2 Consider the CARE (3.1) with D,A,C given by (4.1) and n = 2. For
Newton's method, we will use a real symmetric matnx as an initial guess so that al1
subsequent iterates are r d symmetric. The real symmetric matrices Xk+l = (zF1) and Xk = ( z f j ) are now related by
We dmose 2:, = I/(\IS + l)~, z:, = c for any c > 0, and 4, can be arbitrary. Then
A - DXo is stable, and we find that for k = 1,2, . . .
Thus for any matrix n o m
We note that this example also serves to show that Xo 2 Xi is not true in general.
It is also worthwhile to point out that, for Example 4.2, we no longer have
for al1 k > 1 and some constant p > O. In fact, letting
we c m s e easily t hat A - D& has a zero eigenvalue and t hus is not invertible.
If (4.2) were tnie for all k 2 1 and some constant p > O, we would have
where q is a constant independent of k. Therefore, for k large enough, we have
1 - ( ) 11 < 1. It follows from the Banach Lemma that R;- is invertible
for k large enough, a contradiction. Thus, for Example 4.2, it is impossible tu have
(4.2) for dl k 2 1 and some constant p > 0.
When the conditions in Theorern 3.1 and the direct sum condition are satisfied,
the convergence of Newton's method has been shown to be either quadratic or linear
with common ratio 112. However, quadratic convergence has never b e n observed
in practice (when the derivative at the solution is not invertible). If we drop the
assumption that D 2 O, then we can find examples for which there are Newton
sequences converging quadratically to a solution at which the derivative is not in-
vertible. We note that the existence of a Hennitian solution does not imply the
existence of a maximal solution when the condition D 2 O is dropped.
Example 4.3 Consider the CARE (3.1) with
For this exarnple, all eigendues of the Hamiltonian matrix (3.13) are on the imagi-
nary axis. It cm be easily verified that
is a solution of this equation and a maximal solution does not exist. Although R;.
is not invertible, we can still construct Newton sequences converging quadratically
to the solution X'.
For n 2 O, if
with a, # O, then is invertible and the next Newton iterate is
w here
We now choose uo and Q such that
1 I d < g<
Using (4.3) and noting that
we can prove by induction that for any n > O
We assume for the moment that a, # O for all n 3 O. Thus the Newton sequence
{X,) is well-defined and we have
Therefore, X, -+ X' quadratically.
To ensure a, # O for al1 n 2 O, we may choose to be a ration al number and
choose a0 to be a transcendentd number t . In view of (4.3), we can write for each
n>_O
a, = S. ( t ) Q3"-1 ( t ) '
where P,(t), Q,(t) are polynomials of degree m with integer coefficients. Thus
a, # O, since othenvise t would be an algebraic number. One particulax choice is
We note that Theorem 2.4 is not applicable to Example 4.3. Let N = ~er7Z.k.
and M = 1m7Ci.. We have
Thus CZX2 = N $ M. However, the regularity condition is not satisfied for this
example. In fact, we have P,&;.(z, Y) = O for any Y, Z E N .
4.2 Conjectures
As we mentioned above, we have no examples (analytic or numerical) to display
quadratic convergence when the conditions of Theorem 3.1 are satisfied and ai, is
not invertible. Although the possibility of quadratic convergence is desirable, it may
be possible to prove that the convergence is always linear.
Conjecture 4.1 Assume that D 2 O , Cm = C, (A, D ) is stabilizable, and the CARE
(3.1) has a Hennitian solution. If the Harniltonian matriz (3.13) has eigenvalues on
the imaginary azis then, starttng tm-th a Hennitian matriz Xo such that A - DXo
ts stable, the convergence of the Newton sequence {Xk} to the mazimal solution X+
cannot be quadrutic.
Furthemore, the b t two examples given in the previous section and many other
numericd examples not reported here have supported the following conjecture.
Conjecture 4.2 Undet the same assumptions as in Conjecture 4.1. If the highest
partial multiplicity for the eigenvalues of the Hamiltonian matriz (3.13) on the imag-
inary azis is 2p, then, starting with a Hemztian matriz Xo such that A - DXo is
stable, the Newton sequence {Xk}
lirn k-oo
c o n v e n p to the rnMmal solution X+ linearly loith
or more strongly,
lim
for any matriz n o m .
Chapter 5
Newton's Method for Discrete Algebraic Riccati
Equat ions
In Chapter 3, we considered Newton's method for continuous algebraic Riccati
equations (CARE). In this chapter we consider disctete algebraic Riccati equations
(DARE) of the fom
- X + A'XA + Q - (C + BWXA)'(R + B'XB)-'(C + BaXA) = 0 , (5.1)
where A, Q E CnXn, B E CnXm, C E CmXn, R E CmXm, and Q' = Q, R = R. We
denote by R(X) the left-hand side of (5.1). The function R ( X ) and its derivatives
are much more complicated than their CARE counterparts. Nevertheless, it will be
shown that most analflical properties established in Chapter 3 for the CARE can
be extended to the DARE. The analysis here is more involved, but the line of attack
is the same.
Let V = {X E CnXn 1 R + B'XB is invertible}. We have R : D -, CnX".
The first Fréchet derivative of 7Z at a Hermitian matrix X E ZJ is a linear map
R' x : Cnxn 4 CnXn given by
where  = A - B(R + B0XB)-'(C + B'XA). Also the second derivative at a
Hermitian matrix X E V, 72% : CnXn x CnXn -, CnXn, is given by
where H = B(R + BœXB)-'B*.
For A E CnXn and B E CnXm, the pair (A, B ) is said to be d-stabilizable if there
is a K E Cm Xn such that A - BK is d-stable, i.e., dl its eigenvalues are in the open
unit disk. As before, a Hermitian solution X+ of (5.1) is called maximal if X+ $: X
for every Hermitian solution X. The following result is a modification of Theorem
13.1.1 in [35]. See also [46].
Theorem 5.1 Let (A , B ) be a d-stabilirable pair and assume that there is a Henni-
tian solution x of the inequdity R(X) 2 O for which R + B'XB > O . Then there
ezists a mazàmal Hermitian solution X+ of R(X) = O . Monover, R + BmX+ B > O
and al1 the eigenvalues of A - B(R+ B0X+B)-'(C + BWX+A) lie in the cloaed unit
dtsk.
Remark 5.1 In Theorem 13.1.1 of [35], it is required that R be invertible. This
condition is needed for some later developments in [35], but is not necessary for
the conclusions of Theorem 13.1 .l. The proof of that theorem should be slightly
modified. We have only to replace expressions of the form Q - C'R-'C + (L - R-'C)*R(L - R-'C) by expressions of the form Q + L'RL - C' L - L'C. As noted
in [6], the matrix R may well be singular in applications.
A Hermitian solution X of (5.1) is called stabilizing (resp. almost stabilizing) if
al1 the eigendues of A - B(R + BœXB)-' (C + B'XA) are in the open (resp. closed)
unit disk. Such solutions play important roles in applications. Theorem 5.1 tells us
that, under the given conditions, the maximal solution is at least almost stabilizing.
The Newton method for the solution of (5.1) is the same as (3.4):
given that the maps Rk(i = O, 1,. . .) are al1 invertible. But now the iteration (5.4)
is dosely related to the solution of the Stein equation described in the following
classical result .
Theorem 5.2 (cf. [35, p. 1001) For any gtven mutrices A E CmXm, B E Cnrn and
ï E CnXm the Stein equotion S - BSA = ï ha3 a unique solution if and only if
ATP. # 1 for any AT E 44, p, E 4B).
It follows from Theorem 5.2 that, under the conditions of Theorem 5.1, Rk+ is
invertible if and only if A - B(R + B'X+ B)-'(C + BmX+A) is d-stable.
When we apply Newton's method to the DARE (5.1) with (A, B ) d-stabilizable,
the initial matrix Xo is taken such that A - B(R + BmXoS)-'(C + BoXoA) is d-
stable. The usual way to gmerate such an Xo is as follows. We choose Lo E CmXn
such that A. = A - BLo is d-stable, and take Xo to be the unique solution of the
Stein equation
Xo - &Xo& = Q + LiRLo - C"Lo - LOC. (5.5)
In view of (5.2), the Newton iteration (5.4) can be rewntten as
and
Theorem 5.3 Under the same conditions as in Theorem 5.1 and for any Lo E CmXn
such that A. = A-BLo U d-stable, starting roith the Hermitiun m a t h Xo determined
b y (5.5), the ncursion (5.6) determines a sequence of Hermitian matrices {Xi)% for
which A - B(R+ BeXiB)-'(C + BmXiA) is d-stable for i = 0,1,. . . , Xo 2 XI 2 - ,
and fi-,, Xi = X+.
An important feature of Newton's method applied to the CARE and DARE is
that the convergence is not merely local. The application of Newton's method to the
DARE was initiated in (261 under some conditions which, with the wisdom of hind-
sight, are seen to be restrictive. Similady, Theorem 5.3 was established in the proof of
(46, Thm. 3.11 under the additional condition that R > O. The positive definiteness
of R was replaced by the invertibility of R in the proof of [35, Thm. 13.1.11. As we
have pointed out in Remark 5.1, the invertibility of R is also unnecessary. Note that
an Lo can be produced by automatic stabilizing procedures such as the one in [48].
It should also be noted that Xo 2 Xi is generdly not true, if Xo is not obtained
frorn (5.5).
It is readily seen that Ri, as a function of X, is Lipschitz continuous on a closed
b d centered at X+ and contained in D. We note that the expression for Rk when
X is not Hermitian is different fiom the one given in (5.2). The locally quadratic
convergence of Newton's method (see Theorem 2.3), in combination with Theorem
5.3, yields the following result.
Theorem 5.4 If A - B(R + B0X+B)-' (C + B'X+A) is d-stable i n Theorem 5.3,
then for the sequence {Xi}g, then LP a c o ~ t a n t c > O such that, for i = 0,1,. . .,
IIXi+l - X+/I < cllXi - X+I12, where II II is any given matriz n o n .
When the closed-loop rnatrix A - B(R+ B*X+ B)-'(C + B*X+A) has eigenvalues
on the unit circle, Rh is m t invertible. This situation happens in some important
applications (see [9], for example). We will show that the convergence of Newton's
method is either quadratic or linear with cornmon ratio f , provided that the eigenval-
ues on the unit circle are all semi-simple (Le. all elementary divisors corresponding
to these eigenvalues are linear). The linear convergence appears to be dominant
and, when this is the case, the efficiency of the Newton iteration c m be improved
significantly by applying a double Newton step at the right time. Numerical results
are also given to illustrate these phenornena.
5.2 Interpretation of the direct sum condition for
the DARE
We will now give an interpretation of the direct s u m condition for the DARE (5.1)
We assume throughout this section that the conditions of Theorem 5.1 are satisfied.
Let X+ be the maximal solution of (5.1) with Rk+ not invertible. Let N = Ker'Rk,,
M =imR>+. We have the following interpretation of the direct sum condition.
Theorem 5.5 CnX" = N $ M if and only if dl eigenualues of
on the unit circle are semi-simple.
Pmof. Let J be the Jordan canonid f o m for A+ with P-' A+P = J. We find
that K E if and only if K = P-'LP-' for some L E NJ = {Y E CnX" 1 - Y +
J'YJ = O) . Also W E M if and only if W = P-'UP-' for some U E M j = {Y E
CnXn 1 Y = -V + J'VJ for some V E CnXn]. Therefore, CnXn = N $ M if and
only if CnXn = Nj $ M J .
If dl eigendues of A+ on the unit circle are semi-simple, we c m arrange the
Jordan blocks of A+ so that
J = dia&&, . . . , GP+ G,). (5.9)
Here Gp E CrpXrp consists of Jordan blocks associated with eigenvalues in the open
unit disk, and for k = 1, . . . , p - 1
Gk = (ak + b k i ) l E Cr' "', (5.10)
where a k + bki (k = 1,. . . , p - 1) axe distinct complex numbers on the unit circle.
Using block matrix multiplications and applying Theorem 5.2 repeatedly, we can
h d easily
NI = { N = diag(Nl, ..., Np)I Ni E CriX",l 5 i 5 p ; N p = O } ,
M I = {M=(Mij)IMij E C " ~ ~ ] , ~ 5 i , j s p ; M i i = O , l IiSp-1).
Thus, CnXn = Nj $ M j .
If A+ has nonlinear elementary divisors corresponding to eigenvalues on the unit
circle, we can arrange the Jordan blocks so that the first Jordan block Ji has the
following fom:
where a2 + bZ = 1. In this case,
T = diag(Tl, 0,. . . ,O) E NJ n M J ,
where
Note that T = -V + J'VJ for
with
Therefore, Cnxn # N , $ M J . O
5.3 Characterization of the direct sum condition
via a matrix pencil
We lave just given a characterization of the direct sum condition, in which the sought
after solution X+ appears. In order to give a characterization which is independent
of X+, we consider the matrix pencil )<Fe - Ge (known as a "extended symplectic
pendn ) with
C O R
Matrix pends of this type were first introduced in [15] and [49], but for a different
purpose. See also 1281.
Lemma 5.6 If (5.1) has a Hermitian solution X, then
where Z = - ( R + B*XB)-'(C + B'XA) and
I O
O ( A + BZ)' O 7 Ne =
O -Bo O O 1 Proof. It can be easily verified by direct computation. In particular,
is true because X is a Hermitian solution of (5.1). 0
Note that, in contrast with Proposition 15.2.1 of [35], the equality (5.11) does
not require the invertibility of R. Note also that oo is an eigendue of A Fe - Ge with
multiplicity at least m.
A matrix pend XF - G is called regular if det(XF - G) is not identically zero.
Corollary 5.7 If (5.1) has a Hennitian solution X , then Me-Ge is a ngular pencil.
Monooer, cr is an eigenvalue of A + BZ if and only if a and ü' a n eigenualues of
XF, - Ge. A unirnodular cr is an eigenvdue of A + BZ toith algebmic multiplicity k
if and only if it is an eigenvdue of XF, - Ge m'th algebmic multiplicity 2 k .
Proof. We have by Lemma 5.6
det(XFe - Ge) = det(XM. -Ne)
= c det(XI - (A + BZ)) det(X(A + BZ)' - I ) ,
where c = (-1)" det(R + B'XB) # O. Thus, XF. - Ge is a regular p e n d If
det(A1- (A + BZ)) = (A - AI) - ( A - A,),
we have
det(X(A + BZ)' - I ) = (X1X - 1) . (h,X - 1).
The remaining conclusions in the corollary follow easily. O
If al1 unimodular eigendues of AFe - Ge are of algebraic multiplicity two, then
al1 unimoddar eigenvalue of A + BZ are simple and the direct sum condition is
satisfied. To Qve a complete characterization, we need to consider the relationship
between the elementary divisors of A + BZ and X Fe - Ge.
Theorem 5.8 Let a be a complez number with la1 = 1 and X be a Hermitian
solution of (5.1) toith R+ B'XB > O. If
then the elementary divisors of A+ BZ comsponding to a have degrees kl, . . . , ks(l 5 kl 5 . 5 k, 5 n) if and only if the elementary divisors of XF, - Ge comsponding
to a have degnes 2k1,. . . , 2ks .
Proof. Suppose the elementary divisors of A + BZ corresponding to a have
degrees kl, . . . , k,. By the local Smith form (see [19], for example), we can fkd
matnx polynomials & ( A ) and F,(A) invertible at <r such that
where D = diag(( A - . . . , (A - a)k4 ). Replacing X by k1 in (5.12), and then
taking conjugate transpose, we get
(A + BZ)' - P I = K,(X) (: 0) .m.
where Ka(A) and L,(X) = (~~(1-'))' are rational matnx functions invertible at
a. For any rational rnatrix functions F(X) and G(A), we will write F(X) - G(A) if
there are rational matrix functions K(X) and L(X) invertible at a such that F(X) =
K ( W ( ~ ) L ( 4 *
Now, in view of Lemma 5.6, we have
A I - ( A + BZ) O -B
X F ~ - G~ - O ( A + B Z ~ - A-' I O 1 O -Bo - (R + B'XB)
By (5.12) and (5.13) we have iurther (for X in a neighborhwd of a)
where we have writ ten
( ( ) ) = ( B i ) ( ( ( ) ) ) = ( C i -(R + B'XB) = (Si,).
Note that (Sij) < O since R + B'XB > 0.
Since r d (cri - A B) = n, we have at A = a
rank(XI - (A + BZ) - B) = rank(X1- A - B ) = n.
Therefore, at X = a,
0 B11 B12
raink ( O D B21 B g 2 ) = n
and thus rank(Bzl fi2) = S. Since E,(X-=) = E,(A) at X = a, we have (CG) =
(Bij)' at X = a. We may then assume that B2i and CI2 are invertible in a neighbor-
hood of a since, if otherwise, we can exchange the rows of (Cij) and the columns of
(Bi j ) simultaneously (this will not change the negative definiteness of (Si,) and the
property that (Ci,) = (Bi,)' at X = a. ).
Now we obtain by block elimination
I O 0 0 O O
O D O O I O
O O O D O O
O O O I hl VIP
o o o o ~ l ~ z where
is a rational matrix function with -V(a) > O. It is clear that no principal minors of
V(X) are zero at a.
Al1 nonzero minors of order i for W(A) have the forrn ( A - c r ) I q ( X ) , where 1 2 O
and a is neit her a zero nor a pole of the rational function q ( X ) . For 272 + m - s + 1 2
i 5 Zn + m, the smallest 1 tunis out to be li = c:Z-~"-" 2kj . For 1 5 i 5 2n + m - s, the smallest 1 is li = O. By the Binet-Cauchy formula (see [36], for example), we
can see that ( A - a)'' is also the geatest cornmon divisor (of the form ( A - a)') of all minors of order i for X Fe - Ge. Thus the elernentary divisors of XF, - Ge
corresponding to a are (A - a)lkl,. . . , (A - a)2ke. This proves the "only if" part of
the theorem. The Y£'' part follows r d i l y from the "only if" part. CI
Corollary 5.9 If the conditions of Theorrm 5.1 are satisjîed and Ri+ is not invert-
ible, then CnXn = N $ M if and only if al1 the elementary divisors of AF, - Ge
corresponding to the eigenoalues on the unit circle are of degree two.
A previous result of the same nature as Theorem 5.8 can be found in [SI]. That
result is applicable to the DARE (5.1) with C = O, R > 0, and Q 2 0.
5.4 Convergence rate of the Newton method
When CnXn = N $ M , we let PN denote the projection onto N parallel to M and
let PM = I - PM. For the DARE (5.1), we start the Newton iteration with the
Hermitian matrix Xo obtained from the Stein equation (5.5). By Theorem 5.3, the
Newton sequence is well-defined and converges to X+. The following result shows
there is some possibility of quadratic convergence and is the analogue of Theorem
3.24 for the CARE.
Lemma 5.10 For unyfized 8 > O, let Q = {i 1 (IPM(Xi-X+)II > BIIP''(X;-X+)II).
Then then &t an integer io and a constant c > O such that
for al1 i in Q for which i 2 à*.
Pmof Let = Xi-X+, i = 0,1,. . ., and let L+ = (R+BeX+B)-'(C+B0X+A)
(thus A+ = A - BL+). We have (see [35, p. 3141)
and IIL+ - Lill = O ( I I X ; - ~ I I ) , where the matrices Li are defined by (5.7). We also
have
where we have written o ( I I x ~ ~ ~ ' ) for a tem W(X;) satisfjmg I I W(Xi)l l = 0(11~;;.11~).
Now, in view of (5.1) and (5.7),
a(&) = R(Xi) - R(X+)
= -Zi + +'%A (C + BoXiA)'Li+l + (C + BœX+A)'L+
= -2; + A ; ~ ; A + - A;X~A+ + A ' ~ ; A
-{(C + B'X; A)' - (C + BmX+ A)' )Li+*
In the last equdity, we have used
Thus for i large enough,
for some constants ci and cz.
On the other hand, for i in Q and large enough, we have as in the proof of
Theorem 3.14
IIz(X)II 2 ( ~ ( 9 - l + 1)-' - ~ I I % I I ) I I X ~ I I (5.15)
for some constants c3 and 4. Since X; f X+ for any i, we have by (5.14) and (5.15)
Thezefore, we can find an io such that 11&11 5 ~ l l z ~ ~ ~ Il2 for all i in Q for which
2 2 i',. O
Corollary 5.11 Assume that, for ginen 0 > O, 11 Ph( (Xi - X+) II > 911 PN(X - x+) 11
for al2 i large enough. Then Xi + X+ quadratically.
The condition in Corollary 5.11 appears to be not easily satisfied. The next
r d t describes what will happen if the convergence of the Newton iteration is not
quadrat ic.
Theorem 5.12 Assume CnXn = N @ M . If the convergence of the Newton sequence
{Xi) is not quadmtic, then II(Rk,)-'ll 5 cllX; - X+II-' for a11 i 2 1 and some
constant c > O . Moreover,
lim IlXi+1- x+ii - 1 - lim llp~(xi - x + ) l l = 0. i x i - 2' i+~IIp~(xi-x+)II~
The proof of this theorem will again be an application of Theorem 2.4. Some
preliminary results will be needed.
The fint result is Lemma 4.5.3 of [35].
Lemma 5.13 Let A be n x n, and B be n x m. Then ( A , B ) is d-stabilizable if and
only if ( A + BK, BL) is d-stabilizable for any m x n matriz K and any m x p matriz
L for which Im(BL) = ImB.
The next result is Theorem 4.5.6 (b) of [35].
Lemma 5.14 Let A be n x n, and B be n x m. Then ( A , B ) is d-stabilizable if and
only if
rank(XI - A B ) = n
for e v e q X E C with 1A1 2 1 .
We then have the following easy consequence of the above two results.
Lemma 5.15 Assume CnXn = N $ M and let the matrices P and J Le as in the
proof of Theorern 5.5. Then
for every complez number X vith (X I 2 1.
Proof. It is well known that Im(CCO) = Im(C) for any matrix C. Therefore,
It follows from Lemma 5.13 that (A - BL+, B(R + B*X+ B)-'B*) is d-stabilizable.
Since P-'(A - BL+)P = J , the pair (J, P-'B(R + B*X+ B)-' Bop-') is also d-
stabilizable. (5.16) now follows fiom Lemma 5.14. O
For fixed Z E N, we consider the map BZ : N -, Af defined by
Bz(Y) = PNR$+ (2, Y).
Using the notation and results in the proof of Theorem 5.5, we can write Y =
P-*YJP-l, Z = P-*ZJP-' with YJ,ZJ E Nj. Let H+ = B(R+ B*X+B)-'B.. We
have by (5.3)
where LI+ = P-'B(R+ Box+ B)-'B'P-' , and P f i is the projection onto A/" parallel
to M J . Let ZJ = diag(Zl, . . . , Z,), YJ = diag(x, . . . , Y,) and diag(Dl, . . . , D,) be
the block diagonal of D+. We have further
where we define linear transformations 7% : CriX" + Cc XG b~
For Ç = 1,2 ,..., p - 1, let
Lemma 5.16 For k = 1,2,. . . , p - 1, the set Uk has measun zero in C'kXrk .
Proof. We need only to prove the result for k = 1. As in Chapter 3 we can show
that Ul has measure zero in C" Xq unless detDl = O. Note that D+ = P-'B(R + BmX+ B)-l B'P-' is Hermitian positive semidefinite. If det Di = 0, the first rl rows of
D+ would be linearly dependent by Lemma 3.20. Thus rank((al +hi) 1- J D+ ) < n,
which contradicts Lemma 5.15. a
Lemrna 5.17 If CnXn = N $ M then
U = (2 E N 1 Bz : N -+ N is not invertible )
has rneasure zero in N.
Proof. The result follows from (5.17) and Lemma 5.16, as in Chapter 3. O
Proof of Theorem 5.12. Note that the map 7Z can be extended to a smooth map
on CnXn without changing its values on a closed ball centered at X+ and contained
in D. NOW, as in Chapter 3, the proof can be completed by applying Theorem 5.3,
Theorem 2.4, Corollary 5.11 and Lemma 5.17. O
When al1 elementary divisors of the closed-loop mat rix corresponding to the eigen-
values on the unit circle are linear , we know from Theorem 5.12 t hat the convergence
of the Newton iteration is, if not quadratic, linear with rate i. The following exam-
ple shows that linear convergence with rate f is not to be expected when we have
element ary divisors of higher degree.
Example 5.1 Consider the DARE (5.1) with n = 2,m = 1 and
Clearly (A, B) is d-stabilizable and X+ = O (O is the unique almost stabilizing
solution in this case. See Theorem 13.5.2 of [35], for example). Note that ( A - 1)2 is
the only elementary divisor of A+ = A.
For any real matrix Lo such that & = A - BLo is d-stable, the Newton sequence
{Xi} is well defined by the recursion (5.5)-(5.6). It is clear that the iterates Xi are
al1 real symrnetric. We write for i = 0,1,. . . ,
Since A - B(R + BmX;B)-'(C + BmXiA) is d-stable, we can deduce that ci f O.
Since Xi 2 0, we al30 have ai, bi > 0.
By (5.6)-(5.8), we find for i = 0,1,. . .
Since Xi -t O, we get from (5.21)
c;+l 1 lim - - i-.. - 3
It follows from (5.19) that
It then follows from (5.20), (5.22) and (5.23) that l i ~ , , bilai = O. If the conver-
gence of the Newton iteration is linear with rate p, then lin,, = p. New
by (5.19) and (5.23),
If liq,, q/af = O, we would have liq,, ai+l/ai = 112 by (5.24). In view of (5.22),
We have limi,, ~ i / a : = os, a contradiction. Thus, lir~,, ei/a: # O. Therefore,
ci+l a: lim -- = 1, i-m a:+l
and we get from (5.22) that p = 1 1 4 . The constant 118 cornes as no surprise.
This same constant also appeared in a similar situation for the CARE case ( s e
Example 4.2).
The above example can also serve to show that Xo 2 Xl is generally not true if
Xo is not deterrnined by (5.5). Take
with a > 4 0 < c < 1, and 6 r d . It is easily checked that A - B(R+ BmXoB)-'(C + BmXoA) is d-stable. We see fiom (5.19) that al - 0.5~'-" as e -+ O. Thus Xo 2 Xl
cannot be true for smd e. As e and 6 go to zero, we have llXo - X+I/ -+ 0, but
11x1 - X+ll - 00-
As for the CARE, quadratic convergence has never been observed when the con-
ditions of Theorem 5.1 are satisfied and R;+ ha3 eigenvalues on the unit circle.
We have the following conjecture.
Conjecture 5.18 Assume thot ( A , B) is a d-stabiluable pair and the DARE (5.1)
has a mdmd solution X+ coith R+BmX+ B > O. ~f ai+ is not invertible and p is the
highest degree of the elementary diohors of A- B(C+ BeX+A)'(R+ B'X+ B)-'(C + BmX+A) associated with its eàgenvulues on the unit circle, then the Newton sequence
defined by (5.5)- (5 .6) converges to X+ linearly imth
or, more strongly,
for any m a t e n o m .
5.5 Using the double Newton step
The Newton iteration can be used to fmd the maximal solution of the DARE (5.1)
when the closed-loop matrix has eigenvalues on the unit circle, while most other algo-
rithm3 are not applicable in this case (see (411). We have shown that the convergence
of the Newton's method is either quadratic or linear with rate i, provided that the
unimodular eigenvalues are all semi-simple. Quadratic convergence has never been
observed in numerical expenments and we have conjectured above that the conver-
gence is always linear. In this section we will show that the efficiency of the Newton
iteration can be improved significantly if a double Newton step is used at the right
time. However, the second derivative of the Riccati function is no longer constant
(compare (5.3) with (3.3) and note that, in (5.3), H depends on X). Consequently,
the improvement will not be as dramatic as for the CA=. In contrast with Theorerns
3.22 and 3.23, we have:
Lemma 5.19 In the setting of Theorenas 5.1 and 5.3, assume that Xk is close enough
to X+ tm'th Xk - X+ E /V and that I I (Rf J-' 1 1 5 cIIXk - X+ I I - ' with c independent
of k. If fi.kr = Xk - 2('Ri,)-'R(Xk), then I I ficl - X+ I I 5 cl llXk - X+ Il2 for some
c o d a n t cl independent of k.
Proof. By Taylor's Theorem,
and then
When the direct sum condition is satisfied and the convergence of the Newton
sequence {Xk) is not quadratic, we have 11(RkIr)-l 1 1 5 cIIXk - X+ I I - ' for al1 k (cf.
Theorem 5.12). Moreover, the error Xk - X+ will be dominated by its N-component
for large k. A much better approximate solution can then be obtained by applying
the double Newton step. More precisely, we have the following result.
Theorem 5.20 Assume CnXn = N $ M and the convewence of the Newton iter-
ation is not puadrutic. If for some k , 11 Xk - X+ I I is small enough and II PM(Xk - X+)ll 5 ejlpN(Xk -X+)I/ with c suficienuy s m d , and = Xk-2(n&k)-1a(&),
then - X+ I I 5 clc + ctllXk - X+ I l 2 for some constants cl and cz independent
of c and k.
Pmof The result follows from Lemma 5.19 and the argument used in the proof
of Theorem 3.23. 0
In contrat with the CARE case, it can happen that the matrix Yk+l in Theorem
5.20 is neither stabilizing nor almost stabilizing.
Example 5.2 (cf. [35, Example 13.2.11). Consider the DARE (5.1) with Q = C = O
and A = B = R = 1. Clearly (A, B) is d-stabilizable and X+ = O. Al1 eigendues
of the closed-bop matrix are on the unit circle and semi-simple. For Lo = 1, the
Newton iterates are found to be
1
Thus, the convergence is linear with rate 112. If we cornpute Yk+t as in Theorem
5.20, we get 1
K+i = - I . (2k+' - 1)(2'+2 - 1)
Although Yk+1 is much more accurate than Xk+l for large k, it is neither stabilizing
nor almost stabilizing.
The double Newton step is useful in that it can significantly improve the acwacy
of the m e n t Newton iterate and thus find more correct digits of the exact solution.
The potential problem of getting a slightly non-stabilizing approximate solution is
not our concern here. Even if an exact solution with infinite number of deumals
is known, we will probably get a slightly non-stabilizing approximate solution by
keeping only a finite number of decimals.
Theorem 5.20 suggests the following modification of the Newton method.
Algorithm 5.21 (Modified Newton method for the DARE).
1. Choose a matriz Lo for whàch A - BLo i s d-stable.
2. Find Xo frorn (5.5).
In the above algorithm, II II is an easily cornputable matrix n o m (e.g. 1-nom)
and c is a prescribed accuracy. The equation R;l, (H) = R ( X k ) can be rewritten as
a Stein equation H - A;+,HAk+l = -72(Xk), which can be solved efficiently by a
variation of the BartelsJStewart algorithm (31. See also [18] and (411. According to
Theorem 5.20, the double Newton step will be efficient only when the current iterate
is already reasonably close to the solution. This is a major difference between the
DARE case and the CARE case (see Theorem 3.23). We may try the double Newton
step only when the n o m of the residual is small enough (les than fi, say) and save
a little more computational work. In the above dgorithm, al1 iterates except the last
one are identical to those produced by the original Newton method. Thus dl gwd
properties of the Newton method are retained.
5.6 Numerical results
In this section we give two simple examples to illustrate the performance of the
modified Newton method.
Example 5.3 We consider the DARE (5.1) with n = rn = 2 and
Note that A and R are both singular. It can be easily verified that X+ = diag(1,O)
is the only solution of the DARE and the closed-loop eigenvalues are O and 1. Note
also that R+BmX+B > O. We take Lo = diag(0,2) so that 4 = A- BLo is d-stable,
and apply the rnodified Newton method with e = 10-'O. The numerical results are
recorded in Table 5.1. The last iterate is produced by the double Newton step.
Example 5.4 Weconsider the DARE (5.1) with n = m = 8 and
Table 5.1 : Performance of the modified Newton method for Example 5.3
For this example, X+ = O and the closed-lwp eigendues are those of A. The uni-
modular eigendues are al1 semi-simple. We take Lo = diag( - 1,1,1,1,1,0.1,0.1,0.1)
so that A0 = A - BLo is d-stable, and apply the modified Newton method with
r = IO-''. The resdts are recorded in Table 5.2. Again, the last iterate is produced
by the double Newton step.
In both examples, the convergence of the Newton method is linear and the final
double Newton step reduces the error significantl y.
Table 5.2: Performance of the modified Newton method for Example 5.4
Chapter 6
A Special Discrete Algebraic Riccati Equation
6.1 Introduction
In this chapter, we are concerneil with the iterative solution of the matnx equation
The matrix Q is rn x rn Hermitian positive dehi te and Hermitian positive definite
solutions are required. This equation has been studied recently by several authors
(see [2, 16, 17, 52, 531). The equation has applications in control theory, ladder net-
works, dynarnic programming, stocha~tic filtering, and statistics. See the references
given in (21.
A maximal solution of a matrix equation was defined before. Similady, a Hermi-
tian solution X- of a rnatrix equation is called minimal if X- 5 X for any Hermitian
solution X of the matrix equation.
It is proved in (171 that if (6.1) has a positive definite solution, then it has a
maximal Hermitian solution X+ and a minimal Hermitian solution X- . Indeed, we
have O < X- 5 X < X+ for any Hermitian solution X of (6.1). Moreover, we have
p(X;'A) $ 1 (se, e.g., [52]), where p ( * ) is the spectral radius.
When the matrix A is nonsingular, the minimal positive definite solution of (6.1)
can be found via the maximal solution of another equation of the same type. The
following result is a slight generalization of [17, Thm. 3.31.
Lemma 6.1 If A is nonsigular, then X Lc a solution of
x + A ' X - ~ A = Q
if and only if Y = Q - X is a solution of
Y + A Y - I A * = Q.
In [17], an algorithm was presented to find the minimal solution of the equation
(6.1) for the case where A is singular. The algorithm was based on a recursive
reduction process. The reduction process is useful in showing that the minimal
positive definite solution of (6.1) exists even if the matrix A is singular. However, it
is usually impossible to fmd the minimal solution using that algorithm. The reason is
sjmply that the singularity of general matrices can only be checked approximately due
to rounding errors. For a given matrix A, we can use singular value decomposition
to check if the matrix is nearly singular ( s e [22]). However, it is extremely difficult
(if not impossible) to determine if a general matrix is exactly singular or not. On
the other hand, for OUT equation, the minimal solution is generally not continuous
at a singular matrix A, as the following example shows.
Example 6.1 Let Q = 1 in equation (6.1). If
we find
But for
with O < e < 0.3, we find
Therefore, there is virtually no hope in finding the minimal solution in the pres-
ence of rounding errors when A is singular. We will therefore limit out discussion to
the maximal solution.
As we have seen in Chapter 5, invertibility of R is not necessary for the general
theory for the DARE (5.1). The matrix equation (6.1) is then a special case of the
DARE (5.1) with R = 0 , A = 0, and B = 1. For cornparison purposes, and to
improve existing results, we will study the properties of sorne simpler methods for
(6.1) before we study the application of Newton's method to this equation.
In Section 2, we discuss the convergence behaviour of the basic fixed point it-
eration for the maximal solution of (6.1). In Section 3, we study the convergence
behaviour of inversion free variants of the basic futed point iteration. In general,
these algorithms are linearly convergent and do not perform well when there are
eigenvalues of X;'A on, or near, the unit circle. In Section 4, we study the prop-
erties of the Newton iteration. Some numerical examples are reported in Section
Throughout this chapter, II II will be the spectral n o m for square matrices unless
othexwise ooted.
6.2 Basic fxed point iteration
The maximal solution X+ of (6.1) can be found by the following basic fixed poict
iteration:
Algorithm 6.2
For Algorithm 6.2, we have Xo 2 Xi 2 , and lim,, X, = X+ (see, e.g.,
[17]). The following result is given in [52].
Theorem 6.3 For any c > 0,
for al1 n suflciently large.
We now show that the above result can be improved.
Theorem 6.4 For al1 n 2 0,
and
we have
Thus,
Hence,
and
lim sup n-00
In the last equality, we have used the fact that lim = p(B) for asy square
matrix B and any nom. O
We mentioned earlier that @;'A) 5 1 is always true. From the second part of
the above result , we know that the convergence of the fkced point iteration is R-linear
whenever p(X;' A) < 1. For detailed definitions of the rates of convergence, see [43].
Zhan asked in [52] whether p(XzlA) 5 1 implies IIX;'AII 5 1. This is not the case
and, in fact, it is possible to have IIX;'AII > 1 when p(X;'A) < 1. If p(X<'A) = 1,
the convergence of the fixed point iteration is typically sublinear.
Example 6.2 We consider the scalar case of (6.1) with A = f and Q = 1, Le.,
Clearly, X+ = f and p(X;'A) = 1. For the fixed point iteration
we have 1
Xo > Xl > , and lim Xn = - n-ao 2 '
Note that
i.e., the convergence is sublinear .
6.3 Inversion fkee variant of the basic fixed point
iteration
In [52], Zhan proposed an inversion free variant of the basic fixed point iteration for
the maximal solution of (6.1) when Q = 1. For general positive definite Q, Zhan's
algorithm takes the following form:
Algorithm 6.5 Take Xo = Q, & = Q-l. For n = 0,1,. . . , compute
The convergence of Algorithm 6.5 was established in (521 for Q = I . Zhan's result
can easily be transplanted and we have
Theorem 6.6 If (6.1) h a a positive definite solution then, for Algorithm 6.5, Xo 2
Xi 2 * O * , Yo SY, S . * * , andlim,,,,Xn =X+, li%,,Y, = X;'.
The problem of convergence rate for Algonthm 6.5 was not solved in (521. We
now establish the following result:
Theorem 6.7 For any c > O , tue have
and
for all n suficiently large.
Proof. We have from Algonthm 6.5
The inequality (6.3) follows since I I Y. - X;' I I 5 IIK-i - X;' I I and Lm Y, = X;'.
The inequality (6.4) is true since
If A is nonsingular, we have by (6.6) and (6.7)
Therefore, since IIXn - X+II 5 IIX,,-l - X+II, (6.5) is tme for n large enough.
a
The above proof shows that Algorithm 6.5 should be modified as foilows to im-
prove the convergence properties:
Algorithm 6.8 T d e Xo = Q, O < & 5 Q-'. For n = 0,1,. . . , cornpute
Note that one convenient choice of Y. is Y. = I I IIQllaO. \Ne can also use this
choice of Y. in Algorithm 6.5. Theorems 6.6 and 6.7 remain true for any Y. such
that O < & 5 Q-l .
Lemma 6.9 (521 If C and P are Hennitian matrices of the same order with P > 0,
then CPC + P-' 2 2C.
Theorem 6.10 If (6.1) has a positive definite solution and ( X , ) , {Y,) are deter-
mined b y Algorithm 6.8, then Xo 2 Xl 2 , lim,,,, X, = X+; 5 < , lim,,ico Yn = X z l .
Prwf. It is clear that
is true for n = 1. Assume (6.8) is true for n = k. We have by Lemma 6.9
T her efor e ,
&+1 = Q -AmK+1A > Q -ADX; 'A= X+.
Since K 5 X& 5 X;', we have Kt > Xk. Thus,
Yk+i - f i = Y*(Y;' - Xk)& > 0,
and
Xk,, - Xk = -Arn(K+l - &)A 5 0.
We have now proved (6.8) for n = k + 1. Therefore, (6.8) is tme for al1 n, and
the limits limdoo X. and lim,,,, Yn exist. As in [52], we have limX, = X+, and
limYn =X;'.
Theorem 6.11 For Algorithm 6.8 and any c > O , we have
IIX+l - X;'ll 5 (IIAX;'ll + 4211K - X;l II
and
Ilxn - x+ Il É IIAI1211Yn - X;lII
for al1 n large enough. If A is nonsingular, we also have
for al1 n large enough.
Proof. The proof is very similar to that of Theorem 6.7. a.
We see from the estimates in Theorem 6.7 and Theorem 6.11 that Algorithm 6.8
can be faster than Algorithm 6.5 by a factor of 2. Compared with Algonthm 6.2,
Algorithm 6.8 needs more computational work per iteration. However, Algorithm
6.8 has better numerical properties since matrix inversions have been avoided. A l g ~
nthm 6.8 is particularly useful on a parallel computing system, since matrix-matrix
multiplication can be carried out in parallel very efficiently (se, e.g., 1221).
For Algorithm 6.8, R-linear convergence can be guaranteed whenever p(X;'A) <
1. This will be a consequence of the following general result.
Theorem 6.12 [34, p. 211 Let T be a (nonlinear) operator from a Banach space E
into itself and z' E E Le a solution of z = Tz. If T is Fréchet differentiable ut z'
with p ( ~ : . ) < 1 , then the itemtes zn+1 = Tzn (n = 0,1,. . .) converge to z', provided
that xo is suflciently close to z'. Monover, for any r > 0,
when II II is the nom in E and c(zo; c) is a constant independent of n.
Corollary 6.13 For AIgod?zrn 6.8, we have
Pmof For Algorithm 6.8, we have
where the operator T is defined on CmX" (rn is the order of Q) by
T ( Y ) = 2Y - YQY + YA'YAY.
It is found that the Fréchet derivative T; : CmXm -t Cmxm is given by
T;(z) = 2 2 - ZQY - YQZ + ZA'YAY + YA'YAZ + YA'ZAY.
Therefore,
T' -,(z) = x;~A*zAx;~. x+
Let A be any eigendue of T&. We have
for sorne Z # O. However, (6.1 1 ) is quivalent to
8 (AX;l)')vec Z = Avec 2.
Therefore, X is an eigenvalue of T' -1 if and only if it is an eigenvalue of ( A X ; ' ) ~ 8 x+ (AX;')'. Note that ~ ( ( A X ; ' ) ~ 8 (AX;')') = {AP : A, p E o(AX;')} (see [35,
Theorem 5.1 J I ) . Thus, p ( ~ ' -1) = ( ~ ( A X F ' ) ) ~ . x+ By Theorem 6.12, we have
iim aup d m 5 p ( ~ X ; l ) = (~(X;'A))~. n-a,
6.4 Newton's method
For equation (6.1), the convergence of the algorithrns in the above two sections may
be very slow when X;'A has eigenvalues close to (or even on) the unit tircle. L
these situations, Newton's method c m be recommended.
When A is invertible, it was shown in [17] that X is a solution of X+ A T - ' A = I
if and only if X is a solution of the DARE
However, the results in Chapter 5 cannot be applied to this equation. For the
maximal positive definite solution X+ of (6.lZ), we have X+ = I - AmX;'A < 1.
Therefore, -1 + X+ > O cannot be true and Theorem 5.1 cannot be applied.
We now let m = n in DARE (5.1), and take A = O, R = 0, B = 1. The equation
becomes X + C'X-'C = Q, which has the same form as (6.1)' and the hypotheses
of Theorem 5.1 are tnvially satisfied. We can then apply the results in Chapter 5
to the equation (6.1) (the matrix A in (6.1) has taken the place of the matrix C in
(5-1))-
The next result is an immediate consequence of Theorem 5.1. The first conclusion
has also been proved in [17]. The second conclusion has been noted in [52].
Theorem 6.14 If (6 .1 ) has a positive definite solution, then it has a rnazimal posi-
tive definite solution X+ and p(X;'A) _< 1.
By taking Lo = O in (5.5), we obtain & = O (which is certainly d-stable) and the
following aigorit hm for equation (6.1):
Algorithm 6.15 (Newton's method for (6.1)). Take Xo = Q. For i = 1,2, . . . , cornpute Li = XC'~A, and solue
Note that the Stein equation (6.13) is uniquely solvable when p(Li) < 1. From
Theorems 5.1, 5.3, 5.4, 5.5 and 5.12 we have:
Theorem 6.16 If (6.1) hm a positive definite solution, then Algorithm 6.15 de-
t emines a sequence of Hemitian matRees { X i ) E for which p(Li) < 1 for i =
O , 1,. . . , Xo 2 Xi 2 - -, and Km,, Xi = X+. The convergence is quadmtic if
p(X;'A) < 1. If p(X;lA) = 1 and ail eigenvdues of X;'A on the unit circle a n
semisimple, then the convergence is either quadmtic or linear with rnte 112.
The equation (6.13) can be solved by a compkx version of the algorit hm describecl
in [Ml. The computational work per iteration for algorithm 6.15 is roughly 10 - 15
times that for algorithm 6.2.
As we have seen in the previous convergence results, the convergence rates of
various algorithms for equation (6.1) are dependmt on the eigenvalues of X;lA,
where X+ is the sought afta solution of (6.1). We will now relate the eigenvalues of
X;'A to the eigenvalues of a matnx pend1 which is independent of X+.
The following tesult is an immediate consequence of Corollary 5.7 and Theorem
5.8.
Corollary 6.17 For equation (6.1), the eigenvalues ofXqlA art precàsely the eigen-
values of the matriz p e n d
X F - G E X
inside or on the unit circle, with half of the partid mdtiplicities for each eigenoalue
on the unit circle.
According to [17, T h . 2.11, the equation (6.1) has a positive definite solution
if and only if the rational matrix-valued function $(A) = Q + X A + X I A ' is regular
(i.e., det+(A) # O for some A ) and $(A) 2 O for al1 X on the unit circle. In particular,
(6.1) has a positive dehite solution if $(A) > O for al1 X on the unit circle.
Let r ( T ) be the numerical radius of T E CmXm, defined by r (T) = max{lxmTxI :
z E Cm,zaz = 1}. Note that r ( T ) 5 IlTl1 5 2r(T) (see (2'71, for example).
The following lemma has ben proved in [17].
Lemma 6.18 $(A) > O for dl X on Me unit circle if and only if
As we have seen in Theorem 6.16, the convagence of Algorithm 6.15 is quadratic
if p(X;lA) < 1. Our last theorem clarifies this condition.
Theorem 6.19 For equation (6.1), p(X;lA) < 1 if and only if
Pmof By Corollary 6.17, it is enough to show r ( Q - 1 / 2 ~ Q - 1 / 2 ) < 1 2 if and only
if the pend AF - G has no eigendues on the unit circle. By appropriate block
elimination we find that
-XI O I
det(AF-G) = det -Q I -A*
A A O
-Q-XA' I = det
A X I
= det
= (-1)"Amdet(Q + A-'A + XA').
Therefore, XF - G has no eigenvalues on the unit circle if and only if $(A) > O for
al1 X on the unit circle, the latter is +valent to r ( ~ - l / ~ ~ ~ - l / ~ ) < 2 by hmma
6.5 Numerical results
In this section, we give some examples to illustrate the convergence behaviour of
various algorithms we have discussed for the solution of (6.1). Double precision is
used in al1 computations.
Example 6.3 Consider equation (6.1) with
The macimal solution (with the first 9 digits) is found to be
We compare the number of iterations required for Algorithms 6.2, 6.5 and 6.8 to
get the first 6 correct digits.
Algorithm 6.2 needs 16 iterations with
Algorithm 6.5 needs 34 iterations with
Algorithm 6.8 needs 19 iterations with
We have used = I/IIQllm for Algorithms 6.5 and 6.8. If we use & = Q-l, the
numbers of iterations are 32 and 17, respectively.
The convergence is linear for al1 three dgorithrns. The convergence of Algonthm
6.2 is slightly faster than that of Algorithm 6.8, while the convergence of Algorithm
6.8 is faster than that of Algorithm 6.5 by roughly a factor of 2. These are consistent
with the convergence results in Section 2 and Section 3. For this example, we have
The next two examples will show that, for equation (6.1), Algorithm 6.15 can
be much more &&nt than Algorithm 6.2. Of course, for easy problems the basic
fixed point iteration needs no more than 30 iterations to get a good approximate
solution. In these cases we cannot expect Newton's method to perform better, since
two or three iterations are usually necessary for the Newton iteration. For t hese two
examples, we use the practical stopping criterion
for both Algorithm 6.15 and Algorithm 6.2, where a is a prescribed tolerance.
Example 6.4 We consider the equation (6.1) with Q = I and
0.20 0.20 0.10
0.10 0.15 0.25
For this exarnple, A is Hermitian (and hence normal). The exact maximal solution
can be found according to the formula
1
which is valid for any normal matrix A with IlAl1 5 112 (see (531). The exact solution
with the first 8 digits (without rounding) is
0.82654545 -0.16837666 -0.15816879
x+= [ 0.83164938 -0.16327272
symm. 0.82144151 1 Sine r ( A ) = IlAl1 = 112 for this example, we have P(X;'A) = 1 (cf. Theorem
6.19). The convergence of Algorithm 6.2 tums out to be sublinear. It needs 7071
iterations to satisfy (6.14) for c = IOM8, with
0.82656902 -0.16835309 -0.15814522
x&= ( 0.83167296 -0.16324916 .
syrnrn. 0.82146509
0.82656580 -0.16835631 -0.15814844
1 On the other hand, the convergence of Algorithm 6.15 is linear with rate 112 (cf.
Thmrem 6.16). The stopping aiterion is satisfied after 12 iterations, with
X E - [ 0.83166974 -0.16325238
symrn. 0.82146187 1 We find that both X& and X z have four correct digits, with x,": slightly bet ter.
If we use a double Newton step following XE, the first eight digits of the resulting
approximate solution are the sarne as in the exact solution. This example shows that
Newton's method cari be much more efficient than the basic k e d point iteration
when r (Q- '12~Q- '12 ) is equal or very close to 112.
Example 6.5 We consider the equation (6.1) with
For this example, r (Q-1 /2~Q-112) < 1/2. Thus the Newton iteration converges
quadratically to the maximal solution. It needs 8 iterations to satisfy the stopping
criterion (6.14) for c = 10-12. The computed maximal solution is
1.86737567 0.32524233
syrnm. 0.41582003
The basic fixed point iteration needs 332 iterations to satisfy the same criterion. The
convergence is linear since p(X;'A) < 1. Note that IIX;'AII > 1 for this example.
The minimal solution can be obtained by Lemma 6.1:
In computing Y+, the Newton iteration needs 8 iterations to satisfy II Y, + AY,-'A* -
Qll, < 10-12, while the fixed point iteration needs 330 iterations.
Chapter 7
Some Concluding Remarks
We have studied the convergence behaviour of Newton's method for the algebraic
Riccati equations under very mild conditions. Our emphaais has been on the situation
where the closed-loop matnx has eigendues on the imaginary a x i s (unit circle) for
the CARE (DARE). When these eigendues are semisimple, the convergence of
Newton's method ha9 b e n shown to be either quadratic or linear with rate 112.
However, it is not known whether quadratic convergence is indeed possible. When
t here are non-semisimple closed-loop eigenvalues on the imaginary axis (unit circle)
for the CARE (DARE), the convergence behaviour of Newton's method remains a
topic for future research.
While the Newton iteration can be used as a correction method, an important
feature of Newton's method for the Riccati equations is that the convergence is not
merely local. in some situations, we cannot provide Newton's method with a good
starting point using other methods. Newton's method has to be used with an initial
guess not necessarily close to the solution. Since the convergence rate results we
have established are essentially asymptotic, we don't h o w the error reduction for
Newton's method at earlier stages. Numerical results suggest that we have linear
reduction frorn the very beginning, yet a general theoretical result confirming this
behaviour is still to be obtained. For the CARE (3.1) with C > O, linear reduction
has been confirrned in [l]. For this special case, however, the Hamiltonian matrix
has no eigenvalues on the imaginary axis (cf. [35, Theorem 9.1.21) and it is usually
easy to find a good starting point for Newton's method using other methods.
As we have seen in Examples 4.2 and 5.1, the h t Newton iteration can make a big
adjustment to the initial guess. This adjustment could be harmful in some occasions.
The method of step size control has b e n introduced to Newton's method for the
CARE in [4] to overcome this potential difficulty and to speed up the convergence
of Newton's method. The theory in [4] is established under the awumptions that
the pair (A, D) is controllable and the Hamiltonian matrix has no eigenvalues on the
imaginary axis. It would be worthwhile to explore the possibility of applying the
method of step size control to the DARE and to the CARE where the pair (A, D) is
stabilizable and/or the Hamiltonian matrix has eigendues on the imaginary axis.
Bibliography
[Il J. C. AUwnght, A lower k u n d for the solution of the algebmic Riccati equation
of optimal control and a geometric convergence tute for the Kleinman algorithm,
IEEE Trans. Autom. Control, 25 (1980), pp. 826-829.
[2] W. N. Anderson, Jr., T. D. Morley, and G. E. Trapp, Positive solutions to
X = A - BX-'B., Linear Algebra Appl., 134 (1990), pp. 53-62.
[3] R. H. Bartels and G. W. Stewart, Solution of the matriz equation AX+XB = C,
Comm. ACM, 15 (1972), pp. 82G826.
[4] P. Benner and R. Byers, An ezact line search method for solving genemlized
continuous-time algebruic Riccati equations, IEEE Trans. Autom. Control, 43
(1998), pp. 101-107.
[5] P. Benner, A. J. Laub, and V. Mehrmann, A collection of benchmark ezamples
for the numerical solution of algebruic Riccati equations 1: continuous-time case,
Technical Report SPC 95-22, FakdtZt fur Mathematik, Technische Universittt
Chemnitz-Zwickau, FRG, 1995.
[6] P. Benner, A. J. Laub, and V. Mehrmann, A collection of benchmark ezamples
for the numericd solution of algebraic Ricurti equations II: discrete-time case,
Technical Report SPC 95-23, Fakultàt für Mathematik, Technische Universitàt
Chemnitz-Zwichu, FRG, 1995.
[7] P. Benner, V. Mehrmann, and H. Xu, A new method for computing the stable
invariant subspace of a red Hatniltonian matriz, J. Comput. Appl. Math., 86
(1997), pp. 17-43.
[8] R. C. Cavanagh, Diflerence equations and itemtive processes, Ph.D. Diss., Univ.
of Maryland, College Park, Maryland, 1970.
(91 S. W. Chan, G. C. Goodwin, and K. S. Sin, Convergence properties of the Riccati
difference equation in optimal fitering of nonstabilizable systems, IEEE Trans.
Autom. Control, 29 (1984), pp. 110-118.
[IO] D. J. Clements, B. D. O. Anderson, A. J. Laub, and J . B. Matson, Spectral
factorization Eoith imagina~y-Q2iS rems, Linear Algebra Appl., 250 (1997), pp .
225-252,
[Il] W. A. Coppel, Matriz quadratic equations, Bull. Austral. Math. Soc., 10 (1974),
pp. 377401.
[12] D. W. Decker, H. B. Keller, and C. T . Kelley, Convergence rates for Newton's
method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 296-314.
[13] D. W. Decker and C . T. Kelley, Newton's Method et singular points I? S I A M J .
Numer. And., 17 (1980), pp. 6670.
[14] D. W . Decker and C. T . Kelley, Convergence accelerntion for Nezuton's method
ut singular points, S I A M J . Numer. Anal., 19 (1982), pp. 219-229.
[15] A. Emami-Naeini and G. F. Franklin, Comments on the numerical solution
of the discrete-time algebruic Riccati equation", IEEE Trans. Autom. Control,
25 (1980), pp. 1015-1016.
[16] J. C. Engwerda, On the e t e n c e of a positive definite solution of the matriz
equation X + A=X-'A = 1, Linear Algebra Appl., 194 (1993), pp. 91-108.
[17] J. C. Enperda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and suficient
conditioru for the ezistence of a positive definite solution of the matriz equation
X + A T - ' A = Q , Linear Algebra Appl., 186 (1993), pp. 255-275.
[18] J. D. Gardiner, A. J. Laub, J. J. Amato, and C . B. Moler, Solutron of the
Sylvester matriz equation AXBT + CxDT = E, ACM Trans. Math. Software,
18 (1992), pp. 223-231.
[19] 1. Gohberg, P. Lancaster, and L. Rodman, Matriz Polynomiak, Academic Press,
New York, 1982.
[20] 1. Gohberg, P. Lancaster, and L. Rodman, On Hermitian solutions of the sym-
metric dge h i c Riccati equation, SIAM J. Control Optimization, 24 ( 1986), pp.
1323-1 334.
[21] G. H. Golub, S. Nash, and C. Van Loan, A Hessenberg-Schur method for t h e
pmllem AX + XB = C , IEEE Trans. Autom. Control, 24 (1979), pp. 909-913.
[22] G. H. Golub and C. F. Van Loan, Mat& Computatiohî, Third edition, Johns
Hopkins University Press, Baltimore, MD, 1996.
(231 A. O. Griewank, Slarlike domains of convergence for Newton's method at sin-
guladies, Nurner. Math., 35 (1980), pp. 9Fhll.
1241 A. Griewank and M . R. Osborne, Newton's method for singdar problemg when
the dimension of the nul1 space is > 1 , SIAM J. Numer. And., 18 (1981), pp.
[25] A. Griewank and M. R. Osborne, Andysis of Newton's method at irregular
singtdadies, SIAM J . Numer. Anal., 20 (1983), pp. 747-773.
[26] G. A. Hewer, An iteratioe technique for Me computation of the steady-state gains
for the discnte optimal qulator, IEEE Trans. Autom. Control, 16 (1971), pp.
382-384.
[27] R. A. Hom and C. R. Johnson, Topics in Matriz Analysis, Cambridge University
Press, Cambridge, 1991.
[28] V. Ionescu and M . Weiss, On computing the stabilinng solutiot. of the discrete-
t ime Riccati equation, Linear Algebra Appi., 174 ( 1 W ) , pp. 229-238.
[29] L. V. Kantorovich and G. P. Akilov, Functional Analysis in Nonned Spaces,
Pergamon, New York, 1964.
[30] C. T. Kelley, A Shamodii-like accelemtion scheme for nonlinear equations at
singular mots, Math. Comp., 47 (1986), pp. 609-623.
[31] C. T . Kelley and R. Suresh, A new accelemtion merhod for Newton's method at
singular points, S I A M J. Numer. Anal., 20 (1983), pp. 1001-1009.
[32] C. Kenney, A. J. Laub, and M. Wette, E m r kunds for Newton refinement
of solutions to algebruic Riccati equations, Math. Control Signals Systems, 3
(1990), pp. 211-224.
[33] D. L. Kleinman, On an itemtive technique for Riccati equation computations,
IEEE Trans. Autom. Control, 13 (1968), pp. 114-115.
[34] M. A. Kramoselskii, G. M. Vainikko, P. P. Zabreiko, Ya. B. Rutitskii, and V.
Ya. Stetsenko, Approzimate Solution of Operator Equations, Wolters-Noordhoff
Publishing, Groningen, 1972.
[35] P. Lancaster and L. Rohan, Algebraic Riccati equatiom, Oxford University
Press, 1995.
[36] P. Lancaster and M. Tismenetsky, The Theory of Matrices, Second Edition,
Academic Prtss, Orlando, 1985.
[37] A. J. Laub, A Schur method for solving algebraic Riccati equotiorrs, IEEE Tram.
Autom. Control, 24 (1979), pp. 913-921.
[38] W.-W. Lin and C.-S. Wang, On computing stable Lagranggian subspaces of
Humiltonian matncw and symplectic pencils, SIAM J. Matrix Anal. Appl., 18
(1997), pp. 590414.
[39] A. Linnemann, NumeGche Methoden fut Lineare Regelungssysteme, BI Wis-
senschafts Verlag, Mannheim, 1993.
[40] J. B. Matson, B. D. O. Anderson, A. J. Laub, and D. J. Clements, Riccati
difference equatioru for discrete time spec td factorkation with unit circle zeros,
¢ trends in optimization t heory and applications, World Sci. Publishing ,
River Edge, NJ, 1995, pp. 311-326.
(41) V. L. Mehrmann, The autonornotu h e u r quadratic control problem, Lecture
Notes in Control and Information Sciences, Vol. 163, Springer Verlag, Berlin,
1991.
[42] D. Mustafa and K. Glover, Minimum entmpy H, contml, Lecture Notes in 0
Control and Information Sciences, Vol. 146, Springer Verlag, Berlin, 1990.
[43] J. M. Ortega and W. C. Rheinboldt , Iterative Solution of Nonlincar Equations
in Seveml Variables, Academic Press, New York, 1970.
[44] A. M. Ostrowski, Solution of Equations in Euclidean and Banach Spaces, Aca-
demic Press, New York, 1973.
[45] L. B. Rall, Convergence of the Newton process to multiple solutions, Numer.
Math., 9 (1966), pp. 23-37.
1461 A. C. M. Ran and R. Vreugdenhil, EEistence and compaRson theonrns for al-
gebnzic Riccati equations for continuow- and dàscrete-time systems, Linear Al-
gebra Appl., 99 (1988), pp. 63-83.
[47] G. W. Reddien, On Newton's method for singular prollem, SIAM J. Numer.
And., 15 (1978), pp. 993-996.
[48] V. Sima, An eficient Schur method to solve the stabilizing problern, IEEE Tram
Autom. Control, 26 (1981), pp. 724-725.
[49] P. Van Dooren, A generulizd eigenwdue approach for solm'ng Riccati equations,
SIAM J. Sci. Comput., 2 (1981), pp. 121-135.
(501 H. K. Wimmer, Monotonicity of mazimal solutions of algebmic Riccati equa-
tiow, Syst. Control Lett., 5 (1985), pp. 317-319.
[51] H. K. Wimmer, Nomal f o m of symplectic pencils and the discrete algebraic
Riecati equation, Linear Algabra Appl., 147 (1991), pp. 411440.
1521 X . Zhan, Computing the eztremal positive definite solutions of a matriz equation,
SIAM J. Sa. Cornput., 17 (1996), pp. 1167-1174.
[53] X. Zhan and J. Xie, On the matnt eqwtion X + A=X-'A = 1, Lin- Algebra
Appl., 247 (1996), pp. 337-345.
!hAAVE L!ALUE;T!VI? TEST TARGET (QA-3)
APPLIED IM4GE. lnc - = 1653 East Main Street - -. - - Rochester. NY 14609 USA -- --= Phone: 71 61482-0300 -- -- - - Fax: 7161208-5989