Analysis and modification of Newton's method for algebraic ...

transcript

University of Calgary

PRISM: University of Calgary's Digital Repository

Graduate Studies Legacy Theses

Analysis and modification of Newton's method for

algebraic riccati equations

Guo, Chun-Hua

Guo, C. (1998). Analysis and modification of Newton's method for algebraic riccati equations

(Unpublished doctoral thesis). University of Calgary, Calgary, AB. doi:10.11575/PRISM/21874

http://hdl.handle.net/1880/26229

doctoral thesis

University of Calgary graduate students retain copyright ownership and moral rights for their

thesis. You may use this material in any way that is permitted by the Copyright Act or through

licensing that has been assigned to the document. For uses that are not allowable under

copyright legislation or licensing, you are required to seek permission.

Downloaded from PRISM: https://prism.ucalgary.ca

THE UNVERSIT'Y OF CALGARY

Analysis and Modificaton of Newton's Method

for Algebraic Riccati Equations

Chun-Hua Guo

A DISSEmATION

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

DEPARTMENT OF MATHEMATICS AND STATISTICS

CALGARY, ALBERTA

JANUARY, 1998

@) Chun-Hua Guo 1998

National Library (*m of Canada Bibliothéque nationale du Canada

Acquisitions and Acquisitions et Bibliographie Services services bibliographiques

395 Wellington Street 395, rue Wellington OnawaON K I A O N 4 OttawaON K1AON4 Canada Canada

Your h& Votm relermw

Our lire Notre retermu,

The author has granted a non- exclusive licence ailowing the National Library of Canada to reproduce, loan, distribute or sell copies of this thesis in microfom, paper or e1ectron.i~ formats.

The author retains ownership of the copyright in ths thesis. Neither the thesis nor substantial extracts fiom it may be printed or otherwise reproduced without the author's permission.

L'auteur a accordé une licence non exclusive permettant à la Bibliothèque nationale du Canada de reproduire, prêter, distribuer ou vendre des copies de cette thèse sous la forme de microfiche/film, de reproduction sur papier ou sur format électronique.

L'auteur conserve la propriété du droit d'auteur qui protège cette thèse. Ni la thèse ni des extraits substantiels de celle-ci ne doivent être imprimés ou autrement reproduits sans son autorisation.

Abstract

When Newton's method is applied to find the maximal Hermitian solution of on

algebraic Riccati equation, convergence can be guaranteed under moderate condi-

tions. In particular, the initial guess does not n e 4 to be close to the solution. The

convergence is quadratic if the Fréchet derivative of the Riccati function is invertible

at the solution. In this dissertation, our emphasis is on the behaviour of the Newton

iteration when the derivative is not invertible at the solution. For the continuous

algebraic Riccati equation, the denvative i3 not invertible at the solution if and only

if the closed-loop matrix has eigendues on the imaginary axis. The convergence

of Newton's method is shown to be either quadratic or Iinear with cornmon ratio

1/2, provided that the eigenvalues on the imaginary a i s are dl semisimple. Infor-

mation on these eigenvalues can be obtained from the corresponding Harniltonian

matrix. Linear convergence appears to be dominant, and the efficiency of the New-

ton iteration can be improved dramatically by applying a double Newton step at the

right time. The convergence behaviour of Newton's method is conjectured when the

closed-loop matrix has non-semisimple eigenvalues on the imaginary axis. Similar

results are established for the discrete algebraic Riccati equation. A matrix equation

that has been studied extensively turns out to be a special discrete algebraic Riccati

equation. Newton's method and some other methods are studied for this equation

whenever it has a Hermi t ian positive dehi te solution.

Acknowledgements

1 wish to thank my supervisor Professor Peter Lancaster for his guidance. Thanks

are also due to Professors P. A. Binding and L. P. Bos for several helpful discussions.

1 wish to express my gratitude for the finanad support 1 have received £rom The

University of Calgary and from research grants of Professors P. Lancaster, P. A.

Binding and R. Torrence. Findy, 1 thank my wife Min for her maximal support.

Table of Contents

Approval Page

Abstract iii

Table of Contents v

1 Introduction 1

2 Newton's Method in Banach Spaces 9 2.1 Some basic concepts and facts in functional analysis . . . . . . . . . . 9

. . . . . . . . . . . . . . . . . . . 2.2 Newton's method in Banach spaces 12

3 Newton's Method for Continuous Algebraic Riccati Equations 16 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Preliminaries 16

. . . . . . . 3.2 Interpretation of the direct s u m condition for the CARE 20 3.3 Characterization of the direct s u m condition by the Hamiltonian matnx 22

. . . . . . . . . . . . . . . . 3.4 Convergence rate of the Newton method 26 . . . . . . . . . . . . . . . . . . . . . . . 3.5 A modified Newton method 33

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Numerical results 39

4 Examples and Conjectures 44 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Examples 44 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Conjectures 48

5 Newton's Method for Discrete Algebraic Riccati Equations 50 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Prelimlnaries 50

. . . . . . . 5.2 Interpretation of the direct sum condition for the DARE 54 5.3 Characterization of the direct s u m condition via a matrix p e n d . . . 56

. . . . . . . . . . . . . . . . 5.4 Convergence rate of the Newton method 61 . . . . . . . . . . . . . . . . . . . . . . 5.5 Using the double Newton step 69

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.6 Numerical results 73

6 A Special Discrete Algebraic Riccati Equation 76 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction 76

. . . . . . . . . . . . . . . . . . . . . . . . 6.2 Basic fixed point iteration 79 6.3 Inversion free variant of the basic fixed point iteration . . . . . . . . . 82

6.4 Newton's method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 6.5 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7 Some Concluding Remarks

Bibliogaphy

List of Tables

3.1 Perfomance of Algorithm 3.24 for Example 3.2 . . . . . . . . . . . . 40 3.2 Performance of Algorithm 3.24 for Example 3.3 . . . . . . . . . . . . 42 3.3 Performance of Algorithm 3.24 for Example 3.4 . . . . . . . . . . . . 43

5.1 Performance of the modified Newton method for Example 5.3 . . . . 74 5.2 Performance of the modified Newton method for Example 5.4 . . . . 75

Chapter 1

Introduction

In this dissertation, we are concemed with the numerical solution of algebraic Riccati

equations. The continuous algebraic Riccati equation (CARE) we will consider is of

the form

X D X - X A - A ' X - C = 0 , (1.1)

where A, D, C E CnXn , and D' = Dl C' = C. The discrete algebraic Riccati

equation (DARE) we will consider is of the form

-X + A'XA + Q - (C + BmXA)'(R + B'XB)-'(C + B'XA) = 0, (1.2)

whereA,Q E C n X n , B ~ C n X m , C ~ C m X n , R ~ CmXm, and Q* = Q , R = R. These

equations appear in sever al problem areas including linear-quadratic control, filtering

theory and Hm-optimal control, etc. See, for example, [35, 39, 41, 421.

Hemitian solutions are generally required for practical reasons. For the CARE

(1.1 ), the Hermitian solution X for which the matrix A - DX (known as the 'closed

loop" matrix) has al1 its eigendues in the closed left half-plane is of particular

interest. Such a solution is unique under wme mild conditions on the matrices

A, D, C appearing in (1.1). For the DARE (1.2), the Hermitian solution X for which

the closed loop matrix A - R(R + BaXB)-'(C + B'XA) has all its eigenvalues in

the closed unit disk is of particular interest. Again, such a solution is unique under

mild assumptions on the matrices A, B, C, Q, R appearing in (1.2). Thene desired

solutions (for both CARE and DARE) are ca11ed almost stabilizing. They are called

stabilizing if the word 'closed" is replaced by 'open".

The order relation on the set of Heimitian matrices is the usual one: X 2 Y(X >

Y) if X - Y is positive semidefinite (defmite). For A E CnXn , B E CnXm, the pair

(A, B) is said to be stabilizable (resp. d-stabilizable) if there is a K CmX" such

that A - BK is stable (resp. d-stable), i.e., all its eigenvalues are in the open left

half-plane (resp. open unit disk).

We will briefly explain how equations of the type (1.1) or (1.2) appear in linear-

quadratic control problems. See [41] for more details.

We first consider the continuous time linear-quadratic control problem:

Minimize

subject to the dynamics

where A E CnX",B E CnXm, P E CPXn,Q E CpXP,R E Cmxm,z(t) E Cn,y(t) E

CP,u(f) E C m , Q m = Q,R. = R.

If, for example, Q > O, R > 0, and the pairs (A, B), (A' ,Pm) are stabilizable,

then the solution of the above optimal control problem is given by the feedback law

where X is the unique stabilizing, positive semidefinite solution of (1.1) with

The vector x ( t ) in (1 -3) can be found from the closed loop system

We now consider the discrete time linear-quadratic control problem:

Minimize 1 0 0

subject to the difference equation

where A E CnX", B E CTbXm, P E CPXn, V E C p X p , R E Cmxm, V* = V, R = R.

If, for example, V 2 O, R > 0, and the pairs (A , B) and (A', P m ) are d-stabiljzable,

then the solution of the above discrete time optimal control problem is aven by the

feedback law

= - (R + B'XB)-' B ' X A X ~ , (1-4)

where X is the unique stabilizing, positive semidefinite solution of (1.2) with Q =

P'VP, C = O. The vectors in (1.4) are then determined by the difference equation

The equations (1.1) and (1.2) are clearly nonlinear equations. However, linear

algebra methods can play a big role in the solution of these equations. For example,

we have the following result (see, e.g., 135, Theorern 7.1.21) for the solution of the

CARE (1.1).

Theorem 1.1 Equation (1.1) hm a solution X E CnXn if and only if the= is a set

of vectors V I , . . . , Vn in C2" forming a set of Jordan chains for

and if

when y j , zj E Cn, then yl, 92,. . . , ~ O ~ Y T Z a bais for Cn.

every solution of ( 1 . 1 ) has the form X = ZY-' for some set o f Jordan chains

ul, uz, . . . , u, for H such that Y is inoertible.

From the above result, we see that the problem of finding al1 solutions of the

equation (1.1) can be reformulated as the problem of finding d l Jordan chains of

the 2n x 272 matrix H. The desired Hermitian solution (with al1 eigenvalues in the

ciosed left half-plane) c m be found, in principle, by choosing an appropriate set of

Jordan chains or an appropriate H-invariant subspace. Numerical methods based

on this approach are d l e d subspace rnethods. The most commonly used method of

this kind for solving the CARE has been the Sehur vector method 1371. For recent

development s on subspace met hods , see [7].

In physical problems involving Riccati equations those of real type aise most

frequently. Consequently, the literature is dominated numerically by papers devoted

to the real case. In [37], the Schur rnethod is discussed in detail for the real CARE

where A, D,C E RnXn,DT = D,CT = C. Standard assurnptions for (1.5) are that

D, C 2 0 and the pairs (A, D), (AT, CT) are stabilizable. When these conditions are

fdilled, the equation (1.5) has a unique positive semidehite solution X+ , and the

closed loop matrix A - DX+ has al1 eigenvalues in the open left haJf plane. If the

eigenvalues of A - DX+ are not very close to the imaginary axis, the Schur method

is very efficient, particularly when combined with one or two steps of correction by

Newton's method. Since Newton's method is quadraticdy convergent in this case, it

can irriprove the approximate solution produced by the Schur method provided that

the approximate solution is already reasonably close to the exact solution. In [32],

a cornputable estimate is given for the closeness required to make sure the Newton

method does provide improvement in the first step.

However, even under the above standard conditions for the CARE (1.5), the

eigenvalues of the closed loop matrix can still be very close to the imaginary axis.

For example, for the equation (1.5) with n = l , A = 0, D = l,C = c > O, al1 the

standard assumptions are satisfied. But the eigenvahe of A - DX+ (-4 in this

case) goes to zero as e goes to zero.

For the CARE (1.1) arising in H,-optimal control problems, the matrices D

and C are often not both positive semidefinite and the closed loop matnx can have

eigenvalues precisely on the imaginary axis. Also, according to [9], the DARE (1.2)

having closed loop eigenvalues on the unit circle has applications in the filtering and

control of systems with purely deterministic disturbances such as sinusoids and drift

components.

In this dissertation, we are mainly interested in the situations in which the

closed loop matrix has eigenvalues on the imaginary axis (unit circle) for the CARE

(DARE). There has been some recent work related to the study of algebraic Riccati

equations in these situations. See, for example, [IO] and [40].

In these situations, subspace methods typically run into difficulties ( s e (411). For

the CARE (14, for example, in order to obtain the desired Hermitian solution in

the way described in Theorem 1.1, we need to choose half of the vectors in the Jordan

chains associated with the eigenvalues of the Hamiltonian matnx H on the imaginary

axis. Note that the lengths of these Jordan chains are necessary even (see [35]). If the

closed loop matrix has only one simple eigenvalue A on the imaginary axis, then A is

the only eigenvalue of H on the irnaginary axis and the corresponding Jordan chah

consists of two vectors al,a2, where al is the eigenvector (see [35]). In this case, we

can only pi& up the vector al when forming an H-invariant subspace. If the closed

loop matrix has just two simple eigenvalues X and p on the imaginary axis, then X

and p are the only eigenvalues of H on the imaginary axis and the corresponding

Jordan chains are {al, cr2} and {A, A}, where al and pl are eigenvectors. In this

case, when forming an H-invariant subspace, we can take {al, az}, or {a l , pl}, or

{a,&}. It is not clear which pair should be used to obtain the desired Hermitian

solution. The situation is even more complicated if the closed loop matrix has more

eigenvalues on the imaginary a i s .

We will consider the application of Newton's method for the solution of the

algebraic Riccati equations in these situations. It will be seen that the convergence

behaviour of Newton's method is dependent on the partial multiplicities of the closed

loop eigenvalues on the imaginary axis (unit circle) for the CARE (DARE). However,

we do not need any information on the eigenstnicture of the closed loop matrix in

carrying out the Newton iteration. We are thus free of the above-mentioned concerns

associated with subspace methods.

It should be mentioned that some progress has been recently made (see [38])

in the study of subspace methods when the closed loop rnatnx has eigenvalues on

the imaginary acis (unit circle) for the CARE (DARE). However, some correction

method is still needed to improve the accuracy of the approximate solution obtained

by subspace methods. Newton's method cannot guarantee improvement no mat ter

how close the approximate solution is to the exact solution. It may even produce a

new approximate solution that is worse than a random guess. In these situations,

Newton's method may have to be used directly.

The plan of this dissertation is as follows. In Chapter 2, we will review some

basic concepts and results from functional analysis. Some local results on Newton's

method in Banach spaces are also reviewed with emphasis on Newton's method at

singular roots. In Chapter 3, we mainly study Newton's method for the CARE with

eigendues on the irnaginaq axis. We will prove that the convergence of Newton's

method is either quadratic or linear with common ratio 112 provided that al1 closed

loop eigendues on the imaginary axis are semisimple. We will also discuss in this

chapter how the efficiency of Newton's method can be improved. Some interest-

ing examples of the CARE are given in Chapter 4. The convergence behaviour of

Newton's method when the closed loop matrix has an arbitrary eigenstructure is con-

jectured in this chapter. Chapter 5 is devoted to the more involved DARE. Similar

results are established for DAREs having closed loop eigenvalues on the unit circle.

Matrix pencils are used to characterize the eigendues of the closed loop matrix.

The results are established under very mild conditions for the given matrices in the

DARE (1.2). In Chapter 6, we study a special DARE which also appears in many

applications. The special D A M has the tom X + A'X-'A = Q, with Q being

Hermitian positive definite. Some convergence results are established for some sim-

pler iterative methods. Newton's method is then studied and compared with these

methods. In the last chapter, Chapter 7, we make some concluding remarks.

Chapter 2

Newton's Method in Banach Spaces

2.1 Some basic concepts and facts in functional

analy sis

In this section, we will review some basic concepts and facts from functional analysis.

Let E be a linear space over the field C of complex numbers. A n o m on E is a

function z I+ llzll from E to the field R of reals having the properties

With such a norm we can talk about convergence in E: a sequence {xk} in E

converges to x in E, and z is the limit of the sequence, if the sequence of real numbers

{ l lxc - XII} converges to zero; if such z exists, then the sequence is convergent. A

sequence {tk} is a Cauchy sequence if: for dl c > 0, theie exists an integer N such

t hat 11 zm - xn 11 < c whenever ml n > N. If every Cauchy sequence in E is convergent,

then E is complete. A (complex) Banach space is a (complex) linear space which has

a norm and is complete.

A subset S of a complex Banach space E is a subspace if

for d l q y E S,CE C.

Let Si and Sz be subspaces of E. If for al1 z E E there are a unique zl E Si and

a unique t 2 E SÎ SU& that z = xl + x 2 , then E is a direct sum of SI and Sa, wntten

as E = Si $ S2. In this case, z1 is the projection of z onto SI parallel to S2 and 2 2

is the projection of x onto S2 parallel to SI.

An operator A from a complex B a n d space El into a complex Banach space

& is linear if

A ( W l + ~ 2 2 2 ) = ~lA(21) + ~ 2 4 ~ 2 )

for al1 21, x 2 E El and cl, c2 E C. A linear operator A is bounded if there is a constant

k such tbat

l l A 4 I w for al1 z E El. The linear space of al1 bounded linear operators from El into & is

denoted by L(Ei , &). Note that L(El, E2) is a Banach space with the n o m defined

When El = & = E, we will write L(E) for L(E, E).

The nul1 space (or kernel) of A E L(El, &) is defined by

Ker(A) = {z E Et 1 Ar = O}.

The range (or image) of A E L(Ei , E2) is dehed by

h ( A ) = {y E & 1 y = Az for some z E El}.

Let D be an open set in El. A map F : D + & is Fréchet differentiable at

x E D, if there is an A E L(Ei, Ez) such that

The linear operator A is unique and is called the Fréchet derivative of F at z, denoted

by ~ ' ( z ) or F' .

The second Fréchet derivative of F at z, denoted by F"(z) or F:, is the first

Fréchet derivative of F' at z. Thus ~ " ( z ) is a linear rnap fiom El into &(El, &),

or equivalently, a linear map from El x El into E2. Other higher derivatives can be

defined similarly.

We have the following intermediate value theorem.

Theorem 2.1 Assume that F : D c El 4 & is Fréchet differentiable on a convez

set Do c D . Then, for al1 z, y E DO,

llFy - Fz l l 5 sup I I F ' ( ~ + t ( y - 4)llllv - X I I * OSt<l

For the proof of Theorem 2.1, see [29], for example.

A map F : D C El -, & is Lipschitz continuous on Dl c D if there is a constant

for all z , y E Dl.

A rnap A E L ( E ) has an inverse if Ker(A) = {O) and Irn(A) = E. The inverse is

defined by

~ - ' y = z, if Az = y.

A map A E L(E) is invertible if A-' E L(E). If E is finite dimensional? then

A E & ( E ) is invertible if and only if Ker(A) = {O), or equivalently, Im(A) = E.

Concerning the inverses of linear operators, we have the following clzssical result .

Theorem 2.2 (Banach Lemma) Let E be a Banad space and A, B E t ( E ) . I f A

is inveriible and II 1 - AB11 < 1 then B is &O invertible. Monover,

2.2 Newton's method in Banach spaces

Let F be a map from a Banach space E into itself. Assume that z* E E is a solution

F ( z ) = O .

The Newton iteration for the solution z* is @en by

where zo is an initial guess.

Theorem 2.3 If F'(z) i s Lipschitz continuous in a neighborhood of z* and F'(zo)

is inuertible, then the sequence {zn) defined b y ( 2 . 1 ) is well-defined and dl converge

quadratically to z* i f the initial guess zo is suficiently close to z'.

If F'(za) is not invertible, the root z* is said to be singular. In this case, the

behaviour of Newton's method is much more complicated.

Newton's method at singular roots has been studied by many authors, see for

example [8, 12, 13, 14, 23, 24, 30, 31, 44, 45, 471.

In [45], Ra11 considered Newton's method at singular mots for maps from Rn

into itself. He analysed the one variable case in detail and initiated the study for

the higher dimensional case. The results he obtained for the higher dimensional

case are partly incorrect. The h s t correct result on the convegence of Newton's

met hod at singular roots is due to Cavazlagh [8]. However, he also made the stringent

assumption (as in M l ' s work) that F' is invertible in a deleted neighborhood of the

singular root. Due to this restrictive assumption, his result is not generally applicable

to the study of Newton's method for algebraic Riccati equations.

Newton's method at singular rwts was also studied in [44] in the Banach space

setting. The conditions in [44, Theorem 40.11 are also very stringent and are not

satisfied for the very simple CARE given in [35, Example 9.Z.l].

The restriction that F' is invertible in a deleted neighborhood of the singular root

is dropped in Reddien [47]. However, Reddien's result (established in the Banach

space set ting) needs other conditions which are generally not satisfied for algebraic

Riccati equations. On the other hand, the results in [23, 241 are given only for maps

£rom Rn into itself.

The most general result for Newton's method at singular points is given by Kelley

in [30]. That result is a combination of many previous results in this area. The result

is general enough for application to algebraic Riccati equations. We will now give a

review of that result.

Assume that F is sdkiently smooth and that F'(zm) has a finite dimensional

null space N and a closed range M with the direct sum condition E = N $ M . We

define PN to be the projection onto N parallel to M and let PM = I - PN. Assume

further that the following regularity condition holds: there is a E N such that the

map B on N given by B = PNF"(r8)(&, =) is invertible. Linear convergence with

common ratio 112 is then predicted for an appropnate initial guess. The following

result is a restatement of [30, Theorem 1.11.

Theorem 2.4 Let E = N $ M , let & be chosen so Ùrat B i s invertible, and let

M = span {b) $ NI for some subspace Ni. Write 5 = z - zm and let

d e n Po is the projection ont0 span {k} parallez tu M $ Ni. If xo E W(po, Bo, m)

for po, Bo, suficiently small, then the Newton sequence {xi} is well-defined and

llF'(xi)-'ll < clli;ll-' for all i 2 1 and some constant c > O . Moreover,

Notice that the region W(p, 8, r)) in which zo is required to lie is close to x', N,

and in the sense determined by the p, 9, q inequalities, respectively.

In general situations, it would be hard to imagine that we can choose an initial

guess satisfying al1 these inequalities. For algebraic Riccat i equations, however , we

already know (see Theorem 3.2) that Newton's method is convergent for an appropri-

ate initial guess not necessarily close to the solution. We will show in later chapters

that, as the iteration goes on, the Newton iterate will automatically get into a region

of the form (2.2) for any fixed p, 8, r] > O unless the Newton iteration is quadratically

convergent.

As we have seen from the above theorem, the convergence of the Newton iteration

to a singular root is usually linear rather than quadratic. Much work has been done

on modifications of the Newton iteration with a view to accelerating convergence

when the derivative is not invertible at the solution. See, for example, [12], [l?] and

[31]. The modified methods as described in [12] and [14] are, however, not applicable

for the CARE. Motivated by consideration of quadratic problems, Kelley and Suresh

[31] proposed other modified methods which could be applied to the CARE as well

as the DARE. These methods are all local methods and very sophisticated. We will

not use these methods in our work, since the conditions needed for the convergence

of these methods cannot be verified easily.

The regularity condition is very important for Theorem 2.4. Without this condi-

tion, the behaviour of Newton's method can be very erratic (se, e.g., [25]).

Chapter 3

Newton's Method for Continuous Algebraic

Riccati Equations

In this chapter we consider continuous algebraic Riccati equations of the forrn

where A, D, C E CnXn, and Dm = D, Cg = C. It is cleax that ?Z maps Hermitian

matrices to Hermitian matrices. However, we find it more convenient to regard 7Z

as a mapping from CnXn into itself. For any matrix n o m CnXn is a Banach space.

Note that

'K(X + H) - R(X) = - H ( A - DX) - (A' - X D ) H + HDH.

Therefore, the fist Fréchet derivative of 'K at a Hermitian rnatrix X is the linear

map 7Z; : CnXn -, CnXn given by

RL(H) = - { H ( A - DX) + ( A - D X ) ' H } . (3-2)

Also the second derivative at a Hermitian matrix X, : CnXn x CnXn + CnXn, is

The Newton method for the solution of (3.1) is

given that the maps 7Zki are dl invertible. If 'Rii is invertible, the iteration (3.4) is

In view of (3.2) when Xi is Hermitian, the above equality is precisely

It is easy to see that Xi+l is also Hermitian. Therefore, al1 the matrices Xi are

Hemitian if Xo is so.

For A, B E CnXnl the pair (A, B) is said to be stabilizable if there is a K E CnX"

such that A - BK is stable, i.e., all its eigenvalues are in the open left half-plane.

The order relation on the set of Hermitian matrices is the usual one: X > Y(X > Y)

if X - Y is positive semidefinite (definite). A Hermitian solution X+ of (3.1) is called

maximal if X+ 2 X for every Hermitian solution X. The following result is Theorem

9.1.1 of (351. See also [Il] and [20].

Theorem 3.1 Assume that D 2 O , C' = C, ( A , D ) is stabilizable, and there ezists

a Hennitian solution of the inequality R ( X ) 5 O . Then there ezists a rnazimal

Hemitian solution X+ of R ( X ) = O . Monover, al1 the eigenvalues of A - DX+ are

in the closed left half-plane.

A Hermitian solution X of (3.1) is called stabilizing (resp. aimost stabilizing) if

dl the eigenvalues of A - DX are in the open (resp. closed) left half-plane. Such

solutions play important roles in applications. Theorem 3.1 tells us that, under the

given conditions, the maximal solution is at least h o s t stabilizing. In fact (see

[50] or [35, Theorem 7.9.3]), X+ is the only Herrnitian solution that can be almost

stabilizing. For this reason, the maximal solution is of p a r t i d a interest.

Theorem 3.2 Under the same conditions as in Theonm 3.1, starting with any Her-

mitian matrit Xo for &ch A - DXa is stable, the mursion (3.5) determines a

sequence of Hermitian matn'ces { X ; } Z , for which A - DXi is stable for i = 1,2, . . ., XI z X z I - , a n d l i m , , X i = X + .

The maximal solution can thus be found by the Newton iteration without previous

information about the solution. The prwf of the above theorem can be found in 135,

p. 2321. See dm [Il], (201 and [33]. There is no doubt about the existence of the

rnatnx Xo, which is called a stabilizing matnx. In fact, we have the following result

(see Lemrna 4.5.4 of [35]).

Lemma 3.3 If D > O and ( A , D ) is stabdizable, then there is an X 2 O such that

A - DX is stable.

Moreover, an initial stabilizing Hermitian matrix Xo can be produced by au-

tomatic stabilizing procedures such as the one in [48], although the matrix Xo so

obtained may be far away frorn the solution X+. We note that Xo 2 XI is generdly

not true. In fact, the first Newton iteration is capable of making a big adjustment to

the initial guess (see [32], for example). When X+ 2 O, we necessarily have Xi 2 0.

But Xo can be indefinite.

If X is an almost stabilizing solution of (3.1) (in the sense that a ( A - DX) is in the closed left half-plane), then 7Zk is invertible if and only if X is a stabilizing

solution. This can be seen from (3.2) and the following classical result (cf. [35, p.

1001).

Theorem 3.4 For any matrices A E CmXm, B E CnXn and r E CnX" the S y h t e r

equation SA - BS = r has a unique solution if and only i j A and B hune no

eigenvalues in cornmon.

It is readily seen that Rk, as a function of X, is Lipschitz continuous on CnXn.

We note that the expression for 7Zk when X is not Hermitian is different irom the one

given in (3.2). The locally quadratic convergence of Newton's method ( s e Theorem

2.3), in combination with Theorem 3.2, yields the following result.

Theorem 3.5 If A - DX+ is stable in Theorem 3.2, then for the sequence {X;}g0

there is a constant c > O such that, for i = 0,1,. . ., IIX;+i - X+II 5 cilXi - X+IIZ,

cohere II JI is any given mat& nom.

We note that, because a ( A - DX+) is in the open left half-plane, A - DXo is

necessarily stable if Xo is close enough to X+. A direct algebraic proof of the above

theorem can be found in [35, p. 2371.

In [4], an exact line search method is introduced which improves Newton's method

for the numerical solution of the Riccati equation in severd aspects. However, the

theory established there does not cover the general situation described in Theorem

3.1, even when A - DX+ has no eigendue-s on the imaginary axis.

When A - DX+ has eigenvalues on the imaginary axis, R>+ is not invertible

and the convergence behaviour of the Newton iteration is more complicated. In this

chapter we examine the behaviour of the Newton iteration for this case. The results

we obtain suggest that a simple modification step can be introduced to improve the

performance of the Newton iteration dramat idy in many cases. Numericd results

are also given to show the effectiveness of the modification.

3.2 Interpretation of the direct sum condition for

the CARE

The direct mm condition in Theorem 2.4 is essentid for the conclusions of that

result. We will now give an interpretation of the direct sum condition for the contin-

uous algebraic Riccati equation (3.1). We assume throughout this section that the

conditions of Theorem 3.1 are satisfied. Let X+ be the maximal solution of (3.1)

with 72k+ not invertible. Let N = Ker RIx+, M = I m q . We have the following

interpretation of the direct sum condition.

Theorem 3.6 CnX" = N $ M if and only if al2 elementary divisors of A - DX+

comsponding to the eigenvalues on the imaginary a i s are linear.

Pmof. Let J be the Jordan canonical form for A- DX+ with P-'(A- DX+)P =

J. We find that K E Ker'RX, if and only if K = P-'QP-' for some Q E Cnxn

satisfying Q J + J'Q = O. Also W E Im'RX, if and only if W = P-'RP-' with

R = VJ + J'V for some V E CnXn. Therefore,

where NJ and M r are the kernel and range of the map Ç : CnXn + CnXn given by

If al1 elementary divisors of A - DX+ corresponding to the eigenvalues on the

imaginary axis are linear, we can arrange the Jordan blocks so that

Here G,, E CTpXrp consists of Jordan blocks associated with eigenvalues in the open

left half-plane, and for k = 1,. . . , p - 1,

where the ar7s are distinct real numbers and i is the imaginary unit. Using block

matnx multiplications and applying Theorem 3.4 repeatedly, we can find easily

NJ = { N = diag(Nl,. . .,Np) IN; E CfiX",l 5 i sp; Np = O), (3-9)

Therefore, CnXn = NJ @ M J . We also have CnXn = N $ M by (3.6)-(3.7).

If A - DX+ has nonlinear elementary divisors conesponding to eigenvalues on

the imaginary axis, we can arrange the Jordan blocks so that the first Jordan block

Ji has the following fom:

Note that T =Ç(S) for

Therefore, CnXn # Nj $ Mj, and also CnXn # N $ M .

3.3 Characterizat ion of the direct sum condition

by the Hamiltonian matrix

We have just given an interpretation for the direct sum condition using information

about the eigenvdues of A - DX+ on the imaginary a i s . However, X+ is the

solution to be found. It will be appropriate to give a characterization of the direct

sum condition that is independent of X+.

We st art with some definitions.

Let A E CnXn, B E CnXm. The controllable subspace CAPB of the pair (A, B) is

defined by

CAVB = Im(B AB A ~ - ' B ) .

The pair (A, B) is d e d controllable if

Note t hat if (A, B) is controllable then it is necessarily stabilizable (see Proposition

3.7 and Lemma 3.18 to follow) .

The next four propositions are taken from Chapter 4 of [35].

Proposition 3.7 For any A E CnXn and B E CnXm, the pair (.4, B ) is controllable

if and only if

for al1 X E C .

Proposition 3.8 Let A, B, K Le matrices of sizes n xn, n xm, and m xn respectively

and -te Â = A + B K . Thcn

cig = CAB.

Proposition 3.9 For any matriz putt (A , B ) wîth A E CnXn and B E Cnxm, there

is a nonsingular matriz K such that

when Al is r x r , B1 is r x rn and r is the dimension of CA,& Furthermore, the pair

(Al, Bl) is controllable.

The pair of matrices

of (3.1 1 ) is known as a control (or K h a n ) normal form of (A, B) .

Proposition 3.10 Assume that the pair ( A , B ) is not controllable. Then the pair

( A , B ) is stubilizable if and only if the matriz Az of a control nonnal form (3 .11) is

stable.

For A E Cnxn and A E C, we let

&(A) = {x E Cn 1 (A - Xl)pz = O for some integer p 2 O).

The following result is a restatement of Theorem 7.3.1 of 1351.

Theorem 3.11 Assume that D > O and Ce = C . Suppose the CARE (3.1) has a

Hemitian solution X , and

for every eigenvalue Xo of A - DX on the imaginary azh. Then X is an eigenvalue

of A- DX on the imaginary azis i f and only if A is an eigenvalue of the Hamiltonian

m a t h

on the irnaginary azis. Moreover, the partial multiplicities (2.e. the degrees of ele-

mentary divisors) of X as an eigenvalue of H a n tmke the partial multiplicities of X

as an eigenvalue of A - DX.

The condition (3.12) is certainly satisfied if the pair (A, D) is controllable. We will

now show that the condition (3.12) can be guaranteed when (A, LI) is stabilizable.

In fact, we have the following result.

Proposition 3.12 Let A E Cnxn and 3 E CnXm. If (A, B) i s stabilizable, then for

any X &th nonnegative na1 part

Proof. Let A, B be a control normal form of (A, B) (cf. Proposition 3.9). We

Since (Al, Bi) is controllable, we have

Now, if y E &(A) , then

( A - AI)^^ = O

for some integer p 2 O. Therefore,

(Al - AI)P * O (A2 - XI) ' ) (::) = (i)

Since al1 the eigendues of A2 are in the open left half-plane (cf. Proposition 3.10)

and R d 2 O, Az - XI is nonsingular and hence y2 = O. Therefore, y E CAS. We

have thus proved &(A) c CAB. Consequently, we have EA(A) c CAB.

Corollary 3.13 URdcr the conditions of Theonm 3.1, CnXn = N $ M if and only

if al1 the eigenvalues of the Hamiltonian matriz (3.13) on the imaginary dzis have

partial multiplicity two.

Proof. Since (A, D) is stabilizable, we can find K E CnXn such that A - DK is

stable. Since (A - DX+ ) - D(K - X+ ) = A - D K, ( A - DX+ , D) is also stabilizable.

By Proposition 3.12, we have

for every eigenvalue & of A - DX+ on the imaginary axis. Since CA-DX+ ,D = CAYD

by Proposition 3.8, the result follows from Theorems 3.11 and 3.6. O

3.4 Convergence rate of the Newton method

When CnXn = N $ M, we let PN denote the projection onto N parallel to M and

let PM = I - PN. For the algebraic Riccati equation, we start the Newton iteration

with a Hermitian matnx Xo for which A - DXo is stable. Although the Newton

sequence is well-dehed and converges to X+, we do not know whether the iterates

Xi will finally fall into a speQal region of the form (2.2). Therefore Theorem 2.4

cannot be applied directly. Instead, we have the following result.

Theorem 3.14 For any f i e d B > O, let

Then there ezist an integer zo and a constant c > O such that

for al1 i in Q for which i 2 io.

Proof. Let Zi = Xi - X+. Using Taylor's Theorem with (3.3) and the fact that

%(pMxi ) = O, we have

On the other hand, we have by (3.5)

and obviously,

By subtraction, we obtain after some manipulations

Writing X+ = Xie1 - Xi-1 in (3.14) and using the last equation it is found that -

In view of (3.15) and the fact that Xi # X+ for any i, we have

Corollary 3.15 Assume thai, for gben 8 > O , IIPM(X; -X+)II > BIIPK(X, -X+)II

for al1 i large enough. Then Xi X+ quadratically.

The above result is somewhat surprising, since it is generally believed that linear

convergence is the best we can expect when the derivative at the solution is not

invertible (see [12], [14] and [31]). We cannot rejoice in the possibility of quadratic

convergence, however, since the condition in the corollary is not easily satisfied.

Nevertheless, we can conclude that , when the convergence is not quadratic, the error

will genemlly be dominated by its N-component. This will be the basis for a numerical

strategy proposed in the next section. Meanwhile, the following theorem shows what

happens in the generic case when convergence is not quadratic.

Theorem 3.16 Assume CnXn = N @ M . If the convergence of the Newton sequence

{Xi} is not quadratic, then II('R;r,)-'ll 5 cllXi - X+ll-' for a21 i 2 1 and some

constant c > O . Moreover,

The proof of this theorem is an application of Theorem 2.4 and follows readily

from the next lemma. The map B appearing in Theorem 2.4, when applied to the

Riccati equation (at a fixed Z E N instead of &), takes the form

Lemma 3.17 If CnXn = N $ M then

U = {Z E N 1 Bz : N 4 hf i s not inuertible)

has rneasure zero in N,

Some preliminary results will be needed to prove Lemma 3.17. The first result is

Theorem 4.5.6 (a) of (351.

Lemma 3.18 For any A E CnXn, B E CnXm, the pair (A, B ) is stabilizable if and

only if

rank(A1- A B) = n

for every X E C with ReX 1 0 .

Lemma 3.19 Let J and P 6e as in the pmof of Theonm 3.6. Then

rank ( X I - J P-'DP-') = n

for every X E C m'th R d 2 0.

Proof. Since (A, D) is stabilizable, there is an X such that A - DX. is stable.

which is stable. Thus (J, P-'DP-') is a stabilizable pair. The result now follows

fromLernrna3.18. O

Lemma 3.20 Let W be a Hermitian positive semidefinite matriz. If the deteminant

of a principal submatrit of W is zero, then the mws of W containing this submatriz

m u t be linearly dependent.

Pmof Let W = (w;,)&=,. We may assume without loss of generality that

the principal submatrix is Wi = (w&=, = (a: + * a;)=(r < n) and that al =

c 2 a ~ + * * * + Ga, for some constants Q,.. .,G. Let E ( i , j ( k ) ) be the elernentary

matrix obtained from 1 by adding k times row j to row i . Let

Then UWU* is Hermitian positive semidefinite and has zero in the (1,1) position.

Hence the first row of UWU' is zero. This means that Pl = c2P2 + + c,&, where

&, . . . , ,Br are the first r rows of W. CI

Now consider the map Bz : N -, N of (3.17). Using the notation and results

in the proof of Theorem 3.6, we can write Y = P-'YJP-', Z = P-'ZJP-' with

Yj, Z j E Nj. Therefore, we have by (3.3)

where ev, is the projection onto Nj parallel to M J . Let ZJ = diag (Zl, . . . , Z,),

YJ = diag (K, . . .,Y,) and diag (Dl,. . . , Dp) be the block diagonal of P-'DP-',

where Z;, Y,, Di E Cri for i = 1, . . . , p. We have further

where we define linear transformations 7' : CriXc -, CriXG b

for 15 i S p - 1.

For i = 1,2 ,..., p-1 , le t

Lemma 3.21 The set U; has measure zero in CriX'i.

Proof. We need prove the result only for i = 1. Note that Fz, is invertible if and

only if the equation

&Dix +Y,D,Zi = Wl (3.19)

has a unique solution YI for any Wl E Cr' We c a n rewite (3.19) as

(see (35, p. 99]), where 8 is the Kronecker product and vec is the stacking operation

for matrices, Le.,

when S = (si 32 . . . SJ E C p x q . Therefore,

Since Zl E CTIXrl, the determinant in (3.20) is an algebraic polynornial in r:

variables. By the fundamental theorem of algebra, the set Ul has measure zero in

Crl unless

det (1 8 (&Dl) + (&Dl) @ I ) = 0. (3.21)

If (3.21) is true, we have in particular det (1 @ D: + D: @ 1) = O. Thus O is

an eigenvalue of the matrix I @ D: + D: 8 I . We can then find eigenvaluea Xj, .Ar

of Dl such that A: + A: = O (see [35, Theorem 5.1.11). Since Dl is Herrnitian,

all its eigendues are real. We then condude that O is an eigenvalue of Dl and

det Di = O. By Lemma 3.20, the first rl rows of P-'DP-' are lineady dependent.

Thus rank ( a l i l - J P-'DP-*) < n, which contradicts Lemma 3.19. O

Pmof of Lemma 3.17. From (3.18) we see that Bz is invertible if and only if F2,

is invertible for each i. Thus P-1

U = U v i T i=l

By Lemma 3.21, each Vi h a meôsure zero in N . Therefore U has measure zero in

Pmof of Theorem 3.16. W e apply Theorem 2.4, with some natural changes of

notation. Let 4 = Xi - X+ and x = X - X+. We are to show that there is a

such that Ba, is invertible and, if = span {@O} $ & and Po is the projection on

O. almg Ni $ M, then there is an i such that Xi E W (po, Bo, m) where

Fint, Theorem 3.2 shows that by choosing Xo so that A- DXo is stable there is an

il such that O < J ~ ~ ; J J < po for all i 2 i l . Then, since the convergence of the Newton

sequence is not quadratic it follows from Corollary 3.15 that 11 ~ ~ 2 , ~ 11 5 do 11 pNxiYi, II

for some i2 2 il. Note that P,& # O, since otherwise we would have x,-~ = 0.

Finally, if we choose = ~ ~ 2 ; ~ then the last inequality of (3.22) is trivially

satisfied for Xi,, but Ba, may not be invertible. However, when is given, it follows

from Lemma 3.17 that a can be chosen arbitrady dose to p,qxi, in such a way

that BI, is invertible and Xi, E W(po, %, m). Now apply Theorem 2.4. O

3.5 A modified Newton method

The Newton iteration can be used to f h d the maximal solution of (3.1) when the

Hamiltonian matrix has eigenvalues on the imaginary axis, while most other alge

rithms are not applicable in this case (see [41]). However, the convergence of the

Newton sequence in this case is usually linear although, as Theorem 3.14 suggests,

we have not excluded the possibility of quadratic convergence. Since the Newton

iteration is an expensive procedure, we cannot be satisfied with linear convergence

alone.

For Newton's method in general Banach spaces, the initial guess must be in a

special region of the f o m (2.2) in order to guarantee that the modified methods we

mentioned in Chapter 2 are well-dehed and give fast convergence. When we apply

the Newton iteration to find the maximal Hermitian solution of (3.1), we start with

a Hermitian matrix Xo for which A - DXo is stable. It is not clear whether and

when the iterate Xk will fall into that special region. We therefore take a different

approach. We are not going to recuver quadratic convergence. Instead we will add a

simple modification step to the Newton iteration so that the required accuracy can

be achieved at an early stage. The following simple result is very instructive. Note

statement 2 especially, and the possibility that it presents for stepping directly to

the solution X+ .

Theorem 3.22 In the setting of Theorems 3.1 and 3.2, and under the condition that

XI - X+ E N, tue have

Proof. By Taylor's Theorem,

Since 7Z(X+) = O and Ri+(& - X+) = O, we may also write

The second part of the theorem follows immediately. The fkst part follows easily

from (3.4) and the second part. O

We remark that similar conclusions cm be reached for any map F from a Banach

space into itself, for which F" is constant. Note also that the direct sum condition

c n x n = N @ M is not required in Theorem 3.22.

Example 3.1 It is instructive to revisit Example 9.2.1 of [35] at this stage. Let

It is easily verified that there is a unique solution of R ( X ) = 0, namely,

and is not stable.

If Newton iterations are started with

(in this case A - DXo = - I and is stable), then it can be proved by induction that,

for n = 1,2,. ..,

Consequently, for n = 1,2,. . . ,

It can be seen that, in this case

N = span 1 Note also that, for n = 1,2, . . . ,

Fortuitously, Xo is chosen in such a way that Xn - X+ E N for n = 1,2,. . .. Thus,

Theorem 3.22 applies and (3.23) holds. Furthmore, it is clear that, by applying

the modified Newton step at any n 2 1, the exact solution X+ is obtained.

When the direct sum condition is satisfied and the convergence of the Newton

sequence {Xk} is not quadratic, we have at some (hopefully early) stage I I PM (Xk - X+)(l a: II PN(Xk - X+)ll (cf- Theorem 3.16). A very good approxiinate solution

could then be obtained by applying the modification step 2 of Theorem 3.22. More

precisely, we have the following resuit.

Theorem 3.23 Assume CnXn = AI' $ M and

small, and

then IIYk+l - X+ll < ce for some constant c independent of c and k .

If q c < ?, we h o w from the Banach Lemma (Theorem 2.2) that Rik is invertible

Since xk - X+ E N , we have by Theorem 3.22

x+ = - ~(R~~)-'R(x~)-

Note that

Therefore,

we obtain easily IIYk+l - X+ I I 5 cc. O

Note that the condition (3.24) is satisfied when the convergence of Newton's

method is not quadratic (cf. Theorem 3.16). Note also that the matrix produced

by the double Newton step in Theorem 3.23 is at l es t almost stabilizing (see the

discussions in 141).

The following algorithm is suggested by the results of this section.

Algorithm 3.24 (Modified Newton method for the CARE)

1. Cnoose a Hennitian matriz Xo for d i c h A - DXo is stable.

2. F o r k = O , l , ... do:

Solve qI ( H ) = R ( X k ) ;

Compute Xk+l = Xk - 2 H ;

If IlR(xk+l) 11 < € 9 stop;

ahettoise, cornpute Xk+i = X j - H; If Ila(xk+l)ll < c, stop*

In the above algorithm, I I II is an easily cornputable matrix n o m (e.g. 1-nom)

and c is a prescribed accuracy. The equation R i i ( H ) = R(Xk) can be rewritten as

a Lyapunov equation

which can be solved efficiently by the algorithms described in [3] and [21]. In Al-

gonthm 3.24, dl iterates except the last one are identical to those produced by the

original Newton method. Thus al1 gwd properties of the Newton method are re-

tained. When A - DX+ has eigendues on the imaginary axis, the last iterate is

usually produced by the modified step. Algorithm 3.24 needs roughly 10% more

computational work per iteration, since we systemat icdly perform one additional

Riccati function evaluation with a view to achieving the required sccuracy in the

modified step as early as possible.

3.6 Numerical results

In this section we present îome numerical examples to illustrate the effectiveness of

the modified Newton step in Algorithm 3.24.

Example 3.2 Consider the CARE (3.1) with n = 2 and

(cf. Example 10 of [5 ] ) . The maximal solution X+ = (x,) is given by

For e = 0, the pair (A, D) is stabilizable, and

Observe that o ( A - DX+) = { O , -2).

Starting with

we perform 8 steps of the ordinary Newton iteration and then perform a modification

step. The results are recorded in Table 3.1. As usual we let Xk = Xk - X+ and write

xk = @fi). For this problem the convergence of the Newton iteration is linear with

common ratio f (cf. Theorem 3.16). After 8 Newton iterations, X8 is still not very

close to X+. However, Xs - X+ is very close to an element in N. A modification

step then produces a very accurate approximate solution (cf. Theorem 3.23).

Table 3.1: Performance of Algorithm 3.24 for Example 3.2

When r is a small positive number, X+ is a stabilizing solution. According to

Theorem 3.5, the Newton sequence {Xk) converges to X+ quadratically. However,

the constant c in Theorem 3.5 will be very large for very s m d a. Thus the quadratic

convergence could be exhibited only after Xk gets very close to the solution. On the

other hand, as Xk gets close to the solution, the corresponding Lyapunov equation

will be ill-conditioned. As a result, quadratic convergence can hardly be realized.

For example, take c = 10-8 and Xo as before. If we perform 8 Newton iterations

and then perfarm a modification step, we get I I X ~ ~ ~ ~ = 0.41420 - 08. Without the

modification step, the error 11 11 for the Newton it en t e decreases monotonieally

until the 26th iteration with ~l& l l~ = 0.37380 - 07.

Example 3.3 Consider the CARE (3.1) with n = 2 and

(cf. Example 11 of [5 ] ) . The maximal solution is

For c = 0, the pair (A, D) is stabilizable. And we have

and observe that @ ( A - DX+) = {-i,i).

Starting with

we perform 8 steps of the ordinary Newton iteration and then perform a modification

step. The results are recorded in Table 3.2. The situation for this example is very

similar to that for Example 3.2.

For É = IO-'' and the same initial guess, we perfom 8 steps of Newton iteration

and then perform a modification step. We get 11&11, = 0.10000 - 09. For the

Newton iteration, the error decreases rnonotonically untill the 31 st iteration with

Example 3.4 We consider the CARE (3.1) with n = 8 and a block-diagonal matrix

A with 2 x 2 blocks:

Table 3.2: Performance of Algonthm 3.24 for Exarnple 3.3

It is readily seen that X+ = O so that o ( A - LN+) = {- 1, O, f i , f 2i) and the purely

imaginary eigenvalues have linear elementary divisors.

We apply Algorithm 3.24 with Xo = 1 and e = IO-'^. The results are recorded in

Table 3.3. The first 9 steps are ordinary Newton iterations. The convergence of the

Newton iteration is linear with cornmon ratio f (cf. Theorem 3.16). And by (3.16)

we have Il a(&) 11 5 cllxk 1l2 in this case, as verified by the numerical results. The

Table 3.3: Performance of Algorithm 3.24 for Example 3.4

last step is a modification step, which irnproves the accuracy dramatically.

We camed out many other numerical experiments. The results reported above

are typical. In these experiments the convergence of the Newton method is always

observed to be linear with common ratio $ whenever all elementary divisors of A - DX+ are linear.

Chapter 4

Examples and Conjectures

4.1 Examples

In the previous chapter, we have shown that the convergence of Newton's method

is either quadratic or linear with cornmon ratio 112, provided the conditions in

Theorems 3.1 and 3.2 are satisfied and the direct s u m condition is fulfilled.

In practical applications, the direct sum condition CnXn = N $ M is usually

satisfied. On the other hand, we c m find examples where the eigendues on the

imaginary axis have elementary divisors of arbitrary degree.

Example 4.1 Consider the CARE (3.1) with

For X = (z,) with z~ = C i (the binomial coefficients), al1 the eigendues of A- DX

are -1. Thus (A, D) is stabilizable. We also have X+ = O (O is an almost stabilizing

solution, and thus maximal), and An is the only elementary divisor of A- DX+. Note

also that, by Theorem 3.11, X2" is the only elementary divisor of the Harniltonian

matrix (3.13).

We performed many numerical experiments with different n and different initial

guess. We obsemed that the Newton sequence converges to O linearly with common

ratio 2-'1". Of course, since Ri+ is not invertible, the Newton iteration cannot be

continued forever. The above rates are observed when the iterates are reasonably

close to the solution X+ = 0.

An analytical example displaying the observed rate of convergence is thus highly

desirable.

Example 4.2 Consider the CARE (3.1) with D,A,C given by (4.1) and n = 2. For

Newton's method, we will use a real symmetric matnx as an initial guess so that al1

subsequent iterates are r d symmetric. The real symmetric matrices Xk+l = (zF1) and Xk = ( z f j ) are now related by

We dmose 2:, = I/(\IS + l)~, z:, = c for any c > 0, and 4, can be arbitrary. Then

A - DXo is stable, and we find that for k = 1,2, . . .

Thus for any matrix n o m

We note that this example also serves to show that Xo 2 Xi is not true in general.

It is also worthwhile to point out that, for Example 4.2, we no longer have

for al1 k > 1 and some constant p > O. In fact, letting

we c m s e easily t hat A - D& has a zero eigenvalue and t hus is not invertible.

If (4.2) were tnie for all k 2 1 and some constant p > O, we would have

where q is a constant independent of k. Therefore, for k large enough, we have

1 - ( ) 11 < 1. It follows from the Banach Lemma that R;- is invertible

for k large enough, a contradiction. Thus, for Example 4.2, it is impossible tu have

(4.2) for dl k 2 1 and some constant p > 0.

When the conditions in Theorern 3.1 and the direct sum condition are satisfied,

the convergence of Newton's method has been shown to be either quadratic or linear

with common ratio 112. However, quadratic convergence has never b e n observed

in practice (when the derivative at the solution is not invertible). If we drop the

assumption that D 2 O, then we can find examples for which there are Newton

sequences converging quadratically to a solution at which the derivative is not in-

vertible. We note that the existence of a Hennitian solution does not imply the

existence of a maximal solution when the condition D 2 O is dropped.

Example 4.3 Consider the CARE (3.1) with

For this exarnple, all eigendues of the Hamiltonian matrix (3.13) are on the imagi-

nary axis. It cm be easily verified that

is a solution of this equation and a maximal solution does not exist. Although R;.

is not invertible, we can still construct Newton sequences converging quadratically

to the solution X'.

For n 2 O, if

with a, # O, then is invertible and the next Newton iterate is

w here

We now choose uo and Q such that

1 I d < g<

Using (4.3) and noting that

we can prove by induction that for any n > O

We assume for the moment that a, # O for all n 3 O. Thus the Newton sequence

{X,) is well-defined and we have

Therefore, X, -+ X' quadratically.

To ensure a, # O for al1 n 2 O, we may choose to be a ration al number and

choose a0 to be a transcendentd number t . In view of (4.3), we can write for each

a, = S. ( t ) Q3"-1 ( t ) '

where P,(t), Q,(t) are polynomials of degree m with integer coefficients. Thus

a, # O, since othenvise t would be an algebraic number. One particulax choice is

We note that Theorem 2.4 is not applicable to Example 4.3. Let N = ~er7Z.k.

and M = 1m7Ci.. We have

Thus CZX2 = N $ M. However, the regularity condition is not satisfied for this

example. In fact, we have P,&;.(z, Y) = O for any Y, Z E N .

4.2 Conjectures

As we mentioned above, we have no examples (analytic or numerical) to display

quadratic convergence when the conditions of Theorem 3.1 are satisfied and ai, is

not invertible. Although the possibility of quadratic convergence is desirable, it may

be possible to prove that the convergence is always linear.

Conjecture 4.1 Assume that D 2 O , Cm = C, (A, D ) is stabilizable, and the CARE

(3.1) has a Hennitian solution. If the Harniltonian matriz (3.13) has eigenvalues on

the imaginary azis then, starttng tm-th a Hennitian matriz Xo such that A - DXo

ts stable, the convergence of the Newton sequence {Xk} to the mazimal solution X+

cannot be quadrutic.

Furthemore, the b t two examples given in the previous section and many other

numericd examples not reported here have supported the following conjecture.

Conjecture 4.2 Undet the same assumptions as in Conjecture 4.1. If the highest

partial multiplicity for the eigenvalues of the Hamiltonian matriz (3.13) on the imag-

inary azis is 2p, then, starting with a Hemztian matriz Xo such that A - DXo is

stable, the Newton sequence {Xk}

lirn k-oo

c o n v e n p to the rnMmal solution X+ linearly loith

or more strongly,

for any matriz n o m .

Chapter 5

Newton's Method for Discrete Algebraic Riccati

Equat ions

In Chapter 3, we considered Newton's method for continuous algebraic Riccati

equations (CARE). In this chapter we consider disctete algebraic Riccati equations

(DARE) of the fom

- X + A'XA + Q - (C + BWXA)'(R + B'XB)-'(C + BaXA) = 0 , (5.1)

where A, Q E CnXn, B E CnXm, C E CmXn, R E CmXm, and Q' = Q, R = R. We

denote by R(X) the left-hand side of (5.1). The function R ( X ) and its derivatives

are much more complicated than their CARE counterparts. Nevertheless, it will be

shown that most analflical properties established in Chapter 3 for the CARE can

be extended to the DARE. The analysis here is more involved, but the line of attack

is the same.

Let V = {X E CnXn 1 R + B'XB is invertible}. We have R : D -, CnX".

The first Fréchet derivative of 7Z at a Hermitian matrix X E ZJ is a linear map

R' x : Cnxn 4 CnXn given by

where Â = A - B(R + B0XB)-'(C + B'XA). Also the second derivative at a

Hermitian matrix X E V, 72% : CnXn x CnXn -, CnXn, is given by

where H = B(R + BœXB)-'B*.

For A E CnXn and B E CnXm, the pair (A, B ) is said to be d-stabilizable if there

is a K E Cm Xn such that A - BK is d-stable, i.e., dl its eigenvalues are in the open

unit disk. As before, a Hermitian solution X+ of (5.1) is called maximal if X+ $: X

for every Hermitian solution X. The following result is a modification of Theorem

13.1.1 in [35]. See also [46].

Theorem 5.1 Let (A , B ) be a d-stabilirable pair and assume that there is a Henni-

tian solution x of the inequdity R(X) 2 O for which R + B'XB > O . Then there

ezists a mazàmal Hermitian solution X+ of R(X) = O . Monover, R + BmX+ B > O

and al1 the eigenvalues of A - B(R+ B0X+B)-'(C + BWX+A) lie in the cloaed unit

Remark 5.1 In Theorem 13.1.1 of [35], it is required that R be invertible. This

condition is needed for some later developments in [35], but is not necessary for

the conclusions of Theorem 13.1 .l. The proof of that theorem should be slightly

modified. We have only to replace expressions of the form Q - C'R-'C + (L - R-'C)*R(L - R-'C) by expressions of the form Q + L'RL - C' L - L'C. As noted

in [6], the matrix R may well be singular in applications.

A Hermitian solution X of (5.1) is called stabilizing (resp. almost stabilizing) if

al1 the eigendues of A - B(R + BœXB)-' (C + B'XA) are in the open (resp. closed)

unit disk. Such solutions play important roles in applications. Theorem 5.1 tells us

that, under the given conditions, the maximal solution is at least almost stabilizing.

The Newton method for the solution of (5.1) is the same as (3.4):

given that the maps Rk(i = O, 1,. . .) are al1 invertible. But now the iteration (5.4)

is dosely related to the solution of the Stein equation described in the following

classical result .

Theorem 5.2 (cf. [35, p. 1001) For any gtven mutrices A E CmXm, B E Cnrn and

ï E CnXm the Stein equotion S - BSA = ï ha3 a unique solution if and only if

ATP. # 1 for any AT E 44, p, E 4B).

It follows from Theorem 5.2 that, under the conditions of Theorem 5.1, Rk+ is

invertible if and only if A - B(R + B'X+ B)-'(C + BmX+A) is d-stable.

When we apply Newton's method to the DARE (5.1) with (A, B ) d-stabilizable,

the initial matrix Xo is taken such that A - B(R + BmXoS)-'(C + BoXoA) is d-

stable. The usual way to gmerate such an Xo is as follows. We choose Lo E CmXn

such that A. = A - BLo is d-stable, and take Xo to be the unique solution of the

Stein equation

Xo - &Xo& = Q + LiRLo - C"Lo - LOC. (5.5)

In view of (5.2), the Newton iteration (5.4) can be rewntten as

Theorem 5.3 Under the same conditions as in Theorem 5.1 and for any Lo E CmXn

such that A. = A-BLo U d-stable, starting roith the Hermitiun m a t h Xo determined

b y (5.5), the ncursion (5.6) determines a sequence of Hermitian matrices {Xi)% for

which A - B(R+ BeXiB)-'(C + BmXiA) is d-stable for i = 0,1,. . . , Xo 2 XI 2 - ,

and fi-,, Xi = X+.

An important feature of Newton's method applied to the CARE and DARE is

that the convergence is not merely local. The application of Newton's method to the

DARE was initiated in (261 under some conditions which, with the wisdom of hind-

sight, are seen to be restrictive. Similady, Theorem 5.3 was established in the proof of

(46, Thm. 3.11 under the additional condition that R > O. The positive definiteness

of R was replaced by the invertibility of R in the proof of [35, Thm. 13.1.11. As we

have pointed out in Remark 5.1, the invertibility of R is also unnecessary. Note that

an Lo can be produced by automatic stabilizing procedures such as the one in [48].

It should also be noted that Xo 2 Xi is generdly not true, if Xo is not obtained

frorn (5.5).

It is readily seen that Ri, as a function of X, is Lipschitz continuous on a closed

b d centered at X+ and contained in D. We note that the expression for Rk when

X is not Hermitian is different fiom the one given in (5.2). The locally quadratic

convergence of Newton's method (see Theorem 2.3), in combination with Theorem

5.3, yields the following result.

Theorem 5.4 If A - B(R + B0X+B)-' (C + B'X+A) is d-stable i n Theorem 5.3,

then for the sequence {Xi}g, then LP a c o ~ t a n t c > O such that, for i = 0,1,. . .,

IIXi+l - X+/I < cllXi - X+I12, where II II is any given matriz n o n .

When the closed-loop rnatrix A - B(R+ B*X+ B)-'(C + B*X+A) has eigenvalues

on the unit circle, Rh is m t invertible. This situation happens in some important

applications (see [9], for example). We will show that the convergence of Newton's

method is either quadratic or linear with cornmon ratio f , provided that the eigenval-

ues on the unit circle are all semi-simple (Le. all elementary divisors corresponding

to these eigenvalues are linear). The linear convergence appears to be dominant

and, when this is the case, the efficiency of the Newton iteration c m be improved

significantly by applying a double Newton step at the right time. Numerical results

are also given to illustrate these phenornena.

5.2 Interpretation of the direct sum condition for

the DARE

We will now give an interpretation of the direct s u m condition for the DARE (5.1)

We assume throughout this section that the conditions of Theorem 5.1 are satisfied.

Let X+ be the maximal solution of (5.1) with Rk+ not invertible. Let N = Ker'Rk,,

M =imR>+. We have the following interpretation of the direct sum condition.

Theorem 5.5 CnX" = N $ M if and only if dl eigenualues of

on the unit circle are semi-simple.

Pmof. Let J be the Jordan canonid f o m for A+ with P-' A+P = J. We find

that K E if and only if K = P-'LP-' for some L E NJ = {Y E CnX" 1 - Y +

J'YJ = O) . Also W E M if and only if W = P-'UP-' for some U E M j = {Y E

CnXn 1 Y = -V + J'VJ for some V E CnXn]. Therefore, CnXn = N $ M if and

only if CnXn = Nj $ M J .

If dl eigendues of A+ on the unit circle are semi-simple, we c m arrange the

Jordan blocks of A+ so that

J = dia&&, . . . , GP+ G,). (5.9)

Here Gp E CrpXrp consists of Jordan blocks associated with eigenvalues in the open

unit disk, and for k = 1, . . . , p - 1

Gk = (ak + b k i ) l E Cr' "', (5.10)

where a k + bki (k = 1,. . . , p - 1) axe distinct complex numbers on the unit circle.

Using block matrix multiplications and applying Theorem 5.2 repeatedly, we can

h d easily

NI = { N = diag(Nl, ..., Np)I Ni E CriX",l 5 i 5 p ; N p = O } ,

M I = {M=(Mij)IMij E C " ~ ~ ] , ~ 5 i , j s p ; M i i = O , l IiSp-1).

Thus, CnXn = Nj $ M j .

If A+ has nonlinear elementary divisors corresponding to eigenvalues on the unit

circle, we can arrange the Jordan blocks so that the first Jordan block Ji has the

following fom:

where a2 + bZ = 1. In this case,

T = diag(Tl, 0,. . . ,O) E NJ n M J ,

Note that T = -V + J'VJ for

Therefore, Cnxn # N , $ M J . O

5.3 Characterization of the direct sum condition

via a matrix pencil

We lave just given a characterization of the direct sum condition, in which the sought

after solution X+ appears. In order to give a characterization which is independent

of X+, we consider the matrix pencil )<Fe - Ge (known as a "extended symplectic

pendn ) with

Matrix pends of this type were first introduced in [15] and [49], but for a different

purpose. See also 1281.

Lemma 5.6 If (5.1) has a Hermitian solution X, then

where Z = - ( R + B*XB)-'(C + B'XA) and

O ( A + BZ)' O 7 Ne =

O -Bo O O 1 Proof. It can be easily verified by direct computation. In particular,

is true because X is a Hermitian solution of (5.1). 0

Note that, in contrast with Proposition 15.2.1 of [35], the equality (5.11) does

not require the invertibility of R. Note also that oo is an eigendue of A Fe - Ge with

multiplicity at least m.

A matrix pend XF - G is called regular if det(XF - G) is not identically zero.

Corollary 5.7 If (5.1) has a Hennitian solution X , then Me-Ge is a ngular pencil.

Monooer, cr is an eigenvalue of A + BZ if and only if a and ü' a n eigenualues of

XF, - Ge. A unirnodular cr is an eigenvdue of A + BZ toith algebmic multiplicity k

if and only if it is an eigenvdue of XF, - Ge m'th algebmic multiplicity 2 k .

Proof. We have by Lemma 5.6

det(XFe - Ge) = det(XM. -Ne)

= c det(XI - (A + BZ)) det(X(A + BZ)' - I ) ,

where c = (-1)" det(R + B'XB) # O. Thus, XF. - Ge is a regular p e n d If

det(A1- (A + BZ)) = (A - AI) - ( A - A,),

we have

det(X(A + BZ)' - I ) = (X1X - 1) . (h,X - 1).

The remaining conclusions in the corollary follow easily. O

If al1 unimodular eigendues of AFe - Ge are of algebraic multiplicity two, then

al1 unimoddar eigenvalue of A + BZ are simple and the direct sum condition is

satisfied. To Qve a complete characterization, we need to consider the relationship

between the elementary divisors of A + BZ and X Fe - Ge.

Theorem 5.8 Let a be a complez number with la1 = 1 and X be a Hermitian

solution of (5.1) toith R+ B'XB > O. If

then the elementary divisors of A+ BZ comsponding to a have degrees kl, . . . , ks(l 5 kl 5 . 5 k, 5 n) if and only if the elementary divisors of XF, - Ge comsponding

to a have degnes 2k1,. . . , 2ks .

Proof. Suppose the elementary divisors of A + BZ corresponding to a have

degrees kl, . . . , k,. By the local Smith form (see [19], for example), we can fkd

matnx polynomials & ( A ) and F,(A) invertible at <r such that

where D = diag(( A - . . . , (A - a)k4 ). Replacing X by k1 in (5.12), and then

taking conjugate transpose, we get

(A + BZ)' - P I = K,(X) (: 0) .m.

where Ka(A) and L,(X) = (~~(1-'))' are rational matnx functions invertible at

a. For any rational rnatrix functions F(X) and G(A), we will write F(X) - G(A) if

there are rational matrix functions K(X) and L(X) invertible at a such that F(X) =

K ( W ( ~ ) L ( 4 *

Now, in view of Lemma 5.6, we have

A I - ( A + BZ) O -B

X F ~ - G~ - O ( A + B Z ~ - A-' I O 1 O -Bo - (R + B'XB)

By (5.12) and (5.13) we have iurther (for X in a neighborhwd of a)

where we have writ ten

( ( ) ) = ( B i ) ( ( ( ) ) ) = ( C i -(R + B'XB) = (Si,).

Note that (Sij) < O since R + B'XB > 0.

Since r d (cri - A B) = n, we have at A = a

rank(XI - (A + BZ) - B) = rank(X1- A - B ) = n.

Therefore, at X = a,

0 B11 B12

raink ( O D B21 B g 2 ) = n

and thus rank(Bzl fi2) = S. Since E,(X-=) = E,(A) at X = a, we have (CG) =

(Bij)' at X = a. We may then assume that B2i and CI2 are invertible in a neighbor-

hood of a since, if otherwise, we can exchange the rows of (Cij) and the columns of

(Bi j ) simultaneously (this will not change the negative definiteness of (Si,) and the

property that (Ci,) = (Bi,)' at X = a. ).

Now we obtain by block elimination

I O 0 0 O O

O D O O I O

O O O D O O

O O O I hl VIP

o o o o ~ l ~ z where

is a rational matrix function with -V(a) > O. It is clear that no principal minors of

V(X) are zero at a.

Al1 nonzero minors of order i for W(A) have the forrn ( A - c r ) I q ( X ) , where 1 2 O

and a is neit her a zero nor a pole of the rational function q ( X ) . For 272 + m - s + 1 2

i 5 Zn + m, the smallest 1 tunis out to be li = c:Z-~"-" 2kj . For 1 5 i 5 2n + m - s, the smallest 1 is li = O. By the Binet-Cauchy formula (see [36], for example), we

can see that ( A - a)'' is also the geatest cornmon divisor (of the form ( A - a)') of all minors of order i for X Fe - Ge. Thus the elernentary divisors of XF, - Ge

corresponding to a are (A - a)lkl,. . . , (A - a)2ke. This proves the "only if" part of

the theorem. The Y£'' part follows r d i l y from the "only if" part. CI

Corollary 5.9 If the conditions of Theorrm 5.1 are satisjîed and Ri+ is not invert-

ible, then CnXn = N $ M if and only if al1 the elementary divisors of AF, - Ge

corresponding to the eigenoalues on the unit circle are of degree two.

A previous result of the same nature as Theorem 5.8 can be found in [SI]. That

result is applicable to the DARE (5.1) with C = O, R > 0, and Q 2 0.

5.4 Convergence rate of the Newton method

When CnXn = N $ M , we let PN denote the projection onto N parallel to M and

let PM = I - PM. For the DARE (5.1), we start the Newton iteration with the

Hermitian matrix Xo obtained from the Stein equation (5.5). By Theorem 5.3, the

Newton sequence is well-defined and converges to X+. The following result shows

there is some possibility of quadratic convergence and is the analogue of Theorem

3.24 for the CARE.

Lemma 5.10 For unyfized 8 > O, let Q = {i 1 (IPM(Xi-X+)II > BIIP''(X;-X+)II).

Then then &t an integer io and a constant c > O such that

for al1 i in Q for which i 2 à*.

Pmof Let = Xi-X+, i = 0,1,. . ., and let L+ = (R+BeX+B)-'(C+B0X+A)

(thus A+ = A - BL+). We have (see [35, p. 3141)

and IIL+ - Lill = O ( I I X ; - ~ I I ) , where the matrices Li are defined by (5.7). We also

where we have written o ( I I x ~ ~ ~ ' ) for a tem W(X;) satisfjmg I I W(Xi)l l = 0(11~;;.11~).

Now, in view of (5.1) and (5.7),

a(&) = R(Xi) - R(X+)

= -Zi + +'%A (C + BoXiA)'Li+l + (C + BœX+A)'L+

= -2; + A ; ~ ; A + - A;X~A+ + A ' ~ ; A

-{(C + B'X; A)' - (C + BmX+ A)' )Li+*

In the last equdity, we have used

Thus for i large enough,

for some constants ci and cz.

On the other hand, for i in Q and large enough, we have as in the proof of

Theorem 3.14

IIz(X)II 2 ( ~ ( 9 - l + 1)-' - ~ I I % I I ) I I X ~ I I (5.15)

for some constants c3 and 4. Since X; f X+ for any i, we have by (5.14) and (5.15)

Thezefore, we can find an io such that 11&11 5 ~ l l z ~ ~ ~ Il2 for all i in Q for which

2 2 i',. O

Corollary 5.11 Assume that, for ginen 0 > O, 11 Ph( (Xi - X+) II > 911 PN(X - x+) 11

for al2 i large enough. Then Xi + X+ quadratically.

The condition in Corollary 5.11 appears to be not easily satisfied. The next

r d t describes what will happen if the convergence of the Newton iteration is not

quadrat ic.

Theorem 5.12 Assume CnXn = N @ M . If the convergence of the Newton sequence

{Xi) is not quadmtic, then II(Rk,)-'ll 5 cllX; - X+II-' for a11 i 2 1 and some

constant c > O . Moreover,

lim IlXi+1- x+ii - 1 - lim llp~(xi - x + ) l l = 0. i x i - 2' i+~IIp~(xi-x+)II~

The proof of this theorem will again be an application of Theorem 2.4. Some

preliminary results will be needed.

The fint result is Lemma 4.5.3 of [35].

Lemma 5.13 Let A be n x n, and B be n x m. Then ( A , B ) is d-stabilizable if and

only if ( A + BK, BL) is d-stabilizable for any m x n matriz K and any m x p matriz

L for which Im(BL) = ImB.

The next result is Theorem 4.5.6 (b) of [35].

Lemma 5.14 Let A be n x n, and B be n x m. Then ( A , B ) is d-stabilizable if and

only if

rank(XI - A B ) = n

for e v e q X E C with 1A1 2 1 .

We then have the following easy consequence of the above two results.

Lemma 5.15 Assume CnXn = N $ M and let the matrices P and J Le as in the

proof of Theorern 5.5. Then

for every complez number X vith (X I 2 1.

Proof. It is well known that Im(CCO) = Im(C) for any matrix C. Therefore,

It follows from Lemma 5.13 that (A - BL+, B(R + B*X+ B)-'B*) is d-stabilizable.

Since P-'(A - BL+)P = J , the pair (J, P-'B(R + B*X+ B)-' Bop-') is also d-

stabilizable. (5.16) now follows fiom Lemma 5.14. O

For fixed Z E N, we consider the map BZ : N -, Af defined by

Bz(Y) = PNR$+ (2, Y).

Using the notation and results in the proof of Theorem 5.5, we can write Y =

P-*YJP-l, Z = P-*ZJP-' with YJ,ZJ E Nj. Let H+ = B(R+ B*X+B)-'B.. We

have by (5.3)

where LI+ = P-'B(R+ Box+ B)-'B'P-' , and P f i is the projection onto A/" parallel

to M J . Let ZJ = diag(Zl, . . . , Z,), YJ = diag(x, . . . , Y,) and diag(Dl, . . . , D,) be

the block diagonal of D+. We have further

where we define linear transformations 7% : CriX" + Cc XG b~

For Ç = 1,2 ,..., p - 1, let

Lemma 5.16 For k = 1,2,. . . , p - 1, the set Uk has measun zero in C'kXrk .

Proof. We need only to prove the result for k = 1. As in Chapter 3 we can show

that Ul has measure zero in C" Xq unless detDl = O. Note that D+ = P-'B(R + BmX+ B)-l B'P-' is Hermitian positive semidefinite. If det Di = 0, the first rl rows of

D+ would be linearly dependent by Lemma 3.20. Thus rank((al +hi) 1- J D+ ) < n,

which contradicts Lemma 5.15. a

Lemrna 5.17 If CnXn = N $ M then

U = (2 E N 1 Bz : N -+ N is not invertible )

has rneasure zero in N.

Proof. The result follows from (5.17) and Lemma 5.16, as in Chapter 3. O

Proof of Theorem 5.12. Note that the map 7Z can be extended to a smooth map

on CnXn without changing its values on a closed ball centered at X+ and contained

in D. NOW, as in Chapter 3, the proof can be completed by applying Theorem 5.3,

Theorem 2.4, Corollary 5.11 and Lemma 5.17. O

When al1 elementary divisors of the closed-loop mat rix corresponding to the eigen-

values on the unit circle are linear , we know from Theorem 5.12 t hat the convergence

of the Newton iteration is, if not quadratic, linear with rate i. The following exam-

ple shows that linear convergence with rate f is not to be expected when we have

element ary divisors of higher degree.

Example 5.1 Consider the DARE (5.1) with n = 2,m = 1 and

Clearly (A, B) is d-stabilizable and X+ = O (O is the unique almost stabilizing

solution in this case. See Theorem 13.5.2 of [35], for example). Note that ( A - 1)2 is

the only elementary divisor of A+ = A.

For any real matrix Lo such that & = A - BLo is d-stable, the Newton sequence

{Xi} is well defined by the recursion (5.5)-(5.6). It is clear that the iterates Xi are

al1 real symrnetric. We write for i = 0,1,. . . ,

Since A - B(R + BmX;B)-'(C + BmXiA) is d-stable, we can deduce that ci f O.

Since Xi 2 0, we al30 have ai, bi > 0.

By (5.6)-(5.8), we find for i = 0,1,. . .

Since Xi -t O, we get from (5.21)

c;+l 1 lim - - i-.. - 3

It follows from (5.19) that

It then follows from (5.20), (5.22) and (5.23) that l i ~ , , bilai = O. If the conver-

gence of the Newton iteration is linear with rate p, then lin,, = p. New

by (5.19) and (5.23),

If liq,, q/af = O, we would have liq,, ai+l/ai = 112 by (5.24). In view of (5.22),

We have limi,, ~ i / a : = os, a contradiction. Thus, lir~,, ei/a: # O. Therefore,

ci+l a: lim -- = 1, i-m a:+l

and we get from (5.22) that p = 1 1 4 . The constant 118 cornes as no surprise.

This same constant also appeared in a similar situation for the CARE case ( s e

Example 4.2).

The above example can also serve to show that Xo 2 Xl is generally not true if

Xo is not deterrnined by (5.5). Take

with a > 4 0 < c < 1, and 6 r d . It is easily checked that A - B(R+ BmXoB)-'(C + BmXoA) is d-stable. We see fiom (5.19) that al - 0.5~'-" as e -+ O. Thus Xo 2 Xl

cannot be true for smd e. As e and 6 go to zero, we have llXo - X+I/ -+ 0, but

11x1 - X+ll - 00-

As for the CARE, quadratic convergence has never been observed when the con-

ditions of Theorem 5.1 are satisfied and R;+ ha3 eigenvalues on the unit circle.

We have the following conjecture.

Conjecture 5.18 Assume thot ( A , B) is a d-stabiluable pair and the DARE (5.1)

has a mdmd solution X+ coith R+BmX+ B > O. ~f ai+ is not invertible and p is the

highest degree of the elementary diohors of A- B(C+ BeX+A)'(R+ B'X+ B)-'(C + BmX+A) associated with its eàgenvulues on the unit circle, then the Newton sequence

defined by (5.5)- (5 .6) converges to X+ linearly imth

or, more strongly,

for any m a t e n o m .

5.5 Using the double Newton step

The Newton iteration can be used to fmd the maximal solution of the DARE (5.1)

when the closed-loop matrix has eigenvalues on the unit circle, while most other algo-

rithm3 are not applicable in this case (see (411). We have shown that the convergence

of the Newton's method is either quadratic or linear with rate i, provided that the

unimodular eigenvalues are all semi-simple. Quadratic convergence has never been

observed in numerical expenments and we have conjectured above that the conver-

gence is always linear. In this section we will show that the efficiency of the Newton

iteration can be improved significantly if a double Newton step is used at the right

time. However, the second derivative of the Riccati function is no longer constant

(compare (5.3) with (3.3) and note that, in (5.3), H depends on X). Consequently,

the improvement will not be as dramatic as for the CA=. In contrast with Theorerns

3.22 and 3.23, we have:

Lemma 5.19 In the setting of Theorenas 5.1 and 5.3, assume that Xk is close enough

to X+ tm'th Xk - X+ E /V and that I I (Rf J-' 1 1 5 cIIXk - X+ I I - ' with c independent

of k. If fi.kr = Xk - 2('Ri,)-'R(Xk), then I I ficl - X+ I I 5 cl llXk - X+ Il2 for some

c o d a n t cl independent of k.

Proof. By Taylor's Theorem,

and then

When the direct sum condition is satisfied and the convergence of the Newton

sequence {Xk) is not quadratic, we have 11(RkIr)-l 1 1 5 cIIXk - X+ I I - ' for al1 k (cf.

Theorem 5.12). Moreover, the error Xk - X+ will be dominated by its N-component

for large k. A much better approximate solution can then be obtained by applying

the double Newton step. More precisely, we have the following result.

Theorem 5.20 Assume CnXn = N $ M and the convewence of the Newton iter-

ation is not puadrutic. If for some k , 11 Xk - X+ I I is small enough and II PM(Xk - X+)ll 5 ejlpN(Xk -X+)I/ with c suficienuy s m d , and = Xk-2(n&k)-1a(&),

then - X+ I I 5 clc + ctllXk - X+ I l 2 for some constants cl and cz independent

of c and k.

Pmof The result follows from Lemma 5.19 and the argument used in the proof

of Theorem 3.23. 0

In contrat with the CARE case, it can happen that the matrix Yk+l in Theorem

5.20 is neither stabilizing nor almost stabilizing.

Example 5.2 (cf. [35, Example 13.2.11). Consider the DARE (5.1) with Q = C = O

and A = B = R = 1. Clearly (A, B) is d-stabilizable and X+ = O. Al1 eigendues

of the closed-bop matrix are on the unit circle and semi-simple. For Lo = 1, the

Newton iterates are found to be

Thus, the convergence is linear with rate 112. If we cornpute Yk+t as in Theorem

5.20, we get 1

K+i = - I . (2k+' - 1)(2'+2 - 1)

Although Yk+1 is much more accurate than Xk+l for large k, it is neither stabilizing

nor almost stabilizing.

The double Newton step is useful in that it can significantly improve the acwacy

of the m e n t Newton iterate and thus find more correct digits of the exact solution.

The potential problem of getting a slightly non-stabilizing approximate solution is

not our concern here. Even if an exact solution with infinite number of deumals

is known, we will probably get a slightly non-stabilizing approximate solution by

keeping only a finite number of decimals.

Theorem 5.20 suggests the following modification of the Newton method.

Algorithm 5.21 (Modified Newton method for the DARE).

1. Choose a matriz Lo for whàch A - BLo i s d-stable.

2. Find Xo frorn (5.5).

In the above algorithm, II II is an easily cornputable matrix n o m (e.g. 1-nom)

and c is a prescribed accuracy. The equation R;l, (H) = R ( X k ) can be rewritten as

a Stein equation H - A;+,HAk+l = -72(Xk), which can be solved efficiently by a

variation of the BartelsJStewart algorithm (31. See also [18] and (411. According to

Theorem 5.20, the double Newton step will be efficient only when the current iterate

is already reasonably close to the solution. This is a major difference between the

DARE case and the CARE case (see Theorem 3.23). We may try the double Newton

step only when the n o m of the residual is small enough (les than fi, say) and save

a little more computational work. In the above dgorithm, al1 iterates except the last

one are identical to those produced by the original Newton method. Thus dl gwd

properties of the Newton method are retained.

In this section we give two simple examples to illustrate the performance of the

modified Newton method.

Example 5.3 We consider the DARE (5.1) with n = rn = 2 and

Note that A and R are both singular. It can be easily verified that X+ = diag(1,O)

is the only solution of the DARE and the closed-loop eigenvalues are O and 1. Note

also that R+BmX+B > O. We take Lo = diag(0,2) so that 4 = A- BLo is d-stable,

and apply the rnodified Newton method with e = 10-'O. The numerical results are

recorded in Table 5.1. The last iterate is produced by the double Newton step.

Example 5.4 Weconsider the DARE (5.1) with n = m = 8 and

Table 5.1 : Performance of the modified Newton method for Example 5.3

For this example, X+ = O and the closed-lwp eigendues are those of A. The uni-

modular eigendues are al1 semi-simple. We take Lo = diag( - 1,1,1,1,1,0.1,0.1,0.1)

so that A0 = A - BLo is d-stable, and apply the modified Newton method with

r = IO-''. The resdts are recorded in Table 5.2. Again, the last iterate is produced

by the double Newton step.

In both examples, the convergence of the Newton method is linear and the final

double Newton step reduces the error significantl y.

Table 5.2: Performance of the modified Newton method for Example 5.4

Chapter 6

A Special Discrete Algebraic Riccati Equation

6.1 Introduction

In this chapter, we are concerneil with the iterative solution of the matnx equation

The matrix Q is rn x rn Hermitian positive dehi te and Hermitian positive definite

solutions are required. This equation has been studied recently by several authors

(see [2, 16, 17, 52, 531). The equation has applications in control theory, ladder net-

works, dynarnic programming, stocha~tic filtering, and statistics. See the references

given in (21.

A maximal solution of a matrix equation was defined before. Similady, a Hermi-

tian solution X- of a rnatrix equation is called minimal if X- 5 X for any Hermitian

solution X of the matrix equation.

It is proved in (171 that if (6.1) has a positive definite solution, then it has a

maximal Hermitian solution X+ and a minimal Hermitian solution X- . Indeed, we

have O < X- 5 X < X+ for any Hermitian solution X of (6.1). Moreover, we have

p(X;'A) $ 1 (se, e.g., [52]), where p ( * ) is the spectral radius.

When the matrix A is nonsingular, the minimal positive definite solution of (6.1)

can be found via the maximal solution of another equation of the same type. The

following result is a slight generalization of [17, Thm. 3.31.

Lemma 6.1 If A is nonsigular, then X Lc a solution of

x + A ' X - ~ A = Q

if and only if Y = Q - X is a solution of

Y + A Y - I A * = Q.

In [17], an algorithm was presented to find the minimal solution of the equation

(6.1) for the case where A is singular. The algorithm was based on a recursive

reduction process. The reduction process is useful in showing that the minimal

positive definite solution of (6.1) exists even if the matrix A is singular. However, it

is usually impossible to fmd the minimal solution using that algorithm. The reason is

sjmply that the singularity of general matrices can only be checked approximately due

to rounding errors. For a given matrix A, we can use singular value decomposition

to check if the matrix is nearly singular ( s e [22]). However, it is extremely difficult

(if not impossible) to determine if a general matrix is exactly singular or not. On

the other hand, for OUT equation, the minimal solution is generally not continuous

at a singular matrix A, as the following example shows.

Example 6.1 Let Q = 1 in equation (6.1). If

we find

But for

with O < e < 0.3, we find

Therefore, there is virtually no hope in finding the minimal solution in the pres-

ence of rounding errors when A is singular. We will therefore limit out discussion to

the maximal solution.

As we have seen in Chapter 5, invertibility of R is not necessary for the general

theory for the DARE (5.1). The matrix equation (6.1) is then a special case of the

DARE (5.1) with R = 0 , A = 0, and B = 1. For cornparison purposes, and to

improve existing results, we will study the properties of sorne simpler methods for

(6.1) before we study the application of Newton's method to this equation.

In Section 2, we discuss the convergence behaviour of the basic fixed point it-

eration for the maximal solution of (6.1). In Section 3, we study the convergence

behaviour of inversion free variants of the basic futed point iteration. In general,

these algorithms are linearly convergent and do not perform well when there are

eigenvalues of X;'A on, or near, the unit circle. In Section 4, we study the prop-

erties of the Newton iteration. Some numerical examples are reported in Section

Throughout this chapter, II II will be the spectral n o m for square matrices unless

othexwise ooted.

6.2 Basic fxed point iteration

The maximal solution X+ of (6.1) can be found by the following basic fixed poict

iteration:

Algorithm 6.2

For Algorithm 6.2, we have Xo 2 Xi 2 , and lim,, X, = X+ (see, e.g.,

[17]). The following result is given in [52].

Theorem 6.3 For any c > 0,

for al1 n suflciently large.

We now show that the above result can be improved.

Theorem 6.4 For al1 n 2 0,

we have

Hence,

lim sup n-00

In the last equality, we have used the fact that lim = p(B) for asy square

matrix B and any nom. O

We mentioned earlier that @;'A) 5 1 is always true. From the second part of

the above result , we know that the convergence of the fkced point iteration is R-linear

whenever p(X;' A) < 1. For detailed definitions of the rates of convergence, see [43].

Zhan asked in [52] whether p(XzlA) 5 1 implies IIX;'AII 5 1. This is not the case

and, in fact, it is possible to have IIX;'AII > 1 when p(X;'A) < 1. If p(X<'A) = 1,

the convergence of the fixed point iteration is typically sublinear.

Example 6.2 We consider the scalar case of (6.1) with A = f and Q = 1, Le.,

Clearly, X+ = f and p(X;'A) = 1. For the fixed point iteration

we have 1

Xo > Xl > , and lim Xn = - n-ao 2 '

Note that

i.e., the convergence is sublinear .

6.3 Inversion fkee variant of the basic fixed point

iteration

In [52], Zhan proposed an inversion free variant of the basic fixed point iteration for

the maximal solution of (6.1) when Q = 1. For general positive definite Q, Zhan's

algorithm takes the following form:

Algorithm 6.5 Take Xo = Q, & = Q-l. For n = 0,1,. . . , compute

The convergence of Algorithm 6.5 was established in (521 for Q = I . Zhan's result

can easily be transplanted and we have

Theorem 6.6 If (6.1) h a a positive definite solution then, for Algorithm 6.5, Xo 2

Xi 2 * O * , Yo SY, S . * * , andlim,,,,Xn =X+, li%,,Y, = X;'.

The problem of convergence rate for Algonthm 6.5 was not solved in (521. We

now establish the following result:

Theorem 6.7 For any c > O , tue have

for all n suficiently large.

Proof. We have from Algonthm 6.5

The inequality (6.3) follows since I I Y. - X;' I I 5 IIK-i - X;' I I and Lm Y, = X;'.

The inequality (6.4) is true since

If A is nonsingular, we have by (6.6) and (6.7)

Therefore, since IIXn - X+II 5 IIX,,-l - X+II, (6.5) is tme for n large enough.

The above proof shows that Algorithm 6.5 should be modified as foilows to im-

prove the convergence properties:

Algorithm 6.8 T d e Xo = Q, O < & 5 Q-'. For n = 0,1,. . . , cornpute

Note that one convenient choice of Y. is Y. = I I IIQllaO. \Ne can also use this

choice of Y. in Algorithm 6.5. Theorems 6.6 and 6.7 remain true for any Y. such

that O < & 5 Q-l .

Lemma 6.9 (521 If C and P are Hennitian matrices of the same order with P > 0,

then CPC + P-' 2 2C.

Theorem 6.10 If (6.1) has a positive definite solution and ( X , ) , {Y,) are deter-

mined b y Algorithm 6.8, then Xo 2 Xl 2 , lim,,,, X, = X+; 5 < , lim,,ico Yn = X z l .

Prwf. It is clear that

is true for n = 1. Assume (6.8) is true for n = k. We have by Lemma 6.9

T her efor e ,

&+1 = Q -AmK+1A > Q -ADX; 'A= X+.

Since K 5 X& 5 X;', we have Kt > Xk. Thus,

Yk+i - f i = Y*(Y;' - Xk)& > 0,

Xk,, - Xk = -Arn(K+l - &)A 5 0.

We have now proved (6.8) for n = k + 1. Therefore, (6.8) is tme for al1 n, and

the limits limdoo X. and lim,,,, Yn exist. As in [52], we have limX, = X+, and

limYn =X;'.

Theorem 6.11 For Algorithm 6.8 and any c > O , we have

IIX+l - X;'ll 5 (IIAX;'ll + 4211K - X;l II

Ilxn - x+ Il É IIAI1211Yn - X;lII

for al1 n large enough. If A is nonsingular, we also have

for al1 n large enough.

Proof. The proof is very similar to that of Theorem 6.7. a.

We see from the estimates in Theorem 6.7 and Theorem 6.11 that Algorithm 6.8

can be faster than Algorithm 6.5 by a factor of 2. Compared with Algonthm 6.2,

Algorithm 6.8 needs more computational work per iteration. However, Algorithm

6.8 has better numerical properties since matrix inversions have been avoided. A l g ~

nthm 6.8 is particularly useful on a parallel computing system, since matrix-matrix

multiplication can be carried out in parallel very efficiently (se, e.g., 1221).

For Algorithm 6.8, R-linear convergence can be guaranteed whenever p(X;'A) <

1. This will be a consequence of the following general result.

Theorem 6.12 [34, p. 211 Let T be a (nonlinear) operator from a Banach space E

into itself and z' E E Le a solution of z = Tz. If T is Fréchet differentiable ut z'

with p ( ~ : . ) < 1 , then the itemtes zn+1 = Tzn (n = 0,1,. . .) converge to z', provided

that xo is suflciently close to z'. Monover, for any r > 0,

when II II is the nom in E and c(zo; c) is a constant independent of n.

Corollary 6.13 For AIgod?zrn 6.8, we have

Pmof For Algorithm 6.8, we have

where the operator T is defined on CmX" (rn is the order of Q) by

T ( Y ) = 2Y - YQY + YA'YAY.

It is found that the Fréchet derivative T; : CmXm -t Cmxm is given by

T;(z) = 2 2 - ZQY - YQZ + ZA'YAY + YA'YAZ + YA'ZAY.

Therefore,

T' -,(z) = x;~A*zAx;~. x+

Let A be any eigendue of T&. We have

for sorne Z # O. However, (6.1 1 ) is quivalent to

8 (AX;l)')vec Z = Avec 2.

Therefore, X is an eigenvalue of T' -1 if and only if it is an eigenvalue of ( A X ; ' ) ~ 8 x+ (AX;')'. Note that ~ ( ( A X ; ' ) ~ 8 (AX;')') = {AP : A, p E o(AX;')} (see [35,

Theorem 5.1 J I ) . Thus, p ( ~ ' -1) = ( ~ ( A X F ' ) ) ~ . x+ By Theorem 6.12, we have

iim aup d m 5 p ( ~ X ; l ) = (~(X;'A))~. n-a,

6.4 Newton's method

For equation (6.1), the convergence of the algorithrns in the above two sections may

be very slow when X;'A has eigenvalues close to (or even on) the unit tircle. L

these situations, Newton's method c m be recommended.

When A is invertible, it was shown in [17] that X is a solution of X+ A T - ' A = I

if and only if X is a solution of the DARE

However, the results in Chapter 5 cannot be applied to this equation. For the

maximal positive definite solution X+ of (6.lZ), we have X+ = I - AmX;'A < 1.

Therefore, -1 + X+ > O cannot be true and Theorem 5.1 cannot be applied.

We now let m = n in DARE (5.1), and take A = O, R = 0, B = 1. The equation

becomes X + C'X-'C = Q, which has the same form as (6.1)' and the hypotheses

of Theorem 5.1 are tnvially satisfied. We can then apply the results in Chapter 5

to the equation (6.1) (the matrix A in (6.1) has taken the place of the matrix C in

(5-1))-

The next result is an immediate consequence of Theorem 5.1. The first conclusion

has also been proved in [17]. The second conclusion has been noted in [52].

Theorem 6.14 If (6 .1 ) has a positive definite solution, then it has a rnazimal posi-

tive definite solution X+ and p(X;'A) _< 1.

By taking Lo = O in (5.5), we obtain & = O (which is certainly d-stable) and the

following aigorit hm for equation (6.1):

Algorithm 6.15 (Newton's method for (6.1)). Take Xo = Q. For i = 1,2, . . . , cornpute Li = XC'~A, and solue

Note that the Stein equation (6.13) is uniquely solvable when p(Li) < 1. From

Theorems 5.1, 5.3, 5.4, 5.5 and 5.12 we have:

Theorem 6.16 If (6.1) hm a positive definite solution, then Algorithm 6.15 de-

t emines a sequence of Hemitian matRees { X i ) E for which p(Li) < 1 for i =

O , 1,. . . , Xo 2 Xi 2 - -, and Km,, Xi = X+. The convergence is quadmtic if

p(X;'A) < 1. If p(X;lA) = 1 and ail eigenvdues of X;'A on the unit circle a n

semisimple, then the convergence is either quadmtic or linear with rnte 112.

The equation (6.13) can be solved by a compkx version of the algorit hm describecl

in [Ml. The computational work per iteration for algorithm 6.15 is roughly 10 - 15

times that for algorithm 6.2.

As we have seen in the previous convergence results, the convergence rates of

various algorithms for equation (6.1) are dependmt on the eigenvalues of X;lA,

where X+ is the sought afta solution of (6.1). We will now relate the eigenvalues of

X;'A to the eigenvalues of a matnx pend1 which is independent of X+.

The following tesult is an immediate consequence of Corollary 5.7 and Theorem

Corollary 6.17 For equation (6.1), the eigenvalues ofXqlA art precàsely the eigen-

values of the matriz p e n d

X F - G E X

inside or on the unit circle, with half of the partid mdtiplicities for each eigenoalue

on the unit circle.

According to [17, T h . 2.11, the equation (6.1) has a positive definite solution

if and only if the rational matrix-valued function $(A) = Q + X A + X I A ' is regular

(i.e., det+(A) # O for some A ) and $(A) 2 O for al1 X on the unit circle. In particular,

(6.1) has a positive dehite solution if $(A) > O for al1 X on the unit circle.

Let r ( T ) be the numerical radius of T E CmXm, defined by r (T) = max{lxmTxI :

z E Cm,zaz = 1}. Note that r ( T ) 5 IlTl1 5 2r(T) (see (2'71, for example).

The following lemma has ben proved in [17].

Lemma 6.18 $(A) > O for dl X on Me unit circle if and only if

As we have seen in Theorem 6.16, the convagence of Algorithm 6.15 is quadratic

if p(X;lA) < 1. Our last theorem clarifies this condition.

Theorem 6.19 For equation (6.1), p(X;lA) < 1 if and only if

Pmof By Corollary 6.17, it is enough to show r ( Q - 1 / 2 ~ Q - 1 / 2 ) < 1 2 if and only

if the pend AF - G has no eigendues on the unit circle. By appropriate block

elimination we find that

-XI O I

det(AF-G) = det -Q I -A*

-Q-XA' I = det

= (-1)"Amdet(Q + A-'A + XA').

Therefore, XF - G has no eigenvalues on the unit circle if and only if $(A) > O for

al1 X on the unit circle, the latter is +valent to r ( ~ - l / ~ ~ ~ - l / ~ ) < 2 by hmma

In this section, we give some examples to illustrate the convergence behaviour of

various algorithms we have discussed for the solution of (6.1). Double precision is

used in al1 computations.

Example 6.3 Consider equation (6.1) with

The macimal solution (with the first 9 digits) is found to be

We compare the number of iterations required for Algorithms 6.2, 6.5 and 6.8 to

get the first 6 correct digits.

Algorithm 6.2 needs 16 iterations with

We have used = I/IIQllm for Algorithms 6.5 and 6.8. If we use & = Q-l, the

numbers of iterations are 32 and 17, respectively.

The convergence is linear for al1 three dgorithrns. The convergence of Algonthm

6.2 is slightly faster than that of Algorithm 6.8, while the convergence of Algorithm

6.8 is faster than that of Algorithm 6.5 by roughly a factor of 2. These are consistent

with the convergence results in Section 2 and Section 3. For this example, we have

The next two examples will show that, for equation (6.1), Algorithm 6.15 can

be much more &&nt than Algorithm 6.2. Of course, for easy problems the basic

fixed point iteration needs no more than 30 iterations to get a good approximate

solution. In these cases we cannot expect Newton's method to perform better, since

two or three iterations are usually necessary for the Newton iteration. For t hese two

examples, we use the practical stopping criterion

for both Algorithm 6.15 and Algorithm 6.2, where a is a prescribed tolerance.

Example 6.4 We consider the equation (6.1) with Q = I and

0.20 0.20 0.10

0.10 0.15 0.25

For this exarnple, A is Hermitian (and hence normal). The exact maximal solution

can be found according to the formula

which is valid for any normal matrix A with IlAl1 5 112 (see (531). The exact solution

with the first 8 digits (without rounding) is

0.82654545 -0.16837666 -0.15816879

x+= [ 0.83164938 -0.16327272

symm. 0.82144151 1 Sine r ( A ) = IlAl1 = 112 for this example, we have P(X;'A) = 1 (cf. Theorem

6.19). The convergence of Algorithm 6.2 tums out to be sublinear. It needs 7071

iterations to satisfy (6.14) for c = IOM8, with

0.82656902 -0.16835309 -0.15814522

x&= ( 0.83167296 -0.16324916 .

syrnrn. 0.82146509

0.82656580 -0.16835631 -0.15814844

1 On the other hand, the convergence of Algorithm 6.15 is linear with rate 112 (cf.

Thmrem 6.16). The stopping aiterion is satisfied after 12 iterations, with

X E - [ 0.83166974 -0.16325238

symrn. 0.82146187 1 We find that both X& and X z have four correct digits, with x,": slightly bet ter.

If we use a double Newton step following XE, the first eight digits of the resulting

approximate solution are the sarne as in the exact solution. This example shows that

Newton's method cari be much more efficient than the basic k e d point iteration

when r (Q- '12~Q- '12 ) is equal or very close to 112.

Example 6.5 We consider the equation (6.1) with

For this example, r (Q-1 /2~Q-112) < 1/2. Thus the Newton iteration converges

quadratically to the maximal solution. It needs 8 iterations to satisfy the stopping

criterion (6.14) for c = 10-12. The computed maximal solution is

1.86737567 0.32524233

syrnm. 0.41582003

The basic fixed point iteration needs 332 iterations to satisfy the same criterion. The

convergence is linear since p(X;'A) < 1. Note that IIX;'AII > 1 for this example.

The minimal solution can be obtained by Lemma 6.1:

In computing Y+, the Newton iteration needs 8 iterations to satisfy II Y, + AY,-'A* -

Qll, < 10-12, while the fixed point iteration needs 330 iterations.

Chapter 7

Some Concluding Remarks

We have studied the convergence behaviour of Newton's method for the algebraic

Riccati equations under very mild conditions. Our emphaais has been on the situation

where the closed-loop matnx has eigendues on the imaginary a x i s (unit circle) for

the CARE (DARE). When these eigendues are semisimple, the convergence of

Newton's method ha9 b e n shown to be either quadratic or linear with rate 112.

However, it is not known whether quadratic convergence is indeed possible. When

t here are non-semisimple closed-loop eigenvalues on the imaginary axis (unit circle)

for the CARE (DARE), the convergence behaviour of Newton's method remains a

topic for future research.

While the Newton iteration can be used as a correction method, an important

feature of Newton's method for the Riccati equations is that the convergence is not

merely local. in some situations, we cannot provide Newton's method with a good

starting point using other methods. Newton's method has to be used with an initial

guess not necessarily close to the solution. Since the convergence rate results we

have established are essentially asymptotic, we don't h o w the error reduction for

Newton's method at earlier stages. Numerical results suggest that we have linear

reduction frorn the very beginning, yet a general theoretical result confirming this

behaviour is still to be obtained. For the CARE (3.1) with C > O, linear reduction

has been confirrned in [l]. For this special case, however, the Hamiltonian matrix

has no eigenvalues on the imaginary axis (cf. [35, Theorem 9.1.21) and it is usually

easy to find a good starting point for Newton's method using other methods.

As we have seen in Examples 4.2 and 5.1, the h t Newton iteration can make a big

adjustment to the initial guess. This adjustment could be harmful in some occasions.

The method of step size control has b e n introduced to Newton's method for the

CARE in [4] to overcome this potential difficulty and to speed up the convergence

of Newton's method. The theory in [4] is established under the awumptions that

the pair (A, D) is controllable and the Hamiltonian matrix has no eigenvalues on the

imaginary axis. It would be worthwhile to explore the possibility of applying the

method of step size control to the DARE and to the CARE where the pair (A, D) is

stabilizable and/or the Hamiltonian matrix has eigendues on the imaginary axis.

Bibliography

[Il J. C. AUwnght, A lower k u n d for the solution of the algebmic Riccati equation

of optimal control and a geometric convergence tute for the Kleinman algorithm,

IEEE Trans. Autom. Control, 25 (1980), pp. 826-829.

[2] W. N. Anderson, Jr., T. D. Morley, and G. E. Trapp, Positive solutions to

X = A - BX-'B., Linear Algebra Appl., 134 (1990), pp. 53-62.

[3] R. H. Bartels and G. W. Stewart, Solution of the matriz equation AX+XB = C,

Comm. ACM, 15 (1972), pp. 82G826.

[4] P. Benner and R. Byers, An ezact line search method for solving genemlized

continuous-time algebruic Riccati equations, IEEE Trans. Autom. Control, 43

(1998), pp. 101-107.

[5] P. Benner, A. J. Laub, and V. Mehrmann, A collection of benchmark ezamples

for the numerical solution of algebruic Riccati equations 1: continuous-time case,

Technical Report SPC 95-22, FakdtZt fur Mathematik, Technische Universittt

Chemnitz-Zwickau, FRG, 1995.

[6] P. Benner, A. J. Laub, and V. Mehrmann, A collection of benchmark ezamples

for the numericd solution of algebraic Ricurti equations II: discrete-time case,

Technical Report SPC 95-23, Fakultàt für Mathematik, Technische Universitàt

Chemnitz-Zwichu, FRG, 1995.

[7] P. Benner, V. Mehrmann, and H. Xu, A new method for computing the stable

invariant subspace of a red Hatniltonian matriz, J. Comput. Appl. Math., 86

(1997), pp. 17-43.

[8] R. C. Cavanagh, Diflerence equations and itemtive processes, Ph.D. Diss., Univ.

of Maryland, College Park, Maryland, 1970.

(91 S. W. Chan, G. C. Goodwin, and K. S. Sin, Convergence properties of the Riccati

difference equation in optimal fitering of nonstabilizable systems, IEEE Trans.

Autom. Control, 29 (1984), pp. 110-118.

[IO] D. J. Clements, B. D. O. Anderson, A. J. Laub, and J . B. Matson, Spectral

factorization Eoith imagina~y-Q2iS rems, Linear Algebra Appl., 250 (1997), pp .

225-252,

[Il] W. A. Coppel, Matriz quadratic equations, Bull. Austral. Math. Soc., 10 (1974),

pp. 377401.

[12] D. W. Decker, H. B. Keller, and C. T . Kelley, Convergence rates for Newton's

method at singular points, SIAM J. Numer. Anal., 20 (1983), pp. 296-314.

[13] D. W. Decker and C . T. Kelley, Newton's Method et singular points I? S I A M J .

Numer. And., 17 (1980), pp. 6670.

[14] D. W . Decker and C. T . Kelley, Convergence accelerntion for Nezuton's method

ut singular points, S I A M J . Numer. Anal., 19 (1982), pp. 219-229.

[15] A. Emami-Naeini and G. F. Franklin, Comments on the numerical solution

of the discrete-time algebruic Riccati equation", IEEE Trans. Autom. Control,

25 (1980), pp. 1015-1016.

[16] J. C. Engwerda, On the e t e n c e of a positive definite solution of the matriz

equation X + A=X-'A = 1, Linear Algebra Appl., 194 (1993), pp. 91-108.

[17] J. C. Enperda, A. C. M. Ran, and A. L. Rijkeboer, Necessary and suficient

conditioru for the ezistence of a positive definite solution of the matriz equation

X + A T - ' A = Q , Linear Algebra Appl., 186 (1993), pp. 255-275.

[18] J. D. Gardiner, A. J. Laub, J. J. Amato, and C . B. Moler, Solutron of the

Sylvester matriz equation AXBT + CxDT = E, ACM Trans. Math. Software,

18 (1992), pp. 223-231.

[19] 1. Gohberg, P. Lancaster, and L. Rodman, Matriz Polynomiak, Academic Press,

New York, 1982.

[20] 1. Gohberg, P. Lancaster, and L. Rodman, On Hermitian solutions of the sym-

metric dge h i c Riccati equation, SIAM J. Control Optimization, 24 ( 1986), pp.

1323-1 334.

[21] G. H. Golub, S. Nash, and C. Van Loan, A Hessenberg-Schur method for t h e

pmllem AX + XB = C , IEEE Trans. Autom. Control, 24 (1979), pp. 909-913.

[22] G. H. Golub and C. F. Van Loan, Mat& Computatiohî, Third edition, Johns

Hopkins University Press, Baltimore, MD, 1996.

(231 A. O. Griewank, Slarlike domains of convergence for Newton's method at sin-

guladies, Nurner. Math., 35 (1980), pp. 9Fhll.

1241 A. Griewank and M . R. Osborne, Newton's method for singdar problemg when

the dimension of the nul1 space is > 1 , SIAM J. Numer. And., 18 (1981), pp.

[25] A. Griewank and M. R. Osborne, Andysis of Newton's method at irregular

singtdadies, SIAM J . Numer. Anal., 20 (1983), pp. 747-773.

[26] G. A. Hewer, An iteratioe technique for Me computation of the steady-state gains

for the discnte optimal qulator, IEEE Trans. Autom. Control, 16 (1971), pp.

382-384.

[27] R. A. Hom and C. R. Johnson, Topics in Matriz Analysis, Cambridge University

Press, Cambridge, 1991.

[28] V. Ionescu and M . Weiss, On computing the stabilinng solutiot. of the discrete-

t ime Riccati equation, Linear Algebra Appi., 174 ( 1 W ) , pp. 229-238.

[29] L. V. Kantorovich and G. P. Akilov, Functional Analysis in Nonned Spaces,

Pergamon, New York, 1964.

[30] C. T. Kelley, A Shamodii-like accelemtion scheme for nonlinear equations at

singular mots, Math. Comp., 47 (1986), pp. 609-623.

[31] C. T . Kelley and R. Suresh, A new accelemtion merhod for Newton's method at

singular points, S I A M J. Numer. Anal., 20 (1983), pp. 1001-1009.

[32] C. Kenney, A. J. Laub, and M. Wette, E m r kunds for Newton refinement

of solutions to algebruic Riccati equations, Math. Control Signals Systems, 3

(1990), pp. 211-224.

[33] D. L. Kleinman, On an itemtive technique for Riccati equation computations,

IEEE Trans. Autom. Control, 13 (1968), pp. 114-115.

[34] M. A. Kramoselskii, G. M. Vainikko, P. P. Zabreiko, Ya. B. Rutitskii, and V.

Ya. Stetsenko, Approzimate Solution of Operator Equations, Wolters-Noordhoff

Publishing, Groningen, 1972.

[35] P. Lancaster and L. Rohan, Algebraic Riccati equatiom, Oxford University

Press, 1995.

[36] P. Lancaster and M. Tismenetsky, The Theory of Matrices, Second Edition,

Academic Prtss, Orlando, 1985.

[37] A. J. Laub, A Schur method for solving algebraic Riccati equotiorrs, IEEE Tram.

Autom. Control, 24 (1979), pp. 913-921.

[38] W.-W. Lin and C.-S. Wang, On computing stable Lagranggian subspaces of

Humiltonian matncw and symplectic pencils, SIAM J. Matrix Anal. Appl., 18

(1997), pp. 590414.

[39] A. Linnemann, NumeGche Methoden fut Lineare Regelungssysteme, BI Wis-

senschafts Verlag, Mannheim, 1993.

[40] J. B. Matson, B. D. O. Anderson, A. J. Laub, and D. J. Clements, Riccati

difference equatioru for discrete time spec td factorkation with unit circle zeros,

&cent trends in optimization t heory and applications, World Sci. Publishing ,

River Edge, NJ, 1995, pp. 311-326.

(41) V. L. Mehrmann, The autonornotu h e u r quadratic control problem, Lecture

Notes in Control and Information Sciences, Vol. 163, Springer Verlag, Berlin,

[42] D. Mustafa and K. Glover, Minimum entmpy H, contml, Lecture Notes in 0

Control and Information Sciences, Vol. 146, Springer Verlag, Berlin, 1990.

[43] J. M. Ortega and W. C. Rheinboldt , Iterative Solution of Nonlincar Equations

in Seveml Variables, Academic Press, New York, 1970.

[44] A. M. Ostrowski, Solution of Equations in Euclidean and Banach Spaces, Aca-

demic Press, New York, 1973.

[45] L. B. Rall, Convergence of the Newton process to multiple solutions, Numer.

Math., 9 (1966), pp. 23-37.

1461 A. C. M. Ran and R. Vreugdenhil, EEistence and compaRson theonrns for al-

gebnzic Riccati equations for continuow- and dàscrete-time systems, Linear Al-

gebra Appl., 99 (1988), pp. 63-83.

[47] G. W. Reddien, On Newton's method for singular prollem, SIAM J. Numer.

And., 15 (1978), pp. 993-996.

[48] V. Sima, An eficient Schur method to solve the stabilizing problern, IEEE Tram

Autom. Control, 26 (1981), pp. 724-725.

[49] P. Van Dooren, A generulizd eigenwdue approach for solm'ng Riccati equations,

SIAM J. Sci. Comput., 2 (1981), pp. 121-135.

(501 H. K. Wimmer, Monotonicity of mazimal solutions of algebmic Riccati equa-

tiow, Syst. Control Lett., 5 (1985), pp. 317-319.

[51] H. K. Wimmer, Nomal f o m of symplectic pencils and the discrete algebraic

Riecati equation, Linear Algabra Appl., 147 (1991), pp. 411440.

1521 X . Zhan, Computing the eztremal positive definite solutions of a matriz equation,

SIAM J. Sa. Cornput., 17 (1996), pp. 1167-1174.

[53] X. Zhan and J. Xie, On the matnt eqwtion X + A=X-'A = 1, Lin- Algebra

Appl., 247 (1996), pp. 337-345.

!hAAVE L!ALUE;T!VI? TEST TARGET (QA-3)

APPLIED IM4GE. lnc - = 1653 East Main Street - -. - - Rochester. NY 14609 USA -- --= Phone: 71 61482-0300 -- -- - - Fax: 7161208-5989

Analysis and modification of Newton's method for algebraic ...

Documents