THE THEORY OF MATRICES - 上海交通大学数学系math.sjtu.edu.cn/faculty/tyaglov/courses/linear...

THE THEORY OF

MATRICESF. R. GANTMACHER

VOLUME TWO

AMS CHELSEA PUBLISHINGAmerican Mathematical Society Providence, Rhode Island

THE PRESENT WORK, PUBLISHED IN TWO VOLUMES, IS AN ENGLISHTRANSLATION, BY K. A. HIRSCH, OF THE RUSSIAN-LANGUAGE BOOKTEORIYA MATRITS By F. R. GANTMACHER (PaliTMaxep)

2000 Mathematics Subject Classification. Primary 15-02.

Library of Congress Catalog Card Number 59-11779International Standard Book Number 0-8218-2664-6 (Vol.11)

International Standard Book Number 0-8218-1393-5 (Set)

Copyright © 1959, 1987, 19898 by Chelsea Publishing CompanyPrinted in the United States of America.

Reprinted by the American Mathematical Society, 2000The American Mathematical Society retains all rights

except those granted to the United States Government.® The paper used in this book is acid-free and falls within the guidelines

established to ensure permanence and durability.Visit the AMS home page at URL: http://ww.ams.org/

10987654321 04030201 00

PREFACE

THE MATRIX CALCULUS is widely applied nowadays in various branches ofmathematics, mechanics, theoretical physics, theoretical electrical engineer-ing, etc. However, neither in the Soviet nor the foreign literature is there abook that gives a sufficiently complete account of the problems of matrixtheory and of its diverse applications. The present book is an attempt to fillthis gap in the mathematical literature.

The book is based on lecture courses on the theory of matrices and itsapplications that the author has given several times in the course of the lastseventeen years at the Universities of Moscow and Tiflis and at the MoscowInstitute of Physical Technology.

The book is meant not only for mathematicians (undergraduates andresearch students) but also for specialists in allied fields (physics, engi-neering) who are interested in mathematics and its applications. Thereforethe author has endeavoured to make his account of the material as accessibleas possible, assuming only that the reader is acquainted with the theory ofdeterminants and with the usual course of higher mathematics within theprogramme of higher technical education. Only a few isolated sections inthe last chapters of the book require additional mathematical knowledge onthe part of the reader. Moreover, the author has tried to keep the indi-vidual chapters as far as possible independent of each other. For example.Chapter V, Functions of Matrices, does not depend on the material con-tained in Chapters 11 and III. At those places of Chapter V where funda-mental concepts introduced in Chapter IV are being used for the first time,the corresponding references are given. Thus. a reader who is acquaintedwith the rudiments of the theory of matrices can immediately begin withreading the chapters that interest him.

The book consists of two parts, containing fifteen chapters.In Chapters I and III, information about matrices and linear operators

is developed ab initio and the connection between operators and matricesis introduced.

Chapter II expounds the theoretical basis of Gauss's elimination methodand certain associated effective methods of solving a system of n linearequations, for large n. In this chapter the reader also becomes acquaintedwith the technique of operating with matrices that are divided into rectan-gular `blocks.'

iii

iv PREFACE

In Chapter IV we introduce the extremely important 'characteristic'and 'minimal' polynomials of a square matrix, and the'adjoint' and 'reducedadjoint' matrices.

In Chapter V, which is devoted to functions of matrices, we give thegeneral definition of f (A) as well as concrete methods of computing it-where f (A) is a function of a scalar argument A and A is a square matrix.The concept of a function of a matrix is used in §§ 5 and 6 of this chapterfor a complete investigation of the solutions of a system of linear differen-tial equations of the first order with constant coefficients. Both the conceptof a function of a matrix and this latter investigation of differential equa-tions are based entirely on the concept of the minimal polynomial of a matrixand-in contrast to the usual exposition--4o not use the so-called theory ofelementary divisors, which is treated in Chapters VI and VIT.

These five chapters constitute a first course on matrices and their appli-cations. Very important problems in the theory of matrices arise in con-nection with the reduction of matrices to a normal form. This reductionis carried out on the basis of li'eierstrass' theory of elementary divisors.In view of the importance of this theory we give two expositions in thisbook : an analytic one in Chapter VI and a geometric one in Chapter VIi.We draw the reader's attention to §§ 7 and 8 of Chapter VT, where we studyeffective methods of finding a matrix that transforms a given matrix tonormal form. In § 9 of Chapter VII we investigate in detail the methodof A. N. Krylov for the practical computation of the coefficients of thecharacteristic polynomial.

In Chapter VITT certain types of matrix equations are solved. We alsoconsider here the problem of determining all the matrices that are permutablewith a given matrix and we Study in detail the many-valued functions ofmatrices '"N/A and 1nA.

Chapters IX and X deal with the theory of linear operators in a unitaryspace and the theory of quadratic and hermitian forms. These chapters donot depend on \Veierstrass' theory of elementary divisors and use, of thepreceding material, only the basic information on matrices and linear opera-tors contained in the first three chapters of the hook. In § 9 of Chapter Xwe apply the theory of forms to the study of the principal oscillations of asystem with n degrees of freedom. ]it § 11 of this chapter we give an accountof Frobenius' deep results on the theory of IIankel forms. These results areused later, in Chapter XV, to study special eases of the Routh-ITnrwitzproblem.

The last five chapters form the second part of the book I the secondvolume, in the present English translation I. lit Chapter XT we determinenormal forms for complex symmetric, skew-symmetric, and orthogonal mat-

PREFACE V

ices and establish interesting connections of these matrices with real matricesof the same classes and with unitary matrices.

In Chapter XII we expound the general theory of pencils of matrices ofthe form A + AB, where A and B are arbitrary rectangular matrices of thesame dimensions. Just as the study of regular pencils of matrices A + ARis based on Weierstrass' theory of elementary divisors, so the study of singu-lar pencils is built upon Kronecker's theory of minimal indices, which is, asit were, a further development of Weierstrass's theory. By means of Kron-ecker's theory-the author believes that he has succeeded in simplifying theexposition of this theory-we establish in Chapter XII canonical forms ofthe pencil of matrices A + AB in the most general case. The results obtainedthere are applied to the study of systems of linear differential equationswith constant coefficients.

In Chapter XIII we explain the remarkable spectral properties of mat-rices with non-negative elements and consider two important applicationsof matrices of this class: 1) homogeneous Markov chains in the theory ofprobability and 2) oscillatory properties of elastic vibrations in mechanics.The matrix method of studying homogeneous Markov chains was developedin the book (46] by V. I. Romanovskii and is based on the fact that the matrixof transition probabilities in a homogeneous Markov chain with a finitenumber of states is a matrix with non-negative elements of a special type(a `stochastic' matrix).

The oscillatory properties of elastic vibrations are connected with anotherimportant class of non-negative matrices-the òscillation matrices.' Thesematrices and their applications were studied by M. G. Krein jointly withthe author of this book. In Chapter XIII, only certain basic results in thisdomain are presented. The reader can find a detailed account of the wholematerial in the monograph (17].

In Chapter XIV we compile the applications of the theory of matricesto systems of differential equations with variable coefficients. The centralplace (§§ 5-9) in this chapter belongs to the theory of the multiplicativeintegral (Prod uktintegral) and its connection with Volterra's infinitesimalcalculus. These problems are almost entirely unknown in Soviet mathe-matical literature. In the first sections and in § 11, we study reduciblesystems (in the sense of Lyapunov) in connection with the problem of stabil-ity of motion; we also give certain results of N. P. Erugin. Sections 9-11refer to the analytic theory of systems of differential equations. Here weclarify an inaccuracy in Birkhoff's fundamental theorem, which is usuallyapplied to the investigation of the solution of a system of differential equa-tions in the neighborhood of a singular point, and we establish a canonicalform of the solution in the case of a regular singular point.

vi PREFACE

In § 12 of Chapter XIV we give a brief survey of some results of thefundamental investigations of I. A. Lappo-Danilevskii on analytic functionsof several matrices and their applications to differential systems.

The last chapter, Chapter XV, deals with the applications of the theoryof quadratic forms (in particular, of Hankel forms) to the Routh-Hurwitzproblem of determining the number of roots of a polynomial in the righthalf-plane (Re z > 0). The first sections of the chapter contain the classicaltreatment of the problem. In § 5 we give the theorem of A. M. Lyapunov inwhich a stability criterion is set up which is equivalent to the Routh-Hurwitzcriterion. Together with the stability criterion of Routh-Hurwitz we give,in § 11 of this chapter, the comparatively little known criterion of Lienardand Chipart in which the number of determinant inequalities is only abouthalf of that in the Routh-Hurwitz criterion.

At the end of Chapter XV we exhibit the close connection between stabil-ity problems and two remarkable theorems of A. A. Markov and P. L.Chebyshev, which were obtained by these celebrated authors on the basis of theexpansion of certain continued fractions of special types in series of decreas-ing powers of the argument. Here we give a matrix proof of these theorems.

This, then, is a brief summary of the contents of this book.

F. R. Gantmaeher

PUBLISHERS' PREFACE

TILE PUBLISHERS WISH TO thank Professor Gantmacher for his kindness incommunicating to the translator new versions of several paragraphs of theoriginal Russian-language book.

The Publishers also take pleasure in thanking the VEB Deutscher Verlagder Wissenschaften, whose many published translations of Russian scientificbooks into the German language include a counterpart of the present work.for their kind spirit of cooperation in agreeing to the use of their formulasin the preparation of the present work.

No material changes have been made in the text in translating the presentwork from the Russian except for the replacement of several paragraphs bythe new versions supplied by Professor Gantmaeher. Some changes in thereferences and in the Bibliography have been made for the benefit of theEnglish-language reader.

CONTENTS

PREFACE

PUBLISHERS' PREFACE

XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, AND ORTHOGO-NAL MATRICES

iii

vi

1

§ 1. Some formulas for complex orthogonal and unitarymatrices . 1

§ 2. Polar decomposition of a complex matrix 6

§ 3. The normal form of a complex symmetric matrix 9

§ 4. The normal form of a complex skew-symmetric matrix.._... 12

§ 5. The normal form of a complex orthogonal matrix 18

XII. SINGULAR PENCILS OF MATRICES 24

§ 1. Introduction 24

§ 2. Regular pencils of matrices 25

§ 3. Singular pencils. The reduction theorem ..... 29

§ 4. The canonical form of a singular pencil of matrices 35

§ 5. The minimal indices of a pencil. Criterion for strongequivalence of pencils _ 37

§ 6. Singular pencils of quadratic forms_.. 40

§ 7. Application to differential equations 45

XIII. MATRICES WITH NON-NEGATIVE ELEMENTS. 50

§ 1. General properties 50

§ 2. Spectral properties of irreducible non-negative matrices- 53

§ 3. Reducible matrices 66

§ 4. The normal form of a reducible matrix._. - 74

§ 5. Primitive and imprimitive matrices 80

§ 6. Stochastic matrices 82

Vii

viii CONTENTS

§ 7. Limiting probabilities for a homogeneous Markov chain. _with a finite number of states 87. .

§ 8 Totally non-negative matrices 98.

§ 9. Oscillatory matrices 103

XIV. APPLICATIONS OF THE THEORY OF MATRICES TO THE INVES-TIGATION OF SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 113

§ 1. Systems of linear differential equations with variableGeneral concepts- 113coefficients .

§ 2. Lyapunov transformations . 116

§ 3 Reducible systems 118.

§ 4. The canonical form of a reducible system. Erugin's theorem 121§ 5 The matricant 125.

§ 6. The multiplicative integral. The infinitesimal calculus ofVolterra --- 131

§ 7. Differential systems in a complex domain. General prop-erties 135

§ 8. The multiplicative integral in a complex domain -..._ 138§ 9 Isolated singular points -......_- 142.

§ 10 Regular singularities 148.

§ 11. Reducible analytic systems 164

§ 12. Analytic functions of several matrices and their applica-tion to the investigation of differential systems. The papersof Lappo-Danilevskii ___ 168

XV. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS 172

§ 1 Introduction 172.

§ 2. Cauchy indices 173

§ 3. Routh's algorithm 177

§ 4 The singular case Examples 181. .

§ 5. Lyapunov's theorem 185

§ 6. The theorem of Routh-Hurwitz 190

§ 7 Orlando's formula 196.

§ 8. Singular cases in the Routh-Hurwitz theorem ...... 198

§ 9. The method of quadratic forms. Determination of thenumber of distinct real roots of a polynomial _........_ - 201

CONTENTS IX

§ 10.

§ 11.

§ 12.

§ 13.

§ 14.

§ 15.

§ 16.

§ 17.

§ 18.

Infinite Hankel matrices of finite rank. 204

Determination of the index of an arbitrary rational frac-tion by the coefficients of numerator and denominator___ 208Another proof of the Routh-Hurwitz theorem 216

Some supplements to the Routh-Hurwitz theorem. Stabil-ity criterion of Lienard and Chipart ...... ._ _ 220

Some properties of Hurwitz polynomials. Stieltjes' theo-rem. Representation of Hurwitz polynomials by con-tinued fractions ........_.._._. __...._- 225Domain of stability. Markov parameters........... -- 232Connection with the problem of momentR 236

Theorems of Markov and Chebyshev ____ 240

The generalized Routh-Hurwitz problem... 248

BIBLIOGRAPHY 251

INDEX __ _ 268

CHAPTER XI

COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ANDORTHOGONAL MATRICES

In Volume I, Chapter IX, in connection with the study of linear operatorsin a euclidean space, we investigated real symmetric, skew-symmetric, andorthogonal matrices, i.e., real square matrices characterized by the relations j-.

ST=S, KT=-K, and QT=Q-1,respectively (here QT denotes the transpose of the matrix Q). We haveshown that in the field of complex numbers all these matrices have linearelementary divisors and we have set up normal forms for them, i.e., `simplest'real symmetric, skew-symmetric, and orthogonal matrices to which arbitrarymatrices of the types under consideration are real-similar and orthogonallysimilar.

The present chapter deals with the investigation of complex symmetric,skew-symmetric, and orthogonal matrices. We shall clarify the questionof what elementary divisors these matrices can have and shall set up normalforms for them. These forms have a considerably more complicated struc-ture than the corresponding normal forms in the real case. As a preliminary,we shall establish in the first section interesting connections between com-plex orthogonal and unitary matrices on the one hand, and real symmetric,skew-symmetric, and orthogonal matrices on the other hand.

§ 1. Some Formulas for Complex Orthogonal and Unitary Matrices1. We begin with a lemma :

LEMMA 1:1 1. If a matrix G is both hermitian and orthogonal (GT=O=G-1), then it can be represented in the form

G=le'K, (1)where I is a real symmetric involutory matrix and K a real skew-symmetricmatrix permutable with it :

' See [1691, pp. 223-225.f In this and in the following chapters, a matrix denoted by the letter Q is not neces-

sarily orthogonal.1

2 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL MATRICES

I =I=P',12=E, K=S=-KT (2)

2. If, in addition, G is a positive-definite hermitian matrix,2 then in(1) I=E and

G = euK. (3)Proof. Let

G = S + iT,

where S and T are real matrices. Then

G=S-iT and GT=ST+ iTT.

(4)

(5)

Therefore the equation j = GT implies that S = ST and T = - TT, i.e., S issymmetric and T skew-symmetric.

Moreover, when the expressions for G and ( from (4) and (5) are sub-stituted in the complex equation G(i=E, it breaks up into two real equations:

S2+T2=E and ST=TS. (6)

The second of these equations shows that S and T commute.By Theorem 12' of Chapter IX (Vol. 1, p. 292), the computing normal

matrices S and T can he carried simultaneously into quasi-diagonal form bya real orthogonal transformation. Therefore3

S=Q{dl

T=Q ;0 '211'I-t= 0 j!..'

(the numbers s, and It are real). Hence

G=S+iT=Q{ 81 itt-it1 d1

ii 82 it2!

: -it2 82,

(Q=Q=QT-1) (7)

,0,...,01Q-I

1dq l

a-1. (8)-it il' 82y+I, ..., 8,, Q

q q, JJJ

On the other hand, when we compare the expressions (7) for S and Twith the first of the e.1uatiuns (6). we find:

si-ti=1, E2-tY=1,..., s4-to=1, s2q+I=t1,..., (9)

I.e., f: i, tilt, eocfti,ioul matrix of s hermitia;a form (Mee Vol. I,('hal'ter X,§ 9).

See also the N"le follo tlu g Th.orcm 12' of Vol. 1, Chapter IX (p. 293 a.

§ 1. COMPLEX ORTHOGONAL AND UNITARY MATRICES 3

Now it is easy to verify that a matrix of the type 111_is . d

it!li with s2 - t2 =1can always be represented in the form

d it ll _ aer l _°

11

where

Isl=coshgt, et=sinhop,

Therefore we have from (8) and (9) :

G=Q(±e' -m1 0 I , 0

i.e.,

where

and

IK=KI.

From (11) there follows the equation (2).2. If, in addition, it is known that G is a positive-definite hermitian

matrix, then we can state that all the characteristic values of G are positive(see Volume I, Chapter IX, p. 270). But by (10) these characteristic valuesare

±e"",±e-Ta,±e91.,±e-",...,±eIN,±e-P+t,±I,..., f 1

(here the signs correspond to the signs in (10) ).Therefore in the formula (10) and the first formula of (11), wherever

the sign -±- occurs, the + sign must hold. Hence

I= Q(1, 1, ... , 1)Q-' =E,

I=Q(f 1, f 1,..., f l)Q-1K=Q 0 c'1 !, .. 0 9,9 I, 0, ..., 0 Q-1il -9'1 0 I.! t -47t 0 11

'l

0

e = sign 8.

1<'l ° ',I-vtt fl,...,±l}Q-1, (10)

G=Ie"K,

and this is what we had to prove.This completes the proof of the lemma.

4 NI. COMPLEX SYMMETRIC SKEW-SYMMETRIC, ORTI1oooNAl. 1IATRI('ES

By means of the lemma we shall now prove the following theorem :

THEOREM 1: Every complex orthogonal matrix Q can be represented inthe form

Q = ReiK, (12)

where R is a real orthogonal matrix and K a real skew-symmetric matrix

R=R=RT-1, K=K=-KT.

Proof. Suppose that (12) holds. ThenQ*=QT =ei RT

and

(13)

Q*Q=ersRTRe{s=e2ix.

By the preceding lemma the required real skew-symmetric matrix K canbe determined from the equation

Q* Q = e21K (14)

because the matrix Q*Q is positive-definite hermitian and orthogonal. AfterK has been determined from (14) we can find R from (12) :

R= Qe-iK. (15)

ThenR*R=e-'NQ*Qe-ra=Ei

i.e., R is unitary. On the other hand, it follows from (15) that R, as theproduct of two orthogonal matrices, is itself orthogonal : RTR = E. ThusR is at the same time unitary and orthogonal, and hence real. The formula(15) can be written in the form (12).

This proves the theorem.4Now we establish the following lemma :

LEMMA 2: If a matrix D is both symmetric and unitary (D=DT=D-I ),then it can be represented in the form

D=eis, (16)

where 9 is a real symmetric matrix (S =S = ST).

4 The formula (13), like the polar decomposition of a complex matrix (in connectionwith the formulas (87), (88) on p. 278 of Vol. 1) has a close connection with the importantTheorem of Cart: n which establishes a certain rcpresentation for the automorphisms ofthe complex Lie groups; see [169J, pp. 232-233.

§ 1. COMPLEX ORTHOGONAL AND UNITARY MATRICES 5

Proof. We setD=U+iV (U=U,V=V). (17)

ThenD=U-iV, DT=UT+iVT.

The complex equation D = DT splits into the two real equations

U=UT, V=VT.

Thus, U and V are real symmetric matrices.The equation DD = E implies :

U2+V2=E, UV=VU. (18)

By the second of these equations, U and V commute. When we applyTheorem 12' (together with the Note) of Chapter IX (Vol. I, pp. 292-3)to them, we obtain :

U = Q (e1, a2, ... 18%)Q-11 V = Q (t1, ta, ... , tw) Q-1 . (19)

Here sk and tk (k =1, 2, ... , n) are real numbers. Now the first of theequations (18) yields :

et+tR=1 (k=1,2,...,n).

Therefore there exist real numbers (pk (k = 1, 2, ... , n) such that

et=oosPk, tt=sinopt (k=1, 2, ..., n).

Substituting these expressions for sk and tk in (19) and using (17), we find:

D = Q (e'r,, a", ... , e{ft) Q1 = efewhere

S=Q(g1, g,!, ..., -Fw) Q-1 (20)

From (20) it follows that S = S = S1.This proves the lemma.Using the lemma we shall now prove the following theorem :

THEOREM 2: Every unitary matrix U can be represented in the form

U = Ress,

where R is a real orthogonal matrix and S a real symmetric matrix

R=R=RT-1, S=S=A .

(21)

(22)

6 Xl. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTh OGO\:U. MIATRICES

Proof. From (21) it follows that

UT = e`sRT. (23)

Multiplying (21) and (23), we obtain from (22) :

UTU=eisRTRe+s=e28.

By Lemma 2, the real symmetric matrix S can be determined from theequation

UTU = e2ts (24)

because UTU is symmetric and unitary. After S has been determined, wedetermine R by the equation

R = Ue's. (25)Then

RT = e_tsUT (26)

and so from (24), (25). and (26) it follows thatRTR = e-tsUT Ue-'8 = E,

i.e., R is orthogonal.On the other hand, by (25) R is the product of two unitary matrices

and is therefore itself unitary. Since R is both orthogonal and unitary,it is real. Formula (25) can be written in the form (21).

This proves the theorem.

§ 2. Polar Decomposition of a Complex Matrix

We shall prove the following theorem :THEOREM 3: If A it ark I1' is a non-singular matrix with complex

elements, thenA=SQ (27)

andA = QISI, (28)

where S and S, are complex symmetric matrices, Q and Q, complex orthogo-nal matrices. Moreover,

S=V/AAT=f(AAT), S, ^VATA=D(ATA),

where f(A), f, (A) are polynomials in A.The factors S and Q in (27) (Q, and S, in (28)) are permutahle if and

only if A and AT are permutable.

2. POLAR DECOMPOSITION OF COMPLEX MATRIX 7

Proof. It is sufficient to establish (27), for when we apply this decom-position to the matrix AT and determine A from the formula thus obtained,we arrive at (28).

If (27) holds, then

A=SQ, AT=Q-ISand therefore

AAT = S=. (29)

Conversely, since AAT is non-singular (JAATj = I A 12 0), the function,VA is defined on the spectrum of this matrix° and therefore an interpola-tion polynomial f (A) exists such that

}1AAT = / (AAT). (30)

We denote the symmetric matrix (30) by

S=}'AAT.

Then (29) holds, and so 18 1 0. Determining Q from (27)

Q=S-IA,

we verify easily that it is an orthogonal matrix. Thus (27) is established.If the factors S and Q in (27) are permutable, then the matrices

A=SQ and AT=Q-ISare permutable, since

AA" = 82, AT A = Q-I S5Q.

Conversely, if AAT = ATA, then

M=Q_1U tit

i.e., Q is permutable with S2 =AA T . But then Q is also permutable withthe matrix S = f (dAT ).

Thus the theorem is proved completely.

2. Using the polar decomposition we shall now prove the following theorem :

e Vol. 1, l'haliter V', § 1. 11 'e choose a siogle-valued branch of tilt, functionill a simld.y connected donuiin containing nll the characteristic values of .4:1T, lout not theuwuhor n.

8 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORT11000ÀL MATRICES

THEOREM 4: If two complex symmetric or skew-symmetric or orthogonalmatrices are similar :

B = T-IA T, (31)

then they are orthogonally similar; i.e., there exists an orthogonal matrix Qsuch that

B = Q-IAQ . (32)

Proof. From the conditions of the theorem there follows the existenceof a polynomial q(2) such that

AT= q (A), BT = q (B). (33)

In the case of symmetric matrices this polynomial q(2) is identically equalto l and, in the case of skew-symmetric matrices, to -,1. If A and B areorthogonal matrices, then q(A) is the interpolation polynomial for 1/2 onthe common spectrum of A and B.

Using (33), we conduct the proof of our theorem exactly as we did theproof of the corresponding Theorem 10 of Chapter IX in the real case(Vol. I, p. 289). From (31) we deduce

q (B) = T-Iq (A) Tor by (33)

BT = 2-'ATT.Hence

B= TTATT-I.

Comparing this equation with (31), we easily find:

TTTA = ATTT. (34)

Let us apply the polar decomposition to the non-singular matrix T

T=SQ (S=ST= t(TTT), QT=Q-I)

Since by (34) the matrix TTT is permutable with A, the matrixS = f (TTT) is also permutable with A. Therefore, when we substitute theproduct SQ for T in (31), we have

B=Q S-IASQ=Q-IAQ.

This completes the proof of the theorem.

§ 3. NORMAL FORM OF COMPLEX SYMMETRIC MATRIX 9

§ S. The Normal Form of a Complex Symmetric Matrix1. We shall prove the following theorem :

THEOREM 5: There exists a complex symmetric matrix with arbitrarypreassigned elementary divisors.6

Proof. We consider the matrix H of order n in which the elements ofthe first superdiagonal are 1 and all the remaining elements are zero. Weshall show that there exists a symmetric matrix S similar to H :

S = THT-1. (35)

We shall look for the transforming matrix T starting from the conditions :

S = THT-' = S' = TT-1HTTT.

This equation can be rewritten as

VH =HT V , (36)

where V is the symmetric matrix connected with T by the equation'

T'T=-2iV. (37)

Recalling properties of the matrices H and F = HT (Vol. I, pp. 13-14)we find that every solution V of the matrix equation (36) has the followingform:

0 . . . 0 aoao ai

V= (38)

0ao.ao al

.

. . .

where ao, a1,.. . , are arbitrary complex numbers.Since it is sufficient for us to find a single transforming matrix T, we

set ao = 1, a1= ... = 0 in this formula and define V by the equation"

0 ... 0 l

V= ( 0 ... 1 01. (39)

1 ... 0 0"In connection with the contents of the present section its well as the two sections

that follow, §§ 4 and 5, see [378].7 To simplify the following formulas it is convenient to introduce the factor - 2i."The matrix V is both symmetric and orthogonal.

10 X1. COMPLEX SYMMETRIC. SKEW-SYMMITRRC. ORTIIOGONAI. MATRICES

Furthermore, we shall require the transforming matrix T to be symmetric :

T= T'.Then the equation (37) for T can be written as:

T2=-2iV.

(40)

(41)

We shall now look for the required matrix T in the form of a polynomialin V. Since V2 = E, this can be taken as a polynomial of the first degree :

T=aE+PV-From (41), taking into account that 1'2 = E, we find.-

0+#2=0, 2afi=-2i.We can satisfy these relations by setting a=1, A=-i. Then

T=B-iV. (42)

T is a non-singular symmetrix matrix.° At the same time, from (41) :

T-I= iV-1T=2iVT.

T-1=2

(E+iV). (43)

Thus, a symmetric form S of H is determined by

0 ... 0 IF

S=THT-1= (E-iV)H(E+iV), V= 10 ... 1 OBI(44)

be

Since S satisfies therewritten as follows :

equation (36) and

28=(H+HT)+i(HV-VH)10 1 . . . 0 'I

1

0 . . . 1 0.

+i

0 ofV2 = E, the equation (44) can

1 0

1 I0 -1' . . . 0,

D The fact that 7' is non-singular follows, in particular, fran (41 ), because V is non.singular.

§ 3. NORMAL FORM OF COMPLEX SYMMETRIC MATRIX 11

The formula (45) determines a symmetric form S of the matrix H.In what follows, if n is the order of H, H = H("), then we shall denote the

corresponding matrices T, V, and S by T("), V(") and S(").Suppose that arbitrary elementary divisors are given :

(2-Aj)r', (2-2 )P" ..., (46)

We form the corresponding Jordan matrix

J = (2 E(P,) + H(P.), 22E(Pd + H(P'), ..., 2 P") + H(P")) .

For every matrix H(P0 we introduce the corresponding symmetric form

S(PP. From

x") = T(P1) H(Pt) [T(Pi]-1 (j=1,2, ..., u)

it follows that

AtE(Pt) + S(P)) = T(Pi) D WO + H(nt)] [7(Pt)]-1.

Therefore setting

,s = (21E(P") + $(P.), 22E(P.) + S(P.), ... , 2.E('") + S(Pa) )

T = ( T("), T(P.), ..., T(ft)

we have :

(47)

(48)

9= Tip-1.

S is a symmetric form of J. S is similar to J and has the same elementarydivisors (46) as J. This proves the theorem.

COROLLARY 1. Every square complex matrix A aik 11i is similar to asymmetric matrix.

Applying Theorem 4, we obtain :

COROLLARY 2. Every complex symmetric matrix S= II a{k 11i is orthogo-nally similar to a symmetric matrix with the normal form S, i.e., there existsan orthogonal matrix Q such that

9= QSQ-1. (49)

The normal form of a complex symmetric matrix has the quasi-diagonalform

9 = ( 21E(P,) + S(P,), 22E(a) + S(n.), ... , 2.,E(P") + S(P")) , (50)

where the blocks S(P) are defined as follows (see (44), (45)) :


,g(P) =

2

[ )- iy(P)] H(P) (Ei P) + iV(P)]

= 2 [H(P) + E(p)T + i (g(P) V(P) - y(P) H(P))]

0 1 . . . 0

1 .

I

1

0 . . . 1 0

+i

0 . . . 1 0

(51)

1. .

0-1 . . . 0

§ 4. The Normal Form of a Complex Skew-symmetric Matrix

1. We shall examine what restrictions the skew symmetry of a matriximposes on its elementary divisors. In this task we shall make use of thefollowing theorem :

THEOREM 6: A skew-symmetric matrix always has even rank.Proof. Let r be the rank of the skew-symmetric matrix K. Then K has

r linearly independent rows, say those numbered il, iz, . . . , i,; all the remain-ing rows are linear combinations of these r rows. Since the columns of Kare obtained from the corresponding rows by multiplying the elements by- 1, every column of K is a linear combination of the columns numberedil, i2, . . . , i,.. Therefore every minor of order r of K can be represented inthe form

aK1 b$ ... i,

(81 12 . t,where a is a constant.

Hence it follows that

K (ii i2 ... i) 0.

But a skew-symmetric determinant of odd order is always zero. There-fore r is even, and the theorem is proved.

THEOREM 7: If A0 is a characteristic value of the skew-symmetric matrixK with the corresponding elementary divisors

(,1-AO)t" ..., (,2-?.0)ft,

then - A-1 is also a characteristic value of K with the same number and thesame powers of the corresponding elementary divisors of K

§ 4. NORMAL FORM OF COMPLEX SKEW-SYMMETRIC MATRIX 13

(A+A0)1', (,1+A,)1', ..., (I+A )1e.

2. If zero is a characteristic value of the skew-symmetric matrix K,1othen in the system of elementary divisors of K all those of even degree cor-responding to the characteristic value zero are repeated an even numberof times.

Proof. 1. The transposed matrix KT has the same elementary divisorsas K. But KT = - K, and the elementary divisors of -K are obtainedfrom those of K by replacing the characteristic values A,, .42, ... by - A2,- 22, .... Hence the first part of our theorem follows.

2. Suppose that to the characteristic value zero of K there correspond b,elementary divisors of the form A, b2 of the form A2, etc. In general, wedenote by 6, the number of elementary divisors of the form 1P (p = 1, 2, ...).We shall show that 62, b4, ... are even numbers.

The defect d of K is equal to the number of linearly independent charac-teristic vectors corresponding to the characteristic value zero or, what is thesame, to the number of elementary divisors of the form A, 22, A3, .... There-fore

(52)

Since, by Theorem 6, the rank of K is even and d = n - r, d has the sameparity as n. The same statement can be made about the defects d3, d,, ... ofthe matrices K3, K1, ..., because odd powers of a skew-symmetric matrixare themselves skew-symmetric. Therefore all the numbers d, = d, d3, d,, .. .have the same parity.

On the other hand, when K is raised to the m-th power, every elementarydivisor AP for p < m splits into p elementary divisors (of the first degree)and for p ? m into m elementary divisors." Therefore the number of ele-mentary divisors of the matrices K, K3, ... that are powers of A are deter-mined by the formulas32

d6=B1+262+3ds+484+5(66+8s+ ), (53)

Comparing (52) with (53) and bearing in mind that all the numbersd, = d, d3i d,, . . . are of the same parity, we conclude easily that 82, b,, ... areeven numbers.


'O Le., if I K ; = 0. For odd n we always have j K j = 0." See Vol. 1, Chapter VI, Theorem 9, p. 15R.2 These formulas were introduced (without reference to Theorem 9) in Vol. I, ('hapter

VI (see formulas (49) on p. 155).

14 Xl. COMPLEX SYMMETRIC, SKEW-SYM\ILTRIC, ORTIIOC:Ox.\1. MATRICES

2. THEOREM 8: There exists a skew-symmetric matrix with arbitrary pre-assigned elementary divisors subject to the restrictions 1., 2. of the pre-ceding theorem.

Proof. To begin with, we shall find a skew-symmetric form for thequasi-diagonal matrix of order 2p:

J )=(AOE+H,-)E-H) (54)

having two elementary divisors (A - A,,) P and (A + Ao) P ; here E = E(P),H = H(P).

We shall look for a transforming matrix T such that

TJ;:'')T-1

is skew-symmetric, i.e., such that the following equation holds :

TJ( )T-1 + TT_1 [J(PP)]T TT= 0or

WJ)+[J,(i;°)]TW=O,

where W is the symmetric matrix connected with T by the equation"

TTT=-2iW.

We dissect W into four square blocks each of order p :

W (Wll W12

=\W21 W22Then (55) can be written as follows :

W11W21

W1 A0E+H 0W2J ( 0 -4E-H)

AOE+HT 0 l (WIl Wlt+

l -_( 0 -a0E-HT/ W21

WszlO. (57)

When we perform the indicated operations on the partitioned matriceson the left-hand side of (57), we replace this equation by four matrixequations :

1. HTW11+ W11(2A E+H)=O,2. HTW12-W14H=0,3. HW21-Wt1H=O,4. HTW22+ W22(2AE+H) = 0.

°i Sei footnote 7 on p. 9.


The equation AX - XB = 0, where A and B are square matrices withoutcommon characteristic values, has only the trivial solution X = 0." There-fore the first and fourth of the equations (58) yield : W11 = W22 = 0.18As regards the second of these equations, it can be satisfied, as we have seenin the proof of Theorem 5, by setting

0 . . . 0 1

W12=V=101

U

'(59)

11 . . . 0 0

since (cf. (36) )VH-H'rV=O.

From the symmetry of W and V it follows that

W21=W2=V.

The third equation is then automatically satisfied.Thus,

W= 0 V)

V 0

But then, as has become apparent on page 10, the equation (56) will besatisfied if we set

ThenT=E(2p)-iV(2p). (61)

T-1= 2 (E(E p) + i V(2 p)), (62)

Therefore, the required skew-symmetric matrix can be found by the formula16

gApp) _ I r &(2 P) _ iV(2 p)J j (PP) [E 2 p) + i y _2 p)]-T At

4- [JSpp) -J( )T+ i (Jr V(2 P) _ F(2 (63)

When we substitute for and V(2P) the corresponding partitionedmatrices from (54) and (60), we find:

14 See Vol. I, Chapter VIII, § 1.15 For 1,.., 0 the equations 1. and 4. have no solutions other than zero. For ) = 0

there exist other solutions, but we choose the zero solution.16 Here we use equations (55) and (60). From these it follows that

V(=p) JV V(1p) _ _App)T


kpp)[1H OHT

HT0

Hl +(4E

0H-AOE0

-H) \V 0l

V 0(AoKO+ H

_ 20E _ H)J

H-HT i(2AOV+HV+VH) (64)2 (-i(2 AOV+HV + VH) HT-H )'

0 1 . . . . . . . . 0 0 . . . . . . i 2A0

-1 0 2AO i

1 i .

0 ... --1 0 2 A i ... 00 . . . -i -2A0 0 -1. . . . 011

-2A0 -i !1 0

-i. . . 1-2) -i . . . 0 .0

.(65)

We shall now construct a skew-symmetric matrix K(q) of order q havingone elementary divisor A9, where q is odd. Obviously, the required skew-symmetric matrix will be similar to the matrix

0 1 0.......00 0 1 ,

j() =

0

-1 0'

0 1

0 0

(66)


In this matrix all the elements outside the first superdiagonal are equal tozero, and along the first superdiagonal there are at first (q -1) /2 ele-ments 1 and then (q -1) /2 elements -1. Setting

X(f) = TJ(q) I ' ,

we find from the condition of skew-symmetry :

W1J(0+ J(q)T W1=0

whereTTT=-2iW1.

By direct verification we can convince ourselves that the matrix

110 ... 0 1Wl=V(Q)=

0 ... 1 0

1 ... 0 0

(67)

(68)

(69)

satisfies the condition (68). Taking this value for W1 we find from (69),as before :

T= E(q)- i V(q), T-1= [E(q)+iV(q)],(70)

ke) =2

[E(q) - iV(gj J(q) [E(q) + iV(q)]

= 2[J(q) - J(q)T + i (J(q) 00- V(q) J(q))]

(71)

When we perform the corresponding computation, we find :

0 1 . . . . . . . 0

-1 0 .

0 . . . . . . 1 0

1

2 K(q) = +i (72)

0

-10 -1 . . . 0

Suppose that arbitrary elementary divisors are given, subject to theconditions of Theorem 7:

18 Xl. COMPLEX SYMMETRIC, SKEW-SYMME1'RIC, ORTHOGONAL MIATRII'L

(I(A+A)'t (i=1, 2, ..., u),) (k=1, 2, ..., v; ql, q2, ...,q, are odd numbers)." } (73)

Then the quasi-diagonal skew-symmetric matrix

K = {Ka,(AA,), KNO r) 1..., ..., (74)

has the elementary divisors (73).This concludes the proof of the theorem.

COROLLARY : Every complex skew-symmetric matrix K is orthogonallysimilar to a skew-symmetric matrix having the normal form K determinedby (74), (65), and (72) ; i.e., there exists a (complex) orthogonal matrix Qsuch that

K = QKQ-1. (75)

Note. If K is a real skew-symmetric matrix, then it has linear ele-mentary divisors (see Vol. I, Chapter IX, § 13).

iqp,,, d + ip,,, A, ..., A are real numbers).e antes

In this case, setting all the pJ = 1 and all the qk = 1 in (74), we obtain asthe normal form of a real skew-symmetric matrix

0 9'ttIIO

O,...,0}.-q'ti :i 111

§ 5. The Normal Form of a Complex Orthogonal Matrix

1. Let us begin by examining what restrictions the orthogonality of amatrix imposes on its elementary divisors.

THEOREM 9: 1. If Ao (A02:#6 1) is a characteristic value of an orthogonalmatrix Q and if the elementary divisors

(A-A0)",, (A-AX., ..., (A-AX,

° Sonic of rile ttunihcrs utay he zero. Moreover, one of the numbers uand t may he zero; i.e., in Some easl.s there may he clemcutarv divisors of only one type.

§ 5. NORMAL FORM OF COMPLEX ORT1IoOoxA1, MATRIX 19

correspond to this characteristic value, then 1/A0 is also a characteristic valueof Q and it has the same corresponding elementary divisors:

(A-A0 1)1', (A-Aa 1)1., ..., (A-A; -1

2. If Ao = - 1 is a characteristic value of the orthogonal matrix Q, thc'iithe elementary divisors of even degree corresponding to A. are repeated aneven number of times.

Proof. 1. For every non-singular matrix Q on passing from Q to Q-'each elementary divisor (A-Ao)t is replaced by the elementary divisor

(A-A01)/'e On the other hand, the matrices Q and QT always have thesame elementary divisors. Therefore the first part of our theorem followsat once from the orthogonality condition QT=Q-1

2. Let us assume that the number 1 is a characteristic value of Q, while-1 is not (I E - Q I = 0, 1 E + Q I 0). Then we apply ('ayley's formulas(see Vol. I, Chapter IX, § 14), which remain valid for complex matrices.We define a matrix K by the equation

K = (E _Q) (E + Q)-1 (76)

Direct verification shows that KT = - K, so that K is skew-symmetric.When we solve the equation (76) for Q, we find :19

Q = (E -K) (E +K)-1.

Setting1-1 2f (A) = we have f'(A) =-(l+Z)' 0. Therefore in the transi-

tion from K to Q = f (K) the elementary divisors do not split.lo Hence inthe system of elementary divisors of Q those of the form (A -1)2p are re-peated an even number of times, because this holds for the elementarydivisors of the form A2p of K (see Theorem 7).

The case where Q has the characteristic value - 1, but not + 1, is reducedto the preceding case by considering the orthogonal matrix - Q.

We now proceed to the most complicated case, where Q has both thecharacteristic value + 1 and - 1. We denote by yi(A) the minimal poly-nomial of Q. Using the first part of the theorem, which has already beenproved, we can write y'(A) in the form

"See Vol. 1, Chapter VI, §7. Setting f(.)=1/i., we haveHeuee it follows that is the transition from Q to Q ' the rlementary divisors ilo not split(see Vol. 1, p. 158).

Note that (71i) implies that K -j- K _(E + Q) awl therefor,'

K= .: 2 1=- ' . E+Q r0.-" tine Vol. I. p. I:,x.

20 XI. COMPLEX SYMMETRIC, SKEW-SYMMETRIC, ORTHOGONAL. MATRICES

Y(2)=(x-1)'"'(A-1-1)°''II(a-2,)'i(2- '')'f (.c#1; f=1,2,... ,x).

We consider the polynomial g (A) of degree less than m (m is the degreeof ty (d)) for which g (1) =1 and all the remaining m -1 values on thespectrum of Q are zero; and we set :21

P=g(Q)- (77)

Note that the functions (g(1) )2 and g(1/2) assume on the spectrum of Qthe same values as g(A). Therefore

Ps=P, PT=g(QT)=g(Q-1)=P, (78)

i.e., P is a symmetric projective matrix.22We define a polynomial h(2) and a matrix N by the equations

h (A) = (A -1) g (A), (79)

N = h (Q) = (Q - E) P. (80)

Since (h(,l) vanishes on the spectrum of Q, it is divisible by rp(A)without remainder. Hence :

N°'' = 0,

i.e., N is a nilpotent matrix with m1 as index of nilpotency.From (80) we find:"

NT= (QT-E) P.

as From the fundamental formula (we Vol. 1, p. 104)

g(d)= 3T(9 (1t)Z!1-}-9'(1t)ZM+--'I

it follows tl:atk-1

(81)

p=Zn.T= A hermitian operator P is called projective if P=P. In accordance with this,

n hermitian matrix 1' for which P= P is called An example of a projectiveoperator P in a unitary spare R is the operator of the orthogonal projection of a vectorx e R into it subspace S = PR, i.e., Px = xS, where xS e S and (x - xS) 1. S (see Vol. I,p. 248).

spa All the matrices that occur here, P, .V, NT, QT are permutable among eachother and with Q, since they are all functions of Q.

§ 5. NORMAL FORM OF COMPLEX ORTHOGONAL MATRIX 21

Let us consider the matrix

R=N(NT +2E).

From (78), (80), and (81) it follows that

R=NNT +2N=(Q--QT)P.

From this representation of R it is clear that R is skew-symmetric.On the other hand, from (82)

Rk=Nk(NT+2E)k (k=1,2, ...).

But NT, like N, is nilpotent, and therefore

ANT+2EI 0.

(82)

Hence it follows from (83) that the matrices Rk and Nk have the same rankfor every k.

Now for odd k the matrix Rk is skew-symmetric and therefore (see p. 12)has even rank. Therefore each of the matrices

N, Ne, Ns, .. .

has odd rank.By repeating verbatim for N the arguments that were used on p. 13

for K we may therefore state that among the elementary divisors of N thoseof the form 12P are repeated an even number of times. But to each ele-mentary divisor 12p of N there corresponds an elementary divisor (1-1) 2pof Q, and vice versa .211 Hence it follows that among the elementary divisorsof Q those of the form (1-1)2P are repeated an even number of times.

We obtain a similar statement for the elementary divisors of the form(A + 1)2P by applying what has just been proved to the matrix - Q.

Thus, the proof of the theorem is complete.

2. We shall now prove the converse theorem.

24 Since h(1) =0, h'(1) # 0, in passing from Q to N=h(Q) the elementary divisorsof the form (t - 1)'p of Q do not split and are therefore replaced by elementary divisorsh'p (see Vol. I, Chapter VI, § 7).

22 1I. COMPLEX 5Yb1METRIC, SKEW-SY.111ETRIC, ORTIIOooN.U. MA,nucrS

THEOREM 10: Every system of powers of the form

- A 1)Pi (At 0; j = 1, 2, ..., u),1)Q., ..., (A -1$°,

(A + 1)", (A + I)'., ..., (A + I)'(q1, ..., q tl, ..., t are odd numbers)

is the system of elementary divisors of some complex orthogonal matrix Q.35

Proof. We denote by auj the numbers connected with the numbers A,(j = 1, 2, ... , n) by the equations

Ai=e'q (j=1,2,...,u)

We now introduce the `canonical' skew-symmetric matrices (see the pre-ceding section)

K(PH)) (j = 1, 2, ..., u); K(0, ...,K(9"); K('a), ... , V0,Mf

with the elementary divisors

(A - piff, (2 + pf)Pf (j= 1, 2, ... , u) All, ... , 29v; k', ... , A".

If K is a skew-symmetric matrix, then

Q = ez

is orthogonal (QT= eRT = e_R = Q-1). Moreover, to each elementary divi-sor (A - p) P of K there corresponds an elementary divisor (A - eµ) P of Q.26

Therefore the quasi-diagonal matrix

R(PIPL) g(ArPr) g(41) g(40 x(h) g(ar)

Q = e Y, , ..., a 'v ; e , ..., e ; -6 , ..., - e } (85)

is orthogonal and has the elementary divisors (84).This proves the theorem.From Theorems 4, 9, and 10 we obtain :

Sonie (or eveli all) of the oumhers i., may he -t 1. One or two of the numbersu, v, it may he zero. Then the element;:ry divisors of the corresponding type are absentin Q.

26 This follows from the filet that for f(X 1 =e4 we have f'O.) _crA# 0 for every 1.

§ 5. NORMAL FORM OF COMPLEX ()RT11oGONA1. MATRIX 23

CoRoLLARY: Every (complex) orthogonal matrix Q is orthogonallysimilar to an orthogonal matrix having the normal form Q; i.e., there existsan orthogonal matrix Ql such that

Q = Q3Q ,-1 (86)

Note. Just as we have given a concrete form to the diagonal blocks in theskew-symmetric matrix k, so we could for the normal form 0.27

27 See [378).

CHAPTER XII

SINGULAR PENCILS OF MATRICES

§ 1. Introduction1. The present chapter deals with the following problem :

Given four matrices A, B, A,, B, all of dimension m X n with elementsfrom a number field F, it is required to find under what conditions thereexist two square non-singular matrices P and Q of orders m and n, respec-tively, such that'

PAQ =A1, PBQ = B1 (1)

By introduction of the pencils of matrices A + AB and A, + AB, thetwo matrix equations (1) can be replaced by the single equation

P (A + AB) Q =A1 + 2B1 (2)

DEFINITION 1: Two pencils of rectangular matrices A + AB and A, + AB,of the same dimensions in X n connected by the equation (2) in which P andQ are constant square non-singular matrices (i.e., matrices independent ofA) of orders m and n, respectively, will be called strictly equivalent.2

According to the general definition of equivalence of A-matrices (seeVol. 1, Chapter VI, p. 132), the pencils A + AB and A, + AB1 are equivalentif an equation of the form (2) holds in which P and Q are two squareA-matrices with constant non-vanishing determinants. For strict equivalenceit is required in addition that P and Q do not depend on A.'

A criterion for equivalence of the pencils A + AB and A, + AB1 followsfrom the general criterion for equivalence of A-matrices and consists in theequality of the invariant polynomials or, what is the same, of the elementarydivisors of the pencils A + AB and A, + AB, (see Vol. I, Chapter VI, p. 141).

1 If such matrices P and Q exist, then their clement, can be taken front the field F.This follows from the fact that the equations (l) can be written in the form PA = A,Q",PB=B,Q-' and are therefore equivalent to n certain system of linear homogeneous equa-tions for the elements of P and Q- with coefficients in F.

2 See Vol. I, Chapter VI, P. 145.8 We have replaced the term 'equivalent pencils' that occurs in the literature by

strictly equivalent pencils,' in order to draw a sharp distinction between Definition I andthe definition of equivalence in Vol. I, Chapter Vi.

24

§ 2. REGULAR PENCILS OF MATRICES 25

In this chapter, we shall establish a criterion for strict equivalence oftwo pencils of matrices and we shall determine for each pencil a strictlyequivalent canonical form.2. The task we have set ourselves has a natural geometrical interpretation.We consider a pencil of linear operators A + 1B mapping R into Rm. Fora definite choice of bases in these spaces the pencil of operators A + 1B cor-responds to a pencil of rectangular matrices A + 1B (of dimension m X n) ;under a change of bases in R. and R. the pencil A + 1B is replaced by astrictly equivalent pencil P(A + AB)Q, where P and Q are square non-singular matrices of order m and n (see Vol. I, Chapter III, §§ 2 and 4).Thus, a criterion for strict equivalence gives a characterization of that classof matrix pencils A + 1B (of dimension m X n) which describe one and thesame pencil of operators A + 1B mapping R into R. for various choices ofbases in these spaces.

In order to obtain a canonical form for a pencil it is necessary to findbases for R and R. in which the pencil of operators A + 1B is described bymatrices of the simplest possible form.

Since a pencil of operators is given by two operators A and B, we canalso say that: The present chapter deals with the simultaneous investigationof two operators A and B mapping R. into Rm.3. All the pencils of matrices A + AB of dimension m X n fall into twobasic types : regular and singular pencils.

DEFINITION 2: A pencil of matrices A + 1B is called regular if1) A and B are square matrices of the same order n; and2) The determinant I A + AB I does not vanish identically.

In all other cases (m n, or m = n but I A + 1B - 0), the pencil is calledsingular.

A criterion for strict equivalence of regular pencils of matrices and alsoa canonical form for such pencils were established by Weierstrass in 1867[377] on the basis of his theory of elementary divisors, which we have ex-pounded in Chapters VI and VII. The analogous problems for singularpencils were solved later, in 1890, by the investigations of Kronecker [249].'Kronecker's results form the primary content of this chapter.

§ 2. Regular Pencils of Matrices1. We consider the special case where the pencils A + 1B and Al + 1B1consist of square matrices (m = n) I B 1 0, 1 B1 I # 0. In this case, as wehave shown in Chapter VI (Vol. I, pp. 145-146), the two concepts of èquiv-

' Of more recent papers dealing with singular pencils of matrices we mention 12349,[3691, and [2551.

26 XII. SINGULAR PENCII.S OF MATRICES

alence' and 'strict equivalence' of pencils coincide. Therefore, by applyingto the pencils the general criterion for equivalence of A-matrices (Vol. I,p. 141) we are led to the following theorem :

THEOREM 1: Two pencils of square matrices of the same order A + ABand A, + ABI for which I B # 0 and I BI I ' 0 are strictly equivalent if andonly if the pencils have the same elementary divisors in F.

A pencil of square matrices A + AB with I B 1 7& 0 was called regular inChapter VI, because it represents a special case of a regular matrix poly-nomial in A (see Vol. I, Chapter IV, p. 76). In the preceding section of thischapter we have given a wider definition of regularity. According to thisdefinition it is quite possible in a regular pencil to have I B =0 (and evenCAI=IBI=0).

In order to find out whether Theorem 1 remains valid for regular pencils(with the extended Definition 1), we consider the following example:

213 11211 1`211 1111A+AB= 325. +A 1 1 2'1, AI+ABI=

(

1 2 1 + 1 1 1 (3)

3 2 6! 113!! ,111 1 1 1

It is easy to see that here each of the pencils A + AB and AI + ABI hasonly one elementary divisor, A + 1. However, the pencils are not strictlyequivalent, since the matrices B and BI are of ranks 2 and 1, respectively;whereas if an equation (2) were to hold, it would follow from it that theranks of B and B, are equal. Nevertheless, the pencils (3) are regularaccording to Definition 1, since

IA+ABI=I A,+ABI -A+1.

This example shows that. Theorem 1 is not true with the extended defini-tion of regularity of a pencil.

2. In order to preserve Theorem 1, we have to introduce the concept of 'in-finite' elementary divisors of a pencil. We shall give the pencil A + AB interms of 'homogeneous' parameters A, p : uA + AB. Then the determinanti(A, u) IiA+ i.B I is a homogeneous function of A, It. By determiningthe greatest common divisor Dk(A, Fp) of all the minors of order k of thematrix pA + AB (k = 1, 2, ... , n), we obtain the invariant polynomials bythe well known formulas

1,(A,µ)=D. (A.

,12(A,µ)=nn-2(x,

;)P...

here all the Dk(A, /I) and i;(A, p) are homogeneous polynomials in i. and It.

2. REGULAR PENCILS OF MATRICES 27

Splitting the invariant polynomials into powers of homogeneous polynomialsirreducible over F, we obtain the elementary divisors ea (A, p) (a = 1, 2....of the pencil pA + AB in F.

It is quite obvious that if we set p =1 in ea(A, p) we are back to the ele-mentary divisors ea(A) of the pencil A + AB. Conversely, from each ele-mentary divisor ea(A) of degree q we obtain the correspondingly elementarydivisor ea(A, p) by the formula ea (A,p) =742e). We can obtain in this wayall the elementary divisors of the pencil uA + AB apart from those of theform pa.

Elementary divisors of the form pq exist if and only if I B I = 0 and arecalled ìnfinite' elementary divisors of the pencil A + AB.

Since strict equivalence of the pencils A + AB and A, + AB, impliesstrict equivalence of the pencils pA + AB and pA, + AB,, we see that forstrictly equivalent pencils A + i.B and Al + AB1 not only their ' finite.' butalso their 'infinite' elementary divisors must coincide.

Suppose now that A + AB and Al + AB, are two regular pencils forwhich all the elementary divisors coincide (including the infinite ones).We introduce homogeneous parameters : pA + AB, pA1 + AB1. Let us nowtransform the parameters

A=a1)+a2/4, p=i9 +i i (z1f2-a 1,0)

In the new parameters the pencils are written as follows :

Al + AB , J Al + AB', , where B = {4,A + a,B, B1 = ,81A1 + a1B1.

From the regularity of the pencils pA + AB and pA1 + AB1 it follows that wecan choose the numbers a1 and 0, such that I B I 0 and I B1 0.

Therefore by Theorem 1 the pencils µA + AB and µA1 + AB1 and con-sequently the original pencils pA + AB and pA1 + AB1 (or, what is the same,A + AB and A, + AB1) are strictly equivalent. Thus, we have arrived atthe following generalization of Theorem 1:

THEOREM 2: Two regular pencils A +.AB and A, + AR, are strictlyequivalent if and only if they have the same ('finite' and 'infinite') ele-mentary divisors.

In our example above the pencils (3) had the same 'finite' elementarydivisor A + 1, but different. 'infinite' elementary divisors (the first pencilhas one 'infinite' elementary divisor p2; the second has two : p, p). Thereforethese pencils turn out to be not strictly equivalent.

28 XII. SINCt'I.AR PENCILS OF MATRICE)

3. Suppose now that A + AB is an arbitrary regular pencil. Then thereexists a number c such that J A + cB I 0. We represent the given pencilin the form A, + (2- c) B, where A, = A + cB, so that I Al 0. Wemultiply the pencil on the left by AT': E + (Z - c) A 1 B. By a similaritytransformation we put the pencil in the forma

E + (A - c) (J0,JI) _ (E-cJo +AJ0, B-cJ, +AJI) , (4)

where {J0,JI} is the quasi-diagonal normal form of AiIB, Jo is a nilpotentJordan matrix,6 and I J, 10.

We multiply the first diagonal block on the right-hand side of (4) by(E - cJo) -' and obtain : E + A (E - cJo) -'J0. Here the coefficient of Ais a nilpotent matrix! Therefore by a similarity transformation we canput this pencil into the form8

E + AJ0 = ( N(",), N(us), ..., Plus)) (N("> = E(") + I(")). (5)

We multiply the second diagonal block on the right-hand side of (4) byJ11; it can then be put into the form J + AE by a similarity transformation,where J is a matrix of normal form9 and E the unit matrix. We have thusarrived at the following theorem :

THEOREM 3: Every regular pencil A + AB can be reduced to a (strictlyequivalent) canonical quasi-diagonal form

(N(",), N(us), . . . , N(",), J + 2E) (N(") = E(") + )iH(")), (6)

where the first s diagonal blocks correspond to infinite elementary divisorsµ"', µl", ..., p"aof the pencil A + AB and where the normal form of the lastdiagonal block J + AE is uniquely determined by the finite elementarydivisors of the given pencil.

-'The Unit n,atriet's E in the diagonal blocks on the right-hand side of (4) have thesame order as J. and ./,.

Le., .1J = 11 for some integer I > 0.Front .1.!=O it follows that I(E-r./.) 0.Here E(jr ) is :r unit matrix of order it and 11(11 a is a matrix of order it whose elements

in the first superdiagoual are 1, while the reumiuing elements are zero.Since the matrix ./ can ho replaced here by in a r itrary similar matrix, we may

asmne that .1 has one of the oorural forms (for example, the natural form of the firstor second kind or the .Jordan form (see Vol. 1, Chapter VI, § ti ).

§ 3. SINGULAR PENCILS. THE REDUCTION THEOREM 29

§ 3. Singular Pencils. The Reduction Theorem

1. We now proceed to consider a singular pencil of matrices A + AB ofdimension in X n. We denote by r the rank of the pencil, i.e., the largestof the orders of minors that do not vanish identically. From the singu-larity of the pencil it follows that at least one of the inequalities r < is andr < m holds, say r < n. Then the columns of the 1-matrix A + AB arelinearly dependent, i.e., the equation

(A+AB)x=o, (7)

where x is an unknown column matrix, has a non-zero solution. Everynon-zero solution of this equation determines some dependence among thecolumns of A + AB. We restrict ourselves to only such solutions x(2) of (7)as are polynomials in 2,10 and among these solutions we choose one of leastpossible degree a :

x (,1) = xe - .1x1 + A'xz - ... + (-1)'1'x. (x. # 0). (8)

Substituting this solution in (7) and equating to zero the coefficients ofthe powers of 2, we obtain :

Axo=o, Bxo-Ax1=o, Bx1-Axe=o, ... , Bx._1- Ax.=o, Bx.=o. (9)

Considering this as a system of linear homogeneous equations for theelements of the columns z0, - x1i + x2 ... , (-1)'x,, we deduce that thecoefficient matrix of the system

6+1

A O ... 0B A

M.=M.(A+.1B]= 0 BA

0 0... B

(10)

is of rank p. < (e + 1) n. At the same time, by the minimal property of e,the ranks p., p1i ... , e._1 of the matrices

I" For the actual determination of the elements of the column .r satisfying (7) it isconvenient to solve a system of linear homogeneous equations in which the coefficients ofthe unknown depend Iiucarly on ).. The fundamental linearly iudependcut solutions r canalways be dhoseu such that their elements are polynuwials in )..

30 XII. SINGULAR PENCILS OF MATRICES

A O...OA 0 B A

Mo=(BM1= B A , ... , M.-I= . . (10')`` 0 B A

O ... B

satisfy the equations po= n, pI = 2 n, . . . , p,_I =en.Thus: The number a is the least value of the index k for which the sign

< holds in the relation px < (k + 1)n.Now we can formulate and prove the following fundamental theorem :

2. THEOREM 4: If the equation (7) has a solution of minimal degree e ande > 0, then the given pencil A + AB is strictly equivalent to a pencil ofthe form

(L, 00 A + .B)'

where

.+ 1

A 1 0 . . . 00 2 1

L. =

110 0 . . .A1

(12)

and A+;Lit is a pencil of matrices for which the equation analogous to (7)has no solution of degree less than e.

Proof. We shall conduct the proof of the theorem in three stages. First,we shall show that the given pencil A + 1B is strictly equivalent to a pencilof the form

(L, D + AF10 A+2B)' (13)

where D, F, A, B are constant rectangular matrices of the appropriatedimensions. Then we shall establish that the equation (A + A) i = 0 hasno solution x(A) of degree less than e. Finally, we shall prove that byfurther transformations the pencil (13) can be brought into the quasi-diagonal form (11).


1. The first part of the proof will be couched in geometrical terms.Instead of the pencil of matrices A + AB we consider a pencil of operatorsA + AB mapping R into Rm and show that with a suitable choice of basesin the spaces the matrix corresponding to the operator A + AB assumes theform (13).

Instead of (7) we take the vector equation

(A + 2B)x =o (14)

with the vector solution

x (,1) =x0-,1x1 + 22x2-... + (15)

the equations (9) are replaced by the vector equations

Axo= o, Ax1= Bxo, Ax2= Bx1, ..., Ax,= Bx,_1, Bx,= o (16)

Below we shall show that the vectors

Ax1i Ax2t ... , Ax, (17)

are linearly independent. Hence it will be easy to deduce the linear inde-pendence of the vectors

x0, x1, .. ., x,. (18)

For since Axo = o we have from ao xo + al x, + + ax, = o thatal A x1 + + a. A x, = o, so that by the linear independence of the vectors

(17) a1= a2 = ... = a, = 0. But xo ,& 0, since otherwised

x (A) would be

a solution of (14) of degree e - 1, which is impossible. Therefore an = 0also.

Now if we take the vectors (17) and (18) as the first e + 1 vectors fornew bases in R. and R,,, respectively, then in these new bases the operatorsA and B, by (16), will correspond to the matrices

.+1 .+1

0 1 ... 0 * ... 1 0... 0 0 * ... *0 0 1 ... 0 * ... * ; 10 1 ... 0 0 * ... *;

`4- 0 0 ... 1 0 0... 1 0

0 0 ... 0 * ... * ' 0 0...0 0 * ... *.

f

. . . . 0 0 * ... *0 0 ... 0 * .. *I 0 0...0 0 * ... *

32 X11. SINGULAR PENCILS OF MATRICES

hence the A-matrix A + At is of the form (13). All the preceding argu-ments will be justified if we can show that the vectors (17) are linearlyindependent. Assume the contrary and let Ax, (h ? 1) be the first vectorin (17). that is linearly dependent on the preceding ones :

Axr = a1AxA_1 + a2AxA_2 + + aa_1Ax1.

By (16) this equation can be rewritten as follows :

Bxh_I = 011BxA_s + a2BxA-s + .....}- aA_1Bx0,i.e.,

wherexA_1= o ,

xA-1 = xA-1 - a1xA-2 - a2XA_S aA_lxo

Furthermore, again by (16),

,AxA_1= B (xA_2 - a1xA_a aA_Zxu = BXA-2where

XA-2 = xA-2 - alXA_s - ... - aA-Zxo

Continuing the process and introducing the vectors

xh_s=XA_3-a1XA_4-...-ah_sxo, ..., X1 =X1-a1X0, xa =xo

we obtain a chain of equations

BxZ_1 = o , Ax,*,_1= Bx,'_2, .... Axi = Bxo , Ax=o. (19)

From (19) it follows that

x'(A)=x;-Axi+--.+e-1 (xo=xo o)

is a non-zero solution of (14) of degree < h -1 < e, which is impossible.Thus, the vectors (17) are linearly independent.

2 We shall now show that the equal ion (A + AB) z =o has no solutionsof degree less than e. To begin with, we observe that the equation L, y = o,like (7), has a non-zero solution of least degree e. We can see this imme-diately, if we replace the matrix equation L, y= o by the system of ordinaryequations

41 + y2 = 0, AY2 + ya =0, ... , Ay. + y.+1 =0 (Y= (y1, Y2, ... , Y,+I));

(k=1, 2,...,e+1).yr=(-1)A-Iy1AA-I


On the other hand, if the pencil has the `triangular' form (13) then thecorresponding matrix pencil Mk (k = 0, 1, .... e) (see (10) and (10') onpp. 29 and 30) can also be brought into triangular form, after a suitablepermutation of rows and columns:

Mx [L,] Mt [D + .1F] (20)

l O MM[A+LB])

For k = e -1 all the columns of this matrix, like those of M._1 [L.],arelinearly independent." But M._1 [L,] is a square matrix of order e(e + 1).Therefore in M,_1 [A + ,1B] also, all the columns are linearly independentand, as we have explained at the beginning of the section, this means that theequation (A + AB) z = o has no solution of degree less than or equal to a - 1,which is what we had to prove.

3. Let us replace the pencil (13) by the strictly equivalent pencil

D +AF(01%z/

(0L.

A + AB} (O, E` 1- (Os

D + AF+A(A,1B

Ah)-LX(21)

where E,, E2, E3, and E. are square unit matrices of orders e, m - e, e + 1,and n - e - 1, respectively, and X, Y are arbitrary constant rectangularmatrices of the appropriate dimensions. Our theorem will be completelyproved if we can show that the matrices X and Y can be chosen such that thematrix equation

holds.L,X=D+AF +Y(A+2B) (22)

We introduce a notation for the elements of D, F, X and also for therows of Y and the columns of A and B :

D drat j, F=/:,

A= (a1, aq, ... , B = (b1, b2, ... ,

Then the matrix equation (22) can be replaced by a .system, of scalar equa-tions that expresses the equality of the elements of the k-th column on theright-hand and left-hand sides of (22) (k = 1, 2, ... , n - e -1) :

Ii This follows from the fact that the rank of the matrix (20) for A = e - 1 is equalto en; a similar equation hulls for the rank of the matrix Me-1 [L.].

34 XII. SINGI'I.AR PENCILS OF MATRICES

x21; + Axlk = dlk + Aflk + ylak + Aylbk,

xyk + Ax2k = d2k + Af2k + y2ak + Ay2bk ,

x4k + Axsk = dsk + Afsk + ysak + Aysbk , (23).................x.+I.k + Ax.k = d.k + A/.k + y.ak + Ay.bk

(k=1, 2, ..., n-e-1).The left-hand sides of these equations are linear binomials in A. The

free term of each of the first a - 1 of these binomials is equal to the coeffi-cient of A in the next binomial. But then the right-hand sides must alsosatisfy this condition. Therefore

Ylaak - ysbk = f2k - dlk,

y2ak - y$bk = 18k - d2k

y.-lak - y.bk =fek - d.-l.k(k=1, 2, ..., n-e-1).

If (24) holds, then the required elements of X can obviously be determinedfrom (23).

It now remains to show that the system of equations (24) for the ele-ments of Y always has a solution for arbitrary d1k and ffk (i = 1, 2, . . . , a;k = 1, 2, ... , n - e - 1). Indeed, the matrix formed from the coefficientsof the unknown elements of the rows y,, - Y2, y3, - y., ... , can be written,after transposition. in the form

But this is the matrix M, _2 for the pencil of rect 'ngular matrices A + AB(see (10') on p. 30). The rank of the matrix is (, -1) (n-s-1), be-cause the equation (A + )B) x = o, by what we have shown, has no solutionsof degree less than e. Thus, the rank of the system of equations (24) isequal to the number of equations and such a system is consistent (non-contradictory) for arbitrary free terms.


§ 4. CANONICAL FORM OF SINGULAR PENCIL 35

§ 4. The Canonical Form of a Singular Pencil of Matrices

1. Let A + AB be an arbitrary singular pencil of matrices of dimensionm X n. To begin with, we shall assume that neither among the columnsnor among the rows of the pencil is there a linear dependence with constantcoefficients.

Let r < n, where r is the rank of the pencil, so that the columns of A + ABare linearly dependent. In this case the equation (A + AB)x = o has a non-zero solution of minimal degree el. From the restriction made at the begin-ning of this section it follows that el > 0. Therefore by Theorem 4 thegiven pencil can be transformed into the form

L., 0(0 Al + AB1

where the equation (A1 + AB1) x(1 = o has no solution xt1 of degree lessthan el.

If this equation has a non-zero solution of minimal degree C2 (where,necessarily, es et), then by applying Theorem 4 to the pencil At + AB1we can transform the given pencil into the form

C

., 0 0'A 00 A2+AB

Continuing this process, we can put the given pencil into the quasi-diagonal form

L'nAy + ABp

(25)

where 0 < et c 82 :5 ... < ea and the equation (A, + x( p) o has nonon-zero solution, so that the columns of A. + 1B, are linearly independent.12

If the rows of A, + AB, are linearly dependent, then the transposedpencil A, + ABP' can be put into the form (25), where instead of C 1 ,8 2,- . , eathere occur the numbers (0 <) 71 5 >)_ 5 >)y .ts But then the givenpencil A + AB turns out to be transformable into the quasi-diagonal form

1= In the special case where E. + e. + + Fr = m the block A, + )J> is absent.Since no linear depeudCUee kith constant coefficients exists among the rows of the

pencil A + }.B and causCqucuth of .1,. + XIl,. we ha ST tI, > U.

36 XII. SINOI'I.AR PENCILS OF MATRICES

L,, 0/l L.

L.pLIT,.

LTA° + AB°

(26)

(O<giSel S'" sy

where both the columns and the rows of A + AB,, are linearly independent,i.e., Ao + 2B0 is a regular pencil."

2. We now consider the general case where the rows and the columns ofthe given pencil may be connected by linear relations with constant coeffi-cients. We denote the maximal number of constant independent solutionsof the equations

(A+2B)x=o and (AT+ABT) =0

by g and h, respectively. Instead of the first of these equations we consider,just as in the proof of Theorem 4, the corresponding vector equation(A + AB) x = o (A and B are operators mapping R. into Rm). We denotelinearly independent constant solutions of this equation by et, 402, ... , eoand take them as the first g basis vectors in R.. Then the first g columnsof the corresponding matrix A + AB consist of zeros

A + AB = (O, Al +.IBI) . (27)

Similarly, the first h rows of the pencil Al + AB, can be made into zeros.The given pencil then assumes the form

P

(0 0(28)

0 A° + AB° '

16If in the given pencil r= n, i.e., if the colwnn. of the pencil are linearly independent,then the first p diagonal blocks in (26) of the form L. are absent (p = 0). In the sameway, if r= m, i.e., if the rows of .f + ).B are linearly independent, then in (26) thediagonal blocks of the form L7 arc absent (q = 0).

§ 5. MINIMAL INDICES. CRITERION FOR STRONG EQUIVALENCE 37

where there is no longer any linear dependence with constant coefficientsamong the rows or the columns of the pencil A° + AB°. The pencil A° + 2B°can now be represented in the form (26). Thus, in the general case, thepencil A + AB can always be put into the canonical quasi-diagonal form

a

{k [0, L89+11 ... , Lip, Lok+1, ..., Loo, A0 + ABo}. (29)

The choice of indices for e and 71 is due to the fact that it is convenient hereto take and '1= r1== =i1k=0.

When we replace the regular pencil AD + IB° in (29) by its canonicalform (6) (see § 2, p. 28), we finally obtain the following quasi-diagonalmatrix

0

{"[0; L o; N("1), ..., N(w); J + .E}, (30)

where the matrix J is of Jordan normal form or of natural normal form andN(s) = E(w) + IH(w).

The matrix (30) is the canonical form of the pencil A + AB in the mostgeneral case.

In order to determine the canonical form (30) of a given pencil imme-diately, without carrying out the successive reduction processes, we shall,following Kronecker, introduce in the next section the concept of minimalindices of a pencil.

§ 5. The Minimal Indices of a Pencil. Criterion forStrong Equivalence of Pencils

1. Let A + AB be an arbitrary singular pencil of rectangular matrices.Then the k polynomial columns x1(A ), x2 (A), ... , xk (1) that are solutionsof the equation

(A+AB)x=o (31)

are linearly dependent if the rank of the polynomial matrix formed fromthese columns %_ [x1(A), x2(A), ... , xk(A) ] is less than k. In that casethere exist k polynomials pi (A), P2(A), ..., pk(I), not all identically zero,such that

P1 (A) XI W+ P2 (A) X2 (A)+-"+ Pk(')Xk (A) _0'

But if the rank of X is k, then such a dependence does not exist and thesolutions x1(A), x2(2), . . . , xk(A) are linearly independent.


Among all the solutions of (31) we choose a non-zero solution x, (A) ofleast degree el. Among all the solutions of the same equation that are lin.early independent of xl(A) we take a solution x2(2) of least degree e2.Obviously, el < E2. We continue the process, choosing among the solutionsthat are linearly independent of xl(A) and x2(A) a solution x,(1) of minimaldegree E,, etc. Since the number of linearly independent solutions of (31)is always at most n, the process must come to an end. We obtain a funda-mental series of solutions of (31)

x1(1), x2 (A), ... , x,, (A) (32)

having the degrees

$1 5 82 5 S ep- (33)

In general, a fundamental series of solutions is not uniquely determined(to within scalar factors) by the pencil A + AB. However, two distinctfundamental series of solutions always have one and the same series ofdegrees 61, E2, ... , ea. For let us consider in addition to (32) another funda-mental series of solutions al(1), 22(A), ... with the degrees El, 82, ....Suppose that in (33)

and similarly, in the series El, E2, ,

Obviously, el = El. Every column x`,(A) (i=1, 2, ... , iil) is a linear com-bination of the columns x, (A), x2(1), . . . , x,,,(A), since otherwise the solu-tion xp,+1(A) in (32) could be replaced by x,(2), which is of smaller degree.It is obvious that, conversely, every column x, (A) (i = 1, 2, . . . , nj) is alinear combination of the columns xl(A), i2(,l), ...,x,,,+, (A). Thereforen, = Ti and E,,,+1 =E,,,+l. Now by a similar argument we obtain that,n2=n'2 and En,+1=a*,+1, etc.

2. Every solution xk(A) of the fundamental series (32) yields a lineardependence of degree Ek among the columns of A + AB (k = 1, 2, ... , p).Therefore the numbers el, 82, ... , e, are called the minimal indices for thecolumns of the pencil A + AB.

The minimal indices 17, rte, ..., 71, for the rows of the pencil A + ,1B areintroduced similarly. Here the equation (A + AB) x = o is replaced by(AT +ABT)y=o, and 111, 112, ..., ij,, are defined as minimal indices for thecolumns of the transposed pencil AT + AB T.

§ 5. MINIMAL INDICES. CRITERION FOR STRONG EQUIVALENCE 39

Strictly equivalent pencils have the same minimal indices. For letA + )1B and P(A + A.B)Q be two such pencils (P and Q are non-singularsquare matrices). Then the equation (31) for the first pencil can be written,after multiplication on the left by P, as follows:

P(A+2B)Q-Q-'x=o.

Hence it is clear that all the solutions of (31), after multiplication on theleft by Q-', give rise to a complete system of solutions of the equation

P(A+.1B)Qz=o.

Therefore the pencils A + AB and P(A + AB) Q have the same minimalindices for the columns. That the minimal indices for the rows also coincidecan be established by going over to the transposed pencils.

Let us compute the minimal indices for the canonical quasi-diagonalmatrix

(h [op L,D+t, ... , Lop; LT4, Ap + AB0 } (34)

(A0 + AB0 is a regular pencil having the normal form (6) ).We note first of all that: The complete system of indices for the columns

(rows) of a quasi-diagonal matrix is obtained as the union of the correspond-ing systems of minimal indices of the individual diagonal blocks. The matrixL. has only one index e for the columns, and its rows are linearly independ-ent. Similarly, the matrix L' has only one index rt for the rows, and its-columns are linearly independent. Therefore the matrix (34) has as itsminimal indices for the columns

e1=...=e,=0, eg+1, ..., ep

and for the rows7111=*,*=Th,=0, !'lA+l, - , 1 7 e

note further that L. has no elementary divisors, since among itsminors of maximal order a there is one equal to 1 and one equal to A'. Thesame statement is, of course, true for the transposed matrix L,. Since theelementary divisors of a quasi-diagonal matrix are obtained by combiningthose of the individual diagonal blocks (see Volume I, Chapter VI, P. 141),the elementary divisors of the A-matrix (34) coincide with those of its regular`kernel' A,, + AB,,.

The canonical form of the pencil (34) is completely determined by theminimal indices e1, ... , !?Q and the elementary divisors of thepencil or, what is the same, of the strictly equivalent pencil A + AB. Since


two pencils having one and the same canonical form are strictly equivalent,we have proved the following theorem :

T$aoREM 5 (Kronecker) : Two arbitrary pencils A + AB and Al + AB1of rectangular matrices of the same dimension m X n are strictly equivalentif and only if they have the same minimal indices and the same (finite andinfinite) elementary divisors.

In conclusion, we write down, for purposes of illustration, the canonicalform of a pencil A + AB with the minimal indices e1= 0, e2 =1 , es = 2,'71= 0, 172 = 0, t)a = 2 and the elementary divisors As, (A + 2) s, µa :18

00

A 1

A 1 00 A 1

I

A+2 1

0 A+2

§ 6. Singular Pencils of Quadratic Forms

1. Suppose given two complex quadratic forms :

A aax,xr, B (x, x) = j b;kx,xk; (36)

they generate a pencil of quadratic forms A(x, x) + AB(x, x). This pencilof forms corresponds to a pencil of symmetric matrices A + AB (AT = A,BT = BY . If we subject the variables in the pencil of forms A (x, x) + AB (x, x)to a non-singular linear transformation x = Tz (I T I 0), then the trans-formed pencil of forms A(z, z) + AB(z, z) corresponds to the pencil ofmatrices

I

. (35)

15 All the elements of the matrix that are not mentioned expressly are zero.

§ 6. SINGULAR PENCILS OF QUADRATIC FORMS 41

A+AB=TT(A+AB)T ; (37)

here T is a constant (i.e., independent of A) non-singular square matrix oforder n.

Two pencils of matrices A + AB and A + Ali that are connected by a rela-tion (36) are called congruent (see Definition 1 of Chapter X ; Vol. I, p. 296).

Obviously, congruence is a special case of equivalence of pencils ofmatrices. However, if congruence of two pencils of symmetric (or skew-symmetric) matrices is under consideration, then the concept of congruencecoincides with that of equivalence. This is the content of the followingtheorem.

THEOREM 6: Two strictly equivalent pencils of complex symmetric (orskew-symmetric) matrices are always congruent.

Proof. Let A -A+ AB and A- A + AB be two strictly equivalentpencils of symmetric (skew-symmetric) matrices:

A=PAQ (AT=±A, AT=±A; IPI O, IQI O). (38)

By going over to the transposed matrices we obtain :

A= QTAPT. (39)

From (38) and (39), we have

AQPT-1=P-'QTA. (40)

SettingU=QPT-1, (41)

we rewrite (40) as follows :

AU= UT A.

From (42) it follows easily that

AUk= UT*A (k=0, 1, 2, ...)

(42)

and, in general,

whereAS=STA , (43)

S=/(U), (44)

and f (A) is an arbitrary polynomial in A. Let us assume that this poly-nomial is chosen such that I S 10. Then we have from (43) :

A = STAB-' . (45)


Substituting this expression for A in (38), we have:

A= PSTAS-'Q. (46)

If this relation is to be a congruence transformation, the following equa-tion must be satisfied:

which can be rewritten as

(PST)T = S-1 Q

S2=QPT-'=U.

Now the matrix S = f (U) satisfies this equation if we take as f (A) theinterpolation polynomial for CA on the spectrum of U. This can be done,because the many-valued function } has a single-valued branch determinedon the spectrum of U, since I U I :F& 0.

The equation (46) now becomes the condition for congruence

11=TTAT (T=SQ=1/QPT_'Q). (47)

From this theorem and Theorem 5 we deduce :

COROLLARY : Two pencils of quadratic forms

A (x, x) + AB (x, x) and A (z, z) + AB (z, z)

can be carried into one another by a transformation xt-- Tz (I T I , 0) ifand only if the pencils of symmetric matrices A + AB and A + AB have thesame elementary divisors (finite and infinite) and the same minimal indices.

Note. For pencils of symmetric matrices the rows and columns have thesame minimal indices :

p= q; el = i?I, ... , Ep= tly. (48)

2. Let us raise the following question : Given two arbitrary complex quad-ratic forms

n

A (x, x) = ± aikxixk , B (x, x) = bax,xk .i.k_1 Ck_1

Under what conditions can the two forms be reduced simultaneously tosums of squares

n n

.E arzi and is biz (49)

by a non-singular transformation of the variables x = Tz (I T I =A 0) t

§ 6. SINGULAR PENCILS OF QUADRATIC FORMS 43

Let us assume that the quadratic forms A(x, x) and B(x, x) have thisproperty. Then the pencil of matrices A + AB is congruent to the pencilof diagonal matrices

(al + A1, ap + 92, ... , an + Abn) (50)

Suppose that among the diagonal binomials a{+1b{ there are precisely r(r S n) that are not identically zero. Without loss of generality we canassume that

ai+Abt40 (i=n-r+ 1, ..., n). (51)

Setting

Ao + ABo = (an +1 + Abu-,+1, ... , an + Abn) , (52)

we represent the matrix (51) in the form

(0, Ao+ABo). (53)

Comparing (52) with (34) (p.39), we see that in this case all the minimalindices are zero. Moreover, all the elementary divisors are linear. Thuswe have obtained the following theorem:

THEOREM 7: Two quadratic forms A(x, x) and B(x, x) can be reducedsimultaneously to runts of squares (49) by a transformation of the variablesif and only if in the pencil of matrices A+,1B all the elementary divisors(finite and infinite) are linear and all the minimal indices are zero.

In order to reduce two quadratic forms A(x, x) and B(x,.r) simulta-neously to some canonical form in the general ease, we have to replace thepencil of matrices A + AB by a strictly equivalent 'canonical' pencil ofsymmetric matrices.

Suppose the pencil of symmetric matrices A + AB has the minimal indicesel = ... = e, = 0, eg+1 0, ... , e, 0, the infinite elementary divisors101, µ°9, ... , 101 and the finite ones (I + A])`1, (I + A2)c , . . . , (A + A,)`, Then,in the canonical form (30), g = h, p = q and sp+ 1 = rlp+ 1, ... , EP = ny. Wereplace in (30) every two diagonal blocks of the form L, and L6 by a single

diagonal block (L` L.) and each block of the form M') =E'"°+AH(v)by the

44 Xll. SINGI'LAR PI;NCII.S OF IIATRICES

strictly equivalent symmetric block

N(u)= V(")N(°)=

0 0...0 1

0 0... 1 A

1 A...0 0with V(") =

0 0 ... 0 1

0 0 ... 1 0

1

11 0...0 0

Moreover, instead of the regular diagonal block J + AE in (30) (J is aJordan matrix)

J+AE={(A+A1)E(4)+H(°a), ..., (A+A,)Met) +H(a)),

we take the strictly equivalent block

A,,), ... , ZAa)l (55)

where

Z,ZI)= V(a) [(A + Aj) E(a) + H(a)J

0 . .. 0 A+Ai;+0 . . . A + A, 1 I1

(i =1, 2, ..., t). (56)

jiA+Ai 1 . . . . 0 li

The pencil A + AB is strictly equivalent to the symmetric pencil

A+ AB

=0 ),...,0 ),(

0, Leg

+1

(L.P0 L P N(a.) Z(d,) Z(a)

. (54)

(57)

Two quadratic forms with complex coefficients A(x, x) and B(x, x) canbe simultaneously reduced to the canonical forms A(z, z) and B(z, z) definedin (57) by a transformation of the variables x = Tz (I T 0).11

In the mission edition the author stated that propo,itions analogous to Theorems tiand 7 hold for hermitinn forms. A. I. Ma)'rev has pointed out to the author that this isnot the ease. As regards, singular pencils of lie ratitimi norms, we 1197 111.

§ 7. APPLICATION TO DIFFERENTIAL EQUATIONS 45

§ 7. Application to Differential Equations

1. The results obtained will now be applied to a system of m linear differ-ential equations of the first order in n unknown functions with constantcoefficients:l9

dxk=

or in matrix notation :

hereto

We introduce new unknown functions z1, z2, ..., z, that are connectedwith the old x1, x2, ..., x by a linear non-singular transformation withconstant coefficients:

k-1 k-1

Ax + Bdx = f (t);

A=llaall, B=llbfklj

x=(x1, 02, ..., x,,), ! =(!1+/2' . fm)

(59)

x=Qz IQI O) (60)

Moreover, instead of the equations (58) we can take m arbitrary inde-pendent combinations of these equations, which is equivalent to multiplyingthe matrices A, B, f on the left by a square non-singular matrix P of order m.Substituting Qz for x in (59) and multiplying (59) on the left by P, weobtain :

where(61)

(62)

Az+Bdi=f(t),

A=PAQ, B=PBQ, f=Pf =(ft,I2,...,fw)-The matrix pencils A + AB and A + 19 are strictly equivalent :

A + 1B = P (A + AB) Q . (63)

We choose the matrices P and Q such that the pencil A + 2B has thecanonical quasi-diagonal form

1s The particular case where m = n and the system (58) is solved with respect to thederivatives has been treated in detail ito Vol. 1, Chapter V, § 5.

It is well known that a system of linear differential equations with constant coeffi-cients of arbitrary order s can be reduced to the form (58) if all the derivatives of theunknown functions up to and including the order s - I are included as additional unknownfunctions.

10 We recall that parentheses denote column matrices. Thus, .r = (r,, r,, ... , is thecolumn with the elements z,, .r,, ... , r_


A+,1B=(0, L194.1, ..., L,p, LTik+t, ...,Ly, N(",), ..., N(w), J+A). (64)

In accordance with the diagonal blocks in (64) the system of differentialequations splits into v = p - g + q - h + s + 2 separate systems of the form

i 1

L.t(a)1zt=1)fdt

T d P-9+1+iP-v+1+i

(d,) z =

d P-D+Q-k+1+k P-9+9-k+1+k

N(-k) (de) z = T

(J+dl)z=/ ,

where

(65)

(i=1,2, ...,p-q), (66)

(9 =1, 2, ... , q - h) , (67)

( k=1 ,2 , (68)

(69)

(70)

z

1 1 22

z = (z1, ... , zo), f = (/I, ... ,1k), z = (zp+1, ...), 1= ...) etc., (71)

A (a) = A + Bd , if A (d) = A + 2B . (72)

Thus, the integration of the system (59) in the most general ease isreduced to the integration of the special systems (65)-(69) of the same type.In these systems the matrix pencil A + AB has the form 0, L L", N's), andJ +AE, respectively.

1) The system (65) is not inconsistent if and only if

/,=0, ..., 7k=0. (73)


In that case we can take arbitrary functions of t as the unknown functionsI

zI, z2, ... , za that form the columns Z.

2) The system (66) is of the form

L,(d )z = f (74)

or, more explicitly,'"'

ai +22=%I(t), de +z,=%2(t), ..., d +Z.+I=i.(t).

(75)

Such a system is always consistent. If we take for z s +I (t) an arbitraryfunction of t, then all the remaining unknown functions z, , z, _I, ... , zI canbe determined from (75) by successive quadratures.


Ln (at)z=f

or, more explicitly,21

(76)

ai=fa(t),a

+zI=/2 (t), ..., ai +zn-Ii"=to+I(t) (77)

From all the equations (77) except the first we determine zi), z,-I, ... , z1uniquely :

//Zn-l7+1,

Zq-I-fn adtI (78)

. . . . . . . . . . . . . . . .

zI-f2 dt +... + (- I)v -Idtn-t

Substituting this expression for zI into the first equation, we obtain thecondition for consistency :

di.dfs dill

(79)

20 We have changed the indices of z and f to simplify the notation. In order to returnfrom (73) to (66) we have to replace E by E; and add to each index of z the numher9+e9 1+ -Sy+i-t+l-l,toeachindex of7thenumber1+ep+t+.. +eg+i-t

21 Here, as in the preceding ease, we have changed the notation. See the preceding

footnote.

48 Xll. SINGULAR PENCILS OF MATRICES


or, more explicitly,

(80)

d +z1=f1, de +zY-1=fr_l, (81)

Hence we determine successively the unique solutions

Z.

dt. . . . . . . . . . . . . . . . . .

e,Js Ts,-


(82)

(83)

As we have proved in Vol. I, Chapter V, § 5, the general solution of sucha system has the form

z =e-"zo + I e-J(9-T) f (t) dT;

0(84)

here zo is a column matrix with arbitrary elements (the initial values ofthe unknown functions for t=0).

The inverse transition from the system (61) to (59) is effected by theformulas (60) and (62), according to which each of the functions x1, . . . , xis a linear combination of the functions zl, . . . , z and each of the functionsfi(t), . . . , &(t) is expressed linearly (with constant coefficients) in termsof the functions fl(t), . . . ,

2. The preceding analysis shows that: In general, for the consistency of thesystem (58) certain well-defined linear dependence relations (with constantcoefficients) must hold among the right-hand sides of the equations and thederivatives of these right-hand sides.

If these relations are satisfied, then the general solution of the systemcontains both arbitrary constants and arbitrary functions linearly.

The character of the consistency conditions and the character of the


solutions (in particular, the number of arbitrary constants and arbitraryfunctions) are determined by the minimal indices and the elementary divi-sors of the pencil A + AB, because the canonical form (65)-(69) of the sys-tem of differential equations depends on these minimal indices and ele-mentary divisors.

CHAPTER XIII

MATRICES WITH NON-NEGATIVE ELEMENTS

In this chapter we shall study properties of real matrices with non-negativeelements. Such matrices have important applications in the theory ofprobability, where they are used for the investigation of Markov chains(`stochastic matrices,' see [ 46] ), and in the theory of small oscillations ofelastic systems (òscillation matrices,' see [171).

§ 1. General Properties

1. We begin with some definitions.

DEFINITION 1: A rectangular matrix A with real elements

A=1ja;k (i=1,2,...,m;k=1,2, ..,n)is called non-negative (notation: A ? 0) or positive (notation : A > 0) ifall the elements of A are non-negative (ate ? 0) or positive (aa > 0).

DEFINITION 2: A square matrix A =1 a,k II, is called reducible if theindex set 1, 2.... , n can be split into two complementary sets (without com-mon indices) il, i2, ... , i,; k1, k2, ... , k, (p + v = n) such that

stake=0

Otherwise the matrix is called irreducible.By a permutation of a square matrix A = II a,k I1i we mean a permutation

of the rows of A combined with the same permutation of the columns.The definition of a reducible matrix and an irreducible matrix can also

be formulated as follows:

DEFINITION 2': A matrix A= ` a.a I i is called reducible if there is apermutation that puts it into the form

B OA- C D'where B and D are square matrices. Otherwise A is called irreducible.

su

§ 1. GENERAL PROPERTIES 51

Suppose that A= II ayk II "I corresponds to a linear operator A in ann-dimensional vector space R with the basis e,, e2,... , To a permutationof A there corresponds a renumbering of the basis vectors, i.e., a transitionfrom the basis e,, e2, ... , e to a new basis e'1 = et e 2 = ei .... e' = e! ,where (j,, j2, ... , j,) is a permutation of the indices 1, 2, ... , n. The matrixA then goes over into a similar matrix A=T-'AT. (Each row and eachcolumn of the transforming matrix T contains a single element 1, and theremaining elements are zero.)

2. By a v-dimensional coordinate subspace of R we mean a subspace of R

with a basis ek ej , ... , ek, (1 < k, < k2 < ... < k,:5 n) . There are (n)

v-dimensional coordinate subspaces of R connected with a given basise1, e2, .. . , e,,. The definition of a reducible matrix can also be given in thefollowing form :

DEFINITION 2": A matrix A = II ask 11 71 is called reducible if and only ifthe corresponding operator A has a v-dimensional invariant coordinate sub-space with v < n.

We shall now prove the following lemma :

LEMMA 1: If A ? 0 is an irreducible matrix of order n, then

(E +A)--' >0. (1)

Proof. For the proof of the lemma it is sufficient to show that for everyvector' (i.e., column) y > o (y# o) the inequality

(E+A)n-'y>oholds.

This inequality will be established if we can only show that under theconditions y ? o and y o the vector z = (E + A) y always has fewer zerocoordinates than y does. Let us assume the contrary. Then y and z havethe same zero coordinates.' Without loss of generality we may assume thatthe columns y and z have the forma

I llerc :111(1 throughout this chapter w by a vector a eolunio of n munhers. Inthis. wnv we identIf}-, ns it were, n vector with the coh:nu: of its coordinates in that Iinsisin which the given matrix A = j1 air !R' determines a certain linear operator.

-' here we start front the fait that z= y + Ay and aii ? o: therefore to positivecoordinates of y there correspond positive eoordinates of Z.

The columns i/ and z can he brought into this form by means of a suitnhlc reuumher-iug of the roordimites (the s:1111, for y and z).

52 XIII. MATRICES WITH NON-NEGATIVE EI.EJIENT5

y=(ol, z=( (u>o, v>o),0)

where the columns u and v are of the same dimension.Setting

we have

A=A21 A:b

and hence

Since u > o, it follows that

Ò/ + .AsI Az:/

(O)-\0/,

AL1u=o.

AY1=0.

This equation contradicts the irreducibility of A.Thus the lemma is proved.We introduce the powers of A :

A°=ij a;qI!i (q=1,2,...).Then the lemma has the following corollary :

COROLLARY: If A ? 0 is an irreducible matrix, then for every indexpair i, k (1 < i, k:5 n) there exists a positive integer q such that

ark) > 0.

Moreover, q can always be chosen within the bounds

(2)

q5m-1 if ilk, lq<m if i=k, J (3)

where m is the degree of the minimal. polynomial yp(A) of A.

For let r(A) denote the remainder on dividing (A+ 1)K-1 by ,p(A).Then by (1) we have r(A) > 0. Since the degree of r(A) is less than m,it follows from this inequality that for arbitrary i, k (1 < i, k < n) at leastone of the non-negative numbers

(2) 1)8ik, aik, aik , ... , a{

All

Al

is not zero. Since 61k=0 for i k, the first of the relations (3) follows.

§ 2. SPECTRAL PROPERTIES OF IRREDUCIBLE NON-NEGATIVE MATRICES 53

The other relation (for i = k) is obtained similarly if the inequalityr(A) > 0 is replaced by Ar(A) > 0.4

Note. This corollary of the lemma shows that in (1) the number n - 1can be replaced by m - 1.

§ 2. Spectral Properties of Irreducible Non-negative Matrices

1. In 1907 Perron found a remarkable property of the spectra (i.e., thecharacteristic values and characteristic vectors) of positive matrices.5

THEOREM 1 (Perron) : A positive matrix A = II act Il i always has a realand positive characteristic value r which is a simple root of the characteristicequation and exceeds the moduli of all the other characteristic values. To this`maximal' characteristic value r there corresponds a characteristic vectorz= (zt, z2, ... , z*) of A with positive coordinates z, > 0 (i=1, 2, ..., n).6

A positive matrix is a special case of an irreducible non-negative matrix.Frobenius7 has generalized Perron's theorem by investigating the spectralproperties of irreducible non-negative matrices.

THEOREM 2 (Frobenius) : An irreducible non-negative matrix A=II au II, always has a positive characteristic value r that is a simple root ofthe characteristic equation. The moduli of all the other characteristic valuesdo not exceed r. To the `maximal' characteristic value r there correspondsa characteristic vector with positive coordinates.

Moreover, if A has h characteristic values A = r, A,,. .. , Aa_, of modulusr, then these numbers are all distinct and are roots of the equation

Ah -rh=0. (4)

More generally: The whole spectrum A., A,, ..., of A, regarded as asystem of points in the complex 1-plane, goes over into itself under a rotation

4 The product of an irreducible non-negative matrix and a positive matrix is itselfpositive.

5 See [316], [317J, and [17], p. 100." Since r is a simple characteristic value, the characteristic vector z belonging to it is

determined to within a scalar factor. By Perron 's theorem all the coordinates of z arereal, different from zero, and of like sign. By multiplying z by - 1, if necessary, wecan make all its coordinates positive. In the latter case the vector (column) z= (z,, zt,z...... is called positive (as in Definition 1).

7 See [165] and [166].

54 XIII. MATRICES WITH NON-NEGATIVE ELEMENTS

of the plane by the angle 2n/h. If h > 1, then A can be put by means of apermutation into the following `cyclic' form:

0 A120 ... O00 A28...0

A= . . . . . . . . , (5)

1000...

Ak10 O ... 0

where there are square blocks along the main diagonal.Since Perron's theorem follows as a special case from Frobenius'

theorem, we shall only prove the latter.8 To begin with, we shall agree onsome notation.

We writeC<D or D>C,

where C and D are real rectangular matrices of the same dimensions m X n

C =11 ca 11, D = 11 da 11 (i =1, 2, ..., m ; k =1, 2,..., n),

if and only ifca g da (i = 1, 2, ..., m; k = 1, 2, ..., n) . (6)

If the equality sign can be omitted in all the inequalities (6), then weshall write

C<DorD>C.

In particular, C ? 0 (> 0) means that all the elements of C are non-negative (positive).

Furthermore, we denote by C+ the matrix mod C which arises from Cwhen all the elements are replaced by their moduli.

2. Proof of Frobenius' Theorem :e Let x = (.x1, x2i ... , (r o) be afixed real vector. We set :

(Ax)tr= = min -zIgsgn t

((Ax)1=Za;kxk; i = 1, 2, ... , n).k - L

In the definition of the minimum we exclude here the values of i for whichx{ = 0. Obviously rs ? 0, and r.r is the largest real number a for which

pox < Ax.

a For a direct proof of Perron's theorem see 1171, if. 100 ff.9 This proof is clue to Wielandt (3841.


We shall show that the function rr assumes a maximum value r for somevector z ? o :

Axr=r,=maxrx=max mm (x--. (7)(:20) 1$t s" I

From the definition of rr it follows that on multiplication of a vectorx ? o (x o) by a number A > 0 the value of rr does not change. There-fore, in the computation of the maximum of r= we can restrict ourselves tothe closed set M of vectors x for which

±X,'=l.x z o and (xx) -i-I

If the function r¢ were continuous on M, then the existence of a maximumwould be guaranteed. However, though continuous at every `point' x > o,ra may have discontinuities at the boundary points of M at which one of thecoordinates vanishes. Therefore, we introduce in place of M the set N ofall the vectors y of the form

y=(E+A)"-'x (xeM).

The set N, like M, is bounded and closed and by Lemma 1 consists ofpositive vectors only.

Moreover, when we multiply both sides of the inequality

rrx < Ax,

by (E+A)"-I > 0, we obtain:

rry < Ay (y = (E + A)"-Ix).

Hence, from the definition of r we have

rr c r9.

Therefore in the computation of the maximum of rr we can replace Mby the set N which consists of positive vectors only. On the bounded andclosed set N the function rr is continuous and therefore assumes a largestvalue for some vector z > o.

Every vector z ? o for whichre=r (8)

will be called cxtrenial.


We shall now show that: 1) The number r defined by (7) is positiveand is a characteristic value of A; 2) Every extremal vector z is positiveand is a characteristic vector of A for the characteristic value r, i.e.,

r>0, z>o, Az=rz. (9)

For if u = (1, 1, ... , 1), then r = min ± aa. But then r >0, be-Isiswt-I

cause no row of an irreducible matrix can consist of zeros only. Thereforer > 0, since r > r,,. Now let

x=(E+A)"-'z. (10)

Then, by Lemma 1, x > o. Suppose that Az - rz o. Then by (1), (8),and (10) we obtain successively :

Az - rz Z o, (E + A)*-1 (Az - rz) > o, Ax - rx > o.

The last inequality contradicts the definition of r. because it would implythat Ax - (r + e) x > o for sufficiently small e > 0, i.e., rs > r + e > r.Therefore Az = rz. But then

o<x=(E+A)"-1z=(1+r)r 1z,so that z>o.

We shall now show that the moduli of all the characteristic values do notexceed r. Let

Ay=ay (y#o). (11)

Taking the moduli of both sides in (11), we obtain:10

hence a y+ 5 Ay+. (12)

r.Let y be some characteristic vector corresponding to r:

Ay=ry (yo).Then setting a=r in (11) and (12) we conclude that y+ is an extremalvector, so that y+ > o, i.e., y= (y), y2, ... , where yj7&0 (i=1,2..... n).Hence it follows that only one characteristic direction corresponds to thecharacteristic value r ; for if there were two linearly independent character-istic vectors z and z1, we could chose numbers c and d such that the char-acteristic vector y = cz + dzl has a zero coordinate, and by what we haveshown this is impossible.

1O Regarding the notation y+, see p. 54.


We now consider the adjoint matrix of the characteristic matrix AE - A :

B (A) = N Ba (A) II*,, = d (A) (AE - A)-',

where d(A) is the characteristic polynomial of A and Bk(A) the algebraiccomplement of the element Abk, - a, in the determinant d (A). From thefact that only one characteristic vector z = (Si, z2, ..., with zl > 0,z2 > 0, ... , z. > 0 corresponds to the characteristic value r (apart from afactor) it follows that B(r) # 0 and that in every non-zero column of B(r)all the elements are different from zero and are of the same sign. The sameis true for the rows of B(r), since in the preceding argument A can be re-placed by the transposed matrix AT. From these properties of the rowsand columns of A it follows that all the Bur(r) (i, k = 1, 2, ... , n) aredifferent from zero and are of the same sign o. Therefore

i.e., d'(r) 0 and r is a simple root of the characteristic equation d(1) = 0.Since r is the maximal root of d (1) = A" + ... , d (A) increases for

A > r. Therefore d'(r) > 0 and a=1, i.e.,

Bn (r) > 0 (i, k =1, 2, ..., n) . (13)

3. Proceeding now to the proof of the second part of Frobenius' theorem,

we shall make use of the following interesting lemma :I'LEMMA 2: If A= II a . jI' and C = II cik II1 are two square matrices of

the same order n, where A is irreducible and12

C+<A, (14)

then for every characteristic value y of C and the maximal characteristicvalue r of A we have the inequality

I yI:5 r.

In the relation (15) the equality sign holds if and only if

C = e'+DAD-',

(15)

(16)

where eiv= y/r and D is a diagonal matrix whose diagonal elements are ofit-nit modulus (D+ = E).

'1 See [384].

12 C is a complex matrix and A ? 0.


Proof. We denote by y a characteristic vector of C corresponding to thecharacteristic value y :

Cy=Yy (Y 0) (17)

From (14) and (17) we find

I Y I y+ S C+ y+ S Ay+. (18)

Therefore

Iylsr,+sr.

Let us now examine the case I y I = r in detail. Here it follows from(18) that y+ is an extremal vector for A, so that y+ > o and that y+ is acharacteristic vector of A for the characteristic value r. Therefore therelation (18) assumes the form

Ay+=C+y+=ry+, y+>o. (19)

Hence by (14)

C+=A.

Let y= (yl, yr, ... , y,,), where

y!= I yt I elf (9=1, 2, ..., n).

We define a diagonal matrix D by the equation

D = (eivs, esft, ..., e ).Then

y=Dy

(20)

Substituting this expression for y in (17) and then setting y = re"", wefind easily :

whereFy+= ry+, (21)

F = e--4vD-1 CD.

Comparing (19) with (21), we obtain

Fy+=C+y+ =Ay+.But by (22) and (20)

F+=C+=A.

(22)

(23)


Therefore we find from (23)

Fy+ =F+ y+.

Since y+ > o, this equation can hold only if

i.e.,

Hence

and the Lemma is proved.

e-4'D-1CD =A.

C = e'PDAD-1,

4. We return to Frobenius' theorem and apply the lemma to an irreduciblematrix A ? 0 that has precisely h characteristic values of maximal modu-lus r :

20 =re's., A, =re", ..., A,,_, = rermk-1

(0=9)o<TI<92<... <91h_1<221).

Then, setting C = A and =1k in the lemma, we have, for every k = 0,1,...,h-1,

A = e'9kDkADk I , (24)

where Dk is a diagonal matrix with D+ k= E.Again, let z be a positive characteristic vector of A corresponding to the

maximal characteristic value r :

Then setting

Az = rz (z > o). (25)

y =Dkz (y+=z>o), (26)

we find from (25) and (26) :k k

Ay =Aky (Ak =re k =0, 1, ..., h - 1). (27)

The last equation shows that the vectors y, y, .. .;6 y1 defined in (26) arecharacteristic vectors of A for the characteristic values 10, 11i ... , 1n-I-

From (24) it follows not only that 1a = r, but also that each character-istic value 1,, ... , 1,,-, of A is simple. Therefore the characteristic vectors

y and hence the matrices Dk (k = 0, 1, . . . , h - 1) are determined to withinscalar factors. To define the matrices Do, D1, ... , Dti_1 uniquely we shallchoose their first diagonal element to be 1. Then Do = E and y = z > o.

60 XIII. MATRICES WITIi NON-i\EQATIVE ELEMENTS

Furthermore, from (24) it follows that

A=et(vJtvk)DiDk'ADt 'Di 1 (9, k=0, 1, ..., b-1).

Hence we deduce similarly that the vector

DfDkf' z

is a characteristic vector of A corresponding to the characteristic valuered(vitvk).

Therefore e$(v,fck) coincides with one of the numbers erg and the matrixD; Dkal with the corresponding matrix D,; that is, we have, for somel,, l2(0<11i12ch-1)

ei (&J+ck) = ell,,

D,Dk =D,,,

e' (01-Vk) = et1/,,

Dtgk D,..

Thus : The numbers a", e{ft, ... , eivk-, and the corresponding diagonalmatrices D,,, D,, ... , Dk_1 form two isomorphic multiplicative abeliangroups.

In every finite group consisting of h distinct elements the h-th powerof every element is equal to the unit element of the group. Thereforee4, e4h, ..., ervk-1 are h-th roots of unity. Since there are h such roots ofunity and 0<9;, <q's< <q7k .1<27s ,

9'k = 1- (k=0, 1, 2, ..., h -1)and

ib,elk=ek (e=ei" =e k ; k=0, 1, ..., h-1), (28)

.k= rek (k=0, 1, ..., h- 1). (29)

The numbers A,,, A1, ..., Ah_1 form a complete system of roots of (4).In accordance with (28), we have:"

A = Dk (D = D, ; k =0, 1, ..., h -1). (30)

The equation (24) now gives us (for k=1):

IX

A = et k DAD-'. (31)

14 Here we use tin isomorphism, of the multiplicative groups 091A-1 andD,, D...... D,_,.

§ 2. SPECTRAL PROPERTIES OF IRREDUCIBLE NON-NEGATIVE MATRICES 612"

Hence it follows that the matrix A on multiplication by e' + goes over into asimilar matrix and, therefore, that the whole system of n characteristic

values of A on multiplication by e' A goes over into itself.12Further,

D"=E,so that all the diagonal elements of D are h-th roots of unity. By a permuta-tion of A (and similarly of D) we can arrange that D be of the followingquasi-diagonal form :

D = (noEo,171E1, ..., 14-1E's-.I },where E,, El, ... , E,_I are unit matrices and

lip ewp,2n

wp= np A

(npisaninteger; p=0, 1,...,s-1;0 <no < ... < n,_1 < h)

Obviously s 5 h.Writing A in block form (in accordance with (32) )

All A12 ... AlsA =

A2I A22 ... A2r

(An A,2 ... As.

we replace (31) by the system of equations

(32)

(33)

eA A \2

p° ''lp_I p, q= 1, 2, ..., e; s =e{ (34)

Hence for every p and q either !i =e or A,, = 0.

Let us take p = 1. Since the matrices A,,, A13, ... , A,, cannot vanish17*simultaneously, one of the numbers 171, , ... '"1 (ii, = 1) must be equal

7h % noto e. This is only possible for n1=1. Then 1 = e and All = A13 = ... _Ats = 0. Setting p = 2 in (34), we find similarly that n2 = 2 and that A2, _A22 = A24 _ ... = A. = 0, etc. Finally, we obtain

15 The number h is the largest integer having these properties, because A has preciselyh characteristic values of maximal modulus r. Moreover, it follows from (31) that allthe characteristic values of the matrix fall into systems (with h numbers in each) of theform p pet, . . . , po0--1 and that within each such system to any two characteristicvalues there correspond elementary divisors of equal degree. One such system is formedby the roots of the equation (4) ).o, ),,, ... , ),,-,.


0 A18 0 ... 00 0 An...0

A= . . . . . . . . .

0 0 0 ... A,_1,,A,1 A,2 A.3 ... A.

Here n1=1, n2=2, ..., n,_1=s-1. But then forhand sides of (34) we have the factors

4-4p4-1 = A`

p=s

(q=1,2, ..., s).

on the right-

One of these numbers must be equal to e = e' A . This is only possible whens = h and q =1; consequently, A,2 = ... = A = 0.

Thus,

D = (Eo, PEI, esE2,

and the matrix A has the form (5).Frobenius' theorem is now completely proved.

5. We now make a few general comments on Frobenius' theorem.Remark 1. In the proof of Frobenius' theorem we have established

incidentally that for an irreducible matrix A ? 0 with the maximal charac-teristic value r the adjoint matrix B(A) is positive for A=r:

B(r)>O, (35)

Bjk (r) > 0 (i, k =1, 2, ... , n), (35')

where Btk(r) is the algebraic complement of the element rbki - aki in thedeterminant I rE - A 1.

Let us now consider the reduced adjoint matrix (see Vol. I. Chapter IV,§ 6)

,C(A)=DB (l(a)

where is the greatest common divisor of all the polynomials Bik(A)(i, k = 1, 2, ... , n). It follows from (35') that 0. All the rootsof D,_,(1) are characteristic values16 distinct from r. Therefore all the

36 O.) is a divisor of the characteristic polynomial D. (k) - I JOE-A 1.


roots of (A) either are complex or are real and less than r. Hence0 and this, in conjunction with (35), yields:"

C(r)= B(r) >0.D.-I (r)(30)

Remark 2. The inequality (35') enables us to determine bounds for themaximal characteristic value r.

We introduce the notationft

aj = ,' au (i= 1, 2, ..., n), a = min st, ,S = max at.k-1 1$1$A Isf$w

Then : For an irreducible matrix A ? 0

s r:5 8, (37)

and the equality sign on the left or the right of r holds for s = S only; i.e.,holds only when all the 'row-sums' s,, 82, ... , s are equal.18

For if we add to the last column of the characteristic determinant

r -all - a12 ... - atn

- a21 ?-a22 ... - a211A(r)=

I - a,a - r -an, I

all the preceding columns and then expand the determinant with respect tothe elements of the last column, we obtain :

k_1

Hence (37) follows by (35').

Remark 3. An irreducible matrix A ? 0 cannot have two linearly inde-pendent non-negative characteristic vectors. For suppose that, apart fromthe positive characteristic vector z > o corresponding to the maximal charac-teristic value r, the matrix A has another characteristic vector y ? o (lin-early independent of z) for the characteristic value a:

17 In the following section it will be shown for an irreducible matrix B(?) > 0, thatC(A) > 0 for every real k ? r.

18 Narrower bounds for r than (8, S) are established in the papers [ 2561, [295] and[119, IV).


Ay = ay (y moo; y 2 o).

Since r is a simple root of the characteristic equation I AE - A 0,

air.We denote by u the positive characteristic vector of the transposed matrix

AT ford=r:

Then1°

hence, as a r,

ATu=ru (u>o).

r(y,u)=(y,ATu)=(Ay,u)=a(y,u);

(y,u)=0,

and this is impossible for it > o, y ? o, y o.

Remark 4. In the proof of Frobenius' Theorem we have established thefollowing characterization of the maximal characteristic value r of an irre-ducible matrix A ? 0:

r=max rx,(X Z, 0)

where r= is the largest number e for which ex < Ax. In other words, since

rx= min (Ax)i, we have1$t$" xi

r =max min (Ax)i

(x3o) 15i$w xi

Similarly, we can define for every vector x ? o (x # o) a number r- as theleast number a for which

i.e., we setox? Ax;

rx = max (Ax)i

15isw xi

If for some i we have here xi = 0, (Ax) i 0, then we shall take rx = + oo.As in the case of the function r:, it turns out here that the function rr

assumes a least value r for some vector v > o.Let us show that the number r defined by

r = min rx = min max Ax)i(38)

(Z 20) (x?3o) 1$i$" xi

"If y= (y,, y...... and u = (m, u... . u.), then we mean by (y, u) the 'scalar

product' yTU=E',V,ui. Then (y,ATU)=yTATn and (Ay,u)=(Ay)TU=yTATU.i-1


coincides with r and that the vector v ? o (v o) for which this minimumis assumed is a characteristic vector of A for A = r.

For,rv - Av ? o (v g o, v r o).

Suppose now that the sign ? cannot be replaced by the equality sign. Thenby Lemma 1

Setting (E + A)-' rv - Av) > o, (E + A)n-I v > o. (39)

we haveu=(E+A) v>o,

ru > Au

and so for sufficiently small e > 0

(r-e)u>Au (u>o),which contradicts the definition of r. Thus

Av= rv.But then

u=(E + A)"-Iv=(1 + ;)*--IV.

Therefore u >0 implies that v > o.Hence, by the Remark 3,

r=r.Thus we have for r the double characterization :

r = max min (Ax)i= min max Ox)( (40)

(x2o) 1;51$n xi (x.!;o) ISi$n xi

Moreover we have shown that max or min is only assumed for a positive(x 2 o) W2 o)

characteristic vector for A= r.From this characterization of r we obtain the inequality20

min [Ax)` S r S max (A?)i (x ? o, x moo). (41)I i n xi 1Sign xi

Remark 5. Since in (40) max and min are only assumed for a posi-(Z a 0) (x 2.)

tive characteristic vector of the irreducible matrix A > 0, the inequalities

20 See [128) and also [17), p. 325 ff.


or

always imply that

rz5Az, z0, zoo

rz?Az, zo, z76 o

Az = rz, z > o.

§ 3. Reducible Matrices

1. The spectral properties of irreducible non-negative matrices that wereestablished in the preceding section are not preserved when we go over toreducible matrices. However, since every non-negative matrix A ? 0 canbe represented as the limit of a sequence of irreducible positive matrices A.

A = lim A. (Am > 0, m = 1, 2, . . .), (42)m_M

some of the spectral properties of irreducible matrices hold in a weaker formfor reducible matrices.

For an arbitrary non-negative matrix A ='I a,k II; we can prove thefollowing theorem :

THEOREM 3: A non-negative matrix A= II aik II; always has a non-negative characteristic value r such that the moduli of all the characteristicvalues of A do not exceed r. To this `maximal' characteristic value r therecorresponds a non-negative characteristic vector

Ay=ry (y>o, y=A o)The adjoint matrix B(1) = II B,k(2) Iii = (2E-A)-'A(A) satisfies the

inequalities

B (A) ? 0, d- B (A) z 0 for A ? r. (43)

Proof. Let A be represented as in (42). We denote by r(m) and y(m)the maximal characteristic value of the positive matrix Am and the corre-sponding normalized" positive characteristic vector :

Amy(m) = r(m) y(m) ((y(m), y(m)) = 1, 00) > o; m = 1, 2, ...) . (44)

Then it follows from (42) that the limit

lim r(m) = r

21 By a normalized vector we mean a column y = (y,, ys, ... , for which (y, y) _Eye=1.t-t

§ 3. REDUCIBLE MATRICES 67

exists, where r is a characteristic value of A. From the fact that r(-) > 0and r(m) > A.(-) 1, where lo(-) is an arbitrary characteristic value of A.(m =1, 2, ...), we obtain by proceeding to the limit :

r?O,r?IAol,where 1,, is an arbitrary characteristic value of A. This passage to the limitgives us in place of (35)

B (r) z O. (45)

Furthermore, from the sequence of normalized characteristic vectors y(m)(m = 1, 2, . . .) we can select a subsequence y'-v) (p = 1, 2, ...) that con-verges to some normalized (and therefore non-zero) vector y. When we goto the limit on both sides of (44) by giving to m the values mo (p = 1, 2,...)successively, we obtain:

Ay=ry (y Z o, y,-: o).

The inequalities (43) will be established by induction on the order n.For n = 1, they are obvious.n Let us establish them for a matrix A = 11 aik 117of order n on the assumption that they are true for matrices of order lessthan n.

Expanding the characteristic determinant d (2) = l AE - A I with re-spect to the elements of the last row and the last column, we obtain :

*-Id( Y (46)

i. k-1

Here Be,, (1) = l 26;,t - as -i is the characteristic determinant of a 'trun-cated' non-negative matrix of order n -1, and B(k",)(2) is the algebraiccomplement of the element Mack - aik in (i, k = 1, 2, ... , n -1). Themaximal non-negative root of B,,,,(2) will be denoted by r,,. Then settingA = r in (46) and observing that by the induction hypothesis

B(R) O (i, k =1, 2, ..., n -1) ,

we obtain from (46) :,j (r,,) <0.

On the other hand d (A) _ 2 +. . . , so that d (+ oo) _ + oo. Thereforer,,, either is a root of d(A) or is less than some real root of d(1). In bothcases,

22 For since B(X)°(XE-A)-Id (),),we have B(X)-E, B(X)°Oforn=I.


Since every principal minor B,,(A) of order n - 1 can be brought intothe position of B..(A) by a permutation of A, we have

r1Sr (j= 1, 2, ..., n), (47)

where r5 denotes the maximal root of the polynomial B1j(A) (j = 1, 2, ... , n).Furthermore, B(k(A) may be represented as a minor of order n -1 of

the characteristic matrix AE - A, multiplied by (-1)s+k. When we differ-entiate this determinant with respect to A, we obtain :

dd Bct (A) Bir (1) (i, k =1, 2, ..., n -1), (48)

where Bin (1) = ii BB ;' (i # j, k # j; j =1, 2, ..., n) is the adjoint matrixof the matrix II aik II (i, k = 1, 2, ... , j - 1, j + 1, ... , n) of order n - 1.But, by the induction hypothesis,

B(i)(A) ?0 for A?r; (j=1,2,...,n);

and so, by (47) and (48),

dl B(A) 0 for A ? r. (49)

From (45) and (49) it follows that

B(A)?O for A?r.

The proof of the theorem is now complete.

Note. In the passage to the limit (42) the inequalities (37) are pre-served. They hold, therefore, for an arbitrary non-negative matrix. How-ever, the conditions under which the equality sign holds in (37) are notvalid for a reducible matrix.

2. A number of important propositions follow from Theorem 3:

1. If A= II atk Iii is anon-negative matrix with maximal characteristicvalue r and C(A) is its reduced adjoint matrix, then

C(A) ? 0 for A? r. (50)


For

C (.i) = DB (A) (51)

where D.-I(A) is the greatest common divisor of the elements of B(A).Since D.-I(A) divides the characteristic polynomial J (A) and 1(1) _1"-1+...,

>0 for 2> r. (52)

Now (43), (51), and (52) imply (50).2. If A > 0 is an irreducible matrix with maximal characteristic value

r, thenB(1) > 0, C(1) > 0 for 1? r. (53)

Indeed, by (35) B(r) > 0. But also (see (43)) dj B(1) 0 for A r.Therefore

B(1) >0 for A? r. (54)

The other of the inequalities (53) follows from (51), (52), and (54).3. If A ? 0 is an irreducible matrix with maximal characteristic value

r, then(AE-A)-'>O for 1 > r.

This inequality follows from the formula

(AE-A)-1= a(x)

(55)

since B(1)>Oand4(1)>Ofor1>r.4. The maximal characteristic value r' of every principal minor23 (of

order less than n) of a non-negative matrix A II aik 111 does not exceedthe maximal characteristic value r of A:

r' < r. (56)

If A is irreducible, then the equality sign in (56) cannot occur.If A is reducible, then the equality sign in (56) holds for at least one

principal minor.

23 We mean here by a principal minor the matrix formed from the elements of a prin-cipal minor.

70 XIII. MATRICES WITH NON-NEOATI4E ELEMENTS

For the inequality (56) is true for every principal minor of order n -1(see (47) ). If A is irreducible, then by (35') Bj;(r) > 0 (j =1, 2, ... , n)and therefore r' # r.

By descent from n -1 to n - 2, from is - 2 to it - 3, etc., we showthe truth of (56) for the principal minors of every order.

If A is a reducible matrix, then by means of a permutation it can beput into the form

(B 0D).

Then r must be a characteristic value of one of the two principal minors Band D. This proves Proposition 4.

From 4. we deduce :

5. If A> 0 and if in the characteristic determinant

A (r)=

r-all -a12 ... -au,-all r-a. ... -a2

-ael -ant . . . r- a.,

any principal minor vanishes (A is reducible!), then every 'augmented'principal minor also vanishes; in particular, so does one of the principalminors of order n - 1

B11(A), B22(Z), ...,

From 4. and 5. we deduce :

6. A matrix A ? 0 is reducible if and only if in one of the relations

B;i (r) ? 0 (i =1, 2, ..., n)

the equality sign holds.From 4. we also deduce :

7. If r is the maximal characteristic value of a matrix A ? 0, then forevery A> r all the principal minors of the characteristic matrix A,,-AE-Aare positive :

A,(il i2 >0 (A>r; 15i1<i2<...<ipSn; p=1,2,...,n). (57)11 s2 tp

It is easy to see that, conversely, (57) implies that A > r. For

§ 3. REDUCIBLE MATRICES 71IA(A+p)=I(A+I)E-AI=1Aa+pE1= SA-k,where Sk is the sum of all the principal minors of order k of the character-istic matrix A), - AE - A (k = 1, 2, ... , n).24 Therefore, if for some reald all the principal minors of A), are positive, then for p ? 0

A(A+p) 0,

i.e., no number greater than A is a characteristic value of A. Therefore

r<2.

Thus, (57) is a necessary and sufficient condition for A to be an upperbound for the moduli of the characteristic values of A.25 However, theinequalities (57) are not all independent.

The matrix AE - A is a matrix with non-positive elements outside themain diagonal.26 D. M. Kotelyanskil has proved that for such matrices, justas for symmetric matrices, all the principal minors are positive, providedthe successive principal minors are positive.R'

LEMMA 3 (Kotelyanskii) : If in a real matrix G = II gik 11i all the non-diagonal elements are negative or zero

g,k:0 (i,-&k;i,k=1,2,...,n) (58)

and the successive principal minors are positive

12 291i=G(1 )>0, f#(1 2)>0, ..., ( (

1 2 ... n7 ,

) > (59)1

then all the principal minors are positive :

tit i2 ... SPQ (4.1

i2 ... $P) >0 (1 <iPSn; p= 1,2, ..., n).

24 See Vol. I, p. 70.25 See [344).

26 It is easy to see that, conversely, every matrix with negative or zero nor liagonalelements can be represented in the form ?lE - A, where A is a non-negative matrix and kis a real number.

27 See [2151. This paper contains a number of results about matrices in which all thenon-diagonal elements are of like sign.


Proof. We shall prove the lemma by induction on the order n of thematrix. For n=2 the lemma holds, since it follows from

912 S 0 1721 0, 911 > 0, 911922 - 1712921 > 0

that g22 > 0. Let us assume now that the lemma is true for matrices oforder less than n ; we shall then prove it for G = II gik II 1 . We consider thebordered determinants

1 ti )tct =G 1 k) = 91192 -17u17i1 (i, k = 2, ... , n .

From (58) and (59) it follows that

tjk<0 (ilk; i,k=2, ..., n).On the other hand, by applying Sylvester's identity (Vol. I, Chapter 11,(30), p. 33) to the matrix T = 11 to 111 2, we obtain :

... 1p}t1ItT (1, i2 ... y

r1 $ i2 ... ipl(2

S i1 < i2 < ... < fp S n,

_ (911) 1 G I 1 i1 i2 ... ip) p = 1, 2,(60)

Hence it follows by (59) that the successive principal minors of the matrixT= II t,k 11312 are positive :

t22=T(2)> 0,2

T(2 3)>0, ..., T(2 3 ... n)>0.

Thus, the matrix T = II t1k II"2 of order n - 1 satisfies the condition ofthe lemma. Therefore by the induction hypothesis all the principal minorsare positive:

T(i1 i2.. ip)>0 (25 i1<i2<...<i,, n;p=1,2,...,n-1).11 i2 by

But then it follows from (60) that all the principal minors of G containingthe first row are positive :

C1i1i2...iP)>0 (25i1<i2< <i7Sn; p=1,2, ..., n-1). (61)


Let us choose fixed indices i1i i2, ... , (where 1 < it < i2 < ... <i.-2:-5 n) and form the matrix of order n - 1:

119-011 (a, Q =1, '11 '21 ..., i.-2) (62)

The successive principal minors of this matrix are positive, by (61) :

gII>0,G(1 iI)>0, ..., G(1 iI i s ... i' _

1 iI 1 iI is ... i,rand the non-diagonal elements are non-positive :

gap 5 0 (01:76 P; a, fl =1, iI, is, ... , i"-2).

But the order of (62) is n - 1. Therefore, by the induction hypothesis, allthe principal minors of this matrix are positive ; in particular,

G ('1i= ... iy\ > 0

(63)i2 ... ty

(25i1<is< ... <iPSn; p=1, 2, ..., n-2).

Thus, all the minors of G of order not exceeding n - 2 are positive.Since by (63) 922 > 0, we may now consider the determinants of order

two bordering the element g22 (and not g11 as before)

t = G(2

2

ik)(i, k =1, 3, ... , n) .

By operating with the matrix T" = II t k II, as we have done above with T.we obtain inequalities analogous to (61) :

2iI... p

G (2 iI ... y) > 0 (64)

(i1<i2<... <ip; iI, ..., ip=1,3, ...,n; p=1,2, ..., n-1).

Since every principal minor of G = ga, II7contains either the first orthe second row or is of order not exceeding n - 2, it follows from (61), (63),and (84) that all the principal minors of A are positive. This completesthe proof of the lemma.

This lemma allows us to retain only the successive principal minors inthe condition (57) and to formulate the following theorem :

29 See [344] and [215). Since C =A - XE and A ? 0, X. is real (this follows fromX. + X = r) and the corresponding characteristic vector of C is non-negative: Cy = X .y(y?o,yfo)


THEOREM 4: A real number A is greater than the maximal characteristicvalue r of the matrix A II aik II; ? 0

r < A

if and only if for this value A all the successive principal minors of the char-acteristic matrix A), = AE - A are positive :

A-a11 -all-a21 A-an > 0. (65)

- a91 -Let us consider one application of Theorem 4. Suppose that in the matrix

C = II c,k III all the non-diagonal elements are non-negative. Then for someA > 0 we have A = C + AE > 0. We arrange the characteristic values A,(i = 1, 2, ... , n) of C with their real parts in ascending order :

Re AI <Re A2<... <ReA,,.

We denote by r the maximal characteristic value of A. Since the charac-teristic values of A are the sums A, + A (i = 1, 2, ... , n), we have

In this case the inequality r < A holds for A. < 0 only, and signifies that allthe characteristic values of C have negative real parts. When we write downthe inequality (65) for the matrix - C = AE - A, we obtain the followingtheorem :

THEOREM 5: The real parts of all the characteristic values of a realmatrix C = II ect III with non-negative non-diagonal elements

c,k? 0 (ilk; i,k=1, 2, . . . , n)

>0_ ....

.1-all -a12 -al- a21 11- a42 ... -- a2s

are negative if and only if

C11 <0,C11

C21

C12I >0, ..., (-1)nC22:

C11 C12 ... elmC21 C22 ... C2a > 0. (66)

Cw1 C82 ... C.. I

§ 4. The Normal Form of a Reducible Matrix

1. We consider an arbitrary reducible matrix A = II alk II 1 . By means of apermutation we can put it into the form

§ 4. THE NORMAL FORM OF A REDUCIBLE MATRIX 75

A=B OC D (67)

where B and D are square matrices.If one of the matrices B or D is reducible, then it can also be represented

in a form similar to (67), so that A then assumes the form

K 0 0A= H L O .

F G MIf one of the matrices K, L, M is reducible, then the process can be con-

tinued. Finally, by a suitable permutation we can reduce A to triangularblock form

All 0 ... 0A = A21 A22 ... 0 (68)

(A'I

A,2 ... A,

where the diagonal blocks are square irreducible matrices.A diagonal block Ait (1 i:5 s) is called isolated if

A;k=O (k=1, 2,...,i-1,i+1,..., 8).

By a permutation of the blocks (see p. 50) in (68) we can put all theisolated blocks in the first places along the main diagonal, so that A thenassumes the form

AI 0 ... 0 00 As ... 0 0

A= 0 0 ... AtAo+,. I Aa+1.2 ... A#+I, s

A,1 A,2 ... A,a

0AD+1

here Al, A2i ... , A, are irreducible matrices, and in each row

An, Are, ..., Af.fI (1=g+ 1, ..., a)

(69)

at least one matrix is different from zero.We shall call the matrix (69) the normal form of the reducible matrix A.


Let us show that the normal form of a matrix A is uniquely determinedto within a permutation of the blocks and permutations within the diagonalblocks (the same for rows and columns) 29 For this purpose we considerthe operator A corresponding to A in an n-dimensional vector space R. Tothe representation of A in the form (69) there corresponds a decompositionof R into coordinate subspaces

R=R,+R2+...+R,+R,+,+...+R,; (70)

here R R._, + R R,_2 + R5_, + R,. . . are invariant coordinate subspacesfor A, and there is no intermediate invariant subspace between any twoadjacent ones in this sequence.

Suppose then that apart from the normal form (69) of the given matrixthere is another normal form corresponding to another decomposition of Rinto coordinate subspaces:

R=R,+R2+...+R,+R,+,+...+ R,. (71)

The uniqueness of the normal form will he proved if we can show that thedecompositions (70) and (71) coincide apart from the order of the terms.

Suppose that the invariant subspace R, has coordinate vectors in com-

mon with Rk, but not with Rk+,, ... , R,. Then R, must be entirely con-tained in Rx, since otherwise R, would contain a `smaller' invariant sub-

space, the intersection of R, with R, + R,;+, + ... + R,. Moreover, R, must

coincide with Rk, since otherwise the invariant subspace R,+Rx+t +...+R,would be intermediate between RA + RA+, + . . . + R. and R..

Since R&. coincides with R,, RA. is an invariant subspace. Therefore, withoutinfringing the normal form of the matrix, Rk can be put in the place of R,.

Thus, we may assume that in (70) and (71) R, = R,.

Let us now consider the coordinate subspace R,_,. Suppose that it hascoordinate vectors in common with R, (l < s), but not with R,+,, ... , R,.Then the invariant subspace R,_, + R, must be entirely contained inR, + R,+t + ... + R,, since otherwise there would be an invariant coordinate

subspace intermediate between R, and R, _, + R,. Therefore R, _, C R,.

Moreover R,_, R,, since otherwise R,+ R,+, + ... + R, would be an

invariant subspace intermediate between R, + R,+, + ... + R. and R,+, +

=U Without violating the normal form we can permute the first g blocks arbitrarilyamong each other. Moreover, sometimes certain permutations among the last s - g blocksare possible with preservation of the normal form.


...+R,. From Re-1 - R: it follows that R, + R, is an invariant subspace.Therefore R, may be put in the place of R,_1 and then we have

Re-1 = R,-1, R, =R,

Continuing this process, we finally reach the conclusion that s = t andthat the decompositions (70) and (71) coincide apart from the order of theterms. The corresponding normal forms then coincide to within a permuta-tion of the blocks.

From the uniqueness of the normal form it follows that the numbersg and s are invariants of the non-negative matrix A.3°

2. Making use of the normal form, we shall now prove the followingtheorem :

THEOREM 6: To the maximal characteristic value r of the matrix A ? 0there belongs a positive characteristic vector if and only if in the normalform (69) of A: 1) each of the matrices A1i A2, ..., A, has r as a charac-teristic value; and (in case g < s) 2) none of the matrices A,+1, ... , A. hasthis property.

Proof. 1. Let z > o be a positive characteristic vector belonging to themaximal characteristic value r. In accordance with the dissection intoblocks in (69) we dissect the column z into parts zk (k = 1, 2, ... , s). Thenthe equation

Az = rz (z > o) (72)

is replaced by two systems of equations

A;z' = rz' (i = 1, 2, ... , g),i-1

Agz" + A1z1= rz1 (j = g + 1, ... , a) .h-1

(72')

(72")

From (72') it follows that r is a characteristic value of each of thematrices A1i A2, ... , A,. From (72") we find :

Apzl S rzi, Ajz1,'rz1 (j = g + 1, ..., s). (73)

We denote by r1 the maximal characteristic value of A, (j = ii + 1, ... , s).Then (see (41) on p. 65) we find from (73) :

rj r

30 For an irreducible matrix, g = s = 1.


On the other hand, the equation r1 = r would contradict the second of therelations (73) (see Note 5 on p. 65). Therefore

rj < r (j= g + 1, ... , s). (74)

2. Suppose now, conversely, that the maximal characteristic values ofthe matrices A, (i =1, 2, . . . , g) are equal to r and that (74) holds for thematrices Af (j = g + 1, ... , s). Then by replacing the required equation(72) by the systems (72'), (72") we can define positive characteristic col-umns zi of the matrices A, (i=1, 2, ... , g) by means of (72'). Next wefind columns zf (j = g + 1, . . . , s) from (72") :

f-1

zf=(rEf-A,)-1 AjAZA (j=g+1, ..., 8), (75)A.1

where Ef is the unit matrix of the same order as Af (j = g + 1, . . . , s).Since r1 < r (j = g + 1, .... s), we have (see (55) on p. 69)

(rE1-A1)-1>O (j=g+ 1, ..., s). (76)

Let us prove by induction that the columns z9+1, ... , zJ defined by (75)are positive. We shall show that for every j (g + 1 j < s) the fact thatz'. z2, ..., zi"1 are positive implies that z1 > o. Indeed, in this case,

i-1 f-1

E Afhzh? o, Y AtAZ''#o,A-1 A-1

which in conjunction with (76) yields, by (75)

z' > 0.

Thus, the positive column z = (z1, . . . , z') is a characteristic vector of Afor the characteristic value r. This completes the proof of the theorem.

3. The following theorem gives a characterization of a matrix A ? 0which together with its transpose AT has the property that a positive char-acteristic vector belongs to the maximal characteristic value.

THEOREM 7 :31 To the maximal characteristic value r of a matrix A ? 0there belongs a positive characteristic vector both of A and of AT if and onlyif A can be represented by a permutation in quasi-diagonal form

A=(A1,A2, ..., A,), (77)

where A1. A2, ... , A, are irreducible matrices each of which has r as itsmaximal characteristic value.

33 See [166].


Proof. Suppose that A and AT have positive characteristic vectors forA= r. Then, by Theorem 6, A is representable in the normal form (69),where A1, A2, ... , A, have r as maximal characteristic value and (for g < s)the maximal characteristic values of A,+,,..., Ae are less than r. Then

AT

A; ... 0 A +1.1

0 ... AT Aa+1.9 A3g

0 ... 0 ATg+1

0 ... 0 0 As

Let us reverse here the order of the blocks in this matrix :

A; 0 0...0AT a-1 A-, 0 ...

(78)

A$I As 1,1 T

Since AT, AL1, ... , A; are irreducible, we obtain a normal form for (78)by a permutation of the blocks, placing the isolated blocks first along themain diagonal. One of these isolated blocks is AT, . Since the normal formof AT must satisfy the conditions of the preceding theorem, the maximalcharacteristic value of A; must be equal to r. This is only possible wheng = s. But then the normal form (69) goes over into (77).

If, conversely, a representation (77) of A is given, then

AT=(A;,A2, ..., A;). (79)

We then deduce from (77) and (79), by the preceding theorem, that A andAT have positive characteristic vectors for the maximal characteristic value r.


COROLLARY. If the maximal characteristic value r of a matrix A > 0is simple and if positive characteristic vectors belong to r both in A and AT,then A is irreducible.


Since, conversely, every irreducible matrix has the properties of thiscorollary, these properties provide a spectral characterization of an irre-ducible non-negative matrix.

§ S. Primitive and Imprimitive Matrices

1. We begin with a classification of irreducible matrices.

DEFINITION 3: If an irreducible matrix A 0 has h characteristicvalues At, A2, ... , A,, of maximal modulus r (A1= 1A2 = ... _ (Ax I = r),then A is called primitive if h = 1 and imprimitive if h > 1. lI is called theindex of imprimitivity of A.

The index of imprimitivity It is easily determined if the coefficients ofthe characteristic equation of the matrix are known

d(A)_r+a1An,+.a2An'+.....+arg=0

(n>n1>...>n,; a1'0, a2 0,...,a, 0)

namely : h is the greatest common divisor of the differences

n - n1i n1 _n., ..., n,-,-n,. (80)

For by Frobenius' theorem the spectrum of A in the complex A-planegoes over into itself under a rotation through 2n/h around the point A= 0.Therefore the polynomial d(A) must be obtained from some polynomialg(,a) by the formula

A (2) = g (Al) An

Hence it follows that h is a common divisor of the differences (80). Butthen h is the greatest common divisor d of these differences, since the spec-trum does not change under a rotation by 27i/d, which is impossible for h < d.

The following theorem establishes an important property of a primitivematrix :

THEOREM 8: A matrix A ? 0 is primitive if and only if some power ofA is positive :

At > 0 (p ? 1). (81)

Proof. If At > 0, then A is irreducible, since the reducibility of Awould imply that of At. :Moreover, for A we have h = 1, since otherwisethe positive matrix AP would have h (> 1) characteristic values

AP, iAP, AP1 n

of maximal modulus rP, and this contradicts Perron's theorem.

§ 5. PRIMITIVE AND IMPRIMITIVE MATRICES 81

Suppose now, conversely, that A is primitive. We apply the formula(23) of Chapter V (Vol. I, p. 107) to AP

P - ' 1 C(,1) AP (mx-I)`4 x (82)k-I

*o (a) 1-Axwhere

'P (,1) = (d - A1)"' (.1- 22)m.... (2 _ 2 )m' (.lj .l j for j f )

is the minimal of nomial of Ax

p y ,V (,i)=xmx(k=1, 2,...,s) and C(A)_ (2E-A)-lp(1) is the reduced adjoint matrix.

In this case, we can set:

'I1=r>IA.! ?I2I and m1=1. (83)

Then (82) assumes the form

Ap= C(r) rp+ 1 C(H)AP (mx-1)

V'M x-g (Mk- 1) 1 [V(A) I A-Zx

Hence it is easy to deduce by (83) that

Ap C (r)lim,

rP v '(r)

On the other hand, C(r) > 0 (see (53)) and y; (r) > 0 by (83). There-fore

lim pp> 0,

and so (73) must hold from some p onwards." This completes the proof.We shall now prove the following theorem :

THEOREM 9: If A ? 0 is an irreducible matrix and some power AQ ofA is reducible, then AQ is completely reducible, i.e., AQ can be representedby means of a permutation in the form

AQ = (A1, As, ..., Ad), (85)

where A1i A2, ... , Ad are irreducible matrices having one and the samemaximal characteristic value. Here d is the greatest common divisor of qand h, where h is the index of imprimitivity of A.

32 As regards a lower bound for the exponent p in (81), see [384).


Proof. Since A is irreducible, we know by Frobenius' theorem thatpositive characteristic vectors belong to the maximal characteristic value r,both in A and in AT. But then these positive vectors are also characteristicvectors of the non-negative matrices AQ and (Aq)T for the characteristicvalue A = r4. Therefore by applying Theorem 7 to Ao, we represent thismatrix (after a suitable permutation) in the form (85), where A1, A2, ... , Adare irreducible matrices with the same maximal characteristic value ra.But A has h characteristic values of maximal modulus r:

2n,

r, re,...,rea-1 a=e").Therefore AQ also has h characteristic values of maximal modulus

r4, r'c, ..., r4eQ("-I),

among which d are equal to r4. This is only possible when d is the greatestcommon divisor of q and h. This proves the theorem.

For h = 1, we obtain:

COROLLARY 1: A power of a primitive matrix is irreducible and primi-tive.

If we set q = h in the theorem, then we obtain:

COROLLARY 2: If A is an imprimitive matrix with index of im primitivityh, then A" splits into h primitive matrices with the same maximal charac-teristic value.

§ 6. Stochastic Matrices

1. We consider n possible states of a certain system

SI, S2, S.and a sequence of instants

to, tl, t2, ... .

(86)

Suppose that at each of these instants the system is in one and only oneof the states (86) and that p,j denotes the probability of finding the systemin the state S; at the instant tk if it is known that at the preceding instanttk_1 the system is in the state S, (i, j = 1, 2, ... , n ; k 1, 2, ...). We shallassume that the transition probability p,j (i, j = 1, 2, ... , n) does not dependon the index k (of the instant tk).

If the matrix of transition probabilities is given,

P=llpjili,

§ 6. STOCHASTIC MATRICES 83

then we say that we have a homogeneous Markov chain with a finite numberof states.33 It is obvious that

ft

prt ? 0, P pct =1 (i, j =1, 2, ... , n). (87)

DEFINITION 4: A square matrix P = II pit II7 is called stochastic if P isnon-negative and if the sum of the elements of each row of P is 1, i.e., if therelations (87) hold.34

Thus, for every homogeneous Markov chain the matrix of transitionprobabilities is stochastic and, conversely, every stochastic matrix can beregarded as the matrix of transition probabilities of some homogeneousMarkov chain. This is the basis of the matrix method of investigating homo-geneous Markov chains.35

A stochastic matrix is a special form of a non-negative matrix. There-fore all the concepts and propositions of the preceding sections are applicableto it.

We mention some specific properties of a stochastic matrix. From thedefinition of a stochastic matrix it follows that it has the characteristic value1 with the positive characteristic vector z = (1, 1, . . . , 1). It is easy to seethat, conversely, every matrix P ? 0 having the characteristic vector(1, 1, . . . , 1) for the characteristic value 1 is stochastic. Moreover, 1 is themaximal characteristic value of a stochastic matrix, since the maximal char-acteristic value is always included between the largest and the smallest ofthe row sums38 and in a stochastic matrix all the row sums are 1. Thus,we have proved the proposition :

1. A non-negative matrix P ? 0 is stochastic if and only if it has thecharacteristic vector (1, 1, .. ., 1) for the characteristic value 1. For astochastic matrix the maximal characteristic value is 1.

Now let A= II auk 11 77 be a non-negative matrix with a positive maximalcharacteristic value r > 0 and a corresponding positive characteristic vectorz=(23,Zi...,z,,)>o:

33 See [212] and [46], pp. 9-12.

34 Sometimes the additional condition 27p # 0 (j= 1, 2, ..., n) is included in thei-t

definition of a stochastic matrix. See [46], p. 13.36 The theory of homogeneous Markov chains with a finite (and a countable) number

of states was introduced by Kolmogorov (see [212]). The reader can find an accountof the later introduction and development of the matrix method with applications tohomogeneous Markov chains in the memoir [329] and in the monograph [46] by V. I.Romanovskii (see also [4], Appendix 5).

36 See (37) and the note on p. 68.


ai5zi = rzt (i =1, 2, ..., n). (88)

We introduce the diagonal matrix Z = (z1, z2, ... , and the matrix

P- II py II1P= 1 Z-1AZ.r

Thenpr, zrlapzf?0 (i,j=1,2,...,n),

and by (88)X

pt1=1 ( i=1 ,2 ,t-Thus :

2. A non-negative matrix A > 0 with the maximal positive characteristicvalue r > 0 and with a corresponding positive characteristic vectorz = (z1, Z2.... o is similar to the product of r and a stochastic matrix:"

A = ZrPZ-1 (Z = (z,, z2, ... , 0). (89)

In a preceding section we have given (see Theorem 6, § 4) a characteriza-tion of the class of non-negative matrices having a positive characteristicvector for A = r. The formula (89) establishes a close connection betweenthis class and the class of stochastic matrices.

2. We shall now prove the following theorem :

THEOREM 10: To the characteristic value 1 of a stochastic matrix therealways correspond only elementary divisors of the first degree.

Proof. We apply the decomposition (69), § 4, to the stochastic matrixp= 11 P,, 11n

/A1 0 . . . . . . . . . . . . . 0I 0 A2 . . . . . . . . . . . . . 0

P= 0 . . . . . . . AP O . . . 0 I

A9+1. I . . . . . A,+1.O A,+1 . . . 0

\ A . . . . . . A . . . . . . . A

where A1, A2, ... , A8 are irreducible and

37 Proposition 2. also holds for r = 0, since A ? 0, z > o implies that A = 0.

§ 6. STOCHASTIC MATRICES 85

AJ1+AJ2+...+Af.1 1=A0 (1=g+1,..., s).Here A1, A2, ..., Ao are stochastic matrices, so that each has the simplecharacteristic value 1. As regards the remaining irreducible matricesA,+1, ... , A., by the Remark 2 on p. 63 their maximal characteristic valuesare less than 1, since in each of these matrices at least one row sum is lessthan 1.38

Thus, the matrix P is representable in the form

Ql 0P= S Q)

where in Q, to the value 1 there correspond elementary divisors of the firstdegree, and where 1 is not a characteristic value of Q2. The theorem nowfollows immediately from the following lemma :

LEMMA 4: If a matrix A has the form

Q1 0A = S Q' ,

where Q1 and Q2 are square matrices, and if the characteristic value Ao of Ais also a characteristic value of QI, but not of Q2,

IQ1-) EI=0, IQ2-2oEI 0,

then the elementary divisors of A and Q, corresponding to the characteristicvalue to are the same.

Proof. 1. To begin with, we consider the case where Q, and Q2 do nothave characteristic values in common. Let us show that in this case theelementary divisors of Q1 and Q2 together form the system of elementarydivisors of A, i.e., for some matrix T (I T I 0)

TAT-1=(QiQs)

(91)

We shall look for the matrix T in the form

T= (El 01U E2

58 These properties of the matrices A,, ... , A. also follow from Theorem 6.


(the dissection of T into blocks corresponds to that of A; El and E2 are unitmatrices). Then

(El 0 Ql O l( E1 0 _( Ql OTAT-l=(U

Eal \S Qsl \-U E=) -1UQ1-Q:U+S Qsl(91 )

The equation (91') reduces to (91) if we choose the rectangular matrixU so that it satisfies the matrix equation

Q2U- UQ1=S.

If Q, and Q2 have no characteristic values in common, then this equationalways has a unique solution for every right-hand side S (see Vol. 1, ChapterVIII, § 3).

2. In the case where Q, and Q2 have characteristic values in common,we replace Q, in (90) by its Jordan form J (as a result, A is replaced by asimilar matrix). Let J = (J, J2), where all the Jordan blocks with thecharacteristic value Ao are combined in J1. Then

Jl 0 0 0 Jl 0 0 010 is 0 0 1OA =

Sil S1!Q2

1911 Q!!S!1 Sn S!1

This matrix falls under the preceding case, since the matrices J, and Q2 haveno characteristic values in common. Hence it follows that the elementarydivisors of the form (A - Ao)" are the same for A and J, and therefore alsofor A and Q1. This proves the lemma.

If an irreducible stochastic matrix P has a complex characteristic valueAo with I A. = 1, then A .P is similar to P (see (16)) and so it follows fromTheorem 10 that to Ao there correspond only elementary divisors of the firstdegree. With the help of the normal form and of Lemma 4 it is easy toextend this statement to reducible stochastic matrices. Thus we obtain :

COROLLARY 1. If Ao is a characteristic value of a stochastic matrix P andA I = 1. then the elementary divisors corresponding to Aa are of the first

degree.From Theorem 10 we also deduce by 2. (p. 84) :

COROLLARY 2. If a positive characteristic vector belongs to the maximalcharacteristic value r of a non-negative matrix A, then all the elementarydivisors of A that belong to a characteristic value A. with I A(, r are of thefirst degree.

§ 7. LIMITING PROBABILITIES FOR MARKOV CHAIN 87

We shall now mention some papers that deal with the distribution of thecharacteristic values of stochastic matrices.

A characteristic value of a stochastic matrix P always lies in the disc2 < 1 of the 1-plane. The set of all points of this disc that are character-

istic values of any stochastic matrices of order n will be denoted by M.3. In 1938, in connection with investigation on Markov chains A. N. Kol-mogorov raised the problem of determining the structure of the domain M..This problem was partially solved in 1945 by N. A. Dmitriev and E. B.Dynkin [ 133 ], [ 133a J and completely in 1951 in a paper by F. I. Karpelevich[209]. It turned out that the boundary of M. consists of a finite numberof points on the circle I A I = 1 and certain curvilinear arcs joining thesepoints in cyclic order.

We note that by Proposition 2. (p. 84) the characteristic values of thematrices A= II alk II TI > 0 having a positive characteristic vector for A= rwith a fixed r form the set r M..39 Since every matrix A = 11 a,k I1" ? 0can be regarded as the limit of a sequence of non-negative matrices of thattype and the set r M is closed, the characteristic values of arbitrary matricesA = II afk III ? 0 with a given maximal characteristic value r fill out theset r Mn.40

A paper by H. R. Suleimanova [359] is relevant in this context; it con-tains sufficiency criteria for it given real numbers A,, 22, ... , A. to be thecharacteristic values of a stochastic matrix P = II pl1 III.1,

§ 7. Limiting Probabilities for a Homogeneous Markov Chainwith a Finite Number of States

1. LetS19 S21 ... , Sn

be all the possible states of a system in a homogeneous Markov chain and letP=

*

II pi; III be the stochastic matrix determined by this chain that is formedfrom the transition probabilities pi, (i, j = 1, 2, ... , n) (see p. 82).

We denote by the probability of finding the system in the state S, atthe instant tk if it is known that at the instant tk_1 it is in the state S,

Clearly, (11(i, 9=1, 2, ..., n; q=-1. 2, ...) Y, pu=pct (z, a=1, 2, .... n).

39 r M. is the set of points in the of the form rµ, where /1 c M..40 Kolmogorov has shown (see [133a (1946)1, Appendix) that this problem for an

arbitrary matrix A ? 0 can be reduced to the analogous problem for a stochastic matrix.41 See also [312].


Making use of the theorems on the addition and multiplication of probabili-ties, we find easily :

Pij+i)Pi PM (i, j = 1, 2, ..., n)

or, in matrix notation,

(q+t) i - (q)

Hence, by giving to q in succession the values 1, 2,... , we obtain the impor-tant formula42

flP(!)II=Pq (q=1, 2,...).If the limits

Py = Pil (1, j =1, 2, ..., n)


lim Pq =P°° = iI Ptf I Ii

exist, then the values p{t (i, a = 1, 2, ... , n) are called the limiting or finaltransition probabilities.42

In order to investigate under what conditions limiting transition proba-bilities exist and to derive the corresponding formulas, we introduce the fol-lowing terminology.

We shall call a stochastic matrix P and the corresponding homogeneousMarkov chain regular if P has no characteristic values of modulus 1 otherthan I itself and fully regular if, in addition, 1 is a simple root of thecharacteristic equation of P.

A regular matrix P is characterized by the fact that in its normal form(69) (p. 75) the matrices A,. A2, ... , A, are primitive. For a fully regularmatrix we have, in addition, g =1.

Furthermore, a homogeneous Markov chain is irreducible, reducible.acyclic or cyclic if the stochastic matrix P of the chain is irreducible, reduc-ible, primitive, or imprimitive, respectively. Just as a primitive stochasticmatrix is a special form of a regular matrix, so an acyclic Markov chain is aspecial form of a regular chain.

We shall prove that: Limiting transition probabilities exist for regularhomogeneous Markov chains only.

42 It follows from this formula that the probabilities p(V) as well as pj, (i, j =1, 2,3, ... , q= 1, 2, ...) do not depend on the index k of the original instant Is.

43 The matrix PaO, as the limit of stochastic matrices, is itself stochastic.


For let ip(I) be the minimal polynomial of the regular matrix P= 11 p,J II; .

Then

(AjAk; irk =1, 2, ..., u). (92)

By Theorem 10 we may assume that

Al=1, m1=1.

By the formula (23) of Chapter V (Vol. I, p. 107),

PC =CM v 1

f

i Q,(mk-1)

(1) +k - (k - 1) l (ZI A JA-Ak

where C(I) _ (2E-P)-1W(A) is the reduced adjoint matrix and

lk(A(lk)

moreover

W (A) _ zl and

If P is a regular matrix, then

(k=1,2,...,u);

V (1) ='P' (1)

IAk!<1 (k=2,3,...,u),

(93)

(94)

and therefore all the terms on the right-hand side of (94), except the first,tend to zero for q-> oo. Therefore, for a regular matrix P the matrix P°formed from the limiting transition probabilities exists, and

P°°=CV,(1)

The converse proposition is obvious. If the limit

(95)

P°°= Jim P, (96)

exists, then the matrix P cannot have any characteristic value 2k for whichAk 1& 1 and I Ak =1, since then the limit im Ar would not exist. (This

limit must exist, since the limit (96) exists.)We have proves that the matrix P'° exists for a regular homogeneous

Markov chain (and for such a regular chain only). This matrix is deter-mined by (95).


We shall now show that P° can be expressed by the characteristic poly-nomial

d (A) _ (A - Al)"1(A - ,13)n, ... (A - A. )ft (97)

and the adjoint matrix B(A) = (AE-P)-ld(A).From the identity

B (1) _ C (1)

d (1) +p (1)

it follows by (92), (93), and (97) that

n,B(nt-1) (1) C (1)A(MI) (1)

-y'(i).

Therefore (95) may be replaced by the formula

P_ n,B(ni-1) (1)z1(n.) (1)

(98)

For a fully regular Markov chain, inasmuch as it is a special form of aregular chain, the matrix P°° exists and is determined by (95) or (98). Inthis case n1=1, and (98) assumes the form

B (1)P°°= d,(1). (99)

2. Let us consider a regular chain of general type (not fully regular). Wewrite the corresponding matrix P in the normal form

Ql . . . . . . 0 0. . . . . . . .0

P=0 Q, 0 0Uv+1.1 Uo+l.p Q9+1

Ud, . . . . . U,v Ue.t-1 Qi

(100)


where Q1, ..., Qo are primitive stochastic matrices and the maximal valuesof the irreducible matrices Qo+1, ... , Q. are less than 1. Setting

Uv+1.1 . . . Ug.F.i

Us1 . . . U,r

we write P in the form

P=

0 . . . Qo 0

Then

U W

(101)

and

But W = lim W9 = 0, because all the characteristic values of W are of

modulus less than 1. Therefore

Q-. 0 .0P°° (102)

\0...Q9 D

U., 0

Since Q1, ... , Q, are primitive stochastic matrices, the matrices QiQ.- are positive, by (99) and (35) (p. 62)


Qi>0,...,Q->0,and in each of these matrices all the elements belonging to any one columnare equal :

Q ; 7 = 1 1 k`,i 1 1 n.

1 _ 1 (h =1, 2, ... , g).

We note that the states S,, S2,... , S. of the system fall into groups cor-responding to the normal form (100) of P :

£1, £y, ..., £a, £o+t, ..., E. (103)

To each group I in (103) there corresponds a group of rows in (100).In the terminology of Kolmogorov the states of the system that occur inE, £, ... , E. are called essential and the states that occur in the remaininggroups £p+,, ... , £, non-essential.

From the form (101) of Pa it follows that in any finite number q of steps(from the instant t,;-,, to tk) onI}' the following transitions of the system arepossible: a) from an essential state to an essential state of the same group;b) from a non-essential state to an essential state; and c) from a non-essentialstate to a non essential state of the same or a preceding group.

From the form (102) of P°° it follows that: A limiting transition canonly lead from an arbitrary state to an essential state, i.e.. the probabilityof transition to any non-essential state tends to zero when the number ofsteps q tends to infinity. The essential states are therefore sometimes alsocalled limiting states.

3. From (95) it follows that45

(E-P)PO=0.

Hence it is clear that: Every column of P°° is a characteristic vector of thestochastic matrix P for the characteristic value 1=1.

For a fully regular matrix P, 1 is a simple root of the characteristic equa-tion and (apart from scalar factors)only one characteristic vector(1, 1. ... , 1)of P belongs to it. Therefore all the elements of the ,j-th column of P,are equal to one and the same non-negative number p':

n

p,", = z0 (9=1,2,...,n;p.t=1). (104)j-I" See [2121 and [461, pp. 37-39.15 This formula holds for an arbitrary regular chain and can be obtained from the

obvious equation Pa - p. Pa-, = 0 by passing to the limit q - oo .


Thus, in a fully regular chain the limiting transition probabilities donot depend on the initial state.

Conversely, if in a regular homogeneous Markov chain the limitingtransition probabilities do not depend on the initial state, i.e., if (104) holds,then obviously in the scheme (102) for P'° we have g = 1. But then nt = 1and the chain is fully regular.

For an acyclic chain, which is a special case of a fully regular chain.P is a primitive matrix. Therefore PQ > 0 (see Theorem 8 on p. 80) forsome q > 0. But then also P° = P" Pa > 0.18

Conversely, it follows from P'° > 0 that PQ > 0 for some q > 0, andthis means by Theorem 8 that P is primitive and hence that the given homo-geneous Markov chain is acyclic.

We formulate these results in the following theorem:

THEOREM 11: 1. In a homogeneous Markov chain all the limiting transi-tion probabilities exist if and only if the chain is regular. In that case thematrix P" formed from the limiting transition probabilities is determinedby (95) or (98).

2. In a regular homogeneous Markov chain the limiting transition proba-bilities are independent of the initial state if and only if the chain is fullyregular. In that case the matrix Pm is determined by (99).

3. In a regular homogeneous Markot' chain all the limiting transitionprobabilities are different from zero if and only if the chain is acyclic.4T

4. We now consider the columns of absolute probabilities

p = (Pi, Ps' ..., Pot) (k=O, 1, 2, ...), (105)

where pi is the probability of finding the system in the state St (i = 1, 2....n; k = 0, 1, 2, ...) at the instant tk. Making use of the theorems on theaddition and multiplication of probabilities, we find:

k ' 0

Pi = PhPsilA-1

(i= 1, 2,...,n;k=1,2,...)


46 This matrix equation is obtained by passing to the limit m 1 0o from the equationPm= Pm-q. P9 (m> q). P°^ is n stochastic matrix; therefore P°° ? 0 and there arenon-zero elements in every row of P". Hence P" Pq > 0. Instead of Theorem 8 we canuse here the formula (99) and the inequality (35) (p. 62).

47 Note that P°° > 0 implies that the chain is acyclic and therefore regular. Hence itfollows automatically from P°° > 0 that the limiting transition probabilities do notdepend on the initial state, i.e., that the formulas (104) hold.


p=(PT)rp (k=1, 2, ...), (106)

where PT is the transpose of P.All the absolute probabilities (105) can be determined from (106) if the

initial probabilities pt, p2, ... , p and the matrix of transition probabilitiesP = II pq 1I i are known.

We introduce the limiting absolute probabilities

or

rPr = prrye (i =1, 2, ..., n)

p=(p1,Pi,...,p,,)=1 p.r-.-When we take the limit k - co on both sides of (106), we obtain :

p (p°°)Tp. (107)

Note that the existence of the matrix of limiting transition probabilitiesPm implies the existence of the limiting absolute probabilities

for arbitrary initial probabilities p = (j)1i p2, ... , and vice versa.From the formula (107) and the form (102) of P°° it follows that: The

limiting absolute probabilities corresponding to non-essential states are zero.Multiplying both sides of the matrix equation

PT . (P°°)T = (Pyp)T

by p on the right, we obtain by (107) :

PT;=P, (108)

i.e.: The column of limiting absolute probabilities p is a characteristic vectorof PT for the characteristic value A = 1.

If a fully regular Markov chain is given, then A= 1 is a simple root ofthe characteristic equation of PT. In this case, the column of limiting abso-

lute probabilities is uniquely determined by (108) (because p; ?0 (j1,2,...,n) andEp;=1).

I- I


Suppose that a fully regular Markov chain is given. Then it followsfrom (104) and (107) that :

^° " o " oP -A-1 h-1

(j =1, 2, ..., n). (109)

In this case the limiting absolute probabilities p1, Pe, ..., p" do not

depend on the initial probabilities pot, PB, ... 'pa-Conversely, p is independent of po on account of (107) if and only if all

the rows of P° are equal, i.e.,

p h i = p.i (h, j =1, 2, ..., n)

so that (by Theorem 11) P is a fully regular matrix.If P is primitive, then P'° > 0 and hence, by (109),

pi>0 (j=1,2,...,n).

Conversely, if all the pi (j = 1, 2, ... . n) are positive and do not dependon the initial probabilities, then all the elements in every column of P°° areequal and by (109) P° > 0, and this means by Theorem 11 that P is primi-tive, i.e., that the given chain is acyclic.

From these remarks it follows that Theorem 11 can also be formulatedas follows :

THEOREM 11': 1. In a homogeneous Markov chain all the limiting abso-lute probabilities exist for arbitrary initial probabilities if and only if thechain is regular.

2. In a homogeneous Markov chain the limiting absolute probabilitiesexist for arbitrary initial probabilities and are independent of them if andonly if the chain is fully regular.

3. In a homogeneous Markov chain positive limiting absolute probabili-ties exist for arbitrary initial probabilities and are independent of them ifand only if the chain is acyclic.48

5. We now consider a homogeneous Markov chain of general type with amatrix P of transition probabilities.

48 The second part of Theorem 11' is sometimes called the ergodic theorem and thefirst part the general quasi-ergodic theorem for homogeneous Markov chains (see [41,pp. 473 and 476).


We choose the normal form (69) for P and denote by h1i h2, ... , ho theindices of imprimitivity of the matrices A,, A2, ... , A, in (69). Let h bethe least common multiple of the integers h1, h2, ... , h9. Then the matrixPk has no characteristic values. other than 1, of modulus 1, i.e., P, is regular;here h is the least exponent for which Ph is regular. We shall call h theperiod of the given homogeneous Markov chain.

Since Pk is regular, the limit

lim Pr'Q = (Ph)-

exists and hence the limits

Q-

P,=lim (r=0, 1, ..., h-1)

also exist.

Thus, in general, the sequence of matrices

P. P2. P3....

splits into h subsequences with the limits m (r=0, 1, ... , h -1).When we go from the transition probabilities to the absolute probabili-

ties by means of (106), we find that the sequence

1 2 3p, p, p, ...

splits into h subsequences with the limits

l i m = (PT")-p (r= 0, 1, 2, ..., h -1).yph

Q_»

For an arbitrary homogeneous Mlarkov chain with a finite number ofstates the limits of the arithmetic means always exist:

NP=lim N (110)

X__ ke1

and

I N k Op=lim ,Xp=PT

P.jV-

Here 1' = !I p,; , and T) = (p1, n2, ... , Pr). The values j),; (i, .j =1. 2.3, ..., n) and p; (J=1, 2. .... n) are called the mean limiting transitionprobabilities and mean limiting absolute probabilities, respectively.

§ 7. LIMITING PROBABILITIES FOR MARKOV CHAIN

Since

we have

and therefore, by (110'),

pp= P

PTp=P.i.e., is a characteristic vector of PT for , = 1.

Note that by (69) and (110) we may represent P in the form

A ,0...00 A2. . .00

P=0 0. .Ao

Owhere

W

1 N t 1 NW = tin N Wk,

Nom- t-I N--k-1Aa+10 . . . 0* Au+2 . 0W =

97

(111)

\* * . . A,/

Since all the characteristic values of 11' are of modulus less than 1, wehave

Jim Wt =0,t--and therefore tiV = 0.

Hence

Nt l Nhm

NX Pt =line N X Pt

N-- t-l

(112)

Since P is a stochastic matrix, the matrices A,, .42.... , :4p are alsostochastic.

98 XIII. MATRICES WITIi NON-NEGATIVE ELEMENTS

From this representation of P and from (107) is follows that: The meanlimiting absolute probabilities corresponding to non-essential states arealways zero.

If g =1 in the normal form of P, then A = I is a simple characteristicvalue of pr.

In this case p` is uniquely determined by (111), and the mean limitingprobabilities P1, P2, ... , P. do not depend on the initial probabilities

pl, p, ... , p,,. Conversely, if P does not depend on po, then P is of rank 1by (110'). But the rank of (112) can be 1 only if g=1.

We formulate these results in the following theorem :49

THEOREM 12: For an arbitrary homogeneous Markov chain with period.

h the probability matrices P and p tend to a periodic repetition with periodh for k--> oo; moreover, the mean limiting transition probabilities and theabsolute probabilities T= II P,; II' and P = (P1, P2, ... , defined by (110)and (110') always exist.

The mean absolute probabilities corresponding to non-essential states arealways zero.

If g = 1 in the normal form of P (and only in this case), the mean limit-ing absolute probabilities p1, P2, ... , P. are independent of the initial proba-

bilities pi, ps, , pA and are uniquely determined by (111).

§ 8. Totally Non-negative Matrices

In this and the following sections we consider real matrices in which notonly the elements, but also all the minors of every order are non-negative.Such matrices have important applications in the theory of small oscilla-tions of elastic systems. The reader will find a detailed study of thesematrices and their applications in the book [171. Here we shall only dealwith some of their basic properties.

1. We begin with a definition :DEFINITION 5: A rectangular matrix

Aa, (i=1,2, ..,m; k=1,2, . n)

is called totally non-negative (totally positive) if all its minors of any orderare non-negative (positive) :

49 This theorem is sometimes called the asymptotic theorem for homogeneous Markovchains. See [4], pp. 479-82.

§ 8. TOTALLY NON-NEGATIVE MATRICES 99

A(k, i1 i2 ...0 >0)ks ... kp)

i <i <... <iS 1 1 ;5n; p=1, 2, ..., min (m, n) .

k1<k2<. <kp

In what follows we shall only consider square totally non-negative andtotally positive matrices.

Example 1. The generalized Vandermonde matrix

V= 11 atklli (0<a1<a2<...<a.;a1<a2<...<a.)is totally positive. Let us show first that I V I 0. Indeed, from I V = 0it would follow that we could determine real numbers c1, c2, ..., c,,, not allequal to zero, such that the function

f(x) = 'ckxak (a,#a, for i'j)k-1

has then zeros x{ = at (i =1, 2, ... , n), where n is the number of terms inthe above summand. For n = 1 this is impossible. Let us make the induc-tion hypothesis that it is impossible for a sum of n1 terms, where n1 < n,and show that it is then also impossible for the given function f (x). Assumethe contrary. Then by Rolle's Theorem the function f1(x) =[x'°1f (x)]'consisting of n -1 terms would have n - 1 positive zeros, and this contra-dicts the induction hypothesis.

Thus, I V I 0. But for a1= 0, a2 =1, ... , an = n - I the determinantV I goes over into the ordinary.Vandermonde determinant I ae-11i, which

is positive. Since the transition from this to the generalized Vandermondedeterminant can be carried out by means of a continuous change of theexponents a1i a2, ... , an with preservation of the inequalities a1 < a2 < ...< a,,, and since, by what we have shown, the determinant does not vanishin this process, we have I V I > 0 for arbitrary 0 < a1 < a2 < . . . < an-

Since every minor of V can be regarded as the determinant of some gen-eralized Vandermonde matrix, all the minors of V are positive.

Example 2. We consider a Jacobi matrix

J=

a1 b1 0 ... 0 0c, a, b,... 0 00 c2as... 0 0

0 0 0 ... a

(113)

100 XIII. MATRICES WITH No-;-NEGATIVE ELEMENTS

in which all the elements are zero outside the main diagonal and the firstsuper-diagonal and sub-diagonal. Let us set up a formula that expresses anarbitrary minor of the matrix in terms of principal minors and the elementsb, c. Suppose that

and

11<11<...<tiPlxk1<k,<...<k, -_

11=k1, 1==k1,..., fr,=k,, ; 4,+11 k,,+1r .., i,,,*k,.,

then

/11 {_ ... iP _ t1 .. it 1r 1 ti (1r 1 ... i,lk1 ks ... ky) - (k1... k,,) J klc,yl ... J \k,,1) .+1.. k,.+1

This formula is a consequence of

the/easily

\\verifiable equation:

/ i1 .:. iP =J( i :: 1r=1/ 11 J r 1r1 J (1.+1 tPl (for i, . k,).I\k1kp/1 k1 . k, k, k1

(114 )(114)

(115)

From (114) it follows that every minor is the product of certain prin-cipal minors and certain elements of J. Thus: For J to be totally non-negative it is necessary and sufficient that all the principal minors and theelements b, c should be non-negative.

2. A totally non-negative matrix A ='{ a,k II i always satisfies the follow-ing important determinantal inequality :50

A (1 2 ... ) S A (1 2 ... p) A(p

+ 1 ... n) (P'< n). (116)

Before deriving this inequality, we prove the following lemma :

LEMMA 5: If in a totally non-negative matrix A = II aik I! T any prin-cipal minor vanishes, then every principal minor `bordering' it also vanishes.

Proof. The lemma will be proved if we can show that for a totally non-negative matrix A = III a;k II; it follows from

50 See [172] and [17], pp. 111ff, where it is also shown that the equality sign in

(116) can only hold in the following obvious cases:1) One of the factors on the right-hand side of (116) is zero;2) All the elements am (i=1,2,...,p; k=p+1,...,n) or a,& (i=p+1,...,n;

k = 1, 2, ... , p) are zero.The inequality (116) has the same outward form as the generalized Hadamard inequal-

ity (see (33), Vol. I, p. 255) for a positive-definite hermitian or quadratic form.

§ 8. TOTALLY NON-NEGATIVE MATRICES 101

thatA (1 2 ...

q) =0 (q < n) (117)q)

A 2 ...n)=0. (118)

For this purpose we consider two cases:.

1) all=0. Since all 0,a,k0(i,k=2,...,n), either all the at1=0 (i=2, ..., n) or all the alk=0 (k=2, ..., n).These equations and a11= 0 imply (118).

2) all ,k 0. Then for some p (1 p < q)

1

A(1 2 ... p-1)#0' A(1 2 ... p-1 p)-0. (119)

We introduce bordered determinants

I 2 ... p-1 idu=A(1 2... p-1 k) (i,k=p,P+1,...,n) (120)

and form from them a matrix D = II dtk II;.By Sylvester's identity (Vol. I, Chapter II, § 3),

sl '2 ... titD (k, k$ ... kt)

-[A(1 2 ... p-1)lt-lA1 2 ... p-l it is ... t)0 (121)IIIL

1 2 ... p -l J 1 2 ... p-1 k1 ks ... kt

n; Q=1, 2,...,n-p+1(ps S<k2<...<ktkl

so that D is a totally non-negative matrix.Since by (119)

1 2 ... p)=0dpp A 1 2 ... p

the matrix D falls under the case 1) and

2

D (p p + l ... n)- [A (1 2 ... p -1)]n-PA (1 2 ... n) - 0.


Since A1

(12 ... p-12 ... p-1 0, (118) follows, and the lemma is proved.

3. We may now assume in the derivation of the inequality (116) that allthe principal minors of A are different from zero, since by Lemma 5 one ofthe principal minors can only be zero when I A 0, and in this case theinequality (116) is obvious.

For n = 2, (116) can be verified immediately :

1 2

A I 2) :-- au a82 - a12 a21 S all a22,

since a12 ? 0, a21 ? 0. We shall establish (116) for n > 2 under the assump-tion that it is true for matrices of order less than n. Moreover, without lossof generality, we may assume that p > 1, since otherwise by reversing thenumbering of the rows and columns we could interchange the roles of pand n - p.

We now consider again the matrix D = II dtk IIn, where the dlk (i, k = p,p + 1, . . . , n) are defined by (120) ; we use Sylvester's identity twice as wellas the basic inequality (116) for matrices of order less than n and obtaii

nip p-=1 ... n p+l ... nlDdpp p+1 ... n)A 1 2 ... n-[,flp p+l n)

(1 2 ... n) 2 ... p S-l'\ -f A

('1 2 ... p-Ill^-P\1 2 ... p=l/J l \l 2 ... p-1

A\1 2 ... p/A\1 2 ... p-1 p+l ... n/--1 2 ... p-l

A\l 2 ... p-1

A (12 ...

p )A (p+ 1 ..: n

(122)\1 2 ... p p+ l .. n

Thus, the inequality (116) has been established.Let us make the following definition:

DEFINITION 6. A minor

A it i2 ...ipl (lSil<z2<...<ipc-

n 123)(\kl k2 ... kpJ \ kl < k2 < <k P

of the matrix A = II a;k II 1 will be called almost principal if of the differencesit - k1, i2 - k2.... , ip - kp only one is not zero.

§ 9. OSCILLATORY MATRICES 103

We can then point out that the whole derivation of (116) (and the proofof the auxiliary lemma) remain valid if the condition À is totally non-negative' is replaced by the weaker condition àll the principal and almostprincipal minors of A are non-negative.151

§ 9. Oscillatory Matrices

1. The characteristic values and characteristic vectors of totally positivematrices have a number of remarkable properties. However, the class oftotally positive matrices is not wide enough from the point of view of appli-cations to small oscillations of elastic systems. In this respect, the class oftotally non-negative matrices is suffiently extensive. But the spectralproperties we need do not hold for all totally non-negative matrices. Nowthere exists an intermediate class (between that of totally positive and thatof totally non-negative matrices) in which the spectral properties of totallypositive matrices are preserved and which is of sufficiently wide scope forthe applications. The matrices of this intermediate class have been calledòscillatory.' The name is due to the fact that oscillatory matrices form themathematical apparatus for the study of oscillatory properties of small vibra-tions of elastic systems.32

DEFINITION 7. A matrix A = II aik 1111 is called oscillatory if A is totallynon-negative and if there exists an integer q > 0 such that Aa is totallypositive.

Example. A Jacobi matrix J (see (113)) is oscillatory if and only if1. all the numbers b, c are positive and 2. the successive principal minors arepositive :

a' See [214]. We take this opportunity of mentioning that in the second edition of thebook [17] by F. R. Gantmacher and M. G. Krein a mistake crept in which was firstpointed out to the authors by D. M. Kotelyanskii. On p. 111 of that book an almostprincipirl minor (123) was defined by the equation

p

P-1

With this definition, the inequality (116) does not follow from the fact that the principaland the almost principal minors are non-negative. However, all the statements and proofsof § 6, Chapter II in [17] that refer to the fundamental inequality remain valid if analmost principal minor is defined as above and as we have done in the paper [214].

52 See [17], Introduction, Chapter III, and Chapter IV.


a1 b1 0 ... 0 0

a ba1 b1 0 c1 a= b, ... 0 0

a1 > 0,1

C , asI > 0, e1

0

a= b2 : > 0, ..., 0 C. a, ... 0 0 > 0. (124)

c. as0 0 0 a

Necessity of 1., 2. The numbers b, c are non-negative, because J > 0.But none of the numbers b, c may be zero, since otherwise the matrix wouldbe reducible and then the inequality J9 > 0 could not hold for any q > 0.Hence, all the numbers b, c are positive. All the principal minors of (124)are positive, by Lemma 5, since it follows from I J ? 0 and I JQ I > 0 thatJI>0.

Sufficiency of 1., 2. When we expand I J I we easily see that the num-bers b, c occur in I J I only as products b1c1, b2c2, ... , The sameapplies to every principal minor of `zero density,' i.e., a minor formed fromsuccessive rows and columns (without gaps). But every principal minor ofJ is a product of principal minors of zero density. Therefore : In every prin-cipal minor of J the numbers b and c occur only as products b1c1. b2e2....

We now form the symmetrical Jacobi matrix

a. 9, 0

S'1 a, b.

J'=

b= .

, lu=)'bici>0 (i=1, 2, ..., n). (125)

From the above properties of the principal minors of a Jacobi matrix itfollows that the corresponding principal minors of J and j are equal. Butthen (124) means that the quadratic form

.1(x, x)

is positive definite (see Vol. I, Chapter X, Theorem 3, p. 306). But in apositive-definite quadratic form all the principal minors are positive. There-fore in J too all the principal minors are positive. Since by 1. all the numbersb, r are positive, by (114) all the minors of .1 are non-negative ; i.e., .1 istotally non-negative.

That a totally non-negative matrix J for which 1. and 2. are satisfied isoscillatory follows immediately from the following criterion for an oscilla-tory matrix.


A totally non-negative matrix A= II aik 1I, is oscillatory if and only if:

1) A is non-singular (I A I > 0) ;2) All the elements of A in the principal diagonal and the first super-

diagonals and sub-diagonals are different from zero (a(k > 0 for I i-k < 1).The reader can find a proof of this proposition in [ 17], Chapter II, § 7.

2. In order to formulate properties of the characteristic values and charac-teristic vectors of oscillatory matrices, we introduce some preliminary con-cepts and notations.

We consider a vector (column)

u = (u1, u2, ... , u")

Let us count the number of variations of sign in the sequence of coordinatesu,, u2, ... , it. of u, attributing arbitrary signs to the zero coordinates (ifany such exist). Depending on what signs we give to the zero coordinatesthe number of variations of sign will vary within certain limits. Themaximal and minimal number of variations of sign so obtained will be de-noted by S and respectively. If S,;- = S.+-, we shall speak of the exactnumber of sign changes and denote it by S,,. Obviously S;- = S-; if and onlyif 1. the extreme coordinates it, and it. of n are different from zero, and2. uj=0 (1 < i < n) always implies that u;-,u,+, < 0.

We shall now prove the following fundamental theorem :

THEOREM 13: 1. An oscillatory matrix A= Ì a,k jji always has n dis-tinct positive characteristic values

(126)

2. The characteristic vector u= (u,,, u21, ..., IN,) of A that belongsto the largest characteristic value A, has only non-zero coordinates of like

4

sign; the characteristic vector i t = (u,_, u22, ... , u,,) that belongs to thesecond largest characteristic value A2 has exactly one variation of sign in its

kcoordinates; more generally, the characteristic rector it = 0I k, 1 ( 2 k, . , unk)that belongs to the characteristic value ik has exactly k -1 variations of sign(k=1,2,...,n).

3. For arbitrary real numbers c c,+,, .... er (1 < g h :5-:,n;

. ck > 0) the number of variations of sign in the coordinates of the vectork-g

Aku=,Cku

k-g(127)

lies between g -1 and h -1:

106 XIII. MATRICES WITIi NON-NEGATIVE ELEMENTS

g-1 SS,- SS,+, Sh-1. (128)

Proof. 1. We number the characteristic values 21, 22, ... , 1, of A sothat

IA11?'4I?...L>'A.i

and consider the p-th compound matrix 9Ip (p = 1, 2, ... , n) (see Chapter I,§ 4). The characteristic values of 9I, are all the possible products of pcharacteristic values of A (see Vol. I, p. 75), i.e., the products

2122...Ap, ,11,18...,1p-1'1p+1

From the conditions of the theorem it follows that for some integer q AQis totally positive. But then 9Ip ? 0. 9[p > 0 ;53 i.e., 9fp is irreducible, non-negative, and primitive. Applying Frobenius' theorem (see § 2, p. 40)to the primitive matrix 91p (p = 1, 2, ... , n), we obtain

A,As...Ap> 0 (p=1, 2, ..., n),d1A2...Ap>A1/I$...2p-1Ap+1 (p=1, 2, ..., n-1).

Hence (126) follows.2. From this inequality (126) it follows that A = II a,k 11 71 is a matrix

of simple structure. Then all the compound matrices 9Ip (p = 1, 2, ... , n)are also of simple structure (see Vol. I, p. 74).

We consider the fundamental matrix U = II uak I' i of A (the k-th column

of U contains the coordinates of the k-th characteristic vector a of A; k =1, 2, ... , n). Then (see Vol. I. Chapter III, p. 74), the characteristic vectorof 91, belonging to the characteristic value 1112 ... Ap has the coordinates

U(1 2 .. 1p) (1 <t1<is<...<ip<n) (129)p

By Frobenius' theorem all the numbers (129) are different from zeroand are of like sign. Multiplying the vectors i1, ti, ... , u by -t 1, we canmake all the minors of (129) positive:

U (tlis

.. ip) > 0(1 5 i1 < i2 < ... < ip n)

. (130)1 2 . p p=1,2,. .,ra

sa The matrix 8Ip is the compound matrix Aa (see Vol. I, Chapter I, p. 20.)


The fundamental matrix U = II 'Ilk Ill is connected with A by the equatic

A = U (A1, Aa, ..., A.) U-1. (13

But then

AT = (UT)-1(Al, Aa, ... , UT.

Comparing (131) with (132), we see that

V = (UT)-I

(1

(13:

is the fundamental matrix of AT with the same characteristic values A,, i... , A.. But since A is oscillatory, so is AT. Therefore in V as well fievery p = 1, 2, ... , n all the minors

V 12

(1(il La ...

a'p(13

are different from zero and are of the same sign.On the other hand, by (133) U and V are connected by the equation

UTV =E.

Going over to the p-th compound matrices (see Vol. 1, Chapter 1, § 4), vhave:

UP 93P = (;p.

Hence, in particular, noting that the diagonal elements of LFp are 1, we obtain

U il i2

P V ("to :.

1P)1 2 p 1 2 . p

On the left-hand side of this equation, the first factor in each of the sunmands is positive and the second factors are different from zero and are (:like sign. It is then obvious that the second factors as well are positive; i.e

tl is ... iP) (1Sil<i2<...<iPsnV(1 2 ... P>0

\ P=1, 2,...,n (131

Thus, the inequalities (130) and (136) hold for II= II tc{k i anV = (UT) -1 simultaneously.

108 VIII. MATRICES WITH EOIv-NEGATIVE ELEMENTS

When we express the minors of V in terms of those of the inverse matrixT7-1 = U'. by the well-known formulas (see Vol. I, pp. 21-22), we obtain

PD+

(ji i4 ... jp_P = - tl r-1 {r tl t2 tD(137)y 1 2 ... n-p) I UI U\n n-1 ... n-p+1)'

where i1 < i2 < ... <i, and jl < j2 < ... < j._D together give the com-

plete system of indices 1, 2, ... , n. Since, by (130), 1 U I > 0 it follows from(136) and (137) that

(-1)

PRP+

r-1 (t1 t' ... tP) > 0

h h

1 Sil<i2< <iPS1(138)p=1,2,...,n

Now let u = ck u ck > 0). We shall show that the inequalitiesk-g k-g

(130) imply the second part of (128) :

S <h-1, (139)

and the inequalities (138), the first part:

SU ? g-1. (140)

Suppose that SY > h - 1. Then we can find h + I coordinates of it

nil, ui,, ..., nih+1 (1 5 i1 < i2 < ... < ih+1 S n) (141)

such that

uiauta+1S0 (a=1,2,..., h).

Furthermore, the coordinates (141) cannot all be zero ; for then we couldh

equate the corresponding coordinates of the vector it. = ' cku (Cl = ... _h k- I

cy_1=0; Z'ck > 0) to zero and thus obtain a system of homogeneousk-i

equationsh

. ckuiak = 0 (a =1, 2, ..., h)k-1

with the non-zero solution c1, c2, ... , Ch, whereas the determinant of thesystem

U (11 12 ...

th)12...his different from zero, by (130).

§ 9. OSCILLATORY MATRICES

We now consider the vanishing determinan.

ut,l ... ui,A ut,

U41 ---U ' ., uti.=0.A+1

A+a+1' (-1) utaa-1

Ur11 : ' : is-1 a+l 1A} 1 =0.1 .h

109

But such an equation cannot hold, since on the left-hand side all the termsare of like sign and at least one term is different from zero. Hence theassumption that SM > h -1 has led to a contradiction, and (139) can beregarded as proved.

We consider the vector

k.U = (u1t, u?;, ..., u,;) (k =1, 2, ..., n),

whereuu = (-1)n+1+ru;t (i, k =1, 2, ..., n) ;

then for the matrix U = II utk I177 we have, by (138) :

s ip /15i1<ia< <i"SnU

l iy

(n n-1 ... n-p + 1)>0 I p=1, 2,...,n . (142)

But the inequalities (142) are analogous to (130). Therefore, by setting

_ (-1)kcAl (143)k-9

we have the inequality analogous to (139) :M

S S n-g. (144)

Let u= (u1i u2, ... , u,,) and u = (it 1, u2, ... , 4.). It is easy to see that

u{=(-1)'ut (i = 1, 2, ..., n).Therefore

k54 In the inequalities (142), the vectors u (k = 1, 2, ... , n) occur in tho inverse order

u, u ... The vector u is preceded by n - p vectors of this kind.

110 XIII. MATRICES WITII NON-NEGATIVE ELEMENTS

Su,.+s =n-1,and so the relation (140) holds, by (144).

This establishes the inequality (128). Since the second statement of thetheorem is obtained from (128) by setting g = h = k, the theorem is nowcompletely proved.

3. As an application of this theorem, let us study the small oscillations ofn masses mi, m2i ... , m concentrated at n movable points x, < x2 < ... < X.of a segmentary elastic continuum (a string or a rod of finite length),stretched (in a state of equilibrium) along the segment 0 < x : l of thez-axis.

We denote by K(x, s) (0 < x, s 5 1) the function of influence of thiscontinuum (K(x, s) is the displacement at the point x under the action of aemit force applied at the points) and by k the coefficients of influence forthe given n masses :

k{; = K (x,, x,) (i, j =1, 2, ..., n).

If at the points x1, x2, ... , x n forces F1, F2, ... , F. are applied, thenthe corresponding static displacement y(x) (0:5- x:5 1), is given, by virtueof the linear superposition of displacements, by the formula

n

y(x)K(x,xj)F;.

When we here replace the forces F, by the inertial forces - m, at y (x,, t)(j = 1, 2, . . . , n), we obtain the equation of free oscillations

fy(x)=-'m5K(x, x,)

o2

y(xi, t).

We shall seek harmonic oscillations of the continuum in the form

(145)

y (x) = u (x) sin ((ot + a) (0 5 x S i) . (146)

Here u(x) is the amplitude function, w the frequency, and a the initialphase. Substituting this expression for y(x) in (145) and cancellingsin (wt + a), we obtain

n

u(x)=w'2'm,K(x, x,)u(x,). (147),_1


Let us introduce a notation for the variable displacements and the dis-placements in amplitude at the points of distribution of mass :

Then

yi = y (xa t) , u{ = u (xe) (i = 1, 2, . .., n).

yi=uisin(wt+a) (i=1, 2,...,n).

We also introduce the reduced amplitude displacements and the reducedcoefficients of influence

ut = u{ , ail = m, mi kri (i, 9 =1, 2, ..., n). (148)

Replacing x in (147) by x, (i=1, 2, ..., n) successively, we obtain asystem of equations for the amplitude displacements:

1ayuì=Au{ (A =w,; i=1, 2, ..., n). (149)

Hence it is clear that the amplitude vector s = (ul, i2, . . ., iii,,) is a charac-teristic vector of A = I ail IIi =1I m mik{i IIi for A =1/w2 (see Vol. I, ChapterX, § 8).

It can be established, as the result of a detailed analysis,55 that the matrixof the coefficients of influence II k{1 IIi of a segmentary continuum is alwaysoscillatory. But then the matrix A = 11 a1 11, = II im,mikci IIi is also oscilla.tory! Therefore (by Theorem 13) A has n positive characteristic values

i.e., there exist n harmonic oscillations of the continuum with distinctfrequencies :

By the same theorem to the fundamental frequency , there correspondamplitude displacements different from zero and of like sign. Among thedisplacements in amplitude corresponding to the first overtone with thefrequency 0)2 there is exactly one variation of sign and, in general, amongthe displacements in amplitude for the overtone with the frequency wi thereare exactly j - 1 variations of sign (j = 1, 2, ... , n).

55 See [239], [240], and [17], Chapter III.

112 XIII. MATRICES WITH ETON-NEGATIVE ELEMENTS

From the fact that the matrix of the coefficients of influence 'I k,, !, 10is oscillatory there follow other oscillatory properties of the continuum:1) For w=cot the amplitude function u(x), which is connected with theamplitude displacements by (147), has no nodes; and, in general, for w = w;the function has j -1 nodes (j = 1, 2, ... , n) ; 2) The nodes of two adjacentharmonics alternate, etc.

We cannot dwell here on the justification of these properties.56

56 See [17), Chapters III and IV.

CHAPTER XIVAPPLICATIONS OF THE THEORY OF MATRICES

TO THE INVESTIGATION OF SYSTEMS OFLINEAR DIFFERENTIAL EQUATIONS

§ 1. Systems of Linear Differential Equations with VariableCoefficients. General Concepts

1. Suppose given a system of linear homogeneous differential equations ofthe first order:

dx _ "

dt 'pik(t)Xk (i=1,2,...,n), (1)

where pik(t) (i, k = 1, 2, ... , n) are complex functions of a real argument t,continuous in some interval, finite or infinite, of the variable t.'

Setting P(t) pik(t) 11 and x= (x1i x2, ..., x"), we write (1) asdX=P(t)x.

(2)T9

An integral matrix of the system (1) shall be defined as a square matrixX(t) = II xik(t) Ili whose columns are n linearly independent solutions ofthe system.

Since every column of X satisfies (2), the integral matrix X satisfies theequation

d% P(t) X.de - (3)

In what follows, we shall consider the matrix equation (3) instead ofthe system (1).

From the theorem on the existence and uniqueness of the solution of asystem of differential equations2 it follows that the integral matrix X(t)is uniquely determined when the value of the matrix for some (ìnitial')

I In this section, all the relations that involve functions of t refer to the given interval.2 A proof of this theorem will be given in § 5. See also I. G. Petrowski (Petrovskii),

Vorlesungen fiber die Theoric der gewihnlichcn Diffcrentialgiciehungen, Leipzig. 1954(translated from the Russian: Moscow, 1952).

113

114 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS

value t = t,, is known,' X (t.) = ,Yo. For X, we can take an arbitrary non-singular square matrix of order n. In the particular case where X(t0) = E,the integral matrix X (t) will be called normalized.

Let us differentiate the determinant of X by differentiating its rows insuccession and let us then use the differential relations

dr" )at =w

XpuuxH (i,j =1, 2,...,n.

We obtain :t-1

d i x i =(p11 + Pu+...+ Pww) I X Ide

Hence there follows the well-known Jacobi identitytftrPdt

1X1=CC" ,

where c is a constant and

trP=ptt+p22+...+pww

(4)

is the trace of P(t).Since the determinant j X I cannot vanish identically, we have c # 0.

But then it follows from the Jacobi identity that j X I is different from zerofor every value of the argument

I X I 0;

i.e.. an integral matrix is non-singular for every value of the argument.If %(t) is a non-singular (j X(t) I # 0) particular solution of (3). then

the general solution is determined by the formula

X=XC, (5)

where C is an arbitrary constant matrix.For, by multiplying both sides of the equation

dt =P%(6)

dt

by Con the right, we see that the matrix XC also satisfies (3). On the otherhand, if X is an arbitrary solution of (3), then (6) implies:

:, It is assumed that t. belongs to the given interval of t.

§ 1. SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS 115

dgat(i.i-1X)= dgX',X +%at(%-1X)=P%+gat(%-1%),

and hence by (3)

and

d (%-1%) = 0

1-1% = oonst.= C ;i.e., (5) holds.

All the integral matrices X of the system (1) are obtained by the formula(5) with I CI O.

2. Let us consider the special case :

(7)

where A is a constant matrix. Here X = e4' is a particular non-singularsolution of (7),' so that the general solution is of the form

X=ea'C (8)

where C is an arbitrary constant matrix.Setting t = to in (8) we find : Xo = Hence C = e-Ale %o and there-

fore (8) can be represented in the form

g = eA(I-4)go, (9)

This formula is equivalent to our earlier formula (46) of Chapter V (Vol. I,p. 118).

Let us now consider the so-called Cauchy system :

d_%__t_A % (A is a constant matrix). (10)dt -a

This case reduces to the preceding one by a change of argument :

u=ln(t-a).Therefore the general solution of (10) looks as follows :

X= eAIn(t-a)C= (t -a)AC.

The functions eAr and (t - a)A that occur in (8) and (11) may be repre-sented in the form (Vol. I, p. 117)

w Ak4 By term-by-term differentiation of the aeries cAt = Eo - ,o- P we find aj eA* = AeAt.


eA,= (Zk1+Zk2t+...+ZZ'1ktmk-')eakt, (12)k-l

(t - a)' (Zkl + Z. In (t - a) + ... + Zkk [In (t - a)]'"k-') (t - a)k. (13)k-t

Here

(2 Akfor i .k;i,k=1,2,...,s)

is the minimal polynomial of A, and Z51 (j = 1, 2, ... , mk; k = 1, 2, ... , s)are linearly independent constant matrices that are polynomials in A.3

Note. Sometimes an integral matrix of the system of differential equa-tions (1) is taken to be a matrix W in which the rows are linearly independ-ent solutions of the system. It is obvious that It' is the transpose of X :

W =X'.

When we go over to the transposed matrices on both sides of (3), weobtain instead of (3) the following equation for W :

dW =a WP(t).i (3')

Here W is the first factor on the right-hand side, not the second, as I wasin (3).

§ 2. Lyapunov Transformations

1. Let us now assume that in the system (1) (and in the equation (3) )the coefficient matrix P(t) = II pik(t) IIi is a continuous bounded functionof t in the interval [to, oo ).3

In place of the unknown functions xt, x2, ... , x,, we introduce the newunknown functions yi, y2..... y by means of the transformation

(A

xi lik(t)k_i

(i =1, 2, ..., n). (14)

3 Every terns Xr = (Zki + Zksl + ... + Zk,nktmtk-t) eakt (k = 1, 2, ... , s) on theright-hand side of (12) is a solution of (7). For the product g(A)eAt, with an arbitraryfunction g(.), satisfies this equation. But X, =f(A)=g(A)eAt if f(k) =g(k)eAtandg(,") = 1, and all the remaining m - I values of g(k) on the spectrum of A are zero(see Vol. 1, Chapter V, formula (17), on p. 104).

This means that each function p,w(f) (i, k = 1, 2, ... , n) is continuous and boundedin the interval It,, oo ), i.e., t ? to.

§ 2. LYAPUNOV TRANSFORMATIONS 117

We impose the following restrictions on the matrix L (t) = 11 l{k (t) 117of the transformation:

1. L (t) has a continuous derivativedLda

in the interval [ to, oo) ;

2. L(t) and dt are bounded in the interval [to, oo) ;

3. There exists a constant m such that

0 < m < absolute value of I L (t) (t ? to),

i.e., the determinant I L(t) I is bounded in modulus from below by the posi-tive constant m.

A transformation (14) in which the coefficient matrix L(t) = lin(t) iisatisfies 1.-3. will be called a Lyapunov transformation and the correspond-ing matrix L (t) a Lyapunov matrix.

Such transformations were investigated by A. M. Lyapunov in hisfamous memoir `The General Problem of Stability of Motion' [321.

Examples. 1. If L = const. and I L I # 0, then L satisfies the conditions1.-3. Therefore a non-singular transformation with constant coefficients isalways a Lyapunov transformation.

2. If D = II dik II i is a matrix of simple structure with pure imaginarycharacteristic values, then the matrix

L(t) = eDt

satisfies the conditions 1.-3. and is therefore a Lyapunov matrix.'

2. It is easy to verify that the conditions 1.-3. of a matrix L (t) imply theexistence of the inverse matrix L-1(t) also satisfying the conditions 1.-3.;i.e., the inverse of a Lyapunov transformation is itself a Lyapunov trans-formation. In the same way it can be verified that two Lyapunov transfor-mations in succession yield a Lyapunov transformation. Thus, the Lyapunovtransformations form a group. They have the following important property :

If under the transformation (14) the system (1) goes over into

A

dat - 9 (t) Ykk-l

(15)

and if the zero solution of this system is stable, asymptotically stable, orunstable in the sense of Lyapunov (see Vol. I, Chapter V, § 6), then the zerosolution of the original system (1) has the same property.

7 Here all the ma =1 in (12) and X, = i9)k(qk real, k =1, 2, ... , s).


In other words, Lyapunov transformations do not alter the characterof the zero solution (as regards stability). This is the reason why thesetransformations can be used in the investigation of stability in order tosimplify the original system of equations.

A Lyapunov transformation establishes a one-to-one correspondence be-tween the solutions of the systems (1) and (15) ; moreover, linearly inde-pendent solutions remain so after the transformation. Therefore a Lyapunovtransformation carries an integral matrix X of (1) into some integralmatrix Y of (15) such that

X=L(t)Y.In matrix notation, the system (15) has the form

dY=Q(t)Y,

where Q(t) = II g1k(t) II; is the coefficient matrix of (15).

(16)

(17)

Substituting LY for X in (3) and comparing the equation so obtainedwith (17), we easily find the following formula which expresses Q in termsof P and L:

Q=L-1PL-L-1L (18)

Two systems (1) and (15) or, what is the same, (3) and (17) will becalled equivalent (in the sense of Lyapunov) if they can be carried into oneanother by a Lyapunov transformation. The coefficient matrices P and Qof equivalent systems are always connected by the formula (18) in whichL satisfies the conditions 1.-3.

§ 3. Reducible Systems

1. Among the systems of linear differential equations of the first order thesimplest and best known are those with constant coefficients. It is, there-fore, of interest to study systems that can be carried by a Lyapunov trans-formation into systems with constant coefficients. Lyapunov has called suchsystems reducible.

Suppose given a reducible system

d t -PX.

Then some Lyapunov transformation

X =L(t) Y

(19,

(20)carries it into a system

§ 3. REDUCIBLE SYSTEMS 119

dy =AY, (21)dtwhere A is a constant matrix. Therefore (19) has the particular solution

%=L(t)e,". (22)

It is easy to see that, conversely, every system (19) with a particular solu-tion of the form (22), where L(t) is a Lyapunov matrix and A a constantmatrix, is reducible and is reduced to the form (21) by means of the Lyapu-nov transformation (20).

Following Lyapunov, we shall show that : Every system (19) withperiodic coefficients is reducible.'

Let P(t) in (19) be a continuous function in (- co, + oo) with period r:

P(t+s)=P(t).Replacing tin (19) by t + r and using (23), we obtain :

d% (e -}- :)de- = P (t) X (t + r) .

Thus, X (t + r) is an integral matrix of (19) if X (t) is. Therefore

%(t+ r)=X(t)V,

(23)

where V is a constant non-singular matrix. Since I V I 0, we candetermine'

mvVT =eT

This matrix function of t, just like X(t), is multiplied on the right by Vwhen the argument is increased by r. Therefore the `quotient'

L (t)= X (t) V T =X (t) e` In y

is.continuous and periodic with period r:

L (t + -r) =L (t),

and with I L 0. The matrix L(t) satisfies the conditions 1.-3. of thepreceding section and is therefore a Lyapunov matrix.

$ See [321, § 47.9 Here In V = f(V), where f o.) is any single-valued branch of In 1, in the simply-

connected domain G containing all the characteristic values of V, but not containing 0.See Vol. I, Chapter V.


On the other hand, since the solution X of (19) can be represented in

the formIn Y

X=L(t)e * `,

the system (19) is reducible.In this case the Lyapunov transformation

X=L(t)Y,

which carries (19) into the form

dY=T

has periodic coefficients with period t.Lyapunov has established" a very important criterion for stability and

instability of a first linear approximation to a non-linear system of differ-ential equations

si=..'a xk + (oo) (i =1, 2, ..., n),dt k_1

(24)

where we have convergent power series in x1, x2, ... , x on the right-handside and where denotes the sum of the terms of second and higher ordersin x,, x2, . . . , x.; the coefficients ak (i, k = 1, 2, ... , n) of the linear termsare constant.11

LYAPUNOV'S CRITERION: The zero solution of (24) is stable (and evenasymptotically stable) if all the characteristic values of the coefficient matrixA = II aik 111 of the first linear approximation have negative real parts, andunstable if at least one characteristic value has a positive real part.

2. The arguments used above enable us to apply this criterion to a systemwhose linear terms have periodic coefficients:

dai -rde = G Pik (t) xk -f- ('*).

For on the basis of the preceding arguments we reduce the system (2-5) tothe form (24) by means of a Lyapunov transformation, where

10 See [32], § 24.11 The coefficients in the non-linear terms may depend on 1. These functional coeffi-

cients are subject to certain restrictions (see [32], § 11).

§ 4. CANONICAL FORM OF REDUCIBLE SYSTEM. ERUOIN's THEOREM 121

A=Ila.1,-.= T in V

and where V is the constant matrix by which an integral matrix of the cor-responding linear system (19) is multiplied when the argument is changedby r. Without loss of generality, we may assume that r > 0. By the prop-erties of Lyapunov transformations the zero solutions of the original and ofthe transformed systems are simultaneously stable, asymptotically stable,or unstable. But the characteristic values At and v1 (i =1, 2, ... , n) of Aand V are connected by the formula

A<=1n v, (i =1, 2, ... , n) .

Therefore, by applying Lyapunov's criterion to the reduced systems wefind:"

The zero solution of (25) is asymptotically stable if all the characteristicvalues vi, va, ..., v of V are of modulus less than 1 and unstable if at leastone characteristic value is of modulus greater than 1.

Lyapunov has established his criterion for the stability of a linear ap-proximation for a considerably wider class of systems, namely those of theform (24) in which the linear approximation is not necessarily a system withconstant coefficients, but belongs to a class of systems that he has calledregular.13

The class of regular linear systems contains all the reducible systems.A criterion for instability in the case when the first linear approxima-

tion is a regular system was set up by N. G. Chetaev.'4

§ 4. The Canonical Form of a Reducible System. Erugin's Theorem

1. Suppose that a reducible system (19) and an equivalent system

dY =AY

(in the sense of Lyapunov) are given, where A is a constant matrix.We shall be interested in the question : To what extent is the matrix A

determined by the given system (19) ? This question can also be formu-lated as follows :

12 Loc. cit., § 55.

13 Loc. cit., § 9.

14 See 191, p. 181.


When are two systems

dY=AY and L=BZ,

where A and B are constant matrices, equivalent in the sense of Lyapunov;i.e., when can they be carried into one another by a Lyapunov transfor-mation f

In order to answer this question we introduce the notion of matrices withone and the same real part of the spectrum.

We shall say that two matrices A and B of order it have one and the samereal part of the spectrum if and only if the elementary divisors of A and Bare of the form

(A - Al)"", (A -- A2)"2, ..., (A -A.)-; (A (A ..., (A -IA,)- ,

whereReAk=Reµk ( k=1 ,2 ,. . . ,8 ) .

Then the following theorem due to N. P. Erugin holds :18

THEOREM I (Erugin) : Two systems

T, =AY and =BZ (26)

(A and B are constant matrices of order n) are equivalent in the sense ofLyapunov if and only if the matrices A and B have one and the same realpart of the spectrum.

Proof. Suppose that the systems (26) are given. We reduce A to thenormal Jordan form16 (see Vol. I, Chapter VI, § 7)

A = T ( A1E1 + H1, AEE2 + H2, ..., AE, + H,) T-1, (27)where

Ak = ak + ijk (ak, #,% are real numbers; k =1, 2, ... , s). (28)

In accordance with (27) and (28) we setAl = T { a1E1 + H1, a2Ea + H$, ..., aE, + Hj 21-1,

(29)As = T ( ifl1E1, ifl$E3, ..., ifl,E,) T-1.

15 Our proof of the theoren, differs from that of Erugin.1" Ek is the unit matrix; in IIk the elements of the first superdingonal are 1, and the

remaining elements are zero; the orders of Et, Hk are the degrees of the elementarydivisor of A, i.e., mk (k = 1, 2, , s).

§ 4. CANONICAL FORM OF REDUCIBLE SYSTEM. ERUOIN'S THEOREM 123

ThenA = Al + As, A1A2 = AsA1.

We define a matrix L(t) by the equation

L (t) = eAO.

(30)

L(t) is a Lyapunov matrix (see Example 2 on p. 117).But by (30) a particular solution of the first of the systems (26) is of

the formed' = ee,2eA,e =L (t) ed,(t).

Hence it follows that the first of the systems (26) is equivalent to

dUdt = AlU, (31)

where, by (29), the matrix A, has real characteristic values and its spec-trum coincides with the real part of the spectrum of A.

Similarly, we replace the second of the systems (26) by the equivalentsystem

= BIV'a (32)

where the matrix B, has real characteristic values and its spectrum coincideswith the real part of the spectrum of B.

Our theorem will be proved if we can show that the two systems (31)and (32) in which A, and B, are constant matrices with real characteristicvalues are equivalent if and only if A, and B, are similar."

Suppose that the Lyapunov transformation

U=L1V

carries (31) into (32). Then the matrix L, satisfies the equation

1 = A1L1- LIB1. (33)

This matrix equation for L, is equivalent to a system of n2 differentialequations in the n2 elements of L,. The right-hand side of (33) is a linearoperation on the `vector' L, in an n2-dimensional space

11 This proposition implies Theorem 1, since the equivalence of the systems (31) and(32) means that the systems (26) are equivalent, and the similarity of A, and B, meansthat these matrices have the same elementary divisors, so that the matrices A and B haveone and the same real part of the spectrum.


de1= F (L1), [F(L1) =A1L1- L1B1] . (33')

Every characteristic value of the linear operator P (and of the corre-

sponding matrix of order 912) can be represented in the form of a differencey - b, where y is a characteristic value of Al and b a characteristic valueof B1.18 Hence it follows that the operator F has only real characteristicvalues.

We denote by

w(2)=(2-2)'"1 (2-d,)m'...(2-2s),o

(the are real ; for i # j ; i, j = 1, 2, ... , it) the minimal polynomialof F. Then the solution L1(t) = ft(o) of (33') can, by formula (12)(p. 116), be written as follows:

L1 (t) =,' X Lkttf S`,k-1 f-0

(34)

where the Lkf are constant matrices of order it. Since the matrix L1(t) isbounded in the interval (to, co ), both for every A,, > 0 and for Ak = 0 andj > 0, the corresponding matrices Lkf = 0. We denote by L_ (t) the sum ofall the terms in (34) for which Ak < 0. Then

where

L1 (t) = L_ (t) + Lo, (35)

lim L_(9)=0, lim =0, Lo=cont.

Then, by (35) and (35'),Jim L1 (t) = Loat-.+.

(35')

,s For let A., be tiny characteristic value of the operator F. Then there exists a matrixL * 0 such that F(L) = A..L, or

(A,-A.E)L=LB,.The matrices At - A E and B, have at least one characteristic value in common, sinceotherwise there would exist a polynomial g().) such that

g(A,-A.E)=0,g(B,)=E,and this is impossible, because it follows from (") that g(A,-A.E) L=L g(B,)and L # 0. But if A, - A.F, and B, have a common characteristic value, then A.= y-d,where y and 8 are characteristic values of A, and B,, respectively. A detailed study ofthe operator k can be found in the paper [179) by F. Golubehikov.

I' mk-1

§ 5. THE MATRICANT 125

from which it follows thatIL.I7& 0,

because the determinant I L, (t) I is bounded in modulus from below.When we substitute for L, (t) in (33) the sum L_ (t) + Lo, we obtain :

dL_ (t) - A1L_ (t) + B1L_ (t) = A1La - B1Lo;dt

hence by (35')

and thereforeA1Lo -LOB, = 0

B1= Lo 'A1Lo. (36)

Conversely, if (36) holds, then the Lyapunov transformation

i1= LOV

carries (31) into (32). This completes the proof of the theorem.

2. From this theorem it follows that : Every reducible system (19) can becarried by the Lyapunov transformation X =LY into the form

dYdt=JY,

where J is a Jordan matrix with real characteristic values. This canonicalform of the system is uniquely determined by the given matrix P(t) towithin the order of the diagonal blocks of J.

§ 5. The Matricant

1. We consider a system of differential equations

W = P(t)%, (37)

where P(t) = II pik(t) II; is a continuous matrix function of the argumentt in some interval (a, b) .19

19 (a, b) is an arbitrary interval (finite or infinite). All the elements p,k(t) (i, k=1 , 2, ... , n) of P(t) are complex functions of the real argument t, continuous in (a, b).Everything that follows remains valid if, instead of continuity, we. require (in every finitesubinterval of (a, b)) only boundedness and Riemann integrability of all the functionsplk(t).


We use the method of successive approximations to determine a normal-ized solution of (37), i.e., a solution that for t= to becomes the unit matrix(to is a fixed number of the interval (a, b)). The successive approximationsXk (k = 0, 1, 2, ...) are found from the recurrence relations

ddt P(t)Xr_z (k=1, 2, ...),

when Xo is taken to be the unit matrix E.Setting Xk (to) = E (k = 0, 1, 2, ...) we may represent Xk in the form

t

Xk=E + f P(r) Xk_ldt.4

Thus

t t s

Xo=E, %1= E+f P(r)dT, X x=E+ f P(T)dr+ f P(T) f P(a)dadr, ...,4 4 4 4

i.e., Xk (k = 0, 1, 2, ...) is the sum of the first k + 1 terms of the matrixseries

t tE+ f P(r)dT+ f P(r) f (38)

4 4 4

In order to prove that this series is absolutely and uniformly convergentin every closed subinterval of the interval (a, b) and determines the requiredsolution of (37), we construct a majorant.

We define non-negative functions g (t) and h (t) in (a, b) by the equa-tions"

g (t) =maa [ i p11(t)1, I p,2 (t) I,-, I &.(t) 1], h (t) = lfg (T) dr

It is easy to verify that g (t) , and consequently h (t) as well, is continuousin (a, b).21

Each of the n2 scalar series into which the matrix series (38) splits ismajorized by the series

1 + h (t) + nh' (t) + n'h' (t)21 31

(39)

20 By definition, the value of p(t) for any value of t is the largest of the n' moduli ofthe values of p,k ( t ) ( i , k =1, 2, ... , n) for that value of t.

21 The continuity of g(t) at any point t, of the interval (a, b) follows from the factthat the difference g(t) -g(t,) for t sufficiently near t, always coincides with one ofthe n' differences I p,k(t) I - I p,k(t,) I ( i , 1 : = 1, 2, ... , n).

§ 5. THE MATRICANT 127

For

(1P (_)dr),kI=I fper(_)dvI SI fg(s)dvI=h(t),4 4 4

,I

(fP(v) fP(a)dadv)r,kl=I11pt5(v) fptk(a)dadvI SnI fg(r)fg(a)dadtl=2h2(t)4 1-14 4 to to

etc.The series (39) converges in (a, b) and converges uniformly in every

closed part of this interval. Hence it follows that the matrix series (38) alsoconverges in (a, b) and does so absolutely and uniformly in every closedinterval contained in (a, b).

By term-by-term differentiation we verify that the sum of (38) is asolution of (37) ; this solution becomes E for t = to. The term-by-termdifferentiation of (38) is permissible, because the series obtained after dif-ferentiation differs from (38) by the factor P and therefore, like (38), isuniformly convergent in every closed interval contained in (a, b).

Thus we have proved the theorem on the existence of a normal solutionof (37). This solution will be denoted by D,, (P) or simply Q,. Everyother solution, as we have shown in § 1, is of the form

X = D".0,

where C is an arbitrary constant matrix. From this formula it follows thatevery solution, in particular the normalized one, is uniquely determined byits value for t = to.

This normalized solution Qy of (37) is often called the matricant.We have seen that the matricant can be represented in the form of a

series"I T

Sly=E+ f P(v)dr+1 P(v)l (40)4 to 4

which converges absolutely and uniformly in every closed interval in whichP(t) -is continuous.

2. We mention a few formulas involving the matricant.

1. Sl:, = SZ<t,Sli; (to, tl, t E (a, b)).

For since Dn and K are two solutions of (37), we have

__ The representation of the mntricant in the form of such a series was first obtainedby Peano [308].


£ , = C (C is a constant matrix).Setting t = t, in this equation, we obtain C = Q .

2. D',.(P+Q)=.U,.(P)f,,(B) with 9 =[Qà(P)1--'Q.Q,(P)

To derive this formula we set :

X =1I,(P), Y=Q.(P+Q),and

Y =XZ.

Differentiating (41) term by term, we find:

(P+Q)XZ=PXZ+X.Hence

(41)

W =X-'QXZ

and since it follows from (41) that Z(to) =E,

Z =!2,, (X-'QX ).

When we substitute their respective matricants for X, Y, Z in (41), weobtain the formula 2.

3. In 'Qr`,(P)I=J tr P dr.b

This formula follows from the Jacobi identity (4) (p. 114) when wesubstitute Q,(P) for Y(t) in that identity.

Tl4. If A= 1: aNk -eonst., then

d,, (A) = CA(`-'')

We introduce the following notation. If P= II Irk II1 , then we shallmean by mod P the matrix

mod P = 111 Pik i II; .

Furthermore, if A = ) aik II and B = (I bik 1I; are two real matrices and

aik:5 box (i,k=1,2...... ),then we shall write

ABB.

5. THE MATRICANT 129

Then it follows from the representation (40) that:

5. If modP(t) <modP(t) (t?t0), then the series (40) for Q(P)is majorized, beginning with the first term, by the same series for SY4(Q),so that for all t ? to

mod Q4 (P) Sdt4 (Q) , mod [d4 (P) - E] < d4 (Q) - E,e t

mod [(4, (P) - E - f Pdr] S S2t4 (Q) - E - f Qdr, etc.4 4

In what follows we shall denote the matrix of order n in which all theelements are I by I :

I=k 11-

We consider the function g (t) defined on p. 126. Then we have

modP(t) <g(t)I.

But I (g (t) I) is the normalized solution of the equation

dX =g (t) I%

Therefore, by 4. '23

nh=t n'h'tfit. (g(t)I)=e ù-E+(h(t)+ 211+ 11()+...ll

where

h (t) _ g (r) dT, g (t) = max I pik (t)4 I St, tSn

(42)

Therefore it follows from 5. and (42) that :

6. mod SE+ (en"t')-1)I,

mod [Q, (P)-E] ` n (ma(e)-1)I,

mod [Q`4 (P) - E - f Pdr] S n (en"(`) -1- nh (t) ) I, etc.4

We shall now derive an important formula giving an estimate for themodulus of the difference between two matricauts :

1

21 By replacing the independent variable t by h = fg(t)dt.to


7. mod [S1`4 (P) - Q. (Q)1:5 n e*p(t-4) -1) I (Q;10),if

modQ <ql, mod (P-Q) <d-l, 1=11111

(q, d are non-negative numbers; n is the order of P and Q).We denote the difference P - Q by D. Then

P=Q+D, mod D:5 d-1.

Using the expansion (40) of the matricant in a series, we find:

t r t : t s

=JD (T)dT+ f D(=) f Q(a)dadT+ f Q(=) f D(a)dadT+ f D(r) fD(a)dadz+,....4 4 4 4 4 4 4

From this expression it is clear that, for t ? to,

mod (f4, (Q + D) - f1`4 (Q)] Sly (modQ + mod D) - Q, (mod Q )

5 a4 ((q + d) I) - .Q, (qI) = e(v+s) 10-4) - evr(t-t.)

= eQ'a-4) (eU(t-4) -D)

= A [P + It(e*g(t-w) -1) I] (e*d(t-4) -1) I

=n[I+n(e"4('-"-1)Is](&"(t-+.)_1)

_ e*4(t-4) (e+,a(t-4) -1) I .

We shall now show how to express by means of the matricant the generalsolution of a system of linear differential equations with right-hand sides:

d¢t/j (t) (i=1, 2,...,n);=Prk(t)xk+ (43)

P;k(t) and fa(t) (i, k=1, 2...., n) are continuous functions of t in someinterval.

By introducing the column matrices ('vectors') x = (xl, .r2, ... , andf = (fi, f2, ... , and the square matrix P pu ;;"l , we write the systemas follows :

=P(t)x+I(t). (43')

§ 6. THE MULTIPLICATIVE INTEGRAL. CALCULUS OF VOLTERRA 131

We shall look for a solution of this equation in the form

x=tIs,(P)z, ('14)

where z is an unknown column depending on t. We substitute this expres-sion for x in (43') and obtain :

PQ`4(P)z+Q`(P) =PS2`4(P)z+/(t);hence

a = [.Q, (P)]-1 / (t)

Integrating this, we find :

z =/ [Q (P)1-1/ (s) dr + c,4

where c is an arbitrary constant vector. Substituting this expression in(44), we obtain:

x=C4(P) J[Qy(P)]-1/ (r)dt+Sty(P)c. (45)4

When we give to t the value to, we find: x(to) = c. Therefore (45) assumesthe form

where

x = Q`4 (P) x (t0) + JK (t, r) / (v) d r , (45')4

K (h 'r) = Q;. (P) A (P)]-1

is the so-called Cauchy matrix.

§ 6. The Multiplicative Integral. The Infinitesimal Calculusof Volterra

1. Let us consider the matricant JAW). We divide the basic interval(ta, t) into n parts by introducing intermediate points t1i t2, ... , andset dtk = tk - tk_ 1 (k= 1, 2, ... , n ; t = t). Then by property 1. of thematricant (see the preceding section),

Qy = D".-1 ... a4Qty. (46)


In the interval (tk_ I, tk) we choose an intermediate point Tk (k = 1, 2, .... n) .By regarding the Atk as small quantities of the first order we can take, forthe computation of &k_1 to within small quantities of the second order,P (t) c const. = P (T,,). Then

(47)

here we denote by the symbol the sum of terms beginning with termsof the second order.

From (46) and (47) we find:

Q4 =eP(Ts)4' ... eP(T2)d4eP(T1)dh + (48)

and

Q, = [E + P [E + P (r2) dtf] [E + P (rl) Atl] + (49)

When we pass to the limit by increasing the number of intervals indefi-nitely and letting the length of these intervals tend to zero (the small terms

disappear in the limit),24 we obtain the exact limit formulas

and

D14 (P) = Jim [eP(TN)its ... eP(T2)d4eP(T,)dh]deN0

(48')

SY4 (P) = lim [E + P [E + P (r2) dt'] [E + P (rl) dtl]. (49')dek-0.

The expression under the limit sign on the right-hand side of the latterequation is the product integral.23 We shall call its limit the multiplicativeintegral and denote it by the symbol

I4[E + P (t) dt] = lim [E + P 4t.] [E + P (TI) 4tI]. (50)

dek 00

The formula (49') gives a representation of the matricant in the form of amultiplicative integral

S24 (P) (E +P dt), (51)

and the formulas (48) and (49) may he used for the approximative compu-tation of the matricant.

24 These arguments can be made more precise by an estimate of the terms we havedenoted by For a rigorous deduction of (48') we have to use formula 7. of § 5 inwhich the matricant Q(t) must be replaced by a piece-wise constant matrix

Q(t)=P(ta) (th_1 t:5 ts;k=1,2,...,n).25 An analogue to the sum integral for the ordinary integral.

§ 6. THE MULTIPLICATIVE INTEGRAL. CALCULUS OF VOLTERRA 133

The multiplicative integral was first introduced by Volterra in 1887.On the basis of this concept Volterra developed an original infinitesimalcalculus for matrix functions (see (631 ).26

The whole peculiarity of the multiplicative integral is tied up with thefact that the various values of the matrix function P(t) in subintervals arenot permutable. In the very special case when all these values are permutable

P (t.) P (t") = P (t") P(t') (t', t" , (to, t)),

the multiplicative integral, as is clear from (48') and (51), reduces to thematrix

fP(t)stego

2. We now introduce the multiplicative derivative

D,% = dd %I.

The operations D, and f' are mutually inverse :If

D,% =P,then21

X= f4(E+ Pdt).C (C=I (to)),and vice versa. The last formula can also be written as follows : 28

(52)

(E + P dt) =X (t) X (to)-'. (53)

We leave it to the reader to verify the following differential and integralformulas :2s

26 The multiplicative integral (in German, Produkt-Integral) was used by Schlesingerin investigating systems of linear differential equations with analytic coefficients [49]

and [50]; see also [321].

The multiplicative integral (50) exists not only for a function P(t) that is continuousin the interval of integration, but also under considerably more general conditions(see [116]).

27 Here the arbitrary constant matrix C is an analogue to the arbitrary additive con-

stant in the ordinary indefinite integral.t

28 An analogue to the formula f Pdt = %(t) - X (to), where = P.to

29 These formulas can be deduced immediately from the definitions of the multiplica-tive derivative and multiplicative integral (see [63] ). However, the integral formulas areobtained more quickly and simply if the multiplicative integral is regarded as a matricantand the properties of the matricant that were expounded in the preceding section are used(see [49]).


DIFFERENTIAL FORMULAS

I. D, (%Y) =D,(X) +%D,(Y)X-',D, (%C) = D, (%), (C is a constant matrix)D,(CY) =CD,(Y)C-'.

H. D,(%T) = XT (DX)T XT-'.

III. D, (X-I) = -%-'D, (X) X = - (D, (XT))T,D,((XT)-')=- (.D (X))T.

INTEGRAL FORMULAS

IV. f ti (E + Pdr) = f ;. (E + P dr) f,. (E + Pd-r).

V. f ,(E+Pdr)=[ f,'(E+Pdr)]-1.

VI. f4 (E + CPC-' dr) = C f '* (D + Pdr) CC' (C is a constant matrix)

VII. f' [E + (Q + D,%) d r ] = % (t) f t (E + %-'Q% dr) % (to)-'. 37

VIII. mod[ f,.(E+Pdr)- f ,(E+Qdr)] S e' (' " (end (1 -1)I(t>to),

if

mod Q < q I, mod (P - Q) d I, I=11111

(q and d are non-negative numbers; n is the order of P and Q).Suppose now that the matrices P and Q depend on the same parameter a

P=P(r,a), Q=Q(r,a)and that

lim P (r, a) = lim Q (r, a) = PO (r),

where the limit is approached uniformly with respect to r in the interval(t0j t) in question. Furthermore, let us assume that for a -* ao the matrixQ(r, a) is bounded in modulus by ql, where q is a positive constant. Then,setting

limd(a)=0,a-a#

we have : d (a) = max pa (r, a) - qn (r, a)Igi.k$nt T j

31 The formula VII can be regarded in a certain sense as a analogue to the formula forintegration by parts in ordinary integrals. VII follows from 2.of §5).

§ 7. DIFFERENTIAL SYSTEMS IN COMPLEX DOMAIN 135

Therefore it follows from formula VIII that :

alien.[ jj.(E+Pdt)- fj.(E+Qdt)]=0.

In particular, if Q does not depend on a (Q(t, a) =Po(t)), we obtain:

where

[E + P (t, a) dil [E + Po (t) dtl,lim fto

.-a,

Po(t)=lim P(t,a).a- a,

§ 7. Differential Systems in a Complex Domain. General Properties

1. We consider a system of differential equations

dzrdz = Pik (z)xk. (4)

Here the given function pik(z) and the unknown functions xi(z) (i, k =1, 2, . . . , n) are supposed to be single-valued analytic functions of a complexargument z. regular in a domain 0 of the complex z-plane.

Introducing the square matrix P(z) = 11 pik(z) !1 *1 and the column matrixx= (XI, x2, ...I we can write the system (54), as in the case of a realargument (§ 1), in the form

d-x = P (z) x (54')

Denoting an integral matrix, i.e., a matrix whose columns are n linearlyindependent solutions of (54), by X, we can write instead of (54') :

aR=P(z)X (55)

Jacobi's formula holds also for a complex argument z :

ftrPds%ce. (56)

Here it is assumed that zo and all the points of the path along which f isSo

taken are regular points for the single-valued analytic function tr P(z) _Ptt (z) + P22 (Z) + ... + p,A,(z)_a=

'12. Here, and in what follows, the path of integration is taken as a sectionally smoothcurve.


2. A peculiar feature of the case of a complex argument is the fact that fora single-valued function P(z) the integral matrix X(z) may well be a many-valued function of z.

As an example, we consider the Cauchy system

d% _ U X (U is a constant matrix). (57)dz -s-aOne of the solutions of this system, as in the case of a real argument (seep. 115), is the integral matrix

I =e°ln(s-.d) =(z-a)°. (58)

For the domain G we take the whole z-plane except the point z = a. All thepoints of this domain are regular points of the coefficient matrix

P(z)= Ua.

If U:#6 0, then z = a is a singular point (a pole of the first order) of thematrix function P(z) = U/(z-a).

An element of the integral matrix (58) after going around the pointz = a once in the positive direction returns with a new value which is obtainedfrom the old one by multiplication on the right by the constant matrix

V = e:,i°

In the general case of a system (55) we see, by the same reasoning as inthe case of a real argument. that two single-valued solutions X and X arealways connected in some part of the domain G by the formula

X=%C,where C is a constant matrix. This formula remains valid under anyanalytic continuation of the functions X (z) and k (z) in G.

The proof of the theorem on the existence and (for given initial values)uniqueness of the solution of (54) is similar to that of the real case.

Let us consider a simply-connected star domain Gt (relative toforming part of G and let the matrix function P(z) be regular34 in Gt. Weform the series

E+[P(C)dC+1P (C)f"P (V)dZ'4 +.... (59)

So to h

33 A domain is called a star domain relative to a point z, if every segment joining z,to an arbitrary point z of the domain lies entirely in the given domain.

31 I.e., all the elements p,* (z) (i, k= 1, 2, ..., n) of P(z) are regular functions in G.

§ 7. DIFFERENTIAL SYSTEMS IN COMPLEX DOMAIN 137

Since G, is simply-connected, it follows that every integral that occurs in(59) is independent of the path of integration and is a regular function inG. Since G, is a star domain relative to zo. we may assume for the purposeof an estimate of the moduli of these integrals that they are all taken alongthe straight-line segment joining zo and z.

That the series (59) converges absolutely and uniformly in every closedpart of G, containing zo follows from the convergence of the majorant

1+lM+ 2,l2MW+3,s

l3MS+....

Here M is an upper bound for the modulus of P(z) and l an upper boundfor the distance of z from zo, and both bounds refer to the closed part ofG, in question.

By differentiating term by term we verify that the sum of the series(59) is a solution of (55). This solution is normalized, because for z = zoit reduces to the unit matrix E. The single-valued normalized solution of(55) will be called, as in the real case, a matricant and will be denoted bydt.(P). Thus we have obtained a representation of the matricant in G, inthe form of a series35

I c

ns.(P)=E+j P(C)d +f P(oJ P(C')d,d +..., (60)I. rr I.

The properties 1.-4. of the matricant that were set up in § 5 automaticallycarry over to the case of a complex argument.

Any solution of (55) that is regular in G and reduces to the matrix Xfor z = zo can be represented in the form

X =cl ,(P) - C (C=X0). (61)

The formula (61) comprises all single-valued solutions that are regularin a neighborhood of zo (z(, is a regular point of the coefficient matrix P(z) ).These solutions when continued analytically in G give all the solutions of(55) ; i.e., the equation (55) cannot have any solutions for which z wouldbe a singular point.

For the analytic continuation of the matricant in G it is convenient touse the multiplicative integral.

35 Our proof for the existence of a normalized solution and its representation in G,by the series (60) remains valid if instead of the assumption that the domain is a stardgmain we make a wider assumption, namely, that for every closed part of C:, there existsa positive number I such that every point z of this closed part can be joined to by a pathof length not exceeding t.

138 X I V. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS

§ B. The Multiplicative Integral in a Complex Domain

1. The multiplicative integral along a curve in the complex plane is definedin the following way.

Suppose that L is some path and P(z) a matrix function, continuous onL. We divide the path L into n parts (zo, z1), (z1, z2), ... , herezo is the beginning, and z = z the end of the path, and zl, z2, ... , areintermediate points of division. On the segment zk_lzx we take an arbitrarypoint C k and we use the notation dzk = zk - zk _ 1 (k =1, 2, ... , n). Wethen define

f [E + P (z) dz] = lim (E + P [E + P (CI) dzl] .L 4zk-0

When we compare this definition with that on p. 132, we see that theycoincide in the special case where L is a segment of the real axis. However,even in the general case, where L is located anywhere in the complex plane,the new definition may be reduced to the old one by a change of the variableof integration.

If

z = z (t)

is a parametric equation of the path, where z(t) is a continuous functionin the interval (to, t) with a piece-wise continuous derivative de , then it iseasy to see that

d-z dt{ .f [E + P (z) dz] 44, a {E + P [z (t)]

This formula shows that the multiplicative integral along an arbitrarypath exists if the matrix P(z) under the integral sign is continuous alongthis path.36

2. The multiplicative derivative is defined by the previous formula

Ds% = dg X-1.

Here it is assumed that X(z) is an analytic function.All the differential formulas (I-III) of the preceding section carry over

without change to the case of a complex argument. As regards the integralformulas IV-VI, their outward form has to be modified somewhat:

'16 Sec footnote 26. Even when P(z) is continuous along L, the function P[z(t)] demay only he sectionally continuous. In this case we can split the interval (t,, () intopartial intervals in each of which the derivative ds is continuous and can interpret theintegral from t, to t as the son of the integrals along these partial intervals.

§ 8. MULTIPLICATIVE INTEGRAL IN COMPLEX DoMAIN 139

IV'. f (E + Pdz) _ J (E + Pdz) / (E -- P dz).L.. L,

V'. f (E + Pdz) _ [f (E + P dz)]-1.-L L

VI'. f (E + CPC-1 dz) =C f (E + Pdz) C-' (C is a constant matrix).L L

In IV' we have denoted by L' + L" the composite path that is obtainedby traversing first L' and then L". In V', - L denotes the path that differsfrom L only in direction.

The formula VII now assumes the form

VII'. f [E+(Q+DX)dz] =X(z)f(E+%'QXdz)X(zo)-'.L L

Here X(zo) and X(z) on the right-hand side denote the values of X(z) atthe beginning and at the end of L, respectively.

Formula VIII is now replaced by the formula

VIII'. mod [f (E+Pdz)-f(E+Qdz)] 5»L L

where mod Q < qJ, mod (P - Q) < d I. I = 'i 1 11, and l is the length of L.VIII' is easily obtained from VIII if we make a change of variable in the

latter and take as the new variable of integration the arc-length s along Ldz.

(with 'da! =1).

3. As in the case of a real argument, there exists a close connection betweenthe multiplicative integral and the matricant.

Suppose that P(z) is a single-valued analytic matrix function, regularin G, and that G is a simply-connected domain containing z and formingpart of G. Then the matricant Q,(P) is a regular function of z in G..

We join the points z and z by an arbitrary path L lying entirely in G,,and we choose on L intermediate points z,, z2, ... , Then, using theequation

= Qrn_1 ... QZ,ID_

and proceeding to the limit exactly as in § 6 (p. 1:32), we obtain :


Q,(P)=I (E+Pdz) =. (E+Pdz).L

(62)

From this formula it is clear that the multiplicative integral depends noton the form of the path, but only on the initial point and the end point ifthe whole path of integration lies in the simply-connected domain G. withinwhich the integrand P(z) is regular. In particular, for a closed contour Lin Go, we have :

9(E + Pdz) =E. (63)

This formula is an analogue to Cauchy's well-known theorem accordingto which the ordinary (non-multiplicative) integral along a closed contouris zero if the contour lies in a simply-connected domain within which theintegrand is regular.

4. The representation of the matricant in the form of the multiplicativeintegral (62) can be used for the analytic continuation of the matricantalong an arbitrary path L in G. In this case the formula

X =/:.(E+Pdz)X0 (64)

gives all those branches of the many-valued integral matrix Y of the differ-

ential equation d = P% that for z = z reduce to Xo on one of the branches.

The various branches are obtained by taking account of the various pathsjoining zo and z.

By Jacobi's formula (56)

I tr Pdrxl__Ig0 a=.

and, in particular, for X. = L',

r( tr Pds

IIr.(E+Pdz)I =e'. (65)

From this formula it follows that the multiplicative integral is alwaysa non-singular matrix provided only that the path of integration lies entirelyin a domain in which P(z) is regular.

If L is an arbitrary closed path in G and G is not a simply-connecteddomain, then (63) cannot hold. Moreover, the value of the integral

§ 8. MULTIPLICATIVE INTEGRAL IN COMPLEX DOMAIN 141

(E +Pdz)

is not determined by specification of the integrand and the closed path ofintegration L but also depends on the choice of the initial point of integra-tion zo on L. For let us take on the closed curve L two points zo and z, andlet us denote the portions of the path from zo to z, and from z, to zo (in thedirection of integration) by L, and L2, respectively. Then, by the for-mula IV',ar

s, L1 Le

and therefore

q L, r. L,

The formula (66) shows that the symbol (E + Pdz) determines a cer-tain matrix to within a similarity transformation, i.e., determines only theelementary divisors of that matrix.

We consider an element X(z) of the solution (64) in a neighborhood ofzo. Let L be an arbitrary closed path in 0 beginning and ending at zo. Afteranalytic continuation along L the element X(z) goes over into an elementX(z). But the new element %(z) satisfies the same differential equation(55), since P(z) is a single-valued function in O. Therefore

%=%V,

whefe V is a non-singular constant matrix. From (64) it follows that

X (zo) = # (E + Pdz) X..ti

Comparing this equation with the preceding one, we find:

(E+Pdz)Xo. (67)

4In particular, for the matricant X we have Yo = E, and then

V = 9 (E + Pdz). (68)

so

ar To simplify the notation we have omitted the expression to be integrated, R + Pile,which is the same for all the integrals.

142 XIV. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL. EQUATIONS

§ 9. Isolated Singular Points

1. We shall now deal with the behavior of a solution (an integral matrix)in a neighborhood of an isolated singular point a.

Let the matrix function P(z) be regular for the values of z satisfyingthe inequality

0<Iz-aI <R.The set of these values forms a doubly-connected domain G. The matrix

function P(z) has in G an expansion in a Laurent series

4-P (Z)P. (z - a) . (69)

An element X(z) of the integral matrix, after going once around a inthe positive direction along a path L. goes over into an element

X+ (z) = X (z) V

where V is a constant non-singular matrix.Let l' be the constant matrix that is connected with V by the relation

V= e2nia (70)

Then the matrix function (z - a)U after going around a along L goesover into (z - a) UV. Therefore the matrix function

F(z)=X(z) (z-a)-° , (71)

which is analytic in 0, goes over into itself (remains unchanged) by analyticcontinuation along L.-I" Therefore the matrix function F(z) is regular in Gand can be expanded in G in a Laurent series

gas

F(z)= IFrom (71) it follows that:

(72)

X (z) = F (z) (z - a)U . (73)

Thus every integral matrix X(z) can be represented in the form (73),where the single-valued function F(z) and the constant matrix I' d^pend on

''" Hence it follows that when z traverses any other closed path in G, the function F(z)Ieturns to it. original value.

§ 9. ISOLATED SINGULAR POINTS 143

the coefficient matrix P(z). However, the algorithmic determination of Uand of the coefficients F. in (72) from the coefficients P,, in (69) is, ingeneral, a complicated task.

A special case of the problem, where

P(z) P"(z-a)"n_-1

will be analyzed completely in § 10. In this case, the point a is called aregular singularity of the system (55).

If the expansion (69) has the form

P(z) = ' P"(z-a)" (q> 1; P-q 0)"--q

then a is called an irregular singularity of the type of a pole. Finally, ifthere is an infinity of non-zero matrix coefficients P. with negative powersof z - a in (69), then a is called an essential singularity of the given differ-ential system.

From (73) it follows that under an arbitrary single circuit in the posi-tive direction (along some closed path L) an integral matrix X (z) is multi-plied on the right by one and the same matrix

V = e2":o

If this circuit begins (and ends) at z,, then by (67)

V = X (zo)-' 9c (E + P dz) X (zo) . (74)

so

If instead of X(z) we consider any other integral matrix X(z) =X(z)C(C is a constant matrix; I C 10), then, as is clear from (74), V is replaced

by the similar matrixV=C-'VC

Thus, the 'integral substitutions' V of the given system form a class of

similar matrices.From (74) it also follows that the integral

(E + Pdz) (75)

r.

is determined by the initial point x and does not depend on the form of the


curved path.39 If we change the point zo, then the various values of theintegral that are so obtained are similar.40

These properties of the integral (75) canalso be confirmed directly. For let L and L'be two closed paths in G around z = a with theinitial points zo and z (see Fig. 6).

The doubly-connected domain between Land L' can be made simply-connected by intro-ducing the cut from z to z,,'. The integralalong the cut will be denoted by

T=f(E+Pdz). Fig. 6

Since the multiplicative integral along a closed contour of a simply-connected domain is E, we have

JT f-'T-'=E;L L

hence

I=TIT-'.L' L

Thus, the integral (E+Pdz), like V, is determined to within similarity,and we shall occasionally write (74) in the form

V-(E+Pdz);

meaning that the elementary divisors of the matrices on the left-hand andright-hand sides of the equation coincide.

2. As an example, we consider a system with a regular singularity

where

dX=P(z)XaZ

P Z P"(z-a)"."-0

Let

Q(z)= z-a

tD Under the condition, of course, that the path of integration goes around a once inthe positive direction.

40 This follows from (74), or from (66).


Using the formula VIII' of the preceding section, we estimate the modulusof the difference

D=ci(E-+-Pdz)-f(E+Qdz), (76)

taking as path of integration a circle of radius r (r < R) in the positivedirection. Then with

mod P_1Sp_11, mod IPP(z-a)"Sd(r)I, I=11111;Is-aI - r n-0

we set in VIII':

and then obtaind=d(r), 1=2xr

mod D S

se2 "p_, (e2 ""rd (r) -1)1.

Hence it is clear that"limD=O. (77)ry0

On the other hand, the system

dYdz =QY

is a Cauchy system, and in that case we have for an arbitrary choice of theinitial point zo and for every r < R

f (E + Q dz)go

Therefore it follows from (76) and (77) that:

lim (E + P dz) = e2"cr-, . (78)ry0..

S.

But the elementary divisors of the integral (E + Pdz) do not depend ont.

zo and r and coincide with those of the integral substitution V.From this Volterra in his well-known memoir (see [3741) and his book

163] (pp. 117-120) deduces that the matrices V and ew""-' are similar, sothat the integral substitution V is determined to within similarity by the`residue' matrix P_1.

But this assertion of Volterra is incorrect.

41 Here we have used the fact that for a suitable choice of d(r)

lim d(r) =d.,r-.0

where d, is the greatest of the moduli of the elements of A.

146 X I V. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATION -S

From (74) and (78) we can only deduce that the characteristic valuesof the integral substitution V coincide with those of the matrix e'"P-1. How-ever, the elementary divisors of these matrices may be distinct. For example,for every r 0 the matrix

a r

.0 a

has one elementary divisor (A-a)2, but the limit of the matrix for r- 0,i.e., the matrix iI o a I, has two elementary divisors A - a, A - a.

Thus, Volterra's assertion does not follow from (74) and (78). It is noteven true in general, as the following example shows.

Let

P(z)_0 0'! 1 1.0 1

1:0

-l 2+ 1 0 0

The corresponding system of differential equations has the form:

at = X2,

Integrating the system we find:

xl=clnz+d,The integral matrix

dz:- x2

dz--

z

ez$=

lnz 1,%(z) =

1 z-1 0

when the singular point z = 0 is encircled once in the positive direction, ismultiplied on the right by the matrix

.41 0

V=li27ci 11

This matrix has one elementary divisor (A - 1)2. At the same time thematrix

1 o oil-'1 Oile2"iP_1 = e2"i 0 -1 =E0 1 ii

has two elementary divisors A - 1, A - 1.

3. We now consider the case where the matrix P(z) has a finite numberof negative powers of z - a (a is a regular or irregular singularity of thetype of a pole) :


P_1p(z)= P-g(z-a)c +... z-a t PA(x-a)a (4? I; P, 'p)ow-

We transform the given system

by setting

d =PX

X =A (z) Y, (80)

(79)

where A (z) is a matrix function that is regular at z = 0 and assumes therethe value E :

the power series on the right-hand side converges for I z - a I < r1.The well-known American mathematician G. D. Birkhoff has published

a theorem in 1913 (see [1171) according to which the transformation (80)can always be chosen such that the coefficient matrix of the transformedsystem

dz = P* (z) Y

contains only negative powers of z - a :

P_*y P_*1(za)9 +... -i. z-a

(79')

Birkhoff's theorem with its complete proof is reproduced in the bookOrdinary Differential Equations, by E. L. Ince.42 Moreover, on the basisof these `canonical' systems (79') he investigates the behavior of the solutionof an arbitrary system in the neighborhood of a singular point.

Nevertheless, Birkhoff's proof contains an error, and the theorem is nottrue. As a counter-example we can take the same example by which wehave above refuted Volterra's claim.43

In this example q = 1, a = 0 and

P-1 =0 0

0 -1 11, p°0 11

0 0for n = 1,2,....

'z See 1201, pp. 632.41. Birkhoff and Ince formulate the theorem for the singularpoint z = oo. This is no restriction, because every singular point z = a can be carried bythe transformation z'= 1/ (z - a) into z'= co.

13 In the case q =I the erroneous statement of Birkhoff coincides in essence withVolterra's mistake (see p. 145).

148 XIV'. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIO.N.6

Applying Birkhoff's theorem and substituting in (79) the productAY for X in (79), we obtain after replacing ds by PZ' and cancelling Y

=PA.

Equating the coefficients of 1/z and of the free terms we find:

P1 =P_I, A1P_I - P_1A1 + AI = Po.Setting

a 0

Ic 0

A,= a bI.e di

we obtain :0 0' ;0 1

-c -d.l 0 0

This is a contradictory equation.In the following section we shall examine, for the case of a regular singu-

larity, what canonical form the system (79) can be transformed into bymeans of a transformation (80).

§ 10. Regular Singularities

In studying the behavior of a solution in a neighborhood of a singular pointwe can assume without loss of generality that the singular point is z = 0."

1. Let the given system be

dz =P (z) X (81)where

P (z) = P_II + X Pmzm (82)

m-0

and the series E Pm zm converges in the circle I z I < r.m-0

We set

X=A(z)Y, (83)

whereA(z)=E-A1z+A2z2+ . (84)

4' By the transformation z' = z - a or z' = 11z every finite point z = a or z = 00 canbe carried into z' = 0.

§ 10. REGULAR SINGULARITIES 149

Leaving aside for the time being the problem of convergence of the series(84), let us try to determine the matrix coefficients A. such that the trans-formed system

where,i = P* (z) Y, (85)

P*(z)=Pm-o

(88)

is of the simplest possible ('canonical') form.45When we substitute the product AY for % in (81) and use (85), we

obtain :

A (z) P*(z) Y+ dz Y= P (z) A (z) Y.

Multiplying both sides of the equation by Y-1 on the right we find:

P(z)A(z)-A (z)P*(z)=di

When we replace here P(z), A(z), and P*(z) by the series (82), (84),and (86) and equate the coefficients of equal powers of z on the two sides,we obtain an infinite system of matrix equations for the unknown coefficientsA1, A2, .. :ss

1. P_1=P:1,2. P_1A1-A1(P_1+E)+Po=Po,3. P_1A2 - A2 (P_1 + 2 E) + PO A1- A1Po* + P1 = Pi ,.........................

(m + 2). P_1Am+1- A,,,+t [P_1 + (m + 1) E] ++ AmPo + PiAm_1- Ain_1PI + ... + P, = Pm .

(87)

2. We consider several cases separately:

1. The matrix P_1 does not have distinct characteristic values thatdiffer from each other by an integer.

S We shall aim at having only a finite number (and indeed the smallest possiblenumber) of non-zero coefficients P. in (86).

+R In all the equations beginning with the second we replace P-, by P., in accordancewith the first equation.


In this case the matrices P_1 and P_1 + kE do not have characteristicvalues in common for any k = 1, 2, 3, ... , and therefore (see Vol. I, ChapterVIII, § 3)47 the matrix equation

P_IU - U (P_1 + kE) =T

has one and only one solution for an arbitrary right-hand side T.We shall denote this solution by

0r (P_1, T).

We can therefore set all the matrices P; (m = 0, 1. 2, ...) in (87) equal tozero and determine A1, A2, .. . successively by means of the equation

A1= 01 (P_1, - Po), As = 02 (P_1, - PI - PSAI), ... .

The transformed system is then a Cauchy system

dY P-1 y'dz = z

and so the solution X of the original system (81) is of the

X=A(z)zP_1. (88)

2. Among the distinct characteristic values of P_1 there are some whosedifference is an integer; furthermore, the matrix P_1 is of simple structure.

We denote the characteristic values of P_1 by Al, AZ, .... A. and orderthem in such a way that the inequalities

hold.Re (A1)>Re(12)>...?Re (A.) (89)

47 However, we can also prove this without referring to Chapter VIII. The proposi-tion in which we are interested is equivalent to the statement that the matrix equation

P-,U=I'(P-,+kE) (')has only the solution U = 0. Since the matrices I'-, and I'-, + kE have no characteristicvalues in common, there exists a polynomial f (k) for which

But from it follows thatf(P-,) = 0, f(P.., + kE) =E.

f(P-,)1%= Uf(P_, + kE).Hence U = 0.

's The formula (88) defines one integral matrix of the system (81). Every integralmatrix is obtained from (88) by multiplication on the right by an arbitrary constantnon-singular matrix C.


Without loss of generality we can replace P_1 by a similar matrix. Thisfollows from the fact that when we multiply both sides of (81) on the leftby a non-singular matrix T and on the right by T-1, we in fact replace allthe P. by TPmT-1 (m=-1, 0, 1, 2, ...); moreover, % is replaced byTXT-1. Therefore we may assume in this case that P_1 is a diagonalmatrix :

P-1 =1,' ik lii-

We introduce a notation for the elements of PM, P,, and A.:

Pm-!ip(k)II1 Am=jjxii)Iji-

In order to determine A1, we use the second equation in (87).matrix equation can be replaced by the scalar equations

(At - Ak -1) xlr1) + pa) = pa)

(91)

This

(i, k =1, 2, ... , n) (92)

If none of the differences Al - Ak is 1, we can set P. *= 0. We then havefrom (872) that A, =01(P_1- P°).49

In that case the elements of A, are uniquely determined from (92) :

AT)xjk) = (i, k =1, 2, ... , n). (93)

But if for someS° i, kAi-Ak=1,

then the correspondingpi})

is determined from (92) :

Pa) = Pa)>

and the corresponding zik can be chosen quite arbitrarily.For those i and k for which 1, - lk # 1 we set :

=0,p;k°)O

and find the corresponding x(jk) from (93).Having determined A,, we next determine A2 from the third equation

of (87). We replace this matrix equation by a system of n2 scalar equations:

(A;-Ak-2) xik2)=pit)*-p1j)-(P,A1-A1Po)ir (94)

(i, k =1, 2, ... , n).

Here we proceed exactly as in the determination of A,.

49 We use the rotation introduced in dealing with the case 1.so By (89) this is only possible for i < k.


If i, - 2w 96 2, then we set :

and find from (94) :pig). =0;

x(2)- - (Pik - (PoelI A1Pe),k]u - 2

But if 1, - AA: = 2, then it follows from (94) for these i and k that :

pa(')* = p(t) + (POA1- A1Po),k.

In this case is chosen arbitrarily.Continuing this process we determine all the matrices Po , Pi , .. .

and A1, A2, ... in succession.Furthermore, only a finite number of the matrices P,q is different from

zero and, as is easy to see, P (z) is of the forma'

11 a 2zlrartz

P'(z)= 0

0 0

agaZ,S-hex-t

4.. z

(95)

where ark = 0, when A, - At is not a positive integer, and a3k = ,x i-'r- 1)*when A, - lk is a positive integer.

We denote by mt the integral part of the numbers Re A,:'2

Then, by (89),

If 2, -Ak is an integer, then

mlL> m22 ... Zm,.

.I,-.lk=m,-mk.

51 P: (m ? 0) can be different from zero only when there exist characteristic values k,and kk of P_, such that k, - kt - ] = at (and, by (89), i < k). For a given in therecorresponds to each such equation an element pik(m) = at of the matrix e.; this elementmay be different from zero. All the remaining elements of P.' are zero.

32 I.e., at, is the largest integer not exceeding Re k, (i=1, 2, ... , n).

m, = [Re (A,)] (i =1, 2, ..., n). (96)


Therefore in the expression (95) for the canonical matrix P'(z) we canreplace all the differences A, - Ak by m; - m,r. Furthermore, we

(i=1,2,...,n),Al au ... a,.

M =i! mr ark lii, U=0 2, ... as,.

set:

(91')

(97)

o

Then it follows from (95) (see formula I on p. 134) :

P' (z) = zyU z-M + M

= Ds (zMz°) .

Hence Y = zMzu is a solution of (85) and

X =A (z) zMz° (98)

is a solution of (81) .63

3. The general case. As we have explained above, we may replace P_1without loss of generality by an arbitrary similar matrix. We shall assumethat P_1 has the Jordan normal forms*

P_1 =(A1E1 + H1, AZE5 + Hz, ..., A E + H. ), (99)

with

Re (Al) Z Re (A$) z Re (A.). (100)

Here E denotes the unit matrix and H the matrix in which the elements ofthe first superdiagonal are 1 and all the remaining elements zero. The ordersof the matrices E, and H; in distinct diagonal blocks are, in general, differ-

ent ; their orders coincide with the degrees of the corresponding elementarydivisors of P_1."

In accordance with the representation (99) of P_1 we split all thematrices Pm, P.*, A. into blocks :

Ss The special form of the matrices (97) corresponds to the canonical form of P-,.If P_, does not have the canonical form, then the matrices M and U in (98) are similar tothe matrices (97).

°' See Vol. I, Chapter VI, § 6.33 To simplify the notation, the index that indicates the order of the matrices is omitted

from E, and H,.


(m) m)' (M) v)c, )c, Am=(Xir )IT)-

Then the second of the equations (87) may be replaced by a system of

equations

(AAA A + Hi) X,r) - [(As, + 1) Ek + Hr] + Piko)= P;t) , (101)

which can also be written as follows :

(A,-'1r-1) X;,'t) a- H,X(k)-X(k)Hk + P;ko) =P(O)' (i, k = 1, 2, ..., u). (102)

Suppose that"

(o)_ (0) (0)' (o)'P,k Pa . Pir - I P.

Then the matrix equation (102) (for fixed i and k) can be replaced by asystem of scalar equations of the formB7

Ri- Ak-1);1-rx,+1.i X,.1-1 + P.(O,)=P(,O,)* (103)

(8 =1, 2, ..., v; t =1, 2, ..., w; xo+1. I = x,.o = 0) ,

where v and w are the orders of the matrices A,E1 + H, and AkEk + Hr in (99).If At - ,ik 1, then in (103) we can set all the equal to zero and

determine all the zr, uniquely from the recurrence relations (103). Thismeans that in the matrix equations (102) we set

PIT' = O

and determine Xrr) uniquely.If A - Ak = 1, then the relations (103) assume the form

(o (o)'xa+1,I-x,.?-i+P,, P.(x.+,., = x, 0 = 0; 8 =1,2, ..., v; t=1,2, ..., w).

(104)

55 To simplify the notation, we omit the indices i, k in the elements of the matricesP(,.,) AT .

sr The reader should hear in mind the properties of the matrix H that were developedon pp. 13-15 of Vol. I.


It is not difficult to show that the elements x1 of ik can be determinedfrom (104) so that the matrix Pal*has, depending on its dimensions (v X w),one of the forms

ao 0

a1 ao

0

0 II

1 ao-2... al ao

(v = w)

ao 0 . . . 00...0al ao . . . 0 0 ... 0

1 a1 ao0...01(v<to)

0 0 . . . 0

0 0 . . . 0ae 0 . . . 0a a . . . 01 o

(105)

ago-1 ... a1 a0 11

(V> w)

We shall say of the matrices (105) that they have the regular lowertriangular form.88

From the third of the equations (87) we can determine A2. This equa-tion can be replaced by the system

(A;-Ax-2)%(2)+Hi%t l-%t )Hk+(P0AI-A1Po)a+P(,I=P;,)* (106)

(i,k=1,2,...,u).In the same way that we determine A1i we determine X;kl uniquely with

P,k" = 0 from (106) provided Al - Ak rpl- 2. But if A; - Ax = 2, then .Yjt canbe determined so that PtikR is of regular lower triangular form.

SE Regular upper triangular matrices are defined similarly. The elements of X(1,% arenot all uniquely determined from (104); there is a certain degree of arbitrariness in thechoice of the elements x.,. This is immediately clear from (102) : for ). - ). = I wemay add to X(i:) an arbitrary matrix permutable with H, i.e., an arbitrary regular uppertriangular matrix.


Continuing this process, we determine all the coefficient matrices A1.As, ...and F11,, P,*. Pl, ... in succession. Only a finite number of the coef-ficients P., is different from zero, and the matrix P* (z) has the followingblock form :69

.t,E1 + H1

P. (z) =

where

0

Bn2'r'1 ...1299+ H2 .. ,

B,,A'-Jtr-t

(1(Y7)

0 11E5 + Hr2

0 if 4 - Ak is not a positive integer,B,k = P('-4-1)' if Ai-Ak is a positive integer.

All the matrices B,k (i, k = 1, 2, ... , u ; i < k) are of regular lower trian-gular form.

As in the preceding case, we denote by mt the integral part of Re A;

and we setm,= (Re (A)] (i =1, 2, ... , u) (108)

A,=m,+A, (i=1, 2, ... , u). (108')

Then in the expression (107) for P* (z) we may again replace the difference14 - AA; everywhere by mi - mk. If we introduce the diagonal matrix Mwith integer elements and the upper triangular matrix U by means of theequations"

M = (MA6,k)i

0 0 ...then we easily obtain, starting from (107), the following representation ofP* (Z) :

P' (z) = z" z . z-y + -M = D, (z vzu) .

0 AIE= + H2 ... BsrU = (109)

AIEI + Hl B1! ... B1r

69 The dimensions of the square matrices E4, H4 and the rectangular matrices B{4 aredetermined by the dimensions of the diagonal blocks in the Jordan matrix P_,, i.e., bythe degrees of the elementary divisors of P_,.

60 Here the splitting into blocks corresponds to that of P_, and P' (z).

§ 10. REGULAR SINGULARITIES

Hence it follows that the solution (85) can be given in the form

Y =.z zu

and the solution of (81) can be represented as follows :

X = A(z) zMz0.

157

(110)

Here A (z) is the matrix series (84), M is a constant diagonal matrix whoseelements are integers, and U is a constant triangular matrix. The matricesM and U are defined by (108), (108'), and (109).61

3. We now proceed to prove the convergence of the series

We shall use a lemma which is of independent interest.

LEMMA : If the series82

x = ao + a1z + atz9 + (111)

formally satisfies the system

T = P (z) x (112)

for which z = 0 is a regular singularity, then (111) converges in every neigh-borhood of z = 0 in which the expansion of the coefficient matrix P(z) inthe series (82) converges.

Proof. Let us suppose that

P(z)=Pq-o

where the series . Pm z,A converges for I z < r. Then there exist posi-

tive constants p-1 and p such that63

mod P_1 S P_1I, mod P,, -E I, I (113)

Substituting the series (111) for x in (112) and comparing the coeffi-cients of like powers on both sides of (112), we obtain an infinite system of(column) vector equations

61 See footnote 53.

62 Here x= (x,, x,, .. . is a column of unknown functions; a*, a,, a,, ... are con-stant columns; P(a) is a square coefficient matrix.

63 For the definition of the modulus of a matrix, see p. 128.

158 XI V. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL EQUATIONS

P_lao = o,(E- P_1) at = Pte,

(2 E - P_1) as = Poat + P,ao,

(mE-P_1) a.=PoaI + Ptarn-e t ... + Pr_lao,

It is sufficient to prove that every remainder of the series (111)

z(k) = a zk + ak+lz" I .}... .

(114)

(115)

converges in a neighborhood of z = 0. The number k is subject to theinequality

k > np_,.

Then k exceeds the moduli of all the characteristic values of P_1,'a so thatform ? k we have I mE - P_I 0 and

(mE-P_I)-1 m (E- m E+ 12 P_1 + , P!+... (116)(m= k, k+ 1, ...).

In the last part of this equation there is a convergent matrix series. Withthe help of this series and by using (114), we can express all the coefficientsof (115) in terms of ab, al, .... ak_ I by means of the recurrence relations

a,=(.M E+ ,P_1 ma P2_1 + ...) (/r,-1 + Poam-I + ... + Pm-k-lak),

where(m=k,k+1,...) (117)

jm_1 = Pr_kak_t + + Pm_tao (m= k, k + 1, ...). (118)

Note that this series (115) formally satisfies the differential equation

64 If ),p is a characteristic value of A = Iau 11,;, then 14 ! < n maz I a, '. For letAx=A.x, where r =- (x,, x..... zs) .,& 0. Then

``w

doxi = 27 aikxkk-t

(i =1, 2, ... n).

Let Ix5I=mas {-x, , :x,...... 1x,,;). Then

I $i, k$a

w

I'.OI xjl 2Iaikl lxkI; Izj1n max Iaikl.k-I 1$i,k$n

Dividing through , x, we obtain the required inequality.


where

dx(k)= P(z) k) + / (z) ,dx

f (z) /,,z"' =P(z) (a0 + a1z + .. - + ak_Iz+a--I) -

(119)

-al-2alz-.-.-(k-1) ak_lz 2. (120)

From (120) it follows that the series

X /,zm

converges for I z I < r ; hence there exists an integer N > 0 such that65

modjIN II (m= k-1, k, ...). (121)

From the form of the recurrence relations (117) it follows that when thematrices P_1i Pq, f",_, in these relations are replaced by the majorantmatrices p_11, pr-gl, j! 1 II and the column a," by II a,,, II (m = k, k + 1,

q = 0, 1, 2, ... ),66 then we obtain relations that determine upper bounds11 am I' for mod a,:

Therefore the seriesmod am51!a,,,I!. (122)

e(k) = akZ1 + ak,f,lzk+ 1 + .. . (123)

after term-by-term multiplication with the column li 1 II becomes a majorantseries for (115).

By replacing in (119) the matrix coefficients P_1i P. fm of the series

P (z) = P >< + Pgzg,q-0 m-k-1

by the corresponding majorant matrices p_, I, 41, ii ' II, II (k) ;, we obtaina differential equation for (k) :

Zk-IN -dk) n p_, k rk - tdz (z + 1 zr=') -

1

z .

r

(124)

66 Here II N/r"' II denotes the column in which all the elements are equal to one and thesame number, N/rm.

66 Here II a., II denotes the column (a., a., .... a.) (am is a constant, m = k,k + 1, ...).


This linear differential equation has the particular solutionr

rk)= N -op

I Zk-"s-'-1 1- sdz, (125)

(1-sr-1)o

which is regular for z = 0 and can be expanded in a neighborhood of thispoint in the power series (123) which is convergent for I z I < r.

From the convergence of the majorant series (123) it follows that theseries (115) is convergent for I z I < r, and the lemma is proved.

Note 1. This proof enables us to determine all the solutions of the differ-ential system (112) that are regular at the singular point, provided suchsolutions exist.

For the existence of regular solutions (not identically zero) it is necessaryand sufficient that the residue matrix P_1 have a non-negative integral char-acteristic value. Ifs is the greatest integral characteristic value, then columnsao, al, ... , a. that do not all vanish can be determined from the first s + 1 ofthe equations (114) ; for the determinant of the corresponding linear homo-geneous equation is zero :

d=IP-II I E-P iI...I8E-P-iI=O.From the remaining equations of (114) the columns a.+1, a,+2, .. can beexpressed uniquely in terms of ao, al, ... , a,. The series (111) so obtainedconverges, by the lemma. Thus, the linearly independent solutions of thefirst s + 1 equations (114) determine all the linearly independent solutionsof the system (112) that are regular at the singular point z = 0.

If z = 0 is a singular point, then a regular solution (111) at that point(if such a solution exists) is not uniquely determined when the initial valueao is given. However, a solution that is regular at a regular singularity isuniquely determined when ao, a,, ... , a. are given, i.e., when the initialvalues at z = 0 of this solution and the initial values of its first s derivativesare given (s is the largest non-negative integral characteristic value of theresidue matrix P_1).

Note 2. The proof of the lemma remains valid for P_1= 0. In thiscase an arbitrary positive number can be chosen for p_1 in the proof of thelemma. For P_1= 0 the lemma states the well-known proposition on theexistence of a regular solution in a neighborhood of a regular point of thesystem. In this case the solution is uniquely determined when the initialvalue ao is given.

4. Suppose given the systemdXdz = P (z) X . (126)


where

P(z)=P'+ PP,,,zm

and the series on the right-hand side converges for I z I < r.Suppose, further, that by setting

X =A (z) Y

and substituting for A(z) the series

A(z)=Ao+A1z+A,zs+we obtain after formal transformations :

dY=P.(Z) Y,dz

where

(127)

(128)

(129)

P'(z)=p+P,,,zm,and that here, as in the expression for P(z), the series on the right-handside converges for I z I < r.

We shall show that the series (128) also converges in the neighborhooda <rof z=0.

Indeed, it follows from (126), (127), and (129) that the series (128)formally satisfies the following matrix differential equation

= P (z) A - AP (z). (130)

We shall regard A as a vector (column) in the space of all matrices oforder n, i.e., a space of dimension n2. If in this space a linear operatorP(z) on A, depending analytically on a parameter z, is defined by theequation

P(z) [A] =P(z) (131)

then the differential equation (130) can be written in the formdA

= P (z) (A] (132).dz

The right-hand side of this equation can be considered as the product ofthe matrix r (z) of order n2 and the column A of n2 elements. From (131)it is clear that z = 0 is a regular singularity of the system (132). The series(128) formally satisfies this system. Therefore, by applying the lemma, weconclude that (128) converges in the neighborhood I z I < r of z = 0.


In particular, the series for A(z) in (110) also converges.Thus, we have proved the following theorem :

THEOREM 2. Every system

dXdz=P(z)X,

with a regular singularity at z = 0

P(z)=P'+P'e'has a solution of the form

(133)

X = A (z) zmzu , (134)

where A(z) is a matrix function that is regular for z = 0 and becomes theunit matrix E at that point, and where M and U are constant matrices, Mbeing of simple structure and having integral characteristic values, whereasthe difference between any two distinct characteristic values of U is not aninteger.

If the matrix P_1 is reduced to the Jordan form by means of a non-singular matrix T

P_1=T{A1 E1+H1, 23E8+H2,...,A,E,+H,)T-1 (135)(Re (A1) z Re (A2) z ... Re (A,)),

then M and U can be chosen in the form

M=T(m1 E1, rn2E2, ..., m,E,)T-1, (136)

AIE1 + H1 B12 ... B1,

U = T 0 12 E$ + H2 ... B2+ T-1, (137)

0 O ... A. E, + H,where

m;=(2c), Ai=),c-mc (i=1, 2, ...,s). (138)

The B,k are regular lower triangular matrices (i, k - 1, 2, ... , s) and Bik = 0if i1 - Ak is not a positive integer (i, k =1, 2, .....; ).

In the particular case where none of the differs nces Ak (i, k =1, 2,3, ... , s) is a posit ire integer. we can set in (134) 31= 0 and U = P_ i ; i.e.,in this case the, solution can be represented in the form

X=A(z)z" . (139)

10. REGULAR SINGULARITIES 163

Note 1. We wish to point out that in this section we have developed an

algorithm to determine the coefficients of the series A(z) _ `4,,e'M_o

(A,,= E) in terms of the coefficients P. of the series for P(z). Moreover,the theorem also determines the integral substitution V by which the solution(134) is multiplied when a circuit is made once in the positive directionaround the singular point z = 0:

V = e'2xiv.

Note 2. From the enunciation of the theorem it follows that

B;x=0 for A;#Ak (1, k = 1, 2, ..., 8).Therefore the matrices

O B12 ... B13

A= T ( 1E1, 22E2, ... , ZE3) T_1 and U=T U U "' '2' T-1 (140)

0 0 ...0are permutable :

HenceAU= UA.

zxzc = z.vz,i + u = z-rz;V = znz°, (141)where

A =M +A= T-' (142)

and where A,, X12, ... , A. are all the characteristic values of P_1 arrangedin the order Reel,>Re22>...>ReA".

On the other hand.

z°=h(U),where h(A) is the Lagrange-Sylvester interpolation polynomial for f (A) = za.

Since all the charaeteristic values of I' are zero, h(2) depends linearlyon f (0), f (0), ... , f (a-1) (0), i.e.. on 1, In z, ... , (In z)9-1 (g is the leastexponent for which rig =O). Therefore

and

a-1

h (1) = Z' h; (1,) (in z)'

1 q1s ... 41xp

z°=h(U)= V h, (U)(inz)j=T 0 1

:")T_,,

1-00 0

(143)

164 XI V. APPLICATIONS TO SYSTEMS OF LINEAR DIFFERENTIAL IhQI-ATIONS

where q,1 (i, j = 1, 2, ... , n; i < j) are polynomials in In z of degree lessthan g.

By (134), (141), (142), and (143) a particular solution of (126) can bechosen in the form

X=A (z)

Z" 0 ... 00 z'"...0 (144)

0 0 ... z'6Here AI, A2, ... , A. are the characteristic values of P_I arranged in the orderRe AI > Re A2 ? ... ? Re A,,, and q1 (i, j =1, 2, ... , n ; i < j) are poly-nomials in In z of degree not higher than g -1, where g is the maximal num-ber of characteristic values 1., that differ from each other by an integer ; A (z)is a matrix function, regular at z0, and A(0) = T (I T I =AO). If P_I hasthe Jordan form, then T = E.

§ 11. Reducible Analytic Systems

1. As an application of the theorem of the preceding section we shall investi-gate in what cases the system

dt =Q(t)X,where

(145)

Q(t)= EQt, (148)q_1

is a convergent series for t > to, is reducible (in the sense of Lyapunov), i.e.,in what cases the system has a solution of the form

X = L (t) 01, (147)

where L(t) is a Lyapunov matrix (i.e., L(t) satisfies the conditions 1.-3. onp. 117) and B is a constant matrix.67 Here X and Q are matrices withcomplex elements and t is a real variable.

We make the transformation .

67 If the equation (147) holds, then the Lyapunov transformation %=L(t)Y carries

the system (145) into the system r = BY.

§ 11. REDUCIBLE ANALYTIC SYSTEMS

Then the system (145) assumes the form

d% =P(z)X,a:where

P(z)=-z-2Q (z)=-Ql- X Qn+:z`.

-o

165

(148)

(149)

The series on the right-hand side of the expression for P(z) converges forz I <1/to. Two cases can arise :

1) Q, = 0. In that case z = 0 is not a singular point of the system (148).The system has a solution that is regular and normalized at z = 0. This solu-tion is given by a convergent power series

SettingX(z)=E+X,z+ Xgzs+ (fzI < ').

L(t) _%()), 8=0,we obtain the required representation (147). The system is reducible.

2) Q, # 0. In that case the system (148) has a regular singularity atz=0.

Without loss of generality we may assume that the residue matrixP_, = - Q, is reduced to the Jordan form in which the diagonal elements11, 22, ... , A. are arranged in the order Re 1, Re 12 > ... > Re An.

Then in (144) T = E, and therefore the system (148) has the solution

X=A(z)

zA1 0 ... 00 zA+...0

0 0 ... za*

where the function A (z) is regular for z = 0 and assumes at this point thevalue E, and where qck (i, k = 1, 2, ... , n; i < k) are polynomials in In z.When we replace z by 1/t, we have:

(`)x,0 ... 0

0 0

0 0 ... (, )An

1 1

1 q12 (ln ) ... q,, (In i)0 1 ... qq (In

. (150)

0 0 1


Since X=A(1/t)Y is a Lyapunov transformation, the system (145) isreducible to a system with constant coefficients if and only if the product

t-z' 0 ... 0 i q12 (in i) ... q1, (1n i

0 1 ... q2 (In i)

0 0 ... 1

e-Bt, (151)LI (t) = 0 t-'' ... 0

0 0 ... t-1^

where B is a constant matrix, is a Lyapunov matrix, i.e., when the matrices

L1(t); di-`, and L- I(t) are bounded. It follows from the theorem of Erugin

(§ 4) that the matrix B can be assumed here to have real characteristicvalues.

Since L1(t) and L-1(t) are bounded for t > to, all the characteristicvalues of B must be zero. This follows from the expression for eR* and e-81obtained from (151). Moreover, all the numbers A1, A2, ..., A. must be pureimaginary, because by (151) the fact that the elements of the last row ofL1(t) and of the first column of L-1(t) are bounded implies that Re A > 0and Re Al < 0.

But if all the characteristic values of P_1 are pure imaginary, then thedifference between any two distinct characteristic values of P_1 cannot bean integer. Therefore the formula (139) holds

X =A (z)zt'-'=A( )tQ',

and for the reducibility of the system it is necessary and sufficient thatthe matrix

L2(t) =tQ'e- Bt (152)

together with its inverse be bounded for t > t,,.Since all the characteristic values of B must be zero, the minimal poly-

nomial of B is of the form Ad. We denote by

+V(A)= (A(A-4uz)`' ... ()CuA-IA.(1f',uk for i k)

the minimal polynomial of Q1. As Q1 =- P_1, the numbers µ1, µ2, ... , µdiffer only in sign from the corresponding numbers A; and are therefore allpure imaginary. Then (see the formulas (12), (13) on p. 116)

§ 11. REDUCIBLE ANALYTIC SYSTEMS 167

is

tQ' _ [ U, + Ur, In t -+- Ur.,r-, (In t"r,r-1

Substituting these expressions in the equationL2 (1) eau =1Q,,

we obtain[L2 (t) Vd-I + td-' = Zo (t) (In 1)",

(153)

(154)

(155)

where c is the greatest of the numbers c1, c2, ..., c,,, (.) denotes a matrixthat tends to zero for t -e oo, and Z0(t) is a bounded matrix for t > to.

Since the matrices on both sides of (155) must be of equal order of magni-tude for t -a oo, we have

d = c = 1,

B=O,and the matrix Q, has simple elementary divisors.

Conversely, if Q, has simple elementary divisors and pure imaginarycharacteristic values 1A1i IA2.... , w., then

X=A(z)z-Q.=A(z)IIz "'o;kI"

is a solution of (149). Setting z= 11t, we find :

(1-1X=A t `t"arr:l

The function 1(t) as well as d dt(1)- and the inverse matrix Y-' (t) are

bounded for t > to. Therefore the system is reducible (B = 0). Thus wehave proved the following theorem :88

THEOREM 3: The system

d =Q(t)X,where the neatrix Q(t) can be represented in a series convergent for t > to

I Q11

is reducible if and only if all the elementary divisors of the residue matrix.Q, are simple and all its characteristic values pure imaginary.

lB See Erugin (13]. The theorem is proved for the ease where Q, does not have dis-tinct characteristic values that differ from each other be an integer.


§ 12. Analytic Functions of Several Matrices and theirApplication to the Investigation of Differential Systems.

The Papers of

1. An analytic function of in matrices S1, X2, ..., X. of order it can begiven by a series

F(%1, 12, ..., Xm)=ao+ a%i6...i.%1,%fi...%b (156)

convergent for all matrices %1 of order n that satisfy the inequality

modX1<R. (j=1,2,...,m). (157)

Here the coefficients

ao, ahA .i. (j1, js...., j, =1, 2, ..., m; v= 1, 2, 3, ...)

are complex numbers, R1 (j =1, 2, ... , m) are constant matrices of order nwith positive elements, and X1 (j = 1, 2, ... , m) are permutable matricesof the same order with complex elements.

The theory of analytic functions of several matrices was developed byI. A. Lappo-Danilevskii. He used this theory as a basis for fundamentalinvestigations on systems of linear differential equations with rationalcoefficients.

A system with rational coefficients can always be reduced to the form

dX Uto U11 U1.4 -1+ 1+... + ' g (158)di -

(z-ai) 1 (z-ai)1 z-at

after a suitable transformation of the independent variable, where U areconstant matrices of order n, a1 are complex numbers, and s1 are positiveintegers (k=O, 1, ..., s1 _ 1; j=1, 2, ..., m).61

We shall illustrate some of Lappo-Danilevskii's results in the special caseof the so-called regular systems. The latter are characterized by the condi-tion s1= 82 = ... = s,n = 1 and can be written in the form

dX jm Uj X.dz =f-1 z-a) (159)

69 In the system (158) all the coefficients are regular rational fractions in Z. Arbi-trary rational coefficients can be reduced to this form by carrying a finite point z = cthat is regular (for all coefficients) by means of a fractional linear transformation on zinto a=no.

§ 12. ANALYTIC FUNCTIONS OF SEVERAL MATRICES AND APPLICATIONS 169

Following Lappo-Danilevskii, we introduce special analytic functions,namely hyperlogarithms, which are defined by the following recurrencerelations :

dzle(z; ai,) z-at,

e

,la (z; ai= f 16(z; aj,,al,,...,ai,), z -

ai,dz .

e

Regarding al, a2, ..., am, oo as branch points of logarithmic type. weconstruct the corresponding Riemann surface S(a1, a2, ... , a,,,; oo). Everyhyperlogarithm is a single-valued function on this surface. On the otherhand, the matricant 116 of the system (159) (i.e., the solution normalized atz = b) after analytic continuation can also be regarded as a single-valuedfunction on S(a1, a2, . . . , a,,,; oo) ; here b can be chosen as an arbitrary finitepoint on S other than a1, a2, ... , a,,.

For the normalized solution Q, Lappo-Danilevskii gives an explicit ex-pression in terms of the defining matrices U1, U2, .... of (159) in theform of a series

(1.....m)Q_E+ (160)

This expansion converges uniformly in z for arbitrary U1, U2, ... , U. andrepresents S16' in any finite domain on $(a1, a2, ..., am; oo) provided onlythat the domain does not contain a.1, a2, .. , am in the interior or on theboundary.

If the series (156) converges for arbitrary matrices X1, X2, ... , Xm, thenthe corresponding function F(X1i X2, ... , X,,) is called entire. 0; is anentire function of the matrices U1, U2, ... , Um.

If in (160) we let the argument z go around the point al once in the posi-tive direction along a contour that does not enclose other points a; (fori # j), then we obtain the expression for the integral substitution Z'i corre-sponding to the point z = ai :

(1....m)Vi =E + Y pi (b; ai,, ai1, ..., af) U,,Ui, ... Ui, (161)

=1, 2, ..., m),

where in a readily understandable notation


,3z

(a,)

16 (z; a1 a1,, ..., a,,)pi (b; ai,, a1., ..., a,,) -- dz

(a)

j=1,2,...,mv1, 2, 3, ..

The series (161), like (160), is an entire function of U1, U2, ... , U,,,.

2. Generalizing the theory of analytic functions to the easePO of a countablyinfinite set of matrix arguments X,, X,, IS, ... , Lappo-Danilevskii hasused it to study the behavior of a solution of a system in a neighborhoodof an irregular singularity." We quote the basic result.

The normalized solution Sts of the system

d = Z,' P1 r1X,

where the power series on the right-hand side converges for I z I < r (r > 1),a2can be represented by a series

526=E+ Y £ P1, ... P1, X'-1

r 1Y -F 1 4.... fr, r-y ,4 ...+1 +p n-vaX b z1 " ' InA b. a(") ....ir.In"z.

N-0 1,-0 "-0

(162)

Here a (1) and a(") are scalar coefficients that are defined by...1r 1,...,iu

special formulas. The series (162) converges for arbitrary matrices P1, P2,... in an annulus

e<zI <r

(e is any positive number less than r). The point b must also lie in thisannulus (e < I b j < r).

70 See [291. Vol. 1, Memoir 1.

-1 See (291, Vol. T. Memoir :S. See also (2521, 12531, 12541, 11461, and 11471.

r2 The rest tie' ion r > 1 is not es.s ntial, since this condition can always be obtained byreplacing z by az, where a is a suitably chosen positive number.

§ 12. ANALYTIC FUNCTIONS OF SEVERAL MATRICES AND APPLICATIONS 171

Since in this book we cannot possibly describe the contents of the papersof Lappo-Danilevskii in sufficient detail, we have had to restrict ourselvesto giving above statements of a few basic results and we must refer the readerto the appropriate literature.

All the papers of Lappo-Danilevskii that deal with differential equationshave been published posthumously in three volumes (129] : Memoires stir latheorie des systemes des equations dif f erentielles lineaires (1934-36) ). More-over, his fundamental results are expounded in the papers [252], [253],[254] and the small book [28]. A concise exposition of some of the resultscan also be found in the book by V. I. Smirnov [56], Vol. III.

CHAPTER XVTHE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

§ 1. IntroductionIn Chapter XIV, § 3 we explained that according to Lyapunov's theoremthe zero solution of the system of differential equations

dt.

td =,'aikxk+ (") (1)

k-1(ask (i, k = 1, 2.... , n) are constant coefficients) with arbitrary terms(**)of the second and higher orders in x1, x2, . . . , x is stable if all the character-istic values of the matrix A = 11 aik II; , i.e., all the roots of the secular equa-tion A(A) = I AE - A I = 0, have negative real parts.

Therefore the task of establishing necessary and sufficient conditionsunder which all the roots of a given algebraic equation lie in the left half-plane is of great significance in a number of applied fields in which thestability of mechanical and electrical systems is investigated.

The importance of this algebraic task was clear to the founders of thetheory of governors, the British physicist J. C. Maxwell and the Russianscientific research engineer I. A. Vyshnegradski who, in their papers ongovernors,' established and extensively applied the above-mentioned alge-braic conditions for equations of a degree not exceeding three.

In 1868 Maxwell proposed the mathematical problem of discovering cor-respond ng conditions for algebraic equations of arbitrary degree. Actuallythis problem had already been solved in essence by the French mathematicianHermite in a paper (1871 published in 1856. In this paper he had estab-lished a close connection between the number of roots of a complex poly-nomial f (z) in an arbitrary half-plane (and even inside an arbitrarytriangle) and the signature of a certain quadratic form. But Hermite's

I J. C. Maxwell, 'On governors' Proc. Roy. Soc. London, vol. 10 (1868) ; I. A. Vyshne-grndskii, ' On governors with direct action' (1876). These papers were reprinted in thesurvey 'Theory of automatic governors' (Izd. Akad. Nauk SSSR, 1949). See also thepaper by A. A. Andronov and I. N. Voznesenskii, 'On the work of J. C. Maxwell, I. A.Vyshnegradskii, and A. Stodol in th- theory of governors of machines.'

172

§ 2. CAUCHY INDICES 173

results had not been carried to a stage at which they could be used by spe-cialists working in applied fields and therefore his paper did not receivedue recognition.

In 1875 the British applied mathematician Routh (47], (48 ] , using Sturm'stheorem and the theory of Cauchy indices, set up an algorithm to determinethe number k of roots of a real polynomial in the right half-plane (Re z > 0).In the particular case k = 0 this algorithm then gives a criterion for stability.

At the end of the 19th century,the Austrian research engineer A. Stodola,the founder of the theory of steam and gas turbines, unaware of Routh'spaper, again proposed the problem of finding conditions under which all theroots of an algebraic equation have negative real parts, and in 1895 A. Hur-witz [204] on the basis of Hermite's paper gave another solution (independ-ent of Routh's). The determinantal inequalities obtained by Hurwitz areknown nowadays as the inequalities of Routh-Hurwitz.

However, even before Hurwitz' paper appeared, the founder of themodern theory of stability, A. M. Lyapunov, had proved in his celebrateddissertation ('The general problem of stability of motion,' Kharkov. 1892)2a theorem which yields necessary and sufficient conditions for all the rootsof the characteristic equation of a real matrix A = 11 atk iI; to have nega-tive real parts. These conditions are made use of in a number of paperson the theory of governors.'

A new criterion of stability was set up in 1914 by the French mathema-ticians Lienard and Chipart [259]. Using special quadratic forms, theseauthors obtained a criterion of stability which has a definite advantage overthe Routh-Hurwitz criterion (the number of determinantal inequalities inthe Lienard-Chipart criterion is roughly half of that in the Routh-Hurwitzcriterion).

The famous Russian mathematicians P. L. Chebyshev and A. A. Markovhave proved two remarkable theorems on continued-fraction expansions of aspecial type. These theorems, as will be shown in § 16. have an immediatebearing on the Routh-Ilurwitz problem.

The reader will see that in the sphere of problems we have outlined, thetheory of quadratic forms (Vol. I. Chapter X) and, in particular, the theoryof Hankel forms (Vol. 1, Chapter X, § 10) forms an essential tool.

§ 2. Cauchy Indices

1. We begin with a discussion of the so-called Cauchy indices.

2 See [32], § 30.

3 See, for example, [101-11.

174 XV. TIIE PROBLEM OF ROUTIT-HURWITZ AND RELATED QUESTIONS

DEFINITION 1: The Cauchy index of a real rational function R(x)between the limits a and b (notation : IfR(r) ; a and b are real numbers or±oo) is the difference between the numbers of jumps of R(x) from - ao to+ ao and that of jumps from +oo to - oo as the argument changes froma to b.'

According to this definition, if

R(x)=A, +Rl(x),

where A4, a4 (i =1, 2, ... , p) are real numbers and Ri (x) is a rationalfunction" without real poles, then'

and, in general,

s(2)

I.' R(x)=,E sgnAi (a<b). (2')<.i < b

In particular, if f (x) = ao (x - at) ... (x - a real polynomial(a4 ak for i 96 k ; i, k = 1, 2, ... , m) and if among its roots al, a2, ... , a.only the first p are real, then

1.(=) =1 r xi

/(x) Z-a) -Z-Qi+RI(x). (2")

where R2(x) is a real rational function without real poles.Therefore, by (2') : The index

I; z) (a< b)/(z)

is equal to the number of distinct real roots of f (x) in the interval (a, b).An arbitrary real rational function R(x) can always be represented in

the formr AM A(4)R(x)_ 1 R, (x),4I x-at (--ad)'

where all the a and A are real numbers (A" 0; i = 1. 2, ... , p) andRI (x) has no real poles.

Then

In counting the number of junmps, the extreme values of x-the limits a and b-arenot included.

e The poles of it rational function are those values of the argument for which the

function becomes infinite.I By sign a (a is a real number) we mean + 1. - 1, or 0 according as a > 0, a < 0,

rri=0.

and, in general,8

§ 2. CaucHY INDICES 175

I ±m R (x)= X sgn (3)niodd

I; R (x) _ E sgn A(ii) (a< b). (3')a<ai<6, A{ odd

2. One of the methods of computing the index la R(z) is based on theclassical theorem of Sturm.

We consider a sequence of real polynomials

fi(x), 12(X), ..., f, (x) (4)

that has the two following properties with respect to the interval (a, b) :°

1. For every value x (a < x < b), if any fa(x) vanishes, the two adja-cent functions fa_1(x) and fa+1 (x) have values different from zero and ofopposite signs; i.e., for a < z < b it follows from fa(x) =0 that

f'%-1(x)fa+i(x) < 0.

2. The last function f,. (x) in (4) does not vanish in the interval (a, b) ;i.e., fm(x) 0 for a<x<b.

Such a sequence (4) of polynomials is called a Sturm chain in the inter-val (a, b).

We denote by 17(x) the number variations of sign in (4) for a fixedvalue x.10 Then the value of V (x), as x varies from a to b, can only changewhen one of the functions in (4) passes through zero. But by 1., when thefunctions fa(x) (k = 2, ... , m - 1) pass through zero, the value of V(x)does not change. When f1 (x) passes through zero, then one variation ofsign in (4) is lost or gained according as the ratio f2(x)1f1(x) goes from- oo to + oo or vice versa. Hence we have :

THEOREM 1 (Sturm) : If fl W, f2 W, ..., fm (x) is a Sturm chain in(a, b) and V(x) is the number of variations of sign in the chain, then

1° 1a (x) = V (a) - V (b). (5)11(x)

8 In (3) the sum is extended over all the values i for which the corresponding it, is odd.In (3') the sum is extended over all the i for which n, is odd and a < a, < b.

9 Here a may be - oo and b may be + oo.10 If a < x < b and f, (x) 0, then by 1. in the determination of V (x) a zero value

in (4) may be omitted or an arbitrary sign may be attributed to this value. If a is finite,then F (a) must be interpreted as F (a + e), where a is a positive number sufficientlysmall that in the half-closed interval (a, a + e) none of the functions f, (x) vanishes..In exactly the same way, if b is finite, V(b) is to be interpreted as V(b-e), where thenumber a is defined similarly.

176 XV. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

Note. Let us multiply all the terms of a Sturm chain by one and thesame arbitrary polynomial d(x). The chain of polynomials so obtained iscalled a generalized Sturm chain. Since the multiplication of all the termsof (4) by one and the same polynomial alters neither the left-hand nor theright-hand side of (5), Sturm's theorem remains valid for generalizedSturm chains.

Note that if f(x) and g(x) are any two polynomials (where the degreeof f (z) is not less than that of g(x) ), then we can always construct a gen-eralized Sturm chain (4) beginning with f,(x) - f(x), f2(x) -g(x) bymeans of the Euclidean algorithm.

For if we denote by - f., (x) the remainder on dividing f,(x) by f2(x),by - f. (x) the remainder on dividing f2(x) by f,(x), etc., then we havethe chain of identities

11(x)=q1(x)/2(z)-/s(z),

Ik-I (x)= qr-I (x) /k (z) - /k+ 1 (z), (6)

/m-1 (x)=qm-1 (x)/.(x),

where the last remainder f,,,(x) that is not identically zero is the greatestcommon divisor of f(x) and g(x) and also of all the functions of thesequence (4) so constructed. If fm(x) # 0 (a < x < b) then this sequence(4) satisfies the conditions 1., 2. by (6) and is a Sturm chain. If thepolynomial fm(x) has roots in the interval (a, b), then (4) is a generalizedSturm chain, because it becomes a Sturm chain when all the terms aredivided by f.(x).

From what we have shown it follows that the index of every rationalfunction R(x) can be determined by Sturm's theorem. For this purpose

it is sufficient to represent R(x) in the form Q(x) + fi(x), where Q(x), f (x),

g(x) are polynomials and the degree of g(x) does not exceed that of f (x).If we then construct the generalized Sturm chain for f(x). g(x), we have

AR (z) = IQ /(x)

= V (a) - V (b).

By means of Sturm's theorem we can determine the number of distinctreal roots of a polynomial f (x) in the interval (a, b), since this number, as

we have seen, is la fix).

§ 3. ROUTH'S ALGORITHM 177

§ 3. Routh's Algorithm

1. Routh's problem consists in determining the number k of roots of areal polynomial f (x) in the right half-plane (Re z > 0).

To begin with, we treat the case where f (z) has no roots on the imaginaryaxis. In the right half-plane we construct the semicircle of radius R with

Fig. 7

its center at the origin and we consider the domainbounded by this semicircle and the segment of the imagi-nary axis (Fig. 7). For sufficiently large R all the zerosof f (z) with positive real parts lie inside this domain.Therefore arg f (z) increases by 2ka on going in thepositive direction along the contour of the domain." Onthe other hand, the increase of arg f (z) along the semi-circle of radius R for R--- oo is determined by the in-crease of the argument of the highest term aoz" and istherefore nn. Hence the increase of arg f (z) along theimaginary axis (R -> oo) is given by the expression

arg / (iw) = (n - 2 k) r. (7)

We introduce a somewhat unusual notation for the coefficients of f (z) ;namely, we set

/ (z) = aaz" + boz"-' + alz""2 + blz"--a + (an,U)Then

where for even n/ (iw) = U (w) + iV (w) , (8)

U (w) (-1) 2 (aow" - a1o,' 2 + a. o"-4 - ...) ,(I

V (w) =(-1) 2-I (bow"-1 -blw"-3 +b2w"- -...)1

and for odd n"-'

U (w) _ (-1) 2 (bow"-1- b1w"-3 ± b2(0" - .. ),"-tV(w)_(-1) 2 (aow"-a1w"-2+a2w"-+-...),

(8')

i For if f(z) = ao If (z -s,), then d arg f(z) = d arg (z-z,). If the pointr-i i_1

z, lies inside the domain in question, then d arg (z - z,) = 2x; if z, lies outside thedomain, then d arg (z-z,) =u.

178 XV. THE PROBLEM OF ROUTH-HURwrrZ AND RELATED QUESTIONS

Following Routh, we make use of the Cauchy index. Then'2

I' v(w) for lim y!w) =0,

Xd± arg/ (iw) = p(W)

f' (w) (9)-1_.. II(w) for lim L-(W) 0.

The equations (8') and (8") show that for even n the lower formula in(9) must be taken and for odd n, the upper. Then we easily obtain from(7), (8'), (8"), and (9) that for every n (even or odd)"

1- a0wn-alcon-2+... -2k. (10)

2. In order to determine the index on the left-hand side of (10) we useSturm's theorem (see the preceding section). We set

/1(w)=aow"-alw"-2+..., /s(w)=bow,' -b1w 3+... (11)

and, following Routh, construct a generalized Sturm chain (see p. 176)

/1 (w), f2 (w), /3 (w), ..., (12)

by the Euclidean algorithm.First we consider the regular case : m = n + 1. In this case the degree

of each function in (12) is one less than that of the preceding, and the lastfunction f," (w) is of degree zero."

From Euclid's algorithm (see (6)) it follows that

fa (w) = b w /2 (w) - /1 (w) = COW-2 - C1w"' + Ciw*-d - ...where

CO-a1-bbl=b0a'b0a0bl, cl=a3-bob2=b°azb0a0b=, .. , (13)

Similarly

...,/4 (w) = - w /3 (w) - f2 (W) = low"-3 -d1w/-5 +C

whereb0 c. b, - b0c, L. cob, - b0a0do=bl-cocl=

codl=b2-LO C2= , .. (13)

The coefficients of the remaining polynomials /5(w), ... , /n±t (w) are simi-larly determined.

12 Since arg 1(1w) = arccot ! L() = aretan11 We recall that the formula (10) was derived under the assumption that f(e) has no

roots on the imaginary axis.14 In the regular case (12) is the ordinary (not generalized) Sturm chain.

§ 3. ROUTTI's ALGORITHM 179

Each polynomialfl (w), /I(-) I ..., /.+l ((0) (14)

is an even or an odd function and two adjacent polynomials always haveopposite parity.

We form the Routh schemea0, a1, a2, ...'

Ibo, bl, b2, ... ,c0, Cl, cg, ..., (15)

do, d1, d8, ... ,

The formulas (13), (13') show that every row in this scheme is deter-mined by the two preceding rows according to the following rule:

From the numbers of the upper row we subtract the corresponding num-bers of the lower row multiplied by the number that makes the first differ-ence zero. Omitting this zero difference, we obtain the required row.

The regular case is obviously characterized by the fact that the repeatedapplication of this rule never yields a zero in the sequence

bo, co, do, ... .

Figs. 8 and 9 show the skeleton of Routh's scheme for an even n (n = 6)and an odd n (n = 7). Here the elements of the scheme are indicated bydots.

In the regular case, the polynomials fl(ee) and f2(cu) have the greatestcommon divisor f. +I (co) = const. 0. Therefore these polynomials, andhence U(w) and V(cu) (see (8'), (8"), and (11)), do not vanish simul-taneously; i.e., f (ic)) = 17(cu) + iV(w) ' 0 for real cu. Therefore: In theregular case the formula (10) holds.

When we apply Sturm's theoremin the interval (- oo, + oo) to the left-hand side of this formula and make

. . . _ , use of (14), we obtain by (10)V (- oo) - V (+ oo)=n-2k. (16)

In our casels

Fig. 8 Fig. 9and

V (+ oo) =Y(ao, bo, ca, do, ...)

is The sign of ft (to) for to = + oo coincides with the sign of the highest coefficient

and for w = - oo differs from it by the factor (- 1)M-k+I (k = 1, ... , n + 1).

180 XV. TIIE PROBLEM OF ROUTII-HIJRWITZ AND RELATED QUESTIONS

Hence

V (- oo) = V (ao, - bo, co, - do, ...) .

V (-oo) = n- V (+ oo). (17)

From (16) and (17) we find:

k = V (ao, bo, co, do, ...) . (18)

Thus we have proved the following theorem :

THEOREM 2 (Routh) : The number of roots of the real polynomial f (z)in the right half-plane Re z > 0 is equal to the number of variations of signin the first column of Routh's scheme.

3. We consider the important special case where all the roots of f (z) havenegative real parts ('case of stability'). If in this case we construct for thepolynomials (11) the generalized Sturm chain (14), then, since k = 0, theformula (16) can be written as follows :

V(-oo)-V(+oo)=n. (19)

But 0c V(- oo) <m-1:5 n and 0< V(+ oo) m-1 S n. There-fore (19) is possible only when m, = n + 1 (regular case !) and V(+ oo) = 0,V(- co) = m -1= n. The formula (18) then implies :

ROUTH'S CRITERION. All the roots of the real polynomial f(z) havenegative real parts if and only if in the carrying out of Routh's algorithmall the elements of the first column of Routh's scheme are different fromzero and of like sign.

4. In deriving Routh's theorem we have made use of the formula (10).In what follows we shall have to generalize this formula. The formula (10)was deduced under the assumption that f (z) has no roots on the imaginaryaxis. We shall now show that in the general case, where the polynomialf (z) = aoz" + bee-1 + aI z°-s + (ao 0) has k roots in the right half-plane and s roots on the imaginary axis, the formula (10) is replaced by

I. :bawl-bleu,.-a+btcuw s-... =n-2k-s.(20)

Foraown - alcn-2 + a2w*-4 - .. .

f(z) =d(z)f'(z),

where the real polynomial d (z) = zf + ... has s roots on the imaginaryaxis and the polynomial f' (z) of degree n = n - s has no such roots.

§ 4. THE Sn ouza.a CABS 181

For the sake of definiteness, we consider the case where s is even (thecase where s is odd is analyzed similarly).

Letf (iw) = U (w) + iV (w) =d (iw) [U* (w) + iV* (w)].

Since in our case d (iw) is a real polynomial in w, we have

U (w) U* (w)(w) .V (w) - V O

Since n and n have equal parity, we find by using (8'), (8"), and the nota-tion (11) :

/, (w)- f. (w)/i (w) h(w)

We apply formula (10) to f (z) . Therefore

I±» (°')-1±»f' (o') =n*-2k=n-2k-s,Is)- ff(w)

and this is what we had to prove.

§ 4. The Singular Case. Examples

1. In the preceding section we have examined the regular case where inRouth's scheme none of the numbers bo, co, do, ... vanish.

We now proceed to deal with the singular cases, where among the num-bers bo, co,... there occurs a zero, say, h, = 0. Routh's algorithm stops withthe row in which h, occurs, because to obtain the numbers of the followingrow we would have to divide by h,.

The singular cases can be of two types :

1) In the row in which h, occurs there are numbers different from zero.This means that at some place of (12) the degree drops by more than one.

2) All the numbers of the row in which h, occurs vanish simultaneously.Then this row is the (m + 1) -th, where m is the number of terms in thegeneralized Sturm chain (12). In that case, the degrees of the functions in(12) decrease by unity from one function to the next, but the degree ofthe last function f,,,(w) is greater than zero. In both cases the number offunctions in (12) is in < n + 1.

Since the ordinary Routh's algorithm comes to an end in both cases,Routh gives a special rule for continuing the scheme in the cases 1), 2).

182 %V. THE PROBLEM OF ROUTII-HURWITZ AND RELATED QUESTIONS

2. In case 1), according to Routh, we have to substitute for ho = 0 a `small'value a of definite (but arbitrary) sign and continue to fill in the scheme.Then the subsequent elements of the first column of the scheme are rationalfunctions of e. The signs of these elements are determined by the 'small-ness' and the sign of e. If any one of these elements vanishes identically in e,then we replace this element by another small value 11 and continue thealgorithm.

Example :f(z)=z'+z'+2z'+2z+1.

Routh's scheme (with a small parameter e) :

1, 2, 1

1, 2

a, 1 k=V(1,1,e,2- 1 ,1)=2.2- 1

0

This special method of varying the elements of the scheme is based onthe following observation :

Since we assume that there is no singularity of the second type, thefunctions fl (w) and f2 (w) are relatively prime. Hence it follows that thepolynomial f (z) has no roots on the imaginary axis.

In Routh'a scheme all the elements are expressed rationally in terms ofthe elements of the first two rows, i.e., the coefficients of the given poly-nomial. But it is not difficult to observe in the formulas (13), (13') andthe analogous formulas for the subsequent rows that, once we have givenarbitrary values to the elements of any two adjacent rows of Routh's schemeand to the first element of the preceding row, we can express all the elementsin the first two rows, i.e., the coefficients of the original polynomial, inintegral rational form in terms of these elements. Thus, for example, allthe numbers a, b can be represented as integral rational functions of

ao, bog coy ... , 9o, 91, 92, ... , ho, h3, h2, ...

Therefore, in replacing go = 0 by a we in fact modify our original poly-nomial. Instead of the scheme for f (z) we have the Routh scheme for apolynomial F(z, e), where F(z, e) is an integral rational function of z and ewhich reduces to f (z) for e = 0. Since the roots of F(z, e) change continu-ously with a change of the parameter a and since there are no roots on theimaginary axis for e = 0, the number k of roots in the right half-plane is thesame for F(z, e) and F(z, 0) = f (z) for values of a of small modulus.

§ 4. THE SINGULAR CASE 183

3.. Let us now proceed to a singularity of the second type. Suppose thatin Routh's scheme

ao 0, bo 0, ..., eo 0,go=0, g1=0, ga=0,

In this case, the last polynomial in the generalized Sturm chain (16) is ofthe form :

to (W) = e0w -"'+1-

Routh proposes to replace fm+1(w), which is zero, by fm(w) ; i.e., heproposes to write instead of go, g,, ... the corresponding coefficients

(n-m+ 1) eo. (n-m-1) e1, ...

and to continue the algorithm.The logical basis for this rule is as follows :

By formula (20)I+«ls(°))-n-2k-8

11W

(the s roots of f(z) on the imaginary axis coincide with the real roots of

fm (w)) . Therefore, if these real roots are simple, then (see p. 174)

In Wand therefore

I± . t' (°') -{- Imo lm = n 2 k./1(Q1) 1m (W)

This formula shows that the missing part of Routh's scheme must be filledby the Routh scheme for the polynomials fm((o) and f;,,((o). The coeffi-cients of are used to replace the elements of the zero row in Routh'sscheme.

But if the roots of fm(w) are not simple, then we denote by d(w) thegreatest common divisor of fm(w) and f,,((o), by e(w) the greatest commondivisor of d (w) and d' (w) , etc., and we have :

I_«im (G1)

+ I_« d (w) + I_«a (w) f ... - s.

Thus the required number k can be found if the missing part of Routh'sscheme is filled by the Routh scheme for fm(w) and f ,,(w), then the schemefor d (cu) and d' (w) , then that for e (w) and e' ((o) , etc., i.e., Routh's rulehas to be applied several times to dispose of a singularity of the second type.


Example. /(z)=x10+z'-z'-2z°+-'+3z5.-z4-2t'-z2+z+1.Scheme

0)10 1- 1 1 1 -1 1

we 1 - 2 3 -2 1

1 - 2 3 -2 1

8 -12 12 -4w 2 -3 3 -1

-1 3 -3 2

3 -3 3ws l1 -1 1 k=V(1,1,1.2,-1,1,1,2,-1,1,1)=4.2 -2 2

w4 1 -1 1

4 -2ws 2 -1w' -1 2

W 1

2w0 1

k 1

Note. All the elements of any one row may be multiplied by one andthe same number without changing the signs of the elements of the firstcolumn. This remark has been used in constructing the scheme.

4. However, the application of both rules of Routh does not enable us todetermine the number k in all the cases. The application of the first rule(introduction of small parameters e, ...) is justified only when f (z) hasno roots on the imaginary axis.

If f (z) has roots on the imaginary axis, then by varying the parameter esome of these roots may pass over into the right half-plane and change k.

Example.

Scheme

we 1 3 3 1

w6 1 3 2

w4 e 1 1

3- 1

1 1 u 2_!_ --e+w' 3- 2- (a 1-2e-1

wz

a

-2e-1 1

E

3--w

EV

U / 1 4 for e> 0,w° 1

eV 1'1'C,3-- , -1, -e, 11= 2 for e <O.j

§ 5. LYAPuxov's TuxoanM 185

The question of the value of k remains open.In the general case, where f (z) has roots on the imaginary axis, we have

to proceed as follows :

Setting f(z) =F1(z) +F2(z), where

Fl(z) =adz"+alz"-2+..., F2(z) =baz"-1+biZn-2+...,

we must find the greatest common divisor d (z) of Fl (z) and F2 W. Thenf(z) =

If f(z) has a root z for which - z is also a root (all the roots on theimaginary axis have this property), then it follows from f(z) = 0 andf (-z) = 0 that Fl (z) =0 and F2 (z) =0, i.e., z is a root ofd (z) . Thereforef* (z) has no roots z for which -z is also a root of f* (z).

Thenk=k1+k2i

where k1 and k2 are the respective numbers of roots of f* (z) and d (z) in theright half-plane ; k1 is determined by Routh's algorithm and k2 = (q - s) /2,where q is the degree of d(z) and s the number of real roots of d(iw).18

In the last example,

d(z) =z2+1, f (z) =z'+z6+2z'+2z+1.

Therefore (see example on p. 182), we have k2 = 0, k1= 2, and hence

k=2.

§ 5. Lyapunov'e Theorem

1. From the investigations of A. M. Lyapunov published in 1892 in hismonograph `The General Problem of Stability of Motion' there follows atheorem" that gives necessary and sufficient conditions for all the roots ofthe characteristic equation I AE - A I = 0 of a real matrix A = I ark II; tohave negative real parts. Since every polynomial

f(A)=ad2"+aiAx-1+...+a (ao#0)

16 d(i w) is a real polynomial or becomes one after cancelling i. The number of its realroots can be determined by Sturm's theorem.

27 See [32), § 20.

186 XV. THE PROBLEM OF ROUTII-HURWITZ AND RELATED QUESTIONS

can be represented as a characteristic determinant I AE - A 1,19 Lyapunov's

theorem is of general character and is applicable to an arbitrary algebraicequation f (A) = 0.

Suppose given a real matrix A = II act II 1 and a homogeneous polynomialof dimension m in the variables zl, z2, ... , X.:

V (x, x, ... , x) (x = (XI, x2, ... , X.))

Let us find the total derivative with respect to t of V (x, x, ... , x) underthe assumption that x is a solution of the differential system

Thend

=Ax.We

WtV(x,x,...,x)=V(Ax,x,...,x)+V(x, Ax,...,

x, .... x) is again a homogeneous polynomial of dimension inin x,, x2,. .. , x,,. The equation (21) defines a linear operator A which asso-ciates with every homogeneous polynomial of dimension in V(x, x, ..., x)a certain homogeneous polynomial W (x, x, ... , x) of the same dimension m

W =A (V) .

We restrict ourselves to the case m = 2.19 Then V (x, x) and W (x, x)are quadratic forms in the variables xt, x2, . . . , x connected by the equation

hence2O

de V (x, x) =V (Ax, x) + V (z, Ax) = W (x, x); (22)

W=A(V)=ATV+VA. (23)

19 For this purpose it is sufficient to set, for example:

0 0 ... 0 as40

a"_111 0 ... 0

aoA=..........

0 0 ... 1 -la,

19 A. M. Lyapunov has proved his theorem for every positive integer in.20 Because V (x, y) = xT Vy.

§ 5. LYAPUNOV's THEOREM 187

Here V = II va, II1 and W = II w{k II; are symmetric matrices formed,respectively, from the coefficients of the forms V (x, x) and W (x, x). Thelinear operator A in the space of matrices of order n is completely deter-mined by specification of the matrix A = II ask II1

If 1l, 22, ... , A. are the characteristic values of the matrix A, then everycharacteristic value of the operator A can be represented in the formA +Ak (1ci, k,5 n).2'

Therefore, if the matrix A = II ask II1 has no zero characteristic value andno two that are opposites, then the operator A is non-singular. In this casethe matrix W in (23) determines the matrix V uniquely.

If V is symmetric, then the matrix W defined by (23) is also symmetric.If I is a non-singular operator, then the converse statement also holds :Every symmetric matrix W corresponds by (23) to a symmetric matrix V.For in this case we find, by going over to the transposed matrices on bothsides of (23), that the matrix VT, as well as V, satisfies (23). By the unique-ness of the solution, VT = V.

Thus: If the matrix A = II act 11i has no zero and no two opposite char-acteristic values, then every quadratic form W (x, x) corresponds to one andonly one quadratic form V(x,x) connected with W(x,x)by (22).

Now we can formulate Lyapunov's theorem.

THEOREM 3 (Lyapunov) : If all the characteristic values of the realmatrix A = II ask 11 1have negative real parts, then to every negative-definitequadratic form W (x, x) there corresponds a positive-definite quadratic formV (x, x) connected with W (x, x) -taking

d =Ax (24)

into account-by the equation

dV(x,x)=W(x,x). (25)

Conversely, if for every negative-definite form W(x, x) there exists apositive-definite form V (x, x) connected with W (x, x) by the equation (25)-taking (24) into account-then all the characteristic values of the matrixA = II aa, II i have negative real parts.

Proof. 1. Suppose that all the characteristic values of A have negativereal parts. Then for every solution x = eA`xo of (24) we have Jim x = 0.22a.-tSuppose that the forms V (x, x) and W (x, x) are connected by (25) and that

=' See footnote 18.22 See Vol. I, Chapter V, § 6.


W (x, x) < 0 (x # 0 ) . 2 3

Let us assume that for some xo 0 oVo=V(xo, x0)50.

But V (x, x) = W (x, x) < 0 (z = ed' x0). Therefore for t > 0 the valueof V (x, x) is negative and decreases for t -* oo, which results in a contradic-tion to the equation lim V (x, x) = lim V (x, x) = 0. Therefore V (x, x) > 0

tom.. x-*o

for x o, i.e., V (x, z) is a positive-definite quadratic form.2. Suppose, conversely, that in (25)

W(x,z) <0, V(x,x) > 0 (x o).


V (x, x) = V (xo, xo) + f W (x, x) dt (x = eA x0) . (25')0

We shall show that for every xo y6 o the column z = eAtxo comes arbitrarilynear to zero for arbitrarily large values of t > 0. Assume the contrary.Then there exists a number v > 0 such that

W(z,x)<-v<0 (x=e"'x0, x0,, o, t>0).But then from (25')

V (x, x) <V (x0, x0)-vt,and so for sufficiently large values of t we have V (x, x) < 0, which contra-dicts our assumption.

From what we have shown, it follows that for certain sufficiently largevalues of t the value of V (x, x) (x = eAtxo, xo , 0) will be arbitrarily nearto zero. But V (x, x) decreases monotonically for t > 0, since d V (X, x) _

W (x, x) < 0. Therefore lim V (x, x) = 0.

Hence it follows that for every xo # o, lim eA'xo = o, i.e., Jim eAt = 0.lames Ha.

This is only possible if all the characteristic values of A have negative realparts (see Vol. I, Chapter V, § 6).

The theorem is now completely proved.For the form W(x, z) in Lyapunov's theorem we can take any negative-

definite form, in particular, the form - x,2. In this case the theorem

admits of the following matrix formulation :

ss The form IV (x, x) is given arbitrarily. The form V (x, x) is uniquely determinedby (25), because A has in this case neither the characteristic value zero nor pairs of

opposite characteristic values.

§ 5. LYAPUNOV's T1iEOREM 189

THEOREM 3': All the characteristic values of the real matrix A = II aik 11 *1have negative red parts if and only if the matrix equation

AT + VA= -E (26)

has as its solution V the coefficient matrix of some positive-definite quad-ratic form V (x, x) > 0.

2. From this theorem we derive a criterion for determining the stability ofa non-linear system from its linear approximation.24

Suppose that it is required to prove the asymptotic stability of the zerosolution of the non-linear system of differential equations (1) (p. 172) inthe case where the coefficients ask (i, k =1, 2, ... , n) in the linear termson the right-hand side form a matrix A = II ask II 1 having only characteristicvalues with negative real parts. Then, if we determine a positive-definiteform V (x, x) by the matrix equation (26) and calculate its total derivativewith respect to time under the assumption that x = (XI, x2, ... , x,,) is asolution of the given system (1), we have:

ddt f=1

where IR(x1, x2, ... , is a series containing terms of the third and highertotal degree in x1, x2, ..., x,,. Therefore, in some sufficiently small neigh-borhood of (0, 0, ... , 0) we have simultaneously for every x o

V(x,x)>0, T, V(x,x)<0.

By Lyapunov's general criterion of stability2S this also indicates theasymptotic stability of the zero solution of the system of differential equa-tions.

If we express the elements of V from the matrix equation (26) in termsof the elements of A and substitute these expressions in the inequalities

vu> 0, V11 V12

V21 V22

>0, I v21 v22 ... V2n >0,

I vat vn2 ... vr,n

then we obtain the inequalities that the elements of a matrix A= II a .,x 117must satisfy in order that all the characteristic values of the matrix should

24 see [32], § 26; (9], pp. 113 ff.; [36], pp. 66 ff.25 See [32], § 16; [9], pp. 19.21 and 31-33; [36], pp. 32-34.


have negative real parts. However, these inequalities can be obtained in aconsiderably simpler form from the criterion of Routh-Hurwitz, which willbe discussed in the following section.

Note. Lyapunov's theorem (3) or (3') can be generalized immediatelyto the case of an arbitrary complex matrix A= li alk II 1". The quadraticforms V (x, x) and W (x, x) are then replaced by Hermitian forms

n nV (x, x) _ ViAxk I W (x, x) waz{xt

Correspondingly, the matrix equation (26) is replaced by the equation

A'V+VA=-E (A* =,W).

§ 6. The Theorem of

1. In the preceding sections we have explained the method of Routh, un-surpassed in its simplicity, of determining the number k of roots in the righthalf-plane of a real polynomial whose coefficients are given as explicitnumbers. If the coefficients of the polynomial depend on parameters andit is required to determine for what values of the parameters the number khas one value or another-in particular, the value 0 ('domain of stability 1)26-then it is desirable to have explicit expressions for the values of co, d,,, .. .in terms of the coefficients of the given polynomial. In solving this problem,we obtain a method of determining k and, in particular, a stability criterionin a form in which it was established by Hurwitz [204].

We again consider the polynomial

/ (z) = aoz* + boz*-I + a1f' 2 + blz"1 + ... (ao, 0)

By the Hurwitz matrix we mean the square matrix of order n

H=

bo bI b' ...a0 aI a2 ... ap_I

0 bo bl ...0 ao aI ... a,20 0 bo ...

ak=0 for k>[2,,b, = 0 for k> [n 2

(27)

26 For this is precisely the situation in planning new mechanical or electrical systemsof governors.

§ 6. THEOREM OF ROUTH-HURWITZ 191

We transform the matrix by subtracting from the second, fourth, ...rows the first, third,... row, multiplied by ao/bo.27 We obtain the matrix

bo bl b3

.0 CO C1

0 bo b1

0 0 co c"_g

0 0 bo b*_g

In this matrix co, c1, .. . is the third row of Routh's scheme supplemented byzeros (ck= 0 fork > [n/2J -1).

We transform this matrix again by subtracting from the third, fifth, .. .rows the second, fourth, ... row, multiplied by bo/co:

bo bl b2 bs ...0 CO c1 c, ...0 0 do dl .. .0 0 Co c1 ...0 0 0 do ...0 0 0 CO ...

Continuing this process, we ultimately arrive at a triangular matrix oforder is

R=

bo b1 b2 ...0 co cl .. .0 0 do ... (28)

which we call the Routh matrix. It is obtained from Routh's scheme (see(15)) by : 1) deleting the first row ; 2) shifting the rows to the right so thattheir first elements come to lie on the main diagonal; and 3) completing itby zeros to a square matrix of order is.

27 We begin by dealing with the regular case where b.'. 0, c,,' 0, d, f 0.....

192 RV. THE PROBLEM OF ROUTIi-HURWITZ AND RELATED QUESTIONS

DEFINITION 2: Two matrices A = II aak 11. and B = 11 b,k Ii will be calledequivalent if and only if for every p n the corresponding minors of orderp in the first p rows are equal :

1 2 ... p _ 2 ., p ii, is, .. , i,=1,2, ..., nA (ii is ... iP)

Bit is ... iy) ( p = 1, 2, ..., n

Since we do not change the values of the minors of order p in the firstp rows when we subtract from any row of the matrix an arbitrary multipleof any preceding row, the Hurwitz and Ruth matrices H and R are equiva-lent in the sense of Definition 2:

H1 2 ... p 2 ... p i1 is, ..., ir= 1, 2, ..., n,1

(29)'Si iis ...$p)=R I

t i2 ip/// p= 1, 2, ..., nThe equivalence of the matrices H and R enables us to express all the

elements of R, i.e. of the Routh scheme, in terms of the minors of the Hurwitzmatrix H and, therefore, in terms of the coefficients of the given polynomial.For when we give to p in (29) the values 1, 2, 3, ... in succession, we obtain

H(1)=bo, H(1)=bl, H(1)=bQ, ...,

H (1 2) = boco, H(11

3) = bocl' H (1 2)bocz, ...,

, (30)1 2 3

H 1 2 3) = becodo, H

(1

1

2

2

3)bocod1, H

1

1

2

2

35=booms, .. .

4 2

etc.

Hence we find the following expressions for the elements of Routh'sscheme :

bo=H(1)' b1=H(2), b2=H(3), ...,

((1 2' 1 2'1 (1 2_ H l1 2) H I 1 3) H (1 4)

co

CI_

11()

r1 23 (1 2 1 213H`1

2 3) H(l 2 4 Hl1 2 5i)d= , d1= , d2=o

H (1 2)H (1 2) 11 (1 2)r

(31)

............................

§ 6. THEOREM OF ROUTH-HURWITZ 193

The successive principal minors of H are usually called the Hurwitzdeterminants. We shall denote them by

(1) (1

)=:d-H bd-H",I ...

bo bl ... b*_I2ao al... aw-1

d = H (1 2 ... n) = 0 bo ...0 ao...

Note 1. By the formulas (30)."'

(32)

Al = bo, d, =bate, d 3 = bocodo, .... (33)

From dl # 0, ... , do , 0 it follows that the first p of the numbersba, c0, . . . are different from zero, and vice versa ; in this case the p successiverows of Routh's scheme beginning with the third are completely determinedand the formulas (31) hold for them.

Note 2. The regular case (all the b(,, c,,, ... have a meaning and aredifferent from zero) is characterized by the inequalities

d1 0, 429&0, .. , 4.7&0.

Note 3. The definition of the elements of Routh's scheme by means ofthe formulas (31) is more general than that by means of Routh's algorithm.

Thus, for example, if bo = H(i)= 0, then Routh's algorithm does not give us

anything except the first two rows formed from the coefficients of the givenpolynomial. However if for d1= 0 the remaining determinants d2, ds, ...are different from zero, then by omitting the row of c's we can determineby means of the formulas (31) all the remaining rows of Routh's scheme.

By the formulas (33),

bo=d1, co=A,, do=4,, ...

and therefore

28 If the coefficients of f (e) are given numerically, then the formulas (33)-reducingthis computation, as they do, to the formation of the Routh scheme-give by far thesimplest method for computing the Hurwitz determinants.

194 XV. TIIE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

V (ao, lo, co.... ) =V (ao, dl, di, ..., ,, = V (ao, d1, ds, ...) + V(11 d=, d4, ...).

Hence Routh's theorem can be restated as follows :

THEOREM 4 (Routb-Hurwitz) : The number of real roots of the poly-nomial f (z) = az" +... in the right half-plane is determined by the formula

k=V(a,d d, t- A. )0 1, dl' d'r ..., 3"-1

or (what is the same) by

k = V (ao, dI, da, ...) + V (1, d'. d4, .) (34')

Note. This statement of the Routh-Hurwitz theorem assumes that wehave the regular case

d1 0, d2 (1, .. , d",0.In the following section we shall show how this formula can be used in

the singular cases where some of the Hurwitz determinants A, are zero.

2. We now consider the special case where all the roots of f (z) are in theleft half-plane Re z < 0. By Routh's criterion, all the a,,, b,,, c,,, d,,, ... mustthen be different from zero and of like sign. Since we are concerned herewith the regular case, we obtain from (34) for k = 0 the following criterion :

CRITERION OF ROUTH-HURWITZ: All the roots of the real polynomialf (z) = adz" + ... (a,,0) have negative real parts if and only if the in-equalities

a8dl>0, d,>0, a6d'>0, d4>0, ..., and" > 0 (for odd n),d">0 (for even n)

hold.

Note. If ad > 0, these conditions can be written as follows :

J(35)

dI>0, d2>0, ... , d">0. (36)

If we use the usual notation for the coefficients of the polynomial

f (z) = aoz"+ a1z + atz"-' + ... + a"_zz + a",

then for ad > 0 the Routh-Hurwitz conditions (36) can be written in theform of the following determinantal inequalities :

§ 6. THEOREM OF ROUTH-HURWITE

Ia,I>0, al as

as as > 0,al as as

as as a40 a1 as

>0, ...,

al as as ... 0ao a, a. ... 00 a, a. ... 00 ao as ... 0

. . . . a

195

> 0. (38)

A real polynomial f (z) =a,z" + ... whose coefficients satisfy (35), i.e.,whose roots have negative real parts, is often called a Hurwitz polynomial.

3. In conclusion, we mention a remarkable property of Routh's scheme.

Let to, fl, . . . and go, gi,... be the (m + 1)-th and (m + 2)-th rows of thescheme (fo = d.,/d.,_,, go = d,.+1/d.,). Since these two rows togetherwith the subsequent rows form a Routh scheme of their own, the elementsof the (m + p + 1) -th row (of the original scheme) can be expressed interms of the elements of the (m + 1) -th and (m + 2) -th rows f o, fl, ... andgo, g,, ... by the same formulas as the (p + 1) -th row can in terms of theelements of the first two rows ao, a,.... and bo, b1, ... ; that is, if we set

H'=

then we have

go g1 Dr ...to /, A...0 go g,-0 /o /1 .

l...M+P-1 m+P l rrl...p-1 P 1H(1...m+P-1 m+p+k-1) H\1...p-1 P+k-1/ (37)l...m-}-p-1l l...p-1lH(1...m+P-l/ H(1...P-11

The Hurwitz determinant d.,+P is equal to the product of the first m + pnumbers in the sequence bo, co,... :

Butd,.+P = bA ... logo ... 4.

d.,=bc.... /o. A.=g.... 4Therefore the following important relation" holds :

dw.+P = 4.131 (38)

29 Here dP is the minor of order p in the top left-hand corner of H.


The formula (38) holds whenever the numbers fo, fl, ... and go, g1, ...are well defined, i.e., under the conditions 0, dm # 0.

The formula (37) has a meaning if in addition to the conditions dm_1 #0,A. # 0 we also have dm+p-1 # 0. From this condition it follows that thedenominator of the fraction on the right-hand side of (37) is also differentfrom zero: dp_1 0.

§ 7. Orlando's Formula

1. In the discussion of the cases where some of the Hurwitz determinantsare zero we shall have to use the following formula of Orlando [2941, whichexpresses the determinant A. -I in terms of the highest coefficient a and theroots z1i z2, ... , zn of f (z) :30

n(n-i) 1...., ndn_1= (-1) 2 ao 1 11 (zt + Zk) . (39)«x

For n=2 this reduces to the well-known formula for the coefficient boin the quadratic equation aoz2 + boz + a1= 0 :

d1=bo=-ao (z1+z3).

Let us assume that the formula (39) is true for polynomials of degree n,f (z) = a o z " + boz"-1 + .. and show that it is then true for polynomials ofdegree n + 1

F (z) = (z + h) f (z)=aoa°+1+(b0+hao)z"+(a,+hbo)z"-1+ (h

For this purpose we form the auxiliary determinant of order n + 1

bo bl ... b._1 h"hn-1lao al .. , an-1 -

0 bo ... bn-2 h"-2 at = 0 for k > [ ,D= 0 ao ... an_s - hn-s

. .. . . . . . .. .. (bk=O for k> [" 2

0 0 . . . . (-1)"

30 The coefficients of 1(z) may be arbitrary complex numbers.

§ 7. ORLANDO'S FORMULA 197

We multiply the first row of D by ao and add to it the second row multi-plied by - b,, the third multiplied by al, the fourth by - bl, etc. Thenin the first row all the elements except the last are zero, and the last elementis f (h). Hence we deduce that

D = (-1)" do-31 (h)

On the other hand, when we add to each row of D (except the last) thenext multiplied by h we obtain, apart from a factor (-1)", the Hurwitzdeterminant zl of order n for the polynomial F(z) :

D=(-1)"

Thus

bo + hao bl + ha, ...ae al + hbo ...0 be 4 has .. .0 ae ... =(-1)"dn.

n

A' =dn-1f (h) =aodn-1 fl (h-zi).i-1

When we replace d"_1 by its expression (39) and set h =-z"+1, we obtain

(n+ 1)" 1,.... n+1do=(-1) 2 ao II (Zi+za)

i<x

Thus, by mathematical induction Orlando's formula is established forpolynomials of every degree.

From Orlando's formula it follows that: dn_1=0 if and only if the sumof two roots of f (z) is zero.,,,

Since do = cdn_1i where c is the constant term of the polynomial f (z)(c= (-1)"a0Z1Z2... zn), it follows from (39) that:

n(n11) 1.....nn

= aexlz9 ... Z. iII (zi + zk) (40)

The last formula shows that: d" vanishes if and only if f (z) has a pairof opposite roots z and - z.

31 In particular, d"_1 = 0 when 1(s) has at least one pair of conjugate pure imaginaryroots or multiple zero roots.


§ 8. Singular Cases in the Routh-Hurwitz Theorem

In discussing the singular cases where some of the Hurwitz determinantsare zero, we may assume that d", 0 (and consequently d"_1 ,' 0).

For if A. = 0, then, as we have seen at the end of the preceding section,the real polynomial f (z) has a root z' for which - z' is also a root. If we setf(z) = F, (z) + F2(z), where

F1(z) =aoz" + a1z"-s + ..., F2 (z) = boz"-1 + blz"-3 + ...,

then we can deduce from f (z') = f (- z') = 0 that PI W) = P2 W) = 0.Therefore z' is a root of the greatest common divisor d(z) of the polynomialsF1(z) and F2(z). Setting f(z) =d(z)f*(z), we reduce the Routh-Hurwitzproblem for f (z) to that for the polynomial f * (z) for which the last Hurwitzdeterminant is different from zero.

1. To begin with, we examine the case where

d1=...=dv=0, dV+1 0, .... A. 0. (41)

From d1=0 it follows that bo=0; from A2 =I0ao b1 it

follows that b1= 0. But then we have automatically0 b b, 2

d3= ao a, a2 =-aobi =0.0 0 b1

From0 0 b2 b3 -

d = ao a1a2

asaoba=04

0 0 0 b2

0 ao at a2

it follows that b2 = 0 and then As = - a2,b; = 0, etc.This argument shows that in (41) p is always an odd number p = 2k -1.

Then bo=b1=b2=...=ba_1=0, bh#0, and"h(h 1 I) h(h+I)

(42)do+I=d2h=(-1) 2aobh, dr-4 2=d2h+I=(-1) 2 aabh+I=dp+lbh

Let us vary the coefficients b,,, b,, ... , in such a way that for thenew, slightly altered values bo*, b1*, ..., br_1 all the Hurwitz determinantsd1*, d2*, ... , d"* become different from zero and 44 ,... , d"* keep theirprevious signs. We shall take bo*, b1*, ... , b,`,+1 as `small' values of differ-ent orders of 'smallness'; indeed, we shall assume that every bj- 1 is in abso-

h .t I32 From (42) it follows that for odd h sign dy+2= (-1) 2 sign a., and for even

h

h sign dy}1= (-1)2

§ 8. SINGULAR CASES IN ROUTH-HURWITZ THEOREM 199

lute value `considerably' smaller than b1' (j =1, 2, ... , h ; b,,) . Thelatter means that in computing the sign of an integral algebraic expressionin the bt' we can neglect terms in which some b{' have an index less thanj in comparison with terms where all the bj' have an index at least j.We can then easily find the 'sign-determining' terms of d,', dzi ... , d;(p=2h-1):33

dl=bo, d2=-aobi+.... d3=-a(,bl !..., d4=-aob2 +...,

etc. ; in general,de=aob;3+ ...,

i(i 1 1)

A;i=(-1) 2 aobil+ .. .

1(i+1)

djffit=(-1) 2 aob*i+1+ ...(43)

We choose b1', ..., b2h_1 as positive; then the sign of is determinedby the formula

iv+1)sign di = (-1) 2 sign ao

i(j=`21, i=1,2, ..., pl. (44)

In any small variation of the coefficients of the polynomial the numberk remains unchanged, because f (z) has no roots on the imaginary axis.Therefore, starting from (44) we determine the number of roots in theright half-plane by the formula

. 2 dp+I dp+21 (dn+2 dp 1k =Y as, dl,G

, ..., d d J+

y `dp+l' ...,A.-1 J.

1 P P+1

An elementary calculation based on (42) and (44) shows that

(45)

)2V d

p+1 p{ 2 h 46- -(- ) e (- )V4. - - 2- (e=alga(a°)dp2d; dp dP+

Note that the value on the left-hand side of (46) does not depend on themethod of varying the coefficients and retains one and the same sign forarbitrary small variations. This follows from (45), because k does notchange its value under small variations of the coefficients.

( P=2h-1

33 Essentially the same terms have already been computed above for d1, ds , ... , dp .

200 %V. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

2. Suppose now that for s > 0

d,+1 = ... = d,+P = 0 (47)

and that all the remaining Hurwitz determinants are different from zero.We denote by a`,o, 51, ... and bo, b1,... the elements of the (s + 1)-th

rows in Routh's scheme (ao = d./d,-1i So = A.+ /A.). We denote the cor-responding determinants by d1i 42,..., .J._.. By formula (38) (p. 195),

4+1 = 4Z1, ..., d.+P = d.dp, d,+P+l = d.ZP+1, d,+P+2 = drdP+2 (48)

Then by 1. it follows that p is odd, say p = 2h -1.8fLet us vary the coefficients of f (z) in such a way that all the Hurwitz

determinants become different from zero and that those that were differentfrom zero before the variation retain their sign. Since the formula (46) isapplicable to the determinants 3, we then obtain, starting from (48) :

A, d,±1. d.+p++1 d,+P+2

1d,-1 d, d,+P d,+P+l

=h+ 1-(-J%A.. ,+P+s,g

d,-1 d,+P+1

p=2h-1,d d-e -si n

(49)

d d++P+2` d++P+2k = a1) + 17 1, e.l. ..., e.+p+1 + y (

\e,+P+1'.... dw-1)

The value on the left-hand side of (49) again does not depend on the methodof variation.

3. Finally, let us assume that among the Hurwitz determinants there arev groups of zero determinants. We shall show that for every such group(47) the value on the left-hand side of (49) does not depend on the methodof variation and is determined by that formula.85 We have proved thisstatement' for v = 1. Let us assume that it is true for v - 1 groups andthen show that it is also true for v groups. Suppose that (47) is the secondof the v groups; we determine 4 i d2 in the same way as was done under 2.;then for this variation

s. In accordance with footnote 32, for p=2h-1 and odd h,A+I

(-1) 2 signd,-1;and for even h,

A

sign As + p + 1 = (-1) 2 sign A,.

33 From (47) and A. ¢ 0, d,+ p+ 1 ¢ 0 it follows by (48) and (42) that d,_1 ,' 0,d,+P+ 2 ¢ 0.

§ 9. QUADRATIC FORMS. NUMBER OF REAL ROOTS OF POLYNOMIAL 201

,..., a6, ,,....V =vd,-1 dn- 1

Since we have only v - 1 groups of zero determinants on the right-hand sideof this equation, our statement holds for the right-hand side and hence forthe left-hand side of the equation. In other words, the formula (49) holdsfor the second, ..., v-th group of zero Hurwitz determinants. But then itfollows from the formula

k=V(ao,di e:, ..., d--"-

d, ds-1/

that the value of V '!'(,,_l, i e X11 does not depend on themethod of variation for the first group of zero determinants, and thereforethat (49) holds for this group as well.


THEOREM 5: If some of the Hurwitz determinants are zero, but A. :O& 0,then the number of roots of the real polynomial f (z) in the right half-planeis determined by the formula

k= Vra6,dl,d',

in which for the calculation of the value of V for every group of p successivezero determinants (p is always odd!)

(As 710) d,+1=...=d,+p=0 (d,+p+1"0)we have to set

where"

V ( d, d,+I ..., d,+p+Y\ = h .+ I - (-1)A821-1 d, d,+p+I 2

p=2h-1 and e=sign d'- d"+p+21dr-1 dr+p+1

(50)

§ 9. The Method of Quadratic Forms. Determination of theNumber of Distinct Real Roots of a Polynomial

Routh obtained his algorithm by applying Sturm's theorem to the computa-tion of the Cauchy index of a regular rational fraction of special type (seeformula (10) on p. 178). Of the two polynomials in this fraction-numera-

36 For 8= 1 44 81 is to be replaced by dl ; and for t = 0, by a,.


for and denominator-one contains only even, the other only odd powers ofthe argument z.

In this and in the following sections we shall explain the deeper andmore comprehensive method of quadratic forms, due to Hermite, in itsapplication to the Routh-Hurwitz problem. By means of this method weshall obtain an expression for the index of an arbitrary rational fractionin terms of the coefficients of the numerator and denominator. The methodof quadratic forms enables us to apply the results of Frobenius' subtle inves-tigations in the theory of Hankel forms (Vol. I, Chapter X, § 10) to theRouth-Hurwitz problem and to establish a close connection of certain re-markable theorems of Chebyshev and Markov with the problem of stability.

1. We shall acquaint the reader with the method of quadratic forms firstin the comparatively simple problem of determining the number of distinctreal roots of a polynomial.

In the solution of this problem we may restrict ourselves to the casewhere f (z) is a real polynomial. For suppose that f (z) = u(z) + iv(z) isa complex polynomial (u(z) and v(z) being real polynomials). Each realroot of f (z) makes u(z) and v(z) vanish simultaneously. Therefore thecomplex polynomial f (z) has the same real roots as the real polynomial d(z),the greatest common divisor of u(z) and v(z).

Thus, let f (z) be a real polynomial with the distinct roots aI, a2, ... , aeof the respective multiplicities nI, n2, ... , nq :

f (z) = ao (z - a,)", (z - a2)".... (z - aq)"q

(ao#0;ai#ak for i#k; i,k=1,2,...,q).

We introduce Newton's sums

With these sums we form the Hankel forms

"-IS. (";Cl x) = Z 8i+kxixk

i. k-0

where n is an arbitrary integer, n ? q.Then the following theorem holds :

THEOREM 6: The number of all the distinct roots of f (z) is equal to therank, and the number of all the distinct real roots to the signature, of theform S"(z, z).

§ 9. QUADRATIC FORMS. NUMBER OF REAL ROOTS OF POLYNOMIAL 203

Proof. From the definition of the form S,(x, x) we immediately obtainthe following representation :

Q

Sn (x., x) ni (xo + aix1 + ai x2 + ... + a;-' (51)

Here to each root a, of f (z) there corresponds the square of a linear formZ5=xo+a1x1+...+af_1xn_1 (j=1,2,...,q). The forms Z1fZ2,...,ZQ

are linearly independent, since their coefficients form the Vandermondematrix II a51 II whose rank is equal to the number of distinct ai, i.e., to q.Therefore (see Vol. I, p. 297) the rank of the form S,(x, x) is q.

In the representation (51) to each real root ai there corresponds a posi-tive square. To each pair of conjugate complex roots of and ai there cor-respond two complex conjugate forms :

Zi= P, + iQi, Z, =Pi-iQi;

the corresponding terms in (51) together give one positive and one negativesquare :

2 2 P2 2n,Zi + niZi = 2 ni i - 2 niQi

Hence it is easy to see" that the signature of S. (x, x), i.e., the differencebetween the number of positive and negative squares, is equal to the numberof distinct real ai.


2. Using the rule for determining the signature of a quadratic form thatwe established in Chapter X (Vol. I, p. 303), we obtain from the theorem thefollowing corollary:

COROLLARY : The number of distinct real roots of the real polynomialf (z) is equal to the excess of permanences of sign over variations of sign inthe sequence

1, 80,80 81

81 82

80 81 ... 8,x_1

81 82 sn

8n_1 8n

(52)

where the s, (p = 0, 1, ...) are Newton's sums for f (z) and n is any integernot less than the number q of distinct roots of f (z) (in particular, n can bechosen as the degree of f(z)).

ar The quadratic form S. (x, x) is representable as an (algebraic) sum of q squares ofthe real forms Z, (for real aj) and P, and Q, (for complex a,). These forms are linearlyindependent, since the rank of S. (x, x) is q.


This rule for determining the number of distinct real roots is directlyapplicable only when all the numbers in (52) are different from zero. How-ever, since we deal here with the computation of the signature of a Hankelform, by the results of Vol. I, Chapter X, § 10, the rule with proper refine-ments remains valid in the general case (for further details see § 11 of thatchapter).

From our theorem it follows that : All the forms

$n(x,x) (n=q,q+l, ...)have the same rank and the same signature.

In applying Theorem 6 (or its corollary) to determine the number ofdistinct real roots, we may take is to be the degree of f (z).

The number of distinct real roots of the real polynomial f (z) is equal to

the index Imo; ), (see p. 175). Therefore the corollary to Theorem 6 givesthe formula

80 81 ... an_1

+f'(r) 80 8 8 8 ... 81 _n-2V 1,80, 1 1 2 n

-» (s) 81 82

8w_1 8n ... 82n-2

4

where s, (p = 0, 1, ...) are Newton's sums and is is the degree

of f(z).In § 11 we shall establish a similar formula for the index of an arbitrary

rational fraction. The information on infinite Hankel matrices that will berequired for this purpose will be given in the next section.

§ 10. Infinite Hankel Matrices of Finite Rank

1. Let80, 81, 82, ...

be a sequence of complex numbers. This determines an infinite symmetricmatrix

80 81 8I .. .81 82 8g...8g 8g 84 ...

§ 10. INFINITE HANKEL MATRICES OF FINITE RANK 205

which is usually called a Hankel matrix. Together with the infinite Hankelmatrices we shall consider" the finite Hankel matrices S. = I s4+kand their associated Hankel forms

w-1S,, (x, a{+kxtxk

(,k-0

The successive principal minors of S will be denoted by D,, D2, D3, .. .

Dp -=I8t+k 110,_10-1 (p=1,2, ...).

Infinite matrices may be of finite or of infinite rank. In the latter case,the matrices have non-zero minors of arbitrarily large order. The followingtheorem gives a necessary and sufficient condition for a sequence of numbersso, s1, s2i ... to generate an infinite Hankel matrix S= II s,+k 11 .0 of finiterank.

THEOREM 7: The infinite matrix S= II s,+ A, II; is of finite rank r if andonly if there exist r numbers al, a2i ... , ar such that

aq= aas4-q (q=r, r } 1, ...) (53)q-1

and r is the least number having this property.Proof. If the matrix S = II si+k II ; has finite rank r, then its first

r+1 rows R1i R2, ... , Rr+1 are linearly dependent. Therefore there existsa number h:5 r such that R1f R2, ..., Rh are linearly independent and Rh+1is a linear combination of them :

h

Rh+1 _.X ot9R8-q+1.q-1

We consider the rows Rq+1, Rq+2, , R0+k+1, where q is any non-negative integer. From the structure of S it is immediately clear that therows Rq+1, Rq+2, , Ra+h+l are obtained from R1, R2, ..., Rh+1 by a'shortening' process in which the elements in the first q columns are omitted.Therefore

[8

Rq+h+1 = G ap Rq+h-a+1V-1

(q=0, 1, 2, ...).

Thus, every row of S beginning with the (h + 1) -th can be expressed linearlyin terms of the h preceding rows and therefore in terms of the linearly

35 See Vol. I, Chapter X, § 10.

206 XV. THE PROBLEM OF ROUTI I -HURWITZ AND RELATED QUESTIONS

independent first h rows. Hence it follows that the rank of S is r = h.39The linear dependence

h

Rq+h+1 = aq Rq+h-g+ip_I

after replacement of h by r and written in more convenient notationyields (53).

Conversely, if (53) holds, then every row (column) of S is a linear com-bination of the first r rows (columns). Therefore all the minors of S whoseorders exceed r are zero and S is of rank at most r. But the rank cannot beless than r, since then, as we have already shown, there would be relationsof the form (53) with a smaller value than r, and this contradicts the secondcondition of the theorem. The proof of the theorem is now complete.

COROLLARY: If the infinite Ilankel matrix S= II s,+k 11o is of finiterank r, then

D>= 8.+r10#0.

For it follows from the relations (53) that every row (column) of S isa linear combination of the first r rows (columns). Therefore every minorof S of order r can be represented in the form aDr, where a is a constant.Hence it follows that D,,' 0.

Note. For finite Hankel matrices of rank r the inequality D,-:96 0 need

not hold. For example 52 = II s 8' II for so = sl = 0, s2 # 0 is of rank 1,81,

whereas D1= so = 0.

2. We shall now explain certain remarkable connections between infiniteHankel matrices and rational functions.

LetB(z)=9(2)

T (-Z-)

be a proper rational fractional function, where

h (z) = aozm + ... + aam (ao 0), g (z) = b1z"A-1 + b2z" 2 .}.....}. b.-

We write the expansion of R(z) in a power series of negative powers of z :

(z) _ ao a1 s..

89 The statement 'The number of linearly independent rows in a rectangular matrix isequal to its rank' is true not only for finite rows but also for infinite rows.

§ 10. INFINITE HANKEL MATRICES OF FINITE RANK 207

If all the poles of R(z), i.e., all the values of z for which R(z) becomes infinite,lie in the circle z < a, then the series on the right-hand side of theexpansion converges for I z I > a. We multiply both sides by the denomi-nator h(z) :

(aozm + ale-' + ... + an) (z + L, + Z' + ...l = blzm-1 + b2zm_2 + . bm.

Equating coefficients of equal powers of z on both sides of this identity,we obtain the following system of relations:

aoso= b1,

a.8, + b0.l04)

aosm-1 + alsm-2 + ... + am-1so = b.,,

aos9 + ala.-I + ... + amsQ-m = 0 (q=m, m + 1, ... ). (54')

SettingaD

we can write the relations (54') in the form (53) (for r= m). Therefore,by Theorem 7, the infinite Hankel matrix

S=1181+k 110

formed from the coefficients so, 81, 82, ... is of finite rank (< m).Conversely, if the matrix S = II s4+k I100 is of finite rank r, then the rela-

tions (53) hold, which can be written in the form (54') (for m = r). Then,when we define the numbers b1, b2i ... , b., by the equations (54) we have theexpansion

b,zm-1 + ... + bm _ so sIaozm + a,zm-i + ... + am - z + zs

{ .. .

The least degree of the denominator m for which this expansion holds isthe same as the least integer m for which the relations (53) hold. By Theo-rem 7, this least value of m is the rank of S = II 84+k


THEOREM 8: The matrix S= II s4+k 1lo is of finite rank if and only ifthe sum of the series

R(z)=z"+8'+82+...za Z3

is a rational function of z. In this case the rank of S is the same as thenumber of poles of R(z), counting each pole with its proper multiplicity.

208 XV. THE PROBLEM OF ROUTII-HURwrrZ AND RELATED QUESTIONS

§ 11. Determination of the Index of an Arbitrary Rational Fractionby the Coefficients of Numerator and Denominator

1. Suppose given a rational function. We write its expansion in a seriesof descending powers of z:10

R(Z)=8-.-1Z- + '-'+8-2Z +8-1 + +.Zs

+ ...,

The sequence of coefficients of the negative powers of z

8e, 81, 82, ...

determines an infinite Hankel matrix S= II ${+k II.-.We have thus established a correspondence

(55)

R(z) - S.

Obviously two rational functions whose difference is an integral func-tion correspond to one and the same matrix S. However, not every matrix8= II 8#+k II: corresponds to some rational function. In the precedingsection we have seen that an infinite matrix S corresponds to a rationalfunction if and only if it is of finite rank. This rank is equal to the numberof poles of R(z) (multiplicities taken into account), i.e., to the degree ofthe denominator f (z) in the reduced fraction g(z)/f (z) =R(z). By meansof the expansion (55) we have a one-to-one corespondence between properrational functions R(z) and Hankel matrices S= II 84+k Ilu of finite rank.

We mention some properties of the correspondence :1. If R1(z) - Sl, R2(z) - S2, then for arbitrary numbers c1, e2

c1R1(z) + c2R2(z) - c181 + C282-

In what follows we shall have to deal with the case where the coefficientsof the numerator and the denominator of R(z) are integral rational functionsof a parameter a; R is then a rational function of z and a. From the expan-sion (54) it follows that in this case the numbers so, 81, s2, ... , i.e., the ele-ments of S, depend rationally on a. Differentiating (55) term by termwith respect to a, we obtain :

2. If R(z, a) - S(a), then - - _as "a« as

40 The series (55) converges outside every circle (with center at z = 0) containing all

the poles of R(z).

41 If 8 then es = 118 k11;04

§ 11. DETERMINATION OF INDEX OF ARBITRARY RATIONAL FRACTION 209

2. Let us write down the expansion of R(x) in partial fractions:

v

{z

A(lDa Ail) A;0B(z)Q ++

(z)-I- z-e ...+ (56)

where Q(z) is a polynomial; we shall show how to construct the matrix Scorresponding to R(z) from the numbers a and A.

For this purpose we consider first the simple rational function

1 -V

zl as- 4 zp+1

It corresponds to the matrix

Sa = ar+t I!-

The form S..(x, x) associated with this matrix isw-I

If

Saw (x, x) = P ar+t x+ xt = (x0 + ax1 + ... + e-' xw-1)s .f. k-0

a A(J)B (z) =Q (z) +,Xz-

f-1,

then by 1. the corresponding matrix S is determined by the formula

fS =,y A(i) Sa; _' I

i-1 f-I

and the corresponding quadratic form is

S. (x, x) = 'A(f) (xo + at x1 + ... + 4 I x.-I)= .-1

In order to proceed to the general case (56), we first differentiate the

relation1 _{ ..

z - a sa - ; ar+e (10

h - 1 times term by term. By 1. and 2., we obtain :

) h ~ ( 1 )

8h-1 Sa ilo ( hi k1) =0fori+L-<h-1.z-a h-1 ! Bah-1 h-1 -


Therefore, by using rule 1. again we find in the general case, where R ( z)has the expansion (56) :

B (Z) S = j' (A,n+ All) any + ... }1) l A% a«; ,1) S. y . (57)

By carrying out the differentiation, we obtain:

v

S=Ii A,naf+x+AYn(tit)f+>t-'+...+A;) (i+i) a'++-y+'Ilo

A-1

The corresponding Hankel form S. (x, x) X 8,+k--,x, isr, r-o

Ain+A; aai+...+(yi t 1)! Ana' 11(z0+atxl+...+ (57'

3. Now we are in a position to enunciate and prove the fundamentaltheorem :42

THEOREM 9: If

R(z) - S

and m is the rank of S," then the Cauehy index 1+a R(z) is equal to thesignature44 of the form 85(x, x) for any n> m:

I-+-R(z)=a[S.(x,x)].

Proof. Suppose that the expansion (56) holds. Then, by (57),v

S=2Ta.,

where each term is of the form

Ta = (A1 - A2 as + ... -+. (y1 1) l A. -1) Sa, S. = al

(68)

andv

S (x, x) _ 2' Ta, (x, x) _ Y Taj (x, x) + 2' [Ta1(x, x) + Ti (x, x)]- 1 21 real of complex

42 This theorem was proved by Hermite in 1856 for the simplest case where R(a) hasno multiple poles 11871. In the general case it was proved by Hurwitz (204] (see also[25], pp. 17.19). The proof in the text differs from Hurwitz' proof.

49 As we have already mentioned, m is the degree of the denominator in the reducedrepresentation of the rational fraction R(z) (see Theorem 8 on p. 207).

44 We denote the signature of S.(x,x) by a1S.(x,x)].


By Theorem 8, the rank of the matrix Taj, and hence of the form Taf(x, x),4

is vj (j =1, 2, ... , q) and the rank of 8,.(x, x) is m = Z? v,. But if thef-1

rank of the sum of certain real quadratic forms is equal to the sum of theranks of the constituent forms, then the same relation holds for thesignatures :

a [S. (x, x)] _ X a [T.! (x, x)] + E a [Ta1(x, x) + T8f (x, x)]. (59)of real at complex

We consider two cases separately :

1) a is real. Under any variation of the parameters A1, A2, ..., A,-1and a in

A, +(z

Awls-+-..+(Z

A,z a a).

the rank of the corresponding matrix T. remains unchanged (= v) ; there-fore the signature of Ta(x, x) also remains unchanged (see Vol. I, p. 309).Therefore o[T4(x, x) ] does not change if we set in (59) and (60) : Al = ... _A,_ 1= 0 and a= 0, i.e., if for T. we take the matrix

V-1

I

1 8-'Sot

(V- 1)!

0 0 ... 0 A. 0 0 ...0

The corresponding quadratic form is equal to

2A, (xex,._, + x1x._2 + + x,,x,) for v = 2s,

A. [2 (xox,, + " + x,_2x,) + x'_1] for v = 2s -1,(s=1,2,3,...).

212 %V. THE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

But the signature of the upper form is always zero and that of the lowerform is sign A,. Thus, if a is real, then

0, for even va [Ta (x, x)] =

sign A., for odd v

2) a is complex.

. rTx,(x,x)= (Pk+iQk)', Tai(x,x)=2 (Pk-'Qk)a,

(61)

where Pk, Qt (k = 1, 2, ... , v) are real linear forms in the variables xo, xi,x2, ... , Then

T. (x, z) +Ta (x,x)=2£ Pk'-2£Q . (62)

Since the rank of this quadratic form is 2v, the Pk, Qk (k = 1, 2, ... , v) arelinearly independent, so that by (62) for a complex a

a [Ta (x, x) + Ta (x, x)] = 0. (63)


a [S (x,x)] sign A;o.real

1 r odd

But on p. 175 we saw that the sum on the right-hand side of this equationis I+:: R(z). This completes the proof.

From this theorem we deduce :

COROLLARY 1: If R(z) - S= 11 si+k IIa and m is the rank of 8, thenn-t

all the quadratic forms x) = Z si+k xi xk (n = m, m + 1, ...) havei. k-o

one and the same signature.

In Chapter X, § 10 (Vol. I, pp. 343-44) we established a rule for comput-ing the signature of a Hankel form ; moreover, Frobenius' investigationsenabled us to formulate a rule that embraces all singular cases. By the

+R Each of the products x.,x..,,.r,x.2, ... can be replaced by a difference of squares

(Z,+x,_2\s(z,-2) 2 4I1 2 ,...

All the squares so obtained are linearly independent.


theorem above we can apply this rule to compute the Cauchy index. Thuswe obtain :

COROLLARY 2: The index of an arbitrary rational function B (z) whosecorresponding matrix S= II s,+k Ilo is of rank m, is determined by theformula

whereI*_- B(z)=m-2V (1, DI, DI, ..., Da), (64)

Dl=1at+kI S I -

ao s1 ...af I81 a2...a!

s1_1 a, ... 821-,

( / = 1 , 2 , . . . , m ) ;) (65)

if among D1, D2, ..., D. there is a group of vanishing determinants"

(DA T& 0) DA+1= ... = DA+p = O (DA+p+l,' O) ,

then i n the computation of V (DA, DA+I, ... , DA+p+1) we can take

sign DA+/ _(- 1)I(tf-1)

sign D. (j- =1, 2, ..., p)

and this gives

V (DA, DA+1, ..., DA+p+1)

IV+1 for odd p,(66)

+ l 7_8 DA+p+t(66)

2for even p and e= (-1) 1 sign

DA

In order to express the index of a rational function in terms of thecoefficients of the numerator and denominator we shall require some addi-tional relations.

First of all, we can always represent B(z) in the form"

Bz)=Q(z)+kQ)(z) I

where Q(z), g(z), h(z) are polynomials and

h(z)=aoy,"+a1z"-I+...+a",(ao O),g(z)=boz"'+b1z'"-1+...+b,".

Obviously,I ±m°° B (z) = I ±.: x)

46 Here we always have D. r 0 (p. 206).4? It is not necessary to replace B(z) by a proper fraction. For what follows it is

sufficient that the degree of g(s) does not exceed that of h(z).

214 XV. TIIE PROBLEM OF RouTII-HcRWITZ AND RELATED QUESTIONS

Let

If we now get rid of the denominator and then equate equal powers of z onthe two sides of the equation, we obtain :

aoa_1= bo ,

aoao + as_1 = bl,

a1a,,,_1 + ... + a.s_1 =b.,

aoa, + ala,_i + + 0 (t =m, m + 1, ...) .

(67)

Using (67), we find an expression for the following determinant of order2p in which we put a! = 0, b, = 0 for j > m :

ao al a, ... a,p_i 1 0 0 ...0 ao al %.. - aar-1b0 b1 b2 ... b,p-1 8_1 80 81 ... a,Y_a 0 as a1 ... a,p_,0 ao a, ... a,,--, = 0 1 0 ... 0 , 0 0 a0... ap-0 b0 b1 ... b,p_a 0 8_1 a0 ... 82'-s . . . . . . . .

. . . . . . . . . . . . . . . 0 0 0 ...aa

(p-1>81

p '2 80

II

a 1 ..a

-Ip

.P- e

_(- 1) 2 :p p_a d 1. ap-sI - :p {

'

9 r = at D, . (68)

a0 a1 ... ap-1

We introduce the abbreviation

V.P =

ao a1 ... a,p_Ibo b1 ... b2p_1

0 ao...a,,-,

8P-1 ap ... 8,P-2

(p=1, 2, ...; ai=b,=0for j>m). (69)

Then (68) can be written as follows :

P:p=aopD, (p=1, 2, ...). (68')

By this formula, Corollary 2 above leads to the following theorem :


THEOREM 10: If V2ri,,0; ° then

+»" + b1z++'-I + ... + bmI =m-2V(1,V 17,...,V ) (a0), (70)

where V2, (p=1, 2, ... , m) is determined by (69) ; if there is a group ofzero determinants

(V,. 0) V 2A+2 = ... = V lh+2P = 0 (V 2A+2P+2 # 0)

then in computing V (V 2n, V2n+21 V2h+sP+2)we have to set:

iU-Usign V8hi+2, _ (- 1) ' sign V2,r (j= 1, 2, ..., p)

or, what is the same,

V(V ..., V2k+2 2)=' +

P+' Ior odd p2

for even p and e= (-11,-L sign VCh+2p+2v21,

p1p--2

Note. If V2m 0, i.e., if the fraction under the index sign in (70) isreducible, then (70) must be replaced by another formula

I{»b ;zozmn+ +

+b,z a1Zm-1+

+... ++ba,,, m_r-2V(1,

V21 Vk, ..., V201 (70')°° ...

where r is the number of poles (including multiplicities) of the rationalfraction under the index sign (i.e., r is the degree of the denominator in thereduced fraction).

For in this case the index we are interested in is

r - 2V (1, DI, D2, ..., D,),

since r is the rank of the corresponding matrix S = II s;+k Iln . But theequation (68') is of a formal character and also holds for reduced fractions.Therefore

V (1, D1, D2, ..., Dr) = V (1, f7 2, V 41 ..., V 2r) r

and we have reached (70').Formula (70') enables us to express the index of every rational fraction

in which the degree of the numerator does not exceed that of the denominatorin terms of the coefficients of numerator and denominator.

48 The condition V2m # 0 means that D. -A 0, so that the fraction under the indexsign in (70) is reduced.

216 XV. Tim PROBLEM OF ROUTII-HURWITZ AND RELATED QUESTIONS

§ 12. Another Proof of the Routh-Hurwitz Theorem

1. In § 6 we proved the Routh-Hurwitz theorem with the help of Sturm'stheorem and the Routh algorithm. In this section we shall give an alterna-tive proof based on Theorem 10 of § 11 and on properties of the Cauchyindices.

We mention a few properties of the Cauchy indices that will be requiredin what follows.

1. 1a R(x) =-IeR(x).'°2. I.IRI(x)R(x) =sign RI(z) I`R(x) if RI(x) 0, oo within the inter-

val (a, b).3. If a<e<b, then R(x)+rye,where;1c=0 if

R(c) is finite and q,= -1 if R(x) becomes infinite at c; here il,.=+1corresponds to a jump from - oo to + co at c (for increasing x). and i1c=-1to a jump from + oo to - oo.

4. If R(-x) =-R(x), then I ,R(x) =IaR(x). If R(-x) =R(x),then le_,R(x) =- to R(x).

5. IIR(x) + 1a(1/R(x)) = ea2- ep

, where e. is the sign of R(x) within(a, b) near a and es is the sign of R(x) within (a, b) near b.

The first four properties follow immediately from the definition of theCauchy index (see § 2). Property 5. follows from the fact that the sum of

the indices la R(x) and Ia R(x) is equal to the difference III - n2, where n,

is the number of times R(x) changes from negative to positive when zchanges from a to b, and 712 the number of times R(x) changes from positiveto negative.

We consider a real polynomial"

/ (z) = aoz" + alz"-I + a2z"-2 + ... + a"_Iz + a" (a0>0),

We can represent it in the form

/(z)=h(z2)+zy(a2),where

h(u)=a"+a"_$u+---, g(u)=a"_I+a"_su+- .

49 Here and in what follows the lower limit of the index may be - ec and the upperlimit may be + oo.

50 We have here reverted to the usual notation for the coefficients of a polynomial.

§ 12. ANOTHER PROOF OF ROUTH-HURWITL THEOREM 217

We shall use the notationa,zn-1 - a.,zn-3 +...

- I (71)Q - aozn - asz"-z + .. .

In § 3 we proved (see (20) on p. 180) that

Q=n-2k-8, (72)

where k is the number of roots of f (z) with positive real parts and s thenumber of roots of f (z) on the imaginary axis.

We shall transform the expression (71) for Q.To begin with, we deal with the case where n is even. Let n = 2m. Then

h (u)= aoum + alum-1 + ... + an, 9 (u) =alum-1 + alum-2 + ... + aw-1.

Using the properties 1.-4. and setting j1= ± 1 if lim 99 (U) oo respec-a-o- h (u)

tively, and -q = 0 otherwise, we have :

I± »+ 2I_+I )I* =)y(Q=- y)o 11 l=--( - ,.h -

9(u) Io210 9(- z') 9(u) Io20

uq(u)«) h(u)-= , h(-

I -%7(U±9(u)I

h(u) -r!

= -h (u) ° h (u)-

Similarly we have for odd n, n = 2m + 1:

h(u)=alum+asum-1+...+a,, y(u)=aoum+agum-1+...+as-1.

Setting61 C = sign l (u)Ja-o-if lim L(U) = 0 and C = 0 otherwise, we find :

Q=I+.°'h(-z; =I°_,.+Io' +=21° h(- ) +2I-h(u)z9 (- z) z9 (-

zs)+C

u9 (u)

=I° 11 (U) -I° h(u)h(u)u9 (u)

_°'g (u) u9 (u) 9 (u)

Thus62

51 Here we mean by sign sign of g(u)/h(u) for negative valuesof u of sufficiently small modulus.

52 If a, 0, then the two formulas (73') and (73") may be combined into the singleformula

Q = I+« 9 (U))

}I+,°°

uh9((U)

(73.,.)

218 XV. TIIE PROBLEM OF ROUTII-HL'RWITZ AND RELATED QUESTIONS

e=1*: L(-I±-T((U) (n=2m), (73')

=I+ h(u) ,4,,,h(u)B --g )

of2 1m ). (73)(n= +(uU9 (U)

As before, we denote by AI, d2, ... , d. the Hurwitz determinants of f (z).We assume that A. ¢ 0.63

1) n=2m. By (70),64

4(u) dl, As, ..., (74)

11 -

=-m+2V(1, d,, d4,..., A.). (75)

But then, by (73'),

e =n-2V(1, d,, d ..., d._1)-2V(1, d2, d4, ..., dp),

which in conjunction with E = n - 2k gives

k = V (1, d,, d,, ..., V (1, d2, d4, ..., dn). (76)

2) n=2m+1. By (70),86

I± 9((u) =m + 1- 2V (1, d,, d,, (77)

=-m+2V(1, d,, d4, ..., dR_,). (78)

The equation p = 2m + 1 - 2k together with (73"), (77), and (78) againgives (76).

This proves the Routh-Hurwitz theorem (see p. 194).

63 In this case a = 0, so that e = n - 2k. Moreover, A. f. 0 means that the fractionsunder the index signs in (73') and (73") are reduced.

64 In computing Vy V4, ... , the values a., a,, ..., a. and b., b,, ..., b. must bereplaced by a., as, ... , at. and 0, a,, a,, ... , a,.-, respectively in computing the first indexand by a., at, ..., as. and at, a...... a,.-,, 0 respectively in computing the second index.

66 In computing the first index in (70) we take a., a,, ... , a., 0 and 0, a,, a,, ... , a,.+,,respectively, instead of a., a...... a.+, and b., b,, ..., b.+,; and in computing the secondindex we take a,, a,, ... , a,.+, and a., a,, ... , a,., respectively, instead of a., a,, ... , a.and b., b,, ... , b..

§ 12. ANOTHER PROOF OF ROUTH-HURWITZ THEOREM 219

2. Note 1. If in the formula

k= V (1, d1, As, ...) + V (1, As, d4, ...)

some intermediate Hurwitz determinants are zero, then the formula remainsvalid, only in each group of successive zero determinants

(A'=#=0) At+2 = Al+4 = - d1+2p = 0 (4+2p+9 # 0)

the following signs must be attributed to these determinants (in accordancewith Theorem 7)

W-1)sign d1+2f =(-1) ' sign d1 (j=1, 2, ..., p),

which yields:

P+1 foroddp,2 '

79P+2-E, for even p and e=(-1)"sign AI+ p+z

A careful comparison of this rule for computing k in the presence ofvanishing Hurwitz determinants with the rule given in Theorem 5 (p. 201)shows that the two rules coincide.""

Note 2. If do=O, then the polynomials ug(u) and h(t1) are not co-prime. We denote by d (u) the greatest common divisor of g (it) and h (u)and by u''d (u) that of ug(u) and h (u) (y=0 or 1). We denote the degreeof d(u) by 6 and we set h(u) =d(u)hI(it) and g(u) =d(u)gI(t1).

The irreducible rational fraction g,(u)/h1(u) always corresponds to aninfinite Hankel matrix S = 11 s,+k I1= of rank r, where r is the degree ofh1(u). The corresponding determinant D, 7(=O and Dr+, = Dr+s = ... = 0.By (68') V2, # 0, Va,+s = Va,+4 = = 0. Moreover,

I+-9'(u)=r-2V(1, V=, ... , V:,)hI (u) -

When we apply all this to the fractions under the index sign in (74), (75),(77), and (78) we easily find that for every it (even or odd) and x = 26 + y

x

dn-x_10, dn_x9`0, dn-x+i=... =4n=0

and that the formulas (74), (75), (77), and (78) all remain valid in thiscase, provided we omit all the J, with i > it - x on the right-hand sides andreplace the number m (in (77), m + 1) by the degree of the corresponding

16 We have to take account here of the remark made in footnote 36 (p. 201).


denominator of the fraction under the index, after reduction. We thenobtain by taking (73') and (73") into account:

e=n-x-2V(1,dl,ds.... )-2V(l,d1,d4,...Together with the formula e = n - 2k - s this gives :

k1=V(1,dl,ds.... )+V(1,ds,d4,...). (80)

where k1= k + s/2 - x/2 is the number of all the roots of f (z) in the righthalf-plane, excluding those that are also roots of f (- z).57

13. Some Supplements to the Routh-Hurwitz Theorem.Stability Criterion of liknard and Chipart

1. Suppose given a polynomial with real coefficients

f (z) = aoz" + alz"-1 + . -. + a" (ao> 0).

Then the Routh-Hurwitz conditions that are necessary and sufficient forall the roots of f (z) to have negative real parts can be written in the formof the inequalities

wheredl>0,d2>0....,d">0, (81)

al as

ao as

0 al a

0 ao as a4 (at = 0 for k > n)

I ai I

is the Hurwitz determinant of order i (i =1, 2, ... , n).If (81) is satisfied, then f (z) can be represented in the form of a product

of as with factors of the form z + it, z2 + vz + w (u > 0, v > 0, w > 0), sothat all the coefficients of f(z) are positive:'e

sr This follows from the fact that x is the degree of the greatest common divisor ofh(u) and ug(u); x is the number of 'special' roots of 1(z), i.e., those roots z for which- z* is also a root of f(z). The number of these special roots is equal to the number ofdeterminants in the last uninterrupted sequence of vanishing Hurwitz determinants(including A.): 1"-x+1 = ... = d = 0.

as a. > 0, by assumption.

§ 13. SUPPLEMENTS TO ROUTH-HURWITZ THEOREM. 221

a1>0,a2>0,...,a">0. (82)

Unlike (81), the conditions (82) are necessary but by no means suffi-cient for all the roots of f (z) to lie in the left half-plane Re z < 0.

However, when the conditions (82) hold, then the inequalities (81) arenot independent. For example : For n = 4 the Routh-Hurwitz conditionsreduce to the single inequality d, > 0; for n = 5, to the two: dZ > 0, d. > 0;for n = 6 to the two : d, > 0, d, > 0.59

This circumstance was investigated by the French mathematiciansLienard and Chipart°0 in 1914 and enabled them to set up a stability criteriondifferent from the Routh-Hurwitz criterion.

THEOREM 11 (Stability Criterion of Lienard and Chipart) : Necessaryand sufficient conditions for all the roots of the real polynomial f(z) _aoz" + a1z"-1 + + a" (ao > 0) to have negative real parts can be givenin any one of the f ollotoing four forms :11

1 ) a . > a . - 2 J 2 ' d 3

2) a">0, a"_2>0,...; d:>0,d4>0,...,3) a">0;aa_1>0, a"_3>0, .. ; dl>0, dg>0, ...,4) a">0;a,,_1>0,a"_y>0,...; d'>0, d4>0,....

From Theorem 11 it follows that Hurwitz's determinant inequalities (81)are not independent for a real polynomial f (Z) = aoz" + az"-1 + - - + a,(ao > 0) in which all the coefficients (or even only part of them : a., a5_2i... or are positive. In fact: If the Hurwitz determinantsof odd order are positive, then those of even order are also positive, andvice versa.

Lienard and Chipart obtained the condition 1) in the paper [259] bymeans of special quadratic forms. We shall give a simpler derivation of thecondition 1) (and also of 2), 3), 4)) based on Theorem 10 of § 11 and thetheory of Cauchy indices and we shall obtain these conditions as a specialcase of a much more general theorem which we are now about to expound.

We again consider the polynomials h(u) and g(u) that are connectedwith f (z) by the identity

59 This fact has been established for the first few values of n in it number of paperson the theory of governors, independently of the general criterion of Lienard and Chipart,with which the authors of these papers were obviously not acquainted.

60 See [259). An account of some of the basic results of Lienard and Chipart can befound in the fundamental survey by M. G. Krein and M. A. Naimark [25).

81 Conditions 1), 2), 3), and 4) have a decided advantage over Hurwitz' conditions,because they involve only about half the number of determinantal inequalities.

222 %V. TILE PROBLEM OF ROUTII-HURWITZ AND RELATED QUESTIONS

f (z)=h(z2)+zg(z2).If n is even, n = 2m, then

h(u)=aoum+asum-1.+...+a", g(u)=aium-I+asum-2.+...+a"_,;

if n is odd, n = 2m + 1, then

h(u)=alum+asum-I+...+a", g(u)=aou'"+a2um-I+...+

The conditions a > 0, a"-s > 0, ... (or a"_, > 0, aa_, > 0, ...) cantherefore be replaced by the more general condition : h (u) (or g (u) ) doesnot change sign for u > 0.02

Under these conditions we can deduce a formula for the number of rootsof f (z) in the right half-plane, using only Hurwitz determinants of odd orderor of even order.

THEOREM 12: If for the real polynomial

/(z) = aoz" + alz" _ 1 +... + a" = h (z2) + zg(z2) (a0>0)

h(u) (or g(u)) does not change sign for it > 0 and the last Hurwitz deter-minant J. 0, then the number k of roots of f (z) in the right half-planeis determined by the formulas

n = 2m

h(u)does notchangesignfor

u>0

g(u)does notchangesignfor

u>0

where'

k=2V(1,4,4,...,A,_,)

=2V(1,d2,d,,...,d")

2

=2V(l,d2,da....d0 -e2

e0

n=2m+ 1

k =2V(1, d,, d ..., Am)- I -E_2

+ 1

2

2

=2V(1,4,4k,...,d"-I)+1280

(83)

E_ =sign lA (u))u_ t , £o =alga [h (u),0_o+ (84)

62 I.e., h(u) ? 0 or h(u) < 0 for u > 0 (g (u) 0 or g(u) < 0 for It > 0).63 If a, 96 0, then c = sign a,; and, more generally, if a, = a, = ... =a2,-l= 0,

a2v+1 0, then e,. = sign a2p4 1 If a.-,'. 0, then r0= sign a..,/a.; and, more generally,if a"_, = a"-3 =... =a.-2;-1 =0 and -A 0, then e0=sign a"-2p-1/a,,.

§ 13. SUPPLEMENTS TO ROUTII-IIURwITz THEOREM.

Proof. Again we use the notation

P= I± °°a12*-1 - aizn-s + ...

- aazn - a5z*-2 + .. .

Corresponding to the table (83) we consider four cases:

1) n = 2m; h (u) does not change sign for it > 0. Then81

to °°h(u)lo 0,

and so the obvious equation

h (u)Iu

°° h ( )implies that :86

g W= - I + u9 (u)h(u) - TOO-

But then we have from (74) and (75) :

V(1,dl,A,...)=V(1,d2,d4,...),

and therefore the Routh-Hurwitz formula (76) gives:

k=2V(1,dl,d,,..2V(1,A2,/14,...,4 ).2) n = 2m ; g(u) does not change sign for it > 0. In this case,

h (U)Ip 4 "g (u)

= to "t9((u) = 0,

I-0°°g(u)+Iom

ug(u)

so that with the notation (84) we have :

I± j+ () h (u)g(uM) + I=-

U9 (u)- -0 =0 .

223

When we replace the functions under the index sign by their reciprocals.then we obtain by 5. (see p. 216) :

h (u) °° h (u)

641f h(u,) =0 (u, > 0), then g(u,) -A0, because 0. Therefore h(u) ? 0(u > 0) implies that g(u)/h(u) does not change sign in passing through it =u,.

4s From A. = a 0 it follows that h (0) = it. L 0.

224 XV. THE PROBLEM of ROUTII-HuRWITZ AND RELATED QUESTIONS

But this by (74) and (75) gives :

V (1, d,, d4, ...) - V (1, ...) _2

Hence, in conjunction with the Routh-Hurwitz formula (76), we obtain :

k=2V(1,41,As.... )+c..2 '2r'.

3) n = 2m + 1, g (u) does not change sign for it > 0.In this case, as in the preceding one. (85) holds. When we substitute

the expressions for the indices from (77) and (78) into (85), we obtain :

1-42

In conjunction with the Routh-Hurwitz formula this gives:

k=2V(1,dl,d,,.:.)-1 2to =2V(1,d,.d4,...)+ 112

4) n = 2m + 1, h (u) does not change sign for it > 0.From the equations

g (u) u g(u)9(u)

u g(u) - 0I;hu) - I h (u) = 0 and 1 h (u) + (U) -

we deduce :

h (u) h (U) -

Taking the reciprocals of the functions under the index sign, we obtain :

u(u)+If, h(u) -e...g (u) W (U)

Again, when we substitute the expressions for the indices from (77) and(78), we have:

V(1,d,,V(1,d2,d4,...)=-2From this and the Routh-Hurwitz formula it follows that:

k=2V(1,d1,d,,...)-1 ' =2V(1,d,,d4....)+ 126-

2

This completes the proof of Theorem 12.

From Theorem 12 we obtain Theorem 11 as a special case.

§ 14. HURWITZ POLYNOMIALS. STIELTJES' THEOREM 225

2. COROLLARY To THEOREM 12: If the real polynomial

f (z)=aoz"+a1z"-1+...+a (ao> 0)

has positive coefficients

ao>O, a1>0, as>0,... a,>O,

and d",' 0, then the number k of its roots in the right half-plane Re z > 0is determined by the formula

k=2V(1,d1,d,.... )=2V(1,42,d4,...).

Note. If in the last formula, or in (83), some of the intermediate Hurwitzdeterminants are zero, then in the computation of V(1, Al, d,, ...) andV (1, dz, d., ...) the rule given in Note 1 on p. 219 must be followed.

But if A. = d"-1= ... = d"- } 1= 0, do- # 0, then we disregard thedeterminants A. in (83)06 and determine from these formulasthe number k1 of the `non-singular' roots of f (z) in the right half-plane,provided only that h (it) ,'. 0 for it. > 0 or g (u) , A 0 for u > 0.87

§ 14. Some Properties of Hurwitz Polynomials. Stieltjes' Theorem.Representation of Hurwitz Polynomials by Continued Fractions

1. Letf (z) = aoz" + alz"-1 + ... +an (a0 0)

be a real polynomial. We represent it in the form

f (z) = h (z2) + zg (z2).

We shall investigate what conditions have to be imposed on h(at) andg(u) in order that f (z) be a Hurwitz polynomial.

Setting k=s=0 in (20) (p. 180), we obtain a necessary and sufficientcondition for f (z) to be a Hurwitz polynomial, in the form

e=n,where, as in the preceding sections,

apzn _ a2zn-2 + ..

69 See p. 220.

67 In this case the polynomials h2(u) and g,(u) obtained from h(u) and g(u) bydividing them by their greatest common divisor d(u) satisfy the conditions of Theorem 12.


Let n = 2m. By (73') (p. 218), this condition can be written as follows:

it = 2m =±°° 9 (u) - I+» u9 (u) (86).. htu) .. h (u)

Since the absolute value of the index of a rational fraction cannot exceed

the degree of the denominator (in this case, m), the equation (86) canhold if and only if

I.= 9(y)=m andI±°°uD(u)m (87)(u) °° h (u)

hold simultaneously.For n = 2m + 1 the equation (73") gives (on account of e = n)

n=r°° h(u) -I+«h(u)ug (u) g M'

When we replace the fractions under the index signs by their reciprocals(see 5. on p. 216) and observe that h(u) and g(u) are of the same degree in,we obtain :68

g (U)n=2m+1=I±°A(U)

-I+:h(uh+e... (88)

Starting again from the fact that the absolute value of the index of a fractioncannot exceed the degree of the denominator we conclude that (88) holdsif and only if

I! I+:'-h(uj)=-mande..=1 (89)

hold simultaneously.If n = 2m, the first of equations (87) indicates that h(u) has in distinct

real roots u1 < u2 < ... < it,,, and that the proper fractions g(u)/h(n)can be represented in the form

q(u) =m Rt

h (a) E iTwhere

R, = g (ut) > 0 (i =1, 2, ..., m).A' NO

(90)

From this representation of g(u)/h(u) it follows that between any tworoots u;, ut+1 of h(u) there is a real root its of g(u) (i=1, 2, ..., m-1)and that the highest coefficients of h(u) and g(u) are of like sign, i.e.,

68 As in the preceding section, e.. = signh (U)(")l

[ u. + w 'h


h(u)=ao(u-ul)...(u-um), g(u)=a,(u-t4)...(U-ur_

U, < ul < us < u2 <... < um_1 < ur_1 < ur ; aoa1 > 0 .

The second of equations (87) adds only one condition

um <0.

By this condition all the roots of h (u) and g (u) must be negative.If n = 2m + 1, then it follows from the first of equations (89) that

h(u) has m distinct real roots it, < u2 < ... < u, and that

g(U) - mRt

A(u)=e_I+ u-ut ($_1 =A 0),{_1

(91)

where

= S NO > 0R ( i=1 2 , (91')i1 (tip)

The third of equations (89) implies that

,

s_1 > 0, (92)

i.e., that the highest coefficients ao and a1 are of like sign. Moreover, itfollows from (91), (91'), and (92) that g(it) has m real roots u'1 < u'2 < ...< u.'m in the intervals (- oo, u,), (u1f u2), ... , u,,,). In other words,

h(u)=a,(u-ul)...(u-um),

ui <uj<u'<u2<...<u;,,<um; anal>0.The second of equations (89), as in the case n= 2m, only adds one furtherinequality

um < 0 .

DEFINITION 3. We shall say that two polynomials h(u) and g(u) ofdegree in (or the first of degree in and the second of degree in - 1) form apositive pair69 if the roots u,, n2i ... , it. and u,, up, . . ., u, (or u',, u;, ...,um_,) are all distinct, real, and negative and they alternate as follows:

(or ul<ui<USand their highest coefficients are of like sign.70

60 See [17], p. 333. The definition of a positive pair of polynomials given here differs

slightly from that given in the book [17].70 If we omit the condition that the roots be negative, we obtain a real pair of poly-

nomials. For the application of this concept to the Routh-Hurwitz problem, see (36).


When we introduce the positive numbers vi:-.,: - ui and v4 = - u; andmultiply h(u) and g(u) by + 1 or -1 so that their highest coefficients arepositive, then we can write the polynomials of this positive pair in the form

m m

h (u) = al II (u + vi) , g (u) = ao 11 (u + vi) , (93)

where

a1 > 0, ao > 0, 0 < Vm < V, < Vm_1 < vm_1 < ... < VI < vi,

in case both h(u) and g(u) are of degree m, and in the form

m m-Ih(u)=aoll(u+v{), g(u)=ail7(u+v{), (93')

r_I c_I

where

a0>0, al>0, 0<v,,,<v;,,_1<vm_1<...<vi<vl,

in case h(u) is of degree m and g(u) of degree in - 1.

By our earlier arguments we have proved the following two theorems :

THEOREM 13: The polynomial f(z) =h(zz) +zg(z2) is a Hurwitz poly-nomial if and only if h (u) and g (u) form a positive pair.71

THEOREM 14: Two polynomials h(u) and g(u) the first of which is ofdegree m and the second of degree m or in - 1 form a positive pair if andonly if the equations

h (u) h (u) (94)

hold and, when h (u) and g (u) are of equal degree, the additional condition

a- --=sign [9 (U)l+_=1

holds.

(95)

2. Using properties of the Cauchy indices we can easily deduce from thelast theorem a theorem of Stieltjes on the representation of a fractiong(u)/h(u) as a continued fraction of a special type, provided h(u) andg(u) form a positive pair of polynomials.

The proof of Stieltjes' theorem will be based on the following lemma :

71 This theorem is a special case of the so-called Hermite-Biehler theorem (see [7],

p. 21).


LEMMA. If the polynomials h(u) and g(u) (h(u) of degree m) forma positive pair and

g (u)h u) c + du + hl (u) (96)

gi (u)

where c, d are constants and h,(u), gl(u) are polynomials of degree notexceeding m -1, then

1. c?O,d>0;2. hl(u), g, (u) are of degree m-1;3. hl(u) and gl(u) forma positive pair.Given h(u) and g(u), the polynomials hl(u) and gl(u) are uniquely

determined (to within a common constant factor) and so are c and d.Conversely, from (96) and 1., 2., 3. it follows that h(u) and g(u) form

a positive pair, that h(u) is of degree m, and g(u) is of degree m or m-1according as c > 0 or c = 0.

Proof. Let h(u), g(u) be a positive pair. Then it follows from (94)and (96) that:

1(u) du + hl (u)

gi (u)

This equation implies that gl (u) is of degree m -1 and that d 0.Further, from (97) we find :

m=-I+:[du+g,(u)J+ sign d=-I+:g (U) + sign d.

Hence it follows that d > 0 and that

I+»hi(u)=- -1 .(u)

(in )

The second of equations (94) now gives :

U9 (u)= +« 1- m - Imo, h; (u)(u) Imo, cu + d + K (u) l

ug,(u)1=I+ 1 =I+« d+ hi(u) I+» hi(u)

hl (u) _°° [ ug. (u)1 -°° ugi R+

ug, (u)

(97)

(98)

(99)

Hence it follows that hl (u) is of degree m -1.Condition (95) yields, by (96) : c > 0. But if g(u) is of smaller degree

than h (u), then it follows from (96) that c = 0.

230 XV. THE PROBLEM OF ROUTIi-HURwITZ AND RELATED QUESTIONS

(98) and (99) imply :

I+"! g1(u) m 1 Imo"ugl (8) _ - m + etlth1(u) A, (U)

sign g1 (u)1

(100)

Since the second of the indices (100) is in absolute value less than m -1,we have

1)= 1E(m- (101)

and then we conclude from (100) and (101), by Theorem 12, that thepolynomials hl(u) and g1(u) form a positive pair.


c=limgemu),U (u) 4,y h u d

After c and d have been found, the ratio h1(u) is determined by (96).

The relations (97), (98), (99), (100), and (101) applied in the reverseorder, establish the second part of the lemma. Thus the proof of the lemmais complete.

Suppose given a positive pair of polynomials h(u), g(u), with h(u)of degree m. Then when we divide g (v) by h (u) and denote the quotientby co and the remainder by g1(u ), we obtain :

g(u) iT (U) 0 h (u) 0 h (U)

gi (u)

-- can be represented in the form d(,u.+9,!u), where h,(1l), like g1 (it),

is of degree less than m. Hence

g (u) = 1

(U) co +u

+

h1(u) (102)

d0 gl (u)

14. IIrR RITZ POLYNOMIALS. STIELTJES' TiionEit 231

Thus, the representation (96) always holds for a positive pair h (it) andg(u). By the lemma

c, ?0, do>0,

and the polynomials h1(it) and g, (u) are of degree m -1 and form a posi-tive pair.

When we apply the same arguments to the positive pair h1(u), g1(u),we obtain

9i (s) 1

d, ,u(u)

wherecl>0, d1>0,

(102')

and the polynomials h2(u) and g2(u) are of degree m -2 and form a posi-tive pair. Continuing the process, we finally end up with a positive pairh. and gm, where h. and g,A are constants of like sign. We set :

9,nC".hm

9(u)- 1A(u)-CO+ - -- 1dou + -- --j

cl .+ .._1

d1u + cs +

1

1dm-iu +

CM

(102(m))

Using the second part of the lemma, we show similarly that for arbitrary0, cl>0,...,cm>0, do>0, d1>0, ..., d,.,-i>0 the above con-

tinued fraction determines uniquely (to within a common constant factor) apositive pair of polynomials h(u) and g(u), where h(it) is of degree in andg (u) is of degree iii, when co > 0 and of degree nt - 1 when Cu = 0.

Thus we have proved the following theorem."

72 A proof of Stieltjes' theorem that is not based on the theory of Cauchy indices canbe found in the book [17), pp. 333-37.

232 XV. TILE PROBLEM OF ROUTII-IIURwITZ AND RELATED QUESTIONS

THEOREM 15 (Stieltjes) : If h(u), g(u) is a positive pair of polynomialsand h(u) is of degree in, then

g(u)_ 1

h()-co+dou+ 1

1

d1u + 1

+ 1

d.-,u+ -

where

(103)

co ? 0, ci > 0, ..., c,,, > 0, do > 0, ..., d,n_1 > 0.

Here co = 0 if g(u) is of degree in -1 and co > O if g (u) is of degree in.The constants c4i dk are uniquely determined by h(u), g(u).

Conversely, for arbitrary co ? 0 and arbitrary positive cl, ... , cm,

do, . . . , dm_1f the continued fraction (103) determines a positive pair ofpolynomials h(u), g(u), where h(u) is of degree m.

From Theorem 13 and Stieltjes' Theorem we deduce:

THEOREM 16: A real polynomial of degree n f (z) = h (z2) + zg(z2) is aHurwitz polynomial i f and only if the formula (103) holds with non-negative co and positive c 1 . . . . . cm, do, . . . , dm_,. Here co > 0 when n is oddand co = 0 when n is even.

§ 15. Domain of Stability. Markov Parameters

1. With every real polynomial of degree n we can associate a point of ann-dimensional space whose coordinates are the quotients of the coefficientsdivided by the highest coefficient. In this `coefficient space' all the llurwitzpolynomials form a certain n-dimensional domain which is determined" bythe Hurwitz inequalities 41 > 0, 42 > 0, ... , d > 0, or, for example, bythe Lienard-Chipart inequalities a > 0, 0, ... , 4, > 0. 4, >.0, ... .We shall call it the domain of stability. If the coefficients are given asfunctions of p parameters, then the domain of stability is constructed in thespace of these parameters.

73 For a.= 1.

§ 15. DOMAIN OF STABILITY. MARKOV PARAMETERS 233

The study of the domain of stability is of great practical interest; forexample, it is essential in the design of new systems of governors."

In § 17 we shall show that two remarkable theorems which were foundby Markov and Chebyshev in connection with the expansion of continuedfractions in power series with negative powers of the argument are closelyconnected with the investigation of the domain of stability. In formulatingand proving these theorems it is convenient to give the polynomial not byits coefficients, but by special parameters, which we shall call Markovparameters.

Suppose thatf(z)=aoz"+a1z"-'+...+a, (ao#0)

is a real polynomial. We represent it in the form

f (z) = h (z;) + zg (z2).

We may assume that h (u) and g(u) are co-prime (d",' 0). We expandthe irreducible rational fraction h (u) in a series of decreasing powers of a :71

u a e 8 a

h ) = 8-1 + ti _ s + us - u4 .... (104}

The sequence so, s1, 82, ... determines an infinite Hankel matrixS= 11 8,+k 110. We define a rational function R(v) by

R(v)= - (105)

Then(106)

so that we have the relation (see p. 208)

R(v) - S. (107)

Hence it follows that the matrix S is of rank m = [n/21, since m, beingthe degree of h(u), is equal to the number of poles of R(v). "'

For n = 2m (in this case, s_, = 0), the matrix S determines the irre-ducible fraction h -(U) uniquely and therefore determines f (z) to within a

74 A number of papers by Y. I. Naimark deal with the investigation of the domain ofstability and also of the domains corresponding to various values of k (k is the number ofroots in the right half -plane). (See the monograph (41].)

73 In what follows it is convenient to denote the coefficients of the even negative powersof u by - a,, - s,, etc.

76 See Theorem 8 (p. 207).

2:14 XV. THE PROBLEM OF ROUTII-11URWITZ AND RELATED QUESTIONS

constant factor. For n = 2m + 1, in order to give f (z) by means of S it isnecessary also to know the coefficient s_1.

On the other hand, in order to give the infinite Hankel matrix S of rankm it is sufficient to know the first 2m numbers so, 81, ... , s2m-1. Thesenumbers may be chosen arbitrarily subject to only one restriction

D._Isi+ki 0; (108)

all the subsequent coefficients 82,, 82,w.F1i ... of (104) are uniquely (andrationally) expressible in terms of the first 2m: so, sl, . . . , s2m_1. For in theinfinite Hankel matrix S of rank n1 the elements are connected by a recur-rence relation (see Theorem 7 on p. 205)

sy=Y a9-9 (4=m, m+ 1, ...).v-1

(109)

If the numbers so, sl, . . . , s,,,_1 satisfy (108), then the coefficients al, a2.... ,a,,, in (109) are uniquely determined by the first m relations; the subsequentrelations then determine s2o s2m+1, ... .

Thus, a real polynomial f (z) of degree n = 2m with A. # 0 can be givenuniquely" by 2m numbers so, sl, ... , 82.-1 satisfying (108). Whenn = 2m + 1, we have to add s_1 to these numbers.

We shall call then values so, s1i ... , (for n= 2m) or s_1, so, ... ,s,,,_1 (for n = 2m + 1) the Markov parameters of the polynomial f (z).These parameters may be regarded as the coordinates in an n-dimensionalspace of a point that represents the given polynomial f (z).

We shall find out what conditions must be imposed on the Markov para-meters in order that the corresponding polynomial be a Hurwitz polynomial.In this way we shall determine the domain of stability in the space of Markovparameters.

A Hurwitz polynomial is characterized by the conditions (94) and theadditional condition (95) for n=2m+1. Introducing the function R(v)(see (105) ), we write (94) as follows:

I±wR(v)=m, I±:vR(v)=m.

The additional condition (95) for n = 2)n + 1 gives:

8_1>0.

(110)

Apart from the matrix S= Ii si+k Ilo we introduce the infinite Hankelmatrix 5(1) = II s'+k+1 II.. Then, since by (106)

" To within a constant factor.

15. DOMAIN OF STABILITY. MARKOV PARAMETERS 235

}...vR (V) =- 8-1V + 80 + +

81 82

the following relation holds :

vR(v) - S"'. (111)

The matrix S(l), like S, is of finite rank m, since the function vR(v), likeR(v), has m poles. Therefore the forms

m-1 m-1.(1)

S. (x, x) _ 8i+kxixk, ,Jm (x, x) _ 8i+k+lxixki. k-0 i, k-0

are of rank m. But by Theorem 9 (p. 190) the signatures of these forms,in virtue of (107) and (111), are equal to the indices (110) and hence alsotom. Thus, the conditions (110) mean that the quadratic forms Sm(x, x) and5;,,1) (x, x) are positive definite. Hence :

THEOREM 17: A real polynomial f (z) = h(z2) + zg(z2) of degree n = 2ntor n = 2m + 1 is a Hurwitz polynomial if and only if

1. The quadratic formsm-1 m-1

S. (x, x) =X 3i+kxixk I S(,al (x, x) X 8i+k+lxixk (112)i, k-0 i, k-0

are positive definite; and

2. (For n=2m+1)s-1 > 0. (113)

Here 8_1, 80, $l, ... , 82,,,_1 are the coefficients of the expansion

h (u) = 8-1 + u - u; + U3 ti-

4' We do not mention the inequality An7& 0 expressly, because it follows automaticallyfrom the conditions of the theorem. For if f (s) is a Hurwitz polynomial, then it is knownthat J. ,' 0. But if the conditions 1., 2. are given, then the fact that the form S." 1(x, x)is positive definite implies that

-I+..ny(u)-I+vR(v) m

and from this it follows that the fraction ug(u)/h(u) is reduced, which can be expressedby the inequality A f 0.

In exactly the same way, it follows automatically from the conditions of the theoremthat D. _ s, ,.. Ia -1 96 0, i.e., that the numbers ac, a,, ... , s2.-j, and (for n = 2+n + 1)s_, are the Markov parameters of f(a).

236 XV. TIIE PROBLEM OF ROUTII-IIURWITZ AND RELATED QUESTIONS

We introduce a notation for the determinants

Dp= 81+t la-1, D(11= sr+t+l 10-1 (p =1, 2, ..., m). (114)

Then condition 1. is equivalent to the system of determinantal inequalities

D1=s0>0, D2 = 80 81

8182>0,...,D,,,=

80 81 ... 8m-181 82 ... 8m

> 0,

8m-1 8m ... 82m-2

81 82 ... 8,e

D1(11=81>0,D2(1)=

81 82

82 8i

8_I > 0. (116)

82 83 ... 8m+1

1 8m 8m+1 82m-1

(115)

> 0.

If n = 2m, the inequalities (115) determine the domain of stability inthe space of Markov parameters. If n = 2m + 1, we have to add the furtherinequality :

In the next section we shall find out what properties of S follow fromthe inequalities (115) and, in so doing, shall single out the special class ofinfinite Hankel matrices S that correspond to Hurwitz polynomials.

§ 16. Connection with the Problem of Moments

1. We begin by stating the following problem :PROBLEM OF MOMENTS FOR THE POSITIVE Axis 0 < v < oo :'B Given a

sequence so, 81, ... of real numbers, it is required to determine positivenumbers

lu1>0, lu2>0, ... , /4m>0, V11(117)

such that the following equations hold:M

8P =Y 1A,v' (p=0, 1, 2,...). (118)!_ 1

It is not difficult to see that the system (118) of equations is equivalentto the following expansion in a series of negative powers of it :

79 This problem of moments ought to be called discrete in contrast to the general expo-m

nential problem of moments, in which the sums X µivf are replaced by Stieltjes integralsf.1

f vP du (v) (see 1551).0

§ 16. CONNECTION WITH PROBLEM OF MOMENTS 237

°0eii '

(119)

In this case the infinite Hankel matrix S= II s4+k I1o is of finite rank mand by (117) in the irreducible proper fraction

9 (u) fq (120)h (u) f u + Vi

(we choose the highest coefficients of h(u) and g(u) to be positive) thepolynomials h (u) and g (u) form a positive pair (see (91) and (91')).

Therefore (see Theorem 14), our problem of moments has a solution ifand only if the sequence so, s1, 82, ... determines by means of (119) and(120) a Hurwitz polynomial f (s) = h(z2) + zg(z2) of degree 2m.

The solution of the problem of moments is unique, because the positivenumbers vi and yi (j =1, 2, ... , m) are uniquely determined from the expan-sion (119).

Apart from the 'infinite' problem of moments (118) we also considerthe 'finite' problem of moments given by the first 2m equations (118)

sp(p=0, 1, ..., 2m-1). (121)f-I

These relations already determine the following expressions for the Hankelquadratic forms :

t'I 8i+kxixk =1 Jul (x0 + xlvi + ... + xm_lvi -1)s,f-

m-1 in

sr+k+lxixk = Rfvf (xo +xlvf +... }i,k-0

Since the linear forms in the variables xo, x1, ..., x,,,_1

xo+xlvf+...+xm_lvn-I (9=1, 2,...,m)

(122)

are independent (their coefficients form a non-vanishing Vandermondedeterminant), the quadratic forms (122) are positive definite. But thenby Theorem 17 the numbers so, 81, ..., s2,,,_1 are the Markov parameters ofa certain Hurwitz polynomial f (z). They are the first 2m coefficients ofthe expansion (119). Together with the remaining coefficients s2m, s2m+I, -

they determine the infinite solvable problem of moments (118), which hasthe same solution as the finite problem (121).


2:1* XV. TIIE 1'ROili E I OF ROUT11-11URWITZ AND RELATED QUESTIONS

THEOREM 18: 1) The finite problem of momentsm

ap = X PIV,i-I(123)

(p=0,1,...,2m-1;pI>0,...,pw,>0;0<vj<vs<...<vI),WhereSP are given real numbers and r1 and pj are unknown real numbers(p=O, 1, ..., 2m-1; j=1, 2, ..., m) has a solution if and only if thequadratic forms

rn--I mr-1

Si+kxixk, f. 8i+k+lXi-ki.k-0 i,k-0(124)

are positive definite, i.e., if the numbers so, sl, ..., s2,,,_1 are the Markovparameters of some Hurwitz polynomial of degree 2m.

2) The infinite problem of moments

i-I(125)

(p=0,1,2,...;p1 >0,...,p, >0;0<vl<v2<...<v.),where s,aregiven real numbers and vi and pJ are unknown real numbers (p = 0, 1.... ;

j =1, 2, ... , m) has a solution if and only if 1. the quadratic forms (124)are positive definite and 2. the infinite Hankel matrix S= II si+k llo is ofrank m, i.e., if the serves

80 81 + atu u' us

.... = k(M)

determines a Hurwitz polynomial f(z) =h(z2) =zg(z2) of degree 2m.

(126)

3) The solution of the problem of moments, both the finite (123) andthe infinite (124) problem, is always unique.

2. We shall use this theorem in investigating the minors of an infiniteHankel matrix S = 11 5 +k 11a of rank m corresponding to some Hurwitzpolynomial, i.e., one for which the quadratic form (124) is positive definite.In this case the generating numbers so, s1i 32, ... of S can be represented inthe form (123), so that for an arbitrary minor of S of order h < m we have :

ai,+k, a(,+kh

ih ik ih84 4 k, ... aih+kh Plyl thv2 ... pmvm

k, kh

vri V2a 21-41V II lusvz ... lumvim

k, khv,,, vm

§ 16. CONNECTION WITH PROBLEM OF :MOMENTS

and therefore

Slkl k' ... kA

Pah

Va.,it ... Vah

ValV,...VaA

V1VaA. Vah

ki k2Val ... VahVal

Vat Va' .. vkAa4.at

. . . . . .

vaA vaA ... vaA

239

(127)

But from the inequalities

0 < vl < vq < ... <v., i1 < 12 < ... < ih, kl < kt < ... < kh

it follows that the generalized Vandermonde determinants"

Va(, Va. ... VaAis to

Val Vi. ... Vah

Val Vat ... VaA

> 0,

... Vaki fitVal Va,

h

k, k, khVat Val ... Va,

k, k, khvah Vah ... vaA

>0

are positive.Since the numbers pf are positive (j = 1, 2, ... , m), it therefore follows

from (127) that/ 1

S (k, l

ish) > 0 0;5 k1 <

k2< ... < kh

h =1, 2, ... , m) . (128)k2 ... kA 1 S h, /

Conversely, if in an infinite Hankel matrix S= II si+k fiu of rank m allthe minors of every order h S m are positive, then the quadratic forms (124)are positive definite.

DEFINITION 4: An infinite matrix A= II act III,,' will be called totallypositive of rank m if and only if all the minors of A of order h:5 m arepositive and all the minors of order h > m are zero.

The property of S that we have found can now be expressed in the fol-lowing theorem :81

THEOREM 19: An infinite Hankel matrix S= II si+k 11- is totally posi-tive of rank m if and only if 1) S is of rank m and 2) the quadratic forms

m-1

f 8i+kxixk, E Bi+k+lxixki, k-0 i, k-0

are positive definite.

PO See p. 99, Example 1.

81 See [173).

240 XV. THE PROBLEM OF ROUTII-lluRWITZ AND RELATED QUESTIONS

From this theorem and Theorem 17 we obtain :

THEOREM 20: A real polynomial f (z) of degree n is a Hurwitz poly-nomial if and only if the corresponding infinite Hankel matrix S= II si+k iiis totally positive of degree m = [n/2] and if, in addition, s_1 > 0 whenn is odd.

Here the elements so, Si, 82, ... of S and 8_1 are determined by theexpansion

whereh(u)+ u'+ US-... (129)

/(z)=h(z$)+zg(z2).

§ 17. Theorems of Markov and Chebyshev

1. In a notable memoir Òn functions obtained by converting series intocontinued fractions'82 Markov proved two theorems, the second of which hadbeen established in 1892 by Chebyshev by other methods, and not in thesame generality. 88

In this section we shall show that these theorems have an immediate bear-ing on the study of the domain of stability in the Markov parameters andshall give a comparatively simple proof (without reference to continuedfractions) which is based on Theorem 19 of the preceding section.

In proceeding to state the first theorem, we quote the correspondingpassage from the above-mentioned memoir of Markov :8`

On the basis of what has preceded it is not difficult to provetwo remarkable theorems with which we conclude our paper.

One is concerned with the determinants"

dl, ds, ..., d7A, Am" dm, ... , d(m)

and the other with the roots of the equations'

VVm(x)=0.

82 Zap. Petersburg Akad. Nauk, Petersburg, 1894 [in Russian] ; see also [38],pp. 78-105.

83 This theorem was first published in Chebyshev's paper Òn the expansion in con-tinued fractions of series in descending powers of the variable' [in Russian]. See [8),pp. 307-62.

84 [38], p. 95, beginning with line 3 from below.85 In our notation, D,, D., ... , Dm, DIM, D211, .. , Dmll. (See p. 236.)86 In our notation, h (- x) = 0.

§ 17. THEOREMS OF MARKOV AND CIIEBYSIIEV 241

THEOREM ON DETERMINANTS : If we have for the numbers

80, 81, 82, ... , 82m-2, 82w-i

two sets of values

1. 80 = a0 , 81 = all s2 = a2, ..., 820-2 = a2m-2, 82m-l = 02,-l,

2. 80 = b0, 81 = b1, 82 = b2 , .... 82m-2 = b2,n-2 , 82m-i = b2n-l

for which all the determinants

41=80,42= 180 81

181 82

80

81

81 ... sm-182...8m

,

4(1) = 81, 4(2) =

82 ... 8m83 ... 8m+i

8m 8m+1 82,, .i I

i sn,-1 an 82m-2

81

81821 82

82 831

turn out to be positive numbers satisfying the inequalities

a0 Z b0, b1 ? a1, a2 2 b2, b3 ? a3, ... , a2m-2 ? b2m-2, b2m-1 ? a2i_1,

then our determinant

All d2, ..., dm; 4(1), 4(2), ..., 4(m)

must be positive for all values

80, 81, 82, ..., 82m-1

satisfying the inequalities

a0 Z 80 ? b0, b1 9 819 a1, a2 z 82 Z b2, ...,

a2m-2 Z 82rn-2 ? b2m_2, b2m-1 ? 82m-i ? a2m-l.

Under these conditions we have

a0 a1 ... ak_l 80 81 ... 8k-1 bo b1 ... bk-ia ... aa 8 8 ... 8 b ... bb1 k2 2 1 2 k 1 3 k

8k-l 8k ... 82k_2 bk-i bk ... b2k-2

and

242 XV. TIIE PROBLEM OF ROUT] I-111;RWITZ Am) RELATED QX.ESTIONS

b1 bs ... bt 81 81 ... 8r82 8i ... 8t+1Z

I bk bt+l ... b2k-1

fork=1, 2,...,m.8k 8l+1 ... 82k-1

al as ... a,;as as ... ak+lZ

In order to give another statement of this theorem in connection withthe problem of stability, we introduce some concepts and notations.

The Markov parameters so, sI, ... , (for n = 2m) or s-1, s,,, s1, ...sz,,,_I (for n= 2m + 1) will be regarded as the coordinates of some point Pin an n-dimensional space. The domain of stability in this space will bedenoted by G. The domain G is characterized by the inequalities (115) and(116) (p. 236).

We shall say that a point P = {s1} `precedes' a point P" = andshall write P -< P if

88S80,815 81,825 82,89S83, ..., 82m-1582.-1

and (for n = 2m + 1)8-1 S 8-1

(130)

and the sign < holds in at least one of these relations.If only the relations (130) hold, without the last clause, then we shall

write:.

We shall say that a point Q lies `between' P and R if P -< Q -< R.To every point P there corresponds an infinite Hankel matrix of rank

m: S = II s1+e ile . We shall denote this matrix by Sip.Now we can state Markov's theorem in the following way :

THEOREM 21 (Markov) : If two points P and R belong to the domain ofstability G and if P precedes R, then every point Q between P and R alsobelongs to G, i.e.,

from P, R e G, P < Q:5 R it follows that Q e G.

Proof. From P '< Q - R it follows that P and Q can be connectedby an are of a curve8i=(-1)'1p;(t)[antsy;i=0,1,...,2m-1and(forn=2m+l)i=-1] (131)

§ 17. THEOREMS OF MARKOV AND CIHEBYSIIEV 243

passing through Q such that : 1) the functions Ti(t) are continuous, mono-tonic increasing, and differentiable when t varies from t = a to t = y; and2) the values a, (a < 0 < y) of t correspond to the points P, Q, R onthe curve.

From the values (131) we form the infinite Hankel matrix S = S (t) _II si+,t(t) 11 of rank m. We consider part of this matrix, namely the rec-tangular matrix

8p 81 ... em-1 8m

81 82 ... 8m 8m+1

8m-1 8m ... 82m-2 82m-1

84 84+1 ... 8q+p-1

84+1 8q+2 ... 84+P

By the conditions of the theorem, the matrix S(t) is totally positive ofrank m for t = a and t = y, so that all the minors of (132) of order p = 1, 2,3, ... , m are positive.

We shall now show that this property also holds for every intermediatevalue oft (a<t<y).

For p = 1, this is obvious. Let us prove the statement for the minorsof order p, on the assumption that it is true for those of order p - 1. Weconsider an arbitrary minor of order p formed from successive rows andcolumns of (132) :

D(9) _P

84+p-1 84+P ... 8q+2p-2

(132)

[q=0, 1, ..., 2 (m- p) + 1]. (133)

We compute the derivative of this minor

d D(4)= P-1 aD_(P) d$9+i+kdt P a8 dti,k-p 4+i+k

a84+i+k

of the elements of (133). Since by assumption all the minors of this determi-nant are positive, we have

OD(q)(- 1)i+k -P- > 0

as4+i+k

(134)

P- (i, k = 0, 1, ... , p - 1) are the algebraic complements (adjoints)

(i, k=0, 1, ..., p-1). (135)

On the other hand, we find from (131) :

244 XV. TIIE PROBLEM OF MOUTH-IIrRWITZ AND RELATED QUESTIONS

(-1)q+{+k "v±4-+s = y+t+r Z 0 (i k=0, 1, 1 (136)


q=0, 1, ... , 2(m-p)+1,(-1)q D( f) Z 0 p=1, 2, ... , m, (137)

a;5

Thus, when the argument increases from t = a, to t = y, then every minor(133) with even q is a monotone non-decreasing function and with odd qis a monotone non-increasing function ; but since the minor is positive fort = a and t = y, it is also positive for every intermediate value of t(a<t<y).

From the fact that the minors of (132) of order p -1 and those of orderp that are formed from successive rows and columns are positive, it nowfollows that all the minors of (131) of order p are positive.87

What wehave proved implies that for every t (a< t:5 y) the valuesso, s1i . . . , and (for n = 2m + 1) s_1 satisfy the inequalities (115)and (116), i.e., that for every t these values are the Markov parameters ofa certain Hurwitz polynomial. In other words, the whole are (131) and,in particular, the point Q lies in the domain of stability G.

This completes the proof of Markov's Theorem.

Note. Since we have proved that every point of the are (131) belongsto G, the values of (131) for every t (a 5 t:55 y) determine a totally posi-tive matrix S(t) = II sl+k(t) Ilo of rank m. Therefore the inequalities (135)and consequently (137) as well hold for every t (a:5 t 5 y), i.e., with in-creasing t every D(q) increases for even q and decreases for odd q (q = 0, 1,2, ..., 2 (m - p) + 1; p =1, ... , m). In other words, from P 'S Q 75 Rit follows that

(-1)eDp)( ) s s (-1)gDy4)('}I)(q=0, 1, ..., 2(m-p)+1; p=1, ..., m).

These inequalities for q = 0, 1 give Markov's inequalities (pp. 241 ).We now come to the Chebyshev-Markov theorem mentioned at the be-

ginning of this section. Again we quote from Markov's memoir 88

N7 This follows from Fekete's determinant indentity (see [17], pp. 306-7).88 See [38], p. 103, beginning with line 5.

§ 17. THEOREMS OF MARKOV AND CIIEBYSHEV

THEOREM ON RooTS : If the numbers

satisfy all theequations

a0, al, a2, ... , asm-a, a2,,,_1,

80, 81, 82, .. , 8g,,,-!, 82M-I,bo, b1, b2, ... , ba,,,_2, ba,,,_1

conditions of the preceding theorem," then the

ao al ... am_1 1

a1 as ... am x

a2 ... am+1 sx = 0,

am am+1 ... a2.-, oft

80 81 ... 8m_1 1

81 82 ... 8m x82 83 ... 8m+1 x2 = 0,

8m 8m+1 ... 82m_1 mx

bo b1 ... bm_1 1

b1 b2 ... bm x

b2 bs ... bm+1 X2 =0

b. bm+1 ... xm

245

of degree m in the unknown x do not have multiple or imaginaryor negative roots.

And the roots of the second equation are larger than the cor-responding roots of the first equation and smaller than the cor-responding roots of the last equation.

Let us find out the connection of this theorem with the domain of sta-bility in the space of the Markov parameters. Setting f (z) = h (z2) + zg(z2)and

h(-V)=coven+clvm-l+...+cm (co 0),

we obtain from the expansion (105)

R(v)=-h(-9)=-a-I+ v + ti; +...the identity

89 He refers to the preceding theorem, Markov'a theorem on determinants (pp. 241).

246 XV. TILE I'ROHLE1 OF H01'T11-ll['RWITZ AND RELATED QVf:.LTlo\s

- g (- v) _ (- e_1 + + vh, + ...l (c,v + clv"-1 + ... + CM).

Equating to zero the coefficients of the powers v-1, v-2, ... , v-'", we find :

sow. +eler_1 =0,Ole= + 82C.-1 + ... + A+w+l Ce = 0,

aerlC, + 8,Cw-1 + ... + e2--lce -0;

to these relations we add the equation

(138)

h(-v)=0, (139)written as

c.,+vc,,,_,1+...+ec*=0. (139')

Eliminating from (138) and (139') the coefficients co, cl, ... , c.,, we repre-sent the equation (139) in the form

... s,,,-..1 1

... on V

... 8m+l v2

dm dw+1 ... ale,-i V

Thus, the algebraic equation in the Chebyshev-Markov theorem coincideswith (139) and the inequalities imposed on so, Si, ..., s2n-1 coincide withthe inequalities (115) that determine the domain of stability in the spaceof the Markov parameters.

The Chebyshev-Markov theorem shows how the roots u1 = - v,.u2 = - v2f .... u., - V. of h (u) change when the corresponding Markovparameters so, s1, ... , s2,,,_1 vary in the domain of stability.

The first part of the theorem states something we already know: Whenthe inequalities (115) are satisfied, then all the roots u1, u2, ... , U. of h (U)are simple, real, and negative.90 We denote them as follows :

ul (P), U2 (P), - .. , ue,(P),

where P is the corresponding point of G.The second (fundamental) part of the Chebyshev-Markov theorem can

be stated as follows :

s9 See Theorem 13, on p. 228.

§ 17. THEOREMS OF :l'IARKOV AND CHEBYSHEV 247

THEOREM 22 (Chebyshev-Markov) : If P and Q are two points of G andP `precedes' Q,

PQ, (140)then"

ui(P) < ui(Q), u2(P) < U2 (Q), ..., u,, (P) < u.(Q). (141)

Proof. The coefficients of h (u) can be expressed rationally in terms ofthe parameters so, sI, ... , S2.- 1.92 Then

h (ut) = 0 (i =1, 2, ..., m)implies that :98

h' (u:) de t

= 0 (i= 1, 2, ..., m; 1= 0, 1, ..., 2m -1). (142)t

On the other hand, when we differentiate the expansion

g (U) 80 81 +(u) U Us

term by term with respect to s, we find :

8h(u)A (u) ag(u) -g(u) ast _ (- 1)t 108,

+ 143

Multiplying both sides of this equation by

us (ss and denoting the coeffi-

tcient of ut in this polynomial by Cu, we obtain :

ah (u)h ('u) a9 (x) a8t - (-1)IC,l .+ (144)U-ut ast - u-u, - U

Comparing the coefficients of 1/u (the residues) on the two sides of (144),we find :

ah (t)as,

which gives in conjunction with (142) :dut _ (-1)'Cit1 9 ('ui) h' _(U-0

(145)

91 In other words, the roots u,, u,, ..., u., increase with increasing s., s,, ..., se.,_a andwith decreasing s,, s,, ..., s,.._,,

92 For example, by the equations (138) if, for simplicity, we set co=1 in these equations8h(utl 8A(u)93 Here as, = [& ].(.

249 XV. TIIE PROBLEM OF Roi1TII-IIURWITZ AND IIELATE.D QUESTIONS

Introducing the valuesBi= y (uc) (1=1, 2, ..., m), (146)

(ui)

we obtain the formula of Chebyshev-Markov :

dui - (- 1)ICi1 0 =1, 2, ..., m; 1= 0, 1, ..., 2m -1). (147)d,,,&[h' (w)]'

But in the domain of stability the values Ri (i=1, 2, ... , m) are positive(see (90') on p. 226). The same can be said of the coefficients Cal. For

h= (u) =c4 (u + v1)2 ... (u + vi-1)' (u + vt) (u + vi+1)' ... (u + v,,,)'. (148)u-uiwhere

vl=-ui>0 (i= 1, 2, ..., m),h' (u)

From (148) it is clear that all the coefficients Cil in the expansion of u-uiin powers of u are positive. Thus, we obtain from the Chebyshev-Markovformula :

1)Ii > 0. (149)I

In the proof of Markov's theorem we have shown that any two pointsP -< Q of G can be joined by an are sl = (-1)lip, (t) (1= 0, 1, ... , 2m -1),where TI(t) is a monotonic increasing differentiable function of t (t varieswithin the limits a and # (a < #) and t = a corresponds to P, t = to Q).Then along this are we have, by (149) :11

du; = 2m-1 dui deI0,

du;0 (a (150)

dt di de -1-0

Hence by integrating we obtain :

ui(I-at) =ui (P) <ui(I-t) =ui (Q) (i=1, 2, ..., m).

This completes the proof of the Chebyshev-Markov theorem.

§ 18. The Generalized Problem

1. In this section we shall give a rule to determine the number of roots in theright half-plane of a polynomial f (z) with complex coefficients.

04 Since (-I )I dt = aIt ? 0 (a < t 5 fl) and for at least one I there exist values

of t for which (- 1)I a > 0.

§ 18. THE GENERALIZED R.OUTH-HURWITZ PROBLEM 249

Suppose that

f(tiz)=boz"+b1z"-1+...+bn+i(aoz*+alz"-'+...+a"), (151)

where ao, a,, ... , a", bo, b1, ... , b, are real numbers. If the degree of f (z)is n, then bo + iae 0. Without loss of generality we may assume thata,,6 0 (otherwise we could replace f (z) by if (z) ).

We shall assume that the real polynomials

aoz"+alz"-1+...+a and box"+b1z"-1+...+b" (152)

are co-prime, i.e., that their resultant does not vanish:911

ao a1 ... a" 0 ... 0bo b1 ... b" 0 ... 0 '0 ao ... a"_3 an ... 0

0.0 bo ... b"_ 1 b" ... 0

............I

(153)

Hence it follows, in particular, that the polynomials (152) have no roots incommon and that, therefore, f (z) has no roots on the imaginary axis.

We denote by k the number of roots of f (z) with positive real parts.By considering the domain in the right half-plane bounded by the imaginaryaxis and the semi-circle of radius R (R -i- oo) and by repeating verbatimthe arguments used on p. 177 for the real polynomial f (z), we obtain theformula for the increment of arg f (z) along the imaginary axis

d-}- arg f (z)=(n-2k)n. (154)

Hence we obtain, by (151), in view of ao 9& 0:

J4.-,oZ + a1z -:----+z"bozo + blzn-1 + ... + b" =n-2k. (155)n "-1

Using Theorem 10 of § 1] (p. 215), we now obtain:

k = V (1, 172, P4, ... , f72n) r

where

(156;

95 V14 is a determinant of order 2n.

250 XV. TIIE PROBLEM OF ROUTH-HURWITZ AND RELATED QUESTIONS

as al ...-ibo b1 ... bap-1

V,r= 0 a@ ... "'-s (p=1, 2, ..., n; at=bb=O for k>n). (157)0 bo ... a,,_,

We have thus arrived at the following theorem.

THEOREM 23: If a complex polynomial f (z) is given for which

f (iz) = box" + ble-1 + ... + b" + i (aoe + ale-' + ...+ a*) (x01& 0)

and if the polynomials aez" + ... + a" and b0z" + ... + b" are co-prime(V," # 0), then the number of roots of f (z) in the right half-plane is deter-mined by the formulas (156) and (157).

Moreover, if some of the determinants (157) vanish, then for each groupof successive zeros

(V, O) VM+2="'=V2n+2p=0 (V2A+2y+2,0) (158)

in the calculation of V (1, V,, V4,. . ., V,,,) we must set :

ty-nSign VM+,f=(-1) 2 signVM (j=1, 2, ..., p) (159)

or, what is the same,

V(Van) V u+a, ...I 17"+W Van+ap+a)

- v- V+ vv,

We leave it to the reader to verify that in the special case where f (z)is a real polynomial we can obtain the Routh-Hurwitz theorem (see § 6)from Theorem 23.98

In conclusion, we mention that in this chapter we have dealt with theapplication of quadratic forms (in particular, Hankel forms) to one problemof the disposition of the roots of a polynomial in the complex plane. Quad-ratic and hermitian forms also have interesting applications to other prob-lems of the disposition of roots. We refer the reader who is interested inthese questions to the survey, already quoted, of M. G. Krein and M. A.Naimark `The method of symmetric and hermitian forms in the theoryof separation of roots of algebraic equations,' (Kharkov, 1936).

P+' for odd p ,2

P E

for eve2zn p and e = (-1) sign

06 Suitable algorithms for the solution of the generalized Routh-Hurwitz problem canbe found in the monograph [41] and in the paper [39]. See also [7] and [37].

BIBLIOGRAPHY

BIBLIOGRAPHY

Items in the Russian language are indicated by *

PART A. Textbooks, Monographs, and Surveys

[1) AcnissEa (Akhieser), N. J., Theory of Approximation. New York: Ungar, 1956.[Translated from the Russian.]

[2) AITHEN, A. C., Determinants and matrices. 9th ed., Edinburgh: Oliver and Boyd,1956.

[3] BELLMAN, R., Stability Theory of Differential Equations. New York: McGraw-Hill, 1953.

[4) BEaNSTEIN, S. N., Theory of Probability. 4th ed., Moscow: Gostekhizdat, 1946.(5) BODEwIa, E., Matrix Calculus. 2nd ed., Amsterdam: North Holland, 1959.[6) CAHEN, 0., EWments du calcul matriciel. Paris: Dunod, 1955.[7] CHESOTAaty, N. G., and MEIMAN, N. N., The problem of Bough-Hurwits for poly-

nomials and integral functions. Trudy Mat. Inst. Steklov., vol. 26 (1949).[8) CHESxsHEV, P. L., Complete collected works. vol. III. Moscow: Izd. AN SSSR,

1948.

'[9] CHETAEV, N. G., Stability of motion. Moscow: Gostekbizdat, 1946.[10] COLLATE, L., Rigenwertaufgaben mit teehnischen Anwendungen. Leipzig: Akad.

Velags., 1949.[11] Eigenwertprobieme and Are numeriache Behandlung. New York:

Chelsea, 1948.[12) CouwANT, R. and HILBERT, D., Methods of Mathematical Physics, vol. I. Trans.

and revised from the German original. New York: Interscienee, 1953.[13] Eauone, N. R., The method of Lappo-Danilevskil in the theory of linear differen-

tial equations. Leningrad: Leningrad University, 1956.(14) FADDEEV, D. K. and SomixsKII, I. S., Problems in higher algebra. 2nd ed., Moscow,

1949; 5th ed. Moscow: Gostekhizdat, 1954.[15] FADDEEVA, V. N., Computational methods of linear algebra. New York: Dover

Publications, 1959. [Translated from the Russian.][16] FRALEa, R. A., DuNCAN, W. J., and COLLAR, A., Elementary Matrices and Some

Applications to Dynamics and Differential Equations. Cambridge: CambridgeUniversity Press, 1938.

[17) GANTMACHER (Gantmakher), F. R. and KREIN, M. G., Oscillation matrices andkernels and small vibrations of dynamical systems. 2nd ed., Moscow: Gostekh-izdat, 1950. [A German translation is in preparation.]

[18] G16BNE8, W., Matrisenrechnung. Munich: Oldenburg, 1956.[19] HAHN, W., Theorie and Anwendung der direkten Methode von Lyapunov (Ergeb-

nisse der Mathematik, Neue Folgc, Heft 22). Berlin: Springer, 1959. [Containsan extensive bibliography.]

2 53

254 BIBLIOoltAPHY

[20] INCE, E. L., Ordinary Differential Equations. New York: Dover, 1948.[21] Juxa, H., Matrizen and Determinanten. Eine Einfilhrung. Leipzig, 1953.[22] KLEIN, F., Vorlesungen uber hiihere Geometric. 3rd ed., New York : Chelsea, 1949.[23] KOWALEWSKI, G., Einfllhrung in die Determinantentheorie. 3rd ed., New York:

Chelsea, 1949.'[24] KREiN, M. G., Fundamental propositions in the theory of )-cone stability of a

canonical system of linear differential equations with periodic coefficients.Moscow : Moscow Academy, 1955.

'[25] KREIN, M. G. and NAiMARK, M. A., The method of symmetric and hermitian formsin the theory of separation of roots of algebraic equations. Kharkov: ONTI,1936.

'[26] KREIN, M. G. and RUTMAN, M. A., Linear operators leaving a cone in a Banachspace invariant. Uspehi Mat. Nauk, vol. 3 no. 1, (1948).

'[27] KUDRYAVCHEV, L. D., On some mathematical problems in the theory of electricalnetworks. Uspehi Mat. Nauk. vol. 3 no. 4 (1948).

'[28] LAPPO-DANILEVSKII, I. A., Theory of functions of matrices and systems of lineardifferential equations. Moscow, 1934.

(29) M6moires sur la theorie des systernes des equations diffirentielles lin6-aires. 3 vols., Trudy Mat. Inst. Steklov. vols. 6-8 (1934-1936). New York:Chelsea, 1953.

(30] LErsCHETZ, S., Differential Equations: Geometric Theory. New York: Inter-science, 1957.

[31] LICHNERowicz, A., Algebre et analyse lineaires. 2nd ed., Paris: Masson, 1956.[32] LYAPUNOV (Liapounoff), A. M., Le Probikme general de la stabilit6 du mouve-

ment (Annals of Mathematics Studies, No. 17). Princeton: Princeton Univ.Press, 1949.

[33] MACDuFFEE, C. C., The Theory of Matrices. New York: Chelsea, 1946.[34] Vectors and matrices. La Salle: Open Court, 1943.

'[35] MALKIN, I. G., The method of Lyapunov and Poincar6 in the theory of non-linearoscillations. Moscow: Gostekhizdat, 1949.

[36] Theory of stability of motion. Moscow: Gostekhizdat, 1952. [A Germantranslation is in preparation.]

[37] MARDEN, M., The geometry of the zeros of a polynomial in a complex variable(Mathematical Surveys, No. 3). New York: Amer. Math. Society, 1949.

'[38] MARxov, A. A., Collected works. Moscow, 1948.'(39] MEIMAN, N. N., Some problems in the disposition of roots of polynomials. Uspehi

Mat. Nauk, vol. 4 (1949).[40] MIRSKY, L., An Introduction to Linear Algebra. Oxford: Oxford University

Press, 1955.'[41] NAhMARK, Y. I., Stability of linearized Systems. Leningrad: Leningrad Aero-

nautical Engineering Academy, 1949.[42] PARODI, M., Sur quelques proprietis des valeurs caracteristiques des matrices

carr6es (M6morialdesSciences Mathematiques, vol. 118), Paris: Gauthiers-Villars,1952.

(43] PERLIS, S., Theory of Matrices. Cambridge. (Mass.): Addison-Wesley, 1952.[44] PICKERT, G., Normalformen von Matrizen (Enz. Math. Wiss., Band I, Teil B.

Heft 3, Teil I). Leipzig: Teubner, 1953.'[45) POTAPOV, V. P., The multiplicative structure of J-inextensible matrix functions.

Trudy Moscow Mat. Soc., Vol. 4 (1955).

BIBLIOGRAPHY 255

[46] RoMANOVSKU, V. I., Discrete Markov chains. Moscow: Gostekhizdat, 1948.[47] RouTH, E. J., A treatise on the stability of a given state of motion. London:

Macmillan, 1877.[48] The advanced part of a Treatise on the Dynamics of a Rigid Body.

6th ed., London: Macmillan, J905; repr., New York: Dover, 1959.[49] SCHLESINGER, L., Vorlesungen fiber lineare Differentialgleichungen. Berlin, 1908.

[50] Einf'uhrung in die Theorie der geuOhnlichen Differentialgleichungen auffunktionentheorctischer Grundlage. Berlin, 1922.

[51] SCHMEIDLER, W., Vortrage ilbcr Determinanten and Matrizen mit Anuvendungenin Physik and Technik. Berlin: Akademie-Verlag, 1949.

[52] SCHREIER, 0. and SPERNER, E., Vorlesungen fiber Matrizen. Leipzig: Teubner,1932. [A slightly revised version of this book appears as Chapter V of [53].]

[53] Introduction to Modern Algebra and Matrix Theory. New York: Chelsea,1958.

[54] SCHWERDTFEOER, H., Introduction to Linear Algebra and the Theory of Matrices.Groningen: Noordhoff, 1950.

[55] SHOHAT, J. A. and TAMARKIN, J. D., The problem of moments (MathematicalSurveys, No. 1). New York: Amer. Math. Society, 1943.

[58] SMIRNOW, W. I. (Smirnov, V. I.), Lehrgang der hoheren Mathematik, Vol. III.Berlin, 1956. [This is a translation of the 13th Russian edition.]

[57] SPECHT, W., Algebraische Gleichungen mit reellen oder komplexen Koeffizienten(Enz. Math. Wiss., Band I, Teil B, Heft 3, Teil II). Stuttgart: Teubner, 1958.

[58] STIELTJES, T. J., Oeuvres Completes. 2 vols., Groningen: Noordhoff.[59] STOLL, R. R., Linear Algebra and Matrix Theory. New York: McGraw-Hill, 1952.[60] THRALL, R. M. and TORNHEIM, L., Vector spaces and matrices. New York:

Wiley, 1957.[61] TURNBULL, H. W., The Theory of Determinants, Matrices and Invariants. Lon-

don: Blackie, 1950.[62] TURNBULL, if. W. and AITKEN, A. C., An Introduction to the Theory of Canonical

Matrices. London: Blackie, 1932.[63] VOLTERRA, V. et HOSTINSKY, B., Operations infinitesimales lintaires. Paris:

Gauthiers-Villars, 1938.[84] WEDDERBL'RN, J. H. M., Lectures on matrices. New York: Amer. Math. Society,

1934.[65] WEYL, H., Mathematische Analyse des Raumproblems. Berlin, 1923. [A reprint

is in preparation: Chelsea, 1960.](66] WINTNER, A., Spektraitheorie der unendlichen Matrisen. Leipzig, 1929.[67] ZURMUHL, R., Mat risen. Berlin, 1950.

PART B. Papers

[101] AFRIAT, S., Composite matrices, Quart. J. Math. vol. 5, pp. 81-89 (1954).[102] AIZERMAN (Aisermann), M. A., On the computation of non-linear functions of

several variables in the investigation of the stability of an automatic regulatingsystem, Avtomat. i Telemeh. vol. 8 (1947).

[103] AISERMANN, M. A. and F. R. GANTMACHER, Determination of stability by linearapproximation of a periodic solution of a system of differential equations withdiscontinuous right-hand sides, Quart. J. Mech. Appl. Math. vol. 11, pp. 385-98(1958).

256 BIBLIOGRAPHY

[104] AITKEN, A. C., Studies in practical mathematics. The evaluation, with applica-tions, of a certain triple product matrix. Proc. Roy. Soc. Edinburgh vol. 57,(1936-37).

[105] AMm Moiz ALI, R., Extreme properties of cigenvalues of a hermit ion transforma-tion and singular values of the sum and product of linear transformations, DukeMath. J. vol. 23, pp. 463-76 (1956).

[108] ARTASHENKOV, P. V., Determination of the arbitrariness in the choice of amatrix reducing a system of linear differential equations to a system with con-stant coefficients. Vestnik Leningrad. Univ., Ser. Mat., Phys. i Chim., vol. 2,pp. 17-29 (1953).

[107] ARZHANYCB, I. S., Extension of Erylov's method to polynomial matrices, Dokl.Akad. Nauk SSSR, Vol. 81, pp. 749-52 (1951).

[108] AZBELEV, N. and R. VINOGRAD, The process of successive approximations for thecomputation of eigenvalues and eigenvectors, Dokl. Akad. Nauk., vol. 83, pp. 173-74 (1952).

[109) BAKER, H. F., On the integration of linear differential equations, Proe. LondonMath. Soc., vol. 35, pp. 333.78 (1903).

[110] BARANKIN, E. W., Bounds for characteristic roots of a matrix, Bull. Amer. Math.Soc., vol. 51, pp. 787-70 (1945).

[111] BARTSCH, H., Absehiitzungen fur die Kieinste charakteristiache Zahl einer positiv-definiten hermitschen Matrix, Z. Angew. Math. Mech., vol. 34, pp. 72-74 (1954).

[112) BELLMAN, R., Notes on matrix theory, Amer. Math. Monthly, vol. 60, pp. 173-75,(1953); vol. 62, pp. 172-73, 571.72, 647-48 (1955); vol. 64, pp. 189-91 (1957).

[113] BELLMAN, R. and A. HOFFMAN, On a theorem of Ostrowski, Arch. Math., vol. 5,pp. 123-27 (1954).

[114] BENDAT, J. and S. SILVERMAN, Monotone and convex operator functions, Trans.Amer. Math. Soc., vol. 79, pp. 58-71 (1955).

[115] BEaan, C., Sur une propriite des matrices doublement stochastiques, C. R. Acad.Sci. Paris, vol. 241, pp. 269-71 (1955):

[116] BIRKHOrr, G., On product integration, J. Math. Phys., vol. 16, pp. 104-32 (1937).[117] BIRKHOFF, G. D., Equivalent singular points of ordinary linear differential

equations, Math. Ann., vol. 74, pp. 134-39 (1913).[118] BoTT, R. and R. DvrrIN, On the algebra of networks, Trans. Amer. Math. Soc.,

vol. 74, pp. 99-109 (1953).[119] BRAUER, A., Limits for the characteristic roots of a matrix, Duke Math. J.,

vol. 13, pp. 387-95 (1946); vol. 14, pp. 21.26 (1947) ; vol. 15, pp. 871-77 (1948) ;vol. 19, pp. 73-91, 553.62 (1952); vol. 22, pp. 387.95 (1955).

[120] Ober die Lage der charakteristischen Wurseln einer Matrix, J. ReineAngew. Math., vol. 192, pp. 113.16 (1953).

[121] Bounds for the ratios of the coordinates of the characteristic vectors of

a matrix, Proc. Nat. Acad. Sci. U.S.A., vol. 41, pp. 162-64 (1955).[122] The theorems of Ledermann and Ostrowski on positive matrices, Duke

Math. J., vol. 24, pp. 285.74 (1957).[123] BRENNER, J., Bounds for determinants, Proc. Nat. Acad. Sci. U.S.A., vol. 411.

pp. 452.54 (1954) ; Proc. Amer. Math. Soc., vol. 5, pp. 631.34 (1954) ; vol. 8,pp. 532-34 (1957); C. R. Acad. Sci. Paris, vol. 238, pp. 555.56 (1954).

[124] BRUIJN, N., Inequalities concerning minors and eigenvalues, Nieuw Arch. Wisk.,vol. 4, pp. 18-35 (1956).

(125] BRUIJN, N. and G. SZEKF.RES, On some exponential and polar representatives ofmatrices, Nieuw Arch. Wisk., vol. 3, pp. 20-32 (1955).

BIBLIOGRAPHY 257

'[126] BVLOAKOV, B. V., The splitting of rectangular matrices, Dokl. Akad. Nauk SSSR,vol. 85, pp. 21-24 (1952).

[1271 CAYLEY, A., A memoir on the theory of matrices, Phil. Trans. London, Vol. 148,pp. 17.37 (1857); Coll. Works, vol. 2, pp. 475-96.

[1281 COLLATZ, L., Einschliessungssatc fiir die charakteristiaehen Zahlen von Matrizen,Math. Z., vol. 48, pp. 221-26 (1942).

[1291 fiber monotone systeme linearen Ungleichungen, J. Reine Angew. Math.,vol. 194, pp. 193-94 (1955).

[130] CREMER, L., Die Verringerung der Zahi der Stabilita"takriterien bei Voraussetzungpositiven koeffizienten der charakteristischen Gleichung, Z. Angew. Math. Mech.,vol. 33, pp. 222-27 (1953).

'[131] DANILEVSKIl, A. M., On the numerical solution of the secular equation, Mat. Sb.,vol. 2, pp. 169-72 (1937).

[132] DitisnaTo, S., On systems of ordinary differential operations. In: Contributionsto the Theory of Non-linear Oscillations, vol. 1, edited by S. Lefschetz (Annalsof Mathematics Studies, No. 20). Princeton: Princeton Univ. Press (1950),pp. 1-38.

[1331 DMITRIEv, N. A. and E. B. DYNKIN, On the characteristic roots of stochasticmatrices, Dokl. Akad. Nauk SSSR, vol. 49, pp. 159-62 (1945).

'[133a] Characteristic roots of Stochastic Matrices, Izv. Akad. Nauk, Ser. Fiz-Mat., vol. 10, pp. 167-94 (1946).

[134) Dossca, 0., Matrixfunktionen beschrankter Schwankung, Math. Z., vol. 43,pp. 353.88 (1937).

'[135] DONSKAYA, L. I., Construction of the solution of a linear system in the neighbor-hood of a regular singularity in special cases, Vestnik Leningrad. Univ., vol. 6(1952).

'[136] On the structure of the solution of a system of linear differential equa-tions in the neighbourhood of a regular singularity, Vestnik Leningrad. Univ.,vol. 8, pp. 55-64 (1954).

'[137) DtBNOV, Y. S., On simultaneous invariants of a system of affinore, Trans. Math.Congress in Moscow 1927, pp. 236.37.

'[138] On doubly symmetric orthogonal matrices, Bull. Ass. Inst. Univ. Moscow,pp. 33-35 (1927).

'[139] On Dirac's matrices, L:& zap. Univ. Moscow, vol. 2, pp. 2, 43-48 (1934).1401 'DuBNov, Y. S. and V. K. IVANOV, On the reduction of the degree of affinor

polynomials, Dokl. Akad. Nauk SSSR, Vol. 41, pp. 99-102 (1943).[141) Dt:NCAN, W., Reciprocation of triply-partitioned matrices, J. Roy. Aero. Soc.,

vol. 60, pp. 131.32 (1956).[142] EOERVARY, E., On a lemma of Stieltjes on matrices, Acta. Sei. Math., vol. 15,

pp. 99-103 (1954).[143] On hypermatrices whose blocks are commutable in pairs and their appli-

cation in lattice-dynamics, Acts, Sei. Math., vol. 15, pp. 211-22 (1954).[144] EPSTEUT, M. and H. FLANDERS, On the reduction of a matrix to diagonal form,

Amer. Math. Monthly, vol. 62, pp. 168-71 (1955).'[145] ERSxov, A. P., On a method of inverting matrices, Dokl. Akad. Nauk SSSR,

vol. 100, pp. 209-11 (1955).[1461 ERCGIN, N. P., Sur la substitution exposante pour quelques systi'mes irreguliers,

Mat. Sb., vol. 42, pp. 745.53 (1935).'[147] Exponential substitutions of an irregular system of linear differential

equations, Dokl. Akad. Nauk SSSR, vol. 17, pp. 235-36 (1935).

2 58 BIBLIOGRAPHY

[1481 On Riemann's problem for a Gaussian system, U. Zap. Ped. Inst., vol. 28,pp. 293-304 (1939).

[1491 FADDEEV, D. K., On the transformation of the secular equation of a matrix,Trans. Inst. Eng. Constr., vol. 4, pp. 78.86 (1937).

[150) FAEDO, S., Un nuove problema di stabilita per le equationi algebriche a coeffi-cients reali, Ann. Scuola Norm. Sup. Pisa, vol. 7, pp. 53.63 (1953).

`[151] FAGE, M. K., Generalization of Hadamard's determinant inequality, Dokl. Akad.Nauk SSSR, vol. 54, pp. 765.68 (1946).

(152] On symmetrizable matrices, Uspehi Mat. Nauk, vol. 6, no. 3, pp. 153-56(1951).

[153] FAN, K., On a theorem of Weyl concerning eigenvatues of linear transformations,Proc. Nat. Acad. Sci. U.S.A., vol. 35, pp. 652.55 (1949); vol. 36, pp. 31-35 (1950).

[154] Maximum properties and inequalities for the eigenvatues of completelycontinuous operators, Proc. Nat. Acad. Sci. U.S.A., vol. 37, pp. 760-66 (1951).

[155] A comparison theorem for eigenvatues of normal matrices, Pacific J.Math., vol. 5, pp. 911-13 (1955).

(156) Some inequalities concerning positive-definite Hermitian matrices, Proc.Cambridge Philos. Soc., vol. 51, pp. 414-21 (1955).

[157] Topological proofs for certain theorems on matrices with non-negativeelements, Monatsh. Math., vol. 62, pp. 219-37 (1958).

[158] FAN, K. and A. IIOFFMAN, Some metric inequalities in the space of matrices, Proc.Amer. Math. Soc., vol. 6, pp. 111-16 (1958).

(159] FAN, K. and G. PALL, Imbedding conditions for Hermitian and normal matrices,Canad. J. Math., vol. 9, pp. 298-304 (1957).

[1601 FAN, K. and J. TODD, A determinantal inequality, J. London Math. Soc., vol. 30,pp. 58.64 (1955).

[161] FROBENIUS, G., Ober lineare substitutionen and bilineare Formen, J. ReineAngew. Math., vol. 84, pp. 1.63 (1877).

[162] Ober das Tragheitsgesetz der quadratischen Formen, S: B. Deutseh.Akad. Wise. Berlin. Math.-Nat. Kl., 1894, pp. 241-56, 407-31.

(163) Ober die cogredienten transformationen der bilinearer Formen, S: B.Deutsch. Akad. Wise. Berlin. Math.-Nat. KI., 1896, pp. 7-16.

[164) Ober die vertauschbaren Matrizen, S.-B. Deutsch. Akad. Wiss. Berlin.Math.-Nat. KI., 1896, pp. 601-614.

[165) Ober Matrizen cue positiven Elementen, S.-B. Deutsch. Akad. bliss.Berlin. Math-Nat. Kl. 1908, pp. 471-76; 1909, pp. 514-18.

[166] Ober Matrizen cue nicht negativen Elementen, S.-B. Deutsch. Akad.Wiss. Berlin Math.-Nat. Kl., 1912, pp. 456-77.

[167] GANTMACHER, F. R., Geometric theory of elementary divisors after Krull, TrudyOdessa Gos. Univ. Mat., vol. 1, pp. 89-108 (1935).

[168] On the algebraic analysis of Krylov's method of transforming the secularequation, Trans. Second Math. Congress, vol. II, pp. 45-48 (1934).

[1691 On the classification of real simple Lie groups, Mat. Sb., vol. 5, pp. 217-50(1939).

`[1701 GANTMACHER, F. R. and M. G. KREIN, On the structure of an orthogonal matrix,Trans. Ukrain, Acad. Sci. Phys.-Mat. Kiev (Trudy fiz.-Halt. otdcla VUAN, Kiev),1929, pp. 1-8.

[171] Normal operators in a hermit ion space, Bull. Phys-Mat. Soc. Univ. Kasan(Izvestiya fiz.-mat. ob-va pri Kaznnakom universitete), IV, vol. 1, ser. 3, pp. 71-84(1929-30).

BIBLIOGRAPHY 259

[172] On a special class of determinants connected with Kellogg's integralkernels, Mat. Sb., vol. 42, pp. 501.8 (1935).

[173] Sur les matrices oscillatoires et completement non-negatives, CompositioMath., vol. 4, pp. 445-76 (1937).

[174] GANTscHi, W., Bounds of matrices with regard to an hermitian metric, CompositioMath., vol. 12, pp. 1.16 (1954).

[175] GELFAND, I. M. and V. B. LIDSKII, On the structure of the domains of stabilityof linear canonical systcros of differential equations with periodic coefficients,Uspehi. Mat. Nauk, vol. 10, no. 1, pp. 3-40 (1955).

[176] GERsaooRIN, S. A., Ober die Abgrensung der Eigenwerte einer Matrix, Izv. Akad.Nauk SSSR, Ser. Fiz.-Mat., vol. 6, pp. 749-54 (1931).

[177] GODDARD, L., An extension of a matrix theorem of A. Brauer, Proc. Int. Cong.Math. Amsterdam, 1954, vol. 2, pp. 22-23.

[178] GoHEEx, H. E., On a lemma of Stieltjes on matrices, Amer. Math. Monthly,vol. 56, pp. 328.29 (1949).

0[179] GoLVacHIKov, A. F., On the structure of the automorphisms of the complex simpleLie groups, Dokl. Akad. Nauk SSSR, vol. 27, pp. 7-9 (1951).

[180] GRAVE, D. A., Small oscillations and some propositions in algebra, Izv. Akad.Nauk SSSR, Ser. Fiz: Mat., vol. 2, pp. 563-70 (1929).

[181] GROSSMAN, D. P., On the problem of a numerical solution of systems of simul-taneous linear algebraic equations, Uspehi Mat. Nauk, vol. 5, no. 3, pp. 87-103(1950).

[182] HAHN, W., Eine Bemerkung zur sweiten Methode van Lyapunov, Math. Nachr.,vol. 14, pp. 349-54 (1956).

[183] Ober die Anwendung der Methode von Lyapunov auf Differenzenglcich-ungen, Math. Ann., vol. 136, pp. 430-41 (1958).

[184] HATNSWORTH, E., Bounds for determinants with dominant main diagonal, DukeMath. J., vol. 20, pp. 199-209 (1953).

[185] Note on bounds for certain determinants, Duke Math. J., vol. 24, pp. 313-19 (1957).

[186] HELLMANN, 0., Die Anwendung der Matrizanten bei Eigenwertaufgaben, Z.Angew. Math. Mech., vol. 35, pp. 300-15 (1955).

[187] HERMITS, C., Sur le nombre des racines dune Equation algibrique comprise entredes limites donn6es, J. Reine Angew. Math., vol. 52, pp. 39-51 (1856).

[188] HJELMSLER, J., Introduction it to thCorie des suites monotones, Kgl. Danske Vid.Selbsk. Forh. 1914, pp. 1-74.

[189] HOFFMAN, A. and O. TAUSSKY, A characterization of normal matrices, J. Res.Nat. Bur. Standards, vol. 52, pp. 17-19 (1954).

[190] HorrMAN, A. and H. WIELANDT, The variation of the spectrum of a normalmatrix, Duke Math. J., vol. 20, pp. 37-39 (1953).

[191] HORN, A., On the eigenvalues of a matrix with prescribed singular values, Proc.Amer. Math. Soc., vol. 5, pp. 4-7 (1954).

[192] HOTELLING, H., Some new methods in matrix calculation, Ann. Math. Statist.,vol. 14, pp. 1-34 (1943).

[193] HOUSEHOLDER, A. S., On matrices with non-negative elements, Monatah. Math.,vol. 62, pp. 238-49 (1958).

[194] HOUSEHOLDER, A. S. and F. L. BAUER, On certain methods for expanding thecharacteristic polynomial, Numer. Math., vol. 1, pp. 29-35 (1959).

[195] Hsu, P. L., On symmetric, orthogonal, and skew-symmetric matrices, Proe.Edinburgh Math. Soc., vol. 10, pp. 37-44 (1953).

260 BIBLIOGRAPHY

[196] On a kind of transformation of matrices, Acts, Math. Sinica, vol. 5,pp. 333-47 (1955).

[197] HUA, L.-K., On the theory of automorphic functions of a matrix variable, Amer.J. Math., vol. 66, pp. 470-88; 531-63 (1944).

[198] Geometries of matrices, Trans. Amer. Math. Soc., vol. 57, pp. 441-90(1945).

[199] Orthogonal classification of Hermitian matrices, Trans. Amer. Math.Soc., vol. 59, pp. 508-23 (1946).

[200] Geometries of symmetric matrices over the real field, Dokl. Akad. NaukSSSR, vol. 53, pp. 95-98; 195-96 (1946).

[201] Automorphisms of the real sympleetit group, Dokl. Akad. Nauk SSSR,vol. 53, pp. 303-306 (1946).

[202] Inequalities involving determinants, Acta Math. Sinica, vol. 5, pp. 463-70(1955).

[203] HuA, L: K. and B. A. ROSENrELD, The geometry of rectangular matrices and theirapplication to the real projective and non-euclidean geometries, Izv. Higher Ed.SSSR, Matomatika, vol. 1, pp..233-46 (1957).

[204] Huawrrz, A., Ober die Bedingungen, unter welchen tine Gleichung nur Wurselnmit negativen reellen Teilen besitst, Math. Ann., vol. 46, pp. 273-84 (1895).

[205] INaaAHAM, M. H., On the reduction of a matrix to its rational canonical form,Bull. Amer. Math. Soc., vol. 39, pp. 379-82 (1933).

[208) IoNEseu, D., 0 identitate importanta si descompunere a unei forme bilineare intosums de produse, Gaz. Mat. Ser. Fiz. A. 7, vol. 7, pp. 303.312 (1955).

[207] ISHAK, M., Sur lea spectres des matrices, 86m. P. Dubreil et Ch. Pivot, Fee. Sei.Paris, vol. 9, pp. 1-14 (1955/56).

[208) KAGAN, V. F., On some number systems arising from Lorents transformations,Izv. Ass. Inst. Moscow Univ. 1927, pp. 3-31.

[209] KARPELEVICH, F. I., On the eigenvalues of a matrix with non-negative elements,Izv. Akad. Nauk SSSR Ser. Mat., vol. 15, pp. 361.83 (1951).

[210] KHAN, N. A., The characteristic roots of a product of matrices, Quart. J. Math.,vol. 7, pp. 138-43 (1956).

[211] KHLODOVSKIi, I. N., On. the theory of the general case of Krylov's transforma-tion of the secular equation, Izv. Akad. Nauk, Ser. Fiz.-Mat., vol. 7, pp. 1076-1102(1933).

[212) KOLMoooaov, A. N., Markov chains with countably many possible states, Bull.Univ. Moscow (A), vol. 1:3 (1937).

[213] KOTELYANSKII, D. M., On monotonic matrix functions of order n, Trans. Univ.Odessa, vol. 3, pp. 103-114 (1941).

[214] On the theory of non-negative and oscillatory matrices, Ukrain. Mat. Z.,vol. 2, pp. 94-101 (1950).

`[215] On some properties of matrices with positive elements, Mat. Sb., vol. 31,pp. 497-506 (1952).

[216] On a property of matrices of symmetric signs, Uspehi Mat. Nauk, vol. 8,no. 4, pp. 163-67 (1953).

[217] On some sufficient conditions for the spectrum of a matrix to be realand simple, Mat. Sb., vol. 36, pp. 163-68 (1955).

[218] On the influence of Gauss' transformation on the spectra of matrices,Uspehi Mat. Nauk, vol. 9, no. 3, pp. 117.21 (1954).

'[219] On the distribution of points on a matrix spectrum, Ukrain. Mat. Z.,vol. 7, pp. 131-33 (1955).

BIBLIOGRAPHY 261

0[220] Estimates for determinants of matrices with dominant main diagonal,It,. Akad. Nauk SSSR, Ser. Mat., vol. 20, pp. 13744 (1956).

0[221] KOVALENKO, K. R. and M. G. KRZYN, On some investigations of Lyapunov con-cerning differential equations with periodic coefficients, Dokl. Akad. NaukSSSR, vol. 75, pp. 495-99 (1950).

[222] KOWALEWBKI, G., Natiiriiche Normalformen linearer Transformationen, Leipz.Ber., vol. 69, pp. 325-35 (1917).

0[223] KaAsovsxu, N. N., On the stability after the first approximation, Prikl. Mat.Meh., vol. 19, pp. 516-30 (1955).

"[224] KRASNOBEL'SKII, M. A. and M. G. KRE1N, An iteration process with minimaldeviations, Mat. Sb., vol. 31, pp. 315-34 (1952).

[225] KRAUS, F., Uber konvexe Matrixfunktionen, Math. Z., vol. 41, pp. 18-42 (1936).(226] KRAVCHUK, M. F., On the general theory of bilinear forms, Izv. Polyt. Inst.

Kiev, vol. 19, pp. 17.18 (1924).[227] On the theory of permutable matrices, Zap. Akad. Nauk Kiev, Ser. Fiz.-

Mat., vol. 1:2, pp. 28.33 (1924).0[228] On a transformation of quadratic forms, Zap. Akad. Nauk Kiev, Ser.

Fiz: Mat., vol. 1:2, pp. 87-90 (1924).4[229] On quadratic forms and linear transformations, Zap. Akad. Nauk Kiev,

Ser. Fiz: Mat., vol. 1:3, pp. 1-89 (1924).[230] Permutable sets of linear transformations, Zap. Agr. Inst. Kiev, vol. 1,

pp. 25-58 (1926).[231] Ober vertauschbare Matrizen, Rend. Cire. Mat. Palermo, vol. 51, pp. 126-

30 (1927).[232] On the structure of permutable groups of matrices, Trans. Second. Mat.

Congress 1934, vol. 2, pp. 11-12.[233] Ka.AVCaux, M. F. and Y. S. GOL'DBAUY, On groups of commuting matrices,

Trans. Av. Inst. Kiev, 1929, pp. 73-98; 1936, pp. 12-23.[234] On the equivalence of singular pencils of matrices, Trans. Av. Inst. Kiev,

1936, pp. 5.27.[235] KREIN, M. G., Addendum to the paper' On the structure of an orthogonal matrix,'

Trans. Fiz.-Mat. Class. Akad. Nauk Kiev, 1931, pp. 103.7.[236] On the spectrum of a Jacobian form in connection with the theory of

torsion oscillations of drums, Mat. Sb., vol. 40, pp. 455-66 (1933).[237] On a new class of hermitian forms, Izv. Akad. Nauk SSSR, Ser. Fiz.-Mat.,

vol. 9, pp. 1259-75 (1933).[238] On the nodes of harmonic oscillations of mechanical systems of a special

type, Mat. Sb., vol. 41, pp. 339.48 (1934).[239] Sur quelques applications des noyaux de Kellog aux problemcs d'oscilla-

tions, Proc. Charkov Mat. Soc. (4), vol. 11, pp. 3-19 (1935).[240] Sur les vibrations propres des tiges dont 1'une des extremitJs eat encastr6e

et l'autre libre, Proc. Charkov. Mat. Soc. (4), vol. 12, pp. 3-11 (1935).'[241] Generalization of some results of Lyapunov on linear differential equa-

tions with periodic coefficients, Dokl. Akad. Nauk SSSR, vol. 73, pp. 445.48(1950).

[242] On an application of the fixed-point principle in the theory of lineartransformations of spaces with indefinite metric, Uspehi Mat. Nauk, vol. 5, no. 2,pp. 180-90 (1950).

[243] On an application of an algebraic proposition in the theory of monodromymatrices, Uspehi Mat. Nauk, vol. 6, no. 1, pp. 171-77 (1951).

262 BIBLIOGRAPHY

[244] On some problems concerning Lyapunov's ideas in the theory of stability,Uspehi Mat. Nauk, vol. 3, no. 3, pp. 166-69 (1948).

[245] On the theory of integral matrix functions of exponential type, Ukrain.Mat. Z., vol. 3, pp. 164-73 (1951).

[246] On some problems in the theory of oscillations of Sturm systems, Prikl.Mat. Meb., vol. 16, pp. 555.68 (1952).

[247] KReIN, M. G. and M. A. NAYMARK (Neumark), On a transformation of theBesoutian leading to Sturm's theorem, Proc. Charkov Mat. Soc., (4), vol. 10,pp. 33-40 (1933).

[248] On the application of the Bisoutian to problems of the separation ofroots of algebraic equations, Trudy Odessa Gos. Univ. Mat., vol. 1, pp. 51-69(1935).

[249] KRONECKER, L., Algebraische Reduction der Schaaren bilinearer Formen, 8: B.Akad. Berlin 1890, pp. 763-76.

[250] KRULL, W., Theorie and Anwendung der verallgemeinerten Abelachen Gruppen,S.B. Akad. Heidelberg 1926, p. 1.

[251] KRYLOV, A. N., On the numerical solution of the equation by which the frequencyof small oscillations is determined in technical problems, Izv. Akad. Nauk SSSRSer. Fiz.-Mat., vol. 4, pp. 491-539 (1931).

[252] LAPPO-DANILEVSKII, I. A., Resolution algorithmique des problemes rCguliers dePoincar6 et de Riemann, J. Phys. Mat. Soc. Leningrad, vols. 2:1, pp. 94.120;121-54 (1928).

[253] TUorie des matrices satisfaisantes b des syatemes des equations diffe-

rentielles lindaires a coefficients rationnels arbitraires, J. Phys. Mat. Soc. Lenin-

grad, vols. 2:2, pp. 41-80 (1928).[254] Fundamental problems in the theory of systems of linear differential

equations with arbitrary rational coefficients, Trans. First Math. Congr., ONTI,1936, pp. 254-62.

[255] LEDERMANN, W., Reduction of singular pencils of matrices, Proc. EdinburghMath. Soc., vol. 4, pp. 92-105 (1935).

[256] Bounds for the greatest latent root of a positive matrix, J. London Math.Soc., vol. 25, pp. 265-68 (1950).

[257] LIDSKIY, V. B., On the characteristic roots of a sum and a product of symmetricmatrices, Dokl. Akad. Nauk SSSR, vol. 75, pp. 769-72 (1950).

[258] Oscillation theorems for canonical systems of differential equations,Dokl. Akad. Nauk SSSR, vol. 102, pp. 111-17 (1955).

[259] LIENARD, and CHIPART, Sur la signe de to partie rCelle des racines d'une equationalgdbrique, J. Math. Pures Appl. (6), vol. 10, pp. 291-346 (1914).

`[260] LIPIN, N. V., on regular matrices, Trans. Inst. Eng 8. Transport, vol. 9, p. 105(1934).

'[261] LIVSHITZ, M. S. and V. P. POTAPov, The multiplication theorem for characteristicmatrix functions, Dokl. Akad. Nauk SSSR, vol. 72, pp. 164-73 (1950).

'[262] LoPsHlrz, A. M., Vector solution of a problem on doubly symmetric matrices,Trans. Math. Congress Moscow, 1927, pp. 186.87.

[263] The characteristic equation of lowest degree for affinors and its appli-cation to the integration of differential equations, Trans. gem. Vectors and Ten-sors, vols. 2/3 (1935).

[264] A numerical method of determining the characteristic roots and charac-teristic planes of a linear operator, Trans. gem. Vectors and Tensors, vol. 7,pp. 233-59 (1947).

BIBLIOGRAPHY 263

[265] An extremal theorem for a hyper-ellipsoid and its application to thesolution of a system of linear algebraic equations, Trans. Sem. Vectors and Ten-sors,vol. 9, pp. 183-97 (1952).

[266] LOWNER, K., Uber monotone Matrixfunktionen, Math. Z., vol. 38, pp. 177-216(1933) ; vol. 41, pp. 18-42 (1936).

[267] Some classes of functions defined by difference on differential inequali-ties, Bull. Amer. Math. Soc., vol. 56, pp. 308-19 (1950).

[268] LCSIN, N. N., On Krylov's method of forming the secular equation, Izv. Akad.Nauk SSSR, Ser. Fiz.-Mat., vol. 7, pp. 903-958 (1931).

[269] On some properties of the displacement factor in Krylov'a method, Izv.Akad. Nauk SSSR, Scr. Fiz.-Mat., vol. 8, pp. 596-638; 735-62; 1065-1102 (1932).

[270] On the matrix theory of differential equations, Avtomat. i Telemeh, vol. 5,pp. 3-66 (1940).

[271) LYUSTERNIK, L. A., The determination of eigcnvalues of functions by an electricscheme, Elcctricestvo, vol. 11, pp. 67.8 (1946).

[272] On electric models of symmetric matrices, Uspehi Mat. Nauk, vol. 4,no. 2, pp. 198-200 (1949).

"[273] LYUSTERNIK, L. A. and A. M. PROKIIOROV, Determination of eigenvalues andfunctions of certain operators by means of an electrical network, Dokl. AkadNauk SSSR, vol. 55, pp. 579-82; Izv. Akad. Nauk SSSR, Ser. Mat., vol. 11,pp. 141-45 (1947).

[274] MARCUS, M., A remark on a norm inequality for square matrices, Proe. Amer.Math. Soc., vol. 6, pp. 117-19 (1955).

[275] An cigenralne inequality for the product of normal matrices, Amer.Math. Monthly, vol. 63, pp. 173-74 (1956).

[276] A determinantal inequality of H. P. Robertson, II, J. Washington Acad.Sci., vol. 47, pp. 264-66 (1957).

[277] Convex functions of quadratic forms, Duke Math. J., vol. 24, pp. 321-26(1957).

(278] MARCUS, M. and J. L. MCGREGOR, Extremal properties of Hermitian matrices,Canad. J. Math., vol. 8, pp. 524-31 (1956).

[279] MARCUS, M. and B. N. MOYLS, On the maximum principle of Ky Fan, Canad.J. Math., vol. 9, pp. 313-20 (1957).

[280] Maximum and minimum values for the elementary symmetric functionsof Hermitian forms, J. London Math. Soc., vol. 32, pp. 374.77 (1957).

[281] MAYANTS, L. S., A method for the exact determination of the roots of secular

equations of high degree and a numerical analysis of their dependence on the

parameters of the corresponding matrices, Dokl. Akad. Nauk SSSR, vol. 50,pp. 121.24 (1945).

[282] MIRSKY, L., An inequality for positive-definite matrices, Amer. Math. Monthly,vol. 62, pp. 428-30 (1955).

[283] The norm of adjugatc and inrerse matrices, Arch. Math., vol. 7, pp. 276-77(1956).

[284] The spread of a matrix, Mathematika, vol. 3, pp. 127.30 (1956).[285] Inequalities for normal and Hermitian matrices, Duke Math. J., vol. 24,

PP. 591-99 (1957).[286] Mixnovic, D., Conditions graphiques pour que toutes les racines d'unc equation

alycbrique soient h parties recllcs negatives, C. R. Acad. Sci. Paris, vol. 240,pp. 1177-79 (1955).

[287] MORGENSTERN, D., Eine Verschhrfung der Ostrowskischen Determinanten-abschatzung, Math. Z., vol. 66, pp. 143-46 (1956).

264 BIBLIOGRAPHY

(288) MOTZKIN, T. and O. TAUSSKY, Pairs of matrices with property L., Trans. Amer.Math. Soc., vol. 73, pp. 108-14 (1952); vol. 80, pp. 387-401 (1954).

[289] NEIGAUS (Neuhaus), M. G. and V. B. LIDSKti, On the boundedness of the solu-tions of linear systems of differential equations with periodic coefficients, Dokl.Akad. Nauk SSSR, vol. 77, pp. 183-93 (1951).

[290] NEUMANN, J., Approximative of matrices of high order, Portugal. Math., vol. 3.pp. 1-62 (1942).

[291] NUDEL'MAN, A. A. and P. A. SxvAaTSMAN, On the spectrum of the product ofunitary matrices, Uspehi Mat. Nauk, vol. 13, no. 6, pp. 111.17 (1958).

[292] OKAMOTO, M., On a certain type of matrices with an application to experimentaldesign, Osaka Math. J., vol. 6, pp. 73.82 (1954).

[293] OPPENHEIM, A., Inequalities connected with definite Hermitian forms, Amer.Math. Monthly, vol. 61, pp. 463-66 (1954).

[294) ORLANDO, L., Sul problema di Hurwitz relativo alle parti reali delle radici di un'equatione algebrica, Math. Ann., vol. 71, pp. 233-45 (1911).

[295] OSTROWSKI, A., Bounds for the greatest latent root of a positive matrix, J. Lon-don Math. Soc., vol. 27, pp. 253-58 (1952).

[296] Sur quelques applications des functions convexes et concaves an seas de

1. Schur, J. Math. Puree Appl., vol. 31, pp. 253-92 (1952).[297] On nearly triangular matrices, J. Res. Nat. Bur. Standards, vol. 52,

pp. 344-45 (1954).[298] On the spectrum of a one-parametric family of matrices, J. Reine Angew.

Math., vol. 193, pp. 143-60 (1954).[299] Sur lea determinants a diagonals dominante, Bul. Soc. Math. Belg., vol. 7,

pp.46.51 (1955).[300] Note on bounds for some determinants, Duke Math. J., vol. 22, pp. 95-102

(1955).[301] Bber Normen von Matrizen, Math. Z., vol. 63, pp. 2-18 (1955).[302] Ober die Stetigkeit von charakteristischen Wurseln in Abhdngigkeit von

den Matrizenelcmenten, Jber. Deutsch. Math. Verein., vol. 60, pp. 40-42 (1957).[303] PAPKOVICH, P. F., On a method of computing the roots of a characteristic deter-

minant, Prikl. Mat. Mch., vol. 1, pp. 314-18 (1933).[304] PAPULIS, A., Limits on the zeros of a network determinant, Quart. Appl. Math.,

vol. 15, pp. 193-94 (1957).[305] PARODI, M., Remarques our is stabilitE, C. R. Acad. Sci. Paris, vol. 228, pp. 51-2;

807-8; 1198-1200 (1949).[306] Sur une propriete des racines dune equation qui entervient en mecanique,

C. R. Acad. Set. Paris, vol. 241, pp. 1019-21 (1955).[307] Sur In localization des valeurs caracteristiques des matrices darts le plan

complexe, C. R. Acad. Sci. Paris, vol. 242, pp. 2617-18 (1956).[308) PEANO, G., Integration par series des equations differentielles lineaires, Math.

Ann., vol. 32, pp. 450-56 (1888).[309] PENROSE, R., A generalized inverse for matrices, Proc. Cambridge Philos. Soc.,

vol. 51, pp. 406-13 (1955).[310] On best approximate solutions of linear matrix equations, Proc. Cam-

bridge Philos. Soc., vol. 52, pp. 17-19 (1956).[311] PERFECT, H., On matrices with positive elements, Quart. J. Math., vol. 2, pp. 286-

90 (1951).[312] On positive stochastic matrices with real characteristic roots, Proc. Cam-

bridge Philos. Soc., vol. 48, pp. 271-76 (1952).

BIBLIOGRAPHY 265

[313] Methods of constructing certain stochastic matrices, Duke Math. J.,vol. 20, pp. 395-404 (1953); vol. 22, pp. 305-11 (1955).

[314] A lower bound for the diagonal elements of a non-negative matrix, J.London Math. Soc., vol. 31, pp. 491-93 (1956).

[315] PERRON, 0., Jacobischer Kettenbruchaigorithmus, Math. Ann., vol. 64, pp. 1-76(1907).

[316] Ober Matrisen, Math. Ann., vol. 64, pp. 248.63 (1907).[317] ttber Stabilitdt and asymptotisches Verhalten der Ldsungen rifles Systems

endlicher Differenzengleichungen, J. Reins Angew. Math., vol. 161, pp. 41-64(1929).

[318] PHn.LIps, H. B., Functions of matrices, Amer. J. Math., vol. 41, pp. 266-78 (1919).[319] PONTRYAGIN, L. S., Hermitian operators in a space with indefinite metric, Izv.

Akad. Nauk SSSR, Ser. Mat., vol. 8, pp. 243-80 (1944).'[320] Po'rArov, V. P., On holomorphic matrix functions bounded in the unit circle,

Dokl. Akad. Nauk SSSR, vol. 72, pp. 849-53 (1950).[321) RASCH, G., Zur Theorie and Anwendung der Produktintegrals, J. Reine Angew.

Math., vol. 171, pp. 65-119 (19534).[322] REICHARDT, H., Einfache Herleitung der Jordanechen Normalform, Wigs. Z.

Humboldt-Univ. Berlin. Matb.-Nat. Reihe, vol. 6, pp. 445.47 (1953/54).*[323] RECHTMAN-OL'SHANSKAYA, P. G., On a theorem of Markov, Uspehi Mat. Nauk,

vol. 12, no. 3, pp. 181-87 (1957).[324] RHAM, G. DE, Sur an thdorlme de Stieltjes reiatif it certain matrices, Acad.

Serbe Sci. Publ. Inst. Math., vol. 4, pp. 133-54 (1952).[325] RICHTER, H., Ober Matrixfunktionen, Math. Ann., vol. 122, pp. 16-34 (1950).[326] Bemerkung zur Norm der Inversen einer Matrix, Arch. Math., vol. 5,

pp. 447-48 (1954).[327] Zur Abschatsung von Matrizennormen, Math. Nachr., vol. 18, pp. 178-87

(1958).[328] ROMANOVSKIi, V. I., Un the oreme sur lea zeros dcs matrices non-negatives, Bull.

Soc. Math. France, vol. 61, pp. 213-19 (1933).[329] Eecherches sur lea chaises de Markoff, Acta Math., vol. 66, pp. 147-251

(1935).[330] ROTH, W., On the characteristic polynomial of the product of two matrices, Proc.

Amer. Math. Soc., vol. 5, pp. 1-3 (1954).[331] -- On the characteristic polynomial of the product of several matrices, Proc.

Amer. Math. Soc., vol. 7, pp. 578-82 (1956).[332] ROY, S., A useful theorem in matrix theory, Proc. Amer. Math. Soc., vol. 5,

pp. 635-38 (1954).[333] SAKHNOVICH, L. A., On the limits of multiplicative integrals, Uspehi Mat. Nauk,

vol. 12 no. 3, pp. 205-11 (1957).*[334] SARYMSAKOV, T. A., On sequences of stochastic matrices, Dolt). Akad. Nauk, vol.

47, pp. 331-33 (1945).[335] SCHNEIDER, H., An inequality for latent roots applied to determinants with domi-

nant principal diagonal, J. London Math. Soc., vol. 28, pp. 8-20 (1953).[336] A pair of matrices with property P, J. Amer. Math. Monthly, vol. 62,

pp. 247-49 (1955).[337] A matrix problem concerning projections, Proe. Edinburgh Math. Soc.,

vol. 10, pp. 129-30 (1956).[338] The elementary divisors, associated with 0, of a singular 3f-matrix,

Proc. Edinburgh Math. Soc., vol. 10, pp. 108-22 (1956).

266 BTBLIOGRAPIIY

[339] SCHOENRERG, J., Ober variationsvermindernde lineare Transformationen, Math.Z., vol. 32, pp. 321.28 (1930).

[340] Zur Abzo"hlung der reellen Wurseln algebraischer Gtesehungen, Math. Z.,vol. 38, p. 546 (1933).

[341] SCHOENBERO, I. J., and A. WHITNEY, A theorem on polygons in n dimensionswith application to variation diminishing linear transformations, CompositioMath., vol. 9, pp. 141.80 (1951).

[342] SCHUR,.I., OJber die charakteristischen Wurseln einer linearen Substitution mitciner Anwendung auf die theorie der Integraigleichungen, Math. Ann., vol. 66,pp. 488.510 (1909).

[343] SEaIENDYAEV, K. A., On the determination of the eigenvalues and invariant mani-folds of matrices by means of iteration, Prikl. Matem. Meh., vol. 3, pp. 193-221(1943).

"[344] SEVAST'YAIIOV, B. A., The theory of branching random processes, Uspehi Mat.Nauk, vol. 6, no. 6, pp. 46-99 (1951).

0[345] SnirrNzn, L. M., The development of the integral of a system of differentialequations with regular singularities in series of powers of the elements of thedifferential substitution, Trudy Mat. Inst. Steklov. vol. 9, pp. 235-66 (1935).

'[348] On the powers of matrices, Mat. Sb., vol. 42, pp. 385.94 (1935).[347] SHODA, K., Ober mit einer Matrix vertauschbare Matrizen, Math. Z., vol. 29,

pp. 696.712 (1929).[348] SHOSTAK, P. Y., On a criterion for the conditional definiteness of quadratic forma

in n linearly independent variables and on a sufficient condition for a conditionalextremum of a function of n variables, Uspehi Mat. Nauk, vol. 8, no. 4, pp. 199-206(1954).

[349] 8HREIDEa, Y. A., A solution of systems of linear algebraic equations, Dokl. Akad.Nauk, vol. 76, pp. 651-55 (1950).

[350] SHTAERMAN (Steiermann), I. Y., A new method for the solution of certainalgebraic equations which have application to mathematical physics, Z. Mat.,Kiev, vol. 1, pp. 83-89 (1934) ; vol. 4, pp. 9-20 (1934).

[351] SHTAERMAN (Steiermann), I. Y. and N. I. AKHIESER (Achieser), On the theoryof quadratic forms, Izv. Polyteh., Kiev, vol. 19, pp. 116.23 (1934).

[352] M. R., An estimate of error in the numerical computation of matricesof high order, Uspehi Mat. Nauk, vol. 6, no. 4, pp. 121-50 (1951).

1[353) SHVARTSMAN (Schwarzmann), A. P., On Green's matrices of self-ad joint differ-ential operators, Proe. Odessa Univ. Matematika, vol. 3, pp. 35-77 (1941).

[354] SIFAEL, C. L., Symplectic Geometry, Amer. J. Math., vol. 65, pp. 1.86 (1943).[355] SKAL'KINA, M. A., On the preservation of asymptotic stability on transition from

differential equations to the corresponding difference equations, Dokl. Akad.Nauk SSSR, vol. 104, pp. 505.8, (1955).

1[356] SMoooRzHEvsx1i, A. S., Sur lea matrices unitaires du type de circulants, J. Mat.Circle Akad. Nauk Kiev, vol. 1, pp. 89.91 (1932).

[356a] SHOOORZHEVSKII, A. S. and M. F. KRAVCHUK, On orthogonal transformations,Zap. Inst. Ed. Kiev, vol. 2, pp. 151.56 (1927).

[357] STENZEL, U., Ober die Darstellbarkeit einer Matrix als Produkt von zwei symmet-rischen Matrizen, Math. Z., vol. 15, pp. 1-25 (1922).

[358) STOHR, A., OsziUationstheoreme fur die Eigenvektoren spesieller Matrizen, J.Reine Angew. Math., vol. 185, pp. 129.43 (1943).

[359] SULEIMANOVA, K. R., Stochastic matrices with real characteristic values, Dokl.Akad. Nauk SSSR, vol. 66, pp. 343-45 (1949).

BIBLIOGRAPHY 267

[360] On the characteristic values of stochastic matrices, Uh. Zap. MoscowPed. Inst., Ser. 71, Math., vol. 1, pp. 167-97 (1953).

[361] SULTANOV, R. M., Some properties of matrices with elements in a non-commutativering, Trudy Mat. Sectors Akad. Nauk Baku, vol. 2, pp. 11-17 (1946).

[362] SUSHKEVICH, A. K., On matrices of a special type, Uh. Zap. Univ. Charkov,vol. 10, pp. 1-16 (1937).

[363] 8z-NAov, B., Remark on S. N. Boy 'a paper À useful theorem in matrix theory,'Proc. Amer. Math. Soc., vol. 7, p. 1 (1956).

[364] TA Lt, Die Stabilit8tafrage bei Differensengleiehungen, Acta Math., vol. 63,pp. 99-141 (1934).

[365] TAUSSKY, 0., Bounds for characteristic roots of matrices, Duke Math. J., vol. 15,pp. 1043-44 (1948).

[366] A determinantal inequality of H. P. Robertson, I, J. Washington Acad.Bel., vol. 47, pp. 263-64 (1957).

[367] Commutativity in finite matrices, Amer. Math. Monthly, vol. 64, pp. 229-35 (1957).

[368] TOEPLrrz, 0., Das algebraisohe Analogon sit einem Bats von Pejer, Math. Z.,vol. 2, pp. 187-97 (1918).

[369] TURNBULL, H. W., On the reduction of singular matrix pencils, Proc. EdinburghMath. Soc., vol. 4, pp. 67-76 (1935).

0[370] TuacHANtxov, A. S., On some applications of matrix calculus to linear differen-tial equations, Uh. Zap. Univ. Odessa, vol. 1, pp. 41-48 (1921).

[371] VEazHBITSIU, B. D., Some problems in the theory of series compounded fromseveral matrices, Mat. 8b., vol. 5, pp. 505-12 (1939).

[372] VILENxIN, N. Y., On an estimate for the maximal eigenvaiue of a matrix, U.Zap. Moscow Ped. Inst., vol. 108, pp. 55-57 (1957).

[373] VmEa, M., Note sur lea structures unitaires et paraunitaires, C. R. Acad. Sei.Paris, vol. 240, pp. 1039-41 (1955).

[374] VOLTEEaA, V., Sui fondamenti della teoria deUe equasioni differenziali lineari,Mem. Soc. Ital. Sci. (3), vol. 6, pp. 1-104 (1887); vol. 12, pp. 3-68 (1902).

[375] WALKER, A. and J. WESTON, Inclusion theorems for the eigenvalues of a normalmatrix, J. London Math. Soc., vol. 24, pp. 28-31 (1949).

[376] WAYLAND, H., Expansions of determinantal equations into polynomial form,Quart. Appl. Math., vol. 2, pp. 277-306 (1945).

[377] WEIEaSTRAss, K., Zur theorie der bilinearen and quadratisohen Formen, Monatsh.Akad. Wisa. Berlin, 1867, pp. 310-38.

[378] WELLSTEIN, J., Ober symmetrische, alternierende and orthogonale Normalformenvon Matrisen, J. Reine Angew. Math., vol. 163, pp. 166-82 (1930).

[379] WEYL, H., Inequalities between the two kinds of cigenvalues of a linear trans-formation, Proc. Nat. Acad. Sci., vol. 35, pp. 408-11 (1949).

[380] WEYa, E., Zur Theorie der bilinearen Formen, Monatsh. f. Math. and Physik,vol. 1, pp. 163-236 (1890).

[381] WHITNEY, A., A reduction theorem for totally positive matrices, J. Analyse Math.,vol. 2, pp. 88-92 (1952).

[382] WIELANDT, H., Bin Einschliessungasats fur charakteriatiaehe Wurzeln normalerMat risen, Arch. Math., vol. 1, pp. 348-52 (1948/49).

[383] Die Einschliessung von Eigenwerten normaler Matrizen, Math. Ann.vol. 121, pp. 234-41 (1949).

[384] Unserlegbare nicht-negative Matrizen, Math. Z., vol. 52, pp. 642-48(1950).

268 BIBLIOGRAPHY

[385] Lineare Scharen von Matriaen mit reeUen Sigenwerten, Math. Z., vol. 53,pp. 219-25 (1950).

[386] Pairs of normal matrices with property L, J. Ran. Nat. Bur. Standards,vol. 51, pp. 89-90 (1958).

[387] Inclusion theorems for eigenvalues, Nat. Bur. Standards, Appl. Math.Sci., vol. 29, pp. 75-78 (1953).

[388] An extremum property of sums of eigenvalues, Proc. Amer. Math. Soc.,vol. 6, pp. 106-110 (1955).

[389] On eigenvalues of sums of normal matrices, Pacific J. Math., vol. 5,pp. 633-38 (1955).

[390] WINTNEa, A., On criteria for linear stability, J. Math. Mech., vol. 6, pp. 301-9(1957).

[391] WONG, Y., An inequality for Minkowski matrices, Proc. Amer. Math. Soe., vol. 4,pp. 137-41 (1953).

[392] On non-negative valued matrices, Proc. Nat. Acad. 86. U.S.A., vol. 40,pp. 121-24 (1954).

0[393] YAOLOM, I. M., Quadratic and skew-symmetric bilinear forms in a real symplecticspace, Trudy Sem. Vect. Tens. Anal. Moscow, vol. 8, pp. 364-81 (1950).

[394] YAKUDOVICH, V. A., Some criteria for the reducibility of a system of differentialequations, Dokl. Akad. Nauk SSSR, vol. 66, pp. 577.80 (1949).

[395] ZEITLIN (Tseitlin), M. L., Application of the matrix calculus to the synthesis ofrelay-contact schemes, Dokl. Akad. Nauk SSSR, vol. 88, pp. 525-28 (1952).

[396] ZIMMERMANN (Tsimmerman), 0. K., Decomposition of the norm of a matrixinto products of norms of its rows, Naut. Zap. Ped. Inst. Nikolaevsk, vol. 4,pp. 130-35 (1953).

INDEX

INDEX(Numbers in italics refer to Volume Two]

ABSOLUTE CONCEPTS, 184Addition of congruences, 182Addition of operators, 51Adjoint matrix, 82Adjoint operator, 2115Algebra, 11Algorithm of Gauss, 23ff.

generalized, 45Angle between vectors, 242Axes, principal, 309

reduction to, 309

BASIS (ES), Ucharacteristic, 7.3coordinates of vector in, 53

Jordan, 291

lower, 202orthonormal, 242, 245

Bessel, inequality of, 253BBzout, generalized theorem of, 81Binet-Cauchy formula, 2Birkhoff, G. D., 142.Block, of matrix, 41

diagonal, isolated, Z.Jordan, 151

Block multiplication of matrices, 42Bundle of vectors, 1.83Bun$akovskii'a inequality, 255

CARTAN, theorem of, 4Cauchy, formula of Binet-, ft

system of, 11.5Cauchy identity, 10Cauchy index, 174, UBCayley, formulas of, 228Cayley-Hamilton theorem, 83, 192Cell, of matrix, 41Chain, see Jordan, Markov, SturntCharacteristic basis, 73Characteristic direction, 71Characteristic equation, 70, 310, 338Characteristic matrix, 82Characteristic polynomial, 71. 82

Characterization of root, minimal, 319maximal-minimal, 321, 322

Chebyshev, 175, 8!Qpolynomials of, 259

Chebyshev-Markov, formula of, &48theorem of, 247

Chetaev, 181Chipart, 175, 1Coefficients of Fourier, 261Coefficients of influence, reduced, 111Column, principal, 338Column matrix, 2Columns, Jordan chains of, 165Components, of matrix, 1115

of operator, hermitian, 268skew-symmetric, 281

symmetric, 281Compound matrix, 19ff., 20Computation of powers of matrix, 109Congruence,, 181, 1.82Constraint, 320Convergence, 110, ]12Coordinates, transformation of, 59

of vector, 53Coordinate transformation, matrix of,

D'ALEMSERT-EULER, theorem of, 286DanilevskiI, 214Decomposition, of matrix into triangi

factors, 33ff.polar, of operator, 276.286; kof space, 248

Defect of vector space, 64Derivative, multiplicative, LY8Determinant identity of Sylvester, 332Determinant of square matrix, IDiagonal matrix, 3Dilatation of space, 287Dimension, of matrix, 1

of vector space, 51Direction, characteristic, 71Discriminant of form, 333

271

272 INDEX

Divisors, elementary, 142. 144 194admissible, 23Bgeometrical theory of, 115infinite, &2

Dmitriev, 82Domain of stability, 218Dynkin, 82

EIOENVALUE, (18Elements of matrix, IElimination method of Gauss, 23ff.Equivalence, of matrices, 61. 132. 133

of pencils, strict, 24Ergodic theorem for Markov chains, 9fErugin, theorem of, 111

Gaussian form of matrix, 38Golubchikov, 124Governors, 172, 233Gram, criterion of, 247Gramian, 247, 251Group, 18

unitary, 268Gundenfinger, 304

HADAMARD INEQUALITY, 252generalized, 254

Hamilton-Cayley theorem, 83, 182Hankel form, 338; 2115Hankel matrix, 338; 2114Hermite, 172, 202, 2111

Euler-D 'Alembert, theorem of, 286 Hermite-Biebler theorem, 228Hurwitz, 173, 190, 2111

FACTOR SPACE, 183Faddeev, method of, 82Field, 1Forces, linear superposition of, 28Form, bilinear, 294

Hankel, 338; 2115

Hurwitz matrix, 1912Hyperlogarithm, 182

IDENTITY OPERATOR, 66Imprimitivity, index of, 811Ince, 1.4.2

hermitian, 244, 331bilinear, 332canonical form of, 337negative definite, 337negative semidefinite, 336pencil of, see pencilpositive definite, 337positive semidefinite, 336rank of, 333signature of, 334singular, 333

quadratic, 246 294definite, 305discriminant of, 294

Inertia, law of, 297, 334Integral, multiplicative, 112, L1&

product, 118Invariant plane, of operator, 283

JACOBI, formula of, 302, 336identity of, 11.4method of, 300theorem of, 303

Jacobi matrix, 99Jordan basis, 201Jordan block, 151Jordan chains of columns, 165Jordan form of matrix, 11 201 2

rank of, 296real, 294reduction of, 299ff.reduction to principal axes, 309restricted, 306semidefinite, 304signature of, 296, 298singular, 294

Jordan matrix, 152. 201

KARPELEVICH, 82Kernel of X-matrix, 19Kolmogorov, 81 87. 92Kotelyanskii, 1121

lemma of, 71Krein, 221, 2511

Fourier series, 26.1Frobenius, 304, 339, 343; 53

Kronecker, 75; Q5, 17 411Krylov, 203

theorem of, 343; 51Function, entire, 1:68

left value of, 81

transformation of, 206

LAORANOE, method of, 299

GANTMACHEa, 101

Lagrange interpolation polynomial, 191Lagrange-Sylvester interpolation polyno-

Gauss, algorithm of, 23ff. mial, 91generalized, 45 ):matrix, 130

elimination method of, 23ff. kernel of, 12

INDEX

Lappo-Danilevskii, 168, 170, 171Left value, 81Legendre polynomials, 253Li6nard, 173, &81Li6nard-Chipart stability criterion, WLimit of sequence of matrices, 33Linear (in)dependence of vectors, 51Linear transformation, 3Logarithm of matrix, 238Lyapunov, 171, 188

criterion of, 18Requivalence in the sense of, 118theorem of, 18Z

Lyapunov matrix, 112Lyapunov transformation, LIZ

MACMILLAN, 115Mapping, affine, 245Markov, 173, i4Q

theorem of, 242Markov chain, acyclic, 88

cyclic, 88fully regular, 88homogeneous, 81

period of, 98(ir)reducible, 88regular, 88

Markov parameters, 233, 234Matricant, IizMatrices, addition of, 4

group property, 11annihilating polynomial of, 88applications to differential equations,

116ff.congruence of, 296difference of, 5equivalence of, 132, 133equivalent, 61 ff.left-equivalence of, 132, 133limit of sequence of, 33multiplication on left by H, 14product of, 6quotient of, 12rank of product, 12similarity of, fiZunitary similarity of, 242with same real part of spectrum, 122adjoint, 82. 230reduced,92

blocks of, 41

canonical form of, 63, 135, 136, 139, 141,152, 192, 201, 202, 284, 265

cells of, 41characteristic, 82

characteristic polynomial of, 82

273

Matrix, column, 2commuting, Zcompanion, 148completely reducible, 81complex, 1 ff.

orthogonal, normal form of, 81representation of as product, 6skew-symmetric, normal form of, 18symmetric, normal form of, 11

components of, 195compound, 19ff., 211computation of power of, 1518constituent, 195of coordinate transformation. fi4cyclic form of, 54decomposition into triangular factors,

33ff.derivative of, 112determinant of, 1, 5diagonal, 3

multiplication by, 8diagonal form of, 152dimension of, 1elementary, 132elementary divisors of, 142. ,44. 194elements of, 1function of, Off.

defined on spectrum, 26fundamental, 13Gaussian form of, 39Hankel, 338; 20

projective, gQHurwitz, L2Qidempotent, 220infinite, rank of, 8,12integral, 126; UA

normalized, 114invariant polynomials of, 139, 144, 194inverse of, 15

minors of, 19ff.irreducible, 511

(im)primitive, 8QJacobi, 33Jordan form of, 1 201, 2flA, 1311and linear operator, 51logarithm of, 22.2Lyapunov, ZiZminimal polynomial of, 89

uniqueness of, 90minor of, 2

principal, 2multiplication of, by number, 5

by matrix, 11

274

Matrix, nilpotent, 226non-negative, 5R

totally, 9Bnon-singular, 15normal, 269normal form of, 150, 192, 201, 292notation for, 1order of, 1orthogonal, 262oscillatory, 112.1partitioned, 41, 42permutable, 7permutation of, 50polynomial, see polynomial matrixpolynomials in, permutability of, 13positive, 511

spectra of, 5.1totally, 9,8

power of, 12computation of, 108

power series in, 113principal minor of, 2quasi-triangular, 43rank of, 2reducible, 50, 51

normal form of, 15representation as product, 2114root of non-singular, 233root of singular, 234ff., 238Routh, 191row, 2of simple structure, 73singular, 15skew-symmetric, 18square, 1square root of, 239stochastic, 88

fully regular, 88regular, &8

spur of, 81subdiagonal of, 13superdiagonal of, 13symmetric, 13trace of, 81

transformation of coordinate, 611transforming, 35 filltranspose of, 12triangular, 18. 218; 18unit, 12unitary, 263, '26(1unitary, representation of as product, 5upper quasi-triangular, 43.upper triangular, 18

Matrix addition, properties of, 4

INDEX

Matrix equations, 215ff.uniqueness of solution, 16,

Matrix multiplication, 8 7Matrix polynomials, 76

left quotient of, 78multiplication of, 72

Maxwell, 179Mean, convergence in, of series, 264Metric, 242

euclidean, 246hermitian, 243.244

positive definite, 243positive semidefinite, 248

Minimal indices for columns, S8Minor, 2

almost principal, 118of zero density, 1114

Modulus, left, 215Moments, problem of, 236, 837Motion, of mechanical system, 125

of point, 121stability of, 125

asymptotic, 125

NAIMARK, 181, 833, 25Nilpotency, index of, 226Norm, left, 225

of vector, 243Null vector, 52Nullity of vector space, 64Number space, n-dimensional, 52

OPERATIONS, elementary, 134Operator (linear), 55. 66

adjoint, 265decomposition of, 281hermitian, 268

positive definite, 274positive semidefinite, 214projective, &2spectrum of, 272

identity, fibinvariant plane of, 283matrix corresponding to, 56normal, 268positive definite, 2811positive semidefinite, 284normal, 284orthogonal, of first kind, 281

(im)proper, 281of second kind, 281

polar decomposition of, 276 286real, 282semidefinite, 274. 284

INDEX 275

Operator (linear), of simple structure, 72skew-symmetric, 2811square root of, 225

symmetric, 2811transposed, 2811

unitary, 268spectrum of, 273

Operators, addition of, 51multiplictaion of, 58

Order of matrix, 1Orlando, formula of, 126Orthogonal complement, 268Orthogonalization, 2511Oscillations, small, of system, 326

PARAMETERS, homogeneous, 26Markov, 233, SS4

Parseval, equality of, 261Peano, 182Pencil of hermitian forms, 338

characteristic equation of, 338characteristic values of, 338principal vector of, 338

Pencil(s) of matrices, canonical form of,37.12

congruent, 41elementary divisors of, infinite, 22rank of, 88regular, 85singular, 8Bstrict equivalence of, 24

Pencil of quadratic forms, 310characteristic equation of, 310characteristic value of, 310principal column of, 310principal matrix of, 312principal vector of, 310

Period, of Markov chain, 28Permutation of matrix, 5QPerron, 8,1

formula of, 1.16Petrovskii, 118Polynomial(s), annihilating, 176, 177

minimal, 1111of square matrix, 82

of Chebyshev, 259characteristic, 71interpolation, 97. 101, 1113invariant, 139, 144. 194of Legendre, 258matrix, see matrix polynomialsminimal, 89, 176, 177monic, 11fiscalar, 7dpositive pair of,

Polynomial matrix, 76, 120elementary operations on, 130, 131regular, 76order of, 76

Power of matrix, 12Probability, absolute, 2Y

limiting, 84mean limiting, 96

transition, 88final, 8Blimiting, SBmean limiting, 9H

Product, inner, of vectors, 243scalar, of vectors, 242, 243of operators, 58of sequences, 6

Pythagoras, theorem of, 244

QUASI-EROODIC THEOREM, 28Quasi-triangular matrix, 43Quotients of matrices, 11

RANx, of infinite matrix, 889of matrix, 2of pencil, 82of vector space, 64

Relative concepts, 184Right value, 81Ring, 11Romanovskil, 83Root of matrix, 233, 234f1., 239Rotation of space, 287Routh, 173, 801

criterion of, 180Routh-Hurwitz, criterion of, 194Routh matrix, 191Routh scheme, 173Row matrix, 2

SCHLESINOER, LL2Schur, formulas of, 4fiSchwarz, inequality of, 255Sequence of vectors, 256, 284Series, convergence of, 2611

fundamental, of solutions, 18Signature of quadratic form, 296, 298Similarity of matrices, fitSingularity, 143Smirnov, 11'1Space, coefficient, 812

decomposition of, 177, 248dilatation of, 287euclidean, 242, 245

extension of, to unitary space, 282factor, 183

276 INDEX

Space, rotation of, 287unitary, 242, 243

as extension of euclidean, 282Spectrum, 96, 272, 273; 53Spur, 87Square(s), independent, 297

positive, 334Stability, criterion of, 321

domain of, 233of motion, 125of solnution of linear system, 129

States, essential, 93limiting, 98non-essential, 98

Stieltjes, theorem of, 282Stodola, 173Sturm, theorem of, 175Sturm chain, 175

generalized, 176Subdiagonal, 13Subspace, characteristic, 71

coordinate, 51cyclic, 185generated by vector, 185invariant, 178vector, 63

Substitution, integral, 143, 169Suleimanova, 87Superdiagonal, 13Sylvester, identity of, 32, 33

inequality of, 66Systems of differential equations, applica-

tion of matrices to, 116ff.equivalent, 118reducible, 118regular, 191, 168singularity of, 143stability of solution, 129

Systems of vectors, bi-orthogonal, 267orthonormal, 245

TRACE, 87Transformation, linear, 3

of coordinates, 59orthogonal, 242, 263unitary, 242, 263written as matrix equation, 7

Lyapunov, 117Transforming matrix, 35, 60Transpose, 19, 280Transposition, 18

UNIT SUM or SQUARES, 314Unit sphere, 315Unit vector, 244

VALUE(S), characteristic, maximal, 53extremal properties of, 317

latent, 69

left and right, of function, 81proper, 69

Vector(s), 51angle between, 242bundle of, 183Jordan chain of, 202complex, 282congruence of, 181extremal, 55inner product of, 243Jordan chain of, 201

latent, 69

length of, 242, 243linear dependence of, 51

test for, 251modulo 1, 183

linear independence of, 51

norm of, 243

normalized, 244; 66null, 52orthogonal, 244, 248orthogonalization of sequence, 256principal, 318, 338proper, 69projecting, 248

projection of, orthogonal, 248

real, 282scalar product of, 242, 243systems of, hi-orthogonal, 267

orthonormal, 245unit, 244

Vector space, 50ff., 51basis of, 51defect of, 64dimension of, 51finite-dimensional, 51infinite-dimensional, 51nullity of, 64rank of, 64

Vector, subspace, 63Volterra, 133, 145, 147Vyshnegradakil, 179

WEIERSTRASS, e5

ISBN: 0-8218-1393-5 (Set)ISBN: 0-8218-2664-6 (Vol. II)

911780821 826645CHEL/133.H

Date post:	25-Aug-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

THE THEORY OF MATRICES - 上海交通大学数学系math.sjtu.edu.cn/faculty/tyaglov/courses/linear...

Documents