[IEEE 2009 Joint 48th IEEE Conference on Decision and Control (CDC) and 28th Chinese Control...

Inverse Optimality of Cooperative Control for Networked Systems

Zhihua Qu and Marwan Simaan

Abstract— In this paper, inverse optimal control design isapplied to quantify performance of linear cooperative systems.It is known that a cooperative control consists of an individualnegative feedback and a network positive feedback and that,if the sensing/communication matrix sequence is sequentiallycomplete (or the graph has a globally reachable node or aspanning tree), the corresponding network system is coopera-tively asymptotically stable (or convergent to a consensus). Itis shown that, for any given topology, the individual negativefeedback control part by itself is inversely optimal and so arethe pair of individual negative feedback and network positivefeedback parts. It is also established that a cooperative systemof varying topologies is inversely optimal with respect to aquadratic performance index. Since the optimal performanceindex depends upon future changes of network topology, anonline algorithm is also proposed to estimate the cost incurredup to any instant of time. These results are useful to quantifyperformance of a cooperative networked system, and they couldalso be used to improve the choices of weights in the nonnegativerow-stochastic network feedback matrix.

I. INTRODUCTION

Cooperative control of dynamical systems has drawn muchattention in recent years. One of the primary reasons is thatcooperative systems by nature are complex networked sys-tems which arise in such contemporary applications as teamoperations of autonomous vehicles, search/rescue missions,mobile sensor network, and distributed energy systems. Theother important reason is that cooperative behaviors emergenaturally from group behaviors of animals. For instance, it isshown [31] that animal flocking behaviors can be simulatedand reproduced by using simple rules of cohesion, separa-tion and alignment. Consequently cooperative control haspromising applications in such complex systems as cyber-physical systems, biological systems, bio-mimic systems, andhuman-machine systems. Cooperative control theory [25] isto provide methodologies and tools for rigorously analyzingand systematically designing cooperative control systems.

Typically, a cooperative system consists of a collectionof individual dynamical systems which either actively trans-mit information or passively observe information througha shared sensing/communication network. Usually, the in-formation flow within the network are local, intermittentand unpredictable, and the dynamical systems attempt touse all the information available to them individually. Inaddition, the network may have latency, noises, and other

Z. Qu and M. Simaan are with School of EECS, University of CentralFlorida, Orlando, FL 32826, USA. This work is supported in part by a recentgrant from the CISE CCF Division of U.S. National Science Foundation.

characteristics. While uncertain dynamical systems havebeen studied extensively in robust control theory [13], [6],[23] and asymptotic stability analysis based on connectivetopology have been reported in [33], [34], a cooperativenetworked system has the distinct feature that its topologyof information sharing and feedback is time-varying andunspecified and its cooperative stability and cooperative con-trollability should be established according to the cumulativeproperties of information exchanges.

One commonly used approach to study cooperative sys-tems is the graph theoretical approach; for example, conver-gence of the nearest neighbor rule [40] is analyzed for a con-nected and undirected graph [10], for a directed graph with aspanning tree [30], for a directed graph with a globally reach-able node [17]. These results are obtained for agents whosedynamic equations are linear and of relative degree one, andtheir extensions to second-order dynamics are possible butinvolved [38], [39]. On the other hand, the matrix theoreticalmethod has a whole set of connectivity concepts equivalentto all those in the graph theoretical approach, and it hasthe advantages that heterogeneous systems of high relative-degree can be handled [28] and that control Lyapunovfunction can be found and robustness can be analyzed [25].For nonlinear discrete systems with convex dynamics andtime-varying topological patterns, stability analysis is doneusing the combination of graph theory and discrete set-valuedLyapunov functions [21], and this result is also extended tocontinuous-time coupled nonlinear systems [18]. The latteragain requires non-smooth analysis and convex dynamics.Recently, a topology-based comparative Lyapunov argumentis proposed [24], it involves a simple check of differentialinequalities, it does not require that system dynamics beconvex or any Lyapunov function be known, and it canbe used to systematically design cooperative control forheterogeneous nonlinear systems [25].

The aforementioned results enable us to synthesize co-operative control and to draw conclusions in terms of cu-mulative information flow and on cooperative stability andits convergence rate (should the convergence be uniform).Nonetheless, little is available about quantitative measureson performance of cooperative systems. The objective of thispaper is to integrate cooperative control theory and optimalcontrol theory in order to analyze performance of cooperativesystems. Since future topology of a cooperative system isnot predictable, an optimal cooperative control design isnever attainable. Accordingly, it is appropriate to evaluate theperformance of a convergent cooperative system by findingits inversely optimal performance index. The basic idea ofthe inverse optimal design has successfully been explored

Joint 48th IEEE Conference on Decision and Control and28th Chinese Control ConferenceShanghai, P.R. China, December 16-18, 2009

WeB17.2

978-1-4244-3872-3/09/$25.00 ©2009 IEEE 1651

[11], [8], [37], [5], [14] to avoid solving the Hamilton-Jacobi-Bellman equation in control design for linear andnonlinear uncertain systems and to analyze performance of astabilizing control by computing its cost and demonstratingoptimality with respect to some well-defined and meaningfulperformance index. In this paper, the inverse optimal designis extended to cooperative systems and, since topologychanges of a cooperative networked system are not knownor predictable, finding a meaningful performance index andcalculating (or estimating) its value are nontrivial even for alinear cooperative system.

The rest of the paper is organized into three sections.In section II, the basic results on cooperative systems andtheir convergence conditions are briefly reviewed. In sectionIII, inverse optimality of cooperative systems is revealedby establishing the following four results. First, individualfeedback control design of a cooperative system is found tobe inversely optimal with respect to a quadratic performanceindex provided that both the topology of information flow inthe cooperative system and its corresponding feedback gainmatrix are given. Second, it is shown that, given the topologyof information flow, both individual feedback and networkfeedback designs are optimal with respect to a quadraticperformance index containing a cross-product term. Third,it is illustrated using performance index of a zero-sum gamethat the positive feedback of network information is notadversarial. Finally, a cooperative system of varying topologyis shown to be also optimal with respect to a quadratic per-formance index. These results together demonstrate inverseoptimality of convergent cooperative systems and, becausefuture topology changes are not known, an algorithm is alsoproposed to estimate online the cost incurred up to anyinstant of time. Possible applications of the proposed resultsand future research are described in section IV.

II. COOPERATIVE SYSTEMS

Consider a networked dynamical system consisting of ns

individual and heterogeneous systems whose dynamics aregiven by

xμ = fμ(xμ) + gμ(xμ)uμ, yμ = hμ(xμ), (1)

where μ = 1, · · · , ns, xμ ∈ �nµ is the state, yμ ∈ �m

is the output, and uμ ∈ �m is the cooperative control tobe designed according to the exchanged information amongthe systems within their network. In the simplest case thatthe individual systems are all identical agents of first-orderdynamics, equation (1) reduces to

xμ = uμ, μ = 1, · · · , ns. (2)

To characterize information exchanges among the systemsover time, the following sensing/communication matrix canbe defined without loss of any generality:

S(t) =

⎡⎢⎢⎢⎣

s11 s12(t) · · · s1ns(t)s21(t) s22 · · · s2ns(t)

......

......

sns1(t) sns2(t) · · · snsns

⎤⎥⎥⎥⎦ (3)

where sii ≡ 1, sij(t) = 1 for i �= j if the output informationfrom the jth dynamical system is known to the ith systemat time t, and sij(t) = 0 if otherwise. By its nature, matrixS(t) is binary and piecewise constant as

S(t) = S(k) for all t ∈ [tk, tk+1)

where {tk : k ∈ ℵ} denotes the (infinite) sequenceof time instants at which network topology changes, and

ℵ �= {0, 1, · · · ,∞}.The objective of cooperative control design is to achieve

the so-called (output) asymptotic cooperative stability de-fined as follows.

Definition 1: Systems in the form of (1) are said to becooperatively stable if, for every given ε > 0, there exists aconstant δ > 0 such that, for all the initial conditions satisfy-ing ‖xμ(t0)−xl(t0)‖ ≤ δ, ‖yμ(t)−yl(t)‖ ≤ ε for all t ≥ t0.The systems are said to be asymptotically cooperativelystable if they are cooperatively stable and if, for all μ andfor every set of initial conditions {xμ(t0), μ = 1, · · · , m},there exists a constant c ∈ � such that limt→∞ yμ(t) = c1,where the value of c depends upon the initial conditions ofxμ(t0) and 1 ∈ �m is the vector of 1s. ♦

It is distinctive that a cooperative control must be ofgeneral form that

uμ = uμ,1(sμ1(t)(y1 − yμ), · · · , sμns(t)(yns − yμ))+uμ,2(xμ), (4)

where uμ,2(xμ) is the individual self-feedback control com-ponent aiming at stabilizing the μth system such that xe

μ =c1 are among the equilibrium points when uμ,2 = 0, anduμ,2(·) is the network feedback control component. In orderfor control uμ(·) to be feasible, component uμ,1(·) mustchange according to the μth row of sensing/communicationmatrix S. Hence, in analysis and design of cooperativecontrol, we have to consider not only system dynamics andtheir stability property but also intermittent and unpredictablechanges of the network.

As an example, consider the collection of agents in (2).Dynamics of the agents are Lyapunov stable at equilibriaxe

μ = c1. Hence, self-feedback components uμ,2(xμ) areno longer needed here, and their cooperative control can bechosen to be linear as

uμ(t) =ns∑i=1

aiμsμi(t)∑ns

ν=1 aμνsμν(t)[yi(t) − yμ(t)] (5)

�= −yμ(t) +

ns∑i=1

Dμi(t)yi(t),

where aij > 0 are weighting coefficients. Then, the closedloop model of the overall networked system can be expressedas

x = [−I + D(t)]x, (6)

where

D(t) =

⎡⎢⎣

D11(t) · · · D1ns(t)...

...Dns1(t) · · · Dnsns(t)

⎤⎥⎦ .

WeB17.2

1652

and D(t) is a nonnegative, piecewise-constant and row-stochastic matrix whose changes are not known apriori.

Cooperative control (5) is a truly distributed control law,it is also referred in [40] to as the nearest neighboringcontrol rule if aij = 1, and its cooperative stability isoften called the consensus problem. Using graph theory,the consensus problem is rigorously analyzed in [10] forthe case that the system topologies are an undirected andconnected graph. In [30], [16], the convergence conditionis relaxed to a directed graph with strong connectivity orwith a spanning tree. Cooperative algorithms have also beenstudied using the combination of graph theory and Nyquiststability criterion [4], [32], using proximity graphs [2], andusing the combination of graph theory and discrete set-valuedLyapunov functions [21].

Most of the aforementioned results deal with identicallinear networked systems of simple dynamics (of eitherfirst order or second order). For heterogeneous systems ofhigh order, a linear canonical form [28] is proposed usingfeedback linearization and for cooperative control design,the design always renders a closed loop system in theform of (6), and a matrix-theoretical approach [27], [28] isdeveloped to conclude a necessary and sufficient conditionon cumulative information flow for asymptotic cooperativestability. For nonlinear systems, cooperative stability underthe convexity property is analyzed in [21], and cooperativecontrol design using topology-based Lyapunov argument ispresented in [25]. Due to space limitation, only two necessaryconcepts and the key result are summarized in the rest ofthis section so that the subsequent analysis becomes selfcontained.

Let I ′ �= {k′

v : v ∈ ℵ} be a subsequence of ℵ and considerany two consecutive integers k′

v and k′v+1 therein. Then,

cumulative exchange of information within the network overtime interval [tk′

v, tk′

v+1] can be described by SΛ(v), where

SΛ(v)�= S(tk′

v+1)∧

S(tk′v+1−1)

∧· · ·

∧S(tk′

v), (7)

is the binary product of all sensing/communication matriceswithin the time interval, and

∧denotes the operation of gen-

erating a binary product of two binary matrices. Furthermore,there exists a permutation matrix T (v) to map SΛ(v) intothe lower-triangular canonical form [20] as: for some integerp(v) > 0,

T T (v)SΛ(v)T (v)

=

⎡⎢⎢⎢⎣

S′Λ,11(v) 0 · · · 0

S′Λ,21(v) S′

Λ,22(v) · · · 0...

.... . .

...S′

Λ,p1(v) S′Λ,p2(v) · · · S′

Λ,pp(v)

⎤⎥⎥⎥⎦ , (8)

where S′Λ,ii(v) are square and irreducible submatrices.

Definition 2: [28] Sensor/communication matrix sequence{S(k), k ∈ ℵ} defined in (3) is said to be sequentially

complete if there exists an infinitely-long subsequence I ′ �=

{k′v : v ∈ ℵ} ⊂ ℵ such that the corresponding lower-

triangular canonical form (8) has the property that, in every

row of i ≥ 2, there is at least one j < i such that S′Λ,ij(v) �=

0 for all v ∈ ℵ. The sequence is said to be uniformlysequentially complete if non-zero elements of S′

Λ,ij(v) donot vanish as v → ∞ and if time intervals [tk′

v, tk′

v+1] are

all uniformly bounded. ♦Definition 3: [26] Vc(x(t), t) is said to be a cooperative

control Lyapunov function for a networked system x =F(x, t) with initial condition x(t0) if F(c1, t) = 0 forall t ≥ t0 and for all c ∈ �, if Vc(c1, t) = 0 is theglobal minimum (i.e., Vc(x, t) > 0 whenever x �= c1),if Vc(x, t) is uniformly bounded with respect to t, and ifVc(x(t′), t′) > Vc(x(t), t) along the solution of the systemand for all t > t′ ≥ t0 unless x(t′) = c1.

Theorem 1: [28], [25] System (6) is uniformly asymp-totically cooperatively stable if and only if the sen-sor/communication matrix sequence {S(k), k ∈ ℵ} is uni-formly sequentially complete. Under the same condition,cooperative control Lyapunov function exists and, withinevery interval [tk, tk+1], its analytical expression alwaysconsists of complete square terms of (xj − xi).

III. INVERSE OPTIMALITY OF COOPERATIVE SYSTEMS

Optimal control theory can be used to synthesize an open-loop or feedback control law for a given system and toanalyze its performance. For instance, consider the followingclass of affine systems:

x = f(x, t) + g(x, t)u, (9)

where x ∈ �n, u ∈ �nu , and x(t0) = x0 is the initialcondition. The performance criterion to be optimized is oftenchosen to be the following cost functional:

J(t0, tf , x(t0), u) = φ(x(tf ), tf ) +∫ tf

t0

L(x(τ), u(τ))dτ,

(10)where functions φ(·) and L(·) are positive semi-definite withrespect to their arguments. The principle of optimality statesthat, if u∗ : [t0, tf ] → �nu is optimal with respect toJ(t0, tf , x(t0), u), it must also be optimal with respect toJ(t, tf , x(t), u), that is, J(t, tf , x(t), u∗) ≤ J(t, tf , x(t), u)for all u and all t ∈ (t0, tf ]. Based on the principle of opti-mality, it is shown [1] by perturbation analysis that optimalcontrol u∗ can be found by minimizing the correspondingHamiltonian; that is,

∂

∂u∗H

(x, u∗,

∂J∗(t, tf , x)∂x

, t

)= 0, (11)

where H(x, u, λ, t) = L(x, u) + λT [f(x, t) + g(x, t)u]is the Hamiltonian, and J∗(t, tf , x) is the optimal valueof J(t, tf , x(t), u) and also the solution to the followingHamilton-Jacobi-Bellman (HJB) equation:

∂J∗(t, tf , x)∂t

= −H(x, u∗, λ, t)∣∣∣∣λ =

∂J∗(t, tf , x)∂x

. (12)

In the case that tf = ∞, L(x, u) = q(x) + uT R(x)u,f(x, t) = f(x) and g(x, t) = g(x) in system (9) and

WeB17.2

1653

performance index (10), the optimal control becomes

u = −12R−1(x)LgJ

∗

where J∗ is the solution to the steady state HJB equation:

q(x) + LfJ∗ − 14(LgJ

∗)T R−1(x)LgJ∗ = 0. (13)

Alternatively, Euler-Lagrange method and the Pontryaginminimum principle [15] can also be applied to derive optimalcontrol.

Because HJB equation (12) and its steady state version(13) are partial differential equations, optimal control designis generally too difficult to solve analytically except fora few special cases. Inverse optimal control design offersthe alternative for which solving the HJB equation is notrequired. The basic idea of the inverse optimal design is toanalyze performance of a stabilizing control by computing itscost and demonstrating optimality with respect to some well-defined and meaningful performance index. Inverse optimalcontrol has successfully been explored [11], [8], [37], [5],[14] for linear and nonlinear systems.

In this paper, inverse optimality of networked cooperativesystems is investigated. Since the same process can be fol-lowed to establish inverse optimality for systems (1) and (6),linear systems are considered subsequently while discussionson nonlinear systems are omitted due to space limitation.

Compared with standard control systems, cooperative sys-tems have two distinctive features. First, the objective ofcooperative control is to ensure output consensus of yμ → c1for some c ∈ �, this implies x → c1 where x is the new stateunder the input-output linearization transformation [28], andhence function L(x, u) in performance index (10) should bea positive semi-definite function of (not x and u but) (xj−xi)and (ul − ui). This can be accomplished by replacing x in(10) with z, where

z = GiWix, (14)

Wi ∈ �(n−1)×n is the resulting matrix after inserting−1 as the ith column into identity matrix I(n−1)×(n−1),and Gi ∈ �n×(n−1) denotes the resulting matrix aftereliminating the ith column from In×n. Second, topology ofthe networked system is never fixed and its change is notknown or predictable. This implies that, even in the case oflinear cooperative system (6) under a quadratic performance,finding a meaningful performance index and calculating itsvalue are nontrivial problems.

In what follows, inverse optimality of linear cooperativesystem (6) is established step-by-step by utilizing its coop-erative control Lyapunov function [26]. And, the results canbe extended to nonlinear cooperative systems by employingthe topology-dependent Lyapunov argument [24], [25].

A. Cooperative System of Fixed Topology: Individual Feed-back Control

To motivate our analysis, let us first consider the followingcooperative system of fixed topology:

x = [−I + D]x, (15)

where D ∈ �n×n is a nonnegative, row-stochastic, diago-nally positive, and lower triangularly complete matrix. Thesimplified auxiliary control system corresponding to (15) is

x = Dx + v, (16)

in which Dx is the “command” signal (by combining allthe available information from the network), and v is the(individual) feedback control to be designed. The followingtheorem shows that an optimal design of control v renderscooperative system (15). That is, cooperative system (15) isinversely optimal with respect to the following performanceindex:

J =n∑

i=1

piJi =n∑

i=1

pi

∫ ∞

0

Li(x, v)dt, (17)

where pi > 0, Qi and Ri are symmetric matrices,

Li(x, v) = xT WTi QiWix + vT WT

i RiWiv,

and Wi ∈ �(n−1)×n is the matrix in (14).Theorem 2: Consider system (16) and performance index

(17). Choose Qi = 2GTi (1.5P − PD − DT P )Gi and

Ri = GTi PGi, where P = diag{p1, · · · , pn} is a positive

definite and diagonal matrix, and Gi ∈ �n×(n−1) is thatdefined in (14). Then, control v∗ = −R−1Px = −x is theoptimal control with respect to J . Furthermore, if matrixD is irreducible, matrix P can be chosen to be pi = λi

where λ =[

λ1 · · · λn

]Tis the unity left eigenvector

of D and associated with eigenvalue ρ(D) = 1. Then,Li(x, v∗) ≥ 0, system (16) is asymptotically cooperativelystable, and

J∗ =n∑

i,j=1

λiλj [xj(0) − xi(0)]2.

Proof: It follows from (16) that

Wix = Wi[−I + D]x + Wi(v − v∗),

and that, by lemma 5.2 in [25],

Wix = −Wix + WiDGiWix + Wi(v − v∗).

It follows from lemma 5.18 in [25] that, for all x ∈ �n andfor zi = GiWix,

n∑i=1

pizTi [−2P + PGiWiD + DT WT

i GTi P ]zi

= 2n∑

i=1

pizTi [−2P + PD + DT P ]zi.

Now, choosing

V =n∑

i=1

piVi

=n∑

i=1

pixT (t)WT

i GTi PGiWix(t)

=n∑

i=1

pi

n∑j=1

pi[xj(t) − xi(t)]2, (18)

WeB17.2

1654

differentiating it and then completing squares yield

V =n∑

i=1

2pixT WT

i GTi PGiWix

=n∑

i=1

pi

{2xT WT

i GTi [−1.5P + PD + DT P ]GiWix

−(v∗)T WTi RiWiv

∗ − 2(v∗)T WTi RiWi(v − v∗)

}

=n∑

i=1

pi {−Li(x, v)

+(v − v∗)T WTi RiWi(v − v∗)

}. (19)

Thus, by integrating both sides of the above equation, wehave

J = V (0) − limt→∞ V (t)

+n∑

i=1

pi

∫ ∞

0

(v − v∗)T WTi RiWi(v − v∗)dt, (20)

which is minimized under the optimal choice of v = v∗.It follows from Perron-Frobenius theorem [22], [7] that,

for nonnegative irreducible matrix D, λi > 0 for all i. Thus,Lyapunov function Vi in (18) is positive definite with respectto [xj(t)−xi(t)]. It follows from (19) that, under the choiceof v = v∗,

V = −n∑

i=1

piLi(x, v∗) = −n∑

i=1

2pixT WT

i HiWix,

where

Hi = GTi (2P − PD − DT P )Gi.

To show Hi is positive definite and hence Vi is negativedefinite, recall from D being irreducible and row-sum-1 thatmatrix sum 2P − PD − DT P is a singular and irreducibleM-matrix. It follows from (f) and (a) of theorem 4.31 in [25]that (2P −PD−DT P ) is positive semi-definite and of rank(n − 1) under the choice of P , and

(2P − PD − DT P )1 = 2λ − P1− DT λ = 0.

Since x = xi1 + GiWix, (GiWix)T (2P − PD −DT P )(GiWix) �= 0 unless GiWix = 0, matrix Hi ispositive definite. Thus, V is negative definite with respectto [xj(t) − xi(t)] and, by Lyapunov stability theorem [12],[23] and by (20), [xj(t)−xi(t)] asymptotically converges tozero for all i, j and J = V (0). Q.E.D.

In theorem 2, expression (19) holds for all choices of pi >0 and V is a Lyapunov function. Once P is chosen accordingto eigenvector λ, V is negative definite. In fact, matrices Hi

are all positive definite. In case that matrix D is reduciblebut lower triangularly complete, each of Hi by itself mayno longer be positive definite. Nonetheless, it is shown bytheorem 5.20 in [25] that pi can be found such that V isnegative definite. Therefore, theorem 2 can be generalized sothat matrix D needs only to be lower triangularly complete,whose detail is omitted here for briefness.

B. Cooperative System of Fixed Topology: Individual andNetwork Feedback Controls

To account for both individual feedback and network“command” control, the generalized auxiliary control systemcorresponding to (15) is

x = Dw + v, (21)

in which connectivity matrix D is dictated by the network, wis the networked control to be designed, v is the (individual)feedback control to be designed, and the designs of D andw are subject to the constraints that

0 ≤ dij ≤ sij , dii > 0,

n∑j=1

dij = 1, w =

⎡⎢⎣

w1(x1)...

wn(xn)

⎤⎥⎦ .

The following theorem shows that, with respect to perfor-mance index (22), optimal designs of control v and w renderscooperative system (15). That is, cooperative system (15) isinversely optimal with respect to performance index (22).

Theorem 3: Consider system (21) and performance index

J ′i =

∫ ∞

0

L′i(x, D, w, v)dt, (22)

where

L′i(x, D, w, v) = xT WT

i GTi PGiWix

−2xT WTi GT

i PGiWiDGiWiw

+vT WTi GT

i PGiWiv,

and Gi, Wi and P are those defined in theorem 2. Then, the“optimal” controls

v∗ = −x, w∗ = x

satisfy all the necessary conditions of optimality.Proof: It follows from Lemma 5.2 in [25] that system (21)can be expressed as

Wix = WiDGiWiw + Wiv.

Define the Hamiltonian as

H(x, D, w, v, λ) = L′i(x, D, w, v)+λT [WiDGiWiw+Wiv].

It follows from [15], [1] that the necessary conditions ofoptimality are

∂H∂x

= −WTi λ,

∂H∂dkl

= 0,∂H∂w

= 0,∂H∂v

= 0.

It is straightforward to verify algebraically that, with λ =2GT

i PGiWix, all the above conditions hold. Q.E.D.The results in theorems 2 and 3 are closely related as

L′i(x, D, w∗, v) = Li(x, v). It is worth noting that theorem

3 does not provide any explicit “optimal” solution to matrixD. Indeed, non-zeroing entries in matrix D are uniquelydetermined by network connectivity (i.e., according to sens-ing/communication matrix S). With this constraint and giventhe facts that the value of performance index (22) is linearin D and that D has all its row sums equal to one (alsocalled “row stochastic”), the optimal value of matrix D canbe found using a linear programming algorithm.

WeB17.2

1655

C. Cooperative System of Fixed Topology: A Game Perspec-tive

Consider the zero-sum linear quadratic differential gamewith performance index

J =n∑

i=1

peiJi

Ji =∫ ∞

0

[xT WTi GT

i P−1e QeiP

−1e GiWix

+vT WTi GT

i P−1e GiWiv − ηT P−1

e η]dt, (23)

subject to the constraint of fictitious system

x = Eη + v, (24)

where E is a given matrix, Pe = diag{pe1 , · · · , pen} withpei > 0, and

Qei = Pe − GiWiEPeET WT

i GTi .

By mimicking the steps in the proof of theorem 2, onecan show that

Ji =∫ ∞

0

(v − v∗)T WTi GiP

−1e GiWi(v − v∗)dt

−∫ ∞

0

(η − η∗i )T P−1

e (η − η∗i )dt

+xT (0)WTi GT

i P−1e GiWix(0)

− limt→∞xT (t)WT

i GTi P−1

e GiWix(t),

where

v∗ = −x, η∗i = PeE

T WTi GT

i P−1e GiWix, (25)

which is the optimal pair of game strategies.It follows that, under (25), Lyapunov function

Vi = xT WTi GT

i P−1e GiWix

has time derivative as

Vi = −2xT WTi GT

i P−1e QeP

−1e GiWix.

It is not difficult to check that Vi may assume positive values(for a non-negative, row-sum-1 and irreducible matrix E).Thus, asymptotic cooperative stability cannot be concludedin general for fictitious system (24) under optimal strate-gies (25). This result is not surprising since optimal gamestrategies do not necessarily form an equilibrium pair [9].Note that fictitious system (24) under optimal strategies (25)becomes cooperative system (16) provided that

D = EPeET WT

i GTi P−1

e GiWi�= D∗.

Given sensing/communication matrix S, the above matrixchoice of D∗ (if it were feasible) is the worst in the senseof value functional J . Fortunately, this never happens asD∗1 = 0 while D1 = 1.

Should performance index be modified to be

J =n∑

i=1

pei

∫ ∞

0

[xT WTi GT

i P−1e QeiP

−1e GiWix

+vT WTi GT

i P−1e GiWiv − ηT WT

i GTi P−1

e GiWiη]dt,

the optimal pair of game strategies become

v∗ = −x, GiWiη∗i = PeE

T WTi GT

i P−1e GiWix.

While v∗ remains the same, the above expression of η∗i

generally yields no solution by itself and hence any feasiblechoice of matrix D in cooperative system (16) does notcorrespond to any game solution.

In summary, network connectivity and its correspondingchoice of D are not adversarial and thus will not adverselyaffect cooperative stability.

D. Cooperative System of Varying Topologies

In the presence of network topology changes, the auxiliarycontrol system corresponding to (6) is

x = D(t)x + v, (26)

where D(t) is piecewise-constant and nonnegative and hasits row sums equal to one. Then, inverse optimality ofcooperative system (26) is shown by the following theorem,and theorem 3 can also be extended to the case of changingtopologies.

Theorem 4: Consider system (26) and performance index

Jo =∫ ∞

0

Lo(x, v, Po(t))dt, (27)

where

Lo(x, v, Po(t)) = xT WTi GT

i [Qo − Po(t)]GiWix

+vT WTi GT

i Po(t)GTi Wiv,

Wi and Gi are the matrices defined before, Qo is a givensymmetric and positive definite matrix, and Po(t) is sym-metric and positive definite matrix to be determined. Ifchanges of D(t) are uniformly sequentially complete, thenthere exists a positive definite matrix function Po(t) suchthat control v∗ = −x is the optimal control with respect toJo, Lo(x, v∗, Po(t)) ≥ 0, and the optimal cost is

J∗o = xT (0)WT

i GTi Po(0)GiWix(0).

Proof: It follows from (26) and lemma 5.18 in [25] that

GiWix = GiWi[−I + D(t)]GiWix + GiWi(v − v∗).

It has been shown in [28] that, since the piecewise matrixsequence of D(t) is uniformly sequentially complete, theabove system with v = v∗ is uniformly asymptoticallystable. It follows from Lyapunov converse theorem [12] that,for every positive definite matrix Qo, solution Po(t) to theLyapunov equation

Po + PoGiWi[−I + D] + [−I + D]T WTi GT

i Po = −Qo

is positive definite.Consider Lyapunov function

Vo = xT WTi GT

i Po(t)GiWix,

WeB17.2

1656

and its time derivative is given by

Vo = xT WTi GT

i PoGiWix + xT WTi GT

i PoGiWix

= −xT WTi GT

i QoGiWix

+2xT WTi GT

i PoGiWi(v − v∗)= −Lo(x, v, Po(t)) − (v∗)T WT

i GTi PoGiWiv

∗

+vT WTi GT

i Po(t)GTi Wiv

−2(v∗)T WTi GT

i PoGiWi(v − v∗)= −Lo(x, v, Po(t))

+(v − v∗)T WTi GT

i PoGiWi(v − v∗).

The proof is completed by integrating both sides of theabove equation and by recalling both convergence of Vo andexpression of Lo(x, v, Po(t)). Q.E.D.

By their nature, network topology changes are unpre-dictable. Thus, it is unattainable to find matrix Po(t) orthe corresponding optimal cost value J∗

o since Po(t) has tobe found backward. In addition, unlike diagonal matrix Pin theorem 2, it is not known if matrix Po(t) in theorem4 has the same diagonal structure. Thus, to evaluate real-time performance of a cooperative system, it is necessary todevelop a procedure by which an estimate, Jo(t, 0), can becalculated over time to approximate

J∗o (t, 0) = xT (0)WT

i GTi Po(0)GiWix(0)

−xT (t)WTi GT

i Po(t)GiWix(t),

which is the actual cost of the optimal performance indexand incurred over time interval [0, t]. It is obvious that, sincev∗ = −x is used, the estimate is an upper bound on theoptimal value as J∗

o (t, 0) ≤ Jo(t, 0).To estimate the cost incurred, consider two consecutive

intervals over which D(t) is piecewise constant as

D(t) ={

Di t ∈ [ti, ti+1)Di+1 t ∈ [ti+1, ti+2)

,

where t0 = 0. For t ∈ [ti, ti+1), future topology changesare not known apriori and hence it is reasonable to computeforward the cost incurred within the interval only based onthe available information. That is, it follows from (18), (19)and (17) and from v∗ = −x that, for t ∈ [0, t1),

Jo(t, 0)

�=

n∑i=1

2pi

∫ t

0

xT WTi GT

i (2P − PD0 − DT0 P )GiWixdt

=n∑

i,j=1

pipj{[xj(0) − xi(0)]2 − [xj(t) − xi(t)]2}.

where P = diag{p1, · · · , pn} is a positive definite anddiagonal matrix. For t ∈ [t1, t2), system (26) under optimalcontrol v∗ = −x has the solution that

x(t) = e−teD1(t−t1)eD0t1x(0)�= Φ(D1, t − t1, D0, t1)x(0).

It is straightforward to show that, for all t ≥ t1, matrixΦ(D1, t − t1, D0, t1) is nonnegative, row-sum-1, uniformly

bounded and invertible. It follows from Baker-Campbell-Hausdorff formula [29] that, for any fixed t,

Φ(D1, t − t1, D0, t1) = Φ(D1, t, 0, 0),

where D1(t) is the “equivalent” matrix defined by

D1t = D0t1 + D1(t − t1) +12adD0t1(D1(t − t1))

+112

{adD0t1(adD0t1D1(t − t1))

+adD1(t−t1)(adD0t1D1(t − t1))}

+ · · · ,

in which adAB = AB − BA is the Lie bracket. Since bothD0 and D1 are row-stochastic, [adD0t1(D1(t − t1))]1 = 0and so are all other products of the Lie brackets with 1in the above formula. Hence, matrix D1 is row-stochastic.Mild conditions under which the equivalent matrix of D1 isalso nonnegative can be found in lemma 5.14 in [25]. Recallthat, for any constant diagonal matrix P and any row-sum-1matrix D, expression (19) can be derived in terms of (18)and (17). Thus, it follows from (19) and from v∗ = −x thatthe estimated cost incurred up to time t with t ∈ [t1, t2) is

Jo(t, 0)

�=

n∑i=1

2pi

∫ t

0

xT WTi GT

i (2P − PD1 − DT

1 P )GiWixdt

=n∑

i,j=1

pipj{[xj(0) − xi(0)]2 − [xj(t) − xi(t)]2}.

By induction, the estimated cost incurred up to time t witht ∈ [tk, tk+1) is

Jo(t, 0)

�=

n∑i=1

2pi

∫ t

0

xT WTi GT

i (2P − PDk − DT

k P )GiWixdt

=n∑

i,j=1

pipj{[xj(0) − xi(0)]2 − [xj(t) − xi(t)]2}, (28)

where equivalent matrix Dk(t) is defined recursively by

Dkt = Dk−1tk + Dk(t − tk) +12adDk−1tk

(Dk(t − tk))

+112

{adDk−1tk

(adDk−1tkDk(t − tk))

+adDk(t−tk)(adDk−1tkDk(t − tk))

}+ · · · .

It is worth noting that, through the corresponding solutionPo(t), optimal performance index (27) requires the knowl-edge of all topology changes. In comparison, its estimate (28)employs a constant matrix P but adjusts the performancefunction as a new topology change becomes known.

IV. CONCLUSION

In this paper, performance of cooperative and networkedcontrol systems is analyzed. In its canonical form, a coop-erative control consists of network feedback and individualfeedback terms. It is shown that the individual control termcorresponds to the optimal solution to a linear quadratic

WeB17.2

1657

control problem and that, once network connectivity matrixS is given, both control terms are inversely optimal as wellwith respect to a meaningful performance index, and optimalvalue of network feedback matrix D can be found using alinear programming algorithm.

In a cooperative system, the network control term is apositive feedback control as it represents information sharing.Using the game theoretical approach, it is shown that thepositive feedback network control is not adversarial. Thisstudy of inverse optimality can further be exploited to es-tablish an in-depth relationship between cooperative controland dynamic game, which could make it possible to developdistributed algorithms for zero-sum game [9] as well as forcooperative teaming strategies [35], [36], [3], [19] in thepresence of adversaries.

For a cooperative system of varying topologies, inverseoptimality is also established in terms of a quadratic perfor-mance index. In this general case, the optimal cost incurredup to any given instant of time cannot be computed real-timesince future changes of network topology are not predictableand since its value is a solution to a two-point boundaryvalue problem. To evaluate real-time performance of thecooperative system, a procedure is presented to estimate anupper bound on the optimal cost.

REFERENCES

[1] A. E. Bryson and Y. C. Ho, Applied Optimal Control. Bristol, PA:Hemisphere Publishing Corp., 1975.

[2] J. Cortes, S. Martinez, and F. Bullo, “Robust rendezvous for mobileautonomous agents via proximity graphs in arbitrary dimensions,”IEEE Transactions on Automatic Control, vol. 51, pp. 1289–1298,2006.

[3] J. B. Cruz, M. Simaan, A. Gacic, and Y. Liu, “Moving horizon gametheoretic approaches for control strategies in a military operation,” inProceedings of the 40th IEEE Conference on Decision and Control,Orlando, Florida, December 2001, pp. 628–633.

[4] J. A. Fax and R. M. Murray, “Information flow and cooperative controlof vehicle formations,” IEEE Transactions on Automatic Control,vol. 49, pp. 1465–1476, 2004.

[5] R. A. Freeman and P. V. Kokotovic, “Inverse optimality in robuststabilization,” SIAM Journal on Control and Optimization, vol. 34,pp. 1365–1391, 1996.

[6] ——, Robust nonlinear control design: state-space and Lyapunovtechniques. Boston, MA: Birkhauser, 1996.

[7] F. R. Gantmacher, The Theory of Matrices, vol.II. New York, NY:Chelsea, 1959.

[8] D. H. Jacobson, Extensions of Linear Quadratic Control, Optimizationand Matrix Theory. New York, NY: Academic Press, 1977.

[9] ——, “On the values and strategies for inifinite-time linear quadraticgames,” IEEE Transactions on Automatic Control, pp. 490–491, 1977.

[10] A. Jadbabaie, J. Lin, and A. Morse, “Coordination of groups of mobileautonomous agents using nearest neighbor rules,” IEEE Transactionson Automatic Control, vol. 48, pp. 988–1001, 2003.

[11] R. E. Kalman, “When is a linear control system optimal?” Transactionsof the ASME, Journal of Basic Engineering, vol. 86, pp. 1–10, 1964.

[12] H. K. Khalil, Nonlinear Systems. Upper Saddle River, NJ: PrenticeHall, 3rd ed., 2003.

[13] M. Krstic, I. Kanellakopoulos, and P. V. Kokotovic, Nonlinear andAdaptive Control Design. New York: Wiley, 1995.

[14] M. Krstic and Z. Li, “Inverse optimal design of input-to-state stabi-lizing nonlinear systems,” IEEE Transactions on Automatic Control,vol. 43, pp. 336–350, 1998.

[15] F. Lewis and V. L. Syrmos, Optimal Control. John Wiley and Sons,1995.

[16] Z. Lin, M. Brouchke, and B. Francis, “Local control strategies forgroups of mobile autonomous agents,” IEEE Transactions on Auto-matic Control, vol. 49, pp. 622–629, 2004.

[17] Z. Lin, B. Francis, and M. Maggiore, “Necessary and sufficient graph-ical conditions for formation control of unicycles,” IEEE Transactionson Automatic Control, vol. 50, pp. 121–127, 2005.

[18] ——, “State agreement for continuous-time coupled nonlinear sys-tems,” SIAM Journal on Control and Optimization, vol. 46, pp. 288–307, 2007.

[19] Y. Liu, M. Simaan, and J. B. Cruz, “Game theoretic approach tocooperative teaming and tasking in the presence of an adversary,” inProceedings of the American Control Conference, Denver, Colorado,June 2003, pp. 5375–5380.

[20] H. Minc, Nonnegative Matrices. New York: John Wiley & Sons,1988.

[21] L. Moreau, “Stability of multiagent systems with time-dependent com-munication links,” IEEE Transactions on Automatic Control, vol. 50,pp. 169–182, 2005.

[22] O. Perron, “Zur theorie der matrizen,” The Mathematische Annalen,vol. 64, pp. 248–263, 1907.

[23] Z. Qu, Robust Control of Nonlinear Uncertain Systems. New York:Wiley Interscience, 1998.

[24] ——, “A comparison theorem for cooperative control of nonlinearsystems,” in Proceedings of American Control Conference, Seattle,Washington, June 2008.

[25] ——, Cooperative Control of Dynamical Systems. London: SpringerVerlag, 2009.

[26] Z. Qu, J. Wang, and J. Chunyu, “Lyapunov design of cooperativecontrol and its application to the consensus problem,” in Proceedingsof IEEE Multi-conference on Systems and Control, Singapore, October2007.

[27] Z. Qu, J. Wang, and R. A. Hull, “Products of row stochastic matricesand their applications to cooperative control for autonomous mobilerobots,” in Proceedings of American Control Conference, Portland,Oregon, June 2005.

[28] ——, “Cooperative control of dynamical systems with applicationto autonomous vehicles,” IEEE Transactions on Automatic Control,vol. 53, pp. 894–911, 2008.

[29] M. W. Reinsch, “A simple expression for the terms in the baker-campbell-hausdorff series,” Journal of Mathematical Physics, vol. 41,pp. 2434–2442, 2000.

[30] W. Ren and R. W. Beard, “Consensus seeking in multiagent systemsunder dynamically changing interaction topologies,” IEEE Transac-tions on Automatic Control, vol. 50, pp. 655–661, 2005.

[31] C. W. Reynolds, “Flocks, herds, and schools: a distributed behavioralmodel,” Computer Graphics (ACM SIGGRAPH 87 Conference Pro-ceedings), vol. 21, pp. 25–34, 1987.

[32] R. O. Saber and R. M. Murray, “Consensus problems in networks ofagents with switching topology and time-delays,” IEEE Transactionson Automatic Control, vol. 49, pp. 1520–1533, 2004.

[33] D. D. Siljak, “Connective stability of competitive equilibrium,” Auto-matica, vol. 11, pp. 389–400, 1975.

[34] ——, Decentralized Control of Complex Systems. Academic press,1991.

[35] M. Simaan and J. B. Cruz, “On the stackelberg strategy in a nonzero-sum games,” Journal of Optimization Theory and Applications, vol. 11,pp. 533–555, 1973.

[36] ——, “A stackelberg solution for games with many players,” IEEETransactions on Automatic Control, vol. 18, pp. 322–324, 1973.

[37] E. D. Sontag, “A universal construction of artstein’s theorem onnonlinear stabilization,” System & Control Letters, vol. 13, pp. 117–123, 1989.

[38] H. G. Tanner, A. Jadbabaie, and G. J. Pappas, “Stable flockingof mobile agents, part i: fixed topology,” in Proceedings of IEEEConference on Decision and Control, Maui, Hawaii, 2003, pp. 2010–2015.

[39] ——, “Stable flocking of mobile agents, part ii: dynamic topology,”in Proceedings of IEEE Conference on Decision and Control, Maui,Hawaii, 2003, pp. 2016–2021.

[40] T. Vicsek, A. Czirok, E. B. Jacob, I. Cohen, and O. Shochet, “Noveltype of phase transition in a system of self-driven particles,” PhysicalReview Letters, vol. 75, pp. 1226–1229, 1995.

WeB17.2

1658

Date post:	10-Oct-2016
Category:	Documents
Upload:	marwan
View:	214 times
Download:	1 times

[IEEE 2009 Joint 48th IEEE Conference on Decision and Control (CDC) and 28th Chinese Control...

Documents