+ All Categories
Home > Documents > Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… ·...

Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… ·...

Date post: 01-Oct-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
117
Optimal Synchronization and Games on Graphs
Transcript
Page 1: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Optimal Synchronization and Games on Graphs

Page 2: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

UTA Research Institute (UTARI)The University of Texas at Arlington

F.L. LewisMoncrief-O’Donnell Endowed Chair

Head, Controls & Sensors Group

Optimal Design for Synchronization &Games on Communication Graphs

Supported by AFOSR, NSF, AROThanks to Jie Huang

Page 3: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

UTA Research Institute (UTARI)The University of Texas at Arlington

F.L. Lewis

http://ARRI.uta.edu/acs

Cooperative Control Synchronization: Optimal Design and Games on Communication Graphs

Supported by :NSF ‐ PAUL WERBOSARO, AFOSRNNSF of China China Project 111 at NEU

Page 4: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Invited by Derong Liu

Page 5: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Thanks toCesare AlippiZhang Huaguang

Page 6: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 7: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

He who exerts his mind to the utmost knows nature’s pattern.

The way of learning is none other than finding the lost mind.

Meng Tz500 BC

Man’s task is to understand patterns innature and society.

Mencius

Page 8: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Kung Tz500 BC

Confucius

ArcheryChariot driving

MusicRites and Rituals

PoetryMathematics

孔子 Man’s relations toFamilyFriendsSocietyNationEmperorAncestors

Page 9: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Outline

Optimal Design for Synchronization of Cooperative Systems

Distributed Observer and Dynamic Regulator

Discrete-time Optimal Design for Synchronization

Graphical Games

Control Design Methodsfor Multi‐Agent Systems

Acks. to:Guanrong Chen – Pinning controlLihua Xie - Local nbhd. tracking errorZhihua Qu - Lyapunov eq. for di-graphs

Page 10: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Books Coming

F.L. Lewis, H. Zhang, A. Das, K. Hengster-Movric, Cooperative Control of Multi-Agent Systems: Optimal Design and Adaptive Control, Springer-Verlag, 2013, to appear.

Key Point

Lyapunov Functions and Performance IndicesMust depend on graph topology

Hongwei Zhang, F.L. Lewis, and Abhijit Das“Optimal design for synchronization of cooperative systems: state feedback, observer and outputfeedback,”IEEE Trans. Automatic Control, vol. 56, no. 8, pp. 1948-1952, August 2011.

Page 11: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

OutlineCooperative ControlLocally Optimal Design and SynchronizationGlobally Optimal Design for Collective MotionMulti‐player Games on Communication GraphsReinforcement Learning for Game Solutions

Page 12: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

OutlineCooperative ControlLocally Optimal Design and SynchronizationGlobally Optimal Design for Collective MotionMulti‐player Games on Communication GraphsReinforcement Learning for Game Solutions

Page 13: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

J.J. Finnigan, Complex science for a complex world

The Internet

ecosystem ProfessionalCollaboration network

Barcelona rail network

Structure of Natural andManmade Systems

Local nature of Physical LawsPeer-to-Peer Relationships

in networked systems

Clusters of galaxies

Page 14: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Synchronized Motion of Biological Groups

Fishschool

Birdsflock

Locustsswarm

Firefliessynchronize

Page 15: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

The Power of Synchronization Coupled OscillatorsDiurnal Rhythm

Page 16: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Outline

A. Stable Design for Synchronization of Cooperative Systems

B. Global Optimal Design for Collective Group Motion

Stability vs. Optimality ofCooperative Control

Issues: For cooperative control on graphs -

Local stability of each agent is NOT the same as stable synchronization of the team

Local optimality of each agent is NOT the same a global optimality of the team

Page 17: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 18: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

2

3

4

56

Diameter= length of longest path between two nodes

Volume = sum of in-degrees1

N

ii

Vol d

Spanning treeRoot node

Strongly connected if for all nodes i and j there is a path from i to j.

Tree- every node has in-degree=1Leader or root node

Followers

Communication Graph

Page 19: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Communication Graph1

2

3

4

56

N nodes

[ ]ijA a

0 ( , )ij j i

i

a if v v E

if j N

oN1

Noi ji

jd a

Out-neighbors of node iCol sum= out-degree

42a

Adjacency matrix

0 0 1 0 0 01 0 0 0 0 11 1 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 1 0

A

iN1

N

i ijj

d a

In-neighbors of node iRow sum= in-degreei

(V,E)

i

Page 20: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Dynamic Graph- the Distributed Structure of ControlEach node has an associated state i ix u

Standard local voting protocol ( )i

i ij j ij N

u a x x

1

1i i

i i ij ij j i i i iNj N j N

N

xu x a a x d x a a

x

( )u Dx Ax D A x Lx L=D-A = graph Laplacian matrix

x Lx

If x is an n-vector then ( )nx L I x

x

1

N

uu

u

1

N

dD

d

Closed-loop dynamics

i

j

[ ]ijA a

Page 21: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Communication Graph

N nodes

G=(V,E)

State at node i is ( )ix t

Synchronization problem( ) ( ) 0i jx t x t

Page 22: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

2

3

4

56

Theorem. Graph contains a spanning tree iff e-val of L at is simple.

Graph strongly connected implies exists a spanning tree

Then 2 0

Then -L has one e-val at zero and all the rest stable

1 0

Then, all states synchronize using the local voting protocol

Laplacian matrixL=D-A

Page 23: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1 1

( ) (0) (0) (0)i i

N Nt tLt T T

i i i ij j

x t e x v e w x w x e v

Consensus Value and Convergence Rate

x Lx Closed-loop system with local voting protocol

Modal decomposition

Let be simple. Then for large t1 0

2 1 22 2 1 1 2 2

1( ) (0) (0) (0) 1 (0)

Nt t tT T T

j jj

x t v e w x v e w x v e w x x

2 determines the rate of convergence and is called the FIEDLER e-value

1 0

and the Fiedler e-val 2There is a big push to find expressions for the left e-vector for

Let graph have a spanning tree. Then all nodes reach consensus.

Page 24: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1 1

( ) (0) (0) (0)i i

N Nt tLt T T

i i i ij j

x t e x v e w x w x e v

Convergence Value and Rate

x Lx Closed‐loop system with local voting protocol

Modal decomposition

Let               be simple.  Then for large t1 0

2 1 22 2 1 1 2 2

1( ) (0) (0) (0) 1 (0)

Nt t tT T T

j jj

x t v e w x v e w x v e w x x

2 determines the rate of convergence ‐ Fiedler e‐value

1 1 2Tw determines the consensus value in terms of the initial conditions

Depends on Communication Graph TopologyNo freedom to determine the consensus value

L has e‐val at zero

We call this the Cooperative Regulator Problem

is simple if the graph is strongly connected1 0

Page 25: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

2 3

4 5 6

12

3

4 5

6

Graph Eigenvalues for Different Communication Topologies

Directed Tree-Chain of command

Directed Ring-Gossip networkOSCILLATIONS

Page 26: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Graph Eigenvalues for Different Communication Topologies

Directed graph-Better conditioned

Undirected graph-More ill-conditioned

65

34

2

1

4

5

6

2

3

1

Page 27: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Synchronization on Good Graphs

Chris Elliott fast video

65

34

2

1

1

2 3

4 5 6

Mesh graph4 neighbors

Page 28: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Synchronization on Gossip Rings

Chris Elliott weird video

12

3

4 5

6

Ring graphor cycle

10 nodes

Page 29: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

These beautiful pictures are from a lecture by Ron Chen, City U. Hong KongPinning Control of Graphs

Natural and biological structures

Page 30: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Locally Optimal Design and Synchronization

Page 31: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Controlled Consensus: Cooperative Tracker

Node state i ix uDistributed Local voting protocol with control node v

( ) ( )i

i ij j i i ij N

u a x x b v x

( ) 1x L B x B v i i

i ij i ij j ij N j N

u a x a x b v

0ib If control v is in the neighborhood of node i

{ }iB diag b

Theorem. Let graph have a spanning tree and for at least one root node. Then L+B is nonsingular with all e-vals positiveand -(L+B) is asymptotically stable

0ib

control node v

Ron Chen – pinning control

Local Neighborhood Tracking Error

Page 32: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

2 1 1,

2 1 0A B

0.5 0.5K

Agent Dynamics and Local Feedback Design

i i ix Ax Bu

i iu Kx

1

2

3

4

56

Couple 6 agents with communication graph

Nodes synchronize to consensus heading

-350 -300 -250 -200 -150 -100 -50 0 50-300

-250

-200

-150

-100

-50

0

50

x

y

0( ) ( )i

i ij j i i ij N

e x x g x x

Local neighborhood tracking error

i iu K

0xc.g. leader

Page 33: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

2 1 1,

2 1 0A B

0.5 0.5K

Agent Dynamics and local Feedback design

i i ix Ax Bu

i iu Kx

1

2

3

4

56

ADD another comm. Link- more information flow

0( ) ( )i

i ij j i i ij N

e x x g x x

Local neighborhood tracking error

i iu K

Causes Unstable Formation!

-30 -25 -20 -15 -10 -5 0 5 10 15 20-25

-20

-15

-10

-5

0

5

10

15

20

25

WHY?

x

y

Page 34: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

We want Design Freedom that overcomes graph topology constraints

Decouple Control Design from Graph Topology constraints

Guaranteed synchronization for general Directed graphs

Hongwei Zhang, F.L. Lewis, and Abhijit Das“Optimal design for synchronization of cooperative systems: state feedback,observer and output feedback”IEEE Trans. Automatic Control, vol. 56, no. 8, pp. 1948-1952, August 2011.

Guaranteed stability for continuous-time multi-agent systems on graphs -

A. STABLE DESIGN FOR COOPERATIVE CONTROL ON GRAPHS

Page 35: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

i i ix Ax Bu

A.  State Feedback Design for Cooperative Systems on Graphs

Cooperative Regulator vs. Cooperative Tracker problemN nodes with dynamics

Synchronization Tracker design problem  0( ) ( ),ix t x t i

0 0x AxControl node or Command generator (Exosystem)

0( ) ( )i

i ij j i i ij N

e x x g x x

0n ne L G I x x L G I

1 2 ,TT T T nN

Ne R 0 0 ,nNx Ix R 1 nN nnI I R

0nNx x R

Local neighborhood tracking error

Overall error vector

Consensus or synchronization error

where

= Local quantity

= Global quantity

i

j

, ,n mi ix R u R x0(t)

Ron Chen- pinning control Lihua Xie- error

L=D-A

Page 36: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

0n ne L G I x x L G I

Local quantity Global quantity

Local control objectives imply global performance

Local Neighborhood Tracking Error

Page 37: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

0( ) ( )i

i i ij j i i ij N

u cK cK e x x g x x

0( ) ( )i

i i i i ij j i i ij N

x Ax Bu Ax cBK e x x g x x

0( ) ( ) ( )Nx I A c L G BK x c L G BK x

( ) ( )NI A c L G BK

Closed loop system

Overall c.l. dynamics

Global synch. error dynamics

Fax and Murray 2004

1 2 ,TT T T nN

Nx x x x R Overall state

Graph structure          Control structure

Coop. nbhd  SVFB

MIXES UP CONTROL DESIGN AND GRAPH STRUCTURE

0x Ix

( )u c L G K Distributed form of control

Page 38: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

The key to global stability and synchronization of the collective

is

Locally optimal design for each agent

Page 39: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Lewis and Syrmos1995

DECOUPLES CONTROL DESIGN FROM COMMUNICATION GRAPH STRUCTURE

OPTIMALDesign at Each node

LOCAL OPTIMAL DESIGN Guarantees Global Synchronization

12

0

(x )T Ti i i i iJ Qx u Ru dt

minimizes

0( ) ( )i

i i ij j i i ij N

u cK cK e x x g x x

Optimal Control 3rd edLewis, Vrabie, Syrmos2012

S. Tuna, “LQR-based coupling gain for synchronization of linear systems,” Arxiv preprint arXiv:0801.3390, 2008.

Hongwei Zhang, F.L. Lewis, and Abhijit Das, “Optimal design for synchronization of cooperativesystems: state feedback, observer and output feedback”IEEE Trans. Automatic Control, vol. 56, no. 8, pp. 1948-1952, August 2011.

Page 40: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Optimal Control 3rd edLewis, Vrabie, Syrmos2012

Emre Tuna 2008 paper online

OPTIMAL Design at each node gives global guaranteed performance on any strongly connected communication graph

OPTIMALDesign at Each node

Li, Duan, Chen-Finsler’s Lemma

Page 41: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

2 3

4 5 6

12

3

4 5

6

Graph Eigenvalues for Different Communication Topologies

Directed Tree-Chain of command

Directed Ring-Gossip networkOSCILLATIONS

Page 42: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Example: Unbounded Region of Consensus for Optimal Feedback Gains.

2 1 1,

2 1 0A B

0.5 0.5K

b. Unbounded Consensus Region forOptimal SVFB Gain

a. Bounded Consensus Region forArbitrarily Chosen Stabilizing SVFB Gain

Q=I, R=1 

1.544 1.8901K

Example from [Li, Duan, Chen 2009]

Im{ }

Re{ }

Im{ }

Re{ }

A c BK E-vals of (L+G)

Page 43: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Results:  

Local Riccati Design yields guaranteed stable synchronization 

Decouples Controls Design from Graph Properties

Page 44: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 45: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Globally Optimal Design for Collective Group Motions

Page 46: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 47: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Outline

A. Stable Design for Synchronization of Cooperative Systems

B. Optimal Design for Collective Group Motion

Stability vs. Optimality ofCooperative Control

Issues: For cooperative control on graphs -

Local stability of each agent is NOT the same as stable synchronization of the team

Local optimality of each agent is NOT the same a global optimality of the team

Have seen that LOCAL OPTIMAL DESIGN Guarantees Global Synchronization

Page 48: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

The method just shown guarantees synchronization on arbitrary graphsIt is a LOCAL OPTIMAL DESIGN at each agent

B. GLOBAL OPTIMAL DESIGN FOR COLLECTIVE MOTION ON GRAPHS

What about Global Optimality of cooperative control on graphs?

Problem- the global optimal control is not distributed

The global optimal control is generally distributed only on a complete graph – Wei Ren

ni i ix Ax Bu

( ) ( )x I A x I B u Ax Bu ( ) ( )I A I B u A Bu

Agent dynamics

Global dynamics

LQR

1T TA P PA Q PBR B P

12

0

( )T TJ Q u Ru dt

ARE

Control 1 Tu R B P is distributed only on a complete graph- Wei Ren

BUT- a distributed control must have the form ( )u c L G K

So Q and R must depend on the graph topology

0x Ix

Page 49: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1T TA P PA Q PBR B P LQR case- ARE

Given A, B, and the distributed control form, find Q and R

( )u c L G K

Inverse Optimality

Kristian Hengster-Movric

Page 50: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

2 2 0( ) ( )i

i i ij j i i ij N

u cK cK e x x g x x

2 0( ) ( )i

i i i i ij j i i ij N

x Ax Bu Ax cBK e x x g x x

2 2 0( ) ( ) ( )Nx I A c L G BK x c L G BK x

2( ) ( )NI A c L G BK Global synch. error dynamics

Graph structure          Control structure

( ) ( )x I A x I B u

ni i ix Ax Bu

( ) ( )I A I B u

2( )u c L G K

ne L G I

0( ) ( )i

i ij j i i ij N

e x x g x x

Closed‐loop system

Distributed Control

System 0 0x Ax

0x Ix

Local nbhd tracking error

Global disagreement error

Leader

Page 51: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

,i i ix u x R

x u TNxxx 1 1T

Nu u u

1T

Ne ( )e L G

i iu ( )u L G

{ }iG diag g

( )u L G

B.1 Optimal Cooperative Tracker for Single-Integrator Dynamics0 0x

0( ) ( )i

i ij j i i ij N

e x x g x x

System

Local nbhd tracking error

Global disagreement error

control

Closed‐loop System

Leader node

Graph structure          Control structure

No control structure hereFocus on graph structure

0x Ix

Kristian Hengster-Movric

Page 52: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1T TA P PA Q PBR B P

x u

Q

Use local nbhd tracking errorIn the cost function!

Condition on graph topology

Kristian Hengster-Movric

Page 53: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

i i ix Ax Bu

Cooperative Regulator vs. Cooperative Tracker problemN nodes with dynamics

Synchronization Tracker design problem  0( ) ( ),ix t x t i

0 0x AxControl node or Command generator (Exosystem)

0( ) ( )i

i ij j i i ij N

e x x g x x

0n ne L G I x x L G I

1 2 ,TT T T nN

Ne R 0 0 ,nNx Ix R 1 nN nnI I R

0nNx x R

Local neighborhood tracking error

Overall error vector

Consensus or synchronization error

where

= Local quantity

= Global quantity

i

j

, ,n mi ix R u R x0(t)

Ron Chen- pinning control Lihua Xie- error

L=D-A

Cooperative Tracker for Identical LTI Dynamics

Page 54: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

ni i ix Ax Bu

( ) ( )x I A x I B u

( ) ( )I A I B u

2 2 0( ) ( )i

i i ij j i i ij N

u cK cK e x x g x x

2 0( ) ( )i

i i i i ij j i i ij N

x Ax Bu Ax cBK e x x g x x

2 2 0( ) ( ) ( )Nx I A c L G BK x c L G BK x

2( ) ( )NI A c L G BK

2( )u c L G K

Cooperative Tracker for Identical LTI Dynamics

ne L G I

0( ) ( )i

i ij j i i ij N

e x x g x x

1T

Ne

0 0x Ax

Closed‐loop system

Local nbhd tracking error

Systemleader

Control

Graph structure          Control structure

0x Ix

Page 55: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

max 1 2 2 2 2

min 1 2 2 2

( ( ) ( ))(( ) ( ) )

T

T T

R L G Q K R KcL G R L G K R K

Q depends on Graph topology

TWO CONDITIONS

Kristian Hengster-Movric

The local optimal design from before

A new condition on the graph

Page 56: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1( ) ( ) ( ) ( ) 0T TI A P P I A Q P I B R I B P

1 2P P P

1 1( )P cR L G

12 2 2 2 2 2 0T TA P P A Q P BR B P

1 11 2 1 2 1 2 1 2 1 2

1 11 2 2 1 1 1 2 2 2

( )( )( ) 0

( ) ( ) 0

T T

T T

P A P P P A Q P P B R R P B P

P A P P A Q PR P P BR B P

11 2 2 2 2 2 2

1 11 1 1 1 2 2 2

( )

( ) ( ) 0

T T T

T

P A P P A Q P BR B P

Q PR P P BR B P

21 1( ) ( )TQ c L G R L G

1T TA P PA Q PBR B P ( ) ( )I A I B u A Bu

22 1 2 2 1 2 2

2 11 2 2 2 1 2 2 2 2

1 1 11 1 1 2 2 2 1 2 2 2 2

1 11 2 2 2 1 2 2 2 2

(( ) ) ( )(( ) ) ( ) ( )

( ) ( ) ( ) ( )

( )

( )

T T

T T T

T T

T T

Q c L G K R R L G K cR L G A P P A

c L G R L G K R K cR L G Q P BR B P

P R P P BR B P P Q P BR B P

Q P BR B P P Q P BR B P

1 1 1 11 2 1 2

1 11 1 2 2 2

( ) ( )( )( )

( )

T T T

T

u R B P R I B P R R I B P P

R P R B P c L G K

1 2R R R

Proof:

2 conditions:

ARE

System

Select

ARE

Choose Q

ARE

Control

Distributed !!

Kristian Hengster-Movric

Page 57: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

12 2 2 2 2 2 0T TA P P A Q P BR B P

1 1( )P cR L G

1. Condition on graph topology

For some 2 2 2 2 2 20, 0, 0T T TP P R R Q Q

2. Local agent control design condition – Same as before-local optimal control

For some 1 1 1 10, 0T TP P R R

Always holds if (A,B) reachable

Locally optimal design is also globally optimal on the graph if condition 1 holds

Two Conditions for global optimal design on the graph

Page 58: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Condition on Graph Topology

1 1( )P cR L G

1. Undirected Graphs ( )TL G L G

1 1( ) ( )TR L G L G R

1 1( ) ( )R L G L G R The condition becomes a Commutativity Requirement

Iff have the same eigenvectors1, ( )R L G

Case 1. 1R I

00 0

( , ) ( ( ) ( ) ) ( )T T T T TJ u L G L G u u dt e e u u dt

For single-integrator dynamics

TL T T

Case 2.

0TR T T 0 diagonal

1 1( ) ( )R L G L G R

Let

Select For any

Equivalent to

Jordan form

R depends on graph topology- ALL e-vectors

1 1 1 10, 0T TP P R R

Kristian Hengster-Movric

Page 59: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

2. Detail Balanced Graphs

i ij j jie e 1 ... 0N for

Then is a left eigenvector for L for e-val= 0 1T

N

, {1 / } 0iL DP D diag with a symmetric graph Laplacian matrix

1( )L G DP G D P D G DP

P

1( ) ( )P D L G R L G

Detail balanced implies reversibility of an associated Markov Process

Detail balanced implies balanced

1 1( )P cR L G 1 1( ) ( )TR L G L G R Equivalent to

1 1 1 10, 0T TP P R R

R depends on graph topology – principal left e-vector

Page 60: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

3. Directed Graphs with Simple Graph Laplacian L+G

1( )T L G T

Select

1( ) ( )T T T TT L G T T L G T

( ) ( )T T TT T L G L G T T

0T TR T T R

Diagonal Jordan form

Dennis BernsteinMatrix book

1 1( )P cR L G 1 1( ) ( )TR L G L G R Equivalent to

1 1 1 10, 0T TP P R R

R depends on graph topology- ALL e-vectors

1( ) ( ) , 0T TL G R L G R R R

A new class of digraphs

Page 61: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Distributed Systems

Page 62: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1i i ix k Ax k Bu k

0 01x k Ax k

0i

i ij j i i ij N

e x x g x x

11i i i iu c d g K

kBKgdckAxkx iiiii 111

11 c Nk A k I A c I D G L G BK k

GLGDI 1

, 1,k k N

A.2  Discrete‐Time Optimal Design for Synchronization

Distributed systems

Command generator

Local Nbhd Tracking Error

Local closed‐loop dynamics

Local cooperative SVFB ‐ weighted

0 ( )k x k x k Global disagreement error dynamics

Weighted Graph Matrix

Weighted graph eigenvalues

Decouple controls design from graph topology

K. Hengster-Movric, Keyou You, F.L. Lewis, and Lihua Xie,, “Synchronization of Discrete-Time Multi-agent Systems on Graphs Using Riccati Design,” Automatica, to appear.

Page 63: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

11 c Nk A k I A c I D G L G BK k

GLGDI 1

, 1,k k N

Weighted Graph Matrix

Weighted graph eigenvalues

Synchronization error dynamics

MIXES UP CONTROL DESIGN AND GRAPH STRUCTURE

Page 64: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

r

c0

r0

Covering circle of graph eigenvalues

Synchronization region contains this circle

Ctrldesign

GraphProps.

Kristian MovricDecouple Controls Design From Graph Topology

Page 65: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Single‐Input case with Real Graph Eigenvalues

1/21/2 1 1/20max

0

( ( ) )T T Tr r Q A PB B PB B PAQc

0 max min

0 max min

.rc

If graph eigenvalues are real

u

u Ar

1For SI systems, for proper choice of Q

Mahler measure

2log uii

A intrinsic entropy rate = minimum data rate in a networked control system that enables stabilization of an unstable system – Baillieul and others.

min max/ Eigen‐ratio = ‘condition number’ of the communication graph

condition

Work on log quantization- Elia & Mitter, Lihua Xie

Page 66: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Single‐Input case with Real Graph Eigenvalues

( ) | ( ) |iunstab

ile

M A A

Mahler Measure

Guoxiang Gu, L. Maronovici, and F.L. Lewis, “Consensusability of discrete-time dynamic multi-agent systems,” IEEE Trans. Automatic Control, to appear, 2012.

max

min

( )L G

Graph Condition Number

1

1

1( )1

M A

Synchronization guaranteed if

( ) log( ( ))h A M A

Topological Entropy1

1

1( ) log1

C L G

New definition- Graph Channel Capacity

Like to have

min large means fast convergence( ) 1G Varshney

Page 67: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

2 3

4 5 6

12

3

4 5

6

Graph Eigenvalues for Different Communication Topologies

Directed Tree-Chain of command

Directed Ring-Gossip networkOSCILLATIONS

Page 68: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Graph Eigenvalues for Different Communication Topologies

Directed graph-Better conditioned

Undirected graph-More ill-conditioned

65

34

2

1

4

5

6

2

3

1

Page 69: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

max min

max min

( ) u

u

A A

Single‐Input case with Real Graph Eigenvalues

Is equivalent to max

min

( ) 1( ) 1AA

1i i ix k Ax k Bu k 0i

i ij j i i ij N

u cK e x x g x x

0( )i

i ij j i i ij N

u cF z K e x x g x x

Add stable filter

2max

min

( ) 1( ) 1AA

Filtered protocol gives synch. if

select ( )A 2

2 2 1min 2

min

( (1 ) ) ,T T TP A P I BB P A B PB

the stabilizing solution to0P

2 2 1min min( (1 ) )T TK I B PB B PA

1min min( ) ( )T z K zI A BK B

1 2

2

(1 )( )1 ( )

F zT z

Complementary sensitivity

Guoxiang Gu &Lewis – IEEE TAC

Improvement

Page 70: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

max

min

( )G Graph Condition Number

Like to have ( ) 1G

min large means fast convergence

eigenratio= min

max

L.R. Varshney, “Distributed inference with costly wires”

max

min

( ) 1( ) 1AA

Page 71: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Games on Communication GraphsKyriakos Vamvoudakis, Mohammed Abouheaf

Page 72: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Sun Tz bin fa孙子兵法

Graphical Coalitional Games

500 BC

Page 73: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

UTA Research Institute (UTARI)The University of Texas at Arlington

F.L. Lewis, K. Vamvoudakis, M. Abouheaf

http://ARRI.uta.edu/acs

Games on Communication Graphs

Supported by :NSF ‐ PAUL WERBOSARO, AFOSR

Page 74: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Manufacturing as the Interactions of Multiple AgentsEach machine has it own dynamics and cost functionNeighboring machines influence each other most stronglyThere are local optimization requirements as well as global necessities

Page 75: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

,i i i ix Ax B u

0 0x Ax

0( ) ( ),ix t x t i

0( ) ( ),i

i ij i j i ij N

e x x g x x

( ) ,nix t ( ) im

iu t

Graphical GamesSynchronization‐ Cooperative Tracker Problem

Node dynamics

Target generator dynamics

Synchronization problem

Local neighborhood tracking error (Lihua Xie)

Pinning gains              (Ron Chen)0ig

x0(t)

K.G. Vamvoudakis, F.L. Lewis, and G.R. Hudas, “Multi-Agent Differential Graphical Games: online adaptive learning solution for synchronization with optimality,” Automatica, vol. 48, no. 8, pp. 1598-1611, Aug. 2012.

Page 76: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

,i i i ix Ax B u

0 0x Ax

0( ) ( ),ix t x t i

0( ) ( ),i

i ij i j i ij N

e x x g x x

( ) ,nix t ( ) im

iu t

1 2 ,TT T T nN

N 0 0nNx Ix

0 ,n nL G I x x L G I

/ ( )L G

0nNx x

1 2 ,TT T T nN

Nx x x x

Graphical GamesSynchronization‐ Cooperative Tracker Problem

Node dynamics

Target generator dynamics

Synchronization problem

Local neighborhood tracking error (Lihua Xie)

Global neighborhood tracking error

Lemma.  Let graph be strongly connected and at least one pinning gain nonzero.  Then

and agents synchronize iff  ( ) 0t

Pinning gains              (Ron Chen)0ig

x0(t)

Standard way =

Page 77: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

( )i

i i i i i i ij j jj N

A d g B u e B u

0( ) ( )i

i ij i j i ij N

e x x g x x

12

0

( (0), , ) ( )i

T T Ti i i i i ii i i ii i j ij j

j N

J u u Q u R u u R u dt

12

0

( ( ), ( ), ( ))i i i iL t u t u t dt

12( ( )) ( )

i

T T Ti i i ii i i ii i j ij j

j Nt

V t Q u R u u R u dt

1

N

i ii

z Az B u

12

10

( (0), , ) ( )N

T Ti i i j ij j

j

J z u u z Qz u R u dt

( ) { : }i j iu t u j N

1

( , , )

( , ),

( ,{ : })

TN

i i j i

G U v

G V E v v v

v U U j N R

Graphical Game:  Games on GraphsLocal nbhd. tracking error dynamics

Define Local nbhd. performance index

Local value functions for fixed policies iu

Static Graphical Game

Standard N‐player differential game

Values depend on all other agents

Value depends only on neighbors

Local agent dynamics driven by neighbors’ controls

Dynamics depend on all other agents

Values driven by neighbors’ controls

Kyriakos Vamvoudakis

Page 78: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

( )i

i i i i i i ij j jj N

A d g B u e B u

0( ) ( )i

i ij i j i ij N

e x x g x x

12

0

( (0), , ) ( )i

T T Ti i i i i ii i i ii i j ij j

j N

J u u Q u R u u R u dt

12

0

( ( ), ( ), ( ))i i i iL t u t u t dt

12( ( )) ( )

i

T T Ti i i ii i i ii i j ij j

j Nt

V t Q u R u u R u dt

( ) { : }i j iu t u j N

1

( , , )

( , ),

( ,{ : })

TN

i i j i

G U v

G V E v v v

v U U j N R

Graphical Game:  Games on GraphsLocal nbhd. tracking error dynamics

Define Local nbhd. performance index

Local value functions for fixed policies iu

Static Graphical Game

Value depends only on neighbors

Local agent dynamics driven by neighbors’ controls

Values driven by neighbors’ controls

Kyriakos Vamvoudakis

Page 79: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1u

2u

iu Control action of player i

Value function of player i

New Differential Graphical Game

( )i

i i i i i i ij j jj N

A d g B u e B u

State dynamics of agent i

Local DynamicsLocal Value Function

Only depends on graph neighbors

12

0

( (0), , ) ( )i

T T Ti i i i i ii i i ii i j ij j

j N

J u u Q u R u u R u dt

Page 80: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1

N

i ii

z Az B u

12

10

( (0), , ) ( )N

T Ti i i j ij j

j

J z u u z Qz u R u dt

1u

2u

iu Control action of player i

Central Dynamics

Value function of player i

Standard Multi-Agent Differential Game

Central DynamicsLocal Value Functiondepends on ALL

other control actions

Page 81: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1 1 11 1 2 3 1 2 1 3 13 3 3( ) ( ) ( ) coi

teamJ J J J J J J J J J

1 1 12 1 2 3 2 1 2 3 23 3 3( ) ( ) ( ) coi

teamJ J J J J J J J J J

1 1 13 1 2 3 3 1 3 2 33 3 3( ) ( ) ( ) coi

teamJ J J J J J J J J J

The objective functions of each player can be written as a team average term plus a conflict of interest term:

1 1

1 1

( ) , 1,N N

coii j i j team iN N

j j

J J J J J J i N

For N-player zero-sum games, the first term is zero, i.e. the players have no goals in common.

For N-players

Team Interest vs. Self InterestCooperation vs. Collaboration

Page 82: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

( ), { : }i j iu t u j N

{ : , }G i ju u j N j i

* * *1 2, ,...,u u u

* * * *( , ) ( , ),i i i G i i i G iJ J u u J u u i N

* *( )i i iJ J u

( ) ( , ) ( , ' ),i i i i G i i i G iJ u J u u J u u i

* 12( ( )) min ( )

ii

T T Ti i i ii i i ii i j ij ju

j Nt

V t Q u R u u R u dt

Problems with Nash Equilibrium Definition on Graphical GamesGame objective

Neighbors of node i

All other nodes in graphDef:  Nash equilibrium

are in Nash equilibrium if

Counterexample.  Disconnected graph

Let each node play his optimal control

Then, each agent’s cost does not depend on any other agent

Then all agents are in Nash equilibriumNote‐ this Nash is also coalition‐proof

Another example

Define

Page 83: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Def. Local Best response.is said to be agent i’s local best response to fixed policies        of its neighbors if

* * *( , ) ( , ), ,i j G j i j G jJ u u J u u i j N

A restriction on what sorts of performance indices can be selected in multi‐player graph games. 

A condition on the reaction curves (Basar and Olsder) of the agents

This rules out the disconnected counterexample.

*( , ) ( , ),i i i i i i iJ u u J u u u

*iu iu

New Definition of Nash Equilibrium for Graphical Games

* * *1 2, ,...,u u u

* * * *( , ) ( , ),i i i G i i i G iJ J u u J u u i N

Def:  Interactive Nash equilibrium

are in Interactive Nash equilibrium if

2.  There exists a policy         such that ju

1.

That is, every player can find a policy that changes the value of every other player. 

They are in Nash equilibrium

Interaction Condition

Page 84: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Theorem 3.  Let (A,Bi) be reachable for all i.  Let agent i be in local best response

Then                            are in global Interactive Nash iff the graph is strongly connected.

*( , ) ( , ),i i i i i iJ u u J u u i

* * *1 2, ,...,u u u

i

k

( ) (( ) ) ( ) (( ) )0( ) ( )

N n i i n kk kT

ii N

I A L G I diag B K L G I Bv A Bv

p pp diag Q I A

1( ) ( ) T ii i i i i ii i i i

i

Vu u V d g R B K p

k k k ku K p v

2B AB A B

Picks out the shortest path from node k to node i

A

BAB

Hamiltonian System

L G2( )L G

3( )L G

Page 85: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1 1 12 2 2( , , , ) ( ) 0

i i

TT T Ti i

i i i i i i i i i ij j j i ii i i ii i j ij ji i j N j N

V VH u u A d g B u e B u Q u R u u R u

10 ( ) Ti ii i i ii i

i i

H Vu d g R B

u

2 1 2 1 11 1 12 2 2( ) ( ) 0,

i

TT Tj jc T T Ti i i

i i ii i i i i ii i j j j jj ij jj ji i i j jj N

V VV V VA Q d g B R B d g B R R R B i N

2 1 1( ) ( ) ,i

jc T Tii i i i i ii i ij j j j jj j

i jj N

VVA A d g B R B e d g B R B i N

* *( , , , ) 0ii i i i

i

VH u u

* 2 11 1 12 2 20 ( , , , ) ( )

i

T Tc T T Ti i i i

i i i i i i ii i i i i ii i j ij ji i i i j N

V V V VH u u A Q d g B R B u R u

2 1( )

i

c T ii i i i i ii i ij j j

i j N

VA A d g B R B e B u

12( ( )) ( )

i

T T Ti i i ii i i ii i j ij j

j Nt

V t Q u R u u R u dt

Value function

Differential equivalent (Leibniz formula) is Bellman’s Equation

Stationarity Condition

1.  Coupled HJ equations

2.  Best Response HJ Equations – other players have fixed policies

where

where

Graphical Game Solution Equations

ju

Page 86: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Kyriakos Vamvoudakis

Page 87: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Reinforcement Learning to Solve Graphical Games

Page 88: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

D. Vrabie, K. Vamvoudakis, and F.L. Lewis, Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles, IET Press, 2012.

BooksF.L. Lewis, D. Vrabie, and V. Syrmos, Optimal Control, third edition, John Wiley and Sons, New York, 2012.

New Chapters on:Reinforcement LearningDifferential Games

Page 89: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

F.L. Lewis and D. Vrabie,“Reinforcement learning andadaptive dynamic programmingfor feedback control,” IEEECircuits & Systems Magazine,Invited Feature Article, pp. 32-50, Third Quarter 2009.

IEEE Control Systems Magazine“Reinforcement learning andfeedback Control,” Dec. 2012

Page 90: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Different methods of learning

SystemAdaptiveLearning system

ControlInputs

outputs

environmentTuneactor

Reinforcementsignal

Actor

Critic

Desiredperformance

Reinforcement learningIvan Pavlov 1890s

Actor-Critic LearningPaul Werbos

We want OPTIMAL performance- ADP- Approximate Dynamic Programming

Sutton & Barto book

Page 91: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Doya, Kimura, Kawato 2001

Limbic system

Page 92: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Cerebral cortexMotor areas

ThalamusBasal ganglia

Cerebellum

Brainstem

Spinal cord

Interoceptivereceptors

Exteroceptivereceptors

Muscle contraction and movement

Summary of Motor Control in the Human Nervous System

reflex

Supervisedlearning

ReinforcementLearning- dopamine

(eye movement)inf. olive

Hippocampus

Unsupervisedlearning

Limbic System

Motor control 200 Hz

theta rhythms 4-10 Hz

picture by E. StinguD. Vrabie

Memoryfunctions

Long term

Short term

Hierarchy of multiple parallel loops

gamma rhythms 30-100 Hz

Page 93: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Online Solution of Graphical Games

Use Reinforcement Learning Convergence Results

POLICY ITERATION

Kyriakos Vamvoudakis

Page 94: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

1 1 12 2 2( , , , ) ( ) 0

i i

TT T Ti i

i i i i i i i i i ij j j i ii i i ii i j ij ji i j N j N

V VH u u A d g B u e B u Q u R u u R u

1( ) T ii i i ii i

i

Vu d g R B

Online Solution of Graphical Games

Policy Iteration gives structure needed for online graph games

Solve simultaneously online:

ˆ ˆ Ti i iV W

112

ˆˆ ( )T

T ii i i ii i i N

iu d g R B W

Weierstrass Approximator structures‐ 2 at each node

actor

critic

2 11 14 4

ˆ ˆ ˆ ˆ ˆ( )i

Tj jT T T T T

i i i ii i i N i i N j j j N j jj ij jj j j Nj jj N

W Q W DW d g W B R R R B W

ˆ ˆ( ( ) )i

ii i i i i i ij j j

i j N

A d g B u e B u

Bellman equation

Bellman equation becomes an algebraic equation in the parameters

Page 95: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Approximate values by a Critic Network at each node i

ˆ ˆ Ti i iV W

Approximate control policies by an Actor Network at each node i

112

ˆˆ ( )T

T ii N i i ii i i N

iu d g R B W

1

2 11 14 42

ˆˆ

ˆ ˆ ˆ ˆ ˆ[ ( ) ](1 )

i

i ii

Tj jT T T T Ti

i i i i ii i i N i i N j j j N j jj ij jj j j Ni i j jj N

EW a

W

a W Q W DW d g W B R R R B W

Online Solution of Graphical Games Using Value Function Aproximation

Tuning law for Critic parameters

Tuning law for Actor parameters

2 11

1 1ˆ ˆ ˆ ˆ ˆ ˆ ˆ{( ) ( ) }4 4

ii

TT Tj jT T T Ti N i N

i N i N i i N i i N i i i N i i N j j j j jj ij jj jsi s j jj N

j i

W a F W F W DW W W d g W B R R R Bm m

Converges to solution of coupled HJ equations, and Nash equilibriumand keeps states stable while learningNeed PE of ˆ ˆ( ( ) )

i

ii i i i i i ij j j

i j N

A d g B u e B u

Page 96: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Lemma 1 – Draguna Vrabie

Solves Lyapunov equation without knowing f(x,u)

( ( )) ( , ) ( ( )), (0) 0t T

t

V x t r x u d V x t T V

0 ( , ) ( , ) ( , , ), (0) 0TV Vf x u r x u H x u V

x x

Another form for the CT Bellman eq.

Is equivalent to

Integral Reinforcement Form of Bellman Equation

Can avoid knowledge of drift term f(x) by using Integral Reinforcement Learning (IRL)

Draguna Vrabie

Then HJ equations are solved online without knowing f(x)Coupled AREs are solved online without knowing A

Bellman Equation

Page 97: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Can avoid knowledge of drift term f(x) by using Integral Reinforcement Learning (IRL)

Draguna Vrabie

Then HJ equations are solved online without knowing f(x)

Coupled AREs are solved online without knowing A

Page 98: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Lemma 1 – Draguna Vrabie

( ( )) ( , ) ( , )Td V x V f x u r x u

dt x

Proof:

( , ) ( ( )) ( ( )) ( ( ))t T t T

t t

r x u d d V x V x t V x t T

Solves Lyapunov equation without knowing f(x,u)

( ( )) ( , ) ( ( )), (0) 0t T

t

V x t r x u d V x t T V

0 ( , ) ( , ) ( , , ), (0) 0TV Vf x u r x u H x u V

x x

Allows definition of temporal difference error for CT systems

( ) ( ( )) ( , ) ( ( ))t T

t

e t V x t r x u d V x t T

Another form for the CT Bellman eq.

Is equivalent to

Integral Reinforcement Form of Bellman Equation

Draguna Vrabie

Page 99: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

System

Action network

Policy Evaluation(Critic network)cost

Value updateControl policy update

Critic and Actor tuned simultaneouslyLeads to ONLINE FORWARD‐IN‐TIME implementation of optimal control

New Structure of Adaptive Controller

F.L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEECircuits & Systems Magazine, Invited Feature Article, pp. 32‐50, Third Quarter 2009.

Online Solution of Graphical Games

Do not need to know system drift dynamics

Best Paper Award, Int. Joint Conf. Neural Networks, Barcelona, 2010. Draguna Vrabie and F.L. Lewis,“Adaptive Dynamic Programming Algorithm for Finding Online the Equilibrium Solution of the Two-PlayerZero-Sum Differential Game.”

Reinforcement Learning Adaptive Critic

( ) ( )i i i ix f x g x u

112

ˆˆ ( )T

T ii i i ii i i N

iu d g R B W

ˆ ˆ Ti i iV W

Page 100: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Process i Cost Function Optimization

Process j1 controlupdate

Process i controlupdate

Process j2 controlupdate

Optimal Performance of Each Process Depends on the Control of its Neighbor Processes

Control Policy of Each Process Depends on the Performance of its Neighbor Processes

Process i Cost Function Optimization

Process i controlupdate

Process j2 Cost Function Optimization

Process j1 Cost Function Optimization

Graphical Games for Multi-Process Optimal Control

Page 101: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 102: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 103: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Motions of Biological Groups

Fishschool

Birdsflock

Locustsswarm

Firefliessynchronize

Local / Peer-to-Peer Relationships in socio-biological systems

Page 104: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

The cloud-capped towers, the gorgeous palaces,The solemn temples, the great globe itself,Yea, all which it inherit, shall dissolve,And, like this insubstantial pageant faded,Leave not a rack behind.

We are such stuff as dreams are made on, and our little life is rounded with a sleep.

Our revels now are ended. These our actors, As I foretold you, were all spirits, and Are melted into air, into thin air.

Prospero, in The Tempest, act 4, sc. 1, l. 152-6, Shakespeare

Page 105: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

System

Action network

Policy Evaluation(Critic network)cost

The Adaptive Critic Architecture

Adaptive Critics

Value update

Control policy update

Critic and Actor tuned simultaneouslyLeads to ONLINE FORWARD‐IN‐TIME implementation of optimal control

Policy Iteration is Reinforcement Learning

F.L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEECircuits & Systems Magazine, Invited Feature Article, pp. 32‐50, Third Quarter 2009.

Page 106: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

System

Action network

Policy Evaluation(Critic network)cost

The Adaptive Critic Architecture

Adaptive Critics

Value update

Control policy update

Critic and Actor tuned simultaneouslyLeads to ONLINE FORWARD‐IN‐TIME implementation of optimal control

A new adaptive control architecture

Optimal Adaptive Control

ˆ ˆ Ti i iV W

112

ˆˆ ( )T

T ii i i ii i i N

iu d g R B W

Page 107: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 108: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 109: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 110: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 111: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 112: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 113: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 114: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 115: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 116: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded
Page 117: Optimal Synchronization and Games on Graphs talks/2013 optimal coop ctrl- local and glob… · Directed Tree-Chain of command Directed Ring-Gossip network OSCILLATIONS. Example: Unbounded

Recommended