Task Planning and Multi-Agent Systems !
Robert Stengel! Robotics and Intelligent Systems,
MAE 345, Princeton University, 2017
Copyright 2017 by Robert Stengel. All rights reserved. For educational use only.http://www.princeton.edu/~stengel/MAE345.html
•! Decision making•! Task decomposition, communities, and connectivity•! Cooperation, collaboration, competition, and conflict•! Single-agent path planning (see Lecture 6)•! Multi-agent architectures•! Swarm dynamics and control
1
Task Planning Goals•! Accomplish an objective
–! Make a decision–! Gather information–! Build something–! Analyze something–! Destroy something
•! Determine and follow a path–! Minimize time or cost–! Take the shortest path–! Avoid obstacles or hazards
•! Work toward a common goal–! Integrate behavior with higher
objectives–! Do not impede other agents
Central Pacific and Union Pacific Railroads meet in Promontory, Utah, 1869
Driving The Golden Spike
2
More Task Planning Goals•! Provide leadership for
other agents–! Issue commands–! Receive and decode
information•! Provide assistance to
other agents–! Coordinate actions–! Respond to requests
•! Defeat opposing agents–! Compete and win
•! Path planning–! See Lecture 5
3
Common Threads in Task Accomplishment•! Optimize a cost function•! Satisfy or avoid constraints•! Exhibit desirable behavior•! Tradeoff individual and team goals•! Use resources effectively and efficiently•! Negotiate•! Cooperate with team members•! Overcome adversity and ambiguity
4
Task Planning
•! Situation awareness•! Decomposition and identification of communities•! Development of strategy and tactics
PhaseProcess Outcome
Objective Tactical (short-term)
Situation Assessment
Situation Awareness
Strategic (long-term)
Comprehension Understanding
5
Boyd s OODA Loop for Combat Operations•! Derived from air-combat
maneuvering strategy•! General application to learning
processes other than military
6
Endsley, 1995
Elements of Situation Awareness
7
Important Dichotomies in Planning
Strength, Weakness, Opportunity, and Threat (SWOT) Analysis Knok-Knoks and Unk-Unks
8
Strategy/Tactics Development and Deployment
•! Development of long- and short-term actions/activities for implementation and operation
•! Sequence of procedures to be executed–! fixed or adaptive
•! Exposition of approach–! Rules of engagement–! Concept of Operations (CONOPS)
•! Spectrum of flexibility–! Rigid sequence <---> Learning systems
•! Think Expert System
9
Planning Tools!
10
Program Management: Gantt Chart
•! Project schedule•! Task breakdown and dependency•! Start, interim, and finish elements•! Time elapsed, time to go
11
Program Evaluation and Review Technique (PERT) Chart•! Milestones•! Path descriptors•! Activities, precursors, and successors•! Timing and coordination•! Identification of critical path•! Optimization and constraint
12
Task Decomposition: Community Identification
•! Connectivity of individuals
•! Individuals assemble in communities or clusters
•! Complex networks–!Random
networks–!Small-world
networks–!Scale-free
networks•! Degrees of
separation
Fully connected Random
Clustered small worldSmall world ring lattice
Community <-> CommunicationStrogatz, 2001
13
Communities and Networks!
14
Scale-Free NetworksFrequency and cumulative distributions of cluster sizes, k, inversely proportional to kx, x ~ –2 or –3No “knee” that implies a scale in the distribution
Strogatz, 2001
Scale-Free
15https://en.wikipedia.org/wiki/Scale-free_network
Community Examples•! Associations•! Governments
–!Agencies•!Laboratories
–! Managers–! Scientists
•! Military organizations–!Army
•!Corps–! Division
»! Brigade–!Regiment
•!Battalion–! Company
»! Platoon–!Squad
•!Soldier–!Special Operations
•! Terrorist organizations
•! Families•! Classmates•! Neighbors•! Social Networks
–!Facebook–!LinkedIn
•! Media Networks•! Corporations•! Employees•! Customers•! Sports Leagues
–!Teams•!Managers
–! Players–! Trainers
•! Airlines•! Cities
16
Multi-Agent Systems•! Specialized vs. general-purpose agents•! Organizational models•! Cooperators
–! Leader/follower (hierarchical)–! Equal members
•! Collaborators–! Air, ground, and sea traffic–! Customers
•! Competitors–! Individual game players–! Sports teams–! Political/military organizations
•! Negotiators–! Politicians–! Employer/employee representatives
17
Multi-Agent Systems•! Cooperation and collaboration should
lead to win-win (non-zero-sum) solutions
•! Competition should lead to win-lose (zero-sum) solutions
•! Negotiation should lead to win-win but may lead to win-lose solutions
18
Typical Characteristics of Multi-Agent Architectures
•! Federated (centralized) problem solving–! Doctrinaire–! Coupled–! Synchronous–! Fragile–! Complex–! Strategic–! Information-rich–! Unified–! Integrated–! Top-down–! Globally optimal
•! Distributed problem solving–! Autonomous–! Independent–! Asynchronous–! Robust–! Simple–! Tactical–! Parsimonious–! Idiosyncratic–! Modular–! Bottom-up–! Locally optimal
19
Hierarchical Tree or Hub-and-Spoke Network?
20
What is the Nature, Quality, and Significance of Connections?
•!Communication•!Collaboration•!Coordination•!Negotiation•!Competition•!Conflict
21
Connections May Connote Different Relationships
•!Communication•!Collaboration•!Coordination•!Negotiation•!Competition•!Conflict
22
Competition
23
Conventional Conflict
24
Unconventional ( Asymmetric ) Conflict
25
System Analysis of the 9/11 Terrorist
Network
•!Hijackers–!AA11–!AA77–!UA93–!UA175
•!Accomplices
26http://pear.accc.uic.edu/ojs/index.php/fm/article/view/941/863
27
Air Traffic Management: A Collaborative Multi-Agent System
28
https://www.flightradar24.com
Elements of Principled Negotiation
[Fisher, Ury (1981) Fry (1991)]
•! Example of decision-making•! Separate agents* from the problem•! Focus on interests, not positions•! Invent options for mutual gain•! Insist on using objective criteria
* people, organizations, entities, …29
Intelligent Agents in Air Traffic Management
30
Principled Negotiation Flow Chart
31
Expert System Diagram for Principled
Negotiation (Wangermann and
Stengel) •! Separate agents* from the
problem•! Focus on interests, not
positions•! Invent options for mutual
gain•! Insist on using objective
criteria
32
Graphical Representation of
Knowledge: Principled
Negotiation in Air Traffic
Management
33
Principled Negotiation: Getting Past No (Ury, 1991)
•! Prepare by identifying barriers to cooperation, options, standards, and your Best Alternative to a Negotiated Agreement (BATNA)
•! Understand your goals, limits, and acceptable outcomes•! Buy time to think•! Know your hot buttons , deflect attacks•! Acknowledge opposing arguments•! Agree when you can without conceding•! Express your views without provoking•! I statements, not you statements•! Negotiate the rules of the game•! Reframe the negotiation•! Build a golden bridge that allows opponent to retreat gracefully•! Engage third-party mediation or arbitration•! Aim for mutual satisfaction, not victory•! Forge a lasting agreement
34
Multi-Agent Scenarios Modeled as Optimal
Control Problems!
35
Multi-Agent Control Example Based on Linear-Quadratic-Gaussian !
(LQG) Optimal Control
E(J ) = E ! x(t f )"# $% + L x(t),u(t)[ ]dtto
t f
&'()
*)
+,)
-)
= 12xT (t f )S fx(t f )+ xT (t)Qx(t)+ uT (t)Ru(t)"# $%dt
to
t f
&'()
*)
+,)
-)
•! Quadratic cost function
•! Linear dynamic model
!x(t) = Fx(t) +Gu(t) + Lw(t)
36
A Federated Optimization Problem
!x(t) = Fx(t) +Gu(t) =FA FB
A
FAB FB
!
"##
$
%&&xAxB
!
"##
$
%&&+
GA GBA
GAB GB
!
"##
$
%&&
uAuB
!
"##
$
%&&
Dynamic models for two agents, A and B, are coupled to each other and expressed as a single system
E(J ) = E 12
xT (t)Qx(t)+ uT (t)Ru(t)!" #$dtto
t f
%&'(
)(
*+(
,(
= E 12
xAT xB
T!"
#$QA QB
A
QAB QB
!
"--
#
$..xAxB
!
"--
#
$..+ uA
T uBT!
"#$RA RB
A
RAB RB
!
"--
#
$..
uAuB
!
"--
#
$..
!
"
--
#
$
.
.dt
to
t f
%&'(
)(
*+(
,(
u(t) = !Cx̂(t) =uAuB
"
#$$
%
&''= !
CA CBA
CAB CB
"
#$$
%
&''x̂Ax̂B
"
#$$
%
&''
Cost function minimizes performance-control tradeoff
Optimal feedback control laws are coupled to each other
37
A Distributed Optimization Problem
!x(t) = Fx(t) +Gu(t) =FA 00 FB
!
"##
$
%&&
xAxB
!
"##
$
%&&+
GA 00 GB
!
"##
$
%&&
uAuB
!
"##
$
%&&
Each sub-system can be optimized separatelyEach control depends only on separate sub-state
E(J ) = E 12
xT (t)Qx(t)+ uT (t)Ru(t)!" #$dtto
t f
%&'(
)(
*+(
,(
= E 12
xAT xB
T!"
#$QA 00 QB
!
"--
#
$..
xAxB
!
"--
#
$..+ uA
T uBT!
"#$RA 00 RB
!
"--
#
$..
uAuB
!
"--
#
$..
!
"--
#
$..dt
to
t f
%&'(
)(
*+(
,(
u(t) = !RA 00 RB
"
#$$
%
&''
!1
GTSx̂(t) = !Cx̂(t) =uAuB
"
#$$
%
&''= !
CA 00 CB
"
#$$
%
&''
x̂Ax̂B
"
#$$
%
&''
Coupling between actions of two agents, A and B, is negligible
38
Pursuit-Evasion: !A Competitive
Optimization Problem
Linear model with two competitors, P and E
!x(t) = Fx(t) +Gu(t) =!xP!xE
!
"##
$
%&&=
FP 00 FE
!
"##
$
%&&
xPxE
!
"##
$
%&&+
GP 00 GE
!
"##
$
%&&
uPuE
!
"##
$
%&&
Pursuer s goal: minimize final miss distanceEvader s goal: maximize final miss distance
•! Example of a differential game, Isaacs (1965), Bryson & Ho (1969)
39
Pursuit-Evasion: !A Competitive
Optimization ProblemQuadratic minimax (saddle-point) cost function
Optimal control laws for pursuer and evader
E(J ) = E 12xT (t f )S(t f )x(t f )!" #$ +
12
xT (t)Qx(t)+ uT (t)Ru(t)!" #$dtto
t f
%&'(
)(
*+(
,(
= E 12
xPT (t f ) xE
T (t f )!"-
#$.
SP SPESEP SE
!
"--
#
$..f
xP (t f )xE (t f )
!
"--
#
$..
&'(
)(
*+(
,(
+E 12
xPT (t) xE
T (t)!"
#$
QP QPE
QEP QE
!
"--
#
$..
xP (t)xE (t)
!
"--
#
$..+ uP
T (t) uET (t)!
"#$RP 00 /RE
!
"--
#
$..
uP (t)uE (t)
!
"--
#
$..
!
"--
#
$..dt
to
t f
%&'(
)(
*+(
,(
u(t) =uP (t)uE (t)
!
"##
$
%&&= '
CP (t) CPE (t)CEP (t) CE (t)
!
"##
$
%&&
x̂P (t)x̂E (t)
!
"##
$
%&& 40
Requirements for Guaranteeing Stability of the LQ Regulator
!!x(t) = F!x(t) +G!u(t) = F "GC[ ]!x(t)Closed-loop system is stable whether or
not open-loop system is stable if ...Q > 0R > 0
Rank G FG ! Fn!1G"# $% = n
... and (F,G) is a controllable pair
41
Coordination
42
Collaboration
43
Conclusion•! Robots and Robotics
–! Mechanical devices–! Design of mechanical devices–! Use of mechanical devices–! Control processes, sensors, and algorithms used in humans,
animals, and machines
•! Intelligent Systems–! Systems to perform useful functions driven by goals and
current knowledge–! Systems that emulate biological and cognitive processes–! Systems that process information to achieve objectives–! Systems that learn by example–! Systems that adapt to a changing environment–! Optimization
•! Robots + Intelligent Systems = Intelligent Robotics44
SSuupppplleemmeennttaarryy MMaatteerriiaall!!
45
MAE 345 Course Learning Objectives!!! Dynamics and control of robotic devices.!!! Cognitive and biological paradigms for system design. !!!! Estimate the behavior of dynamic systems. !!!! Apply of decision-making concepts, including neural networks, expert
systems, and genetic algorithms.!!! Components of systems for decision-making and control, such as
sensors, actuators, and computers. !!!! Systems-engineering approach to the analysis, design, and testing of
robotic devices. !!!! Computational problem-solving, through thorough knowledge,
application, and development of analytical software. !!!! Historical context within which robotics and intelligent systems have
evolved.!!! Global and ethical impact of robotics and intelligent systems in the
context of contemporary society. !!!! Oral and written presentation.!
46
Intelligent Aircraft/Airspace SystemFlow Control
47
Intelligent Aircraft/Airspace SystemDeparture Control
48
A Cooperative Multi-Agent System
49
Decomposition into Fast and Slow Models!
50
Reduction of Dynamic Model OrderSeparation of high-order models into loosely coupled or
decoupled lower order approximations
!!x fast
!!xslow
"
#$$
%
&''=
Ffast Fslowfast
Ffastslow Fslow
"
#
$$
%
&
''
!x fast
!xslow
"
#$$
%
&''+
G fast Gslowfast
G fastslow Gslow
"
#
$$
%
&
''
!u fast
!uslow
"
#$$
%
&''
=Ff small
small Fs
"
#$$
%
&''
!x f
!xs
"
#$$
%
&''+
G f small
small Gs
"
#$$
%
&''
!u f
!us
"
#$$
%
&''
51
Truncation of a Dynamic Model•! Dynamic model order reduction when
–! Two modes are only slightly coupled–! Time scales of motions are far apart–! Forcing terms are largely independent
!!x f
!!xs
"
#$$
%
&''=
Ff Fsf
Ffs Fs
"
#
$$
%
&
''
!x f
!xs
"
#$$
%
&''+
G f Gsf
G fs Gs
"
#
$$
%
&
''
!u f
!us
"
#$$
%
&''
=Ff small
small Fs
"
#$$
%
&''
!x f
!xs
"
#$$
%
&''+
G f small
small Gs
"
#$$
%
&''
!u f
!us
"
#$$
%
&''
(Ff 00 Fs
"
#$$
%
&''
!x f
!xs
"
#$$
%
&''+
G f 00 Gs
"
#$$
%
&''
!u f
!us
"
#$$
%
&''
!!x f = Ff!x f +G f!u f
!!xs = Fs!xs +Gs!us
•! Approximation: Modes can be analyzed and control systems can be designed separately
52
Residualization of a Dynamic Model•! Dynamic model order reduction when
–! Two modes are coupled–! Time scales of motions are separated–! Fast mode is stable
•! Approximation: Motions can be analyzed separately using different clocks–! Fast mode reaches steady state instantaneously
on slow-mode time scale–! Slow mode produces slowly changing bias
disturbances on fast-mode time scale
!!x f
!!xs
"
#$$
%
&''=
Ff Fsf
Ffs Fs
"
#
$$
%
&
''
!x f
!xs
"
#$$
%
&''+
G f Gsf
G fs Gs
"
#
$$
%
&
''
!u f
!us
"
#$$
%
&''
=Ff small
small Fs
"
#$$
%
&''
!x f
!xs
"
#$$
%
&''+
G f small
small Gs
"
#$$
%
&''
!u f
!us
"
#$$
%
&''
53
!!x f = Ff!x f +G f!u f
+ Fsf!xs +G
fs!us( )
"Bias
Residualized Fast Mode
If fast mode is not stable, it could be stabilized by inner loop control
!!x f = Ff!x f +G f !uc "C f!x f( )+ Fs
f!xs +Gfs!us( )
"Bias
= Ff "G fC f( )!x f +G f!u fc
+ Fsf!xs +G
fs!us( )
"Bias
Fast mode dynamics
Fast Mode Inner Loop
Control Law
54
Assume that fast mode reaches steady state on a time scale that is short compared to the slow mode
0 ! Ff"x f + Fsf"xs +G f"u f +G
fs"us
"!xs = Ffs"x f + Fs"xs +Gs"us +G
sf"u f
Algebraic solution for fast variable
0 ! Ff"x f + Fsf"xs +G f"u f +G
fs"us
Ff"x f = #Fsf"xs #G f"u f #G
fs"us
"x f = #Ff#1 Fs
f"xs +G f"u f +Gfs"us( )
Fast Mode in Quasi-Steady State
55
Substitute quasi-steady fast variable in differential equation for slow variable
!!xs = "Ffs Ff
"1 Fsf!xs +G f!u f +G
fs!us( )#$ %& + Fs!xs +Gs!us +G
sf!u f
= Fs " FfsFf
"1Fsf#$ %&!xs + Gs " Ff
sFf"1Gs
f#$ %&!us + G fs " Ff
sFf"1G f#$ %&!u f
Residualized equation for slow variable
!!xs = FsNEW !xs +GsNEW
!u f
!us
"
#$$
%
&''
Residualized Slow Mode
Control law can be designed for reduced-order slow model, assuming inner loop has been stabilized separately
Slow Mode Outer Loop
Control Law
56