[American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation and Control Conference...

Real-time Decentralized Pursuer-Evader

Assignment for Cooperating UCAVs Using

the DTC Algorithm∗

Dany Dionne† and C. A. Rabbath‡

A real-time application of the DTC algorithm for decentralized task allo-

cation is reported. The application involves a team of almost-lighter-than-

air vehicles capable of local sensing and of communications. The objective

of this team is an interception of a group of moving targets. The task allo-

cation problem is to decentrally distribute the targets across the members

of the team such as to preserve cooperation while minimizing both commu-

nications and traveled distances. Results demonstrate applicability of the

DTC algorithm in hard real-time problems involving small teams of UAVs.

I. Introduction

The deployment of a team of uninhabited aerial vehicles (UAVs) is a demanding task for

operators in a base station since: (i) simultaneous coordination/cooperation with the other

team members must be insured, (ii) the communications link between the UAVs and the base

station must be maintained, and (iii) the environment is an uncertain and time-varying. To

ease the deployment of a team of networked UAVs, autonomous vehicles with the capability

to cooperatively self-assign tasks are of interest.1 Examples of tasks to be allocated are

waypoints to be reached by the UAVs, or targets to be intercepted. Such set of tasks is in

general time-varying due to the uncertainties in the environment, e.g, the unknown future

trajectory of the targets, or the online detection of an obstacle that requires modifying the

flight plan to the waypoints.

∗This work was financially supported by the Natural Sciences and Engineering Research Council ofCanada.

†Research and Development, Lockheed Martin Canada.‡Defence Research and Development Canada - Valcartier, and Mechanical Eng. Dept., McGill University,

Adjunct Professor

AIAA Guidance, Navigation and Control Conference and Exhibit20 - 23 August 2007, Hilton Head, South Carolina

AIAA 2007-6454

Copyright © 2007 by Dany Dionne and Camille A. Rabbath. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

The general problem of dynamically coordinating a group of multiple robots satisfying

multiple goals is as yet unsolved.2 Nonetheless, several task allocation strategies for simplified

problems have been proposed. These strategies are either centralized, i.e., a single entity

allocates the tasks for the whole team, or decentralized, i.e., each UAV allocates itself its own

tasks. Decentralized task allocation strategies involves UAVs with different input information

(each UAV has different local measurements, independent noises, and so on). Then, each

UAV feeds its decentralized task allocation rule with a different information input,3,4 and

the output of this rule (the allocated task) may fail in maintaining cooperation across the

team.

Decentralized task allocation algorithms for robots and UAVs were proposed in Refs 5–7.

In Ref. 5, a decentralized task allocation algorithm was proposed that foster cooperation by

asynchronous intermittent communications; each UAV communicates when the error between

its local and the shared information exceeds a given threshold. In Ref. 6, efficient informa-

tion exchange was investigated by transmitting only the data with the largest impact in the

performance of the closed-loop system. In Ref. 7, a decentralized task allocation algorithm

was proposed where the decision to communicate is triggered by deviations in the outputs of

the decentralized task allocation rule. The resulting asynchronous and intermittent commu-

nications was demonstrated efficient in the sense that the number of communication events

was limited, while both the cooperation across the team and the minimization of the global

cost function were preserved.

This paper presents a real-time application of the decentralized task consensus (DTC)

algorithm introduced in Ref. 7. The studied scenario is a pursuit-evasion engagement, where

the pursuers and the evaders are the uninhabited almost-lighter-than-air vehicles (ALTAVs)

described Ref. 8. The vehicles’ model comprises 6-DOF nonlinear equations of motion with

noise in sensor measurements and actuator limits. The pursuers are equipped with local

sensors and a communications system. The tasks to be allocated across the team are the

evaders to be reached. Real-time simulations are carried out on a multiprocessor testbed.

The testbed is comprised of a cluster of four PCs running RedHawk Linux operating sys-

tem, with fast communications, and hardware synchronization. This enables fast, real-time

simulations for teaming pursuer vehicles. The simulations show the effectiveness of the DTC

algorithm as well as the computing times needed for the real-time execution of the guidance

and control laws in various engagement scenarios.

II. Scenario

A three-dimensional pursuit-evasion scenario with N pursuers and M evaders is adopted.

The objective of the pursuers is to intercept the evaders while minimizing the communications

State estimates Task allocation Guidance

Localmeasurements

Receivedcommunication

Decisionto communicate

Transmitcommunication

ALTAVdynamics and

autopilot

Figure 1. Control loop of a pursuer.

and the distance to be traveled. The adopted solution is the DTC algorithm7 that repetitively

updates the target allocation as new measurements/information arrives.

Each pursuer obtains information through communications and by gathering measure-

ments. The measurements gathered by a pursuer are about its own position and the position

of the targets; these measurements are gathered at constant rates. The communications

received by a pursuer describe the state of the other pursuers in the team; these communi-

cations are intermittent and subject to a communication delay. Whenever a pursuer decides

to transmit data, this information is broadcasted to all the other pursuers.

The control loop of a pursuer is illustrated in Fig. 1. The components of the control loop

are described below.

A. ALTAV dynamics and Autopilot

The nonlinear ALTAV dynamics was derived from experimentations by Quanser inc.8 Each

ALTAV is equipped with four motors. The nonlinear dynamics of each ALTAV is given by

Mx = −Cxx +4

∑

i=1

Fi sin(γ) (1a)

My = −Cyy +4

∑

i=1

Fi sin(φ) (1b)

Mz = Fg − FB − Cz z −4

∑

i=1

Fi cos(γ) cos(φ) (1c)

Jθθ =(

F1l1 − F2l2 + F3l3 − F4l4)

sin(ρ) − Cθθ (1d)

Jγ γ = F1l1 − F3l3 − FBlB sin(γ) − Cγ γ (1e)

Jφφ = −F2l2 + F4l4 − FBlB sin(φ) − Cφφ (1f)

where M = 1.618 [kg] is the mass, Jθ = 0.995, Jγ = 1.005, and Jφ = 1.005 [kg m2/rad]

are the moments of inertia about the x, y, and z axes, respectively, Fg = 9.8M [N] is the

force due to gravity, FB = 13 [N] is the buoyant force, l1 = l2 = l3 = l4 = 0.941 [m] are

the perpendicular distances between each of the motors and the vehicle center of gravity,

Cx = Cy = Cz = 0.95 [kg/s] and Cθ = Cγ = Cφ = 0.5 [kg m2/rad] are the drag coefficients

in the directions x, y, z, θ, γ, and φ, respectively, ρ = 6π/180 [rad] is the angular offset from

vertical of the motors’ thrust vector, and Fi, i ∈ {1, 2, 3, 4}, are the force magnitudes of the

motors. The four control variables are the values of Fi, i ∈ {1, 2, 3, 4}.

The autopilot has for objective to steer the ALTAV toward a desired position while

maintaining θ(t) = 0. The adopted controllers are PIDs and are described in details in

Ref. 9.

B. Guidance

Let V = {1, · · · , N} be the set of pursuers, and T = {1, · · · ,M} be the set of targets. The

guidance law delivers the desired position of pursuer i ∈ V along the three axes,[

xdi yd

i zdi

]T,

such as to achieve its guidance objective. This guidance objective is an interception of the

target located at[

xej , y

ej , z

ej

]T, where j = l?−i ∈ T . The value l?−i is the index of the target

allocated to the pursuer i ∈ V . A pure pursuit guidance law10 is adopted, i.e., the pursuer

flies toward the current position of its allocated target

[

xdi (tk) yd

i (tk) zdi (tk)

]T

=[

xej(tk) ye

j (tk) zej (tk)

]T

(2)

C. Measurements and State estimator

Each UAV gathers measurements on its x, y, z positions, and its θ, γ, and φ angles. All mea-

surements are subject to an additive zero-mean Gaussian noise. The position measurements

along the x and y axes are obtained from a GPS at intervals ∆GPS = 1 [s] with a noise covari-

ance σ2GPS = 1 [m2]. The position measurement along the z axis is obtained from a sonic range

finder (SRF) at intervals ∆SRF = 0.02 [s] with a noise covariance σ2SRF = 0.02 [m2]. The θ

measurements are obtained at intervals ∆θ = 0.02 [s] with a noise covariance σ2θ = 2 [degree2].

The φ and γ measurements are obtained at intervals ∆tilt = 0.01 [s] with a noise covariance

σ2tilt = 1 [degree2].

The state estimator has two main purposes: to filter the noise in the position measure-

ments, and to time align the information. Time alignment of the information is achieved by

prediction of the position of the other pursuers from the instant of their last received report

to the current time instant.

The estimator selected by each pursuer i ∈ V is a bank of (N +1) Kalman estimators,

i.e., N Kalman predictors that processes the information shared by the N pursuers, and

one Kalman filter that processes all the shared and unshared information about the local

pursuer i ∈ V . Each Kalman estimator in the bank delivers an estimated state vector,

x ∈ R9, describing the pursuer associated with it. This state vector is given by

x =[

qx qx qx qy qy qy qz qz qz

]T

(3)

where qx, qy, and qz are the estimated positions along the x, y, and z axes, respectively. The

estimation model is in the form of

˙x(t) = A(t)x(t) + B(t)u(t) + w(t) (4a)

y(tk) = H(tk)x(tk) + ν(tk) (4b)

with time-invariant block-diagonal matrices given by

A =

Aq 03×3 03×3

03×3 Aq 03×3

03×3 03×3 Aq

, B =[

09×1

]

(5a)

H =

Hq 01×3 01×3

01×3 Hq 01×3

01×3 01×3 Hq

(5b)

where

Aq =

0 1 0

0 0 −α

0 0 0

, Hq =[

1 0 0]

(6)

and with noises w(t) ∼ N (0, Qw) and ν ∼ N (0, Qν) given by

Qw =

Qwq 0 0

0 Qwq 0

0 0 Qwq

, Qwq =

qw1 0 0

0 qw2 0

0 0 qw3

(7a)

Qν =

σ2GPS 0 0

0 σ2GPS 0

0 0 σ2SRF

(7b)

In Eq. (5), the dynamics along the x, y, and z axes is uncoupled, this approximation

reduces the computational requirements. The uncoupled dynamics along each axis is in the

form of Eq. (6) where the last two rows of Aq, together with the last row of Qwq , form a Singer

shaping filter.11 The Singer shaping filter provides the estimator with the ability to cope

with unknown correlated exogenous inputs (i.e., it compensates for the unrealistic null B

matrix and the unknown u in Eq. (5)). The main exogeneous input is the control command

in § II.A; the value of that input cannot be employed by the estimator due to the control

variables in Eq. (1) being significantly different from those in Eq. (5).

By trial and error, the process noise is set to a power spectral density, Qwq , given by

qw1 = 0.1 [m/s], qw

2 = 0.1 [m/s2], qw3 = 0.1 [m/s3], and the Singer’s correlation coefficient is

selected to have value α = 0.1.

D. Task allocation and Communication

The DTC algorithm is employed for the task allocation and for the decision to communicate,

see Ref. 7. This algorithm solves the task allocation problem twice: (i) by employing only the

shared information, and (ii) by employing all the information available to the local pursuer.

Whenever there is a discrepancy between (i) and (ii), the pursuer adopts the allocation

(i) to preserve cooperation across the team. The decision to communicate is based on the

discrepancies between the allocations from (i) and (ii).

The objective of the task allocation problem is to minimize a global cost, J . This global

cost is calculated as follows. Let cij be the cost for pursuer i ∈ V to intercept target j ∈ T .

The global cost is the accumulation of the costs cij:

J(tk, L) =N

∑

i=1

cij(tk), ij ∈ L (8)

where L is one of the admissible task allocations for the team. Without loss of generality,

the cost cij is selected to be the square of the separation between the vehicles.

The minimization of the global cost in Eq. (8) involves solving an optimal combinatorial

problem. An exact solution is obtained by calculating the global cost for all the admissible

combinations.

A pursuer communicates when its unshared information is sufficient to modify the so-

lution of the task allocation problem. The information communicated by a pursuer is its

current state vector. The decision to communicate is obtained as follows. Let l?−i be the

target allocated to pursuer i ∈ V based on the shared information, and let l?i be the allocated

target based on all the local information. The decision function, gi, is given by

gi(tk) = l?i − l?−i (9)

and the decision to communicate is

gi(tk) =

0 =⇒ no communication

otherwise =⇒ communicate(10)

III. DTC Simulations: Testbed and Results

This section describes the testbed used to simulate, in real-time (RT), the DTC algorithm

and presents the simulation results obtained. The simulations rely upon nonlinear 6-DOF

models of uninhabited combat air vehicles (UCAVs), and hardware synchronization of the

various computing tasks. The multirate tasks include communications, decision, control,

filtering and dynamics. The scenarios are as follows. A small team of UCAVs have for

mission to strike an equal number of evader aerial vehicles. The pursuer-evader allocation

problem revolves around a decentralized coordination of the team of UCAVs using locally

available information and evolving scenes while constraining the number of communication

events. Once an UCAV is within a prescribed distance of its assigned target, then the

ground operator can order the firing of UCAV munitions on the target. Thus, the goal of

the DTC algorithm is to guide the UCAVs within a neighborhood of the detected targets.

The objective of the simulations is to show that this can be done in RT for a set of realistic

scenarios.

A. Testbed

The multiprocessor testbed is shown in Figure 2. The use of multiple RT processing targets

warrants small step size despite large computing tasks, enables reconfigurable hardware-in-

the-loop and immerses the user in RT operations. In the figure, the host PC, which runs

Windows Operating System (OS), serves for offline DTC design and analysis, and for sending

user commands to the RT targets through a TCP/IP link. The RT processing environment

comprises four central processing units, or targets, running RedHawk Linux OS,12 sharing

information through shared memory and FireWire (IEEE 1394 OHCI). Designer-specified

data are communicated online to the Windows-based viewer PC equipped with a renderer

software to display the engagement. In the simulations considered in this project, the X-

Plane13 renderer was used. The computing tasks running on the four RT targets are obtained

through a rapid control prototyping process enabled by the use of Matlab/Simulink,14 Real-

RTTarget

1

HostPC

RTTarget

2

RTTarget

3

RTTarget

4

ViewerPC

Real-time Processing

SM SMFW

RouterTCP/IP

TCP/IPTCP/IP

TCP/IPTCP/IP

TCP/IP

SM: Shared MemoryFW: FireWire

RTTarget

1

HostPC

RTTarget

2

RTTarget

3

RTTarget

4

ViewerPC

Real-time Processing

SM SMFW

RouterTCP/IP

TCP/IPTCP/IP

TCP/IPTCP/IP

TCP/IP

SM: Shared MemoryFW: FireWire

Figure 2. Multiprocessor testbed

Time Workshop15 and RT-Lab.16 In short, having a Simulink model available, the user

separates the various components of his/her model into subsystems. Then, the designer

decides unto which target the various subsystems will run. Making sure the subsystems are

RT-compliant, the user then initiates the process of automatic code generation, compilation

on the RT targets, and uploading to the assigned RT targets. After this sequence is done,

the user can start the simulations and acquire data, as needed.

To reduce computing times via distributed processing, the user must make sure the

pursuer-engagement model is relatively well balanced among the nodes and that the depen-

dency among subsystems, within a single time step, is reduced.

B. Simulation Results

The results obtained with a 5-pursuer, 5-evader open-air engagement scenario are shown

in Figures 3 to 5. In these simulations, artificial delays are introduced in the exchange of

information among the RT targets to mimic the effect of communications delays found with

actual aerial systems.

Figure 3. Vehicle trajectories projected in the horizontal plane during the engagement. Pur-suers (solid lines). Evaders (dashed lines).

The trajectories obtained in the horizontal plane are shown in Figure 3. The engagement

is face-to-face, the pursuers are denoted P1, · · · , P5 and their initial position is on the left-

hand side. The initial position of the evaders is on the right-hand side. Initially, the pursuers

P1 and P2 are in the same neighborhood, while the other pursuers are more separated from

each other.

Snapshots of the engagement taken at four different time instants are presented in Fig-

ure 4. The pursuers (P) are shown to approach the evaders (E), as taken from a ground

observation point at t0 + 10, t0 + 20 and t0 + 30 seconds. Forty seconds later, each P lies in

proximity of its assigned E.

The communication events between the pursuers during the first 60 [s] of the engagement

are shown in Figure 5. The communications are intermittent and asynchronous. The decision

to communicate (see § II.D.) is triggered by the uncertainties in the available information and

by the geometry of the engagement. Consider the pursuers P1 and P2 that are in proximity

of each other initially (see Fig. 3). This proximity makes their decentralized task allocation

more difficult, these pursuers then employs communication at about t = t0 +10 [s] to ensure

efficient cooperation.

The average computing times obtained by the RT implementations of the proposed DTC

algorithm are displayed in Table 1 for a selected simulation time frame. The target computer

is a dual-CPU Pentium 4 with a 2 GHz clock speed. The shortest execution period for the

t0 + 70 sec

t0 + 30 secE

E

E

E

E

P

P

P

P

P

Vie

w fr

om g

roun

d

Chase ViewP-E

P-E

P-E

P-E

P-E

t0 + 70 sec

t0 + 30 secE

E

E

E

E

P

P

P

P

P

Vie

w fr

om g

roun

d

Chase ViewP-E

P-E

P-E

P-E

P-E

Figure 4. Engagement snapshots. The pursuers (P) arrive from the left-hand side, while theevaders (E) arrive from the right-hand side.

Figure 5. Communications events during the first 60 [s] of the engagement. From upper tolower panels, the communication history of each pursuer is displayed.

multi-rate model is 0.01 second. Shared memory communications is used in the exchange of

data between the two CPUs. The results show that the computing times are significantly

smaller than the idle time, thus demonstrating that the actual real-time implementation of

the DTC algorithm is feasible despite the nonlinear UAV dynamics and the sophistication

of the involved mathematics.

IV. Conclusion

The DTC algorithm was demonstrated in a detailed real-time simulation of UAVs. Real-

time applicability was demonstrated in small team composed of up to five UAVs. Based on

the results provided in this paper, it is envisaged that application of the DTC algorithm

to swarms of hundreds of vehicles could be readily simulated with the testbed shown in

Figure 2. Applicability of the DTC algorithm to such large team will be enable by combining

Table 1. Computing times of the DTC algorithm in a time frame.

5vs5/1CPU 5vs5/2CPUs 2vs2/1CPU

Computing times

(microseconds)160 100 70

the novelties introduced in the DTC algorithm with numerically efficient techniques to solve

the combinatorial problem encountered in solving the task allocation problem.

References

1Chandler, P. R. and S., R., “UAV Cooperative Path Planning,” Proceedings of the AIAA Guidance,

Navigation, and Control Conference, August 2000, Paper AIAA-2000-4370.

2Mataric, M. J., Sukhatme, G. S., and Østergaard, E. H., “Multi-Robot Task Allocation in Uncertain

Environments,” Autonomous Robots, Vol. 14, 2003, pp. 255–263.

3Ren, W., Beard, R. W., and Kingston, D. B., “Multi-Agent Kalman Consensus with Relative Uncer-

tainty,” Proceedings of the American Control Conference, Portland, Oregon, June 2005, pp. 1865–1870.

4Mitchell, J. W. and Sparks, A. G., “Communication Issues in the Cooperative Control of Unmanned

Aerial Vehicles,” Proceedings of the 41st Annual Allerton Conference on Communication, Control, and

Computing , 2003.

5Shima, T., Rasmussen, S. J., and Chandler, P., “UAV Team Decision and Control Using Efficient

Collaborative Estimation,” Proceedings of the American Control Conference, Portland, Oregon, June 2005,

pp. 4107–4112.

6Alighanbari, M. and How, J. P., “Decentralized Task Assignment for Unmanned Aerial Vehicles,”

Proceedings of the IEEE Conference on Decision and Control , Seville, Spain, December 2005, pp. 5668–

5673.

7Dionne, D. and Rabbath, C. A., “Multi-UAV Decentralized Task Allocation With Intermittent Com-

munications: the DTC Algorithm,” Proceedings of the IEEE American Control Conferenre, New York, July

2007, to appear.

8Earon, E., “Almost-Lighter-Than-Air Vehicle Fleet Simulation,” Tech. Rep. V. 0.9, Quanser inc.,

Toronto, Canada, 2005.

9Lechevin, N., Rabbath, C. A., and Earon, E., “Towards Decentralized Fault Detection in UAV For-

mations,” Proceedings of the American Control Conference, New York, July 2007, paper FrC07.2.

10Shneydor, N. A., Missile Guidance and Pursuit - Kinematics, Dynamics, and Control , Engineering

Science, Horwood Publishing, England, 1998.

11Singer, R. A., “Estimating Optimal Tracking Filter Performance for Manned Maneuvering Targets,”

IEEE Transactions on Aerospace and Electronic Systems, Vol. 5, 1970, pp. 473–483.

12http://www.ccur.com/isd solutions redhawklinux.asp.

13http://www.x-plane.com.

14http://www.mathworks.com/.

15http://www.mathworks.com/access/helpdesk/help/toolbox/rtw/.

16www.opal-rt.com.

Date post:	15-Dec-2016
Category:	Documents
Upload:	camille
View:	212 times
Download:	0 times

[American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation and Control Conference...

Documents