1 Graduate Student, Department of Aerospace Engineering, Student Member AIAA 3 Associate Professor, Department of Aerospace Engineering, Associate Fellow, AIAA
SMART Heuristic for Pickup and Delivery Problem (PDP)
with Cooperative UAVs
Chelsea Sabo1
University of Cincinnati, Cincinnati, Ohio, 45221
Kelly Cohen2
University of Cincinnati, Cincinnati, Ohio, 45221
Abstract
Pickup and Delivery Problems (PDPs) are a subset of Vehicle Routing Problems (VRPs)
which require a vehicle to service targets by picking them up at an origin and delivering them to
their unique destination. With respect to surveillance functions, this becomes a realistic problem
as UAVs are restricted by operating range, data rate, Anti-Jam margins, and cost. Therefore,
UAVs must be allocated to “pickup” targets and then “deliver” them, from within a prescribed
communication space, back to a command and control HQ. To maximize the speed/amount of
information transmitted from this communication region, the objective of allocating the UAVs is
such that the total service time (pickup and delivery) of all the targets is minimized. Previous
work on PDPs has shown that as the problem gets more complicated (i.e. more targets and more
vehicles) the solution space increases exponentially, and the execution time to find an optimal
solution is impossible to implement. Additionally, previously related work using a heuristic
solution has been applied to this problem showing that good result can be maintained (within ~15-
20% of the optimal solution for small cases). The focus of this research is to develop an alternate
heuristic algorithm, deemed SMART from here on, that can perform near optimally (within ~5%)
and scales as the problem gets more complicated. Also, the algorithm is developed such that it is
easily extendable to a dynamic scenario as this research progresses. This algorithm is described
in detail and has shown that it reaches this performance metric while requiring a significantly
reduced computational time when compared to the time needed to obtain the optimal solution.
I. Introduction
Surveillance functions are of paramount importance to U.S. defense system, and these systems are
comprised of various means for acquiring and processing information needed by military commanders/national
security decision makers. The future of these systems focus on including human intelligence, measurement and
signature intelligence, signals intelligence, imagery intelligence, and open source intelligence through algorithms,
software, and automation. Additionally in the not too distant future, cooperative UAV teams are anticipated to
provide this much needed support more effectively [1]. A very important aspect in the design of UAV cooperative
control systems is the ability to collect and transmit the data collected to a decision making authority such as
command and control headquarters. Ideally speaking, a group of collaborating UAVs should be able to
communicate “whenever and as much as they need to [1].” While this should be the standard, it is far from a reality.
Typically the focus is on minimizing mission costs while communication restrictions are often ignored. However,
there is a distinct need for collecting high-resolution snapshots of targets anywhere in the environment (no
communication constraints) in addition to the need to reliably get this information back in a timely way. In reality,
operating range, data rate, Anti-Jam margins and cost are limiting factors that need to be considered in order to
operate effectively [2]. While „command and control‟ signals can be delivered with low bandwidth data, there is a
distinct need for a high bandwidth delivery mechanism. Moreover, communication constraints and limits define all
UAV activity and there is a growing interest in the research community to expand current capabilities.
Infotech@Aerospace 201129 - 31 March 2011, St. Louis, Missouri
AIAA 2011-1464
Copyright © 2011 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 2 of 19
American Institute of Aeronautics and Astronautics
II. Literature Survey
The resource allocation problem associated with sending cooperative UAVs to collect information about
targets falls within the general framework of Vehicle Routing Problems (VRPs). Here, the context of the VRP is to
deliver goods to customers that have placed orders (or in this case, for UAVs to service target requests) from a
single, central depot. Several variations of these problems have been addressed and studied extensively: vehicle
routing with pickup and delivery (VRPPD), vehicle routing with Last In, First Out (LIFO), vehicle routing with time
windows (VRPTW), and the capacitated vehicle routing problem (CVRP). The VRPPD is a subset of VRPs where
loads need to be transferred from a pickup location (their origin) to their delivery location (their destination).
Vehicle routing with LIFO is a type of VRPPD where the first item to be delivered must be the last item that was
picked up. VRP with TWs represent problems where the items to be visited have timeframes associated with them.
Capacitated vehicle routing problems are problems where the vehicles have a limited load capacity. Furthermore,
pickup and delivery problems (PDPs) have been studied with each of these various additional constraints (see Fig. 1
below).
Figure 1: Framework for Pickup and Delivery Problems
In the problem formulation with UAVs, the VRP is setup as a PDP in that there are targets that need to be
visited (to collect information from), and there is a single communication range where the UAVs are able to transmit
data through a high bandwidth connection back to a command and control HQ (Fig. 2). Because this is a range (not
a single point or depot), each target typically has a unique delivery site based on its location (e.g. the closest point in
the range from the target to the range). However, each target can also be delivered at another target‟s delivery
location (each target has a “best” delivery location, but is satisfied by being delivered at any point within the range).
Therefore, the UAVs must be allocated to “pickup” targets and then “deliver” them to their corresponding
destination within the communication range. This assumption of limited communication constraints is not only
feasible, but realistic in almost every operating environment.
Figure 2: PDP environment where UAVs need to "pickup" targets and "deliver" them to a communication range.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 3 of 19
American Institute of Aeronautics and Astronautics
While allocating UAVs is an important growing concern as UAVs are more frequently used, this
scheduling issue where pickup and delivery is necessary arises in many fields and has numerous applications. This
is often an issue when scheduling school buses, garbage trucks, delivery vans and trucks; distributing mail and
packages; dispatching trucks; etc. In addition to this, applications include ambulatory services, logistics, and
robotics. An example of a similar, commonly studied problem is the Travelling Repairperson Problem (TRP). Here
the agent must service locations to “repair” the targets. These types of problems are studied in conjunction with
PDPs by assuming that the delivery location is the same as the pickup location [3]-[4].
Because pickup and delivery problems with time windows (PDPTW) and vehicle capacity have many
practical applications, these have been studied extensively [5]-[6]-[7]. An example of the type of problem with time
windows is called the dial-a-ride problem (DARP) in which a customer places a call for taxi [8]. Often times there
are a maximum time window for each customer or limitations to the capacity of the cab [9]. An example of a
limited capacity problem is the Stacker-Crane problem (SCP) [10]. In this problem, each item that is picked-up
needs to be immediately delivered to its destination. This is similar to saying that the load capacity of the vehicle is
one. This problem has practical applications of scheduling crane operations.
Arguably the hardest, and often most realistic, version of these problems is the unconstrained dynamic PDP
with large numbers of service requests. Generally when discussing UAVs in literature, the restrictions of time
windows and vehicle capacity do not exist. It is assumed that each vehicle can take as many pictures or videos of a
target (or targets) as needed without exceeding its capacity. This is analogous to saying that each vehicle has an
infinite capacity or that its capacity is so large it will hardly ever be exceeded. Also, the only “time windows”
present are due to the need to return information to command and control as quickly as possible, e.g. for tactical
reasons or due to cost. Savelsbergh and Sol (1995) provide other examples of realistic problems where there are not
restrictive time window constraints [11]. Here in the problem formulation with UAVs, it is assumed that these „time
windows‟ are not tight, and that the overall time in which information is returned is more important (it becomes the
objective).
Pickup and delivery problems have three main classifications: many-to-many, one-to-many-to one, and
one-to-one. One-to-one problems have been the main focus for research as they have the most real-life applicability
to real-life (e.g. postal services, transportation services, etc). In addition to this, both static and dynamic version of
this problem have been studied, though the latter much less so. Because dynamic problems can are generally solved
by re-optimizing a static scenario or by using smart insertion techniques when a new request arrives, the static
version of this problem has been studied the most and is the focus of this research. However, while many studies of
the static problem address the objective of minimizing route cost [10]-[11]-[12], essentially none address the
objective of minimizing the time. However, there are many cases where the need to service all requests quickly is
much greater than the need to minimize the length of the route.
Previous work on task allocation of multiple UAVs has shown that obtaining the optimal policy for
relevant problems can be computationally overwhelming as the number of states and controls become very large.
Therefore, the scalability of the solution to any type of this problem is of great interest. Optimal solutions of the
minimum distance PDP have only found solutions with up to 15 requests. However, heuristic algorithms have found
solutions with up to 250 requests and 1 vehicle [13]. Previous work as a result of this research effort has addressed
this problem of minimizing wait time [14] using a heuristic algorithm and has shown effective results (within 5-
15%) for small cases in a fraction of the computational time.
Small UAVs have limited power available onboard and consequently, must make careful decisions about
how to best utilize power for communication. This motivated our investigation of optimal approaches to resolve
some of the “communication bottle-necks” inherent in today‟s co-operative UAV networks. Furthermore, a simple
heuristic solution was implemented to solve small static cases of this problem [14]. The objective of this research is
to find a near-optimal solution using a heuristic algorithm for the static case that is scalable and can be altered for
the dynamic, unconstrained PDP. Further development of this work motivated the need for an alternative heuristic,
SMART, to be able to accomplish this (see section IV). As a result, presented in this paper is the motivation,
SMART heuristic strategy, and corresponding results and analysis.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 4 of 19
American Institute of Aeronautics and Astronautics
III. Problem Formulation
In this scenario, it is required that multiple UAVs travel beyond their range of communication abilities to
collect information and return to the communication range to be able to transmit it back to command and control. In
these situations, it is necessary to gather this information as quickly as possible; e.g. for tactical reasons or due to
cost. The problem then becomes how to allocate these UAVs such that they collect data from a number of targets
and return home. Again, this problem can be formulated as a Pickup and Delivery Problem (PDP) with additional
assumptions. For a given scenario, the number of targets, the number of UAVs available, and their respective
locations can vary. Therefore, to allow for any circumstance, these parameters are not restricted (it is not assumed
that the number of targets is greater than the number of UAVs or vice versa). At this point, there is one
communication tower with a given range that allows for transmission of data.
A. General PDP
This problem is formulated such that M vehicles are allocated to service N requests [11]. Each of these
requests, , is characterized by:
Load size of the request i
Origin of request i
Destination of request i
Set of all points containing origins and destinations of transportations requests
while each vehicle, , is characterized by:
Load capacity of vehicle k
Starting point of vehicle k
Ending point of vehicle k
Set of all points containing starting and ending points of vehicles
For all , the associated travel from i to j can be characterized by:
Travel distance from i to j
Travel time from i to j
Travel cost from i to j
A pickup and delivery route Rk for vehicle k is route through Vk, a subset of V, such that: the route starts and
ends at k+ and k
- respectively,
needs to be visited before for all
, each location in Vk gets
visited exactly once, and the vehicle load never exceeds Qk. Additionally, a PDP plan is a set of routes
such that: Rk is a route for vehicle k for each and is a partition of V. Given a
route R, the cost associated with the route is given by an objective function JT(R). Then, the pickup and delivery
problem becomes:
B. Variables
Four different variables are associated with the general pickup and delivery problem:
Equal to 1 if request i is assigned to vehicle k and 0 otherwise
Equal to 1 if vehicle k
travels from i to j and 0 otherwise
Departure time at vertex i
Load of vehicle arriving at request i
The pickup and delivery problem becomes (where R is a feasible pickup & delivery plan and JT is the
objective function as defined in part E):
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 5 of 19
American Institute of Aeronautics and Astronautics
C. Constraints
The minimization of JT(x) is subject to the following constraints:
for all (1)
for all
(2)
for all (3)
for all (4)
for all (5)
for all
(6)
for all (7)
for all (8)
for all (9)
for all
(10)
for all (11)
for all (12)
for all (13)
for all (14)
Equation (1) ensures that each request is only assigned to one vehicle. Equation (2) ensures that each
vehicle only travel to a request location if that request is assigned to that vehicle. Equations (3) and (4) ensure that
each vehicle starts and ends at the correct locations. Equations (5)-(8) form the precedence constraints and
equations (9)-(12) form the capacity constraints.
D. Assumptions
To allow for a good comparison across solution methodologies, all UAVs and targets are considered
homogeneous. That is, all UAVs have similar speeds, sensors, and communication abilities, and all targets only
need to be serviced (visited), and there is no need for loitering, approaching at specific directions, or sweeping the
area for targets. This is done to validate the heuristic solution and show the feasibility of this approach. In the
future, these assumptions will be relaxed in the future as we move towards realization. Additionally, collision
avoidance is considered a non-issue as the UAVs could work at different altitudes.
In this effort, it is also assumed that all targets are known at the initial time t = 0 and that no new targets are
initiated throughout the simulation. At this point, communication within the tower range is homogeneous and
instantaneous. There is no need for the UAV to linger within the area to transmit data and data transmission is
equivalent at all points in the range and capacity of the vehicles is unlimited. In addition to this, no flight dynamics
are considered, and therefore, the UAVs simply need to fly straight paths to and from the targets (or communication
range). This allows for the target locations to be the corresponding state of the UAVs and the cost to go from state
to state, the distance between locations.
E. Objective Function
An additional variable (not associated with the general PDP) is assigned for this problem that allows each
UAV to decide whether to visit the communication range before visiting the next request. The go/no go decision to
visit the communication area is represented by u and can be either 0 or 1. Therefore, given a decision v at a certain
stage, the UAV would also need to make the decision u of whether to go directly to the next target (u = 0) or
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 6 of 19
American Institute of Aeronautics and Astronautics
whether to go through the communication range to transmit data (u = 1). This is such that data can be transmitted
back to the home base periodically throughout the mission.
uk Decision to visit communication range; u ϵ {0 1}
for all k = 1…M
The cost function for this problem is formulated so that the total time that it takes for all the targets to be
visited and then delivered to the communication range is minimized. Therefore, for multiple UAVs, the total service
time cost is the sum of the cost of each individual UAV (Jk):
M
1k
kT JJ (15)
Here, the cost for each UAV (Jk) becomes of a function of the targets assigned to that UAV (nk) and is a
sum of the cost for each target (gi):
kn
1i
ik gJ (16)
Because the total time for a target to be delivered to the communication range depends on when it is picked
up and then delivered, the cost for each target is a function of the initial time to get to the first state, the sum of the
time until the UAV gets to that state, and the final time to get from there to the target. Because of the added
decision at each state of whether to visit the communication range or not (uk), the final time could vary. That is, if uk
= 1, the final time cost would be the cost to go from the request directly to the tower, and if uk = 0, the final time
would accumulate until the next time the decision was to visit the range (uk = 1). Therefore, the cost of each target
has the following form (Eq. 17):
l
N1kli
1i
1l
lDkl
FkIk
ii gu1u1u
ug:
Tk
minCC
CC
(17)
Where the costs to go from one state to the next in the previous equation are denoted by:
CIk Cost for UAV k to go from its initial position to a request
CFk Cost for UAV k to go from a request to a delivery location
CDk Cost for UAV k to go from one request to the next directly
CTk Cost for UAV k to go from one request to the next through the delivery area
As seen in Eq. 17, not only does the cost at each target rely on past decisions, but it also relies on future
decisions. Therefore, if the decision is to go to the communication range at that state, the cost is equal to the total
time past until the UAV has reached that state plus the cost to reach the tower. Otherwise, the cost is equal to the
next state at which the decision is to go to the communication range (uk = 1). Due to this dependency on future
decisions, this methodology has to be solved backwards (from state N to 1) if solved optimally.
IV. Motivation
A. Cost Function
This problem formulation not only has distinct uses for several applications and hardly been addressed in
past literature, but using a minimum-time cost function has clear advantages over other, commonly used cost
functions (minimum-distance and min-max-distance). It allows for information to be delivered periodically while
maintaining a good balance between the distance travelled and the wait time for the requests (Fig. 3). The following
charts show a comparison between the extra total time that the targets wait until delivery by traversing the min-
distance and min-max distance paths (left), and the extra distance travelled by traversing the min-time and min-max
distance paths (right). The results show a clear benefit to using a minimum-time cost function in this formulation.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 7 of 19
American Institute of Aeronautics and Astronautics
Figure 3: Comparison of extra distance travelled and time waited for different cost functions
B. Humans versus Robots
Typically, a human operator would be tasked with allocating targets to UAVs. For scenarios with UAVs,
this operator is normally a military individual. This individual would be extensively trained on the equipment and
briefly trained on the mission objective. As UAVs are being utilized more for missions that are “dull, dirty, or
dangerous,” the means to operate them efficiently becomes critical. It is important that effective algorithms be
developed for vehicles to operate autonomously (or semi-autonomously) so that operators can focus on mission
demands that require a quick response.
Because the computation time to find the optimal solution is exponential with the number of targets and
UAVs, obtaining the optimal solution is impractical for realistic scenarios. Many effective heuristics have been
implemented to solve task allocation problems, and some have shown to produce near-optimal results for very
complicated scenarios. This has been a main motivation for developing good heuristics for the static PDP. In
addition, with a good target insertion method in the dynamic environment, these heuristics could be altered slightly
such that they performed effectively in the infinite-horizon policy.
While the previously developed heuristic [14] can produce good results when compared to the optimal
solution (within 5-15% in this case), it is important for the heuristic algorithm to perform adequately when
compared to human operators. That is, we don‟t want to replace a human with a computer that cannot produce
comparable results. Therefore, a study was done so that the heuristic algorithm could be tested against the optimal
and human performance. In this study, fourteen people were asked to allocate UAVs given a mission objective.
Each of the people that completed the survey are currently doing research in the area of Control Sciences and have
at least a Bachelor of Science from one of the following backgrounds: Mechanical Engineering, Aerospace
Engineering, Electrical Engineering, and Mathematics. They were given directions, a mission objective, and an
example to follow (below).
Mission Objective:
To minimize the total service time (pickup and delivery) of all the targets using your UAVs (green
triangles). That is, the sum of the times each target (red squares) waits to be picked up and delivered.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 8 of 19
American Institute of Aeronautics and Astronautics
Example Problem & Optimal Solution:
Example Problem (Left) and its Optimal Solution (Right)
UAV 1: Targets (1 – N): __2__ __4__ __5__ __3__ _____ _____ _____ _____
Return (0, 1): __0__ __0__ __0__ __1__ _____ _____ _____ _____
UAV 2: Targets (1 – N): __8__ _____ _____ _____ _____ _____ _____ _____
Return (0, 1): __1__ _____ _____ _____ _____ _____ _____ _____
UAV 3: Targets (1 – N): __6__ __7__ __1__ _____ _____ _____ _____ _____
Return (0, 1): __0__ __1__ __1__ _____ _____ _____ _____ _____
Each person was given a set of 10 problems and asked to spend between 45 and 60 seconds on each
problem. There were three difficulties of problems given: “easy,” “medium,” and “hard.” Each problem was
generated at random and was categorized solely on the number of targets and UAVs (the difficulty was not based on
how obvious the answer might be). Three easy problems were given that had 2 UAVs and 5 targets, four medium
problems had 3 UAVs and 8 targets, and three hard problems had 3 UAVs and 15 targets. While the optimal
solution was obtained for the easy and medium problems, the “optimal solution” for the hard problems was based on
the best solution found by anybody. That is, the computational time to find the optimal solution was impossible to
implement and the best solution was used in these cases for comparison.
Figure 4: Heuristic, Optimal, & Human Survey Results for Static PDP
The above plots (Fig 4) show the compiled results for the survey for all fourteen people, the average
person, the optimal solution, and the previous heuristic solution [14]. The score for each is based on the average
cost per target in each problem. Therefore, the score for each problem is the total cost of the solution divided by the
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 9 of 19
American Institute of Aeronautics and Astronautics
number of targets, and the average score is given over all ten problems. Because the objective is to minimize the
total cost, the lowest cost per target indicates the best solution. Obviously, the optimal solution performs the best
overall. While the heuristic solution does not do as well as the optimal or better than the average person, it is
comparable. However, it does better than most people. Additionally, all solutions performed are about on the same
scale.
It is important to note that in these particular examples, the heuristic is operating at worse than its average
performance. That is, for the examples given, the heuristic performs worse than it would do on average given a very
large sample. Additionally, the people being used for comparison are expected to perform better than the average
military trained operator. The people polled have been working on similar task allocation problems for anywhere
between 5 and 30 years. Therefore, it is expected that the heuristic would still do better than the typical military
operator. However, it is clear that the original algorithm had room for improvement, and this study gave a good
metric of what to aim at for future heuristics (i.e. the SMART algorithm).
C. Other Considerations
As stated previously, thus far both an optimal solution and heuristic solution has been developed for
cooperative UAVs to collect N targets and return the information within the range in a static environment. The first
heuristic approach mimiced the behaviour of the optimal solution by a two-stage process: request clustering and
vehicle routing. This algorithm defined clusters based on the Euclidean distance between targets and used a greedy
approach for vehicle routing (that is, each vehicle was assigned to targets within a given cluster based on the
minimum distance to the next target). The solutions obtained showed that the heuristic solution often duplicates the
optimal solution and is typically within about 10% error. In addition to this, the time to compute the solution was
reduced significantly and was shown to increase linearly with the number of targets as opposed to exponentially.
This allows for scaling as the problem gets more complicated. As stated in the previous section, it was believed that
the heuristic had room for improvement, and therefore, several strategies were approached to do so. The results of
this work not only showed some interesting behavior, but also motivated the approach for the SMART heuristic.
The first study conducted addressed the initial assignment of UAVs to clusters. Initially, the assignments
were done in a greedy fashion. Therefore, each UAV maximized the cluster information to distance from the cluster
ratio and were then assigned consecutively by the UAV with the greatest ratio. In an attempt to improvement on the
algorithm, all initial assignments were done simultaneously and the best initial assignments over all the UAVs were
maximized. However, the results not only showed no improvement, they showed that it was either no different but
more often worse to assign vehicles this way (Fig. 5). More generally, this study confirmed that better solutions are
obtained by assigning UAVs in a greedy fashion rather than by minimizing distance over all UAVs.
Figure 5: Average Percent Greater Time Waited to Optimize all Vehicle's First Assignments over a Greedy Approach
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 10 of 19
American Institute of Aeronautics and Astronautics
The second study that produced interesting results was based on an algorithm that formed loose sets of
clusters and then evaluated whether it was beneficial to add other targets to the current cluster. Essentially, the most
effective form of this algorithm assigned the first UAV to a loose cluster by maximizing the cluster information to
distance from the cluster ratio (similar to before) and then evaluated whether it was better to add any other targets,
including other clusters, to its route. Several strategies were used to evaluate whether it was more beneficial to pick
up an extra target (or cluster), but the “best” one compared adding any cluster to the current cluster versus not
adding it and picking it up immediately after. Alternatively, the second “best” option was to compare the
information return to distance ratio. These were considered good options, because they were able to evaluate a
“good” decision without taking into account the state of other UAVs, thereby reducing complexity of decisions and
consequently, computational cost. In both of these situations, it was found that while it may be better in the short
term, often the overall solution was worse (Fig. 6). This is due to the fact that though some targets would be
delivered quicker in the short term, it would mean all the targets still waiting to be picked up would wait longer.
Other important consequences of this strategy showed that it is better to add targets to clusters when the ratio of
UAVs to targets small and vice versa. Therefore, it would be better to form larger clusters when there are not many
UAVs available and smaller clusters when there are.
Figure 6: Average Percent Greater Time Waited to Form Loose Clusters and Add Targets over the Original Algorithm
Several important conclusions could be drawn from these studies and have affected the formulation of the
SMART algorithm. By inserting targets into clusters that had yet to be assigned, it was found that typical methods
used to evaluate these decisions are ineffective. While the decision may seem better in the immediate sense, often it
was worse for the overall solution. Additionally, the study showed it is better to have a good clustering algorithm
from the start than to form loose clusters and add targets afterward, and these clusters varying depending on the ratio
of targets to vehicles. That is, the clustering algorithm needs to be adaptable to the given scenario specifications.
Finally, the study comparing initial assignments showed that in a minimum time sense it is better to do this vehicle
assignment in a greedy fashion. Not only does this make vehicle assignment simpler and reduce computational
time, the vehicle routing lends itself to be easily extended to a dynamic scenario (the future goal of this research).
V. Solution Methodology
The SMART heuristic algorithm is developed here with the motivation to minimize the time that each
target has to wait before delivery. Additionally, an optimal solution is described that is used for comparing the
performance of the heuristic solution. Here, the optimal solution is found using a brute force method that searches
through all possible permutations, a trivial but effective method. It was shown in previous work [14] that as the
number of UAVs increase, the time it takes to compute the optimal result subsequently increases regardless of the
particular solving method. This is due to the exponentially increasing number of possibilities. Therefore, the
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 11 of 19
American Institute of Aeronautics and Astronautics
alternate solution is offered that is based on heuristics that mimic the optimal solution. This heuristic solution is
broken down into two stages: request clustering and vehicle routing. Both of these solutions have an objective to
collect information from the targets as quickly as possible transmit it back to base in a timely manner.
A. Optimal Solution
The methodology used here to search for the optimal solution is done by direct enumeration and checking
to see which one minimizes the objective function [Eq. 17]. This is done due to its directness and its guarantee to
find the best combination. This methodology was also chosen due to the fact that the cost at one state in the cost
function not only relies on past decisions, but it also relies on future decisions. In addition to this, it makes it
difficult to eliminate poor solutions, because the best solution isn‟t always obvious.
A brute force methodology, the one used here to find the optimal policy, involves accounting for all
possibilities. For a single UAV, the number of possibilities increases with the number of targets: # = N! With the
added communication constraint, the number of possibilities becomes: # = N!·2N-1
when the UAV can decide to visit
the communication range or not at each target. For multiple UAVs, each UAV can be assigned from 0 to N targets.
For each number of targets assigned, the specific target assignment can vary. For a random scenario, the number of
targets assigned, which targets, and the order they‟re visited in can vary based on the location of the UAVs and the
targets. The total number of possibilities for this scenario is explored through an analogy:
How many ways can you put N balls in M bins?
1
1
2
2
)1(MN
k
MNM
M
kMNN 1...21 wvvvvv w
(18)
How many ways can we arrange the Balls 1-N when we know M1- M5?
M
j j
jNM n
mN
1
j
iij nNm 1
M
iinN
1
(19)
Using a brute force method and searching through all possible ways to “put N balls in M bins” and the
number of ways to “arrange the Balls 1-N when we know M1- M5” implies that the execution time to find the
optimal solution will increase exponentially as the environment gets more complex. Therefore, scalability is the
main motivation for finding an alternate solution based on heuristics.
B. SMART Heuristic
An alternate solution using heuristics is proposed that is constructed by analyzing the optimal solution.
These heuristics are based on how targets are grouped between visits to the communication tower (clustered) and
how these clusters are assigned to UAVs. These heuristics are similar to other 2-phase heuristic solutions to Vehicle
Routing Problems (VRPs) where the solution is decomposed into two natural phases: 1) clustering of target vertices
into feasible routes and 2) actual construction of routes. Here, the heuristic solution is cluster-first, route-second
algorithm, but differs from conventional VRP solutions in that the number of clusters is not necessarily equal to the
number of vehicles. Therefore, clusters tend to be much smaller and not intuitive. Therefore, a unique approach to
clustering is adopted to form clusters, and then routes are constructed for each vehicle until clusters are visited.
Clustering
As stated in the previous section, the key to an effective solution using a cluster-first, route-second heuristic
for this problem is correctly defining targets that belong to similar clusters. Therefore, the SMART heuristic uses a
similar routing technique as used in previous related research [14], but adopts a new, smarted clustering method.
Here, a Hierarchical Agglomerative Clustering (HAC) algorithm is suggested to build clusters of targets. Different
from flat clustering (where a flat set of clusters is created without any explicit structure that would relate clusters to
each other), HAC algorithms are typically deterministic and do not require the user to define the number of clusters
at the start. HAC algorithms build clusters based on a measure of similarity (e.g. Euclidean distance) in a tree
structure (Fig. 7) until some cut-off measure is reached. Once clusters were assigned based on their similarity, a
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 12 of 19
American Institute of Aeronautics and Astronautics
similarity cutoff measure was found empirically. In this case, multiple values were tested and refined until the best
overall performance was found (see Results & Analysis section for more details).
Figure 7: Original Structure & Corresponding HAC Tree
Because the similarity of requests in this case can depend on a variety of things: density of targets, number
of vehicles, proximity to the communication range, etc, a Fuzzy Inference System (FIS) is used to create a measure
of similarity between requests. Fuzzy logic [15], based on multi-valued logic, provides a unique method for
encoding knowledge about continuous variables by manipulating inputs to outputs with if-then rules by using
heuristic knowledge and human experience (Fig. 8). Experience from the past decade, with the successful marketing
of a wide variety of products based on the Fuzzy Logic, has shown that for certain applications this approach can
lead to lower development costs, superior features, and better end product performance. One of the inherent
properties of fuzzy logic systems is that it has the capability of being a universal approximator. Additionally, this
system has the ability to utilize expert heuristic knowledge of operation of controlled systems including physical
intuition; capacity to successfully handle uncertainties and nonlinearities; and the existence of a variety of tools that
assist in studying and building efficient fuzzy systems in relatively short times. In recent times, the advantages of
fuzzy logic systems have made them attractive candidates for use in expert systems.
Figure 8: Fuzzy Inference System
The biggest influence on the clustering besides the distance from the range and the angle between the
targets was found to be the ratio of UAVs to targets. Much larger errors in the performance of the heuristic when
compared to the optimal solutions were found to be when the ratio was either very high or very low. Therefore the
FIS used here inputs information about the ratio of the distances of the two requests from the range, the angle
separating the two requests, and the ratio of number of vehicles to the number of requests to get a crisp output for
similarity. The ratio of distances and angle separation are both described by either small, medium, high, or very
high. The UAV to target ratio is described by small, medium, or high. Finally, the output measure of similarity is
described as small, medium, high, or very high. The membership functions for the inputs and outputs are shown
below (Fig. 9-12).
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 13 of 19
American Institute of Aeronautics and Astronautics
Rules relating the inputs and outputs for the FIS are set up in the form of if-then statements and are based
on heuristics and human experience. The rules for the fuzzy inference system can be summed up in some simple
decision making logic. There are a total of 48 rules for this setup and are broken up into three situations: when the
ratio of UAVs to targets is small, medium, and high. Generally, the rules follow the logic that if the ratio is small,
there are much more targets to UAVs and the clusters should be larger. Additionally, the opposite is true. That is, if
there are a lot of UAVs compared to the number of targets, the clusters tend to be much smaller. The rules
implemented with the FIS can be summarized in Tables 1, 2, and 3 below.
Table 2: Similarity Measure given a Medium Ratio of UAVs to Targets
Ratio of Distances from Communication Range
An
gle
of
Sep
ara
tion
Small Medium High Very High
Small Very High High Medium Low
Medium High Medium Medium Low
High Medium Medium Low Low
Very High Low Low Low Low
Table 1: Similarity Measure given a Small Ratio of UAVs to Targets
Ratio of Distances from Communication Range
An
gle
of
Sep
ara
tion
Small Medium High Very High
Small Very High Very High High Medium
Medium Very High High High Medium
High High High Medium Low
Very High Medium Medium Low Low
Figure 12: Output: Similarity Measure
Figure 11: Input: Ratio of UAVs to Requests
Figure 10: Input: Angle of Separation between two
Requests
Figure 9: Input: Ratio of Distances to the
Communication Range of the Two Requests
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 14 of 19
American Institute of Aeronautics and Astronautics
Vehicle Routing
In the static PDP, the assignment of clusters to targets is done as a finite horizon scenario. Instead of
assigning vehicles to clusters simultaneously, each assignment is done individually and consecutively until all
targets have been assigned. By assigning vehicles this way, the policy can easily be extended to an infinite horizon
scenario.
At each “stage,” each UAV gives a bid on the cluster that they would best service first. This bid is done by
maximizing the information (within that cluster) to distance ratio for each UAV. Then, only one UAV is assigned
by resolving the cluster to the first UAV that would arrive there. These bids are done at the initial stage and then
each UAV is subsequently assigned to clusters as it arrives back to the communication range. The bidding is done at
the first stage to allow for circumstances in which a UAV could drop-off one cluster and still pickup the next before
a different UAV could get it. This is done until all clusters are visited.
Once the targets are assigned to each UAV based on the previous algorithm, each UAV must decide in
what order to visit the targets within the cluster. Because clusters are defined as those targets with which a UAV
would visit all together and then visit the communication range at the end, it takes the decision variable uk out of the
problem. More accurately, we know that the decision variable uk will be 0 for all targets in the cluster except the
last. Therefore, to determine the shortest path for a UAV within a cluster an approximate dynamic programming
approach is used: a limited look-ahead policy. This method is an effective way to reduce the computation done by
dynamic programming by basing the decision of which target to visit next based on looking ahead only a small
number of stages. Here, the next two stages are taken into account.
VI. Results & Analysis
A. Similarity Measure
A similarity cutoff measure for this clustering technique was found empirically. In this case, multiple
values were tested and refined until the best overall performance was found. Initially, broad values (0.1, 0.2, 0.3,
0.4, 0.5, 0.6, and 0.7) were tested to get a rough estimate of the best similarity cutoff (Fig. 13). For this comparison,
3 UAVs were used and 1000 random cases were tested. It was found that a similarity value between 0.3 and 0.4
(Fig. 13) minimized the cost. Following this, values in this range (0.32, 0.33, 0.34, 0.35, 0.36, 0.37, and 0.38) were
tested to get a more exact value for 1, 3, and 5 UAVs (Fig. 14-16). The figures show blown up details of a point in
common on each of the graphs. The common trend of the curves showed that the minimum cost was at a value
between 0.34 and 0.36. The final value used for the clustering method that minimized the cost most effectively
overall was found to be 0.34. This was due to the fact that it was the most reliable to be one of the values to
minimize the cost over the set tested.
Table 3: Similarity Measure given a High Ratio of UAVs to Targets
Ratio of Distances from Communication Range
An
gle
of
Sep
ara
tion
Small Medium High Very High
Small Very High High Low Low
Medium High Medium Low Low
High Low Low Low Low
Very High Low Low Low Low
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 15 of 19
American Institute of Aeronautics and Astronautics
B. Cost Function Comparison
The main motivation for proposing a solution based on heuristics was to improve upon the execution time
while still maintaining a good approximation of the optimal solution. In addition to this, the SMART heuristic is
expected to outperform the original algorithm, and it would be ideal for the SMART solution to be scalable and
operate nominally under uncertainty. That is, how well this solution holds as the environments get more and more
complex and whether it succeeds or fails under uncertainty. While uncertainty has not yet explored in this research,
it is noted that this is an important asset for a multiple-UAV, multiple-target scenario and will be researched in the
future.
In the figure below (Fig. 17) the average percent greater time waited for the SMART algorithm over
optimal for 2 to 8 targets and 1 to 5 UAVs is shown. As seen, the algorithm gives near-optimal results. Also, the
examples show the performance of algorithm when the ratio of UAVs to targets is both large and small. Similar to
previous results, the cost for the heuristic solution has the same trend as the optimal solution. The average cost for a
scenario generally reduces as UAVs are added and increases with the number of targets. Also, the cost of the
heuristic solutions is on the same scale as the optimal solutions and within about 95% of the optimal solution cost.
In addition to this, the algorithm tends to be performing pretty consistently. That is, there is not much erratic
behavior is the cost performance. Additionally, the cost per target (Fig. 18) is relatively consistent as the number of
Figure 16: Similarity Comparison with 5 UAVs
Figure 15: Similarity Comparison with 3 UAVs
Figure 14: Similarity Comparison with 1 UAV
Figure 13: Rough Similarity Comparison (3 UAVs)
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 16 of 19
American Institute of Aeronautics and Astronautics
targets increases. While it is increasing, the cost appears to be increasing at a decreasing rate. This implies that as
the problem gets more complicated, the algorithm is maintaining its performance. Overall, the heuristic-based
solution provides excellent results in significantly reduced execution time from the optimal solution.
Figure 17: SMART Average Percent Error Figure 18: SMART Average Cost Per Target
Because optimal solutions for problems with greater than 8 targets require such a significant computational
time, the performance was only compared up to this point. However, it is important that the SMART algorithm
outperform the original algorithm as the scenario gets more complex. Below the comparison between the two
algorithms is shown for scenarios up to 5 UAVs and 25 targets (Fig. 18). As seen, the SMART algorithm does
much better, and shows an average percent improvement on the heuristic of about 2-4%.
Figure 19: Average Percent Less Time Waited for the SMART Algorithm over the Original Algorithm
C. Execution Time
One of the main objectives for any heuristic algorithm is not to just maintain near-optimal performance, but
to be able to do so in a fraction of the computational time. Previous work [14] has shown that the execution times
were significantly lower for the heuristic algorithms and appear to increase in a linear fashion. Because the
execution time for this algorithm was so little, it was clear that there was some leeway room. However because the
2 3 4 5 6 7 80
5
10
15
20
25
30
35
40
45
50
# of Targets
Perc
ent
Gre
ate
r
SMART Average Percent Greater Time Waited
1 UAV
2 UAVs
3 UAVs
4 UAVs
5 UAVs
5 10 15 20 250
0.5
1
1.5
2
2.5
3
# of Targets
Cost
(thousands)
SMART Average Cost Per Target
1 UAV
2 UAVs
3 UAVs
4 UAVs
5 UAVs
2 3 4 5 6 7 8 9 10-2
0
2
4
6
8
10
# of Targets
Perc
ent
Impro
vem
ent
Average Cost Improvement on Original Heuristic
1 UAV
2 UAVs
3 UAVs
4 UAVs
5 UAVs
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 17 of 19
American Institute of Aeronautics and Astronautics
new clustering technique is on the order of N2, the execution time for the SMART algorithm (Fig. 20) increases
accordingly. Nevertheless, the execution time is still significantly reduced and appears to have little relationship to
the number of UAVs (the order of the algorithm is independent of M as expected).
Figure 20: Execution Time for a Heuristic Solution
VII. Conclusions & Future Work
In this work, a solution strategy based on heuristics, SMART, was proposed that minimized the total time
that all the targets have to wait to be picked up and delivered to a communication range. The wait-time for each
target is based on the decision of which target to visit next and the decision at each stage of whether to visit the
communication range or not. The optimal results were used to generate a solution methodology based on heuristics
by mimicking the optimal solutions. This algorithm was motivated by a study of human performance against the
original heuristic. The original algorithm results (from previous work) have shown good results, but more
importantly, show that an effective clustering algorithm is crucial to obtaining near-optimal results. Therefore, a
new clustering algorithm based on HAC and Fuzzy Logic was implemented.
The SMART algorithm was compared both to optimal solutions and the previous heuristic for small cases
and the previous heuristic for larger cases. It was shown that the SMART algorithm performed near optimally,
within 5% of the optimal for all the cases tested. Additionally, the SMART algorithm improved on the previous
heuristic by generally performing about 2-4% better on average. Furthermore, it is shown that the cost per target for
the SMART algorithm, while increasing with the number of targets, is increasing at a decreasing rate implying that
the algorithm is somewhat scalable. This is expected as the clustering algorithm takes into account the size of the
problem (M number of UAVs and N number of targets). Finally, the execution time was calculated, and it was seen
that as expected, the execution time increased in the order of N2.
Because most Pickup and Delivery Problems are dynamic (including this specific example involving
UAVs), there is increasing interest on them. However, there has been relatively little work done on them due to
their complexity. Using the SMART algorithm as a solution to the initial scenario for the dynamic problem, the
solution to the dynamic problem can be developed and will be the future direction of this research. Because each
target is added to a cluster, and each vehicle is assigned to a cluster, individually and consecutively until they have
been assigned, the policy can easily be extended to an infinite horizon scenario. However, as shown in this work,
typical target insertion techniques can fail under a minimum-time objective. Once a good target insertion technique
is developed, targets can be assigned to a cluster and vehicle routing will remain unchanged.
1 2 3 4 5 6 7 8 9 10 110
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
# of Targets
Execution T
ime (
secs)
Average Execution Time of SMART Heuristic
1 UAV
2 UAVs
3 UAVs
4 UAVs
5 UAVs
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 18 of 19
American Institute of Aeronautics and Astronautics
Acknowledgments
C. Sabo thanks Derek Kingston at the Wright-Patterson Air Force Research Lab for his contribution, kind
support, and guidance in this research endeavor.
C. Sabo acknowledges that this work was partially funded by the Air Force Summer Faculty Fellowship
Program (SFFP).
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464
Page 19 of 19
American Institute of Aeronautics and Astronautics
References 1. Shima, T., and Rasmussen, S., Editors, “UAV Cooperative Decision and Control – Challenges and Practical Approaches,”
Advances in Design and Control, SIAM, 2009. 2. Fahlstrom, P.G., and Gleason, T.J., “Introduction to UAV Systems,” 3rd Edition, UAV Systems Inc., 2009. 3. Smith, S. L., Pavone, M., Bullo, F., Frazzoli, E., “Dynamic Vehicle Routing with Heterogeneous Demands,” 47th IEEE
Conference on Decision and Control, pp. 1206-1211, Dec, 2008. 4. Waisanen, H., Shah, D., Dahleh, M., “A Dynamic Pickup and Delivery Problem in Mobile Networks Under Information
Constraints,” IEEE Transactions on Automatic Control, Vol. 53, No. 6, pp. 1419-1433, Jul, 2008. 5. Nanry, W., Barnes, J. W., “Solving the Pickup and Delivery Problem with Time Windows using Reactive Tabu Search,”
Transportation Research Part B, Vol. 34, pp. 107-121, Jul, 2000. 6. Ropke, S., Pisinger, D., “An Adaptive Large Neighborhood Search Heuristic for the Pickup and Delivery Problem with
Time Windows,” Transportation Science, Vol. 40, No. 4, pp. 455-472, Nov, 2006. 7. Li, H., Lim, A., “A Metaheuristic for the Pickup and Delivery Problem with Time Windows,” IEEE 13th International
Conference on Tools of Artificial Intelligence, pp. 160-167, Nov, 2001. 8. Cordeau, J.-F., Laporte, G., “The Dial-a-Ride Problem (DARP): Variants, Modeling Issues, and Algorithms,” Quarterly
Journal of the Belgian, French, and Italian Operations Research Societies, Vol. 40R 1, pp. 89-101, 2002. 9. Feuerstein, E., Stougie, L., “On-Line Single-Server Dial-a-Ride Problems,” Theoretical Computer Science, Vol. 268, pp. 91-
105, 2001. 10. Berbeglia, G., Cordeau, J.-F., Laporte, G., “Dynamic Pickup and Delivery Problems,” European Journal of Operational
Research, Vol. 202, pp. 8-15, Nov, 2009. 11. Savelsbergh, M.W.P., Sol, M., “The General Pickup and Delivery Problem,” Transportation Science, Vol. 29, pp. 17-29,
1995. 12. Berbeglia, G., Cordeau, J.-F., Gribkovskaia, I., Laporte, G., “Static Pickup and Delivery Problems: A Classification Scheme
and Survey,” Top, Vol. 15, pp. 1-31, Apr, 2007. 13. Parragh. S. N. Doerner, K. F., Hartl, R. F., “A Survey on Pickup and Delivery Programs. Part II: Transportation Between
Pickup and Delivery Locations,” Journal Für Betriebswirtschaft, Vol. 58, No. 2, p.p. 81–117, 2008. 14. Sabo, C., Kingston, D., and Cohen, K., “Minimum Service Time for UAV Cooperative Control Subject to Communication
Constraints,” Air Force Research Laboratory, Accepted for presentation at the 2010 AIAA Infotech@Aerospace, Atlanta, Georgia, 20-22 April 2010.
15. Zadeh, L. A., “Fuzzy Sets”, Information and Control, Vol. 8, 1965, pp. 338-353.
Dow
nloa
ded
by U
NIV
ER
SIT
Y O
F C
INC
INN
AT
I on
Nov
embe
r 24
, 201
4 | h
ttp://
arc.
aiaa
.org
| D
OI:
10.
2514
/6.2
011-
1464