SMART Heuristic for Pickup and Delivery Problem (PDP) with Cooperative UAVs

1 Graduate Student, Department of Aerospace Engineering, Student Member AIAA 3 Associate Professor, Department of Aerospace Engineering, Associate Fellow, AIAA

SMART Heuristic for Pickup and Delivery Problem (PDP)

with Cooperative UAVs

Chelsea Sabo1

University of Cincinnati, Cincinnati, Ohio, 45221

Kelly Cohen2

University of Cincinnati, Cincinnati, Ohio, 45221

Abstract

Pickup and Delivery Problems (PDPs) are a subset of Vehicle Routing Problems (VRPs)

which require a vehicle to service targets by picking them up at an origin and delivering them to

their unique destination. With respect to surveillance functions, this becomes a realistic problem

as UAVs are restricted by operating range, data rate, Anti-Jam margins, and cost. Therefore,

UAVs must be allocated to “pickup” targets and then “deliver” them, from within a prescribed

communication space, back to a command and control HQ. To maximize the speed/amount of

information transmitted from this communication region, the objective of allocating the UAVs is

such that the total service time (pickup and delivery) of all the targets is minimized. Previous

work on PDPs has shown that as the problem gets more complicated (i.e. more targets and more

vehicles) the solution space increases exponentially, and the execution time to find an optimal

solution is impossible to implement. Additionally, previously related work using a heuristic

solution has been applied to this problem showing that good result can be maintained (within ~15-

20% of the optimal solution for small cases). The focus of this research is to develop an alternate

heuristic algorithm, deemed SMART from here on, that can perform near optimally (within ~5%)

and scales as the problem gets more complicated. Also, the algorithm is developed such that it is

easily extendable to a dynamic scenario as this research progresses. This algorithm is described

in detail and has shown that it reaches this performance metric while requiring a significantly

reduced computational time when compared to the time needed to obtain the optimal solution.

I. Introduction

Surveillance functions are of paramount importance to U.S. defense system, and these systems are

comprised of various means for acquiring and processing information needed by military commanders/national

security decision makers. The future of these systems focus on including human intelligence, measurement and

signature intelligence, signals intelligence, imagery intelligence, and open source intelligence through algorithms,

software, and automation. Additionally in the not too distant future, cooperative UAV teams are anticipated to

provide this much needed support more effectively [1]. A very important aspect in the design of UAV cooperative

control systems is the ability to collect and transmit the data collected to a decision making authority such as

command and control headquarters. Ideally speaking, a group of collaborating UAVs should be able to

communicate “whenever and as much as they need to [1].” While this should be the standard, it is far from a reality.

Typically the focus is on minimizing mission costs while communication restrictions are often ignored. However,

there is a distinct need for collecting high-resolution snapshots of targets anywhere in the environment (no

communication constraints) in addition to the need to reliably get this information back in a timely way. In reality,

operating range, data rate, Anti-Jam margins and cost are limiting factors that need to be considered in order to

operate effectively [2]. While „command and control‟ signals can be delivered with low bandwidth data, there is a

distinct need for a high bandwidth delivery mechanism. Moreover, communication constraints and limits define all

UAV activity and there is a growing interest in the research community to expand current capabilities.

Infotech@Aerospace 201129 - 31 March 2011, St. Louis, Missouri

AIAA 2011-1464

Copyright © 2011 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 2 of 19

American Institute of Aeronautics and Astronautics

II. Literature Survey

The resource allocation problem associated with sending cooperative UAVs to collect information about

targets falls within the general framework of Vehicle Routing Problems (VRPs). Here, the context of the VRP is to

deliver goods to customers that have placed orders (or in this case, for UAVs to service target requests) from a

single, central depot. Several variations of these problems have been addressed and studied extensively: vehicle

routing with pickup and delivery (VRPPD), vehicle routing with Last In, First Out (LIFO), vehicle routing with time

windows (VRPTW), and the capacitated vehicle routing problem (CVRP). The VRPPD is a subset of VRPs where

loads need to be transferred from a pickup location (their origin) to their delivery location (their destination).

Vehicle routing with LIFO is a type of VRPPD where the first item to be delivered must be the last item that was

picked up. VRP with TWs represent problems where the items to be visited have timeframes associated with them.

Capacitated vehicle routing problems are problems where the vehicles have a limited load capacity. Furthermore,

pickup and delivery problems (PDPs) have been studied with each of these various additional constraints (see Fig. 1

below).

Figure 1: Framework for Pickup and Delivery Problems

In the problem formulation with UAVs, the VRP is setup as a PDP in that there are targets that need to be

visited (to collect information from), and there is a single communication range where the UAVs are able to transmit

data through a high bandwidth connection back to a command and control HQ (Fig. 2). Because this is a range (not

a single point or depot), each target typically has a unique delivery site based on its location (e.g. the closest point in

the range from the target to the range). However, each target can also be delivered at another target‟s delivery

location (each target has a “best” delivery location, but is satisfied by being delivered at any point within the range).

Therefore, the UAVs must be allocated to “pickup” targets and then “deliver” them to their corresponding

destination within the communication range. This assumption of limited communication constraints is not only

feasible, but realistic in almost every operating environment.

Figure 2: PDP environment where UAVs need to "pickup" targets and "deliver" them to a communication range.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 3 of 19


While allocating UAVs is an important growing concern as UAVs are more frequently used, this

scheduling issue where pickup and delivery is necessary arises in many fields and has numerous applications. This

is often an issue when scheduling school buses, garbage trucks, delivery vans and trucks; distributing mail and

packages; dispatching trucks; etc. In addition to this, applications include ambulatory services, logistics, and

robotics. An example of a similar, commonly studied problem is the Travelling Repairperson Problem (TRP). Here

the agent must service locations to “repair” the targets. These types of problems are studied in conjunction with

PDPs by assuming that the delivery location is the same as the pickup location [3]-[4].

Because pickup and delivery problems with time windows (PDPTW) and vehicle capacity have many

practical applications, these have been studied extensively [5]-[6]-[7]. An example of the type of problem with time

windows is called the dial-a-ride problem (DARP) in which a customer places a call for taxi [8]. Often times there

are a maximum time window for each customer or limitations to the capacity of the cab [9]. An example of a

limited capacity problem is the Stacker-Crane problem (SCP) [10]. In this problem, each item that is picked-up

needs to be immediately delivered to its destination. This is similar to saying that the load capacity of the vehicle is

one. This problem has practical applications of scheduling crane operations.

Arguably the hardest, and often most realistic, version of these problems is the unconstrained dynamic PDP

with large numbers of service requests. Generally when discussing UAVs in literature, the restrictions of time

windows and vehicle capacity do not exist. It is assumed that each vehicle can take as many pictures or videos of a

target (or targets) as needed without exceeding its capacity. This is analogous to saying that each vehicle has an

infinite capacity or that its capacity is so large it will hardly ever be exceeded. Also, the only “time windows”

present are due to the need to return information to command and control as quickly as possible, e.g. for tactical

reasons or due to cost. Savelsbergh and Sol (1995) provide other examples of realistic problems where there are not

restrictive time window constraints [11]. Here in the problem formulation with UAVs, it is assumed that these „time

windows‟ are not tight, and that the overall time in which information is returned is more important (it becomes the

objective).

Pickup and delivery problems have three main classifications: many-to-many, one-to-many-to one, and

one-to-one. One-to-one problems have been the main focus for research as they have the most real-life applicability

to real-life (e.g. postal services, transportation services, etc). In addition to this, both static and dynamic version of

this problem have been studied, though the latter much less so. Because dynamic problems can are generally solved

by re-optimizing a static scenario or by using smart insertion techniques when a new request arrives, the static

version of this problem has been studied the most and is the focus of this research. However, while many studies of

the static problem address the objective of minimizing route cost [10]-[11]-[12], essentially none address the

objective of minimizing the time. However, there are many cases where the need to service all requests quickly is

much greater than the need to minimize the length of the route.

Previous work on task allocation of multiple UAVs has shown that obtaining the optimal policy for

relevant problems can be computationally overwhelming as the number of states and controls become very large.

Therefore, the scalability of the solution to any type of this problem is of great interest. Optimal solutions of the

minimum distance PDP have only found solutions with up to 15 requests. However, heuristic algorithms have found

solutions with up to 250 requests and 1 vehicle [13]. Previous work as a result of this research effort has addressed

this problem of minimizing wait time [14] using a heuristic algorithm and has shown effective results (within 5-

15%) for small cases in a fraction of the computational time.

Small UAVs have limited power available onboard and consequently, must make careful decisions about

how to best utilize power for communication. This motivated our investigation of optimal approaches to resolve

some of the “communication bottle-necks” inherent in today‟s co-operative UAV networks. Furthermore, a simple

heuristic solution was implemented to solve small static cases of this problem [14]. The objective of this research is

to find a near-optimal solution using a heuristic algorithm for the static case that is scalable and can be altered for

the dynamic, unconstrained PDP. Further development of this work motivated the need for an alternative heuristic,

SMART, to be able to accomplish this (see section IV). As a result, presented in this paper is the motivation,

SMART heuristic strategy, and corresponding results and analysis.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

https://www.researchgate.net/publication/239063487_The_General_Pickup_and_Delivery_Problem?el=1_x_8&enrichId=rgreq-994681e1-8484-49f5-8728-942e0e7fa7e6&enrichSource=Y292ZXJQYWdlOzI2ODU3MDc5NTtBUzoxNjcwNzIyNzMyODEwMjVAMTQxNjg0NDU0MjUxMA==

Page 4 of 19


III. Problem Formulation

In this scenario, it is required that multiple UAVs travel beyond their range of communication abilities to

collect information and return to the communication range to be able to transmit it back to command and control. In

these situations, it is necessary to gather this information as quickly as possible; e.g. for tactical reasons or due to

cost. The problem then becomes how to allocate these UAVs such that they collect data from a number of targets

and return home. Again, this problem can be formulated as a Pickup and Delivery Problem (PDP) with additional

assumptions. For a given scenario, the number of targets, the number of UAVs available, and their respective

locations can vary. Therefore, to allow for any circumstance, these parameters are not restricted (it is not assumed

that the number of targets is greater than the number of UAVs or vice versa). At this point, there is one

communication tower with a given range that allows for transmission of data.

A. General PDP

This problem is formulated such that M vehicles are allocated to service N requests [11]. Each of these

requests, , is characterized by:

Load size of the request i

Origin of request i

Destination of request i

Set of all points containing origins and destinations of transportations requests

while each vehicle, , is characterized by:

Load capacity of vehicle k

Starting point of vehicle k

Ending point of vehicle k

Set of all points containing starting and ending points of vehicles

For all , the associated travel from i to j can be characterized by:

Travel distance from i to j

Travel time from i to j

Travel cost from i to j

A pickup and delivery route Rk for vehicle k is route through Vk, a subset of V, such that: the route starts and

ends at k+ and k

- respectively,

needs to be visited before for all

, each location in Vk gets

visited exactly once, and the vehicle load never exceeds Qk. Additionally, a PDP plan is a set of routes

such that: Rk is a route for vehicle k for each and is a partition of V. Given a

route R, the cost associated with the route is given by an objective function JT(R). Then, the pickup and delivery

problem becomes:

B. Variables

Four different variables are associated with the general pickup and delivery problem:

Equal to 1 if request i is assigned to vehicle k and 0 otherwise

Equal to 1 if vehicle k

travels from i to j and 0 otherwise

Departure time at vertex i

Load of vehicle arriving at request i

The pickup and delivery problem becomes (where R is a feasible pickup & delivery plan and JT is the

objective function as defined in part E):

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 5 of 19


C. Constraints

The minimization of JT(x) is subject to the following constraints:

for all (1)

for all

(2)

for all (3)

for all (4)

for all (5)

for all

(6)

for all (7)

for all (8)

for all (9)

for all

(10)

for all (11)

for all (12)

for all (13)

for all (14)

Equation (1) ensures that each request is only assigned to one vehicle. Equation (2) ensures that each

vehicle only travel to a request location if that request is assigned to that vehicle. Equations (3) and (4) ensure that

each vehicle starts and ends at the correct locations. Equations (5)-(8) form the precedence constraints and

equations (9)-(12) form the capacity constraints.

D. Assumptions

To allow for a good comparison across solution methodologies, all UAVs and targets are considered

homogeneous. That is, all UAVs have similar speeds, sensors, and communication abilities, and all targets only

need to be serviced (visited), and there is no need for loitering, approaching at specific directions, or sweeping the

area for targets. This is done to validate the heuristic solution and show the feasibility of this approach. In the

future, these assumptions will be relaxed in the future as we move towards realization. Additionally, collision

avoidance is considered a non-issue as the UAVs could work at different altitudes.

In this effort, it is also assumed that all targets are known at the initial time t = 0 and that no new targets are

initiated throughout the simulation. At this point, communication within the tower range is homogeneous and

instantaneous. There is no need for the UAV to linger within the area to transmit data and data transmission is

equivalent at all points in the range and capacity of the vehicles is unlimited. In addition to this, no flight dynamics

are considered, and therefore, the UAVs simply need to fly straight paths to and from the targets (or communication

range). This allows for the target locations to be the corresponding state of the UAVs and the cost to go from state

to state, the distance between locations.

E. Objective Function

An additional variable (not associated with the general PDP) is assigned for this problem that allows each

UAV to decide whether to visit the communication range before visiting the next request. The go/no go decision to

visit the communication area is represented by u and can be either 0 or 1. Therefore, given a decision v at a certain

stage, the UAV would also need to make the decision u of whether to go directly to the next target (u = 0) or

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 6 of 19


whether to go through the communication range to transmit data (u = 1). This is such that data can be transmitted

back to the home base periodically throughout the mission.

uk Decision to visit communication range; u ϵ {0 1}

for all k = 1…M

The cost function for this problem is formulated so that the total time that it takes for all the targets to be

visited and then delivered to the communication range is minimized. Therefore, for multiple UAVs, the total service

time cost is the sum of the cost of each individual UAV (Jk):

M

1k

kT JJ (15)

Here, the cost for each UAV (Jk) becomes of a function of the targets assigned to that UAV (nk) and is a

sum of the cost for each target (gi):

kn

1i

ik gJ (16)

Because the total time for a target to be delivered to the communication range depends on when it is picked

up and then delivered, the cost for each target is a function of the initial time to get to the first state, the sum of the

time until the UAV gets to that state, and the final time to get from there to the target. Because of the added

decision at each state of whether to visit the communication range or not (uk), the final time could vary. That is, if uk

= 1, the final time cost would be the cost to go from the request directly to the tower, and if uk = 0, the final time

would accumulate until the next time the decision was to visit the range (uk = 1). Therefore, the cost of each target

has the following form (Eq. 17):

l

N1kli

1i

1l

lDkl

FkIk

ii gu1u1u

ug:

Tk

minCC

CC

(17)

Where the costs to go from one state to the next in the previous equation are denoted by:

CIk Cost for UAV k to go from its initial position to a request

CFk Cost for UAV k to go from a request to a delivery location

CDk Cost for UAV k to go from one request to the next directly

CTk Cost for UAV k to go from one request to the next through the delivery area

As seen in Eq. 17, not only does the cost at each target rely on past decisions, but it also relies on future

decisions. Therefore, if the decision is to go to the communication range at that state, the cost is equal to the total

time past until the UAV has reached that state plus the cost to reach the tower. Otherwise, the cost is equal to the

next state at which the decision is to go to the communication range (uk = 1). Due to this dependency on future

decisions, this methodology has to be solved backwards (from state N to 1) if solved optimally.

IV. Motivation

A. Cost Function

This problem formulation not only has distinct uses for several applications and hardly been addressed in

past literature, but using a minimum-time cost function has clear advantages over other, commonly used cost

functions (minimum-distance and min-max-distance). It allows for information to be delivered periodically while

maintaining a good balance between the distance travelled and the wait time for the requests (Fig. 3). The following

charts show a comparison between the extra total time that the targets wait until delivery by traversing the min-

distance and min-max distance paths (left), and the extra distance travelled by traversing the min-time and min-max

distance paths (right). The results show a clear benefit to using a minimum-time cost function in this formulation.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 7 of 19


Figure 3: Comparison of extra distance travelled and time waited for different cost functions

B. Humans versus Robots

Typically, a human operator would be tasked with allocating targets to UAVs. For scenarios with UAVs,

this operator is normally a military individual. This individual would be extensively trained on the equipment and

briefly trained on the mission objective. As UAVs are being utilized more for missions that are “dull, dirty, or

dangerous,” the means to operate them efficiently becomes critical. It is important that effective algorithms be

developed for vehicles to operate autonomously (or semi-autonomously) so that operators can focus on mission

demands that require a quick response.

Because the computation time to find the optimal solution is exponential with the number of targets and

UAVs, obtaining the optimal solution is impractical for realistic scenarios. Many effective heuristics have been

implemented to solve task allocation problems, and some have shown to produce near-optimal results for very

complicated scenarios. This has been a main motivation for developing good heuristics for the static PDP. In

addition, with a good target insertion method in the dynamic environment, these heuristics could be altered slightly

such that they performed effectively in the infinite-horizon policy.

While the previously developed heuristic [14] can produce good results when compared to the optimal

solution (within 5-15% in this case), it is important for the heuristic algorithm to perform adequately when

compared to human operators. That is, we don‟t want to replace a human with a computer that cannot produce

comparable results. Therefore, a study was done so that the heuristic algorithm could be tested against the optimal

and human performance. In this study, fourteen people were asked to allocate UAVs given a mission objective.

Each of the people that completed the survey are currently doing research in the area of Control Sciences and have

at least a Bachelor of Science from one of the following backgrounds: Mechanical Engineering, Aerospace

Engineering, Electrical Engineering, and Mathematics. They were given directions, a mission objective, and an

example to follow (below).

Mission Objective:

To minimize the total service time (pickup and delivery) of all the targets using your UAVs (green

triangles). That is, the sum of the times each target (red squares) waits to be picked up and delivered.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 8 of 19


Example Problem & Optimal Solution:

Example Problem (Left) and its Optimal Solution (Right)

UAV 1: Targets (1 – N): __2__ __4__ __5__ __3__ _____ _____ _____ _____

Return (0, 1): __0__ __0__ __0__ __1__ _____ _____ _____ _____

UAV 2: Targets (1 – N): __8__ _____ _____ _____ _____ _____ _____ _____

Return (0, 1): __1__ _____ _____ _____ _____ _____ _____ _____

UAV 3: Targets (1 – N): __6__ __7__ __1__ _____ _____ _____ _____ _____

Return (0, 1): __0__ __1__ __1__ _____ _____ _____ _____ _____

Each person was given a set of 10 problems and asked to spend between 45 and 60 seconds on each

problem. There were three difficulties of problems given: “easy,” “medium,” and “hard.” Each problem was

generated at random and was categorized solely on the number of targets and UAVs (the difficulty was not based on

how obvious the answer might be). Three easy problems were given that had 2 UAVs and 5 targets, four medium

problems had 3 UAVs and 8 targets, and three hard problems had 3 UAVs and 15 targets. While the optimal

solution was obtained for the easy and medium problems, the “optimal solution” for the hard problems was based on

the best solution found by anybody. That is, the computational time to find the optimal solution was impossible to

implement and the best solution was used in these cases for comparison.

Figure 4: Heuristic, Optimal, & Human Survey Results for Static PDP

The above plots (Fig 4) show the compiled results for the survey for all fourteen people, the average

person, the optimal solution, and the previous heuristic solution [14]. The score for each is based on the average

cost per target in each problem. Therefore, the score for each problem is the total cost of the solution divided by the

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 9 of 19


number of targets, and the average score is given over all ten problems. Because the objective is to minimize the

total cost, the lowest cost per target indicates the best solution. Obviously, the optimal solution performs the best

overall. While the heuristic solution does not do as well as the optimal or better than the average person, it is

comparable. However, it does better than most people. Additionally, all solutions performed are about on the same

scale.

It is important to note that in these particular examples, the heuristic is operating at worse than its average

performance. That is, for the examples given, the heuristic performs worse than it would do on average given a very

large sample. Additionally, the people being used for comparison are expected to perform better than the average

military trained operator. The people polled have been working on similar task allocation problems for anywhere

between 5 and 30 years. Therefore, it is expected that the heuristic would still do better than the typical military

operator. However, it is clear that the original algorithm had room for improvement, and this study gave a good

metric of what to aim at for future heuristics (i.e. the SMART algorithm).

C. Other Considerations

As stated previously, thus far both an optimal solution and heuristic solution has been developed for

cooperative UAVs to collect N targets and return the information within the range in a static environment. The first

heuristic approach mimiced the behaviour of the optimal solution by a two-stage process: request clustering and

vehicle routing. This algorithm defined clusters based on the Euclidean distance between targets and used a greedy

approach for vehicle routing (that is, each vehicle was assigned to targets within a given cluster based on the

minimum distance to the next target). The solutions obtained showed that the heuristic solution often duplicates the

optimal solution and is typically within about 10% error. In addition to this, the time to compute the solution was

reduced significantly and was shown to increase linearly with the number of targets as opposed to exponentially.

This allows for scaling as the problem gets more complicated. As stated in the previous section, it was believed that

the heuristic had room for improvement, and therefore, several strategies were approached to do so. The results of

this work not only showed some interesting behavior, but also motivated the approach for the SMART heuristic.

The first study conducted addressed the initial assignment of UAVs to clusters. Initially, the assignments

were done in a greedy fashion. Therefore, each UAV maximized the cluster information to distance from the cluster

ratio and were then assigned consecutively by the UAV with the greatest ratio. In an attempt to improvement on the

algorithm, all initial assignments were done simultaneously and the best initial assignments over all the UAVs were

maximized. However, the results not only showed no improvement, they showed that it was either no different but

more often worse to assign vehicles this way (Fig. 5). More generally, this study confirmed that better solutions are

obtained by assigning UAVs in a greedy fashion rather than by minimizing distance over all UAVs.

Figure 5: Average Percent Greater Time Waited to Optimize all Vehicle's First Assignments over a Greedy Approach

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 10 of 19


The second study that produced interesting results was based on an algorithm that formed loose sets of

clusters and then evaluated whether it was beneficial to add other targets to the current cluster. Essentially, the most

effective form of this algorithm assigned the first UAV to a loose cluster by maximizing the cluster information to

distance from the cluster ratio (similar to before) and then evaluated whether it was better to add any other targets,

including other clusters, to its route. Several strategies were used to evaluate whether it was more beneficial to pick

up an extra target (or cluster), but the “best” one compared adding any cluster to the current cluster versus not

adding it and picking it up immediately after. Alternatively, the second “best” option was to compare the

information return to distance ratio. These were considered good options, because they were able to evaluate a

“good” decision without taking into account the state of other UAVs, thereby reducing complexity of decisions and

consequently, computational cost. In both of these situations, it was found that while it may be better in the short

term, often the overall solution was worse (Fig. 6). This is due to the fact that though some targets would be

delivered quicker in the short term, it would mean all the targets still waiting to be picked up would wait longer.

Other important consequences of this strategy showed that it is better to add targets to clusters when the ratio of

UAVs to targets small and vice versa. Therefore, it would be better to form larger clusters when there are not many

UAVs available and smaller clusters when there are.

Figure 6: Average Percent Greater Time Waited to Form Loose Clusters and Add Targets over the Original Algorithm

Several important conclusions could be drawn from these studies and have affected the formulation of the

SMART algorithm. By inserting targets into clusters that had yet to be assigned, it was found that typical methods

used to evaluate these decisions are ineffective. While the decision may seem better in the immediate sense, often it

was worse for the overall solution. Additionally, the study showed it is better to have a good clustering algorithm

from the start than to form loose clusters and add targets afterward, and these clusters varying depending on the ratio

of targets to vehicles. That is, the clustering algorithm needs to be adaptable to the given scenario specifications.

Finally, the study comparing initial assignments showed that in a minimum time sense it is better to do this vehicle

assignment in a greedy fashion. Not only does this make vehicle assignment simpler and reduce computational

time, the vehicle routing lends itself to be easily extended to a dynamic scenario (the future goal of this research).

V. Solution Methodology

The SMART heuristic algorithm is developed here with the motivation to minimize the time that each

target has to wait before delivery. Additionally, an optimal solution is described that is used for comparing the

performance of the heuristic solution. Here, the optimal solution is found using a brute force method that searches

through all possible permutations, a trivial but effective method. It was shown in previous work [14] that as the

number of UAVs increase, the time it takes to compute the optimal result subsequently increases regardless of the

particular solving method. This is due to the exponentially increasing number of possibilities. Therefore, the

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 11 of 19


alternate solution is offered that is based on heuristics that mimic the optimal solution. This heuristic solution is

broken down into two stages: request clustering and vehicle routing. Both of these solutions have an objective to

collect information from the targets as quickly as possible transmit it back to base in a timely manner.

A. Optimal Solution

The methodology used here to search for the optimal solution is done by direct enumeration and checking

to see which one minimizes the objective function [Eq. 17]. This is done due to its directness and its guarantee to

find the best combination. This methodology was also chosen due to the fact that the cost at one state in the cost

function not only relies on past decisions, but it also relies on future decisions. In addition to this, it makes it

difficult to eliminate poor solutions, because the best solution isn‟t always obvious.

A brute force methodology, the one used here to find the optimal policy, involves accounting for all

possibilities. For a single UAV, the number of possibilities increases with the number of targets: # = N! With the

added communication constraint, the number of possibilities becomes: # = N!·2N-1

when the UAV can decide to visit

the communication range or not at each target. For multiple UAVs, each UAV can be assigned from 0 to N targets.

For each number of targets assigned, the specific target assignment can vary. For a random scenario, the number of

targets assigned, which targets, and the order they‟re visited in can vary based on the location of the UAVs and the

targets. The total number of possibilities for this scenario is explored through an analogy:

How many ways can you put N balls in M bins?

1

1

2

2

)1(MN

k

MNM

M

kMNN 1...21 wvvvvv w

(18)

How many ways can we arrange the Balls 1-N when we know M1- M5?

M

j j

jNM n

mN

1

j

iij nNm 1

M

iinN

1

(19)

Using a brute force method and searching through all possible ways to “put N balls in M bins” and the

number of ways to “arrange the Balls 1-N when we know M1- M5” implies that the execution time to find the

optimal solution will increase exponentially as the environment gets more complex. Therefore, scalability is the

main motivation for finding an alternate solution based on heuristics.

B. SMART Heuristic

An alternate solution using heuristics is proposed that is constructed by analyzing the optimal solution.

These heuristics are based on how targets are grouped between visits to the communication tower (clustered) and

how these clusters are assigned to UAVs. These heuristics are similar to other 2-phase heuristic solutions to Vehicle

Routing Problems (VRPs) where the solution is decomposed into two natural phases: 1) clustering of target vertices

into feasible routes and 2) actual construction of routes. Here, the heuristic solution is cluster-first, route-second

algorithm, but differs from conventional VRP solutions in that the number of clusters is not necessarily equal to the

number of vehicles. Therefore, clusters tend to be much smaller and not intuitive. Therefore, a unique approach to

clustering is adopted to form clusters, and then routes are constructed for each vehicle until clusters are visited.

Clustering

As stated in the previous section, the key to an effective solution using a cluster-first, route-second heuristic

for this problem is correctly defining targets that belong to similar clusters. Therefore, the SMART heuristic uses a

similar routing technique as used in previous related research [14], but adopts a new, smarted clustering method.

Here, a Hierarchical Agglomerative Clustering (HAC) algorithm is suggested to build clusters of targets. Different

from flat clustering (where a flat set of clusters is created without any explicit structure that would relate clusters to

each other), HAC algorithms are typically deterministic and do not require the user to define the number of clusters

at the start. HAC algorithms build clusters based on a measure of similarity (e.g. Euclidean distance) in a tree

structure (Fig. 7) until some cut-off measure is reached. Once clusters were assigned based on their similarity, a

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 12 of 19


similarity cutoff measure was found empirically. In this case, multiple values were tested and refined until the best

overall performance was found (see Results & Analysis section for more details).

Figure 7: Original Structure & Corresponding HAC Tree

Because the similarity of requests in this case can depend on a variety of things: density of targets, number

of vehicles, proximity to the communication range, etc, a Fuzzy Inference System (FIS) is used to create a measure

of similarity between requests. Fuzzy logic [15], based on multi-valued logic, provides a unique method for

encoding knowledge about continuous variables by manipulating inputs to outputs with if-then rules by using

heuristic knowledge and human experience (Fig. 8). Experience from the past decade, with the successful marketing

of a wide variety of products based on the Fuzzy Logic, has shown that for certain applications this approach can

lead to lower development costs, superior features, and better end product performance. One of the inherent

properties of fuzzy logic systems is that it has the capability of being a universal approximator. Additionally, this

system has the ability to utilize expert heuristic knowledge of operation of controlled systems including physical

intuition; capacity to successfully handle uncertainties and nonlinearities; and the existence of a variety of tools that

assist in studying and building efficient fuzzy systems in relatively short times. In recent times, the advantages of

fuzzy logic systems have made them attractive candidates for use in expert systems.

Figure 8: Fuzzy Inference System

The biggest influence on the clustering besides the distance from the range and the angle between the

targets was found to be the ratio of UAVs to targets. Much larger errors in the performance of the heuristic when

compared to the optimal solutions were found to be when the ratio was either very high or very low. Therefore the

FIS used here inputs information about the ratio of the distances of the two requests from the range, the angle

separating the two requests, and the ratio of number of vehicles to the number of requests to get a crisp output for

similarity. The ratio of distances and angle separation are both described by either small, medium, high, or very

high. The UAV to target ratio is described by small, medium, or high. Finally, the output measure of similarity is

described as small, medium, high, or very high. The membership functions for the inputs and outputs are shown

below (Fig. 9-12).

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 13 of 19


Rules relating the inputs and outputs for the FIS are set up in the form of if-then statements and are based

on heuristics and human experience. The rules for the fuzzy inference system can be summed up in some simple

decision making logic. There are a total of 48 rules for this setup and are broken up into three situations: when the

ratio of UAVs to targets is small, medium, and high. Generally, the rules follow the logic that if the ratio is small,

there are much more targets to UAVs and the clusters should be larger. Additionally, the opposite is true. That is, if

there are a lot of UAVs compared to the number of targets, the clusters tend to be much smaller. The rules

implemented with the FIS can be summarized in Tables 1, 2, and 3 below.

Table 2: Similarity Measure given a Medium Ratio of UAVs to Targets

Ratio of Distances from Communication Range

An

gle

of

Sep

ara

tion

Small Medium High Very High

Small Very High High Medium Low

Medium High Medium Medium Low

High Medium Medium Low Low

Very High Low Low Low Low

Table 1: Similarity Measure given a Small Ratio of UAVs to Targets


An

gle

of

Sep

ara

tion


Small Very High Very High High Medium

Medium Very High High High Medium

High High High Medium Low

Very High Medium Medium Low Low

Figure 12: Output: Similarity Measure

Figure 11: Input: Ratio of UAVs to Requests

Figure 10: Input: Angle of Separation between two

Requests

Figure 9: Input: Ratio of Distances to the

Communication Range of the Two Requests

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 14 of 19


Vehicle Routing

In the static PDP, the assignment of clusters to targets is done as a finite horizon scenario. Instead of

assigning vehicles to clusters simultaneously, each assignment is done individually and consecutively until all

targets have been assigned. By assigning vehicles this way, the policy can easily be extended to an infinite horizon

scenario.

At each “stage,” each UAV gives a bid on the cluster that they would best service first. This bid is done by

maximizing the information (within that cluster) to distance ratio for each UAV. Then, only one UAV is assigned

by resolving the cluster to the first UAV that would arrive there. These bids are done at the initial stage and then

each UAV is subsequently assigned to clusters as it arrives back to the communication range. The bidding is done at

the first stage to allow for circumstances in which a UAV could drop-off one cluster and still pickup the next before

a different UAV could get it. This is done until all clusters are visited.

Once the targets are assigned to each UAV based on the previous algorithm, each UAV must decide in

what order to visit the targets within the cluster. Because clusters are defined as those targets with which a UAV

would visit all together and then visit the communication range at the end, it takes the decision variable uk out of the

problem. More accurately, we know that the decision variable uk will be 0 for all targets in the cluster except the

last. Therefore, to determine the shortest path for a UAV within a cluster an approximate dynamic programming

approach is used: a limited look-ahead policy. This method is an effective way to reduce the computation done by

dynamic programming by basing the decision of which target to visit next based on looking ahead only a small

number of stages. Here, the next two stages are taken into account.

VI. Results & Analysis

A. Similarity Measure

A similarity cutoff measure for this clustering technique was found empirically. In this case, multiple

values were tested and refined until the best overall performance was found. Initially, broad values (0.1, 0.2, 0.3,

0.4, 0.5, 0.6, and 0.7) were tested to get a rough estimate of the best similarity cutoff (Fig. 13). For this comparison,

3 UAVs were used and 1000 random cases were tested. It was found that a similarity value between 0.3 and 0.4

(Fig. 13) minimized the cost. Following this, values in this range (0.32, 0.33, 0.34, 0.35, 0.36, 0.37, and 0.38) were

tested to get a more exact value for 1, 3, and 5 UAVs (Fig. 14-16). The figures show blown up details of a point in

common on each of the graphs. The common trend of the curves showed that the minimum cost was at a value

between 0.34 and 0.36. The final value used for the clustering method that minimized the cost most effectively

overall was found to be 0.34. This was due to the fact that it was the most reliable to be one of the values to

minimize the cost over the set tested.

Table 3: Similarity Measure given a High Ratio of UAVs to Targets


An

gle

of

Sep

ara

tion


Small Very High High Low Low

Medium High Medium Low Low

High Low Low Low Low

Very High Low Low Low Low

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 15 of 19


B. Cost Function Comparison

The main motivation for proposing a solution based on heuristics was to improve upon the execution time

while still maintaining a good approximation of the optimal solution. In addition to this, the SMART heuristic is

expected to outperform the original algorithm, and it would be ideal for the SMART solution to be scalable and

operate nominally under uncertainty. That is, how well this solution holds as the environments get more and more

complex and whether it succeeds or fails under uncertainty. While uncertainty has not yet explored in this research,

it is noted that this is an important asset for a multiple-UAV, multiple-target scenario and will be researched in the

future.

In the figure below (Fig. 17) the average percent greater time waited for the SMART algorithm over

optimal for 2 to 8 targets and 1 to 5 UAVs is shown. As seen, the algorithm gives near-optimal results. Also, the

examples show the performance of algorithm when the ratio of UAVs to targets is both large and small. Similar to

previous results, the cost for the heuristic solution has the same trend as the optimal solution. The average cost for a

scenario generally reduces as UAVs are added and increases with the number of targets. Also, the cost of the

heuristic solutions is on the same scale as the optimal solutions and within about 95% of the optimal solution cost.

In addition to this, the algorithm tends to be performing pretty consistently. That is, there is not much erratic

behavior is the cost performance. Additionally, the cost per target (Fig. 18) is relatively consistent as the number of

Figure 16: Similarity Comparison with 5 UAVs

Figure 15: Similarity Comparison with 3 UAVs

Figure 14: Similarity Comparison with 1 UAV

Figure 13: Rough Similarity Comparison (3 UAVs)

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 16 of 19


targets increases. While it is increasing, the cost appears to be increasing at a decreasing rate. This implies that as

the problem gets more complicated, the algorithm is maintaining its performance. Overall, the heuristic-based

solution provides excellent results in significantly reduced execution time from the optimal solution.

Figure 17: SMART Average Percent Error Figure 18: SMART Average Cost Per Target

Because optimal solutions for problems with greater than 8 targets require such a significant computational

time, the performance was only compared up to this point. However, it is important that the SMART algorithm

outperform the original algorithm as the scenario gets more complex. Below the comparison between the two

algorithms is shown for scenarios up to 5 UAVs and 25 targets (Fig. 18). As seen, the SMART algorithm does

much better, and shows an average percent improvement on the heuristic of about 2-4%.

Figure 19: Average Percent Less Time Waited for the SMART Algorithm over the Original Algorithm

C. Execution Time

One of the main objectives for any heuristic algorithm is not to just maintain near-optimal performance, but

to be able to do so in a fraction of the computational time. Previous work [14] has shown that the execution times

were significantly lower for the heuristic algorithms and appear to increase in a linear fashion. Because the

execution time for this algorithm was so little, it was clear that there was some leeway room. However because the

2 3 4 5 6 7 80

5

10

15

20

25

30

35

40

45

50

# of Targets

Perc

ent

Gre

ate

r

SMART Average Percent Greater Time Waited

1 UAV

2 UAVs

3 UAVs

4 UAVs

5 UAVs

5 10 15 20 250

0.5

1

1.5

2

2.5

3

# of Targets

Cost

(thousands)

SMART Average Cost Per Target

1 UAV

2 UAVs

3 UAVs

4 UAVs

5 UAVs

2 3 4 5 6 7 8 9 10-2

0

2

4

6

8

10

# of Targets

Perc

ent

Impro

vem

ent

Average Cost Improvement on Original Heuristic

1 UAV

2 UAVs

3 UAVs

4 UAVs

5 UAVs

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 17 of 19


new clustering technique is on the order of N2, the execution time for the SMART algorithm (Fig. 20) increases

accordingly. Nevertheless, the execution time is still significantly reduced and appears to have little relationship to

the number of UAVs (the order of the algorithm is independent of M as expected).

Figure 20: Execution Time for a Heuristic Solution

VII. Conclusions & Future Work

In this work, a solution strategy based on heuristics, SMART, was proposed that minimized the total time

that all the targets have to wait to be picked up and delivered to a communication range. The wait-time for each

target is based on the decision of which target to visit next and the decision at each stage of whether to visit the

communication range or not. The optimal results were used to generate a solution methodology based on heuristics

by mimicking the optimal solutions. This algorithm was motivated by a study of human performance against the

original heuristic. The original algorithm results (from previous work) have shown good results, but more

importantly, show that an effective clustering algorithm is crucial to obtaining near-optimal results. Therefore, a

new clustering algorithm based on HAC and Fuzzy Logic was implemented.

The SMART algorithm was compared both to optimal solutions and the previous heuristic for small cases

and the previous heuristic for larger cases. It was shown that the SMART algorithm performed near optimally,

within 5% of the optimal for all the cases tested. Additionally, the SMART algorithm improved on the previous

heuristic by generally performing about 2-4% better on average. Furthermore, it is shown that the cost per target for

the SMART algorithm, while increasing with the number of targets, is increasing at a decreasing rate implying that

the algorithm is somewhat scalable. This is expected as the clustering algorithm takes into account the size of the

problem (M number of UAVs and N number of targets). Finally, the execution time was calculated, and it was seen

that as expected, the execution time increased in the order of N2.

Because most Pickup and Delivery Problems are dynamic (including this specific example involving

UAVs), there is increasing interest on them. However, there has been relatively little work done on them due to

their complexity. Using the SMART algorithm as a solution to the initial scenario for the dynamic problem, the

solution to the dynamic problem can be developed and will be the future direction of this research. Because each

target is added to a cluster, and each vehicle is assigned to a cluster, individually and consecutively until they have

been assigned, the policy can easily be extended to an infinite horizon scenario. However, as shown in this work,

typical target insertion techniques can fail under a minimum-time objective. Once a good target insertion technique

is developed, targets can be assigned to a cluster and vehicle routing will remain unchanged.

1 2 3 4 5 6 7 8 9 10 110

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

# of Targets

Execution T

ime (

secs)

Average Execution Time of SMART Heuristic

1 UAV

2 UAVs

3 UAVs

4 UAVs

5 UAVs

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 18 of 19


Acknowledgments

C. Sabo thanks Derek Kingston at the Wright-Patterson Air Force Research Lab for his contribution, kind

support, and guidance in this research endeavor.

C. Sabo acknowledges that this work was partially funded by the Air Force Summer Faculty Fellowship

Program (SFFP).

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Page 19 of 19


References 1. Shima, T., and Rasmussen, S., Editors, “UAV Cooperative Decision and Control – Challenges and Practical Approaches,”

Advances in Design and Control, SIAM, 2009. 2. Fahlstrom, P.G., and Gleason, T.J., “Introduction to UAV Systems,” 3rd Edition, UAV Systems Inc., 2009. 3. Smith, S. L., Pavone, M., Bullo, F., Frazzoli, E., “Dynamic Vehicle Routing with Heterogeneous Demands,” 47th IEEE

Conference on Decision and Control, pp. 1206-1211, Dec, 2008. 4. Waisanen, H., Shah, D., Dahleh, M., “A Dynamic Pickup and Delivery Problem in Mobile Networks Under Information

Constraints,” IEEE Transactions on Automatic Control, Vol. 53, No. 6, pp. 1419-1433, Jul, 2008. 5. Nanry, W., Barnes, J. W., “Solving the Pickup and Delivery Problem with Time Windows using Reactive Tabu Search,”

Transportation Research Part B, Vol. 34, pp. 107-121, Jul, 2000. 6. Ropke, S., Pisinger, D., “An Adaptive Large Neighborhood Search Heuristic for the Pickup and Delivery Problem with

Time Windows,” Transportation Science, Vol. 40, No. 4, pp. 455-472, Nov, 2006. 7. Li, H., Lim, A., “A Metaheuristic for the Pickup and Delivery Problem with Time Windows,” IEEE 13th International

Conference on Tools of Artificial Intelligence, pp. 160-167, Nov, 2001. 8. Cordeau, J.-F., Laporte, G., “The Dial-a-Ride Problem (DARP): Variants, Modeling Issues, and Algorithms,” Quarterly

Journal of the Belgian, French, and Italian Operations Research Societies, Vol. 40R 1, pp. 89-101, 2002. 9. Feuerstein, E., Stougie, L., “On-Line Single-Server Dial-a-Ride Problems,” Theoretical Computer Science, Vol. 268, pp. 91-

105, 2001. 10. Berbeglia, G., Cordeau, J.-F., Laporte, G., “Dynamic Pickup and Delivery Problems,” European Journal of Operational

Research, Vol. 202, pp. 8-15, Nov, 2009. 11. Savelsbergh, M.W.P., Sol, M., “The General Pickup and Delivery Problem,” Transportation Science, Vol. 29, pp. 17-29,

1995. 12. Berbeglia, G., Cordeau, J.-F., Gribkovskaia, I., Laporte, G., “Static Pickup and Delivery Problems: A Classification Scheme

and Survey,” Top, Vol. 15, pp. 1-31, Apr, 2007. 13. Parragh. S. N. Doerner, K. F., Hartl, R. F., “A Survey on Pickup and Delivery Programs. Part II: Transportation Between

Pickup and Delivery Locations,” Journal Für Betriebswirtschaft, Vol. 58, No. 2, p.p. 81–117, 2008. 14. Sabo, C., Kingston, D., and Cohen, K., “Minimum Service Time for UAV Cooperative Control Subject to Communication

Constraints,” Air Force Research Laboratory, Accepted for presentation at the 2010 AIAA Infotech@Aerospace, Atlanta, Georgia, 20-22 April 2010.

15. Zadeh, L. A., “Fuzzy Sets”, Information and Control, Vol. 8, 1965, pp. 338-353.

Dow

nloa

ded

by U

NIV

ER

SIT

Y O

F C

INC

INN

AT

I on

Nov

embe

r 24

, 201

4 | h

ttp://

arc.

aiaa

.org

| D

OI:

10.

2514

/6.2

011-

1464

Date post:	13-May-2023
Category:	Documents
Upload:	uc
View:	0 times
Download:	0 times

SMART Heuristic for Pickup and Delivery Problem (PDP) with Cooperative UAVs

Documents