cooperative uav search and intercept...abstract cooperative uav search and intercept andrew ke-ping...
TRANSCRIPT
-
Cooperative UAV Search and Intercept
by
Andrew Ke-Ping Sun
A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science
Graduate Department of Aerospace Science and EngineeringUniversity of Toronto
Copyright c© 2009 by Andrew Ke-Ping Sun
-
Abstract
Cooperative UAV Search and Intercept
Andrew Ke-Ping Sun
Master of Applied Science
Graduate Department of Aerospace Science and Engineering
University of Toronto
2009
In this thesis, a solution to the multi Unmanned Aerial Vehicle (UAV) search and in-
tercept problem for a moving target is presented. For the search phase, an adapted
diffusion-based algorithm is used to manage the target uncertainty while individual
UAVs are controlled with a hybrid receding horizon / potential method. The coordi-
nated search is made possible by an uncertainty weighting process. The team intercept
phase algorithm is a behavioural approach based on the analytical solution of Isaac’s
Single-Pursuer/Single-Evader (SPSE) homicidal chauffeur problem. In this formulation,
the intercepting control is taken to be a linear combination of the individual SPSE con-
trols that would exist for each of the evader/pursuer pairs. A particle swarm optimizer
is applied to find approximate optimal weighting coefficients for discretized intervals of
the game time. Simulations for the team search, team intercept and combined search
and intercept problem are presented.
ii
-
Acknowledgements
First and foremost, I would like to thank my research supervisor Professor Hugh Liu. This
thesis would not have been possible without his patience, guidance, and his willingness
to always make time for his students regardless of how crowded his schedule becomes. I
would also like to thank the two other professors on my research assessment committee:
Professors Peter Grant and Chris Damaren for their helpful suggestions. A big thank
you goes out to all my fellow lab mates in the Flight Systems and Control Group for
making my time at UTIAS so enjoyable with special thanks going out to Ruben, Eric,
Yoshi, Sohrab and Keith.
On a more personal note, I would like to recognize those outside of my academic life
who have supported me throughout the past two years. To my girlfriend Ada, my two
brothers Mark and Christopher, my sister Stephanie, my grandparents, and my father, I
am forever grateful for your guidance and words of encouragement. Last but not least, I
owe my deepest gratitude to my mother. Her perseverance, unconditional support, and
mental toughness have been truly inspirational, and for that I dedicate this thesis to her.
iii
-
Contents
1 Introduction 1
1.1 Purpose of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Problem Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Literature Survey 8
2.1 Cooperative UAV Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Cooperative UAV Pursuit and Evasion . . . . . . . . . . . . . . . . . . . 11
2.3 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Background 13
3.1 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1.1 Nelder Meade Simplex Algorithm . . . . . . . . . . . . . . . . . . 13
3.1.2 Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . 16
3.2 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 The Classic Pursuit and Evasion . . . . . . . . . . . . . . . . . . 18
3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution . . . . . . . . . 19
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 Cooperative UAV Search 23
4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iv
-
4.1.1 Grid World Representation . . . . . . . . . . . . . . . . . . . . . . 24
4.1.2 Target Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Target Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.3 Search Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 Potential Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.2 Receeding Horizon Method . . . . . . . . . . . . . . . . . . . . . . 37
4.3.3 Hybrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.3.4 Multi-UAV Coordination Algorithm . . . . . . . . . . . . . . . . . 44
4.4 Benchmark Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5 Cooperative UAV Intercept 51
5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2.1 Evasion Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.2 Pursuit Team Behaviour . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.3 Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2.4 Non-cooperative and Cooperative Simulations . . . . . . . . . . . 56
5.3 Simulation Cases and Results . . . . . . . . . . . . . . . . . . . . . . . . 57
5.3.1 Non-Cooperative Chase Simulations . . . . . . . . . . . . . . . . . 57
5.3.2 Cooperative Chase Simulations . . . . . . . . . . . . . . . . . . . 61
5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
6 Search and Intercept Simulation Results 63
6.1 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2 Search Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.3 Intercept Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
v
-
7 Conclusions 67
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Bibliography 70
vi
-
List of Tables
4.1 Comparison of cooperative and non-cooperative search . . . . . . . . . . 46
4.2 Comparison of cooperative hybrid and Zamboni exhaustive search . . . . 48
5.1 Comparison of evasion capture times for simulation 1-A, 1-B and 1-C . . 58
5.2 Comparison of evasion capture times for simulations 3-A and 3-B . . . . 60
vii
-
List of Figures
1.1 Overview of UAV cooperation task: Searching. . . . . . . . . . . . . . . . 6
1.2 Overview of UAV cooperation task: Interception. . . . . . . . . . . . . . 7
3.1 Simplex in 2D search space. . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.2 Nelder Meade Simplex Operations . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Isaacs SPSE regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Region 1: The primary path. . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.5 Region 2: The universal path. . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6 Region 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1 Grid approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2 Uncertainty represented by gridded map. . . . . . . . . . . . . . . . . . . 26
4.3 Moving target implication of time varying uncertainty. . . . . . . . . . . 28
4.4 Finite element diffusion process. . . . . . . . . . . . . . . . . . . . . . . . 29
4.5 Diffusion modification 1: Maintaining two maps. . . . . . . . . . . . . . . 30
4.6 Diffusion Modification 3: Uncertinty source for regular boundary growth. 31
4.7 Contour plots of the modified diffusion model for uncertainty management.
The green circle represents the uncertainty boundary. . . . . . . . . . . . 33
4.8 Example of virtual potential field generated from an uncertainty distribution. 35
4.9 UAV located in potential field. In this particular case, the UAV control
law would dictate a left turn to align the orientation vector with the gradient. 36
viii
-
4.10 Example receding horizon method. Three time steps are shown with a
horizon length of 5 time steps. At each time step, the UAV reoptimizes to
find the best combination of controls for the next 5 steps. The time step
that corresponds to the current time step is executed. . . . . . . . . . . . 39
4.11 Hybrid search method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.12 Comparison of Hybrid method and Potential method in simulated searchs
of varying initial separation between target and search UAVs. 2 UAVs
used with a maximum time out of 1500 time steps. . . . . . . . . . . . . 43
4.13 Under the coordination algorithm, uncertainty closer to a teammate will
be reduced in value to be searched. In the above scenario, since R(1, 2) >
R(2, 2) then U2 is reduced in value from UAV 1’s perspective. Likewise,
since R(2, 1) > R(1, 1) then U1 is reduced in value from UAV 2’s perspec-
tive. As a result, since UAVs are designed to reduce the greatest total
uncertainty value, a type of uncertainty assignment is acheived with UAV
1 covering U1 and UAV 2 covering U2. . . . . . . . . . . . . . . . . . . 46
4.14 Sample Zamboni search pattern. . . . . . . . . . . . . . . . . . . . . . . . 48
4.15 Simulation results after 100 trials comparing time to target found when
using Cooperative Hybrid versus Zamboni search algorithms. . . . . . . . 49
5.1 Agent dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 Blocking and Chasing behaviours . . . . . . . . . . . . . . . . . . . . . . 56
5.3 Pursuer 1 Initial Condition=[0 0 0]; Pursuer 2 Initial Condition=[5 5 -π/2];
Evader Initial Condition=[0 5]; . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Simulation 2 (MPSE Evader Control): Pursuer 1 Initial Condition=[0 0
0]; Pursuer 2 Initial Condition=[5 0]; Evader Initial Condition=[0 5 -π/2];
Time to capture=1.82s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 Pursuer 1 Initial Condition=[-5 0 0]; Pursuer 2 Initial Condition=[5 0 π];
Evader Initial Condition=[0 0]; . . . . . . . . . . . . . . . . . . . . . . . 60
ix
-
5.6 Pursuit chase trajectores: Pursuer 1 Initial Condition=[−15√2−15√
2π4]; Pur-
suer 2 Initial Condition=[ 1√2−1√
23π4
]; Evader Initial Condition=[0 0]; . . . 62
6.1 Initial conditions for search simulation. . . . . . . . . . . . . . . . . . . . 64
6.2 Seaching for target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.3 Time step= 110; Target found. . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 Target intercept mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
x
-
Chapter 1
Introduction
The Uninhabited Aerial Vehicle (UAV) has come a long way in terms of sophistication
when compared to its early predecessors. One of the first historically documented ap-
plication of unmanned flying machines took place with the back drop of the U.S. Civil
War [19]. Charles Perley, a New York based inventor, filed a patent for the design of a
lighter than air balloon, laden with explosives, which were supposed to be launched over
enemy lines and travel through the air out of reach by enemy obstruction. The hope
was that when the UAV eventually detonated, it would have journeyed close enough to
a significant enemy target, thereby gaining a small victory without risking a single sol-
dier. Unfortunate for the user of such a device, the balloons were not equipped with any
form of control and as such, were at the mercy of the atmosphere including winds that
would often shift from one direction to another without warning. To both Union and
Confederate military commanders, uncontrollable flying explosives must have seemed to
be more of a liability than a useful means of attack since the projects on both sides were
quickly abandoned for more traditional types of warfare. Clearly much has changed as
we are now witnessing at present time an accelerating interest in UAV technology which
is currently in use and being planned for future use by many in the global community.
Unlike the orphaned maverick balloons of the past, today’s UAVs are not only widely
1
-
Chapter 1. Introduction 2
perceived as advanced viable solutions, but to many, they also represent the future of
flight for both military and civilian uses. Take for example the contrast of UAV use in
the first and second Gulf Wars. In the first Gulf War a total of 1641 hours of flight time
were logged with UAVs. According to a May 2001 Departement of the Navy report, this
translated into “at least one UAV was airborne at all times during Desert Storm”[6].
Comparing this with the second Gulf war, where a recent Associated Press article esti-
mates the total number of UAV hours flown to be in excess of 500, 000[2], one can see
a dramatic increase in the reliance on UAV technology. The role of UAVs are becoming
more accepted and entrenched in the arsenal of tools at the modern military’s disposal.
Although military applications have dominated UAV use to date, there is also growing
interest in applying UAV technology to civilian applications [26]. UAV prototypes have
been built and are under development as solutions to monitoring forest fires, monitoring
wildlife migration, and delivering medical supplies. Also remarkable is the breath of
countries employing and developing UAVs. A field that was once dominated by only a
few players, now can boast strong international participation. In the latest AIAA UAV
roundup survey conducted in 2007[27], 36 countries were found to collectively be working
on over 200 UAV projects in active use, under development and under production.
Based on the increasing reliance on UAVs in military and civilian scenarios, the
potential for application to a wide array of problems including scientific and civilian
uses, and the amount of global participation, one can safely state that research into UAV
systems and applications will expand for at least the foreseeable future. New problems
will arise and exploration and research on how these machines can be designed to meet
the demands will continue to be asked.
-
Chapter 1. Introduction 3
1.1 Purpose of Study
The vast majority of current UAVs are still more or less remotely controlled by a human
operator at a ground station. The Predator and Global Hawk are both normally piloted
by personnel located in a ground station [19]. The hand launched Raven is flown by a
specialist soldier located directly in the field[19]. Admittedly, some UAVs do have the
capability to perform some tasks independent of external operators. Examples include
heading holds, altitude holds, navigation point flying, and other various maneuvers in-
cluding loiters, climbs, descents, and the flying of circuit patterns or approaches. These
tasks can more or less be handled by existing control techniques which are well devel-
oped. Yet for more complicated tasks and maneuvers, a higher level of decision making
is necessary, and humans are often still relied upon to make the decision or in most cases
to be in direct control of the aircraft itself.
This gap in system autonomy is an excellent opportunity for improvement for many
researchers of UAV systems. Firstly, humans perform very well when confronted with
new tasks. Their flexibility and intuition are not to be discounted. Yet, for many aircraft
tasks, flexibility and intuition are only seldom called upon. More than likely, an aircraft
will be required to perform the same mission many times over with little difference
between the multiple sorties. In the vast majority of cases, automation has a significant
advantage since consistency, accuracy and precision are all weak points of the human
operator. Secondly, if human operators are constantly required to be in continuous
contact with the aircraft, then the aircraft missions are limited by the range and quality
of the communication method employed. An aircraft capable of making higher level
decisions independently would potentially be able to fly a much larger class of extended
missions, while at the same time maintaining robustness to severed and intermittent
communication links.
-
Chapter 1. Introduction 4
1.1.1 Problem Overview
The study conducted and detailed in this thesis deals with the high level decision algo-
rithms necessary to handle problems of coordination between members of a UAV team
assigned to conduct a specific task collectively. Multiple UAVs comissioned to complete
a task have several advantages over the lone UAV. More UAVs assigned to complete a
given task provides more flexibility to mission planners since multiple units are capable
of undertaking many different types of missions that a single UAV could not. Tasks can
also often be completed faster in time-sensitive missions with multiple agents. Finally,
multiple UAV teams bring a greater degree of robustness since the impact of aircraft loss
is diminished when the loss is out of many compared to when the loss is out of one.
The specific cooperation task dealt with in this study is of multiple UAVs assigned the
team task to search for and intercept a moving target. This scenario has real analogues
in both the military and civilian realms. On the military side, search and intercept UAV
drones would be useful in missions where commanders wish to pursue or find evading
enemy units. After finding the target, the UAVs could either track and relay information
back to the command station or engage the target itself cooperatively in a team fashion.
On the civilian side, search and rescue missions could be facilitated by a similar system.
An example scenario is that of a plane crash where the accident is known to have taken
place, but the exact location of the survivors is uncertain. A group of UAVs could be
dispatched to first search for survivors and secondly, if found, relay information of their
status to search and rescue crews.
The idealized scenario begins with a team of UAVs assigned the collective mission
to search for and intercept a target in a minimum time fashion (refer to Figure 1.1 and
Figure 1.2). The exact location of the single target is not known, but the target is
known to be within a region of known position and dimensions, herein referred to as the
Uncertainty Region. The mission can be divided into two distinct operations: searching
and intercepting. In the search phase, the UAVs collectively reduce the uncertainty by
-
Chapter 1. Introduction 5
performing sensor sweeps of the uncertain region. Sensors are idealized to be perfect and
therefore only a single sensor sweep of a given area is sufficient to ascertain whether the
target is in the given area at that time instant. The searching process continues until
the target location is ascertained.
The end of the search phase initiates the intercept mode for the UAVs. The goal in this
phase is to intercept the target in minimum time. Irrelevant is the UAV that ultimately
intercepts the target, but relevant is the time in which the interception occurs. Minimum
interception time is desirable, and as such the UAVs collectively develop strategies to
accomodate this goal. Planning of interception yields trajectories assigned to each of the
UAVs which are then executed to capture the target.
In this simulation study, an algorithm for the cooperative search and cooperative
intercept is developed and analyzed for a team of multiple UAVs. General assumptions
include:
1. there is only a single target.
2. the target is capable of moving.
3. the maximum velocity of the target is known a priori.
More assumptions will follow and will be detailed as they become relevant to the
discussion.
1.2 Thesis Layout
The thesis is divided into seven chapters. In this current Chapter 1, motivation and
a general overview of the problem are provided. In Chapter 2, a current survey of the
research state is provided detailing relevant works and how this study complements exist-
ing knowledge. Chapter 3 is a brief overview of algorithms, methods and concepts used
in the development of the search and intercept solution. Chapters 4 and 5 detail the
-
Chapter 1. Introduction 6
development and implementation of the cooperative search and intercept algorithms re-
spectively. Simulation results of the combined search/intercept algorithms are presented
in Chapter 6. Finally, future work and concluding remarks are presented in Chapter 7.
Team of UAV drones
Uncertainty RegionMoving target is known to be
within this region.
UAV team is assigned the task to search and
intercept a target .
(a) Team of UAVs are assigned the collective task
of searching for and intercepting a moving target
as fast as possible. The exact whereabouts of the
target is unknown.
UAV Team Searches for target by sensor sweeps over the
uncertainty region .
(b) UAVs perform search for target by performing
sensor sweeps on uncertainty regions.
UAV Team eventually finds target .
Target found.
(c) When target is found, the UAVs discontinue
search operations and initiate intercept planning.
Figure 1.1: Overview of UAV cooperation task: Searching.
-
Chapter 1. Introduction 7
Problem is now how to intercept target in
minimum time.
(a) End conditions of search problem are now ini-
tial conditions of intercept problem. Objective is
to now intercept target cooperatively in minimum
time.
Trajectories are planned for each UAV to
cooperatively capture the target in minimum
time.
(b) UAVs collectively plan intercept trajectories
assigned to each UAV.
Planned trajectories are executed to capture the
target.
(c) Target is intercepted.
Figure 1.2: Overview of UAV cooperation task: Interception.
-
Chapter 2
Literature Survey
There are two distinct UAV tasks studied in this thesis: cooperative UAV search and
cooperative UAV intercept. Much of the existing work has focused on one or the other
and as such the literature survey is accordingly divided into these two distinct areas of
study.
2.1 Cooperative UAV Search
Studies on how a coordinated search can be implemented in autonomous vehicle systems
have been conducted and continue to attract much scientific interest at least in part due
to the wide demand for such a system for military, scientific and civilian uses. Some
notable examples include planetary exploration for the scientific community[10], target
localization for military missions[22], and search and rescue for civilian use[17].
The classic archetypal collaborative search problem is characterized by finding a target
or targets while keeping some parameter to a minimum. Targets can be stationary or
moving and the minimizing parameter is often based on the limitations of the mission or
UAV (examples include time to target found or minimum fuel). Several different methods
exist and are currently being studied. The following is a sampling of the most common
methods currently under investigation.
8
-
Chapter 2. Literature Survey 9
Exhaustive searches are often used as benchmarks and, for the most part, use in
some form an open-loop pre-defined search pattern such as the zamboni pattern[1] and
the progressive spiral in maneuver[25]. The zamoboni method, aptly named after the
ice conditioning machine due to the similarity in generated paths, involves making suc-
cessive lateral sweeps, back and forth, across the search area. Each lateral sweep covers
new ground and the boundary between searched and unsearched space gradually moves
forward until the entire region is covered. The progressive spiral in, as the name implies,
involves maneuvering the agents to cover the perimeter of the search space. The agents
move in the same circular direction, either clockwise or counter clock wise, and gradually
reduce their turning radius. This method in particular has the advantage when dealing
with a moving target scenario since it can be guaranteed that a moving target does not
escape and will eventually be captured given that the right conditions apply.
Exhaustive searches are intuitive and practically easy to implement. However, since
they are open loop, they lack robustness to unforseen events such as the loss of an agent or
any other unexpected agent behaviour. Furthermore, certain search space characteristics
may make an exhaustive search impractical such as a search space populated with sparsely
distributed potential search regions. There is no point searching the entire space when a
visit to a select few way points would be sufficient. In such a case, an exhaustive search
would be a waste of time and resources since it would not be necessary to search the
entire area. To overcome this, search algorithms of closed loop type are preferred.
One closed loop approach is to change the search problem into a task assignment
problem. In these cases, the search space is sub divided into distinct regions and search
agents are assigned to the individual regions. In the work of Enns et al.[7], the search
space is divided into flying lanes and a search UAV is assigned to each lane. A market
orientated programming optimization method is used to do the actual assignment which
involves search agents bidding on the individual lanes to search out. Another example of
changing the search task to that of a task assignment problem is the work of Zhang et
-
Chapter 2. Literature Survey 10
al.[30], UAVs are assigned to search and prosecute targets in a game space. Coordination
is achieved through the assignment of navigation points to individual UAVs where the
points to be assigned are those that are known to be of the highest likelihood to coincide
with a target’s position.
By far the most common method for cooperative search is to use receding horizon
methods which are commonly used in continuous time problems or problems where the
end time is not explicitly defined. The basic receding horizon method involves looking
ahead in time by a defined duration and then determining the best strategy for that
particular look ahead interval. The control input corresponding to the instantanous
time step is executed and at the next time step, the process is repeated. The exact
implementation of coordination between search agents is widely varied. In [21], a global
optimization performance index is used by the search agents, and individual search agents
base their receding horizon controls on optimizing the said parameter. The index used
is a multi-objective weighted sum which balances tasks that include searching, collision
avoidance and minimum search overlap amoung others. In [29] a similar receding horizon
method is used as a long term planner with a short term planner acting in parallel. In
this study, UAVs therefore choose their trajectories based on the benefits of adopting
the long term control policy, the short term control policy, or a combination of the two.
Coordination in this case is manifested by virtual potential fields to minimize overlap
and UAV collisions. Jin et al.[13] also uses receding horizon control with the distinction
that replanning is done only at the end of the look ahead interval and not at every step
for search phases.
Other methods commonly found are potential methods where a virtual potential field
is used to guide the UAVs to regions that are yet to be searched. In a study done in
cooperation with Northrop Grumman[23], the unsearched grid cells act a virtual potential
field source much like a mass is a gravitational potential field source. The gradient of
the potential field, calculated at any UAV location, is used as a guiding direction vector
-
Chapter 2. Literature Survey 11
for the search agent. In this particular study, coordination was not explicitly addressed.
Under simulation, it is observed that overlap and redundancy in search trajectories while
using potential field control is commonplace.
2.2 Cooperative UAV Pursuit and Evasion
Pursuit and evasion problems have been studied since Isaacs first published his work
on differential game theory [11]. His initial work included laying the foundation for the
single pursuit and evasion (PE) game and presenting an analytical technique for finding
optimal path trajectories based on the principle of a constant valued game referred to
as the main equation. Other analytical techniques were subsequently developed to solve
varying types of PE problems such as the point mass interception problem[3] and the
isotropic rocket[5].
Lately, numerical techniques have largely dominated PE research activities primarily
due to its flexibility when applied to the general breadth of PE problems. This has
allowed researchers to explore more complex PE problems such as multi-agent problems
which are currently an active and growing area of study. An example of early studies
into the MPSE problem include work done by Benda et al. [4] where agent dynamics are
approximated with grid-world players.
Alternate approaches to the MPSE problem include the genetically inspired methods
of Haynes and Sen [8] where agent trajectories are generated from a distributed control
algorithm based on a genetic programming approach called the strongly typed genetic
programming method developed initially by Montana [16]. Another distributed approach
by Yamaguchi [28] uses a hybrid behavioural reactive framework algorithm to simulate
robot hunting cooperation. More recently, Jang and Tomlin [12] looked into a similar
multi-agent problem, solving it using level set functions and a novel method for reflecting
forward reachable sets. Li and Cruz[14] also studied the same problem with a look-ahead
-
Chapter 2. Literature Survey 12
optimization approach. Yet another work by Li, Cruz and colleagues[15], translated
the multi pursuer problem into a target assignment problem. In this work, a hiearchial
approach is used where an upper level optimization determines which pursuer targets
which evader. Once pursuit pairs are determined, the pursuers chase the targets using
the results of Isaac’s analytical solution to the one-on-one pursuit problem.
2.3 Thesis Contribution
This thesis presents a novel solution for the combined collaborative search and intercept
problem for a moving target. Many previous works have dealt with the individual prob-
lems of search, intercept and a moving target, yet few have considered all three in the
same problem. The search problem is solved using a diffusion based uncertainty map
management system and a combined potential field/receding horizon method (refered to
as the hybrid method to direct individual search agents. Presented work on the intercept
problem can be viewed as an extension of the behavioural approach of Haynes, Sen[8][9],
and Yamaguchi[28] applied to the MPSE problem studied by Jang, Tomlin[12], Li and
Cruz[14]. The primary advantage of the behavioural approach over existing techniques,
is its ability to reduce the seach-space of possible control strategies for optimal evasion or
pursuit. As a result, this facilitates the optimization routines applied. The disadvantage
is that the limitation to the heuristic behaviours may be overly conservative and the true
optimum may not be included in the corresponding admissible search-space. Thus, the
intercept portion of this thesis concerns itself not with finding the true optimum, but
rather approximations.
-
Chapter 3
Background
This section details some of the underlying methods used in the construction of the
cooperative UAV search and intercept algorithm. The optimization methods of the Nelder
Meade Simplex and particle swarm optimization are described. A short introduction to
game theory and its application to the relevant one-on-one single pursuer/single evader
problem is provided. Only brief descriptions are provided here and the time-restrained
reader already familiar with such concepts can skip this chapter without consequence.
For more detailed and rigorous descriptions of the methods, citations are provided.
3.1 Optimization Methods
Two of the primary gradient-free optimization methods used are the Nelder Meade sim-
plex algorithm and the particle swarm optimization method. Both are described in the
following sections.
3.1.1 Nelder Meade Simplex Algorithm
The Nelder Meade simplex algorithm is one of the two gradient free optimization meth-
ods used in this study. The method starts with a simplex with N + 1 vertices in a N
13
-
Chapter 3. Background 14
dimensional search space. For example, a search space defined by two search parameters
would have a triangle simplex and a three dimensional search space would have a cor-
responding tetrahedron simplex. The vertices of the simplex represent evaluation points
for the objective function and are the only discrete points where the objective function
is measured. The vertices can therefore be ordered from worst to best in terms of the
value to the optimization being performed.
Parameter 1
Par
amet
er 2
Current function evaluation points
New function evaluation point
Current Simplex
New Simplex
worst
lousy
best
Figure 3.1: Simplex in 2D search space.
The general principle behind the simplex algorithm is to update the simpex by con-
tinuously discarding the worst performing vertex and replacing it with another better
point. The new point is selected based on the evaluation of trial vertices which are cho-
sen based on an extrapolation of the objective function. Once a trial point is selected
and evaluated, depending on the performance of the trial point compared to the known
values of the existing vertices, one of several simplex morphing steps can be executed.
The result is a simplex that gradually converges towards local optimum solutions. In a
two dimensional search space, it can be pictured as a triangle flip-flopping within the
search space, changing its shape one vertex at a time, until it reaches the local extrema.
-
Chapter 3. Background 15
A number of different variations exist; however, for this study only the following simplex
morphing steps are used: (A two dimensional search space example with a 3-vertexed
simplex is shown)
Parameter 1
Par
amet
er 2
worst
lousy
best
(a) Reflection
Parameter 1
Par
amet
er 2
worst
lousy
best
(b) Expansion
Parameter 1
Par
amet
er 2
worst
lousy
best
Outside contraction
Inside contraction
(c) Contraction (outside and inside shown)
Parameter 1
Par
amet
er 2
worst
lousy
best
(d) Shrinking
Figure 3.2: Nelder Meade Simplex Operations
The following is a pseudo code for the algorithm used in this study.
1. Initialize simplex
2. Loop until converged:
(a) Identify the worst (highest: xw), second worst (second highest: xl) and best
(lowest: xb) points with function values fw, fl, and fb, respectively.
-
Chapter 3. Background 16
(b) Test for convergence
(c) Evaluate xa, the average of the points in the simplex excluding xw.
(d) Perform refection to obtain xr and evaluate to obtain fr.
(e) if fr < fb then
i. Perform expansion to obtain xe, evaluate to obtain fe.
ii. If fe < fb then replace xw by xe, fw by fe (accept expansion).
iii. Else replace xw by xr, fw by fr (accept reflection).
(f) Else if fr ≤ fl then replace xw by xr, fw by fr (accept reflected point).
(g) Else
i. If fr > fw then perform an inside contraction and evaluate fc.
ii. Else perform an outside contraction and evaluate fc.
iii. If fc > fw then shrink simplex, evaluate at the n new points.
iv. Else replace xw by xc, fw by fc (accept contraction)
(h) End Loop.
For a more detailed description of the Nelder Meade Simplex algorithm, the interested
reader is directed to Nelder and Mead’s original paper[18].
3.1.2 Particle Swarm Optimization
Particle Swarm Optimization (PSO) is a stochastic and gradient-free optimization method
for finding global extrema. The method derives its motivation from social foraging crea-
tures observed in nature such as the ant or bee colony. In these species, one often can
observe that while individual search agents look for resources independently, they can sig-
nal or influence other agents depending on the agent’s resource discovery or lack thereof.
An ant that comes across a stockpile of sugar is known to release pheremones to signal
other search ants to help exploit the bounty. It is this balance between the individual
-
Chapter 3. Background 17
and social forces that is distilled from nature and replicated in code to solve engineering
optimization problems.
The particle swarm optimization begins with a population of agents or particles that
move within the search space. Each agent has an associated position and velocity which
are updated at each optimization iteration. Every agent maintains an information set
with the following information:
1. The value and location of its current point in the search space.
2. The value and location of the best point in the search space that the individual has
discovered on its own.
3. The value and location of the best point in the search space that any team member
has discovered.
Based on this information, each particle adjusts its velocity according to the following
rule:
χik+1 = χik + µ
ik+1∆τ
µik+1 = wµik + c1r1
(ρik−χik)
∆τ+ c2r2
(ρgk−χik)
∆τ
(3.1)
Here k is the optimization iteration index, χi is the ith particle’s position in the
design space, µi is the ith particle’s update velocity and ∆τ is the update time step (set
to unity). w, c1 and c2 are weighting parameters on particle momentum, cognitive and
social factors respectively. r1 and r2 are random numbers between 0 and 1. Finally, ρi is
the individual particle’s optimum position and ρg is the global optimum found out of all
particles.
Pseudo code for the version of particle swarm used in this study is provided below:
1. Initialize positions of particles to a random distribution within the search space.
2. Randomize the particle velocities and orientations.
-
Chapter 3. Background 18
3. Loop until converged:
• For each particle
(a) Evaluate the current locations of all particles
(b) Update the particle bests
(c) Update the global best
(d) Update the particle positions
(e) Update the particle velocities
(f) End Loop
The interested reader in particle swarm optimization is directed to reference [20].
3.2 Game Theory
Game theory is the study of all forms of competition where opponents, often with conflict-
ing aims, execute interdependent strategies to achieve outcomes that maximize respective
payoffs. One of the fundamental objectives of game theorists is to derive optimal player
strategies, which are the strategies that if executed, will result in the highest payoff pos-
sible when compared to the result of all other strategies. No general method for finding
these optimal strategies exist for all classes of games; however, solutions for select game
types do exist. In this particular study, focus is set on the game of pursuit and evasion.
3.2.1 The Classic Pursuit and Evasion
The classic pursuit and evasion (PE) problem has 2 players. A single pursuit agent (P)
and a single evading target (E). P typically has a speed that is greater than that of E;
however, E has the advantage that it is more maneuverable. One macabre yet often cited
visualization tool is the scenario of the homicidal chauffeur. In this example, P is a driver
of a car with the malicious intent of running down the pedestrian E. Consistent with the
-
Chapter 3. Background 19
limitations of both P and E, the pedestrian is more maneuverable than the driver (ex/
parallel parking on two legs is much easier than doing the same on four wheels) and the
driver can reach much greater speeds than the hapless pedestrian. The abstracted model
is typically a unicycle for P, where minimum turning radius and fixed velocity apply and
a kinematic point for E, where E can instantaeously change its orientation but is still
restricted to a fixed velocity.
The game can be classified as a zero-sum game of degree where the payoff is that of
a continuum. More specifically, the capture time is the payoff to E while the negative
of the time to capture is the payoff for P. Since both players are assumed to be rational
and would therefore always choose to maximize payoffs, E would choose its inputs to
maximize time to capture while P would attempt to minimize capture time.
3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution
For a detailed derivation of the optimal strategies for the classic PE game, refer to Isaacs’
text on differential games[11]. Only the results are provided in this section.
Isaacs demonstrated through what he termed the main equation and the integration
of his retrograde path equations, that the optimal controls for both P and E depend on
the state of the game- that is, where E is at any instant relative to P. This therefore
defines a feedback control law which ensures that if executed by both P and E, the
time to capture will be both the maximum and minimum time to capture for P and E
respectively. Furthermore, if the game is plotted in the reference frame of P, with P
located at the origin and the forward direction of P aligned with the y coordinate axis,
then specific geometric regions can be plotted which indicate a specific control to be
executed depending on what region the location of E falls within. A plot of the game
in a P-centred reference frame is provided below along with an overlay of the different
regions. In this plot, the position of the evader relative to the pursuer defines the state
of the game.
-
Chapter 3. Background 20
Region 1
Capture Region
Region 2
Region 3
Figure 3.3: Isaacs SPSE regions.
In the above figure there are 4 distinct regions: region 1, region 2, region 3 and the
capture region. The capture region is bounded by the terminal surface which if touched
by the evader will end the game as a successful capture. The three other regions have
associated optimal controls for both P and E. The regions and the controls are discussed
below. Plots of sample trajectories using the optimal strategies in the P-centred reference
frame and the inertial frame follow.
1. Region 1- Primary Region: This region is characterized by an E that is suffi-
ciently close enough to P and just slightly off course, that a quick swerve by P will
result in a capture. The optimal controls in this case are:
• Evader: move directly away from pursuer at all times.
• Pursuer: make sharp turn into evader.
2. Region 2- Universal Region: Chase situations that fall into this region are
characterized by either an E that is directly ahead of P or an E that is sufficiently
-
Chapter 3. Background 21
(a) Pursuer reference frame
0 0.2 0.4 0.6 0.8 1 1.20
0.2
0.4
0.6
0.8
1
1.2
P
E
(b) Inertial reference frame
Figure 3.4: Region 1: The primary path.
far enough away from P that P has enough time to make a complete turn followed
by a straight run for E. The optimal controls in this case are:
• Evader: Move straight, tangential to P’s initial curvature circle.
• Pursuer: Turn sharply until pointed at E then head straight.
(a) Pursuer reference frame
−12 −10 −8 −6 −4 −2 0 2 4 6 8−12
−10
−8
−6
−4
−2
0
2
4
6
8
P
E
(b) Inertial reference frame
Figure 3.5: Region 2: The universal path.
3. Region 3: This region is associated with the chase scenarios where E is close to
P, but P is orientated in such a way that an immediate swerve towards E would
result in a miss. In this scenario, P must therefore first turn away from E to gain
some space before making a quick turn around for the final kill pass. The optimal
-
Chapter 3. Background 22
controls in this case are:
• Evader: move towards P, tangential to P’s initial curvature circle, until game
state is in Universal region.
• Pursuer: Turn sharply away from E, until game state is in Universal region.
(a) Pursuer reference frame
−8 −6 −4 −2 0 2 4−4
−2
0
2
4
6
8
10
P
E
(b) Inertial reference frame
Figure 3.6: Region 3
3.3 Summary
In this chapter, brief descriptions have been provided of select algorithms and background
information essential to the methods developed in this thesis. These algorithms will be
called upon in subsequent chapters. Two optimization methods have been discussed
which include the Nelder Meade Simplex algorithm (used in the cooperative UAV search
algorithm) and the Particle Swarm Optimization (used in the cooperative intercept al-
gorithm). Isaacs’ analytical solution to the single pursuer / single evader game has also
been provided which is called upon in the discussion on cooperative intercept.
-
Chapter 4
Cooperative UAV Search
As mentioned in the introduction, there are two distinct phases to the central UAV
cooperative task. The first phase is characterized by the goal of identifying the target
location within the game space. Only then can the secondary interception phase be
initiated after the position of the target is positively ascertained. In this section, the
former phase is discussed beginning with how uncertainty in the target’s position is
modelled. A diffusion model for uncertainty is used to account for the target’s ability to
move. The task of searching is realized through the reduction of uncertainty regions. This
is done by UAV sensor sweeps through the area. Three different UAV control schemes for
a single UAV (potential, receeding horizion, and a hybrid between the two) are presented
and compared. This is followed by a discussion on a cooperation algorithm used and how
the hybrid method is extended to multi-UAV teams. Finally search simulations using
the hybrid cooperative method are presented at the end of the chapter.
4.1 Problem Formulation
The search problem consists of a team of n UAVs assigned the common goal of finding
the single target in minimum time. Only a single target exists, the position of which is
not known to the search UAVs with absolute certainty. Available to the search UAVs,
23
-
Chapter 4. Cooperative UAV Search 24
however, is the knowledge of a general region or regions in the game space where the
target can not possibly be within. This knowledge is assumed to be known prior to the
start of the game based on information provided by the mission planner. The target can
however, be hidden within all other regions which would need to be scouted by the UAV
team if the target is to be found. The sensor range of the UAVs is assumed to have
limited range and as such, some UAV maneuvering is likely required to find the target.
4.1.1 Grid World Representation
The game space is a 2-D plane that the agents move within. A grid N×M square elements
is overlayed upon the game space which discretizes the game space in both the horizontal
and vertical directions. Each agent can occupy only one grid element at any time. Note
that the position of each agent can still can take on continuous values- the grid map
is simply a discretized approximation to the continuous game. Both the continuous
positions and the discretized surrogate representations are updated during the course of
the search (see Figure 4.1).
Y
X
Target Position(3.77, 0.25)
UAV1 Position(4.02, 4.43)
UAV1 Position(1.56, 1.62)
(a) Continuous agent position representation.
Y
X
Target Cell(3, 0)
UAV1 Position(4, 4)
UAV1 Cell(1, 1)
(b) Grid agent position representation.
Figure 4.1: Grid approximation.
-
Chapter 4. Cooperative UAV Search 25
4.1.2 Target Uncertainty
At the outset of the search task, the target’s exact location is not known with absolute
certainty to the UAV team; however, some information about the general whereabouts
of the target is available to the UAV team to base its trajectory planning upon. For
example, regions where the target can not be, known through mission planner intuition
is admissible and is of utility to the UAV team. These regions are referred to as certain
regions, since it is known with certainty that the target’s position can not coincide with
any point within the region. Since the target does not exist within certain regions, then
it follows that it can be found somewhere within the regions that are not certain. These
areas are designated uncertain regions. It perhaps goes without saying that any region
that is not certain is deemed uncertain and therefore includes all points where the target
may be at an instant in time.
Uncertainty Representation
The spatial uncertainty environment is represented by the grid-based representation of
the 2-D game. Each or the N×M square elements is assigned the discrete binary state of
either being certain (value=0) or uncertain (value=1). In other words,
U t = utj,k
=
01if grid j, k is certain
otherwise
where U t is the uncertainty map or representation of the UAV team’s knowledge of the
target’s position at the tth time step of size N ×M , j is the index for the grid elements
in the x direction and k in the y direction.
Other Possible Uncertainty Representations
Although in this study, uncertainty is constrained to one of two binary states, a continuum
between the two extremes can be implemented to account for partial certainty of grid
-
Chapter 4. Cooperative UAV Search 26
1 1 1 1 0
1
1
0
0
1 1 0 0
1 0 0 0
0 0 0 0
0 1 0 0
Y
X
Uncertain Regions: Target could possibly be in
these blocks of U=1
Certain Regions: Target could NOT possibly be in these blocks of U=0
Figure 4.2: Uncertainty represented by gridded map.
spaces. This allowance for intermediate values would be useful for the modelling of non-
perfect sensors, where certain regions of the UAV sensor area are less reliable than others.
Less than perfect reliability of a particular sensor area region would therefore correspond
to reduced uncertainty, but not to the point of complete certainty. This partial certainty
would therefore be represented by an uncertainty value that lies somewhere in between
the values of 0 and 1. For this study, perfect sensors are assumed, hence partial certainty
is not considered.
Diffusion Model for a Moving Target
For static target problems, one can make the assumption that changes to the uncertainty
map will only result from the UAVs scanning the uncertain regions. The static case
is therefore a problem of steady or decreasing uncertainty where elements that were
initially certain remain certain, while those that are uncertain can switch to certain only
as a result of the passing of a searching UAV. However, in the case of a moving target,
the steady or decreasing uncertainty assumption is no longer valid. Targets can move
and therefore can transition into neighbouring grid cells. Uncertainty therefore has the
-
Chapter 4. Cooperative UAV Search 27
ability to grow in the moving target case and must be taken into account by a mechanism
which can evolve the uncertainty map with time.
To take the target’s ability to move into account, a model based upon two dimensional
diffusion is adopted to evolve the uncertainty boundaries over time. A variation based
on the work done in [23] is adopted here. The motivation for this is that diffusion will
propagate the uncertainty in all directions equally. This is desireable since no information
is given on the behaviour of the target, and therefore all target moves must be considered
equally probable. Since this is the case, the worst case scenario must be assumed which
should be reflected on the uncertainty boundary evolution. The basic 2-D diffusion
equation is as follows:
∂u
∂t= c
(∂2u
∂x2+∂2u
∂y2
)(4.1)
where u is the uncertainty in the grid cell, t is the time, and c is the diffusion conductivity
constant. As it stands, the above equation is not useful when applied to a discretized
plane such as the already adopted grid world representation of the game space. Instead,
the 2-D finite element diffusion model is used.
ut+1j,k = utj,k + c∆t
[utj+1,k − 2utj,k + utj−1,k
∆x2+utj,k+1 − 2utj,k + utj,k−1
∆y2
](4.2)
In this equation, both the time and spatial partial derivatives have been approximated
with central finite differences. The conductivity constant, c, and the uncertainty, u, retain
their meanings. The super indices t represent the current time step while k and j are
the indices for the grid elements in the x and y directions. ∆x and ∆y are the step sizes
in the x and y directions.
This is one of the simplest forms of diffusion for 2-D finite element models. A simu-
lation of the evolution of the uncertainty using the above model is provided above. The
diagrams demonstrate the movement of uncertainty from grid cells of high uncertainty
concentration to regions of low uncertainty concentration over time.
Yet this is still not quite the desired behavior of the uncertainty evolution. Firstly, as
-
Chapter 4. Cooperative UAV Search 28
0 0 0 0 0
0
0
0
0
ut = 0ut+1 =1 0 0 0
0 ut = 1
ut+1 =1 0 0
0 0 0 0
0 0 0 0
Y
X
V
(a) Given the target position is known, it can ei-
ther transition to another neighbouring cell, or re-
main stationary. If a transition occurs, the new
cell’s uncertainty level must increase to 1
0 0 0 0 0
0
0
0
0
ut = 0ut+1 =1
ut = 0ut+1 =1
ut = 0ut+1 =1 0
ut = 0ut+1 =1
ut = 1ut+1 =1
ut = 0ut+1 =1 0
ut = 0ut+1 =1
ut = 0ut+1 =1
ut = 0ut+1 =1 0
0 0 0 0
Y
X
1
(b) The behaviour of the target is not known.
Therefore transitions to all neighbouring cells
must be considered.
Y
X
V
Uncertainty at t=1
Uncertainty at t=2
Uncertainty at t=3
Uncertainty at t=4
(c) To take moving targets into account, the
boundary of uncertainty must grow at a rate at
least as fast as the target’s maximum velocity.
Figure 4.3: Moving target implication of time varying uncertainty.
-
Chapter 4. Cooperative UAV Search 29
05
1015
2025
05
1015
20250
0.5
1
1.5
2
xy
z
(a) Initial distribution.
05
1015
2025
05
1015
20250
0.5
1
1.5
2
xy
z
(b) After 100 steps.
05
1015
2025
05
1015
20250
0.5
1
1.5
2
xy
z
(c) After 150 steps.
Figure 4.4: Finite element diffusion process.
was mentioned in the previous section, every cell can only be of only two states. Either the
cell is certain or it is uncertain and there are no intermediate values. This binary system
has not yet been taken into account in the 2-D diffusion equation. Secondly, at the basis
of the diffusion equation is the law of conservation. In standard diffusion applications,
the total amount of the parameter in question, (whether it be energy or mass), remains
fixed provided the system is closed and has no sinks or sources. A consequence of this
is that cells of high uncertainty will experience an unwarranted reduction with time as
uncertainty flows out of the cell to neighboring cells of lesser uncertainty. This is not the
desired behavior since cells that are uncertain should remain uncertain until scanned by
a UAV. The maneuvering target within a cell can potentially choose to remain stationary
and this possibility must be reflected in how the uncertainty evolves as well. The third
and final consequence of using an unmodified diffusion equation is that the rate at which
the uncertainty expands has a tendency to slow down in the latter stages of uncertainty
evolution as the spatial gradients of uncertainty become small. Again, this is not the
desired behavior, as one would generally require the uncertainty boundary to grow at a
constant rate.
To address these concerns, some modifications to the diffusion equation are in order.
The first concern is addressed by maintaining two separate maps, U and U’. The first
map, U, is described as above and allows for cells that can take on intermediate values
-
Chapter 4. Cooperative UAV Search 30
between 1 (uncertain) and 0 (certain). It is on this map where the uncertainty is evolved
using the diffusion algorithm. Changes to grid uncertainty values due to search UAV
movement is also taken into account on this map. The second map, U’, is a filtered
version of its cousin where each cell of map U is compared to a defined threshold value.
If a given cell on map U has a value at or above this threshold, the corresponding cell on
map U’ is assigned the value of 1- that is, if a cell is sufficiently uncertain, the algorithm
considers that cell to be completely uncertain and is reflected in map U’. Likewise, if on
the other hand, the value of the map U is below the threshold value, the particular cell on
map U’ is assigned the value of 0. Both maps are required to be maintained and updated
for the duration of the simulation. Map U manages the evolution of the uncertainty with
time, while map U’ is simply a binary filtered version of U which is used by the UAV
search team to develop their search trajectories.
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
z
(a) Map of U : intermediate values allowed.
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
z
(b) Map of U ′: binary filtered version of U with
threshold value of 0.1.
Figure 4.5: Diffusion modification 1: Maintaining two maps.
The second concern of having uncertain cells remain uncertain unless being scanned
by passing search UAVs is addressed with the following fix. Cells within map U do not
experience a reduction in uncertainty due to diffusion. The flux of uncertainty is strictly
limited to net inflows. Outflows for all cells are ignored. The updated equation for finite
element uncertainty diffusion is therefore
-
Chapter 4. Cooperative UAV Search 31
ut+1j,k = utj,k + c∆tmax
[0,
(utj+1,k − 2utj,k + utj−1,k
∆x2+utj,k+1 − 2utj,k + utj,k−1
∆y2
)](4.3)
The third concern of having non-constant uncertainty growth rates can be addressed
by adding uncertainty sources to the map and then applying a saturation filter to all cells
to limit uncertainty values to a maximum of 1. In this implementation, each cell above
the threshold value is increased by a fixed percentage of its original uncertainty value.
In effect, every cell with a value above the threshold acts as an uncertainty source. This
ensures that the gradients near the border of the uncertainty remain sufficiently high
which translates into a constant uncertainty boundary growth rate that does not exhibit
the undesirable slowing effect.
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
z
(a) Initial distribution.
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
z
(b) Every cell above threshold (z
value of 0.1) is boosted by a fac-
tor of 1.2 (exagerrated for visual
purposes).
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
z
(c) Filtered such that values fall
between 0 and 1.
Figure 4.6: Diffusion Modification 3: Uncertinty source for regular boundary growth.
The handling of uncertainty is therefore summed up in the following steps:
For time step t:
1. Set current U to map U from time step t-1
2. For each cell in U perform the diffusion update
3. For each cell in U greater than threshold value, boost uncertainty value by a certain
percentage
-
Chapter 4. Cooperative UAV Search 32
4. Limit each cell in U to a maximum of 1 and a minimum of 0
5. Update U’ with new U by performing cell wise threshold saturation
6. UAVs develop search trajectories with map U’
The figures below depict the new uncertainty diffusion algorithm at different time
steps. By modifying the diffusion we now have an uncertainty distribution that grows
in the radial direction at a rate that approximates the maximum velocity of the target.
Intermediate cells are no longer present, but rather only take on the values of 1 (uncertain)
and 0 (certain).
4.2 Target Behaviour
For the search phase, no particular motion of any kind is assumed. All that is known to
the UAV search team is an upper bound for the velocity of the target. This information
is incorporated into the UAV uncertainty update algorithm, more specifically, it is taken
into account when selecting the conductivity constant for the diffusion method. High
values of conductivity correspond to a faster growing uncertainty which is consistent
with a faster moving target. Conversely low values of conductivity correspond to a
slower growth of uncertainty consistent with a slow moving target.
For testing purposes, target motion is constrained to straight line segments. The
target is first randomly set to have an initial orientation. It then proceeds in straight
lines until it reaches the boundary of the game space, at which point, it reflects off the
boundary with an orientation equal to the angle of incidence. Another candidate test
target behaviour during the search phase that can be used is that of random motion,
where the target selects a new orientation vector at all time instances.
-
Chapter 4. Cooperative UAV Search 33
x
y
0 5 10 15 20 250
5
10
15
20
25
(a) Initial uncertainty distribution.
x
y
0 5 10 15 20 250
5
10
15
20
25
(b) After 100s.
x
y
0 5 10 15 20 250
5
10
15
20
25
(c) After 200s.
x
y
0 5 10 15 20 250
5
10
15
20
25
(d) After 300s.
x
y
0 5 10 15 20 250
5
10
15
20
25
(e) After 400s.
x
y
0 5 10 15 20 250
5
10
15
20
25
(f) After 500s.
Figure 4.7: Contour plots of the modified diffusion model for uncertainty management.
The green circle represents the uncertainty boundary.
-
Chapter 4. Cooperative UAV Search 34
4.3 Search Behaviour
At this point, only how uncertainty is managed for the UAV search has been defined. It
still remains to be explained how to act upon the uncertainty to find the target in some
optimal or approximately optimal fashion. In this section, three methods are explored
as candidate search algorithms: The potential method, the receding horizon method and
a hybrid method which is a combination of the other two.
4.3.1 Potential Method
The potential method is a common method well used in control applications including
scenarios for autonomous UAV guidance laws. The general idea is that certain points
of interest in the game whether they are enemy locations, uncertainty cells, obstacles,
or any other body of interest, serve as virtual masses which induce a virtual potential
field within the game space. It is the calculated gradients of this potential field that
are typically used by the UAVs as a basis for the control laws. Depending on the UAV
task, different variations of the potential method can be used. Although the underlying
principles of the method remain constant how the potential field is generated and how the
agents act upon said potential field exhibit considerable variation between designers and
missions. The subsequent paragraphs describe the intricacies of the potential method
used for the agents in the team UAV search.
Potential field generation
Each cell of uncertainty exhibits a potential field of decreasing magnitude with the square
of distance much like natural gravitational fields. Cells that are certain do not contribute
to the potential field. The potential field due to an uncertain cell at a distance r from
the cell is:
pj,k = −uj,kr
(4.4)
-
Chapter 4. Cooperative UAV Search 35
The potential field at any position is a sum of all contributions from all uncertain
cells.
Pj,k =∑M
j
∑Nkpj,k (4.5)
05
1015
2025
05
1015
20250
0.2
0.4
0.6
0.8
1
xy
Unc
erta
inty
(a) Uncertainty distribution with uncertainty cells
located at centre of space.
05
1015
2025
05
1015
2025−1
−0.8
−0.6
−0.4
−0.2
0
xy
Pot
entia
l Fie
ld
(b) Resulting virtual potential field.
Figure 4.8: Example of virtual potential field generated from an uncertainty distribution.
Potential field control law
Each UAV calculates the gradient of the potential field at its current position. The
gradient is normalized to unity and the direction of the gradient is compared to its
current orientation direction vector. The control law used is to simply minimize the
angle between the two vectors. In other words, if the gradient is to the left of the UAV’s
orientation vector, the UAV will turn to the left. If the gradient is to the right, the UAV
will turn right.
The following is a pseudo-code for each UAV performing potential based search.
1. Calculate potential field contribution for every uncertain cell
2. Sum up all contributions at UAV position
3. Calculate the gradient of the potential field
-
Chapter 4. Cooperative UAV Search 36
X
Y
5 10 15 20 25
5
10
15
20
25
gradient
current orientationvector
Figure 4.9: UAV located in potential field. In this particular case, the UAV control law
would dictate a left turn to align the orientation vector with the gradient.
-
Chapter 4. Cooperative UAV Search 37
4. Normalize gradient to unity
5. Calculate difference between gradient angle and current orientation angle
6. Set control to difference.
7. Update time step and goto step 1
Potential Method Deficiencies
One major deficiency to the potential method is its poor performance when used in sym-
metric uncertainty distributions. The potential method directs the UAV to set a course
towards the centroid of the total uncertainty distribution despite the possibility that
the centroid is devoid of any uncertain cells. One case in particular that illustrates this
method’s downfall particularly well is that of a doughnut shaped uncertainty distribution
with the UAV located in the centre of the distribution. In this case, the UAV control law
continues to direct the UAV to re-search the centre of the distribution where there are
no uncertain cells present. It continues to search the inner boundary of the distribution
while neglecting the growing outer boundary that contributes greater to the increasing
uncertainty.
4.3.2 Receeding Horizon Method
Receding horizon methods are typically applied to optimization problems that have an
unspecified end time or is a continuous problem without any end time defined. In these
cases, optimization for the entire time duration is often too computationally expensive
or in the continuous case not possible. The general principle of the receding horizon
method is to select a truncated look ahead time interval over which the optimization can
be realistically conducted. The performance parameter is optimized over this shortened
time interval (or horizon) with respect to the controls available to the optimizers. A set
of sequential controls is therefore obtained that will optimize the performance parameter
-
Chapter 4. Cooperative UAV Search 38
at the end of the look ahead horizon. The first of these controls which corresponds
to the current time step is executed immediately. At the next time step, the entire
optimization is repeated with the same look-ahead horizon length updated one step
further in the future. This defines the standard receding horizon method which is to
continuously optimize for the finite look-ahead horizon, and allow that horizon to recede
into the future as time progresses.
For a single UAV in the search problem, the performance measure is the total number
of cells that are uncertain within the game space. The receding horizon is set to a pre-
defined number of time steps into the future - each step with a corresponding turning
control. These turning controls make up the search space for the optimizer which re-
optimizes the horizon controls at each time step.
Optimizer
The optimizer used is the Nelder Meade Simplex algorithm motivated by its non reliance
on gradient calculations and its computational efficiency when compared to alternative
non-gradient based optimization methods including particle swarm and genetic algo-
rithms.
Steps for receding horizon method: The following is pseudo-code for each UAV per-
forming receding horizon based search.
1. Define n=number of look ahead time steps
2. Use optimizer to find the next n best control inputs
3. Execute the first of the n control inputs
4. Update time step and go to step 2
-
Chapter 4. Cooperative UAV Search 39
time1 2 3 4 5 6 7 8 9 10
time1 2 3 4 5 6 7 8 9 10
time1 2 3 4 5 6 7 8 9 10
Look ahead Horizon
Look ahead Horizon
Look ahead Horizon
Turn right
Straight
Turn left
Turn right
Turn right
Turn right
Straight
Turn right
Turn left
Turn right
Turn right
Straight
Turn left
Turn right
Turn left
Straight
Straight
Turn right
Black= future time steps
Red = current time step
Grey = past time stepsTime = 1
Time = 2
Time = 3
Figure 4.10: Example receding horizon method. Three time steps are shown with a
horizon length of 5 time steps. At each time step, the UAV reoptimizes to find the best
combination of controls for the next 5 steps. The time step that corresponds to the
current time step is executed.
-
Chapter 4. Cooperative UAV Search 40
Receding Horizon Method Deficiencies
A common problem with receding horizon methods is how to choose the length of the look
ahead horizon. If the horizon is chosen too long then the algorithm becomes excesively
expensive from a computation perspective. If on the other hand the horizon is chosen
too short, UAV search times increase thus compromising the purpose of using any future
planning. One rule could be to make the horizon time as long as possible within the
limits of the computational power made available to the user; however, even in this case,
the horizon length may not be sufficient to make a proper decision. Consider the case
when a receding horizon controlled UAV is located sufficiently far enough away from any
uncertainty cell that regardless of what look ahead strategy it proposes, it will not be
able to scan any uncertainty cells. In this case, the optimizer would find that an optimum
look ahead strategy does not exist since all candidate trajectories are equally poor. In
these scenarios, UAVs equipped with receding horizon control are rendered useless.
4.3.3 Hybrid Method
The hybrid method is a combination of both the potential and receding horizon methods
where switching between the two search modes is performed depending on the UAV’s
position relative to uncertain cells grid cells in the game space. The default mode for the
hybrid method is to use the receding horizon method as described above. If however, the
UAV finds itself in a state where it is unable to find a set of controls that is better than
others, then it switches to the potential method. This switch from receding horizon to
potential would correspond to the case when a UAV is sufficiently far enough away from
any uncertainty cell, that regardless of what controls it selects for the receding horizon
look ahead steps, it would still not be able to reach any uncertainty cell within the look
ahead time interval. In this scenario, all look ahead trajectories would be equally poor
and there would be no reason for selecting one set of controls over another and hence,
-
Chapter 4. Cooperative UAV Search 41
the potential method would be executed until uncertainty cells come within reach of the
UAV look ahead.
Steps for receding horizon method: The following is pseudo-code for each UAV per-
forming receding horizon based search.
1. Define n=number of look ahead time steps
2. Use optimizer to find the next n best control inputs
3. If optimizer has found a good trajectory
(a) Execute the first of the n control inputs
4. If optimizer has not found a good trajectory
(a) Calculate potential field contribution for every uncertain cell
(b) Sum up all contributions at UAV position
(c) Calculate the gradient of the potential field
(d) Normalize gradient to unity
(e) Calculate difference between gradient angle and current orientation angle
(f) Set control to difference.
5. Update time step and go to step 2
The following plot compares the effectiveness of using the hybrid method over the
potential method alone, especially at high initial separation distances. Search simulations
were run in a 25×25 grid game space although the search UAV was permitted to start
out side of this region. Initial conditions such as UAV and target position and orientation
were randomized. A maximum search time was specified at 1500 time steps. A single
UAV was used to locate the single target. Search times for the simulations are plotted
for the initial separation of the UAV and the target.
-
Chapter 4. Cooperative UAV Search 42
UAV
Reachable states within look ahead interval
U
Regions of Uncertainty
U
(a) Initial sample game state with two uncertainty
regions and single UAV.
U
U
(b) Candidate receding horizon generated trajec-
tory found and is executed.
U
(c) After first uncertainty region is scanned, UAV
attempts to use receding horizon but can not
find any optimal or approximate optimal trajecto-
ries since remaining uncertainty is not contained
within receding horizon boundaries.
U
(d) Since receding horizon failed to yield a trajec-
tory, the potential method is used for guidance.
Figure 4.11: Hybrid search method.
-
Chapter 4. Cooperative UAV Search 43
0 20 40 60 80 100 120 140 160 180 2000
500
1000
1500
Initial Separation
Sea
rch
Tim
e
HybridPotential
Figure 4.12: Comparison of Hybrid method and Potential method in simulated searchs
of varying initial separation between target and search UAVs. 2 UAVs used with a
maximum time out of 1500 time steps.
-
Chapter 4. Cooperative UAV Search 44
Figure 4.12 demonstrates the advantage of using the hybrid method over exclusive
potential method control. At low initial separations, there is no appreciable difference
between the two; however, as the initial separation between the target and the search UAV
increases, one notices that there are significantly more time-outs for the potential method
- that is, search UAVs using the potential method are unable to find the target and reach
the maximum allowed time. The reason for this is, as mentioned in section 4.3.1, the
search UAV using potential control has a tendency to head towards the centroid of the
uncertainty regardless of there being uncertainty at the centroid or not. Thus there is an
excess of centroid searching when using the potential method resulting in an inefficient
search. The hybrid method on the other hand, does not suffer from this problem.
4.3.4 Multi-UAV Coordination Algorithm
Up until now, no coordination between UAVs has been discussed. All three methods
(potential, receding horizon, and hybrid) were defined in the limited context of a single
UAV. In this section coordination between UAVs is introduced with the intention of
enhancing the performance of the team search for the target.
Coordination is manifested through the way in which the individual UAVs within the
team perceive uncertainty cells. Up to this point, all uncertainty cells were considered
of equal value to each UAV. For coordination purposes, weighted uncertainty is now
proposed. The underlying principle is to have cells closer to a UAV more attractive to
scan and at the same time have cells closer to the UAV’s teammates less attractive to
scan. As is demonstrated under simulation, this reduces redundant search trajectories
of UAVs that search the same area. This ultimately reduces the time to cover the entire
area and also the search time to find the moving target.
As mentioned in the previous paragraph, the weighted uncertainty values are altered
versions of the standard uncertainty cells by making cells closer to the individual UAVs
of greater value to scan, while those that are closer to other UAVs of less value to scan.
-
Chapter 4. Cooperative UAV Search 45
Both of these objectives can be accomplished by attenuating the uncertainty values as
follows:
The standard uncertainty representation for the UAVs is
U tN×M
= uti,j
=
01if grid i, j is certain
otherwise(4.6)
The weighted uncertainty representation for the ith UAV is
iŪtN×M = iū
tj,k =
0 if grid j, k is certain
n∏l=1
(1− exp
[− r
2l
2σ2
])otherwise
(4.7)
where iŪtN×M
is the weighted uncertainty map valid for UAV i at time step t of size N×M .
The quantity iūtj,k is the weighted uncertainty value for the individual grid elements of
iŪtN×M
. n is the total number of UAVs. l is the index for the UAVs. rl is the distance
from lth UAV to the grid element (j, k). σ is the attenuation factor used to adjust the
degree to which the UAVs avoid uncertainty closer to their teammates.
As one can see in the above equation, cells that are closer to other UAVs are atten-
uated in value. Cells that are further away from teammates, however retain their initial
uncertianty value and therefore of greater value to be scanned by the UAV in question.
60 random search scenarios were tested under simulation using both cooperative and
non-cooperative UAVs. UAV initial conditions, target initial conditions, and initial un-
certainty region were all randomized. The only restrictions imposed were that (1) the
game space was limited to a 25 × 25 grid, (2) all agents were contained within the
grid, (3) the target must be initially within an uncertain region. The following table
summarizes the results:
-
Chapter 4. Cooperative UAV Search 46
1 2
U1 U2
R(1,1)
R(1,2) R(2,1)
R(2,2)
Figure 4.13: Under the coordination algorithm, uncertainty closer to a teammate will be
reduced in value to be searched. In the above scenario, since R(1, 2) > R(2, 2) then U2
is reduced in value from UAV 1’s perspective. Likewise, since R(2, 1) > R(1, 1) then U1
is reduced in value from UAV 2’s perspective. As a result, since UAVs are designed to
reduce the greatest total uncertainty value, a type of uncertainty assignment is acheived
with UAV 1 covering U1 and UAV 2 covering U2.
Table 4.1: Comparison of cooperative and non-cooperative search
Type Wins Avg. Time to target found (time steps)
Cooperative 33 108
Non-Cooperative 21 150
-
Chapter 4. Cooperative UAV Search 47
4.4 Benchmark Comparison
The Zamboni method is an exhaustive search method that can be adapted to autonomous
multi-vehicle searching. Few benchmark search algorithms for cooperative searching ex-
ist; however, the Zamboni method in particular has been used in the past for algorithm
comparison. A full description of the Zamboni method along with other exhaustive search
methods can be found in the work of Ablavsky et al.[1]. The Zamboni method involves a
series of loops where each individual loop is comprised of a (1) front-sweep followed by (2)
a 180 degree turn around and finally a (3) back-sweep, where the back sweep (see Figure
4.14 for sample search pattern). Each sweep (both front and back) is characterized by a
full transition of the UAV from one side of the game space to the direct opposite side.
Each successive loop is advanced slightly further than the last, resulting in a creeping
boundary between the unsearched and searched space. The looping continues until the
front-sweep of the first loop meets the back-sweep of the last loop. At this point, the
UAV advances several units ahead and begins the process over again with another series
of loops searching new area.
100 simulations were conducted comparing the cooperative hybrid search method to
the Zamboni method. Two UAVs were assigned the mission of finding a target in a game
space at varying starting distances. The simulation was first run using cooperative hybrid
searching followed by a re-run using the Zamboni method. The results appear in Figure
4.15 and Table 4.2. Simulations demonstrate that the coordinated hybrid method yields
a lower capture time more than twice as often when compared to the Zamboni method
with a reduced average time to target found of 27%.
4.5 Summary
This chapter discusses the first phase of the UAV mission which is to locate a single
moving target. Initially available to the UAV search team are regions where the UAV
-
Chapter 4. Cooperative UAV Search 48
Back-Sweep
Front-Sweep
Direction of Travel
Figure 4.14: Sample Zamboni search pattern.
Table 4.2: Comparison of cooperative hybrid and Zamboni exhaustive search
Type Wins (3 draws) Avg. Time to target found (time steps)
Cooperative Hybrid 67 238
Zamboni 30 324
-
Chapter 4. Cooperative UAV Search 49
0 20 40 60 80 100 1200
200
400
600
800
1000
1200
1400
Initial Separation
time
to c
aptu
re
Coordinated HybridZamboni
Figure 4.15: Simulation results after 100 trials comparing time to target found when
using Cooperative Hybrid versus Zamboni search algorithms.
-
Chapter 4. Cooperative UAV Search 50
could possibly be. These regions, called uncertainty regions are known before the start of
the search. The UAV team uses a proposed modified diffusion model to manage the time-
evolving nature of the uncertainty regions due to the target’s ability to move from one
location to another. Individual search algorithms are based on a