cooperative uav search and intercept...abstract cooperative uav search and intercept andrew ke-ping...

Cooperative UAV Search and Intercept

by

Andrew Ke-Ping Sun

A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science

Graduate Department of Aerospace Science and EngineeringUniversity of Toronto

Copyright c© 2009 by Andrew Ke-Ping Sun

Abstract

Cooperative UAV Search and Intercept

Andrew Ke-Ping Sun

Master of Applied Science

Graduate Department of Aerospace Science and Engineering

University of Toronto

2009

In this thesis, a solution to the multi Unmanned Aerial Vehicle (UAV) search and in-

tercept problem for a moving target is presented. For the search phase, an adapted

diffusion-based algorithm is used to manage the target uncertainty while individual

UAVs are controlled with a hybrid receding horizon / potential method. The coordi-

nated search is made possible by an uncertainty weighting process. The team intercept

phase algorithm is a behavioural approach based on the analytical solution of Isaac’s

Single-Pursuer/Single-Evader (SPSE) homicidal chauffeur problem. In this formulation,

the intercepting control is taken to be a linear combination of the individual SPSE con-

trols that would exist for each of the evader/pursuer pairs. A particle swarm optimizer

is applied to find approximate optimal weighting coefficients for discretized intervals of

the game time. Simulations for the team search, team intercept and combined search

and intercept problem are presented.

ii

Acknowledgements

First and foremost, I would like to thank my research supervisor Professor Hugh Liu. This

thesis would not have been possible without his patience, guidance, and his willingness

to always make time for his students regardless of how crowded his schedule becomes. I

would also like to thank the two other professors on my research assessment committee:

Professors Peter Grant and Chris Damaren for their helpful suggestions. A big thank

you goes out to all my fellow lab mates in the Flight Systems and Control Group for

making my time at UTIAS so enjoyable with special thanks going out to Ruben, Eric,

Yoshi, Sohrab and Keith.

On a more personal note, I would like to recognize those outside of my academic life

who have supported me throughout the past two years. To my girlfriend Ada, my two

brothers Mark and Christopher, my sister Stephanie, my grandparents, and my father, I

am forever grateful for your guidance and words of encouragement. Last but not least, I

owe my deepest gratitude to my mother. Her perseverance, unconditional support, and

mental toughness have been truly inspirational, and for that I dedicate this thesis to her.

iii

Contents

1 Introduction 1

1.1 Purpose of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.1.1 Problem Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Literature Survey 8

2.1 Cooperative UAV Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Cooperative UAV Pursuit and Evasion . . . . . . . . . . . . . . . . . . . 11

2.3 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Background 13

3.1 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1.1 Nelder Meade Simplex Algorithm . . . . . . . . . . . . . . . . . . 13

3.1.2 Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . 16

3.2 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

3.2.1 The Classic Pursuit and Evasion . . . . . . . . . . . . . . . . . . 18

3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution . . . . . . . . . 19

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Cooperative UAV Search 23

4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

iv

4.1.1 Grid World Representation . . . . . . . . . . . . . . . . . . . . . . 24

4.1.2 Target Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.2 Target Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

4.3 Search Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.1 Potential Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.2 Receeding Horizon Method . . . . . . . . . . . . . . . . . . . . . . 37

4.3.3 Hybrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.3.4 Multi-UAV Coordination Algorithm . . . . . . . . . . . . . . . . . 44

4.4 Benchmark Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

5 Cooperative UAV Intercept 51

5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.2.1 Evasion Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.2 Pursuit Team Behaviour . . . . . . . . . . . . . . . . . . . . . . . 54

5.2.3 Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.2.4 Non-cooperative and Cooperative Simulations . . . . . . . . . . . 56

5.3 Simulation Cases and Results . . . . . . . . . . . . . . . . . . . . . . . . 57

5.3.1 Non-Cooperative Chase Simulations . . . . . . . . . . . . . . . . . 57

5.3.2 Cooperative Chase Simulations . . . . . . . . . . . . . . . . . . . 61

5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6 Search and Intercept Simulation Results 63

6.1 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.2 Search Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.3 Intercept Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

v

7 Conclusions 67

7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Bibliography 70

vi

List of Tables

4.1 Comparison of cooperative and non-cooperative search . . . . . . . . . . 46

4.2 Comparison of cooperative hybrid and Zamboni exhaustive search . . . . 48

5.1 Comparison of evasion capture times for simulation 1-A, 1-B and 1-C . . 58

5.2 Comparison of evasion capture times for simulations 3-A and 3-B . . . . 60

vii

List of Figures

1.1 Overview of UAV cooperation task: Searching. . . . . . . . . . . . . . . . 6

1.2 Overview of UAV cooperation task: Interception. . . . . . . . . . . . . . 7

3.1 Simplex in 2D search space. . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.2 Nelder Meade Simplex Operations . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Isaacs SPSE regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3.4 Region 1: The primary path. . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.5 Region 2: The universal path. . . . . . . . . . . . . . . . . . . . . . . . . 21

3.6 Region 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4.1 Grid approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.2 Uncertainty represented by gridded map. . . . . . . . . . . . . . . . . . . 26

4.3 Moving target implication of time varying uncertainty. . . . . . . . . . . 28

4.4 Finite element diffusion process. . . . . . . . . . . . . . . . . . . . . . . . 29

4.5 Diffusion modification 1: Maintaining two maps. . . . . . . . . . . . . . . 30

4.6 Diffusion Modification 3: Uncertinty source for regular boundary growth. 31

4.7 Contour plots of the modified diffusion model for uncertainty management.

The green circle represents the uncertainty boundary. . . . . . . . . . . . 33

4.8 Example of virtual potential field generated from an uncertainty distribution. 35

4.9 UAV located in potential field. In this particular case, the UAV control

law would dictate a left turn to align the orientation vector with the gradient. 36

viii

4.10 Example receding horizon method. Three time steps are shown with a

horizon length of 5 time steps. At each time step, the UAV reoptimizes to

find the best combination of controls for the next 5 steps. The time step

that corresponds to the current time step is executed. . . . . . . . . . . . 39

4.11 Hybrid search method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.12 Comparison of Hybrid method and Potential method in simulated searchs

of varying initial separation between target and search UAVs. 2 UAVs

used with a maximum time out of 1500 time steps. . . . . . . . . . . . . 43

4.13 Under the coordination algorithm, uncertainty closer to a teammate will

be reduced in value to be searched. In the above scenario, since R(1, 2) >

R(2, 2) then U2 is reduced in value from UAV 1’s perspective. Likewise,

since R(2, 1) > R(1, 1) then U1 is reduced in value from UAV 2’s perspec-

tive. As a result, since UAVs are designed to reduce the greatest total

uncertainty value, a type of uncertainty assignment is acheived with UAV

1 covering U1 and UAV 2 covering U2. . . . . . . . . . . . . . . . . . . 46

4.14 Sample Zamboni search pattern. . . . . . . . . . . . . . . . . . . . . . . . 48

4.15 Simulation results after 100 trials comparing time to target found when

using Cooperative Hybrid versus Zamboni search algorithms. . . . . . . . 49

5.1 Agent dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.2 Blocking and Chasing behaviours . . . . . . . . . . . . . . . . . . . . . . 56

5.3 Pursuer 1 Initial Condition=[0 0 0]; Pursuer 2 Initial Condition=[5 5 -π/2];

Evader Initial Condition=[0 5]; . . . . . . . . . . . . . . . . . . . . . . . 58

5.4 Simulation 2 (MPSE Evader Control): Pursuer 1 Initial Condition=[0 0

0]; Pursuer 2 Initial Condition=[5 0]; Evader Initial Condition=[0 5 -π/2];

Time to capture=1.82s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.5 Pursuer 1 Initial Condition=[-5 0 0]; Pursuer 2 Initial Condition=[5 0 π];

Evader Initial Condition=[0 0]; . . . . . . . . . . . . . . . . . . . . . . . 60

ix

5.6 Pursuit chase trajectores: Pursuer 1 Initial Condition=[−15√2−15√

2π4]; Pur-

suer 2 Initial Condition=[ 1√2−1√

23π4

]; Evader Initial Condition=[0 0]; . . . 62

6.1 Initial conditions for search simulation. . . . . . . . . . . . . . . . . . . . 64

6.2 Seaching for target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.3 Time step= 110; Target found. . . . . . . . . . . . . . . . . . . . . . . . 65

6.4 Target intercept mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

x

Chapter 1

Introduction

The Uninhabited Aerial Vehicle (UAV) has come a long way in terms of sophistication

when compared to its early predecessors. One of the first historically documented ap-

plication of unmanned flying machines took place with the back drop of the U.S. Civil

War [19]. Charles Perley, a New York based inventor, filed a patent for the design of a

lighter than air balloon, laden with explosives, which were supposed to be launched over

enemy lines and travel through the air out of reach by enemy obstruction. The hope

was that when the UAV eventually detonated, it would have journeyed close enough to

a significant enemy target, thereby gaining a small victory without risking a single sol-

dier. Unfortunate for the user of such a device, the balloons were not equipped with any

form of control and as such, were at the mercy of the atmosphere including winds that

would often shift from one direction to another without warning. To both Union and

Confederate military commanders, uncontrollable flying explosives must have seemed to

be more of a liability than a useful means of attack since the projects on both sides were

quickly abandoned for more traditional types of warfare. Clearly much has changed as

we are now witnessing at present time an accelerating interest in UAV technology which

is currently in use and being planned for future use by many in the global community.

Unlike the orphaned maverick balloons of the past, today’s UAVs are not only widely

1

Chapter 1. Introduction 2

perceived as advanced viable solutions, but to many, they also represent the future of

flight for both military and civilian uses. Take for example the contrast of UAV use in

the first and second Gulf Wars. In the first Gulf War a total of 1641 hours of flight time

were logged with UAVs. According to a May 2001 Departement of the Navy report, this

translated into “at least one UAV was airborne at all times during Desert Storm”[6].

Comparing this with the second Gulf war, where a recent Associated Press article esti-

mates the total number of UAV hours flown to be in excess of 500, 000[2], one can see

a dramatic increase in the reliance on UAV technology. The role of UAVs are becoming

more accepted and entrenched in the arsenal of tools at the modern military’s disposal.

Although military applications have dominated UAV use to date, there is also growing

interest in applying UAV technology to civilian applications [26]. UAV prototypes have

been built and are under development as solutions to monitoring forest fires, monitoring

wildlife migration, and delivering medical supplies. Also remarkable is the breath of

countries employing and developing UAVs. A field that was once dominated by only a

few players, now can boast strong international participation. In the latest AIAA UAV

roundup survey conducted in 2007[27], 36 countries were found to collectively be working

on over 200 UAV projects in active use, under development and under production.

Based on the increasing reliance on UAVs in military and civilian scenarios, the

potential for application to a wide array of problems including scientific and civilian

uses, and the amount of global participation, one can safely state that research into UAV

systems and applications will expand for at least the foreseeable future. New problems

will arise and exploration and research on how these machines can be designed to meet

the demands will continue to be asked.


1.1 Purpose of Study

The vast majority of current UAVs are still more or less remotely controlled by a human

operator at a ground station. The Predator and Global Hawk are both normally piloted

by personnel located in a ground station [19]. The hand launched Raven is flown by a

specialist soldier located directly in the field[19]. Admittedly, some UAVs do have the

capability to perform some tasks independent of external operators. Examples include

heading holds, altitude holds, navigation point flying, and other various maneuvers in-

cluding loiters, climbs, descents, and the flying of circuit patterns or approaches. These

tasks can more or less be handled by existing control techniques which are well devel-

oped. Yet for more complicated tasks and maneuvers, a higher level of decision making

is necessary, and humans are often still relied upon to make the decision or in most cases

to be in direct control of the aircraft itself.

This gap in system autonomy is an excellent opportunity for improvement for many

researchers of UAV systems. Firstly, humans perform very well when confronted with

new tasks. Their flexibility and intuition are not to be discounted. Yet, for many aircraft

tasks, flexibility and intuition are only seldom called upon. More than likely, an aircraft

will be required to perform the same mission many times over with little difference

between the multiple sorties. In the vast majority of cases, automation has a significant

advantage since consistency, accuracy and precision are all weak points of the human

operator. Secondly, if human operators are constantly required to be in continuous

contact with the aircraft, then the aircraft missions are limited by the range and quality

of the communication method employed. An aircraft capable of making higher level

decisions independently would potentially be able to fly a much larger class of extended

missions, while at the same time maintaining robustness to severed and intermittent

communication links.


1.1.1 Problem Overview

The study conducted and detailed in this thesis deals with the high level decision algo-

rithms necessary to handle problems of coordination between members of a UAV team

assigned to conduct a specific task collectively. Multiple UAVs comissioned to complete

a task have several advantages over the lone UAV. More UAVs assigned to complete a

given task provides more flexibility to mission planners since multiple units are capable

of undertaking many different types of missions that a single UAV could not. Tasks can

also often be completed faster in time-sensitive missions with multiple agents. Finally,

multiple UAV teams bring a greater degree of robustness since the impact of aircraft loss

is diminished when the loss is out of many compared to when the loss is out of one.

The specific cooperation task dealt with in this study is of multiple UAVs assigned the

team task to search for and intercept a moving target. This scenario has real analogues

in both the military and civilian realms. On the military side, search and intercept UAV

drones would be useful in missions where commanders wish to pursue or find evading

enemy units. After finding the target, the UAVs could either track and relay information

back to the command station or engage the target itself cooperatively in a team fashion.

On the civilian side, search and rescue missions could be facilitated by a similar system.

An example scenario is that of a plane crash where the accident is known to have taken

place, but the exact location of the survivors is uncertain. A group of UAVs could be

dispatched to first search for survivors and secondly, if found, relay information of their

status to search and rescue crews.

The idealized scenario begins with a team of UAVs assigned the collective mission

to search for and intercept a target in a minimum time fashion (refer to Figure 1.1 and

Figure 1.2). The exact location of the single target is not known, but the target is

known to be within a region of known position and dimensions, herein referred to as the

Uncertainty Region. The mission can be divided into two distinct operations: searching

and intercepting. In the search phase, the UAVs collectively reduce the uncertainty by


performing sensor sweeps of the uncertain region. Sensors are idealized to be perfect and

therefore only a single sensor sweep of a given area is sufficient to ascertain whether the

target is in the given area at that time instant. The searching process continues until

the target location is ascertained.

The end of the search phase initiates the intercept mode for the UAVs. The goal in this

phase is to intercept the target in minimum time. Irrelevant is the UAV that ultimately

intercepts the target, but relevant is the time in which the interception occurs. Minimum

interception time is desirable, and as such the UAVs collectively develop strategies to

accomodate this goal. Planning of interception yields trajectories assigned to each of the

UAVs which are then executed to capture the target.

In this simulation study, an algorithm for the cooperative search and cooperative

intercept is developed and analyzed for a team of multiple UAVs. General assumptions

include:

1. there is only a single target.

2. the target is capable of moving.

3. the maximum velocity of the target is known a priori.

More assumptions will follow and will be detailed as they become relevant to the

discussion.

1.2 Thesis Layout

The thesis is divided into seven chapters. In this current Chapter 1, motivation and

a general overview of the problem are provided. In Chapter 2, a current survey of the

research state is provided detailing relevant works and how this study complements exist-

ing knowledge. Chapter 3 is a brief overview of algorithms, methods and concepts used

in the development of the search and intercept solution. Chapters 4 and 5 detail the


development and implementation of the cooperative search and intercept algorithms re-

spectively. Simulation results of the combined search/intercept algorithms are presented

in Chapter 6. Finally, future work and concluding remarks are presented in Chapter 7.

Team of UAV drones

Uncertainty RegionMoving target is known to be

within this region.

UAV team is assigned the task to search and

intercept a target .

(a) Team of UAVs are assigned the collective task

of searching for and intercepting a moving target

as fast as possible. The exact whereabouts of the

target is unknown.

UAV Team Searches for target by sensor sweeps over the

uncertainty region .

(b) UAVs perform search for target by performing

sensor sweeps on uncertainty regions.

UAV Team eventually finds target .

Target found.

(c) When target is found, the UAVs discontinue

search operations and initiate intercept planning.

Figure 1.1: Overview of UAV cooperation task: Searching.


Problem is now how to intercept target in

minimum time.

(a) End conditions of search problem are now ini-

tial conditions of intercept problem. Objective is

to now intercept target cooperatively in minimum

time.

Trajectories are planned for each UAV to

cooperatively capture the target in minimum

time.

(b) UAVs collectively plan intercept trajectories

assigned to each UAV.

Planned trajectories are executed to capture the

target.

(c) Target is intercepted.

Figure 1.2: Overview of UAV cooperation task: Interception.

Chapter 2

Literature Survey

There are two distinct UAV tasks studied in this thesis: cooperative UAV search and

cooperative UAV intercept. Much of the existing work has focused on one or the other

and as such the literature survey is accordingly divided into these two distinct areas of

study.

2.1 Cooperative UAV Search

Studies on how a coordinated search can be implemented in autonomous vehicle systems

have been conducted and continue to attract much scientific interest at least in part due

to the wide demand for such a system for military, scientific and civilian uses. Some

notable examples include planetary exploration for the scientific community[10], target

localization for military missions[22], and search and rescue for civilian use[17].

The classic archetypal collaborative search problem is characterized by finding a target

or targets while keeping some parameter to a minimum. Targets can be stationary or

moving and the minimizing parameter is often based on the limitations of the mission or

UAV (examples include time to target found or minimum fuel). Several different methods

exist and are currently being studied. The following is a sampling of the most common

methods currently under investigation.

8

Chapter 2. Literature Survey 9

Exhaustive searches are often used as benchmarks and, for the most part, use in

some form an open-loop pre-defined search pattern such as the zamboni pattern[1] and

the progressive spiral in maneuver[25]. The zamoboni method, aptly named after the

ice conditioning machine due to the similarity in generated paths, involves making suc-

cessive lateral sweeps, back and forth, across the search area. Each lateral sweep covers

new ground and the boundary between searched and unsearched space gradually moves

forward until the entire region is covered. The progressive spiral in, as the name implies,

involves maneuvering the agents to cover the perimeter of the search space. The agents

move in the same circular direction, either clockwise or counter clock wise, and gradually

reduce their turning radius. This method in particular has the advantage when dealing

with a moving target scenario since it can be guaranteed that a moving target does not

escape and will eventually be captured given that the right conditions apply.

Exhaustive searches are intuitive and practically easy to implement. However, since

they are open loop, they lack robustness to unforseen events such as the loss of an agent or

any other unexpected agent behaviour. Furthermore, certain search space characteristics

may make an exhaustive search impractical such as a search space populated with sparsely

distributed potential search regions. There is no point searching the entire space when a

visit to a select few way points would be sufficient. In such a case, an exhaustive search

would be a waste of time and resources since it would not be necessary to search the

entire area. To overcome this, search algorithms of closed loop type are preferred.

One closed loop approach is to change the search problem into a task assignment

problem. In these cases, the search space is sub divided into distinct regions and search

agents are assigned to the individual regions. In the work of Enns et al.[7], the search

space is divided into flying lanes and a search UAV is assigned to each lane. A market

orientated programming optimization method is used to do the actual assignment which

involves search agents bidding on the individual lanes to search out. Another example of

changing the search task to that of a task assignment problem is the work of Zhang et


al.[30], UAVs are assigned to search and prosecute targets in a game space. Coordination

is achieved through the assignment of navigation points to individual UAVs where the

points to be assigned are those that are known to be of the highest likelihood to coincide

with a target’s position.

By far the most common method for cooperative search is to use receding horizon

methods which are commonly used in continuous time problems or problems where the

end time is not explicitly defined. The basic receding horizon method involves looking

ahead in time by a defined duration and then determining the best strategy for that

particular look ahead interval. The control input corresponding to the instantanous

time step is executed and at the next time step, the process is repeated. The exact

implementation of coordination between search agents is widely varied. In [21], a global

optimization performance index is used by the search agents, and individual search agents

base their receding horizon controls on optimizing the said parameter. The index used

is a multi-objective weighted sum which balances tasks that include searching, collision

avoidance and minimum search overlap amoung others. In [29] a similar receding horizon

method is used as a long term planner with a short term planner acting in parallel. In

this study, UAVs therefore choose their trajectories based on the benefits of adopting

the long term control policy, the short term control policy, or a combination of the two.

Coordination in this case is manifested by virtual potential fields to minimize overlap

and UAV collisions. Jin et al.[13] also uses receding horizon control with the distinction

that replanning is done only at the end of the look ahead interval and not at every step

for search phases.

Other methods commonly found are potential methods where a virtual potential field

is used to guide the UAVs to regions that are yet to be searched. In a study done in

cooperation with Northrop Grumman[23], the unsearched grid cells act a virtual potential

field source much like a mass is a gravitational potential field source. The gradient of

the potential field, calculated at any UAV location, is used as a guiding direction vector


for the search agent. In this particular study, coordination was not explicitly addressed.

Under simulation, it is observed that overlap and redundancy in search trajectories while

using potential field control is commonplace.

2.2 Cooperative UAV Pursuit and Evasion

Pursuit and evasion problems have been studied since Isaacs first published his work

on differential game theory [11]. His initial work included laying the foundation for the

single pursuit and evasion (PE) game and presenting an analytical technique for finding

optimal path trajectories based on the principle of a constant valued game referred to

as the main equation. Other analytical techniques were subsequently developed to solve

varying types of PE problems such as the point mass interception problem[3] and the

isotropic rocket[5].

Lately, numerical techniques have largely dominated PE research activities primarily

due to its flexibility when applied to the general breadth of PE problems. This has

allowed researchers to explore more complex PE problems such as multi-agent problems

which are currently an active and growing area of study. An example of early studies

into the MPSE problem include work done by Benda et al. [4] where agent dynamics are

approximated with grid-world players.

Alternate approaches to the MPSE problem include the genetically inspired methods

of Haynes and Sen [8] where agent trajectories are generated from a distributed control

algorithm based on a genetic programming approach called the strongly typed genetic

programming method developed initially by Montana [16]. Another distributed approach

by Yamaguchi [28] uses a hybrid behavioural reactive framework algorithm to simulate

robot hunting cooperation. More recently, Jang and Tomlin [12] looked into a similar

multi-agent problem, solving it using level set functions and a novel method for reflecting

forward reachable sets. Li and Cruz[14] also studied the same problem with a look-ahead


optimization approach. Yet another work by Li, Cruz and colleagues[15], translated

the multi pursuer problem into a target assignment problem. In this work, a hiearchial

approach is used where an upper level optimization determines which pursuer targets

which evader. Once pursuit pairs are determined, the pursuers chase the targets using

the results of Isaac’s analytical solution to the one-on-one pursuit problem.

2.3 Thesis Contribution

This thesis presents a novel solution for the combined collaborative search and intercept

problem for a moving target. Many previous works have dealt with the individual prob-

lems of search, intercept and a moving target, yet few have considered all three in the

same problem. The search problem is solved using a diffusion based uncertainty map

management system and a combined potential field/receding horizon method (refered to

as the hybrid method to direct individual search agents. Presented work on the intercept

problem can be viewed as an extension of the behavioural approach of Haynes, Sen[8][9],

and Yamaguchi[28] applied to the MPSE problem studied by Jang, Tomlin[12], Li and

Cruz[14]. The primary advantage of the behavioural approach over existing techniques,

is its ability to reduce the seach-space of possible control strategies for optimal evasion or

pursuit. As a result, this facilitates the optimization routines applied. The disadvantage

is that the limitation to the heuristic behaviours may be overly conservative and the true

optimum may not be included in the corresponding admissible search-space. Thus, the

intercept portion of this thesis concerns itself not with finding the true optimum, but

rather approximations.

Chapter 3

Background

This section details some of the underlying methods used in the construction of the

cooperative UAV search and intercept algorithm. The optimization methods of the Nelder

Meade Simplex and particle swarm optimization are described. A short introduction to

game theory and its application to the relevant one-on-one single pursuer/single evader

problem is provided. Only brief descriptions are provided here and the time-restrained

reader already familiar with such concepts can skip this chapter without consequence.

For more detailed and rigorous descriptions of the methods, citations are provided.

3.1 Optimization Methods

Two of the primary gradient-free optimization methods used are the Nelder Meade sim-

plex algorithm and the particle swarm optimization method. Both are described in the

following sections.

3.1.1 Nelder Meade Simplex Algorithm

The Nelder Meade simplex algorithm is one of the two gradient free optimization meth-

ods used in this study. The method starts with a simplex with N + 1 vertices in a N

13

Chapter 3. Background 14

dimensional search space. For example, a search space defined by two search parameters

would have a triangle simplex and a three dimensional search space would have a cor-

responding tetrahedron simplex. The vertices of the simplex represent evaluation points

for the objective function and are the only discrete points where the objective function

is measured. The vertices can therefore be ordered from worst to best in terms of the

value to the optimization being performed.

Parameter 1

Par

amet

er 2

Current function evaluation points

New function evaluation point

Current Simplex

New Simplex

worst

lousy

best

Figure 3.1: Simplex in 2D search space.

The general principle behind the simplex algorithm is to update the simpex by con-

tinuously discarding the worst performing vertex and replacing it with another better

point. The new point is selected based on the evaluation of trial vertices which are cho-

sen based on an extrapolation of the objective function. Once a trial point is selected

and evaluated, depending on the performance of the trial point compared to the known

values of the existing vertices, one of several simplex morphing steps can be executed.

The result is a simplex that gradually converges towards local optimum solutions. In a

two dimensional search space, it can be pictured as a triangle flip-flopping within the

search space, changing its shape one vertex at a time, until it reaches the local extrema.


A number of different variations exist; however, for this study only the following simplex

morphing steps are used: (A two dimensional search space example with a 3-vertexed

simplex is shown)

Parameter 1

Par

amet

er 2

worst

lousy

best

(a) Reflection

Parameter 1

Par

amet

er 2

worst

lousy

best

(b) Expansion

Parameter 1

Par

amet

er 2

worst

lousy

best

Outside contraction

Inside contraction

(c) Contraction (outside and inside shown)

Parameter 1

Par

amet

er 2

worst

lousy

best

(d) Shrinking

Figure 3.2: Nelder Meade Simplex Operations

The following is a pseudo code for the algorithm used in this study.

1. Initialize simplex

2. Loop until converged:

(a) Identify the worst (highest: xw), second worst (second highest: xl) and best

(lowest: xb) points with function values fw, fl, and fb, respectively.


(b) Test for convergence

(c) Evaluate xa, the average of the points in the simplex excluding xw.

(d) Perform refection to obtain xr and evaluate to obtain fr.

(e) if fr < fb then

i. Perform expansion to obtain xe, evaluate to obtain fe.

ii. If fe < fb then replace xw by xe, fw by fe (accept expansion).

iii. Else replace xw by xr, fw by fr (accept reflection).

(f) Else if fr ≤ fl then replace xw by xr, fw by fr (accept reflected point).

(g) Else

i. If fr > fw then perform an inside contraction and evaluate fc.

ii. Else perform an outside contraction and evaluate fc.

iii. If fc > fw then shrink simplex, evaluate at the n new points.

iv. Else replace xw by xc, fw by fc (accept contraction)

(h) End Loop.

For a more detailed description of the Nelder Meade Simplex algorithm, the interested

reader is directed to Nelder and Mead’s original paper[18].

3.1.2 Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a stochastic and gradient-free optimization method

for finding global extrema. The method derives its motivation from social foraging crea-

tures observed in nature such as the ant or bee colony. In these species, one often can

observe that while individual search agents look for resources independently, they can sig-

nal or influence other agents depending on the agent’s resource discovery or lack thereof.

An ant that comes across a stockpile of sugar is known to release pheremones to signal

other search ants to help exploit the bounty. It is this balance between the individual


and social forces that is distilled from nature and replicated in code to solve engineering

optimization problems.

The particle swarm optimization begins with a population of agents or particles that

move within the search space. Each agent has an associated position and velocity which

are updated at each optimization iteration. Every agent maintains an information set

with the following information:

1. The value and location of its current point in the search space.

2. The value and location of the best point in the search space that the individual has

discovered on its own.

3. The value and location of the best point in the search space that any team member

has discovered.

Based on this information, each particle adjusts its velocity according to the following

rule:

χik+1 = χik + µ

ik+1∆τ

µik+1 = wµik + c1r1

(ρik−χik)

∆τ+ c2r2

(ρgk−χik)

∆τ

(3.1)

Here k is the optimization iteration index, χi is the ith particle’s position in the

design space, µi is the ith particle’s update velocity and ∆τ is the update time step (set

to unity). w, c1 and c2 are weighting parameters on particle momentum, cognitive and

social factors respectively. r1 and r2 are random numbers between 0 and 1. Finally, ρi is

the individual particle’s optimum position and ρg is the global optimum found out of all

particles.

Pseudo code for the version of particle swarm used in this study is provided below:

1. Initialize positions of particles to a random distribution within the search space.

2. Randomize the particle velocities and orientations.


3. Loop until converged:

• For each particle

(a) Evaluate the current locations of all particles

(b) Update the particle bests

(c) Update the global best

(d) Update the particle positions

(e) Update the particle velocities

(f) End Loop

The interested reader in particle swarm optimization is directed to reference [20].

3.2 Game Theory

Game theory is the study of all forms of competition where opponents, often with conflict-

ing aims, execute interdependent strategies to achieve outcomes that maximize respective

payoffs. One of the fundamental objectives of game theorists is to derive optimal player

strategies, which are the strategies that if executed, will result in the highest payoff pos-

sible when compared to the result of all other strategies. No general method for finding

these optimal strategies exist for all classes of games; however, solutions for select game

types do exist. In this particular study, focus is set on the game of pursuit and evasion.

3.2.1 The Classic Pursuit and Evasion

The classic pursuit and evasion (PE) problem has 2 players. A single pursuit agent (P)

and a single evading target (E). P typically has a speed that is greater than that of E;

however, E has the advantage that it is more maneuverable. One macabre yet often cited

visualization tool is the scenario of the homicidal chauffeur. In this example, P is a driver

of a car with the malicious intent of running down the pedestrian E. Consistent with the


limitations of both P and E, the pedestrian is more maneuverable than the driver (ex/

parallel parking on two legs is much easier than doing the same on four wheels) and the

driver can reach much greater speeds than the hapless pedestrian. The abstracted model

is typically a unicycle for P, where minimum turning radius and fixed velocity apply and

a kinematic point for E, where E can instantaeously change its orientation but is still

restricted to a fixed velocity.

The game can be classified as a zero-sum game of degree where the payoff is that of

a continuum. More specifically, the capture time is the payoff to E while the negative

of the time to capture is the payoff for P. Since both players are assumed to be rational

and would therefore always choose to maximize payoffs, E would choose its inputs to

maximize time to capture while P would attempt to minimize capture time.

3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution

For a detailed derivation of the optimal strategies for the classic PE game, refer to Isaacs’

text on differential games[11]. Only the results are provided in this section.

Isaacs demonstrated through what he termed the main equation and the integration

of his retrograde path equations, that the optimal controls for both P and E depend on

the state of the game- that is, where E is at any instant relative to P. This therefore

defines a feedback control law which ensures that if executed by both P and E, the

time to capture will be both the maximum and minimum time to capture for P and E

respectively. Furthermore, if the game is plotted in the reference frame of P, with P

located at the origin and the forward direction of P aligned with the y coordinate axis,

then specific geometric regions can be plotted which indicate a specific control to be

executed depending on what region the location of E falls within. A plot of the game

in a P-centred reference frame is provided below along with an overlay of the different

regions. In this plot, the position of the evader relative to the pursuer defines the state

of the game.


Region 1

Capture Region

Region 2

Region 3

Figure 3.3: Isaacs SPSE regions.

In the above figure there are 4 distinct regions: region 1, region 2, region 3 and the

capture region. The capture region is bounded by the terminal surface which if touched

by the evader will end the game as a successful capture. The three other regions have

associated optimal controls for both P and E. The regions and the controls are discussed

below. Plots of sample trajectories using the optimal strategies in the P-centred reference

frame and the inertial frame follow.

1. Region 1- Primary Region: This region is characterized by an E that is suffi-

ciently close enough to P and just slightly off course, that a quick swerve by P will

result in a capture. The optimal controls in this case are:

• Evader: move directly away from pursuer at all times.

• Pursuer: make sharp turn into evader.

2. Region 2- Universal Region: Chase situations that fall into this region are

characterized by either an E that is directly ahead of P or an E that is sufficiently


(a) Pursuer reference frame

0 0.2 0.4 0.6 0.8 1 1.20

0.2

0.4

0.6

0.8

1

1.2

P

E

(b) Inertial reference frame

Figure 3.4: Region 1: The primary path.

far enough away from P that P has enough time to make a complete turn followed

by a straight run for E. The optimal controls in this case are:

• Evader: Move straight, tangential to P’s initial curvature circle.

• Pursuer: Turn sharply until pointed at E then head straight.


−12 −10 −8 −6 −4 −2 0 2 4 6 8−12

−10

−8

−6

−4

−2

0

2

4

6

8

P

E


Figure 3.5: Region 2: The universal path.

3. Region 3: This region is associated with the chase scenarios where E is close to

P, but P is orientated in such a way that an immediate swerve towards E would

result in a miss. In this scenario, P must therefore first turn away from E to gain

some space before making a quick turn around for the final kill pass. The optimal


controls in this case are:

• Evader: move towards P, tangential to P’s initial curvature circle, until game

state is in Universal region.

• Pursuer: Turn sharply away from E, until game state is in Universal region.


−8 −6 −4 −2 0 2 4−4

−2

0

2

4

6

8

10

P

E


Figure 3.6: Region 3

3.3 Summary

In this chapter, brief descriptions have been provided of select algorithms and background

information essential to the methods developed in this thesis. These algorithms will be

called upon in subsequent chapters. Two optimization methods have been discussed

which include the Nelder Meade Simplex algorithm (used in the cooperative UAV search

algorithm) and the Particle Swarm Optimization (used in the cooperative intercept al-

gorithm). Isaacs’ analytical solution to the single pursuer / single evader game has also

been provided which is called upon in the discussion on cooperative intercept.

Chapter 4

Cooperative UAV Search

As mentioned in the introduction, there are two distinct phases to the central UAV

cooperative task. The first phase is characterized by the goal of identifying the target

location within the game space. Only then can the secondary interception phase be

initiated after the position of the target is positively ascertained. In this section, the

former phase is discussed beginning with how uncertainty in the target’s position is

modelled. A diffusion model for uncertainty is used to account for the target’s ability to

move. The task of searching is realized through the reduction of uncertainty regions. This

is done by UAV sensor sweeps through the area. Three different UAV control schemes for

a single UAV (potential, receeding horizion, and a hybrid between the two) are presented

and compared. This is followed by a discussion on a cooperation algorithm used and how

the hybrid method is extended to multi-UAV teams. Finally search simulations using

the hybrid cooperative method are presented at the end of the chapter.

4.1 Problem Formulation

The search problem consists of a team of n UAVs assigned the common goal of finding

the single target in minimum time. Only a single target exists, the position of which is

not known to the search UAVs with absolute certainty. Available to the search UAVs,

23

Chapter 4. Cooperative UAV Search 24

however, is the knowledge of a general region or regions in the game space where the

target can not possibly be within. This knowledge is assumed to be known prior to the

start of the game based on information provided by the mission planner. The target can

however, be hidden within all other regions which would need to be scouted by the UAV

team if the target is to be found. The sensor range of the UAVs is assumed to have

limited range and as such, some UAV maneuvering is likely required to find the target.

4.1.1 Grid World Representation

The game space is a 2-D plane that the agents move within. A grid N×M square elements

is overlayed upon the game space which discretizes the game space in both the horizontal

and vertical directions. Each agent can occupy only one grid element at any time. Note

that the position of each agent can still can take on continuous values- the grid map

is simply a discretized approximation to the continuous game. Both the continuous

positions and the discretized surrogate representations are updated during the course of

the search (see Figure 4.1).

Y

X

Target Position(3.77, 0.25)

UAV1 Position(4.02, 4.43)

UAV1 Position(1.56, 1.62)

(a) Continuous agent position representation.

Y

X

Target Cell(3, 0)

UAV1 Position(4, 4)

UAV1 Cell(1, 1)

(b) Grid agent position representation.

Figure 4.1: Grid approximation.


4.1.2 Target Uncertainty

At the outset of the search task, the target’s exact location is not known with absolute

certainty to the UAV team; however, some information about the general whereabouts

of the target is available to the UAV team to base its trajectory planning upon. For

example, regions where the target can not be, known through mission planner intuition

is admissible and is of utility to the UAV team. These regions are referred to as certain

regions, since it is known with certainty that the target’s position can not coincide with

any point within the region. Since the target does not exist within certain regions, then

it follows that it can be found somewhere within the regions that are not certain. These

areas are designated uncertain regions. It perhaps goes without saying that any region

that is not certain is deemed uncertain and therefore includes all points where the target

may be at an instant in time.

Uncertainty Representation

The spatial uncertainty environment is represented by the grid-based representation of

the 2-D game. Each or the N×M square elements is assigned the discrete binary state of

either being certain (value=0) or uncertain (value=1). In other words,

U t = utj,k

=

01if grid j, k is certain

otherwise

where U t is the uncertainty map or representation of the UAV team’s knowledge of the

target’s position at the tth time step of size N ×M , j is the index for the grid elements

in the x direction and k in the y direction.

Other Possible Uncertainty Representations

Although in this study, uncertainty is constrained to one of two binary states, a continuum

between the two extremes can be implemented to account for partial certainty of grid


1 1 1 1 0

1

1

0

0

1 1 0 0

1 0 0 0

0 0 0 0

0 1 0 0

Y

X

Uncertain Regions: Target could possibly be in

these blocks of U=1

Certain Regions: Target could NOT possibly be in these blocks of U=0

Figure 4.2: Uncertainty represented by gridded map.

spaces. This allowance for intermediate values would be useful for the modelling of non-

perfect sensors, where certain regions of the UAV sensor area are less reliable than others.

Less than perfect reliability of a particular sensor area region would therefore correspond

to reduced uncertainty, but not to the point of complete certainty. This partial certainty

would therefore be represented by an uncertainty value that lies somewhere in between

the values of 0 and 1. For this study, perfect sensors are assumed, hence partial certainty

is not considered.

Diffusion Model for a Moving Target

For static target problems, one can make the assumption that changes to the uncertainty

map will only result from the UAVs scanning the uncertain regions. The static case

is therefore a problem of steady or decreasing uncertainty where elements that were

initially certain remain certain, while those that are uncertain can switch to certain only

as a result of the passing of a searching UAV. However, in the case of a moving target,

the steady or decreasing uncertainty assumption is no longer valid. Targets can move

and therefore can transition into neighbouring grid cells. Uncertainty therefore has the


ability to grow in the moving target case and must be taken into account by a mechanism

which can evolve the uncertainty map with time.

To take the target’s ability to move into account, a model based upon two dimensional

diffusion is adopted to evolve the uncertainty boundaries over time. A variation based

on the work done in [23] is adopted here. The motivation for this is that diffusion will

propagate the uncertainty in all directions equally. This is desireable since no information

is given on the behaviour of the target, and therefore all target moves must be considered

equally probable. Since this is the case, the worst case scenario must be assumed which

should be reflected on the uncertainty boundary evolution. The basic 2-D diffusion

equation is as follows:

∂u

∂t= c

(∂2u

∂x2+∂2u

∂y2

)(4.1)

where u is the uncertainty in the grid cell, t is the time, and c is the diffusion conductivity

constant. As it stands, the above equation is not useful when applied to a discretized

plane such as the already adopted grid world representation of the game space. Instead,

the 2-D finite element diffusion model is used.

ut+1j,k = utj,k + c∆t

[utj+1,k − 2utj,k + utj−1,k

∆x2+utj,k+1 − 2utj,k + utj,k−1

∆y2

](4.2)

In this equation, both the time and spatial partial derivatives have been approximated

with central finite differences. The conductivity constant, c, and the uncertainty, u, retain

their meanings. The super indices t represent the current time step while k and j are

the indices for the grid elements in the x and y directions. ∆x and ∆y are the step sizes

in the x and y directions.

This is one of the simplest forms of diffusion for 2-D finite element models. A simu-

lation of the evolution of the uncertainty using the above model is provided above. The

diagrams demonstrate the movement of uncertainty from grid cells of high uncertainty

concentration to regions of low uncertainty concentration over time.

Yet this is still not quite the desired behavior of the uncertainty evolution. Firstly, as


0 0 0 0 0

0

0

0

0

ut = 0ut+1 =1 0 0 0

0 ut = 1

ut+1 =1 0 0

0 0 0 0

0 0 0 0

Y

X

V

(a) Given the target position is known, it can ei-

ther transition to another neighbouring cell, or re-

main stationary. If a transition occurs, the new

cell’s uncertainty level must increase to 1

0 0 0 0 0

0

0

0

0

ut = 0ut+1 =1

ut = 0ut+1 =1

ut = 0ut+1 =1 0

ut = 0ut+1 =1

ut = 1ut+1 =1

ut = 0ut+1 =1 0

ut = 0ut+1 =1

ut = 0ut+1 =1

ut = 0ut+1 =1 0

0 0 0 0

Y

X

1

(b) The behaviour of the target is not known.

Therefore transitions to all neighbouring cells

must be considered.

Y

X

V

Uncertainty at t=1

Uncertainty at t=2

Uncertainty at t=3

Uncertainty at t=4

(c) To take moving targets into account, the

boundary of uncertainty must grow at a rate at

least as fast as the target’s maximum velocity.

Figure 4.3: Moving target implication of time varying uncertainty.


05

1015

2025

05

1015

20250

0.5

1

1.5

2

xy

z

(a) Initial distribution.

05

1015

2025

05

1015

20250

0.5

1

1.5

2

xy

z

(b) After 100 steps.

05

1015

2025

05

1015

20250

0.5

1

1.5

2

xy

z

(c) After 150 steps.

Figure 4.4: Finite element diffusion process.

was mentioned in the previous section, every cell can only be of only two states. Either the

cell is certain or it is uncertain and there are no intermediate values. This binary system

has not yet been taken into account in the 2-D diffusion equation. Secondly, at the basis

of the diffusion equation is the law of conservation. In standard diffusion applications,

the total amount of the parameter in question, (whether it be energy or mass), remains

fixed provided the system is closed and has no sinks or sources. A consequence of this

is that cells of high uncertainty will experience an unwarranted reduction with time as

uncertainty flows out of the cell to neighboring cells of lesser uncertainty. This is not the

desired behavior since cells that are uncertain should remain uncertain until scanned by

a UAV. The maneuvering target within a cell can potentially choose to remain stationary

and this possibility must be reflected in how the uncertainty evolves as well. The third

and final consequence of using an unmodified diffusion equation is that the rate at which

the uncertainty expands has a tendency to slow down in the latter stages of uncertainty

evolution as the spatial gradients of uncertainty become small. Again, this is not the

desired behavior, as one would generally require the uncertainty boundary to grow at a

constant rate.

To address these concerns, some modifications to the diffusion equation are in order.

The first concern is addressed by maintaining two separate maps, U and U’. The first

map, U, is described as above and allows for cells that can take on intermediate values


between 1 (uncertain) and 0 (certain). It is on this map where the uncertainty is evolved

using the diffusion algorithm. Changes to grid uncertainty values due to search UAV

movement is also taken into account on this map. The second map, U’, is a filtered

version of its cousin where each cell of map U is compared to a defined threshold value.

If a given cell on map U has a value at or above this threshold, the corresponding cell on

map U’ is assigned the value of 1- that is, if a cell is sufficiently uncertain, the algorithm

considers that cell to be completely uncertain and is reflected in map U’. Likewise, if on

the other hand, the value of the map U is below the threshold value, the particular cell on

map U’ is assigned the value of 0. Both maps are required to be maintained and updated

for the duration of the simulation. Map U manages the evolution of the uncertainty with

time, while map U’ is simply a binary filtered version of U which is used by the UAV

search team to develop their search trajectories.

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

z

(a) Map of U : intermediate values allowed.

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

z

(b) Map of U ′: binary filtered version of U with

threshold value of 0.1.

Figure 4.5: Diffusion modification 1: Maintaining two maps.

The second concern of having uncertain cells remain uncertain unless being scanned

by passing search UAVs is addressed with the following fix. Cells within map U do not

experience a reduction in uncertainty due to diffusion. The flux of uncertainty is strictly

limited to net inflows. Outflows for all cells are ignored. The updated equation for finite

element uncertainty diffusion is therefore


ut+1j,k = utj,k + c∆tmax

[0,

(utj+1,k − 2utj,k + utj−1,k

∆x2+utj,k+1 − 2utj,k + utj,k−1

∆y2

)](4.3)

The third concern of having non-constant uncertainty growth rates can be addressed

by adding uncertainty sources to the map and then applying a saturation filter to all cells

to limit uncertainty values to a maximum of 1. In this implementation, each cell above

the threshold value is increased by a fixed percentage of its original uncertainty value.

In effect, every cell with a value above the threshold acts as an uncertainty source. This

ensures that the gradients near the border of the uncertainty remain sufficiently high

which translates into a constant uncertainty boundary growth rate that does not exhibit

the undesirable slowing effect.

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

z

(a) Initial distribution.

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

z

(b) Every cell above threshold (z

value of 0.1) is boosted by a fac-

tor of 1.2 (exagerrated for visual

purposes).

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

z

(c) Filtered such that values fall

between 0 and 1.

Figure 4.6: Diffusion Modification 3: Uncertinty source for regular boundary growth.

The handling of uncertainty is therefore summed up in the following steps:

For time step t:

1. Set current U to map U from time step t-1

2. For each cell in U perform the diffusion update

3. For each cell in U greater than threshold value, boost uncertainty value by a certain

percentage


4. Limit each cell in U to a maximum of 1 and a minimum of 0

5. Update U’ with new U by performing cell wise threshold saturation

6. UAVs develop search trajectories with map U’

The figures below depict the new uncertainty diffusion algorithm at different time

steps. By modifying the diffusion we now have an uncertainty distribution that grows

in the radial direction at a rate that approximates the maximum velocity of the target.

Intermediate cells are no longer present, but rather only take on the values of 1 (uncertain)

and 0 (certain).

4.2 Target Behaviour

For the search phase, no particular motion of any kind is assumed. All that is known to

the UAV search team is an upper bound for the velocity of the target. This information

is incorporated into the UAV uncertainty update algorithm, more specifically, it is taken

into account when selecting the conductivity constant for the diffusion method. High

values of conductivity correspond to a faster growing uncertainty which is consistent

with a faster moving target. Conversely low values of conductivity correspond to a

slower growth of uncertainty consistent with a slow moving target.

For testing purposes, target motion is constrained to straight line segments. The

target is first randomly set to have an initial orientation. It then proceeds in straight

lines until it reaches the boundary of the game space, at which point, it reflects off the

boundary with an orientation equal to the angle of incidence. Another candidate test

target behaviour during the search phase that can be used is that of random motion,

where the target selects a new orientation vector at all time instances.


x

y

0 5 10 15 20 250

5

10

15

20

25

(a) Initial uncertainty distribution.

x

y

0 5 10 15 20 250

5

10

15

20

25

(b) After 100s.

x

y

0 5 10 15 20 250

5

10

15

20

25

(c) After 200s.

x

y

0 5 10 15 20 250

5

10

15

20

25

(d) After 300s.

x

y

0 5 10 15 20 250

5

10

15

20

25

(e) After 400s.

x

y

0 5 10 15 20 250

5

10

15

20

25

(f) After 500s.

Figure 4.7: Contour plots of the modified diffusion model for uncertainty management.

The green circle represents the uncertainty boundary.


4.3 Search Behaviour

At this point, only how uncertainty is managed for the UAV search has been defined. It

still remains to be explained how to act upon the uncertainty to find the target in some

optimal or approximately optimal fashion. In this section, three methods are explored

as candidate search algorithms: The potential method, the receding horizon method and

a hybrid method which is a combination of the other two.

4.3.1 Potential Method

The potential method is a common method well used in control applications including

scenarios for autonomous UAV guidance laws. The general idea is that certain points

of interest in the game whether they are enemy locations, uncertainty cells, obstacles,

or any other body of interest, serve as virtual masses which induce a virtual potential

field within the game space. It is the calculated gradients of this potential field that

are typically used by the UAVs as a basis for the control laws. Depending on the UAV

task, different variations of the potential method can be used. Although the underlying

principles of the method remain constant how the potential field is generated and how the

agents act upon said potential field exhibit considerable variation between designers and

missions. The subsequent paragraphs describe the intricacies of the potential method

used for the agents in the team UAV search.

Potential field generation

Each cell of uncertainty exhibits a potential field of decreasing magnitude with the square

of distance much like natural gravitational fields. Cells that are certain do not contribute

to the potential field. The potential field due to an uncertain cell at a distance r from

the cell is:

pj,k = −uj,kr

(4.4)


The potential field at any position is a sum of all contributions from all uncertain

cells.

Pj,k =∑M

j

∑Nkpj,k (4.5)

05

1015

2025

05

1015

20250

0.2

0.4

0.6

0.8

1

xy

Unc

erta

inty

(a) Uncertainty distribution with uncertainty cells

located at centre of space.

05

1015

2025

05

1015

2025−1

−0.8

−0.6

−0.4

−0.2

0

xy

Pot

entia

l Fie

ld

(b) Resulting virtual potential field.

Figure 4.8: Example of virtual potential field generated from an uncertainty distribution.

Potential field control law

Each UAV calculates the gradient of the potential field at its current position. The

gradient is normalized to unity and the direction of the gradient is compared to its

current orientation direction vector. The control law used is to simply minimize the

angle between the two vectors. In other words, if the gradient is to the left of the UAV’s

orientation vector, the UAV will turn to the left. If the gradient is to the right, the UAV

will turn right.

The following is a pseudo-code for each UAV performing potential based search.

1. Calculate potential field contribution for every uncertain cell

2. Sum up all contributions at UAV position

3. Calculate the gradient of the potential field


X

Y

5 10 15 20 25

5

10

15

20

25

gradient

current orientationvector

Figure 4.9: UAV located in potential field. In this particular case, the UAV control law

would dictate a left turn to align the orientation vector with the gradient.


4. Normalize gradient to unity

5. Calculate difference between gradient angle and current orientation angle

6. Set control to difference.

7. Update time step and goto step 1

Potential Method Deficiencies

One major deficiency to the potential method is its poor performance when used in sym-

metric uncertainty distributions. The potential method directs the UAV to set a course

towards the centroid of the total uncertainty distribution despite the possibility that

the centroid is devoid of any uncertain cells. One case in particular that illustrates this

method’s downfall particularly well is that of a doughnut shaped uncertainty distribution

with the UAV located in the centre of the distribution. In this case, the UAV control law

continues to direct the UAV to re-search the centre of the distribution where there are

no uncertain cells present. It continues to search the inner boundary of the distribution

while neglecting the growing outer boundary that contributes greater to the increasing

uncertainty.

4.3.2 Receeding Horizon Method

Receding horizon methods are typically applied to optimization problems that have an

unspecified end time or is a continuous problem without any end time defined. In these

cases, optimization for the entire time duration is often too computationally expensive

or in the continuous case not possible. The general principle of the receding horizon

method is to select a truncated look ahead time interval over which the optimization can

be realistically conducted. The performance parameter is optimized over this shortened

time interval (or horizon) with respect to the controls available to the optimizers. A set

of sequential controls is therefore obtained that will optimize the performance parameter


at the end of the look ahead horizon. The first of these controls which corresponds

to the current time step is executed immediately. At the next time step, the entire

optimization is repeated with the same look-ahead horizon length updated one step

further in the future. This defines the standard receding horizon method which is to

continuously optimize for the finite look-ahead horizon, and allow that horizon to recede

into the future as time progresses.

For a single UAV in the search problem, the performance measure is the total number

of cells that are uncertain within the game space. The receding horizon is set to a pre-

defined number of time steps into the future - each step with a corresponding turning

control. These turning controls make up the search space for the optimizer which re-

optimizes the horizon controls at each time step.

Optimizer

The optimizer used is the Nelder Meade Simplex algorithm motivated by its non reliance

on gradient calculations and its computational efficiency when compared to alternative

non-gradient based optimization methods including particle swarm and genetic algo-

rithms.

Steps for receding horizon method: The following is pseudo-code for each UAV per-

forming receding horizon based search.

1. Define n=number of look ahead time steps

2. Use optimizer to find the next n best control inputs

3. Execute the first of the n control inputs

4. Update time step and go to step 2


time1 2 3 4 5 6 7 8 9 10

time1 2 3 4 5 6 7 8 9 10

time1 2 3 4 5 6 7 8 9 10

Look ahead Horizon

Look ahead Horizon

Look ahead Horizon

Turn right

Straight

Turn left

Turn right

Turn right

Turn right

Straight

Turn right

Turn left

Turn right

Turn right

Straight

Turn left

Turn right

Turn left

Straight

Straight

Turn right

Black= future time steps

Red = current time step

Grey = past time stepsTime = 1

Time = 2

Time = 3

Figure 4.10: Example receding horizon method. Three time steps are shown with a

horizon length of 5 time steps. At each time step, the UAV reoptimizes to find the best

combination of controls for the next 5 steps. The time step that corresponds to the

current time step is executed.


Receding Horizon Method Deficiencies

A common problem with receding horizon methods is how to choose the length of the look

ahead horizon. If the horizon is chosen too long then the algorithm becomes excesively

expensive from a computation perspective. If on the other hand the horizon is chosen

too short, UAV search times increase thus compromising the purpose of using any future

planning. One rule could be to make the horizon time as long as possible within the

limits of the computational power made available to the user; however, even in this case,

the horizon length may not be sufficient to make a proper decision. Consider the case

when a receding horizon controlled UAV is located sufficiently far enough away from any

uncertainty cell that regardless of what look ahead strategy it proposes, it will not be

able to scan any uncertainty cells. In this case, the optimizer would find that an optimum

look ahead strategy does not exist since all candidate trajectories are equally poor. In

these scenarios, UAVs equipped with receding horizon control are rendered useless.

4.3.3 Hybrid Method

The hybrid method is a combination of both the potential and receding horizon methods

where switching between the two search modes is performed depending on the UAV’s

position relative to uncertain cells grid cells in the game space. The default mode for the

hybrid method is to use the receding horizon method as described above. If however, the

UAV finds itself in a state where it is unable to find a set of controls that is better than

others, then it switches to the potential method. This switch from receding horizon to

potential would correspond to the case when a UAV is sufficiently far enough away from

any uncertainty cell, that regardless of what controls it selects for the receding horizon

look ahead steps, it would still not be able to reach any uncertainty cell within the look

ahead time interval. In this scenario, all look ahead trajectories would be equally poor

and there would be no reason for selecting one set of controls over another and hence,


the potential method would be executed until uncertainty cells come within reach of the

UAV look ahead.

Steps for receding horizon method: The following is pseudo-code for each UAV per-

forming receding horizon based search.

1. Define n=number of look ahead time steps

2. Use optimizer to find the next n best control inputs

3. If optimizer has found a good trajectory

(a) Execute the first of the n control inputs

4. If optimizer has not found a good trajectory

(a) Calculate potential field contribution for every uncertain cell

(b) Sum up all contributions at UAV position

(c) Calculate the gradient of the potential field

(d) Normalize gradient to unity

(e) Calculate difference between gradient angle and current orientation angle

(f) Set control to difference.

5. Update time step and go to step 2

The following plot compares the effectiveness of using the hybrid method over the

potential method alone, especially at high initial separation distances. Search simulations

were run in a 25×25 grid game space although the search UAV was permitted to start

out side of this region. Initial conditions such as UAV and target position and orientation

were randomized. A maximum search time was specified at 1500 time steps. A single

UAV was used to locate the single target. Search times for the simulations are plotted

for the initial separation of the UAV and the target.


UAV

Reachable states within look ahead interval

U

Regions of Uncertainty

U

(a) Initial sample game state with two uncertainty

regions and single UAV.

U

U

(b) Candidate receding horizon generated trajec-

tory found and is executed.

U

(c) After first uncertainty region is scanned, UAV

attempts to use receding horizon but can not

find any optimal or approximate optimal trajecto-

ries since remaining uncertainty is not contained

within receding horizon boundaries.

U

(d) Since receding horizon failed to yield a trajec-

tory, the potential method is used for guidance.

Figure 4.11: Hybrid search method.


0 20 40 60 80 100 120 140 160 180 2000

500

1000

1500

Initial Separation

Sea

rch

Tim

e

HybridPotential

Figure 4.12: Comparison of Hybrid method and Potential method in simulated searchs

of varying initial separation between target and search UAVs. 2 UAVs used with a

maximum time out of 1500 time steps.


Figure 4.12 demonstrates the advantage of using the hybrid method over exclusive

potential method control. At low initial separations, there is no appreciable difference

between the two; however, as the initial separation between the target and the search UAV

increases, one notices that there are significantly more time-outs for the potential method

- that is, search UAVs using the potential method are unable to find the target and reach

the maximum allowed time. The reason for this is, as mentioned in section 4.3.1, the

search UAV using potential control has a tendency to head towards the centroid of the

uncertainty regardless of there being uncertainty at the centroid or not. Thus there is an

excess of centroid searching when using the potential method resulting in an inefficient

search. The hybrid method on the other hand, does not suffer from this problem.

4.3.4 Multi-UAV Coordination Algorithm

Up until now, no coordination between UAVs has been discussed. All three methods

(potential, receding horizon, and hybrid) were defined in the limited context of a single

UAV. In this section coordination between UAVs is introduced with the intention of

enhancing the performance of the team search for the target.

Coordination is manifested through the way in which the individual UAVs within the

team perceive uncertainty cells. Up to this point, all uncertainty cells were considered

of equal value to each UAV. For coordination purposes, weighted uncertainty is now

proposed. The underlying principle is to have cells closer to a UAV more attractive to

scan and at the same time have cells closer to the UAV’s teammates less attractive to

scan. As is demonstrated under simulation, this reduces redundant search trajectories

of UAVs that search the same area. This ultimately reduces the time to cover the entire

area and also the search time to find the moving target.

As mentioned in the previous paragraph, the weighted uncertainty values are altered

versions of the standard uncertainty cells by making cells closer to the individual UAVs

of greater value to scan, while those that are closer to other UAVs of less value to scan.


Both of these objectives can be accomplished by attenuating the uncertainty values as

follows:

The standard uncertainty representation for the UAVs is

U tN×M

= uti,j

=

01if grid i, j is certain

otherwise(4.6)

The weighted uncertainty representation for the ith UAV is

iŪtN×M = iū

tj,k =

0 if grid j, k is certain

n∏l=1

(1− exp

[− r

2l

2σ2

])otherwise

(4.7)

where iŪtN×M

is the weighted uncertainty map valid for UAV i at time step t of size N×M .

The quantity iūtj,k is the weighted uncertainty value for the individual grid elements of

iŪtN×M

. n is the total number of UAVs. l is the index for the UAVs. rl is the distance

from lth UAV to the grid element (j, k). σ is the attenuation factor used to adjust the

degree to which the UAVs avoid uncertainty closer to their teammates.

As one can see in the above equation, cells that are closer to other UAVs are atten-

uated in value. Cells that are further away from teammates, however retain their initial

uncertianty value and therefore of greater value to be scanned by the UAV in question.

60 random search scenarios were tested under simulation using both cooperative and

non-cooperative UAVs. UAV initial conditions, target initial conditions, and initial un-

certainty region were all randomized. The only restrictions imposed were that (1) the

game space was limited to a 25 × 25 grid, (2) all agents were contained within the

grid, (3) the target must be initially within an uncertain region. The following table

summarizes the results:


1 2

U1 U2

R(1,1)

R(1,2) R(2,1)

R(2,2)

Figure 4.13: Under the coordination algorithm, uncertainty closer to a teammate will be

reduced in value to be searched. In the above scenario, since R(1, 2) > R(2, 2) then U2

is reduced in value from UAV 1’s perspective. Likewise, since R(2, 1) > R(1, 1) then U1

is reduced in value from UAV 2’s perspective. As a result, since UAVs are designed to

reduce the greatest total uncertainty value, a type of uncertainty assignment is acheived

with UAV 1 covering U1 and UAV 2 covering U2.

Table 4.1: Comparison of cooperative and non-cooperative search

Type Wins Avg. Time to target found (time steps)

Cooperative 33 108

Non-Cooperative 21 150


4.4 Benchmark Comparison

The Zamboni method is an exhaustive search method that can be adapted to autonomous

multi-vehicle searching. Few benchmark search algorithms for cooperative searching ex-

ist; however, the Zamboni method in particular has been used in the past for algorithm

comparison. A full description of the Zamboni method along with other exhaustive search

methods can be found in the work of Ablavsky et al.[1]. The Zamboni method involves a

series of loops where each individual loop is comprised of a (1) front-sweep followed by (2)

a 180 degree turn around and finally a (3) back-sweep, where the back sweep (see Figure

4.14 for sample search pattern). Each sweep (both front and back) is characterized by a

full transition of the UAV from one side of the game space to the direct opposite side.

Each successive loop is advanced slightly further than the last, resulting in a creeping

boundary between the unsearched and searched space. The looping continues until the

front-sweep of the first loop meets the back-sweep of the last loop. At this point, the

UAV advances several units ahead and begins the process over again with another series

of loops searching new area.

100 simulations were conducted comparing the cooperative hybrid search method to

the Zamboni method. Two UAVs were assigned the mission of finding a target in a game

space at varying starting distances. The simulation was first run using cooperative hybrid

searching followed by a re-run using the Zamboni method. The results appear in Figure

4.15 and Table 4.2. Simulations demonstrate that the coordinated hybrid method yields

a lower capture time more than twice as often when compared to the Zamboni method

with a reduced average time to target found of 27%.

4.5 Summary

This chapter discusses the first phase of the UAV mission which is to locate a single

moving target. Initially available to the UAV search team are regions where the UAV


Back-Sweep

Front-Sweep

Direction of Travel

Figure 4.14: Sample Zamboni search pattern.

Table 4.2: Comparison of cooperative hybrid and Zamboni exhaustive search

Type Wins (3 draws) Avg. Time to target found (time steps)

Cooperative Hybrid 67 238

Zamboni 30 324


0 20 40 60 80 100 1200

200

400

600

800

1000

1200

1400

Initial Separation

time

to c

aptu

re

Coordinated HybridZamboni

Figure 4.15: Simulation results after 100 trials comparing time to target found when

using Cooperative Hybrid versus Zamboni search algorithms.


could possibly be. These regions, called uncertainty regions are known before the start of

the search. The UAV team uses a proposed modified diffusion model to manage the time-

evolving nature of the uncertainty regions due to the target’s ability to move from one

location to another. Individual search algorithms are based on a

cooperative uav search and intercept...abstract cooperative uav search and intercept andrew ke-ping...

Documents