cooperative uav search and intercept...abstract cooperative uav search and intercept andrew ke-ping...

83
Cooperative UAV Search and Intercept by Andrew Ke-Ping Sun A thesis submitted in conformity with the requirements for the degree of Master of Applied Science Graduate Department of Aerospace Science and Engineering University of Toronto Copyright c 2009 by Andrew Ke-Ping Sun

Upload: others

Post on 20-Feb-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Cooperative UAV Search and Intercept

    by

    Andrew Ke-Ping Sun

    A thesis submitted in conformity with the requirementsfor the degree of Master of Applied Science

    Graduate Department of Aerospace Science and EngineeringUniversity of Toronto

    Copyright c© 2009 by Andrew Ke-Ping Sun

  • Abstract

    Cooperative UAV Search and Intercept

    Andrew Ke-Ping Sun

    Master of Applied Science

    Graduate Department of Aerospace Science and Engineering

    University of Toronto

    2009

    In this thesis, a solution to the multi Unmanned Aerial Vehicle (UAV) search and in-

    tercept problem for a moving target is presented. For the search phase, an adapted

    diffusion-based algorithm is used to manage the target uncertainty while individual

    UAVs are controlled with a hybrid receding horizon / potential method. The coordi-

    nated search is made possible by an uncertainty weighting process. The team intercept

    phase algorithm is a behavioural approach based on the analytical solution of Isaac’s

    Single-Pursuer/Single-Evader (SPSE) homicidal chauffeur problem. In this formulation,

    the intercepting control is taken to be a linear combination of the individual SPSE con-

    trols that would exist for each of the evader/pursuer pairs. A particle swarm optimizer

    is applied to find approximate optimal weighting coefficients for discretized intervals of

    the game time. Simulations for the team search, team intercept and combined search

    and intercept problem are presented.

    ii

  • Acknowledgements

    First and foremost, I would like to thank my research supervisor Professor Hugh Liu. This

    thesis would not have been possible without his patience, guidance, and his willingness

    to always make time for his students regardless of how crowded his schedule becomes. I

    would also like to thank the two other professors on my research assessment committee:

    Professors Peter Grant and Chris Damaren for their helpful suggestions. A big thank

    you goes out to all my fellow lab mates in the Flight Systems and Control Group for

    making my time at UTIAS so enjoyable with special thanks going out to Ruben, Eric,

    Yoshi, Sohrab and Keith.

    On a more personal note, I would like to recognize those outside of my academic life

    who have supported me throughout the past two years. To my girlfriend Ada, my two

    brothers Mark and Christopher, my sister Stephanie, my grandparents, and my father, I

    am forever grateful for your guidance and words of encouragement. Last but not least, I

    owe my deepest gratitude to my mother. Her perseverance, unconditional support, and

    mental toughness have been truly inspirational, and for that I dedicate this thesis to her.

    iii

  • Contents

    1 Introduction 1

    1.1 Purpose of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.1.1 Problem Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.2 Thesis Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    2 Literature Survey 8

    2.1 Cooperative UAV Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.2 Cooperative UAV Pursuit and Evasion . . . . . . . . . . . . . . . . . . . 11

    2.3 Thesis Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    3 Background 13

    3.1 Optimization Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    3.1.1 Nelder Meade Simplex Algorithm . . . . . . . . . . . . . . . . . . 13

    3.1.2 Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . 16

    3.2 Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    3.2.1 The Classic Pursuit and Evasion . . . . . . . . . . . . . . . . . . 18

    3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution . . . . . . . . . 19

    3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    4 Cooperative UAV Search 23

    4.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    iv

  • 4.1.1 Grid World Representation . . . . . . . . . . . . . . . . . . . . . . 24

    4.1.2 Target Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    4.2 Target Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    4.3 Search Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4.3.1 Potential Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4.3.2 Receeding Horizon Method . . . . . . . . . . . . . . . . . . . . . . 37

    4.3.3 Hybrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    4.3.4 Multi-UAV Coordination Algorithm . . . . . . . . . . . . . . . . . 44

    4.4 Benchmark Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    5 Cooperative UAV Intercept 51

    5.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5.2.1 Evasion Behaviour . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    5.2.2 Pursuit Team Behaviour . . . . . . . . . . . . . . . . . . . . . . . 54

    5.2.3 Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

    5.2.4 Non-cooperative and Cooperative Simulations . . . . . . . . . . . 56

    5.3 Simulation Cases and Results . . . . . . . . . . . . . . . . . . . . . . . . 57

    5.3.1 Non-Cooperative Chase Simulations . . . . . . . . . . . . . . . . . 57

    5.3.2 Cooperative Chase Simulations . . . . . . . . . . . . . . . . . . . 61

    5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    6 Search and Intercept Simulation Results 63

    6.1 Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    6.2 Search Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.3 Intercept Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    v

  • 7 Conclusions 67

    7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

    Bibliography 70

    vi

  • List of Tables

    4.1 Comparison of cooperative and non-cooperative search . . . . . . . . . . 46

    4.2 Comparison of cooperative hybrid and Zamboni exhaustive search . . . . 48

    5.1 Comparison of evasion capture times for simulation 1-A, 1-B and 1-C . . 58

    5.2 Comparison of evasion capture times for simulations 3-A and 3-B . . . . 60

    vii

  • List of Figures

    1.1 Overview of UAV cooperation task: Searching. . . . . . . . . . . . . . . . 6

    1.2 Overview of UAV cooperation task: Interception. . . . . . . . . . . . . . 7

    3.1 Simplex in 2D search space. . . . . . . . . . . . . . . . . . . . . . . . . . 14

    3.2 Nelder Meade Simplex Operations . . . . . . . . . . . . . . . . . . . . . . 15

    3.3 Isaacs SPSE regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    3.4 Region 1: The primary path. . . . . . . . . . . . . . . . . . . . . . . . . . 21

    3.5 Region 2: The universal path. . . . . . . . . . . . . . . . . . . . . . . . . 21

    3.6 Region 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

    4.1 Grid approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

    4.2 Uncertainty represented by gridded map. . . . . . . . . . . . . . . . . . . 26

    4.3 Moving target implication of time varying uncertainty. . . . . . . . . . . 28

    4.4 Finite element diffusion process. . . . . . . . . . . . . . . . . . . . . . . . 29

    4.5 Diffusion modification 1: Maintaining two maps. . . . . . . . . . . . . . . 30

    4.6 Diffusion Modification 3: Uncertinty source for regular boundary growth. 31

    4.7 Contour plots of the modified diffusion model for uncertainty management.

    The green circle represents the uncertainty boundary. . . . . . . . . . . . 33

    4.8 Example of virtual potential field generated from an uncertainty distribution. 35

    4.9 UAV located in potential field. In this particular case, the UAV control

    law would dictate a left turn to align the orientation vector with the gradient. 36

    viii

  • 4.10 Example receding horizon method. Three time steps are shown with a

    horizon length of 5 time steps. At each time step, the UAV reoptimizes to

    find the best combination of controls for the next 5 steps. The time step

    that corresponds to the current time step is executed. . . . . . . . . . . . 39

    4.11 Hybrid search method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    4.12 Comparison of Hybrid method and Potential method in simulated searchs

    of varying initial separation between target and search UAVs. 2 UAVs

    used with a maximum time out of 1500 time steps. . . . . . . . . . . . . 43

    4.13 Under the coordination algorithm, uncertainty closer to a teammate will

    be reduced in value to be searched. In the above scenario, since R(1, 2) >

    R(2, 2) then U2 is reduced in value from UAV 1’s perspective. Likewise,

    since R(2, 1) > R(1, 1) then U1 is reduced in value from UAV 2’s perspec-

    tive. As a result, since UAVs are designed to reduce the greatest total

    uncertainty value, a type of uncertainty assignment is acheived with UAV

    1 covering U1 and UAV 2 covering U2. . . . . . . . . . . . . . . . . . . 46

    4.14 Sample Zamboni search pattern. . . . . . . . . . . . . . . . . . . . . . . . 48

    4.15 Simulation results after 100 trials comparing time to target found when

    using Cooperative Hybrid versus Zamboni search algorithms. . . . . . . . 49

    5.1 Agent dynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    5.2 Blocking and Chasing behaviours . . . . . . . . . . . . . . . . . . . . . . 56

    5.3 Pursuer 1 Initial Condition=[0 0 0]; Pursuer 2 Initial Condition=[5 5 -π/2];

    Evader Initial Condition=[0 5]; . . . . . . . . . . . . . . . . . . . . . . . 58

    5.4 Simulation 2 (MPSE Evader Control): Pursuer 1 Initial Condition=[0 0

    0]; Pursuer 2 Initial Condition=[5 0]; Evader Initial Condition=[0 5 -π/2];

    Time to capture=1.82s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    5.5 Pursuer 1 Initial Condition=[-5 0 0]; Pursuer 2 Initial Condition=[5 0 π];

    Evader Initial Condition=[0 0]; . . . . . . . . . . . . . . . . . . . . . . . 60

    ix

  • 5.6 Pursuit chase trajectores: Pursuer 1 Initial Condition=[−15√2−15√

    2π4]; Pur-

    suer 2 Initial Condition=[ 1√2−1√

    23π4

    ]; Evader Initial Condition=[0 0]; . . . 62

    6.1 Initial conditions for search simulation. . . . . . . . . . . . . . . . . . . . 64

    6.2 Seaching for target. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    6.3 Time step= 110; Target found. . . . . . . . . . . . . . . . . . . . . . . . 65

    6.4 Target intercept mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    x

  • Chapter 1

    Introduction

    The Uninhabited Aerial Vehicle (UAV) has come a long way in terms of sophistication

    when compared to its early predecessors. One of the first historically documented ap-

    plication of unmanned flying machines took place with the back drop of the U.S. Civil

    War [19]. Charles Perley, a New York based inventor, filed a patent for the design of a

    lighter than air balloon, laden with explosives, which were supposed to be launched over

    enemy lines and travel through the air out of reach by enemy obstruction. The hope

    was that when the UAV eventually detonated, it would have journeyed close enough to

    a significant enemy target, thereby gaining a small victory without risking a single sol-

    dier. Unfortunate for the user of such a device, the balloons were not equipped with any

    form of control and as such, were at the mercy of the atmosphere including winds that

    would often shift from one direction to another without warning. To both Union and

    Confederate military commanders, uncontrollable flying explosives must have seemed to

    be more of a liability than a useful means of attack since the projects on both sides were

    quickly abandoned for more traditional types of warfare. Clearly much has changed as

    we are now witnessing at present time an accelerating interest in UAV technology which

    is currently in use and being planned for future use by many in the global community.

    Unlike the orphaned maverick balloons of the past, today’s UAVs are not only widely

    1

  • Chapter 1. Introduction 2

    perceived as advanced viable solutions, but to many, they also represent the future of

    flight for both military and civilian uses. Take for example the contrast of UAV use in

    the first and second Gulf Wars. In the first Gulf War a total of 1641 hours of flight time

    were logged with UAVs. According to a May 2001 Departement of the Navy report, this

    translated into “at least one UAV was airborne at all times during Desert Storm”[6].

    Comparing this with the second Gulf war, where a recent Associated Press article esti-

    mates the total number of UAV hours flown to be in excess of 500, 000[2], one can see

    a dramatic increase in the reliance on UAV technology. The role of UAVs are becoming

    more accepted and entrenched in the arsenal of tools at the modern military’s disposal.

    Although military applications have dominated UAV use to date, there is also growing

    interest in applying UAV technology to civilian applications [26]. UAV prototypes have

    been built and are under development as solutions to monitoring forest fires, monitoring

    wildlife migration, and delivering medical supplies. Also remarkable is the breath of

    countries employing and developing UAVs. A field that was once dominated by only a

    few players, now can boast strong international participation. In the latest AIAA UAV

    roundup survey conducted in 2007[27], 36 countries were found to collectively be working

    on over 200 UAV projects in active use, under development and under production.

    Based on the increasing reliance on UAVs in military and civilian scenarios, the

    potential for application to a wide array of problems including scientific and civilian

    uses, and the amount of global participation, one can safely state that research into UAV

    systems and applications will expand for at least the foreseeable future. New problems

    will arise and exploration and research on how these machines can be designed to meet

    the demands will continue to be asked.

  • Chapter 1. Introduction 3

    1.1 Purpose of Study

    The vast majority of current UAVs are still more or less remotely controlled by a human

    operator at a ground station. The Predator and Global Hawk are both normally piloted

    by personnel located in a ground station [19]. The hand launched Raven is flown by a

    specialist soldier located directly in the field[19]. Admittedly, some UAVs do have the

    capability to perform some tasks independent of external operators. Examples include

    heading holds, altitude holds, navigation point flying, and other various maneuvers in-

    cluding loiters, climbs, descents, and the flying of circuit patterns or approaches. These

    tasks can more or less be handled by existing control techniques which are well devel-

    oped. Yet for more complicated tasks and maneuvers, a higher level of decision making

    is necessary, and humans are often still relied upon to make the decision or in most cases

    to be in direct control of the aircraft itself.

    This gap in system autonomy is an excellent opportunity for improvement for many

    researchers of UAV systems. Firstly, humans perform very well when confronted with

    new tasks. Their flexibility and intuition are not to be discounted. Yet, for many aircraft

    tasks, flexibility and intuition are only seldom called upon. More than likely, an aircraft

    will be required to perform the same mission many times over with little difference

    between the multiple sorties. In the vast majority of cases, automation has a significant

    advantage since consistency, accuracy and precision are all weak points of the human

    operator. Secondly, if human operators are constantly required to be in continuous

    contact with the aircraft, then the aircraft missions are limited by the range and quality

    of the communication method employed. An aircraft capable of making higher level

    decisions independently would potentially be able to fly a much larger class of extended

    missions, while at the same time maintaining robustness to severed and intermittent

    communication links.

  • Chapter 1. Introduction 4

    1.1.1 Problem Overview

    The study conducted and detailed in this thesis deals with the high level decision algo-

    rithms necessary to handle problems of coordination between members of a UAV team

    assigned to conduct a specific task collectively. Multiple UAVs comissioned to complete

    a task have several advantages over the lone UAV. More UAVs assigned to complete a

    given task provides more flexibility to mission planners since multiple units are capable

    of undertaking many different types of missions that a single UAV could not. Tasks can

    also often be completed faster in time-sensitive missions with multiple agents. Finally,

    multiple UAV teams bring a greater degree of robustness since the impact of aircraft loss

    is diminished when the loss is out of many compared to when the loss is out of one.

    The specific cooperation task dealt with in this study is of multiple UAVs assigned the

    team task to search for and intercept a moving target. This scenario has real analogues

    in both the military and civilian realms. On the military side, search and intercept UAV

    drones would be useful in missions where commanders wish to pursue or find evading

    enemy units. After finding the target, the UAVs could either track and relay information

    back to the command station or engage the target itself cooperatively in a team fashion.

    On the civilian side, search and rescue missions could be facilitated by a similar system.

    An example scenario is that of a plane crash where the accident is known to have taken

    place, but the exact location of the survivors is uncertain. A group of UAVs could be

    dispatched to first search for survivors and secondly, if found, relay information of their

    status to search and rescue crews.

    The idealized scenario begins with a team of UAVs assigned the collective mission

    to search for and intercept a target in a minimum time fashion (refer to Figure 1.1 and

    Figure 1.2). The exact location of the single target is not known, but the target is

    known to be within a region of known position and dimensions, herein referred to as the

    Uncertainty Region. The mission can be divided into two distinct operations: searching

    and intercepting. In the search phase, the UAVs collectively reduce the uncertainty by

  • Chapter 1. Introduction 5

    performing sensor sweeps of the uncertain region. Sensors are idealized to be perfect and

    therefore only a single sensor sweep of a given area is sufficient to ascertain whether the

    target is in the given area at that time instant. The searching process continues until

    the target location is ascertained.

    The end of the search phase initiates the intercept mode for the UAVs. The goal in this

    phase is to intercept the target in minimum time. Irrelevant is the UAV that ultimately

    intercepts the target, but relevant is the time in which the interception occurs. Minimum

    interception time is desirable, and as such the UAVs collectively develop strategies to

    accomodate this goal. Planning of interception yields trajectories assigned to each of the

    UAVs which are then executed to capture the target.

    In this simulation study, an algorithm for the cooperative search and cooperative

    intercept is developed and analyzed for a team of multiple UAVs. General assumptions

    include:

    1. there is only a single target.

    2. the target is capable of moving.

    3. the maximum velocity of the target is known a priori.

    More assumptions will follow and will be detailed as they become relevant to the

    discussion.

    1.2 Thesis Layout

    The thesis is divided into seven chapters. In this current Chapter 1, motivation and

    a general overview of the problem are provided. In Chapter 2, a current survey of the

    research state is provided detailing relevant works and how this study complements exist-

    ing knowledge. Chapter 3 is a brief overview of algorithms, methods and concepts used

    in the development of the search and intercept solution. Chapters 4 and 5 detail the

  • Chapter 1. Introduction 6

    development and implementation of the cooperative search and intercept algorithms re-

    spectively. Simulation results of the combined search/intercept algorithms are presented

    in Chapter 6. Finally, future work and concluding remarks are presented in Chapter 7.

    Team of UAV drones

    Uncertainty RegionMoving target is known to be

    within this region.

    UAV team is assigned the task to search and

    intercept a target .

    (a) Team of UAVs are assigned the collective task

    of searching for and intercepting a moving target

    as fast as possible. The exact whereabouts of the

    target is unknown.

    UAV Team Searches for target by sensor sweeps over the

    uncertainty region .

    (b) UAVs perform search for target by performing

    sensor sweeps on uncertainty regions.

    UAV Team eventually finds target .

    Target found.

    (c) When target is found, the UAVs discontinue

    search operations and initiate intercept planning.

    Figure 1.1: Overview of UAV cooperation task: Searching.

  • Chapter 1. Introduction 7

    Problem is now how to intercept target in

    minimum time.

    (a) End conditions of search problem are now ini-

    tial conditions of intercept problem. Objective is

    to now intercept target cooperatively in minimum

    time.

    Trajectories are planned for each UAV to

    cooperatively capture the target in minimum

    time.

    (b) UAVs collectively plan intercept trajectories

    assigned to each UAV.

    Planned trajectories are executed to capture the

    target.

    (c) Target is intercepted.

    Figure 1.2: Overview of UAV cooperation task: Interception.

  • Chapter 2

    Literature Survey

    There are two distinct UAV tasks studied in this thesis: cooperative UAV search and

    cooperative UAV intercept. Much of the existing work has focused on one or the other

    and as such the literature survey is accordingly divided into these two distinct areas of

    study.

    2.1 Cooperative UAV Search

    Studies on how a coordinated search can be implemented in autonomous vehicle systems

    have been conducted and continue to attract much scientific interest at least in part due

    to the wide demand for such a system for military, scientific and civilian uses. Some

    notable examples include planetary exploration for the scientific community[10], target

    localization for military missions[22], and search and rescue for civilian use[17].

    The classic archetypal collaborative search problem is characterized by finding a target

    or targets while keeping some parameter to a minimum. Targets can be stationary or

    moving and the minimizing parameter is often based on the limitations of the mission or

    UAV (examples include time to target found or minimum fuel). Several different methods

    exist and are currently being studied. The following is a sampling of the most common

    methods currently under investigation.

    8

  • Chapter 2. Literature Survey 9

    Exhaustive searches are often used as benchmarks and, for the most part, use in

    some form an open-loop pre-defined search pattern such as the zamboni pattern[1] and

    the progressive spiral in maneuver[25]. The zamoboni method, aptly named after the

    ice conditioning machine due to the similarity in generated paths, involves making suc-

    cessive lateral sweeps, back and forth, across the search area. Each lateral sweep covers

    new ground and the boundary between searched and unsearched space gradually moves

    forward until the entire region is covered. The progressive spiral in, as the name implies,

    involves maneuvering the agents to cover the perimeter of the search space. The agents

    move in the same circular direction, either clockwise or counter clock wise, and gradually

    reduce their turning radius. This method in particular has the advantage when dealing

    with a moving target scenario since it can be guaranteed that a moving target does not

    escape and will eventually be captured given that the right conditions apply.

    Exhaustive searches are intuitive and practically easy to implement. However, since

    they are open loop, they lack robustness to unforseen events such as the loss of an agent or

    any other unexpected agent behaviour. Furthermore, certain search space characteristics

    may make an exhaustive search impractical such as a search space populated with sparsely

    distributed potential search regions. There is no point searching the entire space when a

    visit to a select few way points would be sufficient. In such a case, an exhaustive search

    would be a waste of time and resources since it would not be necessary to search the

    entire area. To overcome this, search algorithms of closed loop type are preferred.

    One closed loop approach is to change the search problem into a task assignment

    problem. In these cases, the search space is sub divided into distinct regions and search

    agents are assigned to the individual regions. In the work of Enns et al.[7], the search

    space is divided into flying lanes and a search UAV is assigned to each lane. A market

    orientated programming optimization method is used to do the actual assignment which

    involves search agents bidding on the individual lanes to search out. Another example of

    changing the search task to that of a task assignment problem is the work of Zhang et

  • Chapter 2. Literature Survey 10

    al.[30], UAVs are assigned to search and prosecute targets in a game space. Coordination

    is achieved through the assignment of navigation points to individual UAVs where the

    points to be assigned are those that are known to be of the highest likelihood to coincide

    with a target’s position.

    By far the most common method for cooperative search is to use receding horizon

    methods which are commonly used in continuous time problems or problems where the

    end time is not explicitly defined. The basic receding horizon method involves looking

    ahead in time by a defined duration and then determining the best strategy for that

    particular look ahead interval. The control input corresponding to the instantanous

    time step is executed and at the next time step, the process is repeated. The exact

    implementation of coordination between search agents is widely varied. In [21], a global

    optimization performance index is used by the search agents, and individual search agents

    base their receding horizon controls on optimizing the said parameter. The index used

    is a multi-objective weighted sum which balances tasks that include searching, collision

    avoidance and minimum search overlap amoung others. In [29] a similar receding horizon

    method is used as a long term planner with a short term planner acting in parallel. In

    this study, UAVs therefore choose their trajectories based on the benefits of adopting

    the long term control policy, the short term control policy, or a combination of the two.

    Coordination in this case is manifested by virtual potential fields to minimize overlap

    and UAV collisions. Jin et al.[13] also uses receding horizon control with the distinction

    that replanning is done only at the end of the look ahead interval and not at every step

    for search phases.

    Other methods commonly found are potential methods where a virtual potential field

    is used to guide the UAVs to regions that are yet to be searched. In a study done in

    cooperation with Northrop Grumman[23], the unsearched grid cells act a virtual potential

    field source much like a mass is a gravitational potential field source. The gradient of

    the potential field, calculated at any UAV location, is used as a guiding direction vector

  • Chapter 2. Literature Survey 11

    for the search agent. In this particular study, coordination was not explicitly addressed.

    Under simulation, it is observed that overlap and redundancy in search trajectories while

    using potential field control is commonplace.

    2.2 Cooperative UAV Pursuit and Evasion

    Pursuit and evasion problems have been studied since Isaacs first published his work

    on differential game theory [11]. His initial work included laying the foundation for the

    single pursuit and evasion (PE) game and presenting an analytical technique for finding

    optimal path trajectories based on the principle of a constant valued game referred to

    as the main equation. Other analytical techniques were subsequently developed to solve

    varying types of PE problems such as the point mass interception problem[3] and the

    isotropic rocket[5].

    Lately, numerical techniques have largely dominated PE research activities primarily

    due to its flexibility when applied to the general breadth of PE problems. This has

    allowed researchers to explore more complex PE problems such as multi-agent problems

    which are currently an active and growing area of study. An example of early studies

    into the MPSE problem include work done by Benda et al. [4] where agent dynamics are

    approximated with grid-world players.

    Alternate approaches to the MPSE problem include the genetically inspired methods

    of Haynes and Sen [8] where agent trajectories are generated from a distributed control

    algorithm based on a genetic programming approach called the strongly typed genetic

    programming method developed initially by Montana [16]. Another distributed approach

    by Yamaguchi [28] uses a hybrid behavioural reactive framework algorithm to simulate

    robot hunting cooperation. More recently, Jang and Tomlin [12] looked into a similar

    multi-agent problem, solving it using level set functions and a novel method for reflecting

    forward reachable sets. Li and Cruz[14] also studied the same problem with a look-ahead

  • Chapter 2. Literature Survey 12

    optimization approach. Yet another work by Li, Cruz and colleagues[15], translated

    the multi pursuer problem into a target assignment problem. In this work, a hiearchial

    approach is used where an upper level optimization determines which pursuer targets

    which evader. Once pursuit pairs are determined, the pursuers chase the targets using

    the results of Isaac’s analytical solution to the one-on-one pursuit problem.

    2.3 Thesis Contribution

    This thesis presents a novel solution for the combined collaborative search and intercept

    problem for a moving target. Many previous works have dealt with the individual prob-

    lems of search, intercept and a moving target, yet few have considered all three in the

    same problem. The search problem is solved using a diffusion based uncertainty map

    management system and a combined potential field/receding horizon method (refered to

    as the hybrid method to direct individual search agents. Presented work on the intercept

    problem can be viewed as an extension of the behavioural approach of Haynes, Sen[8][9],

    and Yamaguchi[28] applied to the MPSE problem studied by Jang, Tomlin[12], Li and

    Cruz[14]. The primary advantage of the behavioural approach over existing techniques,

    is its ability to reduce the seach-space of possible control strategies for optimal evasion or

    pursuit. As a result, this facilitates the optimization routines applied. The disadvantage

    is that the limitation to the heuristic behaviours may be overly conservative and the true

    optimum may not be included in the corresponding admissible search-space. Thus, the

    intercept portion of this thesis concerns itself not with finding the true optimum, but

    rather approximations.

  • Chapter 3

    Background

    This section details some of the underlying methods used in the construction of the

    cooperative UAV search and intercept algorithm. The optimization methods of the Nelder

    Meade Simplex and particle swarm optimization are described. A short introduction to

    game theory and its application to the relevant one-on-one single pursuer/single evader

    problem is provided. Only brief descriptions are provided here and the time-restrained

    reader already familiar with such concepts can skip this chapter without consequence.

    For more detailed and rigorous descriptions of the methods, citations are provided.

    3.1 Optimization Methods

    Two of the primary gradient-free optimization methods used are the Nelder Meade sim-

    plex algorithm and the particle swarm optimization method. Both are described in the

    following sections.

    3.1.1 Nelder Meade Simplex Algorithm

    The Nelder Meade simplex algorithm is one of the two gradient free optimization meth-

    ods used in this study. The method starts with a simplex with N + 1 vertices in a N

    13

  • Chapter 3. Background 14

    dimensional search space. For example, a search space defined by two search parameters

    would have a triangle simplex and a three dimensional search space would have a cor-

    responding tetrahedron simplex. The vertices of the simplex represent evaluation points

    for the objective function and are the only discrete points where the objective function

    is measured. The vertices can therefore be ordered from worst to best in terms of the

    value to the optimization being performed.

    Parameter 1

    Par

    amet

    er 2

    Current function evaluation points

    New function evaluation point

    Current Simplex

    New Simplex

    worst

    lousy

    best

    Figure 3.1: Simplex in 2D search space.

    The general principle behind the simplex algorithm is to update the simpex by con-

    tinuously discarding the worst performing vertex and replacing it with another better

    point. The new point is selected based on the evaluation of trial vertices which are cho-

    sen based on an extrapolation of the objective function. Once a trial point is selected

    and evaluated, depending on the performance of the trial point compared to the known

    values of the existing vertices, one of several simplex morphing steps can be executed.

    The result is a simplex that gradually converges towards local optimum solutions. In a

    two dimensional search space, it can be pictured as a triangle flip-flopping within the

    search space, changing its shape one vertex at a time, until it reaches the local extrema.

  • Chapter 3. Background 15

    A number of different variations exist; however, for this study only the following simplex

    morphing steps are used: (A two dimensional search space example with a 3-vertexed

    simplex is shown)

    Parameter 1

    Par

    amet

    er 2

    worst

    lousy

    best

    (a) Reflection

    Parameter 1

    Par

    amet

    er 2

    worst

    lousy

    best

    (b) Expansion

    Parameter 1

    Par

    amet

    er 2

    worst

    lousy

    best

    Outside contraction

    Inside contraction

    (c) Contraction (outside and inside shown)

    Parameter 1

    Par

    amet

    er 2

    worst

    lousy

    best

    (d) Shrinking

    Figure 3.2: Nelder Meade Simplex Operations

    The following is a pseudo code for the algorithm used in this study.

    1. Initialize simplex

    2. Loop until converged:

    (a) Identify the worst (highest: xw), second worst (second highest: xl) and best

    (lowest: xb) points with function values fw, fl, and fb, respectively.

  • Chapter 3. Background 16

    (b) Test for convergence

    (c) Evaluate xa, the average of the points in the simplex excluding xw.

    (d) Perform refection to obtain xr and evaluate to obtain fr.

    (e) if fr < fb then

    i. Perform expansion to obtain xe, evaluate to obtain fe.

    ii. If fe < fb then replace xw by xe, fw by fe (accept expansion).

    iii. Else replace xw by xr, fw by fr (accept reflection).

    (f) Else if fr ≤ fl then replace xw by xr, fw by fr (accept reflected point).

    (g) Else

    i. If fr > fw then perform an inside contraction and evaluate fc.

    ii. Else perform an outside contraction and evaluate fc.

    iii. If fc > fw then shrink simplex, evaluate at the n new points.

    iv. Else replace xw by xc, fw by fc (accept contraction)

    (h) End Loop.

    For a more detailed description of the Nelder Meade Simplex algorithm, the interested

    reader is directed to Nelder and Mead’s original paper[18].

    3.1.2 Particle Swarm Optimization

    Particle Swarm Optimization (PSO) is a stochastic and gradient-free optimization method

    for finding global extrema. The method derives its motivation from social foraging crea-

    tures observed in nature such as the ant or bee colony. In these species, one often can

    observe that while individual search agents look for resources independently, they can sig-

    nal or influence other agents depending on the agent’s resource discovery or lack thereof.

    An ant that comes across a stockpile of sugar is known to release pheremones to signal

    other search ants to help exploit the bounty. It is this balance between the individual

  • Chapter 3. Background 17

    and social forces that is distilled from nature and replicated in code to solve engineering

    optimization problems.

    The particle swarm optimization begins with a population of agents or particles that

    move within the search space. Each agent has an associated position and velocity which

    are updated at each optimization iteration. Every agent maintains an information set

    with the following information:

    1. The value and location of its current point in the search space.

    2. The value and location of the best point in the search space that the individual has

    discovered on its own.

    3. The value and location of the best point in the search space that any team member

    has discovered.

    Based on this information, each particle adjusts its velocity according to the following

    rule:

    χik+1 = χik + µ

    ik+1∆τ

    µik+1 = wµik + c1r1

    (ρik−χik)

    ∆τ+ c2r2

    (ρgk−χik)

    ∆τ

    (3.1)

    Here k is the optimization iteration index, χi is the ith particle’s position in the

    design space, µi is the ith particle’s update velocity and ∆τ is the update time step (set

    to unity). w, c1 and c2 are weighting parameters on particle momentum, cognitive and

    social factors respectively. r1 and r2 are random numbers between 0 and 1. Finally, ρi is

    the individual particle’s optimum position and ρg is the global optimum found out of all

    particles.

    Pseudo code for the version of particle swarm used in this study is provided below:

    1. Initialize positions of particles to a random distribution within the search space.

    2. Randomize the particle velocities and orientations.

  • Chapter 3. Background 18

    3. Loop until converged:

    • For each particle

    (a) Evaluate the current locations of all particles

    (b) Update the particle bests

    (c) Update the global best

    (d) Update the particle positions

    (e) Update the particle velocities

    (f) End Loop

    The interested reader in particle swarm optimization is directed to reference [20].

    3.2 Game Theory

    Game theory is the study of all forms of competition where opponents, often with conflict-

    ing aims, execute interdependent strategies to achieve outcomes that maximize respective

    payoffs. One of the fundamental objectives of game theorists is to derive optimal player

    strategies, which are the strategies that if executed, will result in the highest payoff pos-

    sible when compared to the result of all other strategies. No general method for finding

    these optimal strategies exist for all classes of games; however, solutions for select game

    types do exist. In this particular study, focus is set on the game of pursuit and evasion.

    3.2.1 The Classic Pursuit and Evasion

    The classic pursuit and evasion (PE) problem has 2 players. A single pursuit agent (P)

    and a single evading target (E). P typically has a speed that is greater than that of E;

    however, E has the advantage that it is more maneuverable. One macabre yet often cited

    visualization tool is the scenario of the homicidal chauffeur. In this example, P is a driver

    of a car with the malicious intent of running down the pedestrian E. Consistent with the

  • Chapter 3. Background 19

    limitations of both P and E, the pedestrian is more maneuverable than the driver (ex/

    parallel parking on two legs is much easier than doing the same on four wheels) and the

    driver can reach much greater speeds than the hapless pedestrian. The abstracted model

    is typically a unicycle for P, where minimum turning radius and fixed velocity apply and

    a kinematic point for E, where E can instantaeously change its orientation but is still

    restricted to a fixed velocity.

    The game can be classified as a zero-sum game of degree where the payoff is that of

    a continuum. More specifically, the capture time is the payoff to E while the negative

    of the time to capture is the payoff for P. Since both players are assumed to be rational

    and would therefore always choose to maximize payoffs, E would choose its inputs to

    maximize time to capture while P would attempt to minimize capture time.

    3.2.2 Isaacs’ Pursuit and Evasion Problem and Solution

    For a detailed derivation of the optimal strategies for the classic PE game, refer to Isaacs’

    text on differential games[11]. Only the results are provided in this section.

    Isaacs demonstrated through what he termed the main equation and the integration

    of his retrograde path equations, that the optimal controls for both P and E depend on

    the state of the game- that is, where E is at any instant relative to P. This therefore

    defines a feedback control law which ensures that if executed by both P and E, the

    time to capture will be both the maximum and minimum time to capture for P and E

    respectively. Furthermore, if the game is plotted in the reference frame of P, with P

    located at the origin and the forward direction of P aligned with the y coordinate axis,

    then specific geometric regions can be plotted which indicate a specific control to be

    executed depending on what region the location of E falls within. A plot of the game

    in a P-centred reference frame is provided below along with an overlay of the different

    regions. In this plot, the position of the evader relative to the pursuer defines the state

    of the game.

  • Chapter 3. Background 20

    Region 1

    Capture Region

    Region 2

    Region 3

    Figure 3.3: Isaacs SPSE regions.

    In the above figure there are 4 distinct regions: region 1, region 2, region 3 and the

    capture region. The capture region is bounded by the terminal surface which if touched

    by the evader will end the game as a successful capture. The three other regions have

    associated optimal controls for both P and E. The regions and the controls are discussed

    below. Plots of sample trajectories using the optimal strategies in the P-centred reference

    frame and the inertial frame follow.

    1. Region 1- Primary Region: This region is characterized by an E that is suffi-

    ciently close enough to P and just slightly off course, that a quick swerve by P will

    result in a capture. The optimal controls in this case are:

    • Evader: move directly away from pursuer at all times.

    • Pursuer: make sharp turn into evader.

    2. Region 2- Universal Region: Chase situations that fall into this region are

    characterized by either an E that is directly ahead of P or an E that is sufficiently

  • Chapter 3. Background 21

    (a) Pursuer reference frame

    0 0.2 0.4 0.6 0.8 1 1.20

    0.2

    0.4

    0.6

    0.8

    1

    1.2

    P

    E

    (b) Inertial reference frame

    Figure 3.4: Region 1: The primary path.

    far enough away from P that P has enough time to make a complete turn followed

    by a straight run for E. The optimal controls in this case are:

    • Evader: Move straight, tangential to P’s initial curvature circle.

    • Pursuer: Turn sharply until pointed at E then head straight.

    (a) Pursuer reference frame

    −12 −10 −8 −6 −4 −2 0 2 4 6 8−12

    −10

    −8

    −6

    −4

    −2

    0

    2

    4

    6

    8

    P

    E

    (b) Inertial reference frame

    Figure 3.5: Region 2: The universal path.

    3. Region 3: This region is associated with the chase scenarios where E is close to

    P, but P is orientated in such a way that an immediate swerve towards E would

    result in a miss. In this scenario, P must therefore first turn away from E to gain

    some space before making a quick turn around for the final kill pass. The optimal

  • Chapter 3. Background 22

    controls in this case are:

    • Evader: move towards P, tangential to P’s initial curvature circle, until game

    state is in Universal region.

    • Pursuer: Turn sharply away from E, until game state is in Universal region.

    (a) Pursuer reference frame

    −8 −6 −4 −2 0 2 4−4

    −2

    0

    2

    4

    6

    8

    10

    P

    E

    (b) Inertial reference frame

    Figure 3.6: Region 3

    3.3 Summary

    In this chapter, brief descriptions have been provided of select algorithms and background

    information essential to the methods developed in this thesis. These algorithms will be

    called upon in subsequent chapters. Two optimization methods have been discussed

    which include the Nelder Meade Simplex algorithm (used in the cooperative UAV search

    algorithm) and the Particle Swarm Optimization (used in the cooperative intercept al-

    gorithm). Isaacs’ analytical solution to the single pursuer / single evader game has also

    been provided which is called upon in the discussion on cooperative intercept.

  • Chapter 4

    Cooperative UAV Search

    As mentioned in the introduction, there are two distinct phases to the central UAV

    cooperative task. The first phase is characterized by the goal of identifying the target

    location within the game space. Only then can the secondary interception phase be

    initiated after the position of the target is positively ascertained. In this section, the

    former phase is discussed beginning with how uncertainty in the target’s position is

    modelled. A diffusion model for uncertainty is used to account for the target’s ability to

    move. The task of searching is realized through the reduction of uncertainty regions. This

    is done by UAV sensor sweeps through the area. Three different UAV control schemes for

    a single UAV (potential, receeding horizion, and a hybrid between the two) are presented

    and compared. This is followed by a discussion on a cooperation algorithm used and how

    the hybrid method is extended to multi-UAV teams. Finally search simulations using

    the hybrid cooperative method are presented at the end of the chapter.

    4.1 Problem Formulation

    The search problem consists of a team of n UAVs assigned the common goal of finding

    the single target in minimum time. Only a single target exists, the position of which is

    not known to the search UAVs with absolute certainty. Available to the search UAVs,

    23

  • Chapter 4. Cooperative UAV Search 24

    however, is the knowledge of a general region or regions in the game space where the

    target can not possibly be within. This knowledge is assumed to be known prior to the

    start of the game based on information provided by the mission planner. The target can

    however, be hidden within all other regions which would need to be scouted by the UAV

    team if the target is to be found. The sensor range of the UAVs is assumed to have

    limited range and as such, some UAV maneuvering is likely required to find the target.

    4.1.1 Grid World Representation

    The game space is a 2-D plane that the agents move within. A grid N×M square elements

    is overlayed upon the game space which discretizes the game space in both the horizontal

    and vertical directions. Each agent can occupy only one grid element at any time. Note

    that the position of each agent can still can take on continuous values- the grid map

    is simply a discretized approximation to the continuous game. Both the continuous

    positions and the discretized surrogate representations are updated during the course of

    the search (see Figure 4.1).

    Y

    X

    Target Position(3.77, 0.25)

    UAV1 Position(4.02, 4.43)

    UAV1 Position(1.56, 1.62)

    (a) Continuous agent position representation.

    Y

    X

    Target Cell(3, 0)

    UAV1 Position(4, 4)

    UAV1 Cell(1, 1)

    (b) Grid agent position representation.

    Figure 4.1: Grid approximation.

  • Chapter 4. Cooperative UAV Search 25

    4.1.2 Target Uncertainty

    At the outset of the search task, the target’s exact location is not known with absolute

    certainty to the UAV team; however, some information about the general whereabouts

    of the target is available to the UAV team to base its trajectory planning upon. For

    example, regions where the target can not be, known through mission planner intuition

    is admissible and is of utility to the UAV team. These regions are referred to as certain

    regions, since it is known with certainty that the target’s position can not coincide with

    any point within the region. Since the target does not exist within certain regions, then

    it follows that it can be found somewhere within the regions that are not certain. These

    areas are designated uncertain regions. It perhaps goes without saying that any region

    that is not certain is deemed uncertain and therefore includes all points where the target

    may be at an instant in time.

    Uncertainty Representation

    The spatial uncertainty environment is represented by the grid-based representation of

    the 2-D game. Each or the N×M square elements is assigned the discrete binary state of

    either being certain (value=0) or uncertain (value=1). In other words,

    U t = utj,k

    =

    01if grid j, k is certain

    otherwise

    where U t is the uncertainty map or representation of the UAV team’s knowledge of the

    target’s position at the tth time step of size N ×M , j is the index for the grid elements

    in the x direction and k in the y direction.

    Other Possible Uncertainty Representations

    Although in this study, uncertainty is constrained to one of two binary states, a continuum

    between the two extremes can be implemented to account for partial certainty of grid

  • Chapter 4. Cooperative UAV Search 26

    1 1 1 1 0

    1

    1

    0

    0

    1 1 0 0

    1 0 0 0

    0 0 0 0

    0 1 0 0

    Y

    X

    Uncertain Regions: Target could possibly be in

    these blocks of U=1

    Certain Regions: Target could NOT possibly be in these blocks of U=0

    Figure 4.2: Uncertainty represented by gridded map.

    spaces. This allowance for intermediate values would be useful for the modelling of non-

    perfect sensors, where certain regions of the UAV sensor area are less reliable than others.

    Less than perfect reliability of a particular sensor area region would therefore correspond

    to reduced uncertainty, but not to the point of complete certainty. This partial certainty

    would therefore be represented by an uncertainty value that lies somewhere in between

    the values of 0 and 1. For this study, perfect sensors are assumed, hence partial certainty

    is not considered.

    Diffusion Model for a Moving Target

    For static target problems, one can make the assumption that changes to the uncertainty

    map will only result from the UAVs scanning the uncertain regions. The static case

    is therefore a problem of steady or decreasing uncertainty where elements that were

    initially certain remain certain, while those that are uncertain can switch to certain only

    as a result of the passing of a searching UAV. However, in the case of a moving target,

    the steady or decreasing uncertainty assumption is no longer valid. Targets can move

    and therefore can transition into neighbouring grid cells. Uncertainty therefore has the

  • Chapter 4. Cooperative UAV Search 27

    ability to grow in the moving target case and must be taken into account by a mechanism

    which can evolve the uncertainty map with time.

    To take the target’s ability to move into account, a model based upon two dimensional

    diffusion is adopted to evolve the uncertainty boundaries over time. A variation based

    on the work done in [23] is adopted here. The motivation for this is that diffusion will

    propagate the uncertainty in all directions equally. This is desireable since no information

    is given on the behaviour of the target, and therefore all target moves must be considered

    equally probable. Since this is the case, the worst case scenario must be assumed which

    should be reflected on the uncertainty boundary evolution. The basic 2-D diffusion

    equation is as follows:

    ∂u

    ∂t= c

    (∂2u

    ∂x2+∂2u

    ∂y2

    )(4.1)

    where u is the uncertainty in the grid cell, t is the time, and c is the diffusion conductivity

    constant. As it stands, the above equation is not useful when applied to a discretized

    plane such as the already adopted grid world representation of the game space. Instead,

    the 2-D finite element diffusion model is used.

    ut+1j,k = utj,k + c∆t

    [utj+1,k − 2utj,k + utj−1,k

    ∆x2+utj,k+1 − 2utj,k + utj,k−1

    ∆y2

    ](4.2)

    In this equation, both the time and spatial partial derivatives have been approximated

    with central finite differences. The conductivity constant, c, and the uncertainty, u, retain

    their meanings. The super indices t represent the current time step while k and j are

    the indices for the grid elements in the x and y directions. ∆x and ∆y are the step sizes

    in the x and y directions.

    This is one of the simplest forms of diffusion for 2-D finite element models. A simu-

    lation of the evolution of the uncertainty using the above model is provided above. The

    diagrams demonstrate the movement of uncertainty from grid cells of high uncertainty

    concentration to regions of low uncertainty concentration over time.

    Yet this is still not quite the desired behavior of the uncertainty evolution. Firstly, as

  • Chapter 4. Cooperative UAV Search 28

    0 0 0 0 0

    0

    0

    0

    0

    ut = 0ut+1 =1 0 0 0

    0 ut = 1

    ut+1 =1 0 0

    0 0 0 0

    0 0 0 0

    Y

    X

    V

    (a) Given the target position is known, it can ei-

    ther transition to another neighbouring cell, or re-

    main stationary. If a transition occurs, the new

    cell’s uncertainty level must increase to 1

    0 0 0 0 0

    0

    0

    0

    0

    ut = 0ut+1 =1

    ut = 0ut+1 =1

    ut = 0ut+1 =1 0

    ut = 0ut+1 =1

    ut = 1ut+1 =1

    ut = 0ut+1 =1 0

    ut = 0ut+1 =1

    ut = 0ut+1 =1

    ut = 0ut+1 =1 0

    0 0 0 0

    Y

    X

    1

    (b) The behaviour of the target is not known.

    Therefore transitions to all neighbouring cells

    must be considered.

    Y

    X

    V

    Uncertainty at t=1

    Uncertainty at t=2

    Uncertainty at t=3

    Uncertainty at t=4

    (c) To take moving targets into account, the

    boundary of uncertainty must grow at a rate at

    least as fast as the target’s maximum velocity.

    Figure 4.3: Moving target implication of time varying uncertainty.

  • Chapter 4. Cooperative UAV Search 29

    05

    1015

    2025

    05

    1015

    20250

    0.5

    1

    1.5

    2

    xy

    z

    (a) Initial distribution.

    05

    1015

    2025

    05

    1015

    20250

    0.5

    1

    1.5

    2

    xy

    z

    (b) After 100 steps.

    05

    1015

    2025

    05

    1015

    20250

    0.5

    1

    1.5

    2

    xy

    z

    (c) After 150 steps.

    Figure 4.4: Finite element diffusion process.

    was mentioned in the previous section, every cell can only be of only two states. Either the

    cell is certain or it is uncertain and there are no intermediate values. This binary system

    has not yet been taken into account in the 2-D diffusion equation. Secondly, at the basis

    of the diffusion equation is the law of conservation. In standard diffusion applications,

    the total amount of the parameter in question, (whether it be energy or mass), remains

    fixed provided the system is closed and has no sinks or sources. A consequence of this

    is that cells of high uncertainty will experience an unwarranted reduction with time as

    uncertainty flows out of the cell to neighboring cells of lesser uncertainty. This is not the

    desired behavior since cells that are uncertain should remain uncertain until scanned by

    a UAV. The maneuvering target within a cell can potentially choose to remain stationary

    and this possibility must be reflected in how the uncertainty evolves as well. The third

    and final consequence of using an unmodified diffusion equation is that the rate at which

    the uncertainty expands has a tendency to slow down in the latter stages of uncertainty

    evolution as the spatial gradients of uncertainty become small. Again, this is not the

    desired behavior, as one would generally require the uncertainty boundary to grow at a

    constant rate.

    To address these concerns, some modifications to the diffusion equation are in order.

    The first concern is addressed by maintaining two separate maps, U and U’. The first

    map, U, is described as above and allows for cells that can take on intermediate values

  • Chapter 4. Cooperative UAV Search 30

    between 1 (uncertain) and 0 (certain). It is on this map where the uncertainty is evolved

    using the diffusion algorithm. Changes to grid uncertainty values due to search UAV

    movement is also taken into account on this map. The second map, U’, is a filtered

    version of its cousin where each cell of map U is compared to a defined threshold value.

    If a given cell on map U has a value at or above this threshold, the corresponding cell on

    map U’ is assigned the value of 1- that is, if a cell is sufficiently uncertain, the algorithm

    considers that cell to be completely uncertain and is reflected in map U’. Likewise, if on

    the other hand, the value of the map U is below the threshold value, the particular cell on

    map U’ is assigned the value of 0. Both maps are required to be maintained and updated

    for the duration of the simulation. Map U manages the evolution of the uncertainty with

    time, while map U’ is simply a binary filtered version of U which is used by the UAV

    search team to develop their search trajectories.

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    z

    (a) Map of U : intermediate values allowed.

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    z

    (b) Map of U ′: binary filtered version of U with

    threshold value of 0.1.

    Figure 4.5: Diffusion modification 1: Maintaining two maps.

    The second concern of having uncertain cells remain uncertain unless being scanned

    by passing search UAVs is addressed with the following fix. Cells within map U do not

    experience a reduction in uncertainty due to diffusion. The flux of uncertainty is strictly

    limited to net inflows. Outflows for all cells are ignored. The updated equation for finite

    element uncertainty diffusion is therefore

  • Chapter 4. Cooperative UAV Search 31

    ut+1j,k = utj,k + c∆tmax

    [0,

    (utj+1,k − 2utj,k + utj−1,k

    ∆x2+utj,k+1 − 2utj,k + utj,k−1

    ∆y2

    )](4.3)

    The third concern of having non-constant uncertainty growth rates can be addressed

    by adding uncertainty sources to the map and then applying a saturation filter to all cells

    to limit uncertainty values to a maximum of 1. In this implementation, each cell above

    the threshold value is increased by a fixed percentage of its original uncertainty value.

    In effect, every cell with a value above the threshold acts as an uncertainty source. This

    ensures that the gradients near the border of the uncertainty remain sufficiently high

    which translates into a constant uncertainty boundary growth rate that does not exhibit

    the undesirable slowing effect.

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    z

    (a) Initial distribution.

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    z

    (b) Every cell above threshold (z

    value of 0.1) is boosted by a fac-

    tor of 1.2 (exagerrated for visual

    purposes).

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    z

    (c) Filtered such that values fall

    between 0 and 1.

    Figure 4.6: Diffusion Modification 3: Uncertinty source for regular boundary growth.

    The handling of uncertainty is therefore summed up in the following steps:

    For time step t:

    1. Set current U to map U from time step t-1

    2. For each cell in U perform the diffusion update

    3. For each cell in U greater than threshold value, boost uncertainty value by a certain

    percentage

  • Chapter 4. Cooperative UAV Search 32

    4. Limit each cell in U to a maximum of 1 and a minimum of 0

    5. Update U’ with new U by performing cell wise threshold saturation

    6. UAVs develop search trajectories with map U’

    The figures below depict the new uncertainty diffusion algorithm at different time

    steps. By modifying the diffusion we now have an uncertainty distribution that grows

    in the radial direction at a rate that approximates the maximum velocity of the target.

    Intermediate cells are no longer present, but rather only take on the values of 1 (uncertain)

    and 0 (certain).

    4.2 Target Behaviour

    For the search phase, no particular motion of any kind is assumed. All that is known to

    the UAV search team is an upper bound for the velocity of the target. This information

    is incorporated into the UAV uncertainty update algorithm, more specifically, it is taken

    into account when selecting the conductivity constant for the diffusion method. High

    values of conductivity correspond to a faster growing uncertainty which is consistent

    with a faster moving target. Conversely low values of conductivity correspond to a

    slower growth of uncertainty consistent with a slow moving target.

    For testing purposes, target motion is constrained to straight line segments. The

    target is first randomly set to have an initial orientation. It then proceeds in straight

    lines until it reaches the boundary of the game space, at which point, it reflects off the

    boundary with an orientation equal to the angle of incidence. Another candidate test

    target behaviour during the search phase that can be used is that of random motion,

    where the target selects a new orientation vector at all time instances.

  • Chapter 4. Cooperative UAV Search 33

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (a) Initial uncertainty distribution.

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (b) After 100s.

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (c) After 200s.

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (d) After 300s.

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (e) After 400s.

    x

    y

    0 5 10 15 20 250

    5

    10

    15

    20

    25

    (f) After 500s.

    Figure 4.7: Contour plots of the modified diffusion model for uncertainty management.

    The green circle represents the uncertainty boundary.

  • Chapter 4. Cooperative UAV Search 34

    4.3 Search Behaviour

    At this point, only how uncertainty is managed for the UAV search has been defined. It

    still remains to be explained how to act upon the uncertainty to find the target in some

    optimal or approximately optimal fashion. In this section, three methods are explored

    as candidate search algorithms: The potential method, the receding horizon method and

    a hybrid method which is a combination of the other two.

    4.3.1 Potential Method

    The potential method is a common method well used in control applications including

    scenarios for autonomous UAV guidance laws. The general idea is that certain points

    of interest in the game whether they are enemy locations, uncertainty cells, obstacles,

    or any other body of interest, serve as virtual masses which induce a virtual potential

    field within the game space. It is the calculated gradients of this potential field that

    are typically used by the UAVs as a basis for the control laws. Depending on the UAV

    task, different variations of the potential method can be used. Although the underlying

    principles of the method remain constant how the potential field is generated and how the

    agents act upon said potential field exhibit considerable variation between designers and

    missions. The subsequent paragraphs describe the intricacies of the potential method

    used for the agents in the team UAV search.

    Potential field generation

    Each cell of uncertainty exhibits a potential field of decreasing magnitude with the square

    of distance much like natural gravitational fields. Cells that are certain do not contribute

    to the potential field. The potential field due to an uncertain cell at a distance r from

    the cell is:

    pj,k = −uj,kr

    (4.4)

  • Chapter 4. Cooperative UAV Search 35

    The potential field at any position is a sum of all contributions from all uncertain

    cells.

    Pj,k =∑M

    j

    ∑Nkpj,k (4.5)

    05

    1015

    2025

    05

    1015

    20250

    0.2

    0.4

    0.6

    0.8

    1

    xy

    Unc

    erta

    inty

    (a) Uncertainty distribution with uncertainty cells

    located at centre of space.

    05

    1015

    2025

    05

    1015

    2025−1

    −0.8

    −0.6

    −0.4

    −0.2

    0

    xy

    Pot

    entia

    l Fie

    ld

    (b) Resulting virtual potential field.

    Figure 4.8: Example of virtual potential field generated from an uncertainty distribution.

    Potential field control law

    Each UAV calculates the gradient of the potential field at its current position. The

    gradient is normalized to unity and the direction of the gradient is compared to its

    current orientation direction vector. The control law used is to simply minimize the

    angle between the two vectors. In other words, if the gradient is to the left of the UAV’s

    orientation vector, the UAV will turn to the left. If the gradient is to the right, the UAV

    will turn right.

    The following is a pseudo-code for each UAV performing potential based search.

    1. Calculate potential field contribution for every uncertain cell

    2. Sum up all contributions at UAV position

    3. Calculate the gradient of the potential field

  • Chapter 4. Cooperative UAV Search 36

    X

    Y

    5 10 15 20 25

    5

    10

    15

    20

    25

    gradient

    current orientationvector

    Figure 4.9: UAV located in potential field. In this particular case, the UAV control law

    would dictate a left turn to align the orientation vector with the gradient.

  • Chapter 4. Cooperative UAV Search 37

    4. Normalize gradient to unity

    5. Calculate difference between gradient angle and current orientation angle

    6. Set control to difference.

    7. Update time step and goto step 1

    Potential Method Deficiencies

    One major deficiency to the potential method is its poor performance when used in sym-

    metric uncertainty distributions. The potential method directs the UAV to set a course

    towards the centroid of the total uncertainty distribution despite the possibility that

    the centroid is devoid of any uncertain cells. One case in particular that illustrates this

    method’s downfall particularly well is that of a doughnut shaped uncertainty distribution

    with the UAV located in the centre of the distribution. In this case, the UAV control law

    continues to direct the UAV to re-search the centre of the distribution where there are

    no uncertain cells present. It continues to search the inner boundary of the distribution

    while neglecting the growing outer boundary that contributes greater to the increasing

    uncertainty.

    4.3.2 Receeding Horizon Method

    Receding horizon methods are typically applied to optimization problems that have an

    unspecified end time or is a continuous problem without any end time defined. In these

    cases, optimization for the entire time duration is often too computationally expensive

    or in the continuous case not possible. The general principle of the receding horizon

    method is to select a truncated look ahead time interval over which the optimization can

    be realistically conducted. The performance parameter is optimized over this shortened

    time interval (or horizon) with respect to the controls available to the optimizers. A set

    of sequential controls is therefore obtained that will optimize the performance parameter

  • Chapter 4. Cooperative UAV Search 38

    at the end of the look ahead horizon. The first of these controls which corresponds

    to the current time step is executed immediately. At the next time step, the entire

    optimization is repeated with the same look-ahead horizon length updated one step

    further in the future. This defines the standard receding horizon method which is to

    continuously optimize for the finite look-ahead horizon, and allow that horizon to recede

    into the future as time progresses.

    For a single UAV in the search problem, the performance measure is the total number

    of cells that are uncertain within the game space. The receding horizon is set to a pre-

    defined number of time steps into the future - each step with a corresponding turning

    control. These turning controls make up the search space for the optimizer which re-

    optimizes the horizon controls at each time step.

    Optimizer

    The optimizer used is the Nelder Meade Simplex algorithm motivated by its non reliance

    on gradient calculations and its computational efficiency when compared to alternative

    non-gradient based optimization methods including particle swarm and genetic algo-

    rithms.

    Steps for receding horizon method: The following is pseudo-code for each UAV per-

    forming receding horizon based search.

    1. Define n=number of look ahead time steps

    2. Use optimizer to find the next n best control inputs

    3. Execute the first of the n control inputs

    4. Update time step and go to step 2

  • Chapter 4. Cooperative UAV Search 39

    time1 2 3 4 5 6 7 8 9 10

    time1 2 3 4 5 6 7 8 9 10

    time1 2 3 4 5 6 7 8 9 10

    Look ahead Horizon

    Look ahead Horizon

    Look ahead Horizon

    Turn right

    Straight

    Turn left

    Turn right

    Turn right

    Turn right

    Straight

    Turn right

    Turn left

    Turn right

    Turn right

    Straight

    Turn left

    Turn right

    Turn left

    Straight

    Straight

    Turn right

    Black= future time steps

    Red = current time step

    Grey = past time stepsTime = 1

    Time = 2

    Time = 3

    Figure 4.10: Example receding horizon method. Three time steps are shown with a

    horizon length of 5 time steps. At each time step, the UAV reoptimizes to find the best

    combination of controls for the next 5 steps. The time step that corresponds to the

    current time step is executed.

  • Chapter 4. Cooperative UAV Search 40

    Receding Horizon Method Deficiencies

    A common problem with receding horizon methods is how to choose the length of the look

    ahead horizon. If the horizon is chosen too long then the algorithm becomes excesively

    expensive from a computation perspective. If on the other hand the horizon is chosen

    too short, UAV search times increase thus compromising the purpose of using any future

    planning. One rule could be to make the horizon time as long as possible within the

    limits of the computational power made available to the user; however, even in this case,

    the horizon length may not be sufficient to make a proper decision. Consider the case

    when a receding horizon controlled UAV is located sufficiently far enough away from any

    uncertainty cell that regardless of what look ahead strategy it proposes, it will not be

    able to scan any uncertainty cells. In this case, the optimizer would find that an optimum

    look ahead strategy does not exist since all candidate trajectories are equally poor. In

    these scenarios, UAVs equipped with receding horizon control are rendered useless.

    4.3.3 Hybrid Method

    The hybrid method is a combination of both the potential and receding horizon methods

    where switching between the two search modes is performed depending on the UAV’s

    position relative to uncertain cells grid cells in the game space. The default mode for the

    hybrid method is to use the receding horizon method as described above. If however, the

    UAV finds itself in a state where it is unable to find a set of controls that is better than

    others, then it switches to the potential method. This switch from receding horizon to

    potential would correspond to the case when a UAV is sufficiently far enough away from

    any uncertainty cell, that regardless of what controls it selects for the receding horizon

    look ahead steps, it would still not be able to reach any uncertainty cell within the look

    ahead time interval. In this scenario, all look ahead trajectories would be equally poor

    and there would be no reason for selecting one set of controls over another and hence,

  • Chapter 4. Cooperative UAV Search 41

    the potential method would be executed until uncertainty cells come within reach of the

    UAV look ahead.

    Steps for receding horizon method: The following is pseudo-code for each UAV per-

    forming receding horizon based search.

    1. Define n=number of look ahead time steps

    2. Use optimizer to find the next n best control inputs

    3. If optimizer has found a good trajectory

    (a) Execute the first of the n control inputs

    4. If optimizer has not found a good trajectory

    (a) Calculate potential field contribution for every uncertain cell

    (b) Sum up all contributions at UAV position

    (c) Calculate the gradient of the potential field

    (d) Normalize gradient to unity

    (e) Calculate difference between gradient angle and current orientation angle

    (f) Set control to difference.

    5. Update time step and go to step 2

    The following plot compares the effectiveness of using the hybrid method over the

    potential method alone, especially at high initial separation distances. Search simulations

    were run in a 25×25 grid game space although the search UAV was permitted to start

    out side of this region. Initial conditions such as UAV and target position and orientation

    were randomized. A maximum search time was specified at 1500 time steps. A single

    UAV was used to locate the single target. Search times for the simulations are plotted

    for the initial separation of the UAV and the target.

  • Chapter 4. Cooperative UAV Search 42

    UAV

    Reachable states within look ahead interval

    U

    Regions of Uncertainty

    U

    (a) Initial sample game state with two uncertainty

    regions and single UAV.

    U

    U

    (b) Candidate receding horizon generated trajec-

    tory found and is executed.

    U

    (c) After first uncertainty region is scanned, UAV

    attempts to use receding horizon but can not

    find any optimal or approximate optimal trajecto-

    ries since remaining uncertainty is not contained

    within receding horizon boundaries.

    U

    (d) Since receding horizon failed to yield a trajec-

    tory, the potential method is used for guidance.

    Figure 4.11: Hybrid search method.

  • Chapter 4. Cooperative UAV Search 43

    0 20 40 60 80 100 120 140 160 180 2000

    500

    1000

    1500

    Initial Separation

    Sea

    rch

    Tim

    e

    HybridPotential

    Figure 4.12: Comparison of Hybrid method and Potential method in simulated searchs

    of varying initial separation between target and search UAVs. 2 UAVs used with a

    maximum time out of 1500 time steps.

  • Chapter 4. Cooperative UAV Search 44

    Figure 4.12 demonstrates the advantage of using the hybrid method over exclusive

    potential method control. At low initial separations, there is no appreciable difference

    between the two; however, as the initial separation between the target and the search UAV

    increases, one notices that there are significantly more time-outs for the potential method

    - that is, search UAVs using the potential method are unable to find the target and reach

    the maximum allowed time. The reason for this is, as mentioned in section 4.3.1, the

    search UAV using potential control has a tendency to head towards the centroid of the

    uncertainty regardless of there being uncertainty at the centroid or not. Thus there is an

    excess of centroid searching when using the potential method resulting in an inefficient

    search. The hybrid method on the other hand, does not suffer from this problem.

    4.3.4 Multi-UAV Coordination Algorithm

    Up until now, no coordination between UAVs has been discussed. All three methods

    (potential, receding horizon, and hybrid) were defined in the limited context of a single

    UAV. In this section coordination between UAVs is introduced with the intention of

    enhancing the performance of the team search for the target.

    Coordination is manifested through the way in which the individual UAVs within the

    team perceive uncertainty cells. Up to this point, all uncertainty cells were considered

    of equal value to each UAV. For coordination purposes, weighted uncertainty is now

    proposed. The underlying principle is to have cells closer to a UAV more attractive to

    scan and at the same time have cells closer to the UAV’s teammates less attractive to

    scan. As is demonstrated under simulation, this reduces redundant search trajectories

    of UAVs that search the same area. This ultimately reduces the time to cover the entire

    area and also the search time to find the moving target.

    As mentioned in the previous paragraph, the weighted uncertainty values are altered

    versions of the standard uncertainty cells by making cells closer to the individual UAVs

    of greater value to scan, while those that are closer to other UAVs of less value to scan.

  • Chapter 4. Cooperative UAV Search 45

    Both of these objectives can be accomplished by attenuating the uncertainty values as

    follows:

    The standard uncertainty representation for the UAVs is

    U tN×M

    = uti,j

    =

    01if grid i, j is certain

    otherwise(4.6)

    The weighted uncertainty representation for the ith UAV is

    iŪtN×M = iū

    tj,k =

    0 if grid j, k is certain

    n∏l=1

    (1− exp

    [− r

    2l

    2σ2

    ])otherwise

    (4.7)

    where iŪtN×M

    is the weighted uncertainty map valid for UAV i at time step t of size N×M .

    The quantity iūtj,k is the weighted uncertainty value for the individual grid elements of

    iŪtN×M

    . n is the total number of UAVs. l is the index for the UAVs. rl is the distance

    from lth UAV to the grid element (j, k). σ is the attenuation factor used to adjust the

    degree to which the UAVs avoid uncertainty closer to their teammates.

    As one can see in the above equation, cells that are closer to other UAVs are atten-

    uated in value. Cells that are further away from teammates, however retain their initial

    uncertianty value and therefore of greater value to be scanned by the UAV in question.

    60 random search scenarios were tested under simulation using both cooperative and

    non-cooperative UAVs. UAV initial conditions, target initial conditions, and initial un-

    certainty region were all randomized. The only restrictions imposed were that (1) the

    game space was limited to a 25 × 25 grid, (2) all agents were contained within the

    grid, (3) the target must be initially within an uncertain region. The following table

    summarizes the results:

  • Chapter 4. Cooperative UAV Search 46

    1 2

    U1 U2

    R(1,1)

    R(1,2) R(2,1)

    R(2,2)

    Figure 4.13: Under the coordination algorithm, uncertainty closer to a teammate will be

    reduced in value to be searched. In the above scenario, since R(1, 2) > R(2, 2) then U2

    is reduced in value from UAV 1’s perspective. Likewise, since R(2, 1) > R(1, 1) then U1

    is reduced in value from UAV 2’s perspective. As a result, since UAVs are designed to

    reduce the greatest total uncertainty value, a type of uncertainty assignment is acheived

    with UAV 1 covering U1 and UAV 2 covering U2.

    Table 4.1: Comparison of cooperative and non-cooperative search

    Type Wins Avg. Time to target found (time steps)

    Cooperative 33 108

    Non-Cooperative 21 150

  • Chapter 4. Cooperative UAV Search 47

    4.4 Benchmark Comparison

    The Zamboni method is an exhaustive search method that can be adapted to autonomous

    multi-vehicle searching. Few benchmark search algorithms for cooperative searching ex-

    ist; however, the Zamboni method in particular has been used in the past for algorithm

    comparison. A full description of the Zamboni method along with other exhaustive search

    methods can be found in the work of Ablavsky et al.[1]. The Zamboni method involves a

    series of loops where each individual loop is comprised of a (1) front-sweep followed by (2)

    a 180 degree turn around and finally a (3) back-sweep, where the back sweep (see Figure

    4.14 for sample search pattern). Each sweep (both front and back) is characterized by a

    full transition of the UAV from one side of the game space to the direct opposite side.

    Each successive loop is advanced slightly further than the last, resulting in a creeping

    boundary between the unsearched and searched space. The looping continues until the

    front-sweep of the first loop meets the back-sweep of the last loop. At this point, the

    UAV advances several units ahead and begins the process over again with another series

    of loops searching new area.

    100 simulations were conducted comparing the cooperative hybrid search method to

    the Zamboni method. Two UAVs were assigned the mission of finding a target in a game

    space at varying starting distances. The simulation was first run using cooperative hybrid

    searching followed by a re-run using the Zamboni method. The results appear in Figure

    4.15 and Table 4.2. Simulations demonstrate that the coordinated hybrid method yields

    a lower capture time more than twice as often when compared to the Zamboni method

    with a reduced average time to target found of 27%.

    4.5 Summary

    This chapter discusses the first phase of the UAV mission which is to locate a single

    moving target. Initially available to the UAV search team are regions where the UAV

  • Chapter 4. Cooperative UAV Search 48

    Back-Sweep

    Front-Sweep

    Direction of Travel

    Figure 4.14: Sample Zamboni search pattern.

    Table 4.2: Comparison of cooperative hybrid and Zamboni exhaustive search

    Type Wins (3 draws) Avg. Time to target found (time steps)

    Cooperative Hybrid 67 238

    Zamboni 30 324

  • Chapter 4. Cooperative UAV Search 49

    0 20 40 60 80 100 1200

    200

    400

    600

    800

    1000

    1200

    1400

    Initial Separation

    time

    to c

    aptu

    re

    Coordinated HybridZamboni

    Figure 4.15: Simulation results after 100 trials comparing time to target found when

    using Cooperative Hybrid versus Zamboni search algorithms.

  • Chapter 4. Cooperative UAV Search 50

    could possibly be. These regions, called uncertainty regions are known before the start of

    the search. The UAV team uses a proposed modified diffusion model to manage the time-

    evolving nature of the uncertainty regions due to the target’s ability to move from one

    location to another. Individual search algorithms are based on a