mobile robot localisation and mapping in …...mobile robot localisation and mapping in extensive...

Mobile Robot Localisation andMapping in Extensive Outdoor

Environments

Tim Bailey

A thesis submitted in fulfillmentof the requirements for the degree of

Doctor of Philosophy

SID

ERE·MEN

S·EADEM

·MUTATO

Australian Centre for Field RoboticsDepartment of Aerospace, Mechanical and Mechatronic Engineering

The University of Sydney

August 2002

Declaration

This thesis is submitted to the University of Sydney in fulfillment of the requirements forthe degree of Doctor of Philosophy. This thesis is entirely my own work, and except whereotherwise stated, describes my own research.

Tim Bailey

7 August, 2002

Reprinted with corrections and emendations 21 November, 2002

i

AbstractTim Bailey Doctor of PhilosophyThe University of Sydney August 2002

Mobile Robot Localisation andMapping in Extensive Outdoor

Environments

This thesis addresses the issues of scale for practical implementations of simultaneous local-isation and mapping (SLAM) in extensive outdoor environments. Building an incrementalmap while also using it for localisation is of prime importance for mobile robot navigationbut, until recently, has been confined to small-scale, mostly indoor, environments. Thecritical problems for large-scale implementations are as follows. First, data association—finding correspondences between map landmarks and robot sensor measurements—becomesdifficult in complex, cluttered environments, especially if the robot location is uncertain.Second, the information required to maintain a consistent map using traditional methodsimposes a prohibitive computational burden as the map increases in size. And third, themathematics for SLAM relies on assumptions of small errors and near-linearity, and thesebecome invalid for larger maps.

In outdoor environments, the problems of scale are exacerbated by complex structureand rugged terrain. This can impede the detection of stable discrete landmarks, and candegrade the utility of motion estimates derived from wheel-encoder odometry.

This thesis presents the following contributions for large-scale SLAM. First, a batch dataassociation method called combined constraint data association (CCDA) is developed, whichpermits robust association in cluttered environments even if the robot pose is completelyunknown. Second, an alternative to feature-based data association is presented, based oncorrelation of unprocessed sensor data with the map, for environments that don’t containeasily detectable discrete landmarks. Third, methods for feature management are presentedto control the addition and removal of map landmarks, which facilitates map reliability andreduces computation. Fourth, a new map framework called network coupled feature maps(NCFM) is introduced, where the world is divided into a graph of connected submaps.This map framework is shown to solve the problems of consistency and tractability for verylarge-scale SLAM.

The theoretical contributions of this thesis are demonstrated with a series of practicalimplementations using a scanning range laser in three different outdoor environments. Theseinclude: sensor-based dead reckoning, which is a highly accurate alternative to odometryfor rough terrain; correlation-based localisation using particle filter methods; and NCFMSLAM over a region greater than 50000 square metres, and including trajectories with largeloops.

ii

Acknowledgements

This work has benefited greatly from the input and support of many people over the pastfive years, and my gratitude at its completion is extended especially to those mentionedbelow.

To begin, I would like to thank my supervisors. To Eduardo Nebot, thanks for youravailability, expertise and advice, motivation, and encouragement to move forward. To JulioRosenblatt, thanks for your alternative perspective, in particular for introducing me to theconcept of topological maps. Also, Hugh Durrant-Whyte, while not an offical supervisor,deserves special thanks for his reviews of this thesis, helpful comments and wise counsel.

Thank you to Sebastian Thrun for his review of this thesis, for his suggestions andencouragement. His advice is greatly appreciated.

The team at the ACFR have made my time enjoyable and rewarding, and have oftenprovided a welcome respite from research. In particular, I must thank Monica Louda andMike Stevens for their valued friendship. To Monica, thanks for being such a long-sufferingneighbour and for all those cups of coffee. To Mike, thanks for your intelligence, humilityand humour—you are an inspiration. I am indebted to Stefan Williams both for his excel-lent thesis and for many stimulating conversations, thank you. To Jose Guivant and SomMajumder, thanks for the willingness with which you shared your knowledge and research.Thanks also to Jose and the ute team for the park data set, and to Eric Nettleton forthe mine data set. And thanks to “Dissa” Dissanayake for his interest and persistence inasking me “have you finished yet.” To the remaining members of the ACFR, notably SteveScheding, Keith Willis, Ben Grocholsky, and Ralph Koch, thanks for your company and formany entertaining lunchtime discussions.

My most valuable support by far has come from my lovely wife Angela. Thank you,Ang, for your constancy in love, support and understanding. To Mum and Dad, thank youfor a lifetime of love and care, for your prays, and for financing my life and education allthose years.

Finally, I thank my Lord Jesus who supplies all my needs. Thank you for inspirationand perseverance. Thank you, above all, for the incomparable gift of salvation and theeternal hope you offer all who would place their trust in you.

iii

There is a fine line between thorough and slow.

Anon.

We know that we all possess knowledge. Knowledgepuffs up, but love builds up. The man who thinks he

knows something does not yet know as he ought toknow. But the man who loves God is known by God.

1 Corinthians 8:1b–3

Contents

Abstract ii

Acknowledgements iii

Contents v

1 Introduction 11.1 Mobile Robot Localisation and Mapping 2

1.1.1 Forms of Localisation 21.1.2 A Brief History of Autonomous Localisation 4

1.2 Contributions 51.3 Thesis Overview 6

2 Autonomous Localisation 82.1 Navigational Maps and Their Application to SLAM 8

2.1.1 Occupancy Grids 102.1.2 Feature Maps 112.1.3 Topological Maps 142.1.4 Hybrid Topological-Metric Maps 17

2.2 The Stochastic SLAM Algorithm 182.2.1 Vehicle, Map and Augmented State Vectors 182.2.2 Prediction Stage 192.2.3 Update Stage 212.2.4 State Augmentation 222.2.5 Efficiency Improvements for Large-Scale SLAM 23

2.3 Experiments in Outdoor SLAM 242.3.1 SLAM with a Scanning Range Laser 252.3.2 Two Dimensional Projection of Non-Planar Environments 262.3.3 Dead Reckoning in Rough Terrain 262.3.4 Ramifications of GPS 28

2.4 Summary 29

v

3 Batch Data Association for Correlated Feature Sets 303.1 Data Association For Target Tracking 31

3.1.1 Validation Gating 313.1.2 Tracking a Single Target in Clutter 343.1.3 Tracking Multiple Targets in Clutter 34

3.2 Batch Data Association For Correlated Feature Sets 363.2.1 Joint Compatibility Branch and Bound 373.2.2 Maximum Common Subgraph 39

3.3 Combined Constraint Data Association 413.3.1 Calculation of Relative and Absolute Constraints 423.3.2 Set and Graph Terminology 443.3.3 Correspondence Graph with Unknown Vehicle Pose 453.3.4 Correspondence Graph with Partially Known Vehicle Pose 493.3.5 Maximum Clique Search 533.3.6 Computational Complexity of the Maximum Clique Search 553.3.7 Ambiguity Management 583.3.8 A Comparison of Optimal Assignment, JCBB and CCDA 59

3.4 Application: Sensor-based Dead Reckoning 603.4.1 Feature Extraction 603.4.2 Dead Reckoning Implementation Details 643.4.3 Results 64

3.5 Remarks 653.5.1 Utility of Simple Geometric Features 653.5.2 Minimum Acceptable Batch Size 68

3.6 Summary 68

4 An Alternative to Feature-based Data Association: Scan Correlation 704.1 A Review of Point Data Correlation Techniques 72

4.1.1 Iterative Closest Point 724.1.2 Angle Histogram 734.1.3 Occupancy Grid Correlation 744.1.4 Probabilistic Methods 74

4.2 A Probabilistic Representation for Unprocessed Data 754.2.1 Representation as a Two-Dimensional PDF 764.2.2 Higher Dimensional Alternatives 774.2.3 Application of the 2-D Representation 78

4.3 A Bayesian Likelihood Function for a Point Target Model 794.3.1 Notation 794.3.2 Robot Localisation in a Plane 79

4.4 Sum of Gaussians Scan Correlation 82

vi

4.4.1 Conversion to Gaussian Sum Representation 824.4.2 Scaling Models for Gaussian Sum 834.4.3 Cross-Correlation in One Dimension 864.4.4 Planar Cross-Correlation 874.4.5 An Alternative Correlation Scheme 884.4.6 A Comparison of Correlation and Explicit Data Association 89

4.5 Application: Maximum Likelihood Dead Reckoning 934.5.1 Results 934.5.2 Multimodal Likelihood Functions 964.5.3 Discussion of Results 99

4.6 Application: Particle Filter Localisation 994.6.1 State Vector and Process Model 994.6.2 Likelihood Modification 1004.6.3 Results 101

4.7 Remarks: A 2-D PDF for Map Building and SLAM 1044.8 Summary 105

5 Considerations for SLAM in Moderate-Sized Environments 1065.1 Feature Management 106

5.1.1 Feature Addition 1075.1.2 Constrained Initialisation 1075.1.3 Density Control 1095.1.4 Obsolete Feature Removal 111

5.2 Application: Partial SLAM 1135.2.1 Results 1155.2.2 Partial SLAM and GPS 115

5.3 Data Association and the Cycle Detection Problem 1185.4 Application: Full SLAM 118

5.4.1 Results With Medium-Scale Loop Closure 1205.4.2 Results With Large-Scale Loop Closure 123

5.5 Remarks: SLAM versus Batch Map Building 1255.6 Summary 125

6 Network Coupled Feature Maps 1276.1 Submap Methods for Stochastic SLAM 128

6.1.1 Decoupled Stochastic Mapping 1286.1.2 Two-Level Landmark Representation 1296.1.3 Relative Landmark Representation 1306.1.4 Hierarchical Local Maps 1316.1.5 Constrained Relative Submap Filter 133

vii

6.2 Network Coupled Feature Maps 1346.2.1 NCFM Overview 1356.2.2 Coupling Estimate Equations 1366.2.3 Map Traversal 1396.2.4 Submap Creation 1406.2.5 Coupling Convergence 1416.2.6 Consequences of Non-independent Couplings 144

6.3 Cycle Detection within the NCFM Framework 1476.3.1 First Pass Detection 1476.3.2 Second Pass Detection 1516.3.3 Cycle Confirmation 1526.3.4 Pathological Symmetries 152

6.4 Application: Large-scale Outdoor NCFM SLAM 1536.4.1 Implementation Details 1536.4.2 Results 154

6.5 Summary 157

7 Conclusion 1587.1 Summary of Contributions 158

7.1.1 Combined Constraint Data Association 1587.1.2 Sum of Gaussians Scan Correlation 1597.1.3 Feature Management 1597.1.4 Network Coupled Feature Maps 160

7.2 Future Research 1607.2.1 CCDA Algorithm Extensions 1617.2.2 Scan Correlation Applications 1617.2.3 Feature Management Extensions 1627.2.4 NCFM SLAM Extensions 1627.2.5 Longer Term Developments 162

A Environments 164A.1 Urban Parkland 164A.2 Internal Road of a Country Resort 167A.3 Underground Mine Tunnel 169

B Kalman Filter Topics 171B.1 State-Space 171

B.1.1 Linear Transformations 172B.2 The Kalman Filter 173B.3 The Extended Kalman Filter 174

viii

C Insert and Observe State Augmentation 176C.1 New Feature Initialisation 176C.2 Vehicle Pose Initialisation in a Partially Known Map 177C.3 Relative Pose Estimation 178

D Particle Filters 181D.1 Probability Density Functions 181D.2 Recursive Bayesian Estimation 182D.3 Particle Filtering: Sample Based Estimation 183D.4 Resampling Implementations 186D.5 Deficiencies of the Particle Filter Algorithm 188

E Volume from Gaussian Multiplication 189E.1 Multiplication of Two Gaussians 189E.2 Volume from the Multiplication of Two Gaussians 191

Bibliography 193

ix

Chapter 1

Introduction

Autonomous mobile robot localisation in previously unexplored environments requires therobot to incrementally construct a map of its surroundings by a process called simultaneouslocalisation and mapping (SLAM). This thesis is concerned with the implementation ofSLAM in extensive environments that possess difficult properties like rough terrain andlack of man-made structure. The development of a feasible and reliable SLAM system inthese types of environments is dependent on addressing the following issues.

• Data association. SLAM relies on correct correspondence between data obtained fromthe robot sensors and the data currently stored in the map. In natural environmentsthe varied distribution of recognisable objects (features) necessitates a data associationalgorithm robust to both high feature density and an absence of stable features. Theremay also be a high proportion of dynamic objects and spurious sensor measurements,and these difficulties are further accentuated if the robot location is unknown or highlyuncertain.

• Computation. For real-time operation in very large environments, the computationalcomplexity and storage requirements of the SLAM algorithm must scale in a reason-able manner.

• Non-linearity. The process of constructing an incremental map while also localisingfrom it is highly non-linear. In large environments involving significant error accumu-lation and correction, the SLAM algorithm must remain mathematically consistent.

• Adaptation. In the presence of dynamic objects and changing environmental struc-ture, it is necessary to establish criterion for information removal from the map sothat it maintains a contemporary representation of the real world.

The objective of this thesis is to develop solutions to the issues specified above anddemonstrate the functionality of these solutions with experimentation in real outdoor envi-ronments.

1.1 Mobile Robot Localisation and Mapping 2

1.1 Mobile Robot Localisation and Mapping

The ability of a mobile robot to determine its location in space is a fundamental competencefor autonomous navigation. Knowledge of self-location, and the location of other places ofinterest in the world, is the basic foundation on which all high level navigation operationsare built. It enables strategic path planning for tasks such as goal reaching, region cover-age, exploration and obstacle avoidance, and makes following of these planned trajectoriespossible. Without a notion of location, a robot is limited to reactive behaviour based solelyon local stimulii and is incapable of planning actions beyond its immediate sensing range.

1.1.1 Forms of Localisation

This section describes three types of localisation—dead reckoning, a priori map localisation,and SLAM—representing increasing levels of competence in pose estimation.1

The most basic form of localisation is dead reckoning, which is simply estimation ofthe vehicle pose by integrating estimates of its motion (e.g., inertial sensing, encoder-basedodometry). The problem with dead reckoning is that each change-in-pose estimate includesa component of error and these errors accumulate as part of the integration process. Thus,uncertainty in the pose estimate increases monotonically with time and improving sensorand motion model accuracy can serve only to slow, but not prevent, this increase. Eventu-ally, the pose estimate must become so uncertain that it can serve no useful purpose and,for this reason, dead reckoning is an insufficient mechanism for long-term localisation. Deadreckoning does, however, retain usefulness as an auxiliary information source in conjunctionwith map-based localisation.

Pose estimation with bounded uncertainty is only possible through the availability ofabsolute rather than incremental pose measurements. A map of the environment defined bythe locations of distinct landmarks provides such a source of absolute information. Thus,given an ability to sense its surroundings, the robot can obtain absolute pose estimatesby registering sensed information with the map. The problem with a priori map basedlocalisation is the need to have explored the environment in advance, and to have surveyedthe landmark locations before the robot can begin to navigate autonomously. Constructionof an a priori map may be a difficult operation and a new map must be built for each newenvironment. Moreover, the resulting map is static and cannot adapt to changes in theenvironment or grow with exploration into regions beyond the original map bounds.

The motivation for SLAM is to overcome the need for a priori maps as a mechanism forbounded pose uncertainty, and to enable map construction that is extensible and adaptiveto environmental change. SLAM is performed by storing landmarks in a map as they areobserved by the robot sensors, using the robot pose estimate to determine the landmarklocations, while at the same time, using these landmarks to improve the robot pose esti-mate (see Figure 1.1). As the landmarks are repeatedly reobserved, their locations becomeincreasingly certain and the map converges, eventually acquiring the rigidity of an a priorimap.

1This thesis investigates the localisation problem in the context of 2-D (planar) environments, meaningthat the location of the robot is given by its pose (i.e., position (x, y) and orientation φ). However, theconcepts and results presented are equally valid for 3-D environments.


0 10 20 30 40 50−25

−20

−15

−10

−5

0

5

10

0 10 20 30 40 50−25

−20

−15

−10

−5

0

5

10

Figure 1.1: A SLAM example. The vehicle (triangle) obtains a set of range-bearing mea-surements (green rays). From these measurements, landmarks are extracted; new landmarksare added to the map and reobserved landmarks are used to improve the estimates of thevehicle pose and known landmark positions. The landmark uncertainties are shown by redellipses (using a 5σ bound). Note, the uncertainty of the vehicle and added landmarks in-creases with distance from the map origin, but the uncertainty of each landmark decreasesmonotonically once initialised. In the limit, the landmark uncertainties approach a lowerbound and their uncertainty with respect to each other approaches zero [42].


The SLAM ideal is to allow immediate navigation capabilities in completely unknownenvironments, so that a robot might be placed in a new environment and left to exploreand map the environment without human intervention. Alternatively a human operatormight drive the robot through an environment once (a training phase where the operatormight demonstrate trajectories and way-points) and it could accumulate sufficient knowl-edge to thereafter travel competently within the region. It is worth noting that, for realenvironments, training by a human operator is currently a more practicable scenario thanautonomous exploration, as exploration is unlikely to operate safely in the presence of hid-den structures (i.e., structures invisible to the robot’s sensors). For example, transparentobjects, and objects outside of the view plane, are invisible to a robot with only a rangelaser sensor and will not be avoided. Also, structures such as stair-wells present a dangerunless there is a specific recognition of vertical free-space.

1.1.2 A Brief History of Autonomous Localisation

The origin of mobile robot localisation extends back to the 1950s with the installationof wire guided tractors in industrial factories [47]. By the 1970s this path following con-cept had been developed to the point where autonomous guided vehicles (AGVs) navigatedby following lines on the ground—either buried wires (via magnetic inductance) or paintedstripes [133]. Buried wires were reliable and permanent but suffered from substantial instal-lation effort and subsequent inflexibility. Painted lines enabled more rapid path generationand alteration but required continued maintenance to ensure reliability (against wear andfading). The basic limitation of path following is that it restricts navigation to fixed tra-jectories and, therefore, limits AGV application to simple repetitive tasks. Thus, the pathfollowing method, while not actually localisation in the pose estimation sense,2 was a pre-cursor to autonomous localisation in establishing the problem of autonomous navigationand precipitating the need for more flexible navigation strategies.

Increased flexibility via pose estimation was introduced through the use of artificialbeacons. These were either active beacons such as infrared [57] or ultrasonic [80] tranducers,or passive beacons such as retro-reflective markers [20] or radar trihedrals [49], and theyenabled mobile robots to localise relative to the known beacon locations. This meantthat the prescribed navigation paths could easily be redefined in software without anychange to the physical environment and the robot could generate adaptive trajectories tobypass obstacles. Nevertheless, this method still requires the introduction of specialisedinfrastructure (the beacons themselves) that need to be carefully surveyed so as to provideaccurate landmark locations.

The use of the natural environment structure to provide landmarks was the next step inthe development of autonomous localisation, removing the need for specialised infrastruc-ture. By providing the robot with accurate metric maps of the environment (constructedby hand or, commonly, based on CAD structural plans [33]) the sensed environment couldbe registered with the map to determine its location. This use of natural landmarks intro-duced the problem of data association—the process of finding a correspondence betweenelements of two data sets. In the case of mobile robot localisation, data association concerns

2Path following is a more direct forerunner to the topological map localisation paradigm [85], where aqualitative network of paths and places is employed in preference to a metric map of landmark locations.

1.2 Contributions 5

assigning sensed features to appropriate map landmarks. While data association is also anissue in artificial landmark localisation, it can be easily avoided by keeping the beacons wellseparated or making each beacon uniquely distinguishable, using bar-codes for example.

Manually mapping (surveying) an environment, when CAD plans are not available, istedious work and susceptible to human errors and inaccuracies. A better method for a priorimap construction was demonstrated recently [131] where the robot was driven manuallythrough the environment gathering information with its sensors. This information wassubsequently used to automatically generate an environment map using an offline batchprocess, reducing map building from several days to a matter of hours.

The first SLAM algorithm to comprise an explicit and consistent representation of uncer-tainty, and therefore provide qualification of map convergence, was presented in [122]. Thismethod, referred to here as stochastic SLAM, remains the basic foundation of practicallyall subsequent SLAM proposals using the landmark-based map framework. Early experi-mental verification of the algorithm with laser [102] and sonar [87] sensors demonstrated itsutility in relatively small-scale indoor environments. However, for larger regions, the O(n2)computation and storage involved in building a map of n features made direct applicationof the algorithm intractable. More feasible adaptations of stochastic SLAM have since beenproposed such as removing redundant map features [40], developing efficient estimationprocedures [65, 140], and dividing the environment into a network of submaps [90, 140, 30].

For practical SLAM in large-scale unstructured environments, several open problems re-main including issues regarding landmark representation, data association and map struc-ture. First, nearly all current stochastic SLAM implementations represent landmarks assimple geometric primitives like point-location or line features and so are viable only inenvironments where these forms exist. In many real environments, SLAM depends on thedevelopment of more general feature models. Second, SLAM tends to fail if data associa-tion is performed incorrectly. Therefore, data association must be made robust to variabledensities of (often identical) features and dynamic objects, and large uncertainties in vehiclepose. Finally, the structure of the SLAM map needs to enable consistent uncertainty repre-sentation regardless of the size of the map. This is partly a computational problem, wherecorrelations between uncertain variables need to be maintained, but is also a mathematicalissue as uncertainty must be dealt with properly in a highly non-linear system.

1.2 Contributions

This thesis is concerned with the robustness and tractability issues for practical stochasticSLAM in large-scale, particularly outdoor, environments. Specific contributions are madetowards reliable data association, map management, feasible computation, and mathemat-ical consistency. The principal contributions of this thesis are as follows:

• The development of a batch data association algorithm for sets of parametric features.This method, called Combined Constraint Data Association (CCDA), incorporatesall available constraint information to produce an optimal set of assignments betweentwo feature sets. Of particular value to robot navigation is its ability to perform dataassociation with or without the availability of vehicle pose information.

1.3 Thesis Overview 6

• The presentation of a probabilistic (Gaussian sum) representation for unprocessedsensor data, which provides a means for finding a Bayesian correlation between un-processed data sets without feature models. This is a necessary alternative to feature-based data association in environments not suitable to geometric feature extraction.

• The investigation of feature management techniques to limit the addition of unreliablefeatures, remove obsolete features and control feature density. These methods are vitalfor the adaptability and efficiency of long-term SLAM.

• The development of a hybrid topological-metric map representation, called NetworkCoupled Feature Maps (NCFM), that enables stochastic SLAM in very large-scaleenvironments by addressing issues of computation, storage and mathematical con-sistency. In conjunction with the CCDA algorithm, this framework also provides arobust solution to the difficult problem of loop closure.

• The implementation of experiments in outdoor environments to demonstrate the util-ity of CCDA, Bayesian correlation, feature management, and NCFM. These appli-cations include sensor-based (odometry-free) dead reckoning for use in rough-terrainenvironments, particle filter localisation, traditional stochastic SLAM, and NCFMSLAM with large-scale loop closure.

1.3 Thesis Overview

Chapter 2 presents the necessary background to this thesis by discussing common alter-native map representations and their pros and cons when used for SLAM. The traditionalstochastic SLAM algorithm for feature-based maps is then presented, and a discussion ofthe experimental issues for performing large-scale SLAM in outdoor environments.

Chapter 3 gives a detailed background to data association for multiple target tracking,and its relevance to stochastic SLAM. The CCDA algorithm is presented as a highly robustbatch association method, and is then used to implement laser-based dead reckoning inrugged outdoor terrain.

Chapter 4 discusses scan correlation—the association (or alignment) of unprocessed datawithout using geometric feature models. A probabilistic (Gaussian sum) representation isdeveloped for range-bearing data, which permits appropriate sensor modelling and Bayesianpose estimation. The validity of this approach is demonstrated through two applications:sensor-based dead reckoning and particle filter localisation.

Chapter 5 addresses two issues concerning long-term SLAM in medium-scale environ-ments. The first is feature management—feature addition and removal—for efficient andreliable maps that can adapt to structural change. The second concerns loop closure, whichcannot be performed robustly with batch association alone and requires feature groupingwithin the map. An implementation of traditional SLAM in a high feature density environ-ment is used to verify these solutions.

1.3 Thesis Overview 7

Chapter 6 examines the current state-of-the-art in submap SLAM techniques, and presentsthe NCFM approach as a complete SLAM framework that is consistent and highly accurateat a global level. NCFM is also shown to permit very robust loop closure. Preliminaryexperimental NCFM SLAM results are demonstrated.

Chapter 7 presents conclusions and suggests future directions for the completion andextension of this work.

Chapter 2

Autonomous Localisation

The problem of autonomous localisation has received considerable attention over the pasttwo decades and, as a result, a variety of paradigms exist for determining the positionand orientation of a robot vehicle in relation to other objects in the environment. Oneattribute all localisation methods have in common is a map—a representation of the ex-ternal environment that provides a reference for the information obtained from the robotsensors. This chapter examines several map forms currently used in mobile robot navigationsystems. Their suitability for reliable localisation is discussed, particularly with regard tolarge-scale SLAM, and this evaluation leads to the selection of metric feature-based mapsas a best choice for most situations. The remainder of this thesis is based on a featuremap representation, although the developments in Chapter 6 draw from topological mapconcepts.

The later sections of this chapter provide a context for the research in this thesis bydescribing the following aspects of feature-based SLAM.

• The basic stochastic SLAM algorithm is presented which stores the vehicle pose andmap feature parameters in a state vector and updates these estimates using an ex-tended Kalman filter (EKF).

• Recent improvements in the efficiency of the SLAM algorithm are reviewed in termsof their ability to permit feasible operation in large-scale environments.

• The issues involved in performing practical SLAM experiments in rugged outdoorenvironments are discussed with emphasis on the particular case where a scanningrange laser is the sole information source.

2.1 Navigational Maps and Their Application to SLAM

This section considers the three types of maps most commonly used in current localisationsystems: occupancy grids, feature maps and topological maps. For each of these maps, abasic functional description is given and a survey is made of their suitability to a priori maplocalisation (in terms of computational complexity, reliability, etc). An examination of suit-ability for SLAM follows, with attention to criteria considered essential for the development

2.1 Navigational Maps and Their Application to SLAM 9

of a tractable and consistent SLAM algorithm. These criteria are given below (note, thefirst two criteria are valid only for metric maps).

• Representation of uncertainty. Mobile robot sensors cannot measure locations in theenvironment with total accuracy and so there is a degree of uncertainty in the maprepresentation. At the same time, the vehicle location is derived from this map andso the pose estimate is also uncertain. This thesis stipulates that both map andpose estimates require a proper quantification of uncertainty, including conditionaluncertainty where various factors are dependent. Essentially, an uncertainty modelneeds to accurately reflect the error between the estimated and actual state of thesystem.

• Monotonic convergence. The primary purpose of an uncertainty measure is to ensuremap convergence. A map is convergent if the estimated spatial geometry of the en-vironment approaches its true (physical) geometry as new observation information isincorporated. This is synonymous with a monotonic reduction in map uncertainty(i.e., non-increasing uncertainty of the known map). Without this uncertainty mea-sure, a static object with estimated location (x1, y1) may drift with subsequent mapupdates to some arbitrarily distant location (x2, y2). Explicit uncertainty permits for-mal assessment of map accuracy and constrains the effect of subsequent observationinformation.

• Data association. The map representation must permit reliable correspondence be-tween the information obtained from the robot sensors and the stored map informa-tion. First, the search for observation-to-map association needs to be efficient enoughfor real-time operation and, second, the association must be robust to partial viewsand large search-spaces. Partial views occur because an observation may consist ofa combination of currently mapped region, unexplored region and dynamic objects.The size of the search space is determined by the vehicle pose uncertainty; thus, anaccurate uncertainty model improves both efficiency and robustness by prescribing aminimal search space.

• Cycle detection. If a mobile robot explores an environment by traversing a large loop(i.e., much larger than its sensing range), then identifying a return to an old map regionis the cycle detection problem (also known as the map revisitation or loop closureproblem). Cycle detection involves special-case conditions concerning the two previouscriteria. The first issue is data association, which is distinct from local associationbecause of the much larger vehicle pose uncertainty, and hence search-space, involved.Search efficiency is one aspect but, more importantly, there needs to be robustness indeciding whether an association is correct or an artifact of environmental similarity.The second issue, having found a correct association, is convergence, where substantialaccumulated error in the map loop must be compensated properly during the mapupdate phase by propagating the error-offset back through the map circuit.

• Computation and storage. The map must store sufficient information to enable dataassociation and convergence. This storage, and the computation required to updatethe map with new observation data, should scale reasonably with the area of environ-ment covered.


Figure 2.1: Occupancy grid map. The probability of occupancy for eachgrid square is defined by a value bounded by (0, 1) such that 0 indicatesdefinitely not occupied (free space) and 1 means definitely occupied. Aprior probability of 0.5 implies unexplored space (depicted by the lightgray regions). Note that the rectangular grid is not an efficient represen-tation of non-rectangular environments such as this one.

2.1.1 Occupancy Grids

Occupancy (or evidence) grids [50, 118] represent a region as a matrix of cells as shown inFigure 2.1. Each cell describes a small rectangular area in the environment, and indicatesthe probability that the area is occupied by a value in the range (0, 1). Localisation isaccomplished by registering observation data with the map using cross-correlation methods(the same technique as used for image-based template matching).

As an a priori map, in reasonably small environments, occupancy grids are an effectivelocalisation method. Typically observation data is accumulated in a short-term occupancygrid before being registered with the main map [100, 50, 142, 128]. This enables data fusionin a uniform occupancy representation—to filter noisy sensor data over a short time period,or to combine data from multiple sensors and different sensor modalities. Occupancy gridsalso offer an explicit representation of both occupied and free space, which is useful for pathplanning.

One difficulty concerning occupancy grids is data association. The cross-correlationsearch, within the region of the vehicle pose uncertainty, is expensive if the search-spaceis large (although fast search methods have been presented e.g., [82]). Also, if the cross-correlation result is multi-modal within the search-space region, a maximum likelihoodcorrelation search may fail by converging to the wrong mode. This problem might beaddressed by employing a Monte Carlo localisation procedure, as introduced in [38].

The most significant difficulty with occupancy grids, as a priori maps in large envi-ronments, is the tradeoff between grid resolution (termed granularity) and computational


complexity. Ideally, to capture environmental detail and to facilitate accurate pose estima-tion, the grid size is as small as possible (fine grained), whereas for feasible computation,given that storage and computation increases in proportion to the number of grid cells, alarger grid size may be necessary (coarse grained). Also, tasks like path-planning becomecomputationally expensive for fine grid resolutions. Methods to obtain variable granularity,and so focus resources at regions of environmental complexity, have been presented [103, 21],but these possess implementation difficulties of their own.

Occupancy grid SLAM [142, 119] interleaves the steps of localisation and map update byfirst registering the short-term map with the global map (localisation), and then updatingthe perceived occupancy of the global map grid cells (map building). This method hasbeen shown to work robustly in dynamic indoor environments over a limited period of time.However, as a SLAM map representation, occupancy grids do not possess an appropriateuncertainty model and so will tend to diverge in the longer term.

Occupancy grids are reasonably able to incorporate models of sensor uncertainty (e.g.,see [99, 50]), and so are suitable for either localisation given an a priori map or map buildinggiven location. SLAM, however, requires an integrated representation of sensor and vehiclepose uncertainty and their correlations, but this is not supported within the occupancy gridframework. That is, occupancy grids can represent uncertainty at a local (vehicle-centric)level, but not at a global level—which is essential for map convergence. By not definingcriteria for convergence, the developing map is able to drift with each observation updateand this divergence exhibits itself as a slow blurring of the map.

Cycle detection provides the main instance for examining the failings of occupancy gridSLAM. First, without a reasonable estimate of global map uncertainty, the search-spacefor cycle detection is undefined and must either cover the whole space or be limited by adhoc bounds. Second, for large cycles, the minimal search-space may become too great forreal-time cross-correlation. Third, given a large search-space, the possibility of multiplecorrelation modes is high, and data association failure becomes increasingly likely. Finally,and most importantly, even if a correct correspondence between the observed data and theold map region is made, there is still no mechanism for offsetting the error accumulated inthe map loop, and any large error in a map cycle will certainly result in an inconsistentmap.

2.1.2 Feature Maps

Feature maps (or landmark maps) [123, 89] represent the environment by the global loca-tions of parametric features (such as points and lines) as shown in Figure 2.2. Localisationis performed by extracting features from sensed data and associating them to features in themap. The differences between the predicted feature locations and the measured locationsare then used to calculate the vehicle pose. In this way, localisation is very like a multipletarget tracking problem [14] but, unlike normal target tracking, the targets are static andthe observer is in motion.

The landmark locations in an a priori feature map are assumed to be perfectly knownand so each feature is entirely defined by its parameter set. For example, a point-locationfeature for a cylindrical landmark, such as a pole or tree trunk, might be defined byf = (t, r, x, y), where t is the feature type (cylinder type), r is the cylinder radius, and


Figure 2.2: Feature map. The environment is defined by parameterisedfeatures (point locations in this example). These static landmarks aretracked using target tracking methods to determine the motion of theobserver.

x and y define its centre location. Only the location information is directly useful forlocalisation but the other information serves to assist landmark recognition for data associ-ation. Since each landmark is represented by a limited set of parameters, the feature map{f1, ..., fn} is a very efficient environment representation. Unlike occupancy grids, where adense environmental description is maintained, feature maps form a sparse representationof select landmarks. In particular, free-space is not represented and does not incur any costin the localisation process. For this reason, feature maps do not facilitate path-planning orobstacle avoidance, and these must be performed as separate operations.

Localisation using a feature map is a parameter estimation problem to determine thevehicle pose (x, y, φ) given the map feature information and a set of feature observations.Assuming the measurements are correctly associated to the appropriate map features, thevehicle pose can be tracked using standard estimation techniques—with the EKF being themost common method applied to this problem [88, 49, 42].1 Recursive EKF pose estimationhas the advantages of efficient data fusion from multiple sensor measurements and the abilityto incorporate explicit sensor uncertainty models.

Data association is arguably the main weakness of feature map localisation. Correctpose estimation relies on finding correct correspondence between a feature observation andits associated map feature. A misassociation results in an inconsistency where the vehiclepose uncertainty decreases but the estimate error actually increases. The effect of signifi-cant false associations is to dramatically increase the pose estimate error and prevent anysubsequent map registration—so that the vehicle becomes lost. Most feature map local-isation implementations are susceptible to data association failure because they rely onthe association methods developed for target tracking, which treat each measurement inisolation. By failing to exploit the correlations between fixed landmarks, this approach issensitive to observer pose uncertainty and high feature density. A dramatic increase in ro-bustness is possible using batch data association, where a set of observations are assigned atonce, as this enables association distinction based on their combined association likelihoods,effectively utilising the geometric character of the local region.

1Details regarding state-space methods, the Kalman filter and the extended Kalman filter are providedin Appendix B.


Figure 2.3: SLAM uncertainty correlations. In (a) the vehicle observesand initialises feature f1 and so the estimate of f1 is dependent on thevehicle pose estimate. In the next time step (b) the vehicle reobservesf1 and initialises f2. Thus the estimate of f2 is dependent on the vehiclepose which in turn is dependent on f1. As the features are repeatedlyreobserved, they become increasingly correlated to each other and, in thelimit, form a completely rigid map.

A further problem concerning feature maps is their suitability only to environmentswhere the observed objects can be reasonably depicted by basic geometric feature models.This is often not the case in very unstructured environments where the observed objectsmight appear as arbitrary curves rather than, say, distinct points or lines. For reliableoperation in these environments, it is necessary to devise parametric feature models thatdescribe these general objects sufficiently well for consistent extraction and classification.

Feature map SLAM [122, 42] comprises the dual task of adding observed features to themap, using the vehicle pose as a reference, while using existing map features to estimatethe vehicle pose. The uncertainty of sensor measurements, therefore, results in uncertainestimates of both the vehicle pose and the map feature locations, and these uncertainties aredependent (or correlated), as shown in Figure 2.3. Correlated uncertainty has an importantconsequence for feature-based SLAM as it inextricably couples the individual features toeach other and the vehicle to the map. Attempts to estimate the vehicle pose and mapfeatures independently have been shown to produce inconsistent (optimistic) uncertaintyestimates [87, 27].

Consistent stochastic estimation requires that correlations between parameters are ex-plicitly maintained. For an EKF, this means storing the parameters in a single state vectorand their associated correlations in a covariance matrix. The upshot for EKF-based SLAMis that the vehicle pose estimate and the map feature locations must be stored in the samestate vector, and this vector must be augmented as new features are observed and added tothe map. Assuming the SLAM process satisfies the basic EKF conditions of near-linearityand approximately Gaussian uncertainty distributions, then the uncertainty model providedby the EKF has been shown to yield monotonic map convergence [42]. In the limit, withrepeated observation, the map becomes totally rigid such that the relative feature locationsare perfectly known and the global uncertainty approaches a lower bound established bythe initial vehicle pose uncertainty.


Stochastic SLAM suffers from three main weaknesses: high computation and storagecosts, fragile data association, and inconsistent treatment of non-linearity. High compu-tation and storage are the price paid for maintaining correlations in the state covariancematrix. For n state parameters, the n2 covariance elements need to be updated with eachnew observation. Generally, the number of vehicle parameters is insignificant comparedto the number of feature parameters, and so the computation and storage are specified asbeing O(n2) where n is the number of map features.

Data association failure is a much more serious problem for SLAM than for a priorimap localisation. Simple localisation may be able to recover from a minor misassociation,because only the vehicle pose estimate is affected, but with SLAM the map is also alteredand these inconsistencies tend to be self-propagating, causing divergence. A further dataassociation problem concerns the management of non-associated observations. These areeither new map features, outlier measurements, or observations of dynamic objects, anddiscerning the latter two is essential to prevent cluttering the map. The most difficult dataassociation complication arises during cycle detection. This is difficult because, not only isthe vehicle pose uncertain, but the new and old portions of the map are also uncertain inrelation to each other, which reduces the reliability even of batch data association.

Non-linearity is typically not a significant problem for normal SLAM operation [48] but,again, proves to be an issue during cycle detection. Incremental map building withoutcycles is a reasonably linear process and the EKF requirement of small estimation errorsis generally met. The same is true for cycles that accrue only small pose errors, as themap feature correlations permit proper error compensation back through the map loop.However, cycles involving large accumulated errors can incur corrections that violate thelinearised filter assumptions and result in divergence. There are two aspects that contributeto the inconsistency of large corrections. The first is the effect of linearising a highly non-linear observation model, as the first-order approximation may be inaccurate if the stateuncertainty is large compared to the observation uncertainty. This problem could possiblybe addressed using iterated linearisation methods such the iterated EKF (IEKF) [6, pages404–406] or the smoothly constrained Kalman filter (SCKF) [37]. The second aspect is thatthe Gaussian uncertainty approximation for the state (i.e., vehicle and feature covariancesand correlations) may be optimistic because of the large estimate deviations, which wouldprevent convergence to the true state.

In summary, feature maps are a viable representation for long-term convergent SLAMin fairly small-scale environments where stable landmarks are observable, computation istractable, and accumulated state uncertainty does not exceed conservative limits. For fea-sible and convergent operation in larger areas, modifications to the basic stochastic SLAMalgorithm are required.

2.1.3 Topological Maps

Topological maps [85] illustrate a major conceptual shift in environmental representation.Occupancy grids and feature maps are both metric maps where location is defined as aset of coordinates in Cartesian space. Topological maps, however, do not rely on metricmeasurements and instead represent the environment in terms of places and connectingpaths as shown in Figure 2.4. They are generally depicted by a graph structure where


Figure 2.4: Topological map. The environment is defined using a graphdata structure where each node (or vertex) contains a place descriptionand each edge contains a path description. Travelling from one placeto another entails travelling via intermediate places and standard graphshortest path algorithms can be used (e.g., A to H requires travellingthrough the sequence A-D-F-H).

the graph nodes define particular locations in the environment (termed distinctive places)and the graph edges define procedural information for traveling between nodes. Thus,navigation between two non-adjacent locations is determined by a sequence of transitionsbetween intermediate place nodes. The concept works on the assumptions that distinctiveplaces are locally distinguishable from the surrounding area and the procedural informationis sufficient to enable the robot to travel within recognising distance of a specified place.

Place recognition is a form of data association where the observation-to-map correspon-dence is based on the apparent similarity between two data sets—in this case between theobserved information and a graph node description. Appearance-based association, whichrelies on locally unique data, can be considered in contrast with metric proximity con-straints, as used with feature maps, which enables association in the presence of identicallandmark data. For place recognition to function correctly, a node description must beunique along the connecting path regions from its adjacent nodes. This allows the robotto compare observed data with the node template while travelling towards it and to de-termine being “at” the place location once a positive match is made. Most topologicallocalisation systems use one of two basic types of place recognition. The first is based onrange-bearing measurements, such as sonar or laser, and involves defining place nodes atlocations fitting certain geometric qualities and then matching the measured data to theseplace descriptions. For example, these descriptions might be observation criteria such asequidistance from near objects [85, 31], or simple models of indoor structures like doors andcorridor intersections [83, 3]. The second recognition method is based on vision data andinvolves defining each place node with images obtained from its (arbitrary) location andthen matching observed images to the node according to a prescribed similarity measure[51, 1, 137].

Topological maps, as an a priori reference, are attractive for their efficient and compactrepresentation, and their logical organisation for tasks like path planning. The departure


from metric representation makes pose uncertainty estimation irrelevant and, instead, qual-itative measures are used like “follow path from A to B” or “at B”. Of particular advantage,is the ability to utilise standard graph algorithms for high-level planning operations suchas finding the shortest path between non-adjacent nodes [39].

The primary weakness of topological maps concerns ensuring reliable navigation betweenplaces, and subsequent place recognition, without the aid of some form of metric locationmeasure. Travelling between nodes using purely qualitative trajectory information, such aswall following [85], is often sufficient for static structured environments but, in more complexand dynamic environments, may fail to guide the robot to the appropriate place vicinity.The most critical weakness, however, is place recognition. If a place is not recognised (falsenegative) or an alternate location is mistaken for a place (false positive) then the topologicalsequence is broken and the robot becomes lost. False negatives occur because of alterationin place appearance through circumstances such as viewpoint variation, occlusion, struc-tural change, dynamic objects or changed lighting conditions. Both geometric and visualrecognition methods are sensitive to this form of failure. False positives can be generated ifanother portion of the environment appears similar to the place definition—meaning thatthe place is not locally unique. This is common in highly structured environments (such asrows of office cubicles) but can also be a symptom of inadequate place definition. For thesimple descriptions offered by most geometric recognition methods, ambiguous associationsmay be created even by the presence of transient objects. Vision-based recognition is prob-ably more immune to false positives because of the increased level of information definingthe node.

One further problem with topological maps is their limitation to waypoint-based navi-gation. The robot is constrained to follow specific trajectories and to pass through (or verynear) each place location. Effectively, control is directly tied to the localisation process(i.e., the robot must perform active localisation). While this situation is adequate for manyautonomous vehicle systems, there exist applications where the robot trajectory should beindependent of discrete place locations and passive localisation is necessary.

Topological SLAM [85] operates by performing exploration of the environment guidedby a set of path following criteria, and recording place descriptions at appropriate locations.For geometric place recognition, these would be locations effecting certain patterns in thesensory data and, for vision-based recognition, they could be either regularly-spaced loca-tions or locations where a given distinctiveness metric is maximised. As each new place isfound, it is connected to the previous place according to the path following specificationsrequired to reach it. In this way, the map is built as a linear sequence of places, whichcontinues until a place is observed that matches a previously stored place description (pre-suming the robot is not simply traversing old sections of the map). This matching placedescription is generated by either a new region of similar appearance or by observing theold place reached via an alternate route. If the match can be identified unequivocally asthe old place location, then a cycle2 is created, linking the topological sequence back uponitself to form a closed path.

The a priori map problems of qualitative path following and sequential data association2The term cycle detection, used in this thesis to refer to map revisitation, is coined from the graph-

theoretic term for a closed path. For topological maps, the result of loop closure is a cycle in the mapgraph.


remain significant for topological SLAM, but the most prominent concern is cycle detec-tion. By avoiding metric location measurement, topological SLAM removes the difficultiesof uncertainty representation and non-linearities but, instead, places full responsibility forrobust operation upon data association. In the case of cycle detection, where an observedplace is found to resemble a previous place (or perhaps several previous places), data asso-ciation becomes ambiguous, and the observed place could be one of the stored locations or anewly discovered location. Discerning the correct association can be found by a method ofrehearsal [85], where the ensuing sequence of places is tracked until the number of candidatecycles is reduced to one (a cycle) or none (a new place). This method is appropriate if aglobally unique place, or place sequence, exists in the map [9]. However, in many envi-ronments, such uniqueness cannot be ascertained, and the cycle cannot be confirmed sincethe matching sequence could actually represent a similar, but previously unexplored, regionof the environment. An approach for artificially creating unique locations by having therobot drop coloured markers has been proposed [46], but this method is not practical formany robot applications. In general, the observation of an expected place sequence servesto increase confidence of a cycle detection but cannot definitely confirm this hypothesis.

Essentially, unless the environment possesses at least one globally unique sequence ofplaces, cycle detection must always be ambiguous. This is the key weakness in the topologi-cal map paradigm as environmental similarity may eventually produce a consistently similarsequence of places and result in data association failure. The solution to this problem is tointroduce metric information, which would enable the estimation of pose uncertainty be-tween places—bounding the cycle search-space so that place sequences need only be locallyunique.

2.1.4 Hybrid Topological-Metric Maps

The qualities of metric and topological maps are complementary. Metric maps, with an ap-propriate uncertainty representation, constrain data association and permit non-qualitativetrajectory planning, while topological maps break the world up into locally connected re-gions and avoid the problems of maintaining a global reference frame. Hybrid topological-metric maps [141, 121, 30, 31] are basically topological frameworks where the place defini-tions and/or path definitions contain metric (as well as qualitative) information.3 Impor-tantly, this means that places are no longer restricted to discrete locations but can describeregions of arbitrary size and shape as local metric maps.

Defining places as small-scale feature maps is proposed in this thesis as an effectivemethod for performing large-scale SLAM. The global topological structure addresses theproblems of computational cost and non-linearities by decoupling distant (non-adjacent)regions. Cycle detection remains a significant problem but, by combining place recognitionmethods with pose uncertainty constraints, the ability to disambiguate cycles is greatlyimproved. Further development of this hybrid map concept is presented in Chapter 6.

3An alternative type of topological-metric hybrid is presented in [128] where the map is essentially astandard (occupancy grid) metric map overlaid with a topological network for path planning. However, thistype of map does not address the issues of large-scale localisation with which this thesis is concerned and sois not investigated further.

2.2 The Stochastic SLAM Algorithm 18

2.2 The Stochastic SLAM Algorithm

The fundamental equations for feature map SLAM based on the EKF are presented in thissection. An account of the EKF algorithm, and a brief introduction to state-space concepts,is provided in Appendix B. For convenience, the k notation used in Appendix B is droppedin the following explanation as the sequence of operations is apparent from its context.

Stochastic SLAM [122, 36, 42, 140] is performed by storing the vehicle pose and maplandmarks in a single state vector, and estimating the state parameters via a recursiveprocess of prediction and observation. The prediction stage deals with vehicle motionbased on incremental dead reckoning estimates, and increases the uncertainty of the vehiclepose estimate. The observation, or update, stage occurs with the re-observation of storedfeatures, and improves the overall state estimate. When a feature is observed for the firsttime, however, it is added to the state vector through an initialisation process called stateaugmentation.

2.2.1 Vehicle, Map and Augmented State Vectors

The vehicle state is represented, in this thesis, by its pose relative to a base Cartesiancoordinate frame (as shown in Figure 2.5) with mean and covariance defined as

xv =[

xv yv φv

]T(2.1)

Pv =

σ2

xvxvσ2

xvyvσ2

xvφv

σ2xvyv

σ2yvyv

σ2yvφv

σ2xvφv

σ2yvφv

σ2φvφv

(2.2)

The locations of 2-D point features observed by the vehicle form a map in the same basecoordinate system. (More elaborate parametric feature models, such as lines, might alsobe used, but are not implemented in this thesis.) The covariance matrix Pm of this mapincludes cross-correlation information between the features (i.e., the off-diagonal terms)which captures the dependence of each feature location upon knowledge of the other featuresin the map. Since the feature locations are static, these correlations will increase with eachre-observation as the map becomes increasingly rigid.

xm =[

x1 y1 . . . xn yn

]T (2.3)

Pm =

σ2x1x1

σ2x1y1

. . . σ2x1xn

σ2x1yn

σ2x1y1

σ2y1y1

. . . σ2y1xn

σ2y1yn

......

. . ....

...σ2

x1xnσ2

y1xn. . . σ2

xnxnσ2

xnyn

σ2x1yn

σ2y1yn

. . . σ2xnyn

σ2ynyn

(2.4)

The SLAM map is defined by an augmented state vector formed by the concatenationof the vehicle state and the feature map state. This is necessary as consistent SLAM relies


Figure 2.5: Augmented state vector. The SLAM state vector is com-posed of the vehicle pose and of landmarks observed in the environmentreferenced with respect to a base coordinate frame.

on the maintenance of correlations Pvm between the vehicle and the map.

xa =[

xv

xm

](2.5)

Pa =[

Pv Pvm

PTvm Pm

](2.6)

Note that the initial condition of the state estimate is usually given as xa = xv = 0 andPa = Pv = 0. In other words, no features have yet been observed and the initial vehiclepose defines the base coordinate origin.

2.2.2 Prediction Stage

The SLAM process model specifies that the vehicle moves relative to its previous poseaccording to a dead reckoning motion estimate, and the map features remain stationary.The effect of this model on the state estimate is a change in the xv portion of the statevector, and in the Pv and Pvm terms of the state covariance matrix, while the xm and Pm

portions remain constant.An estimate of the vehicle change-in-pose xδ = [xδ, yδ, φδ]T with covariance Pδ (see

Figure 2.6) is commonly obtained using wheel encoder odometry and a vehicle kinematicmodel. In this thesis, it is obtained using laser-based dead reckoning,4 which finds the rela-tive pose between sequential laser scans through a combination of the batch data associationalgorithm in Chapter 3 and the relative pose estimation method in Appendix C.3.

4Concurrent use of laser-based dead reckoning and laser-based SLAM would involve information reuseand lead to an over-optimistic SLAM estimate. However, SLAM may use the laser dead reckoning predictfor data association purposes, provided a separate predict step is used as the prior for the SLAM updatestep. To avoid reuse, the SLAM predict is based on a simple constant velocity dynamic model where thepredicted change-in-pose equals the previous observed change-in-pose with suitably large uncertainty tocapture vehicle accelerations.


Figure 2.6: Change in pose vector. Laser-based dead reckoning providesa vector of the vehicle motion between sequential scans.

The predicted augmented state, therefore, is given by

x−a = f (xa, xδ) =

[g (xv, xδ)

xm

]=

xv + xδ cos φv − yδ sin φv

yv + xδ sin φv + yδ cos φv

φv + φδ

xm

(2.7)

P−a = ∇fxaPa∇fT

xa+ ∇fxδ

Pδ∇fTxδ

(2.8)

where the Jacobians ∇fxa and ∇fxδare defined as

∇fxa =∂f∂xa

∣∣∣∣(xa,xδ)

=[ ∇gxv 0vm

0Tvm Im

](xa,xδ)

(2.9)

∇fxδ=

∂f∂xδ

∣∣∣∣(xa,xδ)

=[ ∇gxδ

0Tvm

](xa,xδ)

(2.10)

and the Jacobians ∇gxv and ∇gxδare as follows.

∇gxv =∂g∂xv

∣∣∣∣(xv,xδ)

=

1 0 −xδ sin φv − yδ cos φv

0 1 xδ cos φv − yδ sin φv

0 0 1

(2.11)

∇gxδ=

∂g∂xδ

∣∣∣∣(xv ,xδ)

=

cos φv − sin φv 0

sin φv cos φv 00 0 1

(2.12)

As these Jacobians only affect the vehicle portion of the covariance matrix Pv and itscross-correlations Pvm, Equation 2.8 can be implemented more efficiently as

P−a =

[ ∇gxvPv∇gTxv

+ ∇gxδPδ∇gT

xδ∇gxvPvm

(∇gxvPvm)T Pm

](2.13)


2.2.3 Update Stage

If a feature already stored in the map as estimate (xi, yi) is observed by a range-bearingsensor with the measurement

z =[

rθ

](2.14)

R =[

σ2r σ2

rθ

σ2rθ σ2

θ

](2.15)

where (r, θ) is the range and bearing relative to the observer and R is the observationcovariance, then the sensed information is related to the map by the following equation.

zi = hi (xa) =

√

(xi − xv)2 + (yi − yv)

2

arctan(

yi−yv

xi−xv

)− φv

(2.16)

Assuming correct data association of the observation z to the map feature estimate (xi, yi),the Kalman gain Wi can be determined as

νi = z − hi

(x−

a

)(2.17)

Si = ∇hxaP−a ∇hT

xa+ R (2.18)

Wi = P−a ∇hT

xaS−1

i (2.19)

where the Jacobian ∇hxa is given by

∇hxa =∂hi

∂xa

∣∣∣∣x−

a

=[ −∆x

d −∆yd 0 0 . . . 0 ∆x

d∆yd 0 . . . 0

∆yd2 −∆x

d2 −1 0 . . . 0 −∆yd2

∆xd2 0 . . . 0

](2.20)

∆x = xi − xv

∆y = yi − yv

d =√

(xi − xv)2 + (yi − yv)

2

For SLAM maps with large numbers of features, the Jacobian ∇hxa consists mainly ofzero terms enabling efficient calculations of Equations 2.18 and 2.19. The non-zero termsalign with the positions of the vehicle states and the observed feature states (xi, yi) in theaugmented state vector. The a posterior SLAM estimate is subsequently determined fromthe update equations.

x+a = x−

a + Wiνi (2.21)

P+a = P−

a − WiSiWTi (2.22)

The observation model in Equation 2.16 correlates the feature estimate to the vehiclepose estimate and serves to reduce the uncertainty of both. Through correlation to thevehicle pose estimate, the map features become correlated to each other and these correla-tions increase monotonically until their locations (relative to each other) become perfectlyknown.


On a practical note, if several independent observations are available at once (a batchobservation z = [r1, θ1, . . . , rn, θn]T ), then a more accurate update may be possible if theyare treated corporately than if each observation is processed individually. The reason forimproved performance is because the EKF performs linearised error correction and theupdate tends to be “pulled in the right direction” better if the innovation vector ν consistsof several observation errors at once. The difference in performance is most noticeable if thevehicle pose is very uncertain prior to the update stage. However, a disadvantage of batchupdate processing is the need to invert the innovation covariance matrix S which requirescomputation to the order O(n3) where n is the number of features in the batch. A simplecompromise is to process the observations in manageable sub-batches.

2.2.4 State Augmentation

As the environment is explored, new features are observed and must be added to the storedmap. A method for initialising new features is shown below.5 First, the state vector andcovariance matrix are extended (augmented) with the polar values of the new observationz, and its covariance R, as measured relative to the observer.

xaug =[

xa

z

](2.23)

Paug =

Pv Pvm 0

PTvm Pm 00 0 R

(2.24)

A function gi is derived to convert the polar observation z to a global Cartesian featurelocation. This transformation is a function of the new observation and the current vehiclepose.

gi (xv, z) =[

xi

yi

]=[

xv + r cos (θ + φv)yv + r sin (θ + φv)

](2.25)

The augmented state can then be initialised to the correct values by performing a linearisedtransformation by the function fi as follows.

x+a = fi (xaug) =

[xa

gi (xv, z)

](2.26)

P+a = ∇fxaugPaug∇fT

xaug(2.27)

where the Jacobian ∇fxaug is given by

∇fxaug =∂fi

∂xaug

∣∣∣∣xaug

=

Iv 0 0

0 Im 0∇gxv 0 ∇gz

(2.28)

5Thanks to Stefan Williams for this derivation.


and the Jacobians ∇gxv and ∇gz are as follows.

∇gxv =∂gi

∂xv

∣∣∣∣(xv ,z)

=

1 0 −r sin

(θ + φv

)0 1 r cos

(θ + φv

) (2.29)

∇gz =∂gi

∂z

∣∣∣∣(xv ,z)

=

cos

(θ + φv

)−r sin

(θ + φv

)sin(θ + φv

)r cos

(θ + φv

) (2.30)

The matrix multiplication in Equation 2.27 requires computation to the order O(n3) wheren is the number of features in the map. Due to the sparseness of the Jacobian matrix,however, a much more efficient implementation is possible as the transform only affects theblock diagonal matrix of the new feature and its off diagonal cross-correlations to the restof the map.

P+a =

Pv Pvm Pv∇gT

xv

PTvm Pm PT

vm∇gTxv

∇gxvPv ∇gxvPvm ∇gxvPv∇gTxv

+ ∇gzR∇gTz

(2.31)

Feature deletion from the SLAM map is straightforward. The elements are simplyremoved from the state vector and the associated rows and columns are deleted from thestate covariance matrix. Map management through the deletion of features is discussed atlength in Chapter 5.

2.2.5 Efficiency Improvements for Large-Scale SLAM

The state covariance update in Equation 2.22 is an O(n2) computation for n map fea-tures. For tractable operation in large-scale environments, several efficient variations of thestochastic SLAM algorithm have been proposed.

The compressed filter [65] limits the SLAM update stage to only affect features within abounded region surrounding the vehicle location. As such, the computational cost becomesO(n2

a) where na is the number of features contained within the local region. Correlations tofeatures outside the local region are preserved in an optimal manner but the O(n2) transferof this information to the entire map is only performed when the vehicle transitions toanother region. Thus a full O(n2) SLAM update can be processed as a low frequencyevent even with high frequency observation information. Moreover, the full update canbe handled as a background task with only the features within the new region requiringimmediate update in sequence with incoming observations. A further contribution in [65] isa suboptimal algorithm for the full update step that permits an upper bound to be set on thenumber of features receiving a decrease in uncertainty—therefore bounding the quadraticportion of the SLAM update. With appropriate map organisation, this method is shown toproduce results almost indistinguishable from the optimal algorithm.

The postponement algorithm [81] is similar to the compressed filter. It is an optimalfilter, and permits the same O(n2

a) local region complexity and deferred application of thefull O(n2) update while the vehicle remains within the local area. Unlike the compressedfilter, the postponement local regions are not governed by geometric bounds but can ex-pand dynamically to accommodate newly observed features and extend the region bounds.

2.3 Experiments in Outdoor SLAM 24

Thus, a local region is described as a “cache of features whose update is cheap, but whichultimately must be ‘swapped out’.” A full update is required only when the computationalload of the local region becomes excessive, whereupon a new local region is commenced andthe full update may be performed as a background task.

The constrained local submap filter (CLSF) [140] is an alternative approach which per-forms actual information gathering and localisation via the construction of an independentlocal map. This technique operates as follows. The robot is at some (approximately) knownlocation in the global map and, from this location, starts a new independent SLAM map.As the robot moves and receives observations of the environment the new local map is builtin the normal SLAM fashion and the global map serves as a static map. Thus, the robotpose in global coordinates may be found by transforming the local pose estimate given theknown global pose of the local map coordinate origin. When the local map reaches a statewhere either it contains too many features or the vehicle location is too uncertain, the localmap is registered with the global map and a new local map is commenced. The process oflocal map registration results in a global map update equivalent to the standard optimalSLAM algorithm. Like the compressed filter, this method enables a low frequency globalSLAM update that can be performed as a background task (i.e., the full update does notneed to reach completion until the next local map is ready for registration). An additionalbenefit of this approach is that it can be applied directly to multiple robot systems whereeach robot builds its own local map and periodically registers it with a shared global map.

Another recent technique called local map sequencing [127] performs local map construc-tion and registration in a manner virtually identical to the CLSF. The robot constructs localsubmaps, and these are periodically registered with the global map by transforming to theglobal coordinate frame and augmenting the global state vector. Common features betweenthe local and global maps are fused via geometric constraints and duplicate features aresubsequently removed to produce a single map equivalent to the full SLAM solution.

All of the above proposals improve the computational burden of stochastic SLAM butthey do not address the O(n2) memory requirement, and storage may eventually present aproblem. The covariance intersection [77] is an interesting development of the (extended)Kalman filter that enables the consistent fusion of information without knowing the correla-tion between state variables or between sources of observed data. Effectively, the covarianceintersection filter performs data fusion by producing the best possible estimate given an as-sumption of worst case conditions where all available information is fully correlated. Thisfilter is applied to the SLAM problem in [135] and results in a formulation where bothcomputation and storage are O(n). However, the assumption of total state and observationdependence is highly suboptimal and the SLAM map tends to suffer large feature uncer-tainty within a short distance from the map origin. The map also tends to converge onlyvery slowly or not at all with repeated observation.

2.3 Experiments in Outdoor SLAM

One of the main contributions of this thesis is verification of the techniques developed in thelater chapters through experimental application in a variety of rugged outdoor environments.These experiments reveal several factors influencing outdoor SLAM, in addition to scale-related issues, which make it considerably more difficult than indoor SLAM.


Figure 2.7: Two scanning lasers (different models) attached to the frontbumper of a standard utility vehicle.

Most indoor environments permit reasonably straightforward SLAM implementationbecause they possess a smooth planar ground surface and a high level of man-made struc-ture. Smooth ground surface enables the use of accurate wheel-encoder odometry and thesurface flatness means that the system closely resembles the assumed 2-D world model.Sensor observations of man-made structure, such as walls, are often well modelled by geo-metric primitives, such as lines and corners, and produce reliable static features. A furthersimplification for indoor SLAM is the relative low speed of indoor vehicles, which meansthat the vehicle motion is kinematic rather than dynamic, the sensor scan does not sufferappreciable distortion, and the effects of timestamp discrepancies are negligible.

Outdoor environments are much less ideal and this section discusses the practical issuesconcerning outdoor SLAM in non-flat non-smooth environments. The particular context ofthis discussion is derived from experiments using a high-speed road vehicle equipped witha scanning range laser sensor. Finally, some comments are made regarding the role of GPSin outdoor localisation.

2.3.1 SLAM with a Scanning Range Laser

The experimental data used in this thesis was obtained from a 2-D scanning range laser(SICK PLS) mounted on the front bumper of a standard motor vehicle (see Figure 2.7).The laser returns a 180◦ planar sweep of range measurements in 0.5◦ intervals (i.e., 361range values in anti-clockwise order) with a range resolution of about ±50mm. The vehiclewas also equipped with wheel and steering encoders, but this information was used forcomparison purposes only.

Scanning laser was the sole information source for all experimentation in this thesis(SLAM and dead reckoning). This meant that successful operation required the existence


of visible static objects in the laser field-of-view, and these tests would certainly fail inenvironments that do not meet this basic criterion. It is important to realise that these ex-periments do not seek to present a complete and infallible SLAM system but to demonstratea single application of the theoretical methods developed in this thesis. System integrity,on the other hand, is dependent on multiple independent and redundant sub-systems withdifferent physical sensing properties and statistically observable failure conditions [117].

A planar scanning laser is a rather inappropriate sensor for outdoor environments, withthe main impediment to adequate interpretation of the laser data being non-planar sensormovement—laser tilt caused by acceleration of the vehicle and undulations in the groundsurface. This effectively alters the laser view plane so that it observes varying cross-sectionsof the environment. Thus, objects can appear and disappear erratically from the sensor field-of-view, and persisting objects can appear to change shape or shift location. A particularlytroublesome problem is ground sweeps, where the laser dips towards the ground and theviewed scene becomes intermittently swamped with ground returns.

2.3.2 Two Dimensional Projection of Non-Planar Environments

In this thesis, the world is modelled as planar and the vehicle pose is defined by posi-tion and heading (x, y, φ). The three-dimensional components of non-flat ground surfaces,therefore, result a distorted representation of the environment. For objects observed by ascanning laser sensor, the effect of 3-D motion is most pronounced for angular tilt as shownin Figure 2.8. In these diagrams, the sensed objects are vertical cylinders and the laser isrotated about the y-axis only, resulting in increasing range measurement distortion withincreasing distance in the direction of the x-axis. These biased measurements not only dis-tort the SLAM map but also hinder data association, particularly for more distant objects.Most significantly, these biases are correlated for all feature measurements in a scan, whichinvalidates the assumption of independent observation errors.

To compensate for tilt distortion, an explicit estimate of sensor tilt may be required toproject the essentially 3-D measurements onto the 2-D map plane. However, if the observedfeatures are non-vertical or have variable cross-sections, this correction would provide littlebenefit and a more complete 3-D model may become necessary.

In this thesis, 3-D distortions, and other unmodelled factors such as imprecise featureclassification, are incorporated into the observation model by expanding the observationuncertainty to cater for expected worst-case environment conditions. This means that theactual observation uncertainties used in practice are much greater than would be presumedfrom the sensor accuracy alone. Also, in regions that more closely approximate the 2-Dideal, the actual pose error tends to be considerably less than the estimated uncertainty.

2.3.3 Dead Reckoning in Rough Terrain

Wheel-encoder based odometry becomes unreliable in outdoor terrain due to the combinedinfluence of rugged ground surface and vehicle dynamics, which can invalidate the kinematicmodel assumptions of pure rolling motion (i.e., that all motion is due to wheel rotation).Wheel slip through loss of traction, skidding, or sliding results in a biased motion estimateand, at high speeds or over rough terrain, the amount of slip can be significant, particularly


(a) Side view

(b) Plan view

Figure 2.8: Trigonometric distortion. The side view (a) shows the effect of tilting the sensorwhen viewing a vertical object—the apparent distance to the object increases. In plan view(b), the objects appear to recede in the direction of the view-plane incline (the x-axis inthis case) causing distorted geometry.


for heavier (high inertia) vehicles. For vehicles with high-slip steering mechanisms, suchas articulated skid steering or tracked differential steering, slip can be a problem even atlow speeds. Fundamentally, reasonable modelling of wheel slip error is difficult given thecomplex physics of ground-wheel interaction.

The shape of the terrain is another biasing factor for odometry since, when modellingthe world as a plane, a non-flat ground surface causes the apparent distance travelled to begreater than the actual planar motion.

Augmenting odometry with an inertial navigation system (INS) can significantly al-leviate these problems by detecting slip [116, 86], tilt, and non-planar terrain [56]. Also,odometry and INS are complementary since encoder-based constraint information facilitateson-line attitude alignment of the INS platform [41]. However, while a low-cost INS is able toimprove certain high-frequency odometry faults, it cannot to observe low-frequency faultssuch as model biases. Thus, INS-encoder dead reckoning is suited to short-term estimatesbetween external (map-based) observations, but is not sufficient for accurate dead reckoningin the longer term.

An alternative to odometry is dead reckoning based on the incremental pose changesobtained from external sensors (e.g., optical flow [11]). Sensor-based dead reckoning removesdependence on precise kinematic modelling and, compared to odometry, the estimate errorstend to be less prone to bias. This thesis presents a laser-based dead reckoning method whichestimates the relative pose between sequential laser scans using the batch data associationalgorithm in Chapter 3. From scan to scan, the predicted relative pose is determined by asimple dynamic (constant velocity) model, which states that, for fixed time intervals, theexpected change-in-pose equals the previous change-in-pose. Thus, the data associationsearch-space is constrained by rough maximum acceleration bounds about the predictedvalue and a dead reckoning estimate can be computed provided the sequential scans possessa subset of common landmark information.

Sensor-based and odometric dead reckoning are not mutually exclusive and, used to-gether, can provide a level of redundancy. That is, agreeing estimates from the two systemscan be fused while disparities can signal faults such as wheel slip or feature tracking failure.

2.3.4 Ramifications of GPS

GPS (Global Positioning System) [126, 62, 6] is a satellite based localisation system thatis capable of providing a three-dimensional global location estimate at almost any point on(or above) the planet surface. It has important implications for outdoor SLAM because, atface value, it would appear to make this line of research irrelevant.

As a localisation mechanism, GPS is essentially an artificial (active) beacon a priorimap. Observation of the satellites enables position estimation via triangulation, with accu-racy of about 10 metres influenced by factors such as atmospheric conditions and geometryof the satellite configuration. The estimate errors are also temporally correlated, consistingof a noise component and a drifting bias component. A dramatic reduction in error is pos-sible through the use of a differential base station (a nearby stationary GPS receiver thattransmits GPS error). Differential GPS (DGPS) compensates for the noise and bias com-ponents of the GPS measurement enabling improved precision; standard DPGS is accurateto 1–2 metres and real-time kinematic DGPS to 1–2 centimetres.

2.4 Summary 29

Exclusive application of GPS for all outdoor localisation purposes is hindered, however,by the following problems. First, reception of the satellite signals requires direct line-of-sight to the GPS receiver. This prevents GPS-based localisation for underground or subseaoperations. It also means that GPS can often provide only intermittent coverage in areascontaining tall structures like buildings or trees. Second, the level of precision necessaryfor autonomous navigation (less than one metre) requires simultaneous line-of-sight of atleast five satellites. This is again hampered by the presence of occluding structures. Third,satellite signal reflection off objects in the environment can result in multipath errors, wherethe estimated satellite distance is greater than its true distance.6 Fault detection alone isreason enough for alternative redundant localisation methods. Finally, since GPS is notself-contained within the mobile robot system, localisation is dependent on uncontrollableexternal factors such as satellite availability7 and possible system downtime.

In environments where GPS is at least intermittently available, it remains a very usefullocalisation tool since the position estimates it provides are known to have bounded error.These estimates might be used in conjunction with other localisation or SLAM systems toimplement smooth and continuous navigation on a very large scale (e.g., see Section 5.2).However, in environments where GPS is unavailable or insufficient (as an a priori map), orwhere multiple redundant navigation loops are required, either alternative maps must bebuilt or the robot must build its own using SLAM.

2.4 Summary

This chapter introduces the fundamental background information for performing SLAM.It presents the four most common map representations: occupancy grids, feature maps,topological maps, and hybrid topological-metric maps; and discusses their relative meritsand utility for SLAM. In particular, it contends that occupancy grids are inappropriate forlong-term SLAM (the map will drift), feature maps are suited to small-scale SLAM, and atopological-metric hybrid is the most likely answer for SLAM in larger environments.

The traditional algorithm for stochastic SLAM is presented based on feature maps andthe EKF. This algorithm requires O(n2) computation and storage for a map of n features,and so several techniques are reviewed for improving its efficiency.

The experimental context of this thesis is described, where a standard road vehicle isfitted with a scanning range laser and driven through a series of outdoor environments. Theissues relating to outdoor SLAM with a laser sensor involve high speeds, uneven groundsurfaces, sensor tilt, geometric distortion, and degraded odometry. Finally, given the com-mon availability of GPS, it is argued that research into large-scale SLAM is still relevant,since GPS may not always operate (reliably) in all areas, and some applications requiremultiple redundant navigation loops.

6Some of these problems may not remain in the long term, as scheduled changes to GPS over the nextfew decades are expected to improve position accuracy and reduce susceptibility to signal blockage andmultipath [62, pages 71–76].

7Satellite availability refers to the availability of a sufficient number of satellite signals within the receiverline-of-sight. It does not refer to “selective availability”—the intentional introduction of errors into the GPSsignals for civilian use. For one, selective availability errors are implicitly removed when using DGPS. Fur-thermore, selective availability has “been discontinued since May 1, 2000. The prevention of the hostile useof GPS is [now] accomplished through other measures, such as selective and local denial of GPS signals” [6].

Chapter 3

Batch Data Association forCorrelated Feature Sets

Data association is arguably the most critical aspect of the SLAM algorithm. Correctcorrespondence of sensed feature observations to map landmarks is essential for consistentmap construction, and a single false match may invalidate the entire process.

Most feature-based SLAM implementations model landmarks as simple geometric prim-itives, such as points or lines, which means that the maps are represented by sets of virtuallyidentical features, distinguishable only by their locations. This representation lends itselfto the data association methods developed for target tracking where correspondence is con-strained by statistical geometric information. However, general target tracking problemspresume that the targets move independently and the observation-to-target associationsare processed without considering possible correlations in target motions. The implicationfor SLAM is that each observation is processed individually and association is based onuncertainties in the measurement, landmark locations, and vehicle pose. Importantly, if thevehicle pose is highly uncertain (compared to the density of targets) then data associationbecomes very fragile indeed.

This chapter investigates the target tracking data association problem where the targetmotions have known correlation. Specifically, the SLAM problem states that the landmarkshave zero relative motion and, therefore, become increasingly correlated with repeated ob-servations. Given a set of observations, the constraint information available when processedas a batch can greatly increase data association robustness. This chapter presents the fol-lowing topics regarding the feature-based data association problem.

• Current target tracking data association methods, for single and multiple targets, arereviewed. In particular, the traditional methods for managing ambiguous associationsare discussed.

• The technique of batch data association for correlated targets is introduced as amechanism for reducing association ambiguity.

• The combined constraint data association (CCDA) algorithm is presented. This graphtheoretic method constrains association with all available information to produce

3.1 Data Association For Target Tracking 31

mutually compatible association sets. Reliable data association becomes possible evenwith total uncertainty in observer pose.

• The CCDA algorithm is applied to real laser data to permit sensor-based dead reck-oning in outdoor environments.

3.1 Data Association For Target Tracking

Data association when tracking identical targets is determined by the correspondence be-tween the observation measurement and the predicted observation for a given target. Thissection discusses the statistical measure most commonly used for gauging the valid neigh-bourhood for association, and states when this constraint is not sufficient for discerningunique associations. For ambiguous associations, there exist a variety of track managementmethods in the target tracking literature [14, 6]. These methods have been developed fromtracking single targets, amidst clutter of spurious measurements, to tracking multiple tar-gets with batch observations in clutter. It should be remembered that none of the methodspresented in this section utilise the correlation information that may be available betweenthe targets.

3.1.1 Validation Gating

Since, within the EKF framework, both the target location estimates and target observa-tions are assumed to have Gaussian uncertainty (with infinite tails), it is possible that anygiven observation might correspond to any target. However, to reject unlikely associations,only targets within a reasonable neighbourhood of an observation should be considered.Association validation is usually performed in observation space, and a validation gate de-fines the maximum permissible discrepancy between a measurement zi and a predictedobservation zj = h (xj) for target xj .

The most common statistical validation gate is based on the normalised innovationsquared (NIS) [6] also known as the Mahalanobis distance [45]. Given an observation inno-vation νij = zi − h (xj) with covariance Sij , the NIS is defined as

Mij = νTijS

−1ij νij (3.1)

For a Gaussian distributed innovation sequence, the NIS forms a χ2 (chi-squared) distri-bution. The shape of the χ2 distribution is dependent on the dimension of the innovationvector as shown in Figure 3.1. The gate, therefore, is applied as a maximum NIS thresholdMij < γn, where the innovation is of dimension n. The integral of the χ2 distribution from 0to γn specifies the probability that, if zi is truely an observation of target xj , the associationwill be accepted.1

The NIS validation gate remains the underlying association threshold for the batch dataassociation method presented in this chapter.

1It is worth noting that the NIS gate does not actually say anything about the rejection of false associa-tions; it only specifies the probability of accepting (or rejecting) correct associations. Thus, although usedas a rejection mechanism, it does not explicitly define a statistical measure of false association rejection.


0 1 2 3 4 5 6 7

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1

2

3

4

Figure 3.1: χ2 distribution. The shape of this distribution, shown herefor 1 to 4 degrees of freedom, is determined by the dimension of theinnovation vector. The integral of this distribution from 0 to γn definesthe probability of accepting a correct association.

Example 3.1Associating a range-bearing measurement z = [r, θ]T to a point target x = [x, y]T . The pre-dicted observation is found from the observation model

z = h (x) =[ √

x2 + y2

arctan( y

x

) ]If the target estimate is x with covariance P and the observation covariance is modelled asR, then the observation innovation ν = z − h (x) has covariance

S = ∇hxP∇hTx + R

where the Jacobian ∇hx is

∇hx =∂h∂x

∣∣∣∣x

=

[x√

x2+y2

y√x2+y2

− yx2+y2

xx2+y2

]

With the innovation vector being of dimension 2, the gate

νTS−1ν < γ2 = 6.0

will accept 95% of correct associations, as 95% of the χ2 probability mass lies between 0 and6.0.


Figure 3.2: Validation gate. The NIS of the measurement zi and thepredicted observation h (xj) must be less than γn to be considered as acandidate assignment.

Figure 3.3: Ambiguous data association. If either a single observationfalls within the validation gate of multiple targets, or multiple observa-tions fall within the validation gate of a single target, there exists uncer-tainty as to the correct assignments. For example, either z1 or z2 couldbe assigned to x1, and z1 could be assigned to either x1 or x2.

The validation gate may be visualised as an ellipsoid in observation space centred aboutthe predicted observation h (xj) as shown in Figure 3.2. An acceptable observation mustfall within this ellipse. Data association ambiguity occurs if either multiple observationsfall within the validation gates of a particular target, or a single observation lies within thegates of multiple targets (see Figure 3.3). Furthermore, it is possible that an observationmight arise from clutter or non-tracked targets leading to false associations even with thesatisfaction of unique gating conditions.

The simplest method for reducing ambiguous associations is to reduce the validationgate threshold γn. This increases the rejection ratio of outlier measurements but, effectively,serves to ignore good information while still allowing false associations to slip through. Inrecent target tracking literature, a number of mechanisms have been developed to appro-priately deal with data association ambiguity, while maintaining suitable gate thresholds(95% acceptance of true measurements, for example).


3.1.2 Tracking a Single Target in Clutter

The most basic form of target tracking is estimating the motion of a single point target withthe presence of spurious measurements or clutter.2 Single target tracking introduces sev-eral ambiguity management methods that are later developed for multiple target, multipleobservation problems.

The most common ambiguity resolution method is nearest neighbour data association.Given a set of observations z = {z1, . . . , zn} within the validation gate of target x, a likeli-hood of association Λi can be calculated for each zi ∈ z.

Λi =1

(2π)n/2√|Si|

exp(−1

2νT

i S−1i νi

)(3.2)

where n is the dimension of the innovation vector. This is simply a likelihood functionof Gaussian contact [125, page 35]. An equivalent metric, the normalised distance Ni, isobtained by taking logs of Equation 3.2.

Ni = νTi S−1

i νi + ln |Si| (3.3)

Nearest neighbour data association then selects the observation that minimises Ni (or max-imises Λi). Note that nearest neighbour is sometimes referred to as maximum likelihood dataassociation because of the equivalence of Equations 3.2 and 3.3. In environments where thedensity of clutter is much less than the target location and measurement uncertainties,nearest neighbour works quite well. However, its performance deteriorates rapidly in envi-ronments with relatively high clutter density. In practical terms, nearest neighbour offerssimilar performance to reduction of the validation gate threshold.

A method designed to explicitly incorporate the informational uncertainty induced byambiguous associations is probabilistic data association (PDA) [7]. Also known as the all-neighbours approach, PDA combines the information from all the candidate associations asa weighted average of the individual updates. The uncertainty of this combined update ismodified to represent the association uncertainty.

A third ambiguity management approach is multiple hypothesis tracking (MHT) [113].MHT delays the application of hard observation-to-target assignment by forming an indi-vidual track for each ambiguous association. This method, termed track splitting, viewseach candidate observation as a viable track hypothesis (plus a further hypothesis thatall observations in the gate are false). With subsequent observations, new hypotheses areformed and old ones are pruned based on a likelihood function. The effect of MHT is todefer irrevocable data association decisions until further information arises that removesthe ambiguity (i.e., through compatibility with subsequent motion observations).

3.1.3 Tracking Multiple Targets in Clutter

Multiple target tracking with multiple simultaneous observations develops the target track-ing problem to deal with ambiguities due to both spurious measurements and known targets.

2Clutter is defined as the presence of false observation returns. Typically these are assumed to form aPoisson distribution within the validation gate of the predicted target location. If, however, the returns arenot random occurrences but are produced by discrete interfering sources [55], then they do not behave asclutter and should be tracked as targets in their own right.


Figure 3.4: Injective assignment example. The validated observation-to-target associations are indicated by connecting lines. The set{E1, . . . , E5} of all maximal assignment sets complying with the injec-tive mapping constraint is as follows (E1 is pictured in bold above):E1 = {(z1,x2), (z2,x3), (z3,x4)}, E2 = {(z1,x2), (z2,x1), (z3,x4)},E3 = {(z1,x3), (z2,x1), (z3,x4)}, E4 = {(z1,x2), (z2,x4)}, E5 ={(z1,x3), (z2,x4)}.

The main extension introduced with multiple target tracking is enforcement of the injec-tive (or one-to-one) mapping constraint, such that no two observations in a batch may beassigned to the same target. In other words, each observation must map to a unique targetor be classified as unknown (clutter or new target).

The multiple target equivalent of nearest neighbour data association is optimal assign-ment, which operates as follows. Let Ek = {e1, . . . , en} represent a maximal set of com-patible (injective) assignment pairs such that em = {i, j} stands for assignment zi to xj

(see Figure 3.4, for example). Notice that not all Ek are the same size but are maximal inthe sense that, given the assignments already contained in the set, no further assignmentscan be added. Optimal assignment finds the set Ek that maximises the product of thelikelihoods Λij . ∏

{∀em∈Ek}Λem (3.4)

which is equivalent to maximising the sum of the log-likelihoods.∑{∀em∈Ek}

ln Λem (3.5)

or minimising the sum of normalised distances Nij .∑{∀em∈Ek}

Nem (3.6)

The method generally used to maximise the log-likelihood sum while ensuring one-to-one as-signment is maximum-weight bipartite graph matching [35]. Basic renditions of this methodassume the bipartite graph is complete (any observations might validly be assigned to any

3.2 Batch Data Association For Correlated Feature Sets 36

target), but extensions have been proposed to cater for the existence of false alarms (spuri-ous observations) and non-detections (non-observed targets) [34]. However, models of falsealarm probabilities are application specific and often of little value so, if the bipartite graphis incomplete, it is generally preferable to select the optimal assignment set only from theEk’s of equal maximum size.

Notice that, while optimal assignment uses the injective mapping constraint to preventconflicting associations, it does not use the concurrent existence of the observation set toinfluence the value of the association likelihoods. These likelihoods are still calculated on anindividual (isolated) basis. An alternative method is proposed in [134], which adjusts theindividual assignment probabilities as a batch event, and produces a modified set of likeli-hoods called the joint assignment matrix (JAM). This algorithm introduces the mappingconstraint into the likelihood calculation so that the likelihood of a particular observation-to-target pairing is “the sum of the probabilities of all [sets of] assignments containing thepair, normalised by the sum of the probabilities of all assignments” [134, page 134].

Λ′em

=

∑{Ek|∃em∈Ek}

∏{∀ei∈Ek} Λei∑

{∀Ek}∏

{∀ei∈Ek} Λei

(3.7)

The resulting constrained association likelihoods more accurately reflect the combined in-fluence of the observation set, and can be used to obtain better optimal assignment resultswhen substituted into Equation 3.4.

Extensions to PDA and MHT have also been proposed to incorporate the injectivemapping constraint. The multiple target version of PDA, joint probabilistic data asso-ciation (JPDA), again utilises all valid associations in a weighted average update but ismodified to account for some observations having non-unique correspondence possibilities.The injective mapping constraint means that both JPDA and MHT operate with sets ofmutually compatible association pairs, rather than with individual assignment pairs. Forthe sake of efficiency, the number of association sets is usually limited to the k most likelysets of hypotheses, which are found using an extension of optimal assignment [35].

3.2 Batch Data Association For Correlated Feature Sets

Stochastic SLAM is a form of multiple target tracking problem where the targets are knownto have zero relative motion. This is a specific case of a more general target tracking categorywhere the target set (or the observation set) is internally correlated. However, none ofthe traditional data association methods exploit these correlations to reduce ambiguity.This section presents methods for batch data association that capitalise on the correlationspresent within a set of features.

Data association requires the transformation of the two data sets into the same coordi-nate space (usually observation space). If either data set is internally correlated when theyare transformed to a common coordinate space, then the batch data association methodsdescribed in this section are able to produce better results than is possible with individualvalidation gating.


Caveat. It is important to make a distinction between ambiguity reduction and ambi-guity management. The NIS gate is an ambiguity reduction mechanism, as it constrainsthe number of observation-to-target associations according to a specified probability of ac-ceptance. (Recall that Gaussians have infinite tails and, without the gate, an observationmight correspond to any target.) On the other hand, the methods described in the previ-ous section—nearest neighbour, JPDA, MHT—are ambiguity management techniques, asthey attempt to resolve any conflicting data associations that remain after gating. Thus,ambiguity reduction implies reducing the number of association possibilities, and ambigu-ity management implies determining how to resolve ambiguous association possibilities.The batch data association methods presented in this section fall into the category ofambiguity reduction. They are essentially multiple-target versions of the NIS gate thatenforce stronger constraints by incorporating additional correlation information. This re-sults in fewer possible observation-to-target correspondences given the same probability ofacceptance. However, ambiguous associations may remain and, for these, the traditionalambiguity management techniques must still be applied (e.g., see Section 3.3.7).

3.2.1 Joint Compatibility Branch and Bound

The joint compatibility branch and bound (JCBB) algorithm [105, 106] generates tentativesets of associations and searches for the largest set that satisfies joint compatibility. Thesearch is performed by incrementally constructing an “interpretation tree” (see [63]) of thesolution space, which enables efficient search space pruning.

For a given set of association pairs, joint compatibility is determined by calculating asingle joint NIS gate. The benefit of joint compatibility is that it preserves the correla-tion information within the set of observations and predicted observations. Consider a setof observations z = [z1, . . . , zn]T with covariance model R and a set of target estimatesx = [x1, . . . , xm]T with covariance P. A tentative set of associations Ek = {e1, . . . , ej} ischosen, subject to the individual validation and injective mapping constraints. Let theassociation pair for ei be denoted zei and xei , such that the joint observation is given by

zEk= [ze1 , . . . , zej ]

T (3.8)

with covariance REk, and the joint predicted observation is as follows.

zEk= hEk

(x) =

he1 (x)

...hej (x)

(3.9)

The joint innovation and innovation covariance are then calculated as

νEk= zEk

− zEk(3.10)

SEk= ∇hxP∇hT

x + REk(3.11)

where the Jacobian ∇hx =∂hEk

∂x

∣∣∣x. The joint validation gate, therefore, is found to be

MEk= νT

EkS−1

EkνEk

< γn (3.12)

where the value of n is equal to the dimension of the joint innovation vector.


Example 3.2Joint compatibility for SLAM. At a given instant in time, the SLAM augmented state vectoris defined by xa = [xT

v ,xTm]T where xv is the vehicle pose [xv, yv, φv]T and the map contains

three point features xm = [x1, y1, x2, y2, x3, y3]T . This SLAM estimate is denoted xa withcovariance Pa. In this moment, a set of range-bearing observations z = [r1, θ1, r2, θ2, r3, θ3]T

is obtained. These measurements are modelled as uncorrelated with covariance

R =

R1 0 0

0 R2 00 0 R3

where Ri =

[σ2

riσ2

riθi

σ2riθi

σ2θi

]

A hypothetical association set Ek = {(z1,x2), (z3,x1)} is chosen, and so the tentative jointobservation is given by zEk

= [r1, θ1, r3, θ3]T with covariance

REk=[

R1 00 R3

]

and the joint predicted observation is

zEk= hEk

(xa) =[

h2 (xa)h1 (xa)

]=

√(x2 − xv)

2 + (y2 − yv)2

arctan(

y2−yv

x2−xv

)− φv√

(x1 − xv)2 + (y1 − yv)

2

arctan(

y1−yv

x1−xv

)− φv

The joint validation gate can then be calculated trivially from Equations 3.10 to 3.12. Theχ2 threshold for a 95% acceptance probability of true association sets is γ4 = 9.5.

The solution space for joint compatibility is depicted as an interpretation tree, whereeach level of the tree defines the set of possible associations for a particular observationand each descending path from the tree root represents a set of injective associations.Continuing from the example in Figure 3.4, the interpretation tree is pictured in Figure 3.5.The branch-and-bound algorithm in [106] constructs the interpretation tree incrementallyusing depth-first search with a maximum likelihood branching heuristic.3 At each level,a new node is added to the joint association set and tested for joint NIS. If a node ona given level fails the joint validation test, then its child nodes are also invalid and neednot be searched. Also, if a descending path has insufficient non-null nodes to produce anassociation set greater than or equal to the current maximum, it is not searched. Theconcept of JCBB is illustrated in Figure 3.5 where the search is marked in bold (with theleftmost branch being searched first).

In [106], the maximum set of valid joint associations is assumed to be the correct dataassociation hypothesis. However, the process of searching through the interpretation tree

3The branching heuristic used in [106] is simply the minimum joint NIS. However, a more correct likelihoodmeasure is the maximum joint likelihood ΛEk , or minimum joint normalised distance NEk , which may beobtained by substituting νEk and SEk into Equation 3.2 or Equation 3.3, respectively.


Figure 3.5: Interpretation tree. This figure shows the entire joint com-patibility solution space for the assignment example from Figure 3.4.Each level of the tree pertains to a particular observation and eachnode represents an observation-to-target association (these include pos-sible non-assignments). Each descending path represents an alternativejoint association hypothesis. An example maximum joint compatibilitybranch-and-bound search is shown in bold, with the best result beingEk={(z1,x3), (z2,x1), (z3,x4)}.

generates many alternative joint compatible association sets of maximum, or near maximum,size. The coexistence of these alternatives embodies batch data association ambiguity, whichcan be resolved using traditional ambiguity management methods. For instance, the nearest-neighbour association set is the set with maximum joint likelihood ΛEk

. Note, however, thatthe interpretation of joint likelihood is complicated by the different sizes |Ek| of the variouscandidate sets. This is the same problem as discussed in Section 3.1.3 regarding optimalassignment. If the joint likelihoods of unequal sized assignment sets are to be compared,then probability models for false alarms, target non-detections, and new target detectionsare required; and it is generally difficult to reasonably model these factors. Thus, as withoptimal assignment, the best option is usually to select the maximum likelihood result fromthe set of valid association sets of maximum size.

3.2.2 Maximum Common Subgraph

The maximum common subgraph (MCS) batch data association algorithm [5, 4] is a graph-theoretic approach to finding sets of jointly compatible associations.4 Each data set isdescribed by a complete graph (i.e., where each node is connected to every other node)such that the nodes represent features, defined by feature type and characteristics, and

4MCS-based data association was first proposed in the early 1980s as an object recognition tool [16, 72],but was not previously applied to the problems of multiple target tracking or robot localisation. Morerecently, a pair-wise constraint interpretation tree algorithm has been applied to mobile robot localisation[104, 26], which produces equivalent association results to MCS (including the case where a priori poseinformation is unavailable). However, the branch and bound search of this algorithm is considerably lessefficient than a maximum clique implementation of MCS.


Figure 3.6: Common subgraphs. The two data sets form complete graphswhere the nodes are defined by the feature types and the edges are theconstraints between features. The common subgraphs shown result fromthe matching of constraints {C45, C34, C35} from the first graph to con-straints {C12, C24, C14} of the second (along with matching node types).

the edges define pair-wise relational constraints between features. For example, a nodemight represent a point landmark and an edge the distance between two points. In thisthesis, these edges are termed relative constraints. The MCS algorithm compares the twofeature graphs and finds the maximum complete subgraphs that possess compatible nodesand edges (e.g, see Figure 3.6). Basically, this operation involves matching nodes and edgesto find matching subgraphs (see [136, 98, 120, 28] for MCS search algorithms).

The MCS data association algorithm presented in [5] operates purely through matchingrelative constraints and feature types, which has the effect of enabling batch data associationwithout a priori knowledge of the relative pose between the data sets.5 However, when poseinformation is available, a greater degree of association constraint can be achieved throughits application. In [4] pose constrained MCS is introduced, which uses the a priori relativepose information to constrain feature associations between the two data sets (in the samemanner as traditional target tracking). Thus, the compatibility of nodes between subgraphsbecomes constrained by the proximity of features from one data set to the predicted locationsof features from the other, in addition to compatibility of feature type and characteristics.In this thesis, constraints on node compatibility between data sets are collectively termedabsolute constraints.

The main advantages of the MCS data association method are that the constraints arepair-wise, may be of arbitrary type, and may be incomplete. For example, the relativeconstraint between two points might be distance, while the relative constraint between twoline features might be the subtended angle. Similarly, an absolute constraint might be thetype circle with radius r. Incomplete constraint information means that edge and nodematching between the two graphs may be partially unrestricted. That is, an unknown rela-tive constraint (or unconstrained edge) is compatible with all other edges, and an unknown

5For the robot localisation problem, where one data set is the map and the other is the observation scan,the “relative pose” between the two data sets is simply the robot pose with respect to the map coordinateframe. Thus, not having a priori knowledge of the relative pose means the robot location is initially unknown.

3.3 Combined Constraint Data Association 41

absolute constraint (or unconstrained node) may correspond to any other node. The ef-fect of incomplete information is most apparent in two extreme cases, where there are onlyrelative constraints or only absolute constraints. In the former, MCS data association isdetermined purely by relational similarity (this is basically the algorithm presented in [5]).The latter case is simply individual assignment limited by the injective mapping constraint,and has similar behaviour to the optimal assignment algorithm.

MCS and JCBB are similar in that they both utilise all the available correlation informa-tion to perform batch data association. JCBB does this by forming association hypothesesand searching an interpretation tree. MCS instigates a two-stage process of (i) compilingpair-wise constraints and (ii) searching for the largest compatible association set. The rel-ative advantages of these two approaches, when given an a priori estimate of the relativepose, have not been properly investigated. A probable outcome will be that JCBB is sim-pler to implement while MCS offers more flexible constraint options. However, MCS offersone significant advantage in its ability to provide a most likely batch association set whenthe relative pose is initially unknown or very uncertain.

3.3 Combined Constraint Data Association

This section presents a development of the MCS batch data association algorithm calledcombined constraint data association (CCDA). The CCDA algorithm integrates the appli-cation of relative and absolute constraints into a unified EKF framework.

The fundamental data structure of the CCDA algorithm is the correspondence graph(CG) [8], which represents valid associations between the two data sets (see Figure 3.7).Complete subgraphs (or cliques) within the CG indicate mutual association compatibilityand, by performing maximum clique search, the largest joint compatible association set maybe found.6

Construction of the CG is performed through the application of relative and absoluteconstraints. The nodes of the CG indicate individual association compatibility and aredetermined by absolute constraints (e.g., node 1 is valid if feature a1 may correspond to b1).The edges of the CG indicate joint compatibility of the connected nodes and are determinedby relative constraints (e.g., if the relative constraints between {a1, a2} and {b1, b2} match,then node 1 is connected to node 5).

Explicit examples of relative and absolute constraints are presented in this section,for the range-bearing SLAM problem with point location landmarks. These constraintsare statistical thresholds based on the NIS validation gate. Application of more generalconstraint forms (including non-statistical constraints such as feature type) can be inferredin light of these examples.

6Note, as mentioned previously, it is not possible to compare association sets of different sizes withoutmodels of non-detection, false-detection, etc. Here, as elsewhere [14, 106], the set with the greatest numberof compatible associations is assumed to represent the maximum likelihood association set—hence the searchfor the maximum clique. In reality, each clique implies an alternative possible association set and resolutionof these ambiguities is addressed in Section 3.3.7.


Figure 3.7: Correspondence graph. The (individually) possible associa-tions between feature sets, A and B, are depicted as nodes. Since, in thisexample, no absolute constraints exist, all combinations of associationpairs are valid. The edges of the CG define which pairings are mutuallycompatible, dependent on injective mapping and relative constraints. Themaximum clique of this graph is shown in bold.

3.3.1 Calculation of Relative and Absolute Constraints

In the previous discussions of target tracking data association, the NIS was calculatedin range-bearing observation space. However, in this section, the set of observations aretransformed to Cartesian space, with the observer as the coordinate origin. Since eachobservation zi is assumed uncorrelated to the others, the polar-to-Cartesian transformationcan be performed individually.

zCi = f (zi) =[

ri cos θi

ri sin θi

](3.13)

RCi = ∇fziRi∇fTzi

(3.14)

The two data sets, therefore, are the augmented SLAM state vector xa = [xTv , xT

m]T withcovariance Pa, and the batch observation vector zC with covariance RC .

Relative constraints concern each data set separately, so let x with covariance Prepresent either the state or observation data. The invariant property between the point


features (xi, yi) and (xj , yj) in x is their relative distance7

dij (x) =√

(xi − xj)2 + (yi − yj)

2 (3.15)

with scalar varianceσ2

ij = Pd = ∇dxP∇dTx

where the Jacobian ∇dx is given by

∇dx =∂dij

∂x

∣∣∣∣x

=[

0 . . . 0 xi−xj

dij(x)yi−yj

dij(x) 0 . . . 0 − xi−xj

dij(x) − yi−yj

dij(x) 0 . . . 0]

A more efficient form of this calculation can be derived considering the sparseness of theJacobian. First the relevant portions of the covariance matrix are extracted to produce areduced form8

Pr =[

Pxi Pxixj

PTxixj

Pxj

](3.16)

The distance variance can now be calculated as

σ2ij = ∇dxrPr∇dT

xr(3.17)

where the Jacobian ∇dxr is given by

∇dxr =[

xi−xj

dij(x)yi−yj

dij(x) − xi−xj

dij(x) − yi−yj

dij(x)

](3.18)

As relative constraints make no distinction between the observation and state information,let the two data sets be represented by set A and set B. Let CAi denote a relative constraint{dpq, σ

2pq

}(�{dAi , σ

2Ai

}) from set A and CBj denote a relative constraint

{drs, σ

2rs

}from

set B. The constraints CAi and CBj are said to match if they satisfy the NIS threshold

Mij =

(dAi − dBj

)2σ2

Ai+ σ2

Bj

< γ1 (3.19)

Absolute constraints determine individual compatibility across the two data sets. Forthe SLAM problem, this is determined by the NIS between an observation zCi in zC and afeature xj in xa. The predicted observation is modelled as

zCj = hj (xa) =[

(xj − xv) cos φv + (yj − yv) sin φv

−(xj − xv) sin φv + (yj − yv) cos φv

](3.20)

The innovation and innovation covariance between observation and predicted observationare as follows.

νij = zCi − hj (xa) (3.21)

Sij = ∇hxaPa∇hTxa

+ RCi (3.22)

7An alternative relative constraint between xi and xj is dij (x) = [xi − xj , yi − yj ]T , which is likely to

give better results since it is a linear constraint model. The NIS gate in Equation 3.19 would be γ2.8Notice that the cross-correlation terms Pxixj are utilised in the calculation of relative constraints. It is

this information that makes CCDA more constrained than non-batch methods, like optimal assignment.



∇hxa =∂hj

∂xa

∣∣∣∣xa

=[ − cos φv − sin φv −(xj − xv) sin φv + (yj − yv) cos φv

sin φv − cos φv −(xj − xv) cos φv − (yj − yv) sin φv

0 . . . 0 cos φv sin φv 0 . . . 00 . . . 0 − sin φv cos φv 0 . . . 0

]

As with the relative constraint calculations, the sparse nature of the Jacobian permits moreefficient calculation of the innovation covariance by extracting a reduced form of the statecovariance matrix.

Sij = ∇hxrPr∇hTxr

+ RCi (3.23)

Pr =[

Pv Pvxj

PTvxj

Pxj

](3.24)

∇hxr =[ − cos φv − sin φv −(xj − xv) sin φv + (yj − yv) cos φv cos φv sin φv

sin φv − cos φv −(xj − xv) cos φv − (yj − yv) sin φv − sin φv cos φv

](3.25)

The absolute constraint of NIS validity is therefore calculated as

Mij = νTijS

−1ij νij < γ2 (3.26)

3.3.2 Set and Graph Terminology

The pseudocode modules in this section are described using set and graph notation. Themeanings of these notations in the context of the CCDA algorithm are listed below.

A set S is a collection of elements {s1, . . . , sn} with the number of elements given by|S|. An empty set is denoted ∅. The existence of element si in set S is denoted si ∈ S. Twosets are combined by their union S3 ← S1 ∪ S2. Thus, new elements may be added to a setas S ← S ∪ {snew}. The set difference S3 ← S1 \ S2 assigns to S3 the elements in S1 thatare not in S2. This operation can be used to remove an element from a set as S ← S \ {si}.The intersection of two sets S3 ← S1 ∩ S2 defines the set of elements common to both sets.Most of the sets specified in the code modules below are ordered sets (also known as vectorsor arrays) as they require a specific iteration order. Adding new elements to these setsby the union operator ∪ implies addition to the back of the set. Similarly, the differenceoperator \ removes elements from the back of an ordered set.

An undirected unweighted graph G = {V, E} is defined as a set of nodes (or vertices)V = {v1, . . . , vn} and a set of edges E = {e1, . . . , em} where each edge ek = {vi, vj} repre-sents a connection between two nodes. Two nodes vi and vj are adjacent if there exists anek = {vi, vj} ∈ E. The set of nodes adjacent to node vi is given by adj(vi) which representsthe set of all vj ∈ V for which there exists ek = {vi, vj} ∈ E. Each node vi = {x, adj(vi)}represents a stored item x and links to its adjacent nodes. A graph G = {V, E} is completeif all its vertices are adjacent. A clique is a complete subgraph of G. A clique GM ⊂ Gis maximal if no additional node in G is adjacent to all the nodes contained in GM . Thelargest maximal clique in G is termed the maximum clique (and there may exist severalmaximum cliques).


3.3.3 Correspondence Graph with Unknown Vehicle Pose

The construction of the CG is presented here for the case where there exist no absoluteconstraints.9 This situation is encountered with pose initialisation in a (partially) knownmap, and recovery from localisation failure where the robot has become lost.

Data association based solely on relative constraints operates by finding subsets of mapfeatures that possess similar geometric configuration to subsets of the observed feature set.An example of CCDA with only relative constraints is shown in Figure 3.8. Both figuresrepresent the same two feature sets extracted from real scanning laser data but the bottomfigure depicts a false association set due to environmental symmetry. In both cases, therelative pose between the two feature sets is calculated subsequently using the method inAppendix C.3. Usually, as in this example, the correct association set is larger than forincorrect hypotheses, but this may not always be the case and, in general, symmetry-basedambiguities must be resolved using MHT.

The CG construction algorithm CorrespondenceGraph(A, B) in Algorithm 3.1 takesas input two feature sets A = {xA,PA} and B = {xB,PB}. The set of nodes for the un-connected CG are generated via the function MakeNodeSet(m, n, G) (see Algorithm 3.2).Each node represents a possible association and, since there are no absolute constraints,this means that mn nodes are created. The node set V is ordered such that each nodevk = {ai, bj} can be found as k = (ai − 1)n + bj .

The generation of relative constraints for each data set is performed via the functionMakeRelativeConstraints(F ) in Algorithm 3.3. Given a set of features F = {x,P},a constraint Ck = {i, j, dij , σ

2ij} is calculated for each combination of feature pairings. In

Algorithm 3.1, the second constraint set CB is subsequently ordered via a sorting functionSort(CB), which arranges the elements CBk

∈ CB so that the distance estimates dBkare

in ascending order. This enables efficient constraint matching between the two data sets.CG edges are generated by matching the two relative constraint sets CA and CB as

shown in Figure 3.9. To permit search efficiency, the matching region is restricted by afixed bound ∆d since, without this restriction, the constraint CAi would have to be checkedagainst all CBk

∈ CB. The consequence of ∆d is that some matching constraints in CB arenot tested. This is a reasonable tradeoff, in practice, as the efficiency gained is substantialand association possibilities rejected by the bounded search tend to be unlikely.

The details of constraint matching are provided in Algorithm 3.4 where the functionMatchConstraints(CA, CB, G) takes as input the two relative constraint sets and theunconnected graph G = {V, ∅}. For each constraint CAi ∈ CA, the constraints {CBj} ⊂ CB,satisfying dAi − ∆d ≤ {dBj} < dAi + ∆d, are checked as follows. First, a binary search10 isperformed using the function UpperBound(CB, dAi − ∆d) which returns the index j of thefirst constraint in CB with dBj ≥ dAi − ∆d. Each ensuing CBj satisfying dBj < dAi + ∆d isthen compared using the function CheckConstraint(CAi , CBj ), shown in Algorithm 3.5,which calculates the NIS gate given in Equation 3.19.

Let CAi be the relative constraint between feature pair {ar, as} and CBj the pair {bp, bq}.9Features are assumed to be identical point targets in this discussion and absolute constraints based on

feature distinctiveness are not considered. However, addition of these constraints, as for pose-based absoluteconstraints, is straightforward using the method presented in Section 3.3.4.

10The C++ STL function upper bound() performs the form of binary search task described.


−15 −10 −5 0 5 10 15 20 25 30

−15

−10

−5

0

5

10

15

20

12

features seen from 2features seen from 1associated features

−5 0 5 10 15 20 25 30 35 40

−15

−10

−5

0

5

10

15

20

2

1

Figure 3.8: CCDA with relative constraints. The above figures (dimensions in metres) showtwo association possibilities for the same two sets of real data. The relative pose betweenthese data sets is not known a priori, and both contain large numbers of spurious features.The location of the observer for each set is indicated by a triangle, and their positionsdepict the estimated relative pose derived from association. The top figure shows a correctassociation set and the bottom figure shows an incorrect association set.


Algorithm 3.1: CorrespondenceGraph(A, B)

G ← {V = ∅, E = ∅}G ← MakeNodeSet(|A| , |B| , G)CA ← MakeRelativeConstraints(A)CB ← MakeRelativeConstraints(B)Sort(CB)G ← MatchConstraints(CA, CB, G)return (G)

Algorithm 3.2: MakeNodeSet(m, n, G)

for i ← 1 to m

do{for j ← 1 to ndo V ← V ∪ {i, j}

return (G)

Algorithm 3.3: MakeRelativeConstraints(F )

C ← ∅for i ← 1 to |F |

do

for j ← i + 1 to |F |

do

dij ← Equation 3.15σ2

ij ← Equation 3.17

C ← C ∪{

i, j, dij , σ2ij

}return (C)


Algorithm 3.4: MatchConstraints(CA, CB, G)

for i ← 1 to |CA|

do

d ← dAi ∈ CAi

j ← UpperBound(CB, d − ∆d)while (dBj ∈ CBj ) < (d + ∆d)

do

if CheckConstraint(CAi , CBj )then G ← MakeEdges(CAi , CBj , G)

j ← j + 1return (G)

Algorithm 3.5: CheckConstraint(CAi , CBj )

Mij ←(dAi

−dBj

)2

σ2Ai

+σ2Bj

return (Mij < γ1)

Algorithm 3.6: MakeEdges(CAi , CBj , G)

k1 ← NodeIndex(ar ∈ CAi , bp ∈ CBj )k2 ← NodeIndex(as ∈ CAi , bq ∈ CBj )E ← E ∪ {vk1 , vk2}k1 ← NodeIndex(ar ∈ CAi , bq ∈ CBj )k2 ← NodeIndex(as ∈ CAi , bp ∈ CBj )E ← E ∪ {vk1 , vk2}return (G)

Algorithm 3.7: NodeIndex(ai, bj)

k ← (ai − 1)n + bj

return (k)


Figure 3.9: Relative constraint matching. The elements of constraint setCB are sorted in ascending order according to their distance values dBj

(representing the distance between features {bp, bq}). The variance σ2Bj

of each constraint is signified here by a vertical line. Matching constraintCAi to elements of CB is performed as follows. The first element dBj

greater than dAi − ∆d is found by binary search. A check for matchis then performed with each element CBj possessing dBj in the regiondAi − ∆d to dAi + ∆d.

A valid matching means that the associations {ar, bp} and {as, bq} are mutually compatible,and {ar, bq} and {as, bp} are mutually compatible (although, clearly, these two hypothesesare not compatible with each other). This results in two edges being added to the CGvia the function MakeEdges(CAi , CBj , G) in Algorithm 3.5. The indices of the connectednodes are found in O(1) time using the order of set V as shown in Algorithm 3.7.

The computational complexity of Algorithm 3.4 is not readily apparent because theinner while loop is non-deterministic. On average the while loop performs approximately kiterations, where k is dependent on n, ∆d, and the distribution of

{dBj

}in CB. An empirical

evaluation of k, as a function of n, is shown in Figure 3.10. These measurements are takenfrom the laser-based dead reckoning experiment presented in Section 3.4 using data fromthe internal road environment (see Appendix A). From these results, the complexity of kappears to be quadratic with respect to n, for fixed ∆d. However, over the experimentalrange of n, this complexity is bounded above by a small-constant linear bound k = 2n. Thisis a substantial improvement over the brute force complexity k = n2−n

2 for comparing CAi

to every CBj ∈ CB.The total complexity of Algorithm 3.1, therefore, is O(mn + m2 + n2 + n2 log n2 +

m2 log n2 + km2), which may be simplified to O(n2 log n2 + km2). If it is assumed thatm ≈ n, this can be further simplified to O(km2).

3.3.4 Correspondence Graph with Partially Known Vehicle Pose

With the existence of partial knowledge of the observer pose, absolute constraints can beincorporated into the CG construction process. Absolute constraints minimise the prob-lem of environmental symmetry by rejecting associations that fall outside the threshold ofobserver pose uncertainty. As a result, only local symmetries, that fall within the pose


0 10 20 30 40 50 60 700

20

40

60

80

100

Figure 3.10: Inner loop of CG construction. Empirical measure-ments for the number of iterations k of the while loop in the functionMatchConstraints(CA, CB, G) are shown as a function of n, wheren is the number of features in set B (and ∆d = 1.0 metre). The bruteforce complexity k = n2−n

2 and the linear bound k = 2n are shown forcomparison.

uncertainty region, remain ambiguous.Two CG construction algorithms are presented in this section. The first applies absolute

constraints after each relative constraint match, and is a simple extension to the methodin Section 3.3.3. The second applies absolute constraints prior to the relative constraintmatching process, and is a very efficient implementation for situations where the vehicle isalready reasonably well localised.

Method 1: Post application of absolute constraints

Let the data set A represent the observation data set {zC ,RC}, and the set B representthe augmented state estimate {xa,Pa}. The following algorithm differs from the algorithmin Section 3.3.3 only in the implementation of the function MakeEdges(CAi , CBj , G) (seeAlgorithm 3.6). The new version of this function checks the node validity for each edgeentry and inserts a new edge into the CG only if the nodes satisfy Equation 3.26. For thepurpose of this calculation, the function is additionally passed the data sets A and B asshown in Algorithm 3.8. The absolute constraint check is performed in Algorithm 3.9 bythe function CheckNode(ai, bj , A, B), which takes as input the indices of two features ai

and bj and the feature sets A and B. It returns a Boolean value indicating whether theabsolute constraint for the association ai → bj is satisfied.

Checking the absolute constraint for each node is an O(1) computation and, if imple-mented as a lookup table so that Mij for each node is evaluated at most once, Algorithm 3.9can be performed very efficiently. Thus, the complexity of the correspondence graph con-struction remains O(n2 log n2 + km2) as for the unknown observer pose algorithm in theprevious section.


Algorithm 3.8: MakeEdges(CAi , CBj , A, B, G)

if CheckNode(ar, bp, A, B) and CheckNode(as, bq, A, B)

then

k1 ← NodeIndex(ar, bp)k2 ← NodeIndex(as, bq)E ← E ∪ {vk1 , vk2}

if CheckNode(ar, bq, A, B) and CheckNode(as, bp, A, B)

then

k1 ← NodeIndex(ar, bq)k2 ← NodeIndex(as, bp)E ← E ∪ {vk1 , vk2}

return (G)

Algorithm 3.9: CheckNode(ai, bj , A, B)

νij ← Equation 3.21Sij ← Equation 3.23Mij ← νT

ijS−1ij νij

return (Mij < γ2)

Method 2: Prior application of absolute constraints

If the vehicle pose uncertainty is small, then the set of valid associations can be largelydetermined from absolute constraints alone. Association set consistency can be subsequentlyenforced by applying relative constraints to a greatly reduced solution space.

This version of CorrespondenceGraph(A, B) is shown in Algorithm 3.10. The setof valid nodes V is generated via the function MakeNodeSet(A, B) (see Algorithm 3.11),which applies absolute constraints using the function CheckNode(ai, bj , A, B) shown pre-viously in Algorithm 3.9. This node generation process takes O(mn) operations.11

The mutual compatibility of each node-pair {vi, vj} is then tested using the functionCheckConstraint(vi, vj , A, B) in Algorithm 3.12. This function first applies the injec-tive mapping constraint by rejecting nodes vi = {ar, bp} and vj = {as, bq} if either ar = as

or bp = bq. Surviving node-pairs are then tested for relative constraint compatibility bycomparing the constraints

{drs, σ

2rs

}and

{dpq, σ

2pq

}. For each compatible node-pair, a con-

necting edge is added to the CG.Algorithm 3.10 has computational complexity O(mn + |V |2). The efficiency of this

approach depends on the observer pose uncertainty being small enough that the size ofset V is approximately O(min {m, n}). That is, the assignments between sets A and Bare largely determined by the absolute constraints. If this assumption holds, the overallcomputational complexity becomes O(mn).

11The process of applying absolute constraints may be improved using more efficient gating procedures, aspresented in [32] for example. This would mean Algorithm 3.11 could be performed in O(m log n) iterations.


Algorithm 3.10: CorrespondenceGraph(A, B)

V ← MakeNodeSet(A, B)E ← ∅for i ← 1 to |V |

do

for j ← i + 1 to |V |do{if CheckConstraint(vi, vj , A, B)then E ← E ∪ {vi, vj}

return (G)

Algorithm 3.11: MakeNodeSet(A, B)

V ← ∅for i ← 1 to |A|

do

for j ← 1 to |B|do{if CheckNode(ai, bj , A, B)then V ← V ∪ {ai, bj}

return (V )

Algorithm 3.12: CheckConstraint(vi, vj , A, B)

if (ar ∈ vi = as ∈ vj) or (bp ∈ vi = bq ∈ vj)then return ( false )

drs ← Equation 3.15 for features ar and as

σ2rs ← Equation 3.17 for features ar and as

dpq ← Equation 3.15 for features bp and bq

σ2pq ← Equation 3.17 for features bp and bq

Mij ← (drs−dpq)2

σ2rs+σ2

pq

return (Mij < γ1)


3.3.5 Maximum Clique Search

Maximum clique search is a standard graph search algorithm, and a great number of tech-niques may be found in the literature for finding the maximum clique in an undirectedunweighted graph (see [17] for a comprehensive survey).

Finding an exact solution to the maximum clique problem in a general graph is thoughtto be NP complete [78], but two factors make it a feasible algorithm for the CCDA applica-tion. First, in typical environments (with normal levels of geometric symmetry), the CG isvery sparse and search heuristics favouring nodes of high degree tend to rapidly obtain goodresults. Second, the CCDA algorithm does not require the exact maximum clique(s) but,rather, the largest maximal clique (or set of maximal cliques) that can be found within areasonable time period. In fact, an approximate maximum clique search algorithm may betrivially implemented as an anytime algorithm [15] where the search runs until a specifiedtime limit and then returns its best result.

Two approximate maximum clique algorithms were empirically trialed for CCDA: adegree-ordered search with a tabu queue [58], and a simple randomised descent search. Inmost cases the latter method was more efficient and so this version is presented below.

The function MaximumClique(G) (see Algorithm 3.13) is sent the graph G = {V, E}as input and returns a clique M ⊆ G that is (approximately) the maximum clique of G.12

The algorithm uses three global variables: the largest clique found M , a temporary cliqueT , and a set of nodes S. These variables are shared by the other functions that declare themas global. In Algorithm 3.13, clique search is performed in a loop terminated when a timingfunction TimeUp() returns true. This loop provides the base level of the randomiseddescent search where a node vi is chosen at random from V ∈ G, and the node set S isinitialised with the adjacency list adj(vi). The randomised descent search is then performedby recursion of the function RecursiveSearch(n).

In Algorithm 3.14, the function GetRandomNode(V ) chooses a node at random fromthe node set V (i.e., a node index is selected with uniform13 likelihood from the discreterange [1, |V |]). Note, the input parameter V is local to this function and does not specifythe set V ∈ G (although V ∈ G may indeed be passed to this function). For a node vi,the maximum possible clique size is |adj(vi)| + 1; therefore, a selected node is accepted forsearching if the size of its adjacency list is at least as large as the current maximum cliqueM . If a valid node cannot be found within the final time limit, an empty set is returned.

The function RecursiveSearch(n) in Algorithm 3.15 is the central process of themaximum clique search operation. Each call of this function indicates the entry to a lowersearch depth and a new node is added to the accumulating clique stored in T , whichrepresents the current descent path. A set of nodes V that is adjacent to all the nodescurrently stored in T is used to perform the next descent level. Provided |V | is largeenough to possibly improve the maximum clique size, a node is selected at random from Vand another level of RecursiveSearch(n) is entered. After the completion of the recursive

12The cliques are actually represented in Algorithm 3.13 by sets of nodes only rather than as graphs asthe nodes implicitly (by definition) form a complete graph.

13Rather than simply selecting a node vi with uniform likelihood, better performance may be possible ifthe likelihood of selecting node vi is a function of its degree |adj(vi)| meaning that more strongly connectednodes are favoured for addition to the clique.


Algorithm 3.13: MaximumClique(G)

global M, T, SM ← ∅T ← ∅while not TimeUp()

do

vi ← GetRandomNode(V )S ← adj(vi)RecursiveSearch(vi)

return (M)

Algorithm 3.14: GetRandomNode(V )

global Mn ← ∅while n = ∅ and not TimeUp()

do

i ← U [1, |V |]if |adj(vi)| ≥ |M |then n ← vi

return (n)

Algorithm 3.15: RecursiveSearch(n)

global M, Tif n = ∅then return

T ← T ∪ nV ← MutuallyAdjacentNodes(n)if |V | > 0 and |T | + |V | > |M |then

{vi ← GetRandomNode(V )RecursiveSearch(vi)

if |T | > |M |then M ← T

T ← T \ n

Algorithm 3.16: MutuallyAdjacentNodes(n)

global SS ← S ∩ adj(n)return (S)


descent, the size of the maximal clique T is compared with the current maximum clique. IfT is greater, then it is assigned to M . The node n is subsequently removed from T at theend of the function to indicate the return to a higher level.

The function MutuallyAdjacentNodes(n) finds the intersection of the sets S andadj(n) and stores the result in S so that, at each level of descent, the nodes in S representthe remaining nodes in the graph G adjacent to all the nodes in the current clique T (butnot already stored in T ).

This basic implementation finds only a single largest clique but can easily be modified tostore the set of all maximal cliques found (a maximal clique is obtained with each descendingpath). Alternatively, the k largest maximal cliques might be retained. It is possible that theimplementation of more sophisticated maximum clique search algorithms (e.g., [10, 54, 28])may enable more efficient application of the CCDA algorithm, thereby permitting batchassociations with larger data sets, but determining the computational advantage of thesealgorithms for the particular structure of the CG will require an empirical evaluation.

3.3.6 Computational Complexity of the Maximum Clique Search

While maximum clique search may be NP complete for a general graph, it appears to befeasible here due to the structure of the CG. This section presents an empirical study ofthe CG density and maximum clique search complexity, both with and without absoluteconstraints.

These experimental results are obtained from the laser-based dead reckoning trials inSection 3.4 using the internal road data set (see Appendix A). CCDA was performed be-tween the feature sets obtained from sequential pairs of laser scans. This test was performedtwice over the same ensemble of feature sets: once including absolute constraints and oncewithout (i.e., using relative constraints only).

The structure of the CG is affected by numerous factors. These include the density ofobservable features (and clutter), their geometric distribution, the measurement (sensor)uncertainty, and the vehicle pose uncertainty. The effect of pose uncertainty (i.e, absoluteconstraints) on the number of CG nodes is particularly significant. Given moderate absoluteconstraints, the number of CG nodes is proportional to the number of features in the smallerof the two data sets while, without absolute constraints, the number of nodes is equal tothe number of features in one set multiplied by the number of features in the other.

The empirical CG density results, edges versus nodes, are shown in Figure 3.11. Inboth cases, the number of edges is quadratic in relation to the number of nodes. However,with absolute contraints, the CG is quite dense (between 20% and 60% connected) while,without absolute constraints, the CG density is between 2% and 4%.

In the absence of a theoretical bound for the maximum clique search, the empiricalresults in Figures 3.12 and 3.13 serve to characterise the expected complexity of the ran-domised descent algorithm. Figure 3.12 shows the number of descent iterations (i.e., thenumber of calls to the function RecursiveSearch(n)) required to find the (exact) max-imum clique, versus the number of CG nodes. Because the algorithm is randomised, thecomputational spread is large; however, the important attribute is the average case asshown.


0 20 40 60 80 100 120 140 160 1800

500

1000

1500

2000

2500

3000

3500

4000

4500

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

0.5

1

1.5

2

2.5

3

3.5x 10

5

Figure 3.11: Correspondence graph density. These figures show empiri-cal measurements for the number of CG edges versus number of nodes.The top and bottom figures depict the results for CGs with and withoutabsolute constraints, respectively. The two lines in each figure show theaverage CG density and, for comparison, the density of a complete graphwhere e = n2−n

2 .

For the first case (with absolute constraints), the number of descent calls is linear withthe number of nodes. This is expected since the CG is dense and contains essentially asmall number of large cliques. (Note, the apparent drop-off for n > 140 is just an artifactof insufficient trials for these larger node quantities.) The second case, without absoluteconstraints, is particularly interesting as the number of descents achieves a constant boundfor large n. It appears that the sparse CG contains relatively few highly connected nodes,and the graph is composed of a small number of large cliques amidst a high proportion ofvery small cliques.

However, the results in the second case do not tell the entire story for CCDA withoutabsolute constraints. In these experiments, features are matched between sequential laser-scan pairs, meaning that the feature sets are of similar size, and have many features incommon. Thus, those CGs with many nodes tend to possess correspondingly large (anddistinctive) cliques. A more difficult problem would be associating a small number ofobservations in a large map without absolute constraints. This would mean a large number


0 20 40 60 80 100 120 140 160 1800

50

100

150

200

250

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

200

400

600

800

1000

1200

1400

1600

Figure 3.12: Maximum clique search iterations. These figures show thenumber of descent iterations versus the number of CG nodes. The topand bottom figures depict the results for CGs with and without abso-lute constraints, respectively. The lines in each figure show the averagenumber of descents.

of CG nodes but small cliques and many ambiguities.The results shown in Figure 3.12 are dependent upon a small modification to the function

GetRandomNode(V ) in Algorithm 3.14. The modified version uses a simple rejectionsampling strategy to favour the selection of nodes of higher degree. This change has amarginal effect on the search with absolute constraints (i.e., the average number of descentiterations becomes c = 0.7n rather than c = 0.8n). However, for the search without absoluteconstraints, the average case falls from linear (c = 0.45n) to constant (c = 200). Clearly, themodified node selection strategy directs the search away from unlikely nodes and towardsthe relevant larger cliques within the sparse graph.

Finally, the number of descent iterations do not actually convey the total complexityof the maximum clique search algorithm, as each descent includes a call to the functionMutuallyAdjacentNodes(n) (see Algorithm 3.16). This function performs a set inter-section operation, which has linear complexity with respect to the degree of the passed nodeparameter.

The empirical results for the total search complexity, therefore, are shown in Figure 3.13.


0 20 40 60 80 100 120 1400

5000

10000

15000

0 500 1000 1500 2000 2500 3000 3500 4000 4500 50000

1

2

3

4

5

6x 10

4

Figure 3.13: Maximum clique search complexity. These figures show thenumber of inner-loop iterations versus the number of CG nodes (with andwithout absolute constraints, respectively). The lines depict the averagecomplexity.

The average case with absolute constraints is quadratic in relation to the number of nodesc = 0.3n2. The average case complexity without absolute constraints is not so well de-fined. For the number of CG nodes between 0 and 700, the complexity appears quadraticc = 0.01x2 but, after this point, it levels off. Whether this result is due to insufficient trialsis unclear, but certainly the complexity for large n seems not worse than linear.

3.3.7 Ambiguity Management

The process of searching for the maximum clique finds many maximal cliques of equal ornear maximum size. These cliques represent ambiguity in the CCDA result. There aretwo types of ambiguity that can arise (described here, for simplicity, in terms of just twomaximal cliques). The first is due to environmental symmetry, where the association setsgiven by two cliques represent two completely different observer locations. One (or both)of these sets must be incorrect, and determining the correct clique is dependent on furtherinformation. The appropriate way to deal with this type of ambiguity is to use MHT, where


a different track is maintained for each clique. The second type of ambiguity is simpleclutter, where both cliques represent essentially the same set of associations with somediffering assignment-pairs due to sensor uncertainty and the close proximity of features.

In the case of multiple cliques, representing simple clutter, one of the following ambiguitymanagement methods might be applied.

• Select the clique of maximum size that has maximum joint likelihood (i.e., joint nearestneighbour). Alternatively, choose the maximum joint likelihood clique from the setof cliques with size greater than a specified minimum (see Section 3.5.2 regardingminimum acceptable clique size).

• Retain only those associations common to the k largest cliques.

• Select the k largest cliques and perform JPDA with those associations that are am-biguous.

• Perform MHT with the k largest cliques (including, perhaps, a hypothesis that allassociation sets are false).

3.3.8 A Comparison of Optimal Assignment, JCBB and CCDA

This section compares the relative merits of optimal assignment (OA), JCBB and CCDAfor finding a single most likely set of associations. They are compared according to twocriteria: level of association constraint and computational complexity.

Constraint quality determines the number of associations that are rejected for a givenstatistical (i.e., NIS) probability of acceptance. The more constraint information available,the more associations become invalid. OA represents the best of the traditional multiple-target tracking data association methods (i.e., the best maximum likelihood method, thisdoes not include JPDA or MHT).14 It incorporates injective mapping constraints and ab-solute constraints, and searches for the association set of maximum size with maximumsummed log-likelihood.

The batch association methods, JCBB and CCDA, both incorporate injective map-ping, absolute and relative constraints. Through the addition of relative constraints, thesemethods will always give a better (i.e., less ambiguous) result than OA for the same NISthreshold. Both methods search for the association set of maximum size with maximumjoint likelihood, but their approaches to applying constraints are different. The JCBB algo-rithm selects an association set hypothesis and applies the absolute and relative constraintsconcurrently via calculation of the joint NIS. CCDA, on the other hand, applies theseconstraints explicitly and separately between feature pairs; this has a significant practicaladvantage as it enables the production of sensible results without absolute constraints.

An experimental comparison between OA and batch association is not given in thisthesis. However, a comparison of JCBB and a greedy nearest neighbour association similarto OA is provided in [106].

14Arguably, the JAM method improves on OA by performing batch likelihood adjustment. However,the JAM has yet to gain popular acceptance and the experimental verification needed to make a propercomparison with OA.

3.4 Application: Sensor-based Dead Reckoning 60

The computational complexity of OA has a theoretical worst-case bound of O(n4) forn measurements and n targets [35], although the average-case complexity is reported tobe considerably better. The implementation in [35] uses the “Hungarian” bipartite graphmatching algorithm [84], and lower complexity might be possible using more recent bipartitematching methods (e.g., see [59, 14]).

JCBB performs incremental construction and search of an interpretation tree of jointassociation hypotheses. The joint NIS gate determines acceptable hypotheses and performsbranch-and-bound pruning of the search space. The discussion in [106] does not provide atheoretical bound, but gives an empirical complexity estimate of O(1.53n), where n is thenumber of observed features (for fixed vehicle pose uncertainty).

The CCDA algorithm is composed of two parts: CG construction and maximum cliquesearch. The first part takes O(n2 log n2+km2) time if the vehicle pose is unknown, where mand n are the number of measurements and targets, respectively, and k is found empiricallyto be O(n). If the vehicle pose is known, CG construction may be performed by one of twomethods. For the first, the same complexity O(n2 log n2 + km2) applies but, for the secondmethod, the complexity becomes O(mn + |V |2), where |V | is the number of nodes in theCG. If the pose uncertainty is reasonably small, |V | is O(min {m, n}).

The maximum clique search complexity was found to be quadratic with the number ofCG nodes (on average) if the vehicle pose is known, and not worse than linear if the vehiclepose is unknown (although more analysis is required for this second result). In these twocases the relationship between the number of nodes and number of features is O(min {m, n})and O(mn), respectively. Therefore, assuming m ≈ n, the average complexity in both cases,in terms of the number of observed features, is expected to be O(m2).

3.4 Application: Sensor-based Dead Reckoning

This application associates features between sequential laser scans to determine the changein pose of the observer. The experimental results have a two-fold objective: first, to demon-strate the CCDA algorithm in outdoor high-clutter environments and, second, to present asensor-based alternative to odometry that is both more accurate and more reliable.

The two environments in which these tests are carried out are the parkland and internalroad locations described in Appendix A. Both environments provide a challenging data as-sociation problem, particularly the latter with feature density ranging from extreme clutterto extreme sparsity.

3.4.1 Feature Extraction

The predominant landmarks in both environments are trees, and so feature extraction isgoverned by a point feature model for detecting tree trunks.

To begin with, the range-bearing measurements for a given laser scan are broken intoclusters using Algorithm 3.17. Here, z = {(r1, θ1), . . . , (rn, θn)} is the angle-ordered setof laser measurements, and the returned set C = {C1, . . . , Ck} is the resulting clusters.Clustering is performed on the basis of discontinuity between adjacent range measurements,


Algorithm 3.17: Cluster(z)

k ← 1Ck ← (r1, θ1)for i ← 2 to n

do

∆max ← A + B · min {ri, ri−1}∆ ← abs (ri − ri−1)if ∆ < ∆max

then Ck ← Ck ∪ (ri, θi)

else{

k ← k + 1Ck ← (ri, θi)

return (C)

Figure 3.14: Point feature model. Given a foreground cluster (i.e., a clus-ter with both edge points closer than the adjacent clusters), the parame-ters of a circle can be calculated from the minimum range measurementrm and the angles of the two edge measurements.

and is tuned by adjusting the gains A and B. In these experiments, the constant gainA = 0.07m and the proportional gain B = 0.04.

Clusters are selected from the ordered set C if they are in foreground. That is, a clusteri is selected if its first and last range measurements are shorter than the last and first rangemeasurements of clusters i−1 and i+1, respectively. These foreground clusters define pointfeatures according to the model shown in Figure 3.14, which expresses the range-bearing ofa circle origin in terms of the cluster’s bounding angles {θ1, θ2} and minimum range rm.15

From this diagram, it can be seen that

sinα =R

rm + R(3.27)

15An alternative circle extraction model is given in [67], which estimates the circle parameters from theentire set of cluster measurements using an EKF. However, empirical laser tests indicate that range valuesat sharp discontinuities tend to be very unreliable, and this is the rationale for the simple model presentedin this thesis—which uses only angle information at the cluster edges.


where α = θ2−θ12 . Rearranging Equation 3.27, the circle radius is given by

R =rm sinα

1 − sinα(3.28)

Therefore, the range and bearing measurement zp = [rp, θp]T is

rp = rm + R =rm

1 − sinα(3.29)

θp =θ1 + θ2

2(3.30)

The individual laser measurements are assumed to be independent, with Gaussian uncer-tainty in range and bearing, such that each measurement zi = [ri, θi]T has covariance

Pi =[

σ2r 00 σ2

θ

]

Therefore, if the circle origin is defined as a function of the parameters (rm, θ1, θ2)

[rp

θp

]= h

rm

θ1

θ2

(3.31)

so that the Jacobian ∇h is given by

∇h =

[∂rp

∂rm

∂rp

∂θ1

∂rp

∂θ2∂θp

∂rm

∂θp

∂θ1

∂θp

∂θ2

]=

[1

1−sin α − rm cos α2(1−sin α)2

rm cos α2(1−sin α)2

0 12

12

](3.32)

then the foreground point covariance Pp can be approximated by linearised transformationof the measurement uncertainties.

Pp = ∇h

σ2

r 0 00 σ2

θ 00 0 σ2

θ

∇hT =

[1

(1−sin α)2σ2

r + r2m cos2 α

2(1−sin α)4σ2

θ 00 1

2σ2θ

](3.33)

Notice that this model produces a correct estimate of the point range-bearing covarianceonly if the object observed is, in fact, circular, and it assumes that the minimum rangemeasurement rm is actually an observation of this circle’s closest point. However, clustersare classified as points based on the sole criterion that they are foreground, which means thatnon-circular clusters can be selected, and are often poorly represented by this model (e.g.,see Figure 3.15). To cater for non-ideal feature classification, it is necessary to inflate theobservation uncertainty; also, to limit the possibility of gross misclassifications, a maximumradius estimate is applied (i.e., only keep those points satisfying R < RMAX). Typically,the tree trunks in the two experimental environments are well approximated by the circularmodel as depicted in Figure 3.16.


Figure 3.15: Feature misclassification. A non-circular object is poorlyrepresented by the circular foreground point model.

22 24 26 28 30 32 34 36 38 40−14

−12

−10

−8

−6

−4

−2

0

Figure 3.16: Tree trunk observations in the park environment. UsingSLAM pose estimates, the unprocessed laser points are recorded over aseries of passes to illustrate the circular shape of these four trunks.


3.4.2 Dead Reckoning Implementation Details

The steps of this dead reckoning algorithm follow from the CCDA algorithm in Section 3.3,and the relative pose estimation method in Appendix C.3. Given a set of change-in-poseestimates {xδ,Pδ} between sequential scans, the accumulated dead reckoning estimate isfound using the equations in Example C.1. Therefore, to avoid unnecessary repetition, thediscussion below just provides additional details regarding (i) calculation of the change-in-pose xδ, and (ii) application of the CCDA algorithm.

The initial estimate for a particular xδ is based on the prior information afforded bythe change-in-pose estimate of the previous scan-pair. Thus, the relative pose calculationsin Appendix C.3 use Equations C.17 and C.18 rather than Equations C.15 and C.16. Theprior estimate is given by a simple constant-velocity inertial model as follows.

xprior = xδ (3.34)Pprior = Pδ + Pacc (3.35)

where xδ and Pδ are the previous change-in-pose estimate, and Pacc represents the increasein uncertainty due to possible accelerations over the time interval. Note, the predictionof constant xδ is possible since the laser information arrives after equal time intervals (4.7Hz in these trials). The value of Pacc is determined by a model of the maximum possiblevehicle accelerations; in these trials, Pacc was fixed at

Pacc =

0.5 0 0

0 0.5 00 0 0.03

Batch data association was performed using the first of the tracking CCDA methods (seeSection 3.3.4). This implementation used a basic form of the maximum clique algorithm,which returned, as a “best” solution, the largest association set found (or, if multiple equallargest sets exist, the first largest set found). Significantly better results could probably beobtained if other association sets were also considered, but these results were sufficient todemonstrate the utility of the CCDA algorithm.

The test environments displayed a high degree of variability in landmark density atdifferent locations, resulting in batch associations ranging from less than 3 in some regionsto as great as 70 in others. For each scan-pair, a minimum batch size was applied todetermine whether the association set was deemed reliable and, while more sophisticatedmeasures may be possible (considering factors such as local feature density), this simplethreshold gave reasonable results (see Section 3.5.2 below for further discussion). If, for agiven scan-pair, insufficient associations are obtained, then the estimated change-in-pose isequal to the prior estimate, and this in turn becomes the prior for the next scan-pair (withexpanded uncertainty Pprior = Pδ + Pacc). There is a basic tradeoff, therefore, betweenassociation reliability and the possibility of prolonged periods of tracking failure, and batch-size thresholds of between 3 and 7 were found to work best in these trials.

3.4.3 Results

Wheel-encoder odometric data was logged during both the trials presented below, and anodometry-based dead reckoning estimate calculated using the vehicle model parameters

3.5 Remarks 65

that were used in [67, 65] (for the same vehicle). These model parameters were tuned togive optimal results, and the odometric results shown here are assumed to be of the bestpossible quality. The laser result, on the other hand, required no detailed kinematic modeland operated with just a coarse dynamic process model and a simple observation model.

The dead reckoning results for the park environment are shown in Figure 3.17, withthe top figure based on wheel-encoder information and the bottom on laser scans. Thesuperior accuracy of laser-based dead reckoning is readily apparent, particularly in heading.In general, odometry tends to provide a smoother estimate, but is subject to biases (manyof which are not easily modelled, such as slip during turning). Laser-based dead reckoning,while noisier, does not possess such significant bias.

The internal road result, shown in Figure 3.18, exhibits the same characteristics overa longer distance (approximately 6 km). Several regions of this environment producedrelatively inaccurate laser-based estimates due to lack of observable features, but overallthe trajectory retains the geometry of the roadway. The encoder-based estimate, on theother hand, rapidly diverges into incoherence.

3.5 Remarks

The two matters discussed in this section are concerned with (i) the usefulness of geometricfeatures in general unstructured environments, and (ii) determining a minimum associationset threshold for CCDA.

3.5.1 Utility of Simple Geometric Features

Simple geometric primitives, such as points and lines, are often criticised16 as insufficientfor general environments and a current trend [93, 118, 132, 95, 48] is to try and developmore general representations (e.g., scan correlation, see Chapter 4). A typical rationalefor rejecting simple features is the problem of misclassification; for example, if a smoothlycurved surface is represented by a set of line segments, these lines may shift arbitrarily.Also, methods like scan correlation use all the available observation information, and notjust the parts that can be classified.

It is argued here, however, that geometric models are probably more universally appro-priate than they are given credit. Point locations such as edges (i.e., range discontinuities)and corners occur in a great many environments. Batch association, and the feature man-agement methods presented in Chapter 5, can be used to reject unstable features withoutharm. Another important factor to note is that data association imparts significant infor-mation and feature-based methods may produce better results than scan correlation.

Some environments, such as mining tunnels and subsea, do not yield stable geometricfeatures and require more general representations. However, this does not preclude theuse of the EKF stochastic SLAM framework. Fundamentally, there is confusion betweenthe use of geometric (point) feature models and the estimation of the point location of anobject. The former may well not apply, but the latter is generally applicable to any static

16For example, geometric feature-based methods have been described as “less accurate and less robust”than scan matching for laser-based localisation in an indoor environment [93].

3.5 Remarks 66

−50 0 50 100 150 200

−160

−140

−120

−100

−80

−60

−40

−20

0

20

40

−40 −20 0 20 40 60 80 100 120 140 160

−80

−60

−40

−20

0

20

40

60

80

Figure 3.17: Encoder-based (top) versus laser-based dead reckoning inthe park environment.

3.5 Remarks 67

−300 −200 −100 0 100 200−250

−200

−150

−100

−50

0

50

100

150

200

250

−300 −250 −200 −150 −100 −50 0 50 100 150 200

−400

−350

−300

−250

−200

−150

−100

−50

0

Figure 3.18: Encoder-based (top) versus laser-based dead reckoning inthe internal road environment.

3.6 Summary 68

map. For example, an object might be represented by a set of unprocessed measurementdata, which serves as a static model of the object. Subsequent observations of the objectcan then use scan correlation to estimate the object location relative to the vehicle (i.e., asa relative pose (x, y, φ) and uncertainty). By approximating the relative pose uncertaintywith a Gaussian, the global object pose can be maintained in the EKF SLAM state vector.This approach is suggested as a future research direction combining scan correlation andSLAM in Section 7.2.2. Essentially, an object might not possess a simple parametric featuremodel but still defines a point pose in space (its local coordinate axis) which is static andapproximately Gaussian.

3.5.2 Minimum Acceptable Batch Size

Batch data association, as described in this chapter, produces sets of associations that aremutually compatible given their combined constraints. From this it is possible to obtain a“best” association set consisting of the greatest number of associations. However, selectinga best set, or set of best sets, is not sufficient for ensuring correct association (i.e., selectingthe best of a bad batch is not good enough). For the SLAM problem, the real requirementis a measure of association reliability; it is better to reject all associations, or perform MHTwhere one hypothesis rejects association, than to risk accepting an incorrect association set.

A likelihood of association set “correctness” is a function of many factors: (i) the numberof associations in the set, (ii) the quality of the individual constraints and correlations, (iii)the geometric distribution of features, (iv) the observer pose uncertainty, (v) the presence ofclutter, occlusions, and dynamic objects, (vi) and so on. In this thesis, association reliabilityis not properly quantified and defining an appropriate measure remains an open question.The approach used in the experiments of this chapter considers only property (i) above,such that a threshold on minimum batch size is used as a blanket criterion. This is a verysuboptimal approach; a low threshold is fragile in dense regions due to symmetries and ahigh threshold is fragile in sparse regions because of possible tracking loss.

3.6 Summary

This chapter addresses the data association problem for localisation and SLAM. First, itreviews the data association methods developed for target tracking, where the basic testfor accepting or rejecting an association is the NIS (or Mahalanobis distance) validationgate. For tracking a single target in clutter, there are three main procedures for resolvingassociation ambiguity: nearest neighbour, PDA, and MHT. These methods have analoguesfor tracking multiple targets in clutter: optimal assignment, JPDA, and MHT.

If a set of targets, or a set of observations, are correlated within themselves, thenmore robust data association can be achieved if they are processed as a batch. Two batchassociation methods are reviewed: JCBB and MCS data association. MCS data associationis unique in finding associations without prior knowledge of the observer pose.

The CCDA algorithm is presented as extension to the MCS data association method.This algorithm is based on a “correspondence graph” (CG) which is constructed from therelative and absolute constraints between feature pairs. The maximum clique of the CG

3.6 Summary 69

defines the largest set of mutually compatible associations. Every other clique in the CGrepresents an alternative (ambiguous) association set hypothesis.

The CCDA method is demonstrated with an implementation of laser-based dead reck-oning in two outdoor environments. These estimates are shown to be greatly more accuratethan encoder-based odometry.

Chapter 4

An Alternative to Feature-basedData Association: Scan Correlation

Feature-based data association is a viable solution in environments where static landmarksexist suitable for classification as geometric primitives. However, in some environments (e.g.,see Figure 4.1), it may be difficult to define appropriate parametric feature models, andmore reliable association is possible through direct correlation of unprocessed observationinformation with the map. Direct correlation is also a useful alternative in environmentsamenable to feature-based methods, as it incorporates all available sensor information (i.e.,does not reject unmodelled data) and avoids imperfect model-based classification.

Unprocessed data correlation, also called scan correlation or range-image registration,is the process of aligning an observed set of (2-D or 3-D) points with a reference point set.Thus, scan correlation may be defined as a function of the relative pose between the twodata sets.

For the mobile robot localisation problem, scan correlation is useful as a mechanism tofacilitate sensor-based dead reckoning, localisation from an a priori map [38], and off-linemap generation [128, 22, 82]. Presently, the implementation of real-time stochastic SLAMusing scan correlation is considered intractable, but recent proposals regarding incremen-tal EM-based SLAM [130] and hybrid topological-metric SLAM [92, 68] show promise fortractable and consistent correlation-based solutions.

A large number of correlation techniques have been proposed in the literature, as re-viewed in the next section. However, most are based on heuristic cost functions and lack atheoretical probabilistic derivation. The motivation of this chapter is to present a theoret-ically grounded approach to scan correlation—derived from a stochastic sensor model—sothat the correlation procedure provides an accurate estimate of the relative pose uncertainty.

This chapter presents a method and justification for performing scan correlation, withdiscussion of the following topics.

• Current methods for performing scan correlation are reviewed, with particular atten-tion to their deficiencies concerning sensor modelling and uncertainty estimation.

• For the probabilistic representation of scan measurements, several different parameter-spaces are considered. In particular, given a scan of 2-D points, the merits and

An Alternative to Feature-based Data Association: Scan Correlation 71

0 2 4 6 8 10 12 14

−4

−2

0

2

4

6

Figure 4.1: Unprocessed laser scan of a mining tunnel. In this environ-ment it is difficult to extract stable parametric features but there existsample information for scan correlation given the surface texture of thetunnel walls.

limitations of a 2-D Cartesian-space representation are examined and compared withthose of higher-dimensional parameter-spaces.

• The 2-D parameter-space causes a scan to be interpreted according to a point-targetmodel. Assuming this model, a Bayesian likelihood function for scan correlation isderived.

• Conversion from a scan of unprocessed point data to a 2-D probability distribution isimplemented using a sum of Gaussians representation. This representation is shownto facilitate efficient calculation of the scan correlation likelihood function.

• Scan correlation is applied to unprocessed laser data (represented as Gaussian sums)to perform two practical localisation tasks: maximum likelihood dead reckoning andparticle filter localisation.

Caveat. This chapter is quite disparate from the rest of the thesis, which is primarilyconcerned with feature-based SLAM. The focus of this chapter is not SLAM but simply thealignment of two point data scans in environments not suited to feature extraction. In thischapter, scan correlation is investigated in a probabilistic manner, and the reader is assumedto be familiar with the basic concepts of recursive Bayesian estimation. In particular, itis necessary to understand terms such as probability density function (PDF) and likelihoodfunction. For a brief introduction to these concepts, refer to Appendix D.

4.1 A Review of Point Data Correlation Techniques 72

Algorithm 4.1: ICP(Po, Pr,x0)

x ← x0

while not Converged

do

E ← ∅for each p ∈ Po

do

p′ ← TransformPoint(p,x)

q ← FindNearestNeighbour(Pr, p′)

E ← E ∪ {p, q}x ← CalculateLeastSquaresPose(E)

return (x)

4.1 A Review of Point Data Correlation Techniques

This section examines a selection of commonly used scan correlation methods, with a par-ticular focus on whether these methods incorporate appropriate sensor uncertainty models.The following review presents two non-probabilistic techniques, iterative closest point (ICP)and the angle histogram, and several other methods that may be implemented in a proba-bilistic fashion including grid correlation and particle filter localisation.

In the discussion below, the two scans are termed the reference scan and the observa-tion scan, where the fixed reference scan defines the base coordinate frame. Correlation,therefore, involves finding the pose of the observation scan relative to this base coordinateframe.

4.1.1 Iterative Closest Point

ICP [13, 143] is arguably the most commonly used range-image registration technique,with its popularity due mainly to its simplicity and efficiency. The basic algorithm isshown in Algorithm 4.1. Let Po = {p1, ..., pm} represent the observation point set andPr = {p1, ..., pn} be the reference point set. The relative pose of the observation set isdenoted x. The algorithm is initialised with an initial pose guess x0 and, until the es-timated pose satisfies some convergence criterion, it is iteratively refined by a process ofpoint-to-point data association and least-squares transformation. Each point p ∈ Po is firsttransformed to the reference coordinate frame using the current pose estimate, and thenassociated to its nearest neighbour in Pr. The original point p and its associate q are addedto an association set E. Finally, the pairs in E are used to calculate the relative pose thatminimises the least-mean-squared error between the associated points.

For each iteration, the nearest neighbour search can be performed in O(m log n) timeusing an efficient search data structure (e.g., k-d trees [12, 101]), and the least-squarescalculation has a closed form solution [2, 73, 93] that can be performed in O(m) compu-tations. Convergence of the algorithm occurs when the nearest neighbour for each p ∈ Po

does not change between iterations. However, it might also be determined by a least-mean-squared residual threshold, or simply a fixed number of iterations k—in which case the total


algorithm complexity is O(km log n).There are three basic shortcomings to the ICP algorithm. First, it performs explicit

point-to-point data association each iteration (n.b., without injective mapping constraints),which introduces error since the points in each scan represent a surface and not a setof discrete locations.1 Second, ICP converges to a local minima and so requires a goodinitial pose guess to find the global minimum. And third, the ICP result is a least-mean-squares estimate where each association pair is equally weighted; there is no direct meansto incorporate modelled sensor uncertainty or to obtain an estimate of pose uncertainty inthe solution.

A precursor to ICP [33], which uses a reference map of line segments and performs cor-relation with an observation point set, is worth noting for its efficiency and use in currentlocalisation systems [69, 68]. Also, a large selection of ICP variants have been proposed inthe literature over the past decade. These include improved convergence rate [93, 115], mod-els for point location error [43], and a method for estimating the final pose uncertainty [124](although neither of the latter two variants are derived from models of sensor uncertainty).

4.1.2 Angle Histogram

The angle histogram method [71, 139] of scan correlation is specific to a planar sensor withscanning-laser qualities. That is, the sensor must return a set of range-bearing measure-ments equispaced in bearing. In addition, each scan is expected to cover a full 360◦ so asto provide a circular (closed) sequence of measurements.

The essence of the angle histogram algorithm is as follows. Each measurement (ri, θi)in a scan is projected into observer-centred Cartesian space. Given two consecutive mea-surements, vi = (xi, yi) and vi+1 = (xi+1, yi+1), the angle of their vector difference is givenby

αi = arctanyi+1 − yi

xi+1 − xi

The angle histogram is generated from the set of angles for each pair of adjacent scanmeasurements. The shape of this histogram is invariant to pure rotation, which appears asa phase shift. Thus, the amount of sensor rotation may be found by cross-correlation ofthe two histograms. The relationship between phase shift and sensor rotation is valid evenwhen combined with a small quantity of sensor translation. Therefore, the change in sensorpose is found by first rotating the observation scan by the calculated angle offset, and thenfinding the translation offset using x and y histograms.

The angle histogram method does not appear to offer any advantage over ICP. It isneither more efficient, more accurate, or simpler to implement. Perhaps the one benefit ofthe angle histogram is that it does not require explicit point-to-point associations.

1The error due to discrete associations is usually small if the associated points represent nearby locations.However, outlier associations, where associated pairs are very distant, can skew the least-squares solution. Asimple measure to improve outlier robustness is to ignore pairs with distance greater than a given threshold.


4.1.3 Occupancy Grid Correlation

Occupancy grid maps (described in Section 2.1.1) use scan correlation to perform registra-tion of the observed scan with the map [50, 128, 118, 82]. The method of registration is2-D cross-correlation, which is usually implemented in the same manner as template match-ing in image processing. That is, the observation grid is shifted by a given pose, and thecorrelation value for this pose is calculated by (i) multiplying the overlapping grid cellstogether, and (ii) summing this result for each cell in the observation grid. (Note, sincethe observer pose varies in x, y, and φ, the search space is 3-dimensional. Thus, the twogrids do not overlap cleanly and interpolation techniques are required, as shown in [128] forexample. Also, the search-space is much greater than for 2-D, and cross-correlation may becomputationally infeasible without a priori bounds. This is also dependent on the tradeoffbetween efficiency and granularity as discussed in Section 2.1.1.)

Occupancy grids represent sensor information in a high-dimensional parameter-space,where each cell has an separate variable for its state between 0 and 1. The value andcorrelation of these states is updated in a non-Bayesian manner according to geometricallyderived sensor models (e.g., see the sonar model in [50]). That is, a sensor is modelledspatially, in terms of probability of occupancy, rather than by directly relating the mea-surement uncertainty to the state of each cell parameter. This means that the relationshipbetween the sensor uncertainty and the uncertainty obtained from cross-correlation is notprecisely defined. Cross-correlation may still be probabilistic, but is not associated to thesensor model in a Bayesian sense.

While occupancy grids may achieve good scan correlation results by virtue of the largequantity of information stored in their grid cells, they fail to exploit the higher accuracy androbustness available from a Bayesian model, and do not provide a theoretical justificationfor the resulting relative pose uncertainty.

4.1.4 Probabilistic Methods

A number of recent scan correlation techniques claim to operate according to probabilis-tic principles, meaning that correlation is driven by a likelihood function derived from astochastic observation model. Usually these methods are presented as maximum likelihoodalgorithms, which attempt to find the relative pose value that maximises the likelihoodfunction, but a few proposals demonstrate full probability density estimation, combiningthe likelihood function with prior information to calculate an a posterior PDF of the pose.

Two forms of maximum likelihood correlation appear commonly in the literature. Thefirst is ICP-like, iteratively searching for nearest neighbour data associations for each pointin the observation scan [108, 110]. Each association is given a likelihood value, and the prod-uct of these likelihoods defines the scan likelihood. A maximum likelihood pose can be foundby sampling pose values within the solution space, usually employing a greedy optimisationalgorithm to converge to a local maximum. The problem with these ICP-based methodsis that the likelihood models do not accurately represent the sensor uncertainty. They aretypically based on geometric properties rather than a proper sensor model (e.g., in [108]the likelihoods are simply functions of nearest-neighbour distance) and, more importantly,they rely on an incorrect model of explicit point-wise association. The algorithm presented

4.2 A Probabilistic Representation for Unprocessed Data 75

in [110] incorporates measures to account for correspondence error, and is a promising ICPvariant, but it remains an ad hoc geometric fix for these fictitious discrete associations.

The second common maximum likelihood correlation technique is grid-based, typicallyusing occupancy grids [50, 22, 82]. However, an alternative grid-based concept is probabilitygrids (see [125], for example), which are fundamentally different to occupancy grids. Whilea 2-D occupancy grid defines a high-dimensional parameter-space equal to the number ofits grid cells, a 2-D probability grid simply represents a 2-D parameter-space. (In [96, 95],a probabilistic grid is used to represent the uncertainty distributions of unprocessed sonarping returns. Note the difference between the sonar model in [96, 95] and the occupancygrid sonar model in [50].) The main weakness of grids as a probability representation isthat they artificially dicretise the distribution (i.e, pixelation), which involves a considerableloss of information. In some cases, better PDF representation might be obtained using, forexample, sums of Gaussians, wavelets, NURBS, and so forth. Using a 2-D probabilityrepresentation for scan correlation (of 2-D scans) is a primary focus of this chapter.

Estimation of the full a posteriori PDF is rarely applied to the scan correlation problembecause of computational complexity. However, a recent advance in general non-linear es-timation called particle filtering (see Appendix D), which approximates general PDFs withsamples, enables tractable computation of many problems previously considered infeasible.In [132], a particle filter is used to perform localisation from an a priori map via scan cor-relation.2 This implementation permits direct estimation of multi-modal pose distributionsand so avoids the fragility inherent in the maximum likelihood approach.

4.2 A Probabilistic Representation for Unprocessed Data

To perform probabilistic scan correlation, it is necessary, first, to define an appropriateparameter-space for the scan data and, second, to derive a Bayesian likelihood functionfor the relative pose between two scans. This section addresses the first requirement: thedefinition of a parameter-space, and the conversion of a scan of real data to a joint PDF inthis space. The derivation of a likelihood function, having specified a parameter-space, ispresented in Section 4.3.

For clarity, this discussion is given in terms of a 2-D laser scan consisting of n range-bearing measurements. Each laser measurement observes the presence of a (diffuse re-flective) surface within the envelope of its beam-width and maximum range. The actuallocation of the observed surface is uncertain and is modelled by a Gaussian PDF. Thus, fora set of n measurements, there is a set of n Gaussians each representing a discrete surfacelocation. These surface points are distinct from each other but may be correlated due totheir proximity, the smoothness of the environment, and the beam-width of the sensor.

The objective of this section is to represent the scan of measurements so that scan corre-lation is both accurate and consistent. Two basic forms of state-space parameterisation areconsidered. The first is a projection from 2-D measurement space to a 2-D Cartesian space,

2The implementation in [132] presents an interesting simplification to reduce computational load. Thelikelihood function is built on a ray-tracing model, which makes an implicit data association for each range-bearing measurement with its map intersection. Essentially, this implies that uncertainty is present inthe range measurement only (modelled as Gaussian) and both the bearing measurement and the map areperfectly known.


which represents world-coordinates. The second is higher-order, where the measurementsare mapped to a non-Cartesian parameter space of dimension greater than two.

The following discussion examines the transformation of the individual measurementuncertainties to these spaces, and considers whether each parameter-space is a reasonablerepresentation of the original scan. The utility and limitations of each representation forscan correlation is considered, with particular attention to the criteria below.

• Generality. The transformation from measurement to parameter-space should notrequire restrictive models, as this is the primary reason for choosing scan correlationover feature-based methods.

• Data association. Ideally, scan correlation should avoid data association thresholds.

• Accuracy and consistency. The resultant pose uncertainty from scan correlationshould be minimal, and consistent in the (weak) sense that the likelihood for thetrue pose is non-zero.

• Tractability. Scan correlation should be computationally feasible in real-time.

4.2.1 Representation as a Two-Dimensional PDF

A 2-D PDF is generated from a scan of measurements by transforming each range-bearingGaussian to sensor-centred Cartesian space, and accumulating these distributions by scalingand summing. The resulting Gaussian sum3 defines an uncertainty distribution over the2-D Cartesian parameter-space. The details for this operation are presented in Section 4.4.

The 2-D parameter-space implies that the scan represents a single point target. This isimplicit in its dimensionality; the PDF defines the uncertainty of a random vector (x, y) forwhich there can exist only a single true value (xt, yt).

This point target model has several advantages. First, it is general; conversion from mea-surement uncertainties to the Cartesian PDF does not involve any feature models. Second,since the model implies a single entity, there is no data association problem for scan corre-lation. Alignment is determined by cross-correlation of the two scan PDFs (see Section 4.3below). Third, cross-correlation of two Gaussian sums can be computed efficiently (seeSection 4.4) and does not incur the artificial pixelation of grid methods. And finally, thismodel appears to produce accurate results in practice (see Sections 4.5 and 4.6).

Nevertheless, the 2-D parameter-space representation has two significant difficulties.The first is obvious; the point target model is not a true representation of the actual data.Plainly, the measurements describe n discrete points of an arbitrarily complex surface, notthe location of a single point target. The second problem is more subtle. The individualmeasurement Gaussians are scaled before being added to the Gaussian sum, and the valuesof these scaling terms, necessary for good results in practice, are difficult to justify in theory(this issue is addressed further in Section 4.4.2).

3The transformed Gaussians will no longer be Gaussian due to the non-linear polar-to-Cartesian function.However, Gaussian approximations to these PDFs are reasonable in practice. Also note, the PDFs for eachmeasurement must be combined by summation rather than multiplication; for further explanation, seeSection 4.7.


The 2-D space representation effectively converts the interpretation of a scan from aset of discrete surface measurements to a set of possible locations for a point target. Thedistribution of one Gaussian in the sum says “the point target could be here” while anothersays “or here” and so forth. Basically, the scan measurements can be interpreted as a 2-Danalogue of a sonar ping, where each sampled amplitude of the return signal indicates thelikelihood of a point target at a certain range.

While the 2-D model has serious imperfections, it is not entirely incorrect either. Forinstance, for a single measurement, both the true and contrived interpretations state that“there is a point target somewhere in this distribution.” More importantly, provided twoscans share at least some commonality, scan correlation using the 2-D PDF representationwill result in a consistent likelihood function for their relative pose. (That is, the likeli-hood of the true relative pose will be non-zero, provided the scans actually share commoninformation and the individual measurement uncertainties are modelled consistently; theyneed not be modelled as Gaussian.) This property arises from the likelihood function beingcomputed via cross-correlation of the two scans (see Section 4.3), which will give a non-zeroresult for the true pose even if there is only common information from a single measurementin each scan. Therefore, the 2-D model seems also to offer robustness to outliers and partialviews; this is examined further in Section 4.4.6.

The 2-D parameter-space model satisfies the four essential criteria set out at the begin-ning of this section. However, in light of its deficiencies, a number of higher dimensionalmodels are considered as to their ability to better represent the scan data and to facilitatescan correlation.

4.2.2 Higher Dimensional Alternatives

Since a 2-D parameter-space is insufficient to properly describe the scan, a complete solutionnecessitates a higher-order parameter-space. Three higher-dimensional possibilities are con-sidered in this section: feature-based parameters, measurement parameters, and occupancygrids.

A feature-based parameter-space is obtained by specifying feature models, and clus-tering and classifying the sensor measurements according to these models; this is simplytraditional feature-based tracking as covered in Chapter 3. Where applicable, the featuremodels transform the measurement data to feature-space [f1, . . . , fk]T , and each new fea-ture specifies an augmentation of the space. The drawbacks of this approach are wellknown. Representation of the measurement data in feature-space is subject to restrictivemodels, which ignore a high proportion of the available information and which may notbe appropriate in certain environments. Also, performing scan alignment incurs the dataassociation problem—finding correspondences between features from the two scans—whichcan be fragile.

A preferable approach is one that avoids both feature models and data association.One such approach is to define the parameter-space by the measurements themselves[r1, θ1, . . . , rn, θn]T . Thus, a scan is represented without models and data association isnot meaningful, since a measurement from one scan observes a point on a continuous sur-face and will not directly correspond to any particular measurement in another scan. Inthis case, the full Bayesian solution to scan correlation is to define a likelihood function that


models the probability of obtaining the second scan, conditioned on the state of the firstscan and their relative pose. However, definition of such a function is complicated because,for any given relative pose, the likelihood of a measurement in the second scan may berelated to the existence of any number of measurements in the first.

Even if a reasonable likelihood function could be defined, the large state-space involved,consisting of the relative pose and the measurements of the first scan, presents a still greaterproblem. A recent general solution to the problem of tracking states without data associa-tion, called unified tracking, is presented in [125]. The implementation of this method is in-sightful particularly with regard to the tractability of tracking without data association. Theauthors state that unified tracking “is computationally infeasible for problems involving evenmoderate numbers of targets” [125, page 205]. They claim that it is practical for up to twotargets, but that efficient approximations to the full solution are necessary for applicationswith greater target numbers. Therefore, it seems that, for higher-dimensional parameter-spaces, there is no current, tractable solution to scan correlation without some form ofdata association (which might be maximum likelihood, multiple hypothesis, heuristic, orotherwise).

Occupancy grids are a high-dimensional parameter-space that do not require featuremodels and permit feasible scan correlation. However, the occupancy grid solution is ad hocin the sense that the uncertainty and correlations of the cells are updated in a non-Bayesianmanner. This means that there is no theoretical relationship between the sensor uncertaintyand the uncertainty of the scan correlation result. Also, data association between parame-ters is determined by an ad hoc metric of geometric proximity during cross-correlation. Forthe purpose of scan correlation, occupancy grids offer no advantage over the 2-D parameter-space model. They have no better theoretical justification, are expected to be less accurate(due to pixelation), and are likely to be less efficient (depending on their granularity).

Of the higher-dimensional representations, the first two fail to meet the criteria spec-ified at the beginning of this section. The feature-based parameter-space uses restrictivemodels and data association, and the measurement-space representation is not tractable.Occupancy grids satisfy the criteria in principle (although they implement heuristic dataassociation), but are a non-Bayesian solution that possess no better justification than the2-D parameter-space approach.

4.2.3 Application of the 2-D Representation

In environments where reliable feature extraction is difficult, the 2-D parameter-space modelseems to be a reasonable, if imperfect, representation and superior to its higher-dimensionalalternatives.

However, the 2-D representation is subject to restricted application. It permits thealignment of two point data sets and is, therefore, applicable to localisation from an a priorimap, where the map is the reference scan and the robot sensors obtain the observation scan.It might also apply to batch map building, where a set of logged scans are aligned pair-wiseto produce a conglomerate off-line map.

But, the 2-D representation is not suitable for applications where the reference scaninformation is updated by a recursive filter (i.e., the reference scan is non-static). Thisexcludes it from on-line map building and SLAM. Basically, a recursive estimate of the ref-

4.3 A Bayesian Likelihood Function for a Point Target Model 79

erence scan will not result in an improved description of the environment but will eventuallycollapse to a single point location (see Section 4.7 for further discussion).

The on-line mapping and SLAM problems must be represented by a higher-dimensionalstate-space. For SLAM, the feature-based approach has known practical solutions and themeasurement-space concept (without data association) is at least theoretically sufficient.Occupancy grids, on the other hand, are not suitable for SLAM as they do not maintain thenecessary correlations between their parameters. For on-line mapping, given location, thefeature-based and measurement-space methods are again applicable, and the occupancy gridapproach is also reasonable, as the correspondence heuristic between grid cells is sufficientto produce a valid and effective map.

4.3 A Bayesian Likelihood Function for a Point Target Model

This section derives a likelihood function for scan correlation based on the point targetmodel. It shows that, given two scan PDFs, each representing the distribution of a singlepoint target, the likelihood function for their relative pose is computed by their cross-correlation.

As a concrete illustration of scan correlation, this discussion is presented in terms ofrobot localisation in a plane. Therefore, scan correlation becomes the alignment of anuncertain observation to an uncertain map, where the map contains a single point landmark.(Performing localisation in this manner is subject to a minor caveat. After alignment ofthe first observation scan, the map and the robot pose become correlated and should nolonger be treated as independent when aligning subsequent observations. However, theinconsistency due to this correlation is easily absorbed by adding process stabilising noise,which means that the map can be treated as independent in practice.)

4.3.1 Notation

This section uses some specialised notation as follows. Given a 2-D random vector x ∈ X ,its PDF is defined by the distribution function fx (x, y). The subscript denotes that thisfunction represents a PDF of x, and the scalar variables x and y are independent parametersover the space X . If two random vectors a and b are concatenated as [a,b]T , their distri-bution function is given by fab (·). For example, if a and b were 2-D and 1-D, respectively,it would be fab (x, y, z).

The state at time k is the robot pose xk = [xk, yk, φk]T . The likelihood function for thestate is denoted Λ(xk) = fxk

(x, y, φ).

4.3.2 Robot Localisation in a Plane

In a plane, the robot pose is composed of three variables (xk, yk, φk). However, given asingle point-location landmark, as shown in Figure 4.2, it is possible to solve for only twoof these states, and therefore necessary to fix one of them to find a solution to the othertwo. In this presentation, the heading value is assumed known such that φk = φo. Theexposition below shows that, for any fixed φk, the likelihood function for the robot pose


Figure 4.2: Two-dimensional robot localisation. The single point land-mark is stored in the map as (xm, ym), and observed from the vehicle bythe measurement (xz, yz).

is given by the cross-correlation of the observation and map PDFs, and argues that this isalso the case if φk varies.

In Figure 4.2, the map defines the landmark location by xm = [xm, ym]T and the vehicle-centred observation measures this landmark as xz = [xz, yz]T . Both xm and xz are randomvariables with PDFs represented by fm (x, y) and fz (x, y), respectively (note that, fm (0, 0)denotes the map density at the map coordinate origin, and fz (0, 0) is the measurementdensity at the vehicle coordinate origin). The vehicle state is xk = [xk, yk]T with a knownheading φo.

The following derivation uses the method for transforming probability densities shownin [109, pages 146 and 173].4 Given the independent PDFs fm (x, y) and fz (x, y), it isdesired to find the state PDF fxk

(x, y) (which is simply the likelihood function Λ(xk) forfixed φk). The state is related to the observation by the following equation.

xk = xm − Rxz (4.1)

where R is the rotation matrix

R =[

cos φo − sinφo

sinφo cos φo

](4.2)

Using the auxiliary variables a1 and a2 to provide four equations in four unknowns,Equation 4.1 becomes

xk = xm − xz cos φo + yz sin φo

yk = ym − xz sinφo − yz cos φo

a1 = xm

a2 = ym

(4.3)

which is referred to functionally as (xk, yk, a1, a2) = g (xz, yz, xm, ym). This has the unique4Thanks to Jose Guivant and Hugh Durrant-Whyte for their assistance with this derivation.


solutionxz = (a1 − xk) cos φo + (a2 − yk) sinφo

yz = −(a1 − xk) sinφo + (a2 − yk) cos φo

xm = a1

ym = a2

(4.4)

The joint probability distribution for the transformation in Equation 4.3 is obtained asfollows (see [109, page 173]).

fxka (xk, yk, a1, a2) =fzm (xz, yz, xm, ym)

abs (|∇gzm|) (4.5)

where the Jacobian ∇gzm is given by

∇gzm =∂g

∂(xz, yz, xm, ym)=

− cos φo sinφo 1 0− sinφo − cos φo 0 1

0 0 1 00 0 0 1

(4.6)

The determinant of ∇gzm is one and, since xz and xm are independent, Equation 4.5becomes

fxka (xk, yk, a1, a2) = fz (xz, yz) fm (xm, ym) (4.7)

where the values of xz, yz, xm, and ym are the particular solutions found in Equation 4.4.The PDF fxk

(xk, yk) is extracted from this result by integrating over all (a1, a2) (see [109,page 125]).

fxk(xk, yk) =

∫ ∞

−∞

∫ ∞

−∞fz (b1, b2) fm (a1, a2) da1 da2 (4.8)

where b1 and b2 abbreviate the solutions for xz and yz, respectively.

b1 = (a1 − xk) cos φo + (a2 − yk) sinφo

b2 = (xk − a1) sin φo + (a2 − yk) cos φo

Intuitively, Equation 4.8 is interpreted as first rotating the observation PDF fz (x, y) byφo and then performing two-dimensional cross-correlation with the map PDF. This can beshown more clearly by defining the rotated observation xr = Rxz. Plainly, xz has the uniquesolution xz = R−1xr and, following the same procedure as demonstrated in Equation 4.5,the PDF of xr is

fr (x, y) = fz (x cos φo + y sin φo,−x sinφo + y cos φo) (4.9)

Therefore, Equation 4.8 may be rewritten as

fxk(xk, yk) =

∫ ∞

−∞

∫ ∞

−∞fr (a1 − xk, a2 − yk) fm (a1, a2) da1 da2 (4.10)

This result demonstrates that, for any fixed φk, the likelihood function Λ(xk) is calcu-lated by the cross-correlation of the map and the rotated observation distributions. Fur-thermore, for any fixed φk, the total probability mass of Λ(xk) is one and, therefore, theprobability masses for any two values of φk are equal. Thus, by rule of proportionality, itfollows that the set of cross-correlations for all φk constitutes a valid likelihood functionover the domain of xk as φk varies.


4.4 Sum of Gaussians Scan Correlation

This section presents the conversion of a set of point measurements to a Gaussian sumPDF, and derives the equations for the cross-correlation of two such PDFs. This discussionis given in the context of a laser sensor, which observes scans of range-bearing measurementsswept in a 2-D plane.

The following topics are examined with regard to implementing Gaussian sum basedscan correlation.

• The equations for converting a scan of range-bearing measurements to a CartesianGaussian sum are presented.

• Two possible scaling criteria for the individual Gaussians within the Gaussian sum areproposed, with equal height scaling being deemed the more suitable representation.

• The cross-correlation equations for two Gaussian sum PDFs are derived, both for the1-D and planar cases.

• A second correlation method is presented that treats the measurements in the obser-vation scan individually rather than as a single entity.

• The likelihood functions produced by the two scan correlation methods from thissection are compared with the likelihood function resulting from explicit point-wisedata association.

4.4.1 Conversion to Gaussian Sum Representation

An n-dimensional Gaussian distribution, with mean value p and covariance P, is definedby the following equation.

g(x; p,P) � 1(2π)n/2

√|P|exp(−1

2(x − p)TP−1(x − p)

)(4.11)

The integral (or, informally, volume) of this function over the space Rn is one and, therefore,

a scaled Gaussian αg(x; p,P) has a volume equal to α.An n-dimensional sum of Gaussians PDF is defined as the sum of k scaled Gaussians,

G (x) �k∑

i=1

αig(x; pi,Pi) (4.12)

where the sum of the scaling factors αi equals one (i.e., the total probability mass is one).For the purpose of cross-correlation, the correct total probability is not relevant—onlyrelative scale is important. Therefore, in this thesis, the Gaussian sums do not requirenormalisation (i.e., the αi’s need not sum to one).


Example 4.1Conversion of a laser scan to a sum of Gaussians. For a scan of k range-bearing measure-ments, each measurement zi = (ri, θi) is first converted to sensor-centred Cartesian space.

xi = f (zi) =[

ri cos θi

ri sin θi

]Pi = ∇fziRi∇fT

zi

where the Jacobian ∇fzi is given by

∇fzi =∂f∂zi

=[

cos θi −ri sin θi

sin θi ri cos θi

]

These Gaussian estimates are then compiled as a Gaussian sum as follows.

G (x) =k∑

i=1

αi

2π√|Pi|

exp(−1

2(x − xi)TP−1

i (x − xi))

A complete theoretical basis for the values of the αi’s is not presently known. However, twobasic scaling criteria have been trialed experimentally, and the effects of these values on thescan correlation results are discussed below.

4.4.2 Scaling Models for Gaussian Sum

An intuitive summation of the individual measurement Gaussians is to simply add themwithout scaling.

G (x) =k∑

i=1

g(x; pi,Pi) (4.13)

However, it transpires that this formulation gives a skewed representation of the environ-ment, and it would seem that a more suitable scaling is possible. To illustrate, Figure 4.3shows a scan of laser measurements transformed to Cartesian space, and Figure 4.4(a) de-picts the Gaussian sum PDF for this scan, based on Equation 4.13. Since, each unscaledGaussian has equal (unit) volume, those with small uncertainty5 possess very high peaks,indicating high likelihood. The effect of this scaling model on scan correlation is that theshort (highly certain) measurements swamp the cross-correlation result, and more distancepoints have little influence.

The difficulty with scaling arises from the discrepancy between the physical reality ofthe sensed measurements (i.e., discrete points on a surface) and the interpretation of theCartesian parameter-space model (i.e., distribution of a single point target). The intuitive“equal volume” scaling model makes sense if each measurement actually does observe thepoint target, as more a accurate measurement would give a better estimate of its location.However, the individual Gaussians in the Gaussian sum must be understood as defining

5For the laser sensor, the measurement uncertainties in range and bearing are assumed constant. There-fore, the uncertainty in Cartesian space increases with distance from the observer.


0 2 4 6 8 10 12 14 16 18 20−10

−8

−6

−4

−2

0

2

4

6

8

10

12

Figure 4.3: Unprocessed laser scan measurements of a mine tunnel wall.

possible locations for the target, and a more accurate measurement is no more likely thana coarse measurement. In fact, the inaccurate measurement, covering a wider region of thestate-space, actually conveys more information about the target’s possible location. (Thisis clear from the beam-width of the laser sensor; an accurate near measurement detects thelikelihood of a target within a very small region, and so provides little information aboutthe environment, while a more distant measurement covers a broader spatial region.)

An alternative scaling model is to make each Gaussian equal height.

G (x) =k∑

i=1

exp(−1

2(x − pi)TP−1

i (x − pi))

(4.14)

In other words, each Gaussian has a scaling factor αi = (2π)n/2√|Pi|. The Gaussian sum

PDF for the example scan is shown in Figure 4.4(b), and depicts a much more even spatialrepresentation. This “equal height” scaling model asserts that the point target is equallylikely to exist within the distribution of an accurate or inaccurate measurement, but thatthe coarser measurement covers a wider spatial region and so conveys more information;thus, the wider Gaussians have greater volume.


(a) Equal volume Gaussians

(b) Equal height Gaussians

Figure 4.4: Two alternative Gaussian sum representations for the laser scan in Figure 4.3.The first assigns each point measurement a Gaussian of equal volume. This results in veryhigh peaks for measurements with low uncertainty. The second assigns Gaussians of equalheight, which stipulates that very precise measurements contain less information about theshape of the environment (i.e., smaller volume Gaussians), and produces a better spatialdescription.


4.4.3 Cross-Correlation in One Dimension

The likelihood function for correlating two Gaussian sum PDFs is derived here for the n-dimensional case, but is strictly correct only for the 1-D case. That is, this form is validfor higher dimensions only if the angular alignment (e.g., roll, pitch, yaw) between the twodata sets is fixed.

As preliminary information, the following two points are noted. First, the volume undera scaled Gaussian is equal to the scale factor.∫ ∞

−∞αg(x; p,P)dx = α

∫ ∞

−∞g(x; p,P)dx

= α

(4.15)

Second, the volume obtained from the multiplication of two Gaussians∫ ∞

−∞g(x; p,P)g(x; q,Q)dx =

∫ ∞

−∞αg(x; r,R)dx (4.16)

is equal to the following (see Appendix E for derivation).

α =1

(2π)n/2√|P + Q|exp

(−1

2(p − q)T (P + Q)−1(p − q)

)(4.17)

Therefore, given two Gaussian sums

Go (x) =k1∑i=1

αig(x; pi,Pi)

Gr (x) =k2∑i=1

βig(x; qi,Qi)

(4.18)

the likelihood function (i.e., cross-correlation) is calculated as follows.

Λ(xk) = Go (xk) � Gr (xk)

=∫ ∞

−∞

k1∑i=1

αig(u − xk; pi,Pi)k2∑

j=1

βjg(u; qj ,Qj)du

=k1∑i=1

k2∑j=1

αiβj

∫ ∞

−∞g(u − xk; pi,Pi)g(u; qj ,Qj)du

=k1∑i=1

k2∑j=1

αiβjγij (xk)

(4.19)

where the function γij (xk) is the cross-correlation of the two Gaussians g(x; pi,Pi) andg(x; qj ,Qj). From Equation 4.17, this function is given by

γij (xk) =1

(2π)n/2√|Pi + Qj |

exp(−1

2(xk + pi − qj)T (Pi + Qj)−1(xk + pi − qj)

)(4.20)


Algorithm 4.2: Likelihood(Go, Gr,xk)

Λ ← 0

t ←[

xk

yk

]

R ←[

cos φk − sinφk

sin φk cos φk

]for i ← 1 to |Go|

do

p′ ← t + Rpi

P′ ← RPiRT

for j ← 1 to |Gr|

do

ν ← p′ − qj

S ← P′+ Qj

γ ← 1

2π√

|S|exp(−1

2νTS−1ν)

Λ ← Λ + αiβjγreturn (Λ)

4.4.4 Planar Cross-Correlation

For the planar case, such as the 2-D robot localisation problem, the state xk = (xk, yk, φk)varies in φk and so Λ(xk) cannot be obtained directly by cross-correlation. However, giventhe result in Section 4.3.2, it is sufficient to alter Equation 4.20 as follows,

γij (xk) =1

2π√|P′ + Qj |

exp(−1

2(p

′ − qj)T (P′+ Qj)−1(p

′ − qj))

(4.21)

where p′and P

′are the mean and covariance of the ith Gaussian in the observation PDF

transformed the the reference coordinate frame,

p′=[

xk

yk

]+ Rpi

P′= RPiRT

R =[

cos φk − sinφk

sinφk cos φk

]

Algorithm 4.2 provides a simple implementation of the planar likelihood function (where|G| denotes the number of Gaussians in a Gaussian sum G). For two Gaussian sums ofsize k1 and k2, this algorithm has complexity O(k1k2) and, for larger PDF definitions,is prohibitively slow for estimation methods like particle filtering. However, assuming theGaussians in each sum possess reasonably similar covariance and scale, the added likelihooddue to two particular Gaussians becomes negligible if the two-norm ||p′ − qj || is large. Thisleads to the possibility of various optimisations, two of which are discussed below.


First, the search space (in the reference Gaussian sum) can be partitioned so that, fora given mean p

′, only those Gaussians satisfying ||p′ − qj || < d are incorporated into the

likelihood calculation. This might be achieved by binning the mean values qj in a coarse gridof granularity 2d. Alternatively, a sufficiently accurate result might be obtained by usingonly the k nearest neighbours to p

′. This method, implemented with k-d trees [12, 101], is

used in the experimental applications of this chapter.A second optimisation is to cache a set of near neighbours for each Gaussian in Go,

and retain these while ever the state xk is reasonably close to its initial value. This isparticularly useful for maximum likelihood estimation, which will converge even if the nearneighbour sets are only updated every n iterations.

4.4.5 An Alternative Correlation Scheme

In the previous discussion, the reference and observation scans are both represented asa Gaussian sum PDFs, but an alternative proposition is to consider each measurementin the observation scan as an independent observation of the reference PDF. Thus, eachmeasurement determines a separate likelihood function

Λi (xk) =k2∑

j=1

βjγij (xk) (4.22)

where γij (xk) is given in Equation 4.21. Notice that the scaling factor αi for each obser-vation Gaussian is not required, since each measurement forms a unit volume PDF. Thecombined likelihood for the observation scan, therefore, is the intersection of the individuallikelihoods.

Λ(xk) =k1∏i=1

Λi (xk) (4.23)

This correlation method, with due assumptions, is shown in Section 4.4.6 to produce moreoptimal likelihood results than is obtained with a single observation PDF, but there areseveral complicating issues.

First, the reference PDF is reused for each iteration of Equation 4.22. This correlationbetween the likelihoods might be compensated by inflating the observation uncertaintybut, since each observation usually observes only a small portion of the reference PDF(and recalling that the reference PDF is itself composed of independent measurements),it is difficult to determine the quantity of this inflation. A better solution, perhaps, is touse the correlated reference information without inflation, given the understanding thatEquation 4.23 will converge to a best fit of the reference PDF and not the true state of theenvironment.

A second problem is that the observation scan not only observes parts of the referencescan, but may also include additional measurements due to viewpoint variation, occlusionand dynamic objects; corporately these are termed outlier measurements. The presence ofoutliers means that a state value xk close to the true state may be made unlikely by themultiplication of Λi (xk), where i represents an outlier measurement. This is particularlyso if the measurements possess bounded uncertainty, in which case the probability of thetrue state may become zero. A work-around for this problem is to specify a probability of


(a) A nominal likelihood function(truncated Gaussian)

(b) Likelihood with probabilityof false detection

Figure 4.5: Likelihood function with probability of false detection. Anyvalue Λ(xk) less than Pf becomes equal to Pf .

false detection Pf to set a minimum likelihood for any state value (see Figure 4.5). In otherwords, given a particular state xk = x0, if Λi (x0) is less than Pf , then the measurement i isassumed an outlier for x0. Determining an appropriate value for Pf is a matter of tuning.6

This method has not yet been implemented and is not used in the experimental appli-cations below; all results using real data are obtained with the more conservative represen-tation of a single observation scan PDF.

4.4.6 A Comparison of Correlation and Explicit Data Association

The two correlation methods described above are demonstrated via a simple 1-D example,and their results are compared to the likelihood function obtained from explicit data as-sociation (i.e., where each observation measurement is assigned to a particular referencelandmark).

The example depicts a simple case where the environment consists of two point land-marks (e.g., two trees), and the reference scan represents them by two Gaussians of equaluncertainty, and the observation scan is exactly the same as the reference. Two varia-tions of this example are presented: the first where the two Gaussians are merged (seeFigure 4.6(a)) and the second where they are distinct (see Figure 4.7(a)). These instancesserve to compare the characteristics of the three association methods.

To begin with, the equations for the three versions of likelihood function are shown. Eachshare the basic calculation of the correlation of two Gaussians where, from Equation 4.20,the cross-correlation of two 1-D Gaussians is given by

αij (xk) =1√

2π(σ2i + σ2

j )exp

(−(xk + xi − xj)2

2(σ2i + σ2

j )

)(4.24)

6One important factor in the tuning of Pf is the local density of the reference PDF, which is dependenton the total size of the reference scan. For a reference scan built from laser data, this difficulty may becircumvented by not normalising the reference Gaussian sum. Other influences on Pf may perhaps besubsumed by empirical tuning.


−5 0 5 10 150

0.05

0.1

0.15

0.2

0.25

(a) Scan Gaussian sum

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

(b) Likelihood functions

Figure 4.6: Correlation results for merged measurements. The measurement Gaussians (a)are merged, resulting in the likelihood functions shown in (b). Note, the constituent αij ’sare shown in colour, while the resultant likelihood functions are black.


−5 0 5 10 150

0.05

0.1

0.15

0.2

0.25

(a) Scan Gaussian sum

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

−10 −8 −6 −4 −2 0 2 4 6 8 100

0.1

0.2

0.3

0.4

(b) Likelihood functions

Figure 4.7: Correlation results for distinct measurements. The measurement Gaussians (a)are well separated, resulting in the likelihood functions shown in (b).


where i represents an observation Gaussian and j a reference Gaussian. In the equationsbelow, this area is denoted αij , without the implicit state parameter.

The first correlation-based likelihood function is for the case where the observation scanforms a single Gaussian sum PDF. Thus, the reference and observation PDFs each consistof two scaled Gaussians (with scale factors equal to 1

2).

Λ(xk) =14(α11 + α12 + α21 + α22) (4.25)

Ignoring the (irrelevant) constant term, this is expressed as

Λ(xk) = α11 + α12 + α21 + α22 (4.26)

Notice that this likelihood function is the union of the individual likelihoods.The second correlation-based likelihood function is obtained when the observation mea-

surements are processed independently according to Section 4.4.5 (n.b., without inflatingthe observation uncertainties).

Λ(xk) = (α11 + α12)(α21 + α22) (4.27)

This likelihood function is always less conservative7 than Equation 4.26 since it is the in-tersection of the sums α11 + α12 and α21 + α22, which is more constrained than the unionof all α’s.

Finally, the third likelihood function (data association) is obtained by defining explicitcorrespondences, 1 → 1 and 2 → 2, between the observation and reference measurements.

Λ(xk) = α11α22 (4.28)

In other words, this likelihood function is the intersection of the likelihoods for each corre-spondence, and is clearly the least conservative of the three likelihood functions.

Examining the results of the two examples in Figures 4.6 and 4.7, the reference and(identical) observation Gaussian sums for each example are shown in (a) and their resultinglikelihood functions are shown in (b). These results depict (top to bottom) Equations 4.26to 4.28, respectively. The top result shows the four constituent αij ’s (note, all are Gaussianwith equal variance, and α11 and α22 have equal mean), and their sum; the middle resultshows α11 + α12 and α21 + α22, and their product; and the bottom result shows α11 andα22, and their product. Note, each of these distributions is normalised to possess area equalto one.

In both Figures 4.6 and 4.7, the first correlation method produces a likelihood functionthat reflects the degree of overlap between the observation and reference PDFs. This can beseen as a conservative likelihood function in the presence of outlier measurements (or miss-ing data etc)—this is particularly apparent in Figure 4.7. The second correlation technique,when compared in the two figures, produces a likelihood function that is more conservativefor merged measurements than for discrete measurements. That is, as the measurements

7The term conservative refers to the shape of a distribution. A flat distribution is totally conservativewith all states equally likely; a less conservative PDF is more peaked, with areas of concentrated probabilitymass.

4.5 Application: Maximum Likelihood Dead Reckoning 93

become more discrete, the product of the α12 and α21 components approach zero, andEquation 4.27 approaches Equation 4.28. Thus, Equation 4.27 correctly deals with dataassociation uncertainty from merged measurements provided each measurement in the ob-servation scan does, in fact, correspond to something in the reference scan. However, itfails to consider outlier measurements (and so requires a probability of false detection Pf ).Finally, the likelihood function for explicit data association is the same in both figures,as expected. This likelihood function provides greater accuracy than the correlation-basedmethods if the associations can be guaranteed (i.e., data association constraints, if correct,are a substantial source of information), but is inconsistent if the associations are uncertain(as for merged measurements).

To summarise these results,

• Equation 4.26 is conservative provided some of the observation measurements corre-spond to part of the reference scan.

• Equation 4.27 is conservative provided all of the observation measurements corre-spond to part of the reference scan. To some extent this limitation may be reducedby incorporating a probability for outlier measurements.

• Equation 4.28 is optimal provided the associations are known and correct, otherwiseit is optimistic.

4.5 Application: Maximum Likelihood Dead Reckoning

This application uses maximum likelihood scan correlation to perform laser-based dead reck-oning. For each scan-pair, the predicted change-in-pose is equal to the maximum likelihoodestimate obtained for the previous scan-pair. The estimated change-in-pose is then obtainedby searching for the maximum scan cross-correlation using a greedy search method (i.e.,the Nelder-Mead downhill simplex method [107, 112]). Note, these results do not supplyany measure of uncertainty.

4.5.1 Results

The results for the internal road environment are shown in Figure 4.8. The laser data usedhere is the same as used for feature-based dead reckoning in the previous chapter.

These results are comparable to the results of the previous chapter but are less accuratefor a number of reasons. First, for many scan-pairs the likelihood function was multi-modal and, on several occasions, the greedy optimisation converged to the wrong mode.Second, even when converging to the correct mode, the optimisation step was run for afixed number of iterations, and may not have converged completely; (the feature-based es-timate, on the other hand, is closed form). Third, in some regions there was insufficientcommon information between scans to allow proper correlation and, while this was detectedin the feature-based implementation, it was not detected here and spurious estimates wereobtained. (Note, lack of correlation information might be detected by examining the non-normalised value of the maximum cross-correlation.) Finally, the explicit batch data associ-


−250 −200 −150 −100 −50 0 50 100 150 200

−350

−300

−250

−200

−150

−100

−50

0

Figure 4.8: Maximum likelihood dead reckoning in the internal road envi-ronment. The result here is rather less accurate than was obtained usingthe feature-based approach.

ation used by the feature-based method imparts substantial information that is unavailableto correlation methods.

The results for the mine tunnel environment are shown in Figure 4.9 (see Appendix A.3for a description of this environment). In this environment reliable geometric features aredifficult to obtain, although some efforts have been made to do so using points of maximumcurvature [94]. Nevertheless, the mine walls are rich in information (highly textured),and permit extremely accurate and unambiguous correlation-based estimation. A smoothand precise dead reckoning trajectory is shown; note that the apparent discontinuity atcoordinates (72, 2) is, in fact, where the vehicle made a 3-point turn.

In addition to the vehicle trajectory, the unprocessed laser measurements are picturedin Figure 4.9, projected onto their global locations according to the estimated pose of thelaser for each scan. This gives a clear indication of the precision of the dead reckoning resultas the overlaid points produce walls about 10cm thick—which is plus-minus the laser rangeresolution (see Figure 4.10).


0 10 20 30 40 50 60 70 80 90 100

−20

0

20

40

60

80

100

120

Figure 4.9: Maximum likelihood dead reckoning in the mine environment. Here geometricfeatures are not readily available but excellent results can still be obtained (as demonstratedby the shape of the walls formed by plotting laser points at their predicted global locations).


56 56.5 57 57.5 58 58.5 59 59.5 60−0.5

0

0.5

48 48.5 49 49.5 50 50.5 51 51.5 5227.5

28

28.5

Figure 4.10: Mine walls depicted by unprocessed laser measurements.These two figures are characteristic of the scan correlation dead reckoningresult, where the projected laser points form walls with about ±5cmaccuracy.

4.5.2 Multimodal Likelihood Functions

The fragility of maximum likelihood estimation is primarily due to the ambiguity of multi-modal likelihood functions. This was true in the internal road environment, where tiltingof the laser sensor observed varying cross-sections of the world, including different layers ofundergrowth and other clutter.

Two examples of multimodal likelihood functions are shown here from data obtainedin the park environment. The first scan-pair, in Figure 4.11, shows a scene containingtree trunk features and also a large section of spurious returns, in the bottom-left corner,due to ground sweeps. The resulting likelihood function shows a narrow peak with mode(0.83,−0.03,−0.00) representing the correct change-in-pose, and a broad peak with mode(1.48,−1.09,−0.04) due to the apparent motion of the ground sweep. Note that the incorrectmode is more likely than the correct one.

The second scan-pair, in Figure 4.12, again depicts a scene consisting mostly of trees,but this time it also contains a set of returns, in the bottom-left corner, from a movingvehicle. The likelihood function shows a narrow peak at (0.66,−0.02, 0.00) for the truechange-in-pose, and a long thin distribution with mode (−1.70,−1.20,−0.01) showing therelative motion of the observer with respect to the (moving) observed vehicle. Once again,the false mode presents the greater likelihood.


0 5 10 15 20 25 30−20

−15

−10

−5

0

5

10

15

20

25

30

(a) Scan 1

0 5 10 15 20 25 30−20

−15

−10

−5

0

5

10

15

20

25

30

(b) Scan 2

(c) Likelihood function

Figure 4.11: Multimodal likelihood function due to ground sweep. Noticethat the maximum mode is not the correct mode.


0 5 10 15 20 25 30−30

−25

−20

−15

−10

−5

0

5

10

15

20

25

30

(a) Scan 1

0 5 10 15 20 25 30−30

−25

−20

−15

−10

−5

0

5

10

15

20

25

30

(b) Scan 2

(c) Likelihood function

Figure 4.12: Multimodal likelihood function due to dynamic object (mo-tor vehicle).

4.6 Application: Particle Filter Localisation 99

4.5.3 Discussion of Results

One of the key points to be learned from these results is that unprocessed scan correlation,which uses all the available information, is not necessarily more accurate than feature-basedmethods that use only part of the available information. This counter-intuitive outcomemay be primarily attributed to two factors. First, the ability of feature models to rejectspurious data makes use of a priori knowledge that certain formations in the data are morereliable than others. For example, data that resembles a tree trunk is likely to be reliable,while disordered data, such as a ground sweep, is not. Second, explicit data association,when correct, is a significant source of information and markedly constrains the resultingrelative pose esimate (as shown in Section 4.4.6).

In environments where reliable feature extraction is difficult, scan correlation is undeni-ably a valuable approach. The results in the mining tunnel demonstrate the quality of thealignment accuracy possible from distinctive, but difficult to model, data.

The multimodal results from the park environment show that dynamic objects (e.g., peo-ple, cars, ground sweeps) can induce strong alternative modes, which can cause maximum-likelihood scan correlation to fail. Thus, in environments susceptible to multiple modes, itmay be necessary to maintain a full representation of the vehicle pose PDF.

4.6 Application: Particle Filter Localisation

This section presents an implementation of particle filter localisation in the mining tunnelenvironment.8 An off-line map of the mine is created from the laser data and dead reckoningresults of the previous section. A different set of laser data, where the vehicle travels backalong the tunnel in the opposite direction, is used as the observation data set.

The off-line map was created in a rather ad hoc manner to produce a single referenceGaussian sum. Basically, the unprocessed laser points are located in the global referenceframe according to the maximum likelihood dead reckoning estimate. Each point representsa unit-height Gaussian, as before, and this set of Gaussians is culled—retaining points on thebasis of minimum determinant, representing minimal spatial area—to reduce the numberof points while maintaining a reasonable density. In other words, points are compared withother points in a small neighbourhood, and the point with the smallest determinant is kept,while the others are removed from the map. Thus, the map Gaussian sum is composed ofa minimal representative point set.

This experiment uses a naive realisation of the particle filter—without provision for theproblems of sample impoverishment or inadequate sample coverage. It is assumed that theprocess noise provides sufficient roughening to prevent impoverishment, and the sensor noiseis broad enough to permit reasonable overlap between the prior and posterior distributions.

4.6.1 State Vector and Process Model

The vehicle state is represented by its position and velocity as follows.

xk = [xk, yk, φk, xk, yk, φk]T

8Discussion of particle filter concepts and implementation is provided in Appendix D.


Note, the pose (xk, yk, φk) is the vehicle pose with respect to the global coordinate frame,while the velocity (xk, yk, φk) is the change-in-pose with respect to the vehicle-centred coor-dinate frame. This form permits a straightforward realisation of a constant velocity processmodel.

The six-element state means that a constant velocity state transition model can bederived that satisfies Markov assumptions (i.e., the predicted state x−

k+1 is dependent onlyon the previous state estimate x+

k ). The process model, therefore, is given by

xk+1 =

xk + ∆T (xk+1 cos φk − yk+1 sinφk)yk + ∆T (xk+1 sinφk + yk+1 cos φk)

φk + ∆T φk+1

xk + ∆Tq1

yk + ∆Tq2

φk + ∆Tq3

where ∆T represents the change-in-time, and q1, q2 and q3 are independent random variablesrepresenting the uncertain change in velocity due to accelerations. The values for q1, q2 andq3 are drawn from zero-mean Gaussian distributions with standard deviations 1.5m/s2,1.5m/s2, and 0.3rad/s2, respectively. Notice that the process model uses the predictedvelocities for time k +1 to predict the vehicle pose. These values introduce the accelerationuncertainties into the pose prediction and, in doing so, add roughening noise to the samplesof the next prior; thus, uncertainty in pose is due entirely to acceleration uncertainty.

4.6.2 Likelihood Modification

The structure of the mine environment necessitates a modification to the likelihood func-tion obtained directly from scan correlation. The problem is that the robustness of cross-correlation to outlier measurements produces an over-conservative likelihood in the directionof the mine walls.

A typical likelihood function from the mine environment is shown in Figure 4.13, whichillustrates the extended likelihood tails in the direction parallel to the mine walls. (Note,the likelihood is reasonably narrow in the perpendicular direction.) Essentially, the char-acteristic wall texture, which indicates the most likely vehicle pose, is not very distinctivefor the scan correlation method described in this chapter (since it fails to penalise outliermeasurements). The alternative scan correlation method in Section 4.4.5 might perhapsaddress this problem without having to resort to the ad hoc modification described below.

The solution chosen in this application is to use the basic cross-correlation method (i.e.,where the observation forms a single PDF for correlation with the reference PDF) andmodify the resulting likelihood to reduce the tail thickness. The modification is simplyto subtract half the maximum sampled likelihood from the likelihood weight of each posesample.9

Λk(i) = Λk(i) − max {Λk(i)}2

Likelihood samples with Λk(i) < 0 are assigned zero likelihood Λk(i) = 0.9The likelihood of each state sample is dependent only on the pose portion (xk, yk, φk) of the state vector,

not the velocity portion (xk, yk, φk).


Figure 4.13: Likelihood function along a straight tunnel section. Cross-correlation produces reasonably narrow peak in the direction perpendic-ular to the walls but is over-conservative along their course.

4.6.3 Results

An initial sample set of 3000 particles is drawn such that the pose estimate has mean[81m, 96m,−2rad] with uniform distribution in the range [±6,±8,±3.5], and the velocityestimate has mean [3.9m/s, 0.0046m/s, 0.0046rad/s] with Gaussian distribution governedby the standard deviations [.2, .2, .02]. The “true” initial state was somewhere close tox0 = [84.8, 100.9,−2.1, 3.8, 0.0036, 0.0055]T .

With each subsequent iteration of the particle filter, the number of resampled particles is1000. The first six iterations are shown in Figure 4.14. Notice that after the first iteration,the sample distribution is multimodal, and this quickly settles into two dominant modes dueto symmetries in the map structure. The distribution then converges to the true mode aftersix iterations. Notice also that this implementation does not make use of “coastline” infor-mation, which would constrain particles to within the mine walls (i.e., prior samples outsidethe mine walls would have zero likelihood); this constraint is not particularly advantageoushere after initial convergence.

The localisation results for the entire run are shown in Figure 4.15. This figure givesthe mean pose estimate for each iteration as a black line and, at selected intervals, showsthe full prior and posterior sample sets. Note, the apparent discontinuity at coordinates(35, 25) is actually a 3-point turn.

The (x, y) posterior standard deviation reached as high as 1.5m in the direction parallelto the tunnel walls along some straight sections. However, it was typically less than 0.1m inthe direction perpendicular to the walls, and similarly less than 0.1m in all directions as thevehicle approached distinctive regions such as tunnel junctions. Clearly, this particle-basedlocalisation result is quite accurate and robust.


60 65 70 75 80 85 9080

85

90

95

100

105

60 65 70 75 80 85 9080

85

90

95

100

105

60 65 70 75 80 85 9080

85

90

95

100

105

60 65 70 75 80 85 9080

85

90

95

100

105

60 65 70 75 80 85 9080

85

90

95

100

105

60 65 70 75 80 85 9080

85

90

95

100

105

Figure 4.14: First six iterations of localisation. The green samples are the prior (withgreen sample covariance ellipse). The black samples are the posterior (with black samplecovariance ellipse). The red points are the unprocessed laser data projected from the mostlikely posterior sample.


−10 0 10 20 30 40 50 60 70 80 90 100−20

0

20

40

60

80

100

120

Figure 4.15: Particle filter localisation. The mean pose for each iterationis depicted by the black line. The prior (green) and posterior (black)samples are shown at periodic intervals along with the unprocessed laserscan (red).

4.7 Remarks: A 2-D PDF for Map Building and SLAM 104

4.7 Remarks: A 2-D PDF for Map Building and SLAM

This chapter is, in part, a response to the work in [95] with regard to the use of a 2-DCartesian parameter-space as the probabilistic representation of a 2-D environment. A maingoal of this chapter is to show when and where a 2-D PDF is an appropriate representation,and how it should be interpreted.

The work in [95] presents experimentation in a subsea environment using a scanningsonar sensor. Each radiating sonar “ping” is modelled by a sum-of-Gaussians uncertaintyin range and a Gaussian in bearing, indicating the likelihood of a point-target location.This polar PDF is then transformed to observer-centred Cartesian space. The individualping PDFs, for a full rotation of the sonar, are accumulated into a joint “scan” PDF viamultiplication.10 The scan PDFs may then be used to perform localisation (through scancorrelation), map building (multiplication of the scan PDF with an accumulating map PDF,given location), or SLAM.

The fundamental assumption in [95] is that a 2-D PDF is capable of representing theshape of a complex environment and can provide the basis for multiple sensor Bayesiandata fusion.

This chapter contends that the use of a 2-D PDF for map building or SLAM is flawed;map building will produce nonsense and SLAM will diverge. The Gaussian sum representa-tion for a sonar ping is valid, but its fusion with other pings via multiplication is incorrect.

Inherently, the 2-D PDF represents the uncertainty distribution of a single point-locationtarget. Thus, the sampled amplitudes of a sonar ping, which measure the likelihood of apoint target versus range, are reasonably represented. The Gaussian sum approximationof the ping is also reasonable, and the set of Gaussians in the sum might be interpretedas saying “the true state is here or here.” On the other hand, the fusion of measurementsby multiplication is equivalent to saying “the true state is here and here.” This is not areasonable interpretation of the data, as each ping does not observe the same point targetbut different aspects of a non-point surface. Recursive estimation with this data in a 2-Dparameter-space is inconsistent.

The apparently reasonable results in [95] are most likely an artifact of using Gaussians(because of their infinite tails), and a counter example can be given by considering a pingwith bounded distribution in bearing. In this case, map building (for example) would failsince multiplication of the map PDF with the ping PDF would set every part of the mapoutside the bearing bounds irrevocably to zero likelihood. Circumventing this problem, byspecifying which portions of the map are affected, would involve extremely ad hoc tinkeringwith the estimation process.

The conclusion of this chapter is to claim that a 2-D PDF representation is suitedonly for tasks like scan correlation and localisation. At most, it might be used to aligna set of scans, using a batch process (possibly EM), such that the resulting “map” is thesum of the aligned distributions, which might be used for localisation. It is not suitablefor recursive estimation problems, such as map building or SLAM, which will inevitablyconverge (collapse) to a single point.

10Accumulation into a “scan” PDF is just an implementation detail to simplify scan correlation. Locali-sation, map building and SLAM might equally be performed using the individual ping PDFs directly.

4.8 Summary 105

4.8 Summary

This chapter examines scan correlation as an alternative to feature-based data association.Scan correlation involves finding the relative pose between two unprocessed point data sets,so that the data sets are aligned. A review of current correlation methods—ICP, anglehistogram, occupancy grids, and probabilistic methods—argues that most are deficient intheir use of sensor uncertainty. The goal of this chapter, therefore, is to formulate a scancorrelation method that incorporates sensor uncertainty appropriately.

If a scan of data is represented in Cartesian space by a 2-dimensional PDF (or 3-DPDF for a 3-D data set), this distribution is implicitly interpreted as defining a single pointtarget. Higher-dimensional parameter-spaces might permit more realistic interpretationsof the data, but are subject to other limitations (e.g., they require feature models, arecomputationally intractable, or lack theoretical justification).

A Bayesian likelihood function for scan correlation, using the “point-location target”model, is shown to be computed via cross-correlation of the two scan PDFs.

A 2-D sum of Gaussians is proposed as a PDF representation for point data sets wherethe individual point distributions are Gaussian. This representation is shown to permitefficient scan cross-correlation. Two variants of scan correlation are presented. The firstis straightforward cross-correlation between two scan PDFs, and the second involves cross-correlation of the reference scan PDF with individual measurement Gaussians from theobservation scan. The first variant is robust to outliers and viewpoint variation, while thesecond is less robust but can be more accurate.

Gaussian sum scan correlation is demonstrated via maximum likelihood dead reckoningin an outdoor environment and in a subterranean mining tunnel. This application gives goodresults but maximum likelihood estimation is fragile for multimodal likelihood functions. Asecond application for scan correlation is particle filter localisation, again in the mine tunnel,which shows that the likelihood function gives an accurate and conservative estimate of poseuncertainty.

Chapter 5

Considerations for SLAM inModerate-Sized Environments

This chapter is concerned with the practical realisation of stochastic SLAM in small tomoderate scale environments. The qualifier “moderate” refers to an environment size wherethe number of stored features (i.e., computation) and non-linearities are small enough topermit straightforward implementation of the SLAM algorithm as presented in Section 2.2.

The first consideration, particularly in dynamic environments with non-static and tem-porarily static entities, is feature management—the addition and deletion of map features.Without mechanisms to minimise the addition of non-static or unstable features, and to re-move obsolete or redundant features, the SLAM map will progressively accumulate clutter.Eventually, this clutter will lead to either excessive computational load or data associa-tion failure. Thus, feature management is an essential requirement for successful long-termdeployment.

The second consideration concerns the cycle detection problem. While batch data asso-ciation (such as CCDA) yields reliable association within local regions, it is an insufficientmechanism for robust cycle detection, and may provide little better reliability than individ-ual (i.e., one assignment at a time) association. This chapter describes when this problemwill occur and a simple method for its solution.

The feature management and cycle detection concepts are illustrated with experimentalSLAM results, performed using the basic SLAM implementation described in Section 2.2and the CCDA algorithm.

5.1 Feature Management

Feature management is a major contributing factor to SLAM being a harder problem thana priori map localisation; not only is it essential to ensure correct associations betweenobserved and stored features, but it is also necessary to manage measurements that fail thesevalidation tests. Non-matched observations might be due to dynamic objects, anomalousfeature extraction, outlier measurements of map landmarks, or measurements of landmarkspreviously unseen. SLAM requires that the last of these cases is detectable, while rejectingthe others, so as to permit map extension.

5.1 Feature Management 107

This section is concerned with three aspects of feature management for dynamic envi-ronments.

• Feature addition. Non-static or unstable features are avoided by deferring additioninto the map until satisfactory evidence of stability is obtained. The method of con-strained initialisation is examined as a mathematically consistent way to incorporatetentative observation information.

• Removal of redundant features. It is not necessary to retain all static features to buildan effective map, and the removal of extraneous features (i.e., density control) candeliver substantial computational saving. Measures are discussed for assessing featurereliability and information content as deletion criteria.

• Removal of obsolete features. If the structure of the environment changes over time,those landmarks ceasing to exist should be removed from the map. By deleting fea-tures that are predicted visible, but are not observed, the map automatically adjuststo change.

5.1.1 Feature Addition

The standard procedure for deferring feature addition is to add new features to a “tenta-tive” list, and transfer features from this list to the map once they have been reobservedsufficiently often. Thus, on obtaining a new observation, an attempt is made first to asso-ciate it with a feature in the map, then to a feature in the tentative list and, failing that, itbecomes a new tentative feature. A series of associations to a particular tentative featurerelegates it to the “confirmed” map, while failure to reobserve a tentative feature results inits demise.

In [53], SLAM is performed using sonar, which is “notorious for exhibiting drop-outs,false returns, no-returns and noise.” To circumvent this problem, all measurements notmatched to known map features are stored in a list as potential features. These are retainedfor N time-steps and are discarded if not reobserved during this time. Each time-step asearch for clusters1 of M ≤ N measurements is made in the tentative list and, if found, eachcluster defines a new confirmed map feature.

A similar approach is presented in [42], where each tentative landmark has a counter forthe number of times it is reobserved. A tentative feature is transferred to the map once asufficient number of associations have occurred, or deleted if not observed for a given periodof time.

5.1.2 Constrained Initialisation

The issue of feature initialisation is, in essence, an issue of data association—whether or nota set of subsequent observations correspond to a particular tentative feature. In ascribing

1Clustering is determined via pair-wise compatibility using the NIS validation gate. Each measurementin a cluster must be from a different time-step and must be compatible with every other measurement in thecluster. Typically, the NIS calculation for tentative features is not performed correctly; this is addressed inSection 5.1.2.


associations, however, the calculation of appropriate validation gates for tentative featuresis a non-trivial task and is frequently done incorrectly. A related problem is the matter oftransferring the accumulated observation information to the map once a feature has beenconfirmed, and this is usually carried out in a very suboptimal manner. In the discussionbelow, the method of constrained initialisation [140] is shown to address both of theseproblems.

The exposition in [42] provides a good example of the usual approach to the initialisationproblem. A potential feature xi is obtained from an observation z = [r, θ]T with covarianceR. [

xi

yi

]= gi (xv, z) =

[xv + r cos(θ + φv)yv + r sin(θ + φv)

](5.1)

The covariance of this feature is given by

Pi = ∇gxvPv∇gTxv

+ ∇gzR∇gTz (5.2)

where ∇gxv and ∇gz are the Jacobians of gi (xv, z) with respect to xv and z, respectively,as given in Equations 2.29 and 2.30 in Section 2.2.4 (notice that Equation 5.2 is simply thebottom diagonal of Equation 2.31). Having similarly obtained another potential feature xj ,the innovation (and innovation covariance) between them is calculated as follows.

νij =[

xi − xj

yi − yj

](5.3)

Sij = Pi + Pj (5.4)

Association validation is then determined via the NIS gate.

Mij = νTijS

−1ij νij < γ2 (5.5)

There are two suboptimal consequences of this approach. First, the validation gateis under-constrained since the innovation covariance Sij does not include the correlationsbetween features xi and xj (resulting from their correlations with the vehicle). Thus,associations may be made that should not be. Second, since these correlations have beenlost, only the final observation can be used when initialising a confirmed feature into themap; all preceding observation information for the feature will be lost.

Constrained initialisation [140] is an optimal and consistent mechanism for feature ini-tialisation, which allows immediate addition of tentative features into the map but defersdata association until the feature is confirmed. This method permits fully constrained dataassociation and conserves all tentative observation information.

Consider the following example where there is an augmented state vector xa = [xTv ,xT

m]T

that is subsequently augmented with two tentative features xi and xj .2 The covariance ofthe map may, therefore, be represented by

Pa =

Pv Pvm Pvi Pvj

PTvm Pm Pmi Pmj

PTvi PT

mi Pi Pij

PTvj PT

mj PTij Pj

(5.6)

2State augmentation is performed using the procedure shown in Section 2.2.4. This method is known tobe consistent—it adds no information to the older portions of the map and does not involve data association.


If, at some stage, the features xi and xj (and, perhaps others) are found to represent thesame (new) landmark, they can be combined by constraining their estimates. On the otherhand, if a tentative feature is not reobserved within a reasonable period, it is deleted fromthe map.3

Data association is performed by way of a perfect “virtual” observation zij = xi − xj = 0,stating that the two features are equivalent.

zij = hij (xa) =[

xi − xj

yi − yj

](5.7)

This constraint has the following innovation and innovation covariance.

νij = zij − hij (xa) = −hij (xa) (5.8)

Sij = ∇hxaPa∇hTxa

(5.9)


∇hxa =∂hij

∂xa

∣∣∣∣xa

=[

0 . . . 0 1 0 −1 00 . . . 0 0 1 0 −1

](5.10)

From the sparsity in ∇hxa , a more efficient evaluation of the innovation covariance is pos-sible.

Sij = ∇hr

[Pi Pij

PTij Pj

]∇hT

r (5.11)

where the reduced Jacobian ∇hr is

∇hr =[

1 0 −1 00 1 0 −1

](5.12)

The resulting NIS value (using Equation 5.5) correctly includes the cross correlations Pij

for properly constrained gating.Having confirmed that a set of tentative features represent the same landmark, a final

constrained estimate is obtained through the normal Kalman filter update equations pre-sented in Section 2.2.3 (using Equations 5.8 and 5.9 in place of Equations 2.17 and 2.18).This causes the constrained features to each possess the same value (and correlations), andso the set of redundant estimates can be simply deleted from the map. A complete discus-sion of constrained initialisation is provided in [140], including proof that the solution isequivalent to that obtained without deferred data association.

5.1.3 Density Control

In many environments, it is possible to obtain many more static landmarks than are neces-sary for accurate navigation, and the increase in information is outweighed by the increasein computation and storage. This is particularly so when features are unevenly distributed

3Deletion of a feature from the SLAM map has been shown to be consistent [40] and, in the case offeature initialisation, no information is lost save the location of the tentative feature.


or clustered. The aim of density control, therefore, is to determine which subset of featuresto retain, and which to remove, so as to obtain a “best” minimal map.

Presented in [40] are two important characteristics of the stochastic SLAM algorithmwith regard to feature removal.

• A feature can be deleted (by removing the appropriate elements from the state vec-tor, and the associated rows and columns of the state covariance matrix) withoutcompromising the statistical consistency of the map.

• The accumulated information of a deleted feature must be discarded, and cannot bereinstated if the feature is reobserved; the feature must be initialised using only thenew observation information. Conversely, reusing the old information will result in aninconsistent map.

In [40], the information content of a particular feature xi is specified as the inverse ofthe trace of its covariance Pi (a submatrix of the state covariance). The following strategyis proposed to permit feature removal with minimal information loss. At each time-step,those map features transitioning from visible to not-visible are stored in a list and, wheneverthe vehicle has travelled a distance interval greater than dv, a single feature from the listis retained (i.e., the one with maximum information) and the rest are discarded from themap. The result is that the landmark density outside the vehicle field-of-view becomesclosely related to the distance dv. Using this strategy, experimental results in an indoorenvironment showed that over half the map features could be deleted without significantchange in SLAM accuracy.

Another criterion for feature removal is a measure of “quality” as suggested in [42], whereat regular intervals (perhaps again using the distance dv) those features below a fixed qualitythreshold are deleted. The quality measure given in [42] is determined from the innovationsequence for each feature as follows. Suppose feature xi has been observed n times, thenthe innovation sequence for this feature is {ν1, . . . , νn} with covariance {S1, . . . , Sn}. Thequality of this feature, therefore, is given by

Qi =

∑nj=1

1

2π√

|Sj |exp(−1

2νTj S−1

j νj

)∑n

j=11

2π√

|Sj |(5.13)

which is the normalised sum of association likelihoods (see Equation 3.2 in Section 3.1.2).4

This quality measure is essentially an indicator of feature stability and may also be usefulas a criterion for feature addition.

In this thesis, a third criterion for feature removal is proposed: feature visibility. Thisexpresses that the utility of a particular feature is dependent on the size of the region fromwhich it is observable. In a sense, feature visibility is similar to information content—themore frequently a feature is observed, the more certain its location—however, there is asignificant difference, which is well illustrated by the case where the vehicle is stationary.Repeated observations of a feature from a fixed location serves to ever reduce its uncertainty

4Note, the likelihood formulation in Equation 5.13 represents the particular case where the innovationvector is of dimension 2 (i.e., for a range-bearing observation of a point feature).


(i.e., increase information) but says nothing about the size of its visibility region, which maybe arbitrarily small due to occlusion from surrounding objects.

For the SLAM problem, feature visibility is generally more important than informationcontent, but this is not necessarily always the case. The essential measure of a feature’sutility is the frequency with which it is observed—thereby enabling the robot to gauge itspose—and observation frequency is a function of the size of the feature’s visibility regionand the quantity of time the robot spends in that region. As a measure of utility, the sizeof the visibility region indicates potential observation frequency, and is the better measureif the robot trajectory is unconstrained (i.e., the robot may travel anywhere within theenvironment). On the other hand, information content provides an indication of past ob-servation frequency, and is therefore a better utility measure if the robot is constrained tofollow previous actions and trajectories (e.g., the robot trajectory might consistently favoura less visible feature). Note, further complication to measuring utility is introduced by theinteraction of multiple features. For example, a feature with small visibility region and lowinformation content becomes important if it is the only feature visible from a certain pose.Another feature may be of better quality, but might always share visibility with other nearfeatures.

The following metric for quantifying visibility is used in this thesis. A feature’s visibilityupon initialisation is zero and, whenever the feature is observed for two consecutive scans,its visibility rating is increased by the distance travelled by the vehicle over that interval.While this metric is not a very accurate representation of visibility (and, no doubt, bettermetrics are possible), it was found to be sufficient for the experimental purposes of thisthesis. Essentially, it is a compromise between visibility and information content; it doesnot actually specify a region of visibility, but rewards feature utility as a function of bothobservation frequency and distance travelled.

The strategy for feature removal, therefore, is as follows. The vehicle field-of-view isspecified by a bounded region and, each time-step, a set of features transition from insideto outside this region. If a transitioning feature has no neighbours closer than dv, it isretained. If, however, it has a neighbour closer than dv with greater visibility, it is removed.On the other hand, if the neighbour has lower visibility and is also outside the field-of-viewregion, the neighbour is removed.5 A demonstration of this strategy in practice is shownin Figure 5.1. Note that the field-of-view region may be arbitrarily greater than the actualsensor field-of-view and, in this example, extends 10 metres behind the vehicle.

5.1.4 Obsolete Feature Removal

Landmarks become obsolete if they move, are removed, or become permanently occluded.For example, people or vehicles might remain stationary for arbitrary periods of time, and beconsidered good quality landmarks by the SLAM algorithm. In the longer term, structuralchanges may occur in the environment—furniture shifted, trees or poles removed. An object

5A problem with this strategy can arise if a highly reliable feature becomes permanently occluded forsome reason. Other near features may be unable to accumulate sufficient visibility to remove it; they arealways deleted first. One way to circumvent this problem is to maintain the visibility count of all validlandmarks—even after removal. (This approach assumes that correct association can be made to a deletedfeature once it is reinitialised.)


−20 0 20 40 60 80 100

−40

−30

−20

−10

0

10

20

30

40

50

60

Figure 5.1: Density control. The vehicle performs SLAM along the tra-jectory shown (using a laser-sensor in an outdoor environment) with thefinal location depicted by the triangle. As map features move out ofthe prescribed field-of-view region they are culled to obtain a minimumspacing of dv = 4 metres.

might be placed in front of a landmark, occluding it from view. For whatever reason, somelandmarks may cease to exist and no longer provide useful information. These obsoletefeatures should be deleted from the map to maintain an uncluttered and contemporaryrepresentation of the environment.

The density control methods in Section 5.1.3 tend to remove obsolete features that arein close proximity to good features. This discussion focuses on removing obsolete featuresthat may not be near other objects.

In [53], if a map feature is predicted to be visible, but is not observed over severalconsecutive time-steps, it is deleted. A probability of detection PD < 1 is assumed suchthat the probability of a feature’s existence, if not observed over n time-steps, is (1 − PD)n.Thus, setting a threshold on n is equivalent to setting a minimum probability threshold onthe feature’s existence at the predicted location. The explanation in [53], however, doesnot specify criteria for “predicted visibility.” For example, it does not mention whether thevehicle uncertainty or models of occlusion are taken into consideration.6 These details arethe subject of the method presented below.

6Another visibility consideration, not addressed in this thesis, is the range of vehicle headings (i.e., angularrange) over which a feature is predicted visible. This is important for features such as edges (i.e., rangediscontinuities) that might only be visible from particular directions.

5.2 Application: Partial SLAM 113

In this thesis, obsolete features are detected by projecting map features into sensor-spaceand comparing their expected location with the current observation set. The approachdescribed is specific to a scanning laser sensor, where the set of measurements can beconnected to form a polygon, such that map features are either inside (i.e., predicted visible)or outside (i.e., occluded or outside the field-of-view).

One implementation is to convert each map feature xi to an observer-centred Cartesianframe as follows.

zCi = hi (xa) =[

(xi − xv) cos φv + (yi − yv) sin φv

(yi − yv) cos φv − (xi − xv) sin φv

](5.14)

RCi = ∇hxaPa∇hTxa

(5.15)

where the Jacobian

∇hxa =∂hi

∂xa

∣∣∣∣xa

is obtained in the same way as shown in Equations 3.20 to 3.25 of Section 3.3.1. The re-sulting estimate will lie either inside or outside the scan polygon as shown in Figure 5.2(a).If the feature is sufficiently inside the polygon, (e.g., by 2σ), then it is marked as “obso-lete” (i.e., predicted visible and not observed). This might be accomplished practically bysampling equispaced points about the feature’s 2σ ellipse, and subjecting them to a point-in-polygon test [70]. Note, in the figure, the sensor uncertainty is assumed much smallerthan the feature uncertainty, and is ignored. However, measurement uncertainty might beincluded by shortening each range measurement by 2σr.

Since the laser scan is a set of angle-ordered range-bearing measurements, a much moreefficient implementation of this test is possible. The map features are projected into polarsensor-space as follows.

zi = hi (xa) =

√

(xi − xv)2 + (yi − yv)

2

arctan(

yi−yv

xi−xv

)− φv

(5.16)

Ri = ∇hxaPa∇hTxa

(5.17)

where the Jacobian ∇hxa is given in Equation 2.20 of Section 2.2.3. The 2σ ellipse of zi

is then tested for being inside the (θ, r) polygon as shown in Figure 5.2(b). This may beperformed by sampling the ellipse as before and checking whether, for each sampled value(θs, rs), the two theta values in the laser scan bounding θs possess values for r greater thanrs.

5.2 Application: Partial SLAM

This application demonstrates an interesting type of feature management, where all featuresoutside the vehicle field-of-view are removed. The field-of-view, in this case, is specifiedas a (40m radius) circular region about the vehicle. This application is termed partialSLAM since it represents an intermediate form between full SLAM, where all features arestored, and the sensor-based dead reckoning of Section 3.4, where only the features fromthe previous scan are retained.


−40 −30 −20 −10 0 10 20 30 400

5

10

15

20

25

30

35

40

45

(a) Cartesian (x, y)

0 0.5 1 1.5 2 2.5 30

5

10

15

20

25

30

35

40

45

(b) Polar (θ, r)

Figure 5.2: Obsolete feature removal. The 180◦ unprocessed laser scan defines a polygonrelative to the sensor location. The range of these measurements is limited by a “no return”maximum (40 metres in this case). Features from the SLAM map are transformed to theobserver coordinate space and compared with the polygon bounds. If the 2σ ellipse of afeature is inside the polygon it is marked as obsolete. Figures (a) and (b) represent the sameinformation, (a) in observer-centred Cartesian space, and (b) in polar space. In Figure (a),the feature on the right is removed and the other two are retained.


Sensor-based dead reckoning, through alignment of consecutive scans, enables (almost)temporally uncorrelated motion estimation, but produces pose estimates with monotonicallyincreasing uncertainty over time—even if the vehicle is standing still. Partial SLAM, onthe other hand, is temporally correlated, and produces pose estimates that converge withtime, but diverge slowly with spatial motion.

Partial SLAM is identical to full SLAM for the case where the vehicle is motionlessor moves only in one direction (or, more precisely, does not return to old regions of theenvironment). While full SLAM experiences convergence when retracing old regions, par-tial SLAM suffers unbounded uncertainty growth with all translational motion beyond itscurrent field-of-view. This growth tends to be very slow, however, and has a correspondingbenefit of bounded computation due to the bounded map size. Also the problems with cycledetection (see Section 5.3 below) tend not to be an issue since the map size is of the sameorder as the sensor range, and is strongly correlated.

5.2.1 Results

Experimental results of partial SLAM are shown in Figure 5.3, where the feature extrac-tion and other implementation details are the same as for sensor-based dead reckoning inSection 3.4. While the accuracy of this result appears similar to the results in Section 3.4,there is a significant difference in the estimate of uncertainty, as shown in Figure 5.4.

The covariance of the sensor-based dead reckoning result (in Section 3.4) tended togrow very quickly due to over-conservative tuning of the measurement uncertainty and,particularly, occasional loss of tracking (i.e., if insufficient features were matched betweenscans, the sensor data was ignored and the change-in-pose estimate was simply the veryuncertain prediction). Because of the gross effects of occasional tracking loss on the rapidgrowth in pose uncertainty, these graphs were not shown in Section 3.4; they were essentiallymeaningless in the long-term.

However, for partial SLAM, with the same tuning parameters, quite reasonable esti-mates of uncertainty were obtained as shown in Figure 5.4. This was because storage oflocal features, and their reregistration, served to permit temporal convergence of the poseestimate, in spite of conservative tuning of the measurement uncertainty. The most signif-icant improvement was the ability to reregister after temporary tracking failure, such thatthe very great increases in uncertainty from being lost over short-term scan sequences couldbe resolved. For example, scans 1743 to 1760 failed to register, resulting in extreme poseuncertainty, but this was reduced subsequently with the registration of scan 1761.

5.2.2 Partial SLAM and GPS

Partial SLAM on its own is not particularly useful; it essentially provides a highly accurate,albeit temporally correlated, dead reckoning estimate. It has potential value, though, as anauxiliary information source in conjunction with an observer of absolute pose, such as GPS.While GPS may be intermittent, noisy, and inaccurate (depending on satellite availability),partial SLAM acts both as a low pass filter on the absolute measurements and an interpo-lating estimate between GPS signals. The GPS information can be incorporated into theSLAM equations using a trivial observation model such as zgps = xv with uncertainty Rgps.


−100 −50 0 50 100 150−300

−250

−200

−150

−100

−50

0

50

Figure 5.3: Partial SLAM in the internal road environment.


0 500 1000 1500 2000 25000

1

2

3

4

5

6

0 500 1000 1500 2000 25000

0.01

0.02

0.03

0.04

0.05

Figure 5.4: Partial SLAM covariance estimates. The top figure shows the xv and yv standarddeviations (scan number versus metres), and the bottom shows the φv standard deviation(truncated at 0.05 radians; peaked 0.5 at scan 1760). Notice that the uncertainty everincreases with vehicle motion because feature information outside the vehicle field-of-viewis lost. However, sudden large increases in uncertainty, due to temporary loss of tracking,are recovered by subsequent reregistration with the local map.

5.3 Data Association and the Cycle Detection Problem 118

5.3 Data Association and the Cycle Detection Problem

Batch data association permits robust association for a priori map localisation and formost situations in SLAM. However, the cycle detection problem introduces a difficulty notdirectly addressed by batch association methods. This difficulty arises from the weaknessof correlation between new and old parts of the map when closing the loop, which impliesweak relative constraints and reduced benefit in using a batch process.

Consider the example in Figure 5.5. The vehicle returns to an old map region via a largeloop, all the while creating new map landmarks. Since the vehicle has not yet reregisteredwith the old map features, its pose uncertainty relative to those features is very large.Furthermore, the uncertainty of the new map features with respect to the old map is alsolarge (i.e., large covariance diagonals and weak cross-covariance terms). Suppose the vehiclemakes an observation scan where it observes several of the new map features and anotherfeature not previously observed (perhaps spurious), but the latter observation is (wrongly)associated to an old map feature. The batch constraints in the association process do notprevent this catastrophic failure because the relative constraints between the new and oldmap features are so weak—the association to the old map feature may just as well have beenperformed individually. As a result of this association, the map irretrievably diverges sincethe relative location of the new map portion is updated, and hence constrained, incorrectly.

In general, the problem occurs if (i) there exist groups of features in the map that arestrongly correlated to each other but very weakly correlated to features in other groups,and (ii) the batch association assigns a disproportionate number of observations to somegroups and only one or two to others. Essentially, the reliability of batch association (dueto relative constraints) is dependent on the number of associations within each group.

A solution to the cycle detection problem is quite simple. First, group the map into setsof mutually constrained features and, second, perform batch association separately for eachgroup. If the same observation is assigned in two different groups, then the participatingmap features are said to be the same (i.e., constrained as a single feature).7 In this thesis,grouping map features was performed using a simple, though only approximately correct,algorithm as follows (see Figure 5.6). Initially there are no groups. For each feature xi inthe map, if there exists a group where each feature in the group is suitably constrainedwith xi, then xi is added to that group. Otherwise, xi starts a new group. The metric for“suitably constrained” is a threshold on the covariance of the relative constraint betweentwo features (see Equations 3.15 to 3.19 in Section 3.3.1).

A limitation of this solution to cycle detection is that it lacks checks for false batchassociations to entire groups due to the existence of symmetries in the environment. Thisis a difficult problem which is addressed more completely in Section 6.3.

5.4 Application: Full SLAM

This application presents a full SLAM experiment that incorporates simple feature man-agement, batch data association and cycle detection via feature grouping. The basic im-

7Constraint between two features from different groups is performed according to the constrained ini-tialisation method in Section 5.1.2. The redundant feature is subsequently removed. Note, the two groupsinvolved will then merge to form a single strongly correlated group.

5.4 Application: Full SLAM 119

Figure 5.5: The cycle detection problem. Consider, for example, the situation depictedwhere batch data association fails to prevent a major false association. A vehicle performsSLAM along a cyclic path such that there exists an old map portion (diamonds) and a newmap portion (circles, their true locations given by fainter circles). Each of these portionsof the map are strongly correlated within themselves but only weakly correlated to eachother. The current observation set (triangles) is assigned to the map features using batchdata association. Four are assigned to recent map features and one to an old map feature.However, due to lack of geometric constraint, the incorrect assignment to the old mapfeature is allowed and results in an inconsistent map update. (Note, the falsely assignedobservation may have been of a feature not previously seen, perhaps due to occlusion, orsimply a false return.)


Figure 5.6: Feature grouping. Working sequentially through the vector ofmap features, groups are formed of features sufficiently united by relativeconstraints. Each feature xi is first checked against existing groups. Ifa group is found where xi is constrained by every feature in the group,then xi is added to the group, and the process begins again with the nextnon-grouped feature xj . If, on the other hand, no groups adequatelyconstrain xi, then it forms a new group. Once grouping is completed,batch data association is performed separately for each group.

plementation details for this experiment (i.e., feature extraction, inertial prediction model)are the same as described in Section 3.4.

Feature management in this experiment was limited to visibility-based density controlonly. (A simple ad hoc initialisation criterion was also used, where new features were notadded to the map if closer than x metres to existing features.) The more sophisticatedfeature management methods described in this chapter were not implemented, althoughthis test would certainly benefit from constrained initialisation. Furthermore, the SLAMresult was not real-time since the naive SLAM algorithm used was insufficient to contendwith the large number of features involved.

5.4.1 Results With Medium-Scale Loop Closure

The result of SLAM after the first 1070 scans is shown in Figure 5.7, during which time thevehicle travelled 800 metres. The vehicle first turns clockwise about a small loop, and thentwice counter-clockwise around a large 330 metre loop, before heading away along a linearpath. The size of the main loop was not large enough to necessitate grouping as the mapwas strongly correlated throughout (due to the loop size being of similar order to the sensorrange and the high feature density in the region). The accuracy of the map is apparentfrom the closeness of the two loop traversals (on a narrow road), and the structure of theunprocessed laser measurements when projected onto their global locations; they depictprecisely the cabin walls and tree trunk shapes.

The SLAM features are shown as circles in Figure 5.7, and can be seen to correspondto objects of near-circular cross-section from the unprocessed laser data (n.b., the circlesare iconic and do not represent the estimated radius). These features were culled as they


−20 0 20 40 60 80 100 120 140−80

−60

−40

−20

0

20

40

60

80

Figure 5.7: Full SLAM in the internal road environment. The path of the vehicle over 1070laser scans (800 metres) is shown along with the unprocessed laser returns for each pose(depicting shrubs, trees and cabins). The SLAM features are shown as circles. Note, thereare two subtle points to be gleaned from this figure. The first is the effect of mild lasertilt, where both front and back edges of the cabin verandas are observed as different laserscans detect slightly different vertical slices of the environment. Second, notice that somefeatures lie on the lines of the cabin walls; these are not spurious, but are poles extendingabove (or piers below) the cabin floor.


0 200 400 600 800 1000 12000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0 200 400 600 800 1000 12000

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

0.01

Figure 5.8: Covariance for full SLAM over first 1070 scans (top, standard deviations for xv

and yv, bottom, standard deviation for φv). Note, the covariance of the first loop traversalis greater than the second, and returns to a minimum when returning close to the origin(scans 490 and 1000).


moved out of the vehicle field-of-view to obtain a minimum spacing of 4 metres.The covariance of the result is shown in Figure 5.8. The covariance increases as the

vehicle travels away from the origin and reaches a maximum about the extremities of theloop. The covariance falls back to a minimum as the vehicle returns towards the origin.Notice that the map converges with repeated observation, and the peak covariance of thesecond loop traversal is rather less than the first. The rapid rise in uncertainty at the endoccurs as the vehicle again heads away from the origin, this time into an unexplored regionwith considerably lower feature density.

5.4.2 Results With Large-Scale Loop Closure

The results in Figure 5.9 show the trajectory of the vehicle over 2290 scans, during whichit travelled about two kilometres. After completing the dual loop traversal describedpreviously, the vehicle maintained a straight heading until turning left at coordinates(−35,−145), continuing straight until (80,−200) and turning right. At this point, thevehicle performed an off-road counter-clockwise loop and, during this manoeuvre, observedmany spurious features (mostly due to a sloping embankment) and few stable features.This resulted in highly uncertain map building, and probably association failures due tothe simplistic implementations of the feature initialisation mechanism (no tentative set) andthe CCDA algorithm (choose largest clique). Returning to the road, the major clockwiseloop was resumed; a region with very few visible features was encountered at the turn at(−70,−240) and, finally, cycle detection and a large error correction was performed on clos-ing the loop. The vehicle then retraced the old map a distance, turning left at (55,−190),making a small clockwise loop at (100,−120), and performing a second major cycle detection(this time with little correction) on rejoining the original map loop.

The two regions mentioned with few useful features tended to break the map into sep-arated regions connected to each other by very weak correlations. Thus, for reliable cycledetection, feature grouping was essential. A detailed view of the main loop closure is shownat the bottom of Figure 5.9. Notice the large error correction and accompanying reductionin pose uncertainty at this point.

The large error correction in this experiment demonstrates a major flaw in the traditionalstochastic SLAM formulation. Even with correct data association, the map would definitelybecome inconsistent after such a large linearised update. In fact, given the prior mapuncertainty, it is probable that the map was inconsistent before the update. The problem istwo-fold: first, the linear (Gaussian) representation of such large uncertainties is likely to beoptimistic and, second, the large linearised update (including error propagation throughoutthe map via correlations) invalidates the basic near-linear, small-error assumptions of theEKF.

Fundamentally, the existence of large uncertainties in the map, so as to necessitatefeature grouping for data association, is likely to prevent consistent application of the EKF-based SLAM framework. This problem leads to the development of submap methods, whichbreak SLAM into manageable connected subsets, as presented in the next chapter.


−100 −50 0 50 100 150−300

−250

−200

−150

−100

−50

0

50

−50 −45 −40 −35 −30 −25 −20 −15 −10−180

−175

−170

−165

−160

−155

−150

−145

−50 −45 −40 −35 −30 −25 −20 −15 −10−180

−175

−170

−165

−160

−155

−150

−145

Figure 5.9: Full SLAM with large-scale loop closure. The top figure shows the vehicle pathover 2290 scans (2 kilometres), and the bottom figures show detail of the large loop errorcorrection—with the vehicle pose and 2σ covariance ellipse before and after cycle detection.

5.5 Remarks: SLAM versus Batch Map Building 125

5.5 Remarks: SLAM versus Batch Map Building

Recent progress in offline batch map building (e.g., using EM [131]) offers a competitivealternative to SLAM, particularly in highly dynamic environments. This section discussestheir relative merits and stipulates when an a priori batch map might be preferred.

A map produced by SLAM is incremental and requires no prior exploration of theenvironment; it is used immediately for localisation. It also permits on-line adaptation tostructural changes. However, in dynamic environments, SLAM reliability is hindered bythe need to consider all objects (static and dynamic) as potential landmarks. In extremecases, this can lead to data association fragility through map clutter.

Offline map generation using batch procedures has been shown in [131] to permit accu-rate metric maps with minimal human intervention. These maps can be checked manually(visually) for consistency and are subsequently used as static a priori maps. This approachhas several marked advantages. First, the map can be used consistently and reliably with-out a formal uncertainty model (i.e., unlike SLAM, the map does not require a measureof global convergence). The map is formed according to a maximum likelihood criteria,which does not necessarily have to match the true physical shape of the world. Second, iflocalisation becomes temporarily lost (perhaps through incorrect data association), it canbe recovered subsequently by increasing the search space; unlike SLAM, such errors arenot self-propagating. Third, measurements of dynamic objects can be identified on-lineand filtered from the localisation data set. This technique is demonstrated in [129], whereshorter-than-expected range measurements are rejected to permit robust localisation in anenvironment crowded with people. A disadvantage of offline maps is that the map must berebuilt whenever there is significant change to the structure of its working environment.

To offer some general rules of thumb regarding the use of SLAM or static batch maps,consider the following criteria. SLAM is best suited if a task requires an immediate estimateof location in a previously unknown environment. It is also suited to environments wherethe physical structure changes regularly but have only moderate levels of transient objects.In these environments, SLAM can operate robustly given the mechanisms described inthis thesis: batch data association serves to extract known landmarks and reject clutter,constrained initialisation prevents the addition of non-static objects to the map, and obsoletefeature removal adapts the map to structural change. On the other hand, an a priori batchmap cannot automatically adjust to environmental change but is very reliable in stableenvironments with large numbers of transient objects. The static nature of the map meansthat measurements of transient objects can be easily filtered out, and localisation can berecovered even after a data association failure.

5.6 Summary

This chapter is concerned with feature management and cycle detection for traditionalSLAM in medium-sized environments.

Feature management involves addition and removal of map features based on featurereliability and utility. The constrained initialisation procedure is reviewed as an optimalmethod for deferred feature initialisation, which permits delayed data association without

5.6 Summary 126

loss of information. Two reasons for feature removal are examined: density control andobsolete features. Density control reduces map clutter and improves efficiency by deletinglesser quality features. Two measures of feature quality are reviewed—information contentand innovation sequence stability—and a new measure is proposed: feature visibility, wherea feature’s utility is governed by the size of the region over which it is visible. Obsoletefeature removal is necessary to delete landmarks that no longer exist (due to environmentalchange). A feature is marked obsolete if it is predicted visible but not observed and, forthis purpose, a laser-specific model of “predicted visibility” is presented.

For large loops, robust cycle detection may require more than just batch data association;rather, it is shown to rely on feature grouping, where the map features are clustered intostrongly constrained sets. Batch association must then be performed separately for eachset.

Chapter 6

Network Coupled Feature Maps

Stochastic SLAM suffers two major problems in relation to scale: excessive computationand storage costs, and inconsistent (optimistic) uncertainty estimation due to accumulatednon-linearities and bias. This chapter proposes that both these problems can be addressedby partitioning the map into a topological-metric network of submaps. Each submap isa small independent stochastic SLAM map, and these are connected by suboptimal, butconsistent, coupling estimates.

The strengths of topological and metric maps are complementary. Topological mapsprovide a natural division of the environment, low computation and storage, large-scaleconnectivity and consistency, and place recognition through the characteristics of localgroupings. Metric maps, on the other hand, provide high local accuracy, quantified uncer-tainty, generality, and pose constrained data association. Combining these qualities in ahybrid topological-metric map permits feasible consistent SLAM on a very large scale.

This chapter presents a hybrid map framework for stochastic SLAM in extensive envi-ronments. The following topics are discussed with regard to this map proposal.

• A review is made of current submap-network systems that use stochastic SLAM as theestimation mechanism for the local submaps. This discussion is primarily concernedwith the consistency of the submap coupling methodologies.

• The network coupled feature maps (NCFM) framework is introduced. The NCFMconcept is presented in terms of the map structure and the methods for performingglobally consistent, convergent SLAM.

• The cycle detection problem is examined in detail, with solutions to the issues of very-large-scale search and loop confirmation presented using the NCFM framework. Alsodiscussed is the difficult problem of pathological environmental symmetries, whererejection of false cycles may be practically impossible.

• An application of NCFM SLAM in an outdoor environment is shown to demonstratethe concepts described in this chapter.

6.1 Submap Methods for Stochastic SLAM 128

6.1 Submap Methods for Stochastic SLAM

This section reviews five recent submap-SLAM proposals, which represent the current state-of-the-art in decoupled stochastic SLAM methods. Each of these methods possess a commontheme whereby, at the level of individual submaps, SLAM is performed using (essentially)the traditional EKF equations given in Section 2.2. Furthermore, these submaps are main-tained independently so as to avoid correlation with other submaps.

The critical difference between these methods is their approach to estimating the cou-pling between submaps, and thus connecting the map at a global level. This is the area wheresubmap methods break away from established SLAM concepts and involve new complexi-ties, particularly with regard to the consistent use of information. Therefore, this sectionis primarily concerned with the consistency of submap coupling estimation and submaptransitions, and the submap methods are examined in terms of the following questions.

• Coupling estimation. How are the submaps coupled? Are the coupling estimatesconsistent (i.e., do they provide a conservative estimate of pose errors at a globallevel)? Do the coupling estimates converge (i.e., global map convergence) and whatare the limits of this convergence?

• Vehicle transition. When the vehicle transitions from one submap to another, howis the pose estimate reinstated in the new submap? What (if any) information istransferred from the previous submap, and is this consistent?

• Loop closure. Is the global structure of the map appropriate for cycle detection andconsistent loop closure?

6.1.1 Decoupled Stochastic Mapping

Decoupled stochastic mapping (DSM) [90] is actually a global map, but the total region issplit into a number of “globally referenced” cells. (In this regard, DSM is distinct from thesubsequent submap methods discussed in this section, wherein each submap defines a localcoordinate origin.) Correlations are maintained between features within each submap, butare neglected between features from other submaps. Thus, the advantage of this method ispurely computational; if there were N cells of n features, full SLAM would require O(N2n2)computation and storage but DSM reduces computation to O(n2) and storage to O(Nn2).

Each submap has its own estimate of the vehicle pose and the locations of features withinthe cell region—all with respect to the global coordinate frame. All submaps are marked“inactive” except for the single “active” cell currently occupied by the vehicle and, whileactive, this submap is updated according to the standard SLAM algorithm. The inactivesubmaps are frozen with their last “active” state estimate (including vehicle pose).

When transitioning from the active submap to another, one of two transition proceduresare performed: cross-map relocation or cross-map update. The former is used when movinginto an older submap (where age refers to the order of cell creation), and serves simply toreinitialise the vehicle pose estimate in the newly activated cell. The latter procedure isused when entering a newer submap, and transfers information from the older cell to permitimprovement in the global uncertainty of the newly activated cell.


The fundamental contention with DSM is the information transfer between submapsduring the “cross-map” procedures. These processes inevitably introduce correlations be-tween the submaps, and the attempts to then decorrelate them using ad hoc uncertaintyinflation are questionable. In general, the inflation incurs over-conservative submap esti-mates but, in some circumstances, may yet produce optimistic results (n.b., empiricallythese procedures appear to yield consistent results). A consistent alternative to cross-maprelocation is presented in [140], wherein vehicle pose reinitialisation is performed withoutcorrelation between the submaps. However, a theoretically consistent alternative to cross-map update, necessary for global convergence, is not presently available.

Finally, the globally referenced structure of DSM does not address the problem of loopclosure. As with traditional stochastic SLAM, the large-error non-linear update associatedwith closing a large loop will tend to invalidate the basic EKF assumptions of near-linearity,and result in an inconsistent map estimate.

6.1.2 Two-Level Landmark Representation

In [24] a two-level representational hierarchy is proposed. The first of these is the landmarklevel, where a “landmark” is a set of local features expressed according to a common localcoordinate frame. Each landmark also includes an estimate of the vehicle pose relative toits local frame. The second level of representation is the global level, where the pose of eachlandmark (i.e., each landmark coordinate origin) is estimated with respect to a commonglobal coordinate frame. This base frame also has an estimate of the (global) vehicle pose.

The estimation process1 for each landmark is performed in isolation from every otherlandmark; similarly, the global frame is also treated in isolation. At each predictiontimestep, the vehicle states are propagated independently within each landmark, and alsofor the global frame. Meanwhile, in the vehicle-centred frame, a temporary landmark isconstructed and this is matched against the existing map landmarks. Having obtained aunique landmark match, the features of the map landmark are updated using the geomet-ric constraints of the observation landmark. This same observation information is then,separately, used to update the global estimate of the landmark pose.

The advantage of this two-level approach is that, by ignoring correlations between thelandmark and global levels, the computation and storage for SLAM are reduced to O(N2 +n2) and O(N2 + Nn2), respectively (where N is the number of landmarks and n is thenumber of features per landmark).

There are three significant problems with this approach. The first is that neglectingthe correlations between the two levels (i.e., between the features of a landmark and theglobal landmark location) must introduce inconsistencies into the global map; althoughit is possible, in this case, that the level of inconsistency is small and easily subsumed byconservative (inflated) observation uncertainty. The second problem is that the vehicle poseprediction step may introduce very large uncertainties into the local landmark estimates, andthese non-linear errors can compromise filter consistency. A solution to this problem is to

1The underlying estimation process used in [24] is a stochastic SLAM variant called the symmetriesand perturbations map (SP map) [25]. This approach is reported to provide a uniform representation forinformation gathered by different sensors, and to overcome the problem of singularities in the representationof geometric features.


remove the vehicle pose elements from the local landmark state while the vehicle is distant,and reinitialise them when it returns to the landmark vicinity (see [140] or Appendix C.2for details on the vehicle reinitialisation process). The third problem is that, as with DSMabove, the structure of the global level does not address the loop closure problem, and theglobal map estimate may be invalidated by a large non-linear error correction.

6.1.3 Relative Landmark Representation

The relative landmark representation (RLR) [64, 66] possesses a similar conceptual structureto the previous method [24]. It represents sets of nearby features in local relative coordi-nate frames and maintains these local frames according to the global coordinate system.However, unlike the previous method, the RLR manages all the appropriate correlations ina consistent, and near-optimal, manner.

The RLR divides the map into sub-regions where, in each region, the landmarks arerepresented with respect to a local coordinate frame. The pose of each local frame isdefined by a small subset of local features (called “base” features), which are representedin the global coordinate frame. Thus, the SLAM augmented vector consists of: the vehiclestates and local frame base features in absolute coordinates, and the remaining local regionfeatures in relative coordinates. The vehicle and base features are referred to as “absolute”states and the local features are called “relative” states.

The motivation behind the RLR structure is to reduce correlations between state ele-ments. In traditional SLAM, where vehicle and features are all in global coordinates, thestates become increasingly correlated over time. With the RLR, the absolute states becomestrongly correlated, and the relative states within each local region become strongly cor-related to each other and to the absolute states, but the relative states in different localregions are only ever weakly correlated. This is particularly so between relative states fromphysically distant local frames. The key idea, therefore, is that the relative states in differentlocal regions can be “decorrelated” from each other with little loss of information—resultingin a dramatic reduction in computation and storage requirements.

The method of decorrelation is to inflate the diagonal sub-matrices of the state covari-ance, so that the respective states can be treated as uncorrelated.[

A CC B

]decorrelate−−−−−−−→

[A

′0

0 B′

]

The decorrelated covariance is conservative if the difference between it and the originalcovariance matrix is positive semi-definite.[

A′

00 B

′

]−[

A CC B

]≥ 0

The decorrelation strategy of the RLR is to maintain all strong correlations: absolute-to-absolute, relative-to-absolute, and (within each local region) relative-to-relative; but todecorrelate relative-to-relative states from different local regions. Since the correlationsbetween these non-local relative states are very small, decorrelation can be performed withminor inflation of the covariance diagonals.


Nevertheless, performing decorrelation after each observation would result in a veryconservative SLAM estimate, and the RLR achieves its near-optimal convergence propertiesby employing the compressed EKF (CEKF) [65] as its estimation method. The CEKF limitsthe SLAM update step to only effect the relative and base landmarks in the vehicle’s currentneighbourhood (these are termed the “active” states) and postpones the full update, andthe decorrelation process, until the vehicle leaves the local region. Thus, decorrelation isperformed only intermittently and, when applied, inflates the covariance of the active statesbut leaves the covariance of the other “passive” states unchanged.

The advantage of the RLR is its O(Nn2) computation and storage, where N is thenumber of local regions and n is the number of features per region. Also, it is reported toachieve close to the accuracy and convergence rate of full SLAM (although this dependssomewhat on the assumption that the vehicle remains in a local area for a reasonable periodof time so as to minimise covariance inflation).

The main issue not addressed by the RLR is non-linearity. The same problems ofloop closure and large error corrections affecting traditional SLAM, and the two submapmethods discussed previously, will also tend to generate inconsistent estimates with theglobally referenced portions of the RLR state vector.

6.1.4 Hierarchical Local Maps

In [29, 30] a hierarchy of local submaps is introduced, where each local submap is maintainedindependently using traditional SLAM,2 and these are connected by estimates of the relativepose between submap origins.

The relative pose between two adjacent submaps is determined at the creation of thesecond submap as follows. The first submap is built using standard SLAM until the vehiclepose uncertainty reaches a given threshold. Let the augmented SLAM estimate with respectto the first submap coordinate frame S1 be given by

S1xa = [S1 xTv , S1 xT

m]T

S1Pa =[

S1PvS1Pvm

S1PTvm

S1Pm

]

At this point, the second submap S2 is created, with the current vehicle pose as its origin.Thus, the pose of the second submap relative to the first is equal to the vehicle pose portionof the first submap state estimate (i.e., S1 xS2 = S1 xv and S1PS2 = S1Pv).

The set of coupling estimates{

Si xSj ,SiPSj

}, therefore, are extracted from the relevant

state estimates and stored in a coupling tree, which represents the order of submap cre-ation. Consider, for example, a situation where the vehicle constructs and traverses a set ofsubmaps in the order shown in Figure 6.1(a). The coupling tree for this traversal sequenceis shown in Figure 6.1(b). The relative pose between two non-adjacent submap coordinateframes can be calculated from the “monotonic linkage” between the submaps—the vectorsum of connecting relative pose estimates in the coupling tree. For example, the relative

2In fact, [29, 30] implement a suboptimal variant of SLAM called relocation-fusion [102], and use anunscented filter [76] rather than the standard EKF.


(a) Map creation order (b) Coupling tree

Figure 6.1: Submap coupling tree. New submaps are coupled to theirparent submap according to the vehicle pose at creation. In this example,from [29, 30], the coupling tree was constructed from the following order-of-events. Create maps 1, 2, then 3, reenter map 2, create maps 4, 5, and6, reenter map 4, then 3, create maps 7 and 8.

pose between submaps 4 and 6 is the vector sum of 4-to-5 and 5-to-6.

S4 xS6 = f(S4 xS5 ,

S5 xS6

)=

S4 xS5 + S5 xS6 cos S4 φS5 − S5 yS6 sin S4 φS5

S4 yS5 + S5 xS6 sin S4 φS5 + S5 yS6 cos S4 φS5

S4 φS5 + S5 φS6

S4PS6 = ∇fS4xS5

S4PS5∇fS4xS5+ ∇fS5xS6

S5PS6∇fTS5xS6

where the Jacobians ∇fS4xS5and ∇fS4xS5

are found as follows.

∇fS4xS5=

∂f∂S4xS5

∣∣∣∣(S4 xS5

,S5 xS6)

∇fS5xS6=

∂f∂S5xS6

∣∣∣∣(S4 xS5

,S5 xS6)

The relative pose between submaps 4 and 6 can be calculated using two separate Jaco-bians because the individual coupling estimates are uncorrelated. However, a problem ariseswhen this approach is applied to relative pose calculations involving a “common root.” Forexample, the relative pose between submaps 3 and 4 is found from the coupling estimates2-to-3 and 2-to-4. In [29, 30], this calculation is performed in a separable fashion similarto the example above (i.e., assuming S2xS3 and S2xS4 are independent) when, in fact, the


coupling estimates are correlated. This error3 is rectified in [140], as described below.A further issue with this method is that it lacks convergence at a global level; the

coupling estimates remain static throughout. This is particularly troublesome for cycledetection and loop closure since the uncertainties accumulated over a large cycle of submapscannot be reduced even after the loop has been completed (e.g., consider a cycle betweensubmaps 4 and 8, the coupling estimate would remain the path 4-2-3-7-8). In other words,the coupling estimate between the two submaps connected by the cycle will possess a largeand irreducible uncertainty equal to the sum of monotonic linkages connecting them.

6.1.5 Constrained Relative Submap Filter

The constrained relative submap filter (CRSF) [140] is similar in structure to the submaphierarchy in [29, 30], but presents significant advantages in terms of correlation estimationand global convergence.

The CRSF is similar to the method in [29, 30] in the following ways. It consists ofindependent submaps performing traditional SLAM and coupled via “monotonic linkage.”When a new submap is created, its origin is defined by the current vehicle pose, and anestimate of this pose is held by its “parent” submap. Calculation of the relative pose betweentwo non-adjacent submaps is performed via the coupling tree. Both methods are consistentat a global level, avoid accumulated non-linearities, and require O(n2) computation andO(Nn2) storage (where N is the number of submaps and n is the number of features persubmap).

The CRSF differs from the method in [29, 30] in the way it stores the coupling esti-mates, and the consequences of this storage approach. Rather than extract the relativepose estimate from the parent submap state, the coupling estimate (and its correlations tothe parent map) is retained as follows.

S1 xa = [S1 xTS2

, S1xTm]T

S1Pa =[

S1PS2S1PS2m

S1PTS2m

S1Pm

]

When the vehicle subsequently returns to the parent map, the vehicle estimate is reinitialisedusing geometric constraints, so that the state becomes S1xa = [S1xT

v , S1xTS2

, S1xTm]T . There

are two important consequences of this formulation. First, if the parent submap latercreates a second child submap, the correlations between the two child submaps are known;this corrects the error in [29, 30] for “common root” calculations. Second, as the vehicleperforms SLAM within the parent submap, the map converges and the coupling estimatealso converges through its correlations to the map. Thus, the overall CRSF map experiencesglobal convergence while maintaining statistical independence between submaps.

A further advantage of the CRSF discussed in [140] is that geometric constraints caneventually be applied between the submaps so that it merges into a consistent monolithicmap. This operation is shown to be almost equivalent to optimal SLAM implementations,but with considerable improvements in computation and reduction of non-linearities.

3The common root “error” in [29, 30] does not actually constitute an inconsistency; rather it results inan over-conservative estimate of the coupling uncertainty (see Section 6.2.6).

6.2 Network Coupled Feature Maps 134

The CRSF in its present form is limited in terms of convergence. At a local level,the couplings between submaps can converge to a lower limit defined by the correlationsbetween the coupling estimate and the parent submap. More importantly, at the globallevel, convergence is restricted by monotonic linkage and, as in [29, 30], loop closure isunable to reduce the uncertainty between two submaps, which is specified by the couplingtree path.

The CRSF is a promising alternative to the NCFM approach presented in this chap-ter, particularly if the loop closure convergence problem can be solved. In general, thisrequires an investigation into whether a coupling estimate can be formed between two ex-isting submaps, which shortcuts the coupling tree, without having to propagate geometricconstraints throughout the entire map (i.e., without having to correlate all submaps).

6.2 Network Coupled Feature Maps

This section presents a new submap framework called network coupled feature maps (NCFM).4

The basic structure of NCFM is shown in Figure 6.2, where independent submaps are con-nected by a network (or graph) of coupling estimates—these being the relative pose betweensubmap coordinate frames. This concept is similar to the submap methods in [29, 30] and[140], but with the important difference that the coupling estimates are not restricted tomonotonic linkages. The implementation and consequences of this difference are discussedlater in this section.

This section presents the following basic mechanisms for SLAM within the NCFM frame-work.

• Coupling estimate equations. These equations for manipulating coupling estimatesare key to global map management as they enable pose estimation across submapsand along coupling-paths connecting non-adjacent submaps.

• Map traversal. This includes traversal within and across existing submaps, withparticular emphasis on consistent vehicle reinitialisation during submap transition.

• Submap creation. At the boundaries of the existing map, new submaps are createdand coupled to neighbouring submaps. This procedure is developed for the case wherethe vehicle pose becomes the new coordinate origin, and also for when the origin isoffset from the vehicle.

• Coupling convergence. Transitions between submaps allow for iterative improvementof their coupling estimates. A method for consistent coupling estimation is presentedthat produces a globally convergent map while retaining independent submaps. This

4The NCFM framework was first introduced in [4], where it was used primarily to demonstrate thereliability of batch data association in situations where the vehicle pose was highly uncertain. In May2002, a method very similar to the NCFM framework was introduced at an oral presentation at WorkshopW4, IEEE International Conference on Robotics and Automation (ICRA 2002). This approach, the Atlasframework, develops many of the same concepts given in this chapter. However, the author was made awareof this seminar only after submission of this thesis in August, and written details of the approach [18] onlybecame available in October 2002. Thus, the NCFM methodology presented here was not influenced by theAtlas framework in conception or particulars.


Figure 6.2: Network coupled feature map structure. The coordinate axesrepresent the origins of independent submaps. Neighbouring submapsare coupled by estimates of the relative pose between their coordinateorigins.

method tends to correlate coupling estimates attached to any one submap, and theconsequences of these correlations on global consistency are also discussed.

6.2.1 NCFM Overview

The NCFM structure consists of a local submap level and a global topological level. Atthe local level, each submap is a feature map defined with respect to a local coordinateframe. At the global level, each network link defines a coupling estimate of the relative posebetween two submap coordinate origins.

Within a local submap, a robot performs traditional stochastic SLAM. The submap statevector is augmented with the vehicle state, and the normal SLAM operations are performed(e.g., prediction, observation, feature addition). When moving to another area, the vehiclestate is initialised in the neighbouring submap using the coupling information connecting thetwo submaps, and so the robot transitions from one submap state vector to another. Thevehicle state may even be maintained simultaneously in multiple neighbouring submaps.However, each submap is built independently from its neighbours and no information isever transferred from one submap to another.

The coupling estimates between submaps are maintained so that they converge as theirassociated submaps converge. They enable consistent pose estimation across submaps, andeffect map convergence at a global level.

The NCFM framework facilitates two basic navigation paradigms: topological-metric,and (conventional) global navigation. With topological-metric navigation, the vehicle mightbe located at a certain pose in submap A, and its goal location might be in submap D,


which is connected to A via submaps B and C. The submap coupling estimates, therefore,describe a relative connecting path between the vehicle and its goal. Alternatively, for globalnavigation, a particular submap is nominated as the “base” frame, and poses relative toother submaps are converted to the base coordinate system via the topology of couplingestimates. Thus, NCFM can be used as though it were a traditional monolithic map.

Existing submap methods have solved the problem of efficiency for stochastic SLAM.However, the best of these solutions address either convergence (i.e., the RLR, with globalbase-feature coupling) or non-linearity (i.e., CRSF, with relative submap coupling), but notboth. The NCFM framework presents a combined solution to efficiency, convergence andnon-linearity.

In terms of efficiency, NCFM SLAM requires O(Nn2) storage and O(n2) computation,where N is the number of submaps and n is the number of features per submap. To convertthe robot’s local pose to a global estimate takes O(N) coupling evaluations between thecurrent and base submaps. Note, this complexity assumes the coupling path is alreadyknown, and excludes the complexity of the connecting path algorithm (e.g., shortest pathbetween two nodes [39]).

Non-linearity and bias are not significant issues for small-scale SLAM,5 but their cumu-lative effects are problematic for large-scale global maps, particularly in connection withloop closure. NCFM performs all map building at a local submap level so that error correc-tions are small and the effects of non-linearity and bias do not accumulate beyond submapbounds. The calculation of a “global” pose estimate, with respect to a base frame, isstill subject to cumulative errors, but these are non-critical. First, the global estimate isnever used in the actual SLAM update, and so does not effect the consistency of the map.And second, as the individual submaps converge, the local estimation problem becomesincreasingly linear, and the global estimate accordingly becomes more linear.

Eventually biases may cause even small-scale SLAM to become inconsistent. The NCFMframework cannot prevent this but, by containing biases within submaps, serves to makethe effects less pronounced and less detrimental.

Convergence at a submap level is the same as for traditional SLAM, and the couplingestimates converge as their associated submaps converge. Therefore, the map also convergesat a global level. The lower limit of global convergence, as the features within each submapbecome fully correlated, is conjectured to approach that of traditional SLAM.

6.2.2 Coupling Estimate Equations

The coupling estimate of submap frame S2 with respect to submap frame S1 is denotedS1xS2 with covariance S1PS2 . In this case, S1 is referred to as the base coordinate frame.Determining the pose of a particular submap (or of the vehicle) relative to a non-adjacent

5Non-linearity and bias problems fall into three basic categories. First, non-linear transformations causeGaussian distributions to become non-Gaussian, and to be underestimated by a Gaussian approximation.These inconsistencies can be easily offset by adding stabilising noise to the process and observation models.Second, linearised error correction may be divergent if the error is large (i.e., if the first-order Taylor ex-pansion is not a reasonable approximation to the non-linear function). This problem gives rise to the basicsmall-error assumption of the EKF. Third, persistent biases, which result from non-linear transformationsand modelling errors, may cause the system to converge to a biased steady-state. This means that thesystem may eventually become inconsistent even with the addition of stabilising noise.


base frame is an important operation for a variety of navigation tasks, and is crucial forperforming cycle detection (as described in Section 6.3). Three equations for coupling ma-nipulation are presented here—coupling summation, coupling inversion and common rootcoupling—which permit recursive cross-submap calculations.

Coupling summation is the vector addition of two coupling estimates S1xS2 and S2xS3

to obtain the coupling S1xS3 . This operation can be used recursively to determine the poseof an entity with respect to a non-adjacent base frame.

S1 xS3 = f(S1 xS2 ,

S2xS3

)=

S1 xS2 + S2 xS3 cos S1 φS2 − S2 yS3 sin S1 φS2

S1 yS2 + S2 xS3 sin S1 φS2 + S2 yS3 cos S1 φS2

S1 φS2 + S2 φS3

(6.1)

S1PS3 = ∇fS1xS2

S1PS2∇fTS1xS2

+ ∇fS2xS3

S2PS3∇fTS2xS3

(6.2)


are given by

∇fS1xS2=

∂f∂S1xS2

∣∣∣∣S1 xS2

=

1 0 −S2 xS3 sin S1 φS2 − S2 yS3 cos S1 φS2

0 1 S2 xS3 cos S1 φS2 − S2 yS3 sin S1 φS2

0 0 1

(6.3)

∇fS2xS3=

∂f∂S2xS3

∣∣∣∣S2 xS3

=

cos S1 φS2 − sin S1 φS2 0

sin S1 φS2 cos S1 φS2 00 0 1

(6.4)

Given a connecting path between the base frame and the frame representing the object ofinterest, a “global” estimate of the object can be found. For example, in Figure 6.3 the baseframe S1 and the vehicle are connected by the path 1-2-3-v of piece-wise couplings S1xS2 ,S2xS3 , and S3xv. A global vehicle estimate is obtained by applying Equation 6.1 twice todetermine first S1xS3 then S1xv.

Note that the covariance summation in Equation 6.2 is calculated with two separateJacobians, which implies that the coupling estimates S1 xS2 and S1 xS3 are independent (i.e.,uncorrelated). This is true given the coupling estimates when the submaps are first created(see Section 6.2.4), but is generally not the case following the convergence operations shownin Section 6.2.5. The issue of treating correlated coupling estimates as though they wereuncorrelated is addressed in Section 6.2.6 (with the verdict being that the result is consistentand, in fact, conservative).

Coupling inversion is the calculation of the pose of frame S1 with respect to frame S2,when given the pose of frame S2 with respect to frame S1. That is, it allows the switchingof base coordinate frames.

S2 xS1 = f(S1 xS2

)=

−S1 xS2 cos S1 φS2 − S1 yS2 sin S1 φS2

S1 xS2 sin S1 φS2 − S1 yS2 cos S1 φS2

−S1 φS2

(6.5)

S2PS1 = ∇fS1xS2

S1PS2∇fTS1xS2

(6.6)


Figure 6.3: Coupling summation. The vehicle pose with respect to thebase frame S1 is found by recursive application of Equation 6.1. FirstS1xS3= f

(S1xS2 ,

S2xS3

)then S1xv= f

(S1xS3 ,

S3xv

).

where the Jacobian ∇fS1xS2is given by

∇fS1xS2=

∂f∂S1xS2

∣∣∣∣S1 xS2

=

− cos S1 φS2 − sin S1 φS2

S1 xS2 sin S1 φS2 − S1 yS2 cos S1 φS2

sin S1 φS2 − cos S1 φS2S1 xS2 cos S1 φS2 + S1 yS2 sin S1 φS2

0 0 −1

(6.7)Common root coupling is the calculation of S2 xS3 given S1 xS2 and S1 xS3 . That is, given

the pose of two frames with respect to a common base frame, obtain the relative posebetween the two frames.

S2xS3 = f(S1 xS2 ,

S1 xS3

)=

(S1 xS3 − S1 xS2) cos S1 φS2 + (S1 yS3 − S1 yS2) sin S1 φS2

−(S1 xS3 − S1 xS2) sin S1 φS2 + (S1 yS3 − S1 yS2) cos S1 φS2

S1 φS3 − S1 φS2

(6.8)S2PS3 = ∇fS1xS2

S1PS2∇fTS1xS2

+ ∇fS1xS3

S1PS3∇fTS1xS3

(6.9)


are given by

∇fS1xS2=

∂f∂S1xS2

∣∣∣∣S1 xS2

=

− cos S1 φS2 − sin S1 φS2 −(S1 xS3 − S1 xS2) sin S1 φS2 + (S1 yS3 − S1 yS2) cos S1 φS2

sin S1 φS2 − cos S1 φS2 −(S1 xS3 − S1 xS2) cos S1 φS2 − (S1 yS3 − S1 yS2) sin S1 φS2

0 0 −1

(6.10)

∇fS1xS3=

∂f∂S1xS3

∣∣∣∣S1 xS3

=

cos S1 φS2 sin S1 φS2 0

− sin S1 φS2 cos S1 φS2 00 0 1

(6.11)


The three coupling manipulation equations can be summarised using vector notation asfollows.

• Coupling summation.S1xS3 = S1xS2 + RS2xS3 (6.12)

• Coupling inversion.S2xS1 = R−1(−S1xS2) (6.13)

• Common root coupling.

S2xS3 = R−1(S1xS3 − S1xS2) (6.14)

where R is the rotation matrix

R =



6.2.3 Map Traversal

Traversing a local submap is performed using traditional stochastic SLAM. The size of thesesubmaps is presumed reasonably small in comparison to the vehicle’s sensing range so thatminor inconsistencies in the map update (due to model non-linearities and biases) do notaccumulate significantly, and can be absorbed by stabilising noise.

When traversing the edge regions of a submap S1, attempts are made to associateobserved data with the adjacent submap S2. For batch association purposes, the pre-dicted pose of the vehicle is projected into S2 using the common root coupling equationS2 xv = f

(S1 xS2 ,

S1 xv

)as shown in Equation 6.8. This is concatenated with the submap

feature estimates as follows.

S2 xa =[

S2 xvS2 xm

](6.15)

S2Pa =[

S2Pv 00 S2Pm

](6.16)

Batch association is performed using the tracking version of CCDA given in Section 3.3.4.If an acceptable association set is found, then a new vehicle track is initialised in the secondsubmap. It is essential that no information is transferred between the two submaps as thiswould introduce correlations between them; rather the vehicle pose is initialised with zeroa priori knowledge beyond the given data association information.

The method proposed in this thesis for initialisation uses the insert and observe proce-dure (see Appendix C), and further details regarding vehicle initialisation are provided inAppendix C.2. A nominal value S2 xv for the vehicle location is the same as in Equation 6.15.


The prior state for S2 is given by the concatenation of the vehicle pose guess, with infiniteuncertainty,6 and the submap feature estimates.

S2 x−a =

[S2 xvS2xm

](6.17)

S2P−a =

[αIv 00 S2Pm

](6.18)

A posterior estimate of the vehicle pose in the second submap is then obtained by applyingthe normal observation update equations, as given in Section 2.2.3, for each of the associatedobservations from the batch association process.

A word of caution at this point. If both submaps continue to maintain a vehicle track,it is imperative that no observation information is used in both submaps. That is, if bothsubmaps contain common features (i.e., features representing the same physical objects),observations of these features may be used in one or other of the submaps, but not both.While this information reuse would not be a concern if the submaps were permanentlyindependent, it is a problem when they are both used to perform coupling estimation inSection 6.2.5. (Note, since batch data association is performed separately in each submapusing the same observation data set, common features are found when a particular mea-surement is assigned to a landmark in both submaps.)

An alternative mechanism for vehicle initialisation is presented in [140], where obser-vations are compiled in a small local map and subsequently applied to the submap usinggeometric constraints.

6.2.4 Submap Creation

As the vehicle traverses the boundaries of the existing map, new submaps are created torepresent new regions of the environment. Conventionally, the vehicle pose has been usedto depict the origin of a new submap at its creation. Thus, the new submap S2 is createdas

S2xa =[

S20vS2 xm

](6.19)

S2Pa =[

S20v 00 S2Pm

](6.20)

and the coupling to its parent submap S1 is given by the estimated vehicle pose S1 xv at thismoment. Note, this estimate may be extracted from S1 (as in [29, 30]), or may be retainedin the parent state estimate (as in the CRSF).

In certain applications, it may be desirable to create a new submap with origin offsetfrom the current vehicle pose. For example, in Figure 6.4 the new submap is offset so thatits axis is aligned with the parent frame and the submap region can be specified by geometricbounds (i.e., square submaps for efficient region coverage with minimal overlap). The twooperations for offset submap creation are (i) calculation of the coupling estimate S1 xS2 (with

6The insert and observe method uses a suitably large number α as an approximation for infinity (seeAppendix C).


Figure 6.4: Submap creation with offset origin. A rigid body transformis performed with respect to the vehicle frame to produce a coordinateorigin with a nominal coupling S1xS2 to the parent frame.

uncertainty S1PS2), and (ii) calculation of the augmented estimate S2 xa = [S2 xTv , S2 xT

m]T

with respect to the offset coordinate frame.Given the current vehicle pose estimate S1xv and a nominal new submap origin S1xS2

with respect to the parent submap, the offset submap frame relative to the vehicle isfound from the common root coupling equation vxS2 = f

(S1 xv,

S1xS2

)as in Equation 6.8.

Note, vxS2 is a rigid-body extension with respect to the vehicle such that vPS2 = 0.The coupling estimate is subsequently calculated using a coupling summation equationS1 xS2 = f

(S1 xv,

vxS2

)as in Equation 6.1, such that the estimated pose of the new submap

is the original nominal value and the coupling uncertainty is found according to Equation 6.2as follows.

S1PS2 = ∇fS1xv

S1Pv∇fTS1xv

+ ∇fvxS2

v0S2∇fTvxS2

= ∇fS1xv

S1Pv∇fTS1xv

The vehicle pose estimate relative to the new submap frame is found as a couplinginversion S2 xv = f (vxS2) as in Equation 6.5, and has zero uncertainty S2Pv = 0. At cre-ation, the map features are assumed to be uncorrelated to the vehicle and from each other;thus, feature estimates vxi can be independently transformed from the vehicle frame tothe new submap frame using a common root calculation S2 xi = f (vxS2 ,

vxi). The resultingaugmented state estimate S2xa is therefore block-diagonal.

Finally, the new submap is connected to other submaps that are adjacent to the parentsubmap and possess region boundaries adjacent to the new submap. These couplings aredetermined using a common root equation S2 xS3 = f

(S1 xS2 ,

S1 xS3

).

6.2.5 Coupling Convergence

In the course of traversing submap boundary regions, vehicle tracks may be maintained in-dependently in multiple adjacent submaps (with care that observation information is used


at most once). The information from these multiple tracks can be used to improve the cou-pling estimates between submaps without correlating the submaps themselves. Convergenceof the coupling estimates facilitates convergence of the map at a global level.

The method for performing coupling convergence is two stage. First, a new estimate ofthe coupling between two submaps is obtained and, second, this estimate is fused with theprior coupling estimate.

A new coupling estimate between two submaps is found using the insert-and-observeprocedure as follows. First, the two augmented state vectors S1 xa and S2 xa are concatenatedalong with an initial “guess” of the coupling estimate S1 xS2 . The coupling guess is assignedinfinite uncertainty as follows.

x−c = [S1 xT

S2, S1 xT

v , S1 xTm, S2 xT

v , S2xTm]T (6.21)

P−c =

αI 0 0 0 00 S1Pv

S1Pvm 0 00 S1PT

vmS1Pm 0 0

0 0 0 S2PvS2Pvm

0 0 0 S2PTvm

S2Pm

(6.22)

Second, a constrained estimate of the concatenated state is generated by applying two typesof perfect “virtual” observations. The first of these is that the vehicle in both submaps isthe same, zv = 0 where

zv = hv (xc) = S1 xv − (S1 xS2 + RS2 xv) (6.23)

and R is the rotation matrix

R =



This constraint has the following innovation and innovation covariance.

ν = zv − zv = −hv

(x−

c

)(6.24)

S = ∇hxcP−c ∇hT

xc(6.25)

where the Jacobian ∇hxc = ∂hv∂xc

∣∣∣xc

.

The second constraint type arises from the existence of common features between the twosubmaps. These are found from batch data association when, for a given set of observations,the same measurement is assigned to a feature in both maps; (this measurement is thenused to update only one of these features). Thus, for common features S1xi and S2xj , theconstraint is zij = 0 where

zij = hij (xc) = S1 xi − (S1 xS2 + RS2 xj) (6.26)

Here, S1xS2 refers to just the (x, y) part of the coupling estimate, and R is the matrix

R =[

cos S1 φS2 − sin S1 φS2

sin S1 φS2 cos S1 φS2

]


The innovation ν = zij − zij and innovation covariance can then be derived, and these“virtual” observations can be used to produce a posterior estimate of the concatenatedstate vector via the normal EKF update equations. From this result, the new couplingestimate S1 xS2 ,

S1PS2 is extracted.There are three points to note from the above process. First, the value of the initial

“guess” for S1 xS2 is simply the prior coupling estimate. Second, the concatenated vector isformed with copies of the two submaps; these coupling calculations are separate and isolatedfrom the independent submap SLAM operations. Third, for efficiency, only those featurescommon to both submaps need to be copied into the concatenated state (non-commonfeature exert no influence on the result).

Furthermore, if the two submaps are parent-child and the original coupling estimate isretained in the parent state estimate (as for the CRSF), then the following concatenatedprior can be used instead of Equation 6.22.

P−c =

S1PS2S1PS2v

S1PS2m 0 0S1PT

S2vS1Pv

S1Pvm 0 0S1PT

S2mS1PT

vmS1Pm 0 0

0 0 0 S2PvS2Pvm

0 0 0 S2PTvm

S2Pm

(6.27)

This prior incorporates the initial knowledge of the coupling estimate as retained in theparent submap. However, it is important to note that the coupling estimate used inEquation 6.27 is part of the submap state and separate from the coupling estimate ob-tained from the concatenation process. In other words, there are two versions of couplingestimate: (i) the estimate contained in the parent submap from the child’s creation, and(ii) the coupling estimates obtained from constraining the concatenated state. It is alsoworth noticing that the advantage of the prior in Equation 6.27 over the insert-and-observeapproach in Equation 6.22 may prove to be marginal as the submaps themselves converge,and probably does not warrant the added complexity.

Having obtained a new coupling estimate S1 xnewS2

, the new estimate is fused with the priorcoupling estimate S1 xold

S2using the covariance intersection [77]. This permits monotonically

convergent estimation when an unknown degree of correlation exists between the prior andnew estimates.7 For notational convenience, let xa = S1xold

S2and xb = S1xnew

S2; the posterior

estimate is found from the covariance intersection equations as follows.

Pc =(wP−1

a + (1 − w)P−1b

)−1 (6.28)

xc = Pc

(wP−1

a xa + (1 − w)P−1b xb

)(6.29)

where the scalar variable 0 ≤ w ≤ 1 is chosen to minimise some criterion (e.g., determinant,trace) of the posterior covariance matrix Pc.

A consequence of the coupling convergence method shown here is that the resultingcoupling estimate is correlated to both submaps and, moreover, to every other coupling

7The covariance intersection is hardly necessary if the coupling estimate involves common feature con-straints. This tends to produce “comparable” uncertainty estimates, where S1Pold

S2 − S1PnewS2 is positive

semi-definite, and the new estimate simply replaces the old. The covariance intersection is most useful whenonly vehicle pose constraints are involved.


connected to either submap. This is a serious issue for the coupling equations given inSection 6.2.2, which assume uncorrelated coupling estimates. The question of whetherthese equations still produce consistent results if the coupling estimates are correlated isaddressed below.

6.2.6 Consequences of Non-independent Couplings

This section investigates the properties of pair-wise correlated8 relative pose estimates,and is applicable to the coupling equations in Section 6.2.2 and also to the sensor-baseddead reckoning algorithm in Example C.1. The outcome of this investigation suggests thattreating the coupling estimates as independent, when they are actually pair-wise correlated,produces consistent and conservative results. This is an important finding as it meanscoupling equations can be used directly on converging coupling estimates, and indicatesthat NCFM SLAM may attain near-optimal convergence limits at a global level.

The general case for correlated information is as follows. Given two independent randomvectors a and b, and a model relating them to a third vector c = f (a,b), the covariance ofthe estimate c can be calculated as follows.

Pc = ∇faPa∇fTa + ∇fbPb∇fT

b (6.30)

where the Jacobians are ∇fa = ∂f∂a

∣∣(a,b) and ∇fb = ∂f

∂b

∣∣(a,b). If a and b are correlated, how-

ever, Equation 6.30 is invalid and may produce an inconsistent result, although a consistentestimate can be found using covariance intersection methods.

Pc = ∇faPa

w∇fT

a + ∇fbPb

1 − w∇fb (6.31)

where w is a scalar between 0 and 1. In particular, if Pa and Pb are each inflated by afactor of two, their substitution in Equation 6.30 produces a consistent estimate.

A basic extension of this concept is to say that, for a path of pair-wise correlated cou-plings, a consistent “coupling summation” estimate can be obtained by first inflating eachcoupling covariance by two. Clearly, this expansion results in a greatly inflated uncertaintyin the final estimate.

At this point, it is important to remark that ignoring correlations does not necessar-ily imply inconsistency; the action of ignoring correlations can lead to either inconsistent(optimistic) estimates or suboptimal (conservative) estimates. Consider the following 1-Dexample c = a + b, where the variables a and b possess variances A and B, respectively, andunknown correlation D. That is, the vector [a, b]T has covariance[

A DD B

]

The true variance of c, therefore, is C = A + B + 2D while, ignoring correlations, the esti-mated variance is C = A + B. The difference between the estimated and true variance is

8For a coupling path connecting two coordinate frames, pair-wise correlation means that each couplingestimate is correlated to its preceding and succeeding coupling estimates only. This is true for any non-closedpath in the NCFM framework.


Figure 6.5: Coupling correlations. Three submap origins in a globalcoordinate frame are shown in the top diagram; the location of a singlelandmark is shown also. The three lower diagrams depict estimates of thelandmark relative to each frame (arrow) and its true location (asterisk).The coupling estimates S1 xS2 and S2 xS3 are correlated by the error in S2.

−2D, which is positive-semi-definite if D ≤ 0. This means that ignoring correlations willlead to inconsistent results if D > 0 but will give conservative results if D < 0.9 In theformer case, consistent results can be obtained by inflating A and B by two, but in thelatter case, the original variance values suffice.

It is now shown, by way of a 1-D example, that the correlations occurring in the couplingequations are always of the conservative variety. Thus, inflation of the coupling estimateuncertainties is unnecessary.

Example 6.1One-dimensional coupling summation. This example considers a simple case where threesubmaps share the one feature. This might also be interpreted as sensor-based dead reckoningover three timesteps while observing a single landmark. The result shown here is equallyvalid for scenarios with multiple landmarks.

A single landmark L is represented in three submaps as shown in Figure 6.5. Theestimated distances in each submap are S1 xL = d1, S2 xL = d2 and S3 xL = d3. The truerelative distances are d1 + v1, d2 + v2 and d3 + v3, respectively, where v1, v2 and v3 areindependent random variables. The coupling estimates S1 xS2 and S2 xS3 are given by

S1 xS2 = S1 xL − S2 xL = d1 − d2

S2 xS3 = S1 xL − S2 xL = d2 − d3

while the true couplings are d1 + v1 − d2 − v2 and d2 + v2 − d3 − v3, respectively. Note,the two estimates are correlated by the error v2 in S2. Coupling summation results in thefollowing estimated coupling.

S1 xS3 = S1 xS2 + S2 xS3 = d1 − d2 + d2 − d3 = d1 − d3

9In more intuitive terms, if D > 0 then a and b will tend to be either above or below their mean valuestogether—exacerbating the error in c—while, if D < 0 then a and b will tend to occupy opposite sides oftheir means (e.g., a above and b below), and be self-correcting in c.


The true coupling is

d1 + v1 − d2 − v2 + d2 + v2 − d3 − v3 = d1 + v1 − d3 − v3

Notice that the correlated error v2 cancels out and the summation is self-correcting. If thecouplings were uncorrelated, the summation would have combined the uncertainty of fourrandom variables—v1, v2, v

′2, v3—but, since they are correlated, it involves only two.

Another way to explain this example is to say that the vector [S1 xL, S2 xL, S3 xL]T hascovariance

PL =

S1PL 0 0

0 S2PL 00 0 S3PL

The coupling estimates are calculated from this as follows.

[S1 xS2S2 xS3

]= F

S1 xL

S2 xLS3 xL

where F is the transition matrix

F =[

1 −1 00 1 −1

]

The covariance of the coupling estimates, therefore, is obtained as

PS = FPLFT =[

S1PL + S2PL −S2PL

−S2PLS2PL + S3PL

]

Notice that the off-diagonals of PS are negative, which means that the coupling summationS1 xS3 = S1 xS2 + S2 xS3 is conservative if the couplings are treated as uncorrelated. That is,the uncorrelated estimate covariance S1PL + 2S2PL + S3PL is always greater than the truecorrelated covariance S1PL + S3PL.

This example demonstrates that, for 1-D coupling summation, assuming uncorrelatedcoupling estimates (i.e., ignoring correlations) will produce a conservative summed estimate.

From the above 1-D example, it is conjectured that coupling summation, assuminguncorrelated coupling estimates, is conservative also for higher dimensions. This is suggestedempirically by the dead reckoning results in Section 3.4, where the noise between sequentialscans is quite high but the global summed pose is extremely accurate. It appears that thescan matching errors tend to cancel for each pair of relative pose estimates.

An intuitive argument for the error cancelling properties of correlated coupling estimatesis given by considering the limiting case. In the limit, each submap becomes (internally)fully correlated, so that its features approach a lower bound with respect to the coordinateframe, and become perfectly known relative to each other. This may be visualised as a rigidlattice of features that is still flexible relative to its coordinate origin. A coupling estimate,derived from common feature constraints between two submaps, defines a transform so that

6.3 Cycle Detection within the NCFM Framework 147

the two submaps are perfectly aligned, even though their relative origins are uncertain.Similarly, a path of coupling estimates is perfectly aligned; they produce a sequence of rigidlattices connected together by common feature constraints. The locations of the interveningsubmap origins are uncertain, but this is irrelevant since the coupling estimates are definedby the relative feature locations, which are perfectly known.

If this reasoning is correct, then it would appear that the lower bound feature uncertaintyin each submap cancels out for all intermediate submaps in a coupling path, and the globalaccuracy of NCFM approaches almost the same lower limit as full SLAM. In other words, theuncertainty of the intermediate couplings is dependent solely on the rigidity of the connectedsubmaps, and not the uncertainty of their coordinate origins. Only the coordinate frameuncertainties of the beginning and end submaps remain relevant.

Therefore, it is suggested that the true accuracy of NCFM converges to near that oftraditional SLAM, but the NCFM covariance estimate will be over-conservative since itincorporates the uncertainty of each submap origin.

6.3 Cycle Detection within the NCFM Framework

For all SLAM methodologies, the most difficult challenge is cycle detection. This sectionpresents a complete strategy for robust cycle detection and confirmation within the NCFMframework, so that loop closure becomes a feasible proposition in extensive environmentswith massive accumulated vehicle pose uncertainty.

The process of cycle detection is implemented as a three stage operation.

• First pass detection. A efficient search is performed to find all submaps not adjacentto the current submap that may be within the vehicle field-of-view (i.e., predictedsubmap visibility).

• Second pass detection. Batch data association determines the existence of possiblecycles to any of the candidate submaps.

• Confirmation. Each cycle hypothesis is tracked until satisfying reasonable criteria asa confirmed cycle. The coupling linkages for the cycle are then added to the map.

This strategy essentially combines the cycle detection mechanisms of topological maps(i.e., place recognition and rehearsal) with the added restriction of metric pose constraints.Nevertheless, these checks are insufficient to guarantee against false cycles, and this sectionconcludes with a discussion of conditions in which cycle detection may fail.

6.3.1 First Pass Detection

First pass search is performed to determine the possible visibility of every submap notadjacent to the current submap. This search checks if any part of a non-adjacent submapregion intersects with the vehicle field-of-view region, and efficiently culls the number ofsubmaps valid for the second pass detection phase.


(a) Vehicle field-of-view (b) Submap region bounds

Figure 6.6: Vehicle and submap visibility regions. The vehicle field-of-view is semi-circular with radius rv; this region is bounded by a rectan-gular box. The submap region is bounded by a circle of radius rs.

For each non-adjacent submap Si, the relative pose with respect to the vehicle vxSi

is found via coupling summation along the shortest connecting path.10 The submaps arechecked in breadth-first order radiating from the current submap. This serves the dualpurpose of (i) finding the (approximately) shortest connecting path to each submap, and(ii) permitting storage of the cumulative coupling sum estimates at each level of the searchtree (i.e., avoiding recalculation for subsequent levels).

The possible visibility test described below assumes that the vehicle field-of-view isbounded by a rectangular box {(0, rv), (rv, rv), (rv,−rv), (0,−rv)} as shown in Figure 6.6(a),and the submap region is bounded about its origin by a circle of radius rs, as shown inFigure 6.6(b).

The first step in calculating visibility is to define the nσ (n-sigma) uncertainty ellipseof the submap origin with respect to the vehicle. (The value of n sets a threshold on theuncertainty search-space; typically n equals two or three.) These equations do not involvethe relative orientation vφSi , so let x = [vxSi ,

vySi ]T with covariance P. Let the elements

of P be denoted as

P =[

σ2x σ2

xy

σ2xy σ2

y

]=[

a dd b

](6.32)

The eigenvalues of P are found from the characteristic equation |P − λI| = 0. That is,10The path chosen between a submap and the vehicle is not critical, and the most accurate solution would

be to calculate vxSi ,vPSi along all combinations of connecting paths and fuse them using the covariance

intersect. A practical solution is to simply choose the shortest connecting path, which usually correspondsto the minimum uncertainty path.


Figure 6.7: Submap visibility region. The inner ellipse represents thenσ uncertainty ellipse of the submap origin with respect to the vehicle.This region is extended by the submap radius rs to define the region overwhich the submap might be visible.

(a − λ)(b − λ) − d2 = 0, which has two roots

λ =a + b ±√(a − b)2 + 4d2

2(6.33)

The eigenvectors corresponding to the principal axes of P are then found as

e =[

ex

ey

]=[

λ + d − bλ + d − a

](6.34)

These eigenvectors are normalised by their 2-norm ||e|| =√

e2x + e2

y and scaled by n√

λ toproduce the principal axes of an nσ ellipse. This ellipse is depicted as the inner ellipse ofFigure 6.7, which is subsequently expanded by the submap radius rs to define the extentof the region over which the submap is possibly visible with respect to the vehicle. Theprincipal eigenvectors of this outer ellipse are calculated from the unscaled-eigenvectors inEquation 6.34 as follows.

es = (n√

λ + rs)e

||e|| (6.35)

The two scaled eigenvector es1 and es2 will subsequently be referred to as e1 and e2.Having defined the elliptical region of possible submap visibility, the test for possible

visibility becomes an intersection test between this ellipse and the vehicle field-of-viewrectangle. The intersection test requires some mathematical preliminaries relating the ellipseeigenvectors to Cartesian coordinates.[

xy

]=[

e1x cos θ + e2x sin θe1y cos θ + e2y sin θ

](6.36)


Figure 6.8: Cartesian ellipse coordinates. The (x, y) coordinates of theellipse are obtained as a function of the principal eigenvectors e1, e2 andthe angle θ.

where θ is the angle from the positive axis of e1 in the direction of the positive axis of e2

as shown in Figure 6.8. The two solutions for θ from Equation 6.36 are

θ = arctanxe2

2x± e1xe2x

√kx

xe1xe2x ∓ e22x

√kx

(6.37)

θ = arctanye2

2y± e1ye2y

√ky

ye1ye2y ∓ e22y

√ky

(6.38)

where kx = e21x

+ e22x

− x2 and ky = e21y

+ e22y

− y2. Note, θ is not defined (has imaginary

roots) for kx < 0 or ky < 0; thus, the Cartesian extremities of the ellipse are x = ±√

e21x

+ e22x

and y = ±√

e21y

+ e22y

.Note, the location of the ellipse centre relative to the vehicle is x, but for these calcula-

tions it is simpler to make the ellipse centre the origin and shift the vehicle by −x. Thus,the ellipse is defined by the eigenvectors e1 and e2, and the rectangle is given by coordinates{(−x, rv − y), (rv − x, rv − y), (rv − x,−rv − y), (−x,−rv − y)}.

The test for ellipse-rectangle intersection is performed in three stages and, if any stagesucceeds, the subsequent stages are not needed. The first is to find whether any line of thebox intersects with the ellipse; the second tests if the box is enclosed by the ellipse; and thethird tests if the ellipse is enclosed by the box.11

1. Box-ellipse intersection test. This test is performed on each line segment of the rect-angle. Here, the vertical segment {(xp, yp), (xp, yq)}, yp < yq is shown as an example.

11The third test, ellipse-in-box, is typically not required as the size of the submap region is usually greaterthan the vehicle field-of-view.


(a) Check whether the vertical line x = xp intersects the ellipse by satisfying thecriterion x2

p ≤ e21x

+ e22x

(n.b., use the form y2p ≤ e2

1y+ e2

2yfor horizontal lines).

If not, the test fails; repeat test with a different line segment.

(b) Calculate the two solutions θ1 and θ2 for Equation 6.37 where x = xp (n.b., useEquation 6.38 for horizontal lines).

(c) Calculate y1 and y2 for θ1 and θ2, respectively, using Equation 6.36. These arethe two ellipse intersect points for the line x = xp.

(d) If yp ≤ y1 ≤ yq or yp ≤ y1 ≤ yq, then this test returns true.

2. Box-in-ellipse test. This test involves choosing any corner of the rectangle (xp, yp).

(a) Check the vertical line x = xp intersects the ellipse by satisfying the criterionx2

p ≤ e21x

+ e22x

. If not, the test fails.

(b) Perform steps (b) and (c) of the box-ellipse intersection test.

(c) If y1 ≤ yp ≤ y2 then this test returns true.

3. Ellipse-in-box test. This is a trivial point-in-box test which returns true if the ellipsecentre (0, 0) lies within the rectangle bounds.

The non-adjacent submaps that meet the ellipse-rectangle intersection criteria are markedas potentially visible and are then subjected to the second pass detection process.

6.3.2 Second Pass Detection

The second pass stage of cycle detection performs batch data association within each can-didate submap. If sufficient associations are obtained between the current observation setand a submap Si, then a cycle hypothesis is formed. This process is analogous to topologicalplace recognition.

The expected vehicle pose Sixv with respect to the submap Si is found via couplingsummation. Thus, the augmented submap state for batch association incorporates poseconstraints (though these are not correlated to the map states) as follows.

Si xa =[

Si xvSi xm

](6.39)

SiPa =[

SiPv 00 SiPm

](6.40)

The submap state is associated to the vehicle observation set using the tracking CCDAalgorithm presented in Section 3.3.4. Note, the feature grouping procedure, as discussed inSection 5.3, is not required for submap cycle detection because correlated feature groupingis inherent within the NCFM submap structure.

Each successful batch association, including multiple association hypotheses within asingle submap, forms a cycle hypothesis. Each cycle hypothesis is tracked over a spec-ified period and distance to ascertain its validity (i.e., to ensure it is not an artifact ofenvironmental symmetry). This phase is termed cycle confirmation.


6.3.3 Cycle Confirmation

The cycle confirmation stage tracks a set of cycle hypotheses until a single hypothesisremains and continues for a reasonable time and distance. This process is analogous totopological rehearsal.

The objective of cycle confirmation is the protect against short-term symmetries in theenvironment, where different areas of the environment possess similar structural appearanceover a limited region. A cycle hypothesis is rejected if tracking is lost—if batch associationfails to produce sufficient correspondences as the vehicle moves through the region. Whenseveral cycle possibilities exist, tracking continues until at most one persists. The remaininghypothesis is accepted if it persists longer than a specified time period and vehicle traversaldistance.

During the confirmation process, SLAM is continued in the current submap and itsadjacencies, including the creation of new submaps. That is, map building and traversalcarry on as though cycle detection were not in progress. However, in the “cycle hypothesis”submaps, the vehicle pose is tracked separately from the submap state. That is, the submapstate is static and the vehicle performs localisation only as it confirms the cycle validity.When a cycle is eventually confirmed, the current submap is coupled to the cycle submap(and some of their neighbouring submaps are also connected). The coupling estimate iscalculated using the coupling convergence mechanism described in Section 6.2.5. It maybe beneficial even to merge the current submap with the cycle submap, using geometricconstraints, if they possess sufficient overlap.

6.3.4 Pathological Symmetries

The cycle detection and confirmation strategy presented above is robust but cannot abso-lutely guarantee the rejection of false cycles. It is possible that symmetries in the environ-ment may persist over a large enough area to admit an incorrect validation. This problemis most likely to occur in highly structured environments.

In general, symmetry-based cycle hypotheses will be rejected because they coexist withthe correct cycle hypothesis, which will eventually become the only surviving hypothesis.However, there are two possible scenarios where false cycle rejection may fail. First, it mayfail in dynamic environments if distinctive features are obscured by transient objects and thetrue hypothesis is rejected—leaving symmetry-based hypotheses unchallenged. Second, itmay fail if the vehicle is exploring a new region of the environment that possesses symmetrywith an old mapped region, and is close enough to this region to satisfy pose constraints.This second scenario has no “true” cycle and will be accepted if the symmetry is sufficientlypersistent.

Consider, for example, the situation in Figure 6.9, where a vehicle traverses a largeloop in a highly structured corridor environment. At the final position shown, the vehicle isabout to enter a new corridor identical to the first. Because of its large pose uncertainty, thevehicle forms a cycle hypothesis with the original location (X) and, since the new corridor isnot yet part of the map, the false cycle persists unchallenged until the end of the corridor.If the cycle confirmation distance is less than the corridor length, the false cycle will beadded to the map.

6.4 Application: Large-scale Outdoor NCFM SLAM 153

Figure 6.9: False cycle detection. This picture signifies a corridor en-vironment where the robot starts at location X and travels along thetrajectory shown. At the final pose, the true pose (triangle) and vehiclefield of view (semicircle) are depicted, along with the pose uncertainty el-lipse. The new corridor is structurally identical to the original (adjacent)corridor.

6.4 Application: Large-scale Outdoor NCFM SLAM

This application demonstrates the basic attributes of NCFM with an implementation ofSLAM in the internal road environment. As a first-cut attempt at NCFM SLAM, thisexperiment incorporates two major simplifications to the method described in this chapter.First, each submap is represented only by the scan of features obtained at its creation,and remains static thereafter. Second, the coupling estimates between submaps are formedwhen a new submap is created or upon a cycle confirmation, and these also remain staticafterwards. Note, as a result of these two restrictions, some rather ad hoc operations areincluded in the experimental details below.

The procedures demonstrated in this application are: map traversal and submap tran-sition, submap creation, and cycle detection and confirmation. Aspects described in thischapter that are not demonstrated here are: SLAM within local submaps, coupling conver-gence, and multiple cycle hypotheses per submap.

6.4.1 Implementation Details

This experiment is feature-based, with the details for feature extraction being the same asgiven in Section 3.4.1. The vehicle prediction model is the same as in Section 3.4.2 and, if


tracking is temporarily lost with respect to the local submaps, the vehicle pose estimate ispropagated using sensor-based dead reckoning.

Map traversal is simply a matter of tracking the features of whichever surroundingsubmaps are predicted visible, and relying on dead reckoning if none provide a suffi-cient number of associations. (Tracking may be lost intermittently because the single-scansubmap representations do not tend to cover the environment very evenly.)

A new submap is created once the vehicle moves more than 15 metres from the origin ofthe nearest existing submap. The new submap is defined by the features extracted from thecurrent laser scan, with the vehicle pose as its origin. Coupling estimates are then createdbetween all submaps with origins closer than 40 metres.

A simple version of cycle detection is performed with each new laser scan. The first-passdetection does not consider the size of the submap or vehicle view regions, but simply looksfor non-adjacent submaps with origins closer to the vehicle than a fixed NIS threshold. Thatis, xPx < γ2 where x, P represent the (x, y) part of the relative pose estimate vxSi ,

vPSi

for submap Si. The second-pass detection stage performs tracking CCDA in each candidatesubmap and, for submaps yielding sufficient associations (≥ 7), cycle hypotheses are formedon the basis of maximum association set (i.e., maximum cliques). Thus, at most one cyclehypothesis is formed per candidate submap each timestep. These hypotheses are added toa hypothesis list provided each particular hypothesis does not already exist on the list.

Cycle hypotheses are tracked for a rehearsal period across three submaps. If trackingis lost, due to insufficient associations, for more than 10 scans in a row, the hypothesisis deleted. If a hypothesis survives rehearsal, and all other hypotheses are significantlyyounger, then the cycle is confirmed and the remaining hypotheses are deleted. Connectionsare made between the original cycle submap and the submap that was current when the cyclewas first hypothesised; the later “current” submaps created during the rehearsal period areremoved from the map.

6.4.2 Results

NCFM SLAM is performed over the same trajectory as presented in Section 5.4. Theresultant map is displayed using two different global representations in Figures 6.10 and6.11. These figures attempt to exhibit the coupling properties of the map, which permitvery accurate pose estimates between closely connected submaps.

In Figure 6.10, the submap locations are shown at their estimated global positions whenfirst created. It shows the accumulated uncertainty of latter submaps with respect to thefirst, even if they a quite close physically. It also depicts the major differences between pre-dicted and estimated pose following cycle detection and confirmation, which are a primarysource of non-linearity problems for traditional SLAM implementations.

The plot in Figure 6.11 shows exactly the same map, but with the submap locationsestimated from breadth-first coupling summation. Summation along a breadth-first pathmeans that a shortest path corresponding to an (approximately) minimum uncertaintyestimate S1 xSi is obtained. The resulting global map gives a much better indication of itstrue structure, and demonstrates the improved certainty of short coupling paths comparedto long circuitous paths.


0 50 100 150 200−250

−200

−150

−100

−50

0

50

Figure 6.10: NCFM with submap origins plotted at their original globallocation estimates. This plot shows the large discontinuities between thepredicted global pose and the estimated pose after cycle confirmation.


0 50 100 150 200−250

−200

−150

−100

−50

0

50

Figure 6.11: NCFM with submap origins plotted in breadth-first order.This plot results in an (approximately) minimum uncertainty global mapwith respect to the first submap.

6.5 Summary 157

6.5 Summary

This chapter presents the NCFM framework for SLAM in large-scale environments.Current variants of submap-based stochastic SLAM are reviewed, with particular at-

tention to their consistency during submap transitions. The DSM method, while shownempirically to give satisfactory results, lacks a theoretical proof of consistency. Also, DSM,the “two-level landmark” hierarchy, and the RLR, having global base coordinate systems,fail to address the non-linearity problems incurred by large loop closures. The “hierarchicallocal maps” and CRSF methods are both consistent and have similar structure, but CRSFprovides significant advantages in the way it defines its submap couplings. Both methodsare limited by monotonic linkage.

The NCFM framework is presented for SLAM operations of local traversal, submap tran-sition, and submap creation. The map converges at a local (submap) level via traditionalSLAM and at a global level by monotonic improvement of the coupling estimates. Theconsequent correlation between coupling estimates is believed to have an error-cancelingeffect, so that NCFM SLAM may even converge to a near-optimal lower limit.

A strategy for robust cycle detection is presented for NCFM. This involves three stages:first-pass search, second-pass detection, and cycle confirmation. The result is a method forloop closure that is resistant to all but the most pathological environmental symmetries.

A preliminary demonstration of the NCFM method is presented in a large-scale outdoorenvironment with challenging loop closures. Basic implementation is shown for submaptraversal and creation and cycle detection, but stochastic SLAM at a local level and couplingestimate level remains to be empirically verified.

Chapter 7

Conclusion

This thesis attempts to address the theoretical and practical issues which previously lim-ited SLAM implementations to small-to-medium scale environments with simple distinctlandmarks. The primary difficulties concerning SLAM in complex large-scale environmentsare seen to be

• Reliable data association given large uncertainties in the vehicle position.

• The representation of landmarks that are not suited to simple geometric classification.

• Map management in terms of landmark addition and removal to avoid the long-termaccumulation of clutter in the map.

• Map management in terms of submaps to permit computationally tractable (scalable)and mathematically consistent SLAM.

• The reliable detection of cycles (loops) in the map and consistent map update on loopclosure.

This thesis presents solutions to these problems and verifies their practical utility throughexperimental applications in outdoor environments (i.e., sensor-based dead reckoning, apriori map localisation, traditional SLAM, submap SLAM).

This chapter summaries the contributions of this thesis and proposes a set of futuredirections for completing and extending this work.

7.1 Summary of Contributions

Four primary theoretical contributions are presented in this thesis: the CCDA algorithm,Gaussian sum scan correlation, mechanisms for feature management, and the NCFM frame-work. These contributions, and their practical implications, are reviewed below.

7.1.1 Combined Constraint Data Association

The CCDA algorithm considers a set of possible observation-map correspondences as a batchrather than individually. This method incorporates all available correlation information to

7.1 Summary of Contributions 159

constrain association, and specifies sets of associations that are mutually compatible. A keyproperty of CCDA is its ability to produce likely association sets even if the vehicle pose isentirely unknown.

An application of batch data association is to permit sensor-based dead reckoning, whichestimates the ego-motion of the vehicle without odometry. This is of practical importance inrugged outdoor environments, where significant wheel-slip can compromise the assumptionsof a kinematic vehicle model.

Contributions

• CCDA provides a solution to the problem of fragile data association, dramaticallyreducing reliance on multiple hypotheses tracking for reliability.

• CCDA offers the same constraint properties as the JCBB batch association method,but additionally permits data association with unknown vehicle pose. This is usefulfor initialising a vehicle within an existing map.

7.1.2 Sum of Gaussians Scan Correlation

Scan correlation—the alignment of unprocessed point data sets—is a valuable alternativeto feature-based data association in environments not suited to geometric feature models.In this thesis, a Gaussian sum representation of point data is shown to facilitate Bayesianscan correlation. That is, it represents the data so that the ensuing relative pose estimateis obtained from a justifiable Bayesian likelihood function, derived from a stochastic modelof the sensor uncertainty. The relative pose likelihood function is shown to be given bycross-correlation of the respective Gaussian sums.

The scan correlation method is demonstrated by applications of maximum-likelihoodsensor-based dead reckoning and particle filter localisation.

Contributions

• Presentation of a representation and means to perform Bayesian scan correlation be-tween unprocessed point data sets without feature extraction. This approach providesan accurate and consistent estimate of the relative pose uncertainty and does not in-volve data association.

• Development of a practical implementation of the scan correlation algorithm using aGaussian sum representation, which permits accurate description of the sensor uncer-tainty and efficient calculation of the relative pose likelihood function.

7.1.3 Feature Management

The first contribution towards feature management is the compilation of useful existingtechniques for feature initialisation and removal, notably the method of constrained initial-isation in [140]. Two motivations are given for feature removal: control of map density anddeletion of obsolete (non-existent) features.

7.2 Future Research 160

Density control reduces computational effort by removing non-essential landmarks fromthe map. A measure of feature visibility is introduced as a metric for deciding which featuresto keep and which to delete. That is, a feature has greater utility for SLAM if it is observablefrom many viewpoints than, say, a more stable feature visible only from a restricted vantage.

Features become obsolete due to changes in the structure of dynamic environments andthe basic criterion for removal occurs when a feature is predicted visible but is not observed.A practical definition of “predicted visibility” is overlooked in the literature, and this thesispresents a criterion specifically applicable to a laser sensor.

Contributions

• Presentation of motivations for feature management: to prevent excessive clutteraccumulation and to allow the map to adapt to structural change.

• Development of a mechanism for feature density control (a form of clutter reduction),based on a feature visibility metric.

• Development of a laser scan specific method for the removal of obsolete features.

7.1.4 Network Coupled Feature Maps

NCFM is a global map management framework that divides the world into a mosaic ofindependent local submaps connected by a network of relative pose estimates. Standard(optimal) stochastic SLAM is performed at a local level and a consistent suboptimal SLAMis performed at a global level. NCFM is computationally cheap and mathematically stable,not susceptible to large-scale non-linearity problems. Furthermore, it converges at a globallevel, and is conjectured to achieve a near optimal lower bound.

The ability of NCFM to address loop closure is one of its major benefits, and a completestrategy for cycle detection is presented. Cycle detection is realised by an efficient first-passsearch, a second-pass detection of possible cycles, and finally, loop confirmation and theinsertion of submap couplings.

Contributions

• NCFM effectively solves the problems of efficiency, convergence and non-linearity forlarge-scale SLAM.

• NCFM facilitates efficient and reliable cycle detection; it provides a solution to theloop closure problem.

7.2 Future Research

This section proposes several areas of this research requiring completion and extension.Furthermore, the contributions in this thesis give scope to numerous applications worthy offuture investigation, and some promising directions are suggested below.


7.2.1 CCDA Algorithm Extensions

The following extensions are proposed for the CCDA algorithm.

• Implement a version of maximum clique search that stores multiple large cliques. Thisenables testing of various ambiguity management methods such as MHT with the klargest cliques.

• Implement alternative clique search algorithms (as presented in the literature) andcompare their efficiency with the simple randomised search.

• Investigate environmental influences on batch association reliability and minimum ac-ceptable batch size (i.e., data association “confidence”, as discussed in Section 3.5.2).

• Compare the tracking version of CCDA to JCBB [106] in terms of search efficiencyand constraint action. Test whether a set of associations accepted by CCDA is alsoaccepted by JCBB for the same χ2 acceptance probability (and vice versa).

• Apply CCDA to different feature types and sensing modalities (esp. vision).

7.2.2 Scan Correlation Applications

Two applications for scan correlation are proposed here. First, in a mining tunnel envi-ronment, a precise metric location estimate is not particularly advantageous, and this kindof environment is suited to a simple topological-metric map as follows. The corridor in-tersections are described by a scan of unprocessed data (or, perhaps a small number ofaligned scans), representing the map nodes. A weaker metric estimate is required betweennodes, and the vehicle pose may be estimated via correlation-based dead reckoning. Placerecognition is then performed by scan correlation with the node template scans.

A second application for scan correlation is stochastic SLAM in environments devoid ofgeometric features. The key idea is to describe portions of the environment by clusters of un-processed data. That is, a scan of data is segmented into clusters representing objects or dis-tinctive “regions of interest,” and each cluster serves as a static model, or template, for theobject it describes. For each object, the representative data is referenced according to a localcoordinate frame, and the global pose of the local frame xf = (xf , yf , φf ) defines the ob-ject’s global location. Therefore, stochastic SLAM can be performed as follows. The objectcoordinate frame poses are stored in the augmented state vector xa = [xT

v ,xTf1

, . . . ,xTfn

]T ,and each time an object is observed, a likelihood function of its pose relative to the vehicleis obtained via scan correlation with its template data set. A Gaussian approximation ofthis likelihood function is then used to perform an EKF update of the state estimate. Asnew objects are observed, new static models are stored and new reference frames are addedto the state. (Note, an object template might be produced from a conglomerate of aligneddata clusters from the first few scans of the object.) The essential advantage of this ap-plication is that it permits EKF-based SLAM based on point-locations without requiringgeometric feature models.


7.2.3 Feature Management Extensions

Some future directions for feature management are as follows.

• Implement a version of constrained initialisation for bearing-only or range-only SLAM.Here, the primary question is: does this method translate to sensors where a pointfeature location is not fully defined by a single measurement? (Note, significant workin this area has been presented in [91].)

• Improve the robustness of constrained initialisation by employing batch associationto test all tentative features at once, and obtain a mutually compatible subset.

• Develop a more accurate and complete utility measure for density control. This wouldreplace the current, rather crude, metric of “feature visibility.”

• Define more general criteria for the “predicted visibility” of features for obsolete fea-ture removal. This problem may turn out to be application specific (e.g., models ofocclusion, different sensing properties, etc).

• Implement “partial SLAM,” where all landmarks outside the current field-of-view areremoved, and incorporate GPS information to obtain a fused global estimate.

7.2.4 NCFM SLAM Extensions

Most of the interesting properties of NCFM have not yet been implemented. A full exper-imental verification of NCFM SLAM is proposed as essential future work. This includes:local submap SLAM, vehicle transition, submap creation, coupling updates, and cycle de-tection.

Furthermore, the global consistency and convergence properties of NCFM require the-oretical confirmation. This involves formal proof of the claims: (i) that treating correlatededge couplings as though they were independent is consistent (and conservative), and (ii)that global convergence approaches the same lower limit as full SLAM.

Future work may also investigate the following extensions to NCFM SLAM. First, todetermine whether the conservative covariance estimates obtained from edge coupling sum-mation may be improved given that the uncertainty of intermediate submap origins is irrel-evant to a global estimate. And second, to investigate how absolute location information,such as GPS, might be incorporated into the map update.

7.2.5 Longer Term Developments

The original stochastic SLAM algorithm introduced an uncertainty measure that enabledconsistent map building. With subsequent extensions, including those presented in thisthesis, such as techniques for robust data association and tractable computation, stochasticSLAM is now possible at a very large scale.

Nevertheless, robust autonomous navigation in unstructured outdoor environments isfar from a solved problem, and some difficult open problems remain. These fall into twomain categories.


The first category is process modelling. Outdoor SLAM needs to contend with 212 -D

and 3-D environments, and rugged terrain. Thus, it is necessary to develop process modelsthat represent high-speed, non-planar motion and high levels of slip. One possible approachis to combine kinematic (i.e., encoder-based) models with INS and external-sensor baseddead reckoning methods.

The second category is observation modelling. Scanning range lasers are not well suitedto non-planar environments (although, given a non-planar motion model, improved resultsare possible), and alternative sensing and modelling methods are likely to be more effec-tive. Using sensors better suited to 3-D (e.g., pan-tilt scanning laser, radar, vision), thekey problem is to develop appropriate probabilistic observation models for them. For anygiven sensor, this problem includes the development of generic feature models that per-mit real-time data interpretation, calculation of egomotion and SLAM. Ideally, this wouldmean devising a feasible general-purpose landmark model that does require tailoring to newenvironments and applications.

The problem of sensor and landmark modelling, particularly for complex sensor infor-mation like vision, is by far the most difficult challenge for the implementation of general-purpose SLAM.

Appendix A

Environments

This appendix describes the three environments used for experimental evaluation of thetechniques described in this thesis. Each description contains a photograph of a typicalportion of the environment and an image produced from laser data obtained during exper-imental trials. The laser-based images were generated by first estimating the global poseof the laser for each scan (e.g., using SLAM) and then accumulating the unprocessed scanpoints into a single global scene. These images give a good impression of the types of objectsvisible to the laser sensor, both static and dynamic.

The particular characteristics of each environment that would influence the experimentalresults are discussed. These include the predominant types of static features and dynamicobjects, and the quality of the road surface and surrounding terrain. Also noted is theconfiguration of the laser, in terms of scan rate and maximum effective range, and thedriving speed of the vehicle. The vehicle speed is included to convey an idea of dynamiceffects, such as roll and pitch, which cause most of the difficulties in interpreting the laserdata.

The first two environments are outdoor regions where the observable landmarks arealmost exclusively tree trunks. This makes (point) feature extraction reasonably straightforward. Since this thesis is concerned mostly with data association and map management,this simplification is not considered important and alternative feature models for otherenvironments are common in the literature (particularly for indoors).

A.1 Urban Parkland

The park environment (see Figure A.1) consists of gently undulating terrain bounded alongone side by a major highway. Much of the region contains trees which provides the mainsource of static features for localisation. The road surface is mostly smooth bitumen, al-though some parts are in disrepair and some of the vehicle trajectory is off-road.

There are three sources of dynamic objects in this environment. The first is the trafficmoving along the adjacent highway, which at times occupies a large portion of the laserfield-of-view. Second, there are a small number of people walking through the park duringtesting (although this is a very minor factor). Finally, and most importantly, laser tilt—caused by vehicle acceleration and irregularities in the ground surface—sometimes results in

A.1 Urban Parkland 165

large portions of a scan being corrupted by ground returns. These transient measurementsboth occlude good information and give an illusion of moving landscape as the laser cross-section shifts with subsequent scans. Embankments and other sloping structures have asimilar effect.

The configuration of the scanning laser and indicative vehicle speeds for the trials inthis environment are as follows.

Laser SpecificationsScan rate: 4.7 HzMaximum effective range: ∼40 m

Vehicle SpeedMaximum speed: 5.3 m/s (19 km/hr)Average speed: 2.8 m/s (10 km/hr)Maximum turn rate: 0.6 rad/s (34 deg/s)

Figure A.1: Park environment.

A.1 Urban Parkland 166

A laser-based image for this environment is shown in Figure A.2. The trunks of treesare easily discernible as small discrete circles, while larger more blurred patches indicateshrubs (e.g., see coordinates (−46, 5) and (35,−30)).

There are a lot of smeared portions in this picture that are caused by the motion ofdynamic objects. The entire bottom section (from about -35 downward in the y-axis) is dueto moving traffic along the adjacent highway. The set of dots forming an approximate linefrom (−48,−3) to (−28,−2) was caused by a person walking across the laser field-of-viewover successive scans. Another fainter person-derived line is shown between (−31, 1) and(−11, 1). The remaining source of smearing is ground sweeps with the major occurrencesin this picture shown about the regions (−5, 5) and (95, 5). The shifting nature of groundsweeps is most evident in the latter of these two examples. Other ground sweep interferenceoccurred during the test run, some more severe than those shown here.

Note that in the upper-middle section of the picture, no stable features are visible. Whenthe semicircle of the laser field-of-view was directed into this region (usually only for shortperiods), data association was obviously impossible and vehicle localisation was forced torely solely on the inertial dynamic model. Other regions of sparse feature density exist inthe areas right and left of the region pictured. In these regions, more accurate and reliableresults would be possible with the incorporation of encoder-based odometry.

−80 −60 −40 −20 0 20 40 60 80 100

−60

−50

−40

−30

−20

−10

0

10

20

30

Figure A.2: Laser image of park environment. This image depicts all thelaser returns accumulated over the test trajectory shown.

A.2 Internal Road of a Country Resort 167

A.2 Internal Road of a Country Resort

The internal road environment is essentially static with ground sweeps from embankments inthe terrain (and, to a lesser degree, laser tilt) being the only significant source of dynamicinformation. As with the park environment, the main source of features here are treetrunks. However, as is evident in Figure A.3, the density of trees is much greater; makingdata association a decidedly greater challenge.

The road is smooth bitumen, and the terrain is either flat or gently sloping, so thatgenerally the laser scan information is close to ideal. However, there is one section of thetrial run that is off-road over rough terrain and another where, for a short period, no visiblefeatures are available within the laser field-of-view. The off-road section, over which thevehicle traverses a small loop, is further encumbered by the presence of a large embank-ment and few stable features, and localisation from laser-only information becomes ratherunreliable (see Section 5.4 for more details). However, in the other regions, where featuresare plentiful, the data association methods presented in this thesis perform extremely well.



Figure A.3: Internal road environment.

A.2 Internal Road of a Country Resort 168

The laser image shown in Figure A.4 depicts a region densely covered by trees. Inthis region, batch data association, as presented in Chapter 3, is absolutely essential asindividual associations based on vehicle pose would quickly fail. The trunks of trees areeasily discernible as small discrete points in the image, while shrubs and undergrowth appearas clumps of semi-merged points. The rectangular objects in the scene are small cabins.The clean straight lines defining these cabins indicate the accuracy of the SLAM algorithmused to generate this image.

−20 0 20 40 60 80 100 120

−60

−40

−20

0

20

40

Figure A.4: Laser image of internal road environment.

A.3 Underground Mine Tunnel 169

A.3 Underground Mine Tunnel

The underground mine tunnel is substantially different to the previous two environment inthat it possesses no discrete features but is characterised entirely by the texture of its walls(see Figure A.5). This environment is completely static with no dynamic objects present,and the road surface is flat and smooth. However, some variation in the perceived walltexture can occur due to the roll of the vehicle at higher speeds—which shifts the laser scancross-section.

Extraction of stable parametric features in this environment is extremely difficult, al-though some work has been done to find points of maximum curvature using scale-spacemethods [94]. On the other hand, the quantity of texture in the wall surface permits veryaccurate and reliable scan correlation, which enables accurate laser-based dead reckoning,and is particularly tenable to place recognition and a topological-metric implementation.



Figure A.5: Mine tunnel environment. Note the smooth road surfaceand distinctive wall texture. Coal is extracted from this mine using LHD(load, haul, dump) vehicles such as the one pictured in the foreground.

A.3 Underground Mine Tunnel 170

Experiments in the mine were performed using a diesel 4WD vehicle with the laserattached to the front bumper. The laser image shown in Figure A.6 is compiled using scancorrelation based dead reckoning (i.e., the pose of the laser for each scan is found relative tothe laser pose for the previous scan using the maximum likelihood scan correlation methodpresented in Chapter 4). The clarity and definition of the walls in the image demonstratethe accuracy of this dead reckoning procedure. Note, in the bottom right-hand corner ofthe image, the vehicle performs a 3-point turn—accounting for the apparent discontinuityin the vehicle trajectory.

0 20 40 60 80 100

−20

0

20

40

60

80

100

120

Figure A.6: Laser image of mine tunnel.

Appendix B

Kalman Filter Topics

This appendix contains a brief introduction to state-space representations and describesthe basic linear Kalman filter and extended Kalman filter algorithms. Further informationregarding Kalman filters can be found in the following texts [97, 6, 61].

B.1 State-Space

State space concerns the representation of a system in terms of a set of scalar parameters, orstate variables, such that the system is defined by a state vector x = [x1, . . . , xn]T . Usuallythe exact condition of the state vector is not known, and there is a degree of uncertaintyto each of the state variables. If the uncertainty has Gaussian distribution, then eachvariable can be defined in terms of its mean xi and standard deviation σi. Generally, themagnitude of uncertainty is given by the standard deviation squared (or variance) σ2

i , as inthis form the uncertainties of independent variables can be added. For example, if x1 andx2 are independent scalar variables with standard deviations σ1 and σ2 respectively, andx3 = x1 + x2 then σ2

3 = σ21 + σ2

2. Therefore, the system x is represented by its mean vectorand covariance matrix.

x =

x1

...xn

� E[x]

Px =

σ2

11 · · · σ21n

.... . .

...σ2

1n · · · σ2nn

� E[(x − x)(x − x)T ]

where E[x] defines the expected value or mean as follows.1

E[x] �∫ ∞

−∞xp (x) dx

The off-diagonal terms of Px are called cross-correlations and represent the dependence ofthe uncertainty of one variable upon another.

1The function p (x) defines the probability density of x. More information about probability densityfunctions (PDFs) is provided in Chapter 4 and Appendix D.

B.1 State-Space 172

B.1.1 Linear Transformations

If x is transformed to y by a system of linear equations F, then the covariance matrix istransformed as follows.

y = Fx

Py = FPxFT

The addition of two independent state vectors is similar.

x3 = Ax1 + Bx2

P3 = AP1AT + BP2BT

Non-linear transformations, however, present a problem as it is no longer possible to simplytransform the covariance matrix from one space to the other. The usual solution to thisproblem is to approximate the covariance transform by linearising the transformation func-tions. Given a set of non-linear functions f , the linearised Jacobian matrix ∇fx is definedas the partial derivative of f with respect to x about the point x.

y = f (x) =

f1 (x1, . . . , xn)

...fm (x1, . . . , xn)

∇fx =∂f∂x

∣∣∣∣x

=

∂f1

∂x1. . . ∂f1

∂xn...

. . ....

∂fm

∂x1. . . ∂fm

∂xn

x

Assuming that f is reasonably linear over a small region, the linearised transformationmatrix approximates f for values of x close to x. Therefore, the covariance transformationcan be calculated as for the linear case.

Py = ∇fxPx∇fTx

A problem with the linearised approach is that the transformations introduce a bias intothe covariance estimate, and may incur a significant error for very non-linear functions. Analternative method that addresses this issue is the unscented transformation [76, 75] whichtransforms a set of points through the non-linear function and reconstructs a Gaussiandistribution to encompass these points.

Example B.1Conversion between polar and Cartesian coordinate systems. A polar measurement p, suchas returned by a range-bearing laser, is often required as a point x in Cartesian space.

p =[

r

θ

]

Pp =[

σ2rr σ2

rθ

σ2rθ σ2

θθ

] f−−−−→x =

[xy

]Px

B.2 The Kalman Filter 173

where the non-linear transform f is given by

x = f (p) =[

r cos θr sin θ

]The transformed estimate x can be calculated directly from the non-linear function but thecovariance must be determined from the Jacobian.

∇fp =∂f∂p

∣∣∣∣p

=[

cos θ −r sin θ

sin θ r cos θ

]Px = ∇fpPp∇fT

p

The inverse transformation from Cartesian to polar coordinates is similar. The functionand its Jacobian are as follows.

p = h (x) =[ √

x2 + y2

arctan( y

x

) ]

∇hx =∂h∂x

∣∣∣∣x

=

[x√

x2+y2

y√x2+y2

−yx2+y2

xx2+y2

]

B.2 The Kalman Filter

The Kalman filter is an algorithm for fusing information in state-space. It is applicable tosystems where both the process (state transitions) and observations of the state are linearand the associated uncertainties are Gaussian. Essentially, it works by a two step cycle, aprediction step and an update step, to recursively produce a state estimate with minimummean squared error (i.e., minimise the trace of the state covariance matrix).

Given a system defined by the state xk (at time k) and modelled by the process

xk+1 = Axk + Buk + qk (B.1)

where uk represents some independent state vector (usually a control input) and qk denotesGaussian white noise, the predict step of the Kalman filter operates as follows.

x−k+1 = Axk + Buk (B.2)

P−k+1 = APkAT + BUkBT + Qk (B.3)

The − sign indicates that this is a prediction or an a priori state estimate. The matricesPk, Uk and Qk are the covariance matrices for xk, uk and qk respectively. If, at time k+1,an independent observation is made of the state

zk+1 = Hxk+1 + rk+1 (B.4)

where rk+1 represents Gaussian white noise with covariance Rk+1, then an improvement canbe made to the state estimate. That is, the update estimate of xk+1 will be a weighted sum

B.3 The Extended Kalman Filter 174

of the predicted state and the observed state. The Kalman filter determines the optimalweighting as follows.

νk+1 = zk+1 − Hx−k+1 (B.5)

Sk+1 = HP−k+1H

T + Rk+1 (B.6)

Wk+1 = P−k+1H

TS−1k+1 (B.7)

The vector νk+1 is the error (or residual) between the predicted and actual observation. Thiserror is commonly termed the innovation and the matrix Sk+1 represents the innovationcovariance. The optimal weighting Wk+1 is known as the Kalman gain. The updated stateestimate (or a posterior estimate) is then determined.

x+k+1 = x−

k+1 + Wk+1νk+1 (B.8)

P+k+1 = P−

k+1 − Wk+1Sk+1WTk+1 (B.9)

Notice that the Kalman filter estimates the covariance Pk+1 independent of the value ofthe state mean xk+1 (i.e., the value of the mean estimate does not affect the covarianceestimate). In fact, for linear systems, it is possible to compute the state covariance off-line.

B.3 The Extended Kalman Filter

Most real systems, and certainly the systems in this thesis, are not governed by linearequations, and the basic Kalman filter is insufficient for these tasks. To cater for non-linearestimation problems the extended Kalman filter (EKF) was devised. For example, a statetransition is modelled by a non-linear function f and an observation of the state modelledby a non-linear function h as follows.

xk+1 = f (xk,uk) + qk (B.10)zk+1 = h (xk+1) + rk+1 (B.11)

The key attribute of the EKF is to linearise the functions f and h about the point ofthe state mean. Therefore, unlike the linear Kalman filter, the state covariance estimateis no longer independent of the state mean estimate as they are coupled by the Jacobiancalculations. The prediction step becomes

x−k+1 = f (xk, uk) (B.12)

P−k+1 = ∇fxk

Pk∇fTxk

+ ∇fukUk∇fT

uk+ Qk (B.13)

where the Jacobian ∇fxkis the partial derivative of f with respect to x at the point (xk, uk)

and the Jacobian ∇fukis similarly defined for u.

∇fxk=

∂f∂xk

∣∣∣∣(xk,uk)

(B.14)

∇fuk=

∂f∂uk

∣∣∣∣(xk,uk)

(B.15)

B.3 The Extended Kalman Filter 175

Calculation of the Kalman gain is performed similarly by linearising the observation functionh.

νk+1 = zk+1 − h(x−

k+1

)(B.16)

Sk+1 = ∇hxk+1P−

k+1∇hTxk+1

+ Rk (B.17)

Wk+1 = P−k+1∇hT

xk+1S−1

k+1 (B.18)

where the Jacobian ∇hxk+1is given by

∇hxk+1=

∂h∂xk+1

∣∣∣∣x−

k+1

(B.19)

The state update is then calculated using the normal Kalman update equations B.8 and B.9.The EKF operates on the assumption that the functions f and h are near-linear so that thelinearised transformations sufficiently approximate the correct covariance transformations.However, for highly non-linear functions, linearisation may generate inconsistent uncertaintyestimates and a method like the unscented transform might provide more accurate results.Also, as with the standard linear Kalman filter, the EKF assumes Gaussian uncertaintydistributions for all estimates and large deviations from this ideal may also lead to unreliableperformance.

Appendix C

Insert and Observe StateAugmentation

An alternative to deriving specialised Jacobians for the initialisation of state variables is themethod of insert and observe, which enables state element initialisation via the EKF updateequations. The method is appropriate for any state augmentation procedure where newelements are added to an existing state vector. Initialisation is performed by first settingthe means of the inserted elements to arbitrary values and their covariance diagonals toinfinity (i.e., total uncertainty).

xaug =[

xold

xnew

](C.1)

Paug =[

Pold 00 ∞Inew

](C.2)

Obtaining the correct mean and covariance values for the added elements (including cross-correlations to the old portion of the state vector) is then simply a matter of applying thenormal observation update equations.

In practice, the added mean elements are initially set to values expected by the newobservation information (i.e., a close initial guess); this prevents linearisation problemsduring the EKF update. The initial covariance diagonals are set to a suitably large value(e.g., σ2 = 1012), as an approximation to infinity. While using non-infinite variances actuallyconstitutes a reduction in uncertainty without the addition of information, this inconsistencyis extremely small and easily absorbed by the added process and observation stabilisingnoise.

C.1 New Feature Initialisation

This example of the insert and observe method is an alternative to the map augmentationmethod presented in Section 2.2.4. However, the following approach is not recommendedfor feature initialisation in practice, since the direct Jacobian-based method is both moreexact and more efficient.

C.2 Vehicle Pose Initialisation in a Partially Known Map 177

Given the observation z = [r, θ]T of a new feature (xi, yi) where the observation is mod-elled as

zi = hi (xv, xi) =

√

(xi − xv)2 + (yi − yv)

2

arctan(

yi−yv

xi−xv

)− φv

(C.3)

the feature is added to the state vector xa with approximately1 infinite uncertainty.

xaug =

xa

xv + r cos (θ + φv)yv + r sin (θ + φv)

(C.4)

Paug =[

Pa 00 αI

](C.5)

The posterior augmented state estimate can then be obtained via the normal SLAM updateequations (i.e., Equations 2.17 to 2.22 from Section 2.2.3).

In the event of long term SLAM, where a huge number of features are initialised bythis method, it is conceivable that the small decrease in vehicle pose uncertainty due tofinite α may accumulate to the point where the estimate becomes inconsistent. To avoidthis problem, a simple modification to the update equations (i.e., Equations 2.21 and 2.22)may be implemented so that only the new feature portion of the state vector, and only thenew feature block and its off-diagonals of the state covariance matrix, are changed. Withthis modification, the information increase is confined to the new feature only and does notaccumulate.

C.2 Vehicle Pose Initialisation in a Partially Known Map

The insert and observe initialisation method is most advantageous when used to performan augmentation of the state vector that requires multiple simultaneous observations (i.e.,a batch observation). Again, the elements to be initialised are simply given approximatelycorrect nominal values and large uncertainty, and are updated to the correct values usingthe observation information.

If a SLAM map has been constructed for a particular environment, the vehicle statesxv can be removed from the map in a consistent manner by simply deleting the vehicleelements from the state vector and deleting the vehicle-correlated rows and columns fromthe covariance matrix. This enables the map xm to be decoupled from the vehicle poseestimate.

Supposing the vehicle (or another vehicle) later observes a set of features from the mapusing the non-tracking CCDA algorithm (see Section 3.3.3), and it is desired to append thevehicle pose to the map and continue SLAM map building. An initial guess of the vehiclepose xguess can be obtained using a batch relative pose estimation technique—such as theclosed-form least-squares algorithm presented in [73, 93]. The augmented state vector is

1A practical approximation to infinity is provided by a large scalar α.

C.3 Relative Pose Estimation 178

then constructed as follows.

x−a =

[xguess

xm

](C.6)

P−a =

[αIv 00 Pm

](C.7)

The batch observation information is subsequently used to update this state estimate tothe proper value (i.e., via Equations 2.17 to 2.22).

x+a =

[xv

xm

](C.8)

P+a =

[Pv Pvm

PTvm Pm

](C.9)

C.3 Relative Pose Estimation

The following technique is a variation on the vehicle pose initialisation example whichenables calculation of the relative pose (and uncertainty) between two scans of featurepoints.

To begin, the state vector is constructed with the vehicle at pose xv = [0, 0, 0]T , withzero uncertainty, and an initial map composed of the features from the first scan.2

xa =[

0v

xm

](C.10)

Pa =[

0 00 Pm

](C.11)

Notice that each feature xi = [xi, yi]T in the map is not correlated to the vehicle or to anyof the other features, and is derived from the measurement zi = [ri, θi]T as follows.

xi = g (zi) =[

ri cos θi

ri sin θi

](C.12)

Pi = ∇gz

[σ2

r 00 σ2

θ

]∇gT

z (C.13)

where the Jacobian ∇gz is given by

∇gz =∂g∂zi

=[

cos θi −ri sin θi

sin θi ri cos θi

](C.14)

The next step is to guess the vehicle pose for the second scan. If there is no a prioriinformation, this guess might be obtained by first finding the set of feature associations be-tween the two scans via the non-tracking version of the CCDA algorithm (see Section 3.3.3).

2Equations C.10 and C.11 are also used for initialising the state vector for SLAM.


Then, as in Section C.2, a guess of the relative pose can be obtained using a simple batchestimation algorithm, and the pose covariance diagonals are assigned a large uncertainty.

x−a =

[xguess

xm

](C.15)

P−a =

[αIv 00 Pm

](C.16)

On the other hand, if there exists a priori information about the relative pose (e.g.,from odometry or dynamic constraints), then the initial guess and uncertainty would be asfollows.

x−a =

[xprior

xm

](C.17)

P−a =

[Pprior 0

0 Pm

](C.18)

With this prior estimate, the tracking version of the CCDA algorithm (see Section 3.3.4)can be applied subsequently—making use of the absolute constraint information.

Having obtained a set of associations between the two scans, a full SLAM update isperformed using Equations 2.17 to 2.22.

x+a =

[xv

xm

](C.19)

P+a =

[Pv Pvm

PTvm Pm

](C.20)

This produces a state estimate where the vehicle and map portions are correlated. If thismethod is used to obtain the relative pose between two isolated scans, then the vehicle poseestimate xv and Pv can be extracted directly. However, if this method is used to calculatethe change-in-pose over a sequence of scans (e.g., laser-based dead reckoning), then the mapcorrelations become an issue, and direct use of Pv may be inconsistent (see the examplebelow).

Example C.1Laser-based dead reckoning. Given a sequence of laser scans, the relative pose between thekth and k+1th scan can be calculated using the method shown above. A naive dead reckoningimplementation would make the change-in-pose estimate, xδ and Pδ, equal to the extractedrelative pose estimate xv and Pv.

Let the dead reckoning estimate for the pose of the kth scan, be denoted xk = [xk, yk, φk]T ;the dead reckoning equations for the k + 1th scan are as follows.3

xk+1 = f (xk, xδ) =

xk + xδ cos φk − yδ sin φk

yk + xδ sin φk + yδ cos φk

φk + φδ

Pk+1 = ∇fxkPk∇fT

xk+ ∇fxδ

Pδ∇fTxδ

3These equations are a simple variation of the state prediction equations in Section 2.2.2 where the mapportion of the state vector is empty.


where the Jacobians ∇fxkand ∇fxδ

are defined as

∇fxk=

∂f∂xk

∣∣∣∣(xk,xδ)

=

1 0 −xδ sin φk − yδ cos φk

0 1 xδ cos φk − yδ sin φk

0 0 1

∇fxδ=

∂f∂xδ

∣∣∣∣(xk,xδ)

=

cos φk − sin φk 0

sin φk cos φk 00 0 1

The problem with assigning Pδ = Pv is that it may result in an inconsistent estimate ofPk+1 because the relative pose estimates are pair-wise correlated (i.e., the change-in-poseestimate between scans k − 1 and k is correlated to the change-in-pose estimate betweenscans k and k + 1). In other words, the information in each scan is used twice, once withthe preceding scan and once with the subsequent scan. The measure used in this thesis toprevent inconsistent dead reckoning is to expand the change-in-pose uncertainty by two (i.e.,Pδ = 2Pv), which caters for the worst case condition where the two adjacent change-in-poseestimates are fully correlated.4

4Ignoring correlations does not necessarily imply an inconsistent estimate and, in the particular caseof sensor-based dead reckoning, the correlations actually tend to produce a self-correcting estimate. Thus,assigning Pδ = Pv is conservative in most, if not all, circumstances (see Section 6.2.6 for further explanation).In other words, while uncertainty inflation is necessary in general for functions of variables with unknowncorrelations, it is probably not required for the summation of change-in-pose estimates.

Appendix D

Particle Filters

This appendix introduces Bayesian (probabilistic) estimation and describes the particlefilter algorithm as a practical method for implementing recursive Bayesian estimation forreal systems. Particle filtering is an attractive estimation technique for systems with any ofthe following properties: (i) governed by non-linear transition models, (ii) observed via non-linear sensor models, and (iii) possessing arbitrary (non-Gaussian) uncertainty distributions.

D.1 Probability Density Functions

The probabilistic approach to estimation is based on the concept of the probability densityfunction (PDF) which is used to define the uncertainty distribution of a set of randomvariables. In other words a PDF p (x) expresses, for a particular random vector x, thelikelihood that the true state of x lies within a particular region of the state-space X . Forthe purposes of this thesis it is sufficient to understand the following PDF properties.1

• A PDF p (x) represents a functional mapping p : x → R for all x ∈ X .

• A PDF p (x) is non-negative for all values of random vector x.

p (x) ≥ 0, ∀x ∈ X

• The area (or volume) under a PDF is one.∫ ∞

−∞p (x) dx = 1

If there exist two random vectors x and y where the value of x is to some degreedependent on the value of y, then there exists a conditional PDF p (x|y). The conditionalPDF p (x|y) may be understood as the probability or likelihood of x given a fixed value ofy. If x and y are independent then p (x|y) = p (x).

1Excellent expositions of Bayesian estimation theory can be found in [97, 109, 125, 6].

D.2 Recursive Bayesian Estimation 182

D.2 Recursive Bayesian Estimation

This section gives a cursory introduction to Bayesian estimation2 which involves trackingthe PDF of a random vector over time given a model of its incremental change and a meansof observing its actual state. A state vector xk at time k, is assumed to evolve over time asmodelled by the state transition function f .3

xk+1 = f (xk,qk) (D.1)

where qk is an independent noise sequence and the PDF p (qk) is known. An observationof the state is obtained at discrete intervals such that the measurement zk is modelled bythe function h.

zk = h (xk, rk) (D.2)

where rk is also an independent noise sequence with known PDF.Assuming that at time k, the PDF p (xk−1|Zk−1) is known4 and an observation zk has

been obtained, p (xk|Zk) can be estimated by a two-step process. The first is a predictionstep to find the prior PDF p (xk|Zk−1).

p (xk|Zk−1) =∫ ∞

−∞p (xk|xk−1) p (xk−1|Zk−1) dxk−1 (D.3)

where the p (xk|xk−1) is derived from the state transition model f . It is assumed thatthe transition function is a Markov process such that p (xk|{x0, . . . ,xk−1}) is equal top (xk|xk−1). That is, the PDF of xk is conditioned by xk−1 only and is independent ofall previous states.

p (xk|xk−1) =∫ ∞

−∞δ (xk − f (xk−1,qk−1)) p (qk−1) dqk−1 (D.4)

where δ (·) is the Dirac delta function or impulse function (an infinitely brief pulse withinfinite amplitude and unit area [19]). The second step, the update step, follows fromBayes theorem that the posterior PDF p (xk|Zk) is given by

p (xk|Zk) =p (zk|xk) p (xk|Zk−1)∫∞

−∞ p (zk|xk) p (xk|Zk−1) dxk(D.5)

where the denominator of the equation is simply a scaling factor to normalise the area underthe posterior to one. The PDF p (zk|xk) is defined by the observation model h.

p (zk|xk) =∫ ∞

−∞δ (zk − h (xk, rk)) p (rk) drk (D.6)

For the case where the state transition and observation functions, f and h, are linearand the uncertainty distributions are all Gaussian, then the above equations are equivalent

2Sections D.2 and D.3 are based on [60] which provides an accessible introduction to the particle filteralgorithm.

3The functions f and h may change over time and so should actually be written as fk and hk.4The conditional PDF p (xk−1|Zk−1) where Zk−1 is the set {z1, . . . , zk−1}, can be interpreted as the PDF

of xk−1 given all the available observation information up to time k − 1.

D.3 Particle Filtering: Sample Based Estimation 183

to the Kalman filter (see Appendix B for a description of the KF algorithm). If f andh are almost linear then, for Gaussian PDFs, Equations D.3 to D.6 may be adequatelyapproximated by the extended Kalman filter. However, for general non-linear systemswith arbitrary distributions, the EKF is unsuitable and it may be very difficult to find ananalytical description of the posterior distribution. An alternative approach is to obtainan approximate estimate of the posterior PDF using samples. The technique of drawingstate samples from the prior distribution and using these samples (in conjunction with statetransition and observation information) to approximate the posterior is known as particlefiltering.5

D.3 Particle Filtering: Sample Based Estimation

The basic particle filter algorithm is implemented as follows. First, assuming that theinitial PDF p (x0) is known, the algorithm is initialised by drawing n random samples{x0(1), . . . ,x0(n)} from this distribution. From this point on the conditional PDF of xis represented entirely by samples and an analytical formulation of the distribution is notrequired. The evolution of p (xk|Zk) is computed recursively using a three step process:prediction, likelihood weighting, and resampling.

1. Prediction is performed by passing each sample from the previous time step xk−1(i)through the state transition function (Equation D.1).

x−k (i) = f (xk−1(i),qk−1(i)) (D.7)

where the sample qk−1(i) is drawn6 from the known distribution p (qk−1). The result-ing set of samples

{x−

k (1), . . . ,x−k (n)

}represents the prior distribution p (xk|Zk−1).

2. Upon the reception of observation zk at time k, a likelihood weighting is calculatedfor each prior sample x−

k (i) based on the PDF p(zk|x−

k (i))

(of Equation D.6). Asboth zk and x−

k (i) are specified, the value of p(zk|x−

k (i))

is scalar and Λk(i) is givenby

Λk(i) =p(zk|x−

k (i))∑n

j=1 p(zk|x−

k (j)) (D.8)

That is, the likelihood Λk(i) is proportional to p(zk|x−

k (i))

and normalised so thatthe sum of likelihoods is one.

3. Having assigned a weighting to each prior sample, the posterior p (xk|Zk) is estimatedby resampling from the prior sample set according to their weightings. The resamplingalgorithm is as follows. First, a set of cumulative likelihoods {Υk(0), . . . ,Υk(n)} iscalculated.

Υk(i) ={

0, i = 0∑ij=1 Λk(j), otherwise

(D.9)

5Particle filters appear in the literature under a variety of guises including Bootstrap filter [60], SIR filter[114], Monte Carlo filter [79], and Condensation algorithm [74].

6The noise sequence qk is assumed to have known distribution p (qk) at each time step k. It is alsoassumed that a random sample may be feasibly drawn from this distribution.


Each posterior sample x+k (i) is then found by drawing a random scalar sample ui from

the uniform distribution U(0, 1] and selecting the indicated sample from the the priorset.

x+k (i) = x−

k (j), where Υk(j − 1) < ui ≤ Υk(j) (D.10)

The set of samples{x+

k (1), . . . ,x+k (n)

}then represents the posterior distribution

p (xk|Zk) and the estimation process can be repeated from step 1.

Implementation of step one is trivial. For each sample xk−1(i) of the old posteriorp (xk−1|Zk−1), draw a sample qk−1(i) from the distribution p (qk−1) and pass them throughthe state transition function. Steps two and three are less intuitive and the examplesbelow elaborate on the calculation of the likelihood function p

(zk|x−

k (i))

for step two andimplementation of the resampling phase for step three.

Example D.1Likelihood weighting calculation. This example is taken from the first example in [60] andserves to elucidate the calculation of p (zk|xk). Given a one-dimensional observation model

zk = x2k/20 + rk

where rk is zero-mean Gaussian white noise with variance 1.0, the likelihood functionp (zk|xk) is found from Equation D.6 as follows.

p (zk|xk) =∫ ∞

−∞δ

(zk − x2

k

20− rk

)p (rk) drk

This represents, for specific values of zk = zo and xk = xo, the area under the functionresulting from the multiplication of p (rk) with the impulse δ

(zo − x2

o/20 − rk

), where the

impulse is non-zero at rk = zo − x2o/20 as shown in Figure D.1. In this way, given an

observation zo at time k, the likelihood of each sampled particle x−k (i) can be calculated.

The likelihood p (zk|xk) can be expressed as a function of zk in sensor-space for a fixedvalue of xk = xo (see Figure D.2(a)). Alternatively, for a fixed value of zk = zo, thelikelihood can be expressed as a function of xk in target-space as shown in Figure D.2(b).

For general state-space systems with likelihood function defined by Equation D.6, thefunction p (zk|xk = xo) always integrates to one meaning that the likelihood function is aPDF in sensor-space. However, in target space, the function p (zk = zo|xk) does not ingeneral integrate to one and so does not represent a PDF. For further information onlikelihood functions, a detailed discussion can be found in [125, pages 33–35].

Example D.2Resampling from the weighted prior sample set. Let x be a one-dimensional random variableand let the prior PDF p (xk|Zk−1) be represented by five samples{

x−k (1) = 6, x−

k (2) = 3, x−k (3) = 8, x−

k (4) = 5, x−k (5) = 7

}with normalised likelihood weightings

{Λk(1) = 0.35, Λk(2) = 0.05, Λk(3) = 0.15, Λk(4) = 0.2, Λk(5) = 0.25}


Figure D.1: Likelihood evaluation. For fixed values of zk and xk (zo = 5and xo = 9 in this example), the likelihood is found as the area underthe function obtained from multiplying p (rk) with the impulse function.Note, this area is simply the value of p (rk) at the location where δ (·) isnon-zero.

(a) p (zk|xk = xo) as a function of zk (b) p (zk = zo|xk) as a function of xk

Figure D.2: Likelihood functions. The likelihood p (zk|xk) for the modelzk = x2

k/20 + rk is shown in (a) as a function of zk with xo = 9 and in(b) as a function of xk with zo = 5.

D.4 Resampling Implementations 186

Algorithm D.1: Resample(x−k , Λk)

Υ1 ← Λ1

for i ← 2 to ndo Υi ← Υi−1 + Λi

for i ← 1 to ndo Υi ← Υi/Υn

U ← UniformRandom(n)j ← 1for i ← 1 to n

do

while Υj < ui

do j ← j + 1x+

k (i) ← x−k (j)

return (x+k )

The cumulative likelihoods are therefore

{Υk(0) = 0, Υk(1) = 0.35, Υk(2) = 0.4, Υk(3) = 0.55, Υk(4) = 0.75, Υk(5) = 1}

A scalar value u1 = 0.43 is drawn at random from a uniform distribution U(0, 1]. Thisvalue is greater than Υk(2) and less than Υk(3) so the new sample x+

k (1) is assigned thevalue of x−

k (3). Suppose the five random scalars drawn were

{u1 = 0.43, u2 = 0.78, u3 = 0.60, u4 = 0.29, u5 = 0.85}

then the posterior sample set would be as follows.{x+

k (1) = 8, x+k (2) = 7, x+

k (3) = 5, x+k (4) = 6, x+

k (5) = 7}

D.4 Resampling Implementations

This section presents an implementation of the resampling algorithm. The key componentto this algorithm is the generation of an ordered set of uniform-random numbers in therange (0, 1], and so three alternative implementations of this module are presented.

The inputs to the resampling algorithm (see Algorithm D.1) are the prior sampleset x−

k ={x−

k (1), . . . ,x−k (n)

}and their associated (non-normalised) likelihood weight-

ings Λk = {Λk(1), . . . ,Λk(n)}. The algorithm returns the posterior sample set x+k ={

x+k (1), . . . ,x+

k (n)}.

The function UniformRandom(n) receives the number of samples required and returnsthis number of uniform-random samples, in ascending order, from the range (0, 1]. The firstversion of this function, shown in Algorithm D.2, performs straight-forward calculation

D.4 Resampling Implementations 187

Figure D.3: Uniform-random sampling algorithms. The same set of 50random samples is used in Algorithms D.2 to D.4 to allow comparison oftheir distributions (top to bottom, respectively). The first two algorithmspossess identical statistical characteristics, while the third gives regularlyspaced samples with uniform-random jitter in the range (− 1

2n , 12n).

Algorithm D.2: UniformRandom(n)

for i ← 1 to ndo ui ← U(0, 1]

Sort(U)return (U)


un ← U(0, 1]un ← u

1/nn

for i ← n − 1 to 1

do

{ui ← U(0, 1]ui ← u

1/ii ui+1

return (U)


k ← 1/nu1 ← k/2for i ← 2 to n

do{

ui ← ui−1 + k

ui−1 ← ui−1 + U(−k2 , k

2 )un ← un + U(−k

2 , k2 )

return (U)

D.5 Deficiencies of the Particle Filter Algorithm 188

of this set; first calculating a set of n values and later sorting them. The sort routinemeans that this implementation has O(n log n) complexity. A faster version with equivalentresults is presented in Algorithm D.3, where the samples are created in order by way ofexponential distributions—thus requiring only O(n) time (this algorithm is also providedin [44, 111]). Finally, a method known as stratified sampling (see also [79]) is presented inAlgorithm D.4. This algorithm also runs in O(n) time, but tends to give better statisticalresults by ensuring that the samples are evenly distributed (i.e., without clustering), withspacing between samples not greater than 1/n. A pictorial comparison of these algorithmsis shown in Figure D.3.

D.5 Deficiencies of the Particle Filter Algorithm

Particle filter theory states that the sampled representation of posterior approaches the trueposterior PDF as the number of samples approaches infinity. Practical implementations,however, are limited to a bounded number of samples and, in relation to this issue, thebasic particle filter algorithm suffers from several important weaknesses.

The first of these is that the number of samples n required to obtain reasonable statisticalresults must typically be very large (even for state vectors of just one or two variables).Nevertheless, for many applications, the distribution statistics may not be crucial providedsample coverage is sufficient to ensure that the target state remains consistently trackable(i.e., mode tracking). But even this task becomes infeasible for larger state-spaces as thevolume of state-space increases exponentially with the dimension of the state.

A second problem is sample impoverishment where the number of independent samplesdecreases over time (i.e., the filter behaves as if using only ne < n samples). This isapparent from Equation D.10 as clearly some samples from the prior set will be selectedmultiple times while others will be lost—resulting in a depletion in information. Sampleimpoverishment is countered in [60] by introducing “roughening”—adding Gaussian jitterto each resampled element x+

k (i)—which seems to effectively eliminate sample dependence.A third weakness arises when there is insufficient overlap between the prior sample

set and the observation likelihood function p (zk|xk) so that the posterior estimate is veryimprecise. This can happen, for example, if the likelihood function is highly peaked (asfor a very accurate sensor), or if the area covered by the bulk of the likelihood function ison the outskirts of the prior sample set (as would occur for an outlier observation). Thelikelihood coverage problem is difficult to solve and requires directing resampling to regionswhere the observation likelihood is significant and weighting the results to maintain thecorrect posterior statistics. Methods for tackling this problem can be found in [60, 23, 52,111, 138, 132].

Appendix E

Volume from GaussianMultiplication

The sum of Gaussians cross-correlation operation in Chapter 4 requires evaluation of thevolume1 under the product of two Gaussians. A derivation of the volume equation isprovided here in two sections. The first section shows the equation for the scaled Gaussianresulting from the multiplication of two Gaussians, and the second section solves for thevolume under this Gaussian.

E.1 Multiplication of Two Gaussians

Let g(x; p,P) and g(x; q,Q) define two n-dimensional Gaussians

g(x; p,P) =1

(2π)n/2√|P|exp

(−1

2(x − p)TP−1(x − p)

)(E.1)

g(x; q,Q) =1

(2π)n/2√|Q|exp

(−1

2(x − q)TQ−1(x − q)

)(E.2)

where p, P and q, Q are their respective means and variances.2 The scaled Gaussianαg(x; r,R) resulting from the multiplication of these two Gaussians is therefore

αg(x; r,R) = g(x; p,P)g(x; q,Q)

=α

(2π)n/2√|R|exp

(−1

2(x − r)TR−1(x − r)

)(E.3)

where r and R are given by

R−1 = P−1 + Q−1 (E.4)

r = R(P−1p + Q−1q) (E.5)

and the scaling factor α represents the volume under the function αg(x; r,R).1The term volume is used here to intuitively describe the integral over n-dimensional space.2The means are n × 1 real vectors and the variances are n × n real positive-definite matrices.

E.1 Multiplication of Two Gaussians 190

Proof. The derivation of Equation E.3 is as follows.3 From Equations E.1 and E.2, themultiplication of g(x; p,P) and g(x; q,Q) can be expanded to

g(x; p,P)g(x; q,Q)

=1

(2π)n√|P| |Q|exp

(−1

2[(x − p)TP−1(x − p) + (x − q)TQ−1(x − q)

])(E.6)

The next section of the proof concerns only the exponential portion of Equation E.6. Fornotational convenience, let A = P−1 and B = Q−1 so that

(x− p)TP−1(x− p) + (x− q)TQ−1(x− q) = (x− p)TA(x− p) + (x− q)TB(x− q) (E.7)

Notice that (x − p)TA(x − p) can be expanded using the matrix identity that xTAy = yTAxif A is real symmetric and xTAy is scalar.

(x − p)TA(x − p) = xTAx − xTAp − pTAx + pTAp

= xTAx − 2xTAp + pTAp

Therefore, Equation E.7 becomes

(x − p)TA(x − p) + (x − q)TB(x − q)

= xTAx − 2xTAp + pTAp + xTBx − 2xTBq + qTBq

= xT (A + B)x − 2xT (Ap + Bq) + pTAp + qTBq

(substituting C = R−1 from Equation E.4)

= xTCx − 2xT (Ap + Bq) + pTAp + qTBq

(inserting CC−1)

= xTCx − 2xTCC−1(Ap + Bq) + pTAp + qTBq

(substituting r from Equation E.5)

= xTCx − 2xTCr + pTAp + qTBq

(inserting rTCr − rTCr)

= xTCx − 2xTCr + rTCr + pTAp + qTBq − rTCr︸︷︷︸k= (x − r)TC(x − r) + k

3Thanks to Somajyoti Majumder for his assistance in this derivation. Note, the standard proof for “theproduct of two Gaussians is a Gaussian” is rather simpler than this one, but this approach fulfills a secondarymotivation to obtain Equation E.9.

E.2 Volume from the Multiplication of Two Gaussians 191

where k is a scalar constant term. Thus, incorporating this result back into Equation E.6the multiplication of g(x; p,P) and g(x; q,Q) can be expressed as

g(x; p,P)g(x; q,Q) =1

(2π)n√|P| |Q|exp

(−1

2[(x − r)TR−1(x − r) + k

])

=1

(2π)n√|P| |Q|exp

(−k

2

)exp(−1

2(x − r)TR−1(x − r)

) (E.8)

which is equivalent to Equation E.3 for α equal to

α =(2π)n/2

√|R|(2π)n

√|P| |Q|exp(−k

2

)(E.9)

E.2 Volume from the Multiplication of Two Gaussians

The volume α under the scaled Gaussian resulting from the multiplication of g(x; p,P) andg(x; q,Q) is given in Equation E.9 above. This equation expressed in terms of p, P, q andQ is as follows.

α =1

(2π)n/2√|P + Q|exp

(−1

2(p − q)T (P + Q)−1(p − q)

)(E.10)

Proof. The derivation of Equation E.10 from Equation E.9 is demonstrated in two parts.First, the simplification of √|R|√|P| |Q| =

√|P(P + Q)−1Q|√|P| |Q|=√

|P + Q|−1

=1√|P + Q|

(E.11)

and, second, the simplification of the constant k.

k = pTP−1p + qTQ−1q − rTR−1r (E.12)

The portion rTR−1r is expanded as

rTR−1r = (P−1p + Q−1q)TRTR−1R(P−1p + Q−1q)

= (pTP−1 + qTQ−1)(P(P + Q)−1Q

)(P−1p + Q−1q)

= pTP−1P(P + Q)−1QP−1p

+ qTQ−1Q(P + Q)−1PQ−1q

+ pTP−1P(P + Q)−1QQ−1q

+ qTQ−1Q(P + Q)−1PP−1p

= pT (P + Q)−1QP−1p + qT (P + Q)−1PQ−1q

+ pT (P + Q)−1q + qT (P + Q)−1p

E.2 Volume from the Multiplication of Two Gaussians 192

so that Equation E.12 becomes

k = pT(P−1 − (P + Q)−1QP−1

)p

+ qT(Q−1 − (P + Q)−1PQ−1

)q

− pT (P + Q)−1q − qT (P + Q)−1p

Now, P−1 − (P + Q)−1QP−1 can be simplified as follows.

P−1 − (P + Q)−1QP−1 = (P + Q)−1((P + Q)P−1 − QP−1

)= (P + Q)−1

(PP−1 + QP−1 − QP−1

)= (P + Q)−1

Similarly, Q−1 − (P + Q)−1PQ−1 = (P + Q)−1 so that Equation E.12 becomes

k = pT (P + Q)−1p + qT (P + Q)−1q

− pT (P + Q)−1q − qT (P + Q)−1p

= (p − q)T (P + Q)−1(p − q)

(E.13)

Substituting the results of Equations E.11 and E.13 into Equation E.9 provides the solutiongiven in Equation E.10.

Bibliography

[1] S. Argamon-Engelson. Using image signatures for place recognition. Pattern Recog-nition Letters, 19:941–951, 1998.

[2] K.S. Arun, T.S. Huang, and S.D. Blostein. Least-squares fitting of two 3-D point sets.IEEE Transactions on Pattern Analysis and Machine Intelligence, 9(5):698–700, 1987.

[3] O. Aycard, F. Charpillet, D. Fohr, and J.F. Mari. Place learning and recognitionusing hidden markov models. In IEEE International Conference on Intelligent Robotsand Systems, pages 1741–1746, 1997.

[4] T. Bailey and E. Nebot. Localisation in large-scale environments. Robotics andAutonomous Systems, 37(4):261–281, 2001.

[5] T. Bailey, E.M. Nebot, J.K. Rosenblatt, and H.F. Durrant-Whyte. Data associationfor mobile robot navigation: A graph theoretic approach. In IEEE InternationalConference on Robotics and Automation, volume 3, pages 2512–2517, 2000.

[6] Y. Bar-Shalom, X.R. Li, and T. Kirubarajan. Estimation with Applications to Track-ing and Navigation. John Wiley and Sons, 2001.

[7] Y. Bar-Shalom and E. Tse. Tracking in a cluttered environment with probabilisticdata association. Automatica, 11:451–460, 1975.

[8] H.G. Barrow and R.M. Burstall. Subgraph isomorphism, matching relational struc-tures and maximal cliques. Information Processing Letters, 4(4):83–84, 1976.

[9] K. Basye, T. Dean, and J.S. Vitter. Coping with uncertainty in map learning. Tech-nical report, Brown University, Department of Computer Science, 1989.

[10] R. Battiti and M. Protasi. Reactive local search for the maximum clique problem.Algorithmica, 29(4):610–637, 2001.

[11] S.S. Beauchemin and J.L. Barron. The computation of optical flow. ACM ComputingSurveys, 27(3):433–467, 1995.

[12] J.L. Bentley. K-d trees for semidynamic point sets. In Sixth Annual ACM Symposiumon Computational Geometry, pages 187–197, 1990.

[13] P.J. Besl and N.D. McKay. A method for registration of 3-D shapes. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992.

BIBLIOGRAPHY 194

[14] S.S. Blackman and R. Popoli. Design and Analysis of Modern Tracking Systems.Artech House Radar Library, 1999.

[15] M. Boddy and T. Dean. Solving time-dependent planning problems. In EleventhInternational Joint Conference on Artificial Intelligence, pages 979–984, 1989.

[16] R.C. Bolles and R.A. Cain. Recognizing and locating partially visible objects: Thelocal-feature-focus method. International Journal of Robotics Research, 1(3):57–82,1982.

[17] I.M. Bomze, M. Budinich, P.M. Pardalos, and M. Pelillo. The maximum clique prob-lem. In D.Z. Du and P.M. Pardalos, editors, Handbook of Combinatorial Optimization.Kluwer Academic Publishers, 1999.

[18] M. Bosse, P. Newman, J. Leonard, and S. Teller. An Atlas framework for scalablemapping. Technical report, Massachusetts Institute of Technology, Marine RoboticsLaboratory, 2002.

[19] R.N. Bracewell. The Fourier Transform and its Applications, 2nd Edition. McGraw-Hill, 1986.

[20] M. Brady, S. Cameron, H. Durrant-Whyte, M. Fleck, D. Forsyth, A. Noble, andI. Page. Progress towards a system that can acquire pallets and clean warehouses. InFourth International Symposium of Robotics Research, pages 359–374, 1987.

[21] W. Burgard, A. Derr, D. Fox, and A.B. Cremers. Integrating global position esti-mation and position tracking for mobile robots: The dynamic markov localizationapproach. In IEEE/RSJ International Conference on Intelligent Robots and Systems,1998.

[22] W. Burgard, D. Fox, H. Jans, C. Matenar, and S. Thrun. Sonar-based mapping withmobile robots using EM. In International Conference on Machine Learning, 1999.

[23] J. Carpenter, P. Clifford, and P. Fearnhead. An improved particle filter for non-linearproblems. Technical report, Oxford University, Department of Statistics, 1997.

[24] J.A. Castellanos, M. Devy, and J.D. Tardos. Towards a topological representationof indoor environments: A landmark-based approach. In IEEE/RSJ InternationalConference on Intelligent Robots and Systems, pages 23–28, 1999.

[25] J.A. Castellanos, J.M.M Montiel, J. Neira, and J.D. Tardos. The SPmap: A proba-bilistic framework for simultaneous localization and map building. IEEE Transactionson Robotics and Automation, 15(5):948–952, 1999.

[26] J.A. Castellanos and J.D. Tardos. Mobile Robot Localization and Map Building: AMultisensor Fusion Approach. Kluwer Academic Publishers, 1999.

[27] J.A. Castellanos, J.D. Tardos, and G. Schmidt. Building a global map of the envi-ronment of a mobile robot: The importance of correlations. In IEEE InternationalConference on Robotics and Automation, pages 1053–1059, 1997.

BIBLIOGRAPHY 195

[28] C.K. Chen and D.Y.Y. Yun. Unify graph matching problems with a practical solution.In International Conference on Systems, Signals, Control, Computers, 1998.

[29] K.S. Chong and L. Kleeman. Large scale sonarray mapping using multiple connectedlocal maps. In International Conference on Field and Service Robotics, pages 538–545,1997.

[30] K.S. Chong and L. Kleeman. Feature-based mapping in real, large scale environmentsusing an ultrasonic array. International Journal Robotics Research, 18(1):3–19, 1999.

[31] H. Choset and K. Nagatani. Topological simultaneous localization and mapping(SLAM): Toward exact localization without explicit localization. IEEE Transactionson Robotics and Automation, 17(2):125–137, 2001.

[32] J.B. Collins and J.K. Uhlmann. Efficient gating in data association with multivariategaussian distributed states. IEEE Transactions on Aerospace and Electronic Systems,28(3):909–916, 1992.

[33] I.J. Cox. Blanche—An experiment in guidance and navigation of an autonomousrobot vehicle. IEEE Transactions on Robotics and Automation, pages 193–204, 1991.

[34] I.J. Cox and S.L. Hingorani. An efficient implementation of Reid’s multiple hypoth-esis tracking algorithm and its evaluation for the purpose of visual tracking. IEEETransactions on Pattern Analysis and Machine Intelligence, 18(2):138–150, 1996.

[35] I.J. Cox and M.L. Miller. On finding ranked assignments with application to multi-target tracking and motion correspondence. IEEE Transactions on Aerospace andElectronic Systems, pages 486–489, 1995.

[36] M. Csorba. Simultaneous Localisation and Map Building. PhD thesis, University ofOxford, Department of Engineering Science, 1997.

[37] J. de Geeter, H. van Brussel, J. de Schutter, and M. Decreton. A smoothly constrainedKalman filter. IEEE Transactions on Pattern Analysis and Machine Intelligence,19(10):1171–1177, 1997.

[38] F. Dellaert, D. Fox, W. Burgard, and S. Thrun. Monte Carlo localization for mobilerobots. In IEEE International Conference on Robotics and Automation, pages 1322–1328, 1999.

[39] E.W. Dijkstra. A note on two problems in connexion with graphs. NumerischeMathematik, 1:269–271, 1959.

[40] G. Dissanayake, H. Durrant-Whyte, and T. Bailey. A computationally efficient solu-tion to the simultaneous localisation and map building (SLAM) problem. In IEEEInternational Conference on Robotics and Automation, volume 2, pages 1009–1014,2000.

[41] G. Dissanayake, S. Sukkarieh, E. Nebot, and H. Durrant-Whyte. A new algorithmfor the alignment of inertial measurement units without external observation for landvehicle applications. In IEEE International Conference on Robotics and Automation,pages 2274–2279, 1999.

BIBLIOGRAPHY 196

[42] M.W.M.G. Dissanayake, P. Newman, S. Clark, H.F. Durrant-Whyte, and M. Csorba.A solution to the simultaneous localization and map building (SLAM) problem. IEEETransactions on Robotics and Automation, 17(3):229–241, 2001.

[43] C. Dorai, J. Weng, and A.K. Jain. Optimal registration of object views using rangedata. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(10):1131–1138, 1997.

[44] A. Doucet. On sequential simulation-based methods for Bayesian filtering. Technicalreport, Cambridge University, Department of Engineering, 1998.

[45] R.O. Duda and P.E. Hart. Pattern Classification and Scene Analysis. John Wileyand Sons, 1973.

[46] G. Dudek, M. Jenkin, E. Milios, and D. Wilkes. Map validation and robot self-locationin a graph-like world. Robotics and Autonomous Systems, 22(2):159–178, 1997.

[47] A. Dunkin. Automated guided vehicle systems: An introduction. Industrial Engi-neering, 26:47–53, August 1994.

[48] H. Durrant-Whyte, S. Majumder, S. Thrun, M. de Battista, and S. Scheding. ABayesian algorithm for simultaneous localisation and map building. In Tenth Inter-national Symposium of Robotics Research, 2001.

[49] H.F. Durrant-Whyte. An autonomous guided vehicle for cargo handling applications.International Journal of Robotics Research, 15, 1996.

[50] A. Elfes. Occupancy grids: A stochastic spatial representation for active robot per-ception. In Sixth Conference on Uncertainty in AI, 1990.

[51] S.P. Engelson and D.V. McDermott. Error correction in mobile robot map learning.In IEEE International Conference on Robotics and Automation, pages 2555–2560,1992.

[52] P. Fearnhead. Sequential Monte Carlo Methods in Filter Theory. PhD thesis, Univer-sity of Oxford, 1998.

[53] H.J.S. Feder, J.J. Leonard, and C.M. Smith. Adaptive mobile robot navigation andmapping. International Journal of Robotics Research, 18(7):650–668, 1999.

[54] T.A. Feo, M.G.C. Resende, and S.H. Smith. A greedy randomized adaptive searchprocedure for maximum independent set. Operations Research, 42:860–878, 1994.

[55] T.E. Fortmann, Y. Bar-Shalom, and M. Scheffe. Sonar tracking of multiple targetsusing joint probabilistic data association. IEEE Journal of Oceanic Engineering,8(3):173–184, 1983.

[56] Y. Fuke and E. Krotkov. Dead reckoning for a lunar rover on uneven terrain. In IEEEInternational Conference on Robotics and Automation, pages 411–416, 1996.

BIBLIOGRAPHY 197

[57] G. Giralt, R. Sobek, and R. Chatila. A multi-level planning and navigation system fora mobile robot: A first approach to Hilare. In Sixth International Joint Conferenceon Robotics and Automation, 1979.

[58] F. Glover and M. Laguna. Tabu Search. Kluwer Academic Publishers, 1997.

[59] A.V. Goldberg and R. Kennedy. An efficient cost scaling algorithm for the assignmentproblem. Mathematical Programming, 71:153–178, 1995.

[60] N.J. Gordon, D.J. Salmond, and A.F.M. Smith. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proceedings-F, 140(2):107–113, 1993.

[61] M.S. Grewal and A.P. Andrews. Kalman Filtering: Theory and Practice Using MAT-LAB, 2nd Edition. John Wiley and Sons, 2001.

[62] M.S. Grewal, L.R. Weill, and A.P. Andrews. Global Positioning Systems, InertialNavigation, and Integration. John Wiley and Sons, 2001.

[63] W.E.L. Grimson and T. Lozano-Perez. Localizing overlapping parts by searching theinterpretation tree. IEEE Transactions on Pattern Analysis and Machine Intelligence,9(4):469–482, 1987.

[64] J. Guivant and E. Nebot. Navigation in large outdoor unstructured environments. InTenth International Symposium of Robotics Research, 2001.

[65] J. Guivant and E. Nebot. Optimization of the simultaneous localization and mapbuilding algorithm for real time implementation. IEEE Transactions on Robotics andAutomation, 17(3):242–257, 2001.

[66] J. Guivant and E. Nebot. Improving computational and memory requirements ofsimultaneous localization and map building algorithms. In IEEE International Con-ference on Robotics and Automation, pages 2731–2736, 2002.

[67] J. Guivant, E. Nebot, and H.F. Durrant-Whyte. Simultaneous localization and mapbulding using natural features in outdoor environments. In Sixth International Con-ference on Intelligent Autonomous Systems, volume 1, pages 581–588, 2000.

[68] J.S. Gutmann and K. Konolige. Incremental mapping of large cyclic environments.In IEEE International Symposium on Computational Intelligence in Robotics andAutomation, pages 318–325, 1999.

[69] J.S. Gutmann and C. Schlegel. Amos: Comparison of scan matching approaches forself-localization in indoor environments. In 1st Euromicro Workshop on AdvancedMobile Robots (Eurobot’96), pages 61–67, 1996.

[70] E. Haines. Point in polygon strategies. In P.S. Heckbert, editor, Graphics Gems IV,pages 24–46. Academic Press, 1994.

[71] R. Hinkel and T. Knieriemen. Environment perception with a laser radar in a fastmoving robot. In Symposium on Robot Control, pages 68.1–68.7, 1988.

BIBLIOGRAPHY 198

[72] R. Horaud and T. Skordas. Stereo correspondence through feature grouping andmaximal cliques. IEEE Transactions on Pattern Analysis and Machine Intelligence,11(11):1168–1180, 1989.

[73] B.K.P. Horn. Closed-form solution of absolute orientation using unit quaternions.Journal of the Optical Society of America A, 4(4):629–642, 1987.

[74] M. Isard and A. Blake. Contour tracking by stochastic propagation of conditionaldensity. In European Conference on Computer Vision, pages 343–356, 1996.

[75] S. Julier, J. Uhlmann, and H.F. Durrant-Whyte. A new method for the nonlineartransformation of means and covariances in filters and estimators. IEEE Transactionson Automatic Control, 45(3):477–482, 2000.

[76] S.J. Julier and J.K. Uhlmann. A general method for approximating nonlinear trans-formations of probability distributions. Technical report, University of Oxford, De-partment of Engineering Science, 1996.

[77] S.J. Julier and J.K. Uhlmann. A non-divergent estimation algorithm in the presence ofunknown correlations. In American Control Conference, volume 4, pages 2369–2373,1997.

[78] R.M. Karp. Reducibility among combinatorial problems. In R.E. Miller and J.W.Thatcher, editors, Complexity of Computer Computations, pages 85–103. PlenumPress, 1972.

[79] G. Kitagawa. Monte Carlo filter and smoother for non-Gaussian nonlinear state spacemodels. Journal of Computational and Graphical Statistics, 5(1):1–25, 1996.

[80] L. Kleeman. Optimal estimation of position and heading for mobile robots usingultrasonic beacons and dead reckoning. In IEEE International Conference on Roboticsand Automation, pages 2582–2587, 1992.

[81] J. Knight, A. Davison, and I. Reid. Towards constant time SLAM using postponement.In IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 405–413, 2001.

[82] K. Konolige and K. Chou. Markov localization using correlation. In InternationalJoint Conference on Articial Intelligence, pages 1154–1159, 1999.

[83] D. Kortenkamp, L.D. Baker, and T. Weymouth. Using gateways to build a routemap. In IEEE/RSJ International Conference on Intelligent Robots and Systems,pages 2209–2214, 1992.

[84] H.W. Kuhn. The Hungarian method for the assignment problem. Naval ResearchLogistics Quarterly, 2:83–97, 1955.

[85] B. Kuipers and Y.T. Byun. A robot exploration and mapping strategy based on asemantic hierarchy of spatial representations. Journal of Robotics and AutonomousSystems, 8:47–63, 1991.

BIBLIOGRAPHY 199

[86] A.T. Le. Modelling and Control of Tracked Vehicles. PhD thesis, University of Sydney,Australian Centre for Field Robotics, 1999.

[87] J.J. Leonard and H.F. Durrant-Whyte. Simultaneous map building and localizationfor an autonomous mobile robot. In IEEE International Workshop on IntelligentRobots and Systems, pages 1442–1447, 1991.

[88] J.J. Leonard and H.F. Durrant-Whyte. Directed Sonar Sensing for Mobile RobotNavigation. Kluwer Academic Publishers, 1992.

[89] J.J. Leonard, H.F. Durrant-Whyte, and I.J. Cox. Dynamic map building for anautonomous robot. International Journal of Robotics Research, 11(4):286–298, 1992.

[90] J.J. Leonard and H.J.S. Feder. Decoupled stochastic mapping. Technical report,Massachusetts Institute of Technology, Marine Robotics Laboratory, 1999.

[91] J.J. Leonard and R.J. Rikoski. Incorporation of delayed decision making into stochas-tic mapping. In International Symposium On Experimental Robotics, 2000.

[92] F. Lu and E. Milios. Globally consistent range scan alignment for environment map-ping. Autonomous Robots, 4:333–349, 1997.

[93] F. Lu and E. Milios. Robot pose estimation in unknown environments by matching2D range scans. Journal of Intelligent and Robotic Systems, 18:249–275, 1997.

[94] R. Madhavan. Terrain Aided Localisation of Autonomous Vehicles in UnstructuredEnvironments. PhD thesis, University of Sydney, Australian Centre for Field Robotics,2001.

[95] S. Majumder. Sensor Fusion and Feature Based Navigation for Subsea Robots. PhDthesis, University of Sydney, Australian Centre for Field Robotics, 2001.

[96] S. Majumder, J. Rosenblatt, S. Scheding, and H. Durrant-Whyte. Map building andlocalization for underwater navigation. In International Symposium On ExperimentalRobotics, 2000.

[97] P.S. Maybeck. Stochastic Models, Estimation and Control, volume 1. Academic Press,1979.

[98] J.J. McGregor. Backtrack search algorithms and the maximal common subgraphproblem. Software Practice and Experience, 12:23–34, 1982.

[99] H.P. Moravec. Sensor fusion in certainty grids for mobile robots. AI Magazine,9(2):61–74, 1988.

[100] H.P. Moravec and A. Elfes. High resolution maps from wide angle sonar. In IEEEInternational Conference on Robotics and Automation, pages 116–121, 1985.

[101] D.M. Mount. ANN programming manual. Technical report, University of Maryland,Department of Computer Science, 1998.

BIBLIOGRAPHY 200

[102] P. Moutarlier and R. Chatila. Stochastic multisensory data fusion for mobile robotlocation and environmental modelling. In Fifth International Symposium of RoboticsResearch, pages 85–94, 1989.

[103] E.M. Nebot and D. Pagac. Quadtree representation and ultrasonic information formapping an autonomous guided vehicle’s environment. International Journal of Com-puters and Their Applications, 2(3):160–170, 1995.

[104] J. Neira, L. Montano, and J.D. Tardos. Constraint-based object recognition in mul-tisensor systems. In IEEE International Conference on Robotics and Automation,pages 135–142, 1993.

[105] J. Neira and J.D. Tardos. Robust and feasible data association for simultaneouslocalization and map building. In IEEE International Conference on Robotics andAutomation, Workshop on Robot Navigation and Mapping, 2000.

[106] J. Neira and J.D. Tardos. Data association in stochastic mapping using the jointcompatibility test. IEEE Transactions on Robotics and Automation, 17(6):890–897,2001.

[107] J.A. Nelder and R. Mead. A simplex method for function minimization. ComputerJournal, 7:308–313, 1965.

[108] C.F. Olson and L.H. Matthies. Maximum likelihood rover localization by matchingrange maps. In IEEE International Conference on Robotics and Automation, pages272–277, 1998.

[109] A. Papoulis. Probability, Random Variables, and Stochastic Processes, 2nd Edition.McGraw-Hill, 1984.

[110] S.T. Pfister, K.L. Kriechbaum, S.I. Roumeliotis, and J.W. Burdick. Weighted rangesensor matching algorithms for mobile robot displacement estimation. In IEEE In-ternational Conference on Robotics and Automation, 2002.

[111] M.K. Pitt and N. Shephard. Filtering via simulation: Auxiliary particle filters. Journalof the American Statistical Association, 94:590–599, 1999.

[112] W.H. Press, S.A. Teukolsky, W.T. Vetterling, and B.P. Flannery. Numerical Recipesin C, 2nd Edition. Cambridge University Press, 1992.

[113] D.B. Reid. An algorithm for tracking multiple targets. IEEE Transactions on Auto-matic Control, 24(6):843–854, 1979.

[114] D.B. Rubin. Using the SIR algorithm to simulate posterior distributions. In J.M.Bernardo, M.H. DeGroot, D.V. Lindley, and A.F.M. Smith, editors, Bayesian Statis-tics 3, pages 395–402. Oxford University Press, 1988.

[115] S. Rusinkiewicz and M. Levoy. Efficient variants of the ICP algorithm. In ThirdInternational Conference on 3D Digital Imaging and Modeling, pages 145–152, 2001.

BIBLIOGRAPHY 201

[116] S. Scheding, G. Dissanayake, E. Nebot, and H. Durrant-Whyte. Slip modelling andaided inertial navigation of an LHD. In IEEE International Conference on Roboticsand Automation, pages 1904–1909, 1997.

[117] S. Scheding, E. Nebot, and H. Durrant-Whyte. High-integrity navigation: Afrequency-domain approach. IEEE Transactions on Control Systems Technology,8(4):676–694, 2000.

[118] A.C. Schultz and W. Adams. Continuous localization using evidence grids. In IEEEInternational Conference on Robotics and Automation, pages 2833–2839, 1998.

[119] A.C. Schultz, W. Adams, B. Yamauchi, and M. Jones. Unifying exploration, local-ization, navigation and planning through a common representation. In IEEE Inter-national Conference on Robotics and Automation, pages 2651–2658, 1999.

[120] A. Shoukry and M. Aboutabl. Neural network approach for solving the maximalcommon subgraph problem. IEEE Transactions on Systems, Man, and Cybernetics,26(5):785–790, 1996.

[121] S. Simhon and G. Dudek. A global topological map formed by local metric maps. InIEEE/RSJ International Conference on Intelligent Robots and Systems, pages 1708–1714, 1998.

[122] R. Smith, M. Self, and P. Cheeseman. A stochastic map for uncertain spatial rela-tionships. In Fourth International Symposium of Robotics Research, pages 467–474,1987.

[123] A. Stevens, M. Stevens, and H.F. Durrant-Whyte. OxNav: Reliable autonomousnavigation. In IEEE International Conference on Robotics and Automation, pages2607–2612, 1995.

[124] A.J. Stoddart, S. Lemke, A. Hilton, and T. Renn. Estimating pose uncertainty forsurface registration. Image and Vision Computing, 16(2):111–120, 1998.

[125] L.D. Stone, C.A. Barlow, and T.L. Corwin. Bayesian Multiple Target Tracking. ArtechHouse, 1999.

[126] S. Sukkarieh, E.M. Nebot, and H.F. Durrant-Whyte. A high integrity IMU/GPSnavigation loop for autonomous land vehicle applications. IEEE Transactions onRobotics and Automation, 15(3):572–578, 1999.

[127] J.D. Tardos, J. Neira, P.M. Newman, and J.J. Leonard. Robust mapping and local-ization in indoor environments using sonar data. International Journal of RoboticsResearch, 21(4):311–330, 2002.

[128] S. Thrun. Learning metric-topological maps for indoor mobile robot navigation. Ar-tificial Intelligence, 99(1):21–71, 1998.

[129] S. Thrun, M. Beetz, M. Bennewitz, W. Burgard, A.B. Cremers, F. Dellaert, D. Fox,D. Hahnel, C. Rosenberg, N. Roy, J. Schulte, and D. Schulz. Probabilistic algo-rithms and the interactive museum tour-guide robot Minerva. International Journalof Robotics Research, 19(11):972–999, 2000.

BIBLIOGRAPHY 202

[130] S. Thrun, W. Bugard, and D. Fox. A real-time algorithm for mobile robot mappingwith applications to multi-robot and 3D mapping. In International Conference onRobotics and Automation, pages 321–328, 2000.

[131] S. Thrun, W. Burgard, and D. Fox. A probabilistic approach to concurrent mappingand localization for mobile robots. Machine Learning, 31:29–53, 1998.

[132] S. Thrun, D. Fox, W. Burgard, and F. Dellaert. Robust Monte Carlo localization formobile robots. Artificial Intelligence, 128:99–141, 2001.

[133] T. Tsumura. Survey of automated guided vehicle in japanese factory. In IEEE Inter-national Conference on Robotics and Automation, pages 1329–1334, 1986.

[134] J.K. Uhlmann. Dynamic Map Building and Localization: New Theoretical Founda-tions. PhD thesis, University of Oxford, Department of Engineering Science, 1995.

[135] J.K. Uhlmann, S.J. Julier, and M. Csorba. Nondivergent simultaneous map buildingand localization using covariance intersection. In SPIE Aerosense Conference onNavigation and Control Technologies for Unmanned Systems II, volume 3087, pages2–11, 1997.

[136] J.R. Ullmann. An algorithm for subgraph isomorphism. Journal of the Associationfor Computing Machinery, 23(1):31–42, 1976.

[137] I. Ulrich and I. Nourbakhsh. Appearance-based place recognition for topological lo-calization. In IEEE International Conference on Robotics and Automation, pages1023–1029, 2000.

[138] R. van der Merwe, J.F.G. de Freitas, D. Doucet, and E.A. Wan. The unscentedparticle filter. Technical report, Cambridge University, Department of Engineering,2000.

[139] G. Weiß, C. Wetzler, and E. Puttkamer. Keeping track of position and orientation ofmoving indoor systems by correlation of range-finder scans. In International Confer-ence on Intelligent Robots and Systems, pages 595–601, 1994.

[140] S.B. Williams. Efficient Solutions to Autonomous Mapping and Navigation Problems.PhD thesis, University of Sydney, Australian Centre for Field Robotics, 2001.

[141] B. Yamauchi and P. Langley. Place recognition in dynamic environments. Journal ofRobotic Systems, 14(2):107–120, 1997.

[142] B. Yamauchi, A. Schultz, and W. Adams. Mobile robot exploration and map-buildingwith continuous localization. In IEEE International Conference on Robotics and Au-tomation, pages 3715–3720, 1998.

[143] Z. Zhang. Iterative point matching for registration of free-form curves and surfaces.International Journal of Computer Vision, 13(2):119–152, 1994.

mobile robot localisation and mapping in …...mobile robot localisation and mapping in extensive...

Documents