multiple hypothesis object tracking for unsupervised self...

7
Multiple Hypothesis Object Tracking For Unsupervised Self-Learning: An Ocean Eddy Tracking Application James H. Faghmous, Muhammed Uluyol, Luke Styles, Matthew Le, Varun Mithal, Shyam Boriah and Vipin Kumar Department of Computer Science and Engineering University of Minnesota Abstract Mesoscale ocean eddies transport heat, salt, energy, and nutrients across oceans. As a result, accurately identi- fying and tracking such phenomena are crucial for un- derstanding ocean dynamics and marine ecosystem sus- tainability. Traditionally, ocean eddies are monitored through two phases: identification and tracking. A ma- jor challenge for such an approach is that the tracking phase is dependent on the performance of the identifi- cation scheme, which can be susceptible to noise and sampling errors. In this paper, we focus on tracking, and introduce the concept of multiple hypothesis assignment (MHA), which extends traditional multiple hypothesis tracking for cases where the features tracked are noisy or uncertain. Under this scheme, features are assigned to multiple potential tracks, and the final assignment is deferred until more data are available to make a rela- tively unambiguous decision. Unlike the most widely used methods in the eddy tracking literature, MHA uses contextual spatio-temporal information to take corrective measures autonomously on the detection step a posteri- ori and performs significantly better in the presence of noise. This study is also the first to empirically analyze the relative robustness of eddy tracking algorithms. 1 Introduction Understanding how the oceans and their ecosystems will re- spond to global changes in temperature and atmospheric CO 2 levels remains a topic of intense scientific and societal inter- est (Caldeira, Wickett, and others 2003; Hoegh-Guldberg and Bruno 2010). In order to project a future response accurately, it is crucial that current and past ocean dynamics be well understood. Mesoscale ocean eddies (hereafter eddies) are coherent rotating structures of ocean spanning tens to hun- dreds of kilometers and lasting a few days to several months. Eddies are critical phenomena as they dominate the ocean’s kinetic energy and are responsible for the transport of heat, salt, nutrients, and energy across the oceans (Fu et al. 2010). Given their importance to ocean dynamics, identifying and tracking eddies has been an active field of research (see (Chelton, Schlax, and Samelson 2011) for a review). Tradi- tionally, autonomous eddy monitoring is performed in two Copyright c 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Figure 1: An eddy near the southern tip of the African continental mass (near bottom right corner). This large eddy (150km in diame- ter) moved heat and nutrients from the warm Indian Ocean to the cooler Atlantic Ocean. Figure courtesy of NASA. independent steps. First, eddy-like features are identified in successive frames of satellite data. Second, the eddy-like fea- tures are tracked across time by associating each feature in one frame to another feature in the following time-step. The focus of this paper is on the tracking phase of this two-step process. Recently, Chelton, Schlax, and Samelson (2011), hereby CH11, performed the most comprehensive study in au- tonomous global eddy identification and tracking in sea sur- face height (SSH) altimeter data. These results were subse- quently used in a groundbreaking study published in the jour- nal Science, that empirically detailed the impact eddies have on marine biological systems (Chelton et al. 2011). However, tracking eddies with the most widely used eddy-tracking technique has two major limitations: first it is inexorably dependent on the performance of the automatic eddy identi- fication algorithms. Such methods are highly susceptible to noise and have been known to “miss” features for a few time steps and merge features in close proximity (Chelton, Schlax, and Samelson 2011). Second, most eddy tracking methods employ a local nearest neighbor (LNN) tracking algorithm that provides little flexibility for previous errors in detection. At its core, eddy tracking is a data association problem and lends itself to a family of multi-target tracking applica- tions (Cham and Rehg 1999; Han et al. 2004), most notably deferred-logic (Poore 1995) techniques such as multiple hy- Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence 1277

Upload: others

Post on 15-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

Multiple Hypothesis Object Tracking For Unsupervised Self-Learning:An Ocean Eddy Tracking Application

James H. Faghmous, Muhammed Uluyol, Luke Styles, Matthew Le,

Varun Mithal, Shyam Boriah and Vipin KumarDepartment of Computer Science and Engineering

University of Minnesota

Abstract

Mesoscale ocean eddies transport heat, salt, energy, andnutrients across oceans. As a result, accurately identi-fying and tracking such phenomena are crucial for un-derstanding ocean dynamics and marine ecosystem sus-tainability. Traditionally, ocean eddies are monitoredthrough two phases: identification and tracking. A ma-jor challenge for such an approach is that the trackingphase is dependent on the performance of the identifi-cation scheme, which can be susceptible to noise andsampling errors. In this paper, we focus on tracking, andintroduce the concept of multiple hypothesis assignment(MHA), which extends traditional multiple hypothesistracking for cases where the features tracked are noisyor uncertain. Under this scheme, features are assignedto multiple potential tracks, and the final assignment isdeferred until more data are available to make a rela-tively unambiguous decision. Unlike the most widelyused methods in the eddy tracking literature, MHA usescontextual spatio-temporal information to take correctivemeasures autonomously on the detection step a posteri-ori and performs significantly better in the presence ofnoise. This study is also the first to empirically analyzethe relative robustness of eddy tracking algorithms.

1 Introduction

Understanding how the oceans and their ecosystems will re-spond to global changes in temperature and atmospheric CO2

levels remains a topic of intense scientific and societal inter-est (Caldeira, Wickett, and others 2003; Hoegh-Guldberg andBruno 2010). In order to project a future response accurately,it is crucial that current and past ocean dynamics be wellunderstood. Mesoscale ocean eddies (hereafter eddies) arecoherent rotating structures of ocean spanning tens to hun-dreds of kilometers and lasting a few days to several months.Eddies are critical phenomena as they dominate the ocean’skinetic energy and are responsible for the transport of heat,salt, nutrients, and energy across the oceans (Fu et al. 2010).

Given their importance to ocean dynamics, identifyingand tracking eddies has been an active field of research (see(Chelton, Schlax, and Samelson 2011) for a review). Tradi-tionally, autonomous eddy monitoring is performed in two

Copyright c© 2013, Association for the Advancement of ArtificialIntelligence (www.aaai.org). All rights reserved.

Figure 1: An eddy near the southern tip of the African continentalmass (near bottom right corner). This large eddy (150km in diame-ter) moved heat and nutrients from the warm Indian Ocean to thecooler Atlantic Ocean. Figure courtesy of NASA.

independent steps. First, eddy-like features are identified insuccessive frames of satellite data. Second, the eddy-like fea-tures are tracked across time by associating each feature inone frame to another feature in the following time-step. Thefocus of this paper is on the tracking phase of this two-stepprocess.

Recently, Chelton, Schlax, and Samelson (2011), herebyCH11, performed the most comprehensive study in au-tonomous global eddy identification and tracking in sea sur-face height (SSH) altimeter data. These results were subse-quently used in a groundbreaking study published in the jour-nal Science, that empirically detailed the impact eddies haveon marine biological systems (Chelton et al. 2011). However,tracking eddies with the most widely used eddy-trackingtechnique has two major limitations: first it is inexorablydependent on the performance of the automatic eddy identi-fication algorithms. Such methods are highly susceptible tonoise and have been known to “miss” features for a few timesteps and merge features in close proximity (Chelton, Schlax,and Samelson 2011). Second, most eddy tracking methodsemploy a local nearest neighbor (LNN) tracking algorithmthat provides little flexibility for previous errors in detection.

At its core, eddy tracking is a data association problemand lends itself to a family of multi-target tracking applica-tions (Cham and Rehg 1999; Han et al. 2004), most notablydeferred-logic (Poore 1995) techniques such as multiple hy-

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence

1277

Page 2: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

pothesis tracking (Reid 1979) that defer making uncertainassociations until more data are available. Such methods havebeen especially useful in cluttered or noisy environments(Bar-Shalom and Li 1995). Extensive work has been done inmulti-target tracking and its applications to security (Black-man 2004; Oh, Russell, and Sastry 2009), computer vision(Cox and Hingorani 1996), and autonomous agents (Nietoet al. 2003; Montemerlo et al. 2003). However, the mostgeneral multiple hypothesis tracking (MHT) methods do notaddress the challenges of tracking noisy (merged eddies inhigh density areas) and incomplete data (eddies disappearingfor a few timesteps without actually dying). As a result, weintroduce the notion of multiple hypothesis assignment toextend MHT for physical processes applications. MHA isinspired by MHT in terms of (i) tracking multiple objectsover several time-steps; and (ii) deferred logic by maintain-ing multiple plausible assignments until a relatively morecertain assignment is possible. We augment the basic MHTapproach to address the nature of climate data by introduc-ing a lookahead (incompleteness of input) and unsupervisedself-learning (noise within input features) capability. Theunsupervised learning feature is unique in the literature as weuse it to autonomously identify errors in the eddy identifica-tion phase and subsequently update the eddy-feature databasewith the correct features by removing the artificially largefeature and adding several (corrected) smaller features.

Our method provides several improvements over the state-of-the-art: First, it is more resilient to noisy observationswhich is a major concern in satellite products. Second, ourtracking method is able to maintain tracks even in the eventof eddies “disappearing” for a few time-steps and then reap-pearing due to noise and other sampling constraints. Finally,we employ heuristics to identify and correct possible errorsin the eddy identification step in an unsupervised fashion.We show with a variety of empirical and qualitative experi-ments that MHA’s improvements have a significant impacton the resulting eddy tracks compared to the most widelyused method in the literature.

2 Background and Problem FormulationTraditionally, the automatic detection and tracking of oceaneddies were achieved using sea surface temperature orocean color satellite data (Pegau, Boss, and Martı́nez 2002;Fernandes 2008; Dong et al. 2011). The advent of SSH obser-vations from satellite radar altimeters provided researcherswith an unprecedented opportunity to study eddy dynamicson a global scale. This is because eddy behavior is intimatelylinked to SSH. Eddies are generally classified by their rota-tional direction. Cyclonic eddies rotate counter-clockwise (inthe Northern Hemisphere), while anti-cyclonic eddies rotateclockwise. As a result, cyclonic eddies cause a decrease inSSH, while anti-cyclonic eddies cause an increase in SSH.Such impact allows us to identify ocean eddies in SSH satel-lite data, where cyclonic eddies manifest as closed contourednegative SSH anomalies and anti-cyclonic eddies as positiveSSH anomalies.

Eddies are identified in the SSH field by assigning bi-nary values to the SSH data based on whether or not a vary-ing threshold was exceeded, and subsequently saving the

Figure 2: Global cyclonic eddy-like features identified in a singleSSH time frame. We identify several thousand eddy-like features inany time frame, many of which will be discarded after significancetests. The goal is to track eddies by connecting these nearly ubiqui-tous features with their corresponding features in subsequent timeframes.

Figure 3: A moving eddy as identified in five successive time framesof SSH satellite data. Note how the feature changes size and shapefrom on frame to the next due to noise in the SSH field.

eddy-like connected component features that remain afterthresholding. The identified features are further pruned basedon physically-consistent criteria that define eddies (Chelton,Schlax, and Samelson 2011; Faghmous et al. 2012). Figure2 shows the ubiquitous cyclonic eddy features identified ina single SSH snapshot. Each snapshot contains several thou-sand eddy features. However that number is often reduced bya variety of significance tests to remove spurious discoveries.

Once eddy-like features have been identified in all timeframes, they need to be tracked across time. Figure 3 showsan example of an eddy identified in five successive SSHframes. While tracking a single object is trivial, tracking mul-tiple user-defined features presents unique challenges bothconceptually and computationally. First, since the objectsbeing tracked are not self-defined with clear boundaries likephysical objects (e.g. car, ball, etc.) the very notion of anobject is subject to interpretation and errors. This can be seenin Figure 3, where the size and shape of the feature changesacross time due to noisy measurements. Second, given themultitude of objects in any given frame, the computationalresources required to maintain a large number of potentialtracks grows exponentially. Finally, splitting and mergingare regular behaviors of ocean eddies that are not commonconcepts in the traditional object tracking literature.

The majority of eddy tracking algorithms employ a com-putationally modest, yet limited approach where a fea-ture is connected to the nearest feature in the subsequenttime frame. While this approach gives reasonable results

1278

Page 3: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

in eddy tracking applications (Chelton, Schlax, and Samel-son 2011), its greedy nature means that there will oftenbe cases where it performs sub-optimally, especially innoisy or cluttered environments (Bar-Shalom and Li 1995;Blackman 2004).

Based on the following overview, we propose the followingproblem definition and present the methods used to solve sucha problem in the following section.

Problem Definition

Given T frames containing eddy features identified in an in-dependent detection step, with each frame i having Ni eddyfeatures. Extract globally-coherent eddy tracks, where aneddy track is a group of eddy features, satisfying the follow-ing constraints: (1) each track has at most one feature fromconsecutive time frames; and (2) each feature is associatedwith at most one track.

3 Methods

We implemented local nearest neighbor assignment (LNN):the most widely used method in ocean eddy tracking (Chel-ton, Schlax, and Samelson 2011). We also implement multi-ple hypothesis assignment (MHA) adopted from Blackman(2004), where each feature is associated with multiple poten-tial tracks.

3.1 Local Nearest Neighbor (LNN)

In the LNN scheme, for each eddy feature identified at timet, the features at time t + 1 are searched to find the closestfeature within a pre-defined search space based on the theo-retical distance an eddy can travel during the period betweentwo successive time frames. In the event that a feature attime t+ 1 is closest to two features from the previous step,it is assigned to the first encountered feature. This causesthe resulting tracks to depend on the scanning order of thefeature set.

3.2 Multiple Hypothesis Assignment (MHA)

The main difference between MHA and LNN is that unlikeLNN, tracks are not formed at every time step. Instead, fea-tures are assigned to multiple plausible tracks and uncertaindecisions are deferred until more data are available to makea relatively unambiguous track assignment. We use a tracktree to maintain all possible tracks that be can be constructedstarting from any eddy (shown in Figure 4). Each path alongthe track tree is a potential track starting from its root.

In an MHA setting, every track has three outcomes: initia-tion, expansion, or termination. At any time t, all potentialexpansions and terminations for the active tracks at t− 1 areentered in the track tree. Additionally, all features identifiedat time t initiate potential new tracks by creating new tracktrees with the new features as the root.

Figure 4 demonstrates how MHA maintains all plausiblehypotheses across 3 time-steps and 7 eddy features. In thisscenario, there are two eddies moving in space and time(e1,e3,e6) and (e2,e4,e7), and a spurious eddy feature e5. Att1 the only possibility is to initiate two track trees with e1and e2 as roots. At t2, three eddy features emerged and they

e1

e3

e6

e2

e4

e7 X

e5

t3t1 t2

e1

e2

e3

e4

e5

e6

e7

t1

t2

t3 X e7 X X X e7Xe7Xe6 e6 e7

e4 X e5 X e4e3

e7

Figure 4: An illustrative example of how MHA maintains all plau-sible tracks in memory. Top panel: A two-dimensional spatio-temporal view. Each panel represents one time step. The real tracksare (e1, e3, e6), (e2, e4, e7), and e5 and are emphasized by greenedges. The light red edges are plausible tracks that are consideredby MHA but ultimately discarded. Bottom panel: the correspondingtrack trees for all possible tracks based on the data in the top panel.The tracks selected are highlighted in green. Abandoned tracksare in red. “End” nodes are denoted by “X”. Note: for simplicitythe scoring function, lookahead, and self-learning features are notdemonstrated in this example. Figure best seen in color.

extend the trees created by e1 and e2 as well as initiate newtrees. Moreover, the tracks created by e1 and e2 may end att2, as denoted by the “X” nodes in Figure 4. Finally two moreeddies appear at t3 and they too are associated with existingtrees as well as initiate new ones (denoted by the rightmostsingle-node trees e6 and e7).

While this approach may be more accurate than LNN, theenumeration and maintenance of all possible tracks growsexponentially. To reduce the problem’s complexity, we usegating and pruning functions (Reid 1979; Kurien 1990). Gat-ing ensures that only eddy features that are within a maximumsearch distance from existing tracks are considered as poten-tial extensions. The gating distance, g, varies based on thelocation of the track, since the size and distance eddies travelvary uniformly as a function of latitude (Fu et al. 2010). Anexample of gating can be seen in Figure 4, where e5 is notconsidered as e1’s extension because e5 is outside e1’s gate.

Furthermore, we employ N-scan pruning, where we solvefor any ambiguities across all track trees by finalizing theassignment of eddies that appeared during the past N -steps.Hence, every N steps, we reduce the number of paths alongeach track tree to at most one. To do so, we assign every edgein each track tree a reward r = g - d(eti, e

t+1j ) if we extend a

track at time t such that the distance between the connectedfeatures at time t and t + 1 is d(eti, e

t+1j ). That is, we give

higher scores to tracks that extend through the nearest featurein the search space.

To make final track assignments, we sum the scores of eachedge along a path from time t−N to t and select the highestscoring path from each track tree. If two highest scoring pathsfrom different trees share a node (i.e. they are ambiguous),then we select the track with the highest total score.

Referring back in Figure 4, If we adopt a 3-scan pruningprocedure (N = 3), all track trees must be pruned every 4steps. In this case, each path along every tree will be scored by

1279

Page 4: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

summing the rewards r of all edges along the paths betweentimes t = 1 and t = 4. The only tracks remaining fromthe pruning phase are the highest scoring non-conflictingtracks across all trees. One example of conflicting tracksare (e1,e3,e6), which is the highest scoring track in the left-most tree, and (e3, e6), the highest scoring track in the thirdtree from the left. They are conflicting because they sharee3 and every feature can only be part of at most one track.In case of conflict, MHA will select the track that has thehighest total reward, which in this case is the longest track(e1,e3,e6). Therefore, the tracks (e1,e3,e6) and (e2,e4,e7) arenon-conflicting tracks and together have the globally optimalreward. If new features are identified at t = 4 (not shown),then the only possible extensions are for the first and secondtrees along the highest scored path (green path) or to extende6 and e7. All remaining red paths and trees are pruned andare no longer eligible extension.

Using the same example, LNN is unable to recover theproper tracks in Figure 4. At time t2, LNN will assign e1and e2 to the closest features which are e4 and since e4is already paired with e1, e2 is associated with the secondclosest feature, e5. At t3, LNN will continue to diverge for theproper path, by assigning e4 to e7 and terminating e5. Thiswill result in an incorrect final output of (e1,e4,e7), (e2,e5)and (e3,e6).

To address the noise in SSH data, MHA extends the tradi-tional multiple-hypothesis tracking algorithm to account fornoisy observations, where features contain varying degrees ofuncertainty. The two main extensions of MHA that allow it tohandle noisy data are: lookahead, where we do not terminatetracks that were not extended for a single time step in theevent that an eddy “disappeared”. The second feature is unsu-pervised self-learning, where MHA autonomously detects anerror in the feature identification phase and takes a correctivemeasure a posteriori.

3.3 MHA Lookahead

A common challenge in tracking eddies is that some eddiesmight be left unassociated from one time-step to the next dueto eddies temporarily “disappearing” as a result of noise andsampling errors. CH11 attempted to address this problem byallowing features to not be paired with a subsequent featurefor 2-3 time steps in the hope that it will be associated toa track at a later time. However, due to LNN’s localizedgreedy nature the “procedure for tracking temporarily losteddies were disappointing. The resulting eddy trajectoriesoften jumped from one eddy to another” (Chelton, Schlax,and Samelson 2011). Thus, while this lookahead feature ishighly desirable, it was abandoned in the most comprehensivestudy to date, as it was preferable to shorten eddy tracksthan to incorrectly associate eddies to incorrect tracks. MHAimplements the lookahead feature by assigning a “missing”node to tracks that were not associated with an eddy insteadof terminating them. If two consecutive “missing” nodesare within the same path, then it is terminated. MHA avoidsjumping tracks because it maintains all plausible future tracksand will likely choose the more consistent track since it hasmore data available than LNN does.

3.4 MHA Unsupervised Self-Learning

One fundamental difference between our method and existingones is its ability to identify errors from the eddy detectionphase and correct them autonomously. The most commontype of misidentification is that of artificially merged eddies.Due to the noisy SSH data, features that are in close proxim-ity are susceptible to being labeled as one large eddy (Chel-ton, Schlax, and Samelson 2011). Faghmous et al. (2012)addressed such artificial merges by introducing a convexitymeasure on the features detected to ensure that only the mostcompact features are selected, as merged features tend tohave abnormal shapes. Nonetheless, due to the global natureof the autonomous identification scheme, some clusters ofeddies may still be merged as a single large feature. Assumethat at time t there are three eddies e1, e2, and e3 with areasa1, a2, and a3 respectively. When a new eddy newe appearsat time t+1 it is automatically associated with e1, e2, and e3.If newe is related to either existing eddy it should be roughlytheir size. If newe is a merged eddy, then its area wouldbe significantly larger than all existing eddies. Therefore, ifMHA detects a new eddy where all of its associated predeces-sors in the track tree are at least 1/2 of its size, we flag the neweddy as a potential merge and apply the eddy identificationalgorithm described in (Chelton, Schlax, and Samelson 2011;Faghmous et al. 2012) to break up the large eddy into smallerones. If the self-learning procedure returns new eddy features,we then remove all nodes that contained the artificially largeeddy from all track trees and extend all active tracks with thenew features. This will cause the trees’ breadth to increaseby K − 1 branches where K is the number of newly identi-fied features. The new features will also initiate K new tracktrees.

The unsupervised self-learning feature is only possiblethanks to the multiple hypothesis nature of MHA. There aresignificant conceptual challenges in extending unsupervisedself-learning to an LNN framework. Assume that an eddyfeature e1 is associated to a significantly larger feature e2 inthe following time frame. In order for LNN to label e2 asan artificial merge with high probability, it must consider allpossible feature pairings in the search space to ensure thatthere is no other feature, ei that is of the same size as e2 andcould be attached to it. Even then, LNN could not accountfor the scenario where e2 is simply a new large eddy thatjust initiated and will continue during the subsequent timeframes.

3.5 Computational complexity

Given M , the maximum number of objects in any time frame,and T time-steps, LNN would compare the distance betweenevery other object leading to M2 comparisons. Therefore,the worst-case runtime complexity of LNN is O(M2T ).

For MHA, let K be the number of hypotheses maintainedfor consideration at any given time-step. In the worst case,K = MT where MHA would store all possible tracks, how-ever using N-scan pruning, MHA only stores the N-mosthypotheses resulting in K = MN . The lookahead and self-learning features add additional costs to the algorithm. How-ever their running time cannot be explicitly bound but in most

1280

Page 5: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

reasonable cases it should be no more than O(M). As a result,the worst-case runtime complexity of MHA is O(MN+1T ).

4 Results

We identified eddies globally in weekly unfiltered SSHdata from October 1992 to January 2011 using the proce-dure described in (Chelton, Schlax, and Samelson 2011;Faghmous et al. 2012). We used the Version 3 dataset of theArchiving, Validation, and Interpretation of Satellite Oceano-graphic (AVISO) which contains 7-day averages of SSH ona 0.25◦ grid. Given the absence of eddy baseline data or“ground truth”, evaluating and comparing methods is a no-table challenge. The majority of eddy tracking studies focuson aggregate results such as track lifetimes and distance prop-agation. An evaluation that is more relevant to our audiencewould compare the performance of each algorithm at detect-ing tracks. Ideally, one would use “ground truth” data wherethe eddy tracks are known in advance and test how well eachmethod recovers such tracks under varying conditions. Oneway to generate such data would be through a numerical sim-ulation (i.e. ocean model) with idealized eddies as groundtruth and then gradually add noise. However, such simula-tions are computationally expensive and require sophisticatedphysics-based models to simulate eddies and their trajecto-ries. An alternative would be to use filed studies data, wherefloats are dropped in the ocean and subsequently tracked andeddies are identified when the float rotate while translating(i.e. the float is moving along the translating eddy). However,such data make up a small sample size and are not sufficientto significantly differentiate between the two methods. To ad-dress these limitations, we designed an experiment to test therelative robustness of each method to noisy measurements; asignificant concern in SSH data.

0%

25%

50%

75%

100%

0.01 0.02 0.03 0.04 0.05 0.1 0.2

Mea

n %

of b

aseli

ne tr

acks

reco

vere

d

Probability of deleting a feature

MHALNN

Figure 5: The mean percentage of baseline tracks recovered byLNN and MHA as a function of the probability deletion (p). Atevery timestep, each feature may be removed from the data with aprobability p. Each probability level was simulated 50 times and theresults show the mean of the 50 simulations. The error bars denotestandard deviations.

4.1 Sensitivity To Noise

There are three ways that noise can affect automatic eddymonitoring procedures. First, a spurious connected compo-nent feature may appear for a single time-step before disap-pearing. Second, noise may cause uncertain feature bound-aries and, as a result, shifted feature centroids. Finally, fea-tures might disappear temporarily either due to limitationsin the identification scheme, or because of perturbations inthe SSH field make the feature unidentifiable. The first typeof noise is handled by discarding features that do not persistover four weeks. The second can be handled by smoothingor taking a weighted centroid. The final type of noise is themost vexing. As highlighted earlier, disappearing features isa considerable challenge that CH11 could not address. As aresult, we focus our evaluation on the resilience of LNN andMHA to disappearing features.

To test each algorithm’s ability to handle missing features,we built a baseline dataset and randomly removed featuresfrom that data. We then computed how many of the orig-inal baseline tracks were recovered despite the noise. Weconstructed the baseline dataset by running both LNN andMHA on the features identified in the South Pacific oceanfor the years 2005 and 2006. Only tracks that lasted at least4 weeks and perfectly matched between both methods wereselected for inclusion in the baseline dataset. We simulatedthe event of a feature temporarily disappearing by randomlyremoving features in the baseline dataset based on a variable“deletion” probability p. To account for a variety of conditionsthat lead to disappearing eddies, we vary p from 0.01 to 0.2and run both LNN and MHA algorithms on the noisy dataset.Figure 5 shows the results of the experiment. As it can beseen, LNN’s recovery rate degrades much faster than thatof MHA. It should be noted that a track may have multiplefeatures deleted and that longer-lived tracks are more likelyto be missed since there is an increased likelihood of mul-tiple features being removed from longer tracks. Howeverthese issues are the same for either procedure, therefore notsignificantly affecting the results. This experiment empiri-cally shows that MHA is significantly more robust to noiserelative to LNN, even in extreme conditions (e.g. 10− 20%probability of missing a feature).

4.2 Impact of Missing Eddies on Track Statistics

LNN’s sensitivity to missing eddies can have a significant im-pact on reported track durations, as eddies disappearing for asingle time step would cause tracks to terminate prematurelywhen the eddy is still moving. Figure 6 shows an examplewhere MHA performs better than LNN in the case of an eddy“disappearing” for one time-step. Initially, both methods trackthe eddy identically. Yet, at time t3 the eddy identificationmethod is unable to find an eddy and since LNN does notimplement lookahead, the track was terminated. MHA, how-ever, was able to recover the full track by allowing the trackto be unassociated for a single time-step and then continuetracking once the eddy reappeared.

To further quantify the effect of such premature termina-tions on track statistics, we repeated the noise experimentreported earlier on all features in the South Pacific between

1281

Page 6: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

t1 t2 t3 t4

Figure 6: An example of how MHA is able to recover tracks evenwhen eddies temporarily “disappear.” The eddy is moving west-wardly (from right to left). Top (from left): An eddy centroid (greensquare) and its path as detected by LNN with no lookahead ability.At t3 the eddy detection algorithm loses the eddy for one time stepas it reappears in t4. However, without the lookahead feature, LNNbreaks up the long track into two small tracks as indicated by thedifferent colored tracks. Bottom (from left): the same eddy trackedwith MHA. Although the eddy is lost at t3, MHA successfullyrecovers the track once the eddy reappears at t4.

2005 and 2006 (not just the baseline data). Due to space limi-tations, we only analyze one deletion probability of p = 0.04(the median of the probabilities from the previous experi-ment). Figure 7 shows the mean cumulative track lifetimesfor 50 simulations where features have a 4% probability oftemporarily disappearing. Although both methods identifynearly the same number of total tracks, LNN identifies 20%less tracks that live 11 weeks or longer. That is a significantdifference given that long-lived eddies play a fundamentalrole in transporting water across ocean basins.

0

80

160

240

320

5-10 Weeks 11+ Weeks

268272224

316

Mea

n nu

mbe

r of t

rack

s

LNN MHA

Figure 7: Mean cumulative track lifetimes for LNN and MHA tracksfrom the noise experiment with a probability p = 0.04 of a featuretemporarily disappearing. MHA identified 20% more long-livedtracks (11+ weeks) than LNN. The simulation was run 50 times andthe error bars denote standard deviations.

4.3 Impact of Self-Learning on Artificial Merges

Figure 8 shows two examples of unsupervised self-learning.In both cases, artificially large eddies were saved in the iden-tification step, but MHA was able to flag such errors au-

A

B

C

D

E

AB

C

D

E

F

Figure 8: Two examples of artificially large eddies being mergedas a single eddy and MHA’s ability to identify such errors. PanelA denotes the actual SSH data with the large eddy highlighted inred. The smaller vertical panels on the right (B-F) show the binaryview of the merged eddy and the corrected data. Panel B shows theactual greyscale SSH data inside the eddy, with the multiple minimamarked by red crosses – an indication of an artificial merge. TheC panel shows the pixels associated with the eddy as saved by theidentification scheme. The bottom two panels (D-F) show the newsmaller eddies after data correction.

tonomously and corrected them by splitting the large featuresinto more accurate smaller eddies. The large panels (A) inFigure 8 highlight the artificially merged eddies over a back-ground of greyscale SSH data. These merges were identifiedby MHA in an unsupervised manner where it would flag anyfeatures that had all potential tracks associated with it besignificantly smaller in previous steps.

The blue ellipses in panel A of Figure 8 highlights themerged features in red. Panels B and C show the featuresas saved in the eddy identification phase in a grayscale andbinary forms respectively. The crosses in panel B indicatethe local minima, which are centroids of the actual eddiesthat were erroneously merged. Panels D-E show the resultingeddies once the data correction procedure was performed.In both cases, the separation occurred along the proper di-rection as denoted by the major axis of the blue ellipses inpanel A. These examples demonstrate how the unsupervisedself-learning feature effectively leverages the spatio-temporalcontext of the data to make accurate corrections to the un-supervised eddy detection phase. Such an extension can beused in other multiple hypothesis tracking settings where theobjects have varying degrees of uncertainty.

One common sign of an artificial merge is multiple ex-trema within a single feature. That is because, physically,eddies should only have a single extrema. Using MHA’s self-learning feature we were able to autonomously identify 2185artificial merges within the South Pacific ocean for years2005 and 2006. Of those potential merges, 1682 features hadmultiple extrema. An extrema was defined as a pixel whosevalue was greater/less than all of its 5 × 5 neighbors. Wewere able to successfully break 1058 of the reported merges,resulting in a significant increase in feature counts.

5 Summary & Future Work

We presented MHA: a multiple hypothesis tracking-inspirededdy monitoring application. MHA extends the traditionalmultiple-hypothesis tracking approach to handle noisy obser-vations and introduced an unsupervised self-learning featurethat refines objects a posteriori. Despite the lack of “ground

1282

Page 7: Multiple Hypothesis Object Tracking For Unsupervised Self …climatechange.cs.umn.edu/.../papers/FaghmousFUSLMBK2013.pdf · 2013-07-26 · resources required to maintain a large number

truth” data to evaluate our algorithm, we presented several ex-periments and case-studies that demonstrated that our methodaddresses two major shortcomings of the most widely usedmethod in the eddy tracking literature. Furthermore, MHA isan example of how incorporating the data’s spatio-temporalcontext can improve the accuracy of feature identificationand tracking algorithms.

Our MHA algorithm is able to use its multiple hypothe-sis nature, along with the data’s spatio-temporal context, totake corrective measures on a previously independent eddyidentification phase. As far as we are aware, this is the firstapproach in the eddy tracking literature to do so. Further-more, our method is more robust to noise as it successfullyimplements a lookahead feature that makes it less likely tobreak-up tracks if observations temporarily disappear.

Additional analysis of MHA’s performance based on var-ious factors (e.g. eddy density, propagation speed, etc.), aswell as self-learning’s impact on track statistics will be thesubject of future work. MHA could also be extended to han-dle the evolutionary nature of eddies forming and dissipatingby adopting MCMC techniques from biological sciences(Smal et al. 2008). While MHA’s pruning heuristic is basedon theoretical eddy dynamics, less rigid heuristics such asthe joint probabilistic data association (JPDA) method (Bar-Shalom and Tse 1975; Bar-Shalom and Li 1995) may be moreaccurate. Finally, now that it has been shown that MHA is amore robust alternative to LNN, concepts such as tracks split-ting and merging can be incorporated in the MHA scheme byrelaxing the constrain of having an eddy be part of a singletrack at most. Such an addition may provide more accurateinformation about ocean dynamics.

6 Acknowledgements

The authors would like to thank three anonymous reviewerswhose suggestions significantly improved the manuscript.This research was funded by a University of Minnesota Doc-toral Dissertation Fellowship, the Planetary Skin Institute,NSF Grant IIS-0905581,and an NSF Expedition in Comput-ing Grant (IIS-1029711). Access to computing resources wasprovided by the University of Minnesota SupercomputingInstitute.

References

Bar-Shalom, Y., and Li, X. 1995. Multitarget-multisensor track-ing: principles and techniques. YBS Publishing, Storrs, CT:University of Connecticut.Bar-Shalom, Y., and Tse, E. 1975. Tracking in a clutteredenvironment with probabilistic data association. Automatica11(5):451–460.Blackman, S. 2004. Multiple hypothesis tracking for multipletarget tracking. Aerospace and Electronic Systems Magazine,IEEE 19(1):5–18.Caldeira, K.; Wickett, M.; et al. 2003. Anthropogenic carbonand ocean ph. Nature 425(6956):365–365.Cham, T., and Rehg, J. 1999. A multiple hypothesis approachto figure tracking. In Computer Vision and Pattern Recognition,1999. IEEE Computer Society Conference on., volume 2. IEEE.

Chelton, D.; Gaube, P.; Schlax, M.; Early, J.; and Samelson, R.2011. The influence of nonlinear mesoscale eddies on near-surface oceanic chlorophyll. Science 334(6054):328–332.Chelton, D.; Schlax, M.; and Samelson, R. 2011. Global obser-vations of nonlinear mesoscale eddies. Progress in Oceanogra-phy.Cox, I., and Hingorani, S. 1996. An efficient implementation ofreid’s multiple hypothesis tracking algorithm and its evaluationfor the purpose of visual tracking. Pattern Analysis and MachineIntelligence, IEEE Transactions on 18(2):138–150.Dong, C.; Nencioli, F.; Liu, Y.; and McWilliams, J. 2011. Anautomated approach to detect oceanic eddies from satellite re-motely sensed sea surface temperature data. Geoscience andRemote Sensing Letters, IEEE (99):1–5.Faghmous, J. H.; Styles, L.; Mithal, V.; Boriah, S.; Liess, S.;Vikebo, F.; Mesquita, M. d. S.; and Kumar, V. 2012. Eddyscan:A physically consistent ocean eddy monitoring application. InIntelligent Data Understanding (CIDU), 2012 Conference on,96 –103.Fernandes, A. 2008. Identification of oceanic eddies in satelliteimages. Advances in Visual Computing 65–74.Fu, L.; Chelton, D.; Le Traon, P.; and Morrow, R. 2010. Eddydynamics from satellite altimetry. Oceanography 23(4):14–25.Han, M.; Xu, W.; Tao, H.; and Gong, Y. 2004. An algorithmfor multiple object trajectory tracking. In Computer Vision andPattern Recognition, 2004. CVPR 2004. Proceedings of the 2004IEEE Computer Society Conference on, volume 1, I–864. IEEE.Hoegh-Guldberg, O., and Bruno, J. 2010. The impact ofclimate change on the worlds marine ecosystems. Science328(5985):1523–1528.Kurien, T. 1990. Issues in the design of practical multitar-get tracking algorithms. Multitarget-Multisensor Tracking: Ad-vanced Applications 1:43–83.Montemerlo, M.; Thrun, S.; Koller, D.; Wegbreit, B.; et al. 2003.Fastslam 2.0: An improved particle filtering algorithm for si-multaneous localization and mapping that provably converges.In International Joint Conference on Artificial Intelligence, vol-ume 18, 1151–1156. LAWRENCE ERLBAUM ASSOCIATESLTD.Nieto, J.; Guivant, J.; Nebot, E.; and Thrun, S. 2003. Real timedata association for fastslam. In Robotics and Automation, 2003.Proceedings. ICRA’03. IEEE International Conference on, vol-ume 1, 412–418. IEEE.Oh, S.; Russell, S.; and Sastry, S. 2009. Markov chain montecarlo data association for multi-target tracking. Automatic Con-trol, IEEE Transactions on 54(3):481–497.Pegau, W.; Boss, E.; and Martı́nez, A. 2002. Ocean color ob-servations of eddies during the summer in the gulf of california.Geophysical Research Letters 29(9):1295.Poore, A. 1995. Multidimensional assignment and multitargettracking. Partitioning Data Sets. DIMACS Series in DiscreteMathematics and Theoretical Computer Science 19:169–196.Reid, D. 1979. An algorithm for tracking multiple targets. Au-tomatic Control, IEEE Transactions on 24(6):843–854.Smal, I.; Draegestein, K.; Galjart, N.; Niessen, W.; and Meijer-ing, E. 2008. Particle filtering for multiple object tracking indynamic fluorescence microscopy images: Application to micro-tubule growth analysis. Medical Imaging, IEEE Transactions on27(6):789–804.

1283