adaptive learning for controlled lagrangian particle trackingc70]oceans_16.pdfadaptive learning for...

6
Adaptive Learning for Controlled Lagrangian Particle Tracking Sungjin Cho * , Fumin Zhang * , Catherine R. Edwards * School of Electrical and Computer Engineering Georgia Institute of Technology, Atlanta, Georgia 30308 Email: scho88, [email protected] Skidaway Institute of Oceanography, University of Georgia, Savannah, Georgia, 31411 Email: [email protected] Abstract—Controlled Lagrangian particle tracking (CLPT) is a feedback control methodology for autonomous underwater vehicles (AUVs) in dynamic ocean environments that can reveal information about ocean flow and model error through the operation of AUVs. From CLPT, we evaluate the navigational performance of AUVs with controlled Lagrangian prediction error (CLPE), which represents the difference between the predicted and true trajectories of AUVs. In this paper, using CLPE, we develop an adaptive learning algorithm that allows AUVs to learn their true motion in ocean flow fields in real time. While previous work shows that CLPE increases over time in simulated and field experiments, the proposed algorithm forces CLPE to converge to zero so that the vehicle motion model is unique and flow fields can be identified. For the robustness of the proposed algorithm, we prove that CLPE is ultimately bounded under disturbances. The proposed algorithm is verified by simulation results. I. I NTRODUCTION Over the past decades, autonomous underwater vehicles (AUVs) have proven to be valuable sensing platforms in a variety of scientific and practical missions [1]: oil spill survey [2], acoustic mapping [3], and ocean sampling [4], among many others. High navigational performance of AUVs is essential for persistent and efficient collection of information- rich data [5], and serious performance degradation can result when flow speed is comparable to or exceeds AUV forward speed, as is the case for underwater gliders [6], [7]. Controlled Lagrangian particle tracking (CLPT) is a feed- back control methodology for AUVs moving in dynamic ocean environments [8], [9] that has been demonstrated to reveal information about the flow field through which the vehicles move. In contrast to passive Lagrangian methods, an AUV is viewed as a controlled Lagrangian particle in the sense that AUVs are not freely advected by ocean flow. To track the controlled Lagrangian particle, we generate the predicted trajectory of the AUV by simulating vehicle motion models composed of incorporated flow models and controllers. Then, we compare the predicted trajectory with the measured true trajectory of the AUV. The discrepancy between the two trajectories shows the tracking performance of the controlled Lagrangian particle, called controlled Lagrangian prediction error (CLPE). CLPE is a crucial measure that can be inter- preted as the navigational performance of AUVs. In this paper, we present an adaptive learning algorithm for controlled Lagrangian particle tracking. Previous work [8], [9] shows that CLPE can increase over time in simulated and field experiments because of inaccurate flow models included in the simulated motion models. Since CLPE represents the accuracy of the simulated motion models, the proposed al- gorithm employs learning models instead of the simulated models, and updates the learning models according to CLPE so that both the learning models and the vehicle motion model are identical. Many researchers have developed path-planning algorithms to improve navigational performance of AUVs against com- plex ocean flows [10]–[12]. By assuming that the true flow field is well-represented by ocean models [13], [14], the path- planning algorithms generate reference trajectories through off-line computation on shore. Waypoints corresponding to the reference trajectories are commanded to the AUVs in the field. However, “true” ocean flow fields are unknown, and ocean models can have significant error due to unknown or under- constrained boundary conditions, initial conditions, or missing physics [15], [16]. Thus, we cannot guarantee that true trajec- tories of AUVs are the same as the reference trajectories gener- ated by path planning algorithms. Alternately, Lagrangian data assimilation schemes [4], [7], [17] and glider computerized tomography algorithms [18], [19] can be used to address the better identification of ocean flow using AUV-derived observations of the flow field. Such algorithms focus on updating flow models and mapping flows, and the algorithms can be complementary to our results. For on-line learning models, we design two controllers, a motion controller and a learning controller. The first is designed for the motion of AUVs, and is composed of constant feedforward and feedback gains, plus a flow canceling term that uses a parametric flow model with known parameters. The second controller is designed to learn the motion of AUVs. In this learning controller, feedforward and feedback gains, learning parameters, and parameters of the flow canceling term are adapted by the proposed learning law based on CLPE. These controllers allow AUVs to learn their actual motion while identifying the ocean flow field. We organize this paper into the following sections: Section II describes the problem setup introducing the vehicle mo- 978-1-5090-1537-5/16/$31.00 ©2016 IEEE

Upload: others

Post on 26-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Adaptive Learning for Controlled LagrangianParticle Tracking

Sungjin Cho∗, Fumin Zhang∗, Catherine R. Edwards†

∗School of Electrical and Computer EngineeringGeorgia Institute of Technology, Atlanta, Georgia 30308

Email: scho88, [email protected]†Skidaway Institute of Oceanography, University of Georgia, Savannah, Georgia, 31411

Email: [email protected]

Abstract—Controlled Lagrangian particle tracking (CLPT) isa feedback control methodology for autonomous underwatervehicles (AUVs) in dynamic ocean environments that can revealinformation about ocean flow and model error through theoperation of AUVs. From CLPT, we evaluate the navigationalperformance of AUVs with controlled Lagrangian predictionerror (CLPE), which represents the difference between thepredicted and true trajectories of AUVs. In this paper, usingCLPE, we develop an adaptive learning algorithm that allowsAUVs to learn their true motion in ocean flow fields in real time.While previous work shows that CLPE increases over time insimulated and field experiments, the proposed algorithm forcesCLPE to converge to zero so that the vehicle motion modelis unique and flow fields can be identified. For the robustnessof the proposed algorithm, we prove that CLPE is ultimatelybounded under disturbances. The proposed algorithm is verifiedby simulation results.

I. INTRODUCTION

Over the past decades, autonomous underwater vehicles(AUVs) have proven to be valuable sensing platforms in avariety of scientific and practical missions [1]: oil spill survey[2], acoustic mapping [3], and ocean sampling [4], amongmany others. High navigational performance of AUVs isessential for persistent and efficient collection of information-rich data [5], and serious performance degradation can resultwhen flow speed is comparable to or exceeds AUV forwardspeed, as is the case for underwater gliders [6], [7].

Controlled Lagrangian particle tracking (CLPT) is a feed-back control methodology for AUVs moving in dynamic oceanenvironments [8], [9] that has been demonstrated to revealinformation about the flow field through which the vehiclesmove. In contrast to passive Lagrangian methods, an AUVis viewed as a controlled Lagrangian particle in the sensethat AUVs are not freely advected by ocean flow. To trackthe controlled Lagrangian particle, we generate the predictedtrajectory of the AUV by simulating vehicle motion modelscomposed of incorporated flow models and controllers. Then,we compare the predicted trajectory with the measured truetrajectory of the AUV. The discrepancy between the twotrajectories shows the tracking performance of the controlledLagrangian particle, called controlled Lagrangian predictionerror (CLPE). CLPE is a crucial measure that can be inter-preted as the navigational performance of AUVs.

In this paper, we present an adaptive learning algorithmfor controlled Lagrangian particle tracking. Previous work [8],[9] shows that CLPE can increase over time in simulated andfield experiments because of inaccurate flow models includedin the simulated motion models. Since CLPE represents theaccuracy of the simulated motion models, the proposed al-gorithm employs learning models instead of the simulatedmodels, and updates the learning models according to CLPEso that both the learning models and the vehicle motion modelare identical.

Many researchers have developed path-planning algorithmsto improve navigational performance of AUVs against com-plex ocean flows [10]–[12]. By assuming that the true flowfield is well-represented by ocean models [13], [14], the path-planning algorithms generate reference trajectories throughoff-line computation on shore. Waypoints corresponding to thereference trajectories are commanded to the AUVs in the field.

However, “true” ocean flow fields are unknown, and oceanmodels can have significant error due to unknown or under-constrained boundary conditions, initial conditions, or missingphysics [15], [16]. Thus, we cannot guarantee that true trajec-tories of AUVs are the same as the reference trajectories gener-ated by path planning algorithms. Alternately, Lagrangian dataassimilation schemes [4], [7], [17] and glider computerizedtomography algorithms [18], [19] can be used to addressthe better identification of ocean flow using AUV-derivedobservations of the flow field. Such algorithms focus onupdating flow models and mapping flows, and the algorithmscan be complementary to our results.

For on-line learning models, we design two controllers,a motion controller and a learning controller. The first isdesigned for the motion of AUVs, and is composed of constantfeedforward and feedback gains, plus a flow canceling termthat uses a parametric flow model with known parameters. Thesecond controller is designed to learn the motion of AUVs.In this learning controller, feedforward and feedback gains,learning parameters, and parameters of the flow canceling termare adapted by the proposed learning law based on CLPE.These controllers allow AUVs to learn their actual motionwhile identifying the ocean flow field.

We organize this paper into the following sections: SectionII describes the problem setup introducing the vehicle mo-

978-1-5090-1537-5/16/$31.00 ©2016 IEEE

tion model, the learning model, flow models, and the statefeedback controllers with flow canceling. Section III presentsthe adaptive learning algorithm, and analyzes robustness ofthe proposed algorithm. Section IV presents the simulationresults that show the verification of the theoretical analysis,and Section V provides conclusions.

II. PROBLEM FORMULATION

We present an adaptive learning algorithm using the frame-work of controlled Lagrangian particle tracking. To formulatethe learning problem of AUVs influenced by the uncertainocean flow, we describe the vehicle motion model, the learningmodel, flow models, and state feedback controllers.

A. Vehicle motion model

Let D⊂R2 be a domain. Then let F(x, t) be a deterministicambient flow velocity, where F:D× [0, ∞]→ R2. Let v be athrough-water velocity, or the controlled velocity of the AUV,where v:D× [0, ∞]→ R2.

dxdt

= FR(x, t)+v(x, t,zc(t)). (1)

Subscript R for the flow F denotes a real ocean flow. FR and vare locally Lipschitz in x = [x1, x2]

> ∈D. x is the true positionof the AUV. zc(t) = [z1c(t), z2c(t)]> is the desired trajectoryof the AUV.

B. Learning model

Learning models are used for generating the simulatedpositions of the vehicle. When we define z(t) as the simulatedposition of the vehicle from a learning model, the trajectory ofthe vehicle is generated by integrating the following learningmodel:

dzdt

= FM(z, t)+v(z, t,zc(t)), (2)

where subscript M for the flow F is a modeled flow.

C. Flow models

Flow fields can be represented by spatial and temporal basisfunctions [20]. We consider that spatial and temporal basisfunctions are the combination of Gaussian radial and tidalbasis functions, respectively. Let N be a positive integer, and θ ,αM ∈R2×N . Let φ :D× [0, ∞]→RN be [φ 1(x, t), · · ·φ N(x, t)]> .

FR(x, t) = θφ(x, t) (3)FM(x, t) = αMφ(x, t), (4)

where

θ =

[θ1θ2

]=

[θ 1

1 · · · θ N1

θ 12 · · · θ N

2

],αM =

[αM1αM2

]=

[α1

M1· · · αN

M1α1

M2· · · αN

M2

].

The combined basis functions are

φi(x, t) = exp−

‖x−ci‖2σi cos(ωit), (5)

where ci is the centers, σi is the widths, ωi tidal frequencies,and i = 1, · · · ,N.

D. State feedback controllers with flow canceling

The flow canceling strategy is to cancel out flow velocityby controlling the heading angle of AUVs [9]. With theflow canceling strategy, we design two types of controllersas the combination of feedback and feedforward structures.Let α(t) =

[α1(t)α2(t)

]=

[α1

1 (t) · · · αN1 (t)

α12 (t) · · · αN

2 (t)

]. Let KM(t) =

diag{KM1(t),KM2(t)}, and ΓM(t) = diag{ΓM1(t),ΓM2(t)} bediagonal matrices with time varying parameters. Let KD =diag{KD1 ,KD2}, and ΓD = diag{ΓD1 ,ΓD2} be diagonal matri-ces with desired fixed parameters. Let ΨL(t)∈R2 is a learninginjection parameter. Then,

v(z, t,zc(t)) =−α(t)φ(z, t)−KM(t)z+ΓM(t)zc(t)+ΨL(t)(6)

v(x, t,zc(t)) =−αMφ(x, t)−KDx+ΓDzc(t). (7)

The learning controller represented by equation (6) contains(in order) a flow canceling term, a positional feedback term,a feedforward term, and a learning injection term with timevarying parameters. By plugging equations (6) and (4) intoequation (2), the closed loop dynamics under the learningcontroller is

z = (αM−α(t))φ(z, t)−KM(t)z+ΓM(t)zc(t)+ΨL(t). (8)

Meanwhile, the closed loop vehicle dynamics under the vehi-cle controller represented by equation (7) is

x = (θ −αM)φ(x, t)−KDx+ΓDzc(t) (9)

when we plug equations (7) and (3) into equation (1). Ifx and z have the same initial condition and α(t) = αM ,KM(t) = KD, ΓM(t) = ΓD, and ΨL(t) = (θ −αM)φ(x, t), thelearned trajectory is identical to the vehicle trajectory becauseof closed-loop dynamics. Our goal is to design the learningcontroller and the learning injection parameter so that theclosed loop dynamics of the learning controller follows thatof the vehicle controller.

E. Controlled Lagrangian prediction error

To develop the learning controller that makes the simulatedtrajectory follows the true trajectory in the ocean flow field,we first derive controlled Lagrangian prediction error (CLPE)dynamics that models how much the true trajectory is deviatedfrom the simulated trajectory. By subtracting equation (8) fromequation (9), CLPE dynamics is represented by

e = x− z= (θ −αM)φ(x, t)− (αM−α(t))φ(z, t)−KDx+ΓDzc(t)

+KM(t)z−ΓM(t)zc(t)−ΨL(t)

= (θ −αM)φ(x, t)− (αM−α(t))φ(z, t)+(KM(t)−KD)z−KDe+(ΓD−ΓM(t))zc(t)−ΨL(t).

(10)For example, if α(t) = αM , KM(t) = KD, ΓM(t) = ΓD, andΨL(t) = (θ−αM)φ(x, t), CLPE goes to zero over time, whichimplies that the simulated trajectory follows the true trajectory.Our goal is to design a learning law for updating parameters

α(t), KM(t), and ΓM(t) by using CLPE dynamics so that CLPEconverges to zero.

III. AN ADAPTIVE LEARNING ALGORITHM

To design an adaptive learning algorithm for controlledLagrangian particle tracking, we need some definitions andassumptions.Definition 3.1 [21] Signal u is persistent exciting if thereexist positive constants κ1, κ2, and T such that κ2I >∫ t+T

t u(s)u>(s)ds > κ1I ∀t.Assumption 3.1 zc is a persistent exciting signal.Assumption 3.2 KD, or the feedback gain matrix of equation(8), is positive definite.Assumption 3.3 Let θ be the estimate of θ . ‖

(θ − θ

)φ‖ is

bounded by flow velocity fmax.Remark 3.1 zc is bounded because of persistent excitation.Remark 3.2 Since θ is a unknown constant matrix, we canestimate θ using the Kalman filter and the motion tomographyalgorithm. We assume ‖θ − θ‖ can be bounded from thosealgorithms.Remark 3.3 Positive definite matrix KD is required for thevehicle controller with negative state feedback.

A. Using a flow model

This section addresses a learning controller that uses avail-able modeled flows. We consider α(t) = αM , which is avail-able modeled flow. Then the CLPE dynamics from equation(10) becomes

e = (θ −αM)φ(x, t)−KDe+(KM(t)−KD)z+(ΓD−ΓM(t))zc(t)−ΨL(t)

(11)

Let KM(t) = [KM1(t), KM2(t)]> and ΓM = [ΓM1(t), ΓM2(t)]

>

be two dimensional vectors with time-varying parameters. LetKD = [KD1 , KD2 ]

>, and ΓD = [ΓD1 , ΓD2 ]> be two dimensional

vectors with desired fixed parameters. Let z = diag{z1, z2},and zc = diag{z1c, z2c} be diagonal matrices. Let γ be anypositive constant. We design learning parameter injection asfollows:

ΨL(t) = (θ −αM)φ(x, t). (12)

We have no knowledge of true flow parameter θ . When weuse estimate θ from known algorithms (e.g., Kalman filtering[17] or motion tomography [18]):

ΨL(t) = (θ −αM)φ(x, t). (13)

We design a learning law for time-varying parameters KM andΓM by the following equations.

˙KM(t) = −γ z>e (14)˙ΓM(t) = γ z>c e (15)

Theorem 3.1 Under assumptions 3.1, 3.2, and 3.3, and usingequations (13), (14), (15), CLPE is ultimately bounded by

‖e‖ ≤ 1µ‖(θ − θ

)φ(x, t)‖ (16)

where µ < λmin (KD).

Proof. Consider a candidate Lyapunov function:

V (e, KM, ΓM) =12

{e>e+

1γ(KM− KD)

>(KM− KD)

+1γ

(ΓD− ΓM

)> (ΓD− ΓM

)}.

(17)

The derivative of V is

V =−eT KDe+ e> (θ −αM)φ(x, t)− e>ΨL

+(KM− KD)>(

˙KM + ze)+(ΓD− ΓM

)>(z>c e− 1γ

˙ΓM

).

(18)By assumption 3.2 and using equation (13), (14),and (15), V = −e>KDe + e>

(θ − θ

)φ(x, t). Then,

V ≤ −(λmin(KD)−µ)‖e‖2 + ‖e‖‖(θ − θ

)φ(x, t)‖ − µ‖e‖2.

When ‖e‖ ≥ 1µ‖(θ − θ

)φ(x, t)‖ given positive constant

µ < λmin (KD), V ≤ −(λmin(KD)−µ)‖e‖2. This mean V isnot positive. Thus, ‖e‖ ≤ 1

µ‖(θ − θ

)φ(x, t)‖.

Remark 3.3 Theorem 3.1 shows that CLPE is bounded byestimated error of flow velocity as expected. Hence, theconvergence of theorem 3.1 is completely determined by theaccuracy of θ , which depends on the convergence of knownalgorithms (Kalman filtering, motion tomography [17], [18]).The following sections describes another learning algorithmthat includes learning flows so that CLPE converges to zero.

B. Learning flow

We rewrite equation (10) as follows:

e = (θ +α(t))φ(x, t)+(KM(t)−KD)z+(ΓD−ΓM(t))zc(t)

−KDe−ΨL(t)− (αM +α(t))φ(x, t)− (αM−α(t))φ(z, t).(19)

Let θ , α , and eφ> ∈ R2N be row vectors.That is, α(t) = [α1

1 (t), · · · ,αN1 (t),α

12 (t), · · · ,αN

2 (t)],θ(t) = [θ 1

1 (t), · · · ,θ N1 (t),θ 1

2 (t), · · · ,θ N2 (t)], and

eφ> = [e1φ1, · · · ,e1φN ,e2φ1, · · · ,e2φN ]>. We design a

learning parameter injection as follows:

ΨL(t) =−(αM +α(t))φ(x, t)− (αM−α(t))φ(z, t). (20)

We design a learning law for time-varying parameters

˙α(t) = −γeφ>(x, t) (21)˙KM(t) = −γ z>e (22)˙ΓM(t) = γ z>c e (23)

Theorem 3.2 Under assumptions 3.1 and 3.2, and usingequations (20), (21), (22), and (23), CLPE converges to zerowhen time goes to infinity.

Proof. Consider a candidate Lyapunov function:

V (e,α, KM, ΓM)

=12

{e>e+

1γ(θ1 +α1(t))(θ1 +α1(t))

>

+1γ(θ2 +α2(t))(θ2 +α2(t))

>+1γ(KM− KD)

>(KM− KD)

+1γ

(ΓD− ΓM

)> (ΓD− ΓM

)}.

(24)The derivative of V is

V =−eT KDe+ e> (θ +α)φ(x, t)− e>ΨL−2e>αMφ(x, t)

+(KM− KD)>(

˙KM + ze)+(ΓD− ΓM

)>(z>c e− 1γ

˙ΓM

)+

(θ + α

)˙α.

(25)We know e> (θ +α)φ =

(θ + α

)eφ>. Then, under assump-

tion 3.2 and using equation (20), (21), (22), and (23),

V =−e>KDe≤ 0. (26)

V is negative semi-definite and this implies that e, α , KM , andΓM are bounded. In addition, the second order time derivativeof V satisfies V = −2e>KDe = −2e>KD{(θ +α)φ(x, t)−KDe+(KM(t)−KD)z+(ΓD−ΓM(t))zc−ΨL(t)−2αMφ(x, t)}.From equation (20), ΨL is bounded. By assumption 3.1, zcis bounded, and z is bounded because equation (8) representslinear systems with sinusoidal inputs. In addition, KM , and ΓMare bounded. Thus, V is bounded, and hence V is uniformlycontinuous. By Barbalat’s lemma in [22], [23], e→ 0 whent→ ∞.

Theorem 3.3 Under the same setting of theorem 3.2, α , KM ,and ΓM converges to −θ , KD, and ΓD, respectively.

Proof. In order to identify the ocean flow field from theproposed learning control law, we prove the convergenceof parameters α , KM , and ΓM . Let η1, η2, η3, and η4 be(θ1 +α1)

>, (θ2 +α2)>, KM− KD, and ΓD− ΓM , respectively.

Let φ1 =

[φ 1

1 · · · φ N1

0 · · · 0

]and φ2 =

[0 · · · 0

φ 12 · · · φ N

2

]be in

R2×N . We rewrite equation (19) using equation η1, η2, η3,and η4 as follows:

e = φ1(x, t)η1 + φ2(x, t)η2 + xη3−KDe+ zcη4. (27)

We augment e, η1, η2, η3, and η4 to new state variable X .Then

X = A(t)X , A(t) =

−KD φ1 φ2 z zc−γφ1 0 0 0 0−γφ2 0 0 0 0−γ z 0 0 0 0−γ zc 0 0 0 0

Y =CX , C =

I 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 0

.(28)

Our goal is to show the origin of X = A(t)X is uniformlyasymptotically stable, which implies that α =−θ , KM = KD,and ΓM = ΓD. By theorem 3.4.8 in [24], a necessary andsufficient condition is that there exist a symmetric matrix Psuch that c1I ≤ P ≤ c2I and A>P+PA+ P+ νC>C ≤ 0 aresatisfied ∀t and some constant ν > 0, where c1 > 0, and c2 > 0and C is such that (C,A) is a uniformly completely observable.

Let P =

12 K−1

D 0 0 0 00 1

2γK−1

D 0 0 00 0 1

2γK−1

D 0 00 0 0 1

2γK−1

D 00 0 0 0 1

2γK−1

D

. Let V ′

be X>PX . Then,

V ′ = X>(A>P+P>A+ P)X ≤−νX>C>CX =−ν‖Y‖2, (29)

where P = 0.Now we will prove (C,A) is a uniformly completely ob-

servable. Because it is hard to prove the observability of timevarying system matrix A, we will instead show (C,A+ LC)is uniformly completely observable with some bounded ma-trix L, called output injection by lemma 4.8.1 in [24]. Let

L =

KD 0 0 0 0γφ1 0 0 0 0γφ2 0 0 0 0γ z 0 0 0 0γ zc 0 0 0 0

. Since x and zc are bounded, and

φ is a sinusoidal function with exponential magnitude, L is

bounded. Then, A+LC =

0 φ1 φ2 z zc0 0 0 0 00 0 0 0 00 0 0 0 00 0 0 0 0

. Thus,

X = AX = (A+LC)X−LY

Y =CX .(30)

Let η = [η1, η2, η3, η4]>, and w = [φ1, φ2, z, zc]

>. We havethe following equation corresponding to equation (30).

e =−KDe+w>η

η = 0Y = e.

(31)

Because zc is persistent exciting, z is bounded, and φ1 andφ2 are sinusoidal functions, w is persistent exciting. LetΦ(ζ ) =

∫τ

t exp−KD(τ−ζ )w(ζ )dζ . By lemma 4.8.3 in [24], Φ(ζ )satisfies persistent exciting conditions because of a stable,minimum phase, proper rational transfer function (sI2×2 +KD). Therefore, there exists constant ρ1, ρ2, T0 > 0 suchthat ρ2I ≥ 1

T0

∫ t+T0t Φ(ζ )Φ>(ζ )dζ ≥ ρ1I ∀t ≥ 0. By applying

lemma 4.8.4 in [24] to the system of equation (31), (C,A+LC)is uniformly completely observable; hence, the system ofequation (28) is uniformly completely observable. Therefore,α , KM , and ΓM converge to −θ , KD, and ΓD, respectively.

C. Inaccuracy in flow modeling

Although the basis functions well capture the spatial vari-ability of ocean flow in a specific region, the functions stillinclude deterministic errors induced by the variability out ofthe region. In this section, we address the robustness of theproposed adaptive learning algorithms.

We show the boundedness of CLPE when the true flowmodel has unknown disturbances such as unstructured un-certainties. We assume FR(x, t) = θφ(x, t)+∆, where ‖∆‖ isbounded by ∆max. Then,

e = (θ +α(t))φ(x, t)+(KM(t)−KD)z−KDe+(ΓD−ΓM(t))zc +∆.

(32)

Theorem 3.4 Under the same setting of theorem 3.2, the boundof CLPE is

‖e‖ ≤ 1β‖∆‖ (33)

where the positive constant β < λmin(KD).

Proof. Let V be the Lyapunov function represented by equa-tion (24). By using equation (32), the derivate of V is

V =−eT KDe+ e>∆+(KM− KD)>(

˙KM + ze)

+(ΓD− ΓM

)>(z>c e− 1γ

˙ΓM

)+

(θ + α

)(1γ

˙α + eφ>(x, t))

(34)Then, we plug the adaptive law represented by equations (20),(21), (22), and (23) into equation (34). Then,

V =−eT KDe+ e>∆

≤−λmin(KD)e>e+ e>∆

≤−λmin(KD)‖e‖2 +‖e‖‖∆‖≤ −(λmin(KD)−β )‖e‖2 +‖e‖‖∆‖−β‖e‖2

(35)

When ‖e‖ ≥ 1β‖∆‖ given positive constant β < λmin(KD), V ≤

−(λmin(KD)−β )‖e‖2, which means V is negative definite.Thus, CLPE is bounded by ‖e‖ ≤ 1

β‖∆‖.

IV. SIMULATION RESULTS

In this section, we describe simulation results for the pro-posed learning algorithm. We choose 3 spatial and 3 tidal basisfunctions to represent 2D ocean flow along each of the twodirections. θ1, which represents true flow parameters along thehorizontal direction, is [1.0 0.4 0.8]. θ2, which represents threeflow parameters along the vertical direction, is [0.9 0.4 0.8].The three combined basis functions are represented by centerci, width σi, and tidal frequency ωi, where i = 1,2,3. c1,c2, and c3 are [0,0]>, [10,10]>, and [5,5]>, respectively. σ1,σ2, and σ3 are all equal to 5. We select three frequencies torepresent tides, the M2 lunar semidiurnal (period=12.42 hours),and two fictional frequencies with periods at intervals higherand lower than the M2 (10.42, 24.42 hrs). Feedback gain ofthe vehicle controller, KD is diag{1,1}. ΓD is the same as KD,and all parameters in the learning algorithm are initialized at

0. Adaptation speed, or γ is designed as 0.8. We consider 10waypoints generated from zc = [r cosΘ, r sinΘ]> with r = 3,and Θ = π

5 bt

20c. Figure 1 represents waypoints of the AUV.The AUV completes a cycle when it sequentially travels ninewaypoints counter-clock direction from starting at waypoint(3,0), and arrives back at waypoint (3,0).

Fig. 1. An AUV starts to go to the waypoint (3,0) from the origin, andthen keeps moving the next waypoint counter-clockwise direction. Arrowsrepresents ocean flow at the initial time. Flow direction changes over timeaccording to the tidal basis functions.

Figures 2, 3, 4 shows simulation results of CLPE and sixflow parameters. CLPE goes to zero over 200 seconds (10intervals), which is about one cycle. Moreover, the six flowparameters converges to their true flow parameters. Feedbackand feedforward gains converge to the gains of desired con-troller with the similar trend of flow parameters. These resultssupport our theoretical analysis in the previous section.

Fig. 2. CLPE: Each unit represents time interval of 20 seconds. 10 intervalrepresents one cycle. Over one cycle, CLPE error is significantly reduced, andCLPE converges to zero.

V. CONCLUSION

We have developed an adaptive learning algorithm for con-trolled Lagrangian particle tracking (CLPT) that improves theability of an AUV to sense the flow field in which its operatesby incorporating the its navigation error. This improvement

Fig. 3. Parameters for horizontal flow converges to true values.

Fig. 4. Parameters for vertical flow converges to true values.

accelerates convergence in the estimation of flow as wellas controller gains convergence to true gains. Furthermore,we prove that the algorithm guarantees controlled Lagrangianprediction error (CLPE) converges to zero with time. Toaddress the robustness of the algorithm, we show that CLPE isultimately bounded in terms of unknown disturbance in oceanflow.

No AUV is a perfect navigator; each is limited in itsability to navigate because of limits to propulsion mechanismsand self-localization. Future work will develop algorithmsthat address boundedness of the problem when the controlledvelocity is saturated and develop learning algorithms when thelocalization service is infrequent.

ACKNOWLEDGMENTS

This research is supported by ONR grants N00014-14-1-0635, N00014-16-1-2667; NSF grants IIS-1319874, CMMI-1436284, CNS-0931576, OCE-1032285, OCE-1559475; andNOAA award NA16NOS0120028.

REFERENCES

[1] F. Zhang, G. Marani, R. N. Smith, and H. T. Choi, “Future trends inmarine robotics,” IEEE Robotics and Automation Magazine, vol. 22,no. 1, pp. 14–21, pp. 122, 2015.

[2] S. Mukhopadhyay, C. Wang, M. Patterson, M. Malisoff, and F. Zhang,“Collaborative autonomous surveys in marine environments affected byoil spills,” Cooperative Robots and Sensor Networks 2014, vol. 554, pp.87–113, 2014.

[3] W. Wu, A. Song, P. Varnell, and F. Zhang, “Cooperatively mapping ofthe underwater acoustic channel by robot swarms,” in Proceedings ofthe the Ninth ACM International Conference on Underwater Networksand Systems (WuWNet’14), 2014.

[4] N. E. Leonard, D. A. Paley, R. E. Davis, D. M. Fratantoni, F. Lekien,and F. Zhang, “Coordinated control of an underwater glider fleet in anadaptive ocean sampling field experiment in Monterey Bay,” Journal ofField Robotics, vol. 27, no. 6, pp. 718–740, 2010.

[5] L. Paull, S. Saeedi, M. Seto, and H. Li, “Auv navigation and localization:A review,” IEEE Journal of Ocean Engineering, vol. 39, no. 1, pp. 131–149, 2014.

[6] F. Zhang, D. M. Fratantoni, D. A. Paley, J. M. Lund, and N. E. Leonard,“Control of coordinated patterns for ocean sampling,” InternationalJournal of Control, vol. 80, no. 7, pp. 1186–1199, 2007.

[7] D. Chang, F. Zhang, and C. R. Edwards, “Real-time guidance ofunderwater gliders assisted by predictive ocean models,” Journal ofAtmospheric and Oceanic Technology, vol. 32, no. 3, pp. 562–578, 2015.

[8] K. Szwaykowska and F. Zhang, “Controlled Lagrangian particle trackingerror under biased flow prediction,” in 2013 American Control Confer-ence, Washington, DC, USA, 2013, pp. 2575–2580.

[9] ——, “Trend and bounds for error growth in controlled Lagrangianparticle tracking,” IEEE Journal of Oceanic Engineering, vol. 39, no. 1,pp. 10–25, 2014.

[10] B. Garau, M. Bonet, A. Alvarez, S. Ruiz, and A. Pascual, “Path planningfor autonomous underwater vehicles in realistic oceanic current fields:Application to gliders in the western mediterranean sea,” Journal ofMaritime Research, vol. 6, no. 2, pp. 5–21, 2009.

[11] B. Rhoads, I. Mezic, and A. C. Poje, “Minimum time heading con-trol of underpowered vehicles in time-varying ocean currents,” OceanEngineering, vol. 66, no. 1, pp. 12–31, 2012.

[12] A. A. Pereira, J. Binney, G. A. Hollinger, and G. S. Sukhatme,“Risk-aware path planning for autonomous underwater vehicles usingpredictive ocean models,” IEEE Journal of Field Robotics, vol. 30, pp.741–762, 2013.

[13] A. F. Shchepetkin and J. C. McWilliams, “The regional oceanic model-ing system (ROMS): a split-explicit, free-surface, topography-following-coordinate oceanic model,” Ocean Modelling, vol. 9, pp. 347–404, 2005.

[14] P. J. Martin, “Description of the navy coastal ocean model version 1.0,”Naval Research Lab, Tech. Rep. NRL/FR/7322–00-9962, 2000.

[15] A. C. Haza, L. I. Piterbarg, P. Martin, T. M. Ozgokmen, and A. Griffa, “ALagrangian subgridscale model for particle transport improvement andapplication in the Adriatic Sea using the Navy Coastal Ocean Model,”Ocean Modelling, vol. 17, no. 1, pp. 68–91, 2007.

[16] A. Griffa, L. I. Piterbarg, and T. Ozgokmen, “Predictability of La-grangian particle trajectories: Effects of smoothing of the underlyingEulerian flow,” Journal of Marine Research, vol. 62, no. 1, pp. 1–35,2004.

[17] D. Chang, X. Liang, W. Wu, C. R. Edwards, and F. Zhang, “Real-time modeling of ocean currents for navigating underwater glidersensing networks,” Cooperative Robots and Sensor Networks, Studiesin Computational Intelligence, vol. 507, pp. 61–75, 2014.

[18] W. Wu, D. Chang, and F. Zhang, “Glider CT: Reconstructing flow fieldsfrom predicted motion of underwater gliders,” in Proceedings of theEighth ACM International Conference on Underwater Networks andSystems (WUWNet ’13), no. 47, 2013.

[19] D. Chang, W. Wu, and F. Zhang, “Glider CT: Analysis and experimentalvalidation,” Distributed Autonomous Robotic Systems, vol. 112, pp. 285–298, November 2014.

[20] X. Liang, W. Wu, D. Chang, and F. Zhang, “Real-time modelling of tidalcurrent for navigating underwater glider sensing networks,” in ProcediaComputer Science, vol. 10, 2012, pp. 1121–1126.

[21] H. K. Khalil, Nonlinear Systems, 2nd ed. Prentice Hall, 1996.[22] K. J. Astrom and B. Wittenmark, Adaptive Control, 2nd ed. Addison-

Wesley, 1995.[23] H. K. Khalil, Nonlinear Systems, 3rd ed. Printice Hall, 2002.[24] P. Ioannou and J. Sun, Robust Adaptive Control. Prentice Hall, 1996.