reactive controller synthesis for uav mission...

Reactive Controller Synthesis for UAV MissionPlanning

Kyunghoon Cho, Yunho Choi and Songhwai OhDepartment of Electrical and Computer Engineering and ASRI, Seoul National University, Seoul, Korea

(Tel : +82-2-880-1512; E-mail: {kyunghoon.cho,yunho.choi,songhwai.oh}@cpslab.snu.ac.kr)

Abstract—This paper presents an UAV control scheme undermission specifications. A mission is specified in linear temporallogic (LTL) formula. A point-to-point local planner is designedto search a trajectory satisfying the given mission specificationso that the low-level controller of a quadrotor can followthe trajectory. The designed planner adopts control Lyapunovfunctions, and has the following properties: (1) It reactivelygenerates a trajectory with respect to current quadrotor state.(2) The generated trajectory reflects the reference path producedduring the offline planning procedure. The proposed method isflexible in responding to disturbances and guides the quadrotorto satisfy the given mission requirement.

Keywords—Mission planning, Unmanned aerial vehicle, Lineartemporal logic

1. INTRODUCTION

Unmanned aerial vehicles (UAVs) have been attractingattention in broad area including both military and civilianapplications. Recently Amazon has launched a delivery systemwhich utilizes UAVs to deliver packages to customers, andsuch services are anticipated to prevail in the near future.Advances in guidance technologies allow UAVs to performfrom simple missions such as reaching goal to complicatedones like the vehicle routing problem.

Temporal logics describe high-level specifications with tem-poral and logical constraints. Especially, linear temporal logic(LTL) is widely used to specify robot task specifications [1]–[3]. For instance, in pick-up and delivery problems, “(eventu-ally) pick item A and deliver it to customer B” can be writtenin LTL formula φ = ♦(PA ∧ ♦DB).

In this paper, we design a local planner that guides aquadrotor to meet a mission specification, where missionsare represented as a syntactically co-safe LTL formula. Thedesigned planner utilizes control Lyapunov functions so thatthe desired trajectory can be searched with a single-step com-putation. This property allows the planner reactively generatesthe trajectory with respect to quadrotor’s state in real-time. Weadopted the Lyapunov function design method in [4], wherethe Lyapunov functions reflect the reference trajectories soughtin the off-line planning procedure. The proposed planner canhandle disturbances flexibly and lead the quadrotor to fulfillthe given mission requirement.

2. MATHEMATICAL FRAMEWORK

2.1. System ModelThe quadrotor system is modeled as a rigid body with a

twelve-dimensional state vector

x = [ξ, ξ, ν, w].

ξ = (x, y, z) is a position in the three-dimensional Cartesianspace, and ξ = (x, y, z) is a linear velocity. ν = (ψ, φ, θ) cor-responds to Euler angles (roll, pitch, yaw), and w = (p, q, r)is an angular velocity. The position of quadrotor evolvesaccording to the dynamic model:

mξ =

00−mg

+R

00∑i Fi

R =

cψcθ − sφsψsθ −cφsψ cψsθ + cθsφsψcθsψ + cψsφsθ cφcψ sψsθ − cψcθsφ−cφcθ sφ cφcθ

,where m is the mass, Fi is the force of each rotor and R isthe rotation matrix with cθ and sθ denoting cos θ and sin θrespectively. The relationship between the angular velocity wand time derivative of Euler angles can be described as below:pq

r

=

cθ 0 −cφsθ0 1 sφsθ 0 cφcθ

φθψ

.The quadrotor is controlled by the force of each motor Fi.We omit the detailed dynamic model of the quadrotor whichis shown in [5].W ∈ R3 denotes the robot’s workspace which describes the

physical space where a robot operates. The mapping functionh : x→ ξ ∈ W extracts position from quadrotor state.

2.2. Linear Temporal Logic (LTL)Linear temporal logic is a logical formalism, which is suited

for specifying linear time properties. The basic ingredients ofan LTL formula are a set of atomic propositions (APs), andBoolean and temporal operators. LTL formulas are formedaccording to the following grammar [6]: φ ::= true | a |φ1 ∨φ2 | ¬φ | ©φ |φ1Uφ2, where a ∈ AP ,© is the next operator,and U is the until operator. There are also derived operatorssuch as � (always), ♦ (eventually), and ⇒ (implication).

An atomic proposition is a statement or assertion whichmust be true or false. We use Π to denote a collection of allatomic propositions, i.e., Π = {π0, π1, ..., πN}. The semanticsof LTL are defined over infinite traces of a given system, wherea trace σ is a sequence of atomic propositions. For given traceσ, the notation σ � φ denotes that σ satisfies φ.

A. Syntactically Co-Safe LTL: In this work, we treat arestricted form of LTL formulas, which is called syntacticallyco-safe LTL formulas (sc-LTL). An LTL formula φ is co-safeif any infinite trace that satisfies φ has a finite prefix whichalso satisfies φ [7]. Syntactically co-safe LTL formulas are

the LTL formulas that contain only ©,♦,U operators, whenwritten in a positive normal form (¬ appears only in front ofatomic propositions).

B. Automaton Representation: Given a set of atomic propo-sitions Π and a syntactically co-safe LTL formula φ, it isknown that a nondeterministic finite automaton (NFA) canbe constructed [8]. There exists off-the-shelf software whichenables a fast translation from an LTL formula to NFA [6].An NFA can be converted to a deterministic finite automaton(DFA), which is more convenient in computation. A DFA Aφ,corresponding to φ, is defined below.

Definition 1: A deterministic finite automaton DFA is a tupleAφ = (Q,Σ, δ, qinit, Qacc), consisting of (i) a finite set ofstates Q, (ii) a finite alphabet Σ = 2Π, (iii) a transition relationδ : Q× Σ→ Q, (iv) a set of initial states qinit ⊆ Q, and (v)a set of accepting states Qacc ⊆ Q.

C. LTL Semantics over Trajectories: Let the workspace Wcontain regions of interest P = {P1, ..., Pn}, and each atomicproposition πj ∈ Π corresponds to the region of interest Pj . Alabeling function L :W → 2Π maps a state in the workspaceto a set of true propositions. π0 holds true for all workspacebeside regions of interest. For given πi ∈ Π, ¬πi holds truefor {w ∈ W |πi /∈ L(w)}.

For a discretized trajectory x∆t(x0) = x0, x1, ..., xm startingfrom x0 with time step ∆t, a trace can be defined as follows:

trace(x∆t(x0)) = L(h(x0)), L(h(x1)), ..., L(h(xm)).

For a given trace trace(x∆t(x0,u)) and a DFA Aφ, automa-ton states are sequentially updated from the initial automatonstate with respect to the transition relation in Aφ [9]. If areached automaton state is one of accepting states, then thetrajectory x∆t(x0) satisfies φ, denoted as x(x0))�∆tφ.

2.3. Candidate Lyapunov Function

Let us consider the system model described as below:

x = v,

where x(t) ⊂ Rd defines a state at time t ∈ R. A constructionof valid Lyapunov candidates can be done in various ways. Weconstruct Lyapunov candidates from demonstrations , whichwas introduced in [4]. The Lyapunov function is modeled asweighted sum of asymmetric quadratic functions (WSAQF):

V(z) = zTP 0z +

L∑l=1

βl(z)(zTP l(z− µl))2,

where z = x − x∗ and x∗ is a target state. L is the numberof used asymmetric quadratic functions, µl are mean vectorsto shape the asymmetry of the functions and P i ∈ Rd×dare positive definite matrices. The coefficient βl is definedas follows:

βl(z) =

{1 : zTP l(z− µl) ≥ 00 : zTP l(z− µl) < 0.

The main advantage of the WSAQF is the existence of anunique global minimum at z = 0. Through learning parameters

P l and µl, demonstrated trajectories can be reflected in the de-signed function, and it can be done by solving the optimizationproblem:

minimizeNtraj∑i=1

Ni∑k=1

1 + w

2sign(ψi,k)(ψi,k)2 +

1− w2

(ψi,k)2

subject to P l � 0, l = 0, ..., L,

where � denotes the positive definiteness of matrix and w isa small positive scalar. The function ψ is defined as

ψi,k =∇V(xik)T vik‖V(xik)‖ · ‖vik‖

.

3. PROPOSED APPROACH

The proposed approach consists of two major steps: offlineand online planning. In the first step, a reference path satisfy-ing the mission specification is searched by the offline planner.Based on the reference path, a set of Lyapunov functions andtemporal goal points are constructed. The role of Lyapunovfunctions is to find trajectories reaching to the assigned targets,which leads the robot to satisfy the given sc-LTL formula. Thesecond step is the online planning, a short-horizon trajectory iscomputed based on the Lyapunov functions and robot’s currentstate. It gradually guides the robot to meet the given missionspecification, even under disturbances.

3.1. Offline Planning

The offline procedure is given in Algorithm 1. A DFAfor the sc-LTL formula φ is constructed and a satisfied pathis searched by the offline planner (line 1-2). There existvarious works for solving path planning problem under LTLspecifications. We use one of the previous work [9] as theoffline planner.

The found reference path Ξ is decomposed into a set of pathsegments. Let Pi1 , ..., PiN is a sequence of interesting regionswhich the found reference path must pass to meet the missionspecification. If pi is a point in the region of interest Pi, thena sequence of points pi1 , ..., piN satisfies the sc-LTL formulatrace([pi1 , ..., piN ]) � φ. A set of temporal target points ξ∗k isdefined as

ξ∗k = arg minξ∈Ξ

‖ξ − pik‖, ∀k = 1, ..., N.

Path segments Ξ1, ...,ΞN are defined according to ξ∗k , whereeach end point of Ξk is ξ∗k for all k = 1, ..., N . The Lyapunovfunctions V1, ...,Vk are constructed from the path segmentsand the temporal targets [4]. Figure 1 describes the offlineprocedure for the LTL formula φ = ♦(a ∧ ♦b), which means“(eventually) reach region a, then b”. One can notice that thecontour map of Lyapunov function reflects the path segment. Ithas lower cost near path segment and the lowest at the end ofpath segment. Gradient flows of Lyapunov function supportsthe fact that ξ∗k is an unique global minimum point.

3.2. Online Planning

The online procedure is shown in Algorithm 2. An onlinepath planner is designed based on the Lyapunov functions (line3). At each time step, OnlineP lanner generates the desired

(a)

(b)

(c)

(d)

Fig. 1. The offline procedure. (a) The reference solution (orange line)found by the offline planner. (b) Path segments and temporal targets fromthe reference path. (c) Contour maps of Lyapunov functions reflecting pathsegments. (d) Gradient vector flows of Lyapunov function.

Algorithm 1 Offline Step (φ, xinit)1: Aφ ← Automaton(φ)2: Ξ← OfflinePlanner(φ, xinit)3: Ξ1, ...,ΞN , ξ

∗1 , ..., ξ

∗N

4: ← PathDecomposition(Ξ, qinit,Aφ)5: for k = 1, k++, while k ≤ N do6: Vk ← ConstructLyapunovFunc(Ξk, ξ

∗k)

7: end for

path with a short horizon, which leads the robot to satisfy thegiven LTL specification.

In order to find the desired path xdes, a goal point ξdes ∈ Wis defined as

ξdes = h(x) + η∆ξ,

where η is a scaling constant between 0 and 1. ∆ξ is computedby solving the following optimization problem:

minimize∆ξ

1

2(∆ξ)T (∆ξ)

subject to (∇Vξ(ξ))T∆ξ < −ρ,

(1)

where ρ ≥ 0. We set ρ = ‖∇Vξ‖, which is a slight variationof Sontag’s control Lyapunov formula [10]. Notice that (1) is

Algorithm 2 Online Step (xinit, V, ξ∗)1: x← xinit, i← 12: while i ≤ N do3: xdes ← OnlinePlanner(x, Vi, ξ∗i )4: x← Controller(x,xdes)5: if ‖h(x)− ξ∗i ‖ < σthres then6: i← i+ 17: end if8: end while

(a) (b)

Fig. 2. Snapshots of the online procedure for the the LTL formula φ =♦(a∧♦(b)). Different Lyapunov functions are used with respect to the currentstate of quadrotor.

a convex problem and has the global optimum point whichcan be solved analytically

∆ξ =

0 : ξ = ξ∗

− ρ

(∇Vξ(ξ))T (∇Vξ(ξ))∇Vξ(ξ) : ξ =W\ξ∗.

The desired trajectory xdes is generated by smoothly in-terpolating two points h(x) and ξdes. The trajectory xdesis defined as a 7th-order polynomials of time t. It travelsbetween a pair of points h(x) and ξdes, and takes a knownamount of time T . xdes(t) can be represented as xdes(t) =α0 + α1(t/T ) + α2(t/T )2 + ... + α7(t/T )7. Coefficients ofthe polynomial are found under the following constraints:

xdes(0) = h(x), xdes(T ) = ξref ,

xdes(0) = hv(x), xdes(T ) = 0,

xdes(k)(0) = 0 ∀k = 2, ..., 6,

with xdes(k) denoting the k-th derivative of xdes. The coef-

ficients are computed by solving the linear equation with theabove constraints.

The low-level controller for a quadrotor in [5] is used(line 4). The quadrotor is controlled independently by nestedfeedback loops. The inner attitude control loop uses onboardaccelerometers and gyros to control the roll, pitch, and yawangles, while the outer position control loop uses the estimatesof position and velocity of the center of mass to controlthe quadrotor. If the current state is reached to the targetξ∗i , then the Lyapunov function is updated (line 5-7). Thethreshold distance σthres is suitably decided so that pointswithin a ball of radius σthres centered at h(x) are included inthe corresponding region of interest. Updating the Lyapunovfunction changes how the online planner behaves so that itcan continuously guide the robot to satisfy the given LTLspecification.

(a) (b)

(c) (d)

Fig. 3. Simulated trajectories under white Gaussian noises with differentstandard deviations. (a) 0.02 (b) 0.04 (c) 0.1 (d) 0.2

(a) (b)

Fig. 4. (a) Crazyflie 2.0 quadrotor. (b) Vicon MX motion capture system.

4. EXPERIMENTAL RESULTS

In this section, we show simulation and experiment results.We ran simulation on the simple scenario of goal reachingproblem in order to show how the proposed planner behaveswell under disturbances. A 2D plane quadrotor model is testedin MATLAB, and the disturbance is modeled as an whiteGaussian noise affecting on the position of quadrotor. Figure 3shows the simulation result with different standard deviation ofwhite Gaussian noise. Notice that the proposed planner flexiblyresponds to external noises and guides the quadrotor to reachthe target point.

Also we have conducted experiments using quadrotorrobots. The robot platform used in the experiment is Crazyflie2.0, an open source nano quadrotor platform developedby Bitcraze. We consider a quadrotor moving inside theworkspace of 3.4m×2.4m. Both the position and orientationof a quadrotor are measured by a Vicon MX motion capturesystem (Figure 4).

We consider four different scenarios with different taskspecifications. For Scenario 1 and 2, sequential missions areassigned and coverage missions are given in Scenario 3 and4. LTL formulas for assigned missions are states as below.

Scenario 1 : φ = ♦(a ∧ ♦(b ∧ (♦c)))Scenario 2 : φ = ♦(a ∧ ♦(b ∧ ♦(c ∧ (♦d))))Scenario 3 : φ = ♦(a) ∧ ♦(b) ∧ ♦(c)Scenario 4 : φ = ♦(b) ∧ ♦(c) ∧ ♦(d)

The results of experiments are shown in Figure 5. It showsthat a quadrotor has successfully carried out assigned missionsin real environments.

(a) Scenario 1 (b) Scenario 2

(c) Scenario 3 (d) Scenario 4

Fig. 5. Results from experiments with Crazyflie 2.0 quadrotor robots. Theblack dashed line is the reference trajectory from the offline planner. Theorange line is the actual traveled trajectory of a quadrotor.

ACKNOWLEDGEMENT

This work was supported by Basic Science Research Pro-gram through the National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT Future Planning(NRF-2017R1A2B2006136).

REFERENCES

[1] G. E. Fainekos, H. Kress-Gazit, and G. J. Pappas, “Temporal logicmotion planning for mobile robots,” in Proc. of the IEEE InternationalConference on Robotics and Automation, Apr. 2005.

[2] P. Tabuada and G. J. Pappas, “Linear time logic control of discrete-time linear systems,” IEEE Transactions on Automatic Control, vol. 51,no. 12, pp. 1862–1877, Dec. 2006.

[3] T. Wongpiromsarn, U. Topcu, and R. M. Murray, “Receding horizontemporal logic planning for dynamical systems,” in Proc. of the IEEEConference on Decision and Control, Dec. 2009.

[4] S. M. Khansari-Zadeh and A. Billard, “Learning control lyapunovfunction to ensure stability of dynamical system-based robot reachingmotions,” Robotics and Autonomous Systems, vol. 62, no. 6, pp. 752–765, 2014.

[5] N. Michael, D. Mellinger, Q. Lindsey, and V. Kumar, “The graspmultiple micro-uav testbed,” IEEE Robotics & Automation Magazine,vol. 17, no. 3, pp. 56–65, 2010.

[6] C. Baier and J.-P. Katoen, Principles of model checking. MIT pressCambridge, 2008.

[7] A. P. Sistla, “Safety, liveness and fairness in temporal logic,” FormalAspects of Computing, vol. 6, no. 5, pp. 495–511, Sep. 1994.

[8] O. Kupferman and M. Y. Vardi, “Model checking of safety properties,”Formal Methods in System Design, vol. 19, no. 3, pp. 291–314, Nov.2001.

[9] A. Bhatia, L. E. Kavraki, and M. Y. Vardi, “Sampling-based motionplanning with temporal goals,” in Proc. of the IEEE InternationalConference on Robotics and Automation, May 2010.

[10] E. D. Sontag, “A universalconstruction of artstein’s theorem on nonlinearstabilization,” Systems & control letters, vol. 13, no. 2, pp. 117–123,1989.

reactive controller synthesis for uav mission...

Documents