experimental demonstrations of real-time milp controlacl.mit.edu/papers/2003_5802.pdf · 2019. 7....

11
Experimental Demonstrations of Real-time MILP Control Arthur Richards * , Yoshiaki Kuwata, and Jonathan How MIT Dept. of Aeronautics and Astronautics § ABSTRACT This paper presents results from recent hardware ex- periments using two forms of on-line Mixed-integer Linear Programming (MILP). The demonstrations were performed on wheeled ground vehicles, with a view to future use on autonomous teams of UAVs. The first experiment uses MILP at a high-level to account for a limited detection horizon while ma- neuvering in the presence of obstacles. The second performs low-level control with MILP to reject dis- turbances while performing a rendezvous. The pa- per describes the design of the multi-vehicle testbed and discusses the architecture and implementation details of the experiments. In both cases, the online MILP compensated for the uncertainties present and successfully completed the maneuvers. INTRODUCTION Two forms of predictive control based on Mixed- integer Linear Programming (MILP) are demonstrated on a multi-vehicle testbed. The objective of these controllers is to enable coordinated operation of mul- tiple Unmanned Aerial Vehicles (UAVs) to achieve high-level mission goals in a challenging environ- ment [1]. The experiments, performed using wheeled ground vehicles to emulate the UAVs, highlight the transition from simulation to hardware implementa- tion. A key issue is the interaction between high- and low-level controllers within the overall control system architecture. Significant dynamic uncertainty is present, ranging from disturbance forces acting on the vehicles to macroscopic changes in the environ- ment, such as the discovery of new targets or ob- stacles. The two demonstrations in this paper show how MILP can be applied at different levels of the control system to account for these different types of uncertainty. * Research Assistant, [email protected] Research Assistant, [email protected] Associate Professor, [email protected], Senior Mbr. AIAA. § MIT 33-328, 77 Mass. Ave., Cambridge, MA 02139. Previous work has demonstrated the use of MILP for off-line trajectory design for vehicles subject to avoidance constraints [4, 7, 8]. MILP enables the inclusion of non-convex constraints and discrete de- cisions in the trajectory optimization. Binary de- cision variables allow the choice of whether to pass “left” or “right” of an obstacle, for example, or the discrete assignment of vehicles to targets, to be in- cluded in the planning problem. MILP can also be used on-line, replanning in real-time to account for dynamic uncertainty. This is known as either Model Predictive Control (MPC) or receding horizon con- trol [10]. In particular, the use of on-line MILP for vehicle path-planning with avoidance has been proposed in two forms. In one, a MILP optimiza- tion is performed over a shortened horizon, with the cost-to-go represented by a realistic path approxi- mation [2]. In this paper, this will be referred to as the cost-to-go method. In the other form, known here as the whole trajectory method, the entire tra- jectory is redesigned [10]. More recent developments of the latter scheme include a variable terminal time, which offers stability for a general target state, and modifications for robustness [5]. Both methods have been considered analytically and in simulation. The innovation of this paper is their implementation with hardware in the loop, including the details of appli- cation and the architectures used. MPC has been successfully applied to chemical pro- cess control [6]. However, application to systems with faster dynamics, such as those found in the field of aerospace, has been impractical until recently, with the on-going improvements in computation speed. Recent hardware work in this field includes the on- line use of NTG nonlinear optimization for control of an aerodynamic system [11, 12] and of a formation of highly-nonlinear vehicles [13, 14]. The testbed used in this work consists of remote- controlled, wheel-steered miniature trucks. For these experiments, they are operated at constant speed. Due to limited steering angles, their turn rate is re- stricted. Therefore, the trucks can be used as repre- 1 American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit 11-14 August 2003, Austin, Texas AIAA 2003-5802 Copyright © 2003 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

Upload: others

Post on 25-Jan-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • Experimental Demonstrations of Real-time

    MILP Control

    Arthur Richards∗, Yoshiaki Kuwata,† and Jonathan How‡

    MIT Dept. of Aeronautics and Astronautics§

    ABSTRACT

    This paper presents results from recent hardware ex-periments using two forms of on-line Mixed-integerLinear Programming (MILP). The demonstrationswere performed on wheeled ground vehicles, with aview to future use on autonomous teams of UAVs.The first experiment uses MILP at a high-level toaccount for a limited detection horizon while ma-neuvering in the presence of obstacles. The secondperforms low-level control with MILP to reject dis-turbances while performing a rendezvous. The pa-per describes the design of the multi-vehicle testbedand discusses the architecture and implementationdetails of the experiments. In both cases, the onlineMILP compensated for the uncertainties present andsuccessfully completed the maneuvers.

    INTRODUCTION

    Two forms of predictive control based on Mixed-integer Linear Programming (MILP) are demonstratedon a multi-vehicle testbed. The objective of thesecontrollers is to enable coordinated operation of mul-tiple Unmanned Aerial Vehicles (UAVs) to achievehigh-level mission goals in a challenging environ-ment [1]. The experiments, performed using wheeledground vehicles to emulate the UAVs, highlight thetransition from simulation to hardware implementa-tion. A key issue is the interaction between high-and low-level controllers within the overall controlsystem architecture. Significant dynamic uncertaintyis present, ranging from disturbance forces acting onthe vehicles to macroscopic changes in the environ-ment, such as the discovery of new targets or ob-stacles. The two demonstrations in this paper showhow MILP can be applied at different levels of thecontrol system to account for these different typesof uncertainty.

    ∗ Research Assistant, [email protected]† Research Assistant, [email protected]‡ Associate Professor, [email protected], Senior Mbr. AIAA.§MIT 33-328, 77 Mass. Ave., Cambridge, MA 02139.

    Previous work has demonstrated the use of MILPfor off-line trajectory design for vehicles subject toavoidance constraints [4, 7, 8]. MILP enables theinclusion of non-convex constraints and discrete de-cisions in the trajectory optimization. Binary de-cision variables allow the choice of whether to pass“left” or “right” of an obstacle, for example, or thediscrete assignment of vehicles to targets, to be in-cluded in the planning problem. MILP can also beused on-line, replanning in real-time to account fordynamic uncertainty. This is known as either ModelPredictive Control (MPC) or receding horizon con-trol [10]. In particular, the use of on-line MILPfor vehicle path-planning with avoidance has beenproposed in two forms. In one, a MILP optimiza-tion is performed over a shortened horizon, with thecost-to-go represented by a realistic path approxi-mation [2]. In this paper, this will be referred toas the cost-to-go method. In the other form, knownhere as the whole trajectory method, the entire tra-jectory is redesigned [10]. More recent developmentsof the latter scheme include a variable terminal time,which offers stability for a general target state, andmodifications for robustness [5]. Both methods havebeen considered analytically and in simulation. Theinnovation of this paper is their implementation withhardware in the loop, including the details of appli-cation and the architectures used.

    MPC has been successfully applied to chemical pro-cess control [6]. However, application to systemswith faster dynamics, such as those found in the fieldof aerospace, has been impractical until recently,with the on-going improvements in computation speed.Recent hardware work in this field includes the on-line use of NTG nonlinear optimization for control ofan aerodynamic system [11, 12] and of a formationof highly-nonlinear vehicles [13, 14].

    The testbed used in this work consists of remote-controlled, wheel-steered miniature trucks. For theseexperiments, they are operated at constant speed.Due to limited steering angles, their turn rate is re-stricted. Therefore, the trucks can be used as repre-

    1American Institute of Aeronautics and Astronautics

    AIAA Guidance, Navigation, and Control Conference and Exhibit11-14 August 2003, Austin, Texas

    AIAA 2003-5802

    Copyright © 2003 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

  • sentative models of UAVs, which would typically op-erate at a constant, nominal speed, flying at a fixedaltitude and with turning rate limited by the achiev-able bank angle. The vehicles have on-board com-puters and wireless LAN communications, offering aflexible architecture for demonstrating co-operativecontrol.

    In the first demonstration, the cost-to-go method [2]is used as a “high-level” controller to compensate foruncertainty in the environment. It designs a seriesof waypoints for the truck to follow. A “low-level”controller then steers the truck to move along thispath. An obstacle exists between the vehicle andits goal, but the controller does not become awareof this until it comes within a “detection horizon” ofthe obstacle. This represents operation using sensorswith limited range, such as radar, which can detectobstacles only within a certain radius.

    In the second demonstration, the low-level controlleris removed and the on-line MILP is used to compen-sate for disturbances acting upon the vehicle. Thewhole-trajectory formulation is used [5], includingmodifications to ensure feasibility of the MILP ateach step. Two vehicles are used: one is drivenopen-loop as a target while the other is controlledby MPC to perform a rendezvous with the first. TheMPC must compensate for uncertain motion of bothtrucks. For the purposes of the experiments, thechoice of a rendezvous maneuver adds the unknowntarget motion to the sources of uncertainty. In prac-tice, such a maneuver could represent in-flight refu-elling of a UAV.

    This paper begins with a description of the trucktestbed used, which is common to both demonstra-tions. Each is then discussed separately, includingspecific objectives, controller formulations, integra-tion with the testbed, test scenarios, procedures andresults.

    APPARATUS DESCRIPTION

    The testbed for this experiment consists of two self-contained, remotely-controlled trucks. They can beoperated in two distinct modes. Fig. 1(a) showsthe architecture and loop closure in waypoint mode,in which the high-level controller provides a set ofwaypoints to the trucks and a low-level controllermoves between them. In direct steer mode, shown inFig. 1(b), the high-level controller generates steeringcommands for the truck, while a low-level controllermaintains a desired speed. The major componentsare described separately in the following sections.

    (a) Waypoint Mode

    (b) Direct Steer Mode

    Fig.1: Testbed Architecture showing Control Loopsin Both Modes

    2American Institute of Aeronautics and Astronautics

  • Fig.2: Testbed vehicles.

    Trucks

    Fig. 2 shows three of the trucks. Each is based ona commercially available model truck. Each car-ries a laptop PC on-board, the functions of whichare described in the next section. Physically, it isconnected by serial links to a PWM converter, gen-erating signals for the motor speed controller andsteering servo, and to a GPS receiver. The batteriesand power supply circuitry complete the on-boardhardware, making each truck an independent unit.A video camera can also be mounted on-board fordemonstration purposes.

    On-Board Computer

    This laptop computer on each truck performs lo-cal estimation and low-level control functions. Itperforms precise relative position and velocity esti-mation using differential carrier-phase GPS [17]. Afixed GPS base station, not shown in Fig. 1, commu-nicates with the on-board PC via wireless LAN toenable differential GPS. The estimation is performedat a rate of 5 Hz. A speed control loop is closed onthe laptop, generating motor servo commands in or-der to maintain a constant reference speed. In way-point mode, an additional steering controller acts tofollow a list of waypoints. In direct steer mode, thesteering servo command is relayed directly from ahigher-level controller. The higher level commands(reference speed, waypoint list and steering com-mand) are sent via wireless LAN from the plannercomputer. To close the loop, the GPS state esti-mates are sent to the planner at a rate of 1 Hz.

    Planner

    This laptop computer performs the MILP optimiza-tions. In waypoint mode, the first few waypoints

    of the plan are sent to the base-station. In direct-steer mode, a steering angle is computed for thefirst plan step and sent to the base-station for di-rect relay to the truck. MATLAB is used to man-age communications, data recording, visualizationand data conversion. The planner uses CPLEX op-timization software [16] to solve the MILP problems,using AMPL [15] as the interface between CPLEXand MATLAB.

    Operating Limits

    To achieve as much as possible in a confined space,the trucks are operated at a low speed (nominal run-ning speed is 0.5 m/s). At lower speeds, frictioneffects made the speed control unreliable. At thisspeed, the available steering range sets the minimumturn radius to approximately 1 m.

    COST-TO-GO DEMONSTRATION

    In these experiments, the receding-horizon MILPformulation with the approximate cost-to-go func-tion [2] is used to compensate for uncertainty inthe obstacle environment beyond a limited detec-tion horizon. In this case, disturbances acting onthe truck are assumed to be handled by the low-level feedback control system. The detection radiuseffect is simulated, since the actual trucks have noreal obstacle sensors. For simplicity, the controller isgiven the size and location of the complete obstacleas soon as its detection radius intersects part of theobstacle [3].

    The following subsection reviews the receding hori-zon formulation. In the final subsection, the resultsare presented and interpreted.

    Review of Receding Horizon Control

    For review, the cost-to-go formulation for the min-imum time path planning problem is briefly pre-sented here [2]. This algorithm designs the minimum-time path to a fixed goal while avoiding a known setof obstacles. Operation in the presence of the limiteddetection horizon is described in the next subsection.Fig. 3 gives an overview of the method, including thedifferent levels of resolution involved.

    The control strategy is comprised of two phases: costestimation and trajectory design. The cost estima-tion estimates the cost-to-go from each obstacle cor-ner. The procedure is as follows:

    1. Form a graph by connecting every obstacle cor-ner to all other obstacle corners using straightlines.

    3American Institute of Aeronautics and Astronautics

  • Fig.3: Overview of Cost-to-Go Method.

    2. Remove those edges that intercept obstacles.3. Use Dijkstra’s Single-Source Shortest-Path Al-

    gorithm to find the “tree” of optimal pathsfrom the goal to each corner.

    4. Store the cost-to-go from each corner along thecalculated path.

    The development of this set of approximate costs isbased on the observation that the true optimal paths(i.e. minimum distance) tend to follow the edges andcorners of the obstacles. In Fig. 3, each of the “costpoints” is marked by + and the associated paths arethe solid black lines. Note that it is not necessaryto store the paths: only the cost values are needed.

    In the trajectory design phase, MILP optimizationsare solved to design a series of short trajectory seg-ments over a planning horizon with length N steps.In Fig. 3, this section of the plan is shown by thedashed black line. Each optimization finds a con-trol sequence {uk|k . . .uk|(k+N−1)} and correspond-ing states {xk|k . . .xk|(k+N)}, where uk|j denotes thecontrol designed at time k for application at time jand each state is comprised of a position and veloc-ity i.e. x = (r v). These states are subject to thedynamics constraint

    ∀j ∈ [k . . . (k+N−1)] xk|(j+1) = Axk|j +Buk|j (1)

    where A and B are the discretized system dynam-ics matrices for a point mass in 2-D free-space. Thespeed and force are limited, using the following ap-proximation for a 2-norm [8]

    ∀j ∈ [k . . . (k + N)] Wvk|j ≤ 1vmax (2)

    ∀j ∈ [k . . . (k + N)] Wuk|j ≤ 1fmax (3)

    where 1 is a vector of 1’s and W has the structure

    W =

    . . . . . .cos(θi) sin(θi). . . . . .

    (4)

    The values of θi are uniformly distributed between0 and 2π. By limiting the components of the vectorquantities in each of these directions, an approxi-mate limit on their two-norms is effected in linearform [8]. A point mass dynamic model subject tothe 2-norm force and speed constraints (2) and (3)forms a good approximate model for limited turn-rate vehicles, provided that the optimization favorsthe minimum time, or minimum distance, path [8].

    The MILP also chooses a “visible point” rvis whichis visible from the “terminal point” rk|k+N and fromwhich the cost-to-go has been estimated, Cvis, in theprevious phase. These points are shown in Fig. 3.Note that the cost map is quite sparse: cost-to-govalues are only known for the corners of the obsta-cles. The distance from the terminal point to thevisible point is approximated in the cost function,allowing the cost-to-go to be evaluated at a freely-chosen terminal point using only a sparse grid of costpoints. Since the optimal trajectories usually passthrough obstacle vertices, this is a valid approxima-tion and the performance remains close to the trueoptimal [2]. The map does not need to be recalcu-lated at each step: it is valid as long as the obstaclesare known and stationary. Changes in the obstaclefield are dealt with by replanning, as discussed inthe next section.

    The following avoidance constraints are applied ateach point of the dynamic segment and at intermedi-ate points between the terminal point and the visiblepoint. Rectangular obstacles are used in this formu-lation, and are described by their lower left corner(ulow, vlow) and upper right corner (uhigh, vhigh). Toavoid collisions, the following constraints must besatisfied at each point on the trajectory [9]

    rk|j,1 ≤ ulow + R bin,1rk|j,1 ≥ uhigh −R bin,2rk|j,2 ≤ vlow + R bin,3rk|j,2 ≥ vhigh −R bin,4 (5)

    4∑j=1

    bin,j ≤ 3 (6)

    The trajectory cost involves three terms: the costof the steps to the terminal point rk|k+N ; the ap-proximate straight-line cost from the terminal pointto the visible point `vis = ‖rk|k+N − rvis‖ ; and thecost from the visible point to the goal Cvis. Refer-ring to Fig. 3, these represent the black dashed line,the gray line, and the black solid line, respectively.Assuming that the first part of the plan is flown at

    4American Institute of Aeronautics and Astronautics

  • constant speed, its distance is constant and it canbe ignored in the cost function. This assumption isvalid for the minimum distance, limited speed prob-lem [8]. Only the distance to the visible point andthe remaining straight-line approxmate path are in-cluded. The actual cost function is

    J∗ = minu(·)

    [`visvmax

    + Cvis

    ](7)

    with Eq. (1) – (6) as constraints. The following con-straints determine `vis

    W(rk|k+N − rvis) ≤ 1`vis (8)

    This approximates the two-norm of the distance fromthe terminal point to the visible point, by approxi-mating the smallest circle to enclose that vector.

    Detection Horizon Application

    This section describes the application of the formula-tion in the previous section to the truck demonstra-tion. The truck was modeled as having maximumspeed 0.5 m/s, equal to the nominal running speed.The modeled maximum turn rate was 7 deg/s, cho-sen to give smooth operation when using the low-level steering controller: the planner sends waypointsrather than steering commands in this demonstra-tion. The system was discretized with a time-stepof four seconds, long enough for the transient of thelow-level steering controller to decay. The planninghorizon was eight steps, equivalent to up to 25.6 m.The execution horizon was one step: after each planwas found, only one new waypoint was uploaded tothe truck.

    The initial condition for each new plan is taken asthe first waypoint of the previous plan. This al-lows the next plan to be made before the execu-tion point is reached, reducing the control delay tothat required for communication. This demonstratesthe two-level architecture of the detection horizondemonstration. It is the responsibility of the low-level steering controller to see that the vehicle reachesthe prescribed waypoint. The high-level MILP con-troller closes a different real-time loop, as it com-pensates for obstacle information that only becomesavailable as the vehicle moves.

    The detection horizon was 16 m. Initially, the costmap is computed using only those obstacles withinthe detection horizon of the starting point. At eachstep, before the trajectory design phase, the circleenclosed by the detection horizon around the cur-rent position is checked for intersections with new

    obstacles. If any are “found,” the cost estimation isrepeated including the new obstacle. Note that thisassumes that complete knowledge of the obstacle isavailable as soon as part of it is detected. On-goingresearch is investigating how to include partially-detected obstacles in the path planning [3].

    In both the cost estimation and trajectory designphases, it is necessary to include an enlarged modelof the obstacle. Since the avoidance constraints areonly applied at discrete time steps, it would be pos-sible for the planned trajectory to “cut the corner”of the obstacle between time points. To prevent col-lisions with the real obstacle, the obstacle modelsare (vmax∆t/

    √2) larger in each direction. This is

    the maximum incursion distance.

    In summary, the control scheme for the cost-to-gomethod is as follows:

    1. If the detection circle intersects any “new” ob-stacles (or if starting), compute the cost map,including all obstacles currently known.

    2. Solve MILP (7) subject to (1)–(6) and (8),starting from the last waypoint downloaded(or initial state if starting).

    3. Upload the first waypoint rk|k+1 of the newplan to the truck

    4. Wait until truck reaches the execution horizonfrom the current point.

    5. Go to 1.

    Detection Horizon Tests

    The test scenario is shown in Fig. 4(a). The truckbegins at the origin (0, 0), heading in the directionof the goal (−50,−20). A single obstacle, markedby the solid black line, blocks the direct path to thegoal. However, it is beyond the detection horizon(shown by the large circle) and hence not initiallyknown to the planner. The obstacle was markedby cones in the test area. To successfully completethe test, the vehicle must move around the obstacle,once it is aware of its presence, and reach the goal,heading in the -x direction.

    When the experiment was performed, the truck wasobserved to begin by heading straight to the goal.When it approached the obstacle, it turned from thestraight path and went around the cones markingthe obstacle. It then proceeded straight to the goal.Plots of the result are shown in Fig. 4. In Fig. 4(a),the first plan from initial position over the executionhorizon has been uploaded to the truck controller.Since it is not aware of the obstacle, the planner hascommanded a path directly towards the goal. The

    5American Institute of Aeronautics and Astronautics

  • −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (a)

    −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (b)

    −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (c)

    −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (d)

    −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (e)

    −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 0−30

    −25

    −20

    −15

    −10

    −5

    0

    5Position in X−Y Frame (truck No.1)

    X [m]

    Y [m

    ]

    (f)

    Fig.4: The trajectory generated by planner and the experimental data of the actual trajectory of the truck.The circle shows the detection horizon.

    6American Institute of Aeronautics and Astronautics

  • truck starts moving and the planner starts solvingfor the next plan, starting from the execution hori-zon of the previous plan.

    Two seconds later, shown in Fig. 4(b), the truck isabout to reach the waypoint sent from the first plan.Having completed the new plan, the next waypointhas been uploaded. The obstacle has not yet beendetected, and the designed path is still straight tothe goal.

    In Fig. 4(c), the obstacle is detected when it comeswithin the detection horizon of the truck. The nextwaypoint is uploaded, still heading straight to thegoal, since the last plan was not aware of the obsta-cle. However, the planner now recomputes the costmap, including the obstacle, before making the nextplan. The new map includes the enlarged obstacle,shown by the dashed line. In Fig. 4(d), the next planreflects the new obstacle information and leads thetruck to go around (above in the figure) the obstacle.

    Figs. 4(e) and 4(f) show the resulting trajectory tothe goal. At the end, the truck aligns its headingwith −x direction as is specified in the terminal con-dition.

    These initial results show that the cost-to-go methodcan control a real vehicle in an uncertain environ-ment. The control architecture allows the onlineplanner to compensate for obstacle discovery beyondthe execution horizon while low-level feedback re-jects vehicle disturbance within that horizon.

    MPC FOR RENDEZVOUS

    In these experiments, the low-level feedback con-troller is removed. One truck is driven open-loop(“the target”) while the other (“the chaser”) is con-trolled by the whole trajectory MPC to rendezvouswith the target. The MPC must now compensatefor uncertainties including model error, disturbancesand the uncertain motion of the target truck. Theformulation used for these experiments was intro-duced in Ref. [5]. It is reviewed briefly here, followedby specific details related to these experiments, be-fore the results are presented.

    Controller Formulation

    The controller in this section uses the whole-trajectoryMPC formulation described in Ref. [5]. It has beenshown to guarantee maneuver completion in finitetime. It is presented in brief form here. As in thecost-to-go method, the core of the problem is to de-sign a sequence of states x and controls u subject to

    the point mass dynamics constraints (1)–(3). How-ever, in contrast to the previous method, there is nocost-to-go at the end of the horizon, and the maneu-ver must be completed before the N th step. “Com-pletion” is achieved when the states are steered intothe region defined by Px ≤ q. Typically P andq define a position box, requiring the vehicle to bewithin some distance of a target point. They mayalso apply velocity constraints as well, such as somemaximum velocity at the finish or some prescribeddirection of travel.

    There are additional discrete (binary) decision vari-ables in this formulation: an ‘input’

    {vk|k . . . vk|(k+N−1)}

    and a ‘state’

    {yk|k . . . yk|(k+N)}

    for which yk = 0 is defined to mean that the ma-neuver has been completed at or before step k. Thefollowing two constraints define a state machine forthe discrete state y and coupling with the continuousstates:

    ∀j ∈ [k . . . (k + N − 1)]

    yk|(j+1) = yk|j − vk|j (9)

    P(Axk|j + Buk|j) ≤ q + 1M(1− vk|j) (10)

    where M is a large positive number. If vk|j = 1,then (9) implies y transitions from 1 at step j to 0at step j + 1 while (10) implies that the states arein the target region at step j + 1.

    The cost function is a combination of the maneuvertime and a low weighting on the applied force

    J∗ = min{x,u,y,v}

    (k+N−1)∑j=k

    (�|uk|j|+ yk|j) (11)

    where the small weighting � � 1 on force has beenshown to help the solution process [18]. Since y = 1implies that the maneuver is not finished, the yk|jterm included in the summation in (11) is effectivelya penalty on maneuver time, in units of time steps.

    The following constraints bound the approximatedtwo-norm of the force input, with robustness modi-fications to tighten the limit in the far future

    Wuk|k ≤ 1fmaxWuk|k+1 ≤ 1fmax − β1

    Wuk|j ≤ 1fmax − β1 − β2 ∀j > k + 1(12)

    7American Institute of Aeronautics and Astronautics

  • where the matrix W is defined in (4). Similar con-straints are applied to the velocity

    Wvk|k ≤ 1MWvk|k+1 ≤ 1vmax + α

    Wvk|j ≤ 1vmax ∀j > k + 1(13)

    Note that the constraint is completely relaxed on thefirst step, since this is fixed by the initial conditions.It was shown in Ref. [5] that, given a bound on thedisturbances, values of α, β1 and β2 can be chosensuch that the MILP is guaranteed to be feasible un-der the action of the disturbance. If a solution existsat time step k, then it can be shown that a candi-date solution, involving a two-step correction for thedisturbance applied at k, is feasible at step (k + 1).Therefore, feasibility at the first step implies feasi-bility at all future steps.

    Equations (12) and (13) are “operating constraints,”applied at every step, and can be expressed in thegeneral form

    C1(j)xk|j + C2(j)uk|j + C3(j) ≤ 0

    where Ci(j) are derived from (12) and (13). For sim-plicity, avoidance constraints were not included inthe rendezvous experiments. They can be applied ifnecessary by the inclusion of auxiliary binary vectorsin the operating constraints. To guarantee finite-time completion, the operating constraints must berelaxed in the plan after the projected time of com-pletion [5]. Hence the operating constraints are im-plemented in the following form

    C1(j)xk|j + C2(j)uk|j + C3(j) ≤ 1M(1− yk|j) (14)

    The following boundary conditions are also applied

    xk|k = Apx (15)yk|k = 1 (16)

    yk|k+N+1 = 0 (17)

    where x denotes the current state estimate. Thematrix Ap propagates the state forward by a shorttime period to account for the computation delaysbetween the measurement of x and the execution ofthe first plan step [11]. The requirement for y tostart at 1 and end at 0 requires that the states passthrough the target region at some step of the plan,thereby completing the mission.

    MPC Application

    This section describes the application of the MPCformulation above to the truck control problem inthese experiments. For the rendezvous problem, theMPC trajectory optimization designs a path for thechaser truck relative to the target truck. The initialcondition x in (15) is found by subtracting the GPSstate estimate for the target from that of the chaser.The propagation Ap was for one second, roughly theamount of computation and communication delayin the system. The dynamics model in the relativeframe is unchanged, apart from a translation of thevelocity to enforce the maximum speed limit

    ∀j ∈ [k . . . (k + N)] W(vk|j + vT ) ≤ 1vmax (18)

    where vT is the modeled target velocity. The systemwas discretized with a time-step of four seconds, us-ing a zero-order hold. An offset was included to con-vert the relative velocity in the optimization to anabsolute velocity before applying the approximatenorm bound. The speed was limited to 0.8 m/s, therunning speed of the truck for this experiment, andthe force to less than 0.07 N, equivalent to a maxi-mum turning rate of approximately 5 deg/s. This iswell within the trucks operating limits.

    The force reductions for robustness in (12) were β1 =0.03 N and β2 = 0.015 N. The corresponding stateperturbation was α = 0.12 m/s. These values werederived from simulations using a nonlinear model ofthe truck dynamics, since it was anticipated that themodel approximation would be the greatest sourceof uncertainty. Note that this implies that the dis-turbances are quite strong: the planned actuationis reduced to 35% of the total in the far future,along with a permitted 15% overspeed at the firstplan step. The target region was a box extendingfrom 0.5 m to 3.5 m east of the target truck (ap-proximately behind it, given its nominal direction oftravel to the west) with relative velocity less than2 cm/s in each component.

    Given a solution to the trajectory optimization, thefirst force input uk|k is converted to a steering an-gle. Its component perpendicular to the direction oftravel is found using

    ucross = ecross · uk|k (19)

    where the direction vector across the current trackis found from the current velocity state v by

    ecross =1‖v‖

    (01−10

    )v (20)

    8American Institute of Aeronautics and Astronautics

  • Then a scaling and bias conversion is performed togenerate an integer servo command to be uploadedto the truck

    Scommand = int(aucross + b) (21)

    The values of a and b are specific to each truck andwere found experimentally. Their values were notknown accurately, adding additional uncertainty tothe problem. Note that the steering conversion (19)masks any longitudinal force command. The plannermay call for small changes in speed, but these will beignored. In summary, the rendezvous MPC schemeis as follows. At each time step:

    1. Generate the relative state x from the currentGPS estimates of the absolute states of eachtruck.

    2. Solve the minimization of (11) subject to con-straints (1), (9), (10) and (14)–(17).

    3. Convert the first element of the continuouscontrol to an equivalent steering servo com-mand using (19)–(21) and upload this com-mand to the truck.

    4. Go to 1.

    Rendezvous Tests

    This section describes the hardware experiments us-ing the MPC rendezvous controller. The test setupand procedure is described first. Then, the resultsare presented and discussed.

    Test Procedure

    This section gives a step-by-step account of the pro-cedure for each test.

    1. Truck # 1 (the chaser) was manually drivento the origin, marked by a cone. Its GPS esti-mator was started with its current position at(0, 0).

    2. Truck # 1 was manually driven to its startingposition, roughly five meters away from theorigin in the Y direction and pointing in the-X direction.

    3. Truck # 2 (the target) was manually drivento the origin. Its GPS estimator was startedwith its current position at (0, 0). Its align-ment was approximately aiming in the -X di-rection. Note: the initial alignments of bothtrucks and the positioning of Truck # 1 wereperformed manually, to a wide tolerance. The

    (a)

    (b)

    Fig.5: Trajectories of Trucks for Two Tests

    uncertainty of these quantities must be han-dled by the feedback control.

    4. (Designated time t = 0) Both trucks were givenan initial command to drive straight ahead (i.ezero steering angle) with speed 0.8 m/s. (It isnecessary for the trucks to be moving at thedesignated speed when the MPC controller isstarted. Otherwise, the point-mass dynamicsmodel is invalid.)

    5. (At time t = 5s and at intervals of 4 s there-after) The MPC controller solved the controlproblem from the current position of the trucks.Truck # 1 was given a steering command de-rived from the resulting trajectory plan. Truck# 2 was commanded to steer straight. Bothtrucks were commanded to remain at speed0.8 m/s.

    6. When the chaser truck was observed to haveintercepted the target and matched its veloc-ity, the test was terminated by the operator.

    9American Institute of Aeronautics and Astronautics

  • (a)

    (b)

    Fig.6: Relative Motion of Trucks over Both Tests

    Results

    Fig. 5 shows the trajectories of the two trucks on twodifferent tests. Since this target truck is not underclosed-loop control, its paths deviate due to initialmisalignment and disturbances. In both cases, thechaser truck diverts from its initial path to comewest of the target, then following it until the testis terminated. Some offset is observed between thetarget and chaser paths. Figs. 6(a) and 6(b) showthe relative trajectories of the chaser with respectto the truck, both as X-Y plots and time histories.The dashed lines on each plot indicate the targetregion. In both tests, the chaser successfully reachesthe target region.

    The relative motion plots illustrate the strength ofthe disturbances at work. Typical plans would trans-fer smoothly from the starting position to the tar-get region, but the plots show significant deviations.From observations of the tests, physical disturbances

    such as the effect of potholes have a significant effecton the results. Another substantial source of uncer-tainty is model error: the plan may call for smallchanges in speed, but only steering commands areused.

    The importance of the robust feasibility modifica-tions was apparent from prior simulations of thesetests: without the robustness formulation, the MPCproblem quickly became infeasible due to disturbances.For example, the optimization would fail if the truckexceeded its modeled maximum speed, as happensoccasionally during the maneuver. In the tests, theoptimization remained feasible throughout, demon-strating the success of the robustness formulation.

    There is a slight overshoot observable in Fig. 6(b),but overall it would seem that any time delays presentin the system have not had significant detrimentaleffects on the results. The forward propagation ofthe initial condition in (15) appears to have worked.

    These results show that MILP/MPC can be used forreal-time feedback control of a vehicle. The robustfeasibility formulation enabled successful completionof rendezvous, even in the presence if significant un-certainty. Computation times were sufficiently shortto maintain smooth control of the vehicle. Also, thelinear model of a constant speed, limited turn vehiclee.g. a truck or aircraft is a workable approximation,allowing the benefits of linear optimization to be ap-plied to practical control problems.

    CONCLUSIONS

    Experiments have been presented to demonstratethe use of Mixed-Integer Linear Programming foron-line replanning to control vehicles in the pres-ence of dynamic uncertainty. Two schemes havebeen implemented on a ground vehicle testbed todemonstrate technologies aimed at UAV control. Inone demonstration, a receding horizon formulationwas used to maneuver a vehicle to its assigned goalwith restricted detection of obstacles, representinglimited-range radar. This architecture uses MILPfor high-level path-planning, accounting for obstacledetection, while a low-level loop rejects disturbancesto the vehicle. In the other demonstration, MILPwas used as part of Model Predictive Control for thelow-level control, to perform a rendezvous betweentwo vehicles. This supported analytical claims of ro-bustness and stability of the MPC formulation. Thisinitial set of experiments will be extended to involvemore complicated scenarios. This work will includethe distribution of the MILP planning process acrossmultiple computers and transition to a greater num-

    10American Institute of Aeronautics and Astronautics

  • Fig. 7: New truck testbed developed for future ex-periments.

    Fig. 8: Autonomous UAV testbed developed forfuture experiments.

    ber and variation of vehicles. Figs. 7 and 8 show thenew multi-truck and multi-UAV testbeds that havebeen developed to support this future work.

    ACKNOWLEDGMENTS

    Research funded under DARPA (MICA) contractN6601-01-C-8075. The testbeds were developed un-der the DURIP grant F49620-02-1-0216.

    REFERENCES

    [1] P. R. Chandler and M. Pachter, “Research Is-sues in Autonomous Control of Tactical UAVs,”Proceedings of the American Control Conference,Philidelphia, PA, June 1998, IEEE, Washington,DC, pp. 394–398.

    [2] J. S. Bellingham, “Receding Horizon Control of Au-tonomous Aerial Vehicles,” ACC, May 2002.

    [3] Y. Kuwata, Real-time Trajectory Design for Un-manned Aerial Vehicles using Receding Horizon

    Control, S.M. Thesis, Dept. of Aeronautics and As-tronautics, MIT, June 2003.

    [4] A. G. Richards, “Trajectory Optimization usingMixed-Integer Linear Programming,” Masters The-sis, Massachusetts Institute of Technology, June2002.

    [5] A. G. Richards, J. P. How, “Model Predictive Con-trol of Vehicle Maneuvers with Guaranteed Com-pletion Time and Robust Feasibility,” submitted toACC 2003.

    [6] J.M. Maciejowski, Predictive Control with Con-straints, Prentice Hall, England, 2002.

    [7] A. Richards, T. Schouwenaars, J. How, E. Feron,“Spacecraft Trajectory Planning With Collisionand Plume Avoidance Using Mixed-Integer LinearProgramming,” Journal of Guidance, Control andDynamics, AIAA, August 2002.

    [8] A. G. Richards, J. P. How, “Aircraft TrajectoryPlanning with Collision Avoidance using Mixed In-teger Linear Programming,” ACC, May 2002.

    [9] T. Schouwenaars, B. DeMoor, E. Feron and J. How,“Mixed integer programming for safe multi-vehiclecooperative path planning,” ECC, Porto, Portugal,September 2001.

    [10] A. Bemporad and M. Morari, “Control of SystemsIntegrating Logic, Dynamics, and Constraints,” inAutomatica, Pergamon / Elsevier Science, Vol. 35,pp. 407–427, 1999.

    [11] R. Franz, M. Milam, and J. Hauser, “ Applied Re-ceding Horizon Control of the Caltech Ducted Fan,”ACC 2002.

    [12] W. B. Dunbar, M. B. Milam, R. Franz andR. M. Murray, “Model Predictive Control of aThrust-Vectored Flight Control Experiment,” ac-cepted for 15th IFAC World Congress on AutomaticControl, 2002

    [13] W. Dunbar and R. Murray, “Model predictive con-trol of coordinated multi-vehicle formations,” IEEECDC, 2002.

    [14] L. Cremean, W. Dunbar, D. van Gogh, J. Hickey,E. Klavins, J. Meltzer and R. Murray, “The CaltechMulti-Vehicle Wireless Testbed,” IEEE CDC, 2002.

    [15] R. Fourer, D. M. Gay, and B. W. Kernighar,AMPL, A modeling language for mathematical pro-gramming, The Scientific Press, 1993.

    [16] ILOG CPLEX User’s guide, ILOG, 1999.

    [17] N. A. Pohlman, “Estimation and Control of aMulti-Vehicle Testbed Using GPS Doppler Sens-ing,” S.M. Thesis, MIT, 2002.

    [18] H. P. Rothwangl, “Numerical Synthesis of the TimeOptimal Nonlinear State Controller via Mixed In-teger Programming,” IEEE ACC, 2001.

    11