energy-optimal trajectory planning for the pendubot and the acrobot

21
OPTIMAL CONTROL APPLICATIONS AND METHODS Optim. Control Appl. Meth. (2012) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/oca.2020 Energy-optimal trajectory planning for the Pendubot and the Acrobot John Gregory 1 , Alberto Olivares 2, * ,† and Ernesto Staffetti 2 1 Department of Mathematics, Southern Illinois University, Carbondale, Illinois, USA 2 Statistics and Operations Research, Rey Juan Carlos University, Móstoles, Madrid, Spain SUMMARY In this paper, we study the trajectory-planning problem for planar underactuated robot manipulators with two revolute joints in the presence of gravity. This problem is studied as an optimal control problem. We consider both possible models of planar underactuated robot manipulators, namely with the elbow joint not actuated, called Pendubot, and with the shoulder joint not actuated, called Acrobot. Our method consists of a numerical resolution of a reformulation of the optimal control problem, as an unconstrained calculus of variations problem, in which the dynamic equations of the mechanical system are regarded as constraints and treated by means of special derivative multipliers. We solve the resulting calculus of variations problem by using a numerical approach based on the Euler–Lagrange necessary condition in integral form. In which, time is discretized, and admissible variations for each variable are approximated using a linear combination of the piecewise continuous basis functions of time. In this way, a general method for the solution of optimal control problems with constraints is obtained, which, in particular, can deal with nonintegrable differential constraints arising from the dynamic models of underactuated planar robot manipulators with two revolute joints in the presence of gravity. Copyright © 2012 John Wiley & Sons, Ltd. Received 17 January 2011; Revised 5 October 2011; Accepted 19 December 2011 KEY WORDS: optimal control; underactuated manipulators; nonholonomic constraints; Pendubot; Acrobot 1. INTRODUCTION The class of underactuated manipulators includes robots with rigid links and unactuated joints, robots with rigid links and elastic transmission elements, and robots with flexible links. Whereas, in the first case, underactuation is a consequence of design; in the other cases, it is the result of an accurate dynamic modeling of the system, in which the control inputs only has effect on the rigid- body motion. In any case, the number of available control inputs is strictly less than the number of degrees of freedom of the robot. However, the control problem of different underactuated manipu- lators may have different levels of difficulty. In this paper, we study the trajectory-planning problem for planar underactuated robot manipulators with two revolute joints that move in a vertical plane and, therefore, are subject to gravity force. The presence of two revolute joints in the mechanical system will be denoted by RR. We consider both possible models of planar underactuated RR robot manipulators, namely the model in which only the shoulder joint is actuated, which is usually called Pendubot, and the model in which only the elbow joint is actuated, which is usually called Acrobot. Planar underactuated RR robot manipulators, in which only the first joint is actuated, will be denoted with R R. Whereas, planar underactuated RR robot manipulators, in which only the second joint is actuated, will be denoted with RR. Underactuated robots are mechanical systems with second-order *Correspondence to: Alberto Olivares, Statistics and Operations Research, Rey Juan Carlos University, Móstoles, Madrid, Spain. E-mail: [email protected] Copyright © 2012 John Wiley & Sons, Ltd.

Upload: john-gregory

Post on 15-Jun-2016

221 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Energy-optimal trajectory planning for the Pendubot and the Acrobot

OPTIMAL CONTROL APPLICATIONS AND METHODSOptim. Control Appl. Meth. (2012)Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/oca.2020

Energy-optimal trajectory planning for the Pendubotand the Acrobot

John Gregory1, Alberto Olivares2,*,† and Ernesto Staffetti2

1Department of Mathematics, Southern Illinois University, Carbondale, Illinois, USA2Statistics and Operations Research, Rey Juan Carlos University, Móstoles, Madrid, Spain

SUMMARY

In this paper, we study the trajectory-planning problem for planar underactuated robot manipulators withtwo revolute joints in the presence of gravity. This problem is studied as an optimal control problem. Weconsider both possible models of planar underactuated robot manipulators, namely with the elbow joint notactuated, called Pendubot, and with the shoulder joint not actuated, called Acrobot. Our method consists ofa numerical resolution of a reformulation of the optimal control problem, as an unconstrained calculus ofvariations problem, in which the dynamic equations of the mechanical system are regarded as constraintsand treated by means of special derivative multipliers. We solve the resulting calculus of variations problemby using a numerical approach based on the Euler–Lagrange necessary condition in integral form. In which,time is discretized, and admissible variations for each variable are approximated using a linear combinationof the piecewise continuous basis functions of time. In this way, a general method for the solution of optimalcontrol problems with constraints is obtained, which, in particular, can deal with nonintegrable differentialconstraints arising from the dynamic models of underactuated planar robot manipulators with two revolutejoints in the presence of gravity. Copyright © 2012 John Wiley & Sons, Ltd.

Received 17 January 2011; Revised 5 October 2011; Accepted 19 December 2011

KEY WORDS: optimal control; underactuated manipulators; nonholonomic constraints; Pendubot; Acrobot

1. INTRODUCTION

The class of underactuated manipulators includes robots with rigid links and unactuated joints,robots with rigid links and elastic transmission elements, and robots with flexible links. Whereas,in the first case, underactuation is a consequence of design; in the other cases, it is the result of anaccurate dynamic modeling of the system, in which the control inputs only has effect on the rigid-body motion. In any case, the number of available control inputs is strictly less than the number ofdegrees of freedom of the robot. However, the control problem of different underactuated manipu-lators may have different levels of difficulty. In this paper, we study the trajectory-planning problemfor planar underactuated robot manipulators with two revolute joints that move in a vertical planeand, therefore, are subject to gravity force. The presence of two revolute joints in the mechanicalsystem will be denoted byRR. We consider both possible models of planar underactuatedRR robotmanipulators, namely the model in which only the shoulder joint is actuated, which is usually calledPendubot, and the model in which only the elbow joint is actuated, which is usually called Acrobot.Planar underactuatedRR robot manipulators, in which only the first joint is actuated, will be denotedwith RR. Whereas, planar underactuated RR robot manipulators, in which only the second joint isactuated, will be denoted withRR. Underactuated robots are mechanical systems with second-order

*Correspondence to: Alberto Olivares, Statistics and Operations Research, Rey Juan Carlos University, Móstoles,Madrid, Spain.

†E-mail: [email protected]

Copyright © 2012 John Wiley & Sons, Ltd.

Page 2: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

nonholonomic constraints because the dynamic equation of the unactuated part of the mechanicalsystem is a second-order differential constraint which, in general, is nonintegrable. In the absenceof gravity, it is not possible to integrate, even partially, this second-order differential constraint inthe dynamic model of the RR robot manipulator. However, in the presence of this second-ordernonholonomic constraint, the system is controllable. On the contrary, in the dynamic model of theRR robot manipulator without gravity, this differential constraint is completely integrable. It can beconverted into a holonomic constraint that makes the system not controllable. As a consequence, thetrajectory-planning problem has a solution only for particular initial and final states. In the presenceof gravity, the dynamic equation of the unactuated part of the mechanical system is a second-ordernonintegrable differential constraint for both the Pendubot and the Acrobot models, and therefore,both systems are controllable. Planning dynamically feasible trajectories, their asymptotic tracking,and the regulation to a desired equilibrium configuration are the main control problems for thisclass of mechanical systems. However, a general control theory for this class of mechanical sys-tems has not been developed yet, and only solutions for particular robot models have been obtained.A review of the most significant works on control of underactuated robots with passive joints canbe found in [1]. For both the Pendubot and the Acrobot models, the control objective is, in gen-eral, to drive the manipulator away from the stable equilibrium state and steer it at an unstableequilibrium state. Because the approximate linearization of these systems is controllable, it is notdifficult to control them locally. However, because of the gravitational drift, the region of the statespace where the robot can be kept in equilibrium is reduced and consists of two disjoint manifolds.Moving between these two manifolds requires the appropriate swing-up maneuvers, whose syn-thesis has been, so far, tackled by energy and/or passivity-based control techniques. In this paper,the trajectory-planning problem for planar underactuated RR robot manipulators in the presence ofgravity is studied as an energy-optimal control problem. Given an initial state and a final state, wefind a trajectory and the corresponding available control inputs that satisfy the dynamic equation ofthe robot manipulator and steer the system between the initial and the final states. This problem isreferred to as the boundary value problem. Whereas, problems in which the final or the initial stateare not completely specified are called initial value problems and final value problems, respectively.It is usually impossible to find analytical solutions to the optimal control problems of robot manip-ulators, and in general, numerical methods must be employed, which can be classified into direct orindirect methods.

Direct methods follow a ‘first discretize, then optimize’ strategy. First, the control variables andoften also the state variables are discretized, and the optimal control problem is converted intoa large, constrained, and often sparse nonlinear programming problem which is treated by stan-dard algorithms such as sequential quadratic programming methods or interior point methods.Well-known realizations of the direct methods are the direct shooting methods or direct colloca-tion methods. Several direct methods also give pointwise estimates of the adjoint variables [2, 3].General references on direct numerical methods for optimal control are [4–9]. Whereas, more spe-cific references on applications of these methods to robotics problems are [10,11]. Aside from theseapproaches, many direct methods are specially tailored for trajectory optimization of fully actuatedrobot manipulators. Examples are two-stage approaches, such as [12], based on the optimization ofthe geometric path, followed by the optimization of the velocity profile along this path, which usesapproaches like [13–15]. In [12], the path is obtained with a parameter optimization that iterates onthe control points of a B-spline. An example of a method based on the complete discretization of theoptimal control problem using B-splines is [16]. A method based on the control parameterizationtechnique [5] is described in [17]. In which, only the control is discretized, and piecewise constantfunctions are used for the approximation of the control function. Whereas, the system of differentialequations is solved using an accurate differential equation solver.

Indirect approaches are characterized by a ‘first optimize, then discretize’ strategy. On thebasis of Pontryagin’s maximum principle [18], the optimal control problem is first transformedinto a piecewise-defined multipoint boundary value problem which is then discretized to achievethe numerical solution. General references on indirect numerical methods for optimal controlare [19, 20]. These methods have been extensively applied to the generation of highly accurateoptimal trajectories for fully actuated robot manipulators, and a comprehensive survey of these

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 3: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

techniques is given in [21]. For time-optimal control, the existence of the solution for two-linkplanar manipulators is proven in [22]. It is shown that, at any moment, at least one control vari-able reaches its bound. In [23, 24], this result is extended to other robot models. In [25, 26], thegradient algorithm is used to solve the time-optimal control of a two-link planar manipulator. In[27, 28], the indirect multiple-shooting algorithm is used to solve the optimal control problemof a two-link planar manipulator. In [29], the minimum time and minimum energy-optimal con-trol problem of a three-link robot manipulator is solved by indirect multiple- shooting, in whichthe initial guess of the solution is generated by direct collocation. A similar approach is used, in[30], for a two-link planar robot manipulator. An indirect approach is used in [31] to deal withthe problem of minimum-time pointing of a two-link planar manipulator, that is, with the prob-lem of aligning the second link of the robot with a target point. In [32], Pontryagin’s maximumprinciple is used to solve optimal control problems in which the emphasis is on obtaining smoothcontrol inputs.

Indirect numerical methods for optimal control entail some intrinsic difficulties, such as the for-mulation of an extended problem that includes the differential equations for the adjoint variablesand the proper treatment of constraints. The first-order necessary conditions for the solution of opti-mal control problems are given by Pontryagin’s maximum principle [18] and the Euler–Lagrangeequation [33, 34]. The first-order necessary conditions of Pontryagin’s maximum principle dependon the constraints of the problem and change when some constraints become active, that is, whenthe state or the control reach the boundary of their admissible sets. This corresponds to disconti-nuities of the first derivative of the extremals. Similarly, the Euler–Lagrange equation holds only atpoints where the extremals are smooth and, therefore, at points where their first derivatives are notcontinuous which are called corners. The so-called Weierstrass–Erdmann corner conditions must besatisfied. The main drawback of this setting is that it is not possible to know in advance the num-ber of corners and their location in time, and therefore, it is not easy to derive a general numericalmethod from these conditions. To solve our optimal control problem, we apply a numerical methodthat falls into the class of indirect methods. More precisely, it is a variational approach in whichthe optimal control problem is transformed into an unconstrained calculus of variations problemby means of special derivative multipliers according to the Bliss multiplier rule [34, Chaper VII].Our method substantially differs from the usual indirect approaches in which the Euler–Lagrangedifferential equation is solved using a suitable numerical method. One of the main characteristicsof our approach is that, to compute the extremals, we use the Euler–Lagrange necessary condi-tion in integral form, plus the transversality conditions, to take into account those components ofstate and control variables, and multipliers that are not specified at the endpoints. In this way, weavoid the loss of information induced by the use of the differential form of the same condition thatimplies numerical corner conditions and the necessity of patching together solutions between cor-ners. Moreover, in our method, the control inputs are derivatives of some of the components of theextended state vector which can be piecewise continuous functions. Whereas, the original state vec-tor is supposed to be composed by piecewise smooth functions. Similarly, the multipliers, which arethe derivatives of other components of the extended state vector, need only be piecewise continu-ous. The use of derivative multipliers is another key aspect of our method. In our approach, timeis discretized, and admissible variations for each variable are approximated by linear combinationsof the piecewise continuous basis functions of time. In this way, variations depend on the valuesof the coefficients at the mesh points. Then, the conditions, under which the objective functional isstationary, with respect to all piecewise smooth variations that satisfy the boundary conditions, arederived. A set of nonlinear difference equations that must be satisfied by the coefficients is obtainedwhich a characteristic tridiagonal structure. This set of equations is then efficiently solved usingthe Newton–Raphson method. The existence of this numerical procedure is an additional advan-tage of the method. This basic procedure can be modified to incorporate equality and inequalityconstraints that involve state and control variables by means of derivative multipliers and derivativeexcess variables. The resulting formalism combines advantages of both direct and indirect methodsbecause it effectively deals with nonholonomic differential constrains induced by the unactuatedpart of the Pendubot and the Acrobot. It is based on optimality conditions and leads to an efficientnumerical procedure.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 4: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

Many specific methods have been devised for the optimal control of systems with nonholonomicconstraints. In [35], the necessary conditions for optimal control of the Lagrangian mechanicalholonomic and nonholonomic systems with symmetry are derived using the idea of Lagrangianreduction, that is, reduction under a symmetry group. The key idea is to link the method of Lagrangemultipliers with Lagrangian reduction as an alternative to the Pontryagin’s maximum principle andPoisson reduction. Reference [36, Chap. 7] is devoted to optimal control of the nonholonomicmechanical systems. The relationship between the variational nonholonomic control systems andthe classical Lagrange problem of optimal control is presented. Then, kinematic and dynamic opti-mal control problems are discussed, whereas related work on integrable systems is studied in theinternet supplement of this book. In [37], an affine connection formulation is employed to study anoptimal control problem for a class of nonholonomic, underactuated mechanical systems. The classof nonholonomic systems studied in this paper are wheeled vehicles. The nonholonomic affine con-nection together with Lagrange multiplier method in the calculus of variations are used to derive theoptimality necessary conditions. The region of the state space where the robot can be kept in equi-librium is reduced, and consists of two disjoint manifolds, a region around the straight-up position,called the attractive region, and the so-called swing-up region. The control objective in the swing-upregion is to pump enough energy into the manipulator to move it into the attractive region, whereasthe control objective in the attractive region is to stabilize the manipulator at the straight-up posi-tion. Different controllers have been used for swing-up control of the Pendubot [38–41], and variousmethods were devised for the swing-up control of the Acrobot [42,43]. In [44], a nonsmooth controllaw for the swing-up control of the Acrobot system is presented. More precisely, a strategy to placean Acrobot system to an energy level close to that of the straight-up position is described. Motivatedby the bang-bang nature of the control law, time optimality necessary conditions for this problemare given. In [45], a unified strategy for motion control of the underactuated RR robot manipulatorsin the presence of gravity, such as the Acrobot and the Pendubot, is presented. First, a control law isemployed to increase the energy and control the posture of the actuated link in the swing-up region.Finally, an optimal control law is designed for the attractive region which uses a linear approximatedmodel of the system around the straight-up position.

This paper is organized as follows. In Section 2, the dynamic models of the Pendubot and theAcrobot are described, and in Section 3, their control properties are discussed. The optimal con-trol problem for these dynamic systems is stated in Section 4. In Section 5, a variational calculusapproach to its resolution is presented. The reformulation of the optimal control problems as a calcu-lus of the variations problem is explained in Section 6. In Section 7, the proposed numerical methodused to solve the resulting calculus of variations problem is described. The results of the applicationof this numerical method to several optimal control problems for both the Pendubot and the Acrobotmodels are reported in Section 8. Finally, Section 9 contains the conclusions.

2. DYNAMIC MODEL OF UNDERACTUATED MANIPULATORS

The general dynamic model of a robot manipulator is described by the following second-orderdifferential equation

B.�/ R� CC.� , P�/ P� CF P� C e.�/D u�G.�/w, (1)

where the first term of this equation, B.�/ R� , represents the inertial forces caused by accelerationat the joints, and the second term, C.� , P�/ P� , represents the Coriolis and centrifugal forces. Thethird term is a simplified model of friction in which only the viscous friction is considered. Theterm e.�/ represents potential forces such as elasticity and gravity. Matrix G.�/ on the right-handside maps the external forces/torques w to the forces/torques at the joints. Finally, u represents theforces/torques at the joints that are the control variables of the system.

We suppose that the links are rigid, as well as the transmission elements, and do not take intoaccount the effects of the friction. We suppose that no external forces are acting on the mechanicalsystem. Under these hypotheses, the dynamic model of the robotic system is reduced to

B.�/ R� CC.� , P�/ P� C e.�/D u. (2)

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 5: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

Figure 1. A planar RR robot manipulator that moves in a vertical plane.

A planar vertical RR manipulator is composed of two homogeneous links and two revolute jointsmoving in a vertical plane ¹x,yº, as shown in Figure 1, where li is the length of link i , ri is the dis-tance between joint i and the mass center of link i ,mi is the mass of link i , and I´i is the barycentricinertia with respect to a vertical axis ´ of link i , for i D 1, 2. In this case, the two matrices B.�/ andC.� , P�/, and the vector e.�/ have the form

B.�/D�˛C 2ˇ cos �2 ıC ˇ cos �2ıC ˇ cos �2 ı

�, (3)

C.� , P�/D

��ˇ sin �2 P�2 �ˇ sin �2. P�1C P�2/ˇ sin �2 P�1 0

�, (4)

e.�/D�.m1r1Cm2l1/g cos �1Cm2r2g cos.�1C �2/

m2r2g cos.�1C �2/

�, (5)

where � D .�1, �2/T is the vector of configuration variables, �1 is the angular position of Link 1with respect to the x axis of the reference frame ¹x,yº, and �2 is the angular position of Link 2 withrespect to Link 1, as illustrated in Figure 1. P� D . P�1, P�2/T is the vector of angular velocities, andR� D . R�1, R�2/T is the vector of accelerations. The control inputs of the system are u D .u1,u2/T ,where u1 is the torque applied by the actuator at joint 1 and u2 is the torque applied by the actuatorat joint 2. The constant g D 9.80665m=s2 is the gravity acceleration. The parameters ˛, ˇ, and ı in(3) and (4) have the following expressions ˛ D I´1 C I´2 Cm1r

21 Cm2.l

21 C r

22 /, ˇ Dm2l1r2, and

ı D I´2 Cm2r22 .

A robot manipulator is said to be underactuated when the number of actuators is less than thedegrees of freedom of the mechanical system. The dynamic model (2) that does not consider theeffects of friction can be rewritten, for an RR robot manipulator underactuated by one control, inthe form [1] �

Baa.�/ Bua.�/

Bua.�/ Buu.�/

� �R�aR�u

�C

�Ca.� , P�/Cu.� , P�/

�C

�ea.�/eu.�/

�D

�ua0

�, (6)

in which the state variables �a and �u correspond to the actuated and unactuated joints, respectively,and ua is the available control input. The last equation describes the dynamics of the unactuatedpart of the mechanical system which has the form

Bua.�/ R�a CBuu.�/ R�uCCu.� , P�/C eu.�/D 0, (7)

which is a second-order differential constraint without input variables.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 6: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

3. CONTROL PROPERTIES OF UNDERACTUATED RR ROBOT MANIPULATORS

In this section, the main control properties of the planar underactuated RR robot manipulators inthe presence of gravity will be described. For a more general description of the control properties ofunderactuated robot manipulators, see [1].

Optimal control approaches to trajectory planning assume that there exists a control input thatsteers the system between two specific states. Thus, controllability is the most important aspect tocheck before studying optimal control of a dynamic system. If in the trajectory-planning problem,the duration of the motion T is not assigned, the existence of a finite-time solution for any state.�F , P�F / in a neighborhood of .�I , P�I / is equivalent, for the robotic system, to the property of localcontrollability at .�I , P�I /. If local controllability holds at any state, then the system is controllable,and the trajectory-planning problem is solvable for any pair of initial and final states.

For underactuated RR robot manipulators, controllability is related to the integrability of thesecond-order nonholonomic constraints. The second-order differential constraint (7) may either bepartially integrable to a first-order differential equation or completely integrable to a holonomicequation. Necessary and sufficient integrability conditions are given in [46]. If (7) is not partiallyintegrable, it is possible to steer the system between equilibrium points. This occurs for planarunderactuated RR robot manipulators in the absence of gravity and for both the RR and RR in thepresence of gravity.

If (7) is completely integrable to a holonomic constraint, the motion of the mechanical systemis restricted to a one-dimensional submanifold of the configuration space, which depends on theinitial configuration. This occurs for the planar underactuated RR robot manipulators without theeffects of gravity [46]. For this robot model, the trajectory-planning problem has a solution onlyfor particular initial and final states. Thus, when (7) is not partially or completely integrable, themechanical system is controllable. However, several aspects of controllability can be studied, whichcharacterize planar underactuated RR robot manipulators.

The equilibrium conditions for the Pendubot are

e1.�1, �2/D g.l1m2Cm1r1/ cos �1C gm2r2 cos.�1C �2/D u1,

e2.�1, �2/D gm2r2 cos.�1C �2/D 0,

whereas for the Acrobot,

e1.�1, �2/D g.l1m2Cm1r1/ cos �1C gm2r2 cos.�1C �2/D 0,

e2.�1, �2/D gm2r2 cos.�1C �2/D u2.

The values of .�1, �2/ that satisfy the equilibrium conditions of the Pendubot for a given valueof u1 are four isolated configurations. Likewise, the values of .�1, �2/ that satisfy the equilibriumconditions for a certain value of u2 are four isolated configurations. Not all these solutions cor-respond to stable equilibrium configurations. For u1 D 0 and u2 D 0, respectively, the Pendubotand the Acrobot have three unstable configurations, namely .�1, �2/ D .��=2,�/, which will becalled midconfiguration, .�1, �2/ D .�=2,��/, and .�1, �2/ D .�=2, 0/, which will be called topconfiguration, and one stable equilibrium configuration, .�1, �2/D .��=2, 0/, which will be calleddown configuration.

A dynamic system is linearly controllable at an equilibrium point if the linear approximation ofthe system around this point is controllable. Planar underactuated RR robot manipulators, in theabsence of gravity, are not linearly controllable. On the contrary, both planar underactuated RR andRR robot manipulators, in the presence of gravity, are linearly controllable.

A mechanical system is said to be small-time locally controllable (STLC) at xI D .�I , P�I / if,for any neighborhood Vx of xI and any time T > 0, the set RVx ,T .xI / of states that are reachablefrom xI within time T along trajectories contained in V includes a neighborhood of xI . Small-timelocal controllability is a stronger property than controllability [47]. A non-STLC but controllablesystem must, in general, perform finite maneuvers in order to perform arbitrarily small changes toconfiguration. It has been proven in [48,49] that planar underactuated RR robot manipulators in the

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 7: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

absence of gravity are not STLC. Both planar underactuated RR and RR robot manipulators in thepresence of gravity are also not STLC.

Second-order mechanical systems cannot be STLC at states with nonzero velocity. Therefore,the weaker concepts of small-time local configuration controllability has been introduced [50]. Asystem is said to be small-time local configuration controllable (STLCC) at a configuration �I if,for any neighborhood V� of �I in the configuration space and any time T > 0, the set RV� ,T .�I /

of configurations that are reachable (with some final velocity P� ) within T , starting from .�I , 0/ andalong a path in the configuration space contained in V� , includes a neighborhood of �I . By defini-tion, STLC systems are also STLCC. Sufficient conditions for STLCC are given in [50]. It has beenproven in [48, 49] that planar underactuated RR robot manipulators in the absence of gravity arenot STLCC. Both planar underactuated RR and RR robot manipulators in the presence of gravityare also not STLCC.

A final question is to investigate if the trajectory-planning problem for RR planar underactuatedrobot manipulators can be solved with algorithmic methods. A mechanical system is kinematicallycontrollable (KC) if every configuration is reachable by means of a sequence of kinematic motions,that is, feasible paths in the configuration space which may be followed with any arbitrary tim-ing law [48, 49, 51]. KC mechanical systems are also STLCC. Kinematic controllability does notimply small-time local controllability. If a mechanism is KC, the trajectory-planning problem maybe solved with algorithmic methods. Planar underactuated RR robot manipulators in the absence ofgravity are not KC. Both planar underactuated RR and RR robot manipulators in the presence ofgravity are not KC.

4. THE OPTIMAL CONTROL PROBLEM

Given the dynamic equation of an underactuated planar RR robot manipulator, an initial state,.�I , P�I /, and a final state, .�F , P�F /, the optimal control problem consists of finding the availablecontrol input, u1.t/ or u2.t/, and the resulting trajectory with t 2 ŒtI , tF � that steers the sys-tem between the initial and final states satisfying the dynamic equation (6) and minimizing theobjective functional

J D

Z tF

tI

u2i dt , (8)

where i D 1, 2 depending on which joint is actuated, and tI and tF are the initial and final timevalues, respectively. This cost functional represents a measure of the energy consumed during themotion because the torque produced with an electromechanical actuator is approximately propor-tional to the current flow and the rate of energy consumption is approximately equal to the squareof this current.

If P�I D P�F D 0, the problem is called rest-to-rest trajectory-planning problem. If the final or theinitial states or part of them are not assigned, the problems are called the initial value problem andfinal value problem, respectively. The final time tF may be fixed or not.

This problem is a particular case of an optimal control problem which can be stated in a moregeneral form, as follows. Minimize the integral

J.x,u/DZ tF

tI

f .t , x,u/dt (9)

subject to

x0.t/D g.t , x,u/, (10)

h.t , x,u/D 0, (11)

l.t , x,u/6 0, (12)

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 8: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

p.t , x/D 0, (13)

q.t , x/6 0, (14)

x.tI /D xtI , x.tF /D xtF , and (15)

u 2 U , (16)

where x.t/D .x1.t/, x2.t/, : : : , xn.t//T is an n-vector called the state vector; u.t/D .u1.t/,u2.t/,: : : ,um.t//T is an m-vector called the control vector; the real valued function J.x,u/ is the objec-tive functional, (10) is called the trajectory equation, and the conditions (15) are called the boundaryconditions. The set U � Rm is called the set of controls, with u.t/ 2 U for every t 2 ŒtI , tF �. Weassume that f , g, h, l , p, and q are sufficiently smooth for our purpose. This will imply solutions,such that x.t/ is piecewise smooth, whereas u.t/ is piecewise continuous [33].

5. CALCULUS OF VARIATIONS APPROACH

A variational approach has been used to solve the more general optimal control problem stated inthe previous section.

The classical calculus of variations problem is to minimize an integral of the form

I.y/D

Z b

a

f .x, y,y0/dx, (17)

such that

y.a/D A, y.b/D B , (18)

where the independent x variable is assumed to be in the interval Œa, b�, and the dependent variabley D y.x/ D .y1.x/, y2.x/ : : : ,yn.x//T is assumed to be an n-vector continuous on Œa, b� withderivative y0 D y0.x/ D .y01.x/,y

02.x/ : : : ,y

0n.x//

T . It is also assumed that y is piecewise smooth,that is, there exists a finite set of points a1, a2, : : : , ak . So that, a 6 a1 < a2 < : : : < ak 6 b, y.x/is continuously differentiable on .al , alC1/ and that the respective left-handed and right-handedlimits of y0.x/ exist. If y.x/ is piecewise smooth and satisfies the boundary conditions y.a/ D A,y.b/D B , then y.x/ is said to be an admissible arc. In other words, this problem consists of finding,among all arcs connecting end points .a,A/ and .b,B/, the one minimizing the integral (17).

The main optimality conditions are obtained by defining a variation ´.x/, a set of functions

y.x, �/D y.x/C �´.x/ for j�j< ı, (19)

and a functional

F.�/D

Z b

a

f .x,y.x, �/,y0.x, �//dx, (20)

where ı > 0 is a fixed real number, and the variation ´.x/ is a piecewise smooth function with´.a/D ´.b/D 0. With a Taylor series expansion, it is easy to see that a necessary condition, 0 is arelative minimum to F , is

I 0.y, ´/DZ b

a

Œfy´C fy0´0�dx D 0, (21)

where fy ,fy0 denote the partial derivatives of f evaluated along .x,y.x/, y0.x//, and the terms ´and ´0 are evaluated at x.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 9: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

Integrating (21) by the parts for all admissible variations ´.x/, another necessary condition fory D y.x/ used to give a relative minimum of the variational problem (17) and (18) is obtained,which is the following second-order differential equation

d

dxfy0 D fy , (22)

known as Euler–Lagrange condition. This equation must hold along .x,y.x/, y0.x//, except at afinite number of points [52, Section 2.1].

The extremals of (17) and (18) can be obtained by solving the Euler–Lagrange equation, but itonly holds at points where the extremal y�.x/ is smooth. At points where y�0.x/ has jumps, calledcorners, the Weierstrass–Erdmann corner conditions must be fulfilled [52, Section 2.3]. Becauseof the location of the corners, their numbers and the amplitudes of the jumps in y0�.x/ are notknown in advance. It is difficult to obtain a numerical method for a general problem by using theEuler–Lagrange equation (22). One of the key aspects of our method is that the integral form of thiscondition

fy0.x,y.x/, y0.x//DZ x

a

fy.x,y.x/, y0.x//dxC c, (23)

holds for all x 2 Œa, b� and some c. Therefore, the Weierstrass–Erdmann corner conditions arenot needed. Thus, an alternative way of computing the extremals can be based on this necessarycondition in integral form.

Note that the necessary condition requires that boundary values fulfill the Euler–Lagrangeequation. Thus, if some of the four values a, y.a/, b, and y.b/ are not explicitly given, alternateboundary conditions have to be provided. This is what transversality conditions do. Assume that a,y.a/, and b are given, but y.b/ is free. In this case, the additional necessary transversality condition

fy0.b,y�.b/,y0�.b//D 0 (24)

must hold.The variational approach does not consider constraints. However, the optimal control problem

has, at least, a first-order differential constraint (10) representing the dynamic equation of the sys-tem. Moreover, because the dynamic equation of a planar RR robot manipulator is a second-orderdifferential equation, additional differential constraints will arise while rewriting it as a first-orderdifferential equation. Therefore, the optimal control problem must be reformulated as an uncon-strained calculus of variations problem in order to deal with differential and algebraic constraints,as described in the following section.

6. VARIATIONAL REFORMULATION OF THE OPTIMAL CONTROL PROBLEM

First of all, following [52, Chapter 5], in this section, we reformulate, as an unconstrained calculus ofvariations problem, the optimal control problem consisting of minimizing (9), subject to (10)–(12),(15), and (16). Notice that we omitted constraints (13) and (14) which need a special treatment.

For convenience, we changed the independent variable from t to x and the dependent variablefrom x to y, to be consistent with the notation of the calculus of variations. Our reformulation isbased on special derivative multipliers and a change of variables in which

y1.x/D y.x/ is the renamed state vector,

y02.x/D u.x/ is the renamed control vector,

y03.x/ is the multiplier associated with equation .10/,

y04.x/ is the multiplier associated with constraint .11/,

y05.x/ is the multiplier associated with constraint .12/,

y06.x/ is the excess variable of constraint .12/.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 10: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

Because y2.x/ to y6.x/ are not unique without an extra condition, we initialize these variables bydefining yi .xI /D 0, i D 1, : : : , 6. Thus, our problem becomes

min I.Y/DZ xF

xI

F.x, Y, Y0/dx, (25)

where YD .y1,y2,y3,y4,y5,y6/T , and

F D f .x,y1,y02/C y0T3 .y

01 � g.x, y1,y02//C y

0T4 h.x,y1,y02/C y

0T5

�l.x,y1,y02/C y

026

�.

Because the values of yi .xF /, i D 2, : : : , 6 are unknown, transversality conditions are needed,having the form FY

0.xF , Y.xF /, Y0.xF //D 0, with YD .y1, Y/.

7. NUMERICAL METHOD

The numerical method used is based on the discretization of the unconstrained variational calculusproblem stated in the previous section. In particular, the main underlying idea is obtaining a dis-cretized solution yh.x/ solving (21) for all piecewise linear spline function variations ´.x/ insteadof dealing with the Euler–Lagrange equation (22). Thus, this method uses no numerical cornerconditions and avoids patching solutions to (22) between corners.

Let N be a large positive integer, h D .b � a/=N , and � D .a D a0 < a1 < : : : < aN D b/ apartition of the interval Œa, b�, where ak D aC kh for k D 0, 1, : : : ,N . Define the one-dimensionalspline hat functions

wk.x/D

´.x � ak�1/=h if ak�1 < x < ak ,.akC1 � x/=h if ak < x < akC1,0 otherwise,

where k D 1, 2, : : : ,N � 1. Define also the m-dimensional, piecewise linear component functions

yh.x/D

NXkD0

Wk.x/Ck and ´h.x/D

NXkD0

Wk.x/Dk,

whereWk.x/D wk.x/Im�m, yh.x/ is the sought numerical solution, and ´h.x/ is a numerical vari-ation. In particular, the constant vectors Ck are to be determined by the algorithm we developed, andthe constant vectors Dk are arbitrary. Thus, the discretized form of (21) is obtained in each subinter-val Œak�1, akC1�. For the sake of clarity of exposition, we assume mD 1. Note that I 0.y, ´/ in (21)is linear in ´ so that a three-term relationship may be obtained at x D ak by choosing ´.x/D wk.x/for k D 1, 2, : : : ,N � 1. Thus,

0D I 0.y,wk/DZ akC1

ak�1

Œfy.x,y, y0/wk C fy0.x,y,y0/w0k�dx

D

Z ak

ak�1

hfy.x,yh.x/, y

0h.x//

x � ak�1

h

idxC

Z ak

ak�1

�fy0.x,yh.x/, y

0h.x//

1

h

�dx

C

Z akC1

ak

hfy.x,yh.x/,y

0h.x//

akC1 � x

h

idxC

Z akC1

ak

�fy0.x,yh.x/,y

0h.x//

��1

h

��dx

D fy0

�a�k�1,

yk C yk�1

2,yk � yk�1

h

�Ch

2fy

�a�k�1,

yk C yk�1

2,yk � yk�1

h

� fy0

�a�k ,

yk C ykC1

2,ykC1 � yk

h

�Ch

2fy

�a�k ,

yk C ykC1

2,ykC1 � yk

h

�. (26)

In these equations, a�kD .ak C akC1/=2 and yk D yh.ak/ are the computed values of yh.x/ at ak .

In the general case, when m > 1, the same result is obtained, but fy0 and fy are column m-vectorsof functions with the i-th component fyi0 and fyi , respectively. Similarly, .ykC yk�1/=2 is them-vector which is the average of the m-vectors yh.ak/ and .yh.ak�1/.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 11: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

By the same arguments that led to (26), we obtain

fy0

�a�N�1,

yN C yN�1

2,yN � yN�1

h

�Ch

2fy

�a�k�1,

yN C yN�1

2,yN � yN�1

h

�D 0,

which is the numerical equivalent of the transversality condition (24). For further details, see[52, Chapter 6].

It can be shown [53] that with this method the global error has an a priori global reduction ratioof O.h2/. In practice, if the step size h is halved, the error decreases by 4.

8. IMPLEMENTATION AND RESULTS

Several numerical experiments have been carried out for both RR and RR planar underactuatedrobot manipulators in the presence of gravity.

8.1. Pendubot

In this section, the optimal control problem for a Pendubot is studied. In this robot model, the secondjoint is not actuated, thus u D .u1, 0/T . In this case, it is neither possible to integrate partially norcompletely the nonholonomic constraint because the conditions for partial integrability [46] are notfulfilled. Hence, the system is controllable. The numerical results of the application of our methodfor optimal control of two different boundary value problems for this system will be described.

In this case, (2) can be split into

.˛C2ˇ cos �2/ R�1C .ıC ˇ cos �2/ R�2 � ˇ sin �2.2 P�1 P�2C P�22/C

g.l1m2Cm1r1/ cos �1C gm2r2 cos.�1C �2/D u1, (27)

.ıCˇ cos �2/ R�1C ı R�2C ˇ sin �2 P�12C gm2r2 cos.�1C �2/D 0. (28)

Inequality constraints of the form (14) on the state variables will also be considered. In particular,the set of values of angle �2 will be bounded in order to take into account possible range limitationsof the mechanical system. Thus, the inequality constraints

�2l 6 �2 6 �2u (29)

are considered, where �2l and �2u are the lower and upper bounds for �2.To express this optimal control problem that involves second-order differential constraints in the

form of a basic optimal control problem, we first have to convert it into a first-order differentialmodel, introducing the following change of variables

x1 D �1, x2 D �2, x3 D P�1, x4 D P�2, (30)

with the following additional relations

x01 D x3, x02 D x4. (31)

The second-order differential equations (27) and (28) are converted into the first-order differentialequations

.˛C2ˇ cos x2/x03C .ıC ˇ cos x2/x

04 � ˇ sin x2.2x3x4C x

24/C

g.l1m2Cm1r1/ cos x1C gm2r2 cos.x1C x2/D u1, (32)

.ıCˇ cos x2/x03C ıx

04C ˇ sin x2x

23 C gm2r2 cos.x1C x2/D 0. (33)

The objective functional to minimize is

J D

Z tF

tI

u21 dt , (34)

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 12: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

relations (31)–(33) are now the differential constraints of the optimal control problem and (43)the simple bound constraints. Then, we apply the Bliss multiplier rule, introducing the followingnew variables

XD�X1 � � �X13

T, (35)

such that

Xi D xi , i D 1, : : : , 4,X 05 D u1,X5.tI /D 0, (36)

where X 06 with X6.tI /D 0 is the multiplier associated with the differential constraint (32), X 07 withX7.tI /D 0 is the multiplier associated with the differential constraint (33), andX 08 withX8.tI /D 0and X 09 with X9.tI / D 0 are the multipliers associated with the additional differential constraints(31). Variable X 010 with X10.tI / D 0 is the multiplier associated with the constraint �2 6 �2u, andX 011 with X11.tI /D 0 is the excess variable associated with the same constraint. Variable X 012 withX12.tI / D 0 is the multiplier associated with the constraint �2 > �2l , and X 013 with X13.tI / D 0 isthe excess variable associated with the same constraint. Thus, the unconstrained reformulation (25)of the problem is, in this case,

min I.X/DZ tF

tI

G.t , X, X0/dt , (37)

where

G DX 025 CX06.�ˇ sin.X2/X3X4 � ˇ sin.X2/X4.X3CX4/C .˛C 2ˇ cos.X2//X

03C

.ıC ˇ cos.X2//X04C g.l1m2Cm1r1/ cos.X1/C gm2r2 cos.X1CX2/�X

05/C

X 07.ˇ sin.X2/X23 C .ıC ˇ cos.X2//X

03C ıX

04C gm2r2 cos.X1CX2//C

X 08.X01 �X3/CX

09.X

02 �X4/CX

010.X2 � �2uCX

0112/CX 012.�X2C �2l CX

0132/

with initial conditions

Xi .tI /D �i , i D 1, : : : , 4 with �i -assigned constant values,

Xi .tI /D 0, i D 5, : : : , 13, (38)

for both optimal control problems. The final conditions for the two boundary value problems havethe form

Figure 2. Sequence of configurations of the robot manipulator at times t D k 564

with k D 0, 1, : : : , 64corresponding to the optimal solution of a boundary value problem for a Pendubot with boundary condi-tions �1.tI / D ��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s, �1.tF / D ��=2 rad,�2.tF / D � rad, P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s, obtained with a discretization of tI , tF into 64subintervals. The initial and final times are tI D 0 and tF D 5 s, respectively. The corresponding controland state variables are represented in Figure 3. (a) k D 0, 1, : : : , 18; (b) k D 19, : : : , 38; (c) k D 39 : : : , 55;

and (d) k D 56, : : : , 64.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 13: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

Xi .tF /D �i , i D 1, : : : , 4 with �i -assigned constant values. (39)

The initial values of the control variable and multipliers have been set to 0, whereas their finalvalues have not been assigned in both optimal control problems. Therefore, transversality conditionsare needed in both cases for the variables Xi .tF /, i D 5, : : : , 13, and they will be of the form

GX0.tF , X.tF /, X0.tF //D 0, (40)

with XD .X1,X2,X3,X4, X/. We have used, for both problems, the following settings I´1 D I´2 D1 kg=m2, l1 D l2 D 1m, and m1 Dm2 D 1 kg.

10 20 30 40 50 60

-4

-3

-2

-1

1

(a)

10 20 30 40 50 60

-2

-1

1

2

3

(b)

10 20 30 40 50 60

-4

-2

2

4

(c)

10 20 30 40 50 60

-6

-4

-2

2

4

(d)

10 20 30 40 50 60

-15

-10

-5

5

10

(e)

Figure 3. Optimal solution of a boundary value problem for a Pendubot with boundary conditions �1.tI /D��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s, �1.tF / D ��=2 rad, �2.tF / D � rad,P�1.tF /D 0 rad/s, and P�2.tF /D 0 rad/s. The initial and final times are tI D 0 and tF D 5 s, respectively. The

corresponding sequence of configurations of the robot manipulator at times t D k 564

with k D 0, 1, : : : , 64

is represented in Figure 2. (a) �1; (b) �2; (c) P�1; (d) P�2; and (e) u1.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 14: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

8.1.1. Swing the Pendubot to the mid configuration. The following initial and final conditions havebeen imposed on the state variables.

�1.tI /D��=2 rad, �1.tF /D��=2 rad,

�2.tI /D 0 rad, �2.TF /D � rad,

P�1.tI /D P�1.tF /D P�2.tI /D P�2.tF /D 0 rad/s with �2� 6 �2 6 2� . The initial and final times aretI D 0 and tF D 5 s, respectively.

Figure 2 shows the sequence of configurations of the robot at times t D k 564

, k D 0, 1, : : : , 64,corresponding to the optimal solution of the boundary value problem obtained with a discretizationof the interval ŒtI , tF � into 64 subintervals. Because the configurations of the sequence overlap, ithas been split into smaller sequences for better visualization of the manipulator motion. Figure 3depicts the corresponding control and state variables. The value of the objective functional for theoptimal solution is 346.32486 J.

8.1.2. Swing the Pendubot to the top configuration. Another boundary value problem has beensolved with the following initial and final conditions

�1.tI /D��=2 rad, �1.tF /D �=2 rad,

�2.tI /D 0 rad, �2.tF /D 0 rad,

P�1.tI / D P�1.tF / D P�2.tI / D P�2.tF / D 0 rad/s, with �� 6 �2 6 � . The initial and final times aretI D 0 and tF D 6 s, respectively.

Figure 4 shows the sequence of configurations of the robot at times t D k 664

, k D 0, 1, : : : , 64,corresponding to the optimal solution of the boundary value problem obtained with a discretizationof the interval ŒtI , tF � into 64 subintervals. Figure 5 depicts the corresponding control and statevariables. The value of the objective functional for the optimal solution is 167.63922 J.

8.2. Acrobot

In this section, the optimal control problem of the Acrobot is studied. In this robot model, the firstjoint is not actuated, thus u D .0,u2/T . In this case, it is neither possible to integrate partially norcompletely the nonholonomic constraint because the conditions for partial integrability [46] are notfulfilled. Hence, the system is controllable. The numerical results of the application of our methodfor optimal control to two different boundary value problems for this system will be described.

Figure 4. Sequence of configurations of the solution at times t D k 664

with k D 0, 1, : : : , 64 corre-sponding to the optimal solution of a boundary value problem for a Pendubot with boundary conditions�1.tI /D��=2 rad, �2.tI /D 0 rad, P�1.tI /D 0 rad/s, P�2.tI /D 0 rad/s, �1.tF /D �=2 rad, �2.tF /D 0 rad,P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s, obtained with a discretization of tI , tF into 64 subintervals. The

initial and final times are tI D 0 and tF D 6 s, respectively. The corresponding control and state variablesare represented in Figure 5. (a) k D 0, 1, : : : , 16; (b) k D 17, : : : , 56; and (c) k D 57, : : : , 64.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 15: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

10 20 30 40 50 60

-3

-2

-1

1

10 20 30 40 50 60

-2

-1

1

2

10 20 30 40 50 60

-2

-1

1

2

3

4

5

10 20 30 40 50 60

-6

-4

-2

2

10 20 30 40 50 60

-10

-5

5

(a) (b)

(c) (d)

(e)

Figure 5. Control and state variables of the optimal solution of a boundary value problem for a Pendubotwith boundary conditions �1.tI / D ��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s,�1.tF / D �=2 rad, �2.tF / D 0 rad, P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s. The initial and finaltimes are tI D 0 and tF D 6 s, respectively. The corresponding sequence of configurations of the robotmanipulator at times t D k 6

64with k D 0, 1, : : : , 64 is represented in Figure 4. (a) �1, (b) �2, (c) P�1,

(d) P�2, and (e) u1.

In this case, (2) can be split into

.˛C 2ˇ cos �2/ R�1C .ıC ˇ cos �2/ R�2 � ˇ sin �2.2 P�1 P�2C P�22/C

g.l1m2Cm1r1/ cos �1C gm2r2 cos.�1C �2/D 0, (41)

.ıC ˇ cos �2/ R�1C ı R�2C ˇsin�2 P�12C gm2r2 cos.�1C �2/D u2. (42)

Simple bound constraints

�2l 6 �2 6 �2u (43)

will be considered, where �2l and �2u are the lower and upper bounds for �2.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 16: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

To express this optimal control problem which involves second-order differential constraints inthe form of a basic optimal control problem, we can proceed as explained in Section 8.1, intro-ducing the change of variables (30) and the additional differential constraints (31). In this way, thesecond-order differential equations (41)and (42) are converted into first-order differential equations

.˛C 2ˇ cos x2/x03C .ıC ˇ cos x2/x

04 � ˇ sin x2.2x3x4C x

24/C

g.l1m2Cm1r1/ cos x1C gm2r2 cos.x1C x2/D 0, (44)

.ıC ˇ cos x2/x03C ıx

04C ˇ sin x2x

23 C gm2r2 cos.x1C x2/D u2. (45)

The objective functional to minimize is

J D

Z tF

tI

u22 dt , (46)

and relations (31), (44), and (45) are now the differential constraints of the optimal control problem.Then, we apply the Bliss multiplier rule, introducing the new variables (35), where X1, : : : ,X5

are defined in (36), X 06 with X6.tI /D 0 is the multiplier associated with differential constraint (44),X 07 with X7.tI / D 0 is the multiplier associated with the differential constraint (45), and X 08 withX8.tI / D 0 and X 09 with X9.tI / D 0 are the multipliers associated with the additional differentialconstraints (31). Variables X10, : : : ,X13 are the multiplier and the excess variables associated withthe simple bound constraints (43), as explained in Section 8.1.

Thus, the unconstrained reformulation (25) of the problem is, in this case,

min I.X/DZ tF

tI

G.t , X, X0/dt , (47)

where

G DX 025 CX06.�ˇ sin.X2/X3X4 � ˇ sin.X2/X4.X3CX4/C .˛C 2ˇ cos.X2//X

03C

.ıC ˇ cos.X2//X04C g.l1m2Cm1r1/ cos.X1/C gm2r2 cos.X1CX2//C

X 07.ˇ sin.X2/X23 C .ıC ˇ cos.X2//X

03C ıX

04C gm2r2 cos.X1CX2/�X

05/C

X 08.X01 �X3/CX

09.X

02 �X4/CX

010.X2 � �2uCX

0112/CX 012.�X2C �2l CX

0132/

with initial conditions (38) and final conditions (39) for both optimal control problems.

Figure 6. Sequence of configurations of the robot manipulator at times t D k 264

with k D 0, 1, : : : , 64corresponding to the optimal solution of a boundary value problem for an Acrobot with boundary condi-tions �1.tI / D ��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s, �1.tF / D ��=2 rad,�2.tF / D � rad, P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s, obtained with a discretization of tI , tF into64 subintervals. The initial and final times are tI D 0 and tF D 2 s, respectively. The correspondingcontrol and state variables are represented in Figure 7. (a) k D 0, 1, : : : , 29; (b) k D 21, : : : , 48; and (c)

k D 49, : : : , 64.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 17: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

The initial values of the control variable and multipliers have been set to 0, whereas their finalvalues have not been assigned in both optimal control problems. Therefore, transversality condi-tions are needed in both cases for the variables Xi .tF /, i D 5, : : : , 13, and they will be of theform (40).

We have used, for both problems, the following settings I´1 D I´2 D 1 kg=m2, l1 D l2 D 1m,m1 Dm2 D 1 kg.

8.2.1. Swing the acrobot to the midconfiguration. A boundary value problem has been solved withthe following initial and final conditions

10 20 30 40 50 60

-2.0

-1.8

-1.6

-1.4

10 20 30 40 50 60

0.5

1.0

1.5

2.0

2.5

3.0

10 20 30 40 50 60

-2

-1

1

10 20 30 40 50 60

-1

1

2

3

4

5

10 20 30 40 50 60

-20

-10

10

20

(a) (b)

(c) (d)

(e)

Figure 7. Control and state variables of the optimal solution of a boundary value problem for an Acrobotwith boundary conditions �1.tI / D ��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s,�1.tF / D ��=2 rad, �2.tF / D � rad, P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s. The initial and finaltimes are tI D 0 and tF D 2 s, respectively. The corresponding sequence of configurations of the robotmanipulator at times t D k 2

64with k D 0, 1, : : : , 64 is represented in Figure 6. (a) �1, (b) �2, (c) P�1, (d) P�2,

and (e) u2.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 18: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

�1.tI /D��=2 rad, �1.tF /D��=2 rad

�2.tI /D 0 rad, �2.tF /D � rad,

P�1.tI /D P�1.tF /D P�2.tI /D P�2.tF /D 0 rad/s, with �2� 6 �2 6 2� . The initial and final times aretI D 0 and tF D 2 s, respectively.

Figure 6 shows the sequence of configurations of the robot at times k 232

with k D 0, 1, : : : , 64,corresponding to the optimal solution of the boundary value problem obtained with a discretizationof the interval ŒtI , tF � into 64 subintervals. Figure 7 depicts the corresponding control and statevariables. The value of the objective functional for the optimal solution is 151.294317 J.

8.2.2. Swing the acrobot to the top configuration. Another boundary value problem has been solvedwith the following conditions

�1.tI /D��=2 rad, �1.tF /D �=2 rad,

�2.tI /D 0 rad, �2.tF /D 0 rad,

P�1.tI / D P�1.tF / D P�2.tI / D P�2.tF / D 0 rad/s, with �� 6 �2 6 � . The initial and final times aretI D 0 and tF D 5 s, respectively.

Figure 8 shows the sequence of configurations of the robot at times k 564

with k D 0, 1, : : : , 64corresponding to the optimal solution of the boundary value problem obtained with a discretizationof the interval ŒtI , tF � into 64 subintervals. Figure 9 depicts the corresponding control and statevariables. The value of the objective functional for the optimal solution is 168.48311 J.

8.3. Computational issues

If the optimal control problem has m variables and the time interval ŒtI , tF � has been discretizedinto N subintervals, the resulting set of difference equations (26) has m � .N � 1/ equations andm�.N �1/ variables, plus the equations and variables caused by transversality conditions. Feasiblesolutions have been used as initial guesses of the algorithm.

The solution of the nonlinear system of the difference equations (26) has been obtained usinga damped Newton algorithm within a line search methodology, implemented in Mathematica 7(Wolfram Research, Inc., IL, USA) under Mac OS X operating system (see [54] and [55] for moredetails).

Figure 8. Sequence of configurations of the robot manipulator at times t D k 532

with k D 0, 1, : : : , 64corresponding to the optimal solution of a boundary value problem for an Acrobot with boundary conditions�1.tI /D��=2 rad, �2.tI /D 0 rad, P�1.tI /D 0 rad/s, P�2.tI /D 0 rad/s, �1.tF /D �=2 rad, �2.tF /D 0 rad,P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s, obtained with a discretization of tI , tF into 64 subintervals. The

initial and final times are tI D 0 and tF D 5 s, respectively. The corresponding control and state variablesare represented in Figure 9. (a) k D 0, 1, : : : , 11; (b) k D 12, : : : , 31; and (c) k D 32, : : : , 64.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 19: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

10 20 30 40 50 60

-3

-2

-1

1

10 20 30 40 50 60

-2

-1

1

2

10 20 30 40 50 60

-2

-1

1

2

3

4

10 20 30 40 50 60

-4

-2

2

4

6

10 20 30 40 50 60

-6

-4

-2

2

4

(a) (b)

(c) (d)

(e)

Figure 9. Control and state variables of the optimal solution of a boundary value problem for an Acrobotwith boundary conditions �1.tI / D ��=2 rad, �2.tI / D 0 rad, P�1.tI / D 0 rad/s, P�2.tI / D 0 rad/s,�1.tF / D �=2 rad, �2.tF / D 0 rad, P�1.tF / D 0 rad/s, and P�2.tF / D 0 rad/s. The initial and final times aretI D 0 and tF D 5 s, respectively. The corresponding sequence of configurations of the robot manipulatorat times t D k 5

32with k D 0, 1, : : : , 64 is represented in Figure 8. (a) �1, (b) �2, (c) P�1, (d) P�2, and (e) u2.

9. CONCLUSIONS

In this paper, the trajectory-planning problem for planar underactuated robot manipulators with tworevolute joints in the presence of gravity has been studied. This problem is solved as an optimalcontrol problem based on a numerical resolution of an unconstrained variational calculus reformu-lation of the optimal control problem in which the dynamic equations of the mechanical system areregarded as constraints. It has been shown that the reformulation method based on special deriva-tive multipliers is able to tackle nonintegrable differential constraints of the dynamic models ofunderactuated planar robot manipulators with two revolute joints in the presence of gravity.

REFERENCES

1. De Luca A, Iannitti S, Mattone R, Oriolo G. Underactuated manipulators: control properties and techniques. MachineIntelligence and Robotic Control 2002; 4(3):113–125.

2. Hargraves CR, Paris SW. Direct trajectory optimization using nonlinear programming and collocation. Journal ofGuidance, Control, and Dynamics 1987; 10(4):338–342.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 20: Energy-optimal trajectory planning for the Pendubot and the Acrobot

J. GREGORY, A. OLIVARES AND E. STAFFETTI

3. von Stryk O. Numerical solution of optimal control problems by direct collocation. In Optimal Control-Calculus ofVariations, Optimal Control Theory and Numerical Methods, Vol. 111, Bulirsch R, Miele A, Stoer J, Well K-H (eds),International Series of Numerical Mathematics. Birkhäuser: Basel, 1993; 129–143.

4. Kraft D. On converting optimal control problems into nonlinear programming problems. In Computational Math-ematical Programming, Vol. 15, Schitkowski K (ed.), NATO ASI Series F: Computer and Systems Sciences.Springer-Verlag: Berlin, 1985; 261–280.

5. Teo KL, Goh CJ, Wong KH. A Unified Computational Approach to Optimal Control Problems, Pitman Monographsand Surveys in Pure and Applied Mathematics, Vol. 55. Longman Scientific and Technical: England, 1991.

6. Betts JT. Practical Methods for Optimal Control Using Nonlinear Programming, Advances in Design and Control.SIAM: Philadelphia, PA, 2001.

7. Enright PJ, Conway BA. Discrete approximations to optimal trajectories using direct transcription and nonlinearprogramming. Journal of Guidance, Control and Dynamics 1992; 15(4):994–1002.

8. Hull DG. Conversion of optimal control problems into parameter optimization problems. Journal of DynamicSystems, Control and Dynamics 1997; 20:57–60.

9. Betts JT. Survey of numerical methods for trajectory optimization. Journal of Guidance, Control and Dynamics1998; 21(2):193–207.

10. Chernousko FL. Optimization in control of robots. In Computational Optimal Control, Vol. 115, Bulirsch R, Kraft D(eds), International Series of Numerical Mathematics. Birkhäuser-Verlag: Basel, 1994.

11. Steinbach MC, Bock HG, Kostin GV, Longman RW. Mathematical optimization in robotics: towards automated highspeed motion planning. Surveys on Mathematics for Industry 1998; 7:303–340.

12. Shiller Z, Dubowski S. Robot path planning with obstacles, actuator, gripper, and payload constraints. TheInternational Journal of Robotics Research 1989; 8(6):3–18.

13. Bobrow JE, Dubowsky S, Gibson JS. Time-optimal control of robot manipulators. The International Journal ofRobotics Research 1985; 4(3):3–17.

14. Shin KG, McKay ND. Minimum-time control of robotic manipulators with geometric path constraints. IEEETransactions on Automatic Control 1985; 30:531–541.

15. Shiller Z, Lu HH. Computation of path constrained time optimal motions with dynamic singularities. ASME Journalof Dynamic Systems, Measurement and Control 1992; 114(2):34–40.

16. Chen YC. Solving robot trajectory planning problems with uniform cubic B-splines. Optimal Control Applicationsand Methods 1991; 12:247–262.

17. Lee HWJ, Teo KL, Jennings LS. On optimal control of multi-link vertical planar robot arms systems moving underthe effect of gravity. Journal of Australian Mathematical Society, Series B 1997; 39:195–213.

18. Pontryagin LS, Boltyanski VG, Gamkrelidze VG, Mishchenko EF. The Mathematical Theory of Optimal Processes.Interscience Publishers: New York, 1962.

19. Kirk D. Optimal Control Theory: An Introduction, Prentice-Hall Networks Series. Prentice-Hall: New Jersey, 1970.20. Bryson AE, Ho Y-C. Applied Optimal Control: Optimization, Estimation, and Control. Hemisphere Publishing

Corporation: Washington DC, 1975.21. Callies R, Rentrop P. Optimal control of rigid-link manipulators by indirect methods. GAMM Mitteilungen 2008;

31(1):27–58.22. Ailon A, Langholz G. On the existence of time-optimal control of mechanical manipulators. Journal of Optimization

Theory and Applications 1985; 46(1):1–23.23. Chen Y. Existence and structure of minimum-time control for multiple robot arms handling a common object.

International Journal of Control 1991; 53:855–869.24. Chen Y. On the structure of the minimum-time control law for multiple robot arms handling a common object. IEEE

Transactions on Automatic Control 1991; 36(2):230–233.25. Weinreb A, Bryson AE. Optimal control of systems with hard control bounds. IEEE Transactions on Automatic

Control 1985; 27:1135–1138.26. Meier EB, Bryson AE. Efficient algorithm for time-optimal control of a two-link manipulator. Journal of Guidance,

Control, and Dynamics 1990; 13:859–866.27. Geering HP, Guzzella L, Hepner SA, Onder CH. Time-optimal motions of robots in assembly tasks. IEEE

Transactions on Automatic Control 1986; AC-31(6):512–518.28. Oberle HJ. Numerical computation of singular control functions for a two-link robot arm. In Optimal Control, Vol. 95,

Bulirsch R, Miele A, Stoer J, Well KH (eds), Lecture Notes in Control and Information Sciences. Springer-Verlag:Heidelberg, 1987; 244–253.

29. von Stryk O, Schlemmer M. Optimal control of the industrial robot Manutec r3. Computational Optimal Control1994:367–382.

30. Schöpf R, Deuflhard P. A mixed symbolic-numeric optimal control calculator. In Computational optimal control,Bulirsch R, Kraft D (eds). Birkhäuser-Verlag, 1994; 269–278.

31. Wie B, Chuang C-H, Sunkel J. Minimum-time pointing control of a two-link manipulator. Journal of Guidance,Control and Dynamics 1990; 13:867–873.

32. Bessonnet G, Sardain P, Chesseé S. Optimal motion synthesis—dynamic modelling and numerical solving aspects.Multibody System Dynamics 2002; 8:257–278.

33. Hestenes MR. Calculus of Variations and Optimal Control Theory, Applied Mathematics Series. John Wiley & Sons:New York, 1966.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca

Page 21: Energy-optimal trajectory planning for the Pendubot and the Acrobot

TRAJECTORY PLANNING FOR PENDUBOT AND ACROBOT

34. Bliss GA. Lectures on the Calculus of Variations. The University of Chicago Press: Chicago, 1963.35. Koon WS, Marsden JE. Optimal control for holonomic and nonholonomic mechanical systems with symmetry and

Lagrangian reduction. SIAM Journal of Control and Optimization 1997; 35:901–929.36. Bloch AM. Nonholonomic Mechanics and Control, Interdisciplinary Applied Mathematics, Vol. 24. Springer-Verlag:

New York, 2003.37. Hussein II, Bloch AM. Optimal control of underactuated nonholonomic mechanical systems. IEEE Transactions on

Automatic Control 2008; 53(3):668–682.38. Fantoni I, Lozano R, Spong MW. Energy based control of the Pendubot. IEEE Transactions on Automatic Control

2000; 45(4):725–729.39. Zhang M, Tarn TJ. Hybrid control of the Pendubot. IEEE/ASME Transactions on Mechatronics 2002; 7(1):

79–86.40. Yabuno H, Goto K, Aoshima N. Swing-up and stabilization of an underactuated manipulator without state feedback

of free joint. IEEE Transactions on Robotics and Automation 2004; 20(2):359–365.41. Albahkali T, Mukherjee R, Das T. Swing-up control of the pendubot: an impulse–momentum approach. IEEE

Transactions on Robotics 2009; 25(4):975–982.42. Spong M. The swing up control problem for the acrobot. IEEE Control Systems Magazine 1995; 15(1):49–55.43. Xin X, Kaneda M. Analysis of the energy-based swing-up control of the acrobot. International Journal of Robust

and Nonlinear Control 2007; 17(16):1503–1524.44. Banavar RN, Mahindrakar AD. A non-smooth control law and time-optimality notions for the acrobot. International

Journal of Control 2005; 78(15):1166–1173.45. Lai XZ, She JH, Yang SX, Wu M. Comprehensive unified control strategy for underactuated two-link manipulators.

IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics 2009; 39(2):389–397.46. Oriolo G, Nakamura Y. Control of mechanical systems with second-order nonholonomic constraints: underactuated

manipulators. Proceedings of the 30th Conference on Decision and Control, 1991.47. Sussmann HJ. A general theorem on local controllability. SIAM Journal on Control and Optimization 1987;

25(1):158–194.48. Bullo F, Lewis AD, Lynch KM. Controllable kinematic reductions for mechanical systems: concepts, computational

tools, and examples. Proceedings of International Symposium on Mathematical Theory of Networks and Systems,2002.

49. Bullo F, Lewis AD. Low-order controllability and kinematic reductions for affine connection control systems. SIAMJournal on Control and Optimization 2005; 44(3):885–908.

50. Lewis AD, Murray RM. Configuration controllability of simple mechanical control systems. SIAM Journal onControl and Optimization 1997; 35(3):766–790.

51. Bullo F, Lynch KM. Kinematic controllability for decoupled trajectory planning in underactuated mechanicalsystems. IEEE Transactions on Robotics and Automation 2001; 17(4):402–412.

52. Gregory J, Lin C. Constrained Optimization in the Calculus of Variations and Optimal Control Theory. Chapman &Hall: London, 1996.

53. Gregory J, Wang S. Discrete variable methods for the m-dependent variable nonlinear, extremal problem in thecalculus of variations. SIAM Journal of Numerical Analysis 1990; 27(2):480–487.

54. Wolfram Research. URL http://www.wolfram.com.55. More JJ, Thuente DJ. Line search algorithms with guaranteed sufficient decrease. ACM Transactions on

Mathematical Software 1994; 20(3):286–307.

Copyright © 2012 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. (2012)DOI: 10.1002/oca