support vector machine

Upload: sivaeeinfo

Post on 02-Nov-2015

6 views

Category:

Documents


0 download

DESCRIPTION

Support vector machine optimal control for mobile wheeled invertedpendulums with unmodelled dynamics

TRANSCRIPT

  • l$

    027

    Zero dynamics

    o

    ele

    as

    bi

    nd

    efe

    ss

    ed

    own i

    e beeninver

    g to utrol istabil

    artly)n theownctiveum is

    the form of parametric and functional uncertainties, which is not

    Contents lists available at ScienceDirect

    journal homepage: www.els

    Neurocom

    $This work is partly supported by Mobile Computing Education research

    Neurocomputing 73 (2010) 27732782usually available. The presence of uncertainties and [email protected] (Y. Zhang).possible to model accurately. Modeling errors might underminethe control approach based on linearized model [1,12] and thecontrol proposed on the velocity level [4]. Therefore, model-basedcontrol may not be the ideal choice since the dynamics is not

    0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved.

    doi:10.1016/j.neucom.2010.04.009

    program of Microsoft Research Asia (no. FY08-RES-THEME-154) and Shanghai

    Pujiang Program (no. 08PJ1407000) and the National Natural Science Foundation of

    China (nos. 60804003, 60935001). Corresponding author.E-mail addresses: [email protected], [email protected] (Z. Li),characterized by unstable balance and unmodelled nonlineardynamics, and there are time varying external disturbances, indescribed by coupled nonlinear differential equations. However, itis often possible to obtain an approximated linearized modelaround an operating point, where the signals involved are smallenough. Several design of controllers and analysis techniques forlinear systems were proposed, i.e., motion control using linear

    transform a nonlinear system dynamics into a (fully or plinear one, so that linear control techniques can be applied. Itransformation, the exact dynamics of system must be knbeforehand, the model-based control can provide an effesolution to the problem. However, wheeled inverted pendulapply the conventional control approach for under-actuateddynamic systems. Therefore, some control designs have beenproposed to guarantee stability and robustness for mobilewheeled inverted pendulums. Moreover, because of intrinsicunder-actuated system, the dynamics of wheeled invertedpendulums systems is nonlinear and coupled, which can be

    linearization, a two-level velocity controller and a stabilizingposition controller were proposed.

    Feedback linearization is an approach to nonlinear controldesign which has attracted a great deal of research interest inrecent years. The central idea of the approach is to algebraicallypendulum body [4], which are diffe[79]. Recently, more researches havdynamics and control of wheeledWheeled inverted pendulums belonsystem, where the number of connumber of degrees of freedom to beand the motors driving the wheels are directly mounted on therent from cart and pendulums

    pitch of the inverted pendulum was studied and the rotationangles of the two wheels as the variables of interest were1. Introduction

    Wheeled inverted pendulums sh n Fig. 1 are not planar

    done to investigate theted pendulums [15].nder-actuated dynamicnputs is less than theized [6]. It is difcult to

    state-space model was proposed in [12], and the controlwas designed based on the dynamic equations linearizedaround an operating point [1]. In [10], dynamics involving the

    presented, and in [11], a linear controller was designed. In [13],only a planar model without yaw was considered, a linearstabilizing controller was derived based on this mode. In [14],although the exact dynamics of two-wheeled inverted pendulumwas investigated, only the linear feedback control was developedon the dynamic model. In [4], based on partial feedbackSupport vector machine optimal contropendulums with unmodelled dynamics

    Zhijun Li a,, Yunong Zhang b, Yipeng Yang a

    a Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, Chinab School of Information Science and Technology, Sun Yat-Sen University, Guangzhou 51

    a r t i c l e i n f o

    Article history:

    Received 1 September 2009

    Received in revised form

    27 December 2009

    Accepted 12 April 2010

    Communicated by S. HuAvailable online 4 May 2010

    Keywords:

    Mobile wheeled inverted pendulum

    Support vector machine

    a b s t r a c t

    The dynamic balance and m

    considered for mobile whe

    dynamics uncertainties. B

    advantage of LS-SVM com

    efcient approximation. U

    track the given bounded r

    global uniform boundedne

    effectiveness of the proposfor mobile wheeled inverted

    5, China

    tion control based on LS-SVM (least squares support vector machine) are

    d inverted pendulums (WIP), in the presence of parametric and functional

    ed on Lyapunov synthesis, the proposed control mechanisms use the

    ned with on-line parameters estimation strategy in order to have an

    er the controller designed, we can ensure that the outputs of the system

    rence signals within a small neighborhood of zero, and guarantee semi-

    of all the closed loop signals. Simulation results are presented to verify the

    control.

    & 2010 Elsevier B.V. All rights reserved.

    evier.com/locate/neucom

    puting

  • Nomenclature

    qv the vector of generalized coordinates for the mobileplatform with qv q1,q2,q3T y,x,yTAR3

    x,y the position coordinates of the mid-point of the twodriving wheels

    y the heading angle in motion relative to the x-axis ofthe xed frame

    a the tilt angle relative to z-axis of the xed frameDv(q) the inertia matrices for the mobile platform with

    33

    G ,G

    GaAR

    Z. Li et al. / Neurocomputing 73 (2010) 277327822774could disrupt the function of the model-based feedback control,and lead to the unstable balance. How to handle the parametricand functional uncertainties, unmodelled dynamics, and distur-bances from the environment is one of the important issues in thecontrol of wheeled inverted pendulum. The wheeled invertedpendulum is denitely different from other nonholonomicsystems subject to (i) only kinematic constraints which geome-trically restrict the direction of mobility, i.e., wheeled mobile robot[27,28], (ii) only dynamic constraints due to dynamic balance atpassive degrees of freedom where no force or torque is applied[33], i.e., the manipulator with passive link [29,30]. It belongs to(iii) not only kinematic constraints but also dynamic constraints.Therefore, the wheeled inverted pendulum is more complex thanthe former two cases, so the previously proposed controlapproaches suitable for (i) and (ii) could not be applied directlyto wheeled inverted pendulums.

    A challenging problem is to control a mobile wheeled invertedpendulum system whose cart is no longer constrained to the guiderail like cart-pendulum systems, but moves in its terrain whilebalancing the pendulum. Moreover, the control for wheeledinverted pendulums is different and difcult compared with otherfull-actuated systems because they consist of multiple under-actuated congurations.

    DvqARDa the inertia matrix for the inverted pendulumDva,Dav the coupling inertia matrices of the mobile platform

    and the inverted pendulumCv,Ca the centripetal and coriolis torques for the mobile

    platform and the inverted pendulum with CvAR33

    and CaARLeast squares support vector machine (LS-SVM) has beenproposed for solving nonlinear function estimation problems[23,22]. LS-SVM takes equality instead of inequality constraintsof SVM in the problem formulation such that LS-SVM is easy totrain. The previous works about SVM learning approaches havebeen proposed in [1518] for modeling nonlinear system andSVM-based nonlinear controls; however, those works lack thedenite stability proof of the closed-loop system using SVMapproaches. In [19], the control design on the support vector

    Fig. 1. Mobile wheeled inverted pendulum.machine (SVM) is developed to achieve accurate haptic display,the approximation model of friction is established off-line throughSVM learning for online feed forward friction compensation.However, in practical control applications, it is desirable to havesystematic methods to ensure on-line stability, robustness, andperformance of the overall system in the unknown environmentsbeforehand. Although the on-line adaptive fuzzy approximation isa well-known technique, for the real-time mechanical system, it isdifcult to choose a satisfying fuzzy rule number, the performancewould be degraded with less rule number, on the other hand, theexplosion of rule number would bring great trouble to the limitcomputation resource. However, support vector machines ap-proach could avoid this problem. While to our best knowledge,there are few works dealing with SVM-based control proposed forthe wheeled inverted pendulum up to now.

    The previous works, which utilized the SVMs approximationfor system modelling, assume beforehand that the state variablesof the system are bounded in a compact set since the approxi-mated data can be provided off-line, while for the stability ofsystem by on-line SVM approximation, the system errors cannotbe predicted. It is not reasonable to assume that the bounds withinthe compact set beforehand. The direct SVMs approximation is notvalid, while many applications using off-line SVM approximation

    tv the control input vector for the mobile platform withtvAR2

    fvt,fat the external friction force on the mobile platform andthe inverted pendulum with fvtAR3 and fatAR1

    dvt,dat the external disturbances on the mobile platformand the inverted pendulum with dvtAR3 anddatAR1

    Bv a full rank input transformation matrix and isassumed to be known because it is a function ofxed geometry of the system with BvAR32

    JvT Jacobian matrix with JTv AR

    3

    l Lagrangian multipliers corresponding to the nonho-lonomic constraintsv(all dcontinve

    Inpropthecontpropon-lthedynabaseorigi

    T

    (i)

    (ii)Cav the coupling centripetal and coriolis torques of themobile platform and the inverted pendulum

    a the gravitational torque vectors for the mobile plat-form and the inverted pendulum with GvAR3 andCva,ata obtained are bounded) limit the extension of SVMs for therol of nonlinear dynamic systems, for example wheeledrted pendulums.this paper, by discovering and utilizing the unique physical

    erty of the wheeled inverted pendulum, we could decouplesystem to simplify the model, such that we can design therol easily. Moreover, we make full use of the physicalerties of the wheeled inverted pendulum and then designine LS-SVM based control by sliding window to accommodatepresence of parametric and functional uncertainties in themics of wheeled inverted pendulums. The developed SVM-d control combining the physical properties of the system isnal.he main contributions of this paper lie in:

    the developed SVM-based control combining the physicalproperties of the system for achieving better performance andsimplifying control design and analysis;the use of SVMs in compensating for parametric andfunctional uncertainties commonly encountered in wheeledinverted pendulums control and the rigorous stability

  • Constraint Eq. (2) implies the existence of vector _w o,vTAR2

    _qv Sq _w 4

    _z _w, _aT _z , _z , _z T o,v, _aT , and multiplying diag[ST, I] byboth

    STD

    Da

    "

    where D22, D33 are unknown constants, D11z3, D23z3, D32z3,

    Z. Li et al. / Neurocomputing 73 (2010) 27732782 2775vS STDva

    S D

    #wa

    S

    TDv _SSTCvS STCvaD _SC S C

    " #_w_a

    S

    TFv

    F

    " #pend

    1 2 3

    sides of (1) to eliminate JvT, the dynamics of wheeled inverted

    ulum can be expressed asConsidering the derivative of (4), dene new variableswith o representing the component of the angular velocity of theplatform denoted in Fig. 1 and v is the forward velocity of mobileplatform such thatanalysis which shows that semi-global uniform boundednessof the tracking errors are guaranteed;

    (iii) the input-to-state stability properties of its zero dynamics arejustied and used to derive the bounds on the tracking errors.

    2. System description

    2.1. Problem formulation and preliminaries

    Lemma 2.1 (Ge et al. [21]). Let eH(s)r with H(s) representing an(nm)-dimensional strictly proper exponentially stable transferfunction, r and e denoting its input and output, respectively. ThenrALm2

    TLm1 implies that e, _eAL

    n2

    TLn1, e is continuous, and e-0 as

    t-1. If, in addition, r-0 as t-1, then _e-0.

    2.2. Dynamics of mobile wheeled inverted pendulums

    Consider the following wheeled inverted pendulum dynamicsdescribed by Lagrangian formulation:

    Dv Dva

    Dav Da

    " #qva

    " #

    Cv Cva

    Cav Ca

    " #_qv_a

    Fv

    Fa

    " #

    Gv

    Ga

    " #

    dv

    da

    " #

    Bvtv0

    J

    Tvl0

    " #1

    Assumption 2.1. The nonholonomic constraints on the mobilewheeled inverted pendulum is known and not to be violated.

    Remark 2.1. In actual implementation, we can adopt the methodsof producing enough friction between the wheels of the mobileplatform and the ground such that the assumption of nonholo-nomic constraints holds.

    2.3. Reduced dynamics

    The vehicle is subjected to nonholonomic constraints, Jv is thekinematic constraint matrix related to nonholonomic constraints,which can be expressed as

    Jv _qv 0 2We could nd a set of smooth and linearly independent vectorelds S1(q) and S2(q) constituting the matrix S S1q,S2qAR32with full rank STS, which in the local coordinates satisfy thefollowing relation [20]:

    ST JTv 0 3v a av av a aC11, C13, C23, C31, f1 _z1, f2 _z2, f3 _z3, g3z3, and d1,d2,d3 areunknown functions.

    From (1), (7) and (8), we can obtain three subsystems: z1-subsystem, z2- subsystem, and z3- subsystem, respectively,as follows:

    _ _ STGv

    Ga

    " # S

    Tdv

    da

    " # S

    TBvtv0

    " #5

    2.4. Control objectives

    Assumption 2.2 (Dong et al. [25], Chang and Chen [26]). Thedesired trajectories z1dt and z3dt and their time derivatives upto the 3rd order are continuously differentiable and bounded forall tZ0.

    Remark 2.2. Since we can plan and design the desiredtrajectory for z1dt and z3dt before implementing control, it isreasonable and feasible that we give the trajectory satisfying theAssumption 2.2.

    The objective of the control can be specied as given desiredtrajectories z1dt and z3dt, we are to determine a control lawsuch that for any zj0, _zj0AO, zj, _zj converge to a manifold Odspecied as O where

    Od fzj, _zjj jzjzjdjrej1,j _zj _zjdjrej2g 6where eji40, i1,2, j1,3. Ideally, eji should be the threshold oftolerable noise. At the same time, all the closed loop signals are tobe kept bounded. The variables z1 and z3 can be thought as outputequation of the system, the choice of z is, as an example,illustrated in the next section.

    Assumption 2.3. F1 _z f1,f2,f3T is an independent vector, forsimplication and convenience, such as F1 _z r _z with the frictioncoefcient r, which is diagonal positive denite.

    Remark 2.3. The friction model can be founded in [34], and F1 is afunction including _z, for simplication, we give the aboveassumption.

    2.5. Physical properties

    It is observed that the dynamics of wheeled inverted pendu-lum, which is listed in Section 5, by exploiting the physicalproperties of mobile wheeled inverted pendulum embedded in thedynamics of (5) are given by

    D1 STDvS STDva

    DavS Da

    " #

    D11z3 0 00 D22 D23z30 D32z3 D33

    264

    375 7

    C1 STDv _SSTCvS STCvaDav _SCavS Ca

    " #

    C11 0 C13

    0 0 C23

    C31 0 0

    264

    375 8

    STFv

    Fa

    " #

    f1 _z1 f2 _z2 f3 _z3

    264

    375, STdv

    da

    " # d1 d2 d3T

    STBvtv0

    " #

    t1t20

    264

    375, STGv

    Ga

    " #

    0

    0

    g3z3

    264

    375D11z3z1C11z1C13z3 f1d1 t1 9

  • system, we need to estimate those uncertainties to design aneffective controller. In this paper, we make use of support vectormachine (SVM) for on-line nonlinear system identication.

    kernel Hilbert space), via a nonlinear mapping f, and then to dothe linear regression in this space. Therefore given a training set of

    Z. Li et al. / Neurocomputin2776l training samples x1,y1, . . . ,xl,ylARn R, we introduce anonlinear mapping f : Rn!HARh, which maps the trainingsamples to a new data set f1x,y1, . . . ,flx,yl. In e- insensitivesupport vector regression the goal is to estimate the followingfunction:

    f^ x /$,fxSb; oARh, bAR 16

    where $ and b are the coefcients, which are estimated by therisk function

    Rmin$,b,E

    1

    2J$J2c 1

    2

    Xli 1

    E2i !

    s:t: yif^ xi Ei 17

    where l is the number of the training samples and the constantc40 measure the trade-off between complexity and losses.

    Construct a Lagrangian to solve the optimization problem:

    maxa

    min$,b

    L 12$T$ 1

    2cXli 1

    E2i Xli 1

    aifyi$TfxibEig( )

    18

    According to KarushKuhnTucker optimization condition, we3.1. LS-SVM

    For support vector machine regression, the basic idea is to mapthe data to a higher dimensional feature space H (reproducingD22D33D223z3D33

    z2D23z3D33

    C31 _z1g3z3 f3 _z3d3C23 _z3

    f2 _z2d2t2 0 10

    D22D33D223z3D23z3

    z3D22

    D23z3C31 _z1g3z3 f3 _z3d3C23 _z3

    f2 _z2d2 t2 11

    Dene the tracking errors and the ltered tracking errors as

    ej zjzjd 12

    rj _ejLjej 13where Lj is a positive number, and j1,3. Therefore, we couldstudy the stability of ej and _ej by the properties of rj. Moreover, it iseasy to have the following computable signals:

    _zjr _zjdLjej 14

    zjr zjdLj _ej 15

    3. LS-SVM based controller design

    Because the dynamic uncertainties of the system are usuallyhard to measure, such as friction forces and disturbances to thecan seek the optimal solution and transform this optimizationproblem into a matrix function as

    0 1 . . . 1

    1 Kx1,x11

    c. . . Kx1,xl

    ^ ^ & ^

    1 Kxl,x1 . . . Kxl,xl1

    c

    26666664

    37777775

    b

    a1

    ^

    al

    266664

    377775 0 y1 . . . ylT 19

    where Kxi,xj fxiTfxj,i,j 1, . . . ,l, it is a kernel function,which satises Mercers theorem.

    The obtained nonlinear model is

    f^ x Xli 1

    aiKxi,xjb 20

    where xi and ai are obtained from solving the set of linear (19) andxj is the jth actual state vector, if the on-line sliding windows isintroduced as next section, xj is the actual state vector at time j.The data xi is used as support vector data for the control signal[32].

    3.2. On-line training with time window

    In order to have a proper performance of LS-SVM, we need toselect as many samples as possible for training; however, thedimension of SVM will greatly increase in the process of on-linetraining. Based on the aim of designing a control which dependson the current state of the nonlinear dynamic system, the trainingdata collected earlier might not suit for real time system, the largedata set might lead to time consuming calculation. Therefore,sliding time window is constructed by l with selected samplingtime interval, then sample data are collected orderly from currentto past. Moreover, a new data sample is collected while the oldestdata being dropped. We assume that the nearest data can moreproperly describe the feature of the system than the oldest data.

    Theorem 3.1. For any given continuous real function f(x) on acompact set UARn and arbitrary e40, there exists an LS-SVMapproximation f^ x formed by (20) such that

    supxAU

    jf^ xf xjoe 21

    Proof. See Wang et al. [31]. &

    Theorem 3.2. For any given continuous real function f(x) on acompact set, for a large enough length of sliding time window lcombined with properly selected sampling time interval and any

    given e40, there exists an LS-SVM approximation function f^ xformed by (20) such that

    suptA T,Tl1

    jf^ xf xjre 22

    Proof. Let fa1,a2, . . . ,alg be the estimated weights of LS-SVM byminimizing L given by (18). Let the input data bexTl1,yTl1, . . . ,xT1,yT1,xT ,yT , here, T denotes the currenttime. It is easy to know that the length l and the selected samplingtime interval determines the size of data samples. From (18), weknow the regularization constant c determines the trade-offbetween the empirical risk and the model complexity. FromTheorem 3.1 and [24], when the time interval is properly selectedand l is sufciently large, we can obtain that for any given e40,there exists an LS-SVM approximation function f^ x formed by(20) such that

    suptA T,Tl1

    jf^ xf xjre & 23

    g 73 (2010) 27732782

  • D11z3_r1C11r1 t1m1 27

    1 2

    Z. Li et al. / Neurocomputing 73 (2010) 27732782 2777therefore, the close loop system for z1 and z3 coupled systembecomes

    D_rCr U 31where

    DD11z3 0

    0D22D33D223z3

    D23z3

    264

    375

    C C11 00 ~C23

    " #

    r r1 r3T , U u1 u2TD22D33D223z3D23z3

    _r3 ~C23r3 t2m3 28

    Dene now control new inputs as

    u1 t1m1 29

    u2 t2m3 30where u and u are auxiliary control inputs to be optimized later,Remark 3.1. For SVMs approximation, we know the computationcost of SVMs is directly proportional to the number of trainingdata pair. However, in order to guarantee high accuracy, we needto choose a large sliding window presented in Theorem 3.2, forexample, in the simulation presented later, we choose the slidingwindows containing 200 sample data, the control performance isgood. Since SVM system approximation ability only holds on acompact set, it is more rigorous to assume the states are within asufciently large compact set beforehand. Otherwise, the SVMsystem approximation will be violated. After the derivation, wecan nd know that if we choose the sliding window to be largeenough for SVM system approximation to cover the sufcientlylarge compact set, the states are bounded in the set by themselves.

    4. LS-SVM based control design

    In reality, dynamics of the system cannot be exactly known. Inaddition, external disturbances may also affect the performance ofthe system. In this section, we take both factors into considerationto develop an LS-SVM based control with adaptive law to dealwith uncertainties as well as external disturbances.

    4.1. z1 and z3- subsystems

    Since _zj _zjrrj and z j zjr _r j, j1, 3, let

    m1 C11 _z1rC13 _z3 f1 _z1d1D11z3 z1r 24

    m3 D22D33D223z3

    D23z3z3rC^23r3C23 _z3r

    D22D23z3

    C31 _z1 f3g3d3f2d2 25

    where we decompose C23 C^23 ~C23 such thatd

    dt

    D22D33D223z3D23z3

    2 ~C23 0 26

    Eqs. (9) and (11) becomeThe following properties are useful for the stability.Property 4.1. The inertia matrix D1z is positive denite andsymmetric, from which we know that the terms D11z3 andD22D33D223z3=D33 are positive.

    Property 4.2. The inertia matrix D is positive denite and symmetric.Property 4.3. The matrix _D12C1 is skew-symmetric, from whichwe can obtain d=dtD11z32C11 0.

    Remark 4.1. Property 4.3 can be veried by the dynamics modelpresented in Section 5.

    From Property 4.3 and (26), we could obtain

    Property 4.4. The matrix _D2C is skew-symmetric, and then wecould obtain xT _D2Cx 0, where xAR2.From (13), we obtain the error dynamics as

    _e Ler 32where e e1,e3T ,L diagLj, and r[r1,r3]T.

    We could build up the following augmented system as

    _X _e

    _r

    L I

    0 D1C

    e

    r

    0D1

    U 33

    which can be described by brief form

    _X Az, _zXBzU 34with AAR44, BAR42, and XAR41.

    Lemma 4.1. Consider the dynamics described by (33). Given aweighted matrix Q QT40, if there exists a symmetric positivedenite matrix K KT40 satisfying the following algebraic Riccati-like equation

    PAATPPBR1BTP _PQ 0 35for a gain matrix R1Q22 and positive denite matrix

    Q Q11 Q12

    QT12 Q22

    " #,

    and Q12QT21o0, P diagK ,D. The feedback controlU 12 R1BTPX 12R1r 36guarantees that all the variables of the close loop system are bounded

    and the tracking performance is achieved.

    Proof. Let us choose a Lyapunov function as

    V1 12XTPX 37Taking the time derivative of V along the dynamics (33), we obtain

    _V1 12 _XTPXXT _PXXTP _X

    12 XTAT z, _zUTBT zPX12 XT _PX12XT PAz, _zXPBzU 12 XTAT z, _zPX12 UTBT zPX12 XT _PX12XTPAz, _zX12 XTPBzU 12XT AT z, _zP _PPAz, _zXXTPBzU

    Consider (35) and (36), we have

    _V 12XT

    Q11 Q12

    QT12 Q22

    " #Xr0 38

    Therefore, using the feedback control (36) results in the controllernonlinear system

    _X Az, _z12BzR1BT zPX 39being globally exponentially stable about the origin in R2, that is, eand r are bounded. &

    Let F^1 and F^3 denote SVM estimation from (20) with on-line LS-

    SVM method, due to the approximation property of SVM and

  • Z. Li et al. / Neurocomputing 73 (2010) 277327822778Theorem 4.1, such that

    jF^1m1jre1 40

    jF^3m3jre3 41where e1 and e3 are bounded errors by Theorem 3.2. For simplicity,we assume a known upper limit

    JeJrem 42Since the control objective is to make r10 and r30, that is,m1 t1 and m3 t2, the effectiveness of t1 and t2 are to converger1-0 and r3-0, therefore, t1-m1, t2-m3.

    To estimate mj with SVM method, we select the sample pairs asx1,y1 r1,t1 and x3,y3 r3,t2, fjx f1j x, f

    2j x . . .

    fhj xT , where fjx is determined by the kernel function Kj(x,x),then mj /wj,fjxSbj F^ jxej

    Pli 1 a

    1i K1x,xjbjej.

    Remark 4.2. From Lemma 4.1, we know r1, e1, r3 and e3 arebounded uniformly in time, therefore, Theorem 3.2 can be applied.

    Therefore, the external torques are given as

    t1 F^ 1u1tr1 43

    t2 F^ 3u2tr2 44where trj is a robustifying vector dened later.

    Then, (27) and (28) can be rewritten as

    D11z3_r1C11r u1tr1 F^ 1m1 45

    D22D33D223z3D23z3

    _r3 ~C23r3 u2tr2 F^3m3 46

    The state space description (34) can be given by

    _X Az, _zXBzU F^trm 47where tr tr1,tr2T .

    Considering the feedback control (36), we have

    _X A12BR1BTPXBF^trm 48

    Theorem 4.1. Consider the U provided by (36) with the robust termgiven by

    tr re2m

    JrJemdt49

    with the positive em dened in (42) and r dened by (13), dt is atime varying positive function converging to zero as t-1, such thatR t0 dodo ao1 with bounded constant a, then X converges to aset containing the origin with a rate at least as fast as ent withn lminQ 40.

    Proof. Considering (37), we have

    _V1 XT P _X12 _PX 50Introducing (48) into (50) yields

    _V1 XTPAX12 XTPBR1BTPX12XT _PXXTPBF^trm 51Using XTPAX 12XT ATPPAX, integrating (35), the time deriva-tive of Lyapunov function becomes

    _V1 12XTQXXTPBF^trm 52Since B 0 D1T from (33), XTPB eT rT diagK ,D0 D1T rT , therefore, we have

    _V1r1

    2JXJ2lminQ JrJeXTPBtrr

    1

    2JXJ2lminQ JrJem

    JrJ2e2m r1 JXJ2lminQ

    JrJemdt

    JrJemdt 2 JrJemdtr12JXJ2lminQ dt 53

    Therefore, we arrive at _VrnVdt. Thus, r converges to a setcontaining the origin with a rate at least as fast as ent .Integrating both sides of the above equation gives

    V1tV10rZ t0

    1

    2JXJ2lminQ

    dsa 54

    Thus V is bounded, which implies that XAL1. From (54), we haveZ t0

    1

    2JXJ2lminQ

    dsrV10V1ta 55

    which leads to XAL2. From r _eLe, it can be obtained thate, _eAL1. As we have established e, _eAL1, from Assumption 2.2, weconclude that _z1, _z3, _z1r , _z3rAL1. &

    4.2. z2- subsystem

    For system (9)(11) under the control laws (43) and (44), thez2- subsystem (10) can be also rewritten as

    _j f g,j,u 56where j z2, _z2T , g z1,z3, _z1, _z3T , u t1,t2T .

    Assumption 4.1. From (9) and (11), the reference signal satisesAssumption 2.2, and the following function is Lipschitz in g, i.e.,there exists Lipschitz positive constants Lg and Lf such that

    JC31 _z1g3 f3 _z3d3JrLgJgJLf 57moreover, from the stability analysis of z1 and z3 subsystems, gconverges to a small neighborhood of gd z1d,z3d, _z1d, _z3dT .

    Remark 4.3. In (57), for the dynamics presented in Section 5, wecan see C31 _z1 as a function of g, therefore, it is easy to obtain thatAssumption 4.1 is satised.

    Remark 4.4. Since the z1 and z3 subsystems satisfy the stabilityunder the proposed controls (43) and (44), and considering (6), letJggdJrr2, it is easy to obtain JgJrJgdJr2, and from (15),z jr-zjd, zj-zjd, since the close-loop signals are bounded, letJ zJrJ zdJr3, where r2 and r3 are small bounded errors.

    Lemma 4.2. The z2-subsystem (10), if z1-subsystem and z3-subsystem are stable, is globally asymptotically stable, too.

    Proof. From (9), (10) and (11), we choose the Lyapunov candidateas

    V2 V1 lncosh _z2 58Differentiating (58) along (10) gives

    _V2 _V1tanh _z2 z2 _V1tanh _z2D33

    D22D33D223z3D23z3D33

    C31 _z1g3z3 f3 _zd3C23 _z3f2 _zd2t2

    59

    From (11), we have

    t2C23z, _z_z3f2d2 D22

    D23z3C31 _z1g3 f3 _z3d3

    D22D33D223z3

    D23z3z3 60

    Integrating (60) into (59), we have

    _V2 _V1tanh _z2D33

    2

    D223z3D22D33D D z D22D33D23z3 33 23 3

  • C31 _z1g3 f3 _z3d3

    tanh _z2D33

    D22D33D223z3D22D33D223z3

    D23z3z3

    _V1tanh _z2D23z3

    C31 _z1g3 f3 _z3d3D33 z3

    Since Jtanh_z2Jr1, JD23z3J is bounded, let Jtanh _z2=D23z3Jrr1, where r1 is a bounded constant, from Assumption 4.1, we haveJC31 _z1g3 f3_z3d3JrLgJgdJ Lgr2Lf , similarly, since z3 z3r _r3, considering Assumption 2.2, r3 is bounded from z1 andz3- subsystems, and z3r-z3d, therefore, we have JD33 z3JrD33J zdJr3, therefore, we have_V2r12JXJ2lminQ dr1LgJgdJLgr2Lf D33J zdJr3 61

    Let p dr1LgJgdJLgr2Lf D33J zdJr3 and it is appar-ently bounded positive, we have _V2r0, when jXjZ

    p=2lminQ

    p, we can choose the proper Q and R such that the

    X can be arbitrarily small. Therefore, we can obtain the internaldynamics being stable with respect to the output _z2. Therefore, thez2- subsystem (11) is exponentially stable.

    Theorem 4.2. Consider the systems (911) with Lemma 4.2, underthe action of control laws (43) and (44). For each compact set O10,where (z10, _z10,AO10, each compact set O30, where (z30, _z30)AO30, the tracking errors r1 and r3 converge to a set containing theorigin with a rate at least as fast as ent , and all the signals in theclosed loop system are bounded.

    Proof. From the results (54) in Theorem 4.1, it is clear that thetracking errors r1 and r3 converge to a set containing the originwith a rate at least as fast as ent . From Lemma 2.1, we can knowe1, _e1, e3, _e3 are also bounded. From the boundedness of z1d,z3d inAssumption 2.2, we know that z1,z3 are bounded. Since _z1d, _z3d are

    0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

    0.5

    1

    1.5

    2

    (ra

    d/s)

    Time (s)

    0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

    0

    1

    2

    3

    1 (ra

    d)

    Time (s)

    SVM

    Robust control

    Model-based control

    SVM

    Robust control

    Model-based control

    Fig. 2. Tracking the direction angle and rotation velocity by the LS-SVM based control.

    25

    0.02

    ad/s)

    T

    T

    Model-based control

    SVM

    M

    Z. Li et al. / Neurocomputing 73 (2010) 27732782 27794.15 4.2 4.-0.04

    -0.02

    0

    d/d

    t (r

    0 0.5 1 1.5 2-1

    -0.5

    0

    0.5

    1

    (ra

    d)

    Robust control

    SVM

    Robust controlFig. 3. Tracking the desired tilt ang4.3 4.35 4.4ime (s)

    2.5 3 3.5 4 4.5 5ime (s)

    odel-based controlle by the LS-SVM based control.

  • by three approaches are shown in Fig. 6. As these gures show, thetilt angle by LS-SVM approach is kept in the smallest set aroundthe vertical position compared with the other two approaches,therefore, the balance is more stable, and the forward velocity alsoconverges to a stable value compared with the other two

    0 0.5 1 1.5 2 2.5 3-1

    0

    1

    2

    3

    4

    5

    6

    7

    x (m)

    y (m

    )

    SVMRobust control

    Model-based control

    Fig. 6. The trajectory of wheeled inverted pendulum.

    0 1 2 3 4 5-2

    0

    2

    4

    v (m

    /s)

    Time (s)

    0 1 2 3 4 5-5

    0

    5

    10

    2 (m

    )

    Time (s)

    Robust control

    SVM

    Model-based control

    Model-based control

    SVM

    Robust control

    Fig. 5. The forward velocity by the LS-SVM based control.

    0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5-300

    -250

    -200

    -150

    -100

    -50

    0

    Time (s)

    (N

    m)

    SVM

    Model-based Control

    Fig. 4. Input torques.

    Z. Li et al. / Neurocomputing 73 (2010) 277327822780also bounded, it follows that _z1, _z3 are bounded. From Lemma 4.2,we know that the z2- subsystem (11) is stable, and z2, _z2 arebounded. &

    5. Simulations

    Let us consider a mobile wheeled inverted pendulum as shownin Fig. 1. The following variables have been chosen to describe thevehicle (see also Fig. 1) as tl,tr is the torques of the left and rightwheels; a the tilt angle of the pendulum; y the direction angle ofthe mobile platform; R the radius of the wheels; D the distancebetween the two wheels; 2L the length of the pendulum; B thefriction coefcient of the ground; M the mass of the mobileplatform and pendulum; m the mass of each wheel; Ja the inertiamoment of the mobile platform and pendulum; J the inertiamoment of each wheel; g gravity acceleration.

    The wheeled inverted pendulum is subject to the followingconstraints _x siny _y cosy 0. Using Lagrangian approach, we canobtain the reduced dynamics for qv y,x,yT , a, Jv 0,siny,cosy,and _z o,u, _aT as

    D1 d11z3 0 0

    0 d22 ML cosa0 ML cosa ML2 Ja

    264

    375,

    C1 ML2 sin2 2a _a=2 0 oML2 sin2a=2

    0 0 ML sina _aoML2 sin2a=2 0 0

    2664

    3775,

    G1 0,0,MgL sinaT , F1 ro,u,0T

    whered11z3 D2m=2 JD2=2R2 JwM2L2 sin2 a,d22 2m2J=R2M.In the simulation, we assume the parametersM10.0kg, J1.0kgm2,Jw4.0kgm2, Ja2.0kgm2, m2.0kg, L1m, D1.0m, R 0.5m,r diag0:1, z0 0:2,0,p=18T , and _z0 0:0,0:1,0:0T . Thedisturbances from environments on the system are introducedas 1:0 sint and 1:0 cost in the simulation model. The desiredtrajectories are chosen as yd 0:5t rad and ad 0 rad, and theinitial velocity is 0.1m/s. The design parameters of the above twoLS-SVM controllers are Q11diag[80,120], Q12Q21T diag[4,6],Q22diag[40,300], L diag10,10, Kdiag[4,6], em 10, d1=1t2, 200 training data are sampled for the sliding windowduring [0,5]s. The training data pairs are constructed. We use theradial basis function (RBF) kernels for SVM: Kxi,xj expfJxixjJ2=2s2g. The parameter of RBF kernels is s 0:25, theparameters of LS-SVM are selected as c120, c240, e 0:02. In thesame conditions, we compare (i) the model based control, we assume10% model uncer-tainty, the model-based controller is consideredas t1 D11 z1dC11 _z1C13 _z3k11 _z1 _z1d k12z1z1d and t2 D22D33D223z3= D23z3 z3d k31 _z3 _z3dk32z3z3dC23 _z3D22=D23z3C31 _z1 f3g3 d3f2g2d2 with k11 k12 k31 k32 10:0; (ii) the robust control with 3050% modeluncertainty and the known bound of model uncertainty, we choosethe control law proposed as t1 kp1r1ki1

    R t0 r1ds

    P3i 1 r1c1i

    C21i=C1ijr1jd1i, where F1 WT1C1, W1 W11W12W13T ; C1 j z1rj,1,1T , and t2 kp3r3ki3

    R t0 r3dt

    P8i 1 r3W3iC

    23i=C3ijr3j d3i,

    where F3 CT3C3, W3 W31, . . . ,W38T ; C3 j z3r j,j _z3rj, j _zj,1,1,j _zj,1,1T ,the control gains are selected as kp 110.0, kp 3 60.0,ki 1ki 30.0, W1 10:0,10:0,10:0T , W3 10:0, . . . ,10:0T , d1i d3i 1=1t2.

    The tracking performances by comparison simulations areillustrated in Figs. 26. The comparison balance performance isshown in Fig. 3. The direction angle and angle velocity tracking bythree approaches are shown in Fig. 2, the input torques by threeapproaches are shown in Fig. 4, and the stable velocities by three

    approaches are shown in Fig. 5, and the trajectories of the system50

    100

    150

    200

    SVMRobust controlapproaches, while the forward velocities and the produced

  • of the proposed LS-SVM based control in the presence of unknownnonlinear dynamic system and environments. Different motion/

    [28] S.S. Ge, Z. Wang, T.H. Lee, Adaptive stabilization of uncertain

    Z. Li et al. / Neurocomputing 73 (2010) 27732782 2781balance tracking performance can be achieved by adjustingcontrol gains.

    The performance of model-based approach is sensitive to theaccuracy of the dynamic model in Figs. 3 and 5, in the model-based control, we only introduce the effects of 10% parametricuncertainties in the dynamic model, since z3 is very sensitive tothe model, more than 10% model uncertainty would cause thesystem unstable. For the robust control, we assume to obtain thebound of the dynamic parameter beforehand; however, it isunrealistic to obtain them in the actual application. From Figs. 3and 5, the tracking performance is no better than the LS-SVM. Incontrast, LS-SVM control approach is tolerant of modeling errors,and can be viewed as a key advantage over model-based androbust control of wheeled inverted pendulums, for which accuratemodeling of wheeled inverted pendulums dynamics is difcult,time-consuming and uncertain. The presence of parametric errorsis a common problem for model-based and robust controllerssince the identication of dynamic parameters is error-prone. Forinstance, the controlled conditions in the test facility under whichthe parameters identied are often very different from actualconditions, thus rendering the parameters inaccuracy for realoperating conditions. LS-SVM control presented in this paper isnot susceptible to this problem, since the unknown parameters arelearned during the wheeled inverted pendulum operation inactual conditions. Therefore, we know the proposed controlscheme could achieve better tracking performance over themodel-based control and the robust control. The better trackingperformances are largely due to the learning mechanism.Although the parametric uncertainties and the external distur-bances are both introduced into the simulation model, the motion/balance control performance of system, under the proposedcontrol, is not degraded. The simulation results demonstrate theeffectiveness of the proposed adaptive control in the presence ofunknown nonlinear dynamic system and environments. Differentmotion/balance tracking performance can be achieved by adjust-ing control gains.

    From these gures, the simulation results showed that theproposed control is with the better performance and more realisticin practice, which validates the effectiveness of the control law inTheorem 4.2.

    6. Conclusions

    In this paper, LS-SVM based control design is carried out fordynamic balance and stable tracking of desired trajectories ofmobile wheeled inverted pendulum, in the presence of unmo-delled dynamics, or parametric/functional uncertainties. Thecontrol is mathematically shown to guarantee semi-globallyuniformly bounded stability, and the steady state compact setsto which the closed loop error signals converge are derived. Thesize of compact sets can be made small through appropriate choicetrajectories by the other two approaches apparently diverge andare clearly shown in Fig. 6. The performance is achieved under theinitial disturbances boundedness from the environment given,even the model parameters of the system are unknownbeforehand. Therefore, we know the proposed control schemecould achieve better tracking performance, although theparametric uncertainties and the external disturbances are bothintroduced into the simulation model, the motion/balance controlperformance of system, under the proposed control, is notdegraded. The simulation results demonstrate the effectivenessof control design parameters. Simulation results demonstrate thatnonholonomic systems by state and output feedback, Automatica 39 (8)(2003) 14511460.

    [29] H. Arai, K. Tanie, Nonholonomic control of a three-DOF planarunderactuated manipulator, IEEE Trans. Robotics Automation 14 (5) (1998)the system is able to track reference signals satisfactorily, with allclosed loop signals uniformly bounded.

    References

    [1] F. Grasser, A. Arrigo, S. Colombi, A.C. Rufer, JOE: a mobile, inverted pendulum,IEEE Trans. Ind. Electron. 49 (1) (2002) 107114.

    [2] R. Brooks, L. Aryanada, A. Edsinger, P. Fitzpatrick, C.C. KempU. OReilly, E. Torres-jara, P. Varshavskaya, J. Weber, Sensing andmanipulating built-for-human environments, Int. J. Humanoid Robotics 1 (1)(2004) 128.

    [3] T. Miyashitaa, H. Ishiguroa, Human-like natural behavior generation based oninvoluntary motions for humanoid robots, Robotics Autonomous Syst. 48(2004) 203212.

    [4] K. Pathak, J. Franch, S.K. Agrawal, Velocity and position control of a wheeledinverted pendulum by partial feedback linearization, IEEE Trans. Robotics 21(3) (2005) 505513.

    [5] N.R. Gans, S.A. Hutchinson, Visual servo velocity and pose control of a wheeledinverted pendulum through partial-feedback Linearization, in: Proceedings ofIEEE/RSJ International Conference on Intelligent Robots and Systems, 2006, pp.38233828.

    [6] A. Isidori, L. Marconi, A. Serrani, Robust Autonomous Guidance: An InternalModel Approach, Springer, New York, 2003.

    [7] M. Zhang, T. Tarn, Hybrid Control of the Pendubot, IEEE/ASME Trans.Mechatronics 7 (1) (2002) 7986.

    [8] C.A. Ibanez, O.G. Frias, M.S. Castanon, Lyapunov-based controller for theinverted pendulum cart system, Nonlinear Dyn. 40 (4) (2005) 367374.

    [9] S.S. Ge, C.C. Hang, T. Zhang, A direct adaptive controller for dynamicsystems with a class of nonlinear parameterizations, Automatica 35 (1999)741747.

    [10] A. Salerno, J. Angeles, On the nonlinear controllability of a quasiholonomicmobile robot, in: Proceedings of IEEE International Conference on Roboticsand Automation, 2003, pp. 33793384.

    [11] A. Salerno, J. Angeles, The control of semi-autonomous two-wheeled robotsundergoing large payload-variations, in: Proceedings of IEEE InternationalConference on Robotics and Automation, 2004, pp. 17401745.

    [12] Y.S. Ha, S. Yuta, Trajectory tracking control for navigation of the inversependulum type self-contained mobile robot, Robotics Autonomous Syst. 17(1996) 6580.

    [13] A. Blankespoor, R. Roemer, Experimental verication of the dynamic modelfor a quarter size self-balancing wheelchair, in: American Control Conference,Boston, MA, 2004, pp. 488492.

    [14] Y. Kim, S.H. Kim, Y.K. Kwak, Dynamic analysis of a nonholonomic two-wheeled inverted pendulum robot, J. Intelligent Robotic Syst. 44 (2005)2546.

    [15] G.L. Wang, Y.F. Li, D.X. Bi, Support vector machine networks for frictionmodeling, IEEE/ASME Trans. Mechatronics 9 (3) (2004) 601606.

    [16] H.R. Zhang, X.D. Wang, C.J. Zhang, X.S. Cai, Robust identication of non-lineardynamic systems using support vector machine, IEE Proc. Sci. Meas. Technol.153 (3) (2006) 125129.

    [17] G. Lu, J. Song, L. Hua, C. Sun, Inverse system control of nonlinear systems usingLS-SVM, in: Proceedings of the 26th Chinese Control Conference, China, 2007,pp. 233236.

    [18] J. Xu, S. Chen, Adaptive control of a class of nonlinear discrete-timesystems using support vector machine, in: Proceedings of the FifthWorld Congress on Intelligent Control and Automation, China, 2004, pp.440443.

    [19] D. Bi, Y.F. Li, S.K. Tso, G.L. Wang, Friction modeling and compensation forhaptic display based on support vector machine, IEEE Trans. Ind. Electron. 51(2) (2004) 491500.

    [20] C. Su, Y. Stepanenko, Robust motion/force control of mechanical systems withclassical nonholonomic constraints, IEEE Trans. Automatic Control 39 (3)(1994) 609614.

    [21] S.S. Ge, C.C. Hang, T.H. Lee, T. Zhang, Stable Adaptive Neural Network Control,Kluwer Academic Publisher, Boston, 2002.

    [22] V. Vapnik, An overview of statistical learning theory, IEEE Trans. NeuralNetworks 10 (5) (1999) 955999.

    [23] J.A.K. Suykens, J. Vandewalle, B.D. Moor, Optimal control by least squaressupport vector machines, Neural Networks 14 (1) (2001) 2335.

    [24] V.N. Vapnik, Statistical Learning Theory, Springer, New York, 1998.[25] W. Dong, Y. Xu, W. Huo, Trajectory tracking control of dynamics nonholo-

    nomic systems with unknown dynamics, Int. J. Robust Nonlinear Control 9(1999) 905922.

    [26] Y.C. Chang, B.S. Chen, Robust tracking designs for both holonomic andnonholonomic constrained mechanical systems: adaptive fuzzy approach,IEEE Trans. Fuzzy Syst. 8 (2000) 4666.

    [27] S.S. Ge, J. Wang, T.H. Lee, G.Y. Zhou, Adaptive robust stabilization of dynamicnonholonomic chained systems, J. Robotic Syst. 18 (3) (2001) 119133.681694.

  • [30] A. De Luca, G. Oriolo, Trajectory planning and control for planar robots withpassive last joint, Int. J. Robotics Res. 21 (56) (2002) 575590.

    [31] J. Wang, Q. Chen, Y. Chen, RBF kernel based support vector machine withuniversal approximation and its application, in: Lecture Notes in ComputerScience, Part III Support Vector Machines, vol. 3173, 2004, pp. 512517.

    [32] J.A.K. Suykens, J. Vandewalle, B.D. Moor, Optimal control by least squaressupport vector machines, Neural Networks 14 (2001) 2335.

    [33] S.S. Ge, B. Ren, K.P. Tee, T.H. Lee, Approximation based control of uncertainhelicopter dynamics, IET Control Theory Appl. 3 (7) (2009) 941956.

    [34] D. Karnopp, Computer simulation of strick-stip friction inmechanical dynamic systems, ASME J. Dyn. Syst. Meas. Control 107 (1985)100103.

    Zhijun Li was born in China in 1973. He received theDr. Eng. degree in mechatronics, Shanghai Jiao TongUniversity, PR China, in 2002. From 2003 to 2005, hewas a postdoctoral fellow in Department of MechanicalEngineering and Intelligent systems, The University ofElectro-Communications, Tokyo, Japan. From 2005 to2006, he was a research fellow in the Department ofElectrical and Computer Engineering, National Univer-sity of Singapore, and Nanyang Technological Univer-sity, Singapore. Currently, he is an associate professorin the Department of Automation, Shanghai Jiao TongUniversity, PR China. He is senior member of IEEE.Dr. Lis current research interests include the adaptive/robust control, mobile manipulator, nonholonomicsystem, etc.

    Yunong Zhang was born in Henan, China, in 1973. Hereceived the B.S., M.S. and Ph.D. degrees, respectively,from Huazhong University of Science and Technology(HUST), South China University of Technology (SCUT)and Chinese University of Hong Kong (CUHK), respec-tively, in 1996, 1999 and 2003. He is currently aprofessor at School of Information Science and Tech-nology, Sun Yat-Sen University (SYSU), Guangzhou,China. Before joining SYSU in 2006, he had been withNational University of Ireland (NUI), University ofStrathclyde, and National University of Singapore(NUS) since 2003. His main research interests includeneural networks, robotics and Gaussian processes. His

    web-page is now available at http://www.ee.sysu.edu.cn/teacher/detail.asp?sn=129.

    Yipeng Yang was born in China in 1977. He receivedthe B.S. and M.S. degrees from the Department ofApplied Mathematics and Department of ControlTheory and Engineering in Shanghai Jiao Tong Uni-versity, China, in the year 2001 and 2003, respectively.In 2008 he received the Ph.D. degree on OperationsResearch from North Carolina State University, USA.Since then he has been an assistant professor in theDepartment of Control Theory and Engineering inShanghai Jiao Tong University, China. His main researchinterests include stochastic control, switching systemcontrol and optimization, mobile robots.

    Z. Li et al. / Neurocomputing 73 (2010) 277327822782

    Support vector machine optimal control for mobile wheeled inverted pendulums with unmodelled dynamicsIntroductionSystem descriptionProblem formulation and preliminariesDynamics of mobile wheeled inverted pendulumsReduced dynamicsControl objectivesPhysical properties

    LS-SVM based controller designLS-SVMOn-line training with time window

    LS-SVM based control designzeta1 and zeta3- subsystemszeta2- subsystem

    SimulationsConclusionsReferences